HANA Cockpit with HA through a load balancer
In this article I will show how I’ve built a HANA Cockpit with High Availability for our production databases, through a load balancer. I recommend to start reading the SAP guide HowTo: High Availability for SAP HANA cockpit using SAP HANA system replication, as it describes most of the work required but does not cover the setup of the load balancer and the changes required in HANA Cockpit configuration. I’ve decided to write this article just because the time required to complete the setup and solve all issues appeared.
All SAP infrastructure in our company is in Azure cloud, and for HANA Cockpit consists in two HANA Express database servers and one load balancer:
– HANA servers run RedHat Linux v7.4 and HANA Express 2.0 SP04. Basically, HANA Cockpit is installed from last version available in SAP downloads at the time of this writing: v2.0 SP10 Patch 08 (from SAPHANACOCKPIT10_8-70002299.SAR file).
– Load Balancer is provided in Azure cloud, with a backend pool pointing the two servers (ip addresses) and a health probe pointing the nameserver SQL port of HANA databases (30013 in our case). This port is only active on the primary database when replication is active in logreplay mode. XSA ports can be also a good choice but note that XSA startup requires some more time (<1h) so in the meantime the load balancer maybe won’t work.
I’ve installed the HANA Cockpit from command line, and the only consideration is to use the FQDN instead of default in ‘Local Host Name’ parameter. In this version, cockpit won’t work with just hostname as URLs are built without domain and won’t be available from browsers. Note that hostname cannot be changed later so reinstall will be required !! Use same SID, instance number, etc. in both servers. Only FQDN should be different, in this article are: server01.mydomain and server02.mydomain
So, our starting point is two identical HANA DB servers, server01 and server02, with the same cockpit up and running.
Configure HANA DB replication between servers
This is a well-known step with a lot of documentation. In this example the setup has been done from command line and it has been so fast.
In server01, to be primary:
hdbnsutil -sr_enable –name=ph0db01
Stop server02 (HDB stop). Two important SAP OSS notes to consider at this point, as pointed in SAP guide:
Authentication is required for replication and XSA runtime, so we need to copy the following files from server01 to server02:
xscontroller.ini file will be copied again later, but now is required to start replication. Finally, run in server02 to be secondary:
hdbnsutil -sr_register –name=ph0db02 –remoteHost=server01.mydomain –remoteInstance=00 –replicationMode=sync –operationMode=logreplay
This is a good time to test the HANA takeover which should be successful from database perspective. But cockpit won’t work in secondary server02 because hostname is different and database has all configuration pointing to server01.
Setup the load balancer
We’re using Azure but I think it should be similar for other options. We need a DNS name for this load balancer: serverlb.mydomain. Load balancer setup is simple as explained before. Important to understand that load balancer queries the backend pool addresses with health probe settings. So basically, the server that responds to port 30013 will assume all requests. As our HANA replication is an active-passive scenario there is no load balancing possible.
Now interesting things start. We should change cockpit configuration to support the load balancer and avoid the dependency of HANA server names.
Cockpit is installed with an auto-issued certificate using the FQDN name provided. It’s not valid for HA, as it’s created with the FQDN provided during installation and browsers will complain. So first step is to get a signed certificate. I’ve used sapgenpse executable provided in HANA exe directory, as usual, to generate the request:
sapgenpse get_pse -p cockpitlb.pse -r serverlb.req -k GN-dNSName:serverlb.mydomain -k GN-dNSName:server01.mydomain -k GN-dNSName:server02.mydomain “CN=serverlb.mydomain, OU=HANA, O=SAP, C=DE”
Note, in red, that SAN (Subject Alternative Name) parameters should include all hostnames involved, use option -k as many times as required.
I’ve sent the serverlb.req file to my team and they returned me a signed certificate, base64 encoded and with PKCS#7 format (.p7b file). Important to have also intermediate and root authority certificates…
So I end up with 3 files: serverlb.p7b, NALRoot.cer and iauth.cer
Configure XSA runtime
We should install these certificates in the right place, but first we need to reconfigure the XSA runtime to build URLs pointing to load balancer, instead of server names, as to import a certificate the default_domain in XSA and the hostname in certificate should match.
We should change two parameters in xscontroller.ini file in primary server (we will copy later to secondary server). File is in:
But it’s easy to change it in HANA Studio, Configuration tab. I’m hiding real server name, but it should be serverlb.mydomain. Parameteres to update are default_domain and api_url. Available documentation only requires default_domain, but I’ve found api_url also needed.
Now XSA restart is required and it takes time, so good time for a coffee break. Use command XSA restart as <sid>adm in server01.
After restart you should be able to access the main XSA page from URL:
and other URLs from this page should go through load balancer. You can use command xs apps at any time to see the URLs available for each application.
But note that after changing default domain, XSA runtime has generated a new auto-íssued certificate.
Issue nº1. Probably only occurring to me because I’m running the cockpit on RedHat linux. Links through load balancer simply do not work from server itself:
[root@server01 ~]# curl -vk https://serverlb.mydomain:30030
* About to connect() to serverlb.mydomain port 30030 (#0)
* Trying 10.XX.XX.XX…
* Connection timed out
* Failed connect to serverlb.mydomain:30030; Connection timed out
* Closing connection 0
curl: (7) Failed connect to serverlb.mydomain:30030; Connection timed out
Nice. Problem is related with Linux configuration that prevents packets from server01 to reach server01 through load balancer. The cause are these kernel parameters ( review documentation to understand why they can drop our packets ):
[root@ server01 data]# sysctl -a | grep $’\.rp_filter’
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.eth0.rp_filter = 2
net.ipv4.conf.lo.rp_filter = 0
There is an immediate workaround and is to add an entry in hosts file of the two HANA db servers:
<server ip address> serverlb.mydomain
Import certificates in XSA runtime
So let’s go ahead. And now it’s time to import the certificates in the XSA runtime. Note that most xs commands should be preceded by a login process. xs login or xs-admin-login are the right ones.
First is necessary to manipulate the files because we need the certificate in X.509 format (.pem file) and the private key, stored in the .pse file we’ve created, as a separate file.
Issue nº2. Here I’ve found another issue with the sapgenpse executable provided, which basically corrupts the certificate chain, so forget about a direct command like:
xs set-certificate serverlb.mydomain –pse cockpitlb.pse
SAP OSS note 2832060 – Verification error when setting Domain Certificate for SAP HANA Cockpit explains how to manage this error, but I needed some time to figure out how to create the files required for this certificate, with the help of openssl I’ve been able to do it finally, and sorry about this command sequence…
To extract the private key from our .pse file:
sapgenpse import_own_cert -p cockpitlb.pse -c serverlb.p7b
sapgenpse export_p12 -p cockpitlb.pse -C 0 cockpitlb.p12
openssl pkcs12 -in cockpitlb.p12 -nocerts -nodes | sed -ne ‘/-BEGIN PRIVATE KEY-/,/-END PRIVATE KEY-/p’ > privatekey.key
openssl pkcs8 -topk8 -in privatekey.key -out private_pkcs8_serverlb.key -nocrypt
To generate the .pem file directly from the signed certificate in X.509 format:
openssl pkcs7 -print_certs -in serverlb.p7b -out serverlb.pem
and finally import the certificate into XSA runtime:
xs set-certificate serverlb.mydomain -c serverlb.pem -k private_pkcs8_serverlb.key
Import here the certificates of the CAs involved:
xs trust-certificate NALRoot -c NALRoot.cer
xs trust-certificate iauth -c iauth.cer
You can review all work done with commands:
Issue nº3. I’ve also wasted time with this error that won’t happen if you’ve imported all required SSL certificates as pointed before. From command xs logs cockpit-admin-web-app –recent :
10/9/19 1:50:06.973 PM [APP/7-4] SYS #2.0#2019 10 09 13:50:06:973#+00:00#ERROR#/Handler#########rDamNLe85hOBWpXorfZ5qV5vtcGaiF4s######k1jbyhxy#PLAIN##GET request to /login/callback?code=aH7RTUTlWl completed with status 500 – Could not authenticate with UAA: Could not obtain access token: request to UAA at https://serverlb.mydomain:30032/uaa-security/oauth/token failed, error: Request to UAA failed: unable to get local issuer certificate (connecting to serverlb.mydomain:30032), #
Finally you should copy now the xscontroller.ini file to the server02 and start the database.
At this point setup should work fine in both servers, always pointing to the load balancer.
Also, command xs apps should show all URLs available through the load balancer:
This setup does not require any configuration change when performing a takeover, like an ip or DNS redirection (as stated in SAP Howto guide) and simplifies the admin everyday work. It will be always a noticeable downtime when performing a role swtich between servers because the time required to stop the XSA runtime in one server and start it in the other.