SAP HANA Hands on tests ( part 2 ) : HANA DB Standby
Hello,
In a previous blog I showed a brief overview on SAP HANA installation on VMWare ESXi ( SAP HANA Hands on tests ( part 1 ) : HANA DB installation ) .
I had one hana box installed with an HANA DB running on it and an SAP ECC EHP7 instance also ( not a recommended set up ) :
Now, I have replicated my first box using VMWare functionnalities, cleaned up the HDB and ECC on it in order to have a brand new hana box.
The setup is the same.
I have configured the nfs beetween my 2 boxes ( hdbtest1 and hdbtest2 ).
Lets configure the Hana replication standby:
Mount /hana directory from hdbtest1 to hdbtest2 :
Install the required hana software on the scaleout node using the embedded hdblcm tool.
This time I use the hdblcm “text mode” ( a bit R3setup notalgic 😉 ) :
SAP HANA Lifecycle Management – SAP HANA 1.00.82.00.394270
**********************************************************
System Properties:
HTL /hana/shared/HTL HDB_ALONE
HDB00 version: 1.00.82.00.394270 host: froxhdbtest plugins: lcapps,afl
Enter Root User Name [root]:
Enter Root User Password:
Collecting information from host ‘froxhdbtest2’…
Information collected from host ‘froxhdbtest2’.
Options:
Index | Listen Interface | Description
———————————————————————————————
1 | global | The HANA services will listen on all network interfaces
2 | internal | The HANA services will only listen on a specific network interface
NOTE : for testing purpose I selected option 1. But on a production setup I’d probably use a dedicated NIC in order to put the Hana inter processes communication on a dedicated network.
Select Listen Interface / Enter Index [1]:
Enter System Administrator (htladm) Password:
Enter Certificate Host Name For Host ‘froxhdbtest2’ [froxhdbtest2]:
Summary before execution:
=========================
Add Additional Hosts to SAP HANA System
Add Hosts Parameters
Installation Path: /hana/shared
SAP HANA System ID: HTL
Install SSH Key: Yes
Root User Name: root
Listen Interface: global
Enable the installation or upgrade of the SAP Host Agent: Yes
Certificate Host Names: froxhdbtest2
Additional Hosts
froxhdbtest2
Storage Partition: <<assign automatically>>
Do you want to continue? (y/n): y
Adding additional host to SAP HANA Database…
Adding additional host…
Adding host ‘froxhdbtest2’…
Registering HANA Lifecycle Manager on remote hosts…
Regenerating SSL certificates on remote hosts…
Deploying Host agent configurations on remote hosts…
Updating Component List…
SAP HANA system : Add hosts finished successfully.
Log file written to ‘/var/tmp/hdb_hdblcm_add_hosts_2015-03-06_14.47.56/hdblcm.log’.
Now you have an additional host ready to take over in case of first node failure :
Of course in this particular case where the hdbtest also holds the HDB storage, I would run into trouble if I totally lose it.
So I will revert the roles : hdbtest2 will be master and hdbtest will be slave :
Now let’ s play with the standby feature :
I have my ERP instance running with an sgen run going on in order to have some load.
Here is the HDB system status before the poweroff :
I poweroff the hdbtest2.
We can see the system hdbtest2 going down. The hana admin console shows some error messages :
The work processes on the ECC instance are switching to reconnect status, and the transaction are being rolledback :
The hdbtest1 host should take the lead after a while.
Now the Master host is lost on hdbtest2 following the poweroff, we can see the hdbIndexserver initializing on hdbtest1:
The services are now started on the hdbtest1.
In My ECC instance I can see the work processes have reconnected to the HDB.
The ECC server got out of the reconnect status. The system is available again after a few minutes.
The SGEN is of course canceled but I can resume it :
ECC is O.K. The database is available through the hdbtest1 node which was the former standby host.
We can see the hdbtest2 is seen as Inactive.
Now we can restart the hdbtest2 host. No fallback should happen.
This is also O.K. The HDB continues to work on hdbtest1. Host hdbtest2 is back in the set up.
We are now fully back online.
The system was out for only a few minutes.
The test is O.K.
The next step will be to apply patches using this set up.
Hi Steve,
sorry but you are mixing here 2 HA scenarios and furthermore in the wrong way. You wrote in the beginning that you are gonna configure the Hana replication - wrong. It is a minimal setup for a Scale-out which you created (you added the second node to the existing HANA system) but you did not defined the standby role. Therefore both nodes are acting as workers - no HA setup (no failover possible), both nodes are holding the data.
When you stopped the second node (does not matter if master one), SAP reacted as you would stop anyDB, the work processes lost connection to the DB and probably re-login was needed. The HANA system could not be functional without second node up and running again since it contains the data as well.
Br, Jan
Hello Jan,
You' re right about the first part. At first I wanted to use the DR scenario then I finally configured a scaleout system in order to have some HA functionnality that way.
I'll correct this quickly.
For the second part, having that kind of setup to provide HA ( and not DR ) works.
I finally found out that the problem laid in the fact that the hdbuserrstore was not updated with my second node information ( for the DEFAULT entry ).
Once this was corrected, everything worked as expected. IE : the SAP application instance , here ECC, goes into "reconnect status" and once HDB is activated on the standby node, the work processes reconnect to the database now made available through node 2.
In my configuration, with only 2 nodes, the second node does not deal with data and is therefore a "standby only" node.
In a 4 nodes scenario for example, you would have 3 worker nodes and 1 standby node not actively dealing with data. Should one of the 3 worker nodes fail, the standby node would take the lead on the failed node data in order for the HDB system to work again.
In a 2 nodes configuration, only 1 node is available/active at a time, therefore I have an HA setup in an active / passive fashion.
Br, Steve.
Hello Steve,
HANA HA options are divided into DR support (backups, system and storage replications) and FR (Fault recovery) support (service autorestart and host autofailover - scale-out with standby node).
Even in your configuration with only 2 nodes you have to still define the standby role:
Otherwise both nodes (workers) are holding the data and there is no failover possibility. So in this case this setup aims for performance or sizing scaling, not HA.
Hope it is clear now.
Br, Jan
Hello Jan,
This is interesting. In my test platform, here are the statuses for my HDB hosts :
hdbtest2 is master and hdbtest has no status ( thus no standby status ) in the detail part.
You're right, the index server roles are worker for both.
But when I kill the hdbtest2, the node hdbtest takes over the master role.
Still the ECC instance goes from reconnect to "connected" status and works properly.
What I did to get this working :
1 - Install HDB first node.
2 - Install ECC ( one HDB node available only )
3 - Scale out
4 - Did NOT redistribute tables after adding hosts.
In this situation the tables are "attached" to one host at a time :
My "slave" host shows :
My master host shows :
It holds all the tables.
When I "failover", here are the results for the queries :
So now the froxhdbtest host holds all the tables then all the data.
The failover works.
I have the following information for the hosts status :
The only thing, as I get it in the end, is to set explicitely a standby node in the configuration right ? ( this will probably disallow the "redistribute feature I guess ).
I' ll have a look at that.
Thanks & Best regards,
Steve.
Hi Steve,
I thought that you hadn't redistributed tables after adding hosts, therefore it really acts like HA active - standby solution 🙂 .
Just wondering when ECC realizes that it is using HANA scale-out and manage the redistribution by itself. Or how to make ECC or BW aware of creating scale-out from single node solution.
You can also test adding new table since tables are distributed across available index servers using the round-robin approach. You will then probably loose this "standby" node, i.e. HA setup, since it will contain some data.
Can you please also check parameters in global.ini --> [table_placement] section?
Thanks and Br,
Jan
Hello Jan,
To me, the redistribution needs to be triggered after the scaleout not is installed.So, so far, it' s like the applications are not "hana aware" and do not trigger the tables redistribution by themselves.
For newly created tables , they should be distributed in a round robin way. So you're right, the "HA" setup is then lost. I'll give it a try.
The [tables_placement] section is as follows :
Thanks & Br, Steve.
@ Jan : I managed to have the configuration in line with the best practices.
I now have a true Stdby HDB :
To do so I had to remove and set back the host in the configuration.
Did it with the hdblcmgui and this time I had the standby role offered. I did not have it at first with the command line.
Thanks & Best regards,
Steve.
Hi,
yes, now it looks ok for me 🙂 When you install scale-out or add host to the existing system via hdblcm (not gui) and not explicitly specify the role of the new host then it is by default worker, see Server installation guide:
Good luck with next testing 😉
Br, Jan
Hello Jan ,
That's it, the "problem" was I used the hdblcm with the --addhosts option but without the "role" option ( I encountered the problem solved through note 2068776 ) .
I'll update the blog accordingly.
Thanks & BR,
Steve.