HANA Active Active System Replication – Configuration, Failover & Failback
Having worked with an Active-Active Read Enabled (R/E) System Replication scenario we wanted to share our experiences. The official documentation is good but does not provide many diagrams or overall setup, process flows for failover, failback and how the read-only queries can be handled. Working with a long-time colleague Paul Barker. This was used for a short evaluation of the Active-Active with Read Enabled capability.
System Replication Prerequisites
- 2+ HANA systems, we used HANA 2.00.20 (HANA 2 SP2).
- Same size systems
- Same HANA SID
- Same HANA Instance #
- Different host names 🙂
1. Initial Landscape Configuration
1a. Acquire an Environment
We used a Cloud Appliance Library (CAL) HANA instance for a quick and easy access to HANA environments.
1b. Clone Environment, Rename Host
We cloned the HANA instance, with cloud providers such as GCP and AWS, this is a quick way of duplicating an existing environment.
We did experience an issue whereby after pausing the system my networking was screwed up. Upon investigation we found that CAL has some clever start-up scripts that map host names and IPs automatically. These needed to be disabled to preserve changes made to the OS configuration. If you are experimenting with CAL then you would need to modify.
## Tier2 (Secondary) /etc/init.d/updatehosts
We now have 2 systems with the same SID, but different host names, we now need to tell HANA we have a new host name this can be achieved via this command.
## Tier2 (Secondary) /hana/shared/HDB/hdblcm/hdblcm --action=rename_system --hostmap=vhcalhdbdb=tier2
1c. Configure System Replication
To enable system replication, we need to tell both the primary and secondary nodes about this configuration. The secondary needs to be stopped before issuing this command. When the secondary is re-started it will automatically sync all data with the primary node.
## Tier1 (Primary) hdbnsutil -sr_enable --name=tier1 ## Tier2 (Secondary) hdbnsutil -sr_register --force_full_replica --remoteHost=vhcalhdbdb --remoteInstance=00 --replicationmode=syncmem --name=tier2 --operationMode=logreplay_readaccess HDB start
1d. Networking – Virtual IPs
To hide the physical deployment from applications and client tools we can use Virtual IPs to connect our environment. To make this possible we need to add a secondary network interface to each HANA node. We also need to configure the Linux routing tables for each of the network interfaces, as adding the 2nd interface also effects the 1st one.
## Tier1 (Primary) & Tier2 (Secondary) ## Map the new network card (NIC) to eth2 udevadm trigger --subsystem-match=net -c add -y eth2 ## Verify we now have 2 NICs sid-hdb:~ # ifconfig -a eth0 Link encap:Ethernet HWaddr 0E:FA:20:F0:EE:D2 inet addr:172.31.35.197 Bcast:172.31.47.255 Mask:255.255.240.0 inet6 addr: fe80::cfa:20ff:fef0:eed2/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1 RX packets:34438 errors:0 dropped:0 overruns:0 frame:0 TX packets:25235 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:92844006 (88.5 Mb) TX bytes:5562258 (5.3 Mb) eth2 Link encap:Ethernet HWaddr 0E:B1:BF:1E:04:96 inet addr:172.31.46.157 Bcast:172.31.47.255 Mask:255.255.240.0 inet6 addr: fe80::cb1:bfff:fe1e:496/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:9001 Metric:1 RX packets:38 errors:0 dropped:0 overruns:0 frame:0 TX packets:15 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:1560 (1.5 Kb) TX bytes:1590 (1.5 Kb) ## Define the default routes for each NIC sid-hdb:~ # ip route add default via 172.31.32.1 dev eth0 tab 1 sid-hdb:~ # ip route add default via 172.31.32.1 dev eth2 tab 2 sid-hdb:~ # ip route show table 1 default via 172.31.32.1 dev eth0 sid-hdb:~ # ip route show table 2 default via 172.31.32.1 dev eth2
2. Failover in DR/HA Scenario
We can now access each HANA instance via either by either the original IP or the new virtual IP address (VIP). The primary (Tier1) allows any type of query and it can also pass read-only queries to the secondary. We can also connect directly to the secondary, if we wish to use this for purely read-only analytics. We can verify our current configuration is as expected.
## Tier1 (Primary) or Tier2 (Secondary) hdbnsutil -sr_state
2b. Simulate Primary Failure
In a fail-over scenario the primary could stop unexpectedly, we can simulate this with a kill.
## Tier1 (Primary) HDB kill -9
2c. Secondary Takeover
We now tell the secondary (Tier2) to become the primary.
## Secondary (Tier2) hdbnsutil -sr_takeover
2d. Swap Virtual IP to new Primary
Tier2 is now primary but queries are still being sent to the now dead Tier1 node. Using the AWS CLI to swap the VIP from the Tier1 node to Tier2. The command was generated using the AWS Console, but executing via the CLI prevents errors. Here we are associating a Network Interface with a Private IP
## Windows, Mac or Linux with AWS Client Tools aws ec2 associate-address --allocation-id "eipalloc-0b18c02cfc0694674" --network-interface-id "eni-00858248469
2e. Failover Completed
The process is now completed, we have swapped our primary HANA node from Tier1 to Tier2.
3. Failback to original configuration
The failback process is similar but first we need to re-sync our old primary (Tier1) with any changes that have taken place while it was offline. The names primary and secondary are now very confusing as the actual nodes are reversed but those roles still remain
3a. Original Primary Down, Secondary Now Primary
We start with just a single active node (Tier2).
3b. Make old Primary Secondary
Before re-starting Tier1, we need to tell it, that it’s now a secondary node.
## Failed Primary (Tier1) hdbnsutil -sr_register --force_full_replica --remoteHost=tier2 --remoteInstance=00 --replicationmode=syncmem --name=tier1 --operationMode=logreplay_readaccess
3c. Start new Secondary (old primary)
When Tier1 re-starts it will now sync all changes made during the time it was not running. We can also verify the status of our system replication configuration.
## Tier1 Failed Primary, becoming a Secondary HDB start hdbnsutil -sr_state
3d. Secondary re-sync with Primary
Initially the new secondary will not be available. The time before it becomes operational depends upon the volume of changes while it was off-line. With the re-sync completed we now have 2 nodes as before, but their roles are reversed.
3e. Stop Primary
To promote Tier1 back to primary we need to stop the current primary.
## Tier2 (now Primary) HDB stop
3f. Promote Secondary to Primary
We can now tell Tier1 that it is the primary node. It will automatically check Tier2 is not active and then take over.
## Tier1 Switching from Secondary to Primary hdbnsutil -sr_takeover
3g. Swap Virtual IPs
The networking needs to be updated to reflect the changes in our deployment. We point the VIP1 to our new primary and VIP2 back to the stopped primary (soon to become secondary).
## Windows, Mac or Linux with AWS Client Tools ## Switch Primary Virtual IP to Tier1 as Primary Node aws ec2 associate-address --allocation-id "eipalloc-0b18c02cfc0694674" --network-interface-id "eni-059611a76ccc2c7b4" --allow-reassociation --private-ip-address "172.31.44.23" --region us-east-1
3h. Revert Tier2 to Secondary
We now need to tell Tier2 it is a Secondary node again.
## Tier2 revert to Secondary hdbnsutil -sr_register --force_full_replica --remoteHost=vhcalhdbdb --remoteInstance=00 --replicationmode=syncmem --name=tier2 --operationMode=logreplay_readaccess
3i. Restart and Re-sync Secondary
When we re-start the secondary it will re-sync with the primary.
## Tier2 re-starting as a Secondary HDB start
3j. Failback completed, both servers are restored.
We finish the process as we began with 2 HANA servers in an Active-Active configuration. We can verify all is configured as expected.
## Either HANA Node hdbnsutil -sr_state
Thanks for reading, hope it was useful.
Great! Short but still enough Information to build your own failover HANA...
SAP Landscape Transformation Replication is a trigger based data replication method in HANA system. It is a perfect solution for replicating real time data or schedule based replication from SAP and non-SAP sources. It has SAP LT Replication server, which takes care of all trigger requests