Technical Articles
HANA Multitier System Replication – HOW TO Configure/ Take-Over / Fail-Back Setup for DR scenario
———————————————————————————————————————————
Hello All,
This is my first SAP blog post. Happy sharing knowledge. !! please do let me know if you have any suggestions to improve…
DISCLAIMER: This blog is based on due diligence performed at the time of writing. As options and paths can change over time, readers are advised to check the latest official information before making business decisions. The author accepts no responsibility for the current validity, accuracy, completeness or quality of information provided.”
———————————————————————————————————————————
I was searching for step by step HANA multi-Tier Replication fail-over and fail -back procedure for DR scenario . I read all the blog post related to multi-tier Replication, but I could not find any blog post mentioning the steps for DR scenario where both primary and secondary will not be available at same time. so thought of writing my experience on this..
In this blog post, we are going to discuss about the following
- Pre-Requites to setup Multitier System Replication
- How to setup Multitier HANA System Replication
- Takeover and Failback Procedure for Multitier System Replication
HANA Version used : HANA 1.0 SP12
Before we proceed, remember…
- Multi Target HANA Replication ( A->B, A->C) is available only from HANA 2.0 SP03
Architecture
Prerequisite to setup Multitier System Replication
I have mentioned some important prerequisite here.
- SAP HANA software version of the secondary has to be equal to or newer than the one on the primary.In terms of configuration, both Primary and secondary must be identical. Not only with number of hosts but, host roles, failover groups, etc
- SID and Instance number must be identicalHostnames in the primary system should be different to the hostnames used in the secondary systems.
- Ensure that log_mode is set to “normal”
- The required ports must be available. The same <instance number> is used for primary and secondary systems. The <instance number>+1 must be free on both systems, because this port range is used for system replication communication.
- The operation mode must be identical for all replications, a mixture (e.g. delta_datashipping from tier 1 to tier 2 and logreplay from tier 2 to tier 3) isn’t allowed
- Both systems should run on the same endianness platform
- How to find endianness of the VM?
- Run the following command – lscpu ( SUSE & RHEL)
- Run the following command – lscpu ( SUSE & RHEL)
- How to find endianness of the VM?
- Please refer below link for complete list – Link
Setup Multitier HANA System Replication
In Node 1:
Step 1: Perform one full backup
Step 2: hdbnsutil -sr_enable –name=NODE1
In Node 2:
Step 1: Stop the secondary system:
sapcontrol –nr <instance_number> -function StopSystem HDB
Step 2: Register the secondary system,choose a replication mode and the
Operation mode
hdbnsutil -sr_register –remoteHost=<node1 hostname>
–remoteInstance=<instance number>
–replicationMode=<sync|syncmem|async>
– -operationMode=<delta_datashipping|logreplay> –name=SITEB
EX:
hdbnsutil -sr_register –remoteHost=hananode01 –remoteInstance=00 — replicationMode=sync –operationMode=logreplay –name=NODE2
Step 3: Start the secondary system to start replication
sapcontrol –nr <instance_number> -function StartSystem HDB
EX:
sapcontrol -nr 00 -function StartSystem HDB
NOTE: Wait till HANA Replcation from NODE -1 to NODE -2 is completed successfully and all services become ACTIVE
Step 4: in NODE02,
Execute –> hdbnsutil –sr_enable
Note : This is to enable the replication From NODE-2 to NODE -3
In Node 3:
Step 1: Stop the HANA DB
sapcontrol –nr <instance_number> -function StopSystem HDB
Step 2: Register the secondary system,choose a replication mode and the operation mode
hdbnsutil -sr_register –remoteHost=<node2 hostname>
–remoteInstance=<instance number>
–replicationMode=<sync|syncmem|async>
–operationMode=<delta_datashipping|logreplay> –name=SITEB
EX:
hdbnsutil -sr_register –remoteHost=hananode02 –remoteInstance=00 — replicationMode=async –operationMode=logreplay –name=NODE03
Step 3: Start the secondary system to start replication
sapcontrol –nr <instance_number> -function StartSystem HDB
EX:
sapcontrol -nr 00 -function StartSystem HDB
NOTE: Wait till HANA Replcation from NODE -2 to NODE -3 is completed successfully and all
services become ACTIVE
Monitor the Replication Progress and Status:
From Primary System : ( NODE01), please run the following python script as <sid>adm
python $DIR_INSTANCE/exe/python_support/systemReplicationStatus.py
Takeover and Fail back Procedure for Multi-tier System Replication
In this blog post, We are going to discuss only about DR scenario.. ( Fail-over and fail back procedure for HA scenario. I e. NODE 1 to NODE 2.. Please refer the procedure here -> LINK )
We Consider the following situation here…
Case of Disaster, Complete Primary region is down
(OR) HANA Primary and secondary HOST is down and not accessible
Fail-Over to DR Site:
Step 1: In NODE03 , Perform the take over
hdbnsutil –sr_takeover
Step 2: HANA will be coming UP and be accessible in NODE-3 now
Step 3: Bring Your SAP application on top of HANA
Fail-Back to Primary Region:
High Level fail-Back Steps:
- Bring your HANA VM’s UP in Primary Region
- Register your Ex-Primary system (NODE01) with DR HANA (Current Primary- NODE03)
- Once Reverse Replication from Node 03 to Node 01 is done, perform Take over on NODE01
- Bring your SAP application on Primary region
There are two scenarios could be possible here…
- Due to disaster, you may need to Re-build HANA system from scratch & Perform fail-back
- For this, please follow from Step 4 below
- Bring UP existing HANA NODE-1 and NODE-2 after Issue resolved in Primary Region
- In this scenario, Both NODE-01 and NODE 02 still be having existing configuration. So before we register NODE01 as secondary to Node 03 , we need to clean up the existing configuration ( Bcoz, NODE01 has ROLE – PRIMARY when it went down)
Step 1:
On NODE-1 : Start HANA database ( HDB Start)
Step1 : hdbnsutil -sr_unregister –id=3
Step2 : hdbnsutil -sr_unregister –id=2
Step3 : hdbnsutil -sr_disable –force
Step4: hdbnsutil -sr_cleanup –force
Step 2:
On Node-2 VM, Don’t Start HANA Database
Step 1: hdbnsutil -sr_cleanup –force
Replication status should be “NONE”
Step 3:
On NODE03, by Default Replication should have been enabled. In case if not please execute the following command
hdbnsutil -sr_enable –name=NODE3
Step 4:
On NODE01,
Step 1: Stop HANA ( HDB stop)
Step 2: Register the secondary system,choose a replication mode and the operation mode
hdbnsutil -sr_register –remoteHost=<node3 hostname>
–remoteInstance=<instance number>
–replicationMode=<sync|syncmem|async>
–operationMode=<delta_datashipping|logreplay> –name=SITEA
EX:
hdbnsutil -sr_register –remoteHost=hananode03 –remoteInstance=00 –replicationMode=Async – -operationMode=logreplay –name=NODE01
Step 3: Start the secondary system to start replication
sapcontrol –nr <instance_number> -function StartSystem HDB
EX:
sapcontrol -nr 00 -function StartSystem HDB
NOTE: Wait till HANA Replication from NODE -3 to NODE -1 is completed successfully and all services become ACTIVE
Step 5: Once Fail-Back decision from DR Site to PRIMARY Site is made, please execute the following command in NODE01
Step 1: In NODE01 , Perform the take over
hdbnsutil –sr_takeover
Step 2: HANA will be coming UP and be accessible in NODE-01 now
Step 3: Bring Your SAP application on top of HANA NODE 01
So now at this point business started running on Primary region (NODE01) again & We can now re-Establish the Multi-tier Replication again (NODE 01 ->NODE 02 -> NODE 03) to setup HA and DR
—————–
References
- 1999880 – FAQ: SAP HANA System Replication
- 2057595 – FAQ: SAP HANA High Availability
- System replication guide
- Troubleshoot System Replication
- How To Perform System Replication for SAP HANA
- Network Configuration for SAP HANA System Replication
Hello Saminathan,
This is great article !!
I have question here if you can clarify, it would be great. I have the similar setup and intend to perform a planned DR testing. Site A -> Site B -> Site C.
However once the DR test is completed we do not need the tested data and would like to go back as initial setup.
Overview ::
Site A stopped
Site B Stopped
Site C I perform a take over. Complete the planned DR test. Once the testing is done get back to the initial Setup A -> B -> C
For this, from the your blog I can see you are Registering your Ex-Primary system (NODE01) with DR HANA (Current Primary- NODE03).
Since we do not want to replicate the C to A or B again cause we do not want the tested data. What is the best approach according to you ?
Regards
Sunam
Hello Sunam Sarkar
Thank you..
Step 1: Stop HANA at DR Site
Step 2: Start Site A and Site B ( Since you just stopped, it will start HANA replication once it started..)
Step 3: Once Replication become ACTIVE between Site A and Site B, Register Site C again .. ( If you register Site C at B, full initial replication starts again )
Hope this helps you..
Hi KANAGARASU SAMINATHAN
Thanks a lot for this wonderful blog.
I have a doubt suppose if we have setup like this Site A -> Site B -> Site C
and we want to perform OS upgrade activity on SITE A for around 40hours and during this activity we want to make SITE B as primary and once we activity is completed on SITE A then we want to make SITE A as primary but also with all the changes done at SITE B.
So how can we achieve this.
Thanks & Regards,
Parag Jhade
Thanks for the compliment
& Your scenario is pretty straightforward.
Step 1: Perform Take over on Site B (If you configured SITE A and SITE B in Clustering, it can be done using cluster commands, if not, manually you can perform Take over on SITE B)
-> This cause minimal downtime for SAP application. Till SITE B HANA is completely online, SAP cannot connect to DB (lets say 15 mins to 30 Mins)
Step 2: Now transactions are happening in SITE B HANA database & In SITE A -> You can perform your OS upgrade
Step 3: Once OS upgrade is done, You can follow same procedure to fail back to original Primary system.
Note:
Kanagarasu Saminathan - In response to your first point: Instead of a takeover why cant we disable the replication across site A and site B by following the standard procedure? Post disabling the replication between Site A and Site B, bring up the Site B database, SAP will point to Site B database making it primary. Post os activity on Site A, activate replication across Site B and Site A again and then perform the takeover to Site A. Waiting to know your point of view on this approach.
Hi Kanagarasu,
Can we have sceario like From SiteA -- > Replicate to Site B and Site C in Asynch logreplya mode.
All 3 sites are in different regions and no HA is there. We need to add one more region for replication.
Regards,
Harish