User Experience Insights
How to Setup SAP HANA Multi-Target System Replication
SAP HANA System Replication is a reliable high availability and disaster recovery solution that provides continuous synchronization of a HANA database to a secondary location either in the same data center or remote site as a disaster recovery site.
This article outlines the steps to setup a system replication when you want to replicate primary to multiple target which may be in same data center or at remote site.
For better understanding let us assume below
Assumption 1
SITEA is Current Primary
SITEB is Secondary (which is in same data center as of SITEA)
SITEC is Second Secondary (which is at remote location and can be used as DR Site)
Assumption 2
SITEA and SITEB are managed by cluster (there are different product available for clustering for example: – pacemaker)
As per the requirement SITEC should also connect as second secondary to current Primary SITEA
Since SITEA and SITEB are in same data center, we will configure
Replication mode as
- Synchronous in memory (syncmem)
And Operation Mode as
- Log Replay
Since SITEC is in another data center as remote Site, we will configure
Replication mode as
- Asynchronous (async)
And Operation Mode as
- Log Replay
Note: – The operation mode must be identical for multi-target replications, a mixture (e.g. logreplay from SITEA to SITEB and delta_datashipping from SITEA to SITEC) isn’t allowed.
So, the overall Scenario is as below
Assumption 3
Inter node communication is established and all network related requirements are in place considering the involvement of remote Site.
Each node can resolve the hostname of every other node.
Technical Steps To configure Multi-Target Hana System Replication
- On Primary node (SITEA) login as <SID>adm and enable replication
#hdbnsutil -sr_enable –name=SITEA
- On Secondary (SITEB and SITEC) host: login as <SID>adm and stop HANA database
- As root user backup existing SSFS data and key files on secondary nodes (SITEB and SITEC)
#cd /usr/sap/SID/SYS/global/security/rsecssfs/
#mv data/SSFS_SID.DAT data/SSFS_SID.DAT.Orig
#mv key/SSFS_SID.KEY key/SSFS_SID.KEY.Orig
- Copy the SSFS_SID.DAT and SSFS_SID.KEY from primary node (SITEA) to secondary node (SITEB and SITEC)
- On secondary Node (SITEB and SITEC) Change the ownership of the file to <SID>adm:sapsys
- login as <SID>adm on secondary node (both SITEB and SITEC) and Register HANA System replication as below
On SITEB
#hdbnsutil -sr_register –remoteHost=SiteA_Hostname –renoteInstance=inst_num_of_SiteA \
–replicationMode=syncmem –operationMode=logreplay –name=SiteB
On SITEC
#hdbnsutil -sr_register –remoteHost=SiteA_Hostname –renoteInstance=inst_num_of_SiteA \
–replicationMode=async –operationMode=logreplay –name=SITEC
- Login as <SID>adm and start HANA on SITEB and SITEC
Validating system replication
On Primary System (SITEA) with <SID>adm user run
#cdpy
It should take you to location
/usr/sap/<SID>/HDBXX/exe/python_support
Now run
#python systemReplicationStatus.py
For Example: – Output will look like below
In the above screenshot, SITEA is replicating to SITEB with SYNCMEM and to SITEC with ASYNC.
So far, the configuration looks simple and straight forward. The most challenging part is successful testing and achieve the expected outcome.
you may see below challenges during the testing: –
- After the takeover, the other available secondary is not automatically connecting to New primary
- After the takeover the other available secondary is connecting to new primary with full data shipping, however the expectation is to have only missing/delta logs applied while re-connecting
To avoid these challenges you need to maintain few parameter as below once you are done with all of the above configuration.
When the primary system replicates data changes to more than one secondary system, you should use force log retention and log retention propagation to reach an optimized re-sync and avoid a full data shipping after takeover or other disconnect situations.
Configure all nodes with
[system_replication]/enable_log_retention = force_on_takeover
During takeover on a secondary system, if force_on_takeover is set, the value is changed to enable_log_retention = force automatically by HANA after the takeover.
This means that your new Primary where you just did the takeover has parameter value
[system_replication]/enable_log_retention = force
Now, After the takeover is completed and you did all the system sanity check, update the parameter to force_on_takeover from force. This will ensure that when next time you take over, your older primary should connect back to new primary with missing delta log only.
Set below parameter for automatic registration of available secondary to new primary
global.ini/[system_replication]/register_secondaries_on_takeover paramater = true
Suppose you takeover on secondary SITEB in data center 1 from primary SITEA in same data center. As a result, secondary SITEC in data center 2 will register automatically to the new primary SITEB in data center1.
After the failure on the previous primary SITEA is solved, register it to the new primary SITEB in data center 1.
If you want to re-order your systems in a complex system replication, This can be done by setting the following parameter in global.ini:
[system_replication]/propagate_log_retention = on.
Log retention propagation is used to retain the log based on the smallest savepoint log position in the whole system replication
If you want to propagate log retention in a system replication landscape between all systems, this parameter should be set on all systems in the landscape.
DR-Drill Scenario
Please ensure that replication is in sync with secondary on which you intend to takeover. Suppose in DR drill you decide to takeover on SITEC from SITEA keeping your cluster disabled between SITEA and SITEB, you may follow below sequence.
Run below command with <SID>adm user on SITEA
#cdpy
# python systemReplicationStatus.py
Check if replication is active and in sync.
On SITEC with <SID>adm to perform the takeover
#hdbnsutil -sr_takeover
#cdpy
# python systemReplicationStatus.py
Now SITEC is current Primary and SITEB is automatically attached to new Primary SITEC.
However, the SITEC replicating to SITEB with SYNCMEM which is not as per the expectation as SITEC is remote DR site. This happened as for SITEB the older primary was SITEA which was replicating to SITEB with SYNCMEM.
To correct this, you need to run below command
hdbnsutil -sr_changemode –mode= async
This command needs to be executed with SIDadm user and from the secondary Site for which you need to change the replication mode.
Now if your SITEA is back available and you intend to continue your SITEC as primary, you need to re-register your SITEA to SITEC manually.
You may encounter such situation during the DR-drill only. As till the time SITEB in same data center is available, you will always prefer to takeover to SITEB.
During the testing or in actual scenario, please ensure after the takeover your available secondary getting registered with missing/delta logs only and not with full data shipping. You can confirm this from the nameserver logs, and you should see similar logs highlighted in screenshots below
Hope this article will help you setting up the SAP HANA Multi-Target System Replication that too with only missing delta log shipping whenever there is a takeover happen and with automatic re-connect of available secondary to new primary.
Thanks!!
Kindly share feedback or thoughts in a comment or ask questions if any.
Excellent blog..... All required information captured in a single thread !!!
Hi Ankit , Could you please suggest how config needs to be done when there is an additional system at DR site, means , Primary system A in data center 1 replicates data changes to secondary system B in the same data center. Primary system A also replicates data changes to secondary system C in data center 2. Secondary system C is a source system for a further secondary system D located in the same data center with system C.
Nice blog...have you tested it in scale-out environment also? I hope it wont make any difference.
Hi Ankit,
Can we have sceario like From SiteA -- > Replicate to Site B and Site C in Asynch logreplya mode.
All 3 sites are in different regions and no HA is there. We need to add one more region for replication.
Regards,
Harish