SAP HANA HA and DR Series #7: Log Replication Modes
Good day everyone,
Following from previously published System Replication, and its operation modes and useful parameters, today I would like to cover the log replication options offered by SAP HANA.
There are 4 log replication modes:
Synchronous in-memory (syncmem): This is the default log replication mode. In this mode, primary node waits for the acknowledgement confirming the log has been received by the secondary node before committing any transactions. Basically, primary node waits until secondary node has received data and as long as the replication status is ACTIVE for all services, there will not be any data loss. However, if at least one service has a status different from ACTIVE, a failover to secondary node might result in a data loss. Because there might be some committed changes in the primary node which could not be sent to secondary due to a connectivity or service failure. So, if replication status is active for all services, you don’t lose any data in case of a failover.
This option offers two main advantages: shorter transaction delay and better performance. Because the primary node waits for secondary node only to “receive” the logs, the transaction delay in the primary node is shorter (which is literally the log transmission time) compared to other replication modes. Also, because the primary node does not wait for any I/O or disk writing activity on the secondary node, it basically performs better.
The waiting period of the primary system is determined by parameter named logshipping_timeout in global.ini. Its default value is 30 secs and if there is no acknowledgement received from the secondary node after 30 secs, primary node continues without replicating the data.
This replication mode can be ideal for system replication setup as a high availability and disaster recovery solution, especially if both nodes are in the same data center or very close to each other.
Synchronous (sync): In this mode, primary node waits for acknowledgement confirming the log has been received AND persisted by the secondary node before committing any transactions.
The key benefit of this option compared to syncmem is the consistency between primary and secondary nodes. You know that the primary node will not commit any transactions until secondary node received and persisted the logs.
Like syncmem, the waiting period of the primary system is also 30 secs by default and determined by logshipping_timeout parameter in global.ini. If there is no acknowledgement received from the secondary node after 30 secs, primary node continues without replicating the data.
This replication mode can be ideal for system replication setup as a high availability solution, especially if both nodes are in the same data center.
Synchronous (full sync): Full sync replication was introduced with SPS08 as an additional option for the sync mode. This mode provides absolute zero data loss no matter what because primary node waits until secondary node received the logs and persisted them on the disk; the transaction processing on the primary node is blocked until secondary system becomes available. This ensures no transaction can be committed on the primary node without shipping the logs to the secondary site.
This mode can be activated via parameter enable_full_sync in the system replication section of global.ini file. When the parameter is set to disabled, full sync is not configured. If you change the parameter to enabled in a running system, that means full sync is configured but not activated immediately to prevent the transaction blocking.
Full sync will be completely activated only when the parameter is enabled AND the secondary node connected. Then you will see the REPLICATION_STATUS becomes ACTIVE. That means if there is a network connectivity issue between two nodes transactions on the primary node will be blocked until secondary node is back.
This replication mode can be ideal for multitier system replication configurations especially between tier 2 and tier 3 nodes for data protection, or can also be used for system replication HA configuration when both nodes are in the same local area network and data protection is number one priority.
Asynchronous (async): In this option, primary node does not wait any acknowledgement or confirmation from the secondary node, it commits the transactions when it has been written to the log file of the primary system and sent redo logs to the secondary node asynchronously. Obviously, this option provides the best performance among all four options as the primary node does not have to wait for any data transfer between nodes, or I/O activity on the secondary node. However, async mode is more vulnerable to data loss compared to other options. You might expect some data loss during an (especially unexpected) failovers.
This replication mode can be ideal for system replication as a DR solution in a distant location or for companies where system performance is the number one priority and because it provides a database consistency with a near zero data loss, it is still a favourable option. In a multitier system replication, async can also be used between tier 2 and tier 3 as a DR solution.
Have any questions about SAP HANA System Replication? Leave a comment below.
References and further reading:
How To Perform System Replication for SAP HANA
SAP Note 2165547 – FAQ: SAP HANA Database Backup & Recovery in an SAP HANA System Replication Landscape
SAP Note 1999880 – FAQ: SAP HANA System Replication
If you liked this post, you might like these relevant posts:
SAP HANA High Availability and Disaster Recovery Series #1
SAP HANA HA and DR Series #2: Redundancy and Fault Recovery Support
SAP HANA HA and DR Series #3: Host Auto-Failover
SAP HANA HA and DR Series #4: Storage Replication
SAP HANA HA and DR Series #5: System Replication
SAP HANA HA and DR Series #6: System Replication Operation Modes & Parameters
Choosing the right HANA Database Architecture
I have a question regarding system repl. log modes and the usage of the secondary site for dev&qas systems.
Do I only have the possibility to use async if my secondary site should be used for dev&qas systems?
As I have set a global mem. alloc limit and I've disabled preload_column_tables.
Or is syncmem also possible in this scenario?
Many thanks in advance!
Thanks for your message. I am more active on LinkedIn so apologies for my late answer.
You can use both options, but async might be a better option considering the secondary nodeis cross site.
Hi Alper Somuncu ,
Thanks a lot for detailed explanation. I have few doubts
In addition to that we used to disable replication and unregister it and replication initiates a full replication when there is a gap of around 6 hours. if we start replication within 4 hours full sync not starting if its above 5/6 hours full replication starts. How it works? Based on log shipment or based on data and log sizes?
Thanks and Regards,
Hi Alper Somuncu,
First of all replication explanation is superb. thanks for the explanation. I have few doubts on system replication
1. sync, sync-mem, asynchronous -> primary does not wait for secondary based on the parameter log shipping timeout -> 30 seconds ( i read this one in your another series)
2. syncfull -> primary waits for secondary till it receives acknowledgement.
During that time primary will be standstill and not commit any transactions meaning -> no transaction will be loaded in DB. which might result in primary slowness or hung state right?
We used to disable replication and unregister secondary.
When we start replication within 4 hours or 4-5 hours replication startes without full sync
When we start after 6 hours full replication starts. How it works based on log segments?
Thanks & Regards,