HANA Scale-Up HA with System Replication & Automat...

dennispadia · ‎10-23-2018

Purpose

This 3 Series blog will help you to get insight on the High Availability of SAP HANA scale-up systems with HANA System Replication and Automated failover using SUSE High Availability Extension (SUSE HAE) implemented on POWER Servers. This blog can also be taken as reference for x86_64 Servers as steps remains the same.

This blog, will give you insight on System Replication and SuSE High Availability Extension overview. It will also provide you details on the various method, services and agents involved to setup automatic failover of HANA.

Scope

Below are the system version on which this document has been prepared. For more updated information, always use the latest SAP Notes and Guide to best implement High Availability for HANA.

HANA Version - HDB 2.0.030.00 (Scale-Up)

SUSE Version - SLES for SAP Application 12 SP 03 (Without the SLES for SAP Application the customer would not be able to receive one of the resource agent required for SAP)

Scenario Implemented - Performance Optimized Scenario

Reference

SUSE Best Practices

SAP HANA SR Performance Optimized Scenario - SLES 12 SP01

SUSE Linux Enterprise Server for SAP Applications 12 SP3 - Guide

SUSE Linux Enterprise High Availability Extension SLEHA - Guide

SAP Documentation

System Replication - WiKi

SAP Note 1999880 - FAQ: SAP HANA System Replication

SAP Note 2407186 - How-To Guides & Whitepapers For SAP HANA High Availability

How To Perform System Replication for SAP HANA 2.0 SPS 02

HANA High Availability Overview

SAP HANA database runs mission critical applications and it is important that these systems remain available to users always. This requires that these systems can make faster recovery after system component failure (High Availability) or after a disaster (Disaster Recovery). This should happen without any data loss (zero RPO) and in very short recovery time (low RTO).

To provide fault recovery SAP HANA software includes a watchdog function, that automatically restarts configured services (index server, name server, and so on) in case of their failure. In addition to these features, SAP and its partners offer the following high availability mechanism for SAP HANA. These solutions are based on completely redundant servers and/or storage.

Host Auto-Failover: One (or more) standby nodes are added to a SAP HANA system and configured to work in standby mode. In case of failure, data and log volumes of a failed worker node is taken over by a standby node. The standby node becomes a worker node and takes over user load. This solution does not need additional storage, only servers.

SAP HANA System Replication: SAP HANA replicates all data to a secondary SAP HANA system constantly. Data can be constantly pre-loaded in the memory of the secondary system to minimize the recovery time objective (RTO). This solution needs additional servers and storage. The focus of this reference architecture guide is SAP HANA System Replication.

Storage Replication: Data replication is achieved by means of storage mirroring independent from the database software. Disks are mirrored without a control process from the SAP HANA system. SAP HANA hardware partners offer this solution. This solution needs additional servers and storage.

SAP HANA System Replication

SAP HANA System Replication is implemented between two different SAP HANA systems with same number of active nodes. After system replication is setup between the two SAP HANA systems, it replicates all the data from the primary HANA system to the secondary HANA system (initial copy). After this, any logged changes in the primary system are also sent to the secondary system. The following replication modes are available for this procedure:

Synchronous on disk (mode=sync): Transaction is committed after log entries are written on primary and secondary systems.

Synchronous in memory (mode=syncmem): Transaction is committed after the secondary system receives the logs, but before they are written to disks.

Asynchronous (mode=async): Transaction is committed after log entries are sent without any response from the secondary system.

Full Sync: Full synchronization is supported by SAP but cannot be configured with SUSE HAE. Full Sync mode stops the surviving node if either node is down, so failover with SUSE HAE is not possible.

If the primary SAP HANA system fails, the system administrator must perform a manual takeover. Takeover can be performed using SAP HANA Studio or the command line. Manual failover requires continuous monitoring and could lead to higher recovery times. To automate the failover process, SUSE Linux Enterprise High Availability Extension (SUSE HAE) can be used or you can any third party vendor. The use of SUSE HAE for the takeover process helps customers achieve service level agreements for SAP HANA downtime by enabling faster recovery without any manual intervention.

FYI: In HANA 2.0 SPS 03 below are the new blockbusters in terms of System Replication. Its usage and details will not be discussed in this blog but if you are interested, you can use this link: Secondary Time Travel , Multi-Target Replication, Invisible Takeover, SAP HANA 2.0 SPS 03 What’s New: High Availability – by the SAP HANA Academy

SUSE High Availability Extension (HAE) Resource Agents (RA)

SUSE has implemented the scale-up scenario with the SAPHana resource agent (RA), which performs the actual check of the SAP HANA database instances. This RA is configured as a master/slave resource. In the scale-up scenario, the master assumes responsibility for the SAP HANA databases running in primary mode, and the slave is responsible for instances that are operated in synchronous (secondary) status.

To make configuring the cluster as simple as possible, SUSE also developed it's SAPHanaTopology resource agent. This runs on all nodes of an SLE 12 HAE cluster and gathers information about the statuses and configurations of SAP HANA system replications. It is designed as a normal (stateless) clone.

SAP HANA System replication for Scale-Up is supported in the following scenarios or use cases:

Performance Optimized

Cost Optimized

Multi-tier

Multi-tenancy or MDC

Concept of the Performance Optimized Scenario

In case of failure of the primary SAP HANA on node 1 (node or database instance) the cluster first tries to start the takeover process. This allows to use the already loaded data at the secondary site. Typically, the takeover is much faster than the local restart. To achieve an automation of this resource handling process, we can utilize the SAP HANA resource agents included in SAPHanaSR. System replication of the productive database is automized with SAPHana and SAPHanaTopology and the details of this agent and its usage is explained below.

You can setup the level of automation by setting the parameter AUTOMATED_REGISTER. If automated registration is activated the cluster will also automatically register a former failed primary to get the new secondary.

To get more insight on Scenarios, you can refer to the link provided in Reference Section.

SAPHana Agent

The SAPHanaSR resource agent manages two SAP HANA database systems which are configured in system replication. SAPHana supports Scale-Up scenarios. Managing the two SAP HANA database systems means that the resource agent controls the start/stop of the instances. In addition the resource agent is able to monitor the SAP HANA databases to check their availability on landscape host configuration level. For this monitoring the resource agent relies on interfaces provided by SAP. A third task of the resource agent is to also check the synchronization status of the two SAP HANA databases. If the synchronization is not "SOK", then the cluster avoids to failover to the secondary side, if the primary fails. This is to improve the data consistency.

The resource agent uses the following four interfaces provided by SAP:

sapcontrol/sapstartsrv: The interface sapcontrol/sapstartsrv is used to start/stop a HANA database instance/system

landscapeHostConfiguration: The interface is used to monitor a HANA system. The python script is named landscapeHostConfiguration.py. landscapeHostConfiguration.py has some detailed output about HANA system status and node roles. For our monitor the overall status is relevant. This overall status is reported by the returncode of the script: 0: Internal Fatal, 1: ERROR, 2: WARNING, 3: INFO, 4: OK The SAPHana resource agent will interpret returncodes 0 as FATAL, 1 as not-running or ERROR and and returncodes 2+3+4 as RUNNING.

hdbnsutil: The interface hdbnsutil is used to check the "topology" of the system replication as well as the current configuration (primary/secondary) of a SAP HANA database instance. A second task of the interface is the possibility to run a system replication takeover (sr_takeover) or to register a former primary to a newer one (sr_register).

hdbsql / systemReplicationStatus: Interface is SQL query into HANA (system replication table). The hdbsql query will be replaced by a python script "systemReplicationStatus.py" in SAP HANA SPS8 or 9. As long as we need to use hdbsql you need to setup secure store users for linux user root to be able to access the SAP HANA database. You need to configure a secure store user key "SAPHANASR" which can connect the SAP HANA database:

saphostctrl: The interface saphostctrl uses the function ListInstances to figure out the virtual host name of the SAP HANA instance. This is the hostname used during the HANA installation.

SAPHanaTopology Agent

This Resource Agent (RA) analyzes the SAP HANA topology and "sends" all findings via the node status attributes to all nodes in the cluster. These attributes are taken by the SAPHana RA to control the SAP Hana Databases. In addition it starts and monitors the local saphostagent.

Interface to monitor a HANA system (landscapeHostConfiguration.py): landscapeHostConfiguration.py has some detailed output about HANA system status and node roles. For our monitor the overall status is relevant. This overall status is reported by the returncode of the script: 0: Internal Fatal 1: ERROR 2: WARNING 3: INFO (maybe a switch of the resource running) 4: OK The SAPHanaTopology resource agent will interpret returncodes 1 as NOT-RUNNING (or 1 failure) and returncodes 2+3+4 as RUNNING. SAPHanaTopology scans the output table of landscapeHostConfiguration.py to identify the roles of the cluster node. Roles means configured and current role of the nameserver as well as the indexserver.

Interface is hdbnsutil: The interface hdbnsutil is used to check the "topology" of the system replication as well as the current configuration (primary/secondary) of a SAP HANA database instance. A second task of the interface is the posibility to run a system replication takeover (sr_takeover) or to register a former primary to a newer one (sr_register).

saphostctrl: The interface saphostctrl uses the function ListInstances to figure out the virtual host name of the SAP HANA instance. This is the hostname used during the HANA installation.

IP Agent

This Linux-specific resource manages IP alias IP addresses. On creating resource, virtual IP will be attached to primary site and this virtual IP will to move to secondary in case of failover.

SUSE HAE Supported Scenarios and Pre-requisites

With the SAPHanaSR resource agent software package, SUSE limit the support to Scale-Up (single-box to single-box) system replication with the following configurations and parameters:

Two-node clusters.

The cluster must include a valid STONITH method.
- Any STONITH mechanism supported by SLE 12 HAE (like SDB, IPMI) is supported with SAPHanaSR.
- This guide is focusing on the sbd fencing method as this is hardware independent.
- If you use sbd as the fencing mechanism, you need one or more shared drives. For productive environments, we recommend more than one sbd device.

Both nodes are in the same network segment (layer 2).

Technical users and groups, such as adm are defined locally in the Linux system.

Name resolution of the cluster nodes and the virtual IP address must be done locally on all cluster nodes.

Time synchronization between the cluster nodes using NTP.

Both SAP HANA instances have the same SAP Identifier (SID) and instance number.

If the cluster nodes are installed in different data centers or data center areas, the environment must match the requirements of the SLE HAE cluster product. Of concern are the network latencies and recommended maximum distance between the nodes. Please review our product documentation for SLE HAE about those recommendations.

Automated registration of a failed primary after takeover.
- As a good starting configuration for projects, we recommend switching off the automated registration of a failed primary. The setup AUTOMATED_REGISTER="false" is the default. In this case, you need to register a failed primary after a takeover manually. Use SAP tools like hana studio or hdbnsutil.
- For optimal automation, we recommend AUTOMATED_REGISTER="true".

Automated start of SAP HANA instances during system boot must be switched off.

Important: Valid STONITH Method - Without a valid STONITH method, the complete cluster is unsupported and will not work properly.

Solution Overview

This blog provides an example configuration of SAP HANA High Availability using SAP HANA System Replication and automated takeover using SUSE HAE and SAPHanaSR resource agents. SAP HANA Scale-up systems are based on IBM Servers which uses POWER architecture. Each HANA server is running on separate Physical box of IBM compute where servers are created using LPAR technology. The two SAP HANA scale-up systems are installed in separate LPAR of two different physical boxes.

We have installed HANA as Multi-Tenant Database Container (MDC) where we can create multiple tenant in single database running on single SAP HANA server.

Parameter	Value	Role
Cluster Node 1	XXXXXXXXX4021	Cluster Node Name
Cluster Node 2	YYYYYYYYY4022	Cluster Node Name
SID	SLH (SYSTEMDB) SL1 (TenantDB)	SAP Identifier (SID)
Instance Number	00	Number of the SAP HANA database. For system replication also, Instance Number+1 is blocked.
Network Address	10.XX.YYY.0/24
Network Mask	10.XX.YYY.255
Virtual IP Address	10.XX.YYY.219
Storage		Storage for HDB data and log files is connected “locally” (per node; not shared)
SBD	/dev/disk/by-id/scsi-360000970000197700209533031354139	STONITH Device
Hawk Port	7630
NTP Server		Address or name of your time server.

Next blog talks about Solution Design of this blog reference architecture.

Part 2: HANA Scale-Up HA with System Replication & Automated Failover using SUSE HAE on SLES 12 SP 3 – Part ...