High Availability in simple terms means protecting the application from unforeseen events like hardware failure. While there are many techniques in this blog post would focus on concepts of configuring SAP HA in public cloud.
Do we need to configure HA In Cloud?
In general, the cloud infrastructure is supposed to be built with redundant components, but a piece of hardware can fail any time. Modern technology is driving the availability higher but at the same time business are depending more and more on digital technology have far reaching impact when systems go down. At the end of the day this is more of a business decision to have HA or Not.
What are the Single Point of failures in a SAP System?
What is an Availability Zone?
If we go back few years, how was this done? Simply purchase 2 pieces of identical hardware and put them in the same building or may be 2 different buildings at the same site and create a logical Cluster.
With Cloud, this has changed to Availability zones. Availability zones are 2 physically separated locations with 1 or more data centers per location. While cloud providers do not provide the distance between these locations, it’s generally assumed they are anywhere between 60-100 miles apart. Combined with the low latency this provides a more robust option for High Availability.
Application Single Point of Failures
All the data is stored in here, and if the database becomes unavailable the app cannot function. In the old days majority of clusters used shared disks which means the same data is available to both the nodes but to protect data integrity clusters are configured in way that only one Node can access the data at any given time.
With Availability zones, there is no sharing of infrastructure so we need to use techniques to sync the data between the nodes and this can be accomplished by using database log shipping features
- Oracle Data Guard for oracle.
- DB2 HADR for DB2.
- SYBASE Data Replication.
- SQL Server Replication.
- Hana System Replication.
We can also use DRBD (Distributed Replicated Block Device) in case of Linux OS
Now we have a database that can potentially run on 2 nodes with 2 different IP’s. From Application perspective we need to have a single IP so that there is no manual intervention when the DB fails over and this is where Virtual IP comes into play. A virtual Ip is floating IP that moves between both the nodes
SAP ASCS and ERS
SAP ASCS (also called Central Services) and ERS, they go hand in hand. SAP ASCS covers 2 single points of failures, SAP message server and enqueue server. Enqueue server holds all application locks. When ASCS fails over message server starts on the other node application servers make the connection back but the locks are gone which means all the inflight transactions are lost and users have to redo the transaction. This is where ERS comes into play, ERS will help replicate the lock table from ASCS and when ASCS fails over the lock table is rebuilt using the copy from ERS and users are able to continue from where they left.
Would It Make Sense to Have ASCS And ERS On the Same Node?
Isn’t this more like putting all the egg’s in the same basket? If you have the main lock table and the replicated table on the same node, both are lost in case the node goes down. Combined with the SAP parameters and HA configuration capabilities by most vendors (Pace Maker, HACMP, VERITAS, MSCS) we can ensure ASCS and ERS will run on opposite Nodes all the time.
SAP Application Servers
This is the easiest part of SAP HA, it’s just a number the more we have the better availability we get. When using the Availability zones the key is to distribute these on to both the zones so we have application servers available all the time.
Now we have configured SAP HA is it all? Not really most of the time SAP HA fails because load balancing is not configured. Best Practice is to use load balancing even if you have only one SAP Application server it will help when more application servers are added to the landscape
Deploying High Availability architecture across multiple Availability zones adds another layer of protection by distributing the SAP logical components across 2 independent physical locations with low latency. This design helps mitigate the outage due to issues with one zone by moving the workload to the other zone.