Architecting solutions on SAP BTP for High Availability
SAP Business Technology Platform (BTP) is an open platform based on a multi-cloud foundation enabling it to run on top of different hyperscalers like Microsoft Azure, AWS, GCP & Alibaba Cloud. The partnership with these hyperscalers has made it possible for SAP to scale and offer SAP BTP across various regions. This provides lot of flexibility for those customers want to co-locate SAP BTP along with their existing software solutions and leverage its capabilities to either extend or integrate solutions. Once a customer subscribes to SAP BTP, they can create subaccounts for extension/integration scenarios on any of the available regions/IaaS providers. You can explore all the available capabilities of SAP BTP from the SAP Discovery Center.
From a services capability standpoint, there is lot of work which is currently being done/in-progress to surface the service capability of the underlying hyperscaler within SAP BTP. For example, SAP BTP offers PostgreSQL , Object Store which in-turn leverages the corresponding hyperscaler capabilities. For a customer this is transparent, and they do not need to worry much about how this works in AWS or Azure. The Kyma environment (based on Kubernetes) on SAP BTP, exposes a catalog of services which customers can subscribe to from other hyperscalers (separately) and architect their solution with best of breed services.
There has also been some work carried out to offer private connection between apps deployed on SAP BTP with the hyperscalers to enhance performance and security – SAP Private Link.
This blog post came out of a recent customer discussion on how to architect solutions that support critical application running on SAP BTP for High Availability (HA). Since SAP BTP runs on top of hyperscalers, it can benefit from the proven Multi-Availability Zone concepts which the hyperscalers leverage.
Availability zones (AZ) are single failure domains within a single geographical region and are separate physical locations with independent power, network, and cooling. Multiple AZs exist in one region and are connected with each other through a low-latency network. SAP BTP services run on this Multi-AZ concept of the underlying hyperscalers thereby offering High Availability. Hence, if there is an outage in one of the AZ’s, the service/application will self-heal and will continue to be serviced with the other AZs in the region.
The SAP BTP Service Description guide provide more information on the SLA for the services. As of 1-Aug-2021, SAP announced the increased availability of 99.95% for several critical services of SAP BTP. The SLAs are being constantly reviewed and there are many activities on the way to increase the SLA commitments. Please review the roadmap and documentation to obtain the latest SLAs.
Another important topic which also gets discussed a lot with customers is on the maintenance windows. As you can see from the Maintenance Window documentation for SAP Cloud services, the maintenance windows and major upgrade windows are documented. During the maintenance windows, changes/bug fixes are rolled out which does not impact any of the services (ZDM). However, SAP reserves 4 windows (once each quarter) to perform major upgrades. There could be major changes related to network/security/DB and these would be communicated in advance to customers.
Now that we have covered some of the basic concepts, let’s look at some of the options which are available to architect solutions which are highly available. Please note that these are just possible scenarios which you can test out before productionizing them.
With the ability to create SAP BTP accounts spread across the globe on different IaaS providers gives us more flexibility to architect such solutions. Whether you are developing a Fiori apps as an extension or an interface using Integration Suite you can easily move these artifacts between BTP accounts across different providers.
For illustration purposes, I am using a large organization as a customer who has invested a lot with Microsoft and is leveraging Microsoft Azure for SAP and 3rd party workloads. This customer has operations across EU and US.
In the below Solution Diagram, I am leveraging the multiple subaccounts for staged developments. DEV/TST/PRD (in Azure US) and another PRD (in Azure EU). All the development and testing happen out of the subaccounts marked as DEV/TST and the changes are pushed across the landscape using the Transport Management services. As you can see, there are two productive subaccounts across two regions. Please refer the documentation on best practices on deploying application. The roadmap is constantly updated for the Transport Management & CI/CD services. Please check if the artefacts which you are looking to transport are supported.
Scenario 1: Cross-region failover with distribution of load
This scenario assumes you have end users scattered across EU and US and would like to load balance the Fiori Launchpad exposed on SAP BTP. The Fiori Launchpad is deployed on both the SAP BTP accounts in EU and US. The same custom domain has been configured for both the Launchpads. For illustration purposes, I have used Azure Traffic Manager which is a DNS-based traffic load balancer. The purpose of the Traffic Manager is to distribute the traffic between these two launchpads and also supports routing the traffic to an instance of the launchpad when the other one goes down (for example, when there is a maintenance window which results in service disruption etc..). There are different routing rules which you can configure to provide the best experience for your end users. There is a best practice documentation which explains some methods on how to identify a failover and the use of rule-based solutions like Akamai ION.
Scenario 2: In-country failover
In one of my recent engagements, I was dealing with a customer that operates in a highly regulated industry and were looking to ways to architect a solution on SAP BTP for one of their mission critical app. Due to tight regulations, the solution had to be based on providers within the country. In Australia, we have SAP BTP on two providers – AWS & Azure (similar to many other regions). This provides an option to architect a High available solution across two providers. For the end customer, this is transparent as they would be directed to the Fiori launchpad depending on the load and availability.
Another interesting thing to note in the above solution diagram is that the maintenance windows for SAP BTP on each of these providers fall on different weekends. Hence, its very unlikely that there is a major upgrade happening to SAP BTP service which might affects the same service across all the providers in the country. Obviously, there are lot of other aspects to think through with this setup – especially when you are looking to setup Azure ExpressRoute or AWS Direct Connect for connectivity into your on-premise systems.
Hope this blog post gave you some ideas in terms on how you can leverage SAP BTP across different regions/providers to architect highly available solutions. Thought I used an example of Fiori Launchpad, this could be used for other scenarios like Integration, Workflows, Automation etc. I am keen to hear from the community if someone has tried to implement this setup.