SAP private linky swear with Azure – how many pinkies do I need? Architecture impact of SAP Private Link Service.
This post is part 3 of a series sharing service implementation experience and possible applications of SAP Private Link Service on Azure.
Find the table of contents and my curated news regarding series updates here.
Looking for part 4?
Find the associated GitHub repos here.
24th of Nov 2021: SAP introduced hostname feature for PLS. Going forward host names are used instead of private IPs. Not all Screenshots below have been updated!
Continuing with the implementation journey of SAP Private Link we eventually need to reflect on our actual SAP system landscape outside of BTP. Are you running three SAP backend stages (dev, qa and prod), or are you one of those brave cloud folks having only prod and non-prod? Are they isolated in different network subnets or regions on Azure? Against which system stage do you develop from BTP? Are your workloads grouped by Subaccounts or by spaces within Cloud Foundry?
And most of all, what happens when your most productive backend system gets a “hiccup” and moves to its Disaster-Recovery-site? Should your workload on BTP anticipate that? Spoiler: Many customers only take care of disaster recovery for their SAP backend but completely forget about their integration layer in BTP.
So, in case you came here to see me cut of a metaphorical “pinky” again, you are exactly in the right place.
Fig.1 Many pinkies “swearing”
I extended the architecture overview from the first post showcasing a set of deployment views. I choose them in accordance with most customer deployment “styles” that I saw over time and to outline configuration considerations given the variety of options. The focus is on deployment location and style rather than explaining its purpose in detail. High availability is anticipated like in the first post of the series. The VMs on the architecture diagrams below could represent a set of SAP application servers or web dispatcher for instance. The load balancer behind the private link service is natively equipped to distribute traffic across a pool of systems (check fig.4 on the first post).
There are quite the number of possible permutations, but I believe you would be able to transform the presented cases to your individual one. In case I missed an important flavour causing a black hole somewhere: let me know and I might add it during the next iteration of the post series 😊
Let’s start with the “classic”.
Same Azure region, three stages mapped one-to-one, no DR site
Three stages of S4 and three – or respectively CF spaces – and recovery through backups or service healing only. Have a look at the SAP docs for more context on the subaccounts strategy.
Fig.2 Classic three stages architecture
In this scenario all involved components are deployed in the same Azure region (e.g. west EU). SAP S4 DEV and QA system are grouped in the same subnet and therefore can be accessed through the very same private link service. You configure the private link resources on BTP to connect against the service in the mentioned subnet. Traffic is routed based on the source port.
Fig.3 multiple BTP links registered with same instance of private link service in Azure
Fig.4 Screenshot of private link service load balancing rules
The “classic” setup mimics a typical setup, that customers would configure with the Cloud Connector. It sacrifices the BTP workload in case of a failover of the SAP backend and hopes for speedy recovery.
Source: http://pictures.4ever.eu/, edited by me
I am going to spare you the image of a severed pinkie, because we lost the link between BTP and S4. Just kidding, here it is:
I wonder about a pinkie graveyard for all the lost pinkies (at least temporarily), because of the optimistic backup recovery strategy. On to the next one.
The untransparent one, where the link service is on the Hub network
Compared to the setup before we consolidate all private links from the individual spokes into the Hub VNet, where all other shared resources are.
Fig.5 Three stage architecture with private link on hub and firewall doing the routing
With that, all traffic – even from BTP via the private link – is bound to go through your corporate firewall.
The desire to have the link on the hub network is understandable for strong traffic inspection reasons. In addition to that it reduces amount of required private link service instances to one. On the downside you lose transparency because the private link service would now only see the VM containing your Firewall appliance. The SAP backend targets, and the routing rules would only be maintained on your Firewall from now on and be unknown to your Azure deployment.
Also making changes to the private link service in such a scenario for trying new features, on-boarding other sandboxes etc. becomes a high-risk operation, because of the potential to disrupt production when you alter or delete the wrong entry.
Multi Azure regions, three stages, sacrificial QA system, 1-many mapping, DR takes over QA
This setup spans the Azure region pair north and west Europe. The prod and non-prod VMs are placed into different regions. The QA system is sized and configured in such a way it can host the prod SAP system in case of a failover. During the failover QA is brought down to support the takeover of Prod.
I heard the term “sacrificial” QA for this setup the first time from my colleague @Robert Biro and found it very “speaking”.
Fig.6 Three stage architecture with sacrificial QA
To simplify the drawing, I “minified” the Dev instance. Other than that, the same private link setup strategy applies. For the actual DR process however, you need to think about how you will re-route the traffic to the link service hosted in the DR-site (here north Europe). A straightforward option would be to configure the prod space in CF to be connected to both SAP backend regions. Now for the switch you could manually alter the BTP Destination config to point towards the DR site. Because the private link service is already setup, it is only a matter of overriding the “pointers” in the Destination service.
In such a setup we are taking care of all the things discussed before while adding the capability to recover SAP in a different region. For the private link service to reflect that immediately, we need to prepare the route (orange arrow in fig.6). An additional configuration step remains for the DNS-based re-routing. With the next architectures we will slowly get close to automating that manual step further.
In a cloud environment it would be advisable to refrain from combining the DR SAP system with another one. You complicate flexibility of sizing, snoozing systems etc. in doing so. Referring to the “cattle vs. pets” paradigm this would be an example of treating the SAP environment too much as a pet. Private-Link-wise it is doable either way.
May the sheep dog get you thinking about that paradigm…
Source: https://www.jetsetter.com/magazine/quarantine-pet-memes/, originally from Twitter user melz (@mvazquez17)
In addition to that we assumed that the BTP deployment in west EU survived while the productive SAP backend deployment did not. To be protected against that we need to have a spare BTP deployment in a second region too.
On to the next one.
Multi Azure regions, sacrificial QA system, traffic manager to anticipate failover
We have the same config as before with the exception that the private link in CF prod space does not target the DR region. We mimic the failover that happens on the S4 side and swap the productive user traffic to the app instance running in the QA space. One way of doing that would be an Azure Traffic Manager setup.
Fig.7 Three stage architecture with seamless failover via Azure Traffic Manager
A disaster recovery decision for a productive ERP is always done consciously by human. Hency you would configure the traffic manager to do the DNS-based re-routing through your DR process or manually.
Compared to the architecture option presented before this a more managed approach, because your actions in Traffic Manager act on the DNS level for already prepared routes. Before you would need to physically act on the Destination and change IPs. That is more error prone during a hectic failover event.
In addition to that we still assume that the BTP deployment in west EU survived while the productive SAP backend deployment did not.
The fanciest architecture I saved until the end. On to the last one.
Cloud native setup with blue green deployment, DR for S4 and BTP
This time we have region redundancy not only for the SAP backend but also BTP. However, on the BTP side the other stages are dropped from the architecture on purpose because we embrace the cloud native pattern, that early customer feedback is key with weekly or even daily releases. Blue-green deployments styles, feature flagging, early adopter programs make it “bearable” to always develop directly against the productive SAP backend. Have a look at my post and SAP on Azure podcast episode regarding blue-green deployments to achieve such a process.
For a less busy drawing Dev and QA stages of the SAP backend have been “minified”.
Fig.8 Cloud native architecture with private link service
Again, we are using DNS-based routing with Traffic Manager to divert productive traffic on demand on the BTP side. To anticipate a “diagonal” region outage all private link services in BTP are connected with the prod and the DR private link endpoints on your own Azure VNets.
The manual CF Destination config override problem still applies of course. As before you could get around it by creating more redundancy on BTP and having a dedicated DR space for the diagonal failover.
This scenario needs the fewest amount of private link services and artifacts on the BTP side while disrupting typical three stage concepts most.
Such a shift to develop directly against the productive SAP backend has impact on internal development processes and culture at the customer side.
How about SAP RISE?
All architecture concepts and application of private link service shown today apply. You would need to discuss with your SAP deployment team counterparts at SAP to setup the required private link endpoints on Azure and share the resource IDs.
Fig.9 One flavour of SAP RISE architecture with private link
I depicted here on flavour of a potential setup. If you already have an Azure landing zone that you want to reuse, you have three private options to connect to the SAP managed RISE environment. That includes VNet peering, VPN or ExpressRoute. None of that impacts your private link service deployment decisions. In case you need to traverse VNets the routing challenge as mentioned in fig.6 arises. Once you maintain routes yourself, that excludes this part of the managed agreement with SAP RISE, because SAP has no control over it.
So, how many metaphorical “pinkies” do you need for your SAP backend setup with BTP? Let me know on the comments section😊
The architectures discussed today are by far not a complete list of the possible variations of your deployment together with SAP. They are meant to showcase the different touch points and failover processes to think about when applying private link service to integrate SAP backends with BTP.
In part four of this series, we will look at how to debug and test your private-link-enabled app. Any other topic you would like to be discussed in relation to the private link service? Just reach out via GitHub or on the comments section below.
As always feel free to ask lots of follow-up questions.