“Secure by default” for SAP on public cloud infrastructure
Time flies when you’re having fun. I noticed today it has been two years since I last blogged on this platform. My only excuse is that I have been rather busy…
In May 2019, I took up a position to run Network & Security Operations for SAP’s public cloud landscape in the Multi Cloud team. The Multi Cloud team inside of SAP is the main interface between anybody in the company making use of public cloud providers, and those providers themselves. Having always been in advisory or consultative roles in security before, this was an amazing opportunity to contribute directly to the security of SAP in public cloud. What came as a pleasant surprise on top was how much the experience in IIoT/OT security proved strangely beneficial in this role, as well as my prior time in analytics and data visualization, consulting and technical support. In a bizarre (but good) way, I feel my entire career in tech so far has led up to and prepared me for this role.
I jumped onto a train that was already running fast. The use of public cloud infrastructure has grown dramatically over the past couple of years, with resources deployed roughly doubling annually. The scale is truly remarkable: SAP now runs over 7.5 million cloud resources across 8,000+ active cloud accounts, across AWS, Azure and GCP, as well as Alibaba Cloud and AWS- and Azure China. This is substantial, but it gets even more impressive when you consider that this is more than double the size we were in January of this year, and we’re projected to double in size each year for the near future.
At the same time, responsibilities for security in public cloud are shared between IaaS providers and those using IaaS services, and even shared between different SAP teams internally, adding to, rather than reducing the complexity. An example from AWS can be seen here. It is critical to understand fully what this means, so we avoid any “I thought <IaaS provider> took care of that?”. While the public cloud providers are responsible for security of the cloud, their customers are responsibility for security in the cloud. That is, the IaaS provider ensures that their services are secured, their data centers are secure, etc. but the consumer of IaaS services, in this case, us in SAP, is responsible for the secure configuration of everything running inside of the cloud account.
This is not a trivial task. There are countless examples in media reports where hacks and data breaches occurred due to unintentionally misconfigured cloud resources, with perhaps the Capital One incident the most famous – at least in North America. SAP has strict policies and hardening procedures in place stipulated by our SAP Global Security organization (SGS), and we scan for compliance to these policies by the various teams in SAP using public cloud infrastructure. But the reality is that security of public cloud infrastructure is actually really hard and a specialists’ job. Each public cloud provider does things differently, so skills acquired in one provider do not immediately translate over to another. Meanwhile, the defaults in public cloud accounts typically are for ease-of-onboarding, not security – for understandable economic reasons. That makes it all the more important to tackle potential security problems as early on as possible.
You may have heard of “shift-left” and DevSecOps. At SAP, we now apply these principles to public cloud security and policy compliance. By taking care of security as early on in the lifecycle and allow for continuous security testing in the CI/CD pipeline, security problems are found early and are more easily corrected, rather than only at a pre-production security quality gate – which could jeopardize planned go-lives – or after the solution has already been deployed and detected with the regular compliance scans across the entire public cloud landscape.
- “preventative controls” implemented as organizational policies on each public cloud provider platform (where available)
- “detective controls”, implemented as compliance controls-as-code in a docker container, allowing SAP teams to scan their code before it is deployed – as well as daily centralized scans across the entire SAP public cloud landscape and all four public cloud providers
- “reference architecture building blocks”, which consist of reference architecture guides for each cloud provider, indicating how compliance with policy can be achieved, and even more helpful: infrastructure-as-code (IaC) templates for common cloud resources that are “secure-by-default” and tested against the detective controls, and are guaranteed to be in line with SAP’s policies and hardening procedures
One of my favorite aspect of public cloud architecture is the focus on “treat your infrastructure as cattle, not pets”. Every public cloud resource is ultimately an API call that can be automated, rather than some precious server somebody has built from scratch and even perhaps given a cute name. Through IaC scripts a desired state can be specified, and by running the script the infrastructure is created. By providing SAP internal teams with these building blocks that can be easily modified to their own needs, they don’t each have to work this out by themselves, saving a lot of time that instead can be spent on developing new features for their solution.
These building blocks modules are developed by the Multi Cloud Network & Security Architecture team, by cloud security experts in close collaboration with the SGS Defensive Architecture team that sets the policies. For instance, with these modules, the entire reference architecture network topology can be materialized, including VPCs, routing tables, subnets, and firewall and peering rules. Each cloud provider does networking a bit differently, and the objects created differ between them. Providing these modules to developer teams saves them tons of time and effort, knowing that these are tested and validated with SGS, and if there are issues with them, they can quickly get help.
The detective controls allow SAP teams to scan for policy violations during the CI/CD pipeline whenever and how often as they want as they develop their solutions, allowing them to catch issues early. The very same control set is used in daily scans running across the SAP public cloud landscape to catch any issues that slipped through the process, somehow, which are then followed up with the relevant team that operate the cloud account.
Finally, some misconfigurations are stopped directly with organizational policies (and in some cases, good public cloud provider default settings) implemented in each provider’s SAP organization, that apply to all SAP cloud accounts on each provider. This allows us to enforce multi-factor authentication for all cloud account users in SAP, for instance, or centrally enforced and collected audit logging that goes straight to the SIEM, or (where supported by the IaaS provider), enforced encryption of data in transit and at rest. Accidentally doing the wrong thing will return a friendly error message that informs the team that sorry, you can’t do that as per policy xyz. And if the team believes they have a valid business reason to do this, they can request an exception from the SGS team.
Should such an exception be granted, we can apply waivers and exceptions to the controls, unique to that particular cloud landscape. There are already over 900 of those in place, and with everything based around “as-code”, we can integrate our tooling with an exception database to handle such exceptions at scale.
Coverage of the toolset will both deepen (more automation, increasingly higher up the stack and integration with other security processes that are part of SAP’s SecSDL) and widen (covering an increasing variety of cloud infrastructure for each cloud provider) in 2021, allowing developer teams to increasingly focus on the features and security of their own solution development and application stacks, while providing them with guardrails and tooling to get the basic cloud infrastructure right.
I’ve heard it said that “public cloud is more secure”. I would twist that a bit: public cloud has the potential to be more secure, and to achieve that state faster. It also has the potential to be much worse than data centers, not least due to its complexity and variation between the different providers, and often a lack of secure defaults. Multi cloud is not for everyone. As someone from the frontlines in cloud secops, I wouldn’t necessarily recommend it, unless you are of a size like us, where you can afford central teams of specialists. Consider the complexity involved and that you have to do everything you do for one provider also for each other platform you wish to support, at least until some sort of central tooling for public cloud emerges on the market that is provider-agnostic.
Having said that – if you have the capability and the resources, and the business reasons to do it – it’s been very rewarding for all involved to bring this toolset about in 2020, during what we can all agree has been a challenging year. SAP’s public cloud landscape is measurably better protected than it was a year ago, and we’re now in far better position to accommodate future growth.
Public cloud infrastructure security is not the whole story – there is of course a whole appsec side to this as well. But without a good baseline, appsec stands no chance. The more you adopt infrastructure-as-code and security/compliance controls-as-code practices, as part of a shift-left DevSecOps approach, the easier it also is to get cloud security right, and get it right at scale.