Preventative controls – using organizational policies to provide guardrails for SAP’s public cloud accounts
There is something very important to remember about security in public cloud : it is the consumer of public cloud services who is responsible for the security of resources running in cloud accounts. This was put to me well by one of the IaaS providers in 2019:
Our defaults are for ease of onboarding, not security
– Public cloud provider comment
This is just a different expression of the shared responsibility model in public cloud that I referred to in a previous blog. It makes a lot of sense: public cloud providers are in the business of making it as easy as possible for their customers to adopt their services, and don’t want to throw any obstacles in the way of their customers starting consuming. They make it very clear: they are responsible for the security of the cloud, their customers are responsible for security in the cloud.
Getting security configuration right in public cloud is not necessarily easy, though. Each are their own “operating system”, if you like, and while general principles will be similar, the actual implementation on each cloud provider can be dramatically different. It is therefore not surprising that it often goes wrong – as examples in the media can attest to – and a whole security tooling category has emerged (Cloud Security Posture Management, or CSPM) to scan cloud resources for misconfigurations and common security controls in order to alert on non-compliant resources.
SAP is no different here, and my team conducts such compliance scans continuously, shares them through the organization and provides reporting and analysis for follow-up and accountability.
However, wouldn’t it be nicer to take care of some of these common baseline security controls at the root, so they don’t occur at all? There are actually ways to put guardrails in place to make cloud accounts more secure-by-default. We have a wide variety of workloads in public cloud at SAP, from customer-facing SaaS services to numerous internal systems and applications and across all board areas. This gets compounded by operating truly in multi cloud mode, with often the same teams using multiple public cloud providers. Rather than ask all teams in SAP only to please adhere to security policies set by our SAP Global Security (SGS) organization with requirements for each hyperscaler, we applied a number of controls directly on cloud accounts to make them more secure than they would be otherwise.
Hyperscaler organizations and policies
There was a great presentation from Brandon Sherman of Twillio at fwd:cloudsec 2020 that you should probably watch if you’re interested in this topic for background, as I will have to generalize.
Public cloud providers have a way to bundle accounts into “organizations”. These are hyperstructures, typically hierarchical, in which cloud accounts are organized, similar domains and subdomains in more traditional IT, or namespaces in Kubernetes, for instance. The implementation and capabilities are different on different providers, but they generally allow some way to set policies on group or organizational level. These are not “paper” policies meant to be read and followed by human readers but technical policies that dictate certain behaviors, limit choices or change defaults.
The Multi Cloud team has essentially administrative privileges on those hyperscaler organizations, equivalent to “root” or “domain admin” in other IT contexts. We can set specific roles and policies that apply to all cloud accounts in the organization. This is how my SecOps team does compliance scanning, for instance, and our Hyperscaler team is responsible for setting these organizational policies we call “preventative controls”.
Preventative controls in place
We now have a good number of these preventative controls in place across AWS, Azure and GCP with some minor caveats here and there, depending on provider capability, implemented through organizational policies or hyperscaler-specific secure defaults. It is also important to note that many of these controls apply on the creation of new resources, rather than operate retro-actively on existing resources. Depending on control and cloud provider functionality available, we may stop a resource from being created, implement it for them, or need to let the resource be created to act on it, but immediate auto-remediate the resource and inform the cloud account admins what happened and why.
These controls are in effect for all cloud accounts that are part of the SAP organizations in each provider. This does not cover all SAP cloud accounts – primarily excluded are cloud accounts operated by recent acquisitions, for instance. Onboarding of those accounts is a 2-step process where first they are monitored through our CSPM tooling to ensure there is no business-critical impact on operations for these controls, and second are brought into the relevant hyperscaler organizations where these controls apply to those accounts like for any other.
So, what does this practically mean? Let’s run through some of my favorites:
- Centralized audit log enforcement and collected: on each provider, audit logs like CloudTrail, etc. are centrally set up, collected and ingested into SAP’s Security Incident and Event Management (SIEM) system
- MFA is enforced for all human users of cloud accounts: any users using the admin console of the cloud provider has to have multi-factor authentication enabled, or they won’t be able to use the account. This alone stops a whole category of security problems
- KMS configuration, including key auto-rotation enforced
- Enforced blocks against common admin and data base ports to the internet, including SSH, RDP and VNC. This stops teams from inadvertently opening up resources to the internet without realizing
- Public storage buckets are not allowed by default, and exceptions require a review and approval process
SAP already operates a landscape that has now grown to over 10 million cloud resources and around 9,000 active cloud accounts. We are expecting this to continue to rise over the coming years, including from teams not previously operating in public cloud. It is a great help and comfort to have these controls in place as these teams migrate to hyperscalers.
Part of a larger security and compliance framework
These preventative controls are part of a larger security and compliance framework at SAP for secure-by-default public cloud infrastructure, also mentioned in the previous blog linked above.
Security policies drive the controls and additional tooling, which can be restrictive (preventative controls), alert us when something is not configured right (detective controls, of which there are a lot more in different severity levels) and helpful, by providing secure-by-default infrastructure-as-code modules that are guaranteed to be compliant with the security policies and secure reference architecture.
There is a direct connection between the security policies and hardening procedures defined by our global security organization, and the controls and supporting tooling that are implemented. The diagram below shows the lineage from written policy to deployed resources.
Our partners in SGS define the policies, the secure reference architecture and the hardening procedures for public cloud infrastructure. In dialogue, the Multi Cloud team turns these policies into practical preventative and detective controls, as well as infrastructure-as-code modules we call building blocks. The dialogue is important to ensure that we strike a balance between the intent of a particular security control, and the practical implications of implementing it as an organizational policy, a detective control in CSPM scans (i.e. what are you actual scanning for), as well as the IaC building blocks. In turn, this ensures that the written policies are pragmatic and implementable by the various teams within SAP running in public cloud.
These components are tested against each other to ensure they are aligned where they overlap. We recently in fact had to get new exceptions for the test accounts for the detective controls. During this testing we need to create “bad” resources to ensure we detect compliance failures correctly as well as compliant resources. Tests started failing after a recent release of new preventative controls because we could no longer create the specific “bad” resource. (Which also in a round-about way proved the preventative control worked as expected!)
Before preventative controls are rolled out across the landscape they are tested together with different lines of business throughout the organization and deployed in stages to avoid any unexpected incidents or side-effects. We also announce well in advance when new controls are rolled out, and track relevant policy violations from our central CSPM scans with extra focus and attention. Such CSPM tooling is already directly available to SAP teams to conduct ad-hoc scans of their own cloud accounts, encouraging a shift-left, infrastructure-as-code approach in their secure software development lifecycle.
Finally, the provider-specific reference architecture guides provided by the Network & Security Architecture team in Multi Cloud explain how teams in SAP can reach compliance on a particular platform in a practical way, while their building blocks help teams directly with secure-by-default infrastructure-as-code templates to build their software stacks and applications on top of.
A model to adopt in your own organizations
We have found this to be a very effective model, and this is not just useful to know as an SAP customer in how we protect public cloud landscapes we operate. Setting up preventative controls for your own public cloud environment is something you can adopt as well, if you haven’t already. Doing this across multiple cloud providers has not been easy and requires a team of specialists like our Multi Cloud Hyperscaler team. But if you are mostly concentrated on a particular one (or two) this may well be feasible in your own organizations and is at least worth an investigation.
You may already have your cloud accounts in a public cloud provider organization, if only because it simplifies billing. We would encourage you to investigate through your public cloud provider’s documentation how to implement such organizational policies on their platform for yourself, or contact your technical account management team for guidance.