Observability, monitoring and alerting services offered by SAP Business Technology Platform – Part 1
I’ve have been a developer with SAP for many years right from the SAP ECC times and quite overwhelmed by the amount of services it offers for application monitoring. Recently SAP came up with the Cloud Application Programming Model(CAP) which aims to ease the pain of developers trying to develop a cloud business application or microservices in SAP Business Technology Platform . SAP also comes up with a set of services that helps to monitor the applications deployed in SAP Business Technology Platform. These are called Observability services.
Observability services doesn’t provide only monitoring. It also provides the flexibility of alerting and insights about the app that is running on the platform or a server. If a particular app has performance issues while running on a platform, The observability services will help you know where the issue is, and let the stakeholders resolve them. If the app performs well, then also these services provide good insights why the app performs very well in a production environment which helps the stakeholders to document the best practices followed and share it across multiple teams.
Make informed decisions
In the age of data and microservices data driven decisions are the key. The more observability data we capture, the better are our decisions. Secondly, observability data needs excellent tooling to provide value for money to service quality – this only works if we capture the data in technologies which help again to unearth the value which lay in them. Data helps to break down barriers and understand many things – it is the duty of everyone who is collecting data, to make as much data as possible available for those who can benefit from them or who take informed decisions or actions from them
In order to make data driven decisions, we refer to the best practices of the industry. Google has released its own Google SRE book to the public which not only introduces the motivation to observe four main “signals” of a software system or a service for Side Reliability (SRE) discipline, but also builds a baseline of comparable indicators which help you judge the healthiness and efficiency of the service you are offering. Those signals are
The time it takes to service a request.
A measure of how much demand is being placed on your system, measured in a high-level system-specific metric.
The rate of requests that fail, either explicitly, implicitly, or by policy.
How “full” your service is.
It is very important to understand the basic observability data that one should be aware of, and, in a perfect world, that one should collect with the observability services:
Availability data: Check if your system is still up and running. This can be done with the help of a health endpoint (e.g. “blackbox check”, “synthetic check”) or a synthetic scenario test.
Metrics: Check your CPU, memory, disk capacity, traffic and much more. Deliver agent technology with your system that checks those metrics in certain intervals. You can set metrics for a certain time frame and then analyze the outcome over a set period of time in order to compare results.
Events: Get insight on how and when events affect your system from the outside. Events (like updates for example) can indicate if and why something has changed. This lets you trace back data to the event and lets your revert your system to the time before the event that might has caused the trouble.
Logs: Preset logs to monitor explicit conditions in the program flow. Logs signal the status of your process to the outside and help with orientation. They come in different granularities and are relevant for different personas (e.g. customer, operator, developer).
Traces: Trace transactions in your program flow. Traces can show dependencies of all interactions of the system. They help visualizing the program flow and to understand single elements. Traces deliver very important information that can help you understand dependencies and how they influence your application performance
Observability in SAP Business Technology Platform
SAP Business Technology Platform offers us a comprehensive set of observability tools and services that help us understand if our product runs smoothly or if it needs improvement. In the beginning, it might need some getting used to, and yes, we might need to invest some time and work into setting up and tweaking your own observability strategy. But you should not worry about actually maintaining or running your own observability stacks – this is what SAP provides to you as-a-service.
Some highlights of observability services are as below
- Services offer a seamless consumption, starting from an easy adoption of pre-packed observability packages.
- No heavy configuration required.
- They are based on a combination of state-of-the art open source approaches (Elastic/Kibana for Logging Services) and SAP-specific service developments ( Alert Notification Service).
- Capabilities cover platform internal use cases (e.g. SAP Development team investigates low level app runtime behavior)
- Customer facing use cases (customer investigates her application runtime behavior).
SAP provides a whole set of services to store the mentioned data pools. Depending on your needs and the capabilities of each service, you add more or less of the observability data pools into the respective offerings and get the best tool for your actual needs. Some of the services offered by SAP Business Technology Platform are.
We put a lot of time, effort and money in developing and operating apps on SAP Business Technology Platform. We plan code and testing for weeks or months. What we might not have thought about, is an observability plan. With observability, you are monitoring your resources, processes and interactions with other systems that helps you preserve your success. Undetected bugs and costly downtimes can turn into your worst nightmare. Ultimately, you might lose customers due to applications not running smoothly. As some may say: “Slow is the new down”!
We can discuss how these services can be consumed one by one in the upcoming blog posts.