Technical Articles
Detect Application Crashes in SAP BTP Cloud Foundry with SAP Alert Notification service (Part 1)
For the sake of readability, we often use SAP BTP as a short form of the complete “SAP Business Technology Platform” name and Alert Notification as a short form of the complete “SAP Alert Notification service for SAP BTP” name.
Introduction
Ensuring 100% availability of our cloud applications is critical for the daily business operations. We must be fully aware about any deviations. If disruption occurs, the prompt reaction and recovery are pivotal.
In the context of SAP BTP, the monitoring of Cloud Foundry applications is natively provided by the platform. The native Cloud Controller ensures not only full-fledged application management but observes the application’s overall performance and availability. The missing piece of the puzzle is the alerting part. Following the SAP best practices for utilizing the SAP BTP solutions, this gap can be easily fulfilled by using an SAP cloud offering – SAP Alert Notification service.
In this blog post, we are going to demonstrate how to be notified whenever an application crash occurs within your Cloud Foundry environment. Furthermore, you can select among multiple notification channels supported by Alert Notification to react. For this use case, we will take an advantage of the integration with PagerDuty for incident management. For all application crash events delivered by Alert Notification, an incident in our PagerDuty account will be triggered automatically. In addition, notification for such events will be delivered instantly via email to the responsible application administrator. However, this is just an example – use Alert Notification action of your choice. Let’s start configuring the target scenario!
Prerequisites
Before we start, here is what you need to have in advance:
- Alert Notification service, enabled for an SAP BTP Cloud Foundry space
- Application instance, deployed in the same SAP BTP Cloud Foundry space
- (Optional) Configured account in PagerDuty
Configurations in Alert Notification
There is a list of application audit events for Cloud Foundry that can be matched by Alert Notification. The current implementation requires the administrator for the subaccount space, containing the application, to add an existing technical user corresponding to the data center. This user must have the Space Auditor permission. The purpose is to obtain information from Cloud Controller in Cloud Foundry.
Check the technical user for your data center here.
If you are not familiar with the Alert Notification’s terminology, before moving forward you can glance through the official documentation for more details.
In Alert Notification, let’s create a subscription and a matching condition.
There is a dedicated audit application event, which is triggered when an application instance has crashed. The Alert Notification’s event property for such an event is with eventType app.crash. Based on the multitude of event properties, you can create more conditions in order to filter only certain crashes, e.g., for particular applications.
Let’s create a condition, as part of our new subscription, with eventType equal to app.crash.
Define a name and meaningful description for the new action. For completing the configuration, you need to provide a Routing Key. Optionally, define a Field Mapping.
Let’s have a closer look at the configurations in PagerDuty. Firstly, in your account you should have an integration on any PagerDuty service with integration type Events API v2. For obtaining the routing key from the PagerDuty account navigate to Services –> Service Directory. Find the integration service for your scenario and go to Integrations. Locate the routing key under Integration Key and provide the value to Alert Notification.
In Field Mapping, optionally you can enter a comma-separated list of key value pairs where the key is a field name and the value is either a constant or a placeholder that will be dynamically replaced with the value from the incoming event. See more details in PagerDuty Action Type.
In the bottom of the incident’s details in PagerDuty, there is an option called “View Message“. There, you can check the raw format of the event received via the PagerDuty Events API. You might find the information useful, if the intention is to map the properties of the Alert Notification event to concrete fields in PagerDuty.
Note: This field mapping overrides any default field/fields specified in the event tag.
Add the new action(s) to the subscription, review the summary and complete the subscription’s configuration in Alert Notification.
How the crash event for our application instance will be detected and handled within the Cloud Foundry environment? In general, Cloud Controller stages, starts, and runs the applications. Furthermore, it configures a health check that runs periodically for each application instance. If a previously healthy application instance fails a health check, it’s considered as unhealthy. As a result, the application instance is stopped and deleted, then a new instance is rescheduled. This stoppage and deletion of the application instance is reported back to Cloud Controller as a crash event. See more details about the health checks flow here.
Note: Audit events can be created at any point during the execution of the action they describe. This means the action associated with the event is not guaranteed to have succeeded. However, it’s important to have this in mind for audit events concerning application instance stop, update, restart, etc. Application crash event is triggered when a health check for instance is failed.
And we are done! The goal of this scenario was to illustrate how to catch application crashes in SAP BTP Cloud Foundry environment, utilizing Alert Notification. Subsequently, there is a variety of notification channels and actions that you can apply to be informed and handle the disruptions within the landscape. Above, we have just exemplified the nice and easy to consume integration with PagerDuty. Give it a try!
We have the intention to extent the use-case in future blog posts. The goal is to introduce further options to react proactively and remediate application instance crashes automatically by using a state-of-the-art DevOps tools, such as SAP Automation Pilot. Stay tuned!
Thanks for the very detailed blog. I am keen to know if it can also detect if one of the dynamic forms goes offline? OR it can be used only in event of application crashed down?
Hello, Mehul,
Thanks for getting in touch! In a nutshell, SAP Alert Notification service is a technical alerting engine that provides out-of-the-box notifications and alarms about some BTP services but also exposes a REST API where the end users could post their own events. I'm not aware of what dynamic forms you're referring to but if you could set up some health check and inform when the forms go offline, you'll be also able to match and receive this event.
Regards,
Kristina
Hi Kristina,
Thanks for quick response. I understood your explanation, however at least for the normal event types it should trigger alerts.
alert-subscription
Strange is the fact, that if we test the alert subscription mail gets triggered successfully, but not for the events configured.
Thanks,
Mehul
events available
Am I missing anything else as part of the configuration? I have followed this blog Detect Application Crashes in SAP BTP Cloud Foundry with SAP Alert Notification service (Part 1) | SAP Blogs
Only difference is that I have used action type as send to dedicated email. (Note - Test event is successful to the email id).
Thanks,
Mehul
Hello, Mehul,
I suspect that you have missed to add the dedicated technical user as Space Auditor as described here: https://help.sap.com/docs/ALERT_NOTIFICATION/5967a369d4b74f7a9c2b91f5df8e6ab6/4255e6064ea44f20a540c5ae0804500d.html?locale=en-US
Regards,
Kristina
Hi Kristina,
That's true indeed. Your suggestion has fixed the issue. Brilliant. Many thanks!!
Thanks,
Mehul
Hello,
Could you help to get active collection from SAP BTP Cloud Foundry to CALM? We do not find information if we need configure CALM connection to SAP BTP Cloud Foundry.
Thanks advance.
br Leena Nissilä
Hello Leena,
Just to make sure I understand correctly your query: you look for information on how you could integrate SAP Alert Notification Service for SAP BTP to the CALM (in order to consume SAP BTP Cloud Foundry events into CALM as well)?
Looking forward hearing from you.
Kind regards,
Biser Simeonov
Worth noting that it's important to access the BTP Cockpit via the same datacenter as the subaccount where the Alert Notificaiton service is running on, otherwise there will be an error and you won't be able to add subscriptions.
For example, if you have a subaccount in EU10, access the Alert Notification service via the cockpit of EU10: https://cockpit.eu10.hana.ondemand.com/cockpit/ and if you have another subaccount in EU30, then you need to access the alert notification service via the cockpit of EU30: https://cockpit.eu30.hana.ondemand.com/cockpit.
Pieter
Hello, Pieter,
By design, you could access SAP Alert Notification service instances cross landscapes. I've just tested the scenario with an instance located on EU10 and I accessed it through the BTP Cockpit on EU30. Also, I've executed successfully the reciprocal scenario -- an instance located in EU30 is opened through the BTP Cockpit on EU10.
If you still experience difficulties with this, please, open a ticket with component BC-CP-LCM-ANS containing your account identifiers, so that my teammates would be able to support you further.
Best regards,
Kristina
Is it possible to share an instance of the alert notification service between different spaces in the same subaccount? Or does each space need it's own instance of the alert notification service?
Thanks,
Mike Sharrar
Hello, Mike,
It depends on the scenario that you want to execute. Below, you could refer to common use cases that require or not a separate instace per space.
You need SAP Alert Notification service instance per space when:
You can share SAP Alert Notification service instances between spaces when:
Keep in mind that all use cases with shared instances are applicable for single instance, as well.
If you need any further support, just let me know.
Best regards,
Kristina