Skip to Content
Technical Articles

Automatic alert remediation with Alert Notification, Automation Pilot and ServiceNow

Introduction

This blog post is part of a series of blog posts related to SAP Cloud Platform Alert Notification service.

Wouldn’t it be perfect if events for failure in your business-critical application are instantly delivered to your incident management system? Now SAP Cloud Platform Alert Notification provides out-of-the-box integration with ServiceNow. Furthermore, you can take advantage of the usage of SAP Cloud Platform Automation Pilot for executing alert remediation activities against the failed application and report the outcome again in ServiceNow. All these tasks could be fully automatic and executed only when certain conditions are met, without requiring any further input by your side.

Automation Pilot is a powerful service which helps you to automate multiple DevOps tasks. It provides catalogs with pre-defined commands, like restart of application or database and examples of custom recommended actions, consisting of multiple steps. Having those features in place, you can model your own commands which fit better to your processes. Get more insights from this blog post.

The idea behind this blog post is to demonstrate a basic scenario in which a failure in business application called “espmcloudweb”, running in SAP Cloud Platform Neo Environment, is reported. Alert Notification delivers a critical alert to ServiceNow, which results in an incident creation. Additionally, it triggers a pre-defined recommended action in Automation Pilot, which aims to remediate the failure. In this case, the action would be to gather the current application health state for later analysis and restart it. Automation Pilot continuously sends notifications to Alert Notification for the status of the executed actions. Thereon, Alert Notification updates the incident in ServiceNow accordingly. Sounds good, doesn’t it? Let’s see how this setup can be achieved.

Configurations in Alert Notification

Let’s start by creating a ServiceNow action for the Alert Notification. Navigate to Alert Notification’s UI in your Cloud Cockpit, tab Actions, click the Create button.

In the displayed form, enter the name of the target table in ServiceNow where incidents are managed, user ID and password for authentication. You should specify the base URL address for the same instance.

Note: The user must have incident_manager role assigned.

That’s it! The Alert Notification events will be delivered to your ServiceNow instance. No configurations in ServiceNow are needed! For more details, check the official documentation.

For this scenario, we would need two active subscriptions with clear purpose. The first one is dedicated to incidents management in ServiceNow. The second – is going to trigger a recommended action in Automation Pilot.

Let’s assign the ServiceNow action to subscription “Incidents-Management”.

Having done this, all alerts with resource name “espmcloudweb” or “SAP CP Automation Pilot” will be delivered in ServiceNow. In this example, only alerts for application status change (start, stop, etc.) will be excluded. You can narrow the scope further by adding more specific matching conditions.

Let’s move forward with the second subscription, called “Recover-java-app”. The matching conditions set here are much more concrete.

For each Alert Notification event with severity “ERROR” or “FATAL”, related to application “espmcloudweb”, we would like a recommended action to be executed. For this purpose, let’s create an action for Automation Pilot.

This is a standard Webhook action with Basic Authentication. Credentials for a Service Account in Automation Pilot are required. In addition, you also need an event trigger URL, which can be called by Alert Notification as a Webhook.

The action has been assigned to active subscription “Recover-java-app”. More details will be shared in the section below. 

Configurations in Automation Pilot

To be able to access the Automation Pilot’s backed APIs, we should have a Service Account. In the Automation Pilot’s UI, navigate to section Service Account, on the left-hand side. Click the Create button.

Enter a suffix, permissions required for the Service Account and a description. The username will be a combination between the tenant ID and the specified suffix. The password is automatically generated and displayed only once, so you need to store it somewhere.

To enable Automation Pilot to produce events for each execution status change, a destination to the Alert Notification instance must be created. In the Automation Pilot’s UI navigate to section Alert Notification, on the left-hand side. Click the Create button.

Specify the region where the service is enabled. Enter credentials for an Alert Notification user with Basic Authentication. You can easily create a new one from the Alert Notification’s UI, tab Security.

Note: At the time of writing this blog post, Automation Pilot supports only one destination to Alert Notification.

SAP provides many catalogs with commands for the most commonly used DevOps tasks. They can be utilized directly and aim to save you time, by eliminating the need for you to code them. Glance through them in the Automation Pilot’s UI, section Catalogs->Provided by SAP.

After the completion of the Automation Pilot’s onboarding process, catalogs with example commands are created in your tenant, to illustrate better how the service can be used. For this scenario, we are going to use a pre-define recommended action, part of such catalog. The purpose of the command is to gather information for the current application health state and restart it. The name of the catalog is “Recommended Actions (RA-T000042)”.

Each command defines a set of Input Keys. These Keys describe what values need to be provided in order to trigger the command. Let’s check which are the required Input Keys for the selected command.

For being able to execute actions against our cloud resources, Automation Pilot should be provided with appropriate credentials. Depending on the task, different level of authorization might be required. It this case, an SAP Cloud Platform Neo OAuth Client with Lifecycle Management & Monitoring permissions would be sufficient for Automation Pilot, to consume the Cloud Platform’s API. Find more details here.

The obtained values need to be stored as an Automation Pilot Input entity. For this purpose, navigate to section Inputs and click the Create button. Define a catalog and name. It is possible to add Inputs to your own catalogs only. Аdd keys for the new input, by using the Add button on the right hand-side.

Note: As each Input Key describes a value that is given to the command when it is triggered, the new Input Keys must have the same name and data type, as the ones defined by the command.

Enter the value for clientId and select data type string from the drop-down list. Repeat the same procedure for clientSecret, by selecting secret for a data type. This is an important step, as it ensures the value will never be revealed. Now, the new Input Keys are safely stored in Automation Pilot for further use.

The remaining required Input Keys – region, resourceName and subAccount, will be derived from the Alert Notification event.

Let’s move forward by building the event trigger URL. It is required by Alert Notification, in order to call the Automation Pilot’s triggers API and start an execution. Navigate to section Executions, on the left-hand side, and click the Build Event Trigger button.

We should specify a few details, concerning the recommended action that is going to be triggered via the URL address. Find below some hints about the required fields:

  • Trigger Type: Alert Notification
  • Catalog: the command that is going to be used is part of a catalog, enter its name
  • Command: select the desired command
  • Version: select the version of the command, applicable for this case. Have in mind that even if two commands differ only by their version, they are still treated as different commands
  • Input Reference: select the credentials that were saved as an Automation Pilot input entity. Thus, they will always be used when the command is executed.

In the upper part of the form, a URL address has been shaped. Provide this link to Alert Notification, in the Webhook action which aims to call the Automation Pilot. The URL address is relevant only for a single recommended action, in this case “RecoverJavaApp”. For each command a separate action in Alert Notification must be configured. For more details, check this link.

The configuration part is done! Let’s generate an alert for failure in the application and observe the whole process flow. We are going to post a custom alert to Alert Notification. The event is going to match the conditions set in the active subscriptions mentioned above:

{
    "eventType": "ApplicationHealth",
    "resource": {
        "resourceName": "espmcloudweb",
        "resourceType": "app"
    },
    "severity": "FATAL",
    "category": "ALERT",
    "subject": "Failed availability check",
    "body": "Application health is in error state. Restart immediately.",
    "tags": {
        "ans:status": "CREATE_OR_UPDATE"
    }
}

In a minute, a new incident in the target ServiceNow instance is created. Conveniently, the subject of the alert is displayed as a short description of the incident.

Simultaneously, the recommended action in Automation Pilot has been triggered. If we check the execution status, it shows that the application is being restarted. Additionally, Automation Pilot sends notifications on each stage of its job and the incident in ServiceNow is regularly updated.

 

Let’s post another event to Alert Notification and simulate a recovery of the reported issue:

{
    "eventType": "ApplicationHealth",
    "resource": {
        "resourceName": "espmcloudweb",
        "resourceType": "app"
    },
    "severity": "INFO",
    "category": "ALERT",
    "subject": "Recovered availability check",
    "body": "Application health is ok.",
    "tags": {
        "ans:status": "CLOSE"
    }
}

In a minute, the status of the incident has been changed to “Resolved”.

If you scroll through the list of incident’s activity updates in ServiceNow, you will be able to track the application failure’s evolvement and recovery in a consequential order, with all the required details.

Try the solution in your own environment and explore the multitude of commands, offered by Automation Pilot. Why not design your own? The service is still in customer beta and we are working continuously on its improvement. Therefore, even greater features are expected in the near feature.

Stay tuned! We would highly appreciate your feedback.

 

2 Comments
You must be Logged on to comment or reply to a post.