Monitoring and alerting on critical errors of SAP HANA components in real time with Open Stack IT Operation analytics using ELK Stack
In first part of blog we have covered the basics of ELK stack and demonstrated how to visualize SAP HANA logs in real time with Open Stack ITOA. In this part we will explore on how to setup alerts for critical errors in HANA core components.
We can watch for changes or anomalies in real-time from HANA logs and perform the necessary actions in response. For example, if we want to monitor the critical errors (e.g; Service_shutdown) of particular component in HANA instance and get notified immediately.
This can be achieved by enabling watcher feature in X-PACK, an Elastic Stack extension that bundles security, alerting, monitoring, reporting, and graph capabilities into one easy-to-install package. While the X-Pack components are designed to work together seamlessly, you can easily enable or disable the features you want to use.
How Watches Work
X-Pack provides an API for creating, managing and testing watches. A watch describes a single alert and can contain multiple notification actions.
A watch is constructed from four simple building blocks:
A schedule for running a query and checking the condition.
The query to run as input to the condition. Watches support the full Elasticsearch query language, including aggregations.
A condition that determines whether or not to execute the actions. You can use simple conditions (always true), or use scripting for more sophisticated scenarios.
One or more actions, such as sending email, pushing data to 3rd party systems through a web-hook, or indexing the results of the query can be done.
A full history of all watches is maintained in an Elasticsearch index. This history keeps track of each time a watch is triggered and records the results from the query, whether the condition was met, and what actions were taken.
Step 1: Download and install the X-pack ELK stack packages
1.1 Install Elastic search
To use X-Pack you must have already installed elastic search and kibana as described in Part 1. Install the X-Pack with simple steps as mentioned in the below link
Step 2: Configure Email Accounts for sending alerts
Watcher can send email using any SMTP email service. Email messages can contain basic HTML tags.
When we configure the accounts ,watcher can use it to send email in the xpack.notification.email namespace in elasticsearch.yml
If your email account is configured to require two step verification, you need to generate and use a unique App Password to send email from Watcher. Authentication will fail if you use your primary password.
Please use the below link for reference : https://www.elastic.co/guide/en/x-pack/current/actions-email.html#actions-email
Configure the elasticsearch.yml file as shown below and restart elasticsearch and kibana
Step 3: Configure watcher for alerting
3.1 Verify watcher
Verify that watcher is enabled, by calling the Watcher Stats API
In Kibana URL go to the “Dev Tools” tab and run the command “GET _xpack/watcher/stats” and check if the watcher status is Started or stopped as shown below .
3.2 Configure Watcher
Create watcher and schedule trigger .
Schedule-triggers define when the watch execution should start based on date and time. All times are specified in UTC time. In this example the watcher is triggered with an interval of 15 minutes.Below image shows the scripts on how to create a watcher named “hanacomponent_health”
Use the search input to load the results of an Elasticsearch search request into the execution context when the watch is triggered. Below image show how to query all the entries in the index “hanalog” and sorted it in descending order with the timestamp, since we need the latest entry from the index.
When a watch is triggered, its condition determines whether or not to execute the watch actions. Watcher supports the 5 condition types, here we used ‘compare’ condition. Use the compare condition to perform a simple comparison against a value in the watch payload.
In this example we have compared the “ACTION” type with the keyword “Service_Shutdown”. If the “Service_Shutdown” ACTION entries are found in the particular index then the alert would be triggered.
When a watch’s condition is met, it’s actions are executed. A watch can perform multiple actions. The actions are executed one at a time and each action executes independently. Any failures encountered while executing an action are recorded in the action result and in the watch history.
Watcher supports the following types of actions: email, webhook, index, logging, hipchat, Slack, and pagerduty. We have chosen ’email’ as action type as we need to get email notification which includes the visualization report as attachment.
Acknowledgement and Throttling
During the watch execution, once the condition is met, a decision is made per configured action as to whether it should be throttled. The main purpose of action throttling is to prevent too many executions of the same action for the same watch.
For example, suppose you have a watch that detects errors in an application’s log entries. The watch is triggered every five minutes and searches for errors during the last hour. In this case, if there are errors, there is a period of time where the watch is checked and its actions are executed multiple times based on the same errors. As a result, the system administrator receives multiple notifications about the same issue, which can be annoying.
To address this issue, Watcher supports time-based throttling. You can define a throttling period as part of the action configuration to limit how often the action is executed. When you set a throttling period, Watcher prevents repeated execution of the action if it has already executed within the throttling period time frame
Step 4 : Run the watcher
Now execute the watcher as shown below from the Kibana Dev tools tab
The watcher will be created as defined in the script .Now when the compare condition matches , that is when one of the HANA component goes down , then an email alert notification would be sent with a report .
Below is the screenshot of the notification that is sent
References : https://www.elastic.co/guide/en/x-pack/current/xpack-alerting.html