What Really Matters in IT Event Management
Short Answer: Meaningful alerts
A Bit Longer Answer: A single pane of glass combining real-time event analytic visualizations with contextual alerts sent in real-time
Event monitoring systems only show their true value when a component is trending towards failure or a component has failed. For example a solid-state storage device (SSD) in a rack mounted storage array.
With proper configuration and pertinent events streaming in, event monitoring systems can continually perform single level and multi-level analytics to determine if anomalies are occurring, if thresholds are about to be breached or if they have been breached. When such a condition exists and it continues to exist for a user specified period, meaningful alerts can be sent via appropriate channels to assist with prompt resolution of the issue.
This is all that really matters.
I can hear you asking: “But what about all the beautiful visualizations I display on my wall monitors showing the state of my systems?”
They are great – if someone happens to looking at them when issues are beginning to occur. After the fact, these displays are of minimal value given the likelihood of cascade failures. They do look great on the walls though!
When an alert is received – the ability to quickly synchronize the alert time with the visualizations is key. Then the visualizations show their true worth. A skilled practitioner is looking at them and they have a need to figure out what is causing the issue. The ability to quickly view the alerting component, associated components and drill down quickly are the hallmarks of useful analytic visualizations.
A screenshot showing how to select a time range in SAP IT Operations Analytics. Custom time ranges can be selected by sliding the ruler along the top or designating specific dates and times in the calendar input. A variety of presets are also available in the drop down menu.
Meaningful alerts and analytic visualizations that can be synchronized with the alert time are the two primary criteria that must be evaluated when selecting an event monitoring system. If an event monitoring system has top tier capabilities in these two areas – it will be used and it’s return on investment (ROI) will be easy to measure.
The reason is simple.
The underlying cause(s) of the alert will eventually be determined. Once the causes are identified, additional analytic processing will be setup to monitor those components for the condition that caused the original alert. A library of analytic processing workflows will be built over time that continually monitor the component problem signatures associated with previous issues. All systems will improve over time as the monitoring is becoming better and better at noticing issues before they become problems. For example a storage array manufacturer might setup up different analytic processing workflows for different SSD manufacturers or batches of SSDs based on previous issues.
Other event monitoring system criteria to evaluate in decreasing order of importance are:
- Ease of creating analytic processing on input data streams
- Richness of analytic library functions (statistical, predictive, etc.)
- Scaling capability with increases in input volumes
- Alert hierarchies (e.g. A rack of SSDs versus individual SSDs)
A screenshot demonstrating how easy it is to build complex queries in SAP IT Operations Analytics using the Analytics Builder. The guided nature of the GUI results in all available data presented in dropdown menus.
How easy it is to setup input streams, parse input streams, create users, etc. are all necessary features. However, they do not constitute the core value that is needed for effective IT event management.
SAP’s IT Operations Analytics (SAP ITOA) product focuses on what is important. It leverages the analytic power, processing speed, and scaling capabilities of the SAP HANA platform. And that’s why it deserves to be on your evaluation short list.