SAP IdM Monitoring in the cloud
Goal
One of the biggest issues we face in operating SAP IdM instances is the live monitoring of the system. The vanilla tools provided by SAP are not fullfilling all the needs we have in operating multiple IdM instances – or are too difficult to handle for a sysadmin who is not that experienced in this topic.
So we had plans to create a monitoring tool which is easy to setup, update, operate and use.
Using on-premise solutions we would face different difficulties:
- User has to setup a new monitoring instance
- User has to install the whole framework
- User has to administrate the tool
- Updates have to be distributed + user has to reinstall/upgrade the framework by himself
- We do not have access to the data without vpn-connections – which makes mass-monitoring of our hundreds customers very costly
- Data cannot be collected and analysed in a big data environment, results might give us clues how to optimize the systems
Regarding these issues, we decided to create a monitoring tool which is operating in the cloud and wanted to offer the software as a service.
Architecture
A small agent (java application) is installed on the application server of the idm instance. This agent collects data from the database with a jdbc connection and transmits the data via REST-call (HTTPS-Connection) to our cloud application.
The REST-Interface saves the data into a HANA-instance running in the SAP-Cloud. From there the frontend (WebUI) can access the data and present it to the user.
Functionality
We currently monitor:
- Dispatcher availability
- Amount of Entries (active/inactive)
- Provisioning queue
- Approvals (open/closed)
- System Logs
- Linkstates
Supported IdM Versions:
- 8.0
- 7.2 > SP6
Databases
Since SAP IdM can be installed on several different databases, we need to cover them up too.
For that issue, we implemented a dynamic sql-abstraction-layer which supports following databases:
- Oracle
- MSSQL
- Sybase
DB2 is also compatible if you use the oracle-plugin on top of it.
The queries for the statements are outsourced to a JSON-file and can be edited/tweaked or removed, if you dont want that special query to be executed.
This way you also have the power over the data which gets sent to the cloud-systems; you might not want to transmit systemlogs to the server? Delete the query in the configuration file and it wont be monitored anymore.
Sample code:
{
"dispatcher": {
"all": "select MACHINE, LAST_VISITED from mc_dispatcher WITH (NOLOCK)"
},
"provqueue": {
"all" : "select count(*) as num from mxp_provision WITH (NOLOCK)",
"onehour" : "select count(*) as num from mxp_provision WITH (NOLOCK) where exectime >= DATEADD(HOUR, -1, GETDATE())",
"fourhour" : "select count(*) as num from mxp_provision WITH (NOLOCK) where exectime >= DATEADD(HOUR, -4, GETDATE()) and exectime <= DATEADD(HOUR, -1, GETDATE())",
"eighthour" : "select count(*) as num from mxp_provision WITH (NOLOCK) where exectime >= DATEADD(HOUR, -8, GETDATE()) and exectime <= DATEADD(HOUR, -4, GETDATE())",
"oneday" : "select count(*) as num from mxp_provision WITH (NOLOCK) where exectime >= DATEADD(HOUR, -24, GETDATE()) and exectime <= DATEADD(HOUR, -8, GETDATE())",
"oneweek" : "select count(*) as num from mxp_provision WITH (NOLOCK) where exectime <= DATEADD(DAY, -7, GETDATE())"
},
"entries": {
"all":"select count(*) as num from idmv_entry_simple e1 WITH (NOLOCK) left join idmv_value_basic e2 WITH (NOLOCK) on e1.mcmskey = e2.mskey and e2.attrname = 'MX_DISABLED' where e1.mcentrytype = ?",
"active":"select count(*) as num from idmv_entry_simple e1 WITH (NOLOCK) left join idmv_value_basic e2 WITH (NOLOCK) on e1.mcmskey = e2.mskey and e2.attrname = 'MX_DISABLED' where e1.mcentrytype = ? and e2.avalue is null"
},
"linkstates": {
"all":"select mcexecstate, count(*) as num from mxi_link WITH (NOLOCK) group by mcexecstate"
},
"approvals" : {
"open" : "select count(*) as num from mxi_approval WITH (NOLOCK)",
"old" : "select count(*) as num from idmv_linkaudit_basic WITH (NOLOCK) where mcoperation in (14,15)",
"approved" : "select count(*) as num from idmv_linkaudit_basic WITH (NOLOCK) where mcoperation = 14"
},
"syslog" : {
"all" : "select top 10000 * from mc_syslog WITH (NOLOCK) where severity > 1 and logid > ? order by logid desc"
}
}
Machine Learning
One of the biggest benefits for using HANA:
We can analyse data and use fast algorithms to improve our systems. A very important topic are systemlogs.
You have hundreds or thousands of logs and you have no easy way to see what goes wrong in your idm. Only clicking and analysing the logs one by one is the current solution.
We wanted to fix this issue and created a smart algorithm which collects data – looks for similarities and creates patterns to categorize the logs:
Simplified the algorithm works this way:
- We split the logs into words and look for the distribution of every word. This then gives the word a special “weight”.
- Then we compare the logs with different algorithms like levenshtein and get different weigths to.
- We combine all weights with a special algorithm to get a final weight for every log.
- In the end the final weigth is clustered and similar logs get into categories.
- Now we compare all categorized logs with their similar neighbours and generate a sql-uniform pattern of it.
- New Pattern gets created and all data similar to this pattern get categorized automatically.
In the end you have a really clear view on the issues your idm has and can handle them.
Screenshot of the Systemlog-Analysis are provided in the following gallery 😉
Overview
All the screenshots are made from a customer dev-system we use to test our monitoring – sensible things were blurred and data might seem a little bit”boring” 😉
Board
Dispatcher
Entries
Provisioning Queue
Approvals
Systemlogs Overview
Here you see our algorithm in action – you see the categorized logs in a piechart
Clicking into a category sends you to the detail view
You have the possibility to send an notification with the error to the mail you want – or send it directly to our support:
Linkstates
Settings
Depending on the size of your instance you might want to define custom thresholds for your board:
Current state
We are currently about to finish the alpha-phase and are getting more and more instances of our customers to the new monitoring software.
Also we are planning to open the service for general availability and provide it as a free service.
Like you saw in the Mailing-Screenshots we will offer you support – but you are free to decide if you want us to support you – or you handle your stuff by your own 😉
Conclusion
The HANA cloud is a great possibility to enhance your SAP Software with new, fast technologies – we want to use this plattform to create a fast, reliable monitoring addon for SAP IdM.
We want to enhance the machine learning part of the monitoring with other topics – for example to give you tips which process in your idm system fails often, how it might be avoided and how other user perform in this topic.
Also predictive analysis is a big topic we want to work on this year; a use case might be how long approvals take to be worked off and how you can simplify them to save up more time.
Do you have any other ideas to add to the monitoring? Or want to join a beta-phase? Or have questions? Hit me up 😉