Advanced monitoring and health check with RealCore’s CPI Dashboard
Although the SAP CPI (as a relatively new product) is constantly being expanded in its product range, there are still a few features missing – especially in the area of monitoring. Be it an overview of the news volume of the last few days / weeks or an overview of the utilization of the system. For that reason (and of course to see what’s technically possible), we’ve developed a small tool called “Realcore CPI Dashboard” which we’d like to share with you.
Since the article is a bit longer, we have divided it into the following sections:
- What is Realcore’s SAP CPI Dashboard?
- Features and data sources
- How to install and configure the dashboard?
What is Realcore’s SAP CPI Dashboard?
The Realcore SAP CPI Dashboard is a web-based tool (webpage) which enables you to check different parameters of your SAP CPI instance. For sake of easyness the complete tool is packed into one single IFlow which does all the technical handling like information retrieval, API endpoint provision for the web-frontend as also delivering the web-frontend itself. The following screenshot gives a preview of the look’n’feel.
Features and data sources
Now let’s take a look at the different features. Some are probably self-explanatory where others certainly need an explanation. Also we will highlight from which data sources the values are taken, so that you know how we retieve the data.
The (system) over view tiles show four different values. The CPU average load for the last 1, 5 and 15 minutes, the percentual CPU usage and percentual IO/Wait time (=time the processor idles, because of IO operations like disk write), the system uptime and the message count for the current day.
- CPU Load Average: Read and parsed via Groovy Script from /proc/loadavg
- CPU usage and IO wait: Read and calculated via Groovy Script from /proc/stat
- System uptime: Read and parsed via Groovy Script from /proc/uptime
- Messages today: Read via REST call from /MessageProcessingLogs-api
System memory and swap
The system memory and swap section shows the current system (=container which hosts the IFlow) memory and swap size and usage. For both, memory and swap, you can see used, free and total values.
The actual values are read via Groovy Script from the /proc/meminfo system file. Note: For the memory the value “Mem Available” and not “Mem Free” is used. (Mem Free might be smaller than Mem Available, but we thought it’s more interesting to find out how much memory could be allocated at all.)
The Java Information card shows the memory available to/for the Java Virtual Machine (JVM) as also some basic information about the used Java version. (Like version, vendor and information about the system user under which the JVM is running.)
Information about the Java version and user information are read via Groovy Script by use of Java’s System.getProperty(“java.*”) functions.
The CPU information card shows information about the CPUs which are assigned to the host/machine which hosts the dashboard IFlow. It gives you a clue what hardware your tenant is running and enables you to get aware of changes on the hostsystem.
The actual values are read via Groovy Script from the /proc/cpuinfo system file.
The Disk usage card shows the usage (used/available space) for different mounts of the Linux machine your CPI is running on. It may help you to identify performance issues.
The information shown is gathered via Java’s system.io.File-class and its functions getTotalSpace() and getUsableSpace().
Message volume / last 30 days
This card shows the total message volume (=all messages regardless of their status) for the last 30 days whereby the interval is split into two halfs. Thus it is easy to compare if your total message volume during the last 14 days is increasing or decreasing compared two the two weeks before.
The message volume is read via REST api call from /MessageProcessingLogs-api.
Message volume last 2 days
This card shows the total message volume for the last 48h. It should help you to find out if there are irregularities between the message volume today and the day before.
The message volume is read via REST api call from /MessageProcessingLogs-api.
The Software versions card shows the software versions of different “core” parts of the SAP CPI (like OS, CPI release, Groovy version, …) as also of each and every SAP CPI module. It may be helpful in two ways. Either if something breaks without any changes done by you (in that case you could check if there was an update which may break your interfaces) or to find out which features of Groovy/Java/XSLT you can use (=because you may want to use bleeding edge features, but aren’t sure if CPI already uses the latest Java/Groovy/etc. version).
- Operating system: Read and parsed via Groovy Script from /proc/version-system file
- SAP CPI release: Read via Groovy Script and FrameworkUtil.getBundle()-function
- Groovy release: Read from Groovy-variable GroovySystem.version
- JVM version: Read via Groovy Script by use of System.getProperty(“java.version”)
- XSLT Engine: Read via XSLT mapping and its select/property function (<xsl:variable name=”properties” select=”(‘xsl:version’)”/>)
- SAP CPI modules: Read via reflection/FrameworkUtil.getBundle()
The logfiles card lists various log files from the CPI and offers three options for each file: 1) Open the log file in a new webbrowser tab. 2) Open the logfile in an in-page popup. 3) Download the log file. It’s correct that these are the same files like the ones which are accessable via the SAP CPI monitoring perspective, but with our dashboard you can view them immediately without the need to download and unzip them first. Note: This view is only visible to dashboard users which have the appropriate “log file”-role.
Direct file access via Groovy’s/Java’s file api.
This view shows information about the current user of the dashboard. It shows the S-User/useraccount which was used to call the dashboard, shows if the user has sufficient rights to show the dashboard at all, the logfiles and the password section and lists all of the roles assigned to the user.
The username is read from SAP CPI’s message header. The “Can access”-checks are calculated by the role list. The role list itself is retrieved via the SAP Cloud Platform Authorization and Management API.
The security material view lists all credential pairs from your SAP CPI instance. In addition it is able to show the passwords/secret keys of your credentials. Since this is highly sensitive data this card is disabled by default. Note: This view is only visible to dashboard users which have the appropriate “security material”-role.
The security material list is queried via the web-based /api/v1/UserCredentials-api. The passwords themselve are retrieved via the official ITApiFactory/SecureStoreService-(Groovy Script-)api. If you wonder now, if it’s really that easy to read passwords – yes it is. Everyone who has sufficient rights to deploy an IFlow also has enough rights to read out all passwords. (If you’re interested to should read this article and its comments.)
The artifact comparison view is my personal favourite of the dashboard’s functions. It checks all artifacts from the design perspectives as also from the runtime (=deployed artifacts), matches them by their artifact id and then shows if their version equals or differs. That way you can easily find out where you have artifacts in new versions which aren’t deployed yet as also where you might have deleted an artifact but forgotten to undeploy it.
For artifact retrieval two web apis are used: /api/v1/IntegrationRuntimeArtifacts-api and /itspaces/odata/1.0/workspace.svc/ContentEntities.ContentPackages-api (unofficial).
The alerting engine allows you to configure alerting rules which then are checked regularly by the dashboard. If a message or a certificate from the keystore matches one of the confgured rules, an email will be send out. You can configure two typed of alerting rules: messages-based rules and certificate-based rules. Message-based rules can check for specific Iflows, sender- or receiver-systems as also for specific message states. If the combination of filters set up in a rule is matched by a message, an email will be send to the mail receiver, configured in the rule. Certificate-based rules allow you to setup alert emails that will inform you X days before all/a specific certificate looses its validity.
The alerting engine is implemented withing the IFlow itself. Based on a timer/trigger the /MessageProcessingLog-api is called and the results are checked against the given alert rules.
IFlow scheduler overview
The IFlow scheduler overview helps you with two lists. One list shows all upcoming IFlow run dates based on timer-events for the next 24h. The second list shows all IFlows containing timers and gives some more detailed information like the cron expression, the next three runs and parameters used in the timer configuration.
The runtimes/-dates are calculated via the cron expressions. The cron expressions theirself are read from CPI’s runtime node via Groovy script and the OSGi classes. If you want to learn more about OSGi in context of CPI, you should read this article.
IFlow comparison tool
The IFlow comparison tool allows you to compare two IFlows with ease. For text-based files it gives you a side-by-side diff view. For binary files it will show differences in CRC (checksums). Thus you can find differences between IFlows in seconds. You can either compare two IFlows from the same tenant or do a remote diff against another SAP CPI tenant.
This feature is based on two sources. For the package and artifact retrieval the (unofficial) /itspaces/odata/1.0/workspace.svc/ContentEntities.ContentPackages-api is used. For getting the content (IFlows) themselves the official /api/v1/IntegrationDesigntimeArtifacts–api is used.
How to install and configure the dashboard?
Since this article already isn’t the shortest one, we decided to provide the installation and configuration information in a second blog post which can be found here.
Time for a self-critical review: The development of the tool and especially the exploration of the technical / operational limits was definitely interesting. I think our tool definitely adds value to a CPI guy’s life.
Nevertheless, there are a few limitations. Of course, values such as CPU utilization or memory consumption only refer to the runtime node on which the dashboard IFlow is deployed on. Since, to my knowledge, you can not influence on which node(-id) an IFlow is deployed, it will be difficult to monitor all nodes. Nevertheless, I think reading RAM and CPU makes sense, because scaling across the nodes should leave all nodes with a similar load. (Attention – this is just my personal guess. At this point, I would love to be teached a better one.)
The source code as also the deployable IFlow was published on Github as OpenSource (MIT licensed). (A small manual how to develop and “compile” the IFlow if you want to make changes will follow.)
Now we reached the end of the first article of this series. I hope you enjoyed reading it and appreciate your feedback, thoughts and comments.
Thanks and credits: I would like to thank…