Five Important Things Every BI4 Administrator Should Know About the BI Monitoring Application
The BI Platform 4.x monitoring application has undergone significant improvements since the 4.0 release. In this SCN blog series, I’ll outline some important fixes, tips, tricks, and documentation regarding the BI Monitoring application that every BI administrator should be made aware of. The goal of this series is for you to get the most out of the native BI monitoring application so that you may keep your BI landscape as stable and efficient as possible.
When using probes and alerts to programmatically re-create user workflows and confirm the availability of various aspects of your BI landscape, you may find that you are getting alerts in your email or BI inbox that have already been triggered in the past. This can be particularly annoying since you depend on these alerts to inform you of any system outages. If you are getting alerts when a watch’s danger rule hasn’t actually been breached then you may find yourself logging on to the company network instead of enjoying your weekend time off in peace. The reason why you get these “false” alerts is because by default, there is a 2 day reminder that re-sends alerts if they have not been acknowledged or read by the BI administrator. These reminders are also sent if the Monitoring service is restarted.
To fix this annoyance, you must go to the Central Management Console, applications, then Monitoring and change the reminder setting from 2 days to 0 days. Then restart your Adaptive Processing Server (Monitoring Service). Refer to the example below:
Stability and High CPU Consumption
If you have noticed some stability issues in the Adaptive Processing Server (Monitoring Service) in the BI 4.0 Support Pack 4 codeline, then it is because there are indeed some significant performance issues. In some cases, the Monitoring service may utilize 100% CPU on the node where the service is running. This can cause other services on the node to be starved of resources which ultimately defeats the purpose of monitoring in the first place. The good news is that we have discovered and corrected a few memory leaks which are to blame for the stability issue. The more metrics and probes defined in your BI landscape, the more likely you are to experience this issue. After applying the correction, the monitoring service will be very stable and you may scale out your metrics and probes as much as is needed to keep tabs on your BI landscape.
This problem is tracked under problem report ADAPT01682093. For more details about the problem, see note 1833881.
To fix the problem, upgrade your BI landscape to one of the following codelines:
- SAP BI Platform 4.0 Support Pack 7
- SAP BI Platform 4.0 Patch 4.12
- SAP BI Platform 4.0 Patch 6.1
Monitoring the Availability of a BI server
One of the best features of BI Monitoring is that by default, you have the capability to keep a watchful eye of the status of your BI4 servers. If one of your BI4 servers goes down, you can configure a watch to send an alert so that you may take action to get the server back online before it causes a widespread system outage. I have documented a simple danger rule example that you may use for this purpose in note 1839303.
There are a few caveats that you need to be aware of in terms of monitoring the availability of the BI server.
- You need to watch out for BI servers that get hung in stopping or starting status, not just those that stop unexpectedly or crash. There is an issue with this functionality in that the Monitoring application cannot differentiate between running state and stopping/starting state. We have tracked this issue in problem report ADAPT01685525. To correct this issue, upgrade your BI landscape to Support Pack 6 or higher.
- By default, the server metrics are cached for 60 seconds. This means that if the state of the server was checked during the last 60 seconds and the server has stopped within that timeframe, then the watch will not detect that the server stopped until the next time the server metrics are collected (for example 61 seconds). To change this granularity, you can adjust the parameter “Metric Refresh Interval” to a smaller value (15 seconds is the lowest possible granularity). This setting can be found under Central Management Console, Applications, Monitoring. Refer to the example below:
Trending Database Woes
In order to be able to view historical data in the BI Monitoring application and to create trending reports against monitoring data, you must have a working trend database. For performance reasons, it is highly recommended that you migrate your trending data from the default Derby database to the Auditing database. For instructions, refer to note 1741961. Before you make the move, you need to be aware of the following issues with Oracle and SQL Server.
- The documentation for using Oracle as a trending database is not correct in the BI 4.0 SP6 Administrator’s Guide. To use Oracle for your trending database, refer to note 1768678.
- When using SQL Server as your trend database, some special configurations must be made prior to enabling the Auditing data source, otherwise data will not be successfully entered into the details table: MOT_MES_DETAILS. Refer to note 1828472
In smaller BI landscapes (with only one node), it is ok to use the default Apache Derby database to store trend data. For larger production landscapes (with two or more physical nodes), it is best practice to use the Audit data source due to performance benefits and data integrity.
Probe Customization with Java
With custom probes, you can extend the functionality of probes to monitor almost any aspect of your BI landscape. Your imagination (and programming skills) is the limit. Should you need to develop your own custom probes and would like a tutorial to get started from, you may refer to the SCN whitepaper Developing and deploying a custom Java probe in BI 4.0. This whitepaper also includes example code for a probe that can be used to monitor the availability of the Web Servers in your application server landscape.
Stay tuned for more BI Monitoring application updates, tips, tricks and documentation in the next blog in this series coming soon to the SCN near you.
"Prior to and including the Support Pack 6 codeline, using Oracle for the trending database is not functional"
However, since SP4, the official admin guide kept saying that using Auditing database as the trending database is possible. In addition, that was one of the "new features" that were highlighted with SP4 release notes. Time goes and then it appears that it simply does not work until a future release.
Was this feature ever tested and verified to be working when it was first announced to customers?
sorry to sound negative but this is a very true to other features and/or functions in BI4
Thanks for the comment and I understand your frustration. About your question:
"Was this feature ever tested and verified to be working when it was first announced to customers?"
It is possible that when it was tested originally it was found to work ok but a regression broke the feature. The good news is that it will be working soon.
It turns out the documentation for using Oracle for trending database is not correct. Refer to the following note which will guide you through to successfully use Oracle as the trending database https://service.sap.com/sap/support/notes/1768678. No patch is required.
Very usefull info -- as usual from Toby. Thank you.
Really thanks for document.
I have few questions:
1. Actually we are oracle 11g as db and also oracle 11g client installed on windows server but when I go to "<BOEInstallDir>\SAP BusinessObjects Enterprise XI 4.0\dataAccess\connectionServer\jdbc\oracle.sbo" location I see oracle 10g file. Do we are in trouble or we are good?
2. Could you let me know what type reports we can create on current apache trenddb and where I can see all tables?
Thanks in advance for help
Hello S J,
For question #1, make sure you follow note 1768678 and enter the correct syntax for Oracle 11 into the oracle.sbo. You edit this file for both Oracle 10 and 11 but the syntax is slightly different. The difference in syntax is covered in the note I refer to.
For information about the trend db and the tables you can refer to the administrator's guide. This database is used to create reports based on historical data collected by the monitoring application (metrics and probes)
Great read! Is there a part 2? 😉
Good notes tip for BO monitoring.
Are there any monitoring changes or enhancements that we should be looking for in BI 4.2?
According to my documentations, there are no significant changes to the BI Monitoring application in 4.2.
Regarding email notifications... Is there a way to pause the watch email notifications during nightly system restarts?
Great information and still relevant thank you Toby.
I was able to get the results I needed by modifying the Throttling Criteria. Setting the alert evaluate TRUE for 15 minutes - allows enough time for graceful system recycle without sending multiple alerts.
We’ve configured Monitoring service on DEV and PROD environments to notify us if there are any memory leaks on our reporting nodes. So far it has been working great. However, we noticed that having this service will increase system connections (CMC home > Settings > View Global System metrics) as it tries to check the memory on the servers consistently at a regular intervals. Eventually, when the total connection reaches 3500+, our system hangs leading to performance issues. To avoid this, we started to restart reporting node SIA’s one by one in our clustered env ( we have 4 reporting nodes in total with SIA running in all of them). And once all the reporting node SIA’s were restarted , system connections come down to 100+. This tends to look like a cycle behavior, we perform SIA restart activity whenever we see system connections reaching 3500+ (We are looking at every 10day on avg ) Appreciate if someone can let me know how to avoid this?
About BO env
3 intelligent nodes – Audit/ SIA / Search
4 reporting nodes – 4 SIA/ WPS/ split APS/ 1 AJS (on each node)
2 Tomcat nodes
I tried disabling all other default watch and just use the one we created for memory leak ( which is 1 watch per one reporting node ). But, no use.
Thanks in advance!
Any thoughts on this please?