In my original blog in this series I began by covering the leading monitoring types. I now want to go into what I like to refer to as the pillars of monitoring. It’s all about how we take those monitoring types and employ them in a usable and measurable interface that makes sense for the BI Administrator. The four pillars are metrics, probes, watches, and alerts. By covering all four pillars I believe we can offer the most complete monitoring solution ever attempted in a BI suite. In this blog I’m going to focus on the metrics, how we interface with these metrics and what information we can learn from the metrics.
In the first blog I mentioned the monitoring types (Instrumented, Synthetic, and Network), in order to implement metrics we have to use instrumentation. This instrumentation allows us to put statements around certain functions in the code that we want to track. By tracking these key functions it allows us to know how long it takes the enterprise to perform very specific functions.
The application used to capture these metrics via the instrumentation is Wily Introscope by CA. We developed custom modules for Wily Introscope to allow for the specific BI 4.0 metrics to be presented. Wily is provided by SAP for all customers with an existing maintenance contract. If you do have maintenance then please logon to the SAP Support Portal and take a look at the technology components in the software download center.
Wily Introscope provide a few features to interface with these metrics, they are dashboards, an investigator, and a transaction tracer. Depending on your role and the information you want to obtain will determine which interface you will get your information from. Let’s assume you would like to simply review the performance of your environment and monitor the overall health. In this case you would most likely want to use the dashboards. We developed more than fifteen different dashboards to cover all of the most important metrics that will provide you with the high level overview and health of your system. These dashboards are broken up into a few areas as indicated below.
The web application server dashboards display the metrics captured for the specific applications you have running within your application server. These dashboards will give you metrics related to user logon times, user request times, service request times among others. In showing these metrics you will have the added ability to separate your application servers and applications so you can view them individually. In this way you can review if you’re Web Intelligence users are experiencing worse performance times versus Crystal Reports users or Analysis users. Because the application server is the main point of entry for all users in the landscape this is the perfect spot to look to see what actual users are experiencing when interacting with your system.
The BI Platform dashboards display metrics from the backend servers, these servers will be the servers used to actually process your documents, reports, and dashboards. If you are curious as to what the real load is on any particular BI server you can review that within these dashboards. You will be able to see how many documents they have open, how long queries take to return from the database, CORBA response times, and stall counters. These are perfect to use in order find out if you are getting proper load distribution throughout your servers or if you want to see why reports are taking longer than usual to process and return to users.
Next are the Explorer and Data Services dashboards, with Explorer we can display facet data, chart, infospace, execution, and CVOM response times. With this set of dashboards you will receive an overall health of your explorer servers which can provide that first line of defense if something begins to go awry. As for data service, it is one of our newest instrumented areas providing a wealth of information previously not possible. These set of dashboards will capture such things as CPU specifics, memory details, and request response times. Your data services jobs will now have a full set of metrics accompanied with them giving you complete insight.
The investigator is different from the dashboards in that the dashboards are statically developed and only provide a subset of the metrics sent by the BI environment to Wily. While these metrics are some of the most important ones used in the system to determine its overall health there is sometimes the need to get more detailed in order to identify what is going on in a specific area. It could be that you see there is some issue in one of the dashboards, you can then go into the investigator to drill into it further. The investigator is the collection of all the metrics produced by the SAP BI 4.0 landscape and the dashboards are just a subset of what is available in the investigator.
If you want to drill into specific details about the CMS like ‘How long are my queries taking to return from the repository?’ you can do exactly that. If you want to look into your input/output FRS to determine how long it takes to read or write the files to disk, you can find that here as well. If you want to drill into the specific servlets used by your application server to pull out specific information for that particular servlet, then yes you have that at your fingertips. It is important to know as well that the metrics collected here are not only the application metrics but also the operating system metrics. This provides you with all the metrics you need to fully understand what precisely is happening on the servers.
The transaction tracer is a very awesome tool, this tool is used specifically for root cause analysis. If you are experiencing an issue and you realize that there are some delays but just unsure exactly where this issue resides then it is time to crack open the tracer. In the past with BI you would do a BOTrace or a –trace on a server to capture logs so you could review them to determine precisely what is going on. This isn’t the best scenario because these traces are written to disk logging all information for a specific server. You can easily end up looking though hundreds or even thousands of lines to get to the specific information you are looking for. This is a thing of the past for most issues where the code has been instrumented. You can filter on just about anything so that you are reviewing only the specific information you want. None of this is in a log format and you do have some different views in order to display the information.
One of the best ways to explain the tracer is by giving a specific example. In this example the issue I am facing is that I had a user come to me and mention that the ‘Quarterly Trends’ report is not running in the standard 3 minutes and is taking almost 10 minutes to return. In order to determine the root cause I setup the tracer to filter on the users name. By doing this the tracer will capture every function call that is a result of that users action. I will see when the user logged on, how long that took, the request for the document, and how long it takes to return the document to the user. In each function call I can see how long they take and they are displayed in a hierarchical format that tells me which one was called by the other. Using this I can tell exactly where the performance hit takes place and if it is in the fetch rows call I know that the database is taking a long time to return the data back to my server.
No more combing through log files looking for the entries that are relevant to me, now it is directly in front of me and it is obvious where my issues resides. The only question is what will I do with all the time I save by using this approach?
In this blog I provided a nice overview of our metrics provided by Wily Introscope. Solution Manager does come with Wily
Introscope so if you do have a Solution Manager implementation then you also have access to Wily. If not, you can setup Wily Introscope to receive these metrics without Solution Manager, the only problem with that is Solution Manager provides numerous other root cause analysis tools that can greatly benefit the system as well. I can cover that in more detail later but if nothing else at least get your feet wet with Wily and you can always implement Solution Manager later and connect it to your Wily implementation.
There are some base metrics provided in the deployment on SAP BI 4.0 but they are a bit limited in scope, I will cover those as well as probes in my next blog in this series. If you guys learn something in this post then let me know and rate the post. If so, I can go into other areas in more detail and possibly provide some real examples in how they were used to solve or load test real issues. Check out my next blog in this series where I will discuss Probes.