There has been some moaning about the built-in capabilities to monitor BW process chains. Clicking one chain after the other to see recent status is just too much effort and transaction RSPCM listing the last execution is lacking supervision with regards to timely execution. In fact for those of you who do not want to use the administrator cockpit, there is no time supervision at all (except the insider tip of transaction ST13 – which however is not contained in standard, but an add-on by SAP active global support).
However with release 7.30 BW development has finally catched up – let me show you!
Let’s start – as always – with data warehousing workbench RSA1:
First change catching the eye is a new folder „Process Chains“ in the Modelling View. Seems a small change, but there comes something important with it – search.
At last you are able to search for process chains by name directly in the workbench. Plus, you might have noticed: The display components are ordered hierarchical like any other folder in RSA1. But not only the display components, but also the chains itself have a hierarchical structure, displaying the metachain – subchain relationship. So you can directly navigate to the lowest subchain in your metachain hierarchy if you only remember the name of the top-level master.
But this is not what we were looking for, we spoke about monitoring. So let’s go to the Administration area of RSA1
Ok, same tree, but that does not really make a difference. Let’s look at the Monitors.
If RSPCM was used in this system before upgrade, I will be asked the first question, else the second one. What does it mean? You might know, that RSPCM requires you to choose the chains, which shall be monitored. So I understand the second question, this will simplify the setup. But what about the first question? Well, RSPCM became user-dependent with 7.30. Each user has his own unique selection of process chains to monitor.
Good thing for most users. But stop – what if I have a central monitoring team where all administrators shall look at the same list? Does each of them need to pick his or her chains himself? No:
Besides the known possibilities to add a single chain resp. all chains of a display component there is the option to „Select Include“.
And how do I create an include? I press button „Edit Include“.
And then I am on the ususal RSPCM list and can add chains. But now, how does the new RSPCM list look like? Here it is:
The first five columns look pretty familiar. But there is something hidden there as well! Let’s click on one of the red chains:
The system is extracting the error messages from the single processes of the failed process chain run and directly displays them together with the corresponding message long text. So no more searching for the red process and navigating and searching the log. One click and you know what’s wrong (hopefully ;-). If not, you still can go to the log view of the process chain by the Log Button on the popup.
And look – the button alongside, called „Repair“, looks interesting. No more error analysis, just press repair any failure. THAT was what you were looking for all the years, isn’t it? Sorry, it is not that simple. The button will repair or repeat any failed process in the chain run like you could do already before via context menu. And whether this is a good thing to try still depends on the particular error. But you should be a little faster now finding it out and doing it.
What about the other two buttons? Let’s delay this a little and first go back to the overview table. There are some new columns:
First, there is a column to switch on and off the „Time Monitoring“ because you might not be interested in the run time or start delay of any chain in RSPCM, especially considering that you can still schedule RSPCM to batch to perform an automatic monitoring of your chains and alert you in case of red or yellow lights. Considering performance, it does not make much of a difference, because the run time calculations of previous chains are buffered in an extra table. After all, the performance of RSPCM also greatly improved by following setting:
You can choose to refresh only those chains, which have yellow status or not refresh the status at all but display the persisted status from the database. For the red and green runs, it is expected, that the status does not change frequently. If you nevertheless expect a change of such chains, you can also force the status refresh in the transaction itself by pressing the „Refresh All“ button in the screenshot above.Now, how come the system judges a run time of 48 seconds „Too Long“? Did somebody set a maximal run time for each chain? Let us click on the red icon.
Seems like this chain runs usually not longer than about 35 seconds. Considering this, 48 seconds is too long. More specifically, the chain takes usually 25.5 plusminus 3.9 seconds. The 48 seconds are about 22.5 seconds more than usual. Especially, these 22.5 seconds are much more than the usual deviation of 3.9 seconds.
So, the system uses some simple statistics to judge on the run time of a chain run. If you like, I can also give you the mathematical formulas for the statistics, but I guess this is not so interesting for most of the readers 😉
To stabilize this statistics, the system even removes outliers before calculating the average run time and average deviation. If you are interested in the filtered outliers, you can press the „Outlier“ button:
This indeed is an outlier, don’t you think? Let’s get back in time a little bit and look at older runs of this chain by pressing the „+“ button twice, which will add twenty additional past values.
Now our outliner is not the biggest anymore. Even more interesting, the chain seemed to stabilize it’s runtime over the past weeks. Considering the previous values as well, our current run is no longer extraordinary long. Is it a false alarm then? No, because if the run time turns back to former instabilities, you would like to get notified. If that instable behaviour however becomes usual again, the system will automatically stop to consider this run time extraordinary and thus stop alerting you, because it always considers the last 30 successfull executions for calculating the statistics for the alerts.
So let us check, what process made this current run longer. We press button „Display Processes“.
Of course a DTP. You would not have expected something different, would you? I now could click on the DTP load to open the DTP request log and start analyzing what went wrong, but at this point rather let’s go back to the RSPCM overview list and look at the „Delay“ column.
Today it is the 14.5.2010, so an expected start on 27.4.2010 indeed I would judge heavily delayed. But how does the system come to the conclusion that the chain should start on 27.4. (plusminus six and a half days)? The answer comes when clicking on the LED.
Obviously this chain was executed only five times in the past, and it had been started once a week (in average). Seems like the responsible developer lost interest in this chain meanwhile. If he continues not to start the chain, the system will stop complaining about the delay and turn the icon grey. Now it is important to note, that this chain is started either immediately by hand or by event. If it would be scheduled time periodic in batch, the delay is not calculated in this statistical manner, but the delay of the TRIGGER job is observed and the alert is raised after 10 minutes of delay.
And now also the question about the two buttons in the status popup is answered: They open up the shown run time and start time popups.
I hope you have fun using these new functions and appreciate that it is not neccessary to laboriously customize any thresholds in order to get meaningful alerts.
Don’t miss any of the other Information on BW 7.30 which you can find SAP Business Warehouse 7.3