End-to-End Trace and Identifying Root Cause for BI 4.x Web Intelligence
I support Web Intelligence (WebI) for a living.
I talk to WebI and it talks to me. Mind you, it doesn’t talk to just anyone – especially when it’s encountering issues. You have to know how to talk to WebI, for it to give you the good stuff – like what’s really the problem.
At times, the error message WebI gives you isn’t all that informative. For example, just today the WebI document I was working on started to throw a “Failed to connect to the OLAP source. (Error: INF)” error:
WebI doesn’t tell you why it failed to the OLAP source, just that it couldn’t.
There’s other error messages, just as unhelpful to the end user. For example, there’s a common error where the error message states “csEX”. Just that, nothing else. Or the dreaded “WIS 30270” error. You search the SAP Notes/KBase for “WIS 30270” and you’ll get hundreds of hits, each describing a different root cause. So which Note/KBase is the right one, the one that describes your issue?
To determine the right one, you need to find the root cause error, the actual error that was thrown that led to the more-generic error message being shown the User.
What’s tricky about this is that, with BI 4.x, the root cause error likely didn’t originate within the Web Intelligence Processing Server. With the BI 4.x architecture, for improved scalability and manageability, components of the WebI document processing are done on different processes. For example, chart generation is done by the Visualization Service housed in the Adaptive Processing Server (APS). SAP BW BEx query data connections are done by the DSL Bridge Service also housed in the APS. Multi-source UNX Universe data connections utilize the Data Federator Service.
So let’s follow an example workflow. You open a WebI document reporting off of SAP BW BEx query in BI launchpad. You click the refresh button. The web browser sends this event to the web application server, the AnalyticalReporting web application handles the event, forwarding the refresh event to the Web Intelligence Processing Server.
The Web Intelligence Processing Server then determines the WebI Session associated with the open WebI document, sees that it’s a refresh request for BEx query data, so sends a ‘processDPCommandsEx’ remote procedure method call to a DSL Bridge Service in an APS. The DSL Bridge makes the call to the BEx query, marshals the results and sends it back to the WebI Processing Server, that composits the results in a report page, that it sends back to AnalyticalReporting that translates the XML internal format to HTML and sends it onwards to the web browser. Phew.
But in my example WebI document above, somewhere the process broke down, and an error was thrown. But where?
End-to-End Tracing Saved my Life
That’s a pretty facetious statement. But to be frank, the End-to-End trace tool, every day, saves me so much time and effort and makes me look pretty smart with customers. Before there was an End-to-End trace tool, trying to identify root cause within hundreds of MB’s of trace log files was always a daunting task. Much of the contents of the trace log was usually irrelevant to the issue I was investigating, and much of the time spent was manually filtering out the irrelevant.
What the End-to-End trace tool does is automatically filter out, from the trace files, all such irrelevant information. It allows me to trace all the action taken to service a request for a WebI Session, from web browser to WebI PS to APS and back all the way to the web browser, without being bothered by any activity done on the servers not part of the workflow.
Most Web Intelligence admins are familiar with this tool – likely when you contact Support, it’s the first thing we ask for. SAP KBase 1861180 is the document that describes how to download and use the End-to-End trace when working with SAP BusinessObjects BI Platform. I highly recommend you bookmark that KBase and keep it handy.
In fact, open up KBase 1861180. What I’m going to describe in this blog is how I use End-to-End traces to analyze the “Failed to connect to the OLAP source. (Error: INF)” error that I got above.
To effectively use this tool, you need to know a bit of how the Web Intelligence processing workflow operates.
Preparing the System for End-to-End Tracing
The first thing you have to do is go on every machine where BI 4.x is deployed and modify the BO_Trace.ini file. You’ll note that 1861180 says to set append to false and keep_num to 20, except for WebI workflows. For WebI, set keep_num to 50. I asked Toby, who wrote that KBase, to put that comment in just for WebI. WebI tends to be very chatty when it comes to trace log output, and I found that for more involved issues, 20 files just weren’t enough.
So make sure you edit the BO_Trace.ini appropriately:
Launching the End-to-End Tracing Tool
From 1861180, download the SAP Client Plug-In tool appropriate for the version of Internet Explorer you have installed on your client machine. Unfortunately, currently the tool only supports Internet Explorer, and no other brand of web browser. What the SAP Client Plug-in does is it acts as a proxy for Internet Explorer, and on all outgoing communication, injects a HTTP Header with an unique ID value. The presence of this ID value is detected and identified by BI launchpad. BI launchpad will then, for every action it takes to service the request associated with the ID value, increase the Trace level. Furthermore, it sends this unique ID to any requests it makes to backend servers. All BusinessObjects servers, including CMS, WebI Processing Servers, Adaptive Processing Servers, the File Repository Servers, etc, will also detect the presence of this ID and raise the Trace level accordingly.
So say you have the BI Platform system configured for default (Error) logging. Any request originating from Internet Explorer launched by the SAP Client Plug-In tool will automatically raise its Trace level to HIGH for any Server servicing this request.
It’s a self-extracting zip that will unzip contents and automatically launch the Plug-In:
Make sure you’ve closed all other instances of Internet Explorer on your client desktop, the click the “Launch” button. This will start up Internet Explorer, and also enable to “Next Step TraceLevel” combobox and the “Start Transaction” button. Don’t click the button just yet!
The thing to know is that it’ll only inject the HTTP Header with the unique ID only after you click “Start Transaction”. Before then, it won’t set the trace log level. So what you can do is, before starting transaction, do all the preliminary stuff to a point just before the action you know will trigger the error. For my example, I log onto BI launchpad and open the problematic WebI document:
In the above screenshot, I’m at a point just before I click the WebI Refresh button, that I know will trigger the error. I set the TraceLevel to “High” and click “Start Transaction”, then I refresh the document.
Important Note: with the WebI Web Viewer, I can wait to enable transactions till just before I take action in the viewer that triggers the error. However, if I want to End-to-End trace workflows that involve the WebI Applet Editor/Viewer (Rich Internet Client), then I have to “Start Transaction” before the Applet is launched. That’s just the way Java applets work. The Applet reads the web browser proxy information only at startup, and not any time later. So you have to have the transaction started before that point. You also need to go in Control Panel -> Java -> General -> Network Settings… and ensure the Applet is configured to use browser network proxy settings.
Once you “Start Transaction”, you need to keep an eye on the SAP Client Plug-In, to make sure it’s actually recording:
If the “Sent Bytes” and “Received Bytes” aren’t increasing as you work with the WebI Web Viewer, then the information isn’t being recorded in the traces.
You can ignore that “Settings are not valid” dialog box.
Follow instructions in KBase 1861180 and go onto all your server machines and collect the updated tracelog *.glf files. Copy them over to your client machine for analysis.
Also, if you look in the SAPClientPlugin folder, you’ll see a log folder within which is a timestamped MyBusinessTransaction folder, that contains a BusinessTransaction.xml file. Save this file as well:
Open the BusinessTransaction.xml file using a text editor, such as Notepad++. What you need from this file is a copy of the unique ID that the Plug-In has been injecting into the HTTP Header, identified by the “id” attribute:
Copy this id value, because you’ll be using it next.
Working with the GLFViewer
KBase 1861180 has a link to the download for the GLFViewer. The GLFViewer is that tool we use at SAP to read and analyze *.glf files. Download the GLFViewer and start it up.
Go to File -> Open -> Add Files… and select all the *.glf files that you’ve collected. Don’t click “OK” just yet! If you do, you’ll get the whole contents of all the trace files, and that’s not what you want. You want to filter out any trace file entry that’s not associated with the session you had with the SAP Client Plug-In.
So in the Open Files dialog box, enable the “Filter and only read matching entries:”, then select Column “DSRRootContextID”, then enter the unique ID value that you’ve copied from the BusinessTransaction.xml file:
and now click “OK”.
The default configuration of GLFViewer is to not show enough columns. Go to View -> Choose Columns… and select all columns:
Another thing I like to do is select the “ActionID” column and move it up to be second in display order.
I’m highlighting ActionID since it’s an important value. What’s an ActionID?
In the SAP BusinessObjects BI Platform 4.x system, each and every User action that leads ot a tangible change – clicking the Refresh button, opening a tree view, clicking Export, clicking the navigation button to go to the next page – generates an unique ID that all reactions resulting from the action is associated.
This unique ID is separate and different from the DSRRootContextID. For example, when I clicked the WebI refresh button, that generated an unique ActionID. This ActionID is then sent from the web application to the backend Web Intelligence Processing Server. Any request that the WebI Processing Server makes to any other Service – the CMS, the FRS, the APS – also is associated with this ActionID. All responses sent back to the web app is also associated with this ActionID.
The ActionID is the way in which we can follow the workflow associated with the User event End-to-End, from the initial request sent from the web browser, and the final response sent to the web browser.
So let’s say a User request results in an error response. To trace back the error to the root cause for that error, you would trace back, in the trace files, following the trail of the ActionID back to the first error that was thrown.
Let’s do this for my example.
Web Intelligence Processing Server **Error
So let’s go to the point in the workflow where we get the error message shown to the end user.
Whenever the Web Intelligence Processing Server sends an user-facing error message back to the client, it does so in XML format. In the WebI Processing Server traces, this XML message is logged prefixed by the string “**Error”.
So to find all error messages shown to the end user that passed through the Web Intelligence Processing Server, search the “Text” column for lines that contain “**Error”:
It’s easier if you scroll down to the very last entry in the GLFViewer, and search backwards:
Now that you have this error, read the”ActionID” associated with this log entry by going to View -> Filter:
What you’ve done is filtered out everything not associated with the workflow resulting in the “Failed to connect…” error.
Root Cause Error Identification
You now want to walk backwards from the **Error line thrown by the WebI Processing Server, to the actual root error.
It’s easier to see this if you enable text indent, by going ot View -> Indent Text According to Scope. When you enable this, you’ll see that the Text column becomes very colorful:
What the color lines define is scope. Scope defines the depth of calls that are made internally to service a request. For example, the WebI Processing Server invokes a request to the DSL Bridge Service. This request is logged by a “START OUTGOING CALL” line in the WebI Processing Server trace file, that indicates that a call was made to the APS. The APS trace file will log this request by a “START INCOMING CALL” line identifying the requester as a WebI Processing Server. The call scope has gone one deep. When the APS returns a response to the WebI Processing Servier, it logs “END INCOMING CALL”, and the WebI Processing Server logs “END OUTGOING CALL”.
So from the “**Error” line identified earlier, follow the depth of call upwards until one encounters the first line in the trace with Trace column equal to Error. You can see in the above screenshot that the root cause error actually originated in the APS, in the DSL Bridge Service. The actual roor error is:
“Password logon no longer possible – too many failed attempts on <host>”
Someone had tried to log onto the SAP BW system using my account, but with the wrong password.
That someone was me, of course. I had an old WebI document recurring schedule that was running, and I forgot to stop it before updating my password on my BW system. Instant lock-up!
Stopping the schedule, logging onto the BW under another admin account, and unlocking the User resolved this issue.
So you can see how End-to-End traces came in handy here.
You can ask yourself, “Well gee, why didn’t WebI just tell the end User that the Password was invalid?”
That touches on one of the security considerations that went into the design of BI Platform. If the system provides too much information concerning a logon failure, then a nefarious intruder can use that knowledge to try and hack into the system. And that’s not what anyone wants.
Even though this design makes troubleshooting a bit difficult, the availability of the End-to-End tool greatly mitigates the pain in getting to the root cause.