Understanding Caching in Web Intelligence
As the new intern on the GSC Canada’s Web Intelligence team, I was initially overwhelmed with the amount of information and content that I had to consume. Thankfully the content was digital, otherwise I would have been swimming in a pool of books, papers and articles. However I was still drowning, until two angels descended from the cloud and saved me with technical training sessions, where the knowledge transfer was efficient, speedy and long-lasting.
Ted Ueda and Jonathan Brown, despite having human qualities of being BI experts, were my angels. As they spoke with one another about what to share with me, they also considered how to share the same training with others. So I decided to document these training sessions and share them with the rest of the world’s Web Intelligence lovers.
On this page, I have documented the training session on the process of caching. We will take a look at the caching server, the location of cached folders, the content within them and troubleshooting tips.
The Basics
Caching report pages improves performance by eliminating the need to access the database every time a report is requested. Once a user requests a document, the document gets cached so when a second user requests the same document, there is no need to re-parse the WID file. We will discover the elements of the caching server and understanding what each cached component is.
Chapter 1 – The Cache Server
History of Cache Server
Consider a case where 5 users request a report and there are 16 processing servers. It is likely that each request gets processed by a different Web Intelligence Processing Server (WIPS). 5 requests for a document to be shown would use 5 different WIPS. Essentially, 5 WID files get parsed, opened up and cached in the caching server. “A very non-optimal behavior” as Ted would describe it. To optimize the behavior, a shared disk or a shared cache is utilized between all WIPS. This allows each WIPS to take advantage of every other WIPS caching information. This case is only relevant to XI 3.1 and we no longer recommend that in the newer version of BI products.
There are issues with sharing cache because in BI 4.0 the architecture is more complex. Because we now make use of 64 bit memory, there is no longer a memory limitation for us that would make us resort to a shared disk for caching purposes. We have further optimized the new I/O for WIPS in BI 4.0 but access to a shared disk to pull out cached information may slow down the I/O.
Keep in Mind
The Web Intelligence Processing Server (WIPS) essentially includes a caching server within itself, which is called CT, common thread, or a CT plugin. If you see errors associated with the CT plugin, then you know that it is the caching that comes into play.
It must be kept in mind that bugs also exist. An example is the shared cache disk does not get cleaned up. The cache disk gets full and all the WIPSs crash.
Chapter 2- Cached Data
What is in the cache directory?
<Install-Directory>\SAP BusinessObjects\SAP BusinessObjects Enterprise XI 4.0\Data\BIPW08R2_6400\BIPW08R2.WebIntelligenceProcessingServer
You will find a folder called sessions. Every time you open a Web Intelligence document, you will see that a session is created and within this folder, you will find further useful material.
In the session folder, you will find the Brep and Universes folders and certain temporary (tmp) file.
- The Cube file is the file which contains all the raw data. It contains the data provider information.
- The Document file is basically the template, the XML that forms the WebI document. The positioning of the objects and formatting of the report are examples of material inside this file.
- The Drill file contains information regarding the drilling of the data such as drilling levels.
- The Repeng stands for Report Engine which handles the calculations and what is displayed on the report.
If you refresh the report and monitor the files, you will notice it’s the Cube files that always get updated with a new time stamp. This indicates that we have new data.
Drilling Through – BLOBs
If you drill further into the BRep, you will find BLOB files.
Most kind of InfoObjects that represent documents, such as Web Intelligence InfoObjects, have associated data. This associated data is stored as files, ‘BLOBs’, in the FRS or in our case in the cache registry.
The BLOB files that we see are of three types:
Chapter 3 – Failover
Q’est ce que failover?
“One WIPS runs multiple WebI documents and it runs into trouble. One of the documents happens to be bad and the WIPS is going to go down” Ted gave a scenario. So what happens in this scenario?
In BI 4.0 we have something called the failover process. It is a process which allows documents to be moved over to a different WIPS. Essentially when WIPS is processing a document, somebody is asking it to process it. It could be SDK, BI Launch Pad or something like that. So when the WebI Processing Server goes down, the client should notice an error. The WIPS is no longer responding because it just went down.
Another WIPS on the same machine is able to take on the same tasks, because it has access to the shared disk that has the cached data. It’s very automatic where data is simply transferred from WIPS1 to WIPS2. Fantastic!
What if all WIPS on the same machine are unavailable?
If WIPS1 is going down, and WIPS2 is unavailable then the state of the WebI document needs to move to another machine. A service on the Adaptive Processing Server, called Document Recovery Service, takes care of this scenario. The Document Recovery Service is responsible for taking documents from WIPS1 and saving it on another WIPS, which we will call WIPS3 on another machine.
But lets take a step back. There is a couple of things happening before we transfer the data onto another WIPS.
- The first WIPS has to indicate that it’s crashed so we can start transferring the data over to the second WIPS on the same machine.
- The Document Recovery Service must pick up the failure of the first two WebI Processing Servers so it can transfer the data to the next machine.
The failover failed
The first two WIPS fail or the Document Recovery Service fails to pick up the document or any of the steps in the transfer of data from one WIPS to another WIPS fails, then we see an error message that says ‘Failover failed’.
The user may indicate that they see a specific error all the time and the log files indicate a ‘failover failed’ error. If this is the case then our focus of investigation must be on the reason behind why the transfer of data from one WIPS to another WIPS failed.
Failover failed basically stands for “we were not able to recover your job”.
An End-2-End trace would be helpful to see which WIPS crashed and go further beyond that to see the reason behind such crash.
Useful Resources
KB 1861180 is a knowledge base article that demonstrate how to conduct End-2-End tracing
Web Intelligence Caching in BusinessObjects Enterprise XI Release 2
Thanks for the wonderful knowledge Keep Sharing !!
Regards,
Abhi
Can you please elaborate more detail on "Another WIPS on the same machine is able to take on the same tasks, because it has access to the shared disk that has the cached data. It's very automatic where data is simply transferred from WIPS1 to WIPS2" while you also indicate that "We have further optimized the new I/O for WIPS in BI 4.0 but access to a shared disk to pull out cached information may slow down the I/O." Regards, Jin-Chong
Hi Jin-Chong,
To reply to your 1st question.
On a clustered environment only the WIPSz on the same machine will share the same disk space .... So, if we have an environment with 2 clustered BOE, lets call them node1 and node 2.
On node 1 there are WIPS1 and WIPS2. On node 2 there is WIPS3.
WIPS1 and WIPS2 will store their cache on node 1 whereas WIPS3 will store cache on node2. WIPS1 and WIPS2 won't be able to access cache made by WIPS3. And WPS3 won't be able to access cache folder made by WIPS1 & WIPS2
For 2nd question Its very relevant! As I know of, and I just asked a fellow coworker that works on webi server for longer than I; there is no in-memory cache on top of the on disk cache. The only in-memory cache there is not share across users and is cleaned the moment you close your document. There is not only one in-memory cache but a multitude of them, for example there is one for the search functionality, in order to increase the search speed in a document.
I guess the author mixed cache and memory tresholds (The WebI server has a memory management system to avoid consuming memory above a defined limit).
Regards
Thank you so much for your clarification, Frederic.
Regards,
Jin-Chong
Hi JinChong,
To answer your question, I may refer you to a blog written by our own Ted Ueda, explaining the output cache directory difference in 3.1 and 4.x. Have a look at the content under:
XI 3.1 - configure Output Cache Directory for all WebIPS on a common network share.
BI 4.1 - configure Output Cache Directory for WebIPS on local disk and do not use network share.
Revisit the Sizing for your deployment of BI 4.x Web Intelligence Processing Servers!
Let me know if you have further questions!
Thanks,
Amid
Very Nicely explained.. thanks Amid