One of the quotes I like most I learned from Tom Kyte from Oracle.
It’s one from Artemus Ward:
“It ain’t so much the things we don’t know that get us into trouble.
It’s the things you know that just ain’t so.”
This blog post is about one of those cases.
A customer complained about the performance of the index creation process in one of his process chains.
I had a look at the processes messages and found this:
--- > 6 hours !!!!????
28.03.2011 12:47:30 SQL-END: 28.03.2011 12:47:30 06:36:35
There was one index that actually took 6 hours and 36 minutes to create, although the table scan was done in parallel on Oracle level.
The other indexes on the same table only took some minutes (except one other that took something around 30 minutes).
So what could be the reason for that?
The next thing I looked up was the number of distinct values for the different dimension KEY columns (since these are the only ones getting indexes on fact tables):
OBJECT DIST NUM_ROWS (LF) BLK
And this was already the hint required to solve this mystery.
Bitmap indexes are extremely efficient during lookup and when used to combine selection criteria.
What they don’t do very good is to handle large numbers of distinct values (always compared to the total number of rows in the table of course).
In this case the dimension for which the index creation took six hours (column KEY_SDIMENS1) had more than half as much distinct values than the total number of rows in the table.
If you’re used to data modelling in BW you know that it’s recommended not to have that many distinct values in the dimensions and if it cannot be avoided then the dimension should be marked as line item dimension.
This can be done in the Infocube modelling in transaction RSA1 by right-clicking on the dimension and opening the properties dialogue.
Flagging a dimension as line item dimension changes the data modell by that it removes the intermediate SID table that is normally put between fact tables and the actual dimension tables (for flexibility reasons) and joins the fact table directly with the dimension table.
It changes this :
[DIMENSION-TAB1] >-----< (SID-TAB1) >---< [[FACT TABLE]] >---< (SID-TAB2) >-----< [DIMENSION-TAB2]
[DIMENSION-TAB1] >-----< [[FACT TABLE]] >---< (SID-TAB2) >-----< [DIMENSION-TAB2]
Besides this change in the data modell the index for the line item dimension should not be a BITMAP index anymore, but a B*TREE index instead. Of course the SAP BW developers know about this and provided a way to avoid the creation of BITMAP indexes on such dimension key columns.
The fallacy the customer and I became victims of was to believe that flagging a dimension as line item automatically includes the setting for the B*Tree index. After all this totally makes sense, doesn’t it?
Unfortunately, but correctly, the BW developers separated these two functions and provided two flags for infocube dimensions that can be set independently:
[ x ] Line Item
[ x ] High Cardinality
Only setting the High Cardinality flag leads to the creation of B*Tree indexes!
Fortunately, this can be done any time, even when the infocube is already in use and filled with data.
This is of course documented (here) but as often, things mix up in the memory and we end up believing the wrong stuff.
Before I forget to mention it: as far as I know the High Cardinality flag really only does something on Oracle based BW systems as the Bitmap Indexes are exclusively available on Oracle.
Maybe there are similar features on other platforms, but I’m not aware of them, so I leave them out for now (and avoid believing the wrong stuff once more…).
So, make sure to check your line item dimensions to avoid long indexing runs.