Improve performance – by designing InfoCube dimensions correctly in #SAP #BW
Introduction
In reality dimensions in an InfoCube are often designed by business terms (like material, customer etc.) This often leads to the impression that InfoCube dimensions should be designed based on business constraints. This although should not be the leading criteria and shouldn’t drive the decision.
Aside from the datavolume which depends on the granularity of the data in the InfoCube, performance is very much depending on how the InfoObject are arranged in the dimensions. Although this has no impact on the size of the fact table it certainly has one on the size of the dimensions.
How is a dimension then designed?
The main goal distributing the InfoObjects in their dimensions must be to keep the dimensions as small as possible. The decision on how many dimension and what InfoObjects go where is purely technical driven. In some cases this matches the organisational view but this would only be a conicidence and not the goal.
There is a few guidelines that should be considered assigning InfoObjects to dimensions:
- Use as many dimensions as necessary but it’s more important to minimize dimension size rather than the number of dimensions.
- Within the dimension only characteristics that have a 1:n relation should be added (e.g. material and product hierarchy)
- Within a dimension there shouldn’t be n:m relations. (e.g. product hierarchy and customer)
- Document level InfoObjects or big characteristics should be designed as Line-Item dimensions. Line item dimensions are not a true dimensions they have a direct link between the fact table and the SID table.
- The most selective characteristics should be at the top of the dimension table
- Don’t mix characteristics with values that change frequently causing large dimension tables. (e.g. material and promotions)
- Consider also to combine unrelated characteristics it can improve performance by reducing the number of table joins. (you only have 13 dimensions so combine the small ones)
As a help the report (SE38) SAP_INFOCUBE_DESIGNS can be used.
This yellow marked dimension should be converted into a line item dimension if it contains a document level characteristic or it is simply bad design.
The maximum number of entries a dimension potentially can have is calculated through the cartesian product of all SID’s. (e.g. 10’000 customer and 1’000 product hierarchies lead to 10’000’000 possible combinations in the dimension table. It’s unlikely that this is going to happen and while designing the dimension this should also be considered – analyzing the possibilities of all customers buying all products in this case.
In cases where there is an m:m relationship it usually means there is a missing entity between those two and therefore they should be stored in different dimensions.
Once data is loaded into the InfoCube a check on the actual number of records loaded into the dimension table vs. the number of record in the fact table should be done. As a rule of thumb the ratio should be between 1:10 and 1:20.
Degenerated Dimensions
If a large dimension table reaches almost the size of the fact table when measured the number of rows in the tables it’s a degenerated dimension. The OLAP processer has to join two big tables which is bad for the query perfromance. Such dimensions can be marked as Line Item Dimensions causing the database not to create an actual dimension table. Checking the table /BIC/F<INFOCUBE> will then show that instead of the DMID dimension key the SID of the degenerated dimension table is placed in the fact table. (Field name RSSID). With this a join of the two tables is eliminated. Those dimensions can only hold one InfoObject as a 1:1 relationship must exist between the SID value and the DIMID.
Dimensions with a lot of unique values can be set to High Cardinality which changes the method of indexing dimensions. (ORA DB only) This results in a switch from a bitmap index to a B-Tree index.
Defining a dimension as Line Item Dimension / High Cardinality
Conclusion
Finding the optimal model and balancing the size and the number of dimensions is a delicate excercise.
Dimensions in MultiProvider do not have to follow the underlying InfoCubes definitions. Those can be focused on the end users need and be structrured by the organizations meaning. This does not affect the performance as the MultiProvider does not have a physically existing datamodel on the database.
Designing the dimension in an InfoCube correctly can have a significant improvement on performance!
Good info and nicely documented. Thanks for sharing Martin.
thanks Raman trying to sort out some of the forgotten basics 🙂
Its good one. Thru by reading also we may recollect as well.Even if you add how to select LID option(at cube level) with screen shot that may makes more effective content.
added.. always appreciate your feedback!
thanks Martin
Good work. Thanks for considering Martin.
HI Martin,
You are just refreshing all our basic knowledge which is always to be remembered in our BI endeavors. Thanks for sharing 🙂
Regards,
Suman
Thanks Suman and a lot of these things are database related which are not always clear even to a expirienced bw developer... 🙂
Martin
Hi Martin,
Cube design in BW is like the universe design in BO.
A bad cube design in BW will always have a negative impression for the client.
These days I am seeing lot of Inhouse team members using SAP_INFOCUBE_DESIGNS to judge how the 3rd part vendors have done the basic cube design for them.
Anyway, thanks a lot for the doc.
BR
Prabhith
Thanks for sharing valuable document.
Regards,
Sushant
really appreciate your support 🙂
Martin
Hi Martin,
Thanks for sharing and refreshing all our basics. 🙂 Nicely presented.
Regards,
Krishna Chaitanya.
Nice Blog for Performance Prospective 🙂
thanks I hope so flad you like it..
Martin came up with performance milestone now 🙂 . Thank you.
Hi Martin,
Thanks for sharing and refreshing the concepts.
Can i request you to also add 1 things" what does it mean ticking both the options i.e. LID and High cardinality etc."
Regards
Kamal
Hi Kamal
Thanks kamal for your input. I'll update the document on LID and High cardinality..
Martin
Nice document martin.... 🙂
Nice Basic information.
regards,
Sm
Helpful Information and documented is good. Thanks Martin for Sharing
Nicely documented, Martin.
Just noticed one typo at line 10. 😛
Regards
Amir
thanks for the typo notice i fixed it..
Useful points. Thank you
thanks
one of the basic modelling point which we have to consider but good to have it 🙂 . Thanks for sharing.
Kind Regards
Manna Das
good glad you like it 🙂
Hi Martin,
One more nice & Useful blog. Thanks for sharing.
BR
Aakash G
thanks for your comments
Hi Martin Grob,
Very useful info....Thanks for sharing... 🙂
Regards,
Rajesh.
thanks Rajesh you're welcome
Good information... Nicely explained.
thanks
Hi Martin,
Nice blog ,As it will help to improve modeling skills.
BR
Praveen Kumar
I hope it does help 🙂
thanks
Excellent Document !!
Thanks Martin.. Keep it up !!
Hi Martin,
thanks for sharing the InfoCube performance concepts in a compact document with the key basics.
Regards,
Michael
Often neclected 🙂
Nice blog Martin . Thanks for sharing.
you're welcome
Nice document....just a small doubt from your statement....
"
"
I think it should be conside to combine related characteristics in a same dimension which would reduce table joins and that should improve performance..
Thanks for your feedback.
The way it was meant when you consolidate small dimensions you should even consider to combine the ones that are unrelated. As we talk about small dimensions the combination of those will not create a big dimension table even they are unrelated. This frees up a dimension you can make better use..
Martin
Back to basics!!
Good Document Martin,
cheers
Raj
maybe basics but often also not understood what the impact is
thanks
Nice blog Martine. Very well explained.
Many Thanks for sharing.
Regards,
Sangita.
you're welcome glad you like it
Nice document Martine. It’ll be very helpful for the beginner & interview point of view.Thanks.
I'd like to clarify my doubt , what about slowly changing dimension ? How can we decide & define while modeling?
Regards,
B Devi
Nice and detailed blog on cube design, explaining all factors crisply. Thank you.
Cube design is an art. But like Arun Varadarajan once said in his blog, this art becomes meaningless with BWA or HANA. You can have a poorly designed cube and still escape from scrutiny as your BWA will give superb query performance.
(the comments from Klaus Nagel as well)
http://scn.sap.com/people/arun.varadarajan/blog/2010/12/10/twinkle-twinkle-little-star-do-i-really-need-you-at-all
Hi
Yes potentially you are right. That blog has been posted almost 3 years ago on my birthday 🙂 . How many of you have now HANA? Designing a cube's dimension correctly might be a dying art but not as fast as one hopes for..
Well said Martin, as I was once careful in designing dimensions with line item and high cardinality and using report SAP_INFOCUBE_DESIGNS. This will be a lantern to non-hana bw systems. Thumbs up for this document. 🙂
After running BW 7.4 on HANA with my cubes in-memory optimized and I didn't even care if I put all my characteristics in single dimension 😛 (tired of creating multiple dimensions as well 😉 )
Hi Martin,
I am not sure about HANA, but I would assume that many would have atleast implemented BWA. BWA was revolutionary for query performance. Those who used BW reporting seriously would have implemented BWA, don't you think?
I don't think that depends on if you are serious with BW or not. That solely depends on the amount of data you have to processes if a BWA was/is necessary vs. a solid datamodel.
Nice.. thanks for sharing the information.
Dear Martin,
Really good and informative..
Making BW consultants life easy 🙂
Regards
Kiran N
HI,
Martin it's really good stuff for bw/bi people
Hi Martin,
Usefull info.Thanks for explaining.
But they dont let me decide on dimensions here..:P
Hi Martin,
thanks again for sharing this knowledge.
I have two questions about the topic:
1. After remodelling the Dimensions of a example Cube (add new Line Item Dimensions and add characteristics with many entries, e.g. 0MATERIAL, 0DOC_NUMBER) I executed the Report SAP_INFOCUBE_DESIGNS again. As a result the Report does not shown the newly added dimensions of the Cube. Is there any automism to show the current dimensions each Cube in the Report?
2. I have tried to move a InfoObject (e.g. 0DOC_NUMBER) from a existing Cube dimension to a absolutely new added Line-Item Dimension. I have noticed that some of existing InfoObjects are marked with a blue font color and the reordering in another Cube-Dimension is not possible. What is the background of this marking?
Many thanks in advance!
Regards,
Michael
Hi Michael
1) The report won't show you the line item dimensions as they are not a dimension anymore but rather attached to the fact table. Thats why this only works if you only have one characteristic in a dimension
2) Once you have loaded data into your cube and the characteristics are in use the dimensions/characteristics are locked. (blue) you can only change them if you delete the content of the cube (including dimension tables)
Martin
Hi Martin,
Nice Document..thanks for sharing 🙂 .
Though these are Basics...sometimes we forget to apply them.Thanks for the the Refresh !!
In all...Good Article Martin....
rgds
SVU123
Hi Martin,
You really hit the nail on the head.with this block. You are blog as usual very informative and gives very practical tips.
Many thanks for sharing wonderful blog.
Regards,
Anand Kumar
Hi Martin ,
A very useful document .Thanks for sharing .
Regards,
Sanyukta
Great, gave good insight. Thanks
Fantastic piece of doc. Thanks for sharing
Hi Martin,
Thanks a lot for your valuable information. But still one question left in my mind.....
This Report SAP_INFOCUBE_DESIGNS will show the count of overall dimension.In this case, How do we find which characterstics is having too many entries out of multiple characterstics added in single DIMENSION (e.g 0CUST_SALES, 0SOLD_TO, 0SHIP_TO, 0BILLTOPRTY, 0PAYER) ?
In this sceario, which characterstic should be marked has Line item dimension ?
Please suggest me if I am missing some thing here.
Regards,
Suresh V
Hi
You can't define which object it is unless you understand what's behind the data. Make sure you don't mix combinations that cause a big number of possibilities.
Martin
Excellent blog 🙂
Good stuff Martin - Thanks for sharing!! 🙂 🙂
thanks for the comment glad you like it..
Good one.
I was really looking for this kind of consolidated blog where I can read all the stuffs .Thanks for sharing
Very Informative and helpful in terms of performance tuning. Appreciate and Thanks for sharing.. Hope to see such more from your side..!
Regards
YN
I mostly check for most of the design level suggestions you made. However I
did not understand how "The most selective characteristics should be at
the top of the dimension table" is going to impact?
I never check for this, may be intuitively
place prominent the info object at the top.
Can you explain this..
Your blog turned me nostalgic and reminded me my initial development days.
Regards,
A Prashant Kiran
well explained the basics, master
This is very conceptual document .Nice to read 🙂 .
thanks..
Hi Martin,
Its nice .. thanks for sharing information.
I tried to realign one of the cubes in our system, and found that there are almost 7-8 characteristcs are qualifying for Line Item Dimensions. But not able to make all above characters as LIds, as my cube crossing allowed number of dimensions.
Pl. suggest how to handle this scenario.
Thanks
Hi,
Really you've provided a good information. Thanks for sharing.
Does cubes in BW 7.3(also bw on hana) has line item dimension concepts? Because, I was just searching the properties by right click on dimension and I couldn't find line item dimension option there.
Would help if you can give some clarity here.
Thanks in advance.
Regards,
Antony Jerald.
Very helpful blog.
Thanks a lot