In SDN Businness Intelligence Forums I often find questions about so-called “line item dimensions”, what they are for or how can they improve performance? To give answers to this kind of questions, I decided to write a Blog.
1. Line Item creation
Let’s take a simple customer-sales cube as an example.
My customer sales cube stores information, which customer has bought at which day from which plant which article in which quantity and to which amount. Our cube will look like this for simplicity:
What happens after activation of your cube on the database?
To see this, call transaction LISTSCHEMA for this cube:
What you can see from the listing above, is that 3 additional dimensions to the standard dimensions time, unit and package have been created. Each dimension consists of one characteristic only. Secondly, the key of each dimension table consists of a unique DimensionID and as data fields the SID-tables of the corresponding characteristic.
You can display the definition of the table and get redirected to transaction SE11. Here you see how dimension table 1 is defined:
a.) What does this mean for query and reading the cube?
First the fact table has to be read, secondly a join is done between dimension table and SID-tables of the corresponding characteristics. So every time data is read from the cube, many joins have to be executed to retrieve the data from the underlying master data tables.
For database performance it will be much better if the fact table holds directly the values of the corresponding SID tables. This means, a join from the fact table to the master data SID-tables, no further joins needed.
b.) How do the dimension tables look with lots of data?
Now let us assume, a big retailer with thousands lines of POS-data is loading data into the cube. Let’s pick the material dimension as an example. This table could look like this:
The column Material SID holds the SID values from the /BI0/SMATERIAL table. As you may expected between DimID and SID is an 1:1 relationship.
To improve performance, the idea is now to create so-called line item dimensions.
c.) How do you create line item dimensions?
You define your line item in data warehousing workbench. Edit your cube, go to the dimension, which you want to be a line item. Press right-mouse button and choose “Properties”. A small pop-up window appears:
Here you can checkmark a dimension as line item. But that’s only possible if this dimension holds exactly one characteristic and the cube and aggregates holds no data!
By the way, did you know that you can do this change in one step: Mark all dimensions, right-mouse click and choose properties, the following pop-up appears:
If you want to change the type of dimension if there is already data in your cube, you have to delete all data of your cube or use the remodelling toolbox new in BI 7.0.
What happens if you change all free dimensions of our sample cube to line items on database?
LISTSCHEMA for our cube after saving and activating looks like that:
What happened? The dimension tables are still there! Why?
To examine it, I display dimension table 1 and go to transaction SE11 :
As we can see here, BI created a view for the dimension table with a join to the SID table! So everything is fine, we reduced the number of joins necessary to read data. Data of Fact-table is joined by a view directly to the characteristic SID table.
2. Detection of “possible” line items in existing cubes
Now you may have the “real world” situation where you created your cubes and started to load data for a while. But you can recognize that your queries get slower and slower. One reason for this may that one or more dimensions got degenerated. This means the relation between size of dimension table to fact table is greater than 20%. So it takes the query a very long time to read all data from the dimension table. But how can you detect such degenerated dimensions? The more cumbersome way would be go to SE16 or SE11 and check the table sizes and compare them to each other. But there is an easier and more exact way to determine the sizes: Call transaction RSRV. Then chose “All elementary tests” -> Database -> Database Information about InfoProvider Tables . Drag & Drop this test from the left to the right window and expand it. A pop-up window opens. Enter here the name of your cube and confirm. After that you can execute the test. The result will probably look like this:
You see the size of each dimension and if there is any dimension greater than 20%, examine it. Consider if your dimension may be holds one characteristic with many values and blows it up. But be aware: You have to delete all data in your cube to checkmark a dimension as line item dimension if your cube holds data already.
3. Recommendations for data modelling
But what does this mean for data modelling? When should I use line items? For this we look at aggregates! This seems a bit surprising, but we will see why. For sake of simplicity I create an aggregate with all free dimensions of the cube and we will look how it will be saved on database. The aggregate looks like this:
The technical name of the aggregate is 100017. This may be different on your system. Aggregate technical names are determined from a number range starting with 100000! Every new aggregate gets the next free number. Aggregates are saved views of the cube. So we will find according tables on the database. See LISTSCHEMA for more details:
If we display table of dimension 1, we see:
It’s again a view! And all other dimensions are views too. So what does the system do? If there are less than or equal to 16 characteristics in the aggregate, the dimensions will be saved as line items! These are so-called “flat aggregates”. If there are more than 16 characteristics in the aggregate then the dimension will be created accordingly to the dimensions of the basis cube!
For this, my personal recommendation is to create line items where ever and when ever possible! If you have less than 13 characteristics in your cube, put each characteristic in one dimension and define it as line item. Use in general all free dimensions, don’t save them for any later purpose you don’t know yet.
Ok, that’s it now. I hope you enjoyed reading this blog.