When we execute IP and load data, it goes to the db. There are several read and that go on in back and forth each and every time. How fast is our system able to read and write is the main thing. In order to improve the read and write, we need to concentrate in performance tuning.
Performance tuning can be categorized in 2 main parts, namely:
1) Query/report performance
2) Data load performance
1) Performance aspects with respect to Query/report:
a) Sound data model: When we build a cube, the dimensions should be well structured. As a rule of thumb dimensions should be 10-15% of the fact table.
Line item dimensioning is part of this (detail present in point number h).
b) Aggregates: It allows us to keep a subset of the actual data into a smaller set of cube what we call aggregates.
In order to build aggregates go to RSA1-> go to the cube-> maintain aggregates-> click on ‘create by yourself’-> drag and drop the required dimensions from left to right -> give name and description of the aggregate
Aggregate is a cube based on the bigger cube. We can specify a certain value to the char of the dimension (say in this case plant=1300), which would take lesser time to read. We can even take out the char from the dimensions which we don’t need.
When we create our query and data is loaded the query is going to look into the aggregates first and see if it is going to satisfy the result. In this case it is a smaller cube for it and hence would take less time to read data.
If we have aggregate of a cube, the request needs to be rolled up. If we don’t do roll up our aggregate is not going to hold any data
To determine the effective aggregate build we can go to tcode ST03.
If we don’t need aggregate we should delete it because it is not good in data load perspective. We should build aggregates which can be used for many queries but not for a single query.
When we create compression another table is created which is called E-table (compressed requests stays here). Disadvantage of E-table is that it loses each and every RID.
F-table contains our requests information. It is optimized for loading and deleting. We can delete any particular rid when we need. When we want to load data , it loads requests by requests.
Benefit of E-table/Compression: The E-table basically holds compressed results. The only thing here is that the individual RIDS can’t exist in E-table. When we do compression the requests move from F-table to E-table but we lose the RIDs. In doing that the OLAP processor doesn’t have to look request by request. It just goes to one particular place which is the E-table hence improves the performance.
For faster retrieval of data from RDBMS (databases) indexes are used. They basically assign numbers.
If we don’t have indexes the system has to do full scan of the database .In cubes we often drop and rebuild indexes. It is expensive when the data load is continuously growing and growing . We drop indexes since we are doing new loads and want them to be realigned and we create the indexes so that when retrieve the data it gives much faster process.
e) Physical partitioning (database partitioning):
It happens on fact table. Logically fact table is partitioned by RIDS but physical partitioning is not done over here but impacts the underlined db. There are some db that does not allow to do physical partitioning. In BI actually physical partitioning happen in the underline db by fiscal year period and calendar year month.
Lets say we have loaded by fiscal year period and want to see in term of fiscal year period, then it goes to the fiscal year portion of the partitioning and retrieves data.
f) Logical Partitioning:
When a query is accessing data if there is no logical partitioning, it has to go to the whole reach to retrieve data(in this case the DSO in the above picture). If we have Logical partitioning, based on the parameters over there it goes to that particular cube to give the result and makes the performance faster.
Good candidate for improved query performance. We can set up parallel pull. While we are pulling some records, the no. of trace that it uses instead of sequential 1 it uses 3 at the same time and hence improves the performance.
h) Line Item Dimensioning:
It is a part of good sound model. The reason that we do this is that there are instances when the dimensions are getting too big i.e. closer to fact table. When dimension is larger than 10% of the fact table we can consider line item dimension and it helps in performance. We can see in dim verses fact in tcode rsrv or through program sap_infocube_designs
To flag to line item dimension, go to tcode rsa1-> go to infoarea->select the infocube-> select the dimension -> go to change mode-> tick mark ‘line item dimension’
2) Performance aspects with respect to Data Load:
a) Data Packet sizing
It controls in inbound and outbound of data. When data is coming it comes in packets. The sizing of the packets depends on the dialogue process (like messengers) that we have. Data packet 1 will pick up messenger 1. Say if we have 5 messengers and 20 data packets coming in then 5 data packets will go and rest 15 has to wait but say if we have 2 data packets at a time in one messenger (which is our dialogue process ) at a go there are 10 packets going at a time and is faster.
Dialogue process depends on system resources. We do not control them .If we go to sm50 or sm66 the DIA there are the dialogue processes and BGD are the background processes.. If its dialogue it comes in tickets.
Data Packet sizing shouldn’t be too little or too big.
b) Write-optimized DSO:
It has only the active table and not the change log or activation queue .SID values can’t be generated. Records with the same key are not aggregated. Data is available for importing immediately after it is loaded (SID is generated at runtime even if it is not there here).