How to Optimize Data Loads to Write-Optimized DSO
Write-optimized DSOs were first introduced in SAP BI 7.0 and are generally used in the Staging layer of an Enterprise Data Warehouse as data loads to these are quite fast. This is because they do NOT have three different tables but only one, Active table. This means data loaded to a write-optimized DSO goes straight to the active table, thus saving us the activation time. Also these DSOs further save time by NOT involving the SID generation step.
However, write-optimized DSOs have one short-coming. During data loads all the data packages are processed serially and not in parallel, even if parallel processing is defined in the batch manager settings of the DTP. This results in cumbersomely long loading times while loading large number of records (typical in a full dump and reload scenarios).
The goal of this paper is to demonstrate how to enable parallel processing of data packages while loading to write-optimized DSOs thereby optimizing load time.
<< UPDATE >> This is applicable to SAP BI 7.0 only. In SAP BI 7.3 packages process in parallel by default.
Step By Step Solution
Parallel processing of data packages while loading to a write-optimized DSO can be enabled by defining the semantic key in the Semantic Groups of the DTP.
Open the DTP of the write-optimized DSO and in the Extraction tab click on Semantic Groups button.
In the pop-up screen select the fields which form the semantic key of the DSO.
Make sure that parallel processing is enabled by going to the menu Goto > Settings for Batch Manager and defining the number of processes for parallel processing.
Now if you run this DTP you will notice that the data packages are processed in parallel and there is a significant improvement in the data load timings. Please note that the improvement will be conspicuous in loads involving large data sets.
Load Time Comparison
First screenshot below shows that it took around 17 hours to load about 23.5 million records in a write-optimized DSO. During this load semantic key was NOT defined in the DTP.
The next screenshot shows that it just took a little over one and half hour to load the same number of records in the write-optimized DSO (11 times faster!). The difference was this time the semantic key was defined in the DTP.
1. Write Optimized DSO
2. SAP Note 1007769 : Parallel updating in write-optimized DSO