How to improve performance of DTP 7.3
The data loading scenario you have described matches the behaviour of a DTP feature called ‘Semantic Grouping’.
When most DTPs execute, the volume of data being extracted is small enough for the average BW Developer/Support person to not notice (or care about) the impact of semantic grouping. The DTPs are transported into the BW Test and BW Production system with very little consideration of the semantic group configuration. In fairness, most of the time it does not matter that this configuration is ignored.
As the volume of data being extracted in a single request increases, so does the importance of the semantic grouping configuration. In your case, the effect is made even more noticeable due to the width of the records (300+ InfoObjects).
There are three (3) independent suggestions for you to consider implementing. Please ensure you test each of them prior to committing the change into any BW Production system.
1. Disable the DTP ‘Semantic Grouping’ feature.
2. Decrease the number of records per DataPacket.
3. Use write optimised DataStores.
Suggestion 1: Disable the DTP ‘Semantic Grouping’ feature.
The semantic group configuration is located on the ‘Extraction’ tab within a DTP. It is the ‘Semantic Groups’ button just below the ‘Filters’ button. If there is a green tick on the right side of the button then the semantic group feature is turned on.
To turn it off, click the ‘Semantic Groups’ button then ensure all fields are un-ticked, then click the green tick button. Remember to save, check and activate the DTP.
The semantic grouping feature of a DTP dramatically changes the way the database table is read when creating the extracted DataPackets (and iDoc’s) from the DataProvider. It also changes the efficiency of the parallel processing of DataPackets.
When semantic grouping is turned on (because key fields have been identified) the extractor will ensure that the records will be delivered in groups according to the master data values of the semantic keys identified (not necessarily in sorted order but definentely grouped).
For Example: If the 0DOC_NUM (EBELN) field was ticked as a key field for semantic grouping on the ‘Document Schedule Line (2LIS_02_SCL)’ extractor then the data would be delivered in the DataPacket with all the records for the same document number, together. The DataPacket would continue to fill up with ‘Groups of records by Document Number’ until the DataPacket reaches it configured size for ‘Number of records per DataPacket’.
It is this ‘grouping’ feature that is causing the long time delay when a DTP is first executed.
A full extract of 14 million records will first make a temporary copy of the data (all 300+ InfoObjects wide) then group the temporary dataset by the key fields identified in the DTPs semantic group configuration. The full DataSet must be prepared first to ensure the grouping is 100% correct, hence it will take quite a while before the very first DataPacket is delivered.
The performance slows down more and more based upon several different factors:
* The total number of records to be extracted for this single request;
* The width of the record being extracted;
* The number of key fields identified in the semantic group.
* The database resources for a full temporary copy of the DataSet to be extracted.
Note: At this point of the extraction, the size of the DataPacket is irrelevant.
When the semantic group feature is disabled, please ensure you diligently regression test the data loading process. Ensure you have considered the impact to:
* Extractor enhancements (BAPI and Customer Exit API);
* Error DTPs;
* Custom ABAP in start, field, end and expert routines;
* InfoSources with key fields leveraging run-time DataPacket compression;
* Process chain timing, specifically in regards to other dependancies.
Suggestion 2: Decrease the number of records per DataPacket.
This increases the number of DataPackets to be processed but also relieves the memory requirements per DataPacket. This can have a significant impact of ‘How long it takes for a transformation to process each DataPacket’ because the application server is no longer thrashing about doing virtual memory page swapping.
There is a balance point/sweet spot within each BW system where the running program demands more memory from the application server than has been physically allocated. When this in-memory point is reached the SAP Kernel and the Operating System begin the process of “Paging” blocks of memory out to disk, effectively freezing access to that block of memory by running programs. When a program tries to access that ‘checked out page of memory’ the program pauses while the SAP Kernel/Operating System retrieves it and then the program can continue. This also reduces the speed of execution of the ABAP program down to ‘how fast can the disk be access’. Given that todays CPUs and memory (RAM) run considerably faster that the disk, this difference is very noticeable.
When a ABAP programs (like a DTP for extraction or a Transformation for DataPacket processing) tries to handle “X” number of records which forces virtual memory to be enabled, the running program will take a lot longer to execute. The same number of records are extracted but it just takes longer.
By decreasing the number of records per DataPacket you are reducing the demand on internal ABAP program memory; which in turn (hopefully) lowers the memory requirement below the threshold where a lot of virtual memory thrashing will occur. To take this to an extreme example (not recommended) would be 10 records per DataPacket. The demand upon the ABAP internal memory would be very low, even for a 300+ InfoObject wide record.
Assuming you are using the recommended default DataPacket size of 50,000 records; try changing this to 15,000. I’m suggesting 15,000 based upon the fact that most ETL pathways contain approx 30 to 80+ InfoObjects and 50,000 records per DataPacket processes very well on most BW systems. Since you have a much wider record, we can offset the additional memory requirement by reducing the number of records. Exchanging a wider record for a shorter DataPacket.
Suggestion 3: Use write optimised DataStores.
Consider converting any DataStore DataTargets into ‘Write Optimised’ DataStores. This is a viable suggestion as you have clearly stated the DataSet is always a snap shot; hence there is not much chance of the different requests overlapping and becoming delta optimised during the activation of a request (as a ‘standard DataStore’ would benefit from). Statistically the number of records loading into the DataTarget will also be the number of records that flow further down stream to other DataTargets, so you may as well remove the effort involved in a standard DataStore activation that determines dealt changes.
Please keep in mind that the three suggestions are all completely independent from each other. Implementing just one will make a difference.
Suggestion 1 will give you the better performance improvement, specifically when extracting the data from the DataProvider.
Suggestion 2 will mostly improve the performance of DataPacket processing through the Transformations (and InfoSources if you are using them).
Suggestion 3 will improve the committing of data into the DataStore, reducing the activation time.