Runtime improvement of a long running infopackages in SAP BW
Introduction:
In various SAP BW projects we might come across issues in production due to long running of info-packages and DTPs in a process chain and thereby impacting the SLAs specifically in case of full load.The reason is mainly due to the increasing of number data postings in ECC or BW where the extractor is extracting data from.
In order to improve the runtime in a significant manner for a long running info package or dtp load,this document will help you to understand and a way to implement a new process structure which can help us to gain runtime quite significantly.
Contents:
1.Steps to follow prior imposing the new structure
2.Insigts of new proposed process structure
3.SAP recommended process settings for performance improvement of loading
4.Benefits and results with example showing the significance of the new structure
1.Steps to follow prior imposing the new structure:
Before developing the newer concept we need to identify the imporant loads which are taking longer time for eg. more than 2/5/10 hours and the number of data records that are getiing extracted from ECC/BW .
This will help us to have a throuhput calculation i.e. no. of data records being extracted/transferred in one minute.
Here is an instance to show the way analysis could be done based on runtime history:
We consider a process chain Z_0CO_OM_CCA9_DATA loading Cost center allocations transactional data.
a)We need to find the runtime history of this process by simply going to /ssa/bwt transaction and check the runtime history of this process chain id.
This indicates that as day progresses this process chain is taking more time to finish.
b)Next step would be to check which loading process in the PC is causing the gradual longer run.
Here for instance we could see below
The info package loading step takes considerable amount of time to finish and the runtime increases graduallly.
c)This is an info package load which has a processing type as follows
Till the above 3 steps, we have identified the Pcs taking longer time and the cause of the same .
2.Insigths of new proposed process structure
The above model defines a method to split a single info package into 4 info packages and the extraction will be based upon the non overalapping ranges set on one of the key fields of the extractor .
Eg.
If we consider a transactional data load for Cost center allocations we can split the extraction process filtering upon a key info object Cost element
So for 4 ranges it could be like this
Cost element Field selection R1 info package: 00000 400000
R2 info package:400001 700000
R3 info package: 700001 900000
R4 info package: 900001 999999
The best way to define the ranges is by calculating the total number of records being extracted for the characteristic on previous runs and divide it by 4 to calculate the ranges for the characteristics to be used for the info package selection during parallel run of the packages.
This can be set based on the previous runs of the info packages and the total number of records which are extracted and updated.
This approach is taken considering that the processing mode of each of the 4 info packages will be till PSA as it was previously set to “PSA and target in parallel”.
So the processing mode of newer info packages is as follows:
The parallel processing of the 4 info packages option till PSA is especially suitable for system environments that will focus on very fast extraction and the subsequent posting of data into data targets is secondary.
PSA Update process variants which will read data from PSA and update the data to the target subsequently on completion of each info package and to avoid locks during update the conditon operators have been introduced where it helps to wait for each of the update
process to finish and for the next update to start. This will avoid any kind of locking discrepancies while updating to the target.
So for 4 parallel info packages there are 4 PSA Update variants to be used to subsequently update the target.
The final structure looks like this below
ADVANTAGES OF USING PSA UPDATES:
PSA Update process variant has the option to set the parallel processing of work processes.
3.SAP recommended process settings for performance improvement of loading
In order to smoothen the data load process we can make use of the Parallel processing settings in each of the 4 PSA update variants.
As per SAP we can increase the parallel jobs from default “3” to “20” .
In parallel processing, additional work processes for processing the BW processes are split off from the main work process. This helps to distribute the work packages and thus improving the overall runtime while updating data from PSA subsequently into the target.
In order to test the runtime of the above proposed PC it is advised to check with Basis team to know how many parallel modes are used for this batch processing.If the Basis team answers that the PC is taking only few parallel modes for this batch processing and all those parallel jobs have been triggered in a single background server ,then there is way which can significantly improve the overall runtime.
It is recommended to create a new group server in background processing with the help of Basis including all servers with batch processes configured:
With this change more new jobs will be created at a time considering parallel batch processing.
But after this change done by Basis team a small change is needed in the PSA update variant to point to this new server group changed settings.
Due to this change the number of parallel jobs get incresed in the background batch processing and thereby causes a significant improvement in loading process.
4.Benefits and results with example showing the significance of the new structure
The result shown above has given given more than 40 percentage improvement before the change was proposed and it can be tried and tested based on different scenerios.
Basically the basic structure could be same as shown above but the filter selection,process variant settings and the range division will be based on the type of load ie.. transactional or master data attribute load and previous snapshots on the number records inserted based on a particular key field which will help to split the range to be proposed for the parallel info package selections.
The same thing can be applied to long running dtp i.e. splitting a single dtp to 3 or 4 based on scenerio based on non overlapping filter selections and setting the parallel processing.
Good one...
Regards,
Aravind.
Thanks Rohit,
nicely documented.
Rgds,
Sudhir
Interesting point...I definitely follow this procedure if I will be in this case.
Thanks Rohit.
Reg,
Phani.
very good information. Thanks for providing the same.