Optimizing ABAP Programs Performance with Parallel Processing
Some custom ABAP programs deal with business problems that take a long time to get results back. Usually, it is caused by a complex algorithms and business logic that has to process a large amount of data from multiple sources. Some of these programs are very good candidates for significantly improving their performance by implementing parallel processing algorithms.
The following characteristic are desired in a sequential processing program when converting it to a parallel processing program:
- Already optimized programs that run too long
- Logic implemented as sequence of calls to subroutines; i.e., ABAP Forms, Function Modules and/or Class Methods
- Two or more time consuming subroutine calls – one after another one – that do not relay on each other results and do not use global variables
Let us consider a program shown on the following diagram:
Its sequential version is shown on the left side of the diagram. The subroutines A, B, C and D do not relay on each other results and they do not use global variables. They could be included in respective programs A, B, C and D shown on the right side of the diagram. The parallel version of the program will schedule programs A, B, C and D for immediate execution and wait in its synchronization routine for their completion. At that time, it will bring results back and continue with its execution.
Let us consider a sample pseudo-ABAP program that uses for simplicity ABAP Forms only as shown in the following table:
The sample program executes 8 subroutines. Its total runtime is 135 minutes. When you look closer at program structure, you will notice that subroutines 3 to 6 are independent of each other. They do not relay on each other results and could be executed in any order. Running subroutines 3 to 6 in different order will not bring any benefit to the overall program performance. It will still run for 135 minutes. However, running these subroutines in parallel will greatly improve overall program performance. Rather than running for total of 25 + 30 + 28 + 26 = 109 minutes, they will run in parallel only for 30 minutes. If you include runtime for subroutines 1, 2, 7 and 8 that cannot be run in parallel because they depend on each other, the total runtime will drop from 135 minutes to 10 + 5 + 30 + 8 + 3 = 56 minutes.
As you could see, with little reorganization of the program logic you could get program running over 2 times faster. Often the performance improvement could be more dramatic and you could get parallelized version of the program running even 10 times faster than traditional sequential implementation.
You could use different techniques to make program running some of its business logic in parallel. The simplest implementation could use JOB_OPEN and JOB_CLOSE function modules to schedule independent programs that would run subroutines 3 to 6 in parallel. You could publish parallel runs’ results to the persistent storage with EXPORT TO DATABASE statement that could write a large amount of data to INDX like table in a very short time. The calling program could retrieve the published results with IMPORT FROM DATABASE statement in its synchronization subroutine.
The sample logic for the parallel version of the same program is shown in the following table:
The zabc_parallel_schedule() subroutine would schedule 4 programs that would execute subroutines 3 to 6 in parallel. Its runtime will be only a few seconds; i.e., 0 minutes. The zabc_parallel_sync() subroutine will spend most of its time waiting for completion of 4 parallel programs that would export their results to INDX like table and set the FINISHED flag in; e.g., TVARV table. The zabc_parallel_sync() subroutine will monitor the TVARV table and once the FINISH flag is detected, it will bring results of the parallel runs back with IMPORT FROM DATABASE statements.
Often implementing parallel processing algorithms might speedup significantly the execution of long running programs – anywhere from 2 to 10 times or more. The performance gains might be even greater in multiple application servers’ environments with many background processes available for processing.
In recent implementation of parallel processing in custom FI/CO ABAP program that was processing around 6-8 million records from BSIK and BSID tables, the runtime was decreased more than 7 times. Instead of a single sequential process, 30 processes were scheduled in parallel and once they were completed, their results were synchronized. The parallelized program was executed on the server group with 3 application servers and over 30 background processes.
Coming soon “Optimizing ABAP Programs Performance with Parallel Processing Library” blog/whitepaper will present custom ABAP Library that makes parallel ABAP programming easy.