Skip to Content
Author's profile photo Former Member

Optimizing Already Optimized ABAP SQL SELECT Statement

Sometimes you have to extract a large number of records from a huge database table. This was the case with a custom ABAP program that prepares F110S Payment Runs for SAP mega customers. As part of the custom ABAP program algorithm, a simple SELECT statement is executed on BSIK table that contains over 60 million records. The SELECT statement extracts information from 25 fields of 6-8 million BSIK records and manipulates it to prepare multiple F110S Payment Runs. It takes for the custom ABAP program on average over 1 hour 30 minutes to do the job on powerful production system with multiple application servers. The custom program spends around 90% of time in a single SELECT statement that extracts BSIK data. The SELECT statement was optimized. The BSIK database table and its indexes were tuned up by BASIS team to speed up the process. Three application servers server group was dedicated to the payment process. Still the process was taking a long time to complete. The entire payment process could not be squeezed into allocated time slot.

For a time it looked that nothing else could be done – that the limitation of the system was reached. However, when you look closer at task at hand, you would notice that the custom ABAP program schedules multiple F110S jobs to run standard SAP payment processes in parallel. The custom ABAP program itself runs as a single process and it seems that it cannot be divided into multiple processes because of a single SELECT statement that consumes over 90% of its total runtime. The SELECT statement pseudo-code looks as shown in the following table:

The above select statement selects BSIK records for 2 company codes and for the entire numeric vendor range 1000000000 to 9999999999, selecting 6 to 8 million records out of 60 to 80 million records BSIK table consuming more than 90 minutes of runtime.

Since the production system is a powerful multi server system, the question is how to utilize its full power for executing a single SELECT statement. If you would divide a single SELECT statement into multiple SELECT statements for unique vendor sub ranges and execute them in parallel you might get results faster even so multiple application servers will be accessing a single database server to get data.

The question is how to determine the vendor sub ranges for multiple SELECT statement jobs. If you use simple approach and define vendor sub ranges as 1000000000-1999999999, 2000000000-2999999999, 3000000000-3999999999, … you might get uneven distribution of number of BSIK records in each vendor sub range as shown on the following diagram:

The sub-ranges as above will cause SELECT statement jobs to be unbalanced – one extracting a large number of records and running long time and another one extracting a small number of records and running short time.

It is important to balance a number of records in each SELECT statement job so they will be extracting similar number of records and having approximately similar runtime. The more balanced vendor sub range is shown on the following diagram:

The prototype ABAP programs were written to calculate balanced vendor sub ranges and then to execute a single SELECT statement on 1000000000-9999999999 vendor range as multiple SELECT statement jobs on balanced vendor sub ranges running in N parallel processes.

The following parallel SELECT statement prototype programs were written:

  • ZAWB_PARALEL_SELECT_RUN – starts the process, waits for all SELECT jobs to finish and imports retrieved BSIK data from INDX like table
  • ZAWB_ PARALEL_SELECT_SCH – calculates vendor sub ranges and schedules parallel SELECT jobs
  • ZAWB_ PARALEL_SELECT_GET – retrieves BSIK data for vendor sub ranges and exports it to INDX like database table

The prototype was tested on Q&A and Volume Performance Testing systems running up to 7 times than a test program with equivalent single SELECT statement. The prototype was tested with 2 to 80 parallel processes running on multiples application servers.

The best performance was achieved when number of parallel runs was equal or multiple of total number of available background processes on dedicated application servers; i.e., when running the prototype on 2 application servers with 20 available background processes, the best results were achieved when splitting SELECT statement into 20, 40, … parallel jobs.

The parallel SELECT statement jobs algorithm was implemented into the custom ABAP payment program dramatically improving its performance on the production system. The runtime was reduced on average from more than 1 hour 30 minutes to around 15 minutes.

Assigned tags

      6 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Former Member
      Former Member
      Hi Adam,

      This blog is really useful. We always face issues with all the FICO tables for performance. Can you please share the code for the above programs.

      If the test client has only one application server then will there be any difference in performance?

      Regards,
      Atish

      Author's profile photo Luis Lu
      Luis Lu
      Thanks for the nice article, if possible can you please shed some light on the details of vendor number partitioning that was used in the article?

      Thanks
      Luis

      Author's profile photo Former Member
      Former Member
      I agree that it will be useful if you can share the code for the partitioning of vendor numbers. I.e. how to determine the volume of data per vendor, etc.
      Author's profile photo Former Member
      Former Member
      I used to think about doing this, but had one apprehension. Suppose the production machine has a heavy job scheduling, then if we run a custom program which creates multiple process, will this not keep other waiting.
      Author's profile photo Clemens Li
      Clemens Li
      Thank you for the useful explanation, I like the graphics - they help a lot in understanding.

      Many readers would like to know some more technical details...

      When I read it, standard parallel processing in the area of contract accounting came to my mind. Here SAP already created a robust framework for mass processing using parallel job scheduling.

      The splitting of, in this case supplier numbers, into sub-ranges is called use of parallel objects. It is not required to create the parallel objects each time you want to run the jobs.

      I remember a colleague used the framework to define own customer-specific tasks, so it is worth to have a look at the Parallelization of Data Processing documentation.

      It is really amazing to see the many areas where SAP standard architects already did a lot of preparations for tasks that come up everywhere.

      Author's profile photo Former Member
      Former Member

      Hi Adam,

      Nice blog..seems its very intersting also. Please can you share  the code ,similar issue we were facing.

      • ZAWB_PARALEL_SELECT_RUN
      • ZAWB_ PARALEL_SELECT_SCH
      • ZAWB_ PARALEL_SELECT_GET

      Thanks&Regards,

      Gurunath Kumar D