Skip to Content

This BLOG is an expanded version of a post that I made to the ABAP forum a while ago. What I wanted to show was that the effect that when performance tuning, the effect of nested loops can be far worse than poorly constructed SELECT statements.

I wrote a small program which illustrates this point. I’ve posted the code below, but in a nutshell, what it does is select a number of FI document headers and line items from BKPF and BSEG. The select statement from BSEG is very inefficient. I did it that way to prove a point. It then reads all records from both tables using a nested loop. Then it reads all records from both tables using a much more efficient method. Finally, it reads all records from both tables using a different method which may be even more efficient, but may be somewhat more difficult to program. It keeps track of the number of records and time taken for each operation.

I ran the program twice in our QA environment – once selecting a small amount of data and again selecting a moderate amount. The outputs are:

First run –

Time for unindexed select : 00:06:09
Number of BKPF entries: 5,863
Number of BSEG entries: 17,683

Time for nested loop : 00:00:53
Number of BKPF reads : 5,863
Number of BSEG reads : 17,683

Time for indexed loop : 00:00:00
Number of BKPF reads : 5,863
Number of BSEG reads : 17,683

Time for parallel cursor : 00:00:00
Number of BKPF reads : 5,863
Number of BSEG reads : 17,683

Second run –

Time for unindexed select : 00:21:07
Number of BKPF entries: 55,777
Number of BSEG entries: 205,285

Time for nested loop : 02:16:21
Number of BKPF reads : 55,777
Number of BSEG reads : 205,285

Time for indexed loop : 00:00:01
Number of BKPF reads : 55,777
Number of BSEG reads : 205,285

Time for parallel cursor : 00:00:01
Number of BKPF reads : 55,777
Number of BSEG reads : 205,285

So what can we conclude? In the first case, a gain in time of almost a minute is not much, but in a dialogue program, it would be worthwhile. But in the second case the other method gains over two hours. This would allow a program that has to run in the background to run in the foreground. The most striking thing to me though, is the fact that the nested loop takes substantially longer than an extremely inefficient select statement.

In both cases, the loop using the parallel cursor method did not produce a substantial savings over the loop using a binary search followed by an indexed read; however, if you run this and extract a very large amount of data, it does run more quickly. In the program I have provided, the parallel cursor method does appear to be somewhat easier to program, but I have found that if the outer loop does not contain all of the records in the inner loop, then programming complexity is increased and I don’t think it warrants the extra effort. The binary search method is easy and runs quickly.

The select screen for the program is quite standard. The amount of data returned is determined by whatever the user enters and the programmer really has no control over it. So I’m suggesting that if you can’t guarantee that the amount of data is small, then you really ought to use the indexed read method.

Also note that the first select statement returns 17,683 rows from BSEG and took 06:09 to run and the second one returns 205,285 rows and takes 21:07. The second one retrieves almost 11.5 times as much data but takes only about 3.5 times as long to execute. The two runs of the program were on separate evenings when there shouldn’t be any load, so buffering and workload shouldn’t be issues.

So, my conclusion is: when tuning a program that you know will return a small amount of data, tune the select statement and don’t worry too much about loops; however, if the program may return a large amount of data, avoid nested loops.

And a final thought: once you get a program tuned to a certain point, it doesn’t make a lot of sense to try to spend a lot of time trying to reduce execution time by a small amount. (More on this later)

Code follows:

report ztest_nested_loop. data: bkpf type bkpf, bseg type bseg. select-options: s_bukrs for bseg-bukrs memory id buk obligatory, s_gjahr for bseg-gjahr memory id gjr obligatory, s_lifnr for bseg-lifnr memory id lif obligatory. data: bkpf_tab type standard table of bkpf, bkpf_lin like line of bkpf_tab, bseg_tab type standard table of bseg, bseg_lin like line of bseg_tab. data: start_time type sy-uzeit, end_time type sy-uzeit, difference type sy-uzeit, bkpf_entries type sy-tabix, bseg_entries type sy-tabix, bkpf_reads type sy-tabix, bseg_reads type sy-tabix. start-of-selection. perform unindexed_select. perform nested_loop. perform indexed_loop. PERFORM parallel_cursor. *&---------------------------------------------------------------------* *& Form unindexed_select *&---------------------------------------------------------------------* form unindexed_select. get time field start_time. select * from bseg into table bseg_tab where bukrs in s_bukrs and gjahr in s_gjahr and lifnr in s_lifnr. if sy-subrc <> 0. message id '00' type 'E' number '001' with 'No entries selected'. endif. select * from bkpf into table bkpf_tab for all entries in bseg_tab where bukrs = bseg_tab-bukrs and belnr = bseg_tab-belnr and gjahr = bseg_tab-gjahr and bstat = space. clear bseg_tab. refresh bseg_tab. select * from bseg into table bseg_tab for all entries in bkpf_tab where bukrs = bkpf_tab-bukrs and belnr = bkpf_tab-belnr and gjahr = bkpf_tab-gjahr. get time field end_time. difference = end_time - start_time. describe table bkpf_tab lines bkpf_entries. describe table bseg_tab lines bseg_entries. write: /001 'Time for unindexed select:', difference, /005 'Number of BKPF entries:', bkpf_entries, /005 'Number of BSEG entries:', bseg_entries. skip 1. endform. " unindexed_select *&---------------------------------------------------------------------* *& Form nested_loop *&---------------------------------------------------------------------* form nested_loop. get time field start_time. loop at bkpf_tab into bkpf_lin. bkpf_reads = bkpf_reads + 1. loop at bseg_tab into bseg_lin where bukrs = bkpf_lin-bukrs and belnr = bkpf_lin-belnr and gjahr = bkpf_lin-gjahr. bseg_reads = bseg_reads + 1. endloop. endloop. get time field end_time. difference = end_time - start_time. write: /001 'Time for nested loop:', difference, /005 'Number of BKPF reads:', bkpf_reads, /005 'Number of BSEG reads:', bseg_reads. skip 1. endform. " nested_loop *&---------------------------------------------------------------------* *& Form indexed_loop *&---------------------------------------------------------------------* form indexed_loop. data: bkpf_index like sy-tabix, bseg_index like sy-tabix. clear: bkpf_reads, bseg_reads. get time field start_time. sort: bkpf_tab by bukrs belnr gjahr, bseg_tab by bukrs belnr gjahr. loop at bkpf_tab into bkpf_lin. read table bseg_tab into bseg_lin with key bukrs = bkpf_lin-bukrs belnr = bkpf_lin-belnr gjahr = bkpf_lin-gjahr binary search. bkpf_reads = bkpf_reads + 1. bseg_index = sy-tabix. while sy-subrc = 0. bseg_index = bseg_index + 1. bseg_reads = bseg_reads + 1. read table bseg_tab into bseg_lin index bseg_index. if bseg_lin-bukrs <> bkpf_lin-bukrs or bseg_lin-belnr <> bkpf_lin-belnr or bseg_lin-gjahr <> bkpf_lin-gjahr. sy-subrc = 99. else. endif. endwhile. endloop. get time field end_time. difference = end_time - start_time. write: /001 'Time for indexed loop:', difference, /005 'Number of BKPF reads:', bkpf_reads, /005 'Number of BSEG reads:', bseg_reads. skip 1. endform. " indexed_loop *&---------------------------------------------------------------------* *& Form parallel_cursor *&---------------------------------------------------------------------* * text *----------------------------------------------------------------------* FORM parallel_cursor. DATA: bkpf_index LIKE sy-tabix, bseg_index LIKE sy-tabix. CLEAR: bkpf_reads, bseg_reads. GET TIME FIELD start_time. SORT: bkpf_tab BY bukrs belnr gjahr, bseg_tab BY bukrs belnr gjahr. bseg_index = 1. LOOP AT bkpf_tab INTO bkpf_lin. bkpf_reads = bkpf_reads + 1. LOOP AT bseg_tab INTO bseg_lin FROM bseg_index. IF bseg_lin-bukrs <> bkpf_lin-bukrs OR bseg_lin-belnr <> bkpf_lin-belnr OR bseg_lin-gjahr <> bkpf_lin-gjahr. bseg_index = sy-tabix. EXIT. ELSE. bseg_reads = bseg_reads + 1. ENDIF. ENDLOOP. ENDLOOP. GET TIME FIELD end_time. difference = end_time - start_time. WRITE: /001 'Time for parallel cursor :', difference, /005 'Number of BKPF reads :', bkpf_reads, /005 'Number of BSEG reads :', bseg_reads. ENDFORM. " parallel_cursor

More information can be found in SE30 (Tips and Tricks)

To report this post you need to login first.

21 Comments

You must be Logged on to comment or reply to a post.

  1. Former Member
    Your code for the Index and Nested loops is misleading, as these do not retrieve the data from the database. However this would make the Nested loop times greater by up to 20? miniutes.

    Another thing to look out for is the sequence that you try each of these processes. The first process to fill an internal table will take longer than a process to re-fill it. The memory to store the data has to be allocated.

    I was however suprised at how much longer the nested loop took to process the data.

    MattG.

    (0) 
    1. Former Member Post author
      Well, the idea was to compare how the performance of nested loops compared to an inefficient SELECT statement, so I retrieved the data entirely separately from LOOPing through it.

      Rob

      (0) 
      1. A. de Smidt
        As mentioned in my thread I noticed that the parallel loop fails if the nested table contains keys which the main table doesn’t have.

        the very slow standard nested loop doesn’t have this problem with additional keys in the nested table.

        Before I do the parallel looop I delete all the entries which are not in the main table from the nested  itab.

        perhaps there is a nicer alternative ๐Ÿ™‚

        (0) 
    1. Former Member
      thnx for good posts..

      You may also use “transporting no fields” addition while finding index of bseg_tab with binary search. this may reduce the assign overhead.

      just for extreme performance ๐Ÿ˜‰

      Shaban

      (0) 
    2. Former Member Post author
      I added your code to my program and modified it so that it did some more efficient selecting (the bad one was just to prove a point). Then I ran the three more efficient loops a number of times. What I found was that you code ran in about in about 80% of the time that the indexed loop took and about 70% of the time of the parallel cursor. So your code is faster. But then I wasn’t really trying to find the fastest way to process two tables, just to show that nested loops have to be avoided at all costs for large tables.

      Incidentally, from the above, the indexed loop method consistently outperforms the parallel cursor method. I find this surprising because the records in each table will be read only once using parallel cursors, but the inner table will be read more often using the indexed loop. The only reason I can think of for this is that the SAP kernel has somehow optimized the binary search.

      Rob

      (0) 
      1. Former Member
        Rob

        Just something to check with these methods is to only sort the data once. Record the time taken to sort the tables, and add it to the runtime of the methods that need sorted data.

        Sort methods, generally, are quicker at sorting unsorted data than sorted data.

        (0) 
  2. Former Member

    It is interesting to compare the times for very small number of records. (This is for SCARR, SPFLI on miniSAP). My times, in microseconds:<br/>Select: 96439<br/>Nested:    31<br/>Indexed:   59<br/>Other:     31<br/>This was for 6 SCARR and 10 SPFLI records.<br/>So this test would show that a Nested loop was the best option. (James’s ‘Other’ logic does not include the required sort).<br/><br/>I tried to find out what was the better method to update an internal table, assign or modify. I found that the overhead of the first assign was 20 times that of subsequent assigns. This assign was 6 times longer than the first modify, itself longer than later modifies.<br/><br/>What was odd about the results, it made a difference to the times if I modify loop first or second. <br/><br/>My code:<br/>&—-


    <br/>*& Report  Z_MG_TYPE_FS_ITAB                                           <br/>&                                                                     <br/>&–


    <br/>&   A little test program to see if loop at assigning would work.     <br/>&   This also proves that the field-symbol has structure.             <br/>&


    <br/><br/>REPORT  Z_MG_TYPE_FS_ITAB             .<br/><br/><br/>CONSTANTS: C_DONUMBER  TYPE I  VALUE 10.<br/><br/>TYPES: BEGIN OF TYP_SPFLI.<br/>INCLUDE  STRUCTURE SPFLI.<br/>TYPES:   TABIX1 TYPE SYTABIX,<br/>         TABIX2 TYPE SYTABIX,<br/>       END   OF TYP_SPFLI.<br/>FIELD-SYMBOLS: <SPFLI>  TYPE TYP_SPFLI.<br/>DATA: T_SPFLI TYPE STANDARD TABLE OF TYP_SPFLI,<br/>      H_SPFLI TYPE                   TYP_SPFLI.<br/><br/>TYPES: BEGIN OF TYP_RESULT,<br/>         INDEX   TYPE SYINDEX,<br/>         ASSIGN  TYPE I,<br/>         AFIRST  TYPE I,<br/>         MODIFY  TYPE I,<br/>         MFIRST  TYPE I,<br/>       END   OF TYP_RESULT.<br/>DATA: T_RESULT  TYPE STANDARD TABLE OF TYP_RESULT,<br/>      S_RESULT  TYPE                   TYP_RESULT.<br/><br/>DATA: TIMESTAMP1  TYPE I,<br/>      TIMESTAMP2  TYPE I,<br/>      MIN_TIME_A  TYPE I  VALUE 1000000,<br/>      MIN_TIME_M  TYPE I  VALUE 1000000,<br/>      MAX_TIME_A  TYPE I  VALUE 0,<br/>      MAX_TIME_M  TYPE I  VALUE 0,<br/>      MEANTIME_A  TYPE I  VALUE 0,<br/>      MEANTIME_M  TYPE I  VALUE 0.<br/><br/>PARAMETERS: P_DEBUG  AS CHECKBOX,<br/>           Who goes first.<br/>            P_ASSIGN RADIOBUTTON GROUP GO,<br/>            P_MODIFY RADIOBUTTON GROUP GO,<br/>            P_XTRAAS AS CHECKBOX.<br/><br/><br/><br/>START-OF-SELECTION.<br/>  IF  P_DEBUG  <>  SPACE.<br/>    BREAK-POINT.<br/>  ENDIF.<br/><br/>  SELECT * INTO TABLE T_SPFLI FROM SPFLI.<br/><br/>* Assign field-symbol, so it is not initial.<br/>  IF  P_XTRAAS  =  ‘X’.<br/>    ASSIGN H_SPFLI TO <SPFLI>.<br/>  ENDIF.<br/><br/>  DO C_DONUMBER TIMES.<br/>    S_RESULT-INDEX  =  SY-INDEX.<br/>    IF  P_MODIFY  =  SPACE.<br/>      PERFORM UPDATE_USING_ASSIGN.<br/>      PERFORM UPDATE_USING_MODIFY.<br/><br/>    ELSE.<br/>      PERFORM UPDATE_USING_MODIFY.<br/>      PERFORM UPDATE_USING_ASSIGN.<br/>    ENDIF.<br/>    APPEND S_RESULT TO T_RESULT.<br/>  ENDDO.<br/><br/><br/>  WRITE: /    ‘Time Assigning:’,  MIN_TIME_A,  MAX_TIME_A,  MEANTIME_A,<br/>         /    ‘Time Modifying:’,  MIN_TIME_M,  MAX_TIME_M,  MEANTIME_M.<br/>  ULINE.<br/>  IF  P_ASSIGN  =  ‘X’.<br/>    WRITE: /    ‘Assign processed first.’.<br/>  ELSE.<br/>    WRITE: /    ‘Modify processed first.’.<br/>  ENDIF.<br/>  ULINE.<br/>  SKIP.<br/>  LOOP AT T_RESULT  INTO S_RESULT.<br/>    WRITE: /    ‘Index :’,        S_RESULT-INDEX,<br/>                ‘Assign:’,        S_RESULT-ASSIGN,<br/>                ‘First assign:’,  S_RESULT-AFIRST,<br/>                ‘Modify:’,        S_RESULT-MODIFY,<br/>                ‘First modify:’,  S_RESULT-MFIRST.<br/>  ENDLOOP.<br/>  ULINE.<br/><br/><br/>*&


    <br/>&      Form  UPDATE_USING_ASSIGN<br/>*&


    <br/>       text<br/>*


    <br/>FORM UPDATE_USING_ASSIGN.<br/>  DATA: L_FIRSTTIME  TYPE I.<br/><br/>  GET RUN TIME FIELD TIMESTAMP1.<br/>  LOOP AT T_SPFLI ASSIGNING <SPFLI>.<br/>    <SPFLI>-TABIX1  =  SY-TABIX.<br/>    IF  SY-TABIX  =  1.<br/>      GET RUN TIME FIELD L_FIRSTTIME.<br/>    ENDIF.<br/>  ENDLOOP.<br/>  GET RUN TIME FIELD TIMESTAMP2.<br/>  TIMESTAMP2  =  TIMESTAMP2  –  TIMESTAMP1.<br/>  IF  TIMESTAMP2  <  MIN_TIME_A.<br/>    MIN_TIME_A  =  TIMESTAMP2.<br/>  ENDIF.<br/>  IF  TIMESTAMP2  >  MAX_TIME_A.<br/>    MAX_TIME_A  =  TIMESTAMP2.<br/>  ENDIF.<br/>  ADD TIMESTAMP2 TO MEANTIME_A.<br/>  S_RESULT-ASSIGN  =  TIMESTAMP2.<br/>  S_RESULT-AFIRST  =  L_FIRSTTIME  –  TIMESTAMP1.<br/>ENDFORM.                    ” UPDATE_USING_ASSIGN<br/><br/><br/>&


    <br/>&      Form  UPDATE_USING_MODIFY<br/>*&


    <br/>       text<br/>*


    *<br/>FORM UPDATE_USING_MODIFY.<br/>  DATA: L_FIRSTTIME  TYPE I.<br/><br/>  GET RUN TIME FIELD TIMESTAMP1.<br/>  LOOP AT T_SPFLI INTO H_SPFLI.<br/>    H_SPFLI-TABIX2  =  SY-TABIX.<br/>    MODIFY T_SPFLI FROM H_SPFLI.<br/>    IF  SY-TABIX  =  1.<br/>      GET RUN TIME FIELD L_FIRSTTIME.<br/>    ENDIF.<br/>  ENDLOOP.<br/>  GET RUN TIME FIELD TIMESTAMP2.<br/>  TIMESTAMP2  =  TIMESTAMP2  –  TIMESTAMP1.<br/>  IF  TIMESTAMP2  <  MIN_TIME_M.<br/>    MIN_TIME_M  =  TIMESTAMP2.<br/>  ENDIF.<br/>  IF  TIMESTAMP2  >  MAX_TIME_M.<br/>    MAX_TIME_M  =  TIMESTAMP2.<br/>  ENDIF.<br/>  ADD TIMESTAMP2 TO MEANTIME_M.<br/>  S_RESULT-MODIFY  =  TIMESTAMP2.<br/>  S_RESULT-MFIRST  =  L_FIRSTTIME  –  TIMESTAMP1.<br/>ENDFORM.                    ” UPDATE_USING_MODIFY

    (0) 
    1. Former Member Post author
      Hi Matthew – yes, for small amounts of data, there really is not much to be gained by using either of the techniques I’ve pointed out. But I’m really not sure if you can count on nested loops giving better performance. The time to do a loop is dependent on the number of records in the table. From F1 on READ: “The runtime required to read a line from a table with 100 entries using the index is around ca. 7 msn (standard microseconds), to read a line using a key of 30 bytes using a binary search, around 25 msn, and without binary search, around 100 msn.”

      Thanks for taking the time to look at this. I appreciate it.

      Rob

      (0) 
    2. RE: “James’s ‘Other’ logic does not include the required sort.”

      Very true.  Thanks for pointing this out.  I was originally using a sorted itab (no BINARY SEARCH), and neglected to add the sort when I morphed my own code to fit the blog’s example.

      Regards,
      James

      (0) 
  3. Former Member
    Hi all,
    found this and played a little with the coding. Why not just replace in the data defenition:
          bseg_tab type standard table of bseg,

    by
          bseg_tab type sorted table of bseg with non-           non-unique key bukrs belnr gjahr buzei,
    (or unique key)?

    Result in a test with around 400k entries was near the indexed loop.

    Greetings,
    Daniel

    (0) 
    1. Former Member Post author
      Yes – that’s a good point. There are a number of different techniques that can be used instead of the one I presented here. I like to use standard tables because I tend to re-use and re-sort them.

      But the main thrust of the blog isn’t how to fix the problem, it’s recogizing the problem in the first place. Many questions in the forum ask “How can I improve the performace of this code…?” The code will have a bunch of selects and then one or more nested loops.

      Most of the responses try to improve the selects only and ignore the nested loops. A lot of people just don’t see this as a problem.

      Rob

      (0) 

Leave a Reply