Skip to Content

I got a blog-worthy surprise when I did a quick performance test on two different ways to use the new-ish (7.4) table expressions to perform a search on an internal table.

My scenario involved internal table searches where the values would be missing a lot of the time. The unexpected result was that when a row can’t be found, using table expressions with exceptions is ten times slower on average than all other lookup methods, including the good ol’ fashioned READ TABLE with SY-SUBRC.

Quick primer for those still getting accustomed to 7.4 ABAP (feel free to skip to the results):

With table expressions, finding an entry in a table is really neat. I have a table that includes a field “ID”. So the following READ TABLE can be rewritten using a table expression (the bit in square brackets):

READ TABLE itab INTO row WITH KEY id = find_id.
"Equivalent to:
row = itab[ id = find_id ].

If the matching entry cannot be found, you need to catch the exception:

  TRY.
      row = itab[ id = find_id ].
    CATCH cx_sy_itab_line_not_found.
      ...
  ENDTRY.

Another alternative is to use a VALUE constructor expression where you can add the keyword OPTIONAL (or DEFAULT) to just initialize any non-found values.

row = VALUE #( itab[ id = find_id ] OPTIONAL ).
IF row IS INITIAL.
  ...

This particular usage of VALUE may seem a little awkward. I mean, why use x = VALUE #( y ) when you can just write x = y? The VALUE constructor’s sole purpose here is the OPTIONAL bit that lets us do away with the exception.

As I was working on a performance-sensitive component, I tested it to see what performs best.

Results:

For completeness, and additionally with suggestions by Jacques Nomssi, Quynh Doan Manh and Uwe Fetzer in the comments, I also added a couple of other ways to look for an entry in a table:

  • The trusty READ TABLE  … WITH KEY, with and without TRANSPORTING NO FIELDS
  • line_exists( )
  • ASSIGN itab[ … ]  – which (bizarrely) sets SY-SUBRC instead of producing an exception
  • REF #( ) instead of VALUE #( ).

I tested a million failed lookups on a 7.50 HANA system. Here are the results in microseconds:

line_exists( )                        :   610,758
READ TABLE ... TRANSPORTING NO FIELDS :   671,368
READ TABLE                            :   671,115
ASSIGN itab[ ... ]                    :   707,929
REF #( itab[ ... ] OPTIONAL )         :   888,562
VALUE #( itabl[ ... ] OPTIONAL )      :   961,803
TRY ... itab[ ... ] CATCH             : 6,325,265

I did not expect the last one at all.

So the take-away here for me is that a TRY-CATCH may be easier to read, but should not be used in performance-sensitive code unless you expect the values to be found most of the time. I’m happy to sacrifice a little performance for readability, but this is a significant impact.

I suspect that this applies to CATCH blocks in general, but that’s another analysis for another day.

For comparison, I also re-ran the same but this time with a lookup value that did exist. Two things happened:

  • the exception was no longer an influence (to be expected)
  • line_exists( ) became a poor choice in my scenario, because we need to double up the the search in order to also read the record:
  IF line_exists( itab[ id = find_id ] ).
    row = itab[ id = find_id ].

To summarise:

  • If you don’t need the data, line_exists( ) is fastest.
  • If performance is number 1 priority and you need the data, READ TABLE is fastest.
  • For compact and/or readable code, use table expressions.
    (Yes, I know, ‘new’ ABAP can be used to make code either more readable or more cryptic, but that’s another discussion)
  • TRY-CATCH with table expressions can be a useful way to structure your code (e.g. to use a single catch handler for multiple expressions), but be mindful of the expected failure rate and performance-criticality of the component. If we’re talking anything less than thousands in a short space of time then you can safely ignore the impact.

Update:

After several comments and suggestions, I uploaded my test program to GitHub at abap_itab_perf

It was just a quick-n-dirty originally for comparing just two options and just grew as a copy/paste of alternatives. Feel free to play with it or even send any updates back via GitHub.

To report this post you need to login first.

29 Comments

You must be Logged on to comment or reply to a post.

  1. Quynh Doan Manh

    I think overall the VALUE #( ) could be used. its make code easier to read, if data not exist its fast enough, and incase data exist you dont have to check + assignment like line_exists( ).

    If you compare only line_exists( ) with read table, I think read table should have TRANSPORT NO FIELD, to be fair :).

    (1) 
    1. Mike Pokraka Post author

      Thanks for your comment. I partly agree, but still find the additional VALUE clumsy. I have no idea why SAP didn’t just do a SY-SUBRC like with field symbols, particularly as there is such a performance penalty.

      Re. TRANSPORTING NO FIELDS, my initial test for my real work was where I needed the value, I added line_exists as an afterthought for the blog. But it’s a good point for a complete comparison, so I’ve added it and updated the results directly in the blog above, thanks.

      (0) 
  2. Jacques Nomssi

    Hello Mike,

    could you please measure the run-time behavoir of this variant?

    ASSIGN itab[ id = find_id ] TO FIELD-SYMBOLS(<row>).

    No exceptions, check sy-subrc, it should be good enough.

    JNN

    (1) 
    1. Mike Pokraka Post author

      Thanks for the input, good point! I remember reading this a while ago, and thinking it’s bizarre. But we are where we are, and ASSIGN turns out to be a great result! My specific use case was a returning parameter so I didn’t even consider an FS.

      (0) 
    1. Mike Pokraka Post author

      Good point, but actually we don’t need to, as key fields of sorted or hashed tables provide a better alternative to binary search. But it’s certainly worth mentioning as not everyone knows all the possibilities.

      See https://help.sap.com/http.svc/rc/abapdocu_752_index_htm/7.52/en-US/index.htm?file=abentable_expressions.htm

      Personally I think BINARY SEARCH is unreliable and should be avoided if possible. I don’t like the idea of relying on prior code sorting the table and that nobody has APPENDED a row since then. If you need to search you need a key, so there’s no reason not to use a SORTED or HASHED table where you cannot go wrong.

      (3) 
      1. Kabir Siddarth

        Isn’t that SORT an overhead if you have large internal table. The value added by binary search isn’t fantastic after that time consuming sort. Like you said, sorted and hashed tables are the best bets.

        (0) 
        1. Mike Pokraka Post author

          A SORT can be beneficial in certain scenarios. Consider adding values one by one in random order – every INSERT into a sorted table adds performance cost, so here a one-off SORT at the end is likely to be more efficient.

          BUT… the reason I say BINARY SEARCH is unreliable is not for technical reasons but has to do with the developer(s). Example:

          DATA(itab) = VALUE #( ( 8 ) ( 5 ) ( 3 ) ).
          SORT itab.
          APPEND 2 TO itab.
          READ TABLE itab WITH KEY 2 INTO data(n) BINARY SEARCH.  "n = 0
          READ TABLE itab WITH KEY 2 INTO n.                      "n = 2

          This example is obviously easy to work out, but complex programs with several development iterations are a different story – especially when the filling, sorting and reading are spaced out in the code.

          This mistake is not possible with sorted or hashed tables. For that reason alone I avoid BINARY SEARCH unless there is a major benefit.

          (1) 
      2. Matthew Billingham

        Since I started using SORTED and HASHED tables 18 years ago, I’ve never had a need to use BINARY SEARCH on tables I’ve defined myself.

        It baffles me that people are using these 7.40+ features… yet continue to use BINARY SEARCH instead ot SORTED/HASHED.

        (2) 
    1. Mike Pokraka Post author

      Good point, updated in the blog. In my experience references are usually slightly slower than field symbols, and this is no exception.

      (0) 
    1. Mike Pokraka Post author

      Hah, it only started with the intention to compare two table expressions. Then I added another two and then the good suggestions came in 🙂

      (0) 
  3. Custodio de Oliveira

    Great post, thanks for sharing.

    Stating the obvious here (which actually was already said), the big difference there is the EXCEPTION, not the different ways of read. Especially for the first 4 cases. I reckon it would probably be a good idea to run the tests a few times and compare the averages (not sure if you did this already).

    Also this:

     IF line_exists( itab[ id = find_id ] ).
        row = itab[ id = find_id ].
    ENDIF.

    should never be used, as per official ABAP programming guidelines.

     

    Cheers.

    (0) 
    1. André Brito

      Hi Mike,

      I agree with Custodio and I know you enough to say that I shouldn’t ask but … . is this the average of  multiple execution of the same tests?

      Anyway, I’m a bit surprised as I expected the results between READ TABLE and VALUE with OPTIONAL to be more leveled (even though is only 300 microseconds difference .. 0.0003 seconds!). I guess I’m being picky. 🙂

      As always, well done and thanks for the info. Cheers!

      (0) 
      1. André Brito

        After better reading of the article I noticed a small big detail, it says you “tested a million failed lookups“.
        So… after reading (or failed reading) a million times the difference between ( READ TABLE vs VALUE with OPTIONAL ) is only 0.0003 seconds?? Really? Wow!

        And as far as I see 0.0003 seconds difference is neglectable. I will still go with the VALUE statement.

         

        P.S.: did you really do a million lookups? 🙂

        (0) 
        1. Mike Pokraka Post author

          Yup, a million. Personally I see 0.3s/million an acceptable margin for code clarity and consistency when the data volume is in the thousands or the DB time far exceeds that. The 6 seconds however took me by surprise and is something to keep in mind.

          I uploaded the code if you or anyone wants to  play with it further – feel free to update back.

          Github: abap_itab_perf

          (0) 
      2. Mike Pokraka Post author

        Hi Andre,

        Of course it goes without saying that duplicate searches are to be avoided, I thought that was too obvious to require explanation  🙂

        It’s 300,000 microseconds, or 0.3 seconds. So the difference compared to an exception becomes over 6 seconds when running through a million records. (Yes, that silly British-American comma that much of the world uses as a thousands separator. )

         

         

        (0) 
  4. Kumud Singh

    It is interesting to see that the assumption of  ‘ABAP 7.4’  elements being better than traditional elements fails sometimes. Thanks for sharing. Developers need to make a choice between readability of the code and performance at many occasions.

     

     

    Regards,

    Kumud

     

    (0) 
    1. Mike Pokraka Post author

      Agree, we (myself included) automatically assume new=faster. But on the other hand, I am not aware of any claims from SAP that the ‘new ABAP’ statements are faster, it’s almost always been about function, readability, new possibilities, etc. Faster to write perhaps 🙂

      (1) 
  5. Tomas Buryanek

    Thank you for the blog and testing.

    So I am staying with good old READ TABLE and line_exists( ) 🙂 EDIT: read bellow comments, thanks Uwe and Mike.

    (1) 
      1. Tomas Buryanek

        Sorry, I do not want to give misleading “conclusions”.
        Should I add “checking existence within an internal table with best performance” then?

        (0) 
        1. Mike Pokraka Post author

          The conclusion, as it often is, is “it depends”.

          There are two things to consider in performance: absolute and relative performance.

          Absolute is how much are you processing. So how much time does something consume over the workday and how much is it going to affect the user. Remember this was for a million executions, so for a dialog user querying 1000 records, the impact of 0.006 vs. 0.009 seconds is nil.

          Likewise for relative performance: If a job does a bunch of DB stuff and runs for half an hour then even 6 seconds is all but meaningless.

          So in most scenarios the difference between the options (apart from exceptions, which was the point of the blog) is a perfectly acceptable tradeoff for consistent and readable code.

          (2) 
  6. Ankit Maskara

    I had a small question, shouldn’t try….catch be used where ever we use the new table expressions

     

    Eg:

    Try.
    
    line_exists( itab[ id = find_id ]
    
    catch cx_sy_itab_line_not_found.
    
    " Exception Handling
    
    endtry.
    (0) 
    1. Mike Pokraka Post author

      A good question, but no. The function, and several of the variants detailed above, are designed with exactly the purpose to handle the exception.

      If you follow the very first link of the blog you’ll see the documentation states:

      If the specified row is not found, a handleable expression of the class CX_SY_ITAB_LINE_NOT_FOUNDis raised in all operand positions, except when

      • a table expression is used in the statement ASSIGN, where sy-subrcis set to the value 4,
      • when used in the predicate function line_exists, where the truth value “false” is returned,
      • when used in the table function line_index, where the value 0is returned.
      (2) 

Leave a Reply