Skip to Content
Author's profile photo Mike Pokraka

ABAP tips: Checking existence within an internal table

I got a blog-worthy surprise when I did a quick performance test on two different ways to use the new-ish (7.4) table expressions to perform a search on an internal table.

My scenario involved internal table searches where the values would be missing a lot of the time. The unexpected result was that when a row can’t be found, using table expressions with exceptions is ten times slower on average than all other lookup methods, including the good ol’ fashioned READ TABLE with SY-SUBRC.

Quick primer for those still getting accustomed to 7.4 ABAP (feel free to skip to the results):

With table expressions, finding an entry in a table is really neat. I have a table that includes a field “ID”. So the following READ TABLE can be rewritten using a table expression (the bit in square brackets):

READ TABLE itab INTO row WITH KEY id = find_id.
"Equivalent to:
row = itab[ id = find_id ].

If the matching entry cannot be found, you need to catch the exception:

  TRY.
      row = itab[ id = find_id ].
    CATCH cx_sy_itab_line_not_found.
      ...
  ENDTRY.

Another alternative is to use a VALUE constructor expression where you can add the keyword OPTIONAL (or DEFAULT) to just initialize any non-found values.

row = VALUE #( itab[ id = find_id ] OPTIONAL ).
IF row IS INITIAL.
  ...

This particular usage of VALUE may seem a little awkward. I mean, why use x = VALUE #( y ) when you can just write x = y? The VALUE constructor’s sole purpose here is the OPTIONAL bit that lets us do away with the exception.

As I was working on a performance-sensitive component, I tested it to see what performs best.

Results:

For completeness, and additionally with suggestions by Former Member, Quynh Doan Manh and Uwe Fetzer in the comments, I also added a couple of other ways to look for an entry in a table:

  • The trusty READ TABLE  … WITH KEY, with and without TRANSPORTING NO FIELDS
  • line_exists( )
  • ASSIGN itab[ … ]  – which (bizarrely) sets SY-SUBRC instead of producing an exception
  • REF #( ) instead of VALUE #( ).

I tested a million failed lookups on a 7.50 HANA system. Here are the results in microseconds:

line_exists( )                        :   610,758
READ TABLE ... TRANSPORTING NO FIELDS :   671,368
READ TABLE                            :   671,115
ASSIGN itab[ ... ]                    :   707,929
REF #( itab[ ... ] OPTIONAL )         :   888,562
VALUE #( itabl[ ... ] OPTIONAL )      :   961,803
TRY ... itab[ ... ] CATCH             : 6,325,265

I did not expect the last one at all.

So the take-away here for me is that a TRY-CATCH may be easier to read, but should not be used in performance-sensitive code unless you expect the values to be found most of the time. I’m happy to sacrifice a little performance for readability, but this is a significant impact.

I suspect that this applies to CATCH blocks in general, but that’s another analysis for another day.

For comparison, I also re-ran the same but this time with a lookup value that did exist. Two things happened:

  • the exception was no longer an influence (to be expected)
  • line_exists( ) became a poor choice in my scenario, because we need to double up the the search in order to also read the record:
  IF line_exists( itab[ id = find_id ] ).
    row = itab[ id = find_id ].

To summarise:

  • If you don’t need the data, line_exists( ) is fastest.
  • If performance is number 1 priority and you need the data, READ TABLE is fastest.
  • For compact and/or readable code, use table expressions.
    (Yes, I know, ‘new’ ABAP can be used to make code either more readable or more cryptic, but that’s another discussion)
  • TRY-CATCH with table expressions can be a useful way to structure your code (e.g. to use a single catch handler for multiple expressions), but be mindful of the expected failure rate and performance-criticality of the component. If we’re talking anything less than thousands in a short space of time then you can safely ignore the impact.

Update:

After several comments and suggestions, I uploaded my test program to GitHub at abap_itab_perf

It was just a quick-n-dirty originally for comparing just two options and just grew as a copy/paste of alternatives. Feel free to play with it or even send any updates back via GitHub.

Assigned Tags

      36 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Shai Sinai
      Shai Sinai

      Interesting analysis.

      Like you , I suspect the performance difference lies in the exceptions/TRY...CATCH mechanism.

      Author's profile photo Quynh Doan Manh
      Quynh Doan Manh

      I think overall the VALUE #( ) could be used. its make code easier to read, if data not exist its fast enough, and incase data exist you dont have to check + assignment like line_exists( ).

      If you compare only line_exists( ) with read table, I think read table should have TRANSPORT NO FIELD, to be fair :).

      Author's profile photo Mike Pokraka
      Mike Pokraka
      Blog Post Author

      Thanks for your comment. I partly agree, but still find the additional VALUE clumsy. I have no idea why SAP didn't just do a SY-SUBRC like with field symbols, particularly as there is such a performance penalty.

      Re. TRANSPORTING NO FIELDS, my initial test for my real work was where I needed the value, I added line_exists as an afterthought for the blog. But it's a good point for a complete comparison, so I've added it and updated the results directly in the blog above, thanks.

      Author's profile photo Jacques Nomssi Nzali
      Jacques Nomssi Nzali

      Hello Mike,

      could you please measure the run-time behavoir of this variant?

      ASSIGN itab[ id = find_id ] TO FIELD-SYMBOLS(<row>).

      No exceptions, check sy-subrc, it should be good enough.

      JNN

      Author's profile photo Mike Pokraka
      Mike Pokraka
      Blog Post Author

      Thanks for the input, good point! I remember reading this a while ago, and thinking it's bizarre. But we are where we are, and ASSIGN turns out to be a great result! My specific use case was a returning parameter so I didn't even consider an FS.

      Author's profile photo Konstantin Polushkin
      Konstantin Polushkin

      Hello. This construction not work.

      Author's profile photo Nahuel E Romanello
      Nahuel E Romanello

      Hi!, please be more explained as we cannot know what is in your mind!. Thanks!.

      Author's profile photo Omeralp Pirim
      Omeralp Pirim

      Hi Mike,

      Good comparison for internal table expressions. Additionally we can’t use binary search option in 7.40 table expression.

      Author's profile photo Mike Pokraka
      Mike Pokraka
      Blog Post Author

      Good point, but actually we don’t need to, as key fields of sorted or hashed tables provide a better alternative to binary search. But it's certainly worth mentioning as not everyone knows all the possibilities.

      See https://help.sap.com/http.svc/rc/abapdocu_752_index_htm/7.52/en-US/index.htm?file=abentable_expressions.htm

      Personally I think BINARY SEARCH is unreliable and should be avoided if possible. I don't like the idea of relying on prior code sorting the table and that nobody has APPENDED a row since then. If you need to search you need a key, so there's no reason not to use a SORTED or HASHED table where you cannot go wrong.

      Author's profile photo Sai Kowluri
      Sai Kowluri

      Isn't that SORT an overhead if you have large internal table. The value added by binary search isn't fantastic after that time consuming sort. Like you said, sorted and hashed tables are the best bets.

      Author's profile photo Mike Pokraka
      Mike Pokraka
      Blog Post Author

      A SORT can be beneficial in certain scenarios. Consider adding values one by one in random order – every INSERT into a sorted table adds performance cost, so here a one-off SORT at the end is likely to be more efficient.

      BUT… the reason I say BINARY SEARCH is unreliable is not for technical reasons but has to do with the developer(s). Example:

      DATA(itab) = VALUE #( ( 8 ) ( 5 ) ( 3 ) ).
      SORT itab.
      APPEND 2 TO itab.
      READ TABLE itab WITH KEY 2 INTO data(n) BINARY SEARCH.  "n = 0
      READ TABLE itab WITH KEY 2 INTO n.                      "n = 2

      This example is obviously easy to work out, but complex programs with several development iterations are a different story - especially when the filling, sorting and reading are spaced out in the code.

      This mistake is not possible with sorted or hashed tables. For that reason alone I avoid BINARY SEARCH unless there is a major benefit.

      Author's profile photo Matthew Billingham
      Matthew Billingham

      Since I started using SORTED and HASHED tables 18 years ago, I've never had a need to use BINARY SEARCH on tables I've defined myself.

      It baffles me that people are using these 7.40+ features... yet continue to use BINARY SEARCH instead ot SORTED/HASHED.

      Author's profile photo Uwe Fetzer
      Uwe Fetzer

      Hi Mike,

      can you add this to your measurements?

      DATA(flight) = REF #( flights[ carrid = 'LH' ] OPTIONAL ).

       

      Author's profile photo Mike Pokraka
      Mike Pokraka
      Blog Post Author

      Good point, updated in the blog. In my experience references are usually slightly slower than field symbols, and this is no exception.

      Author's profile photo Jelena Perfiljeva
      Jelena Perfiljeva

      Nice blog! I always appreciate when someone does all the work for me and comes up with such simple, practical recommendations. 🙂 Well done.

       

      Author's profile photo Sai Kowluri
      Sai Kowluri

      Great Blog Mike, I have started following this blog. Hope you keep updating it with all your findings 🙂

      Author's profile photo Mike Pokraka
      Mike Pokraka
      Blog Post Author

      Hah, it only started with the intention to compare two table expressions. Then I added another two and then the good suggestions came in 🙂

      Author's profile photo Custodio de Oliveira
      Custodio de Oliveira

      Great post, thanks for sharing.

      Stating the obvious here (which actually was already said), the big difference there is the EXCEPTION, not the different ways of read. Especially for the first 4 cases. I reckon it would probably be a good idea to run the tests a few times and compare the averages (not sure if you did this already).

      Also this:

       IF line_exists( itab[ id = find_id ] ).
          row = itab[ id = find_id ].
      ENDIF.

      should never be used, as per official ABAP programming guidelines.

       

      Cheers.

      Author's profile photo André Brito
      André Brito

      Hi Mike,

      I agree with Custodio and I know you enough to say that I shouldn't ask but ... . is this the average of  multiple execution of the same tests?

      Anyway, I'm a bit surprised as I expected the results between READ TABLE and VALUE with OPTIONAL to be more leveled (even though is only 300 microseconds difference .. 0.0003 seconds!). I guess I'm being picky. 🙂

      As always, well done and thanks for the info. Cheers!

      Author's profile photo André Brito
      André Brito

      After better reading of the article I noticed a small big detail, it says you "tested a million failed lookups".
      So... after reading (or failed reading) a million times the difference between ( READ TABLE vs VALUE with OPTIONAL ) is only 0.0003 seconds?? Really? Wow!

      And as far as I see 0.0003 seconds difference is neglectable. I will still go with the VALUE statement.

       

      P.S.: did you really do a million lookups? 🙂

      Author's profile photo Mike Pokraka
      Mike Pokraka
      Blog Post Author

      Yup, a million. Personally I see 0.3s/million an acceptable margin for code clarity and consistency when the data volume is in the thousands or the DB time far exceeds that. The 6 seconds however took me by surprise and is something to keep in mind.

      I uploaded the code if you or anyone wants to  play with it further - feel free to update back.

      Github: abap_itab_perf

      Author's profile photo Mike Pokraka
      Mike Pokraka
      Blog Post Author

      Hi Andre,

      Of course it goes without saying that duplicate searches are to be avoided, I thought that was too obvious to require explanation  🙂

      It's 300,000 microseconds, or 0.3 seconds. So the difference compared to an exception becomes over 6 seconds when running through a million records. (Yes, that silly British-American comma that much of the world uses as a thousands separator. )

       

       

      Author's profile photo Nahuel E Romanello
      Nahuel E Romanello

      You can use line_index( idem ) and then read the internal table by the index if it's not 0. 😀

      Author's profile photo Kumud Singh
      Kumud Singh

      It is interesting to see that the assumption of  'ABAP 7.4'  elements being better than traditional elements fails sometimes. Thanks for sharing. Developers need to make a choice between readability of the code and performance at many occasions.

       

       

      Regards,

      Kumud

       

      Author's profile photo Mike Pokraka
      Mike Pokraka
      Blog Post Author

      Agree, we (myself included) automatically assume new=faster. But on the other hand, I am not aware of any claims from SAP that the 'new ABAP' statements are faster, it's almost always been about function, readability, new possibilities, etc. Faster to write perhaps 🙂

      Author's profile photo Tomas Buryanek
      Tomas Buryanek

      Thank you for the blog and testing.

      So I am staying with good old READ TABLE and line_exists( ) ? EDIT: read bellow comments, thanks Uwe and Mike.

      Author's profile photo Uwe Fetzer
      Uwe Fetzer

      No, this should definitely not the conclusion of this blog post.

      Author's profile photo Tomas Buryanek
      Tomas Buryanek

      Sorry, I do not want to give misleading "conclusions".
      Should I add "checking existence within an internal table with best performance" then?

      Author's profile photo Mike Pokraka
      Mike Pokraka
      Blog Post Author

      The conclusion, as it often is, is "it depends".

      There are two things to consider in performance: absolute and relative performance.

      Absolute is how much are you processing. So how much time does something consume over the workday and how much is it going to affect the user. Remember this was for a million executions, so for a dialog user querying 1000 records, the impact of 0.006 vs. 0.009 seconds is nil.

      Likewise for relative performance: If a job does a bunch of DB stuff and runs for half an hour then even 6 seconds is all but meaningless.

      So in most scenarios the difference between the options (apart from exceptions, which was the point of the blog) is a perfectly acceptable tradeoff for consistent and readable code.

      Author's profile photo Ankit Maskara
      Ankit Maskara

      I had a small question, shouldn't try....catch be used where ever we use the new table expressions

       

      Eg:

      Try.
      
      line_exists( itab[ id = find_id ]
      
      catch cx_sy_itab_line_not_found.
      
      " Exception Handling
      
      endtry.
      Author's profile photo Mike Pokraka
      Mike Pokraka
      Blog Post Author

      A good question, but no. The function, and several of the variants detailed above, are designed with exactly the purpose to handle the exception.

      If you follow the very first link of the blog you'll see the documentation states:

      If the specified row is not found, a handleable expression of the class CX_SY_ITAB_LINE_NOT_FOUNDis raised in all operand positions, except when

      • a table expression is used in the statement ASSIGN, where sy-subrcis set to the value 4,
      • when used in the predicate function line_exists, where the truth value "false" is returned,
      • when used in the table function line_index, where the value 0is returned.
      Author's profile photo Ankit Maskara
      Ankit Maskara

      Didn't read that in detail…Thanks Mike...

      Author's profile photo Ravi alter
      Ravi alter

      Very nice article Mike .

      I am also working on performance and facing challenge using multiple line_exists for each value .With large data every time checking existence will not help much compare with sy-subrc .

      I am working to change my program hope so it will help .

      Author's profile photo Bartho. Maciag
      Bartho. Maciag

      Very nice post, thank you for that. I was asking myself exactly this question and was thinking about how to test it, then I found your post 🙂

      Author's profile photo Sri Harsha Chodisetti
      Sri Harsha Chodisetti

      It is a very useful post. I am wondering if we can use the "line_exists" for checking multiple conditions of the same value. in this case how can we check for the condition,

      if line_exixts ( itab[ id = find_id1 or id = find_id2 ] )

      i know its syntactically incorrect but I am trying to understand how we can achieve this.

      Author's profile photo Mike Pokraka
      Mike Pokraka
      Blog Post Author

      Good question, line_exists and READ TABLE both can only compare individual fields against a single value.

      LOOP AT has a more flexible WHERE clause and can be used for this purpose, but be sure to include an EXIT if you only want a single result.

      "Existence check
      LOOP AT itab WHERE value <= 0 OR date > sy-datum TRANSPORTING NO FIELDS. 
        EXIT.
      ENDLOOP.
      IF sy-subrc = 0.
        ...
      
      
      "Find any one matching row (READ TABLE equivalent)
      LOOP AT itab INTO data(row) WHERE id IN id_range.
        EXIT.
      ENDLOOP.