Skip to Content
Author's profile photo Martin Böschen

A HashSet in ABAP (compared to Java)

In Java, you manage collections of data with …. (surprise) Collections. They are implemented as Classes in the library of the language and have a relative clean interface. In might be tempting to implement such a Collection class in ABAP in order to have a similar interface as the Java language. But this is not the ABAP way to do it. The documentation says:

In Java, all superior data objects (especially container variables such as strings) are modeled using classes. By contrast, ABAP provides very powerful, predefined types. Besides the predefined ABAP strings, internal tables are also provided that are used for structured data storage. These tables represent the most powerful ABAP type. Therefore, it is generally not beneficial to implement own container types using ABAP classes.

In this blog post, I want to compare the usage of a Set in both ABAP and Java. So let’s start with the java version:

// Construction of a set
HashSet<Integer> hashset = new HashSet<>();

// Adding '1', success should be true
success = hashset.add(1);

// Adding '2, success should be true
success = hashset.add(2);

// Adding '2', success should be false
success = hashset.add(2);

// Checking existence, exists should be true
exists = hashset.contains(2);

// Checking existence, exists should be false
exists = hashset.contains(3);

// Write number of elements: should be 2

// Iterate over elements
for (int i: hashset) {

// Remove an element

// Write number of elements: should be 1

Now, let us compare wit the ABAP version:

" Construction of a set

" Adding '1', success should be true
success = xsdbool( sy-subrc = 0 ).

" Adding '2', success should be true
success = xsdbool( sy-subrc = 0 ).

" Adding '2', success should be false
success = xsdbool( sy-subrc = 0 ).

" Checking existence, exists should be true
exists = xsdbool( line_exists( hashset[ table_line = 2 ] ) )  .

" Checking existence, exists should be false
exists = xsdbool( line_exists( hashset[ table_line = 3 ] ) )  .

" Write number of elements: should be 2
WRITE: / 'Number of elements:' , lines( hashset ).

" Iterate over elements
  WRITE: / <integer>.

" Remove an element
Delete table hashset with TABLE KEY primary_key COMPONENTS table_line = 2.

" Write number of elements: should be 1
WRITE: / 'Number of elements:' , lines( hashset ).

This is surprisingly similar. The main difference are that the operations in Java are calls to the methods of the set, while in ABAP you manipulate the set by the keywords of the language.

The Java interface is much cleaner, you can see how you can manipulate a set at one place in the class documentation. The information in ABAP is scattered around the language documentation. So why not implement a set class in ABAP ? (Side question: How would you deal with different types? Java used generics as solution. Feel free to post you ideas in the comments.) In the beginning, I said the is not “the ABAP way”. Of course this is not a real reason. The first one I can come up is performance. Second, if you understand the manipulation described above, there is no real reasons to encapsulate it behind a class. ( Do you see further reasons? Again, feel free to post in the comments!).

One last remark: If you find yourself writing code like

Sort itab.
Delete adjacent duplicates from itab.

chances are, that you really want a set and should use it as described above. The two lines are not needed anymore then.

Assigned Tags

      You must be Logged on to comment or reply to a post.
      Author's profile photo Matthew Billingham
      Matthew Billingham

      If you see my blog recently published, you have explained in detail what I said in a single line concerning ABAP OO and internal tables.

      Make use of the richness of ABAP – e.g. consider internal tables as objects, with “methods” like READ, LOOP AT etc. Don’t create a class that emulates Java or C++ iterators.

      You've also explained very neatly why my point is the case. All very timely. I've edited my blog to link to yours.

      For your final remark - I agree. That's absolutely an indicator you're using the wrong kind of internal table.



      Author's profile photo Martin Böschen
      Martin Böschen
      Blog Post Author

      Thanks for sharing in your blog post!

      Author's profile photo Daniel Gent
      Daniel Gent

      Hello old colleauge,


      very interesting post.  I agree, Java's solution is much cleaner, and it would be great to have standardized container/collection classes. Regarding whether to implement an own class to hide away some of the uglyness, I don't think it's a good trade off, especially considering performance.

      The typing issues can surely be solved with "dynamic" programming, but I wouldn't touch that unless really necessary.

      Another point which could be made, is that with constructor expressions and table expressions you can most of the time avoid the bulky and verbose syntax constructs which ABAP is known for, so maybe the gap between the ABAP solution and a clean version isn't that big.


      Best regards

      Author's profile photo Fabian Lupa
      Fabian Lupa

      For me the main reason not to build your own inhouse collection framework is the missing generics in ABAP, as you pointed out. The gains of having encapsulated methods and a nice inheritance hierarchy are outweighed by the huge loss of type safety. If you have to “know” and remember the concrete type managed by a collection and cast every time you access a member it’s not safe enough to use. I have no idea how Java did it before they introduced generics.

      Also of course you cannot do the fancy lambda expression collection stuff (map etc.) as there are neither lambda expression nor even anonymous classes.

      With 7.40 a lot of the need for that is gone because of the new nice expressions. For value types at least. Unfortunately many things don’t easily work with objects. Like this Java 8 stream based filter operation on a collection

      List<Person> people = new Arraylist() {{
        add(new Person("Bob"));
        add(new Person("Bill"));
      List<Person> bills = -> p.getName().equals("Bill")) 

      could be this in ABAP now:

      TYPES: lty_people_tab TYPE HASHED TABLE OF REF TO lcl_person WITH UNIQUE KEY table_line.
      DATA(lt_people) = VALUE lty_people_tab(
        ( NEW lcl_person( 'Bob' ) )
        ( NEW lcl_person( 'Bill' ) )
      DATA(lt_bills) = FILTER #( lt_people WHERE table_line->mv_name = 'Bill' ).

      But you just get a compiler error

      Which to me is surprising because in some statements object attributes are accessible. Like this works:

      SORT lt_people BY table_line->mv_name ASCENDING.

      Even code completion sometimes suggests the class components, but they don’t work. And I am not even expecting method calls to work within these statements (SORT cannot do “BY table_line->get_name( )” either).


      The closest thing to generics in ABAP I saw was using macros, which is not usable at all in my opinion but it surprisingly works and is syntactically correct (of course you can only "generate" local classes):

      DEFINE make_collection.
              add IMPORTING io_item TYPE REF TO &2,
              remove IMPORTING iv_index TYPE i,
              get IMPORTING iv_index TYPE i
                   RETURNING VALUE(ro_item) TYPE REF TO &2
                   RAISING cx_sy_itab_line_not_found.
              mt_list TYPE STANDARD TABLE OF REF TO &2.
          METHOD add.
            APPEND io_item TO mt_list.
          METHOD remove.
            DELETE mt_list INDEX iv_index.
          METHOD get.
                ro_item = mt_list[ iv_index ].
              CATCH cx_sy_itab_line_not_found INTO DATA(lx_ex).
                RAISE EXCEPTION lx_ex.
      make_collection person_list lcl_person.
      DATA(lo_collection) = NEW lcl_person_list( ).
        NEW #( 'Bob' ) ),
        NEW #( 'Bill') ).
      DATA(lo_bob) = lo_collection->get( 1 ). " <- returns reference to LCL_PERSON
      lo_collection->remove( 2 ).

      The "SAP way" would be to generate these concrete classes at design time to get around the generics issue, using the class composer API perhaps.

      By the way: I think for WebDynpro there are collection and iterators classes available in SAP_BASIS. Of course as pointed out above they just use TYPE REF TO object and you need to cast every time. For reference:

      Author's profile photo Jacques Nomssi Nzali
      Jacques Nomssi Nzali

      A note about your final remark: I think dynamic grouping (the GROUP BY addition to LOOP AT) will help better in most cases.


      Author's profile photo Christian Drumm
      Christian Drumm

      Hi Martin,

      one little thing that has nothing to do with the general theme of your blog.

      Your second code example shows nicely why I always argue against comments on code in favour for explicit naming of variable and methods. Comments tend to rot as they are not updated together with the code. This already happens in such small example programs.

      " Adding '1', success should be false
      INSERT 2 INTO TABLE hashset.
      success = xsdbool( sy-subrc = 0 ).

      Besides that, there are some areas in SAP where implementations of collections and iterators are available. For example, the BOL im SAP CRM contains a implementation of the iterator pattern. However, I agree with Mat that in most cases you should stick to the ABAP language features.


      Author's profile photo Uwe Fetzer
      Uwe Fetzer

      That's why CRM is so d*mned fast 😉


      Author's profile photo Christian Drumm
      Christian Drumm

      I see that you already start to appreciate SAP IS-U ?


      Author's profile photo Martin Böschen
      Martin Böschen
      Blog Post Author

      Thanks for pointing that out, I updated the post.

      Author's profile photo Former Member
      Former Member

      Hi Martin,

      Thanks a lot for your post. For the case, that you need a hash set or list somewhere locally within your class I agree with your argument that it doesn't help a lot to encapsulate it within a class. Also, if you want to have a specific global hash set type, you can define a table type in the dictionary.

      However, as soon as you design APIs where, e.g., interfaces contain hash sets or lists, it often makes sense to define your own data structures encapsulating those ABAP types, because this is the only way to restrict access to the built-in types. You can't just derive from a hashed table.

      For this reason, I still think it is a pity that there are no generics in ABAP allowing to define general data structures; or, even better, an autoboxing/unboxing approach such that I'm always free to chose between class or built-in type.



      Author's profile photo Matthias Bollmeier
      Matthias Bollmeier

      Like Fabian discovered, there is a problem to define the key of hash tables containing objects. How can we define such tables? Is that possible at all? I'm struggling for the same reason using "object hash tables". One solution could be to use a helper structure containing the key and the object reference. But of course, that is not an ideal way.

      Author's profile photo Matthias Bollmeier
      Matthias Bollmeier

      Just realised that the following code generates no syntax error:

      DATA hash_table type hashed table of ref to <<ZIF_ANOBJECT>> with unique key table_line.

      But what whould be the key in that case? Is there a trick, to have some method or attribute in the class/interface, which gets called to retrieve the key?

      Author's profile photo Sandra Rossi
      Sandra Rossi

      The key of hash_table is the object reference, which is an 8 byte pointer to the actual location of the object. If you know one object reference, then you may use such an internal table to link each object to additional data.

      If you want to index objects of one class or interface by one or several of their attributes, you may maintain an internal table each time such objects are instantiated. If you're not interested by the performance, you may also not index these objects and just use a standard internal table containing all object references and use:

      LOOP AT itab ... WHERE table_line IS BOUND AND table_line->attribute = 'VALUE'.
      Author's profile photo Matthias Bollmeier
      Matthias Bollmeier

      Thank you Sandra to clarify the 8 byte pointer key.

      In our case, the key should be something useful, not the object pointer. Like in your example, we like to use an object attribute to access the object in the table. BUT performance IS a requirement. Thats why we are looking for a solution with hash table instead of standard table.

      So what would be the elegant way to access objects in hast tables by object attributes?

      Author's profile photo Sandra Rossi
      Sandra Rossi

      It is how I said (not clearly). The only solution I know is to build yourself an internal table with the attribute values you're interested in, either by the class itself (in the constructor) or externally (each time you instantiate the class).