A HashSet in ABAP (compared to Java)
In Java, you manage collections of data with …. (surprise) Collections. They are implemented as Classes in the library of the language and have a relative clean interface. In might be tempting to implement such a Collection class in ABAP in order to have a similar interface as the Java language. But this is not the ABAP way to do it. The documentation says:
In Java, all superior data objects (especially container variables such as strings) are modeled using classes. By contrast, ABAP provides very powerful, predefined types. Besides the predefined ABAP strings, internal tables are also provided that are used for structured data storage. These tables represent the most powerful ABAP type. Therefore, it is generally not beneficial to implement own container types using ABAP classes.
In this blog post, I want to compare the usage of a Set in both ABAP and Java. So let’s start with the java version:
// Construction of a set
HashSet<Integer> hashset = new HashSet<>();
// Adding '1', success should be true
success = hashset.add(1);
// Adding '2, success should be true
success = hashset.add(2);
// Adding '2', success should be false
success = hashset.add(2);
// Checking existence, exists should be true
exists = hashset.contains(2);
// Checking existence, exists should be false
exists = hashset.contains(3);
// Write number of elements: should be 2
System.out.println(hashset.size());
// Iterate over elements
for (int i: hashset) {
System.out.println(i);
}
// Remove an element
hashset.remove(2);
// Write number of elements: should be 1
System.out.println(hashset.size());
Now, let us compare wit the ABAP version:
" Construction of a set
DATA: hashset TYPE HASHED TABLE OF int4 WITH UNIQUE KEY table_line.
" Adding '1', success should be true
INSERT 1 INTO TABLE hashset.
success = xsdbool( sy-subrc = 0 ).
" Adding '2', success should be true
INSERT 2 INTO TABLE hashset.
success = xsdbool( sy-subrc = 0 ).
" Adding '2', success should be false
INSERT 2 INTO TABLE hashset.
success = xsdbool( sy-subrc = 0 ).
" Checking existence, exists should be true
exists = xsdbool( line_exists( hashset[ table_line = 2 ] ) ) .
" Checking existence, exists should be false
exists = xsdbool( line_exists( hashset[ table_line = 3 ] ) ) .
" Write number of elements: should be 2
WRITE: / 'Number of elements:' , lines( hashset ).
" Iterate over elements
LOOP AT hashset ASSIGNING FIELD-SYMBOL(<integer>).
WRITE: / <integer>.
ENDLOOP.
" Remove an element
Delete table hashset with TABLE KEY primary_key COMPONENTS table_line = 2.
" Write number of elements: should be 1
WRITE: / 'Number of elements:' , lines( hashset ).
This is surprisingly similar. The main difference are that the operations in Java are calls to the methods of the set, while in ABAP you manipulate the set by the keywords of the language.
The Java interface is much cleaner, you can see how you can manipulate a set at one place in the class documentation. The information in ABAP is scattered around the language documentation. So why not implement a set class in ABAP ? (Side question: How would you deal with different types? Java used generics as solution. Feel free to post you ideas in the comments.) In the beginning, I said the is not “the ABAP way”. Of course this is not a real reason. The first one I can come up is performance. Second, if you understand the manipulation described above, there is no real reasons to encapsulate it behind a class. ( Do you see further reasons? Again, feel free to post in the comments!).
One last remark: If you find yourself writing code like
Sort itab.
Delete adjacent duplicates from itab.
chances are, that you really want a set and should use it as described above. The two lines are not needed anymore then.
If you see my blog https://blogs.sap.com/2017/08/17/performance-maintenance-and-classes/ recently published, you have explained in detail what I said in a single line concerning ABAP OO and internal tables.
You've also explained very neatly why my point is the case. All very timely. I've edited my blog to link to yours.
For your final remark - I agree. That's absolutely an indicator you're using the wrong kind of internal table.
Thanks for sharing in your blog post!
Hello old colleauge,
very interesting post. I agree, Java's solution is much cleaner, and it would be great to have standardized container/collection classes. Regarding whether to implement an own class to hide away some of the uglyness, I don't think it's a good trade off, especially considering performance.
The typing issues can surely be solved with "dynamic" programming, but I wouldn't touch that unless really necessary.
Another point which could be made, is that with constructor expressions and table expressions you can most of the time avoid the bulky and verbose syntax constructs which ABAP is known for, so maybe the gap between the ABAP solution and a clean version isn't that big.
Best regards
For me the main reason not to build your own inhouse collection framework is the missing generics in ABAP, as you pointed out. The gains of having encapsulated methods and a nice inheritance hierarchy are outweighed by the huge loss of type safety. If you have to “know” and remember the concrete type managed by a collection and cast every time you access a member it’s not safe enough to use. I have no idea how Java did it before they introduced generics.
Also of course you cannot do the fancy lambda expression collection stuff (map etc.) as there are neither lambda expression nor even anonymous classes.
With 7.40 a lot of the need for that is gone because of the new nice expressions. For value types at least. Unfortunately many things don’t easily work with objects. Like this Java 8 stream based filter operation on a collection
could be this in ABAP now:
But you just get a compiler error
Which to me is surprising because in some statements object attributes are accessible. Like this works:
Even code completion sometimes suggests the class components, but they don’t work. And I am not even expecting method calls to work within these statements (SORT cannot do “BY table_line->get_name( )” either).
The closest thing to generics in ABAP I saw was using macros, which is not usable at all in my opinion but it surprisingly works and is syntactically correct (of course you can only "generate" local classes):
The "SAP way" would be to generate these concrete classes at design time to get around the generics issue, using the class composer API perhaps.
By the way: I think for WebDynpro there are collection and iterators classes available in SAP_BASIS. Of course as pointed out above they just use TYPE REF TO object and you need to cast every time. For reference:
CL_OBJECT_COLLECTION_ITERATOR, CL_OBJECT_COLLECTION, CL_OBJECT_MAP
A note about your final remark: I think dynamic grouping (the GROUP BY addition to LOOP AT) will help better in most cases.
JNN
Hi Martin,
one little thing that has nothing to do with the general theme of your blog.
Your second code example shows nicely why I always argue against comments on code in favour for explicit naming of variable and methods. Comments tend to rot as they are not updated together with the code. This already happens in such small example programs.
Besides that, there are some areas in SAP where implementations of collections and iterators are available. For example, the BOL im SAP CRM contains a implementation of the iterator pattern. However, I agree with Mat that in most cases you should stick to the ABAP language features.
Christian
That's why CRM is so d*mned fast 😉
(duck'n'cover...)
I see that you already start to appreciate SAP IS-U
Thanks for pointing that out, I updated the post.
Hi Martin,
Thanks a lot for your post. For the case, that you need a hash set or list somewhere locally within your class I agree with your argument that it doesn't help a lot to encapsulate it within a class. Also, if you want to have a specific global hash set type, you can define a table type in the dictionary.
However, as soon as you design APIs where, e.g., interfaces contain hash sets or lists, it often makes sense to define your own data structures encapsulating those ABAP types, because this is the only way to restrict access to the built-in types. You can't just derive from a hashed table.
For this reason, I still think it is a pity that there are no generics in ABAP allowing to define general data structures; or, even better, an autoboxing/unboxing approach such that I'm always free to chose between class or built-in type.
Best,
Johannes
Like Fabian discovered, there is a problem to define the key of hash tables containing objects. How can we define such tables? Is that possible at all? I'm struggling for the same reason using "object hash tables". One solution could be to use a helper structure containing the key and the object reference. But of course, that is not an ideal way.
Just realised that the following code generates no syntax error:
But what whould be the key in that case? Is there a trick, to have some method or attribute in the class/interface, which gets called to retrieve the key?
The key of hash_table is the object reference, which is an 8 byte pointer to the actual location of the object. If you know one object reference, then you may use such an internal table to link each object to additional data.
If you want to index objects of one class or interface by one or several of their attributes, you may maintain an internal table each time such objects are instantiated. If you're not interested by the performance, you may also not index these objects and just use a standard internal table containing all object references and use:
Thank you Sandra to clarify the 8 byte pointer key.
In our case, the key should be something useful, not the object pointer. Like in your example, we like to use an object attribute to access the object in the table. BUT performance IS a requirement. Thats why we are looking for a solution with hash table instead of standard table.
So what would be the elegant way to access objects in hast tables by object attributes?
It is how I said (not clearly). The only solution I know is to build yourself an internal table with the attribute values you're interested in, either by the class itself (in the constructor) or externally (each time you instantiate the class).