ABAP Tips: The lifetime of a data object
Some time ago I played around with the concept of Closures and wondered if it could be done in ABAP – and if it would even make sense in ABAP? My experiments did hilight some interesting details about the memory management and data lifetimes in ABAP, which I’d like to share in this blog.
Closures are an advanced concept well outside the scope of ABAP, and you needn’t know anything about them for this blog, therefore I’ve decided to split the geek part that goes off the beaten track into a separate blog.
So let’s dive right in: One of the questions I wanted to get to the bottom of is when exactly a variable or object reference is destroyed and under what circumstances is it accessible? The online help explains how the Garbage Collector works, but what does this mean in practice?
A simplified explanation involves the two types of memory used by an ABAP program:
Stack memory is where runtime program modules live and is controlled by the program execution. When a program module such as method/procedure/function is called, it gets allocated memory on the stack and when it finishes, the memory is cleared from the stack. We also talk about the “call stack” when debugging, which is essentially the same thing. All local variables are part of the stack memory and thus disappear when execution finishes.
A simple example will demonstrate this:
REPORT zdata_object_lifetime_test. CLASS lcl_main DEFINITION. PUBLIC SECTION. METHODS get_num_ref RETURNING VALUE(result) TYPE REF TO i. ENDCLASS. CLASS lcl_main IMPLEMENTATION. METHOD get_num_ref. DATA(i) = 4. GET REFERENCE OF i INTO result. ENDMETHOD. ENDCLASS. START-OF-SELECTION. DATA(iref) = NEW lcl_main( )->get_num_ref( ). WRITE / iref->*.
As we can see, the variable no longer exists and the reference is therefore invalid.
Local variables have their own quirks, which I’ve previously written about here and here, but in this blog I’d like to focus on heap objects.
Heap memory is where all data that it not local to a program module lives. This includes object attributes, global variables and object & data references. Since these have a lifetime beyond the creating module, they are not part of the stack and are managed centrally by garbage collection.
If we take our example program above and move the data declaration to the attributes section (a Ctrl-1 quick fix in Eclipse), it moves to the heap memory and is no longer part of the executing method code’s memory.
CLASS lcl_main DEFINITION. PUBLIC SECTION. METHODS get_num_ref RETURNING VALUE(result) TYPE REF TO i. PRIVATE SECTION. "<-- added DATA i TYPE i. "<-- added ENDCLASS. CLASS lcl_main IMPLEMENTATION. METHOD get_num_ref. i = 4. "<-- changed GET REFERENCE OF i INTO result. ENDMETHOD. ENDCLASS. START-OF-SELECTION. DATA(iref) = NEW lcl_main( )->get_num_ref( ). WRITE / iref->*.
Note that in the program we do not even assign the object instance to a variable, but instead chain the method call straight onto the NEW constructor. Yet we will still get the correct result:
So far so good, I think many developers are already familiar with this setup so far.
Where it gets interesting, however, is when we use an object reference and explicitly clear it. The data still lives on because something is still referencing it. In other words heap object lifetimes are not limited by their containing or originating object.
Let’s demonstrate this by explicitly clearing the object whose attribute we’re referencing. Same code as above, just the last part changed to:
START-OF-SELECTION. DATA(o) = NEW lcl_main( ). DATA(iref) = o->get_num_ref( ). CLEAR o. WRITE / iref->*.
We can go one step further and force garbage collection:
START-OF-SELECTION. DATA(o) = NEW lcl_main( ). DATA(iref) = o->get_num_ref( ). CLEAR o. cl_abap_memory_utilities=>do_garbage_collection( ). WRITE / iref->*.
And still the referenced data element lives on without its parent object:
What is happening here is that the garbage collector knows that something is still referencing the element that was once the attribute of the
lcl_main instance. Therefore even though the
lcl_main instance is now cleared and the garbage collector has run, the element is still there.
Edit: Following comments from Sergey Muratov I’d like to point out that this is the observed behaviour. Whether the object instance is physically cleared from memory or remains as some kind of Zombie Object is unclear, but the end result is still that it is inaccessible and from the program’s point of view it no longer exists.
Edit 2: The question has been answered, see below
We have demonstrated that heap objects such as class attributes behave like fully independent objects in their own right. They are created by the parent class, but their lifetime is defined by the number of references an element has, as determined by the garbage collector.
The example source code is on GitHub over here
Following the discussion in the comments, I dug deeper into the memory to figure out whether the object is physically removed from memory or just becomes inaccessible and remains purely for the attribute reference.
The result is that the entire instance remains in memory until the attribute reference is deleted. But I noticed one other interesting thing: Small strings are stored internally, but larger strings are created as a separate data object in memory.
My test used an object
lcl_test with two attributes
S2 gets assigned a 500 char (1000 bytes) value just because it’s easier to spot larger memory changes.
o = NEW lcl_test( ). o->s = `Foo`. o->s2 = `1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890` && `1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890` && `1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890` && `1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890` && `1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890`. DATA(sref) = o->get_string_ref( ). CLEAR o. cl_abap_memory_utilities=>do_garbage_collection( ). clear sref.
CLEAR o statement, the object is referenced by both O and SREF:
But note how
s2 is now big enough that it is created as a separate data object in memory. This is exactly one scenario I suggested as a possibility, if
s2 is the referenced attribute, the object instance could be deleted. However as we shall see this is not the case.
Note the bound bytes of 1280 is the sum of 1224 for
s2 and the 56 byte object size so
s2 is still considered part of the object.
Now what happens when after
The object is still there, with the full memory occupied. It is referenced by the string attribute reference. Notably, the
s2 value is also still there.
I tested this with the reference pointing to both
s2, with the same result.
For completeness, after the
clear sref statement, all objects disappear. No need for a screenshot.
So the conclusion is that we do have a zombie object, and keeping a reference to an object’s attribute will still retain the complete object with all the memory it occupies.
🙂 thoroughly enjoyed it...You saved the fun part for the last...just like telling a story with an unexpected climax....Loved it and learnt a new thing, of course ! 🙂 Thanks for Sharing Mike Pokraka
I don't agree with the sentence "And still the referenced data element lives on without its parent object". I think that the instance attribute "i" can not exist without its parent object. So the whole object should stay exist in memory.
I don't know for sure, this is part of the internals. However the behaviour is such that the object instance is inaccessible to the program but the attribute is.
Technically both could be feasible, a data object is a piece of memory on the heap with a pointer to it. We can have several references pointing to the same data object and it would be technically feasible to physically destroy the class instance from memory and keep the attribute, it simply becomes an anonymous data object. I don't know how SAP implemented it internally.
If you look at the memory usage of an object in the debugger it will show zero after CLEAR and the reference looks like a regular anonymous data object.
It is an interesting question, if I find some time I may investigate this further, but for all intents and purposes the behaviour as such is that an attribute reference remains alive even when an object is no longer accessible. Maybe I should reword that in the blog.
Thank you for your response! I agree that this depends on internal implementation.
There is a concept called strongly connected components (SCCs).
You can find some more info about it here:
Thanks, useful bit of info. I had a brief look using the memory inspector when I answered Sergey earlier, but could not spot evidence straight away whether it was deleted or not. Superficially it looked like the object is gone, but it was not fully conclusive and for time reasons I did not look deeper into it. I am interested to know for sure myself, so will get to the bottom of it!
I agree with Sergey.
With "Clear O", you only cleared the Object Reference (8 bytes pointer to a memory called the Object, containing the attribute "I" in your case). The Object itself (the attribute "I") is not cleared because there is still "iref" which points to an attribute of the object.
What is important to understand is the difference between Object Reference and Object.
So the "parent Object" is still here, it's only the Object Reference which is not here anymore.
Correct, clearing the reference only deletes the reference.
The garbage collector runs "periodically" (whatever that means) and will reclaim the memory of any unreferenced objects.
Therefore I explicitly invoked garbage collection, because I wanted to find out whether the external reference to the attribute just happens to still work because the object hasn't been reclaimed, or whether the attribute reference keeps the garbage collector away.
We have positively answered that question, the attribute is not garbage even if the object is no longer referenced. The further question Sergey raised is whether the object still lurks as zombie or whether the attribute has become an independent data object. Either way is technically possible, and ABAP also doesn't always behave as we expect. I have a test in mind, will try it later.
Thanks a lot for the investigation. By the way, if you have additional time to investigate, it would be fun to see, if you define an additional attribute of type Internal Table and load it, if its memory is released during "Clear O" or garbage collection. i.e. to determine if the garbage collection is done at the level of the whole object (all Dynamic attributes or none) or at the level of each Dynamic Attribute (string, internal table, data/object reference, and boxed structure).
Good question, I added some more attributes: a short & long string, and a small and large table. The short answer is the whole object is kept.
The screenshot is after the object has been cleared, but while the string reference SREF still points to the string attribute:
The screenshot is split into two because of other system objects in between.
You can see that the string and table attributes' contents are data objects in their own right, that are referenced by the attribute itself. In other words: an integer is part of the object, but a string or table is an internal reference to a data object outside the class object.
That's the so-called "dynamic" (string, internal table, references, boxed structures) and "static" data objects (the rest), but I'd prefer to stop using the word "static", because it has so many meanings in ABAP... ?
OK, I've answered this one, we have a zombie object. Updates in the blog, thanks for raising an interesting question.
Thank you for the update. I'm not surprised: The runtime and garbage collector is most likely optimized to put whole objects into memory (in JVM called "heap"). Tearing apart the data structure that forms an object would create more indirections and additional work in the garbage collector, reducing overall performance.
The interesting thing is that the data structure is selectively broken up, as shown above. I'm guessing internally there's some feature that keeps them closely coupled.
Once I found this sap note (Destructors in ABAP Objects) and it’s nicely explaining it:
I love the solution in this note. "No specific solution.".
Between the blog post and the note, it makes a lot more sense to me. It will really help with debugging. Sometimes I wonder why. And I now know.
Hey Mike, 🙂
Great article. Thanks for sharing.
Hey Mahesh, thanks, glad you found it useful.
Really thought through and well-written, I enjoyed reading. Thanks. Are you functional programming with lambdas in ABAP?
Thanks for the kind words. I don't think that true lambdas are possible, since ABAP is more object-oriented than functional. The functional aspect consists of the SAP-delivered language components, it is not possible to assign them to variables, let alone create own functions.
Even my closure experiments were more of a more hack in that they substituted an inner object for the lambda function that defines a 'real' closure.
Very generic interfaces and some OO Patterns such as Inversion of Control are comparable to lambda-like behaviour when the calling code needs to have no idea what class it is or what it does. But you still have the overhead of a full class instance.
Hi, I like this blog very much and for me, it's always of interest to get deeper insights. Thank you very much for it.
I would also like to read your experiences with closures, but I can't open the geek blog you mentioned. Could you please have look at the link?
Thanks for the kind comment and for pointing out the faulty link – fixed.