ABAP Tips: The lifetime of a data object

pokrakam · ‎11-10-2020

Some time ago I played around with the concept of Closures and wondered if it could be done in ABAP - and if it would even make sense in ABAP? My experiments did hilight some interesting details about the memory management and data lifetimes in ABAP, which I'd like to share in this blog.

Closures are an advanced concept well outside the scope of ABAP, and you needn't know anything about them for this blog, therefore I've decided to split the geek part that goes off the beaten track into a separate blog.

So let's dive right in: One of the questions I wanted to get to the bottom of is when exactly a variable or object reference is destroyed and under what circumstances is it accessible? The online help explains how the Garbage Collector works, but what does this mean in practice?

A simplified explanation involves the two types of memory used by an ABAP program:

Stack Memory

Stack memory is where runtime program modules live and is controlled by the program execution. When a program module such as method/procedure/function is called, it gets allocated memory on the stack and when it finishes, the memory is cleared from the stack. We also talk about the "call stack" when debugging, which is essentially the same thing. All local variables are part of the stack memory and thus disappear when execution finishes.

A simple example will demonstrate this:

REPORT zdata_object_lifetime_test.



CLASS lcl_main DEFINITION.

  PUBLIC SECTION.

    METHODS get_num_ref RETURNING VALUE(result) TYPE REF TO i.

ENDCLASS.



CLASS lcl_main IMPLEMENTATION.

  METHOD get_num_ref.

    DATA(i) = 4.

    GET REFERENCE OF i INTO result.

  ENDMETHOD.

ENDCLASS.



START-OF-SELECTION.

  DATA(iref) = NEW lcl_main( )->get_num_ref( ).

  WRITE / iref->*.

Result:

As we can see, the variable no longer exists and the reference is therefore invalid.

Local variables have their own quirks, which I've previously written about here and here, but in this blog I'd like to focus on heap objects.

Heap memory

Heap memory is where all data that it not local to a program module lives. This includes object attributes, global variables and object & data references. Since these have a lifetime beyond the creating module, they are not part of the stack and are managed centrally by garbage collection.

If we take our example program above and move the data declaration to the attributes section (a Ctrl-1 quick fix in Eclipse), it moves to the heap memory and is no longer part of the executing method code's memory.

CLASS lcl_main DEFINITION.

  PUBLIC SECTION.

    METHODS get_num_ref RETURNING VALUE(result) TYPE REF TO i.

  PRIVATE SECTION.                 "<-- added

    DATA i TYPE i.                 "<-- added

ENDCLASS.



CLASS lcl_main IMPLEMENTATION.

  METHOD get_num_ref.

    i = 4.                         "<-- changed

    GET REFERENCE OF i INTO result.

  ENDMETHOD.

ENDCLASS.



START-OF-SELECTION.

  DATA(iref) = NEW lcl_main( )->get_num_ref( ).

  WRITE / iref->*.

Note that in the program we do not even assign the object instance to a variable, but instead chain the method call straight onto the NEW constructor. Yet we will still get the correct result:

So far so good, I think many developers are already familiar with this setup so far.

Where it gets interesting, however, is when we use an object reference and explicitly clear it. The data still lives on because something is still referencing it. In other words heap object lifetimes are not limited by their containing or originating object.

Let's demonstrate this by explicitly clearing the object whose attribute we're referencing. Same code as above, just the last part changed to:

START-OF-SELECTION.

  DATA(o) = NEW lcl_main( ).

  DATA(iref) = o->get_num_ref( ).

  CLEAR o.

  WRITE / iref->*.

Result:

We can go one step further and force garbage collection:

START-OF-SELECTION.

  DATA(o) = NEW lcl_main( ).

  DATA(iref) = o->get_num_ref( ).

  CLEAR o.

  cl_abap_memory_utilities=>do_garbage_collection( ).

  WRITE / iref->*.

And still the referenced data element lives on without its parent object:

What is happening here is that the garbage collector knows that something is still referencing the element that was once the attribute of the lcl_main instance. Therefore even though the lcl_main instance is now cleared and the garbage collector has run, the element is still there.

Edit: Following comments from sap.dev.sam I'd like to point out that this is the observed behaviour. Whether the object instance is physically cleared from memory or remains as some kind of Zombie Object is unclear, but the end result is still that it is inaccessible and from the program's point of view it no longer exists.

Edit 2: The question has been answered, see below

Conclusion

We have demonstrated that heap objects such as class attributes behave like fully independent objects in their own right. They are created by the parent class, but their lifetime is defined by the number of references an element has, as determined by the garbage collector.

The example source code is on GitHub over here

Update

Following the discussion in the comments, I dug deeper into the memory to figure out whether the object is physically removed from memory or just becomes inaccessible and remains purely for the attribute reference.

The result is that the entire instance remains in memory until the attribute reference is deleted. But I noticed one other interesting thing: Small strings are stored internally, but larger strings are created as a separate data object in memory.

My test used an object lcl_test with two attributes s and s2. S2 gets assigned a 500 char (1000 bytes) value just because it's easier to spot larger memory changes.

    o = NEW lcl_test( ).

    o->s = `Foo`.

    o->s2 = `1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890` &&

            `1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890` &&

            `1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890` &&

            `1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890` &&

            `1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890`.

    DATA(sref) = o->get_string_ref( ).

    CLEAR o.

    cl_abap_memory_utilities=>do_garbage_collection( ).



    clear sref.

Before the CLEAR o statement, the object is referenced by both O and SREF:

But note how s2 is now big enough that it is created as a separate data object in memory. This is exactly one scenario I suggested as a possibility, if s2 is the referenced attribute, the object instance could be deleted. However as we shall see this is not the case.

Note the bound bytes of 1280 is the sum of 1224 for s2 and the 56 byte object size so s2 is still considered part of the object.

Now what happens when after CLEAR o:

The object is still there, with the full memory occupied. It is referenced by the string attribute reference. Notably, the s2 value is also still there.

I tested this with the reference pointing to both s and s2, with the same result.

For completeness, after the clear sref statement, all objects disappear. No need for a screenshot.

So the conclusion is that we do have a zombie object, and keeping a reference to an object's attribute will still retain the complete object with all the memory it occupies.