Boxed components and memory efficiency
What’s a boxed component?
I read about them first in this document – http://scn.sap.com/docs/DOC-26231. Then I read the release notes – you do read the release notes when you upgrade to a new abap version, don’t you? It’s worth doing, as you can find some very nice language enhancements. I read the sap help, but couldn’t quite get what the point of them was, nor how they worked.
Then I read this excellent blog SAP ABAP TYPE BOXED, New Type Category 7.02 Ehp2 – ABAP Help Blog that explains their functionality very clearly, but just now, I’ve created my first, real world use of the concept. So I’ll share it.
I’ve got some master data – materials and time dependent attributes. I’ve got a few million records of data that contain materials and dates, and I want to find the attributes for those material, at that particular date. The records I’ve got are in no particular order, may refer to the same material many times, and may well have materials that don’t have any master data. Also, the materials in my records are only a subset of all the possible materials.
Got that? Good.
The volume of master data is such that I cannot read all the data into an internal table in one go. So what I do is maintain a buffer that starts off empty. For a given material, I first check the buffer. If it isn’t there, I check the database. If I still don’t find anything I create a record in the buffer with a flag “not found”, If I do find something, then I add it to the buffer.
So, my line structure is something like: MATERIAL, CORRECTED_COST, CORRECT_CURRENCY, NOT_FOUND. Here’s some ABAP:
TYPES: BEGIN OF corrected_cost_ty, MATERIAL, CORRECTED_COST, CORRECTED_CURRENCY, NOT_FOUND, END OF corrected_cost_ty
Now, the problem is that if a material is NOT_FOUND, then it has no correct cost. So for all the NOT_FOUND data, there’s these two fields which never contain anything, but will be taking up space, nonetheless. This is where boxing comes in.
I create a type “attributes” which has CORRECTED_COST and CORRECT_CURRENCY, then I redo my corrected_cost_ty like this.
TYPES: BEGIN OF attributes_ty, CORRECTED_COST, CORRECTED_CURRENCY, END OF attributes_ty. TYPES: BEGIN OF corrected_cost_ty, MATERIAL, ATTRIBUTES type attributes_ty BOXED, NOT_FOUND, END OF corrected_cost_ty
Now, when I don’t have any attributes for a material (because it’s NOT_FOUND), memory is only allocated to the MATERIAL and NOT_FOUND.
Let’s say material is 18 bytes, not_found is 1 byte, corrected_cost is 17 bytes and corrected_currency 5 bytes. Then each of these records occupies 41 bytes. If we didn’t have a boxed structure, then all the records in my internal table would occupy 41 bytes. So if I had 6 entries, then that would be 246 bytes total. However, we do have a boxed structure, so the memory allocation looks something like this:
In this diagram, materials 1, 2 and 5 are not found, so their corrected costs and corrected currency are all initial. Boxing means that if the fields of the boxed part are initial, then every record that has these fields initial will share the same area of memory, for those fields. So these not found materials occupy 79 bytes, instead of (without boxing) 123 bytes.
The only downside of using boxing, is that I have to refer to my attributes as attributes-corrected_cost. I’d like to be able to use something like
INCLUDE TYPE attributes_ty BOXED.
But I guess we’ll have to wait a while for that to happen.
I am so naïve I had always assumed deep structures would be dynamic by default...
This is much nicer than having to use data references and creating the objects yourself.
That's new! Thanks for sharing.
Then, can we consider field ERSDA is inbuilt BOXED component as it has no initial values check.
For Internal table BOXED means Initial Values On/Off
As per my understanding. please correct me.
No - initial values for transparent tables are entirely unrelated to the concept of BOXED structures. Initial values for transparent tables determine whether (for example) an integer field of a new record in the table is filled with '0' (in the database) or with "null".
I've updated my blog with a diagram that hopefully makes it clear that BOXED structures are about efficient allocation of memory, when using internal tables.
up till now I have a knowledge that if we declare initial check then it will occupy memory else it will not.
in boxed case."it will share same memory" concept.
its clear now .
Coming from different environment (AS/400) I find it absolutely amazing that you still have to think about memory usage and not on the problem on hand.
Thanks for sharing.
If just think about of thosunds of byte - its not worth thinking about boxed components. But if you have a long running job dealing with millions of row blocking main memory you still should think about memory issues.
Customers tend to enter huge selection spans resulting in a lot amount of data...
I work mainly (at the moment) in BW. We're often dealing with millions of records.
I am not saying that we do not need it in SAP. As I mention I came from s/38 as/400 where we had Single-level store( http://en.wikipedia.org/wiki/Single-level_store)
and record-level access which is not using SQL at all that allow you to process millions of records without blocking main memory....
And those features were available at August 1979(http://en.wikipedia.org/wiki/IBM_System/38)
I guess I am spoiled by those feature.....
I used to work on an OLAP called essBase. It handled sparse data very efficiently.
It is a very impressive piece of technology.
This is the first time I hear about it.
I read here http://en.wikipedia.org/wiki/Essbase#Calculation_engine_2 about its Block storage.
"Since version 7, Essbase has supported two "storage options" which take advantage of sparsity to minimize the amount of physical memory and disk space required to represent large multidimensional spaces."
When version 7 occur ?
I also make it a point to look at the Release changes to see the new language elements available in ABAP.
A couple of questions -
1) If you have a dataset where it's common for a subset of the fields to be empty, then you could benefit from boxing. In my example, I have a buffer that also marks when the data could not be found in the db. In the past, I maintained two buffers, one with data, and one with the "not found" data (just the keys). Now, I can maintain just one buffer and have even better memory efficiency.
2) No precautions necessary. As soon as one of the boxed fields of a record gets a data, memory space is allocated to the boxed structure.
Thank you for sharing,
I didnt understand boxed component, now it is very clear to me.
Very neatly explained .. Thanks for this 🙂
Is there any reason why a person shouldn't use BOXED component?
No. Why would you think there would be?
Exactly, thats what I thought. So, shouldn't SAP be defaulting such things, rather than asking developers to add BOXED for every data declaration, in every program?
May be we should add this to the Stuff SAP could easily fix - rant / wish list
Juwin - correct that it would improve the performance, but it would only improve the performance, but only in certain cases.
See the notes:
It also mentions:
This "generally" makes me skeptical that the performance would be improved all the time. If BOXED is default then, I think the admin cost would be high and the performance will not be improved.
Also, it may have more admin cost for the flat structure which doesn't have same repeating deep structure.
Is my understanding correct that use of boxed components is reasonable for SAP BC 7.4 as it is not defaulted or automatically considered within the ABAP runtime?
As it was introduced by 7.02 I would have expected this has become a standard of saving memory although memory got cheaper.
As far as I know, nothing has changed.
Thanks Matthew Billingham for the great blog. The same concept was used in the last saturday SAP Inside Track Berlin session.