Simple isn’t easy – ABAP composition (Chunk it up)
Do your write code only for machines, or also for your fellow developers and curious functional consultants?
There are many ways to make the ABAP code clean, readable, maintainable and awesome, and it’s not a 1-fit-for-all. Two of the concepts to use is object composition and modular programming. Although they are basically two different things, they overlap in this context and I will mix them together here a bit. I tried to be brief, but this is the shortest I managed to get it now…
But let’s talk about abstraction first
The common saying is that ‘computers only work with ones and zeros’. And already there we have an abstraction as the transistors in the integrated circuits in computers actually work with electrical voltage. The ones and zeros are an abstraction and a representation of that; Voltage above a certain level is abstracted as, or represents, a ‘1’ and voltage below a certain level is a ‘0’. And the ones and zeros abstraction is simpler than dealing with the voltage.
The devices and machines running our code work in ‘machine code’ (or ‘ones and zeros’) and don’t really care about what fancy words or symbols we humans use to write our code. Assembly language is an abstraction up from the machine code and then we have even higher-level abstractions in languages like for instance C, and then ABAP. These are also referred to as programming language generations, with ABAP being a fourth-generation programming language.
These higher level languages makes us not have to worry about the ones and zeros. There we don’t have to worry about the circuits. We typically don’t have to know how the memory bus is operating. We typically don’t have to worry about how the files are stored physically on a hard drive or how the HTTP protocol is working on top of the TCP protocol, on top of the IP protocol. We typically don’t have to worry about how the database is working internally. We can just use it, and it’s simpler.
In ABAP we can for instance issue a SELECT statement in our code and just expect the database and the table to be there and deliver what we need, without knowing anything about how it works.
But how is it even possible that anything works at all then. Well, I’m sometimes surprised that so many things actually work, and of course somebody has to know about those things. There are those who develop these layers, these aspects, these abstractions. There are those who program the programming languages and the IDEs. And that is what makes it simpler for us who typically work in the higher level languages, like ABAP, which I would assume is the majority here… We don’t have to know everything – but somebody does.
Adding a perspective to that, It’s normally good, and in some cases even necessary, to have a good understanding about some of the different abstraction layers in order to utilize the systems optimally. Knowing for instance how a database works and interacts with the system, should help a developer to use it optimally, but we do not need to be able to program the database itself. It might always work to SELECT * FROM table, but having an understanding of what’s actually happening let’s us understand that we should prefer to select only the fields we need. A STANDARD TABLE might always work, but in many cases a SORTED or HASHED table would be better and it helps to know how and why.
So what about this composition and modular thing, then?
In the same way that abstraction layers, as described above, are helping us as developers by making things simpler, we can use object composition and modular programming to help ourselves and our colleagues in a similar way, with the intention of making things simpler.
Instead of hammering away on the keyboard for line after line of continuous procedural code, we can compose these lines in smaller chunks and call on them one after the other.
When we have these smaller chunks together as methods in classes, the methods are ‘modular programming’, and the classes are ‘object composition’.
Simple isn’t easy, though…
It’s not really always easy to do, and it’s not really easy to show, demonstrate and explain in a short, clear and concise way either. So I’m omitting some things here and oversimplifying others with the purpose of not having you sit here reading for hours, but merely show the idea and the principle… 🙂
Let’s say we have some data. It could be input data from an integration or an app, or something we read and gathered from the database for some reason or another.
items = VALUE #( ( order = '1' line = '1' item_id = '123' count = 2 operation = 'add' ) ( order = '1' line = '2' item_id = '234' count = 4 operation = 'add' ) ( order = '2' line = '1' item_id = '234' count = 4 operation = 'add' ) ( order = '2' line = '2' item_id = '234' count = 6 operation = 'sub' ) ( order = '2' line = '3' item_id = '234' count = 2 operation = 'add' ) ( order = '3' line = '1' item_id = '123' count = 3 operation = 'sub' ) ( order = '3' line = '2' item_id = '123' count = 5 operation = 'add' ) ( order = '3' line = '3' item_id = '123' count = 5 operation = 'reset' ) ( order = '3' line = '4' item_id = '123' count = 2 operation = 'sub' ) ( order = '3' line = '5' item_id = '123' count = 4 operation = 'add' ) ).
So we have some kind of orders with order lines. There are item IDs, a count and an operation. The super obvious business requirement is to summarize the counts of the item IDs for each order, based on the operations. The expected outcome is something like this:
ORDER ITEM_ID DESCRIPTION TOTAL_COUNT UPDATE_STATUS 1 123 Gandalf the grey 2 ok 1 234 Hosaka Ono-Sendai 4 ok 3 123 Gandalf the grey 2 ok
The code that produced this output is the made-up example below, and we want to try to use composition and modularization to simplify it. It’s not the code in itself that’s important here but the idea of chunking it up…
LOOP AT items REFERENCE INTO DATA(item) GROUP BY ( order = item->order item_id = item->item_id ) REFERENCE INTO DATA(item_group). " One line per order and item ID. Also read the description for item. READ TABLE summary WITH KEY order = item_group->order item_id = item_group->item_id REFERENCE INTO DATA(summary_line). IF sy-subrc NE 0. INSERT CORRESPONDING #( item_group->* ) INTO TABLE summary REFERENCE INTO summary_line. SELECT SINGLE FROM zitem_main_data_table FIELDS description WHERE id = @summary_line->item_id INTO @summary_line->description. ENDIF. " Perform the operations to calculate the summary LOOP AT GROUP item_group REFERENCE INTO DATA(item_line). CASE item_line->operation. WHEN zcl_constants=>item_operations-add. summary_line->total_count = summary_line->total_count + item_line->count. WHEN zcl_constants=>item_operations-sub. summary_line->total_count = summary_line->total_count - item_line->count. WHEN zcl_constants=>item_operations-reset. CLEAR summary_line->total_count. ENDCASE. ENDLOOP. " Call BAPI to update the system order_header = value bapi_header_type( order_id = summary_line->order ). CALL FUNCTION 'ZBAPI_HANDLE_COUNT_FOR_ITEM' EXPORTING order_header = order_header item_number = summary_line->item_id item_count = summary_line->total_count IMPORTING return_messages = return result = bapi_result. IF line_exists( return[ type = 'E' ] ). summary_line->update_status = zcl_constants=>text-failed. ELSE. summary_line->update_status = bapi_result-status. ENDIF. ENDLOOP. " We only care about returning lines with a non-zero count DELETE summary WHERE total_count = 0.
We don’t need to focus on the details of the code itself. I’d say this is a pretty common representation of ABAP programming; There’s a loop with some things happening inside. It’s not too bad, just a little over 50 lines and it’s not really that much happening. A bigger example would show the impact more clearly, but let’s practice this… Three main things are happening inside the outermost visible loop here… We find a unique line in the summary table (which might be populated before our loop), then we calculate things based on the operations and finally call a BAPI. We put those three things in methods of their own… If we want to reuse them globally in the system, composition and a separate class is the way to go. If it just belongs to this implementation, then a method of its own is my suggestion.
I’d say that finding a unique line is really a modular thing. It belongs to this loop and this class and go in a separate method get_unique_summary_line.
Handling the calculations/operations might be something to reuse depending on what we’re doing. In this imaginary scenario, its just a specific input-thing so it’s going in a method perform_count_operation in this same class.
For calling the BAPI we definitely want to create a global model class for the “Item object”. This may or may not be reused many times in our custom development in the system but we want to have the option and we want to be consistent in the calling. This is already instantiated before the loop into item_model. The wrapper method is called handle_count.
After doing this, our main loop can instead look like this:
LOOP AT items REFERENCE INTO DATA(item) GROUP BY ( order = item->order item_id = item->item_id ) REFERENCE INTO DATA(item_group). get_unique_summary_line( EXPORTING order = item_group->order item_id = item_group->item_id IMPORTING summary_line = DATA(summary_line) CHANGING summary_table = summary ). LOOP AT GROUP item_group REFERENCE INTO DATA(item_line). perform_count_operation( summary_line = summary_line item_line = item_line ). ENDLOOP. summary_line->update_status = item_model->handle_count( VALUE #( order_id = summary_line->order item_id = summary_line->item_id item_count = summary_line->total_count ) ). ENDLOOP. DELETE summary WHERE total_count = 0.
Now, I really like this better and imagine having a more complex scenario to begin with, where you count the lines in the first loop by the hundreds instead… And this is not at all related to loops specifically. The majority of procedural style programming can be chunked up with composite and modular concepts.
Going back to the abstraction I started talking about – we get the unique line here without really caring (at this specific place) about HOW that’s done. Simpler. Then we perform the count operations without really caring (at this specific place) about HOW that’s done. Simpler. Simpler is one of the reasons we want to use the composition and modular approach.
If we just have a look at examples of the new methods we see the added benefit of them being quite short and having a specific thing to do. This method ONLY makes sure we get a unique line to work with later, trying to read the table and creating a new line if not. Simpler:
METHOD get_unique_summary_line. READ TABLE summary_table WITH KEY order = order item_id = item_id REFERENCE INTO summary_line. IF sy-subrc NE 0. INSERT VALUE #( order = order item_id = item_id description = get_item_description( item_id ) ) INTO TABLE summary_table REFERENCE INTO summary_line. ENDIF. ENDMETHOD.
Here we have also moved the SELECT SINGLE for the description text into a method of its own as well. Hopefully it’s not a SELECT SINGLE anymore inside of the loop (because we don’t want to do that), but reading an internal table with all the descriptions already fetched. Just showing that the principle is possible on how many levels you deem appropriate.
The calculation method might look something like this. It ONLY handles the calculations. It does one thing and it does it well. HERE we care about how it’s done. Simpler:
METHOD perform_count_operation. CASE item_line->operation. WHEN zcl_constants=>item_operations-add. summary_line->total_count = summary_line->total_count + item_line->count. WHEN zcl_constants=>item_operations-sub. summary_line->total_count = summary_line->total_count - item_line->count. WHEN zcl_constants=>item_operations-reset. CLEAR summary_line->total_count. ENDCASE. ENDMETHOD.
Another bonus is that this is now testable code. I big loop with many things happening is hard to create automatic tests for. It need to be run as a whole. But these smaller methods, that does one thing, can easily have automatic unit tests written for them, testing that one thing and many ways.
Approaching from another direction
Another way of doing this is taking the requirements like “We need this background job running each night, that gathers a list of materials and summarize their goods movements over a rolling 7-day-period. If some condition applies a new purchase order should be created. And of course the log has to be email to somebody in excel format”.
Ok, to do this we basically need to:
- Get the search criteria parameters
- Find the required data based on it
- Create purchase orders where needed.
- Generate excel file output
- Send excel file in email
And you know what… That means that we basically need to implement these…
METHOD get_search_criteria_parameters. ENDMETHOD. METHOD get_data. ENDMETHOD. METHOD create_purchase_orders. ENDMETHOD. METHOD generate_excel_file. ENDMETHOD. METHOD send_excel_file. ENDMETHOD.
I’m somewhat oversimplifying but I hope you get the idea… And inside of those methods we do it again, could be for instance:
METHOD create_purchase_orders. LOOP AT goods_movement_data WHERE some_condition = 'applies' REFERENCE INTO DATA(goods_with_some_condition). create_purchase_order( goods_with_some_condition ). ENDLOOP. ENDMETHOD.
Somewhere the “actual code” obviously still has to be written… 🙂
Just don’t overdo it…
It’s also important to mention that everything can be taken too far. If you are developing something where really high performance is required over a big data set, an implementation with too many method or function calls will eventually aggregate to a performance impact. At some point the choice must then be made between performance and a clean and pretty codebase. Know what you do, understand the techniques and concepts, do the right thing.
Simple isn’t easy… Chunk it up!