Parallel ABAP Objects
Parallel Processing
Note: This page uses syntax provided by the ABAP 7.4 release, for info on that check out the excellent Blog written by Horst Keller here on SCN.
ABAP Language News for Release 7.40
Parallel processing is generally achieved through a series of different methods in standard SAP, however, each tend to have various drawbacks which often make it annoying to work with and usually not worth the effort for every day uses. This particular topic has been one of constant irritation for me especially when coming from other languages where parallel processing is common place. Asynchronously iterating through loops of slow processes is a serious pain and even more so when you care about the results or the completion of those processes such as a fork / join style process.
The following are the standard ways of doing this:
- Call a remote enabled function module using the STARTING NEW TASK keyword – Dialog Work Processing
- Call a remote enabled function module using the IN BACKGROUND TASK keyword – Dialog Work Processing
- Submit a report to run as a background job via the JOB_OPEN and JOB_CLOSE statements – Background Work Processing
Issues / Justification
Calling the function modules with STARTING NEW TASK does not handle errors and you must custom build the error handling… A common error occurs when there are insufficient resources to start a new task. The standard documentation explains this in detail but the summary is that there must be at least 2 work processes available.
When calling STARTING NEW TASK there is no way to know that your task has completed other than the CALLING method | PERFORMING subroutine keywords at the end but they have their drawbacks again. You cannot call the method if the current calling method that calls the function ends. If the calling method or program ends then NO callback will occur so this means you must manually WAIT UNTIL either time or a variable that is set inside the return method is flagged. This is a pretty inconvenient thing because we dont want to manually code these wait terms any time we want this functionality and we gain little from this if execution of our current method or program ends since good OO coding practices mean that our calling method is usually simple and does not have a lengthy runtime, we would rather store a result and retrieve it later or have the callback always occur no matter what.
In all of the above the amount of code required to simply implement the solution is irritating and not exactly enjoyable to work with. In this page I will discuss a new solution that uses objects to allow developers to implement RUNNABLE or CALLABLE interfaces which can then be executed by custom parallel processing objects.
Important Notes to Keep in Mind
The following objects are custom built by me to achieve the outcomes mentioned in the above section. You should not attempt to change the code yourself without discussing with me. It is not efficient to always use these objects just because they process in parallel. You the developer should consider if your process requires this or not and you should consider the recommendations about which object to use for your scenario. The following technical considerations must be thought through:
- It takes approx 7.6ms while i am testing now, which varies depending on load, to even wait for an RFC connection to occur. Though this is minimal overhead even in a busy system it is still overhead and should be considered.
- Serialization is used to allow an object oriented solution and depending on the size of your runnable object it takes at least 0.5 – 1.5ms to serialize or deserialize the objects internally.
- Any Object variables inside your runnable that must be passed to the task to run must be serializable(they must implement IF_SERIALIZABLE_OBJECT), and any object ref variables in that equally must be serializable… so it is best to call in internal method from the RUN or CALL method to then initialize any object member variables there
- When using the ZCL_PRC_[MANAGED|FUTURE]_TASK objects, shared memory is used and efficiency is decreased depending on load as they will only use their originating application server and they must compete for access to the shared singleton monitors
- There are limitations imposed via system profile settings to ensure there are not too many open RFC connections.
- Standard work process switching limitations apply and executing one of any of these either causes an implicit commit, or in the case of the background job, calls commit work explicitely. For this reason each task should encapsulate one logical unit of work (LUW). These can never be used to iterate over an open DB cursor as it will be lost when switching.
Overview
Currently there are 4 objects that can be used as part of this framework. All of the objects exist in the ZPRC_OBJECTS package and the following are the key objects which implement the parallel processing.
——————————————————————————–
- ZCL_PRC_ISOLATED_TASK – This class will allow the execution of a RUNNABLE interface via the START method. It starts new task and it executes on the DEFAULT RFC destination group. This means that it can be executed on any of the available application servers in the DEFAULT group which in our production landscape effectively means any of 6 available application servers. There is also no easy way to know when this task is completed hence it is an isolated task The processing happens in a DIALOG work process.
——————————————————————————–
- ZCL_PRC_MANAGED_TASK – This class will allow the execution of a RUNNABLE interface via the START method. It starts new task and it executes on the NONE RFC destination. This means it will launch on the same application server as the one it was started on. The task is managed via shared memory(which is application server specific) and you can use a ZCL_PRC_BARRIER=>WAIT call to determine when it is complete. You can also control how many simultaneous connections are allowed for this task. The processing happens in a DIALOG work process.
——————————————————————————–
- ZCL_PRC_FUTURE_TASK – This class will allow the execution of a CALLABLE interface via the START method. It starts new task and it executes on the NONE RFC destination. The task is managed via shared memory and when the task completes its results are stored which can be retrieved at any point in the future. You use the ZCL_PRC_FUTURE_EXECUTOR to SUBMIT a callable object and use the resulting future to GET the result. The processing happens in a DIALOG work process.
——————————————————————————–
- ZCL_PRC_BACKGROUND_JOB – This class will allow the execution of a RUNNABLE interface via the START method. It creates a new background job which executes on any available application server. Subclasses of this can be setup to run on a specified job server group such as the ZCL_ISD_BACKGROUND_JOB which runs onthe BTC_DEDICATED group. The authorisation object S_BTCH_JOB with action ‘RELE’ is required to launch these successfully. The processing happens in a BACKGROUND work process.
——————————————————————————–
The following image shows the simplicity of executing each type
Implementing the Runnable Interface
The runnable interface, ZIF_PRC_RUNNABLE, must be implemented for any object that you want to run as a background job or isolated / managed task. The interface simply has one method, the RUN method. This method will be executed when the parallel process is started.
If you need to pass parameters to this, its not possible. Instead you must construct your object and set parameters on the constructed instance, then in the RUN method you can you those instance attributes. Due to this it is usually a good idea to call a method inside the run method that you could otherwise call from a normal instantiated object which then allows you to execute the same code in parallel or in serial depending. Remember, however, that any Object references in the class internal member variables must implement the IF_SERIALIZABLE_OBJECT interface or it will be nullified during serialization prior to executing on the parallel process.
As literally any code is valid in the RUN method there should be no requirement for me to display an example of this, however, remember the shown class as it will be used in following examples.
Implementing the Callable Interface
The callable interface, ZIF_PRC_CALLABLE, must be implemented for any object that you want to run as a future task. The interface has the following methods shown below. The CALL method will be called when the callable object is ran.
The same restrictions and considerations as with the above Runnable object apply for the callable. The return type is specified as a literal and must match a legitimate abap type, dictionary member, or object. In this random example a string is the result type.
Isolated Task
The class ZCL_PRC_ISOLATED_TASK is the simple example for parallel processing. The following example contructs the runnable and the task to process the runnable, then starts the task.
As shown previously this can be simplified with the following (thanks to the 7.4 release changes)
A delay can be used of up to 45s, though it should be noted that this should be limited where possible as the RFC connection for this waiting task will remain open although the process is not running there are limited connections available.
Note: a literal is used for delay because it allows the WAIT UP TO statement to do fractions of seconds. However, each wait call causes a work process switch which is additional overhead and wait periods less than 1 second are generally not recommended
Managed Task
The class ZCL_PRC_MANAGED_TASK on first glance appears the same as the Isolated task. The implementation is literally identical to the Isolated task.
However, there are additional features, such as the maximum work processes for use can be set for a user.
Furthermore, we can also wait for the completion of these processes by using a barrier
Here is a full example
Future Task
The class ZCL_PRC_FUTURE_TASK is not constructable and must be created by submitting an object which implements the callable interface as shown above. The ZCL_PRC_FUTURE_EXECUTOR is used for this purpose and the primary use of this is to get a result which can be retrieved at some point in the future. The below simple example illustrates this in long form.
Now things are starting to get interesting!
Again we can do this in one line thanks to 7.4
This strategy can be used to dispatch pieces of work in a loop where we want to put the results back together in a specific order. Imagine for instance a scenario where we are doing some processing in each iteration that depends on the results of the previous iteration. We can still process in parallel as long as we put the pieces back together in order…. The following example simply shows how we can process and rebuild the order for example, although it is a meaningless example. The concept can be used to process complex time consuming units of work. The example simply multiplies the input index by 10.
Note: the above example does use an internal collection object, ZCL_ISD_TABLELIST and since I don’t wish to do another example you would need to use a different collection. You could use ZCL_PRC_COLLECTION provided if you wish noting that the syntax will vary as it does not use a next( ) method and instead… or of course an internal table
Background Job
Note: This class requires the S_BTCH_JOB authorisation with an action of ‘RELE’ otherwise the jobs will be stuck scheduled. The unfortunate restriction is forced by the standard functions for submitting jobs. Check the following link for more details
http://help.sap.com/saphelp_nw04/helpdata/EN/5f/ff2138faeb3807e10000009b38f889/content.htm
The class ZCL_PRC_BACKGROUND_JOB allows for the dynamic scheduling of a background job which will run in a background work process. This technique will allow the processing work in parallel background jobs. Again, the simplest example should look familiar and would create a generic background job called Z_PRC_BACKGROUND_JOB which would run immediately.
The background job is reasonably flexible and can have a specific schedule. Note that they can never be re-ran once completed, failed, successful or otherwise. These jobs run one time only ever. The following example shows a job with a schedule to run 60 seconds after the current time with a custom job name
Important Note!
If you are planning to use the above background job and you want it to be on a specific Job Server Group, then you should create a subclass of the above and set the mv_group variable in your constructor.
Files / Install / Setup
Firstly you will need the files which can be found in a .nugg file at the below link. If you do not know how to use a .nugg file then search for that, and look for the program ZSAPLINK to import the .nugg
You can find the most recent nugget at the folllowing location:
https://github.com/lessonteacher/parallel_objects
Post-installation you will need to create the following 2 shared memory classes exactly as shown in the transaction SHMA. You MUST have imported the nugget prior to this step. These shared memory classes are used for the managed tasks.
ZCL_PRC_MONITOR_AREA
ZCL_PRC_EXECUTOR_AREA
When using any of these objects it is your responsibility as a developer to appropriately use these them where it makes sense and consider all of the points listed in this document. It is important to correctly handle all error scenarios and exceptions as well otherwise the errors will generally originate from the PRC objects as they throw PRC exceptions in some scenarios to allow for some cleanup to occur(for the managed objects).
Enjoy!
Hugo
THanks for this nice blog!
Its an interesting topic for sure
🙂
Hi,
Great blog, I had a question regarding an FM I found referenced in your code but it is not available in your .nugg config. In class zcl_prc_managed_task there is a local class lcl_processor with a method dispatch( ) - this has the FM Z_PRC_RUN_MANAGED_TASK which when I double click it did not get inported, or was not in the .nugg file, can you check this please?
Many thanks
smm
Much needed blog and great detail, the author has done a lot of work and many ideas can be taken from this.
Big thanks!
Hi,
Thanks for the comments.
Sorry I have been away over the end of year period. Unfortunately, apparently I have no access anymore to upload to the same place for my files so I might have to come up with an alternative solution.
If you need the FM in the interim I can provide that until I can adjust the .nugg file. It is a bit annoying to get the package right since all the objects exist in my system, wish I had a better strategy there.
Hi Hugo,
nice work and well explained in this article!
I agree with you, that parallel processing in SAP is not the same as in other languages. The use of RFC-FMs or background jobs is cumbersome. A concept of lightweight and easy to use threads is missing.
But even with all the overhead of full processes you can benefit greatly from parallel processing - if only it would be easy to use. And that is where your solution looks promising.
I've found the FMs in the Nugget. Don't know, why they did not get imported. Created them manually.
Many thanks,
Bernhard
Hi,
did you find a other way to upload an updated nuggedfile?
I'm very interested in your development and would like to try it out on our system.
Thanks and Regards
Paul
Hi Paul,
I have added the most recent .nugg file to my random github.
https://github.com/lessonteacher/parallel_objects
Notes:
Post-edit: yes i rechecked the nugg and the ZIF_PRC_CALLABLE interface now returns a data ref and no longer requires the developer to specific if the type is an object or what type it is etc... the shortfall of this is that you must know to create a data ref to return and not return a ref to the value which will lose scope when the method ends. E.g. you must create data ref... whatever and assign ref->* of the copy and set its value, don't try to get reference of into result as that will get cleaned up and unassigned when scope is exited. Sadly I created this gotcha and its a bit regretful...
Cheers,
Hugo
Hi Hugo,
thanks for the fast reply. After some time and work, I got the system to accept your code.
I tried the isolated tasks and it works fine.
But I’m struggling a little bit with the managed tasks. I programmed an simple class, which multiplies 5 by itself.
My first tries worked fine, but after a test I got a dump. Nothing serious (bug in my implementation).
After that I tried running the program again, but it never ended. After some debugging I discovered that ZCL_PRC_MONITOR_ACCESS->MT_REGISTER still got some entries from before:
I found the method ZCL_PRC_MONITOR_ACCESS->GARBAGE_COLLECT but this one doesn’t work, because for some reason there is no wp_id. I cleaned it up via debugger.
I’m not sure if the is a failsafe in your program that terminates this kind of constellations after a certain amount of time.
On another topic the restriction of max. processes is great but you shouldn’t use the user name as a criteria, because in our (and many others) production systems one user normally execute all jobs.
Or does this restriction only apply per user in one dialog/background-session?
All in all a great piece of work. Thanks for sharing!
By the way: Is this code under some license (gpl or some other open source license?) and free to use?
If not I think you should make it open source and let other (I would) contribute.
Regards,
Paul
Hi Paul,
Thanks for you comments and questions. Glad that you managed to get it working there.
For the ManagedTask, i would say that its not quite as useful probably as the FutureTask. As you noted it uses the user to basically attempt to manage the resources.
It turns out that if you do have an issue with an assignment, such as an unassigned field symbol, unfortunately you can end up with this issue where your user cannot run any more processes. I attempted to catch these in the calling function but there is literally no way to handle such an error outside of its occurrence sadly it seems.
The method you mentioned to garbage collect is indeed an unfortunate fail safe I tried to implement to try and deal with some of these issues with the registration but if you are unlucky there are places where the wp id can be deregistered...
You are right the username is not the greatest thing but I didn't have so many options, I believe i implemented an override that is a table somewhere you can configure to set a username and their max to get around this situation... I searched in the nugg i think there is a table called ZPRCC_WP_CONFIG to allow this configuration per user.
Regarding the license, it is open source I attached a license to the repo at github... Basically the fact that its in a .nugg and cant actually manage the code directly is the reason i never bothered to do much normally with this stuff over on github or bitbucket regarding changes and potential pull request by anyone that would want to contribute... but it turns out its still good even just for these files so i have taken to pushing them up there as is the same with the BoxedData framework which i may actually update soon to put the 7.4+ version up instead of the nerfed one that is available currently.
Hi Hugo,
I´m struggling a bit with the future task, maybe you can get me some pointers.
My calling program:
data(lr_excecutor) = new zcl_prc_future_executor( ).
data(lr_future) = lr_excecutor->submit(
ir_callable = new zig_parallel_future_task( ) ).
data:
lr_data type ref to data.
lr_future->get_result(
importing
result = lr_data ).
The interface implementation:
method zif_prc_callable~call.
data: lv_int type i.
field-symbols:
<ls_blub> type any.
create data rv_result like lv_int.
assign rv_result->* to <ls_blub>.
Move 5 to <ls_blub>.
endmethod.
If I run this code I get a ZCX_PRC_RETURN_TYPE_MISMATCH exception from zcl_prc_serialization_util=>deserialize_result.
Do I use the syntax wrong or missing a transformation? Could you give me a short sampleimplementation?
Thanks and regards,
Paul
This indeed is a good example of the power that has the ABAP 7.4. Very cool your project Hugo 😀
Warm regards,
Raphael Pacheco.
Hey Paul,
If you get this exception it is an indication that you passed the wrong type into the get_result( ) method of the future task.
Your code for the implementation is actually correct, the only issue actually is in your calling program you are passing the data ref:
lr_future->get_result( importing result = lr_data ).
Actually just the implementation of the callable needs the annoying data ref stuff, from the calling program the idea is you can pass whatever, so here you need to pass a type of 'i' as that is the type that you are going to return. Its not that obvious actually and it is the regretful feature i mentioned previously.
Either way just change your calling program like:
data(lr_excecutor) = new zcl_prc_future_executor( ).
data(lr_future) = lr_excecutor->submit( new zig_parallel_future_task( ) ).
data: lv_result type i.
lr_future->get_result( importing result = lv_result ).
This should solve the issue.
It works, thanks a lot!