Application Development Blog Posts
Learn and share on deeper, cross technology development topics such as integration and connectivity, automation, cloud extensibility, developing at scale, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 
TomVanDoo
Active Contributor

You may have been in a situation where you have to process multiple objects. When you do this sequentially, it'll take some time before that processing finishes. One solution could be to schedule your program in background, and just let it run there.

But what if you don't have the luxury to schedule your report in background, or what if the sheer amount of objects is so large that it would take a day to process all of them, with the risk of overrunning your nightly timeframe and impacting the daily work.

Multi Threading

It would be better if you could actually just launch multiple processing blocks at the same time. Each block could then process a single object and when it finishes off, release the slot so the next block can be launched.

That could mean that you could have multiple objects updated at the same time. Imagine 10 objects being processed at once rather than just one object sequentially. You could reduce your runtime to only 10% of the original report.

It's actually not that hard. If you create a remote enabled function module, containing the processing logic for one object, with the necessary parameters, you can simply launch them in a new task. That creates a new process (you can monitor it in transaction SM50) which will end as soon as your object is processed.

Here's a piece of pseudo-code to realise this principle.


data: lt_object type whatever. "big *** table full of objects to update





while lt_object[] is not initial.


     loop at lt_object assigning <line>.



          call function ZUPDATE starting new task


               exporting <line>


               exceptions other = 9     



          if sy-subrc = 0.


               delete lt_object.


          endif.


     endloop.


endwhile.






Something like that.

Notice how there's a loop in a loop, to make sure that we keep trying until every object has been launched to a process. Once an object has been successfully launched, remove it from the list.

Queue clog

But there's a catch with that approach. As long as the processing of an individual object doesn't take up too much time, and you have enough DIAlog processes available, things will work fine. As soon as a process ends, it's freed up to take on a new task.

But what if your processes are called upon faster than they finish off? That means that within a blink of an eye, all your processes will be taken up, and new tasks will be queued. That also means that no-one can still work on your system, because all dialog processes are being hogged by your program.

* notice how the queue is still launching processes, even after your main report has already ended.

You do not want that to happen.

First time that happened to me was on my very first assignment, where I had to migrate 200K Maintenance Notifications. I brought the development system to its knees on multiple occasions.

The solution back then was double the amount of Dialog processes. One notification process finished fast enough before the main report could schedule 19 new tasks, so the system never got overloaded.

Controlled Threading

So what you want, is to control the number of threads that can be launched at any given time. You want to be able to say that only 5 processes may be used, leaving 5 more for any other operations. (That means you could even launch these mass programs during the day!)

But how do you do that?

Well, you'll have to receive the result of each task, so you can keep a counter of active threads and prevent more from being spawned as long as you don't want them to.

caller:


data: lt_object type whatever. "big *** table full of objects to update





while lt_object[] is not initial.


     loop at lt_object assigning <line>.


          call function ZUPDATE starting new task


               calling receive on end of task


               exporting <line>


               exceptions other = 9     


     


          if sy-subrc = 0.


               delete lt_object.


               add 1 to me->processes


          endif.


     endloop.


endwhile.





receiver


RECEIVE RESULTS FROM FUNCTION 'ZUPDATE'.


substract 1 from me->processes





This still just launches all processes as fat as possible with no throtling. It just keeps the counter, but we still have to do something with that counter.

And here's the trick. There's a wait statement you can use to check if the number of used processes is less than whatever you specify.

But this number is not updated after a receive, unless you logical unit of work is updated. And that is only done after a commit, or a wait statement.

But wait, we already have a wait statement, won't that update it?

Why yes, it will, but than it's updated after you waited, which is pretty daft, cause then you're still not quite sure whether it worked.

so here's a trick to get around that.

caller:


data: lt_object type whatever. "big *** table full of objects to update




while lt_object[] is not initial.


     loop at lt_object assigning <line>.




          while me->processes >= 5.


               wait until me->processes < 5.


          endwhile.




          call function ZUPDATE starting new task


               calling receive on end of task


               exporting <line>


               exceptions other = 9     




          if sy-subrc = 0.


               delete lt_object.


               add 1 to me->processes


          endif.


     endloop.


endwhile.





That'll keep the number of threads under control and still allow you to achieve massive performance improvements on mass processes!

Alternatives

Thanks to robin.vleeschhouwer for pointing out the Destination groups. Starting your RFC in a specific destination group, your system administrators can control the number of processes in that group. The downside is that it's not as flexible as using a parameter on your mass-processing report, and you have to run everything past your sysadmins.

Another sweet addition came from shai.sinai under the form of bgRFC. I have to admit that I actually never even heard of that, so there's not much I can say at this point in time. Except, skimming through the doco, it looks like something pretty nifty.

12 Comments