Cloud Integration – Using Parallel Processing in General and Iterating Splitter
This blog describes how to use the parallel processing option in a splitter scenario in SAP Cloud Integration. It describes the recommendations and important configuration constraints of this configuration option.
Using Parallel Processing in General and Iterating Splitter
In many Cloud Integration scenarios big messages are split into smaller parts using a splitter pattern. The smaller chunks are then processed separately. In the splitter configuration, there is an option to switch on parallel processing for the single splits. But this option cannot or should not always be used and can even lead to unexpected problems. In this blog I describe the option and the important considerations for your integration flow configuration when using it.
Parallel Processing in Splitter
In the default configuration of the splitter step the single splits are executed sequentially one after the other. To improve the overall processing time of a splitter scenario you can use the option to run the splits in parallel instead. To configure this, select the option Parallel Processing in the configuration of the Splitter. When using Parallel Processing two additional settings have to be configured:
- A Timeout needs to be defined for the maximum processing time of the splits.
- For Splitter versions 1.5 and higher (available with January-20-2019 update) the Number of Concurrent Processes need to be configured. Default is 10 threads, like before when the value was not configurable. Using this setting you can control the parallelism and the load on the receiver system.
The effect of the parallel processing at runtime is that the inbound request is split into multiple separate new exchanges that are then processed completely independently from each other.
The defined number of threads handle the parallel splits. If there are more parallel splits than available threads, the next split is processed when the next thread gets available. This means, the overall time for the split processing depends on the processing time of the single splits and of the number of splits to be executed.
When using parallel processing in the splitter consider the following important aspects:
Is the Backend System able to Handle the Load?
If the splitter sends the multiple single splits to a backend you need to make sure that the backend can handle the expected parallel calls. Otherwise the request may run into timeouts and the whole scenario may stop working. Configure the number of threads accordingly.
Resource Consumption in Cloud Integration
If the splitter processes multiple splits in parallel all of them use resources in the Cloud Integration tenant, like memory, database connections and temporary storage for stream caching. There are no general recommendations possible because this heavily depends on the scenario and the flow steps and features used for the splits. Carefully test your scenario with parallel processing and the expected message size and volume to identify issues with resource consumption. Depending on your scenario, activating parallel processing may not lead to the desired performance improvements but can cause severe issues in the scenario and even to other scenarios running on the tenant.
Timeout Ends Without Error
As already stated, if you switch on Parallel Processing in the General or Iterating Splitter a Timeout field needs to be configured. This field defines the time after which the processing of the parallel splits ends latest and the next processing step, for example the Gather, is executed.
The Splitter interrupts the processing of the parallel splits after the configured timeout without an error and continues with the steps configured after the split. The timeout is a very important setting and needs to be defined high enough to execute all the splits in your scenario. Otherwise some splits may not be processed while the overall processing of the scenario continues with the next flow step. Depending on your scenario this could lead to data inconsistencies because not all splits are executed completely.
The recommendation is to test the scenario with the biggest expected messages in a realistic scenario and check the execution time. Then define a timeout that fits to the scenario.
Parallel Splitter not Supported with Transactional Resources
Splitter with parallel processing is not allowed with transactional resources, for example data store flow steps, JMS, XI or AS2 adapter. Details: How to configure transaction handling in Integration Flow
For more configuration recommendations with respect to the Splitter also check out the following blogs: