How to avoid modeling errors in Netweaver BPM? Par...

Former Member · ‎01-19-2009

Preface

Industry and academia have come up with a number of languages and notations for business processes. The Business Process Modeling Notation (BPMN) has recently gained much momentum within the BPM community. In fact, BPMN is deemed to become the "lingua franca" for business processes and hence, we have chosen BPMN as the process modeling notation in Netweaver BPM. For the most part, that's because it's easy to learn, comes with few modeling restrictions, captures all relevant workflow patterns and (also) aims at a non-technical audience. On the downside, the BPMN standard won't hold you back from modeling deadlocks, non-terminating loops, and block violations which may lead to idling process instances and other undesirable effects.

Before actually running BPMN-based processes, we have collected some guidelines to help you avoiding frequently observed mistakes. In this progressively extended blog, we will feature those guidelines and derive some recommendations out of them.

Avoid mixing gateways

Unlike BPEL, BPMN does not enforce "block structured" process graphs where splitting and merging gateways are symmetrically paired up. Mixing gateways may lead to both deadlocks and token duplications - the example flow below has both issues:

Upon activation, the front "parallel split" (aka AND split) gateway produces two "tokens". These tokens are placed onto the upper and lower branch, respectively. Tokens both denote concurrent threads of control and also act as markers for the execution progress within the process model. After "Activity 1a" and "Activity 1b" have completed, the corresponding tokens are forwarded to the inbound edges of the downstream "uncontrolled (simple) merge" (aka XOR merge) gateway which simply passed them (both) onto its outbound edge.

As a result, we have duplicated a single token and the downstream "Activity 2" is executed twice which was most likely not intended. While token duplication cannot clearly be classified as erroneous, it is often undesirable and may even lead to non-deterministic behavior, as we will see further below.

Tokens that have passed through "Activity 2" are forwarded to the downstream "exclusive choice" (aka XOR split) gateway which puts each onto (exactly) one of its outbound edges. Where to exactly put a token is decided by evaluating the conditions that are associated to the outbound edges. Interestingly, it now depends on the conditions whether or not we run into a deadlock. Mind that two tokens enter the XOR split. If it puts one onto the upper and lower outbound edge each, we are all fine and won't run into a deadlock situation. This is due to the fact that the downstream "parallel (synchronizing) join" (aka AND join) gateway expects a token on each of its inbound edges to fire (i.e., consume these tokens and put one onto its outbound edge). However, if the XOR split puts both tokens onto the same outbound edge, it's Houston we have a deadlock!

Recommendation: If ever possible, try to stick to block-structured flows, where identical gateways are symmetrically paired up. This will not only make the resulting process flow much easier to analyze and understand, but will also help you to avoid deadlocks and token duplications. There may be cases where you may want to explicitly leverage the enriched expressiveness of BPMN and deliberately deviate from block structuring your process. That's fine, but be sure to understand how gateways behave and in essence: know what you are doing!

Careful with mixing back edges and AND joins

Sometimes, one will want to put back a token to an upstream activity to repeat past steps for reasons such as to cope with exceptional situations or to complete gathering some required data. When mixed with block-structured AND split/joins, we may unintentionally produce deadlocking processes. Take the flow below as an example:

Even though the AND split/join gateways nicely pair up, there is a potential cause for a deadlock. That deadlock is caused by the boundary event which is attached to "Activity 2". Whenever "Activity 2" runs into an issue (i.e., the invoked Web Service returns a "fault"), the activity will be cancelled and the token is put back to the upstream XOR merge gateway. From there, it will (again) pass through "Activity 1" and trigger the downstream AND split gateway. As a result, new tokens are put onto the upper and lower branch.

And here's where the issue is. All of a sudden we have a single token on the lower branch and two tokens on the upper branch. This is because when "Activity 2" was aborted, the token on the upper branch (either in front of or behind "Activity 3") was not affected. By now it will probably have proceeded to the downstream AND join, waiting for synchronization, requiring a token on the lower branch. And in fact, synchronization will initially happen, consuming a single token from both inbound branches. But that leaves the second token on the upper branch behind.

That second token will essentially wait in front of the AND join forever (well, until the process was manually shut down by an administrator). But in any case, the process did not complete normally.

Recommendation: Watch out for AND join gateways and make sure you provide the identical number of tokens on all of its inbound edges under all circumstances. In the given example, the deadlock situation can be avoided by making sure a boundary event cancels the complete "block" (the process fragment between the two AND split/join gateways):

The corresponding subflow is shown below:

In the next article of our series we will shed some light onto choosing the right end event and having an "exit strategy" for loops and recursive subflow invocations. Feel free to send me your questions and comments.