Hi All,

Many documents and blogs have been written about How to Develop BPM Components, How to integrate external components into BPM, How to consume BPMs in other applications, BPM APIs etc.

But, after the development of these objects, when the time comes to actually use them, run them on the server for testing, quality or even end-user usage, very little is known to BPM Developers at the beginner level as to how and where can we check what is going wrong with our processes.

Where is it stuck? Why is it stuck? Is there a technical issue? Is there a modelling issue? ….. and the list goes on.

In this blog, I would like to help you guys out with a few steps which can help you in finding the causes of disruptions of your process executions on the server.

For finding the exact root cause of the problems, we will follow a classic 4-step approach, which should be followed in chronological order while troubleshooting your BPM Processes in the NetWeaver Administrator, which can be accessed by logging in to “http://<host>:<port>/nwa”.

Note: For troubleshooting your Business Processes, you need to have appropriate authorizations on the server. You can refer This Article for all roles & authorizations related to BPM. I would suggest having the “BPM_SuperAdmin”/ “NWA_SuperAdmin” role.

Please follow these steps in the exact order in which they are listed.

Step 1: Process Repository-Error Log


Log-in to NWA –> Configuration –> Processes and Tasks –> Process Repository:



/wp-content/uploads/2014/09/nwa1_547095.jpg



Once you click on the Process Repository, you will see all the BPM Components deployed on your server. Select your concerned BPM Component, Scroll the page down, select the Parent Process Instance & click on the Process Link:

/wp-content/uploads/2014/09/nwa2_547096.jpg

This will open up the Process Instance Repository (which will contain all the Process Instances-Running and completed) for the selected BPM Component:

/wp-content/uploads/2014/09/nwa3_547100.jpg

The very first step we will do here is to click on the/wp-content/uploads/2014/09/nwa4_547107.jpg button, which will open up the process design of the BPM, to see where exactly the process is stuck. If the process is stuck at a Human Activity, it will have a RED token on that particular activity.

                                                                              

Moving ahead, if you see this repository closely, you will have a lot of options which you can operate upon your BPM Process Instances.

At the bottom, you will see a tab-strip:

/wp-content/uploads/2014/09/nwa3_547100.jpg

The last tab here is that of the “Error Log”. If the process instance is in a Suspended/Error status, this error log gets enabled and you will be able to find the full error stack trace here. In case the status of the Process is OK/In-progress, this tab will be greyed out (as above).

However, there are quite a few instances where the process is erroneous but you may not find anything in the error-log tab. This is when we go to the 2nd step.

Step-2: Process Repository-History

In the same tab-strip above, the 4th tab you will see is the “History” tab. This section will actually list out all the process execution steps of the BPM Process execution.

If step-1 does not help you, click on this History Tab and from the available dropdown-options, select Advanced:

/wp-content/uploads/2014/09/nwa5_547130.jpg

In the description details listed, you will be able to find the exact detailed error trace in approximately the 3rd entry.

Note: I would personally suggest to copy this full error trace onto a notepad and then check the problem. Sometimes, the table limits the display of the error text.

This step traditionally takes care of Process Modelling issues, Data and other issues which might not be build-errors, but run-time errors.

The only error this step may not track is a Service Execution in an Automated Activity, in which case, we move to step-3.

Step-3: NWA-Connectivity Logs

In NWA Home-page, go to SOA –> Logs & Traces –> Connectivity Logging & Tracing:

/wp-content/uploads/2014/09/nwa6_547160.jpg

When you open this, you will see all the error traces pertaining to service connectivity errors. You may search for your interface (which has failed to execute and we identified it in Step-2, but we do not know the exact error-cause) and check for the errors:

/wp-content/uploads/2014/09/nwa7_547161.jpg

You are bound to find the final error here. But, however, if you fail to do so (which is a very rare case), we move to Step-4.

Step-4: NetWeaver Logs

You can directly access the NetWeaver error-logs by logging-in to http://<host>:<port>/nwa/logs.

This log engine stores all the error logs of the errors which occur on the server (of all the running applications, as well as the configurations logs).

Firstly, change the error-log prioritization as shown in the below snapshots:

/wp-content/uploads/2014/09/nwa8_547162.jpg           

/wp-content/uploads/2014/09/nwa9_547190.jpg

You may enter the Development Component Name in the search criteria to search for errors pertaining to your Development Component.

You may then expand the error by clicking on the “+” sign on the left of the selected error entry and then check for the error details:

/wp-content/uploads/2014/09/nwa11_547196.jpg

Note: Please use this step as the last resort to finding the error details. It is very time-consuming to search for a single error among thousands of records on the server.

That’s it. I think these steps above will surely help somehow to track the cause of the error.

But please note that you should follow these steps in the order in which they are listed.

For an overview of the errors on all the BPM Components on a whole, you may check the BPM-System Overview option in NWA.

Check This Document for BPM System Overview.

You may also refer the excellent article: A day in the life of an SAP NetWeaver Business Process Management Administrator by Birgit Heilig

Hope this helps.

Cheers. 😎

Sid.

To report this post you need to login first.

10 Comments

You must be Logged on to comment or reply to a post.

  1. Andy Silvey

    Hi Sid,

    excellent article and thanks for sharing.

    Over the last few years I’ve spent a lot of time implementing the underlying infrastructure, resizing CE’s and supporting the BPM solutions.

    Your overview is great, and we’ve taken it one stage further, defining the BPM Solution roles and responsibilities for monitoring the BPM solution split across Basis, BPM Portal Team, BPM Project Development Team, which infact deserves a blog.

    One thought which keeps coming back to me is, and your feedback would be interesting, despite yours and our best efforts to monitor and administer the BPM solutions, we still suffer from stuck processes for a variety of reasons, and in the BPM area we seem to be very REACTIVE to issues in BPM and on top of this it is often the User who spots the issue and raises and incident before Basis or BPM Functional Team have seen it.

    I really think SAP need to improve this, and find opportunities for automation of the monitoring in BPM, for example, automatically triggered emails or incident generation when there is a BPM issue, because, the BPM solution costs a huge amount of money and it is simply embarressing when Users report issues, and slowly but surely the User buyin is eroded.

    On top of that, BPM by the very nature of being a WorkFlow, has the User’s utmost trust that once the User triggers a WorkFlow, the work will actually flow, and the point of WorkFlow is that the User doesn’t have to check the progress, the work should flow and the process should complete, and therefore, even more so in the BPM area we could do with SAP building in automated monitoring to help us catch events before the Users raise an incident.

    What are your thoughts ?

    Best regards,

    Andy.

    (0) 
    1. Siddhant Bhatankar Post author

      Hi Andy,

      Firstly, let me thank you for all the appreciation. It means a lot, especially coming from you.

      Secondly, you mentioned something about defining BPM Solution Roles across different teams such as Basis, Portal etc. Sounds very interesting, but I have certain inhibitions regarding this. There is no doubt that a BPM Consultant has acute knowledge of all the development, monitoring, error-resolution etc. But if the BPM Solution roles are split across teams, it will be an obligation on our part to educate each and every one of them with BPM. That being done, we do not want a technical person pointing out ‘WHERE’ the problem lies. At the grass-root level, we definitely want people to identify the exact location and cause of the problem and rectify it then and there.

      Coming to ‘Stuck Processes’, all I have to say is, the root cause of all the stuck processes, according to my experience is ‘INSTABILITY’. When I say this word, I am in no way pointing out to run-time errors and design flaws.

      What I mean to say is, for example, if I make configurations today for connections between BPM & PI, where I am consuming PI Service Interfaces in BPM Processes through Simple Automated Activities, I cannot guarantee the users that this will work tomorrow too. And the most embarrassing part is, I have no answer to their ‘WHY?’

      A configuration working perfectly fine for a week suddenly stops working the next day.

      Or may be an EJB executed through a BPM Process interacting with the database, working perfectly fine for over a month, stops doing it all of a sudden(I had to go through this on a Client System 🙁 ).

      I totally agree with you that SAP needs to improve(tremendously) on this. My question to SAP would be, we never face such unstable issues in an ABAP WorkFlow. I have many Workflows running on R3 successfully for over 2 years without any problems. Then why is BPM so dicey?

      It would be really great if SAP overcomes such issues at the earliest.

      As far as automation of the BPM Monitoring is concerned, I totally agree with you that SAP needs to be on developing lines towards this. As you mentioned, email alerts or incident generation in case of a technical failure in the BPM system is any day better than BPM Admins like us having to check the system for erroneous mishaps every now and then.

      A workflow means total automation; and that definitely does not include the end users having to spot out the drawbacks. If we are working towards automating application flows for the users, we can at least have the liberty of not checking tens of thousands of processes for technical errors. 🙁

      Let me emphasize here that there might be hundreds and thousands of processes running on a server. A person/team sitting and checking for technical errors for various processes just does not make sense.

      As expensive as the product is, there is a wide scope for a lot of improvements. I hope, just like you, that SAP automates this procedure(or at least some part of it) so that the user knows that they are buying a product worth their money! And it definitely would decrease the number of OSS’s raised too. 😉

      I hope I made some sense.

      Regards,

      Sid

      (0) 
      1. Andy Silvey

        Hi Sid,

        I totally agree on most of the points.

        Regarding the monitoring, the roles and responsibilities must be divided over function areas, because some cases only Basis can do it, some cases it makes no sense for Basis to do it, eg, unblocking Ports in MDM at Unix level so that the messages can keep flowing, this is a Basis task. Monitoring Processes in the Manage Processes a BPM Portal solution expert should do.

        Here are some of the areas I have looked at at more than one Customer regarding roles and responsibilities for BPM monitoring:

        1. BPM System Overview

        NWA -> Availability and Performance Management -> BPM system Overview

        • Monitor the availability of the BPM sub systems, adapters, and so on
        • Start a BPM application as part of the troubleshooting action when something fails
        • Configure the process engine
        • Check if the basic BPM functionalities work properly
        • View the number of process and task instances in different statuses

        Responsibility area: ??

        How often: Daily (it helps to catch any problems with system performance, make a decision that data archiving is needed)

        Estimated time: 5 minutes / day if everything OK, more time if something wrong happens

        1.   Process Repository

        Configuration Management -> Processes and Tasks -> Process Repository

        • View the development components of the deployed processes, versions of the components deployed, and all the process and task definitions deployed.
        • Activate a version or deactivate a development component as a whole
        • Start the process instances
        • Download all the corresponding files of processes and tasks, for example, WSDL files, data types, and so on
        • Configure the involved Web services of the processes in a development component (Application communication)

        Responsibility: ??

        How often: After each transport, in case of any changes in development or software components, in case of errors

        Estimated time:  1 hour / each transport

        1. Process Management

        Operation Management -> Processes and Tasks -> Manage Processes

        • Monitor process instances and view their details
        • View the process flow of the instance and check the current workflow step of the process instance
        • View all the running processes, failed process instances, or all processes in error state
        • Analyze the process instance details and take an appropriate action, for example, suspend, resume, or cancel a process
        • Archive the completed and canceled process instances

        Responsibility: ??

        How often: Daily

        Estimated time: 5 minutes / day if OK, more time when some actions needed (like archiving)   

        1. Tasks management

        Operation Management -> Processes and Tasks -> Manage Tasks

        • View the details of task instances. You can view the corresponding process for each task instance
        • View all task instances, task instances with an error state or with lapsed deadlines
        • Search for task instances using the advanced search options
        • Analyze the task instance details and take an appropriate action, for example, suspend, resume, or cancel a task instance
        • Nominate a processor for task instances
        • Resend an offline task, if required

        Responsibility: ??

        How often: Daily

        Estimated time: 5 minutes / day if OK

        1. Business  Logs

        • Monitor and analyze business events of BPM
        • View logs based on date, time, or more specified query
        • View details of log entry

        Responsibility: ??

        How often: On demand (if needed)

        1. Data Clearing & Archiving

        Processes and tasks can be archived with relevant data, such as process context, tasks, attachments and business logs.

        Archiving process instances helps to remove processes which are completed and no more required by the business. All process instances which are marked for archiving will be taken out from database and archived to a location as configured by the administrator. This includes processes, tasks and business logs related to the process.

        Process archiving frequency depends on:

        • Number of process instances and tasks instances daily started
        • Duration of process instance
        • Number of users assigned to the task

        Process archiving improves the overall performance of the system. It keeps database with only active processes and reduces database size.

        SAP recommends archiving entries of the business log that are older than 2 months. This archiving session should be executed on a daily basis at a specified time.

        Instruction: how to archive process instances – separate document will be created

        Responsibility: ??

        How often: it depends on system performance, number of process instances etc

        Estimated time: ?? –

        1. MDM Ports Monitoring & Clearing

        When the import Manager/Server encounters an exception, it handles the exception according to its type. Structural exceptions prevent MDIS from processing any import records in an import file. When a structural exception is found, MDIS moves the offending import file from the ports Ready folder to the ports Structural folder. No importing occurs and the port is blocked until problem is resolved. Then manual action is needed to solve this problem.

        Responsibility: ??

        How often: when something happens, not a repeatable action

        Estimated time:

        But really, in the year 2014, why are these tasks manual human activity, why isn’t it automated with alerts and emails and Solution Manager Alerts feeding into the Incident Management System.

        Do you archive your old processes and tasks ? We are currently doing a POC on that using the out of the box BPM archiving functionality from SAP.  Obviously every customer should be archiving old processes and tasks otherwise they will fill up the db and cause performance issues.

        Best regards,

        Andy.

        (0) 
        1. Siddhant Bhatankar Post author

          Hi Andy,

          Going by the classification you have listed, it looks like a perfect solution.

          The division of the roles and responsibilities according to your prioritization seems perfectly achievable.

          But don’t you think this should have been done long back?

          But really, in the year 2014, why are these tasks manual human activity, why isn’t it automated with alerts and emails and Solution Manager Alerts feeding into the Incident Management System.

          Honestly speaking, I related to this topic way back in Dec 2012 when I had attended a training in Solution Manager 7.1.

          Solution Manager alerts, Email alerts, Incident messages etc. should supposedly be easy to implement as far as capturing technical/system faults are concerned.

          With each and every new release of NetWeaver from SAP, there are some or the other new additions with regards to Development Perspective(PI Integration, Open UI Integration etc.). But sadly, there have been minimal or no updates with regards to Monitoring since a long time now.

          I have been working with BPM since a long time now and have thoroughly enjoyed my journey so far, as far as developments are concerned. But when it comes to Monitoring Tasks, I admit, it is very painful and tedious.

          It would be really amazing if the roles and responsibilities are divided amongst teams as you have listed. But with that, there comes an added responsibility for SAP to release Training material for each and every group/team as to how they should handle their daily BPM Tasks.

          It is high time SAP automates at least some components of BPM Monitoring if not all.

          Every single word in your above reply seems ‘do-able’.

          And at this stage of the evolution of the BPM as a product, I would state it as a ‘Missing Feature’ instead of an enhancement.

          Customers are paying a lot to use this, people like us are putting a lot of efforts to make things fall in place. Then why not make things easier for all of us?

          And yes, I follow the exact same document since a very long time to Archive my old BPM processes and tasks. Thanks for reminding me it’s time for my monthly archives. 😉

          It would be really awesome if you/someone from your team could write a blog with regards to the classification and distribution of roles and responsibilities you have listed above. Don’t forget to send me a link.

          Regards,

          Sid.

          (0) 
  2. Andy Silvey

    Hi Sid,

    agreed on all points, let’s hope SAP are listening.

    From the Basis perspective BPM is incredibly complicated we have amongst other things, configurations for:

    Entry from Portal to CE (FPN)

    CE integration with MDM, including Change Tracker

    CE Integration with BI – UDConnect

    CE integration with ECC – including Web Services

    CE integration with PI and including using PI as the Services Registry

    CE configurations for, amongst other things, javamail, Provider Systems, BPM configs

    NWDI infrastructure setup and integration with CE and CTS+

    Supporting the Developers working in NWDI

    MDM administration

    There are so many vulnerable pieces in the BPM puzzle, it would be a great help if monitoring could be integrated with SolMan.

    In the meantime we make do with configuring our own custom SolMan monitoring of the CE Default Traces searching for key words and generating alerts.

    Best regards,

    Andy.

    (0) 
    1. Siddhant Bhatankar Post author

      Hi Andy,

      I totally agree on all points. SAP needs to pay heed in all the areas you have mentioned above to assemble the vulnerable pieces of the BPM Puzzle. 🙂

      Waiting for some good updates in the coming updates and releases. Life can be so much easier with all your suggestions implemented.

      In the mean time, patience is the key, I believe.

      Regards,

      Sid

      (0) 
  3. Eng Swee Yeoh

    Great article, Sid. This came in really handy as I move from ccBPM in a dual-stack PI system to NW BPM on PO – definitely helped me to familiarise with all the different buttons, tabs, etc to track down those BPM errors!

    (0) 

Leave a Reply