In higher PI releases more and more parameters can be set dynamically. Still there are many parameters on the Java stack are not dynamic. Thus, a restart of the Java stack is required. Furthermore in case of troubleshooting a restart of individual server nodes might be required. Especially if PI is used in business critical scenarios or is running synchronous interfaces to get a downtime is usually very difficult. This blog describes a PI setup and a mechanism to restart the PI Java stack and also PI application servers (both ABAP and Java) with a minimized impact to business. The way described below ensures that no messages will fail during the restart and therefore the business impact is minimized.
PI is installed using multiple application servers. A WebDispatcher (or comparable loadbalancer) is configured according to the Guide “How To… Scale PI” or in a High Availability setup as described in Note 988383 (XI 3.0), 951910 (PI 7.0), 1052984 (PI 7.1) or 1614690 for AEX. Furthermore the local bundle option is set for the mapping runtime as described in the Guide How To Scale Up SAP NetWeaver Process Integration.
In case of high message backlogs during the restart it is also important to follow SAP Note 1963440 to ensure Exactly Once delivery.
General Message Flow Description:
To avoid any disruption of business during a Java restart we have to understand first the type of connections that send requests to the Java stack:
HTTP between Integration Engine and Adapter Framwork:
The Integration Engine forwards messages to the Java stack using HTTP. When configuring multiple instances for a PI system a loadbalancer has to be configured to distribute the HTTP requests over the available instances. In case you want to restart one instance you therefore have to ensure that no messages are forwarded to this instance.
JCo connection for mapping calls:
The Java runtime is called from the ABAP stack (non Advanced Adapter Engine scenario) via JCo. With multiple instances the local bundle option should be used so that the server nodes only register at the local gateway. In case you shutdown the Java instance you have to make sure that no JCo calls are forwarded to this instance. Since the mappings are triggered in the PI outbound queues we have to ensure that the qin scheduler does not process messages on the corresponding ABAP instance. To do so additional loadbalancing configuration is required that will be described below.0.1.
Here we have to distinguish between HTTP based adapters (SOAP, plain HTTP or HTTP based partner adapters) and other adapters. All HTTP based adapters have to use an HTTP loadbalancer as an entry point to allow load distribution. Thus, messages can be redirected by changing the HTTP loadbalancing.
For the polling adapters it depends which Adapter Framework Scheduler is used. Based on Note 1355715 a new scheduler was introduced. This scheduler is recommended per default and can be configured by setting the parameter relocMode<0. A value of -n determines the number of calls (n) after which a polling channel can be relocated in the J2EE cluster. For the relocation of a channel a servlet is called. The AFW scheduler is doing an HTTP call for this. In case you have multiple instances this call would go via the Webdispatcher. Hence if the instance is taken out of loadbalancing after the specified relocMode count the polling adapter will be assigned to another server node. Hence you simply have to wait for a longer time (depending on the polling interval of your channel) and also the polling adapters will not be executed on this channel any longer. With the old AFW scheduler no relocation takes place.
In general all polling sender adapters (JDBC, File, …) are handled by a scheduling algorithm which notices if the server node goes down and balances it to another one. In case a message is partially forwarded to PI in case of a restart no acknowledgment will be given to the sender application and thus the message will be reprocessed by the another server node when restarting.
Necessary configuration steps to avoid mapping errors during restart:
To avoid errors when calling the mapping runtime we have to ensure that the ABAP instance can be removed from the qRFC loadbalancing. The idea is to maintain a logon group in SMQR to allow manual interaction in the loadbalancing during a restart.
- Create Logon group in transaction RZ12:
In our example logon group qRFC_in was created.
- Maintain logon group for qin scheduler:
In SMQR maintain the AS Group as indicated below. Put in the qRFC_in logon group (standard is DEFAULT).
- Specify defined logon group:
- SMQR after the change:
Procedure for restarting the Java stack of an instance
After all these prerequisites are met in case of a restart you have to do the following:
- To avoid errors in the Mapping JCo call you have to change the logon group that is used in SMQR. For this go to transaction RZ12 and take instance out of the qRFC_in logon group. Do not forget saving!
- To remove the instance from HTTP loadbalancing disable it in the WebDispatcher frontend as indicated below.
First you have to deactivate the server (using right mouse click):
Confirm the popup that will come next. Looking at the instance after the change you will see that the green tick is not displayed any longer.
If you restart the complete instance of a dual-stack system (ABAP and Java) you might observe that after the restart of ABAP the instance is automatically added again in the WebDispatcher laodbalancing. This is corrected with Note 1976505.
- Ensure that an eventual message backlog is executed before the restart. For this the best tool to use is Wily Introscope by using the dashboards for the mapping runtime and the Adapter Engine queues as shown below. As soon as no activities are displayed any longer for the server nodes of this instance (0 line in the graph) you can restart the server.
Please Note: For a Java only installation like a PO, AEX or non-central AAE the procedure is valid as well. Main prerequesite is to have multiple application servers. For Java only installation only the isolation of the relevant instance in the WebDispatcher is required since the mapping is always executed on the same server node.