This blog describes the Dead Letter Queue option in the JMS sender adapter. It describes the feature, when to use it, and how to monitor it. Dead letter handling in the JMS sender adapter is available for SAP Cloud Platform Integration customers with an Enterprise Edition license.
Dead Letter Handling in JMS Sender Adapter
Some messages, (usually large messages), received in the cloud integration system have the potential to lead to an out-of-memory error in the worker node, because processing the message requires too much memory.
If such messages are processed in an asynchronously decoupled scenario using JMS, the message is stored in the JMS queue via the inbound flow, where there are normally no memory consuming conversions. But in the outbound flow, the message may cause the problem described above: The node fails due to lack of memory, Cloud Platform restarts the node, the message is polled again from the queue and therefore causing another out-of-memory error. The node restart would try to process the message again and again, leading to complete node unavailability.
To avoid this situation, messages where processing stopped unexpectedly, are retried only twice. After that the message is taken out of processing and stored in a dead letter queue. Manual action is necessary to restart or delete such messages.
Prerequisite: Broker Provisioning and Setup of JMS Scenario
The blog Configure Asynchronous Messaging with Retry using JMS Adapter describes in detail how to setup asynchronous scenarios using the JMS adapter.
Configure Dead Letter Handling in JMS Sender Channel
In the JMS sender adapter, the dead letter queue can be configured as of JMS adapter version 1.1. The checkbox, Dead-Letter Queue, for activating the dead letter queue is available on the Connection tab. The checkbox is selected by default, meaning that dead letter handling is active. After the second retry of a message that could not be completed because of a node crash, this message is moved to a dead letter queue and not processed any further. The number of retries is not configurable.
Possible Performance Improvement for Small Messages
If you are sure that only small messages will be processed in your scenario and no memory issues will occur, you can deselect the checkbox to improve the performance, but do keep the risk of an outage in mind.
Checkbox Not Available – JMS Adapter Version Too Old
If you open an old integration flow, the Dead-Letter Queue setting might not be available because the version of the JMS adapter is too old. To find out the version choose the Technical Information icon in the channel. If you want to use the dead letter handling option, delete the sender channel and add it new from the palette. You will then get the latest version.
Dead Letter Queue Versus Explicit Retry Configuration
The dead letter queue configuration is not connected to the configuration of explicit retry handling described in the blog Configure Asynchronous Messaging with Retry Using JMS Adapter. The dead letter queue is only used for messages that could no longer be processed anymore because of a node crash.
Messages that fail because of an error during processing, do not go to the dead letter queue, but either stay in the normal processing (with a defined retry interval) or are be removed from JMS (if configured in explicit retry modeling).
To check whether messages have been removed from processing and moved to the dead letter queue, you can use the monitoring tools provided by Cloud Platform Integration.
Monitor Message Locks in Manage Lock Monitor
As soon as a message is processed by the JMS sender adapter, a processing lock for this particular message is shown in the Message Locks monitor. The monitor is in the operations view, in the section Manage Locks. The messages belonging to the JMS sender adapter can be identified by the component JMS. The Source column tells you which JMS queue the message is processed in, in the format JMS:<queue name>.
Normally, this entry is removed as soon as processing of the message is completed. In the case of a node outage, processing cannot be completed and when the node is started, the message is marked as erroneous and the entry will disappear from the lock monitor. After approximately 20 minutes a retry is executed. If processing cannot be completed because of another node outage, a second retry is triggered.
After the second retry, the message is moved to a dead letter queue and can be found in the Queue Monitor (described in the next chapter).
Monitor Corresponding Message in Message Queue Monitor
You can check the message that was taken out of processing in the Message Queue monitor, which is in the operations view, in the section Manage Stores.
If messages have the processing status Blocked, this indicates that these messages have been removed from processing and moved to the dead letter queue. You also see that no time is defined for the next retry for these messages:
The easiest way to find these messages is to use the filter option or sort based on Status.
You can use the direct link for the message ID to jump directly to the message processing log for the message sent to the queue. There, you can find the integration flow name, the time the message was sent, and details of processing.
Analyze and Solve the Root Cause
To find out, whether the message was causing the outages, you can download the message and check its size. On the basis of this analysis you can decide, whether to completely remove the message from processing, change your integration flow, and/or try to send the message again:
- If you decide to remove the message completely from processing, delete the message in the Message Queue monitor and also delete the lock entry in the Message Locks monitor, if available Ask the sender of the message to send it in smaller chunks, if possible.
- If you decide to retry the message, choose Retry for this message in the Queue Monitor. The JMS sender adapter will trigger one more retry. Carefully monitor the node to see if it crashes again. If it does, you can be quite sure that it is the message that is causing the outage.