Cloud Integration – Configure Dead Letter Handling in JMS Adapter
This blog describes the Dead Letter Queue option in the JMS sender adapter. It describes the feature, when to use it, and how to monitor it. Dead letter handling in the JMS sender adapter is available for SAP Cloud Integration customers with an Enterprise Edition license.
Dead Letter Handling in JMS Sender Adapter
Some messages, (usually large messages), received in the cloud integration system have the potential to lead to an out-of-memory error in the worker node, because processing the message requires too much memory.
If such messages are processed in an asynchronously decoupled scenario using JMS, the message is stored in the JMS queue via the inbound flow, where there are normally no memory consuming conversions. But in the outbound flow, the message may cause the problem described above: The node fails due to lack of memory, Cloud Platform restarts the node, the message is polled again from the queue and therefore causing another out-of-memory error. The node restart would try to process the message again and again, leading to complete node unavailability.
To avoid this situation, messages where processing stopped unexpectedly, are retried only twice. After that the message is taken out of processing and stored in a dead letter queue. Manual action is necessary to restart or delete such messages.
Prerequisite: Broker Provisioning and Setup of JMS Scenario
The blog Configure Asynchronous Messaging with Retry using JMS Adapter describes in detail how to setup asynchronous scenarios using the JMS adapter.
Configure Dead Letter Handling in JMS Sender Channel
In the JMS sender adapter, the dead letter queue can be configured as of JMS adapter version 1.1. The checkbox, Dead-Letter Queue, for activating the dead letter queue is available on the Connection tab. The checkbox is selected by default, meaning that dead letter handling is active. After the second retry of a message that could not be completed because of a node crash, this message is moved to a dead letter queue and not processed any further. The number of retries is not configurable.
Performance Impact of the Configuration
The Dead-Letter Queue handling is a performance intensive operation, so you should expect a significant impact on the performance. This impact is even higher if you run high-load scenarios. This is because the Dead-Letter Queue handling is based on the database and the database does not scale as good as JMS.
For high-load scenarios or if you are sure that only small messages will be processed in your scenario, you should deselect the checkbox to improve the performance, but do keep the risk of an outage in mind. The recommended configuration would be to configure the size check in the used sender adapter and with this configuration reject large messages to avoid that a large message can cause an out-of-memory. Unfortunately not all sender adapters support the size check yet.
Checkbox Not Available – JMS Adapter Version Too Old
If you open an old integration flow, the Dead-Letter Queue setting might not be available because the version of the JMS adapter is too old. To find out the version choose the Technical Information icon in the channel. If you want to use the dead letter handling option, delete the sender channel and add it new from the palette. You will then get the latest version.
Dead Letter Queue Versus Explicit Retry Configuration
The dead letter queue configuration is not connected to the configuration of explicit retry handling described in the blog Configure Asynchronous Messaging with Retry Using JMS Adapter. The dead letter queue is only used for messages that could no longer be processed anymore because of a node crash.
Messages that fail because of an error during processing, do not go to the dead letter queue, but either stay in the normal processing (with a defined retry interval) or are removed from JMS (if configured in explicit retry modeling).
To check whether messages have been removed from processing and moved to the dead letter queue, you can use the monitoring tools provided by Cloud Integration.
Monitor Message Locks in Manage Lock Monitor
As soon as a message is processed by the JMS sender adapter, a processing lock for this particular message is shown in the Message Locks monitor. The monitor is in the operations view, in the section Manage Locks. The messages belonging to the JMS sender adapter can be identified by the component JMS. The Source column tells you which JMS queue the message is processed in, in the format JMS:<queue name>.
Normally, this entry is removed as soon as processing of the message is completed. In the case of a node outage, processing cannot be completed and when the node is started, the message is marked as erroneous and the entry will disappear from the lock monitor. After approximately 20 minutes a retry is executed. If processing cannot be completed because of another node outage, a second retry is triggered.
After the second retry, the message is moved to a dead letter queue and can be found in the Queue Monitor (described in the next chapter).
Monitor Corresponding Message in Message Queue Monitor
You can check the message that was taken out of processing in the Message Queue monitor, which is in the operations view, in the section Manage Stores.
If messages have the processing status Blocked, this indicates that these messages have been removed from processing and moved to the dead letter queue. You also see that no time is defined for the next retry for these messages:
The easiest way to find these messages is to use the filter option or sort based on Status.
You can use the direct link for the message ID to jump directly to the message processing log for the message sent to the queue. There, you can find the integration flow name, the time the message was sent, and details of processing.
Analyze and Solve the Root Cause
To find out, whether the message was causing the outages, you can download the message and check its size. On the basis of this analysis you can decide, whether to completely remove the message from processing, change your integration flow, and/or try to send the message again:
- If you decide to remove the message completely from processing, delete the message in the Message Queue monitor. Ask the sender of the message to send it in smaller chunks, if possible.
- If you decide to retry the message, choose Retry for this message in the Queue Monitor. The JMS sender adapter will trigger one more retry. Carefully monitor the node to see if it crashes again. If it does, you can be quite sure that it is the message that is causing the outage.
Could I blocked the whole JMS Queue??not blocked a single message in the queue
No, this is not possible.
In case in a JMS queue, after several times of retry we would like to stop the retry and keep the message in the queue. Is this possible or we could only follow the way described here that it retry twice and move to the dead-letter queue, then later manually retry?
The question is consider following scenario, for example a message send to the receiver system and receiver is on upgrade so we need to queue the messages on CPI and send later.
Thanks and regards,
No this is not possible. If you configure exponential backoff the retry is not done so frequently anymore after some retries. But there is no option to stop the consumption for some specified time.
You could move the messages into another queue and trigger this via deployment of an integration flow consuming from this second queue.
Thanks for the detailed blog, can you help with the below clarifications?
Thanks for the clarification.
5. So if we compare Dead letter queue handling with JMS to JDBC.
a. In case of JMS, we have an explicit option to enable/disable DLQ. If enabled such message causing node outage will be marked as "Blocked" in the queue ( no lock on such message in "lock monitor") and no further retry is processed on such messages. The developer can filter out the messages on the "Blocked" status from the Queue and later delete those.
b. In case of JDBC DLQ is always enabled internally, and such messages are locked in "lock monitor" and these messages are not visible in the respective Data Store, as there is no such thing as "Blocked" status for DataStore. So can the developer delete the messages from the lock monitor?
Is there any way/trick i can try out this behaviour i.e making the message fail of an out-of-memory error? I tried pushing a large file(1 GB) into JMS and this was consumed successfully by the sender adapter without any error.
6.Regrading the missing process ID, this is in the CF trial account.
5. a: correct
5. b: the message will also stay in data store but does not have the status Blocked, you need to correlate the entry in the Lock Monitor with the message in Data Store by the entry ID which is the message ID.
You could create an out-of-memory for example if you allocate a lot of memory in a script.
6. In CF we don't have process IDs as in Neo, that's why it is not visible. We need to discuss if we can hide this column in CF.