Message processing during maintenance of CPI
We have been working on adding monitoring in our Figaf IRT application to make it possible to monitor the status of your SAP CPI system. I did make an interesting observation that I would like to share with you.
TL;DR you can still process Integration Flows when there is a maintance window on SAP CPI
There was scheduled maintenance on my SAP CPI this weekend and some disruption to it. From the trust center, I got the following status.
I had always thought that the maintenance meant the full system was down and it could not be used and system was down.
We had just deployed our project and had it running over the weekend to start seeing how the monitoring was working in real life.
When I check the tool the next day I could see we were able to capture the downtime and it seems like the disruption did not affect my tenant. We call different services at the CPI system to get the CPU and memory load of the system. I’m not sure what they will be able to tell us now, I’ll probably have to learn more about it. We got a graph like the following.
As you can see there is a period between 2 and 4 am CEST where the system is not responding. This correlates well with our scheduled maintenance.
I had expected that messages were not processed during the time. We have a simple iflow that is called every 5 minutes to check what the latency is. It was not showing any disruption during the period.
So it seems like the messages are processed even when the system is down. If we had gotten something different than a 200 response we would lot log the latency. In the next iteration, we will then add an alert you can set up rules for.
Then the interesting question: can we then see the messages processed in this period of time? If we look at the messages processed on the LatencyTest iflow we can see the following. The screenshot if from the Figaf tool to give an easier view of your messages.
It seems like we are not processing any messages between 1:59 and 4:18. Then at 4:22 it seems like all the messages processed during the outing is recorded. So no messages are lost during the upgrade, and processing was performed successfully. They are just logged after the upgrade is completed.
So it seems like the maintenance is only for the monitoring and deployment. Already deployed iflows will continue to work with what exists. This makes it a lot easier for customers to run CPI, since they will still process messages.
Disclaimer: If the used Iflow was using more advanced features like the message store, passwords or adapters we could have gotten a different result. It is also possible that it is just this upgrade that had the properties. It could also be possible that our 5-minute interval was too little to capture a time when we would have gotten an error.
I’m looking forward to seeing how the next maintenance window will be and what the impact of it will be, and if we are able to see something different then. Hopefully, by then we got the option to log errors in the processing.