Skip to Content
Technical Articles
Author's profile photo Marcel Christ

SAP Integration Suite: Resilient APIs using API Management, Event Mesh, and Cloud Integration

Introduction

This blog article outlines an idea how to design APIs and interfaces which are more robust towards unavailable services, interface errors, and data loss. The following services of SAP Integration Suite are used to achieve this goal: SAP API Management, SAP Event Mesh, and SAP Cloud Integration. Similar ideas for applications are described in the discovery center mission “Enhance core ERP business processes with resilient applications on SAP BTP“: https://discovery-center.cloud.sap/missiondetail/3501/3542/

 

Scenario

In this scenario, sales orders are received via a public and asynchronous API, delivered to the event mesh, and transmitted via cloud integration to SAP S/4. The problem addressed here is how to reduce data loss when running into situations like (1) errors in the request payload leading to mapping errors or errors in SAP S/4 processing, (2) unavailability of the iFlow for example when redeploying it, or (3) unavailability of the receiver system SAP S/4.

Of course, there is more than one solution to this, and I am sure I haven’t seen them all. Apart from the solution presented here you could for example:

  • Archive events in the data store of the cloud integration or in a cloud storage outside of the Integration Suite. This only works if the IFlow is always operational.
  • Use the retry functionality of the XI Adapter. All events triggering an error before reaching the XI adapter still cannot be reprocessed and are lost.
  • Use a CAP service within a data integration hub. You get persistence of data for free and comfortable validation, plus, you are well prepared for clean core and side-by-side architectures which will be relevant in the future as described here by Thorsten. But you need the skill to implement and maintain CAP services.

Do you have any other ideas? Please let me know in the comments.

 

The following aspects of this scenario are outlined in six steps:

  • Configuration of the event mesh instance.
  • Configuration of the iFlow.
  • Some validation capabilities of the API.
  • Reprocessing events in case of errors.

 

Step 1: Create an event mesh instance

Set up an event mesh instance in your integration subaccount and create two keys – one for the API and one for the Cloud Integration access, for example:

Fig.1: Event Mesh Instance.

 

Step 2: Create the queues

Once the event mesh instance is up and running, create queues. One for processing events and one dead message queue:

 

Fig. 2: Event Mesh Queues.

 

Configure the main queue for processing events with a dead message option:

Fig. 3: Enable Dead Message Queuing.

 

Step 3: Create the IFlow

This iFlow pulls events from the event mesh queue via AMQP adapter, performs a json2xml conversion, a message mapping using groovy and delivers the request via XI adapter to SAP S/4. Alerts are sent via exception process to SAP AIF for better monitoring.

 

Pulling%20events%20from%20Event%20Mesh%20and%20delivering%20to%20SAP%20S/4

Fig. 4: Pulling events from Event Mesh and delivering to SAP S/4.

 

You can find details of how to set up the exception process for AIF used in the above iFlow in this blog by Nicole Bohrmann.

Here is how you would configure the AMQP adapter:

Fig. 5: AMQP adapter.

Details of how to utilize the AMQP adapter in cloud integration are described in this excellent blog:
Cloud Integration – Connecting to Messaging Systems using the AMQP Adapter by Mandy Krimmel

 

Step 4: Expose the API

Create, configure, and publish your API in the API Hub Portal.

Call the event mesh instance directly by entering the URL as target endpoint.

 

Fig. 6: API target endpoint

Note: In case the API provides more than one REST resource, the resource name will be added to the URL by default when calling the target endpoint. We chose to rewrite the target URL in the API policy:

 

Fig. 7: API Policy excerpt for rewriting target endpoint.

Here are references for this step:

https://docs.apigee.com/api-platform/reference/policies/assign-message-policy#assignvariable

https://stackoverflow.com/questions/24702740/apigee-modify-target-path-when-using-targetserver

 

Step 5: Test – sending sample requests

Below you see the response after we have sent an order request to the API. Since the API is asynchronous and moves the events into a queue of the event mesh, the response only states that data has been received and will be processed. At this point, we can not respond with a proper information about the real status of the processing of the order.

 

  • Success – custom response and return code 201

Fig. 8: Success with custom response.

 

  • Failure – responding with validation error 422

To reduce errors in the IFlow and the backend system we have set up some validation for incoming requests – here is an example with a wrong date format and a missing mandatory field. This request is blocked right away and the sender receives a verbose response:

Fig. 9: Validation error.

Step 6: Reprocessing orders 

Once the events have passed the API validation and there is an error in the cloud integration, events are moved to the dead message queue. There are different ways how to handle this, we decided to simply use an additional small IFlow which moves all events from the dead message queue back to the orders queue. As soon as the error in the cloud integration is fixed, the IFlow will deliver the events to SAP S/4 again. In case the IFlow is undeployed or not operational, events remain in the orders queue and are automatically processed once the IFlow is deployed. If the receiver SAP S/4 is not available, requests can be stored in the JMS queue of the XI adapter for reprocessing.

 

 

Conclusion

This blog post provides some ideas how to create more resilient APIs. An use case is presented as an implementation utilizing a combination of SAP API Management, SAP Event Mesh and SAP Cloud Integration.

Please feel free to post a query in the comments section if you have any inquiries.

Stay up to date and follow the SAP Integration Suite topic page, post and answer questions, and read other posts on the topic.

 

Assigned Tags

      15 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Martin Pankraz
      Martin Pankraz

      Thanks for sharing Marcel Christ. Like the community driving this topic further.

      A couple of thoughts:

      • How about elaborating on the re-processing part? I suggest the circuit breaker pattern.
      • What about BTP outages? SAP suggests to deploy everything in one region. That means likely all your mentioned services are not available during disaster. Full Resiliency in terms of availability can only be achieved by using multiple regions and global routing. Critical BTP customers are implementing it like that. Have a look here. It was adopted by SAP as "intelligent routing" mission.
      • Further I recommend external monitoring as that aspect goes down with the region too. See here. With BTP services at least move them to another BTP region.
      Author's profile photo Thorsten Duevelmeyer
      Thorsten Duevelmeyer

      Hi Martin,

      just some ideas from my side:

      • yes, circuit beaker would be a goos one, What do do think of simple dead letter queues and reprocessing of messages from there? If we assume, that event mesh if nearly always up and running (otherwise the sender will get an error), we could handle re-processing of messages (in case of Cloud Integration errors) from there?
      • Adding more resilience to the service would be great, but comes with some extra costs.
      • What do you think about monitoring in Cloud ALM instead of Azure services?
      Author's profile photo Martin Pankraz
      Martin Pankraz

      Re-processing from the dead-letter queue is feasible yes. But you are missing the part that avoids hitting the "recovering" application - hence the circuit breaker.

      From my perspective you need to assume that every Cloud service will fail eventually as it is an inherent challenge of any large distributed system. If your app needs to cater for that you are in uncertain "waters" if you think only about the uptime.

      Cost is important yes. Some companies loose millions by the hour or worse during outages. The business case for the extra redundancy is easily calculated then 😉

      Cloud ALM vs. Azure is typical topic of customer choice based on their platform. Is it SAP only or do you have others? Also who is monitoring? Many customers want to stream line this process and bring the different IT departments closer together. Hence the adoption of Azure Monitor for SAP solutions for the Azure IT to better connect with the SAP basis folks using Solution Manager, Cloud ALM etc. Usually not one single approach for everyone.

      Author's profile photo Marcel Christ
      Marcel Christ
      Blog Post Author

      Hi Martin,

      thank you for the valueable comments and ideas!
      We have not yet considered a failure of BTP and our experience is that it is more stable than SAP PO in this case. Indeed, as Anurag mentions below, our setup is similar to what PI/PO offers out of the box. It would be nice to have the same in the cloud integration standard.

      In our case the challenge is how to handle senders who can not resend their requests and do not process the responses of their API calls. This time, we tried to keep it simple and use SAP means only, but we are using 3rd party functionalities (AWS, Kafka) for other interfaces too. Monitoring and complexity then becomes a challenge.

      So far we have no problems with big loads and the number of incoming calls can be limited in the API policies to protect the backend systems.

      Author's profile photo Martin Pankraz
      Martin Pankraz

      Good to know.

      In terms of PI/PO availability -> it can be as stable as any PaaS or SaaS service if you implement best practices for high-availability and disaster recovery. If you run it on a single VM of course it will be less reliable than BTP. Compare apples for apples 😉

      Regarding your sender re-processing capability challenge a reliable message queing component makes sense. My point was that Event Mesh is not a message queue. SAP Cloud Integration offers JMS but your limited to one BTP region if you don't apply other means. That's described in the "intelligent routing" mission SAP released based on our co-developments.

      Author's profile photo Anurag Gupta
      Anurag Gupta

      This is a good design to enable retry mechanism for important interface which is almost mimicking to what we have in SAP PI PO as persistent of messages at each phase of message.

      But ideally I would not go using event mesh in all interfaces but would define API appropriate responses to consumer of the API in case of errors so that consumer would understand and handle appropriately. In my opinion Event Mesh solve event driven cases where producers and subscribers are involved.

      Author's profile photo Thorsten Duevelmeyer
      Thorsten Duevelmeyer

      Hi Anurag,

      sure, that is also a way to do it.

      I think doing it with a combination of API Management and Event Mesh makes the service resilient and asynchronous. That seems to be a good idea for SAP environments & processes, as most requests could be handled async, also to protect the backend against to much load.

      Does that make sense for you?

      With a direct / sync backend connection the direct way with http response codes and respose messages would definitely be the better way to do it.

      Author's profile photo Martin Pankraz
      Martin Pankraz

      The lines at SAP blur between Events and Messages. Event Mesh is not a Messaging service, while Cloud Integration is not an Eventing service. For proper pattern design those shouldn't be mixed or applied to the opposite use cases. Even though you could survive for smaller loads, this would still be an anti-pattern in my opinion.

      As soon as you hit scale not-ideal designs will be sub-optimal.

      Author's profile photo Anurag Gupta
      Anurag Gupta

      Hi Thorsten Düvelmeyer / Marcel Christ

      The design which you have proposed is a good design to make APIs resilient. It is popularly called as Storage First API pattern. AWS recommends it for asynchronous interfaces too but not for all.

      https://aws.amazon.com/blogs/compute/building-storage-first-applications-with-http-apis-service-integrations/

      If I have to relate AWS SQS/SNS to SAP Cloud,  two things comes to my mind Event Mesh and Integration Suite's JMS. Since JMS doesn't support retry limit - dead letter queue forwarding out of the box, Event Mesh is the best option here.

      Author's profile photo Marcel Christ
      Marcel Christ
      Blog Post Author

      Thanks Anurag for bringing that up! It is an interesting pattern I did not know about and which nicely relates to our setup.

      Author's profile photo Vijay Konam
      Vijay Konam

      The same  can be achieved using any 3rd party service bus (ex: Azure Service Bus) as well. SAP Event mesh is not mandatory. I just wanted to add this point here because, most of the customers are already using Azure, AWS or some other hyperscalar for realizing their event drive architecture for non-SAP integration. It does not make sense to deploy SAP Event Mesh in that case (at least in our case). Only sad part is SAP does not let S4/ECC systems talk to other EDA platforms other than SAP's own event mesh. Wish this was not the case.

      I thank SAP for not restricting CPI to SAP Event Mesh only using the AMQP adapter!

       

      Author's profile photo Thorsten Duevelmeyer
      Thorsten Duevelmeyer

      Yes, it was a very good idea of SAP to implement an open plattform in BTP 🙂
      Perhaps a combination of of BTP services and other PaaS and SaaS solutions my be the best way for use cases in the future?

      Author's profile photo Martin Pankraz
      Martin Pankraz

      Hi Vijay Konam,

      you are free to use a Partner AddOn, the ABAP SDK for Azure or your ABAP extensions to call any messaging service you like. Have a look here to get started.

      For the out-of-the-box SAP standard integration guide you are right, that is restricted to Event Mesh.

      Author's profile photo Vijay Konam
      Vijay Konam

      Thank you Martin. Good to know about ABAP SDK for Azure.

      Author's profile photo Frank Li
      Frank Li

      Thanks for sharing so interesting topic, learned a lot from this blog.