Eng Swee's tips for CPI development

engswee · ‎06-14-2019

It has been about three and a half years since my first foray into the world of cloud integration. A lot has changed since - the official product name changed from HCI to CPI (with a few others thrown in between that did not stick long enough); Eclipse was dropped in favor of WebUI (which I still do not agree with, but have chosen to move on); the many quirky bugs and errors that beleaguered the initial versions of the product have more or less been dealt with; and the product has gone from strength to strength, maturing with each passing month's rolling software update. ~~Of course, there are some parts that have not changed like the sorely missing self-service free trial access, but that will be a story/battle for another day.~~ [Update 21 Jun 2019: Self-service CPI trial is finally available in SAP Cloud Platform Cloud Foundry (CF) environment - yay!]

While it is relatively easy for someone with an integration background (like PI) to pick up CPI and be productive, ensuring that the integration flow designs are robust and can withstand the test of time is a different matter. While Apache Camel (as the underlying framework) supports a flexible modelling environment, it also shifts the onus to the developer/architect to ensure the interfaces are well designed.

After implementing custom CPI developments firstly for SuccessFactors Employee Central integration, followed by A2A integrations with an on-premise S/4HANA system, I have found various design and development practices that have worked particularly well for me. In the rest of this post, I will share those with you.

1. Utilise ProcessDirect adapter to create common/shared IFlows

Certain adapters have configuration parameters that are interface agnostic and therefore can/should be reused between multiple interface. Unlike PI, there is no concept of communication channels in CPI. The design of each interface is within its own IFlow and there are limited options for reuse. The introduction of the ProcessDirect adapter last year enables us to overcome such limitation.

For example, an IDoc receiver can be modelled as a separate IFlow as shown below using ProcessDirect in the sender channel.

This allows the common IFlow to be invoked by more than 1 IFlow merely by sending to the same endpoint via a ProcessDirect receiver channel as shown below.

The benefit of such approach is that common values that need to be populated in an IDoc receiver channel (e.g. URL, Credential Name) are just maintained in one place. This simplifies matters when we need to deploy multiple interfaces at the same time, as well as future maintenance if there are changes to the common receiver.

During runtime, multiple messages (one for each linked IFlow) will appear in the message monitor and these can be linked together with the Message Correlation ID (as shown below).

This approach is not only applicable for grouping common channels, but also to reuse common mappings or common extenal libraries (e.g. if you use FormatConversionBean in more than 1 IFlow).

2. Use Groovy Script for mappings

For most integration scenarios, mapping is one of the most pivotal part of the design and development. A mapping object changes many times throughout its lifecycle - from initial development, initial production release, and the many enhancements during the maintenance period. Sometimes, it may end up looking totally different from how it was when it was first developed.

Therefore, deciding on which mapping approach to use is important - this affects how easy/effective it is to develop, maintain and enhance a mapping object during its lifecycle. Personally, I am most at home when working with a full-fledged programming language. In the context of CPI, that means developing mappings as Groovy Scripts, as I have described in detail in I *heart* Groovy mapping.

Using Groovy allows me to develop, maintain and enhance mappings in a very effective manner.

3. Use Reader when accessing message body in Groovy Script

Stream the XMLSlurper input in Groovy Scripts has been around for two years, yet I see too many scripts online (forum and blog posts) and in CPI tenants utilising the following inefficient manner to access the payload (message body).

def body = message.getBody(java.lang.String)

While this is acceptable and may not cause any issues with smaller sized payloads, it does not scale well when the payload gets larger.

Instead, always access the message body with a Reader (even in the case when you are not using XmlSlurper for further XML parsing).

def body = message.getBody(Reader)

Furthermore, you can even use strong typing to utilise the IDE's code completion in the subsequent lines of code. Also, Groovy extends Reader with many helper methods that lets you work with IO in an efficient manner.

Reader reader = message.getBody(Reader)

4. Externalise parameters to optimise configurability of IFlows

Consider using externalised parameters to optimise design of the IFlows for configurability. This allows certain aspects of the IFlow to be changed without editing it. It follows the same approach for SAP's prepackaged content where most of them are "Configure-Only".

Common use cases for externalised parameters are:-

Endpoint, URL, server, directory and credentials

Scheduler for polling adapters

IDoc partner profile details

Parameter to control payload logging

Parameter values that differ in different environments (Dev, QA, Prod)

5. Populate Application Message ID to uniquely identify messages

The monitoring capabilities in CPI are somewhat limited. Often, when there are many messages in the system, it is not straightforward to uniquely identify a message that is required for further analysis. CPI does not have functionality such as User Defined Message Search (UDMS) that is available in PI. The search functionality for IDs is limited to searching by Message ID, Correlation ID or Application ID.

Fortunately, we can take advantage of Application Message ID for our purposes. This is achieved by populating the message header SAP_ApplicationID with an appropriate value. Depending on the scenario, following are two options to populate that header.

For scenarios with HTTP-based sender channels, this can be populated by the sender system in HTTP header SAP_ApplicationID.

Within the IFlow, message header SAP_ApplicationID can be populated via Content Modifier, Groovy Script or Javascript.

Note that the value populated via (1) can be overwritten by (2).

The onus is on the developer to consider what is the appropriate value to populate into the Application Message ID. This could be a unique value sent by the sender system or possibly the filename in the case of file based integration - screenshot below shows how a message can be searched based on the filename.

6. Implement error handling in IFlow

CPI/Camel places the onus on the developer for error handling. This requires a paradigm shift for those coming from a PI background.

For asynchronous scenarios, not all CPI adapters have native capability for reprocessing messages, or native support for Exactly Once Quality of Service. As such the developer needs to consider the cases when a message fails in CPI, and what type of error handling is required. Below are some options for consideration:-

For HTTP-based senders, explore possibility of sender system retriggering message to CPI.

For SFTP sender, the files are not consumed (archived or deleted) if there is an error during message processing in CPI. Such erroneous files will be picked up for processing again during the next polling cycle. Design the IFlow/mapping such that there will be no application errors expected for all cases, and therefore only leaving the possibility of systemic errors such as intermittent connectivity. This inherently enables the files to be automatically delivered again once the connection is re-established. The drawback for such an approach is if the polling frequency is often, this can generate a lot of noise in the message monitor.

Consider utilising JMS adapter/queues to introduce native reprocessing functionality. The good news is that it is now available for non-Enterprise Edition tenants (requiring additional $$$ of course). The drawback is that JMS queues are "expensive" and even Enterprise Edition comes with only 30 queues by default. Furthermore, if it is a typical 2-tier landscape, then the queues in the non-production tenant will have to be split between Dev and QA (considering a typical 3-tier backend landscape).

Manually implement Exactly Once using approaches described here and here.

[Update 2 Jul 2019] - If you can get your hands on some JMS queues (by any means necessary!) - consider implementing the design detailed in Not enough JMS queues for Exactly Once? Share them between IFlows!

Another aspect to consider is implementing exception handling using exception sub-process. This allows the developer to introduce explicit logic to handle exceptions that occur during message processing. However, for complex IFlows that are broken down into multiple Local Integration Processes (typical of SuccessFactors Employee Central or Hybris Marketing integrations involving multiple OData calls), the error handling can get tricky or messy. Exceptions are not automatically propagated from Local Integration Process to the main Integration Process, so exception sub-process steps need to be implemented in each Local Integration Process block which could lead to a lot of duplication.

7. Design integration artifacts to cater for 2-tier landscape

By default, most CPI landscapes are 2-tier, with one non-production tenant and one production tenant. However, the systems connected to CPI are typically 3-tier (Dev, QA & Prod) for example an on-premise SAP system. Therefore the common approach is to use the non-production CPI tenant to integrate with both Dev and QA environments.

This is achieved via duplicating the integration artifacts in design under a different name (typically adding a suffix like _QA) and deploying them with different configured parameters.

Ensure that the design of the integration artifacts supports such deployment approach. Some common aspects to consider are:-

Configurable endpoint, URL, server, credentials

Value Mapping containing different Agency/Identifier combinations

Separate packages for Dev and QA artifacts

8. Implement version management - both in WebUI and Git

Version management is an important aspect in any development lifecycle. CPI provides a basic version management functionality in WebUI, while it is also possible to download the artifacts and manage it externally via an SCM like Git. My recommendation is to use both.

WebUI

The functionality provided allows us to persist the entire state of the artifact at a point in time. It is recommended to use 'Save as Version' whenever the artifact reaches a somewhat stable/working state. This allows us to make further changes with the assurance that we can revert back to a working version if required.

Furthermore, without version management in WebUI, the state of the artifact remains in 'Draft'. Because an IFlow consists of many objects (IFlow model, schemas, scripts), it is difficult to compare the state of an IFlow across the different environments/tenants. If the copies of the IFlow in Dev, QA and Prod all have the 'Draft' status, this will require manual comparison of all objects within an IFlow to determine if there are any differences between them.

Personally, I use the version number to enable me to know if the IFlow is the same or not between each environment, without having to individually inspect each object. CPI uses a 3-digit numbering system similar to Semantic Versioning. My approach is slightly different and I use it also to indicate the release state of the IFlow during the major stages of the implementation lifecycle. The following denotes the different stages:-

Version No	State
1.1.0	Initial release for System & Integration Testing (SIT)
1.2.0	Initial release for User Acceptance Testing (UAT)
2.0.0	Initial release for Production

The last digit is incremented by one for every fix/enhancement.

Note: Thank you to Piotr Radzki who highlighted that packages with objects in 'Draft' cannot be exported. Since this is a pre-requisite to transport to Production environment, all the more reason to implement a version numbering system that is easy to use and makes sense.

Git

As part of my workflow, I also use Git within an IDE to externally manage my CPI development. While this is mainly for the Groovy Scripts which typically go through lots of changes during the implementation lifecycle, I also store all the other objects of an IFlow.

I have described this approach using the HCI Eclipse plugin (before it was deprecated), but now that I have switched over to IntelliJ IDEA, it is still applicable using the VCS functionality provided by IntelliJ.

Conclusion

So there it is, some of the the practices, approaches and guidelines that have worked particularly well for me when I work on CPI development. It is by no means a "best practices" document (yet), as the whole area of cloud integration is still changing at a rapid pace. As vadim.klimov commented here, "the more we work on cloud integration, the more we find new creative ways to deal with them" (paraphrased).

I welcome any comment, opinion, and feedback on these, and feel free to even disagree with me if you have a different perspective on them.

Conversation about this post on LinkedIn.