Skip to Content
Technical Articles

Eng Swee’s tips for CPI development

It has been about three and a half years since my first foray into the world of cloud integration. A lot has changed since – the official product name changed from HCI to CPI (with a few others thrown in between that did not stick long enough); Eclipse was dropped in favor of WebUI (which I still do not agree with, but have chosen to move on); the many quirky bugs and errors that beleaguered the initial versions of the product have more or less been dealt with; and the product has gone from strength to strength, maturing with each passing month’s rolling software update. Of course, there are some parts that have not changed like the sorely missing self-service free trial access, but that will be a story/battle for another day. [Update 21 Jun 2019: Self-service CPI trial is finally available in SAP Cloud Platform Cloud Foundry (CF) environment – yay!]

While it is relatively easy for someone with an integration background (like PI) to pick up CPI and be productive, ensuring that the integration flow designs are robust and can withstand the test of time is a different matter. While Apache Camel as the underlying framework supports a flexible modelling environment, it also shifts the onus to the developer/architect to ensure the interfaces are well designed.

After implementing custom CPI developments firstly for SuccessFactors Employee Central integration, followed by A2A integrations with an on-premise S/4HANA system, I have found various design and development practices that have worked particularly well for me. In the rest of this post, I will share those with you.

 

1. Utilise ProcessDirect adapter to create common/shared IFlows

Certain adapters have configuration parameters that are interface agnostic and therefore can/should be reused between multiple interface. Unlike PI, there is no concept of communication channels in CPI. The design of each interface is within its own IFlow and there are limited options for reuse. The introduction of the ProcessDirect adapter last year enables us to overcome such limitation.

For example, an IDoc receiver can be modelled as a separate IFlow as shown below using ProcessDirect in the sender channel.

This allows the common IFlow to be invoked by more than 1 IFlow merely by sending to the same endpoint via a ProcessDirect receiver channel as shown below.

The benefit of such approach is that common values that need to be populated in an IDoc receiver channel (e.g. URL, Credential Name) are just maintained in one place. This simplifies matters when we need to deploy multiple interfaces at the same time, as well as future maintenance if there are changes to the common receiver.

During runtime, multiple messages (one for each linked IFlow) will appear in the message monitor and these can be linked together with the Message Correlation ID (as shown below).

This approach is not only applicable for grouping common channels, but also to reuse common mappings or common extenal libraries (e.g. if you use FormatConversionBean in more than 1 IFlow).

 

2. Use Groovy Script for mappings

For most integration scenarios, mapping is one of the most pivotal part of the design and development. A mapping object changes many times throughout its lifecycle – from initial development, initial production release, and the many enhancements during the maintenance period. Sometimes, it may end up looking totally different from how it was when it was first developed.

Therefore, deciding on which mapping approach to use is important – this affects how easy/effective it is to develop, maintain and enhance a mapping object during its lifecycle. Personally, I am most at home when working with a full-fledged programming language. In the context of CPI, that means developing mappings as Groovy Scripts, as I have described in detail in I *heart* Groovy mapping.

Using Groovy allows me to develop, maintain and enhance mappings in a very effective manner.

 

3. Use Reader when accessing message body in Groovy Script

Stream the XMLSlurper input in Groovy Scripts has been around for two years, yet I see too many scripts online (forum and blog posts) and in CPI tenants utilising the following inefficient manner to access the payload (message body).

def body = message.getBody(java.lang.String)

While this is acceptable and may not cause any issues with smaller sized payloads, it does not scale well when the payload gets larger.

Instead, always access the message body with a Reader (even in the case when you are not using XmlSlurper for further XML parsing).

def body = message.getBody(Reader)

Furthermore, you can even use strong typing to utilise the IDE’s code completion in the subsequent lines of code. Also, Groovy extends Reader with many helper methods that lets you work with IO in an efficient manner.

Reader reader = message.getBody(Reader)

 

4. Externalise parameters to optimise configurability of IFlows

Consider using externalised parameters to optimise design of the IFlows for configurability. This allows certain aspects of the IFlow to be changed without editing it. It follows the same approach for SAP’s prepackaged content where most of them are “Configure-Only”.

Common use cases for externalised parameters are:-

  • Endpoint, URL, server, directory and credentials
  • Scheduler for polling adapters
  • IDoc partner profile details
  • Parameter to control payload logging
  • Parameter values that differ in different environments (Dev, QA, Prod)

 

5. Populate Application Message ID to uniquely identify messages

The monitoring capabilities in CPI are somewhat limited. Often, when there are many messages in the system, it is not straightforward to uniquely identify a message that is required for further analysis. CPI does not have functionality such as User Defined Message Search (UDMS) that is available in PI. The search functionality for IDs is limited to searching by Message ID, Correlation ID or Application ID.

Fortunately, we can take advantage of Application Message ID for our purposes. This is achieved by populating the message header SAP_ApplicationID with an appropriate value. Depending on the scenario, following are two options to populate that header.

  1. For scenarios with HTTP-based sender channels, this can be populated by the sender system in HTTP header SAP_ApplicationID.
  2. Within the IFlow, message header SAP_ApplicationID can be populated via Content Modifier, Groovy Script or Javascript.

Note that the value populated via (1) can be overwritten by (2).

The onus is on the developer to consider what is the appropriate value to populate into the Application Message ID. This could be a unique value sent by the sender system or possibly the filename in the case of file based integration – screenshot below shows how a message can be searched based on the filename.

 

6. Implement error handling in IFlow

CPI/Camel places the onus on the developer for error handling. This requires a paradigm shift for those coming from a PI background.

For asynchronous scenarios, not all CPI adapters have native capability for reprocessing messages, or native support for Exactly Once Quality of Service. As such the developer needs to consider the cases when a message fails in CPI, and what type of error handling is required. Below are some options for consideration:-

  • For HTTP-based senders, explore possibility of sender system retriggering message to CPI.
  • For SFTP sender, the files are not consumed (archived or deleted) if there is an error during message processing in CPI. Such erroneous files will be picked up for processing again during the next polling cycle. Design the IFlow/mapping such that there will be no application errors expected for all cases, and therefore only leaving the possibility of systemic errors such as intermittent connectivity. This inherently enables the files to be automatically delivered again once the connection is re-established. The drawback for such an approach is if the polling frequency is often, this can generate a lot of noise in the message monitor.
  • Consider utilising JMS adapter/queues to introduce native reprocessing functionality. The good news is that it is now available for non-Enterprise Edition tenants (requiring additional $$$ of course). The drawback is that JMS queues are “expensive” and even Enterprise Edition comes with only 30 queues by default. Furthermore, if it is a typical 2-tier landscape, then the queues in the non-production tenant will have to be split between Dev and QA (considering a typical 3-tier backend landscape).
  • Manually implement Exactly Once using approaches described here and here.
  • [Update 2 Jul 2019] – If you can get your hands on some JMS queues (by any means necessary!) – consider implementing the design detailed in Not enough JMS queues for Exactly Once? Share them between IFlows!

Another aspect to consider is implementing exception handling using exception sub-process. This allows the developer to introduce explicit logic to handle exceptions that occur during message processing. However, for complex IFlows that are broken down into multiple Local Integration Processes (typical of SuccessFactors Employee Central or Hybris Marketing integrations involving multiple OData calls), the error handling can get tricky or messy. Exceptions are not automatically propagated from Local Integration Process to the main Integration Process, so exception sub-process steps need to be implemented in each Local Integration Process block which could lead to a lot of duplication.

 

7. Design integration artifacts to cater for 2-tier landscape

By default, most CPI landscapes are 2-tier, with one non-production tenant and one production tenant. However, the systems connected to CPI are typically 3-tier (Dev, QA & Prod) for example an on-premise SAP system. Therefore the common approach is to use the non-production CPI tenant to integrate with both Dev and QA environments.

This is achieved via duplicating the integration artifacts in design under a different name (typically adding a suffix like _QA) and deploying them with different configured parameters.

Ensure that the design of the integration artifacts supports such deployment approach. Some common aspects to consider are:-

  • Configurable endpoint, URL, server,¬†credentials

  • Value Mapping containing different Agency/Identifier combinations

  • Separate packages for Dev and QA artifacts

 

8. Implement version management – both in WebUI and Git

Version management is an important aspect in any development lifecycle. CPI provides a basic version management functionality in WebUI, while it is also possible to download the artifacts and manage it externally via an SCM like Git. My recommendation is to use both.

WebUI

The functionality provided allows us to persist the entire state of the artifact at a point in time. It is recommended to use ‘Save as Version‘ whenever the artifact reaches a somewhat stable/working state. This allows us to make further changes with the assurance that we can revert back to a working version if required.

Furthermore, without version management in WebUI, the state of the artifact remains in ‘Draft’. Because an IFlow consists of many objects (IFlow model, schemas, scripts), it is difficult to compare the state of an IFlow across the different environments/tenants. If the copies of the IFlow in Dev, QA and Prod all have the ‘Draft’ status, this will require manual comparison of all objects within an IFlow to determine if there are any differences between them.

Personally, I use the version number to enable me to know if the IFlow is the same or not between each environment, without having to individually inspect each object. CPI uses a 3-digit numbering system similar to Semantic Versioning. My approach is slightly different and I use it also to indicate the release state of the IFlow during the major stages of the implementation lifecycle. The following denotes the different stages:-

Version No State
1.1.0 Initial release for System & Integration Testing (SIT)
1.2.0 Initial release for User Acceptance Testing (UAT)
2.0.0 Initial release for Production

The last digit is incremented by one for every fix/enhancement.

Note: Thank you to Piotr Radzki who highlighted that packages with objects in ‘Draft’ cannot be exported. Since this is a pre-requisite to transport to Production environment, all the more reason to implement a version numbering system that is easy to use and makes sense.

Git

As part of my workflow, I also use Git within an IDE to externally manage my CPI development. While this is mainly for the Groovy Scripts which typically go through lots of changes during the implementation lifecycle, I also store all the other objects of an IFlow.

I have described this approach using the HCI Eclipse plugin (before it was deprecated), but now that I have switched over to IntelliJ IDEA, it is still applicable using the VCS functionality provided by IntelliJ.

 

Conclusion

So there it is, some of the the practices, approaches and guidelines that have worked particularly well for me when I work on CPI development. It is by no means a “best practices” document (yet), as the whole area of cloud integration is still changing at a rapid pace. As Vadim Klimov commented here, “the more we work on cloud integration, the more we find new creative ways to deal with them” (paraphrased).

I welcome any comment, opinion, and feedback on these, and feel free to even disagree with me if you have a different perspective on them.

 

 

Conversation about this post on LinkedIn.

13 Comments
You must be Logged on to comment or reply to a post.
  • Good Blog. We also follow most of the points you have mentioned.

    Though you mentioned for two-tier landscape append System/Phase name we follow that for artifacts/credentials deployed so that it will not create confusion.

    However, keep up the good work and keep sharing to this community.

    Regards

    Rajesh Pasupula

     

     

    • Hi Rajesh

       

      Thanks for your comment. For design artifacts (IFlow, Value mapping, package), I do not add the suffix to these objects for the Dev version. The reason is that the Dev version is the “golden” version which will be the source of transport to Production environment. Ideally IMHO design artifacts should be free from environment/phase indicators in their naming – it’s just that the limitation enforced by a 2-tier landscape forces an additional suffix for additional duplicate IFlows.

       

      For security artifacts (e.g. user credentials), I use the following naming convention:-

      <SystemName>_<AdapterType>_<SystemIdentifier>

      Examples:-

      External_SFTP_QA

      SF_SuccessFactors_C0000nnnnn

      I will update the post later with this additional point.

       

      Regards

      Eng Swee

  • That is a really impressive and useful collection of recommendations and a summary of practices to follow. A definite must read for a CPI developer or architect!

    While regular usage of most of those points shall introduce positive noticeable effect very quickly, if not instantly, I feel some best practices you mentioned might not be that obvious at first sight, but are hidden gems, and if not followed, can make things go wrong and cause severe issues at later stages (e.g. in a production environment) – an advice #3 “Use Reader when accessing message body in Groovy Script” is one of them. In the cloud environment, it is easier to miss a point of performance optimization of internal resources used by the iFlow – such as memory consumption of iFlow at runtime when it is put under load. It might be partly related to complexity or inability of obtaining some relevant information (as to my knowledge, JVM resource monitoring and JVM profiling tools are not exposed to customers in a user-friendly way), and on the other hand, certain thinking of elasticity of cloud resources. As a result, as you mentioned, problems with performance and resource consumption of custom scripts can be introduced when developing iFlows and remain unnoticed until certain conditions / load pattern are met, making such scripts act as “time bombs” within iFlows.

    It is to emphasize once again necessity of testing of scripts from various aspects – not only functional, but also stability and performance. In absence of customer exposure to JVM level monitoring and profiling tools in CPI, local testing becomes even more critical. The approach you described in blogs earlier on how to set up environment for local execution of Groovy scripts, can be re-used for other test types. Unit testing (with Spock or any other alternative framework) is one great example you have already covered in details – we can also put a script under test in a local environment to identify resource leaks or suboptimal usage (locally, we have tools to observe behaviour of Java runtime and collect necessary metrics). It is not ideal as it doesn’t put a script in a wider context of the iFlow and doesn’t reproduce JVM settings of CPI tenant accurately, but if we can emulate representative input to the script, some issues like inefficient usage of Strings or collections shall be spotted there and certain conclusions originating from local runtime observations, can be extrapolated and applied to CPI.

    And another one that stands close to the above: in addition to various unit and performance tests that the script can undergo in a local environment even before it gets embedded into the iFlow and gets into integration testing, code quality checks are somewhat that can help to avoid some common stability / maintainability issues. Few practices to mention that nicely complement unit testing: code coverage (especially for large scripts) – to ensure that unit tests cover critical execution paths, and static code analysis – to run automatic checks against certain rule set (outcome of those checks can help the developer to optimize script coding). To be more precise, static code analysis can come first, followed by unit testing and test coverage verification – all done on basis of scripts developed locally. I used SonarQube for static code analysis and Clover for test coverage (works well with Spock) and they both worked nicely for Groovy developments, but I can imagine organizations might have their own preferences in tooling that is out there in these areas. As a more advanced option, it could be even automated and embedded into development pipeline following CI/CD principles, but that is really something for the next level of organization of development cycle for CPI, not only about tools and techniques, but mindset and development infrastructure.

    The comment started as just few sentences, but looks like it evolved into a long one… Shortly speaking: test early, test often.

    • Hi Vadim

       

      Thanks for your extensive insight into this. I’m surprised that point 3 really stood out for many readers, but yes, it is definitely an important one.

       

      Yes, capability for local testing does open up a lot of possibility for the developer to ensure that the scripts are developed in an optimal and robust manner. This is especially important the more complex a script gets. As such “testing early” is important as it allows us to catch the bugs earlier – it is a lot costlier to fix bugs at the later stage of the development cycle, especially if it could have been identified much earlier on. Another way to see this – if a script would have caused issue in a local environment, this could potentially be amplified when it is executed in a tenant.

       

      I haven’t had the need to run coverage checks for my scripts as mine are not too complex, but it is definitely a possibility, and this could be an area for further expansion as CPI matures together with the developer ecosystem around it.

       

      Regards

      Eng Swee

  • Hi Eng

    Love that you are also just using a two tear system landscape. We added the function to have virtual QA system to import on in the Figaf IRT tool. We added it mostly because it was easier to run test our transports on, when we only have one CPI tenant our self.

    You can see a demo of the transport part here.

    https://youtu.be/tMW5NAOaXWM?t=521

    It will add new post fixes to package and iflows, and HTTP/SOAP sender endpoints, but can be added to other services blocking easy. I guess we should also add it to process direct for both sender/receiver.

    • Hi

      We did improve the virtual landscapes, so it now also supports the process direct adapter. Then all processing on Process Direct is happening on separate flows.

      (link removed by moderator)

      We also moved the packages to start with Z that way they will always be placed at the end of your list of packages.

  • Eng Swee Yeoh

    This is a great list. I’m sure I’ll end up referring to it many times.

    Do you also have advise on the following?

    1. iFlow Naming conventions. Is <sender>_<receiver>_<description> good enough?
    2. Guidance on organizing iFlows in packages. Since transports happen a package level, how do you decide to group iFlows in packages?
      For eg: we will end up with a large number of simple file to file scenarios(100+) with no mappings. These will span across dozens of different systems. The interfaces will go-live in waves over 12-18 months. What would be the best way build/organize these? All in one package? Packages across domains(HR/MFG etc). One iflow per package(i hope not!)?
    • Hi Harsh

       

      Thanks for your comment, and raising such a pertinent question.

      Unfortunately, I do not have a general one-size-fits-all recommendation for the two areas you brought up (and therefore it was not included in this post).

      However, I’ll try to address your questions specifically:-

      1) Naming convention is a very subjective matter and varies from organisation to organisation. Personally, I’ve worked with SuccessFactors integration quite a bit, so I draw my approach from SAP’s way of naming the prepackaged content (with some minor modifications of my own):-

      <Interface_ID><SourceSystem> <Content/BusinessObject> <Action> to <TargetSystem>

      e.g. IF001 – ServiceNow Employee ID Creation to Employee Central

       

      2) First of all, IMHO using CPI for file-to-file integration is an overkill. It would be better to use MFT tools for such integrations.

      Anyway, this is a tough one due to the issue with package level transport. I would definitely split it into multiple packages across different domains/modules (FI, HR, SD, etc) and even possibly at a more granular level across an end-to-end business process (e.g. Procure to Pay). Further consideration should be the number of developers working on the implementation and the number of interfaces being developed during each wave/phase. The idea is to have each package to be worked on by ideally 1 developer, or if not up to 2-3 developers so that to minimize coordination effort during package level transport.

      Hopefully at some point CPI has the capability to transport at the artifact level (IFlow, value mapping) and then we would not have to organise packages based on transport limitation.

       

      Regards

      Eng Swee

       

  • Thanks Eng Swee for a very informative and insightful blog.

    Can you suggest the best way of doing Transports from CPI Non Prod to Prod Tenant. As of now, we are using export & import but we are facing challenge there to move the artifact changes with externalized parameters. These externalized parameters get overwritten by Dev parameter values, So its a cumbersome exercise to reput them again with prod values.

     

    Please suggest.

    Kind REgards,

    Anurag

  • Hi Eng,

    Great blog, thank you for your contribution and wiliness to share your knowledge with the community.  We are currently configuring our integration cloud architecture and had the question on retaining documents.  What best practice do you utilize for archiving payloads?  Business typically prefers to have an archive of payload data available for review.

    Thanks.

    -Jon