Skip to Content
Technical Articles

CPI Large Volume Migration: OData API. Robust audit & error reporting framework.

Introduction

This page  describes the CPI migration integration flow for Customer which have been implemented for migrating the customer data from siebel systems to marketing cloud system.

Standard SAP CPI package “SAP Marketing Cloud – File-Based Data Load” Contacts IFlow has been enhanced and customized according to the client requirements.

https://api.sap.com/package/SAPHybrisMarketingCloudfilebaseddataload?section=Artifacts

The part 2 of the blog describes the migration flow of interactions to load many millions of records into SAP marketing cloud systems. The example here describes SAP marketing cloud but we can use same approach for migrating large volumes into all S4 HANA or C4 HANA systems using same architecture.

Limitations of Standard CPI Marketing Pre-Packaging Content

While the standard pre-packaged integration content provides a good start point for migrating customer data from legacy systems, the following are the limitations of the standard CPI IFLOW.

  • The error handling capability delivered in the standard integration triggers alerts via e-mail which is really not a good use case for complex large volume migration projects unless some wants to jam their inbox.
  • There is no mechanism for full audit logging of what packets failed and what packets are successfully sent to marketing cloud for each file. Imagine  a scenario where we need to migrate 20 million customers that are split into 200 files of 100k records and split into 1000k packet for each OData call to optimize performance and client want to have full tracebility to extract files to understand what records failed and what records are processed successfully.
  • There is no standard CPI mechanism to automatically move the error files to error folder. Due to that limitation, the failed failes are reprocessed in the next run by the CPI IFLOW resulting in many failures in CPI.

Hence we enhanced the standard content to address above limitations and enable the end to end logging which will tell us “Which 1000k packet of splitter step” has successfully processed and which packet has failed. The logs are enabled in AWS SFTP server in this blog but we will be blogging in the future on how we can write audit logs into hana tables, Watch the space for  audit logging using hana tables!

High Level Architecture

The following is the high level architecture for migration data from Siebel systems into Marketing Cloud using SAP CPI DS. It is very important that we have a staging area in large volume migration for profiling data and identifying data issues easily and reprocessing just error packets and reconciling the data from SAP Cloud to Source systems. We used SAP Cloud HANA as staging area because client is already licensed for it. However you can use any DB as long as it is supported by CPI DS.

Diagram was exported using lucidchart online service.
  1. Data is extracted from Siebel and placed in AWS SFTP Extract File Folders for Contacts and each file contains 1 million records
  2. CPI DS is reading files of 1 million records from SFTP and map data.
  3. CPI DS is saving files on 1 million records into HANA DB
  4. CPI DS is reading data from HANA DB
  5. CPI DS is exporting data from HANA DB into AWS SFTP BeforeSplit folder.
  6. CPI PI is reading 1 million file from SFTP, splitting into 100k and saving under AfterSplit folder.
  7. CPI PI is reading 100k files and loading data into C4M using OData API in packets of 1000 records for each OData API call

CPI Flow Description

Z.Contacts.Data.Load – With this integration flow you are able to load contacts via SFTP or HTTPS to your SAP Marketing Cloud system.

IFLOW consists of Main Process, Local Sub-Processes and Exception Process.

CPI Flow Steps

SFTP Integration Process:

CPI Migration IFLOW Steps:

SENDER: SFTP – Trigger of an IFLOW. Step is reading SFTP directory and initiates process if source directory contain files matching conditions.

After file being processed it will be moved to either Archive directory or Error (in case there was an error during load).

It is also possible to schedule how frequently IFLOW will be initiated in the Scheduler section.

Map and Send (Process Call) – call a Local Sub Process “Map and Send”.

Check In Error (Content Modifier) – Assigning an Exchange Property to Local Variable to understand whether there was error during flow.

Has Failed? (Router) – based on error original file is saved under Error or Archive folder.

In Error (Content Modifier) – replacing archiveDirectory standard property to error if there was an error. archiveDirectory property is telling flow where to move file after it was completely processed.

Clean Variable (Write Variables) – Clearing variable not to affect next run.

 

“Map and Send” and “Exception SubProcess” processes.

Such subprocess is doing main logic of an IFLOW. Setting properties, validate CSV and XML structure, splitting message into packets, transform message using XSLT mapping, sending packets to Marketing and handling errors if any, as well posting logs into corresponding folders under AWS SFTP.

Content Modifier (Content Modifier) – Setting properties for the flow. Saving original file name (using for logs, to tell which file was successful and which is not) and setting default archiveDirectory to save file under Archive by default, such property can be overwritten in case of error.

CSV to XML Converter (CSV To XML Converter) – Convert CSV to XML based on Contacts_2.xsd. Setting field separator “|”.

XML Validator (XML Validator) – Validate XML based on TBS_Contacts_2.xsd. XML validator performing checks against field Names and Types. Preventing flow from posting invalid data.

 

General Splitter (General Splitter) – Split message into several (grouping parameter) using xpath expression /CSV_Contacts/Contact. Parallel processing is On.

 

Mapping (XSLT Mapping) – This step maps the CSV file structure into Marketing OData format.

Standard content link https://api.sap.com/package/SAPHybrisMarketingCloudfilebaseddataload?section=Artifacts

Call OData (OData) – Sending Contacts to Marketing via standard API (ContactOriginData method). POST method is used. Odata call is using batch parameter + multipart/mixed content type. OData call sneding packets to Marketing, in its turn Marketing is saving request into staging area.

Check OData Response (Router) – Checking status of an OData call response. It is expected that OData call is returns a status code as part of response.

Set Props If Error (Content Modifier) – Overwrite CamelFileName in case of error. Log file will contain “Failure” in it’s name in case there was a negative response from OData call. Step also is setting errorFlag property which is using to decide whether to save error into CSV log.

Write Variables 1 (Write Variables) – Save local variable in case of error. Variable is used in the post processing to check where to save file. Under Archive, in case of success or under Error in case of failure during processing.

Set Props If Success (Content Modifier) – Overwrite CamelFileName in case of success. Log file will contain “Success” in it’s name in case there was no error in response from OData call.

Send to SFTP Success (Process Call) – Calling of sub process Send To FTP.

Exception SubProcess:

Exception (Content Modifier) – Overwrite CamelFileName in case of Exception. Log file will contain “Exception” in it’s name in case there was a negative response from OData call.

Write Variables 2 (Write Variables) – Set local variable for future moving original file into Error folder.

Send to SFTP Exception (Process Call) – Calling of “Send To FTP”.

Custom SFTP-Based Logging and Exceptions Log – sub process to append custom CSV log stored on SFTP.

Send To FTP (Send) – A step is creating logs of OData call responses under SFTP. Property CamelFileName telling SFTP where to store files.

Router 2 (Router) – Check whether to write failed packet into exception log or not.

Write Exception Log (Process Call) – Calling of “Exceptions Log” sub process.

Generate Line (Content Modifier) – Generating of body to append log with the information about failed packet and error message (if applicable).

Append Error Log (Send) – A step is appending CSV log based on previous step.

Testing (Positive)

Once source CSV file(s) are ready (column count is matching, names are matching XSD definition and values satisfying XSD) we can proceed with the testing.

Please make sure files are available under source directory folder in SFTP server.

Step 1 – put file into source directory under SFTP folder.

Step 2 – Set File Name equal to one in SFTP folder, set Scheduler and deploy the flow.

Once iflow is deployed and if file is under specified directory – it will start processing at the desired time.

Step 3 – Go to “Monitor Message Processing” and make sure iflow has been processed. Check log.

Check SFTP folder. In case file has been processed without errors(no 1000k packets failed) it should be moved to Archive directory.

Check Logs folder. All response messages in this case were successful. The log folder will create a folder with same name as the source file name and places the OData response for each 1000k packet as shown below.  The success on packet 0 indicates that all records from 1 to 1000 in source files are processed successfully and the alpha numeric id after success is the CPI message id that processed the source file. This audit log is very useful when tracing the errors or reconciling source system files which has millions of records that are split into several 100k files and  then further split into 1000 or 2000 packets in SAP CPI to optimise SAP Cloud OData performance.

Testing (Negative)

Once source CSV file(s) are ready (column count is matching, names are matching XSD definition and values satisfying XSD) we can proceed with the testing.

Please make sure files are available under source directory folder in SFTP server.

Step 1 – put file into source directory under SFTP folder.

Step 2 – Set File Name equal to one in SFTP folder, set Scheduler and deploy the flow.

 

Once iflow is deployed and if file is under specified directory – it will start processing at the desired time.

Step 3 – Check SFTP folder. In case file has been processed with errors (at least one packet has failed) – original file should be moved to Error directory.

Check Logs folder. There are few  packets that failed with OData Bad request error due to data issue. The log folder will create a folder with same name as the source file name and places the OData response for each 1000k packet as shown below.  The  failure on packet 0 indicates that all records from 1 to 1000 in source files failed and the alpha numeric id after failure is the CPI message id that processed the source file. This audit log is very useful when tracing the errors or reconciling source system files which has millions of records that are split into several 100k files and  then further split into 1000 or 2000 packets in SAP CPI to optimise SAP Cloud OData performance.

Step 4 – A custom error log file is created additionally to review which packets failed in which file and is written into error folder.

The error log file contains timestamp, original file name, number of failed packet, error message text as shown below.

 

Conclusion

Following blog post demonstrates how to adjust standard integration content for loading big data volume. As well, optimize the performance and make troubleshooting process more transparent.

1 Comment
You must be Logged on to comment or reply to a post.