Skip to Content

In this blog, we focus on our experience with File-Based Load via SAP Cloud Platform Integration (HCI). We describe our test setup, present the lessons we learned, and we give a few recommendations that will hopefully help users working with HCI for the first time. The blog does not include a description of all different situations you can face in such an integration project but introduces our general approach towards gaining a basic understanding of how the integration works.

Introduction

SAP Hybris Marketing offers many ways to integrate data from an external system. For instance, you can connect another SAP system like CRM, ERP or Cloud for Customer. Additionally, you can import data via an API or OData.

Another business scenario is to import data initially or regularly via a file using SAP Cloud Platform Integration services (HCI). This integration scenario allows you to create or update interactions, contacts, corporate accounts, or products and product categories in your SAP Hybris Marketing system. This is possible for cloud and on-premise systems, whereas a cloud system should be the most frequent case.

The new integration package named “SAP Hybris Marketing Cloud – file based data load” is available on the SAP API Business Hub. It runs on the SAP Cloud Platform (HCP) as an integration service tenant and connects to the SAP Hybris Marketing system via OData service. The CSV files are loaded from a SFTP server to SAP Cloud Platform Integration.

The content is delivered as an iFlow. An iFlow is a graphical tool to configure integration scenarios. With an iFlow, you can immediately see the complete end-to-end integration without the need to drill down:

  • Who is the sender, who are the receivers and how many are they?
  • Interfaces used with the sender and receivers.
  • The adapters used with the sender and receivers.
  • The dynamic routing used.
  • The mappings used.

The following figure illustrates the integration flow for the object Interaction:

What happens in the iFlow for interactions?

The CSV files are stored on an SFTP server. The HCI fetches the CSV files.

  1. A script removes the header from the file.
  2. The CSV file is converted to XML format. With the help of the splitter you can split the file in smaller packages.
  3. The mapping from XML to OData structure is performed.
  4. The split packages are sent via OData to SAP Hybris Marketing.

In case of an error an e-mail can be sent, so you can avoid permanent monitoring.

 

Lessons Learned

Summary

Before going into details we want to provide you with an overview of the findings and the resulting recommendations.

Subject Finding Recommendation
Columns of the CSV Input File All existing mapping fields/columns must be included in the CSV file. Even the order of the columns must be correct. Use the attached CSV files in order to understand the structure.
Validation of the Data within the iFlow Values in the CSV file are validated by HCI and can cause errors Check the document Mapping Details to understand the required format of the fields.
Structure of CSV Input File for Interactions The staging area at SAP Hybris Marketing does not allow the import of multiple updates of the same contact within the same package. In case of multiple interactions of the same contact, fill out contact fields only for one interaction.
Alternative: Leave contact fields empty if you know that the contact already exists at SAP Hybris Marketing.
Error Handling / Processing of Big Files In case of a validation error at HCI, the CSV file is not copied to the archive file. After a specified time period (default: 5 minutes), the same CSV file is fetched and processed again by HCI. This means that all data is send again to SAP Hybris Marketing. For a better error handling, do not import too big CSV files.
Log Level Configuration When you deploy an iFlow, the log level is set to “Info” by default. Change the log level to Debug at Monitor -> Monitor Integration Content
Monitoring at SAP Hybris Marketing A successful processing of a CSV file at HCI does not automatically include a successful data import at SAP Hybris Marketing. Once a CSV file was processed successfully at HCI, also monitor the apps Import Monitor and Application Log at SAP Hybris Marketing.
Performance Optimization within an HCI tenant It is possible to parallelize the process within an HCI tenant. In the iFlow, at step Splitter, activate  the checkbox Parallel Processing.
Performance Optimization with Several Nodes You have the option to request additional resources. Create a support ticket on component LOD-HCI-PI-OPS.
Number of Messages per Poll from SFTP Server By default, the standard value for the number of messages per poll from the SFTP server is 20. In case of simultaneous load, reduce the Max. Messages per Poll from value 20 to 1 or 2. By this change, you can ensure that the files at SFTP server are fetched from several nodes.
More Details

For a detailed description of our test findings, see the following chapters.

Columns of the CSV Input File

In the migration package of “SAP Hybris Marketing Cloud – file based data load” you are provided with attached CSV sample files. With the help of these CSV files you can see, which fields are imported.

In a first test, we used the sample file for contacts and copied it to our SFTP server. The import worked properly.

In a second test, we created our own CSV file for contacts. As our contact data samples had no company assigned we have removed the columns SAPERPConsumerAccountId, CompanyId and CustomerName. This time the iFlow raised an error stating that XSD schema is incompatible with CSV payload.

The reason for the error: XML transformation expects all columns of the mapping.

Recommendation:

Open the sample CSV file for the corresponding object to understand, which columns/fields are required within the import file. Yo can find the sample CSV files at  tab Documents  of the iFlow.

The mapping is also explained in the attached documents Set-Up Guide and Mapping details.

Validation of the Data within the iFlow

In further tests, we imported files with a higher data volume – 100000 data sets in one file and almost 1 million data sets.
In customer projects, data from many different external systems are imported to the SAP Hybris Marketing system. In our test scenario, we do not have such an external system, so we have created our own sample data for the import with our own tool.

With the generated sample data, some issues occurred regarding the XSD validation within HCI. For example, a value of a field was too long, see figure below.

Another issue was the wrong format of the field Timestamp. In such a case, you must open the CSV file on the SFTP server and correct the data inconsistencies. The best way is to check the consistency of your data in the CSV import file before you copy the file to the SFTP server.

Recommendation:

Check the document Mapping details for understanding the required format of the fields.

Additionally the rules with which the XML document must comply to be considered as “valid”, are set in the XSD file. If you are not clear about the length or format of field values, you can find the rules in the iFlow under Resources.


Structure of CSV Input File for Interactions

As already mentioned, the content of the CSV import file must comply with some conditions to pass the included validation. For the interaction CSV file, you must also ensure the validity of the structure within the CSV.

In our test case, we have imported the following CSV file for interactions:

ContactId ContactIdOrigin CommunicationMedium InteractionType Timestamp CampaignId InitiativeId InitiativeVersion Valuation Reason IsAnonymous Amount Currency Latitude Longitude SourceObjectType SourceObjectId SourceObjectAdditionalId SourceDataUrl ContentTitle ContentData FirstName LastName CustomerName EMailAddress PhoneNumber MobilePhoneNumber IsContact IsConsumer
TorgardTollefsen@rhyta.com SAP_HYBRIS_CONSUMER ONLINE_SHOP SHOP_ITEM_VIEW 20170916220001 59369337 10702933 Torgard Tollefsen Alert Alarm Company TorgardTollefsen@rhyta.com TRUE
TorgardTollefsen@rhyta.com SAP_HYBRIS_CONSUMER ONLINE_SHOP SHOP_ITEM_VIEW 20170916220002 59369337 10702933 Torgard Tollefsen Alert Alarm Company TorgardTollefsen@rhyta.com TRUE
TorgardTollefsen@rhyta.com SAP_HYBRIS_CONSUMER ONLINE_SHOP PROD_REVIEW_VIEW 20170916220003 59369337 10702933 Torgard Tollefsen Alert Alarm Company TorgardTollefsen@rhyta.com TRUE
TorgardTollefsen@rhyta.com SAP_HYBRIS_CONSUMER ONLINE_SHOP PROD_REVIEW_VIEW 20170916220033 59369337 10702933 Torgard Tollefsen Alert Alarm Company TorgardTollefsen@rhyta.com TRUE
TiagoSilvaCorreia@gustr.com SAP_HYBRIS_CONSUMER ONLINE_SHOP SHOP_ITEM_VIEW 20170916220000 -19800431 -4464859 Tiago Correia Custom Lawn Care TiagoSilvaCorreia@gustr.com TRUE
TiagoSilvaCorreia@gustr.com SAP_HYBRIS_CONSUMER ONLINE_SHOP PROD_REVIEW_VIEW 20170916220001 -19800431 -4464859 Tiago Correia Custom Lawn Care TiagoSilvaCorreia@gustr.com TRUE
TiagoSilvaCorreia@gustr.com SAP_HYBRIS_CONSUMER ONLINE_SHOP SHOP_ITEM_ADD 20170916220031 -19800431 -4464859 Tiago Correia Custom Lawn Care TiagoSilvaCorreia@gustr.com TRUE

As a result, the Import Monitor app displays the following error:

The reason for this error:

At HCI one row within the CSV file is mapped to a contact and to an interaction. This means every row in the CSV file has a contact and an interaction, eventually during the import to the SAP Hybris Marketing system, a contact and an interaction is imported for each row in the CSV file.

So after the mapping within HCI the following contact data sets will be imported to SAP Hybris Marketing:

ContactId ContactIdOrigin FirstName LastName CustomerName EMailAddress PhoneNumber MobilePhoneNumber IsContact IsConsumer
TorgardTollefsen@rhyta.com SAP_HYBRIS_CONSUMER Torgard Tollefsen Alert Alarm Company TorgardTollefsen@rhyta.com TRUE
TorgardTollefsen@rhyta.com SAP_HYBRIS_CONSUMER Torgard Tollefsen Alert Alarm Company TorgardTollefsen@rhyta.com TRUE
TorgardTollefsen@rhyta.com SAP_HYBRIS_CONSUMER Torgard Tollefsen Alert Alarm Company TorgardTollefsen@rhyta.com TRUE
TorgardTollefsen@rhyta.com SAP_HYBRIS_CONSUMER Torgard Tollefsen Alert Alarm Company TorgardTollefsen@rhyta.com TRUE
TiagoSilvaCorreia@gustr.com SAP_HYBRIS_CONSUMER Tiago Correia Custom Lawn Care TiagoSilvaCorreia@gustr.com TRUE
TiagoSilvaCorreia@gustr.com SAP_HYBRIS_CONSUMER Tiago Correia Custom Lawn Care TiagoSilvaCorreia@gustr.com TRUE
TiagoSilvaCorreia@gustr.com SAP_HYBRIS_CONSUMER Tiago Correia Custom Lawn Care TiagoSilvaCorreia@gustr.com TRUE

 

The problem is, that the staging at SAP Hybris Marketing does not allow the import of multiple updates of the same contact within the same package.

Due to the above finding, we recommend that the following structure of the CSV import file is applied to prevent multiple updates of same contact.

ContactId ContactIdOrigin CommunicationMedium InteractionType Timestamp CampaignId InitiativeId InitiativeVersion Valuation Reason IsAnonymous Amount Currency Latitude Longitude SourceObjectType SourceObjectId SourceObjectAdditionalId SourceDataUrl ContentTitle ContentData FirstName LastName CustomerName EMailAddress PhoneNumber MobilePhoneNumber IsContact IsConsumer
TorgardTollefsen@rhyta.com SAP_HYBRIS_CONSUMER ONLINE_SHOP SHOP_ITEM_VIEW 20170916220001 59369337 10702933 Torgard Tollefsen Alert Alarm Company TorgardTollefsen@rhyta.com TRUE
TorgardTollefsen@rhyta.com SAP_HYBRIS_CONSUMER ONLINE_SHOP SHOP_ITEM_VIEW 20170916220002 59369337 10702933
TorgardTollefsen@rhyta.com SAP_HYBRIS_CONSUMER ONLINE_SHOP PROD_REVIEW_VIEW 20170916220003 59369337 10702933
TorgardTollefsen@rhyta.com SAP_HYBRIS_CONSUMER ONLINE_SHOP PROD_REVIEW_VIEW 20170916220033 59369337 10702933
TiagoSilvaCorreia@gustr.com SAP_HYBRIS_CONSUMER ONLINE_SHOP SHOP_ITEM_VIEW 20170916220000 -19800431 -4464859 Tiago Correia Custom Lawn Care TiagoSilvaCorreia@gustr.com TRUE
TiagoSilvaCorreia@gustr.com SAP_HYBRIS_CONSUMER ONLINE_SHOP PROD_REVIEW_VIEW 20170916220001 -19800431 -4464859
TiagoSilvaCorreia@gustr.com SAP_HYBRIS_CONSUMER ONLINE_SHOP SHOP_ITEM_ADD 20170916220031 -19800431 -4464859

 

If you are sure that the contacts for the interactions to be imported already exist, you can also leave empty the contact fields:

ContactId ContactIdOrigin CommunicationMedium InteractionType Timestamp CampaignId InitiativeId InitiativeVersion Valuation Reason IsAnonymous Amount Currency Latitude Longitude SourceObjectType SourceObjectId SourceObjectAdditionalId SourceDataUrl ContentTitle ContentData FirstName LastName CustomerName EMailAddress PhoneNumber MobilePhoneNumber IsContact IsConsumer
TorgardTollefsen@rhyta.com SAP_HYBRIS_CONSUMER ONLINE_SHOP SHOP_ITEM_VIEW 20170916220001 59369337 10702933
TorgardTollefsen@rhyta.com SAP_HYBRIS_CONSUMER ONLINE_SHOP SHOP_ITEM_VIEW 20170916220002 59369337 10702933
TorgardTollefsen@rhyta.com SAP_HYBRIS_CONSUMER ONLINE_SHOP PROD_REVIEW_VIEW 20170916220003 59369337 10702933
TorgardTollefsen@rhyta.com SAP_HYBRIS_CONSUMER ONLINE_SHOP PROD_REVIEW_VIEW 20170916220033 59369337 10702933
TiagoSilvaCorreia@gustr.com SAP_HYBRIS_CONSUMER ONLINE_SHOP SHOP_ITEM_VIEW 20170916220000 -19800431 -4464859
TiagoSilvaCorreia@gustr.com SAP_HYBRIS_CONSUMER ONLINE_SHOP PROD_REVIEW_VIEW 20170916220001 -19800431 -4464859
TiagoSilvaCorreia@gustr.com SAP_HYBRIS_CONSUMER ONLINE_SHOP SHOP_ITEM_ADD 20170916220031 -19800431 -4464859

 

Recommendation:

Ensure the correct structure of CSV import file for interactions!

 

Error Handling / Processing of Big Files

As mentioned above, iFlow includes a validation to ensure the consistency of data in the file. However, a real error handling does not exist.

This means, in case of a validation error at HCI the complete CSV file is processed again, unless the issue is corrected in the source CSV file.

The straightforward flow is as follows:

  • You need to import a CSV file to SAP Hybris Marketing and copy the file to the SFTP server.
  • After approx. five minutes, depends on the configuration, the file is fetched from HCI and is processed there.
  • If you don’t experience any errors within HCI, the data is imported into SAP Hybris Marketing.
  • Then the CSV file is copied to the archive folder at HCI.

In case of a validation error at HCI, the CSV file is not copied to the archive file. After some time, the same CSV file is fetched from HCI and processed again. This means that all the data is send to SAP Hybris Marketing again.

Imagine you have a CSV file consisting of 1 million data sets and only one data set has an error. In this case the complete end to end flow is processed as long the error is corrected within the CSV file.

In the CSV file, every data set has a timestamp. With the help of this timestamp, SAP Hybris Marketing can recognize that the received data is obsolete and does not import it again.

However, when you copy several big CSV files with errors to the SFTP server, it can happen that a high workload is caused at SAP Hybris Marketing possibly resulting in bad system performance for end users.

Recommendation:

For a better error handling, do not import too big CSV files. In addition, it is easier to find and correct an error in a rather “small” CSV file. Regarding overall performance, we could not identify a significant difference between “small” and “big” CSV files.

 

Log Level Configuration

When you deploy an iFlow, the log level is by default set to “Info”.

Recommendation:

Change the log level to Debug at Monitor -> Monitor Integration Content

 

Monitoring at SAP Hybris Marketing

A successful processing of a CSV file at HCI does not automatically include a successful data import at SAP Hybris Marketing.

A successful processing at HCI only means that the data could be passed to the staging area (new with release 1708) of SAP Hybris Marketing. The staging area is a kind of an inbound queue, where the imported data is stored and then further processed by the application.

In the staging area some system depending checks are performed. For instance, when you import a contact with an unknown facet type the contact is blocked in the staging area. This means, that you can find the contact in the application Import Monitor with an error, but not in the Contacts application.

When the validation checks of the staging area are okay, the data is passed on to the application, where the import takes place. If an error occurs during the import of an object you can find the error message in the Application Log app, for example, the format of phone number is incorrect for a contact.

Only if staging area and data import are successful you can find the object in the corresponding application at SAP Hybris Marketing.

Recommendation:

If a CSV file was processed successfully at HCI also monitor the apps Import Monitor and Application Log at SAP Hybris Marketing.

 

Performance Measurements

We executed some performance measurements. To do so, we have prepared CSV files with 10000 and 100000 data sets for contacts and interactions.

 

iFlow Contacts:

For a CSV file with 100000 data sets, it took 10 minutes on average to process all data sets within HCI (without the transfer time from SFTP to HCI – as the transfer time strongly depends on network bandwidth).

We observed a short processing time for the step “CSV to XML Convert” – approx. 14 seconds. Most of the time is consumed for the OData requests (sending a data package to SAP Hybris Marketing and waiting for response before next package can be sent from HCI to SAP Hybris Marketing).

For CSV files with 10000 data sets we observed more or less the same linear performance behaviour as for CSV files containing 100000 records.

Summary:

For a contact we observed an average processing time of 6 milliseconds per data set within HCI.

 

iFlow Interaction:

The performance of the Interaction iFlow is not as convincing as the Contact iFlow as the Interaction iFlow triggers two OData calls – one for interactions and one for contacts. For example, when you have a CSV file with 100000 data sets, in fact 100000 interactions and 100000 contacts are transferred to SAP Hybris Marketing.

Summary: For an interaction we observed an average processing time of 13 milliseconds per data set within HCI.

 

Parallelization

Within a HCI Tenant

It is possible to parallelize the process within a HCI tenant. At the step “Splitter”, you can activate parallel processing. At this step, the content is split in packages with 1000 entries per package. The packages are then sent to SAP Hybris Marketing via OData call. This process can be parallelized by activating the checkbox Parallel Processing.

Without parallelization, we observed that the Contact iFlow takes an average processing time of 6 milliseconds per data set within HCI.

With parallelization, the processing time could be decreased by factor 4: It took an average processing time of 1.5 milliseconds per data set.

 

With Several Nodes

If you want to execute a simultaneous load with several nodes, you have the option to request additional resources via a support ticket on component LOD-HCI-PI-OPS.

We did not test with several nodes.

Recommendation:

In case of simultaneous load, reduce the Max. Messages per Poll from value 20 to 1 or 2. By this change, you can ensure that the files at SFTP server are fetched from several nodes.

Press in the iFow the button Configure and change the value.

To report this post you need to login first.

2 Comments

You must be Logged on to comment or reply to a post.

Leave a Reply