Skip to Content
Technical Articles
Author's profile photo Werner Dähn

S/4Hana to Kafka

More and more customers use Apache Kafka as their real-time data backbone. SAP Hana contains the current state of the business and Apache Kafka the entire stream of changes since the beginning. This enables all data consumers to get ERP data without accessing the expensive S/4Hana system, thus is a great cost saving measure and open new possibilities.

Because Kafka is so poplar, most tools support Kafka, e.g. SAP Data Intelligence, all Big Data tools, ETL tools, pretty much everything.

What customers are missing is an easy way to get S/4Hana data into Kafka, though and the S/4HanaConnector for Kafka helps here (see github and docker).

The usage of the S/4HanaConnector is very simple:

  1. Pull it from Docker Hub
  2. Open the Admin UI and create connections to the S/4Hana system and Kafka
  3. Select the objects to produce data for
  4. Assign the objects to one or multiple producer instances

Everything else happens under the cover. The Kafka schema definitions are derived from the SAP structures, at first start an initial load is performed and from then on the data is produced with a latency of seconds.

With this connector it is a matter of minutes to get data into Kafka.

 

Under the covers there is much more going on, everything needed for a complete Data-, System- and Process Integration solution. Metadata about the landscape, impact/lineage information, ability to map data when reading – e.g. to rename the columns – and to adjust it to an existing schema instead of using a source specific schema.

This post is part of a series and its full power is unlocked when combining it with one of the other components.

 

Assigned Tags

      13 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Gregor Wolf
      Gregor Wolf

      Hi Werner,

      looking at the screenshot you provide and in the README.md on GitHub I see that you're connecting to the SAP HANA Database below the SAP S/4HANA System. Wouldn't it make more sense to use the SAP S/4HANA or SAP S/4HANA Cloud Business Events? Based on this Events the corresponding API's can be used to read the Business Objects with all details. In your approach the relationship between VBAK and VBAP must be defined manually where in the API_SALES_ORDER_SRV there is a navigation attribute defined in the OData Service.

      Best regards
      Gregor

      Author's profile photo Werner Dähn
      Werner Dähn
      Blog Post Author

      Gregor Wolf I would argue, every method has its pros and cons. Therefore I would like to provide adapters for all methods:

      • Business Events
      • IDOCs
      • BAPIs
      • CDS with "delta.changeDataCapture: automatic" and ODP

      The reason I started with a low level implementation is because it provides access to 100% of the data with the least amount of work. But it is definitely just one of many options.

      To be more precise, the Business Events have the following downsides in my opinion:

      1. Initial Load
      2. Performance impact on the ERP system
      3. Only a subset of ERP data is provided via Business Events out of the box
      4. Adding customizing fields
      5. Not all fields exposed

       

      Regarding the navigation attribute, my goal is to do that in Kafka. Import the CDS definition and then assemble the nested object there. I call that service the Object Assembly. Same result and relieving the ERP system from this expensive work.

       

      Makes sense?

      Author's profile photo Gregor Wolf
      Gregor Wolf

      I like the approach using the CDS definitions to re-assemble the object.

      Author's profile photo Sainath Kumar
      Sainath Kumar

      Excellent article

      we have tried this it works well only question on how do we implement cdc kafka expects a time stamp field .

      Author's profile photo Werner Dähn
      Werner Dähn
      Blog Post Author

      Not getting your point. Yes, Kafka Connect requires a timestamp field and you must guarantee there are no deletes. Therefore Kafka Connect cannot be used.

      This solution here does not need a timestamp to identify changes and it produces CDC data including the time the change was created, the transaction id and more.

      Author's profile photo Wenjie He
      Wenjie He

      Hi expert,

      The connector  is open source or licnesed?

      Author's profile photo Werner Dähn
      Werner Dähn
      Blog Post Author

      Wenjie He Both, pay-per-use. See github.

      Author's profile photo 姜 小鹏
      姜 小鹏

      hi expert,what the charge method about the connector?

      Author's profile photo Werner Dähn
      Werner Dähn
      Blog Post Author

      Please send me an email.

      Author's profile photo Vijayakumar Mukunthan
      Vijayakumar Mukunthan

      Hi Werner

      I am very much interested in this. Where did you code this?. I working in S/4 Data Migration. time its difficult to do field mapping from legacy to S4. I would like this type of field mapping for its paining to do manual mapping. We are using SAP Data Services for data load.

      Thanks and Regards

      Vijay

      Author's profile photo Werner Dähn
      Werner Dähn
      Blog Post Author

      https://hub.docker.com/r/rtdi/s4hanaconnector

      Author's profile photo wei cui
      wei cui

      Hello Wener,

      I config 'connect to kafka',but get an error:Topic cannot be read.

      I do not config SASL in kafka,so I set the left form to null.

      How to treat ?

      Thanks.

       

      picture1

      picture1

       

      Author's profile photo Werner Dähn
      Werner Dähn
      Blog Post Author

      Let's discuss that via github issues, yes?