Skip to Content
Technical Articles
Author's profile photo Yuliya Reich

SAP Data Intelligence | How to get headers of CDS view

In my previous post I promised you to show the combination of ABAP CDS Reader and Python operator. Here we are. You can use the approach below for SLT and ODP operators too, because the JSON output of operators is the same. In this blog post we are going to extract a standard CDS view from SAP S/4HANA Cloud into SAP Data Intelligence Data Lake. And what the most important is – we want to see headers in the file. This topic is more complicated, than the first one, but I’m sure you will get it 😊 Why I picked this topic? Firstly, I know that some colleagues encountered this problem. Secondly, there isn’t a standard solution for it at the moment. Moreover, this approach will help you to select columns or make some data preparation if it needed. So, let’s go!

For the CDS view extraction from SAP S/4HANA Cloud we are going to use ABAP CDS Reader operator in DI. It is a standard operator using to read data from a CDS view. There is a standard template “CDS View to File” using for data extraction from a CDS view into a csv file (located in the Data Lake in DI):

Problem description:

However, using this pipeline you will not see any headers in the file. It’s bad…very bad ☹

 

There are three versions of ABAP CDS Reader operator. The main difference here is the output data type. But independent of version you are using, the headers are still not visible in the file. And this problem we are going to solve in this post.

Solution:

Among these three versions only ABAP CDS Reader V2 has an information about fields in its output. So, we are going to use this version. The output data type of this operator is message. A message has a body and attributes. The body of the output deliveries data, and attribute some information about fields, last batch etc (see image below). Because there is not any standard solution to get headers, we must use a custom operator. In this post I’m going to use Custom Python operator. If you don’t know what it is, check my previous blog post.

Below you will see the example how attributes look like (I_COSTCENTERTEXT view):

I’m going to read headers from ‘ABAP’ -> ‘Fields’ -> ‘Name’. The pipeline looks like:

I’m not going to go through back end setup, we will concentrate us on Data Intelligence. Let’s create our Python operator:

1.Step – Create input and output ports for Python operator

Input port: name – inData, type – message

Output port: name – outString, type – string

2.Step – Add some python magic into the “Script”

from io import StringIO
import csv
import pandas as pd
import json
def on_input(inData):
    # read body
    data = StringIO(inData.body) 
    # read attributes
    var = json.dumps(inData.attributes) 
    result = json.loads(var)
    # from here we start json parsing 
    ABAP = result['ABAP']
    Fields = ABAP['Fields']
    # creating an empty list for headers
    columns = [] 
    for item in Fields:
        columns.append(item['Name'])   
    # data mapping using headers & saving into pandas dataframe
    df = pd.read_csv(data, index_col = False, names = columns) 
    # here you can prepare your data,
    # e.g. columns selection or records filtering
    df_csv = df.to_csv(index = False, header = True)
    api.send('outString', df_csv)
api.set_port_callback('inData', on_input)

 

Note, that you should take care of data types using pandas in python. I don’t consider it in the code. Moreover, don’t forget about last batch if you need it in your use case. In addition, be aware that the “on_input” function will be executed by each data package. So, this approach is appropriate for replication/delta mode (because in the initial load json of attributes looks different after the last package, parsing cannot be done anymore, and the pipeline will fail after the last package). But this code can be adapted for initial load with only a couple of code lines. Take it as an exercise 😊  You can use this approach, if the CDS’s structure will be changed. And, of course, this code can be optimized.

3.Step – Upload a cool icon for your python operator and save it.

Configure “ABAP CDS Reader V2” and “Write to File” operators, save and run your pipeline. You are almost done 😊

Your output with the headers:

 

 

Congratulations! As of today, you don’t have any troubles with CDS’s headers 😊  If you have any questions or ideas for the next blog post, feel free to let me know in the comments. Stay tuned!

How to connect SAP S/4HANA with SAP Data Intelligence:

Integrate SAP Data Intelligence and S/4HANA Cloud to utilise CDS Views – Part 1

How To Integrate SAP Data Intelligence and SAP S/4HANA AnyPremise

Assigned tags

      4 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Ian Henry
      Ian Henry

      Great blog Yuliya thank you for sharing,

      I have implemented that in my scenario.

      There's one error in your pasted code, the line below needs to be indented with an extra tab.

      columns.append(item['Name'])   

      Cheers, Ian.

      Author's profile photo Yuliya Reich
      Yuliya Reich
      Blog Post Author

      Hi Ian,

      yes, of course, you are right. Thanks for saying, I will correct it.

      Regards,

      Yuliya

      Author's profile photo Dimitri Vorobiev
      Dimitri Vorobiev

      Yuliya, you are on fire with all these super useful DI blog posts!

      Author's profile photo k samarth ashvinbhai
      k samarth ashvinbhai

      Hey Yuliya, Very Informative BLOG. Thank you for sharing

       

      Can you give me and idea what changes we can make to the code so that it will work for initial load as well?

      P.S.:- i have the same structure of CDS View as well.

      It will be a great help from you!

       

      Thanks,

      Samarth