In my previous post I promised you to show the combination of ABAP CDS Reader and Python operator. Here we are. You can use the approach below for SLT and ODP operators too, because the JSON output of operators is the same. In this blog post we are going to extract a standard CDS view from SAP S/4HANA Cloud into SAP Data Intelligence Data Lake. And what the most important is – we want to see headers in the file. This topic is more complicated, than the first one, but I’m sure you will get it 😊 Why I picked this topic? Firstly, I know that some colleagues encountered this problem. Secondly, there isn’t a standard solution for it at the moment. Moreover, this approach will help you to select columns or make some data preparation if it needed. So, let’s go!
For the CDS view extraction from SAP S/4HANA Cloud we are going to use ABAP CDS Reader operator in DI. It is a standard operator using to read data from a CDS view. There is a standard template “CDS View to File” using for data extraction from a CDS view into a csv file (located in the Data Lake in DI):
However, using this pipeline you will not see any headers in the file. It’s bad…very bad ☹
There are three versions of ABAP CDS Reader operator. The main difference here is the output data type. But independent of version you are using, the headers are still not visible in the file. And this problem we are going to solve in this post.
Among these three versions only ABAP CDS Reader V2 has an information about fields in its output. So, we are going to use this version. The output data type of this operator is message. A message has a body and attributes. The body of the output deliveries data, and attribute some information about fields, last batch etc (see image below). Because there is not any standard solution to get headers, we must use a custom operator. In this post I’m going to use Custom Python operator. If you don’t know what it is, check my previous blog post.
Below you will see the example how attributes look like (I_COSTCENTERTEXT view):
I’m going to read headers from ‘ABAP’ -> ‘Fields’ -> ‘Name’. The pipeline looks like:
I’m not going to go through back end setup, we will concentrate us on Data Intelligence. Let’s create our Python operator:
1.Step – Create input and output ports for Python operator
Input port: name – inData, type – message
Output port: name – outString, type – string
2.Step – Add some python magic into the “Script”
from io import StringIO import csv import pandas as pd import json def on_input(inData): # read body data = StringIO(inData.body) # read attributes var = json.dumps(inData.attributes) result = json.loads(var) # from here we start json parsing ABAP = result['ABAP'] Fields = ABAP['Fields'] # creating an empty list for headers columns =  for item in Fields: columns.append(item['Name']) # data mapping using headers & saving into pandas dataframe df = pd.read_csv(data, index_col = False, names = columns) # here you can prepare your data, # e.g. columns selection or records filtering df_csv = df.to_csv(index = False, header = True) api.send('outString', df_csv) api.set_port_callback('inData', on_input)
Note, that you should take care of data types using pandas in python. I don’t consider it in the code. Moreover, don’t forget about last batch if you need it in your use case. In addition, be aware that the “on_input” function will be executed by each data package. So, this approach is appropriate for replication/delta mode (because in the initial load json of attributes looks different after the last package, parsing cannot be done anymore, and the pipeline will fail after the last package). But this code can be adapted for initial load with only a couple of code lines. Take it as an exercise 😊 You can use this approach, if the CDS’s structure will be changed. And, of course, this code can be optimized.
3.Step – Upload a cool icon for your python operator and save it.
Configure “ABAP CDS Reader V2” and “Write to File” operators, save and run your pipeline. You are almost done 😊
Your output with the headers:
Congratulations! As of today, you don’t have any troubles with CDS’s headers 😊 If you have any questions or ideas for the next blog post, feel free to let me know in the comments. Stay tuned!
How to connect SAP S/4HANA with SAP Data Intelligence: