Skip to Content
Technical Articles
Author's profile photo Giri Raaj Ragupathi

Data flow using Python – in SAP “Data Warehouse Cloud “

  • Overview :

In real world Data Warehouse Implementations,  Business   requirment   demands  a need to cleanse, consolidate  and   integrate   the   data   to   suit   the   Analysis  /  Reporting  for the Organizations.The  data flow  in “SAP Data Warehouse Cloud”   Provides  us  a possibility to transfer data from a  source to  Target  destinations(DWC  Tables)  and cleanse, consolidate and integrate the data so that  it  can  be  used  for analysis,  reporting  and possibly also for planning.

We can Design a data flow to easily transport and transform data through a graphical user interface, before publishing the data to the destination table.  you can drag and drop sources from the Source Explorer, combine them as necessary, add additional operators to remove or add columns, aggregate the data, and run Python scripts.

This Blog Tries to answer some of the Python scripts required to be used in real world DWC implementations and help us to transform data for the dynamic requirments of the business.


 #Sample Data Flow Screen :


  • What is the Key Take Away from this Blog :
    • Guides you  to design DWC Dataflow
    • Guides you  to Transform data using Python scripts
    • Guides you  step by step way to Implement  DML operations in python.
    • Insert New Record using script
    • Modify Specific Field Contents.
    • Updating Values in specific Row number.
    • Concatenating Field Values.
    • Deleting specific Record


  • Step by Step Approach:


  • Step 1: In DWC Home Page -> Data Builder Option will Present the following screen


The above Dataflow option – select the New data Data flow, which will provide us the option to select the source tables/extractors.



  • Step 2: Drag the source and Target Objects from the left Panel.

As shown above, select the source from the left panel and also select projections and select Target objects in to the design area. Please note for target tables , we need to select the above shown option and deploy it.

Please insert SCRIPT operator box from the top panel to transform data with some logics and conditions. Here in we are going to see step by sep usage of python scrip in dataflow.

    • Scenario 1 : Insert New Record using script
    • Scenario 2 : Modify Specific Field Contents.
    • Scenario 3 : Updating Values in specific Row number.
    • Scenario 4 : Concatenating Field Values.
    • Scenario 5 : Deleting specific Record


  • Step 3: In Script Operator box – Select the Change Button from Properties



  • Step 4: Insert a New record using Python script ( Scenario 1)

For Demonstration purpose – we have considered source table with the column names shown below :


Once you navigate to script editor, for inserting new records – data.append code allows us to insert a new record in the recordset. please note as per our scenario we have to copy the last record values, for this purpose line no 19  : data(-1) is used in the code. It captures the last record field values  in a new dataframe called newRow.


Once we have the above code in place of the script editor , please save the code and deploy and ‘RUN’ the dataflow. as demonstrated here , the resultsets after the dataflow run – looks like this below screen.

 The same process need to be followed for any script implementation. for demonstration   purpose in scenario 1  used for Inserting new records.


  • Scenario 2 : Modify Specific Field Contents.


As demonstrated in the above step by step approach . follow step 3 & 4 for this scenario2,  to modify the field contents of the existing recordset. Please note same process need to be followed for any script implementation.

In this , we have modified the Rownum, company code  by incrementing 1.  as discussed  earlier the ‘append’ command helps us  to add records.  in the above screen line no – 33 & 34 performs value increment.


As we can see the output above , which modifies the rownum and company code values.



  • Scenario 3 : Updating Values in specific Row number.

This scenario – we will change the field contents base on condition. here we wanted to update the ‘Remarks’  for row no 2.

The code  which  is  used in line no : 16.   The key  command  is  to   perform  this  activity is[ ]. here we have to mention the row position. as we all know        in Array record count starts from 0. hence to replace the row no 2 : we need to provide 1.



          The output of the operation is shown above. row number 2 ‘Remarks’ has been modified.


  • Scenario 4 : Concatenating Field Values.

In this scenario – we will concatenate values extracted from other fields. Here the ‘Remarks’ Column is updated with ‘Parent_id’ and ‘Child_id”  with column values.

The Line no : 16 performs this operation.


The output shown above . Field values are concatenated in the last field. Using the ‘+’ operator we can combine and concatenate fields and store it in required fields. Please note we must use str(‘integer value’) , if we want to concatenate the integer values.



  • Scenario 5 : Deleting a Specific Record from Result set.

In this scenario – we will  delete the complete record based on conditions. we wanted to remov rownum less than 3. Hence we have copied the required rows to the new dataframe and re-assigned the filtered dataset to the original one.

Line No : 17 – the required results are copied to df_filtered data frame. and in line no : 18 , this is re-assigned to the actual data frame “data”.


As demonstrated the we can the output  rownumber 3 and above. These codes will be very handy for transforming datas during live implementation projects.


  • Conclusion:

Data Warehouse Cloud – is a powerful tool to re-vitalize the datawarehousing solution from SAP. In this Blog We managed to demonstrate one of the key features from DWC  dataflow in combination with Python scripts.

    • Handy python code to help in Implementations.
    • Provides us confidence to transform data based on logics.
    • Guides us  to filter and update the recordsets for dynamic requirments.


–  Giri Raaj. PMP –

Assigned Tags

      Be the first to leave a comment
      You must be Logged on to comment or reply to a post.