Technical Articles
Sap Datasphere Data Flow Series – Introduction and sample example
Hey Champs,
Let’s understand one of the important feature of datasphere which is dataflow, Before moving on to the topic first we have to understand the three things well i.e “What” “Why” ”How”.
This article is the first in the blog post series and introduces about dataflow and one small example that we’ll use along the entire blog series listed below:
- Blog Post #1: Introduction to dataflow and example
- Blog Post #2: Sap Datasphere Data Flow Series – Operators (Joins, Projection, Aggregation, Union)
- Blog Post #3: Sap Datasphere Data Flow Series – Script Operator Part 1
- Blog Post #4: Sap Datasphere Data Flwo series – Script operator Part 2(Still cooking)
Why:
Data flow is an exciting new feature of SAP Datasphere because it provides a visually-based modeling experience for data integration. This makes it easier to combine and load data from various sources, including structured and unstructured data. Just imagine we have BODS facility integrated here and that is awesome.
What:
Data flow allows you to create and manage data pipelines using a graphical interface. These pipelines can be used to perform a variety of tasks, such as extracting data from different sources, transforming it, and loading it into target destinations.
How:
Well this answer will not be small and simple, to understand this lets watch the full movie with me.
User Interface :
Champs lets see what all tools and option Sap Datasphere is providing us.
DataFlow User Interface
Now lets cover a simple scenario of getting data from a source as excel file and loading to a target table.
Wait.. Do we need to create target table again ?. No not at all, thanks to Sap for this wonderful feature where we just have to click create and deploy target table and automatically target table will be created.
USE CASE 1:
We got a requirement where while getting the order information from a different source table we want to add the leading zeros to the item column, so how to do it ?
let’s jump in for data flow to add the leading zeros :-
First we will create the source table using one excel file. After that we need to drag and drop the Sales order table into our play area :
Adding Source Table : Sales Order Table
Well, Now lets write one small python script to add the leading zeros. “What ? Datasphere supports python ??”. Yes, that’s correct datasphere support many python function and its still evolving.
Drag and drop script operator as shown in the below image and write the following code.
Adding Script Operator
Once we finished writing the script we have to then click on add table to create the target table.
Adding Target Table
Now once we have added the target table. Click on Target table and in the right side we will get details panel. Time being select the mode as append. And then in the top left corner just clock deploy option.
I will discuss all the rest mode types in details in further blogs.
Append: Write the data obtained from the data flow as new records appended to the end of the target table.
Now go to target table and click on data preview and its done, as we can see all the items have leading zeros.
Running the Dataflow
That’s not the end, Stay tuned for more blogs which I will be adding to the Data Flow Series.
Hey Kunal,
Great how you express yourself quite fun.
Two questions here:
can you leave your python code and also the link to the rest of the series?
Hello Luis,
To the right side of "adding script operator screenshot" python code is there. It's a very small one. I will write the next topics soon and will add the link soon. And a lot more python script I will share. Till then party time,