SAP BTP Showcase – Load data into SAP Data Warehouse Cloud
This is the 1st of 7 blog posts which are part of the SAP Business Technology Platform Showcase series of blogs and videos. We invite you to check this overall blog, so you can understand the full end-to-end story and the context involving multiple SAP BTP solutions.
Here you will see how to easily consume data from systems of records (e.g. SAP ERP), cloud and on-premise databases (e.g. SAP HANA, SQLServer, Oracle, Athena, Redshift, BigQuery, etc.), oData Services, csv/text files available in your own local computer, or any File/Object store (e.g. Amazon S3). We will leverage SAP Data Warehouse Cloud’s Replication and Data Flow capabilities, as well as demonstrate how to access remote sources using data virtualization.
Below you can see this is the 1st step of the “Solution Map” prepared for the journey on the referred overall blog:
SAP BTP Showcase – Overall Technical Architecture
As a recap, SAP Data Warehouse Cloud is an analytic and persona-driven data warehouse-as-a-service tailored for business and IT users. It provides instant access to data via pre-built business content and adapters to integrate data from various sources. It is scalable, elastic, open and flexible, making it suitable for any company of any size.
In this blog we will handle the specific “Connection” service, used to connect more than 20 types of datasources. You can check all available technologies in in the SAP Help Portal – Connection Types.
As an example, we have logged on SAP Data Warehouse Cloud, and created a new Space, named “DASC SHOWCASE” (DASC stands for SAP HANA Database & Analytics Solutions in the Cloud).
Once created, we have to access the “DASC SHOWCASE” Space, and hit the “plus” (+) button in order to create a new “Connection”:
Let’s now work on multiple scenarios for loading data into SAP Data Warehouse Cloud.
Consume data from systems of records
In this showcase, we will show that SAP Data Warehouse Cloud provides multiple connections for SAP applications, as per the below examples:
For instance, let’s create a connection to an SAP ERP. We can use the SAP ABAP or ECC Connection Types in this example:
After informing the credentials to the ABAP system (e.g. ECC, BW, S4/HANA, or BW4/HANA, etc.), you can see a message showing that the connection is valid.
Then, if you go to “Data Builder” and create any new artifact, like a “Graphical View”, you will be able to choose a system in the “Sources/Connection” tab, for example, “ABAP”. Then you can choose the object’s type, pick-up a source and drag&drop it to the white area in the graphical builder. By clicking in “Preview Data”, you can easily check the contents of the ERP/BW object, as per the figure below.
Consume data from databases
Now let’s see how to connect to multiple database technologies, for example.
When you select a “Connection Type”, you will get a screen to fill with connections properties for each of the technologies:
Let’s connect to an SAP HANA Cloud instance for example:
Then, if you go to “Data Builder” and create any new artifact, like a “Graphical View”, you will be able to choose a system in the “Sources/Connection” tab, for example “HANA Cloud”. Then you can choose the object’s type, pick-up a source and drag&drop it to the white area in the graphical builder, as per the picture below.
Consume data via oData Services
Now let’s create a new “Connection” for consuming oData service data.
Let’s say we will consume an oData service, provided by an SAP HANA Cloud Native application, as presented in Blog 5: Develop a SAP HANA Cloud native application. You can point to the oData URL generated by the application, as the example below:
You can “right click” the “/production-plan-values” and copy the link. Then you can paste it on the Connection property, as per the screenshot below:
Then, if you go to “Data Builder” and create any new artifact, like a “Graphical View”, you will be able to choose a system in the “Sources/Connection” tab, for example “oData”. Then you can choose the object’s type, pick-up a structure and drag&drop it to the white area in the graphical builder, as per the picture below.
Consume csv/text files from your own local computer
Now let’s upload a local .csv example file, containing information about weekday’s names.
Then you can adjust the data structure accordingly, for example, renaming columns, changing datatypes, and even applying more sophisticated transformation rules.
Then you have your data uploaded into SAP Data Warehouse Cloud, and you can see the content of the table, as per the figure below:
Replicate data from hyperscaler’s file/object store using the “Data Flow” functionality
Now we are going to create a new “Connection” for “Amazon Simple Storage Service”, also known as “Amazon S3”, so we can replicate this data, persisting it on SAP Data Warehouse Cloud. You just have to inform S3’s Endpoint, Credentials and Root Path (which is the bucket itself).
Then, if you go to “Data Builder” and create a new “Data Flow”, you will be able to choose a system in the “Sources/Connection” tab, for example, “S3-Filesystem”. Then you can choose the directory, pick-up a source file and drag&drop it to the white area in the graphical builder. By clicking in “Preview Data”, you can easily check the contents of the file object.
Then, you need to click in “Add table”, as presented in the figure below, so you can create a new table on SAP Data Warehouse Cloud’s persistence area (which is an SAP HANA Cloud database instance).
Now you inform the target table name, and click in “Create and Deploy Table”, to effectively deploy this new table.
After that, you need to choose the load behavior, “APPEND” or “TRUNCATE” mode. Then, you hit the save button in the left top corner of the screen, as demonstrated in the picture below:
After saving your “Data Flow”, you are ready to execute it. Just click in the “Execute” button in the top left corner of your screen. The execution starts and you can see the Status as “Running”, on the right middle of your screen. Then you can click in the symbol highlighted in red, in order to analyze the execution in the “Data Flow Monitor”.
In the screenshot below, you can see that the execution was completed successfully, and a copy of your dataset seating in Amazon S3 is ready to use, replicated and persisted in SAP Data Warehouse Cloud.
Replicate data using “Table/View Replication”
You can rely on “Data Integration Monitor” to replicate tables/views’ contents from remote systems, persisting a copy of this data straight in SAP Data Warehouse Cloud. Just click on the object you want to replicate, then click in the “Table Replication” link, on the top right of your screen, like presented in the screenshot below:
If you need a one-shot synchronization, then you can choose “Load New Snapshot”, which will replicate the entire object’s contents into SAP Data Warehouse Cloud’s persistence area. That works for tables and views.
In case you need a real-time synchronization between the source system and SAP Data Warehouse Cloud, then you should choose “Enable Real-Time Access”. It first loads a snapshot of the source data into SAP Data Warehouse Cloud’s persistence area, and then keeps the replication synchronized in real-time. This strategy avoids intense read operations in source systems. For applying this technique, the database table in the source system has to be enabled for change data capture (CDC). For example, SAP HANA technology is CDC-enabled.
You can find detailed information for this feature in the SAP Help Portal – Loading and Monitoring Remote Tables.
Once the replication of the snapshot is finished, then you can see an updated Status “Available”.
Then, if you go to “Data Builder” and open the replicated artifact, you will see that the Data Access is “Replicated”, as highlighted in the screenshot below.
In this blog you could learn how to leverage SAP Data Warehouse Cloud’s persistence area, consuming or replicating data from different source systems & technologies. This data can now be modeled by business users and consumed by analytic applications, as presented on the following 2 linked blogs:
Blog 6: Provide governed business semantics with SAP Data Warehouse Cloud, and
Blog 7: Consume SAP Data Warehouse Cloud’s assets using SAP Analytics Cloud.
All of your feedback is appreciated. Enjoy!