Real-time data replication to Data Warehouse Cloud
I have introduced how to set up connection between Data Warehouse Cloud and source systems like S/4HANA and SAP HANA (both On-Premise and Cloud) in this blog. In addition, I have replicated and visualized S/4HANA CDS View data with Data Warehouse Cloud in this blog.
Now, I will replicate data from On-Premise SAP HANA to Data Warehouse Cloud in real-time in this blog.
I will follow the steps below:
- Create Graphical View
- Replicate data from SAP HANA to Data Warehouse Cloud in real-time
- Create a story for checking record count
- Update the data on source system and confirm the data is replicated in real-time
1. Create Graphical View
First, select “Data Builder” from the menu icon on the left and select the Space where the connection is created.
Click “New Graphical View” to create a new view.
In the left pane, you can select data sources of the view from both Repository (tables and views on Data Warehouse Cloud) and Sources (objects on the source systems). In this blog, I will select “Sources” to connect table in SAP HANA.
If you select “Sources”, the list of schemas in the SAP HANA is displayed.
In this blog, I will use table “LINEITEM_2” in schema “TPCH”.
Drag and drop the table to import it.
Enter “Business Name” and “Technical Name” of the view. Select “Type” to “Analytical Dataset” and “ON” to “Allow Consumption” to visualize the data.
Add a new field to count the number of the records. Select the target “LINEITEM VIEW” and click “Calculated Columns” icon (fx icon) from the popup menu.
Click the “+” icon on upper right to add “Calculated Column”.
Enter “Technical Name” and also “1” as “Expression”. By totaling this column, the number of records can be displayed.
Go to “LINEITEM VIEW” again. Click “…” of the numeric column and select “Change to Measure” to define the column as “Measure”.
Set all numeric columns including the “Calculated Column” “CNT” to “Measure”. Choose “SUM” as aggregation type.
In addition, you can define further configurations such as hierarchy, calculation column, filter, join with other table, etc. However, this blog will leave it as it is, save and deploy it.
Select the target and click “Data Preview” to display preview of the data. At this point, the data is not stored in the Data Warehouse Cloud, but connect to the source system (SAP HANA) each time.
Go back to “Data Builder”, the saved “Graphical View” and source tables used by the view are displayed in the list.
2. Replicate Data
You can create a report (story) for the Graphical View created in the previous step. In this case, you will access SAP HANA every time you request data.
On the other hand, you can also replicate the data to Data Warehouse Cloud for better performance. In this blog, I will try to replicate data in real-time.
* At present (May 2020), only data replication is possible, but a data flow function is planned to be available within this year (2020), which enables so-called ETL processing that performs data transformation while extracting and loading data.
From the menu icon on the left, select “Data Integration Monitor”.
The table you created earlier (the table that refers to SAP HANA) is displayed. You can select “Remote” and “Replicated” in the “Data Access” column.
- Remote: Have no data in Data Warehouse Cloud and access the source system every time
- Replicate: Replicate data from source system to Data Warehouse Cloud
Now change the setting to “Replicate”.
You can select “None” and “Real-Time” as the “Data Frequency”. Here, I will select ‘Real-Time’.
- None : Stores the data as a snapshot. Data Warehouse Cloud will not be updated even if the data source is updated unless the snapshot is updated
- Real-Time : Replicate changes that occur in the source system to Data Warehouse Cloud in real-time
Data replication (Initial Load) is started, the status becomes “Active” once it is completed. Now the table is in real-time replication status.
3. Create a story for record check
Create a simple story to check the number of the table records. Select “Story Builder” on the left menu. Select the space where the Graphical View was created.
Click “Create Story” to start creating a new story.
In the data source selection screen, select the Graphical View which was created before.
You can create story using various components like chart, table, geo map, etc.
In this blog, I will use chart as an example. Change the chart type to “Indicator” > “Numeric Point”, select “CNT” as “Measure”.
The current record count is displayed. Save it once completed.
4. Update data on the source system
Update data on the source system (On-Premise SAP HANA). Two records are inserted.
Refresh the story.
Now you can see that there are two additional data. The data maintained in the source system has been transferred to Data Warehouse Cloud in near real-time.
Now, I have introduced a way to replicate data from a source system (On-Premise SAP HANA) to Data Warehouse Cloud in real-time.
Generally, we need to set up ETL system, create ETL jobs and run them to perform data integration between systems. However, in Data Warehouse Cloud, even if you have set up a system connection (DP Agent setup), you can replicate data to Data Warehouse Cloud by just choosing data access mode. Moreover, You can also set whether the replication is in real-time or snapshot mode only by selecting the refresh frequency.