Implementing SDI with SAP HANA Cloud… concept and approach….
In this blog , I will discuss about the data replication process from remote source to SAP HANA cloud environment using Smart Data Integration as a ETL tool. Before that let’s discuss about the SDI concept ,architecture, advantages and disadvantages over other ETL tool like SAP Data Services also in which scenario it is best to implement.
SDI is an ETL tool that supports data replication from wide range of source system to SAP HANA and it is inbuild with SAP HANA . No separate license is required for SDI. All the data transformations operation like History preservation, row generation etc. are same like BODS but there is some advantages of SDI over SAP data services as a ETL tool
- Separate license is required for SAP Data services but SDI is inbuild.
- Smart Data Quality(SDQ) is included in HANA cloud along with SDI but in SAP data services separate license required for Data Quality Transforms.
- Both real time replication and and Batch replication is supported by the SDI but only Batch replication i.e. scheduled replication is supported by BODS.
Also, if we compare SDI with SDA one thing we need to understand in SDA data is always materialized i.e. physically replicated on the target system but in SDI we can see the source table data via a created virtual table using remote system in cloud platform until we –
- create a Flowgraph 2. a create a Replication Task.
Both of the above two replicate the data physically from source system to target system.
Today I am discussing about the process of SDI integration with SAP HANA Cloud and data replication from HANA remote system to Cloud HANA system.
The architecture will look like below –
We will use flowgraphs when transformation and data quality features like cleansing required which are not present in replication tasks. Otherwise we will use Replication Task.
## Steps of Implementation :
Assuming that target HDI container is already implemented in SAP HANA Cloud. I will use my HDI container as a target system. Also I am accessing HANA BOX (with XSC classic schema) and a remote desktop as a service from AWS.
1. Install and Configure DP Agent:
## We will install it in the remote desktop.DP agent will connect the source HANA XSC system with the HANA could using HANA adapter .To download go to the –
After download and configure like below while installation. Here naming the DP agent as PLB_AGENT –
Need to give the remote desktop details where the DP installing and get all the information from AWS where the service published.
Then register the agent by clicking the button “Register Agent“. Provided the agent name given in the installation time i.e. PLB_AGENT hare and the IP of the remote desktop mentioned in the AWS.
Then click “update status” and .After this select HANA adaptor and select “Register Adapter”
Now PLBAGENT will be appear in the HANA Cloud environment.
2. Add a Remote Source : Need to go to the cloud environment and add remote source.
Remote source will appear in the HANA cloud environment Database Explorer.
3. Create User Provided Services to access Classic Schema :
Need to create UPS to access the virtual table created by remote source in the classic schema inside HANA Cloud environment.
Need to change in the MTA.YML file and create a .hdbgrants tables like below –
The .hdbgrants grant file will similar like below –
4. Create a virtual table inside my HDI and create a flowgraph to replicate the data physically :
## Then crated a flowgraph on my MTA project. inside it the virtual table which we have created add it to the source in the flow graph. Build the flow graph.
In target table “TARGET_CustomerDataTarget” the data will store physically.
The flow graph you can schedule real-time or as a scheduled task.
5. You can create replication task if there is no transformation of data and schedule it:
In this scenario I am creating a replication task and schedule it real-time. Created a Replication task with name “PLB_flowgraph.hdbreptask” and add the virtual table as a source and schedule the file as Realtime.
The following way the replication task need to be configure –
In target table “TARGET_CustomerTarget” the data will store physically and when any change in the remote source for the take will occur ,the change will reflect in the target table.
Finally we are able to replicate the data physically using SDI flowgraph and replication task. This is the end of this blog. I will meet you in my next block with another useful topic.Bye.