HANA SDI is super set of all, if and only if – HANA is the main data target.
Now a day’s lot of questions are coming around HANA SDI, like HANA SDI is the only future data integration solution for the SAP? Should we use SAP SLT or HANA SDI for data replication now? SAP Data services will be dead in the future?
To answer all these questions let me try explaining what is SDI technology and How it is different from existing data integration tools.
HANA SDI – Smart Data Integration is basically part of the HANA delivered service called as EIM (Enterprise Information Management) services. This is an optional component. EIM provides mainly SDI and SDQ (smart data quality) solutions. SDI is the only solution that combines batch ETL, real-time replication, Virtual data into one technology. We no longer need a separate components/product when our main data target is HANA. It is an integral part of HANA architecture so different features of HANA can be used like aggregation, join, AFL function, Union, etc. SDI is also the best use case to move HANA on-premise data to cloud or vice-versa.
Comparison/Use case with other major data provisioning tools of SAP –
SAP SLT (SAP Landscape Transformation) – This tool would be still considered as the best choice when we need to perform real-time replication from ABAP to ABAP or SAP ERP to HANA as sidecar solution. This tool has flexible features to write ABAP codes for data filter, transformation, ease of replication for cluster and pool tables. Lots of other features to handle table replication with few clicks.
SAP DS (SAP Data Services) – One of the best ETL tool available in the market and Can be called as single enterprise-level solution for data integration, transformation and quality. This can be considered if there is no HANA or HANA is just one of the many data targets.
SAP RS (SAP Replication Server)- Real-time data integration solution which moves and Synchronizes transaction data including DDL and DML across the enterprise. This tool is better for mission-critical and Near to zero downtime for data replication scenarios. SAP RS can be considered over HANA SDI when there are no HANA targets and Needs real-time bi-directional data movement.
So now we can say that for most of the non-HANA cases other tools are still needed.
Let’s talk about SDI important components/architecture and Usage.
HANA DP Server (Data provisioning server) – This is a server/process resides inside the HANA platform (Should be running for the usage of SDI) which handles the communication between DP agent and HANA (Index server, XSC, XSA). If DP server is turned off then remote data sources are not accessible to HANA.
DP Agent (Data provisioning agent)- DP Agent is a very lightweight process that allows HANA to reach various configured remote sources based on different types of adapters. This is mainly running out side of HANA and Most of the time close to the remote data source using DP Agent SAP Tool.
This can be created and Configured in DP agent config tool or CREATE AGENT “DPAgentName” PROTOCOL ‘TCP’ host ‘<hostname>’ PORT 5050;
HANA DP agent based on the product availability matrix can be downloaded from software downloads of SAP portal.
DP agent tool is used to configure the agent service, register the agent in HANA and Deploy the adapters in HANA. New adapters .jar file can be easily added.
DP Agent has a unique name within HANA and HANA should be able to connect with this agent. One agent can have different types of adapters deployed.
Once HANA talks with this agent it can query like how many adapters are deployed and Registered? so that those can be made available for remote connection in HANA studio or Web IDE.
The tool is mostly installed on the remote site data server, If there is more than one remote data server into the different environment like windows/linux then this can be installed separately on each server.
Adapter and Adapter connection– It translates the HANA query request into source requests and Vice versa. It plays an important role in the data extraction and Translation process. SAP has provided lots of standard adapters so that different remote data sources can be easily accessed. Remote data source adapters which are not available or Not suitable for the replication requirements, SAP provides SDK development framework to build or extend existing standard adapters which are called as customized adapters. At high-level adapter can be called as database connection driver like we have JDBC, ODBC, etc.
Many standard adapters are provided by SAP as part of HANA DP agent installation. More adapters are available from SAP partners, GitHub and Also custom adapter can be built using SAP SDK. CREATE ADAPTER “FileAdapter” AT LOCATION AGENT “DpAgentName”.
Database operations like SELECT, INSERT,UPDATE,DELETE on virtual table availability depends on the adapter only, If adapter does not support DML operations then INSERT statement will fail in HANA with error “Adapter does not support inserts”.
If the adapter is not available in HANA studio/Web IDE to add remote connection then adapter needs to be deployed and Registered using DP agent tool.
Remote connection – In HANA under provisioning folder, Remote sources can be added or Configured based on the available and Registered SDI adapters. Basically, it establishes a connection between HANA and Remote database using selected required adapter(driver). Once a connection is established and Active then remote data(Virtual table) is accessible using different SDI features.
Select the required adapter or Add from DP agent tool if not available in HANA (You need to connect to the HANA system from DP agent using required role like SYSTEM admin user id). Feed the required parameters to enable remote data connections.
Virtual tables – It represents remote data in HANA. Look and feel and table access provision is like normal schema tables(local) in HANA. The virtual table has only metadata and Data is residing on a remote database server. One remote connection enables to add multiple virtual tables in HANA. Here SDI uses the concept of SDA(Smart Data Access) of adding virtual tables. Following are the different operations which can be performed on virtual tables:
- Simple copy: Insert into <table> select * from <virtual table>
- Read remote data: select * from <virtual table>, Use as data source in calculation or CDS views. Even virtual table can feed data to BW aDSO objects in BW/4 application.
- Data transformation : Using functions ,flow graphs , stored procedures.
- Remote subscriptions: This enables notifications of changes in the remote system. This is important in real-time replication to identify record updated in remote server and Update local HANA copy.
- Invoke virtual procedures. Call <Virtual procedure>.
Flowgraphs: This enables to create replication task using a graphical approach inside HANA which runs using the HANA calculation engine. It enables ETL based batch and Real-time data flows. Complex transformation logic can be easily accomplished using flow graphs.
Replication Task: Replication task can be created using HANA web-development environment. Common use case of hdbreptask editor is to create exact 1:1 copy of remote source into HANA. Example quickly copy data of 1000s tables without much transformation into HANA.
A common replication problem is solved by SDI nicely, When to start initial load and When to start replication task. This is handled by SDI using third loading phase.
Phase1: Capture the new changes and Queue them into HANA DP server. ALTER REMOTE SUBSCRIPTION <name> QUEUE.
Phase2: Trigger the initial load.
Phase3: Distribute the records captured by phase 1 and phase 2. ALTER REMOTE SUBSCRIPTION <name> DISTRIBUTE.
Note: Remote subscription feature is only available for virtual tables as of now.