Checking the behavior of SAP HANA Cloud virtual table replica feature
In this blog, I will share you the experience of behaviour check of SAP HANA Cloud Virtual Table Replica feature. I have checked following options.
- Snapshot (Cache)
SAP HANA Cloud can be positioned as “The Single Data Gateway”, this means any type of enterprise data can be integrated virtually. Specifically, various data sources can be integrated via virtual tables created on HANA Cloud (Data Federation).
However, it may happen that you want to physically store the data in SAP HANA Cloud from the performance and maintenance point of view.
Although On-Premise SAP HANA can replicate data with ETL functionality like SDI in such cases, SAP HANA Cloud can replicate data (or have snapshot data) only by setting it as an option of virtual tables. (No ETL job required!)
*Of course, data can be replicated to SAP HANA Cloud using SDI (also in batch and real-time linkage). Please refer to this blog for details.
I have referred the information as follows.
- SAP Help – Toggling Between Virtual Tables and Replication Tables
- SAP Help – ALTER VIRTUAL TABLE Statement (Data Definition)
* This blog shows the following parts (Online switch federation / caching / replication) of the SAP HANA Cloud overview document.
As a prerequisite, Data Provisioning Agent needs to be set up, please refer to this blog how to set up it.
1. Data Federation
First, create a virtual table on SAP HANA Cloud. It shows a table in On-Premise SAP HANA.
create virtual table "FVT_LINEITEM" at "OPHANA"."<NULL>"."TPCH"."LINEITEM_2";
Virtual Table is created.
Check the number of rows.
Then, execute SELECT statement to the virtual table. The statement is executed on the source system (On-Premise SAP HANA) and only the result set is transferred to the SAP HANA Cloud. It takes a little time due to this.
* On-Premise HANA is running on AWS us-east, SAP HANA Cloud is running on AWS Frankfurt
2. Data Replication
Next, execute ALTER TABLE statement against the virtual table which was created in the previous step. This statement changes the behavior of the virtual table from data federation to data replication.
alter virtual table "FVT_LINEITEM" add shared replica;
By this command, a table which stores replicated data is automatically generated in schema “_SYS_TABLE_REPLICA_DATA”.
From the runtime information of the generated table, you can see that the number of record is same as the source table and it physically stores data.
Check the number of records of the virtual table, it is same number as before ALTER TABLE.
Execute the same SELECT statement. The performance is much improved.
Then, update the source table data. In this test, I have added two records.
Check the number of records of the virtual table, you can find 2 records are added.
Also check the auto-generated table and you can see the record count is increased by 2 records. This means that the data updated on the source side has been replicated to SAP HANA Cloud in the near real-time.
Finally, stop and delete the replication by ALTER TABLE statement.
alter virtual table "FVT_LINEITEM" drop replica;
The auto-generated table is dropped.
3. Data Snapshot
Next, I will replicate snapshot data. For this execute ALTER Statement against the virtual table.
alter virtual table "FVT_LINEITEM" add shared snapshot replica;
A table is automatically generated on schema “_SYS_TABLE_REPLICA_DATA” as same as data replication.
Check the runtime information of the table and you can find the table physically stores data.
Execute SELECT statement and the result is returned as fast as the previous replication case.
And then, update the source table. In this case I have added one record.
The source table was updated, however the data in the SAP HANA Cloud snapshot table is not updated.
Check the number of rows of the virtual table, the data is not updated.
Then, update the snapshot data with the following command.
alter virtual table "FVT_LINEITEM" refresh snapshot replica;
The virtual table data was updated.
The Snapshot table was also updated.
Finally, delete the snapshot with the following command.
alter virtual table "FVT_LINEITEM" drop replica;
The snapshot table was droped.
Executing a SELECT statement against the virtual table now takes a little tame, as before. (Retrieving data from On-Premises SAP HANA)
The above is the introduction of the behaviour check results of the virtual table replication feature.
While virtual data integration can provide data to business users quickly, there are concerns about performance, maintenance and system/network workload.
The virtual table replication feature introduced to SAP HANA Cloud can perform real-time data replication and snapshot data creation simply by changing the virtual table settings with the ALTER TABLE statement. In other words, it is not necessary to create ETL jobs, and it is possible to set it up online, which makes it possible to optimize performance easily and quickly.
By utilizing this feature, first provide data to business users by virtual data integration, and then perform this setting when it is verified to be businessally beneficial and a faster response is required. This means, agile data utilization can be realized.
One thing I noticed...once you create a Replica for a Virtual table, you cannot execute any DML on the Virtual table
You will see an error similar to this.
Could not execute 'insert into "Customers" values (1,'r','c','a','b','c','d','e',4)'
Error: (dberror) : feature not supported: Could not execute DML on a virtual table (DBADMIN.Customers) when it has an enabled replica
Without the Replica enabled, insert/updates into the Virtual table will propagate to the Source table but not with Replica on.
Very good information and the possibility of using this new feature is good for our projects. I have some doubts regarding CDC, when I checked the ADAPTER_CAPABILITIES system view, I see hanaodbc is supporting CDC. I have following questions,
Hope you can clarify these questions. Thank you in advance.
Excellent demonstration. Thanks 🙂
Can I add shared Replica on a virtual table created inside HDI container? If Yes, what are the privileges that would be required? I tried assigning ALTER privilege on remote source to object owner #OO user and application user #RT user of container, but that didn't work.
Can anyone help me out with this?
Thanks in advance,
Could you please help me understanding if we have already replication task in SDI then what will be significance use case of virtual table shared replica. Because both serves same purpose of storing physical data into table.