Skip to Content
Technical Articles

A new approach for replicating tables across different SAP HANA systems

Starting with SAP HANA Cloud, SAP HANA supports a new remote table replication feature for replicating tables across different SAP HANA systems natively. With this new feature, users can handily replicate remote tables as it directly sends DDL and DML logs from a source system to a target system through SAP HANA smart data access (SDA) ODBC connection.
In a nutshell, the new remote table replication was developed with the following five key characteristics.

  • Transient (in-memory) log-based replication: As this new feature is log-based in transactional layer, it can provide better performance with less overhead to a source system compared to trigger-based replication
  • ODBC connection: It ensures optimized performance between HANA systems because it uses SDA ODBC connection with hanaodbc adapter for its log/data transfer
  • Transactional consistency: If multiple source tables are updated within one transaction, it applies the same behavior on the target tables to ensure database level transactional consistency (transactional consistency is only guaranteed if the transaction is handled on database level, and it doesn’t guarantee the transaction consistency on application level like ABAP)
  • Near real-time replication: It keeps replicating when changes happen (change data capture)
  • Auto reactivation for failures: When an indexserver is turned off for any reasons such as crash or upgrade, all replications are disabled but when the indexserver restarts, those are reactivated and resumes replicating

Getting started with the new remote table replication

First of all, this new feature is only supported between SAP HANA Cloud systems at this point. As the hybrid scenario becomes ever more important, replicating remote tables from SAP HANA 2.0 SPS05 or later to SAP HANA Cloud will also be available soon. It will become possible once SDA supports remote sources to on-premise databases from SAP HANA Cloud using SAP Cloud Connector for SAP HANA Cloud. Please find further details in the below SAP Note and stay tuned for future updates.

SAP Note 2600176 – SAP HANA Smart Data Access Supported Remote Sources

As it requires SAP HANA 2.0 SPS05 or later when the source system is SAP HANA platform edition, please also see the below table as the supported scenarios.

Scenario Supported?
Replication between SAP HANA Cloud systems Yes
Replication from SAP HANA 2.0 SPS05 or later (source) to SAP HANA Cloud (target) Planned in Q4 2020
Replication from SAP HANA 2.0 SPS04 or earlier (source) to SAP HANA Cloud (target) No
Replication from SAP HANA Cloud (source) to SAP HANA 2.0 SPS05 or earlier (target) No
Replication between SAP HANA 2.0 SPS05 or earlier systems No

Configuration and activation

In order to use this new remote table replication, the first step is creating a remote source with SDA from the target SAP HANA Cloud to the source SAP HANA Cloud. SDA supports 3 types of credential modes to access a remote source with SAP HANA Cloud: technical user, secondary credentials, and single sign-on (SSO) with JSON Web Tokens (JWT). Currently, this new feature only supports the technical user mode, and users need to configure ‘DML Mode’ of a remote source property as ‘readwrite’.

CREATE REMOTE SOURCE REMOTE_REP ADAPTER "hanaodbc"
    CONFIGURATION 'Driver=libodbcHDB.so;ServerNode=<remote_tenant_endpoint>;
    sslTrustStore="<digi_certificate_string>";encrypt=TRUE;'
    WITH CREDENTIAL TYPE 'PASSWORD' USING 'user=<user_name>;password=<password>';

The <digi_certificate_string> should be entered as a string on a single line as below.

sslTrustStore="-----BEGIN CERTIFICATE-----MIIDr … WVbd4=-----END CERTIFICATE-----";encrypt=TRUE;'

The ‘readwrite’ is a default value for the remote source property ‘DML Mode’ in SAP HANA Database Explorer but please make sure that the created remote source is configured correctly.And the remote source should be named in UPPERCASE to initialize the remote table replication. For more details, please also see the below official documentation.

SAP HANA Cloud Administration Guide: Create an SAP HANA Remote Source

The next step is creating a virtual table in the target system. The following statement will create a virtual table V_CUSTOMER of MARKETING schema in target system based on CUSTOMER table of SALES schema in source system.

CREATE VIRTUAL TABLE MARKETING.V_CUSTOMER AT "REMOTE_REP"."<NULL>"."SALES"."CUSTOMER";

Then creating a replica table and remote subscription is needed.

-- create a replica table
CREATE COLUMN TABLE MARKETING.R_CUSTOMER LIKE MARKETING.V_CUSTOMER;

-- create a remote subscription
CREATE REMOTE SUBSCRIPTION SUB ON MARKETING.V_CUSTOMER TARGET TABLE MARKETING.R_CUSTOMER;

As the replica table is created with COLUMN LOADABLE by default, the entire columns will be loaded into memory and it ensures the performance. However, if needed, the user can switch it to PAGE LOADABLE. It means that column data is loaded by page into the buffer cache and it can reduce the memory usage. For switching, the user can use the following statement.

-- switch to page loadable
ALTER TABLE MARKETING.R_CUSTOMER PAGE LOADABLE;

-- switch to column loadable
ALTER TABLE MARKETING.R_CUSTOMER COLUMN LOADABLE;

Those are all about the configuration for new remote table replication, and the following statements can be used for activating, deactivating, or dropping the replication.

-- activate the replication
ALTER REMOTE SUBSCRIPTION SUB DISTRIBUTE;

-- deactivate the replication
ALTER REMOTE SUBSCRIPTION SUB RESET;

-- drop the replication
DROP REMOTE SUBSCRIPTION SUB;

If users drop the replication and want to reactivate it, they need to truncate or recreate the replica table. Please find the further details in the below links.

SAP HANA Cloud Administration Guide: Configure Remote Table Replication with the SDA HANA Adapter
SAP HANA Cloud Administration Guide: System Views for Monitoring Remote Table Replication

Auto reactivation

As mentioned, this new remote table replication feature supports auto reactivation for failures. During the reactivation, it synchronizes each pair of source and target table first then changes their replication status to ENABLED. Users can also manually do global reactivation with the following statement.

ALTER SYSTEM ENABLE ALL TABLE REPLICAS USING REMOTE SOURCE ALL;

 

Differences from the replication with SDI or SLT

SAP HANA smart data integration (SDI) with HanaAdapter can be used for replicating tables between SAP HANA Cloud systems, but this new feature with SDA hanaodbc adapter provides better performance with log-based replication and simplified approach without deploying the Data Provisioning (DP) Agent.
However, even though it supports near real-time change data capture as SDI does, there are some limitations. For example, the schema of source table and target table must be identical with no additional filters or different partitions. If you would like to know further details of limitations or comparisons between SDI and SDA, please also refer to the below link.

SAP HANA Cloud Administration Guide: Replicating Tables from Remote Sources

SAP Landscape Transformation (SLT) Replication Server can also be used for replicating tables between SAP HANA systems but it is trigger-based and requires additional installation. For these reasons, this new remote table replication can provides benefits from log-based replication with simplified approach compared to SLT.

Furthermore, the key differentiator compared to SDI and SLT is, as explained, in case multiple source tables are updated within one transaction, it applies the same behavior on the target tables to ensure database level transactional consistency. However, it doesn’t mean that SDI does not support transactional consistency. The difference is that this new remote table replication ensures transactional commit order based on original transaction commit ID.

Other limitations

For other limitations, please refer to the below.

  • The following data types are not supported with this new remote table replication
    • ALAPHNUM (not available in SAP HANA Cloud)
    • ARRAY
    • ST_GEOMETRY and ST_POINT
    • TEXT, BINTEXT, SHORTEXT (not available in SAP HANA Cloud)
    • CLOB (available but only as an alias of NCLOB in SAP HANA Cloud)
  • Full-text index is not supported (not available in SAP HANA Cloud)
  • It cannot be activated on other SAP HANA native replication activated tables in the same time

Hybrid scenario and futures

As briefly mentioned above, the hybrid scenario is growing in importance and this new remote table replication will play a crucial role as one of key enablers for the hybrid scenario. In that sense, replication from SAP HANA 2.0 SPS05 or later (source) to SAP HANA Cloud (target) will be possible in the near future. In parallel, it will keep evolving and be augmented with more capabilities and better supportabilities. So, please stay tuned for our next blog series.

6 Comments
You must be Logged on to comment or reply to a post.
  • Nice!

    A couple of thoughts when reading. Appreciate your input.

    1. One other odd limitation is that you cannot create subscriptions on Hana local tables. You always have to create a virtual table that reads via the adapter your own database. With this feature, it should be possible to resolve. Granted, not the most important use case, especially in the light of no-data-duplication but is required sometimes.
    2. You are saying the HanaAdapter of SDI does not support transactional consistency. Are you sure about that?
    3. Are all subscription writer types supported, insert-only, upsert, with change_type and change_time?
    4. Why the limitation of “source and target have the same partitioning”? I assume because of truncate statements? If that is the only reason, truncate table on production data happens less often than partition changes.
    5. What happens with the replication when the source does change the partitioning of a table?
    6. Is the adaptive schema supported, meaning when the source gets an additional column via alter-table-add does the target as well?
    7. The reason SDI does use triggers is because there is no API to read the Hana transaction log, not even an SAP internal one. It seems there is one now. Can this API be used for other consumers as well, e.g. SAP Data Services reading the change data or SAP BW Extractors having an easy way to identify changes or 3rd party?
    8. Will this solution make the SDI Adapter obsolete some time in the future as it can do everything the SDI Adapter can? For the supported Hana versions and deployments at least? Or asking the other way around, other than dpagent and no support for on-prem Hana yet, what are other limitations that might prevent using this new adapter?
    • Hi Werner,

       

      1. If the replication is about replicating within the same instance, it should be handled by OSTR (for scale-out). So, no need to use virtual tables for replication. I think the limitation you’re mentioning is about DP server trying to replicate local tables, isn’t it?
      2. In your blog (https://blogs.sap.com/2016/03/26/hana-smart-data-integration-transactional-consistency-and-delta-handling/), you’ve explained that SDI supports this as well. But the difference is that it is based on time changes. There can be multiple transactions committed in the same milli-seconds where as this is respecting the original commit transactions.
      3. The underlying technology is different from SDI, so different writer types are not supported.
      4. As the replication is done at the transactional layer and not session layer, internal dependencies exists
      5. The DDL changes are forwarded to the target
      6. Yes, any DDL changes are also applied to target
      7. This is limited to HANA to HANA only and cannot be consumed by other tools.
      8. There are still differences between SDI such as the writer types you mentioned above and support for flowgraphs

      @Lee, Seungjoon – Please correct if needed

      • Thank you Tae-Suk!

        Hello Werner,

        Thank you for your feedback and questions.

        As almost questions are already covered by Tae-Suk, let me add a few points as below.

        2. My explanation might somehow mislead readers. I didn’t say that SDI does not support transactional consistency, and that was not my intention. As Tae-Suk commented, it is about the difference. This new remote table replication ensures transaction commit order by leveraging original transaction commit ID.

        I added the below sentence to the blog to avoid any further misleading.

        “However, it doesn’t mean that SDI does not support transactional consistency. The difference is that this new remote table replication ensures transactional commit order based on original transaction commit ID.”

        3. Even though it shares the SQL interface of SDI, its underlying technology is different as Tae-Suk mentioned. It only supports real-time replication with schema change, and users will see ‘feature not supported’ for other writer types.

        5. More precisely, it supports adding empty partitions or dropping existing partitions, but it doesn’t support splitting or shuffling partitions.

        8. No, there is no plan like making the SDI adapter obsolete. However, many customers expect that replicating tables between SAP HANA Cloud systems should be possible natively without any further installation or customer managed application. And it is also natural to extend its scope such as hybrid as long as it is technically feasible and it can provide benefits. So, between SAP HANA systems, this is the recommended option as it gives better optimizations but customers can still prefer SDI as it has richer features for data movement.

  • Thanks for this outlook on (yet another) data replication technology for SAP HANA.

    Just as with the already existing products (SDI, SDA, SLT, Sybase Replication, RDSync, …) it is not quite clear, when to use what and what the implications of each of the different options are.

    Notably, those products focus on a single major data “federation” scenario: data is brought into HANA. While I am a fan of HANA, I can tell that this is not the single best scenario for many customers. This is especially true nowadays, where sufficiently “smart” data storage is available from the diverse cloud providers for a fraction of what HANA costs.

    So, why is there the perception that the SAP user base needs/wants yet another “bring-it-into-HANA” product? Surely the technology could have just been implemented within the existing products if it was simply about improving on the shortcomings of previous developments.

    Is there a specific use-case where this scenario really shines? What is it?

    Another question that I can’t quite figure out myself is: why package the replication of HANA internal data structures in ODBC? I understand that in order to operate “in the cloud” one cannot use arbitrary low-level constructs (i.e. using TCP/IP | UDP is the common denominator for network communication, implementing your own wire-protocol on top of ethernet or building your very own network interconnect is simply not an option on public IaaS infrastructure). But ODBC and its demands for type casting and memory-copying don’t seem to lend itself for replicating data structures between to instances of the same software.

    Can you provide some insight into this choice?

    Final question (for now): how does the “re-sync” work in case the replication had been interrupted? Where is the “last-known-good” sync point kept? As the virtual tables support read-/write access there must be at least two change-sequence-histories that need to be merged and kept in sync.

    How does that work and how can one monitor this?

    Cheers,

    Lars

    • Hello Lars,

      Regarding when to use what and a specific use case, please consider this new approach as a preferred option for replicating tables between SAP HANA Cloud instances.

      As you know, SAP HANA Cloud doesn’t support MDC. So, customers cannot create multiple tenants within the same SAP HANA Cloud instance. In on-premise world, MDC is being used for overcoming hardware limitations or for some business reasons. However, in cloud, as we don’t have such hardware limitations and we can leverage cloud elasticity, having separate instances can provide more benefits.

      In that sense, if customers want to have multiple tenants with SAP HANA Cloud as they did in on-premise, provisioning multiple SAP HANA Cloud instances is needed.

      And if they want to replicate tables between those SAP HANA Cloud instances, this is the recommended option as it is optimized for this use case and there is no need to deploy or operate additional application.

      Regarding your second question, ODBC is just used for transferring internal logs and replication requests. And SAP has been investing in SDA for further optimizations and performance improvements.

      Regarding the re-sync, this new approach is not based on keeping the sync points mechanism. And please refer to the above link for monitoring (SAP HANA Cloud Administration Guide: System Views for Monitoring Remote Table Replication).