Real Time Change Data Capture for SAP Data Services
In my last blog, I introduced a new capability, real time HANA replication with SRS. In this blog, I am going to introduce another major feature in this release, real time change data capture for SAP Data Services.
Change Data Capture (CDC) is a software technique to keep track of data changes due to either data modification language (DML) such as insert, update and delete SQL statements or data definition language (DDL) such as create table, alter table SQL statements. CDC is a typically first step in ETL (extract, transform, and load); a Data Integration tool that provides sophisticated transformation, enrichment and delivery capabilities to a wide range of destinations.
SAP Data Services is a market leader in Data Integration, providing powerful complex data transformation and data quality capabilities in its ETL. However, by definition, ETL is batch oriented. Thus, it induces latency between source and target systems depending on how frequent batch jobs are scheduled.
To provide real time transformation capability and reduce latency, SAP Sybase Replication Server, in this release, extends log-based real time Change Data Capture (CDC) capability to SAP Data Services. With combined solution, SAP Sybase Replication Server and SAP Data Services provide real time technique to capture, transform, and propagate high volume of data to data warehouse. This facilitates the notion of real time enterprise, helps make rapid decision based on fresh data, therefore, keeping pace with the ever-change environment. This technique also eliminates resource intensive batch job to reduce cost and better utilize resources.
SAP Sybase Replication Server captures DML (Data Modification Language), such as inserts, updates and deletes in real time from source database transaction log, stores in runtime database. Data Services continuous data flow, which was newly introduced in version 4.2, reads most recent change data from last check-point from the runtime database and applies it to target system. If complex transformation and data quality are needed, these tasks can be done before applying to target system as illustrated in figure 1.
Figure 1 SAP Sybase Replication Server – Real Time Change Data Capture for SAP Data Services
SAP Sybase PowerDesigner, as illustrated in figure 2, is also included in this solution to automate SAP Sybase Replication Server scripts generation.
Figure 2 SAP PowerDesigner to generate Replication Server scripts
In this release, SAP Sybase ASE and Oracle databases are certified as sources and SAP Sybase ASE is certified as a runtime database for staging change data. Other source databases such as IBM DB2 and MS SQL Server will be certified at later releases.
With newly introduced real time CDC capability, the combined solution of SAP Sybase Replication Server and SAP Data Services brings real time data transformation to Data Integration, and makes real time enterprise a reality.