Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 

Here are some frequently asked questions about the SAP Replication Server Data Assurance Option.

Does DA require a separate license?

Yes. Replication Server Data Assurance Option is available as a separately licensed product for Replication Server and supports Replication Server versions 15.1 and later. The Replication Server Option or DA describes both the DA Server and the DA agent(s):

  • The DA Server -- typically one instance per installation -- is a licensed component.
  • The DA agent --  often zero instances per installation -- is not licensed. DA Agents are typically installed separately from the DA Server. You can install both at once, but you probably wouldn’t because the benefit comes from having them on different machines.

Replication Server Data Assurance Option is licensed through SySAM license manager and is available on multiple platforms. For more information about obtaining a license refer to the Obtaining a license section of the DA Installation guide.

Why might I want to install a Data Assurance (DA) agent?

Some of the advantages of installing separate DA agents are:

  • Improves the throughput: When network latency between the DA server and the source or target database is high, running a DA agent physically closer to the source or target database and having it hash the rows as it sends them to DA server improves throughput.
  • Lightens the load on the DA server: If the DA server is performing many concurrent comparisons, DA agents can share the work and lighten the load on the DA server.
  • Improves the performance of the DA server: If you require the DA external sort feature, which requires CPU/disk intensive processes for one or more tables, you can use separate DA agents to perform the task to improve the performance and lighten the load on the DA server.

Which version of DA am I running?

You can check the version of the DA server (or DA agent) in the following two ways:

a)  On the command prompt, from the DA installation directory, execute the version command:

   1> version

   2> go

       The returned result is:

VERSION

-------------------------------------------------------------------------------------------------------------------------------------------------------

SAP Replication Server Data Assurance Option - DA Server/15.7.1/SP304/P/generic/generic/damain/746/VM: SAP AG 1.8.0_45/OPT/Thu 28 Apr 2016 04:52:18 GMT

(0 rows affected)


b)  Run one of the DA jar files with:


  <PATH_TO_JAVA>\java –jar <PATH_TO_DA_LIB>\da-app.jar


     For example, form the top-level SAP directory of the DA installation use the Sybase shared JRE and run:


   :/sap> ./shared/SAPJRE-8_1_008_64BIT/bin/java -jar ./DA-15_5/server/lib/da-app.jar


     The returned result is:


SAP Replication Server Data Assurance Option - DA Server/15.7.1/SP304/P/generic/generic/damain/746/VM: SAP AG 1.8.0_45/OPT/Thu 28 Apr 2016 04:52:18 GMT

(0 rows affected)  

      

Does DA lock the source and target tables while it compares them?

DA doesn’t lock the entire table. DA doesn’t issue any explicit lock commands to the source and target data servers. What it does is execute the table SELECT statements with the default isolation-level of READ COMMITTED. The implementation of the SELECT statement varies across different data servers, but it typically involves acquiring and releasing the read-locks on a row-by-row or page-by-page basis.

Can I filter the table rows that are compared?

Yes, you can use a WHERE clause to filter the table rows that you want to compare.

Add a WHERE clause to the compareset definition, for the source and/or the target:

create compareset persons_30plus

with source ase1  dbo person src

        where “dateadd(yy, 30, birthdate) >= getdate()”

      target hana1 PPL PERSON tgt

        where “ADD_YEARS(BIRTHDATE, 30) >= NOW()”

map all

go

The WHERE clauses are appended to the SELECT statements that DA issues against the source and target data servers.

Does DA only compare tables that have a PRIMARY KEY?

No, but for DA to compare two tables effectively, each table must have a column, or combination of columns, that can uniquely identify each row in the table. In DA, this is called the row key. Table columns that often make the best row keys are PRIMARY KEY columns, IDENTITY columns, and columns with a UNIQUE INDEX. However, any column, or combination of columns, that uniquely identifies each row in the table will suffice.

The create compareset … map all command automatically detects and assigns PRIMARY KEY and IDENTITY columns, and columns with a UNIQUE INDEX as row keys. If no such columns exist, all non-LOB columns are set as row keys.

Warning: If the columns assigned as the row key are neither a PRIMARY KEY nor an IDENTITY and do not have a UNIQUE INDEX, ordering by these columns can put a tremendous strain on the data server. Consider using DA external sort in these circumstances.

Warning: If the columns assigned as the row key are not unique, DA aborts the comparison (either immediately or after the first compare all phase) with a duplicate key error.

What is the throughput for row comparison?

The throughput for row comparison can vary greatly. For example, the DA server row comparison is in-memory, quick, and (often) relatively simple; it is unlikely to be a bottleneck.

The time it takes for the JDBC driver to obtain the rows from the data servers usually governs the comparison throughput. The comparison is as fast as the slowest of the source and target data servers.

Note: Throughput for the row comparison does not include the additional time it takes for DA to perform an external sort, if configured.

Can DA be installed on its own server to minimize impact on SRS and the primary and replicate databases?

Yes. There is no requirement to install DA server or DA agent on the same server as the SAP Replication Server, the primary database, or the replicate database.

How common are false positive results due to replication latency? Does DA have any mechanism for identifying these false positives?

In a live replication environment, there typically are false positive results. However, these false positives are driven by environmental factors, such as the frequency of table updates or whether you run DA during quiet times.

DA has a two-part strategy for eliminating false positive results:

  1. DA runs one or more re-compare differences phases. During a re-compare differences phase, DA re-selects only the rows that were detected as being different (missing, orphaned, or inconsistent) from the replicate table, and compares those rows to the cached copies of the primary table rows. If a difference is caused by replication latency, it vanishes during a re-compare. The re-compare differences method may not work if a primary table row is updated and the transaction is replicated before the re-compare. In this case, it is still possible for DA to find a false positive result.

The primary rows are not re-selected (they are cached during the initial compare) so as to ignore further updates to the primary table.

    2. If the re-compare differences method fails, DA uses the single and final verify differences compare phase. During this phase, DA re-selects and re-compares all rows that are still different from both the primary and replicate tables.

Note: DA is prone to false positives caused by replication latency in the same way as the initial compare. Any differences that exist after the verify differences phase have to be investigated manually.

What is direct reconciliation?

The easiest way to explain direct reconciliation is to first talk about non-direct reconciliation. Non-direct reconciliation is when you use an AUTO_RECON or CREATE_RECON_SCRIPT to reconcile your data. Non-direct reconciliation requires a minimum of three steps: compare all rows (perhaps with row column hashing); verify all differences (creating a column log of literal column values), and; reconcile the differences (script or auto). While Non-direct reconciliation might seem safer, it’s also relatively slow and drags down performance.

With direct reconciliation, the compare and reconcile steps occur simultaneously. Therefore, to make reconciliation faster, use direct reconciliation, which:

  • Compares and reconciles rows in a single step, in memory.
  • Applies the appropriate insert, delete, or update statement directly to rows in the target table that are identified as different from the source table.
  • Does not save row values to disk.
  • Does not produce a detailed text or XML report (though there are still counters in the job history).
  • Enables multi-threaded reconciliation per comparison partition.

For more information about Non-direct and Direct reconciliation refer to the Comparison and Reconciliation Strategies section of the DA User’s Guide.

What are the LITERAL and ROW_HASH compare options for?

A LITERAL column value is a column value as it appears in the database table. A ROW_HASH is an MD5 or CRC32 hash of one or more column values.

The following diagram shows how different options affect the comparison of a four-column table with 1 GB of data.

There are no definite rules about using any of these options. Choose your desired option based on the rate at which a data server can deliver rows, and the speed of the network. As a general rule:

  • Hashing row data using DATABASE_HASH is typically the optimal strategy.
  • Hashing row data using standalone DA agents and AGENT_HASH can improve performance in environments with high network latency.
  • Hashing row data with AGENT_HASH in environments with low network latency typically offers no performance improvement.

Can DA run more than one job at a time?

Yes. DA has several configuration options to help manage running jobs concurrently:

  • CONFIG / COMPARER_MAX_CONCURRENT_THREADS: This option is a global, hard limit for the number of comparisons that can take place at any one time. If you attempt to run a comparison while all compare threads are busy, the comparison just gets queued.

       Typical usage: Use this option to avoid starving DA of resources (CPU/memory).

  • JOB / MAX_CONCURRENT_COMPARISONS:This option is local to each job that defines it. In any job that contains two or more comparisons, this option controls how many of them can run at any one time.

       Typical usage: When a job has comparisons that all use the same source and target databases, use this option to avoid hitting a database too hard.

  • JOB / COMPARISON / PRIORITY:Each comparison has a priority configuration. In any job that contains two or more comparisons, you can set the priority of each comparison to control which will run first.

       Typical usage: Use this option to ensure that the most important tables are compared first.

For more information about these configuration options refer to the Command Reference and Agent Command Reference sections of the DA User’s guide.

How does DA manage its compare threads?

The compare thread allocator of DA manages multiple compare threads. The comparer_max_concurrent_threads configuration parameter specifies the maximum number of comparisons that can run simultaneously.

When the number of comparisons submitted to run exceed the limit specified by the comparer_max_concurrent_threads configuration parameter, DA allocates the compare threads as follows:

  1. A comparison’s priority has more weight than its submission time. When a comparison is submitted for running, it takes the next available thread if its priority is higher than all of the other queued comparisons.
  2. Some comparisons are configured to run with a retry delay. The comparer_retry_delay_threshold_secs configuration parameter specifies the retry delay threshold value. If the retry delay is less than the configured retry delay threshold, the comparison retains the compare thread during the retry delay. Otherwise, the compare thread is returned to the available pool and queued with the highest priority.

For more information about thread management, refer to the Compare Thread Management topic in the DA User's guide

How do I install JDBC drivers?

For SAP Adaptive Server Enterprise, SAP IQ, and SAP HANA DA server, you do not need to install JDBC drivers as DA server and DA agent ship with the preconfigured drivers.

To install JDBC drivers for MSSQL, UDB, and Oracle, follow the steps mentioned in the DA Quick Start Guide and the Configuring JDBC Drivers section of the DA User’s guide.

3 Comments