Skip to Content

Data Assurance (DA) has options for coping with in-flight data to help you plan and manage your comparison strategies.

Row differences occur due to replication latency; transactions applied to the source have not yet reached the target. These transactions are said to be ‘in-flight’. Differences detected during a comparison caused by in-flight data are known as false positive results. False positive results can stem from differences detected during the comparison phase (COMPARE_ALL).

If the DA server identifies row differences in the initial comparison, DA can recheck those rows to confirm that the primary row data is consistent with replicate row data. This is especially important in replication environments where there is replication latency in updating target databases.

Each comparison within a job has a RETRY_DIFF option, which defaults to NEVER. Set this option to WAIT_AND_RETRY so that DA runs one or more RECHECK_DIFFERENCES phases after the initial comparison. The RECHECK_DIFFERENCES phase rechecks only the rows that were found to be different during the initial comparison.

For example, if an initial comparison at 8:00 P.M. reveals an out-of-sync row, and the RETRY_DELAY_SEC option is 120 seconds, the re-comparison starts at 8.02 P.M., allowing replication to apply transactions that were in-flight during the initial comparison.

However, in a live replication environment, having one or more RECHECK_DIFFERENCES phases is not the best way to eliminate false positive results.

Let’s take the following scenario, which shows the how the WAIT_AND_RETRY technique might not be the best solution.

In this scenario, assume that:

  • There is a job/comparison comparing two large tables
  • The job/comparison RETRY_DELAY_SEC is set to 10; the RETRY_MAX is set to 2.
  • It will take 5 minutes to compare all rows in the source and target tables.

Define a job or comparison:

show job my_job my_comparison
go

OPTION                   VALUE
------------------------ -----------------------
…
NUM_PARTITIONS           1
RETRY_DELAY_SEC          10
RETRY_DIFF               WAIT_AND_RETRY
RETRY_MAX                2
…

DA compares the job/comparison in the following three phases:

  1.  COMPARE_ALL:
    • The job/comparison starts.
    • After 1 minute, DA detects 1 inconsistent row.
    • DA continues comparing the remaining rows over the remaining 4 minutes.
    • The phase ends.
  2. Job Comparison RECHECK_DIFFERENCES:
    • DA waits 10 seconds.
    • The job/comparison recheck (1) starts.
    • DA rechecks the inconsistent row (it is still inconsistent).
    • The phase ends.
  3. Job Comparison RECHECK_DIFFERENCES(2):
    • DA waits 10 seconds.
    • The job/comparison recheck (2) starts.
    • DA rechecks the inconsistent row (it is still inconsistent).
    • The phase ends.

In this example, after the COMPARE_ALL phase, 4 minutes have passed since the inconsistent row was detected. So, the wait time totals 4 minutes and 10 seconds.

When DA reaches the job comparison RECHECK_DIFFERENCES phase 2, because the recheck compares only 1 row, the comparison happens within a second or two. So, DA waits for only 10 seconds.

If your replication latency is 15 seconds, you might choose RETRY_DELAY_SEC 10 and RETRY_MAX 2 to avoid false positives. However, as in this scenario, the actual wait time is 4 minutes 10 seconds, and the row may change a few times while DA was busy/waiting.

Similarly, suppose DA detects a second inconsistent row after 4 minutes 30 seconds. This row would then be rechecked after 40 (30+10) seconds – the timing of the recheck is not precise.

To fix the wait time problem, use more than one partition with the NUM_PARTITON parameter.

For example:

NUM_PARTITIONS = 5

So instead of only 1 thread comparing the entire table in 5 minutes, DA uses 5 threads comparing 1/5th of the table at a time, taking 1 minute each.

The first inconsistent row is detected after 1 minute (right at the end of partition #1) and the actual wait time will be 10 seconds. Similarly, the second inconsistent row is detected after 30 seconds (half way through partition #5), and the actual wait time would be (30 + 10) seconds.

Using Configuration Parameters for RECHECK_DIFFERENCES Phase:

DA provides several global configuration parameters that you can modify to reduce the time DA takes for rechecks.

For example:

config comparer_retry*
go

NAME                                    VALUE   PENDING REQUIRE RESTART
--------------------------------------- ------- ------- ---------------
comparer_retry_delay_threshold_secs     20              false
comparer_retry_max_keys_per_clause      10              false
comparer_retry_min_fill_percent         10              false
comparer_retry_min_fill_percent_literal 30              false
comparer_retry_min_keys_in_range        5               false
comparer_retry_strategy                 DEFAULT         false

If DA detects several row differences, you can modify these global configuration parameters to vary the length of time it takes to recheck the differences. For more detailed information about all the global configuration parameters, refer to the config topic in the SAP Replication Server Data Assurance Option User’s Guide.

To report this post you need to login first.

Be the first to leave a comment

You must be Logged on to comment or reply to a post.

Leave a Reply