Hello Experts,

I was trying to understand the Table Comparison options and started my research with Row by Row select option and after experimenting I found these observations.

TABLE COMPARISON: –

Table comparison is used to update the target table when source records have been updated/changed. It is used in implementing slowly changing dimension (type 2 and type 3). It will always receive NORMAL input and will generate three opcodes: –

INSERT (I) – When new records have been inserted in source table but not present in target table.

UPDATE (U) – When source records have been updated and those updates in source need to be reflected in target.

DELETE (D) – When some records have been deleted from source and those records need to be deleted from target, if required.

It is always connected with query transform as this is the only transform that generates NORMAL opcode.

Based on the opcodes generated by table comparison the target is changed.

How it does the processing: –

In Table comparison option we have a Table_Name section wherein we select the table which is needed to be compared which is called comparison table, it is always the target table. Table comparison maintains two images of target table, Before Image and After Image. The final data present in After Image is reflected in Target. Before Image sends its changes to After Image. In short both Before Image and After Image are buffers holding the data.

What happens when “Row by Row select” is selected?

In row by row select method an SQL statement is executed based on columns defined in Input primary key column section. If we do not specify any column in compare columns section then query is: –

Select * from table_name where C1 = xxxx and C2 = yyyy and so on (based on columns present in Input primary key columns section).

If we specify any column in compare columns section then query is: –

Select C1,C2,C3,… (list of columns present in Compare column section) from table_name where C1 = xxxx and C2 = yyyy and so on (based on columns present in Input primary key columns section).

This result then brings the records in Table Comparisons Before Image buffer and compares with the source and if any change found it sends to After Image and result of After Image is send to Target.

In comparison method select “Row by Row select” option. Now it does the comparison.

Let’s suppose that initially no records are loaded in target table i.e., target table is blank which means Before Image and After Image sections are also blank. Now during first load i.e., when first time job gets executed, due to empty target table, all records are inserted in target table with opcode (I) provided there is no duplicate data present in column defined in Input Primary Key columns

Before Execution: – 

Table Comparison: –

After Execution: –

Opcodes I as all records are new based on EMPID (column name which we have defined in Input Primary Key columns.)

Now I truncate the table and in source add 2 new records for EMPID 1010.

Before Execution: –

After Execution: –

Final Target: –

How did it processed?

Records were inserted one by one in target and after every insert each incoming row was compared with rows present in target. Eg Firstly record with EMPID 1010 was inserted then when record with EMPID 1011 comes it’s checked whether this EMPID exists or not, if not then it Inserts.

Now when row number 10(1010 RAJ START…) is encountered with EMPID 1010, Table Comparison finds that 1010 record is already inserted(1010 RAJ…) so it updates it and sends it to After Image. Again when row number 11 (1010 RAJ JB..) comes it finds that EMPID with 1010 (1010 RAJ…) already exists so it updates that and sends that update to After Image. Again when row number 11 (1010 RAJ LFT..) comes it finds that EMPID with 1010 ((1010 RAJ…)) already exists so it updates that and sends that update to After Image.

Of all the updates the first update sent by before image is reflected in target and the other two updates are directly inserted. Like the first update sent was (1010 RAJ START…) so it updates 1010 RAJ… with 1010 RAJ START… and for other two updates 1010 RAJ JB.. and 1010 RAJ LFT.. they were inserted.

NOTE: – It’ll always compare with the Inserted records like Row number 10, 11 were always compared with the first row (1010,RAJ..)

Table Comparison fires an Update statement for the changed row.

Update table_name set Column = xxxxx where column_name = yyyyy.

Now I’ve changed the source data: –

Row with EMPID 1010 (1010 RAJ LFT..) and 1012 (RAJ PWC) has been deleted from source.

Before Execution: –

Source: –

Target: –

After Execution: –

Rows from source are not deleted from Target because we have not checked delete options in Table Comparison options section, hence TC does not checks for delete.

Always where clause will contain the list/combination of columns present in input primary key columns section.

Hope it helps!

Please correct me if I’ve missed something.

Thanks:)

To report this post you need to login first.

Be the first to leave a comment

You must be Logged on to comment or reply to a post.

Leave a Reply