TABLE COMPARISON: – Generated Key Column, same column name present in source and target, duplicate records in source
Adding further concepts to my knowledge I was working on Generated Key column of Table Comparison, placing the common column existing in both source and target and incoming source has duplicates. So below is my observation please have a glimpse of it!
TABLE COMPARISON: – Generated Key Column
When records coming from source has duplicates for column(s) defined in Input Primary Key Columns section and out of those columns one is selected in Generated key column then records inserted in target are random. So in brief I’ve selected the column which is present in both source and target. Here it’s EMP_ID.
How table comparison sends the records to the final target when this option is selected?
Initially my target table has data present in it. When job is executed then this transform brings the rows to it’s before image for processing and then comparison starts. Based on the OPCODES generated the rows are sent to final target table.
Here I put this brief note in below screenshots: –
Consider the initial target before job was executed: –
Generated key column value has EMP_ID column, which has duplicates, selected in table comparison: –
Initial source records before job is executed: –
Now after execution,
When the job is executed table comparison brings the records in it’s before image. Initially target table has existing records for EMP_ID 1010 (1010, RAJ WRITER,..) and 1011 (1011, RAJ MCKINSEY,…). Now records coming from source are compared one by one.
Firstly EMP_ID 1010 (1010, RAJ WRITER,..) is compared with existing id, 1010, RAJ WRITER,.., table comparison finds no change/update, hence no opcode is generated. Then incoming EMP_ID 1011 is compared with existing id, 1011, RAJ MCKINSEY,…, and again no opcode is generated. Now when EMP_ID 1010 (1010, RAJ,… ) comes from source its compared with existing id, 1010, RAJ WRITER,.., Table comparison detects that EMPNAME has been changed from RAJ WRITER to RAJ, hence it sends an update, ’U’, opcode. Similar is the case with EMPNAME RAJ EY for EMP_ID 1011.
So after execution: –
Records in final Target: –
This was the case when we had only one duplicate record coming from source.
Now let’s suppose that I add few more duplicates in my source for the EMP_ID’s which are present in target. Then how table comparison works, let’s see.
I added 3 more records for EMP_ID 1010 and 1011 in my source. So my new source becomes: –
Before execution my target table is: –
When job gets executed then for all the existing id’s which are already present in target an update, ‘U’, is sent by Table comparison: –
Now which record will be sent by our before image to the target?
For EMP_ID 1010, 4th record, with EMP_ID 1010, is sent to target table and for EMP_ID 1011 1st record, with EMP_ID 1011, is sent as displayed in above screenshot.
Again if I re-execute the same job then table comparison should work in a similar fashion: –
As per above observation it should send 4th update for EMP_ID 1010 and 1st update for EMP_ID 1011, right?
So the final target is: –
This time table comparison sent 1st update for EMP_ID 1010 and 1st update for EMP_ID 1011.
This is strange! 🙁
Now again if you’ll run the same job then again it’ll give different outputs and different records will be sent to target.
So this means if you use same column, present in source and target and column defined in Input Primary Key column has duplicate incoming rows, in Generated Key Column section then table comparison will randomly send the updated records to the target table.
So if the column(s) specified in Input Primary Key columns section has duplicate incoming records and out of them if same column is selected in Generated Key column then table comparison will arbitrarily send the records to the target.
To further understand how does records get processed by Table comparison you can visit this link: –
Please let me know if I’ve missed anything or something is incorrect.
Hope it helps!