Skip to Content

How to work with candidate selection in Match Transform

Candidate Selection process helps when new data set from source system is to be compared and matched to an existing data collection in data warehouse. Without candidate selection, the new set of matched data would simply get appended and there would not be any logical grouping between new data set and the existing data in database.

To achieve grouping between new source data set and existing DB data, we have option in Match Transform – Candidate Section

Pic1.jpeg.jpg

Open Match Transform: For example FirmAddress_MatchBatch

Pic2.jpg

Mapping Input:

  • Map default fields:  FIRM, ADDRESS_PRIMARY_NAME, ADDRESS_PRIMARY_NUMBER, ADDRESS_SECONDARY_NUMBER and POSTCODE
  • Apart from default fields, map all columns that is required to be mapped with DB columns

Options:

Right Click on “Group Forming” options and select Candidate Selection. In the candidate section editor, provide below details

Datastore: Select the valid DW connection where existing data collection is available

  • Suppose secondary source containing existing data collection is static, persistent data store can also be used as data store. This is also advantageous in case secondary source is not an RDBMS.

Cache Type: Pre-Load Cache or No-Cache

Auto Generate SQL or Custom SQL: Either use auto generated SQL or custom SQL for fetching columns from database table that is to be mapped with input data.

Break Key: Break Key on the input data must match with break key used in database table.

  • Use Break Key column from Database
  • Or Match the break key fields with DB columns

Use Constant Source Value: Suppose in case of MDM, you would be matching data from multiple sources. In case you want to keep track of input source data set, provide constant value (Physical Source Value) and map the same to the DB column (Physical Source Field).

Column Mapping: Map fields from input data set with your DB Table columns. In this case for break key fields, the Break Key column gets automatically set to “YES”.

Output: Map all required fields to the output

Pic3.jpg

Group Prioritization: Priority can be set using Group Forming > Group Prioritization operation option. This ensures that high priority records are the master or driver record. Without this prioritization, records from the original source are always considered as driver.

Update DB with new data set: In case data from new data set source has to be matched and merged again in the DB table, use Insert Else Update option in target table update control property.

Hope you find this information useful.

Thanks,

Sukanya Krishnan

To report this post you need to login first.

1 Comment

You must be Logged on to comment or reply to a post.

  1. Deepa Kannan

    Hi Suguna, the above information is very helpful. Can you please help me with the following scenario.

    I have 2 source which needs to be matched.

    source 1 has details EmpID, emp_Name, Emp_Address, Emp_City, Emp_ state.

    Source 2 has details SSN, P_Name, P_Address, P_city , P_state

    I need to get an output like  this in one row EMPID EmpID, emp_Name, Emp_Address, Emp_City, Emp_ state, SSN, P_Name, P_Address, P_city , P_state.

    I need to match based on name, city and state.

    right now I am able to match but I am getting in 2 lines lets say..

    Line 1 EmpID, emp_Name, Emp_Address, Emp_City, Emp_ state, Group number, Group rank

    Line 2 SSN, P_Name, P_Address, P_city , P_state,  Group number, Group rank

    then I have to manually track by group number and group Rank

    Please let me know, how to avoid this and get the results in one line without manual work.

    Thanks,

    ~ Deepa

    (0) 

Leave a Reply