Validation transform is used to filter or replace the source dataset based on criteria or validation rules to produce desired output dataset.
It enables to create validation rules on the input dataset, and generate the output based on whether they have passed or failed the validation
In this Scenario we are validating the data from the database table with correct format of the zip code.
If the zip code is less than 5 digit then we will filter that data & pass it to another table.
The Validation transform can generate three output dataset Pass, Fail, and RuleViolation.
- The Pass Output schema is identical with the Input schema.
- The Fail Output schema has 3 more columns, DI_ERRORACTION and DI_ERRORCOLUMNS, DI_ROWID.
- The RuleViolation has three columns DI_ROWID, DI_RULENAME and DI_COLUMNNAME.
1) Create project, job, workflow, dataflow as usual.
2) Drag source table, Validate transform& provide details.
- Double click on Validation transform to provide details. You can see the 3 types of dataset as described above.
- Add a validation rule.
- Click Add & fill the details about the rule as follows.
Action on Fail:-
1) Send to Fail:- on failure of the rule the record will sent to another target with “Fail” records.
2) Send to Pass:- even on failure pass the record to the normal target
3) Send to Both:- sends to both the targets.
Select the column to be validated, then decide the condition.
We have selected “Match Pattern“ as the condition pattern as ‘99999’.
So it will check whether Zip code is of 5 digits or not.
- Press OK. Then you can see the entry get added as follows.
3) Add a Target table to the dataflow & link the Validate Transform to it.
- Choose the validate condition as “Pass”
- Similarly do the connection for “Fail” & “Rule Violation” condition.
4) Validate the job & execute it.
5) Check the input & output.
- You can see in the input in the above figure where the last row has zip code of less than 5 digits. Now view the output.
- Output for Pass condition:-
- Output for Fail condition
You can see that the invalid record from input is transferred to the “CUST_Fail” table as shown above.
Three more columns “DI_ERRORACTION“, “DI_ERRORCOLUMNS“, “DI_ROWID” can also be seen.
- Output of the “RuleViolation” condition.
So in this way Validate transform is useful in validating the records based on the rules & categorising the bad records into different target which can be analysed later.
Thanks & Regards,