SCD Type 1 Full Load With Error Handle – For Beginners
This example may help us to understand the usage of SCD Type 1 and with how to handle the error messages.
Brief about Slowly Changing Dimensions: Slowly Changing Dimensions are dimensions that have data that changes over time.
There are three methods of handling Slowly Changing Dimensions are available: Here we are concentrating only on SCD Type 1.
Type 1- No history preservation – Natural consequence of normalization.
For a SCD Type 1 change, you find and update the appropriate attributes on a specific dimensional record. For example, to update a record in the
SALES_PERSON_DIMENSION table to show a change to an individual’s SALES_PERSON_NAME field, you simply update one record in the
SALES_PERSON_DIMENSION table. This action would update or correct that record for all fact records across time. In a dimensional model, facts have no meaning until you link them with their dimensions. If you change a dimensional attribute without appropriately accounting for the time dimension, the change becomes global across all fact records.
This is the data before the change:
|15||00120||Doe, John B||Atlanta|
This is the same table after the salesperson’s name has been changed:
|15||00120||Smith, John B||Atlanta|
However, suppose a salesperson transfers to a new sales team. Updating the salesperson’s dimensional record would update all previous facts so that the
salesperson would appear to have always belonged to the new sales team. This may cause issues in terms of reporting sales numbers for both teams. If you want to preserve an accurate history of who was on which sales team, Type 1 is not appropriate.
Below is the step by Step Batch Job creation using SCD Type 1 using error Handling.
Create new job
Add Try and “Script” controls from the pallet and drag to the work area
Create a Global variable for SYSDATE
Add below script in the script section.
# SET TODAYS DATE
$SYSDATE = cast( sysdate( ), ‘date’);
print( ‘Today\’s date:’ || cast( $SYSDATE, ‘varchar(10)’ ) );
Now double click on DF and add Source Table.
Add Query Transformation
Add LOAD_DATE new column in Query_Extract
Map created global variable $SYSDATE. If we mention sysdate() this functional call every time which may hit the performance.
Add another query transform for lookup table
Create new Function Call for Lookup table.
Required column added successfully via Lookup Table.
Add another Query Transform. This query will decide whether source record will insert and update.
Now remove primary key to the target fileds.
Create new column to set FLAG to update or Insert.
Now write if then else function if the LKP_PROD_ID is null update FLAG with INS if not with UPD.
ifthenelse(Query_LOOKUP_PRODUCT_TIM.LKP_PROD_KEY is null, ‘INS’, ‘UP’)
Now Create case Transform.
Create two rules to FLAG filed to set “INS” or ”UPD”
Create Insert and Update Query to align the fields
Change LKP_PROD_KEY to PROD_KEY and PROD_ID to SOURCE_PROD_ID for better understanding in the target table.
Now create Key Generation transform to generate Surrogate key
Select Target Dimension table with Surrogate key (PROD_KEY)
Set Target instance
Add a Key_Generation transformation for the Quary_Insert to add count for the new column.
And for Query _Update we need Surrogate key and other attributes. Use the Map Operation transform to update records.
By default Normal mode as Normal. We want to update records in normal mode.
Update Surrogate key, Product key and other attributes.
Go back to insert target table –> Options –> Update Error Handling as below:
Go back to Job screen and create catch block
Select required exception you want to catch. and Create script to display error messages
Compose your message to print errors in the script_ErrorLogs as below.
print( ‘Error Handling’);
print( error_message() || ‘ at ‘ || cast( error_timestamp(), ‘varchar(24)’));
raise_exception( ‘Job Failed’);
now Validate script before proceed further.
Now these messages will catch errors with job completion status.
Now create a script to print error message if there is any database rejections:
# print ( ‘ DB Error Handling’);
if( get_file_attribute( ‘[$$LOG_DIR]/ VENKYBODS_TRG_dbo_Product_dim.txt ‘, ‘SIZE’) > 0 )
raise_exception( ‘Job Failed Check Rejection File’);
note: VENKYBODS_TRG_dbo_Product_dim.txt is the file name which we mentioned in the target table error handling section.
Before Execute, Source and Target table data of Last_updated_Date.
Now Execute the job and we can see the Last_Updated_Dates.
Now try to generate any error to see the error log captured our error Handling.
try to implement the same and let me know if you need any further explanation on this.