Skip to Content
Author's profile photo Asgar Ali MK

SCD Type Implementation in BODS

Here I am trying to explain the methods to implement SCD types in BO Data Service. The different types of slowly changing dimension types are given below.

1.       Type 0

2.       Type 1

3.       Type 2

4.       Type 3

5.       Type  4

6.       Type 6/Hybrid.

In this document I will explain about first five types of SCD types with examples.

Source Data

Name

IDate

designation

John

2002.12.01

A

Jay

2012.12.01

A

Jasil

2012.12.01

A

SCD TYPE 0

The SCD Type 0 method is passive. Value remains the same as it were at the time the dimension record was first entered. Type 0 also applies to most date dimension attributes.

SCD TYPE 1

This method does not track any history data .This methodology overwrite old data with new data without keeping the history. This method mainly used for misspelled names.

Let consider below given data is our target data after the first run.

ID

Name

IDate

designation

1

John

2002.12.01

A

2

Jay

2012.12.01

A

3

Jasil

2012.12.01

A

During the next run, Consider the designation of John is changed to ‘B’ on the date 2003.12.01 then the output will be.

ID

Name

IDate

designation

1

John

2003.12.01

B

2

Jay

2012.12.01

A

3

Jasil

2012.12.01

A

Here the no history is preserved the designation and Idate are updated with new value.

In BODS it can be implemented by using following transformation.

SCD1.JPG

Source:- Test_Effective_date is our source given above (Section – Source Data)

QR_MAP :- Map the source data to query transform without applying any transformation.

TBL_CPM :- Table comparison used to compare source data and the target table data .

MP_OPR :- This will be used to insert new data and update old data.

KEY_GEN :- This transform used to generate a surrogated key .

SCD TYPE 2

In Type 2 Slowly Changing Dimension, a new record is added to the table to represent the new information. Therefore, both the original and the new record will be present. The new record gets its own primary key. Here 2 new column columns are inserted called start_date and end_date.

Let consider below given data is our target data after the first run.

ID

Name

IDate

designation

Start_Date

End_Date

1

John

2002.12.01

A

2002.12.01

9000.12.31

2

Jay

2002.12.01

A

2002.12.01

9000.12.31

3

Jasil

2002.12.01

A

2002.12.01

9000.12.31

During the next run, Consider the designation of John is changed to ‘B’ on the date 2003.12.01 then the output will be.

ID

Name

IDate

Designation

Start_date

End_date

1

John

2002.12.01

A

2002.12.01

2003.12.01

2

Jay

2002.12.01

A

2002.12.01

9000.12.31

3

Jasil

2002.12.01

A

2002.12.01

9000.12.31

4

John

2003.12.01

B

2003.12.01

9000.12.31

Here a new row is inserted and the end_date of the first row will be the start date of the newly  updated value. (note:-‘ 9000.12.31’ is the default value)

In BODS it can be implemented by using following transformation.

SCD2.JPG

SOURCE : This is our data as given above(Section – Source Data)

QR_MAP : Map the source data to query transform without applying any transformation.

TBL_CPM :- Table comparison used to compare source data and the target table data .

HIS_PRES:- This transform is used to store the history data and if any updates on source data then new row will be inserted.

KEY_GEN :- This transform used to generate a surrogated key .

SCD TYPE 3

This method tracks changes using separate columns and preserves limited history. The Type II preserves unlimited history as it’s limited to the number of columns designated for storing historical data. The original table structure of Type II differs from Type II by Type III adds additional columns. In the following example, an additional column has been added to the table to record the original target- only the previous history is stored.

Let consider below given data is our target data after the first run.

ID

Name

IDate

Curr_Designation

Effective_date

Pre_Designation

1

John

2002.12.01

A

2002.12.01

A

2

Jay

2002.12.01

A

2002.12.01

A

3

Jasil

2002.12.01

A

2002.12.01

A

During the next run, Consider the designation of John is changed to ‘B’ on the date 2003.12.01 then the output will be.

In the target output an extra column will be added to keep the previous value of that particular column.

ID

Name

IDate

Curr_Designation

Effective_date

Pre_designaion

1

John

2003.12.01

B

2003.12.01

A

2

Jay

2002.12.01

A

2002.12.01

A

3

Jasil

2002.12.01

A

2002.12.01

A

(Note:- on the first run the Pre_designaion and Current Designation are same)

In BODS it can be implemented by using following transformation.

SCD3.JPG

SOURCE : This is our data as given above(Section – Source Data).

QR_JOIN : This query transform is used to join the Source with target.

                       This will be a left outer join and the Source will be the outer source and the target is the inner source. This join will based on the name column on the both side.(Source.name = SCD3.name)

QR_INSERT :- This query transform used to filter the data which is new that means SCD3.name is null .

QR_UPDATE : This query transform used to filter the data which is already existed in target table but the designation is updated that means SCD3.name is not null and the designation from the source and the previous designation from the SCD3 table are not same.

MP_UPDATE : This transform used to update the target table by setting the map operation as  ‘normal to update’

KEY_GEN :- Key generation is used to generate an surrogated key for newly inserted row.

During the first run the source file have the data as given above and the target table will not have any data so that all set of data will moved to QR_INSERT flow and loaded into target table.

SCD TYPE 4

SCD Type 4 design technique is used when SCD Type 2 dimension grows rapidly due to the frequently changing dimension attributes. In SCD Type 4, frequently changing attributes will be removed from the main Table and added in to a History Table.

The Type 4 method is usually referred to as using “history tables”, where one table keeps the current data and an additional table is used to keep a record of some or all changes.

Let consider below given data is our target data after the first run.

ID

Name

IDate

designation

1

John

2002.12.01

A

2

Jay

2002.12.01

A

3

Jasil

2002.12.01

A

And the History table will not have any data.

During the next run, Consider the designation of John is changed to ‘B’ on the date 2003.12.01 then the output will be. There will be two table one will keep the current data and other will be the changed data (history).

SCD4_CUR

ID

Name

IDate

designation

1

John

2003.12.01

B

2

Jay

2002.12.01

A

3

Jasil

2002.12.01

A

SCD4_HIST

The history table will have an addition column compared to  the source that will be created date.

ID

Name

IDate

designation

Created_date

1

John

2002.12.01

A

  1. 2003.12.01

(Note: created date will be is the date when the designation is changed)

In BODS it can be implemented by using following transformation.

SCD4.JPG

SOURCE : This is our data as given above(Section – Source Data).

QR_JOIN : This query transform is used to join the Source with target.

                       This will be a left outer join and the Source will be the outer source and the target is the inner source. This join will based on the name column on the both side.(Source.name = SCD4_Curr.name)

QR_INSERT :- This query transform used to filter the data which is new that means SCD3.name is null .

QR_UPDATE : This query transform used to filter the data which is already existed in target table but the designation is updated that means SCD4_Curr.name is not null and the designation from the source and the previous designation from the SCD4_curr table are not same.

QR_CURR: This Query transform is used to updated the target table with new designation .

QR_HIST : This Query transform is used to updated the target table with Previous designation .

MP_UPDATE : This transform used to update the target table by setting the map operation as  ‘normal to update’

KEY_GEN :- Key generation is used to generate an surrogated key for newly inserted row.

During the first run the source file have the data as given above and the target table will not have any data so that all set of data will moved to QR_INSERT flow and loaded into target table.

There are different methods to implement SCD types in BODS and above explained is one method to implement the same.

Regards

Asgar

Assigned Tags

      16 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Sandeep Chandran
      Sandeep Chandran

      Hi Asgar,

        Thanks for sharing this valuable information...looking forward to read more on SCD...:)

      Author's profile photo Former Member
      Former Member

      Informative....

      Author's profile photo Sagar Girme
      Sagar Girme

      Hi Asgar,

      Good information. Just want to add simple but very important point here. Whenever updating target table, we need to make sure that we are using input keys. That is in target table properties, Options tab, in Update Control Use Input Keys as 'YES'.

      Thanks

      Sagar

      Author's profile photo Former Member
      Former Member

      Good work Buddy.

      Regards

      Arun

      Author's profile photo Former Member
      Former Member

      Thanks for Sharing! 🙂

      Author's profile photo Former Member
      Former Member

      Nice explanation.

      Author's profile photo Former Member
      Former Member

      QR_HIST : This Query transform is used to updated the target table with Previous designation .

      Could you please explain how can Query update the data.

      Author's profile photo Former Member
      Former Member

      Hello Sagar,

      Good explanation with Data flow diagram...

      Personally i like SCD-4 Type.. 😎

      Thanks for sharing...

      Author's profile photo Former Member
      Former Member

      This is great, but where is the Type 6/Hybrid portion?  Interested to see how you resolve that!

      Author's profile photo Kumari Varsha
      Kumari Varsha

      Hi Asgar,

      I am Trying to go from

      Id

      LS001

      LS003

      NT006

      LS007

      ES009

      LS004

      TO OUTPUT


      Col          ID

      T1          LS001,ES009,LS003

      T2          NT006

      T3          LS007,LS004

      pls help me out or give me  hint or idea so i can use that.

      Regards

      Varsha

      Author's profile photo Former Member
      Former Member

      Dear Varsha,

      I cant get the logic behind your output. I think you will have to hardcode the required output.

      Regards,

      Neha Khetan

      Author's profile photo Kumari Varsha
      Kumari Varsha

      Hi Neha,Thanks for your response,can you explain with this example because in real time more than 50000 data and i need output as above..?

      Regards

      varsha

      Author's profile photo Akhilesh Sengupta
      Akhilesh Sengupta

      Hi Varsha,

      Please explain the criteria behind grouping the data. Can help you after that.

      Thanks,

      Akhilesh

      Author's profile photo Former Member
      Former Member

      hi can we implement scd-2 with out   table comparison    ????????

      Author's profile photo Swetha N
      Swetha N

      very good explanation with examples.

      Author's profile photo Former Member
      Former Member

      very informative ...grt job