Skip to Content

In my previous article “An adventure in automation world” , I explained to build automatically ERD and Multi-dimensional models.

In BI, there is a third modeling technique named “Data Vault” which is based on three basic table types: Hubs, Links and Satellite.   You’ll find mode information on this site.

This technique uses massive load processing technique and requires adequate infrastructure but, in the other hand, permit to build a solid foundation layer for enterprise warehouse.

 

We decided to create new service based on this approach with the following questions in mind (a service and not a tool because code is not protected inside current extended models):

  • How to elaborate data vault data model from classical model using automation
  • How to generate all steps (extraction, staging and final data vault model) considering classical technical aspects without drowning end user
  • How to generate full traceability (mapping) between all models to keep full impact and linage powerdesigner’s capabilities
  • How to pre-optimize data model considering target physical database type
  • How to generate complete synchronized flows between source models, extraction models, staging model and
    final data vault model
  • How to generate fast prototyping, including SQL generation for pre-optimized physical data models in few minutes while
    manual work takes weeks or months

Using these goals, we decided to build a tool to support our services with PowerDesigner.  We named it “Extreme” because we automatically
build our targets in few minutes instead of spending weeks or months with manual stuff.

Here’s an example of what we want to do: 

  1. Determine a source model:

    Here’s a small sample database schema you can have

/wp-content/uploads/2014/07/p1_495962.png

  1. The final result should be:

/wp-content/uploads/2014/07/p2_495963.png

Additionally we want pre-optimized data models and complete Information
Liquidity models giving all basic instructions to elaborate ETL flows.

Well, as you can see our goals are very complete.  Here’re steps we follow to build a solution:

  1. Build automation algorithm
  2. Create extended models for:
    1. Physical data models:
      • Automation of data models and Information Liquidity Models
      • Pre-optimization of data models
    2. Information Liquidity Models
      • To support few extended attributes which will help in realization

The main extended model has only few attributes.  This is dramatically simple to determine parameters before launching the model production:

/wp-content/uploads/2014/07/p3_495966.png

The complexity is located inside the code (nearly 3000 lines of code), impossible to explain here.

The logic followed is quite “simple”:

  • Build extraction data model to limit stress on source system
  • Build a staging data model to preapare efficiently data before building data vault
    model
  • Build an ILM between source and extraction
  • Build an ILM between extraction and staging
  • Build an ILM between staging and data vault model
  • Build an ILM to integrate all flows

 

Let’s examine results:

First, prepare parameters in the source model and determine Business Keys.

/wp-content/uploads/2014/07/p4_495967.png

Next launch the Data Vault Analyzer and, after few minutes, you’ll get:

1.All data models and Information liquidity models

/wp-content/uploads/2014/07/p5_495969.png

2.With full traceability

/wp-content/uploads/2014/07/p6_495971.png

3.With also all transformations tasks to load target tables through models

/wp-content/uploads/2014/07/p7_495973.png

4.And the cherry on the cake, how to orchestrate everything:

/wp-content/uploads/2014/07/p8_495974.png

As you can see, using automation techniques can help providing “extreme” services with high and repetitive quality for your customer.

You’ll find also a white paper regarding this article here.  Feel free to share your comments !

Kr,

Thierry

To report this post you need to login first.

3 Comments

You must be Logged on to comment or reply to a post.

      1. Thierry de Spirlet Post author

        Thanks for your feedback.

        Unfortunately, I’ll not attend PD Europe day.  Don’t hesitate to let me know if you come in Belgium.  I’m working as freelance mostly in Belgium.

        Thierry

        (0) 

Leave a Reply