An adventure in automation world
After several years in BI domain, I’m still afraid considering time spent in many companies in poor quality design and lack of automation when having many custom or standardized systems.
While I use PowerDesigner since many years and I don’t find any good automation tool on the market, I decided to make my own tool.
First step was to decide what to automate and I choose to build a tool with the following capabilities:
- Helping to identify in source models key elements to build a datawarehouse
- Building automatically classical data models by usage of a more “business data model”:
- a. Extraction area
- b. Staging Area
- c. Final ERD or Multi-dimensional model
- Permit a common or specific optimization module
- Building data movement models with all needed components (tables/views) and intermediate transformation tasks
- Building complete flow orchestration to build each model or to build the complete datawarehouse
I’ll quickly describe the solution; excuse me if I don’t give complete details but this will be too long (+/- 2500 lines in vbs).
The first point was quite easy; I build an extended model to be added to each source model. Using this extension, I give the capability to identify Business Keys, Search keys (elements often used when searching data in a datamart, for future extension), forcing to consider another table and few other flags.
By quickly modifying display options, all of these elements are now available ask key elements in source analysis. This is really a funny behavior in PowerDesigner 😆 .
To build the second step (“business data model”), I created a new extension “Multi Dimensionnel.xem”. This model is a simplified approach of data modeling where a neophyte user can draw his basic fact and dimension. This model considers also the usage of pre-standardized components such as date dimension, time dimension, label storage,… This is extendable.
In this model elements to build a fact are classical measures and links to dimension are based on dimension’s business keys. After modifying display options, we can quicker display this information. After defining these elements, a mapping should be done between source models and the pseudo “business data model”.
Using scripting capabilities of PowerDesigner, I implemented logic to build the extraction of data in several data models from one single right click:
After several tests, I build all data models using initial mappings defined in the “business data model”; all classical datawareehousing components are considered. In addition to “simply” creating data models, I added source to target mapping. This permits a complete linage and traceability.
In the next step, permit a common or specific optimization module, I decided to bypass database specificities by creating one common optimization extension coupled with one specific extension for each database. The extension used is decided at runtime and added dynamically depending on current DBMS used to define the “business data model”. For example, if the target data model is Oracle 11g, we will directly get :
Note we used DBMS code to implement new extended models.
Last steps, building data movement models and orchestration was really challenging this was completely new to me; I implemented specific logic to link all model together in several ILM models.
After several trials and errors, I obtained desired result and I pushed the vice to make a complete data movement orchestration:
Well done, now I have an automation tool:
- Permitting fast prototyping with data models which can be directly implemented
- Decomposing and orchestrating all flows
- Providing complete linage and traceability
PowerDesigner is the perfect tool for such automation and I wait future evolution of PowerDesigner to interface SAP DataServices to directly implement ILM into operational tools ! 😉
Please feel free to add you experiences if I have missed any points of interest or not discovered the killer configuration switches to address this challenge.