Skip to Content
This blog mainly concentrates on the different modeling strategies used in conventional datawarehousing design. SAP BW adopted a different strategy in the infocube design and it is known as extended star schema.The star and snowflake schema are most commonly found in data warehouses where speed of data retrieval is more important than speed of insertion. As such, these schema are not normalized much, and are frequently left third normal or second normal form. It is necessary to understand the processes behind the curtain before we delve into design something. In this blog I have covered the definitions of different design methodologies.

 

Snowflake Schema

A common form of dimensional model. In a snowflake schema, different hierarchies in a dimension can be extended into their own dimensional tables. Therefore, a dimension can have more than a single dimension table. To be precise Dimensions with hierarchies can be decomposed into a snowflake structure when you want to avoid joins to big dimension tables when you are using an aggregate of the fact table. In OLAP, this Snowflake schema approach increases the number of joins and poor performance in retrieval of data. In few organizations, they try to normalize the dimension tables to save space. Since dimension tables hold less space, Snowflake schema approach may be avoided. An example is given below

 

Star Schema 

 Star Schema is a relational database schema for representing multimensional data. It is the simplest form of data warehouse schema that contains one or more dimensions and fact tables. It is called a star schema because the entity-relationship diagram between dimensions and fact tables resembles a star where one fact table is connected to multiple dimensions. Dimensions with hierarchies also encompassed in one dimension. The center of the star schema consists of a large fact table and it points towards the dimension tables. The advantage of star schema are slicing down, performance increase and easy understanding of data.

Steps in designing Star Schema :-

1. Identify a business process for analysis(like sales).

2. Identify measures or facts (sales dollar).

3. Identify dimensions for facts(product dimension, location dimension, time    dimension, organization dimension).

4. List the columns that describe each dimension.(region name, branch name, region name).

5. Determine the lowest level of summary in a fact table(sales dollar).

Important aspects of Star Schema & Snow Flake Schema :-

1. In a star schema every dimension will have a primary key.

2. In a star schema, a dimension table will not have any parent table.

3. Whereas in a snow flake schema, a dimension table will have one or more parent tables.

4. Hierarchies for the dimensions are stored in the dimensional table itself in star schema.

5. Whereas hierachies are broken into separate tables in snow flake schema. These hierachies helps to drill down the data from topmost hierachies to the lowermost hierarchies.

6. Space wise efficiency is more in snowflake schema because it is using 3rd normal form. But star schema is using 2nd normal form.

 7. Retrieval of data is very fast in star schema compared to snow flake schema because snow flake design requires more number of joins to retrieve data.

Extended star schema

Many of the problems associated with the basic star schema are resolved with the BW extended star schema. With the extended star schema, attributes are removed from the dimensions and placed outside the InfoCube in master data tables. The BW extended star schema differs from the basic star schema. It is divided by a solution dependent part (InfoCube) and a solution independent part (attribute, text and hierarchy tables) which is shared among other InfoCubes.In BW, attributes located in the dimensions are called characteristics. In BW, attributes located in a master data table of a characteristic are called attributes of the characteristic. When designing a solution, it is a great challenge to decide whether an attribute should reside in a dimension table and thus in the InfoCube or in a master table or even both. Data is loaded separately into the master data (attributes), text and hierarchy tables. The SID table provides the link between the master data and the dimension tables.

The fact table and the relevant dimension tables of an InfoCube are connected with one another relationally using the dimension keys. The dimension key is provided by the system per characteristic combination in a dimension table.With the execution of a query the OLAP processor checks the dimension tables of the InfoCube to be evaluated for the characteristic combinations required in the selection.The dimension keys determined in this way point the way to the information in the fact table. Dimension tables consist of a maximum of 248 characteristics. The Time dimension holds the time characteristics needed for analysis. The Unit dimension contains the unit of measure and currency characteristics needed to describe the key figures properly. The Data Packet dimension is used to identify discrete packets of information loaded into the InfoCube. In this way, packets can be deleted, reloaded or maintained individually. In this design we have the flexibility to analyse the infocube in 13*248 angles.

To report this post you need to login first.

3 Comments

You must be Logged on to comment or reply to a post.

    1. Githen Ronney Post author
      Good Morning

      When we create an infocube in SAP BW it will automatically use Extended Star Schema. The disadvantage of star schema in comparison with Snowflake schema is it requires more memory to srote data because star schema uses second normal form on the contraty Snowflake schema uses third normal form. The BW extended star schema differs from the basic star schema. It is divided by a solution dependent part (InfoCube) and a solution independent part (attribute, text and hierarchy tables) which is shared among other InfoCubes.
               In BW, attributes located in the dimensions are called characteristics. In BW, attributes located in a master data table of a characteristic are called attributes of the characteristic. When designing a solution, it is a great challenge to decide whether an attribute should reside in a dimension table and thus in the InfoCube or in a master table or even both. Data is loaded separately into the master data (attributes), text and hierarchy tables. The SID table provides the link between the master data and the dimension tables.
                        The flexibility of extended star schema is we can include 248 characteristics in a dimension table. One infocube can have maximum of 16 dimensions possible. So essentially in Extended star schema 16*248 angle analysis is possible whereas Star schema provides only 16 angle analysis.

      (0) 
    2. Arun Varadarajan
      You can have a star schema if all your characteristics for the cube were Line item dimensions – that way the link between DIM ID and SID can be avoided … but then would be like trying to prove that start schemas exist in BI also..

      Another place to find star schemas is in Aggregates – here anyway all the characteristics if you do not exceed 14 characteristics in your aggregate will be stored as a flat table and hence a star schema…

      (0) 

Leave a Reply