Twinkle Twinkle little STAR , do I really need you at all..?
Initially Enterprise Data Warehousing (EDW) was all about cubes , datamarts , STAR Schemas and designing them properly for good performance… A lot of thought went into designing the cubes and making sure that the cube tables were optimized and the usual checks for performance where we compared the size of the FACT table with the Dimension tables to make sure that we did not go over the prescribed 20% limit unless it was inevitable..
Now , enter the Hardware accelerator which uses In-Memory computing and compresses the data in our cubes greatly to help speed up reporting and also makes it easy for designers to design cubes because ultimately the columnar processing of the accelerator would kick in when all else fails…
Taking the case of SAP BW…
Does this mean …
a. Cube design is no longer critical for query performance ..?
Since the BWA is going to index the columns and compress the data , if we had very poorly designed cubes with big fat dimension tables , the BWA would be able to handle the same since it sees each column differently.
In a scenario , lets assume that we have a cube with a dimension table which is 50% of the fact table – either there are too few entries in the Fact Table ( Deletion of requests without removing dimension entries ..) or the design is not very good leading to too many entries in the Dimension table. In the BWA – since the columnar indexing will take care of such duplication , should this design gaffe be considered as forgivable..?
Even if we do take the effort of setting right the cube design by distributing the values properly across dimension tables , in the BWA scenario , this should give us a performance benefit which is hardly noticeable because the original query itself runs very very quickly..
b. With BWA7.3 – should we look at cubes at all ?
Since DSOs can be indexed , the EDW becomes a vast store of DSOs and Infosets which could be used to report on data – we would then save on the space being used for cubes , aggregates , indices etc…
This would then mean that we would be using a transparent table structure which can be used for all reporting purposes.
This brings us back to the same question… if Cube design still relevant ..?
With HANA bringing in real time analysis of transaction data and with HANA being planned for BW as well… and maybe if there was also access to HANA using Remote infoproviders ( at a possible later stage )… then do we need to maintain any data at all in BW..?
This brings into question if the principles of Imnon and Kimball with regard to data modeling and dimensional data modeling and if they are relevant in today’s world of accelerated reporting using hardware in-memory computations..
Is this the new data warehousing that we need to get used to and get familiar with..?
It is not only SAP BW that has in-memory acclerator capabilities , we have Exalogic , NetApps , Vertica etc to name a few and the same concepts apply to all the above which start to question concepts of ROLAP , MOLAP , HOLAP etc.
Do you think that these time honored concepts are still valid and would continue to remain so going forward ?