In my previous article I mentioned that HANA was a major enabler of Pervasive BI, as it allows for real-time data to be brought into the Data Warehouse.
In this article I would like to explore some technical options on how this can be done in practice, i.e. how real-time data can be brought into the Enterprise Data Warehouse to be analysed together with historical data.
BW powered by HANA – LSA++
Before jumping straight into the architecture of real-time data into the Enterprise Data Warehouse, I will first briefly introduce the architecture of BW powered by HANA: LSA++.
Most traditional Enterprise Data Warehouses are built using a Layered Scalable Architecture (LSA) - this architecture makes the EDW a more robust, scalable and flexible system, that can cope with changes in business, technology and requirements. LSA is built with several layers, and data is physically copied from one layer to another. It is not uncommon for traditional data warehouses to have three copies of the same data in each layer: Acquisition, Transformation, Data Mart.
When we implement BW on HANA, we can create a different flavour of Layered Scalable Architecture, much simpler, smaller, faster and that relies more on virtual objects and less on copy of data between layers - It is called LSA++.
Of course, an Enterprise Data Warehouse built on BW on HANA could be done the traditional way, but then it would not take advantage of many of the benefits that HANA's in-memory capabilities bring.
In LSA++ the first layer is called "Open Operation Data Store Layer"– or Open ODS Layer. It is equivalent to the original LSA's Acquisition Layer. Data is stored at field level (raw data) exactly the same way as it is in the source system. Data can be stored in Field-based DSOs (modelled in BW) or in HANA tables, accessible to BW through HANA views. Data can be extracted using scheduled extraction (the same way as in BW without HANA) or via real-time data replication, using SLT (SAP Landscape Transformation). Both DSOs and HANA tables can be queried directly.
The next layer is the "Core Data Warehouse Layer"– or Core DW Layer. Data is still at line item level (not aggregated). At this level data can be transformed, cleansed or consolidated. Data is stored in Core DW DSOs that can be used by data marts in the next layer. This layer too can be queried directly.
The next layer is the "Virtual Data Mart Layer". Due to HANA, the structures on this layer are virtual structures and can combine data from all the other layers. These structures are used as query targets for reporting. InfoCubes, with physical data stored, become obsolete. Virtual providers represent more flexibility and agility.
Real-time data in the Enterprise Data Warehouse
Now that we have some understanding of the architecture of BW powered by HANA, we will add real-time scenarios to it.
It all depends on how data is stored in the source system. If the source system uses traditional databases (not HANA), we can use SLT to replicate data, real-time, into HANA tables in the "BW powered by HANA".
This is done real-time or near real time. Effectively, real-time data makes its way into the EDW in minutes, or seconds after the event - not hours as in traditional data warehouses.
If the source system uses HANA as a database, then there is no latency at all. Data can be analysed directly in BW using Virtual InfoProviders (which read data directly from HANA) or through HANA views built directly on the source system. Data is made available real-time without having to go through the extraction and transformation within the EDW. This eliminates the need for replication, latency and additional storage of data in BW.
Benefits of bringing real-time data into the EDW
The adoption of LSA++ changes the way we do things:
By combining real-time data and historical data into one warehouse, the classic Pervasive BI applications become possible. This can bring phenomenal business benefits. Some examples are:
Some challenges
Of course, bringing real-time into the Enterprise Data Warehouse creates its own challenges. What is real-time data today will become history tomorrow, and could get loaded again in the overnight load (assuming one exists). A mechanism to separate real-time data from historical data must be put in place. Otherwise, there is the risk of having duplicated data.
Conclusion
The ability to bring real-time data into the Enterprise Data Warehouse has never been so real. The adoption of LSA++, SLT and Virtual Providers make it possible to combine history and real-time in the same analytics applications. This technology is available today.
This creates possibilities for applications that were not possible before, for solutions that truly support decision making based on the latest, most accurate information.
This is true disruption.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
37 | |
10 | |
5 | |
4 | |
4 | |
3 | |
3 | |
3 | |
2 | |
2 |