The future of the SAP EDW: Interview with Juergen ...

Former Member · ‎07-14-2015

This blog has previously been published on my company's website, and posted here to reach the SCN audience as well.

At the High Tech Campus Eindhoven, the Netherlands. Juergen Haupt, Product Manager SAP EDW (BW/HANA) gave a presentation for the Dutch User Group (VNSG). In the morning before the meeting, I was fortunate enough to get the chance to sit down with Mr. Haupt for an interview.

About SAP BW on HANA, LSA++, Native development, S/4HANA Analytics and everything in between.

Left: Juergen Haupt, SAP. Right: Sjoerd van Middelkoop, SOA People | Intenzz

Mr. Haupt, welcome to Eindhoven! Please introduce yourself to our readers.

Well, thank you, Sjoerd! Ok, my name is juergen.haupt and I am now with SAP for 18 years, working in the area of Data Warehousing. Before joining SAP, I worked at Software AG, where I had the first contact with Data Warehousing. Starting to work with the early releases of SAP BW it quickly became clear to me that BW was a fully new BI approach bringing business requirements into focus. Nevertheless the first versions were primarily focused on OLAP, not on data warehousing like for example defined by Bill Inmon. Knowing about the impacts of ‘stove pipes’ and encouraged by customers. I began pushing the idea of Inmon’s ‘single version of the truth’ and the ‘conformed dimensions’ of Kimball towards an architecture driven BW approach. Around 2005 more and more customers positioned BW as their Enterprise Data Warehouse and asked for more guidance on how to set up a BW EDW. As a consequence we defined the Layered Scalable Architecture (LSA) that has become the standard setting up a BW EDW on AnyDB today.

But there is never a standstill. So in the moment where we had reached a solid, generally accepted state of LSA on RDBMS - SAP HANA and little later BW on HANA entered the scene…. And this is the reason LSA++ for BW on HANA is the successor of the LSA for BW on anyDB.

Q: So, if we compare the ‘traditional’ BW to BW on HANA – what are the major differences?

Well first of all customers that moved to BW on HANA report tremendous performance gains with respect to data loads and querying. Then they notice the simplification through less InfoCubes. Further simplification we see in BW on HANA 740 SP8 thru the new Advanced DSO that replaces traditional DSOs and InfoCubes. In addition to simplification comes the flexibility thru new CompositeProvider that allows combining any BW InfoProviders (DSOs, the new Advanced DSOs or InfoCubes) and create new virtual solutions. Even combinations with HANA native models outside of BW are possible.

But there are benefits at the second glance that are may be not so well known: let’s call it ‘the new openness of BW on HANA’. We all have the experience on what integrating non-SAP raw data in BW meant in the past certain efforts. You had always to assign and define InfoObjects to the raw data fields. This is now no longer a prerequisite to integrate data into BW as BW on HANA 7.40 comes with the so called field-based modeling. Field-based modeling means that you now can integrate data into BW with considerably lower effort than before. Regardless whether you load data into BW or whether the data resides outside BW: you can now directly model and operate on field level data without the need of defining InfoObjects in advance and subsequently mapping the fields to the InfoObjects. This makes the integration of any data much easier. And how is this achieved? Well the new Advanced DSOs allows storing field-level data in BW. Advanced DSOs can have only fields, a mixture with InfoObjects or just InfoObjects, like the old DSOs. On top of the BW Advanced DSOs with fields or on any SQL/ HANA view outside BW you define the BW on HANA Open ODS Views to model reusable BW-semantics identifying facts, master data, and semantics of fields like currency fields or text fields. Furthermore in Open ODS Views you can define associations between Open ODS Views and InfoObjects what means you model virtual star schemas. Last but not least you can use Open ODS Views in a query or combined with other Providers in a CompositeProvider like any InfoProvider

So in short BW on HANA is capable to model and work on raw data regardless where they are located and we can integrate these raw data with the harmonized InfoObject-world by associating InfoObjects in Open ODS Views to fields.

The idea of working with raw data in BW and the early and easy integration of raw data results in the new ‘Open ODS Layer’, which brings BW and the sources closer together

Q: So what you are saying is that the functionality that has been developed for BW on HANA is actually created from an architectural point of view, and not from a technological point of view?

Exactly, this is an important driver. Knowing that HANA can work on data like it is, without transforming the data into specific analytic structures you should be able to work with virtual objects directly on any field level data. Bringing the source systems closer to BW means that we need to have something intermediate between the source and the fully fledged and top down modeled EDW described by InfoObjects. This is achieved by the Open ODS Layer.

Q: LSA++ is, as you stated, the successor of LSA for BW on HANA scenarios. What are the main differences between the LSA approach and LSA++?

No architecture stays forever. Any architecture has to be reviewed continuously especially when the circumstances change. When HANA came along and a little later BW on HANA was released, colleagues asked me very early “Juergen, can you make an update of LSA for BW on HANA?” I hesitated, because it was clear that BW on HANA is more than just exchanging the relational database, more than the offering of the in-memory BW Accelerator. This is why just an ‘update of LSA’ was and is not adequate – I do not want to bore you with the discussions we had – we can see the results looking to BW on HANA 7.4 and the LSA++ as successor of LSA:

Bearing in mind what I said before about BW on HANA we can look at LSA++ from two different perspectives – the first I call LSA++ for simplified data warehousing.

This perspective deals with the traditional way of doing data warehousing, moving data to BW and organizing the data in a proper way. With LSA++ the architecture becomes far more streamlined and flexible. We can find here two major differences with respect to the traditional LSA: First- making persistent Data Marts – BW InfoCubes - obsolete using virtual composition of persistent data (CompositeProviders). The result is the LSA++ Virtual Data Mart Layer. Second bringing BW closer to the source data thru BW field-based modeling. The result is the Open ODS Layer.

The Open ODS Layer broadens our architecture options as it may serve as inbound layer not only for an EDW Layer that is described mainly by InfoObjects. We can also stage the data in a DWH layer that is mainly described by fields. We call this a raw or domain data warehouse. A Domain data warehouse is dominated by one leading source system and all other sources integrate in the domain DWH with respect to this leading source. For example an S/4HANA can be such a leading source system. All other sources would then integrate in the related BW domain data warehouse with respect to the S/4HANA semantics and values. Defining InfoObjects is always necessary if you have to harmonize multiple equivalent sources – this is the well-known EDW case.

But LSA++ is more than just simplified data warehousing. It is an open architecture, allowing an evolutionary DWH approach. I call this the LSA++ for logical data warehousing. It means a complimentary perspective to the traditional LSA++ simplified data warehousing perspective: sources of any nature (operational sources, data lakes like Hadoop or Open ODS as Data InHubs) play an equivalent role like the data warehouse: they are a basis for analytics. The logical data warehouse like described by Gartner provides analytics and reporting on the original data as long as you can keep the service level agreements and cover the business requirements. You move data to the data warehouse only if the service requirements are violated or the business requirements cannot be fulfilled.

The LSA++ supports the logical DWH approach via an agile virtual data mart layer. Agility comes in from two modeling options in BW on HANA. First it comes in through the CompositeProviders allowing you to combine any BW Provider with HANA models from outside BW, wherever they are located. Second it comes in through Open ODS views of type fact, master or text allowing defining dimensional models on any data outside of BW defined by tables, sql-views or HANA views. You have always the possibility to switch a virtual Open ODS View source to a persisted BW Advanced DSO, like suggested by the logical DWH approach. Switching from virtual to persisted means that BW on HANA generates the data flow from the remote source to an Advanced DSO and the Advanced DSO itself based on the definition of the Open ODS View.

If you look to the virtual models on the source systems, like offered by HANA Live or S/4HANA Analytics, BW can then be considered as an extension offering additional services like historic data, business consistent views et cetera that the source cannot offer. The transition from the source model to BW can then happen in a very dynamic way.

Q: On SAP HANA you can define normalized DWH models like Data Vault directly. Data Vault is quite popular with Dutch companies. Do you think Data Vault modeling is a valid alternative for SAP ERP data?

We call our team SAP EDW Product Management, so that implies that we cover both BW on HANA and HANA native data warehouse modeling as we call it. A native HANA data warehouse can be modeled using any known DWH model (e.g. dimensional, 3NF, data vaults).That means freedom but also threat. Threats especially for customers who decide about their future DWH architecture based on sentiments and a BW-perception that is driven by the past. We find all kind of BW perceptions in the market: people who love it and for whatever reason people who dislike it. I have a quite good idea why people may dislike BW but one thing is clear to me: sentiments are a bad advisor. Having a bad perception about the traditional BW in mind we saw already customers who tried to build a native HANA data warehouse for SAP Business Suite sources saying “we have an SAP source system, and other SAP tools like Powerdesigner and Data Services, so we are going to ‘Vault it’. Making a long story short: finally this ended up as a nightmare as you have to rebuild all the semantics, associations and annotations natively. And it offers no business value because with BW, you get all this for free: BW knows these semantics because of the tight dictionary integration between SAP sources and BW.

In addition Data Vault modeling assumes that you should always expect the worst from your sources. It assumes that at any time and frequently source-model changes can happen that enforces you to change your DWH models and links and so on. But that is not the reality with SAP source systems. The SAP source models are in general pretty stable making the dimensional BW model working very well. Vaulting in general for SAP sources brings in complexity that cannot be justified.

Q: This is the case with standard SAP content. There is however not a single customer I know without quite a bit of customization in their SAP system. And this inability to adapt to these changes is a strong part of criticism on BW.

Yes, you are right and these customizations could not be modeled flexible enough in the past. But this is no longer true with BW on HANA. With BW 740 SP8 we now can model kind of dimensional satellites of a BW entity using Advanced DSOs with Open ODS Views on top or directly in a CompositeProviders .Let me give you an example: you have all the standard SAP attributes in your 0COSTCENTER InfoObject. You have the requirement to model country-specific attributes let’s say for UK only. Today you store these attributes in an Advanced DSO and define an Open ODS View of type master on top of it. In any ODS View of type fact or in a CompositeProvider you can then associate/ join the different views of the entity cost center regardless whether they come from an InfoObject like 0COSTCENTER or an Open ODS views..

From my point of view, this will solve most modeling challenges customers had with such scenarios in the past: you load attributes with different ownership independently, you create new attributes without impacting the existing model, and you associate different attribute views and can even create dedicated authorizations.

Overall: I don’t believe that it makes sense to create data vaults for SAP ERP operational systems because it adds complexity, but no value. BW on HANA is pretty flexible to model volatility of SAP source models caused by customization. On the other hand if you have multiple, highly volatile non-SAP sources you are free to create a data vault DWH natively on SAP HANA The resulting architecture would then end up in a hybrid architecture between BW and a native HANA DWH.

This blog is the first half of the interview I conducted with Juergen Haupt. The second half will be posted shortly!

The future of the SAP EDW: Interview with Juergen Haupt - Part I

SAP PI for Beginners

ABAP 7.40 Quick Reference

Fiori: technical installation and configuration of one app from A - Z