S/4HANA and Data Warehousing
One of the promisses of S/4HANA is that analytics is integrated into the [S/4HANA] applications to bring analyses (insights) and the potentially resulting actions closely together. The HANA technology provides the prerequisites as it allows to easily handle “OLTP and OLAP workloads”. The latter is sometimes translated into a statement that data warehouses would become obsolete in the light of S/4HANA. However, the actual translation should read “I don’t have to offload data anymore from my application into a data warehouse in order to analyse that data in an operational (isolated) context.”. The fundamental thing here is that analytics is not restricted to pure operational analytics. This blog elaborates that difference.
To put it simple: a business application manages a business process. Just take the Amazon website: it’s an application that handles Amazon’s order process. It allows to create, change, read orders. Those orders are stored in a database. A complex business (i.e. an enterprise) has many such business processes, thus many apps that support those processes. Even though some apps share a database – like in SAP’s Business Suite or S/4HANA – there is usually multiple databases involved to run a modern enterprise:
- Simply take a company’s email server which is part of a communications process. The emails, the address book, the traffic logs etc sit in a database and consitute valuable data for analysis.
- Take a company’s webserver: it’s a simple app that manages access to information of products, services and other company assets. The clickstream tracked in log files constitutes a form of (non-transactional) database.
- Cash points (till, check-outs) in a retail or grocery store form part of the billing process and write to the billing database.
- Some business processes incorporate data from 3rd parties like partners, suppliers or market research companies meaning that their databases get incorporated too.
The list can be easily extended when considering traditional processes (order, shipping, billing, logistics, …) and all the big data scenarios that arise on a daily base; see here for a sample. The latter add to the list of new, additional databases and, thus, potential data sources to be analysed. From all of that, it becomes obvious that not all of those applications will be hosted within S/4HANA. It is even unlikely that all the underlying data is physically stored within one single database. It is quite probable that it needs to be brought either physically or, at least, logically to one single place in order to be analysed. That single place hosts the analytic processing environment, i.e. some engines that apply semantics to the data.
Now, whatever the processing environment is (HANA, Hadoop, Exadata, BLU, Watson, …) and whatever technical power it provides, there is one fundamental fact: if the data to be processed is not consistent, meaning harmonised and clean, then the results of the analyses will be poor. “Garbage in – garbage out” applies here. Even if all originating data sources are consistent and clean, then the union of their data is unlikely to be consistent. It starts with non-matching material codes, country IDs or customer numbers, stretches to noisy sensor data and goes up to DB clocks (whose values are materialised in timestamps) that are not in sync – simply look at Google’s efforts to tackle that problem.
In summary: while analytics in S/4HANA is operational, there is 2 facts that make non-operational (i.e. beyond a single, isolated business process) and strategical analyses challenging:
- It is likely that enterprise data sits in more than 1 system.
- Data that originates from various systems is probably not clean and consistent when being combined.
A popular choice to tackle that challenge is a data warehouse. It has the fundamental task to expose the enterprise data in a harmonised and consistent way (“single version of the truth”). This can be done by physically copying data into a single DB to then transform, cleanse, harmonise the data there. It can also be done by exposing data in a logical way via views that comprise code to transform, cleanse, harmonise the data (federation). Both approaches do the same thing, simply at different moments in time: before or during query execution. But, both approaches do cleanse and harmonise. There is no way around. So, either physical or logical data warehousing is a task that does not go away. Operational analytics in S/4HANA cannot and does not intend to replace the strategical, multi-systems analytics of a physical or logical data warehouse. This should not be confused by the fact that they can leverage the same technical assets, e.g. HANA.
On purpose, this blog has been neutral to the underlying product or approach used for data warehousing. This avoids that technical product features are mixed up with general tasks. In a subsequent blog, I will tackle the relationship between S/4HANA and BW-on-HANA.