You have started using SAP NetWeaver BW back in 2001, when the ERP project team would go in and state formally: all reporting requirements will be covered by the “BW Team”. Or even better, you are a customer using an entirelly different BI solution for years and now your company is looking with eyes full of desire to SAP HANA and all innovations and possibilities it brings to the table: S/4 HANA, Cloud Solutions, Plataform as a Service, Suite on HANA, HANA Live, Digital Boardroom and many, many more.
In any case, if you are an Analytics consultant you are wondering: what data should I keep where? Which data should I report against directly in S/4 HANA, in SAP BW or against this new beast called SAP Vora (and an additional Hadoop environment as well). My colleagues have already discussed a lot the need of a Enterprise Data Warehouse and as a consequence, the fact that SAP BW is not dead. On top of that, you all have seen the announcement of SAP Vora. So, does it changes everything? Don’t we need a EDW anymore once we have Hadoop and its OLAP engine powered by SAP in-memory technology?
Fact is, those are different things that serve different purposes. While your EDW environment will still be your single point of truth, allowing multidimensional data modeling to support your decision making process and business querying; Hadoop should be leverage for storing all kinds of data, in the smallest granularity possible, from sources that demand a constant streaming or a constant update of data. Yes, that is a thumb rule, and yes, in an ideal world you should evaluate case by case, but that is a (good) starting point.
Let us discuss a couple of use cases:
- Your customer wants to store temperature sensor data from thousands of different sensors across the world in order to analyze subtle temperature changes driven by ecological phenomena. In that case, I would go with a streaming acquisition tool to simply dump of those sensor data into a Hadoop, them use SAP Vora to analyze the data and integrate into an EDW (BW on HANA) in order to perform the advanced analytics, prediction algorithms that would allow the customer not only to report the past, but to find patterns and to predict consequences. Why not put it all on EDW then? Simple like this: the distributed architecture based on commodities hardware used by Hadoop is simply the most cost efficient way to do it, then you use the premium solution (which also requires more investment) to do the advanced stuff.
- Your customer have an e-commerce website where he can sell his products. It does work pretty well and now your customer wants to go for one step further. He wants to have access to all navigation data from everybody that goes into the website (logged or not) and perform a crooss system analysis to try to identify which navigation patterns (search for products, category A and then category B or even visit today and then come back in 38 hrs, etc.) leads to certain buying behaviors and based on that, create specific offers to try to speed up the purchase experience and ultimately increase revenue. In this case, all navigation data could be stored into Hadoop, primary analysis would be powered by SAP Vora and the cross analysis with sales data could be done directly against S/4 HANA to allow real time decision making and new patterns identification.
In short, new business models and business needs required that we are ready to face new challenges. It depends on us what we can offer to our customers to that they can make the difference on the Digital Market.
All the best,