ERP era led to Data warehousing era. ERP systems ensured enterprises had process related data and there was growing curiosity to analyze this data. ERP systems were good enough to churn out operational reports but running historical reports put undue pressure on them as they were primarily meant to capture process data not analyze the stored data. It disseminated into building data warehouses meant for storing historical data and optimized for reporting to be more precise multi-dimensional reporting. As enterprises were building data warehouses there was growing clamor for better reporting performance or overall analytics performance.
That is when most of the Data warehousing Appliances (Data warehouse Appliance: it is a combination hardware and software product that is designed specifically for analytical processing. An appliance allows the purchaser to deploy a high-performance data warehouse right out of the box, definition courtesy whatis.com) made their entry into this market and projected themselves as panacea for all data and reporting asks. These appliances kept guzzling enterprise, external data and kept churning out analysis after analysis.
This was an interesting era but with unexpected exponential data growth these appliances also started facing various challenges including scalability. Also little were they aware of two key movements taking place in their own neighborhood:
- Customer were demanding more and demanding NOW, real-time was not more luxury but need of the hour and organizations like SAP had courage of conceptualizing altogether new cutting edge platform HANA, keeping these demands in mind.
- Open Source movement especially Hadoop was unfolding and evolving rapidly. It made cost associated with scalability (read storing data) look ridiculous.
While appliances kept guzzling more and more of data and customers kept paying for adding more storage and power to their appliances, they were also evaluating open source Hadoop and cutting edge platforms like HANA. Initial phase was nothing but testing the water for a particular use case. Lots of use cases were validated and customers started making initial investments in HANA or HADOOP and few in both.
(Before we process let us go through HANA and Hadoop story to understand better)
HANA Story: It was June 2011, SAP HANA was launched in the market and it was positioned as an Appliance, remember HANA (high-performance analytic appliance) that put it in direct competition to Data warehousing Appliances. Lots of Analytics use cased were validated on HANA and performance was benchmarked with appliances e.g. Teradata, Netezza etc. Few of the customers did bite the bullet and made use case driven initial investments (their old appliance continued to be key data store). SAP smartly added Database to HANA positioning and it became HANA DB (to start with SAP BW). DB was much more cost effective than overall Appliance and was a big success as lot of Big BW customers migrated from any DB to HANA DB and that was true start of HANA in customer’s SAP ecosystems.
HANA at this point had two main use cases Appliance and DB (for specific SAP applications). In next wave of evolution HANA DB was extended to almost all SAP applications and HANA also evolved as a development platform much beyond just appliance and DB. With underlined HANA DB, SAP started simplification of applications e.g. BW on HANA, ECC on HANA, BW4HANA, S4HANA etc. This in a way unlocked the true potential of HANA. SAP HANA is now well entrenched into SAP ecosystem of most of the SAP customers either as DB or analytics appliance or DataMart or platform.
Hadoop Story: It all started in 2003 with release of Google File System paper. It has evolved dramatically over last 15 years but the core of Apache Hadoop consists of a storage part, known as Hadoop Distributed File System (HDFS), and a processing part which is a MapReduce programming model. Hadoop splits files into large blocks and distributes them across nodes in a cluster. And this storage part is still main driver for enterprises to leverage Hadoop. Data explosion because of Digital has further emphasizes need for an efficient storage solution like Hadoop.
(Coming back to our story now)
Now that the enterprises have got HANA, Data warehousing Appliance (Teradata / Netezza etc.) and Hadoop, there is an opportunity to rationalize the landscape and its data footprint. It can unlock real value (Millions of Dollars) over the period of time:
|Existing Landscape||Suggested Landscape (HANA + HADOOP): Retire Appliance|
– HANA: DB / Data Mart / platform
– Teradata / Netezza etc.: Data warehousing Appliance
– Hadoop: Cost effective data storage
– HANA: DB / Data Mart / platform
– Hadoop: Cost effective data storage
– HANA + HADOOP – covers all being delivered by Data warehousing Appliance in current landscape
– HANA + Teradata + Hadoop
– Managing 3 systems and dependencies during their product lifecycle
– Bigger team as more systems and data footprint
– 3 skills set i.e. less optimized team
– Managing 3 partner relationships
– HANA + Hadoop
– Managing 2 seamlessly integrated systems and dependencies during their product lifecycle
– Optimized team as fewer systems and smaller data footprint
– 2 skills set i.e. more optimized team
– Managing 2 partner relationships
|Service Cost associated||– Low (as you already have this non-optimized, inefficient architecture in place)||– Medium (One time Appliance to HANA migration cost)|
|Hardware Costs||– High as need to manage 3 difference HWs with expanding data footprint and redundancy through lifecycle of three solutions.||– Optimized as need to manage 2 HWs with optimized data footprint (HANA) through lifecycle of two solutions|
Data Storage Cost
|– Distributed data across enterprise system (HANA), appliance (Teradata) and Hadoop having data redundancy, integration challenges and cost associated with it.||– Distributed data across enterprise system (HANA) and Hadoop having almost no data redundancy, leverages integration capability of HANA & Hadoop.|
|Real-Time Insight||– Business cost associated with not have real-time insights.||– Seamless integration of real-time enterprise data and stored data (Hadoop) shall ensure insight are delivered at right time.|
|Scalable Digital Architecture||– Cost associated with data challenges / migrations to execute Digital transformation projects. Add to it dependencies on 3 disparate systems.||– Digital ready HANA + Hadoop scalable architecture|
To unlock the value, as title of this blog goes enterprises need to decide whether to continue to feed both the elephants or feed one. Right decision shall be to feed one and feed agile and evolving one (read HANA) rather than feeding the old jaded giant (read data warehousing appliance). One can always argue that appliances are also evolving and offering much better proposition e.g. integration with HANA etc. I agree to some extent (to my view what they are doing is patch up jobs while HANA was built scratch up) but if value is to be unlocked, decision needs to be taken between two elephants and considering HANA is so much entrenched into SAP ecosystem decision is simple i.e. further monetize investment in HANA and let old elephant retire and rest in his shed!!!
Please do share your views!!!
(thoughts shared are purely mine and may not be of my organization)