Data Lake from SAP
Worldwide lot of SAP customers have invested in Enterprise data warehouse/data mart using SAP HANA. They got tremendous benefits with in-memory capability of HANA as well speed of agility. However data volumes continuous to grow exponentially in recent times due to addition of data from steaming applications ,artificial intelligence well as machine learning. New data sources also includes data coming from IOT devices ,customer behavior as well social media data. Recently we worked with customer who is using SAP HANA for Enterprise data warehouse and same customer also brought Snowflake in their landscape. Unfortunately data tiering was not implemented in customer landscape. Another challenge was growth of data ,customer was anticipating that data volume will be doubled in next 1.5 years. So keeping everything in HANA is an expensive affair for customer.
I also see lot of SAP customers have started investing data lake using either AWS or Azure or Google Cloud and in Snowflake as well. Some of these customers also invested in Hadoop data lake even though 70 % of their data is coming from SAP ( S/4 HANA,BW/HANA)
So what option we should suggest for SAP Customers using SAP HANA. Should they invest in Snowflake or build design lake using AWS/Azure/GCP. Is there any option available from SAP ?
Yes there is Option from SAP to meet such requirements . The option is HANA Data Lake(HDL).
SAP announced HDL in the month of April 2020. This is part of SAP HANA Cloud services and its cost effective as well.
What is HDL
SAP HANA Cloud offers low-cost storage options, including SAP HANA native storage extension and a built-in relational data lake.
Customers can keep current, business-critical (hot) data in memory for real-time processing and move data that you use frequently but not every day (warm) to the SAP HANA Native Storage Extension(NSE). For older, but still important (cold) data, customers can use the HANA Data Lake(IQ) and still retain access to your data when and where they need it. This data tiering helps reduce cost and gives the freedom to choose where you want to store your data based on when you need it.
HDL is relational data lake and its means SAP IQ database deployed in the cloud. It provides processing similar to Azure or AWS.It offers excellent compression. 10x compression of existing data and save storage cost. It can store structured and unstructured data as well. HDL can be enabled in existing HANA Cloud instance or provisioned in new HANA Cloud instance. We can add more storage space in HDL at any time. It also shares HANA cloud security. It provides features such as encryption of data, audit logging & tracking the data access.
- Integrated into HANA Cloud Instance
- Automatically provisioned and administrated with HANA Cloud
- Based on SAP IQ technology
- Elastic scale, independently of HANA DB. Designed to scale up petabytes of data
- High speed ingestion enablement
- Ability to analyze data with excellent performance
- Access to cloud storage (e.g. AWS S3, GCP Cloud Storage)
- Ingest any data from cloud or on-premises data sources
- Easy to setup and use (single access layer in HANA Cloud)
- Low TCO
- Fast analytic processing through columnar architecture
Typical Use Case
- Customers who have implemented in SAP HANA on premise can choose HANA Cloud as hybrid option
- Customers who are looking for cost effective data lake solution
- Customers who are moving forward with their cloud journey
- Customers data volume is increasing and they are having challenges in controlling data volume as well cost
- Customer who has looking for cost effective storage and without affecting the performance
To summarize , SAP customers who are going towards Cloud journey , should start evaluating SAP HANA Cloud which offers HDL as an option. Rather than investing in Data lake on AWS/Azure/Google Cloud or Hadoop, they should invest in HDL. HDL offers cost effective Data Lake solution but also provides excellent performance as well. It also provides fast access to data whenever needed. With HDL customers should not worry on the cost aspects at all. So if you are investing in Data Lake ,think of SAP HANA cloud as it provides built in HANA data lake. Keep exploring !!
Can HDL take data in any format like a hadoop data lake does ?
yes it can take structured as well as unstructured data