Skip to Content
Product Information

Introduction to the Relational Data Lake Service DAT106 #SAPTechEd Recap

Source: SAP

Presented at SAP TechEd

Below are my rough notes

Source: SAP

The legal disclaimer applies, things in the future are subject to change.

 

 

Source: SAP

SAP HANA Cloud provides data virtualization – federation, caching and meta-data and persistence – in-memory, disk and relational data lake

Elastically in an HTAP (hybrid transactional analytical platform) environment

One place to access data

Applications – Data Warehouse Cloud, SAP Analytics Cloud

Source: SAP

Power all applications including SAP, Custom and third parties

Designed to be entry point to landscape, different sources of data

Simplify data management processes

Source: SAP

Today – push for companies to move to cloud; reality is there is a gap between business expectations and the challenges of data management across enterprises was the primary motivation for SAP Hana Cloud.

It offers a single gateway to all data and related computation across a company.

HANA qualities –in-memory, multi-model, HTAP, real-time processes for data on demand – regardless of size, location or complexity, with easy integration – unified access layer to simplify data processing and harmonize data integration work (source: SAP)

Point of virtualization is streamlined access – connect to information without having to move it to a single location

It is fully customizable – development environment to build custom applications

Power of the cloud – control TCO via elasticity and implements serverless principles, HA and autonomous behavior

Source: SAP

How get to today

Look at all deployment options;  SAP HANA offers choices – from maintaining full control of your hardware and software installations to rely on SAP to fully manage them in any one of the major hypercloud providers’ clouds.

In the cloud, you can decide to use your existing licenses and run SAP HANA instances in public clouds where an Infrastructure as a Service Provider provisions and manage the hardware for you, while you are fully responsible for managing SAP HANA. (source: SAP)

Source: SAP

Alternatively,  you can use SAP HANA as a Service fully managed by SAP  in a Platform as a Service environment. You sign a contract with SAP, and then you can choose any of the supported providers – e.g., Microsoft, Azure, Google. You can activate and monitor your SAP HANA as a Service installation via SAP Cloud Platform Cockpit.

Finally, SAP HANA Cloud, coming soon will offer a true cloud Native DBAAS with advanced capabilities including automatically managed multi-tier storage options. (source: SAP)

 

Source: SAP

Illustrate where relational data lake fits

Look at data in enterprise in pyramid fashion

Top – highest value data lives; most value, real time access, keep in memory in HANA

As data loses value over time, data does not need to be in real time, but available in good performance for reporting, data science

HANA Cloud – move data out of in memory – native storage extensions, down to data lake

As go down pyramid, decrease cost of keeping data around, still enabling good performance

Move up stack, such as IoT scenario, data with unknown value – not store in HANA right away, move up pyramid – take data from raw storage and process into data lake as first step, so have ability to query/analyze data, then make decision about importance of data; if important, move up pyramid

Source: SAP

HANA Cloud is a cloud native service, elastic scalability, for both compute and storage – increase/decrease resources required at any given time, allow for full control of costs

Source: SAP

Provide simplified working environment – make virtual access secure, easy to navigate, without moving/replicating it

Source: SAP

Concept of data lake, apply to relational data

Industry research, significant amount of data in data lakes – structured, and has value

Data lake – source of value to enterprise

Source: SAP

Relational data lake as component of HANA Cloud – built in, provisioned

Scalable in terms of compute and storage

Allow fast data access, loading, query processing

Secure model; not reimplement security

Source: SAP

Best experience for load-once, analyze-many low-TCO data

A capability of the HANA Data Fabric, offering a singular experience as the gateway to all data

  • Supports the bottom tiers of a data pyramid with “good enough” performance at lowest cost
  • Offers a singular experience to the user with integrated security, tenancy, and tools
  • Large scale data analysis with full SQL support for all levels of the data pyramid

A managed cloud service for large or low-value data

  • Scales from terabytes to multi-petabyte
  • Automated deployment and operations (K8S) aligned with the ONE Cloud Infrastructure team
  • Efficient disk-optimized relational store based on IQ

-Automatic storage tiering between EFS and S3 to balance costs

-Local storage (SSD or HDD) for caching and performance optimization

  • Elastic and separate provisioning of compute and storage
  • Scalable to match costs to changes in data volume, user count, and complexity of workloads
  • “Structure imposed at ingest time” vs “structure imposed at query time”

Cloud ready, Performance

-Effective use of cloud storage & compute, Elasticity and scalable

-Add / remove compute or storage on the fly

-Grow to handle the largest data size, number of users, and workload complexity, TCO and Automation

-Efficient use of cloud services, storage & compute

-Fully automated test, deployment and most operations

-Customer requests are fully self-service

-Self-healing infrastructure, Secure

Singular experience in HANA Cloud, Query processing

-Full SQL expressiveness

-Consuming PB of data

-Linearly scalable with compute nodes, concepts aligned with HANA Cloud

-Security (JWT to personalize)

-Tenancy models

-Tools (source: SAP)

Source: SAP

Cornerstone of relational service

SAP IQ has been around for years

Disk based system with wide variety of compression algorithms

Has tight integration with SAP HANA

SAP IQ is a mature technology

Source: SAP

 

How SAP HANA & SAP IQ fit together (source: SAP)

Business users need revenue-generating insights that can only be gained from real-time access to both the burgeoning volumes of data flooding the organization in combination with the ever-growing stores of historical data. Summarized data is no longer sufficient… Unfortunately, these volumes of granular data strain storage and processing resources, making it difficult for organizations to get the fast and accurate decisions they need from data analysis.

Forward-looking organizations are moving away from traditional databases. Instead, they’re combining the real-time comprehensive analytics of SAP HANA with historical data cost-effectively held in an SAP IQ Near-line storage (NLS). This integrated solution delivers the performance and responsiveness business users demand, while keeping IT storage and maintenance costs in check; a balance of performance and cost.

(source: SAP)

Source: SAP

Real world scenario today

Characteristics of telco analytics applications:

  • Analytics on a large volume of structured Call Data Records (CDRs)

  • Answers needed in a hurry

  • Ad hoc, complex queries that require optimization

  • Storage of sensitive data

  • Comprehensive profile of the customer (source: SAP)

Source: SAP

Provision data lake service

Do you not have to provision it if using HANA Cloud

Any tool that works with HANA will work with the data lake

Ability to ingest and export cloud storage

Source: SAP

Tenancy concept:

-Customer provisions one SQL Data Lake and 1 or more HANA compute nodes

-Each HANA binds one or more times

-Different colors are different virtual data warehouses

-Customer can allow read-only sharing (Source: SAP)

Source: SAP

HANA Database – Dimension tables

  • Customer usage data and billing information
  • Media information (track, album, artist names)

HANA Data Lake – Fact Tables

  • Specific track requests and listening data
  • Track names, date and time requested, time played, etc…

Historical data ingested from hyperscaler object storage

  • S3 in this demo
  • Performance driven by provisioned compute

-Minimize cost by dynamically increasing/decreasing compute to manage ingest performance

Streaming data inserted directly into HANA Data Lake

Live analytics run directly off of the streaming server, reducing latency and unburdening transactional system (Source: SAP)

Demo showed SAP Analytics Cloud against the HANA Data Lake

Source: SAP

How everything fits together

Be the first to leave a comment
You must be Logged on to comment or reply to a post.