SAP HANA Data Strategy: HANA Data Modeling a Detailed Overview
This is part of the HANA Data Strategy series of BLOGS
Why talk about HANA Data Modeling?
HANA data modeling is what drives real-time analytics while lowering the total cost of ownership (TCO) and time to deliver (TDL) business insights using SAP HANA digital business platform.
Another reason for talking about HANA data modeling at this time is that many customers are making the transition from HANA 1.0 to HANA 2.0. These customers need a deeper understanding of the newer HANA Deployment Infrastructure (HDI) architecture to take advantage of the advanced features added here. Also, there are no comprehensive HANA data modeling sources, and virtual calculation view models are at the heart of HANA Intelligent Digital Business Platform.
Once customers begin to adopt all of the HANA virtual data modeling features, they will drive out all of the latency from their traditional approaches to analytics, creating a true real-time environment and at the same time significantly lower the TCO of the solution by eliminating the need for tradition ETL, change data capture, data streaming and modeling infrastructures. The real-time aspect of this solution means that as soon as an event (order, meter reading, phone call) occurs anywhere in any system in a company’s enterprise, the HANA Intelligent Digital Business Platform will analyze and deliver insight based on that event to the business.
In this blog, I will give a detailed overview of HANA data modeling with a set of recorded presentations and demos that will not only explain all the aspects of HANA data modeling from harmonization with calculation views to data replication with replication tasks, but I will also share demos of how to do the basics. I will follow this up with more advanced modeling presentation and demos in future blogs.
I will also give a detailed overview of the old HANA repository and the new HANA Deployment Infrastructure (HDI) architectures and the development tools used to build models in each. I will then explain the basics of the HDI architecture and demo how to:
- build a project,
- access classic schema tables from HDI using synonyms,
- access HDI artifacts like calculation views from external users (SAC),
- build a calculation view with a star join,
- union and join distributed tables in calculation views.
Introduction to HANA Data Modeling
This presentation is an overview of all aspects of HANA data modeling and data access. It explains all the different types of models that we use in HANA to:
- Ingest the enterprise data in real-time,
- Harmonize and analyze data in real-time, and
- Deliver business insights in real-time.
Throughout these presentations I refer to HANA classic modeling as HANA 1 modeling and HDI modeling as HANA 2 modeling.
Overview of HANA classic and HANA Deployment Infrastructure (HDI) architectures and tools
This presentation is an overview of the differences between the HANA classic modeling and its development tools Studio and Workbench. Compared to the newer HDI modeling and its tool WebIDE. Also an architectural overview of the new HDI containers.
Creating an HDI container and accessing a HANA classic schema from HDI container
This presentation covers the steps for creating an HDI container via a WebIDE database project and then walks through the steps of granting access to HANA classic schema and tables to an HDI container and then how to create a simple calculation view to access those tables.
In the new HDI container architecture developers can store views, calculation views, procedures and tables. This is unlike the classic HANA repository (_SYS_BIC schema) which only stored calculation views and older analytic views and then stored tables in other (classic) schema. All of the SAP application data is still being stored in classic schema and tables, therefore a customers calculation views being created in HDI containers will need access to tables being stored in HANA classic schema.
The basic steps are:
- grant permissions for HDI container#OO user to access HANA classic tables,
- create an external HANA service in your HDI container schema,
- create .hdbsynonym file to setup local synonyms for each external table, and
- use these synonyms in a calculation view.
This is a demonstration of all of this in action.
Granting access for SYSTEM HANA user id to HDI calculation views and tables
This presentation and demo will show how to now grant access to the calculation view we just created in the HDI container to the SYSTEM user id which is external to the HDI environment.
The basic steps to grant SYSTEM user id SELECT access to the GYDEMO_HDI_DB_1 container schema:
- Open an Admin SQL Console of container GYDEMO_HDI_DB_1
- Then grant permissions with these 4 steps:
- set schema “GYDEMO_HDI_DB_1#DI”;
- create local temporary column table “#PRIVILEGES” like “_SYS_DI”.”TT_SCHEMA_PRIVILEGES”;
- insert into “#PRIVILEGES” (“PRIVILEGE_NAME”, “PRINCIPAL_SCHEMA_NAME”, “PRINCIPAL_NAME”) values (‘SELECT’, ”, ‘SYSTEM’);
- call “GYDEMO_HDI_DB_1#DI”.”GRANT_CONTAINER_SCHEMA_PRIVILEGES”(“#PRIVILEGES”, “_SYS_DI”.”T_NO_PARAMETERS”, ?, ?, ?);
Recorded presentation and demo:
HDI supported calculation views
This presentation and demo will cover the following topics:
- Discuss the supported HDI calculation views: CUBE, CUBE with star join and table function
- Discuss differences between classic scripted calculation view and a table function
- Demo the creation of a table function
- Demo the creation of a CUBE calculation view with a star join
More on this subject will be covered in a later BLOG focused completely on the migration tools used to migrate the HANA classic artifacts to HANA HDI artifacts.
Recorded presentation and demo:
Creating calculation views using remote tables in multiple servers
This presentation and demo will cover the following topics:
- Discussion and demo of running calculation views using virtual tables from multiple remote servers,
- Demo how to build a UNION of calculation views built on remote tables using trusted techniques, and
- Discuss the issues of joining remote tables and techniques to make it work.
Recorded presentation and demo:
After watching these presentations and demos customers should be able to explain the differences between the HANA classic modeling and the newer HDI modeling. Customers should be able to create an HDI architecture accessing HANA classic schemas and tables and be able to access those tables from HDI calculation views. Customers should also understand some of the basic techniques for using calculation views with star joins in a distributed environment.
The knowledge in this and future BLOGs will allow customers to quickly and easily create powerful models harmonizing their multiple silos of application data into a single enterprise view that can deliver insights to their business users in real-time.
SAP HANA Data Strategy BLOGs Index
- SAP HANA Data Strategy: HANA Data Modeling a Detailed Overview
- HANA Data Strategy: Data Ingestion including Real-Time Change Data Capture
- HANA Data Strategy: Data Ingestion – Virtualization
- HANA Data Strategy: HANA Data Tiering
Thanks for the nice collection of material.
After clicking on any of the video links all I get is a redirect to the general sap.com page. Why is that?
I'd love to see the videos.
Anyhow, more interesting to me is this:
I've yet to find a good explanation from SAP for what additional benefits the HDI concept delivers to data modeling. Up to now, all I can see is that I need to jump through more hoops (i.e. grant privileges and setup things) just to be able to do data modeling with the new modeling environment.
While "You need it in order to use WEB IDE" is a factually correct answer, it does not address the core point: how does the HDI concept deliver benefits in terms of productivity or value or whatever?
I would expect that if there is this high hurdle of getting people to productively work with this tool then there is a profound measurable benefit associated with it that allows users to say: "yes, this is worth the extra time".
So far, all I've seen is that users try to go through the motions of getting HDI to work for them just because they are under the impression that they have to use it with HANA2 (even though the XSC based repository is still fully available).
I concur and would even provide a harsher verdict. HDI is full of logical plot holes.
The goal of HDI vs XSA is to be able to deploy the same application multiple times. Can't be done with XSC because there, the owners are hard coded as _SYS_BIC. What is the main use case for deploying one application multiple times? To support different tenants. You use MDCs for that today, not schema separation.
HDI containers want to control the schema, yet you have to create synonyms for all used schema objects. A user alias would have done the trick. Your HDI container reads data from SAPHANADATA schema, hence you specify at deploy time that this owner SAPHANADATA shall be an alias pointing to MYS4SCHEMA. Today you have to create 100'000 synonyms.
The usability of HDI sucks. Customer first, run simple? Ever heard of these terms? Instead you enter commands like 30 years ago when using C compiler and get back error messages in text form.
IDP and authorizations are taken from Hana. Yes I know, I get told constantly that this has been requested a lot. Yet nobody is using HDI development for non-Hana applications and this creates just another level of confusion and extra work. I have not seen a single customer using permissions outside of Hana or ABAP. Even if they want to use Kerberos, Hana is configured that way.
It is obvious that XSC has a few conceptual mistakes but at least the simple things can be done easily. And the goal of HDI would have been easy to achieve. Instead of deploying things in the same user, _SYS_BIC, deploy the stuff in the schema of the activating user. If I deploy it in development for me it ends up in my schema, in prod the deployer is a technical user. Aliasing of users for reading.
As a result most of the times I use WebIDE XSA version (not a good UI either by the way), create the runtime artifacts and then copy these as SQL and my programs create the runtime objects by executing the SQL code themselves during install time.
I would love to build a proper solution but it is too big for a single person.
My guess: On Prem XSA WebIDE and HDI containers are dead. No further development, no communication, no replacement. In future you will manage your database by connecting your on-prem Hana with a WebIDE service in the cloud. So SAP is hosting your WebIDE and your Hana Monitor, you do not install XSA locally any longer. Development of tables, views, calcviews is done in WebIDE but how they are deployed, I don't know. The clever move would be to keep the activation plugins but you specify the user to deploy into. Hence my guess is nobody at SAP will touch that and HDI will stay and not be replaced.
In the meantime applications like Successfactors, Hana Data Warehouse Cloud etc continue using the Hana database without HDI as they do today.
Sorry for the rant, years of frustration with HDI did bubble up.
Thanks for chiming in, Werner! I had a feeling, that I would not be alone with this view of HDI.
I think you are spot on with your guess that on-prem HDI is dead. If someone wants to build cloud foundry solutions (or any non-ABAP solutions for that matter) then they'll use liquibase or flyway or whatever schema evolution program they want to use.
Sure, "information models" are not covered by this, but frankly, the main benefit of calculation views etc. nowadays is that they expose (sap-internal) metadata to SAP tools.
Again, you are probably right about that HDI won't be removed - even though it is clearly engineered past any users' needs. The next best thing a developer can do is to ignore it where possible and rely on deployment tools that actually deliver benefits without putting obstacles in the way every single step.
Thanks for your feedback. I wanted to take the time to respond to a few of your points and answer a few of your questions.
In comparison to MDC for tenant isolation, schema-based isolation is simpler, leaner, and needs fewer computing resources, which, in turn, can help to reduce the overall TCO. Additionally, HDI can be easily integrated with the XSA instance manager.
Another use case for HDI is zero-downtime upgrades. This is not possible with XSC due to global users. One further use case is isolated development. With HDI containers, each developer can have a separate working environment, which means that several developers can work in parallel without interfering with each other’s work semantically (for example, overwriting objects or creating inconsistent dependencies) or in terms of speed of deployment (for example, deployments can run in parallel as opposed to the sequential execution of activations in the XSC environment).
Synonyms and projection views can be used to provide a defined interface to the outside of the container to limit the objects that refer to objects outside of the container. If every object in a container could directly access objects outside of the container, it would be very difficult to get an overview of external dependencies because you would have to scan all artefacts for external object references. With HDI, you can look at the synonyms and projection views to determine the external dependencies. Additionally, you can change individual synonym targets dynamically, depending on the target system. There is no need to touch each of the synonyms manually. This is especially useful if your synonyms point to multiple schemas.
Deploying in the schema of the activating user may be a feasible approach in some cases, but it can generally lead to a range of issues, including where to store the artefact metadata, what to do with existing objects in the schema, dependency management in the development and production schemas, and granting privileges for some deployed objects but not others in a manner that will survive an upgrade. If objects are coupled to non-technical uses, it would also couple the lifetime of the objects to the lifetime of an individual user. This could cause issues if the user decides to leave the company. The productive approach you describe seems a bit like an HDI container, where the technical user would be the HDI container user.
Using SAP Web IDE to create runtime artifacts, copying them as SQL, and then creating the runtime objects by executing the SQL code during runtime is a valid approach, but I am not sure what the advantage over other methods would be. For calculation views, there is no guarantee that the specific create SQL statements will work in future releases. The official way is to deploy the XML via HDI.
Generally, there is no conceptual difference between XSA on premise and SAP Cloud Platform, Cloud Foundry environment. Even if you’re running completely in the cloud with SAP Web IDE full-stack and SAP Cloud Platform, SAP HANA service, the deployment model will stay the same, and HDI will manage the dependencies for you.
Thanks, appreciate it and read your text twice to make sure I got all points. Obviously I have different views in most areas regarding the pros and cons and their respective importance but these can only be opinions on either side.
We will see what the next SAP announcements will be in that area....
This is a very concise summary of the intended benefits of the HDI concept.
What, I feel, is still missing is the "pudding with the proof".
Where are examples for successful HDI apps that actually have been developed by teams in parallel and where the possibilities for zero-downtime upgrades and easier dependency management outweighed the difficulty of using HDI?
And how does it fit together with all the other development approaches present at customers/users?
SAP has released (pushed) several different development processes to its customer base over the past years.
Each of those (ABAP on HANA, native HANA modeling, XSC, XSA/HDI...) promised to make certain aspects easier/more efficient/lower the TCO (this one always has to be in it, doesn't it?) but required that the team process is organized along the lines of how the tooling works.
Yet, what I see in customers is that the very same people are tasked with handling these tools and come up with a somewhat coherent approach to deliver results for their organizations.
Maybe it is the lack of actual examples of how this reconciliation of tools and development processes has been successfully executed to the benefit of users, developers and the actual reducement of TCO 😉 that leads to the misuse/misunderstanding of those tools - I don't know.
What I do know is that every single HANA user/dev I interacted with heavily gravitated towards HANA Studio and the straight forward concept of "design & activate"-deployment of DB artifacts.
I really appreciate that you took the time to write this very good piece of commentary; I just don't think the intended benefits are realized in most cases.
I have struggled with understanding the "pros" of HDI myself. Ever since i came to learn from SAP that XSC will be going away in the future I started immersing myself in the HDI/XSA/Cloud Foundry/WebIDE concepts to full understand the differences to XSC/HANA Studio. I learned a lot from Thomas Jung and he makes great arguments similar to Mr Bregler (above). I think SAP took a fundamental software developer company approach to native HANA development and then promoted that same framework to their customer base. Like what has been mentioned, i have not seen clients develop HANA applications (HANA models with complex runtime environments) in the way SAP intended.
What bothers me is that SAP is so inconsistent. For instance, you want a modern Data Management Platform and you incorporate tools like SLT and they create XSC schemas in the target system and you have to create services and synonyms to consume the data in an HDI Container. Why not fully enable all the SAP delivered tools to adopt this modern XSA/HDI framework?
Lars, your point around zero-downtime is spot on. Who cares about that when referring to a development environment.
I have been working in SAP Analytics space since long time and am so much passionate about SAP products and services and recently since an year back, started working on XSA Web ide and have faced a lot of challenges while working on it. a few points to be address… Why made all data modeling so complicated when it was simple using Hana Studio and XSC ? Why make authorisation of objects so complicated ? Why make synonyms and then again use the synonyms in the flowgraphs ?
We are working on a MTA and have used Web, NODE.JS, HDB Modules together and its a nightmare to deploy it on QAS using MTAR file.
Even SAP support is also struggling with the issues. Dont know where are we heading. Its just wait and watch.
My apologies to be straight.
The links are now fixed. Sorry it took so long...
Can you please update the presentation links. None of them works.
The links are now fixed. Sorry it took so long...