Key benefits of Data Modeling in SAP HANA:
Building analytics and data mart solutions using SAP HANA enterprise data modeling offers various benefits, compared to the traditional data warehousing solutions such as SAP BW.
- Virtual data models with on the fly calculation of results, which enables reporting accuracy and requires very limited data storage – powered by the in-memory processing, columnar storage and parallel processing etc.
- Ability to perform highly processing intensive calculations efficiently – For example identify the customers where the sales revenue is greater than the average sales revenue per customer
- Real time reporting leveraging the data replication and access techniques such as SLT, Smart data access etc.
Apart from the HANA sidecar or data mart solutions, HANA modeling also plays an essential role in the BW on HANA mixed scenarios, S/4 HANA Analytics, Predictive Analytics and Native HANA applications etc.
Objectives of this blog:
In this blog, I would like to share some of the experiences and learnings from various projects while implementing HANA modeling solutions. The intent of this blog is to provide some insights and approaches to the HANA modelers, which can be helpful when they start working on the solution design and development. However it does not cover the detailed explanation of the HANA modeling features.
Requirement analysis and setting expectations:
Understand the reporting requirements of the project clearly and try to conceptualize the HANA models to be built based the required KPIs. Few key aspects of the solution design: KPI definition including the details such as data sources, dimensions, filters, calculation logic, granularity and data volumes etc.
At times, the business users would expect HANA models to deliver best performance even with wide open selection criteria and with many columns in the output. Even though SAP HANA data models are expected to deliver sub-second response time, we need to be aware of the fact that there will be limited resources (memory, processing engines) in a HANA instance. Hence it is essential to implement the HANA models to be more efficient as per the performance guidelines.
Test the waters before diving deep: Validate the features, tools and integration aspects
It is always better to start with a prototyping of a sample end-to-end solution, before proceeding with the full- fledged implementation. This includes the steps like setting up data provisioning using SLT, BODS etc.., building HANA views and consuming HANA views from the reporting tool. Prototyping will help us in verifying if all the functionalities are working as expected. With this approach we can check and address the connectivity and security related issues beforehand.
It is also recommended to verify the functionality, understand the pros and cons of new features before trying to use them in our models.
Decision criteria for HANA Modeling approaches:
In the recent releases like SP 10 and SP 11, HANA Modeling functionality has been greatly enhanced with several features to cover various complex requirements. Always try to implement the models using Graphical calculation views unless there are specific requirements that can be only implementing using SQL Script. In general we may need to implement SQL Script based Calculation views only for the scenarios such as complex lookup and calculations, recursive logic etc.
While creating graphical calculation views, we need to implement the entire logic virtually using various nodes in different layers. It requires innovative thinking along with solid data modeling skills and a very good understanding of different SQL statements in order to build complex and effective HANA views.
Try to implement your HANA modeling view as per the features supported by the current support pack / revision level and also consider the guidelines and future road-map of SAP:
For instance, SAP suggests that calculation views are to be implemented for most of the requirements:
- Dimension type calculation views: To model master data or to implement “value help” views for variables etc..
- Star join type calculation views: As an alternative to analytic views
- Cube type calculation views: Mainly for the reporting, which includes measures along with aggregations etc.
There are few scenarios where we have to decide between “on the fly calculations” vs “persistence of the results”. For some of the highly processing intensive calculations where the real time reporting is not essential and also for the scenarios like calculating and storing snapshots of results (such as weekly inventory snapshots), we have to implement the logic using SQL Script stored procedures in HANA to persist the results in a table. Subsequently a simple view can be built on this table to enable reporting.
Seeing Is Believing: Data validation is crucial
Prepare a comprehensive data validation and test plans for your HANA views. We can leverage different techniques to ensure that the HANA view is producing the results exactly as per the requirement. Ensure that your test cases will include the validation of the attributes and measures along with any filters, calculations & aggregations, counters, currency conversions etc.
Below are the key tools and techniques to perform data validation of HANA views:
- Data preview option: Using the data preview option at the HANA view level and also at the individual node level is the simplest option to validate the data during the development of HANA views. Leverage the various options such as Raw data with filters, Distinct value analysis, Generating the SQL statement from the log etc. to perform different types of validations using the Data Preview option
- Custom SQL queries: We can write and execute custom SQL queries in HANA studio SQL editor, and compare the results of HANA view to ensure that the results are matching. Here we can leverage the various types of SQL statements to perform complex data validations – for example to compare the data between the HANA view and the base tables
- Reporting from Excel, Analysis Office or other reporting tools: For validation of larger data volumes and for the validation of semantics (labels / formatting etc.) we can leverage the tools like Analysis for Office
Be conscious about the Input data – Few important aspects in the Data Provisioning:
Identify the list of tables to be imported from different sources using SLT or other data provisioning tools and assess the memory requirements. To ensure the optimal utilization of HANA database, it is advisable to replicate only those tables which are essential to meet the requirements.
Few options to optimize the SLT table replication needs:
- Try to leverage the BW objects (DSO or info objects) if the corresponding data is already available in the connected BW schema – This will save the space in HANA as are avoiding the table replication
- Apply filters to avoid SLT replication of unwanted data for large tables into HANA
Try to leverage the transformation capabilities at the SLT or BODS level, wherever feasible. Especially in the scenarios where we need to filter the data model based on a calculated column, it would be ideal to derive this calculated column during the data provisioning.
Smart tools will enable better Productivity:
There are several tools and options available in the HANA studio, which helps us in maintaining the views in a simplified manner and increase productivity. Leverage these tools and features while building and maintaining HANA views.
Listed below are some of these tools and their utility in HANA modeling process:
- Show lineage (columns): helps us to trace the origin of attributes and measures in HANA views.
- Replacing nodes and data sources (in Graphical views) – to replace the nodes (projection, join..) with a different node OR replace the data sources (views or tables) with a different view or table within a modeled view
- Import columns from Table type or flat file (For script based Calculation views): This will simplify the creation of output structure for a script based calculation view – instead of manually maintaining the output columns, we can import the column details from an existing table or view
Note: The following options are available when we right click on a HANA view:
- Generate Select SQL – Using this option we can get the generated SELECT statement for any of the HANA views, which can be customized and executed from the SQL editor
- Refactoring views: Using this option we can move the views across the packages, which automatically adjusts the inner views to reflect the new package.
- Where-used list: To identify the list of objects where the current views has been used and assess the impact of any changes
- Auto-documentation: To generate the documentation of a modeled view, which can be leveraged as part of the technical documentation
Conclusion: My sincere thanks to the SCN community and especially to all the experts, who has been a great source of inspiration. I am hoping that this blog will be useful to you in learning and implementing the HANA modeling solutions.