In this post, I would like to provide SAP HANA modeling recommendations for setting up HANA information models, optimizing performance, using the SAP HANA modeler and development perspectives, as well as solving complex requirements efficiently in SAP HANA.
Every project/partner companies require SAP HANA best practices and recommendations to enhance the optimization of the system performance. Hence, some of the consolidated best practices and guidelines including tips for SAP HANA modeling are mentioned below.
Below there are some HANA Modelling best practices and guidelines :
- The data should be limited as quickly as possible. Static filters should be defined on source data.
- Variables are applied in SAP HANA Views to push down filters at runtime
- Speed up the filters by partitioning the tables on which filter is applied.
- Only those fields that are required in query results should be selected. Use SELECT * should be avoided.
- Calculations should be performed as late as possible. For example, perform aggregation before calculation if possible.
- Design time filters are more helpful rather than inner join in order to limit the data volume.
- The transfer of large result sets between the SAP HANA DB and client applications should be avoided.
- The data transfer between Calculation Views must be reduced.
- Reduce the intermediate results between nodes inside a Calculation View.
- Use Inner join/referential/left outer joins as maximum any way it depends on the business needs.
- Use referential joins when referential integrity is maintained. With referential integrity, SAP HANA does not evaluate join for all queries.
- Correct cardinality in joins (n:1 or 1:1) must be specified – only if sure
- Join the tables on key fields or indexed columns.
- Avoid joining columns which are having more NULL values
- Avoid joins using calculated columns.
- Try to reduce the number of join fields.
- Avoid joins where the data types of the joins fields are different as this requires type conversion.
- Instead of joining for combining large data sets, we can use Unions.
- All views/tables are nice to be used with a projection node. Projection nodes improve performance by narrowing the data set – considering only the required data columns.
- Unions can be more efficient by using Union Pruning.
- Table Functions can be used as a substitute for Scripted Calculation Views.
- Avoid filters on Calculated Columns.
- Use Aggregation nodes to handle duplicate data set.
- Avoid using imperative logics with loops or branched logic also avoid using cursors.
- Breaking large, complex SQL statements into smaller independent steps using the SQL SCRIPT variable increases the parallelization of execution of the query.
- Use analytical privilege latest SQL analytical privilege (SP10) to filter the data based on business requirement
- Partition the tables if they are having a huge number of records for better performance Max 2B records per table (or table partition) as the load is balanced across multiple servers.
- Do not output the data in a granular format in Calculation View when analysis displays it as aggregates.
- Try to avoid creating Restricted columns based on Calculated Columns.
- Create information views in layers. This helps SAP HANA to process the requests in parallel improving the system performance.
- Enabling caching for Calculation views where applicable.
- Dictionary compression technique providing HANA memory 10x or more in-memory data storing capacity.
- High IOPS based Log Volume storage providing 1000 times faster log data sync between memory DB changed data and persistent storage log data providing HANA fast data change performance.
A set of best practices is required, to understand which tool should be used and when depending on a potentially complex set of criteria so that we can use the HANA speed to achieve massive parallel processing, columnar compression, and column widths.