Skip to Content
Author's profile photo Prasana Ravichandran

HANA Modelling – Consolidated Best Practices for better Performance

Hi All,

Just putting up and writing down all the consolidated best practices for HANA modelling that we witnessed for better performance.

  • Try using Star-join instead of joining multiple nodes of master data/dimensions to a fact table of measures.
  • Try to use Calculation view for all the modelling and design that involves complex calculations and data massaging.

For master data/attribute view types use – Dimension

For mulit-dimensional purpose/OLAP scenarios – CUBE

  • Even attribute views/analytic view with calculated columns & measures/logic will be passed to calculation engine of HANA view.
  • If the design involves many joins of master data-dimensions with measures (fact tables columns) use Star-join for the type. This has better performance rather than making joins at multiple nodes

 

  • Setting the Cache invalidation period for views/models that doesn’t have frequent changes on underlying data will give you quicker results and hold the cached result for an hour or day

Suggestions on Joins

  • If possible try to utilize the same node with calculated columns for different conditions and project with union nodes to combine several conditions/segregated projections of same node. (Avoid projecting same node/information to different segments)

If joins are required try to use Left outer/right outer with cardinality (n:1 or 1:1) – if you are sure about the driving tables, as this would enhance the performance and option of selecting “Optimize join = TRUE.

Also, consider multiple join nodes throughout the design to analytic view and then join those analytic views in the calculation view. Avoid single data source (or node) feeding multiple nodes in calculation view.

  • Also, minimize the number of nodes and joins as much as possible and try to use union instead.
  • For better query performance and quicker retrieval of results try to avoid filtering/joining on calculated columns and do not mix up scripted views and graphical calculation views in a single view.
  • Alternately CE_CALC functions or script can be considered for joining scripted views or several graphical calculation views.

Always follow – HANA Development Points

  • Use left outer joins maximum.
  • Specify cardinality in joins (n:1 or 1:1) – only if sure
  • Set optimize join = true (only if cardinality is set like above)
  • Use table functions instead of scripted calculation views.
  • Execute in SQL-engine (for HANA live)
  • Reduce the data set as early as possible. Use design time filters at the lowest level
  • Input Parameter: Placeholders part of the models Can be used in Calculation. Can accept multiple values and can be derived from table or store procedure
  • Ensure Variables (where clause) is pushed to the lowest level. Confirm using Visualization Plan
  • Use analytical privilege and SQL analytical privilege (SP10) to filter data
  • Avoid script-based calculation view, WHERE clause will not be pushed down.
  • Using Filter is better than using Inner Join to limit the dataset.
  • Avoid filters on calculated column (consider materializing these columns).
  • One of the best practice in HANA modeling is to define joins on columns with either INTEGER or BIGINT as data types.

It is not recommended to have joins on calculated columns/fields with NVARCHAR or DECIMAL as data types, might create performance issues.

 

These are some of the best practices followed by many of the consultants and developers. But still some complex business requirements forces us to use or alleviate from such best practices, which can be ignored.

Assigned Tags

      18 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Former Member
      Former Member

      Hello Prasanna,

      Nice blog. Can you please mention the HANA version on which these best practices were tested? Also I have another question. Could you please explain the point "Avoid single data source (or node) feeding multiple nodes in calculation view." I dont see how reusing a node affects performance.

      Regards,
      Safiyu

      Author's profile photo Prasana Ravichandran
      Prasana Ravichandran
      Blog Post Author

      Hi Saidumuhamed,

      Using single node or table in multiple instances meaning - Lets say if you have a fact table/an aggregation node that has some quantifiable information and you're just joining it to multiple nodes. (Lets say filter on each year/region of sales and forecast). and joining it to various master data nodes and combining it in finally.

      Instead we can use star-join or use that node as the central fact node and join multiple master data. This is kind of rare scenario/bad modelling that I've witnessed in one of the client.

      Author's profile photo G B
      G B

      Hi Prasana

      In the above case i.e analytic view, I have understood what you meant. Have you only meant this statement 'Avoid projecting same node/information to different segments to be true in case of an analytic view only?

      Author's profile photo Prasana Ravichandran
      Prasana Ravichandran
      Blog Post Author

      I mean it again differs to case to case, but mostly for Analytic views or calculation views with comparing same factual data/transnational data.

      But better to avoid for calculation views as well, since it will stab the performance overhead.

      Author's profile photo Former Member
      Former Member

      Hi Prasana,

      thanks for this nice blog-post.

      Do you have any experiences or best practice on creating calculated columns (e.g. column A + column B, creating Flags using IF- and CASE-Statements) in an calculation view?

      - On the lowest level?
      - On the Aggregation-Node?

      Best regards,
      Fabian

      Author's profile photo Prasana Ravichandran
      Prasana Ravichandran
      Blog Post Author

      Hi Fabian,

      Yes we did.

      Basically for such if/case statements and decision scenarios - calculated columns can be added in both scenarios .

      1) On the lowest level - if the calculated column is a measure like count/sales data for which discount/logic needs to be applied. (So that it will not cause/give you a unexpected value).

      2) On the aggregation node - if your calculated column is a flag/dimension information that needs to be calculated on the sum of measures. (eg: set flag = 'HIGH' if avg.sales > 20,000)

      So basically it depends on the use case. The most we used is on the lowest level.

      Warm Regards,
      Prasana Ravichandran

      Author's profile photo Former Member
      Former Member

      Hi Prasana,

      thanks for your answer. This is very helpful.


      According to this I have another question on the layer architecture of your views.

      Do you use Staging-, Core-, DataMart- and Reporting-Layer to have a clear structure for your HANA-views?
      E. g. Combine Header and LineItems in Staging-Layer (maybe filter some data) to build a dataset. Afterwards refer to this Staging-View in Core-Layer and add fields by join, calculate columns, etc.
      Or is it better to build a dataset only in one view and avoid using different layers?


      Warm regards,
      Fabian

      Author's profile photo Prasana Ravichandran
      Prasana Ravichandran
      Blog Post Author

      That's  a good question.

      So if your landscape is a Side car approach on HANA and you're not using any HANA Live views(pre-built SAP Content delivered) , you can go by the staging, core and data mart or reporting layer as you mentioned.

      Since most of our landscape was based on HANA Live, we modelled/enhanced the master data HANA views in a separate package/segmentation and factual HANA views - like sales and billing documents separately and combined in the reporting views.

      You will get better performance when you combine several master data views with the fact data in a star-join node. Or you can just copy/enhance the standard HANA live content that is already available as part of HANA live installation.

      Author's profile photo Former Member
      Former Member

      As we always follow HANA dev't. points still, Very good points with clear explanation. 🙂

      Author's profile photo Prasana Ravichandran
      Prasana Ravichandran
      Blog Post Author

      Thanks Narayan

      Author's profile photo Srinath Gattamaneni
      Srinath Gattamaneni

      Hai Prasana,

      I have one doubt. suppose we filter data in middle of nodes by using input parameter.the end user given value when popup occurs. my doubt is,It filter the data from bottom node with the end user given input parameter or from middle nodes onwards it  do filtering.

      Author's profile photo Prasana Ravichandran
      Prasana Ravichandran
      Blog Post Author

      Not clear about your question . But the filter pushdown in general always happens at the projection/aggregation node where you filter  the attribute

      Author's profile photo Srinath Gattamaneni
      Srinath Gattamaneni

      suppose in cv one projection node is in middle.i applied filter there with input parameter. my doubt is it filter data from that particular node onwards or from bottom nodes of cv.

      Author's profile photo Prasana Ravichandran
      Prasana Ravichandran
      Blog Post Author

      Ofcourse from that particular node where u applied filter on IP.

       

      It wont do automatic filtering until there's a filter based on IP is detected.

      Author's profile photo Sergey Shablykin
      Sergey Shablykin

      Dear Prasana!

      Great wiki-style blog!

      Could you share information source for this: "Avoid single data source (or node) feeding multiple nodes in calculation view"?

      best regards,

      Sergey

      Author's profile photo Prasana Ravichandran
      Prasana Ravichandran
      Blog Post Author

      Hi ,

       

      Eg: Lets say i have a projection node - for table LIKP. I'm using the same projection node in 4 different join/aggregation nodes like below then it should be better avoided and to be revisited !

       

      Author's profile photo Ricardo A Spiller
      Ricardo A Spiller

      Hi Prasana!

      It was really helpful!

      Many thanks!
      Cheers from Brazil!

      Author's profile photo Prasana Ravichandran
      Prasana Ravichandran
      Blog Post Author

      Obrigado  Amigo