Skip to Content
Technical Articles

Improve SAP HANA calculation view performance via Derived from Table Input Parameter

Introduction
Many customers use custom fiscal calendar including week start day, fiscal period start/end, and a unique date format to report sales and/or financials. For example, one customer has set MM-YYYY (e.g., 03-2020) as a standard format for internal reports. BI tools are required to query analytic models using this date format. Since such a date format may not be available in SAP ERP tables, joining on a calculated column is deemed to be an acceptable performance albeit at the expense of calculation view performance. This blog post presents an alternative method to filter base tables for different date formats to improve calculation view performance. Imagine having to work with a few thousand rows of data instead of hundreds of millions!

Derived from Table input parameter can be used to implement an alternative method to filter base tables not containing columns with required date format.

Tables Used
For this exercise, I created two simple base tables, namely _DATES and _SALES.

At customer sites, DATES table is generally an enhancement to M_TIME_DIMENSION with columns added per customer’s prevailing practices.  For example, YYYYMM can be derived from M_TIME_DIMENION while MM-YYYY is an enhancement either in SAP HANA database or SLT.

SALES table with gross sales amount by date or SKU typically does not contain data by customer-specific period format, as shown above. And SALES tables are large, containing billions of rows of data at a medium- to large customer.

Prerequisites
Both the custom date format (MM-YYYY) and standard date format (YYYYMM) have to be present in a persistent table. And one the two date formats have to be available in fact table to be filtered.


SAP HANA Calculation View

Using above tables, a typical calculation view has been created, as shown below.

Calculated Columns
A calculated column was created in each projection, namely MANDT (100). Calendar and Gross_Sales projections are joined on MANDT. Note: This is a pseudo-join and not recommended in real-world modeling. I have created this join solely to eliminate any effect of filtering on either projection.

Input Parameters
Two input parameters were created, one on each base table.

1. Filter DATES table for YYYYMM based on user input in MM-YYYY format. For example, the user would input 03-2020 to filter DATES table.

2. Filter SALES table based on YYYYMM value derived from DATES table. Remember that SALES table does not contain dates in MM-YYYY format and so cannot be filtered directly.

Key items to note:

  • Derived From Table Return Column (from DATES table) that is present in SALES table.
  • Filter Column in DATES table (MM-YYYY) to get the corresponding value from YYYYMM column in DATES table.
  • Filter value set to the other input parameter.

User Input
When end users run the calculation view, the below dialog box is presented to them. As shown, the requested time period is in the desired format, i.e., MM-YYYY in this case.

The above dialog box results from the first input parameter.

Results
Query results are shown below. As you will notice, only sales for period 04-2020 are displayed.

So What’s the Big Deal You Ask?
Technically you could have created a common date format column in Sales table, filtered on DATE table using any type of input parameter, and joined on date columns to get the same results. The difference lies in performance!

If we can filter both tables at the lowest level, we only lift minimum amount of required data from base tables, which has a significant impact on calculation view performance. Using above methodology, we’re able to filter SALES table for the requested time period, and lift only specific row(s) of data corresponding to that time period. In this case, only one (1) row of data was generated from SALES table, corresponding to the input dialog box time period, instead of 6 rows. Extrapolate this performance for over hundreds of millions of rows of data and you will start to envision the benefits of this procedure.

Visualization Plan
See below screenshot from visualization plan for the above calculation view.

As you will notice:

  • DATES table (left side) is filtered using MM-YYYY values
  • SALES table is filtered using YYYYMM values
  • Both tables return one (1) row of data

This tells us that we were able to filter SALES table at the base level using MM-YYYY input parameter value. This reduces the volume of data lifted from SALES table, helping improve the query runtime performance. See below screenshot to illustrate how the derived input parameter value is pushed down to base table.

Conclusion
Derived from Table type of input parameter offers an alternative and powerful method to:

  • Filter base tables using columns not necessarily present in a given table but can be derived from another table
  • Design SAP HANA queries using custom date formats per customer calendars

Thank you for reading and I hope you find this blog post useful to advance analytics query design. Please let me know if I can be of further help on this topic.

Be the first to leave a comment
You must be Logged on to comment or reply to a post.