Execution management by HANA Graphical Calculation View
SAP HANA, An appliance with set of unique capabilities has provided wide range of possibilities for end user to perform data modeling. One among them is ‘Graphical Calculation View’, which helps in leveraging the fullest potential of SAP HANA in data modeling sector.
This Document helps in understanding the nature of execution management by HANA Graphical Calculation Views by utilizing the allowed set of properties in it. So as to reveal out the effectiveness in handling the properties to manage the execution flow.
First among the property set under discussion is
‘KEEP FLAG’ for attributes in Aggregation Node
Above property gives chance for end user to leverage the full capacity of calculation engine by utilizing the nature of Aggregation node.
Let us understand how it is achieved by considering a simple example.
Consider a simple SALARY_ANALYSIS table as shown below. Keeping it simple to make the understanding better.
Above table gives the salary details of each employee type on date basis. Let us now try to get maximum salary in each of the employee_category/type using graphical calculation view.
Step 1: Create a new graphical calculation view of DIMENSION Data category and add an aggregation node to the view.
Now add the above created column table into the newly inserted aggregation node and set EMPLOYEE_TYPE as the output column.
Along with which add SALARY as an aggregated column to the output and rename it as MAX_SALARY_IN_EACH_EMP_TYPE
and set its aggregation type to MAX
Step 2 : Now connect the Aggregation Node to the default Projection Node and just select MAX_SALARY_IN_EACH_EMP_TYPE column to the output in Projection Node.
Step 3 : Save and activate the view. After which perform Data Preview on the Aggregation Node. We get Maximum Salary for each employee type in the underlying table.
Step 4 : Now perform Data Preview at the Calculation View level
Here we see that the end result of the view gives only one row of data discarding the Employee_Type with which grouping was done in the aggregation node. Plan Visualized Query for the same preview is as show below.
By the above visualization of execution plan we figure out that the EMPLOYEE_TYPE column used in the aggregation node is pruned and not passed to the higher nodes when it is not queried in the End Result. Thus we are unable to get the MAX_SALARY for each employee type, instead we are getting the MAX_SALARY out of the total salary list.
To achieve the former case of getting MAX_SALARY based on the EMPLOYEE_TYPE even when the EMPLOYEE_TYPE is not part of the end query, we must enable this special property called ‘Keep Flag‘ on the attributes.
Step 5 : Go back to the Aggregation Node used in the view and select the EMPLOYEE_TYPE column which is added as attribute, set the property ‘KEEP_FLAG‘ to true and activate the view.
Step 6 : Once the activation is completed successfully, perform data preview of the model and check the result. We now get the MAX_SALARY grouped by EMPLOYEE_TYPE although EMPLOYEE_TYPE is not part of the end query.
Plan Visualization on the same query now behaves differently after setting the ‘KEEP_FLAG’ to true for EMPLOYEE_TYPE column, thus propagates the grouping attribute to the higher nodes as shown below:
Hence the end user is given chance to either retain the query optimization or to fit the model for required response by toggling the KEEP_FLAG property of the attributes in aggregation node.
Let us now pitch into an other connected property of Aggregation Node.
Always Aggregate Result in Aggregation Node
Taking the above example of SALARY_ANALYSIS, let us understand the usage of ‘Always Aggregate Result’ property in the aggregation node.
Step 1: Create a new Graphical Calculation View of cube type and add SALARY_ANALYSIS table as the data source into the aggregation node,. now select EMPLOYEE_TYPE and YEAR as non-aggregated or attribute columns and SALARY as the aggregated output column.
Step 2 : Save and activate the view, after which execute query on the above view involving EMPLOYEE_TYPE and SALARY columns as show below
The above select statement do not involve any client side aggregation or group by clause, output values here are the result of default aggregation and group by operations in the aggregation node as shown below:
Let us now execute the same statement with ‘Where Clause’ usage in the query and see the result
Here we see that the introduction of ‘Where clause’ has also introduced the same column in ‘Group by’ Clause and thus the result gets varied from the previous query by having YEAR column also as part of the group by operation in aggregation node.
Step 3: To Avoid the above introduction of filter column in the group by clause and to address the only set of columns requested in the query, we have a property called Always Aggregate Result to be set to true so that the aggregation will not vary based on the filter column in the requested query.
Above cases are true only when client side aggregations are not set in the requested query.
Step 4 : Now execute the same query with’ Where clause’ on YEAR column after setting the ALWAYS AGGREGATE RESULT property.
We now see that the grouping happens only by EMPLOYEE_TYPE which is in the requested query.
There by we see the usage and benefits of two key properties KEEP_FLAG and ALWAYS AGGREGATE RESULT for execution management
in Graphical Calculation View .
Hope the provided information is useful. Any suggestion and feedback for improvement will be much appreciated.