SAP HANA: Using “Dynamic Join” in Calculation View (Graphical)
Hi Folks,
This blog is intended to share my experiences on using “Dynamic Join” in “Join” node of a Calculation View (Graphical).
Note: i have done this on HANA Rev 68.
Problem Description:
We will have some scenario’s in which we would want the “system” to have intelligence on whether to do a join or not before aggregation depending on the “Query” it receives from the UI ( Front end).
Example:
Let us take a sample table data i.e employee with the following DDL and data as shown below:
CREATE COLUMN TABLE "EMPLOYEE" ("EMP NO" INTEGER,
"EMPLOYEE NAME" VARCHAR(200),
"EMPLOYEE TYPE" INTEGER,
"GENDER" VARCHAR(10),
"AGE" INTEGER ,
"REGION" VARCHAR(10),
"COUNTRY" VARCHAR (20),
"SALARY" DECIMAL(18, 0))
Load the data into the EMPLOYEE Table with the below mentioned insert statements:
insert into "EMPLOYEE" values(1,'Sachin',1,'M',40,'APAC','IND',50000);
insert into "EMPLOYEE" values(2,'Ganguly',1,'M',42,'APAC','PAK',40000);
insert into "EMPLOYEE" values(3,'Dravid',1,'M',40,'AMER','US',40000);
insert into "EMPLOYEE" values(4,'Laxman',1,'M',43,'AMER','US',40000);
insert into "EMPLOYEE" values(5,'Dhoni',1,'M',35,'EMEA','GER',40000);
insert into "EMPLOYEE" values(6,'Sehwag',1,'M',36,'EMEA','GER',30000);
insert into "EMPLOYEE" values(7,'Kohli',1,'M',23,'EMEA','UK',20000);
insert into "EMPLOYEE" values(8,'Kumar',1,'M',22,'EMEA','GER',10000);
insert into "EMPLOYEE" values(1,'Law',2,'M',24,'APAC','IND',30000);
insert into "EMPLOYEE" values(2,'Eddie',2,'M',26,'EMEA','UK',150000);
insert into "EMPLOYEE" values(3,'Paul',2,'M',23,'APAC','IND',120000);
insert into "EMPLOYEE" values(4,'Howrang',2,'M',22,'AMER','US',60000);
insert into "EMPLOYEE" values(5,'Xiayou',2,'F',22,'AMER','US',8000);
insert into "EMPLOYEE" values(6,'Nina',2,'F',22,'AMER','CAN',70000);
insert into "EMPLOYEE" values(1,'Federer',3,'M',30,'APAC','IND',1150000);
insert into "EMPLOYEE" values(2,'Nadal',3,'M',29,'APAC','IND',5230000);
insert into "EMPLOYEE" values(3,'Djokovic',3,29,24,'APAC','IND',5045000);
insert into "EMPLOYEE" values(4,'Murray',3,'M',24,'APAC','CHN',55650000);
insert into "EMPLOYEE" values(5,'Sampras',3,'M',44,'AMER','CAN',5660000);
insert into "EMPLOYEE" values(6,'Agassi',3,'M',45,'AMER','US',5056000);
insert into "EMPLOYEE" values(7,'Venus',3,'F',28,'AMER','US',9500500);
insert into "EMPLOYEE" values(8,'Serena',3,'F',29,'AMER','US',9507000);
insert into "EMPLOYEE" values(1,'Messi',4,'M',24,'APAC','IND',510000);
insert into "EMPLOYEE" values(2,'Ronaldo',4,'M',28,'AMER','CAN',500);
insert into "EMPLOYEE" values(3,'Xavi',4,'M',30,'EMEA','UK',5002300);
insert into "EMPLOYEE" values(4,'Beckham',4,'M',40,'EMEA','UK',7850000);
Now Let us try to analyze the share of salaries earned by each “Employee Type” at “Region” level and “Country” level.
We would need to create a calculation view in which we shall use 2 “Aggregation” nodes of which one will not include “Employee Type” in the aggregation as shown below.
Let me name this Calculation view as “CV_DYNAMICJOIN_TRUE”.
Aggregation 1:
Aggregation 2:
While adding “Salary” from this node , rename it as “Total_Salary”.
Now let us join these “Aggregation nodes” with join between them as “Inner Join” and “Dynamic Join” set to “True”
as shown below:
Now we need to create a “Calculated Column” i.e “Share_Salary” which gives us the salary shared at “Region” or “Country” level in the “Projection Node” as shown below:
Now activate the view and let us analyze the results.
Now let us analyze the output:
Output:
You can see in the above result that we get “Salary” at Region and Employee Type level but the “Total Salary” is at “Region Level” and the “Share Salary” gives the share of salaries for each employee type at Region level.
Case 2:
I have created the similar calculation view keeping Dynamic Join as “False”.
Let us see the output for similar query to analyze the salaries of Employee Type at Region level below:
You can see here , the Join is behaving like a “Static Join” at Region as well as Country level and gives the salaries at the same level. Which gives us the erroneous results and hence gives more records i.e 16 instead of 11.
Now let us analyze Viz plans to check how many records each aggregate node is giving for the above mentioned queries:
Case 1 ( Dynamic Join = ‘True’) :
As you can see above we are getting 3 records from the left node. i.e we have 3 regions and we are getting “Total Salaries” for those 3 regions at “Region” Level and similarly we are getting 11 records from the right node i.e Region and Employee Type level.
Case 2 ( Dynamic Join = ‘False’ ) :
As you can see above we are getting 7 records from the left node i.e Region and Country level. and similarly we are getting 16 records from the right node i.e Region,Country and Employee Type level.
So the above “Dynamic Join” helps to “intelligently” decide based on the query received from UI whether it has to join at Country level or not. Which makes our work as a modeller bit easier.
Hoping this blog helped you to understand the benefits of using “Dynamic Join”. Please do let me know your feedback on this.
Your’s
Krishna Tangudu 🙂
HI Krishna,
Good document and Thanks for sharing..
Phani.
Hi Pani,
Thank you for your feedback 🙂
Regards
Krishna Tangudu
Nice document.
Thank you Raj 🙂
Good one Krishna 🙂 Waiting for more 🙂
Hi Akshay,
Thank you for your feedback. Will keep my best efforts while writing.
Regards,
Krishna Tangudu
Krishna,
Nice , Thank you for sharing this article.
Regards
Mangesh K
Thanks Mangesh 🙂
Regards,
Krishna Tangudu
Hi Krishna,
I can't currently make sense out of these numbers and check what is returned as you are using Region in both examples and there is no "Country" field in the table
CREATE COLUMN TABLE "EMPLOYEE" ("EMP NO" INTEGER CS_INT,
"EMPLOYEE NAME" VARCHAR(200),
"EMPLOYEE TYPE" INTEGER,
"GENDER" VARCHAR(10),
"AGE" INTEGER ,
"REGION" VARCHAR(10),
"SALARY" DECIMAL(18, 0))
Is there something missing?
Thank you
Denis
Hi Denis,
Apologies for the confusion.
Thanks for the correction, Yes i have missed keeping "COUNTRY" in the DDL. Have corrected the DDL and the insert statements.
Do let me know if you have any other questions.
Regards,
Krishna Tangudu
Hi Krishna!
First, thank you for share this information this has been helpfull.
But i have some problems when i want to show it in the hana datapreview
becouse, the data there are sumarized, and dont show the same information
that the query, i would like to know, if is posible and "How to.." show the
same information at data preview layer.
Regards
Hi Widerman,
Thanks for your feedback 🙂 Glad you liked it.
Regarding data preview, Even i don know how to control the query getting framed in the way we wanted it.
Hoping someone who knows that information would pitch in and help us here .
Regards,
Krishna Tangudu
Hi Krishna!
Thanks for your fast answer...
But if i use this CV in a BI report tool, for example "LUMIRA", do you think
that it cant show the information, like the query, or like the data prevew.
Regards
It should be possible with "LUMIRA" , i didnt try it as of now but i will try and post the results for you ( whether it is yes or no 🙂 )
Regards,
Krishna Tangudu
Good document for the Dynamic Join.
Is it possible to use the dynamic join in the script type calculation view? If yes, can you provide the detail codes on how to achieve it?
Thanks,
Derek
Cool! Thanks for investigating it.
Thanks for sharing such insightful document.
Hi,
In the second aggregation node how did you got the total salary?? Â i mean where the calculation done?
we have to do right?? correct me if i'm wrong
Venkatesh
Hi Krishna,
I applied your sample for 16 fields, which table has 14 million rows (I'm planning to calculate index for my sales data).
My problem is I'm not getting the result. Its working too slow and my CPU is becoming almost full.
I'm not sure but maybe writing an optimized sql from CV is a good idea? Or is it same?
Is there any other solution for this?
Thanks.