Query Execution Performance: Temporary Tables in Joins and Procedures
Hi,
I would like to share my experience of Using Temporary Tables in SAP HANA Joins and Procedures.
My Query Performance is getting two times faster by using Temporary Tables in SAP HANA Joins and Procedures.
My Scenario is,
based on Business Requirement, I have to Write Left Outer Join Query between two tables and This output has to be loaded into another target table.
Left Table Name: “L_Table” [93,25,825 Rows]
Right Table Name: “R_Table” [43 Rows]
To be Load Table Name: “Target_Table”
For this Scenario, I have Written Procedure.
Below, I have explained two methods I used. (With Out Temporary Table & With Temporary Table)
By Both Methods, I am getting same set of output.
In terms of Query Execution time, I am getting benefit when I use temporary Table.
Method 1: Joins With Out Temporary Table:
Please find the code snippet in Procedure,
Insert into “Target_Table” (“ROW_ID”, “EMAIL_ADDRESS”,
“Column3”, “Column4“,
“Column5“, “Column6“, “Column7“,
“Column8“, “Column9“, “Column10“,
“Column11“, “Column12“, “Column13“, “Column14“,
“Column15“, “Column16“, “Column17“, “Column18“,
“Column19“, “Column20“, “Column21“, “Column22“,
“Column23“, “Column24“)
Select L.”ROW_ID”, L.”EMAIL_ADDRESS”, L.”Column3“, L.”Column4“,
L.”Column5“, L.”Column6“, L.”Column7“,
L.”Column8“, L.”Column9“, L.”Column10“,
L.”Column11“, L.”Column12“, L.”Column13“, L.”Column14“,
L.”Column15“, L.”Column16“, L.”Column17“, L.”Column18“,
L.”Column19“, L.”Column20“, L.”Column21“, L.”Column22“,
R.”Column23“, R.”Column24“
from “L_Table” L
Left Outer Join
(Select distinct “EMAIL_ADDR” As “EMAIL_ADDR”, “Column23“, “Column24“
From “R_Table”
Where “UPDATE_TIMESTP_VAL” In (Select Max(“UPDATE_TIMESTP_VAL”) From
“R_Table” Group By “EMAIL_ADDR”)
AND “END_DATE” IS NULL) As R
On L.”EMAIL_ADDRESS” = R.”EMAIL_ADDR”;
This Procedure Execution time takes –> 5 minutes and 48 seconds.
Method 1: Joins With Temporary Table:
Please find the code snippet in Procedure,
CREATE LOCAL TEMPORARY TABLE “#TMP_TBL” (
EMAIL_ADDR VARCHAR(75),
Column23 VARCHAR(1),
Column24 DATE );
Insert into “#TMP_TBL”
Select distinct “EMAIL_ADDR” As “EMAIL_ADDR”, “Column23”, “Column24”
From “R_Table”
Where “UPDATE_TIMESTP_VAL” In (Select Max(“UPDATE_TIMESTP_VAL”) From
“R_Table” Group By “EMAIL_ADDR”)
AND “END_DATE” IS NULL;
Insert into “Target_Table” (“ROW_ID”, “EMAIL_ADDRESS”,
“Column3”, “Column4”,
“Column5”, “Column6”, “Column7”,
“Column8”, “Column9”, “Column10”,
“Column11”, “Column12”, “Column13”, “Column14”,
“Column15”, “Column16”, “Column17”, “Column18”,
“Column19”, “Column20”, “Column21”, “Column22”,
“Column23”, “Column24”)
Select L.”ROW_ID”, L.”EMAIL_ADDRESS”, L.”Column3″, L.”Column4″,
L.”Column5″, L.”Column6″, L.”Column7″,
L.”Column8″, L.”Column9″, L.”Column10″,
L.”Column11″, L.”Column12″, L.”Column13″, L.”Column14″,
L.”Column15″, L.”Column16″, L.”Column17″, L.”Column18″,
L.”Column19″, L.”Column20″, L.”Column21″, L.”Column22″,
R.”Column23″, R.”Column24″
from “L_Table” L
Left Outer Join
“#TMP_TBL” R
On F.”EMAIL_ADDRESS” = R.”EMAIL_ADDR”;
This Procedure Execution time takes –> 2 minutes and 43 seconds.
Thanks & Regards,
Muthuram
Please rate this post, If you like it or helpful 🙂
Hi Muthuram,
What you are trying to convey with this blog ? Does the developer need to use temporary tables instead of joins ?
Hi Nithin,
From this experience, what I am trying to say is,
If we are writing joining queries against some set of columns which is selected from Table (for large no of records), these select columns anyway will occupy some memory randomly While performing join operation.
If we created temporary Table, these data would be stored in defined memory structure and will perform the Join operation.
So My experience, for this kind of scenarios, Using Temporary Table is also one good approach.
Best Regards,
Muthu
Hi Muthuram,
A little explanation, or your query execution plan, on why one method is faster than the other would have been nice.