Table Transpose in SAP HANA Modeling
This document is prepared based on HANA SPS6 revision 63.
Jody Hesch beautifully explained how to do Transpose and its use cases in the document (How To: Dynamic Transposition in HANA) which involves an additional table. If our requirement is to do table transpose without creating any additional table then we can do this completely by modeling in HANA studio. There could be many ways for doing and this is just another way of doing it.
Once the table is available in HANA studio, modeling will be done based on HANA base table and output of the Information view will be the transposed data.
Based on comments, this document was modified on Jan 8, 2014 and 2 other approaches have been added. Special thanks to Justin Molenaur and Krishna Tangudu for making this better.
Based on comment, this document was modified on May 2, 2016. Special thanks to Pradeep Gupta for making this document more better than before.
Approach 1:
- Analytic view will be built on each base table column which needs transposition.
- In this case 6 columns need transposition, hence 6 Analytic views will be created.
- Calculated Column (VALUE) is created in each Analytic view which derives the value of a particular month in a year.
- Create Calculation View based on Analytic Views created above and join them together using Union with Constant Value.
- No need to create Calculated Column (MONTH) in each Analytic view as this can be derived in Calculation View to improve performance.
Approach 2:
- 1 general Analytic view will be created instead of several Analytic views in which selected attributes and measures will be selected.
- In this case we select 6 measures M_JAN, M_FEB, M_MAR, M_APR, M_MAY, M_JUN in addition to common attributes.
- Create Calculation View based on general Analytic View created above and join them together using Union with Constant Value.
- Calculated Column (VALUE) is created in each Projection node which derives the value of a particular month in a year.
Approach 3:
- No Analytic view will be created instead base table will be used directly.
- Create Calculation View based on direct base table in each projection node.
- Here also 6 projection nodes will be used.
- Calculated Column (VALUE) is created in each Projection node which derives the value of a particular month in a year.
—————————————————————————————————
Approach 4 :
- With single SQLScript calculation view, the table can be easily transposed.
- This is the most easiest way and better as compared to other approaches.
—————————————————————————————————
DDL used for workaround is given below:
—————————————————————————————————
CREATE COLUMN TABLE TEST.ACTUALS (
ID INTEGER NOT NULL,
NAME VARCHAR (20) NOT NULL,
YEAR VARCHAR (4),
M_JAN INTEGER,
M_FEB INTEGER,
M_MAR INTEGER,
M_APR INTEGER,
M_MAY INTEGER,
M_JUN INTEGER,
PRIMARY KEY (ID));
INSERT INTO TEST.ACTUALS VALUES (1,’NAME1′,’2012′,101,102,103,104,105,106);
INSERT INTO TEST.ACTUALS VALUES (2,’NAME2′,’2012′,111,112,113,114,115,116);
INSERT INTO TEST.ACTUALS VALUES (3,’NAME3′,’2012′,121,122,123,124,125,126);
INSERT INTO TEST.ACTUALS VALUES (4,’NAME4′,’2012′,131,132,133,134,135,136);
INSERT INTO TEST.ACTUALS VALUES (5,’NAME5′,’2012′,141,142,143,144,145,146);
INSERT INTO TEST.ACTUALS VALUES (6,’NAME6′,’2013′,201,202,203,204,205,206);
INSERT INTO TEST.ACTUALS VALUES (7,’NAME7′,’2013′,211,212,213,214,215,216);
INSERT INTO TEST.ACTUALS VALUES (8,’NAME8′,’2013′,221,222,223,224,225,226);
INSERT INTO TEST.ACTUALS VALUES (9,’NAME9′,’2013′,231,232,233,234,235,236);
INSERT INTO TEST.ACTUALS VALUES (10,’NAME10′,’2013′,241,242,243,244,245,246);
—————————————————————————————————
The data in the table is:
Transposed data:
Implementation steps for Approach 1:
- Analytic view will be built on each base table column which needs transposition.
- In this case 6 columns need transposition, hence 6 Analytic views will be created.
- Calculated Column (VALUE) is created in each Analytic view which derives the value of a particular month in a year.
- Create Calculation View based on Analytic Views created above and join them together using Union with Constant Value.
- No need to create Calculated Column (MONTH) in each Analytic view as this can be derived in Calculation View to improve performance.
Now let us see this in action.
Let’s start with building Analytic view (AN_M_JAN) based on column M_JAN and in the Data foundation select the attributes ID, NAME, YEAR which will be common in all Analytic views and only month M_JAN and skip other columns as shown below.
In the Logical Join, create Calculated Column (VALUE) and hard-code the value with the name same as base table column name (“M_JAN”) and validate the syntax as shown below.
In the Semantics, hide the attribute M_JAN as it is not required in the output as shown below.
Now Validate and Activate the Analytic view and do data preview. You will see only the values corresponding to M_JAN only.
Create second analytic view AN_M_FEB based on column M_FEB and the process will be the same as created above for M_JAN. In the data foundation make sure that you select month M_FEB not M_JAN.
Date preview for AN_M_FEB corresponds to M_FEB only.
Similarly create other 4 Analytic views AN_M_MAR, AN_M_APR, AN_M_MAY, AN_M_JUN.
Create Calculation View (CA_ACTUALS_MONTH). From the scenario panel, drag and drop the “Projection” node and add the Analytic view in it. Do not select M_JAN column as the Calculated column VALUE used instead. Similarly add the Projection node for other Analytic vies. Totally 6 Projection nodes are required for each Analytic view.
Now add the “Union” node above the six “Projection” node and join them. In details section click “Auto Map by Name”. The only attribute missing in the output is “Month“. In Target(s) under Details section, click on create target as MONTH with datatype as VARCHAR and size as 3 which contains 3 letter month names (eg. JAN, FEb, MAR, etc.)
Right click on MONTH and choose “Manage Mappings” and enter the value for constant for Source model accordingly.
The final Calculation view would be like:
Save and Validate, Activate, and Do the data preview:
which is our desired output of the view with data transposed
But what about the performance?
Total number of records the information view contains:
To check if the filters are pushed down to the Analytic search, you need to find the “BWPopSearch” operation and check the details on the node in the visual plan. Please refer to awesome document by Ravindra Channe explaining “Projection Filter push down in Calculation View” which in turn points to the Great Lars Breddemann blog “Show me the timelines, baby!“
Let us apply filter for the year 2012.
SELECT NAME, YEAR, MONTH, VALUE FROM “_SYS_BIC”.”MDM/CA_ACTUALS_VALUE” WHERE YEAR = ‘2012’;
The Analytic search when expanded will show:
Though the table size is small in our case, Irrespective of table size, the filter is pushed down and fetching only the required records from the base table which helps in improving performance
Implementation steps for Approach 2:
- 1 general Analytic view will be created instead of several Analytic views in which selected attributes and measures will be selected.
- In this case we select 6 measures M_JAN, M_FEB, M_MAR, M_APR, M_MAY, M_JUN in addition to common attributes.
- Create Calculation View based on general Analytic View created above and join them together using Union with Constant Value.
- Calculated Column (VALUE) is created in each Projection node which derives the value of a particular month in a year.
Let us see this in action.
Create general Analytic view with no calculated columns, simple and straight forward as shown below:
Create Calculation view. Drag and drop the Projection node and add general Analytic view, select the measure M_JAN only in addition to common attributes. Create Calculated column VALUE as shown below:
Now add 5 more projection nodes with same Analytic view adding to it. Create Calculated Column VALUE in each projection node corresponding to respective month M_FEB M_MAR, etc.
Now add Union node above these projections and the rest of the process is already seen in Approach1.
Implementation steps for Approach 3:
- No Analytic view will be created instead base table will be used directly.
- Create Calculation View based on direct base table in each projection node.
- Here also 6 projection nodes will be used.
- Calculated Column (VALUE) is created in each Projection node which derives the value of a particular month in a year.
————————————————————————————————————————————————————-
Implementation steps for Approach 4: (recommended)
Create the SQLScript as below:
BEGIN
var_out =
SELECT ID, NAME, YEAR, ‘JAN’ as “MONTH”, M_JAN as “VALUE” from TEST.ACTUALS
UNION
SELECT ID, NAME, YEAR, ‘FEB’ as “MONTH”, M_FEB as “VALUE” from TEST.ACTUALS
UNION
SELECT ID, NAME, YEAR, ‘MAR’ as “MONTH”, M_MAR as “VALUE” from TEST.ACTUALS
UNION
SELECT ID, NAME, YEAR, ‘APR’ as “MONTH”, M_APR as “VALUE” from TEST.ACTUALS
UNION
SELECT ID, NAME, YEAR, ‘MAY’ as “MONTH”, M_MAY as “VALUE” from TEST.ACTUALS
UNION
SELECT ID, NAME, YEAR, ‘JUN’ as “MONTH”, M_JUN as “VALUE” from TEST.ACTUALS
;
END
Output:
Now you are familiar with different approaches of doing table transpose
Thank You for your time.
Thanks for sharing 🙂 Nice and useful one!!!
This could be leveraged only for table transpose right!! Is there any other way we can make use of ?
Regards,
Antony Jerald.
Thanks Antony.
Lets see what other guys will suggest and that will become other way of doing it 🙂
Thanks for posting.Helpful information for How to transpose a table in HANA studio with out creating additional table.Awesome...
Thanks Nageshwar
Good discussion and illustration of the this topic. I believe in the same post you mention How To: Dynamic Transposition in HANA, John Appleby mentions this technique as his first comment. You have shown very clearly how to achieve this as well as the effects of filter pushdown.
I would be very interested in how this scales on data sets > 100m. If I have some time I may give it a shot.
Regards,
Justin
I also mentioned the technique in my original document, but just glossed over it. 🙂
True and you are the one who initiated the concept Jody. Thanks.
I've done this with billions of rows and it works great. I've also used it where you have an attribute for what measure you have, which happens with weather data (e.g. Min Temp, Max Temp, Rain, Snow), and a single measure column. It works great there too.
Not as fast as having well normalized data of course.
Thanks John. Just not clear about what the approach(1/2/3) you followed or else different approach and it worked great for billions of rows.
Hi Raj,
One more nice blog from you 🙂
Observations from my side:
1) I guess we can create only one generalized analytic view and then use it 6 times (with projections), and then use UNION node as mentioned above instead of creating 6 analytic view ( more number of objects ) as shown below:
I have used AN_DYNAMIC_TRANSPOSE 6 times with help of projections, i filtered out the required columns and then went for a UNION.. ( Rest all remains the same )
Stats i got:
In this approach, The reusability of the models is good. I just need 2 models compared to 7 ( With addition of months the number of models will increase ).
Execution is taking a bit more than the model of yours. But not sure if it is tested with more amount of data, how these 2 models behave if we can check that it would be interesting.
Hoping SCN experts here on performance would comment on this approach 🙂
Regards,
Krishna Tangudu
Hey guys,
First of all, this is an excellent thread of posts on Table Transposition started by Jody Hesch and followed up by Raj Kumar Salla and Lalitha Swaroop Krishna Tangudu. The great thing about this series of documents is that it dwells into details of how HANA handles different models and it gives a very good insight of what we can expect in terms of performance out of each model.
I loved how Raj and Krishna pushed Jody's original idea ahead, and I also pushed it a little bit further.
I created a graphical Calc View with UCV, as the last proposals, but without creating any Analytical Views.
Instead, I created the Union component on top of 6 projections mapped directly to the physical tables.
The great thing is that the Calc Engine (my system is rev68) is able to push down the filters directly to the tables, without any intermediary AN views.
Comparing the runtimes, even though the data set was really small (I considered the same dataset that Raj defined), I observed that the model on top of the tables was actually the fastest. The runtimes are described below (in order, I tested Raj's original UCV approach with 6 ANs, Krishna's approach with one generic AN and my approach, with no ANs and the Calc View directly on top of the tables).
And looking into the PlanViz, you can see the filter being pushed down to each table projection (i.e. just 5 rows coming from each one).
EDIT: I'm not able to attach pictures to SCN, I'm trying to figure out why.
Hi Henrique,
Thanks for trying this with base tables. I see in below Justin tried the same with bulk data and came to know that AN view usage is better than direct base table.
Regards
Raj
Check his latest comment below. 😉
Hi Krishna,
Reusing of view is indeed a good idea. Can you please have quick check expanding sub-execution node (right click and choose execute) and see whether BWpopsearch operation is performed or not. If yes then modeling efforts will be reduced to a greater extent.
Regards
Raj
Hi Raj,
Yes i have checked yesterday. The "Filters" are "Pushed Down" here 🙂 .
Please find the screen shot below for your reference:
Regards,
Krishna Tangudu
Thanks, Henrique. Just a few comments below:
Just to be clear, they are two different approaches to the same problem - not extensions of one another. Moreover, in my original post, I did make a reference to the same idea of Union with Constant Values as one solution:
The original use case that motivated dynamic transposition was actually transposing more than 40 Cost Elements, rather than the monthly "buckets" - hence my emphasis on the challenges of the UCV approach and preference for the "identity matrix" approach.
Also, thanks for sharing your test results. In all of my experience, aggregation in the OLAP Engine (i.e. bwPopParallelAggregation) is faster than CalcEngine aggregations, so if source data requires aggregation - I'd still recommend that source nodes in CalcView be Analytic Views rather than base tables with aggregation nodes. As a follow-on, you may want to repeat your test with large datasets - I've found that overhead in different approaches can skew true performance results when dealing with small datasets.
Finally - just want to make sure credit isn't lost to the original developers of dynamic transposition. Mentioned in my original post, but just to be sure 🙂
Hey all, good stuff going on here. I couldn't resist getting some real results against some volume. With tiny volumes like this, it is very difficult to get any results that drive toward a legitimate conclusion.
My process and goal was to see how different a UCV based on Analytic Views vs. Tables would be. I am not differentiating between the same ANV vs.multiple ANV's as I feel like you would always take the 'reusable' approach and just put projections on the same ANV multiple times.
Keep in mind this is executed on rev 62, so this may not be valid for future revisions. My understanding is that there are very few performance improvements in SPS6 revisions, so I wouldn't expect much difference until SPS7.
What I found is that there is a pretty drastic difference once you move to large volumes. I have some further testing in mind, but just wanted to get this published.My results concur with Jody's above recommendation - "I'd still recommend that source nodes in CalcView be Analytic Views rather than base tables with aggregation nodes." This dataset was almost 200m in size, so would be a more realistic test.
Source data, based on COPA.
create column table "MOLJUS02"."TESTING_UCV"( "DOCUMENT_NO" NVARCHAR (10) not null,
"DOCUMENT_ITEM" NVARCHAR (6) not null,
"RECORD_TYPE" NVARCHAR (1) not null,
"MATERIAL" NVARCHAR (18) not null,
"MEASURE1" DECIMAL (15,
2) not null default 0,
"MEASURE2" DECIMAL (15,
2) not null default 0,
"MEASURE3" DECIMAL (15,
2) not null default 0,
primary key ("DOCUMENT_NO",
"DOCUMENT_ITEM") );
INSERT INTO "MOLJUS02"."TESTING_UCV"
SELECT BELNR, POSNR, VRGAR, ARTNR,10.15,20.91,15.50
FROM "SE2_SH2"."CE12000";
Statement 'INSERT INTO "MOLJUS02"."TESTING_UCV" SELECT BELNR, POSNR, VRGAR, ARTNR, 10.15,20.91,15.50 FROM ...'
successfully executed in 24:59.033 minutes (server processing time: 22:02.922 minutes) - Rows Affected: 192699999
merge delta of "MOLJUS02"."TESTING_UCV";
ANV – have two joins for further testing
UCV based on ANV’s
UCV based on tables
ANALYTIC VIEW UCV
SELECT "VRGAR_T", "MEASURE_TYPE", SUM("VALUE")
FROM "_SYS_BIC"."sandbox.justin.UCV_TESTING/CV_TESTING_UCV"
GROUP BY "VRGAR_T", "MEASURE_TYPE"
Statement 'SELECT "VRGAR_T", "MEASURE_TYPE", SUM("VALUE") FROM ...'
successfully executed in 302 ms 409 µs (server processing time: 258 ms 782 µs)
Fetched 15 row(s) in 0 ms 28 µs (server processing time: 0 ms 0 µs)
Statement 'SELECT "VRGAR_T", "MEASURE_TYPE", SUM("VALUE") FROM ...'
successfully executed in 307 ms 56 µs (server processing time: 250 ms 448 µs)
Fetched 15 row(s) in 0 ms 68 µs (server processing time: 0 ms 0 µs)
Statement 'SELECT "VRGAR_T", "MEASURE_TYPE", SUM("VALUE") FROM ...'
successfully executed in 296 ms 691 µs (server processing time: 257 ms 723 µs)
Fetched 15 row(s) in 0 ms 75 µs (server processing time: 0 ms 0 µs)
SELECT "VRGAR_T", "MEASURE_TYPE", SUM("VALUE")
FROM "_SYS_BIC"."sandbox.justin.UCV_TESTING/CV_TESTING_UCV"
WHERE "VRGAR_T" = '1'
GROUP BY "VRGAR_T", "MEASURE_TYPE"
Statement 'SELECT "VRGAR_T", "MEASURE_TYPE", SUM("VALUE") FROM ...'
successfully executed in 340 ms 227 µs (server processing time: 266 ms 970 µs)
Fetched 3 row(s) in 0 ms 11 µs (server processing time: 0 ms 0 µs)
Statement 'SELECT "VRGAR_T", "MEASURE_TYPE", SUM("VALUE") FROM ...'
successfully executed in 323 ms 579 µs (server processing time: 280 ms 813 µs)
Fetched 3 row(s) in 0 ms 24 µs (server processing time: 0 ms 0 µs)
Statement 'SELECT "VRGAR_T", "MEASURE_TYPE", SUM("VALUE") FROM ...'
successfully executed in 371 ms 607 µs (server processing time: 295 ms 670 µs)
Fetched 3 row(s) in 0 ms 23 µs (server processing time: 0 ms 0 µs)
TABLE BASED UCV
SELECT "RECORD_TYPE", "MEASURE_TYPE", SUM("VALUE")
FROM "_SYS_BIC"."sandbox.justin.UCV_TESTING/UCV_TABLE"
GROUP BY "RECORD_TYPE", "MEASURE_TYPE"
Statement 'SELECT "RECORD_TYPE", "MEASURE_TYPE", SUM("VALUE") FROM ...'
successfully executed in 29.439 seconds (server processing time: 29.400 seconds)
Fetched 15 row(s) in 0 ms 29 µs (server processing time: 0 ms 0 µs)
Statement 'SELECT "RECORD_TYPE", "MEASURE_TYPE", SUM("VALUE") FROM ...'
successfully executed in 29.570 seconds (server processing time: 29.532 seconds)
Fetched 15 row(s) in 0 ms 91 µs (server processing time: 0 ms 0 µs)
Statement 'SELECT "RECORD_TYPE", "MEASURE_TYPE", SUM("VALUE") FROM ...'
successfully executed in 29.023 seconds (server processing time: 28.983 seconds)
Fetched 15 row(s) in 0 ms 85 µs (server processing time: 0 ms 0 µs)
SELECT "RECORD_TYPE", "MEASURE_TYPE", SUM("VALUE")
FROM "_SYS_BIC"."sandbox.justin.UCV_TESTING/UCV_TABLE"
WHERE "RECORD_TYPE" = '1'
GROUP BY "RECORD_TYPE", "MEASURE_TYPE"
Statement 'SELECT "RECORD_TYPE", "MEASURE_TYPE", SUM("VALUE") FROM ...'
successfully executed in 14.098 seconds (server processing time: 14.038 seconds)
Fetched 3 row(s) in 0 ms 9 µs (server processing time: 0 ms 0 µs)
Statement 'SELECT "RECORD_TYPE", "MEASURE_TYPE", SUM("VALUE") FROM ...'
successfully executed in 14.258 seconds (server processing time: 14.220 seconds)
Fetched 3 row(s) in 0 ms 13 µs (server processing time: 0 ms 0 µs)
Statement 'SELECT "RECORD_TYPE", "MEASURE_TYPE", SUM("VALUE") FROM ...'
successfully executed in 14.430 seconds (server processing time: 14.392 seconds)
Fetched 3 row(s) in 0 ms 23 µs (server processing time: 0 ms 0 µs)
Regards,
Justin
Hi Justin,
Thanks for confirming with the stats. As you already have the model, if possible i would want you to check the same with 3 different analytic views.
What we have done here is we used the same analytic view 3 times , so i was expecting it to treat them as separate instances and ensure parallelism. So i felt there should be no performance deviations between using separate analytic views Vs single analytic view but on the smaller data set it proved other wise.
Was also having an eye on the CPU Time utilization. I felt most of the times it is giving better stats when compared to separate analytic views but it is changing drastically, each time i take the stats am getting a different result.
As you are testing on more data, would like to get the confirmation with the stats from you.
Note: i checked on HANA rev 68.
Regards,
Krishna Tangudu
I can clearly see the ANV based model (with same ANV in multiple projections) is achieving parallel processing, so I am not going to explore that one any further. From a design perspective it doesn't make a whole lot of sense to me, and it is shown to execute as expected.
See PlanViz timeline for the ANV based UCV I tested.
Regards,
Justin
Hi Justin,
Thanks for the clarifications based on your testing. So now both Single AN Vs Table can be considered as solutions. Strange to see that "Aggregation" using Calc Engine is performing better than aggregation using OLAP engine.
Seen many such behaviors on HANA, which in general makes it very difficult to conclude the things some times to make a generalized statement.
Thanks a lot for your reply 🙂
Regards,
Krishna Tangudu
That is actually expected behavior, given the improvements for HANA Live, but I'd expect it only in SPS7. The surprise is too see it already in rev62 (I believe that's Justin's rev).
Could you run PlanViz on your table-based UCV and see if the filters are being pushed down?
Hi Justin,
That's a great job. My big thanks to you for spending your time on this 🙂 🙂 🙂
Also you made it clear that source nodes in CalcView be Analytic Views rather than base tables with aggregation nodes.
Also the image UCV based on ANV’s seems broken. Request you to upload again if possible.
One thing still to make myself clear is to go for Same Analytic view multiple times or to use different Analytic views is under question especially for Bulk data.
Your test results will be testimonial to all of us 🙂
Regards
Raj
Updated image, should be good now.
Also - I made a slight misstep here, the table based UCV doesn't use aggregation nodes, only projections. Therefore the test was not accurate.
I created another UCV with aggregation nodes on tables and saw that actual is slightly faster than the ANV based calc views! So I may have to reverse the previous conclusion regarding UCV based on table vs. ANV's. Additionally, at the end I can see that filter pushdown is happening in both cases, so both look very viable as solutions.
Next test will be using a joined column of an attribute view in the UCV and testing that performance against a table based UCV that joins after the final union aggregation. I think John Appleby mentioned in some other posts that performing the UCV and then joining out to the required dimensions works well. I would be interested to see it against the ANV method mentioned above.
Happy HANA everyone,
Justin
UCV with aggregation nodes
SELECT "RECORD_TYPE", "MEASURE_TYPE", SUM("VALUE")
FROM "_SYS_BIC"."sandbox.justin.UCV_TESTING/UCV_TABLE_WITH_AGG"
GROUP BY "RECORD_TYPE", "MEASURE_TYPE"
Statement 'SELECT "RECORD_TYPE", "MEASURE_TYPE", SUM("VALUE") FROM ...'
successfully executed in 288 ms 383 µs (server processing time: 241 ms 308 µs)
Fetched 15 row(s) in 0 ms 41 µs (server processing time: 0 ms 0 µs)
Statement 'SELECT "RECORD_TYPE", "MEASURE_TYPE", SUM("VALUE") FROM ...'
successfully executed in 281 ms 922 µs (server processing time: 278 ms 821 µs)
Fetched 15 row(s) in 0 ms 46 µs (server processing time: 0 ms 0 µs)
Statement 'SELECT "RECORD_TYPE", "MEASURE_TYPE", SUM("VALUE") FROM ...'
successfully executed in 272 ms 750 µs (server processing time: 269 ms 295 µs)
Fetched 15 row(s) in 0 ms 49 µs (server processing time: 0 ms 0 µs)
SELECT "RECORD_TYPE", "MEASURE_TYPE", SUM("VALUE")
FROM "_SYS_BIC"."sandbox.justin.UCV_TESTING/UCV_TABLE_WITH_AGG"
WHERE "RECORD_TYPE" = '1'
GROUP BY "RECORD_TYPE", "MEASURE_TYPE"
Statement 'SELECT "RECORD_TYPE", "MEASURE_TYPE", SUM("VALUE") FROM ...'
successfully executed in 235 ms 984 µs (server processing time: 232 ms 383 µs)
Fetched 3 row(s) in 0 ms 11 µs (server processing time: 0 ms 0 µs)
Statement 'SELECT "RECORD_TYPE", "MEASURE_TYPE", SUM("VALUE") FROM ...'
successfully executed in 266 ms 191 µs (server processing time: 262 ms 401 µs)
Fetched 3 row(s) in 0 ms 17 µs (server processing time: 0 ms 0 µs)
Statement 'SELECT "RECORD_TYPE", "MEASURE_TYPE", SUM("VALUE") FROM ...'
successfully executed in 264 ms 126 µs (server processing time: 261 ms 281 µs)
Fetched 3 row(s) in 0 ms 17 µs (server processing time: 0 ms 0 µs)
Henrique - for your request, here is the Vizplan for the second query. There is no explicit *Search operation going on, but from the row count I can see that the filter is definitely getting pushed down (1 record from each source).
To get another look at this, I changed the query to use a WHERE clause containing the constant value column defined in the UNION. My expectation was that the tables that didn't contain the values would not be accessed.
SELECT "RECORD_TYPE",SUM("VALUE")
FROM "_SYS_BIC"."sandbox.justin.UCV_TESTING/UCV_TABLE_WITH_AGG"
WHERE "MEASURE_TYPE" = '''MEASURE1'''
GROUP BY "RECORD_TYPE"
UCV on table with aggregation nodes - behavior as expected, only one table is accessed with ceAggregationPop.There are some interesting nodes to the left, not sure whats going on there, that might possibly be the other two tables leaving some type of artifact that indicates the source data for the UCV is not applicable?
Same operation on UCV based on ANV's - fitler is pushed down and only one ANV is accessed.
SELECT "VRGAR_T",SUM("VALUE")
FROM "_SYS_BIC"."sandbox.justin.UCV_TESTING/CV_TESTING_UCV"
WHERE "MEASURE_TYPE" = '''MEASURE1'''
GROUP BY "VRGAR_T"
Interesting finding on base table being faster than AN views. Not the first time we've seen unintuitive behavior in HANA, and certainly not the last!
Thanks for the confirmation. And yes, the aggregation step took longer than any of the others in my tests.
Thanks Justin.
Hey Raj, just a slight correction to your above revision based on our findings.
In approach 3, its critical that you don't use the base tables directly in a projection node. As I pointed out erroneously, that if you just use a projection the performance is terrible. To get good response, you must use an aggregation node on the base tables.
Regards,
Justin
I did have the word "alternatives" in my original text that I somehow lost and had to rewrite. 🙂
But I think it's clear to everyone that joins and UCVs are alternatives to each other.
Good One Raj!! Thanks for sharing 🙂
Nice blog, thanks for sharing. KV.
Pretty good article Raj.
Hi Raj,
Good Day!
I understand now why your name is Raj. You are real king in writing technical blog. Good work. Keep up the good work.
Keep sharing !
Regards,
Hari Suseelan
Thanks Hari for your kind words. Down the line you will see many kings coming and one day you will be too.
Very informative.
thanks for sharing your knowledge!
regards
Thanks Azeem and Swaroop.
Good explanation.
Good article
Really a valuable information on Transpose.
Thanks,
Ramesh
I thought that we can do this through ETL only.......nice stuff Raj
Well written, well researched, well produced.
On top of that the content is useful and the solution is not obvious or 'easy' to find without experience with this problem.
In short: this is the kind of blog post I personally love to read!
Thanks for that and keep coming with this sort of stuff 😉 .
- Lars
Thank you Lars for your wonderful comment.
Great Stuff Raj
Another awesome blog from you
Thanks a lot for sharing this
Regards,
Vivek
Thanks Vivek.
Great Effort!!
- Ashwin
What a fruitful day it's been. This morning I didn't know what the acronym UCV meant nor applied constant values in calculation views!
Now after working through Thinking in HANA - Part 1: Set Operators by Jody Hesch and onwards to table transposition and the alternatives here I'm starting to appreciate the beauty of and a love for modelling. This kind of blog with theory supported by well documented practice are the treasures in SCN not available in current books, and thus a boon to those not from a modelling/BI background.
So thanks for the efforts, now onwards to dynamic transposition! 🙂
Awesome, Jon-Paul. Your feedback is what motivates folks here on SCN to keep up with these posts. I look forward to seeing what you think of the dynamic transposition article! And all of its extensive commentary. 🙂
Thank you Jon-Paul for your kind comment. Hope you too will contribute your experience in SCN in the near future. Wish you a good fruitful journey in HANA 🙂
Hello Raj,
Just wondering if you have HA300 SP6 material handy - I am appearing for my certification and completed HA300 SP5 training from SAP in Summer, 13. But now SP5 examination has been discontinued. So, would appreciate if you can help.
Thanks,
-Sudhir
Hello Sudhir,
Sorry to say I don't have.
Hello Raj,
Great Stuff. Really you made it very easy and well documented. Hoping you come with more and more great stuff.
Regards
Sri
Thanks Sri
Very Good one
Well documented and good stuff.-Thanks Raj for posting this!
Tried both the approach....with single and multiple Analytic views
Regards
Mary
Hi Raj,
I've a similar requirement. Instead of processing multiple times same ANVs or different ANVs in parallel, How about going for scripted CV where we can read the data once into a temp table (samller set of data) and use it for further calculations or applying filters . For handling huge volumes (300K), I wanted to know if it works. Any views on this please ?
Thanks,
Satish
Hi Satish,
Try using approach 2 and let me know.
Regards
Raj
This is really good and useful and helpful.
Thanks Raj !
Regards
Hyukjoo
Hey All,
Thanks for the info and i have tried it myself however i was just wondering, how can i filter on the 'Month' in the Data Preview?
This is so i will be able to filter the data according to JAN, Feb...etc
I dont seem to be able to get it filtered.
Your help is much appreciated.
Regards,
Robert
Hi Glen,
At Data preview, use the option "Add Filter" to filter from the result set.
I was able to achieve the output format using simple sqls: 🙂
SELECT ID, NAME, YEAR, 'JAN' as "MONTH", M_JAN as "VALUE" from ACTUALS
UNION ALL
SELECT ID, NAME, YEAR, 'FEB' as "MONTH", M_FEB as "VALUE" from ACTUALS
UNION ALL
SELECT ID, NAME, YEAR, 'MAR' as "MONTH", M_MAR as "VALUE" from ACTUALS
UNION ALL
SELECT ID, NAME, YEAR, 'APR' as "MONTH", M_APR as "VALUE" from ACTUALS
UNION ALL
SELECT ID, NAME, YEAR, 'MAY' as "MONTH", M_MAY as "VALUE" from ACTUALS
UNION ALL
SELECT ID, NAME, YEAR, 'JUN' as "MONTH", M_JUN as "VALUE" from ACTUALS;
Hi Pradeep,
Yes you are right 🙂
Hello All,
I faced a typical issue in Transposing to Columns..
Fixed set of Column Transpose i have done only by using CV's and filters... Ex: 6 Columns..
couple of months before.. I had another problem..
I have a finance customized table,..
I have to Transpose the data to around 60 +Columns... Still it may grow in future...
Any suggestions..
Can we do this CV's - I would say its not possible.. because of future growing columns and huge performance...
Only possible in Procedure/Scripted CV's but fortunately I lost in applying this solution...
My point is how can i achieve this situation...
Any nearest code where i can overcome this kind of problems.....
--cheers
Hi Lajwanth,
Have you seen the last approach in this document which used SQLScript CV to achieve the transpose.
Regards
Raj
Thank you Raj.
I hope you are talking about approach 3. I missed, still this may not provide the solution to my query. because i cant create 60 projections, and doing Union( even after dividing with small chunks of projection) and union them is not working/proposed.. till 20 projections( with two union chunks) and its unions are working fine later system performance in my system.
Is there any alternate option..
-Lajwanth
Hi Lajwanth Singh,
Yes, in fact, there is a solution. Your requirement is the exact on we solved on a project several years ago.
See my post here: How To: Dynamic Transposition in HANA
Raj Kumar Salla - thank you for actually referencing my original post. That being the case, I'll have to kindly disagree with your statement about a scripted approach. 🙂
A scripted approach, especially for a high number of columns to transpose, would not be the best approach due to:
1) Excessive development effort to hand-code all of the required transposition nodes.
2) Maintenance challenges for such a code base, i.e. defects in code, readability.
3) Performance deficiencies resulting from materialization of all fields (unlike graphical solutions which only materialize columns requested).
4) Optimizations for graphical views that are missing from scripted approaches.
Thanks,
Jody
Got it..
Thank you Jody, 🙂
-Lajwanth
Hi Lajwanth,
I'm talking about Approach # 4 using SQLScript.
I would like to know which approach you used after you are done. If it is new then please share with us.
Thank You
Raj Kumar Salla
Hi Jody Hesch
I agree with you.
Having n number of columns makes the modeling complicate.
Lets wait for Lajwanth Singh feedback for the approach he followed.
Regards
Raj
Hi Raj,
I saw 3rd aooroach due to high comments missed approach 4,
we didn't followed any approach, cause we r in go-live phase and this requirement will be available in our next sprint...
If I find any better approach definitely i will post..
-Lajwanth.
Hello,
Is it possible to Transpose Rows to columns, where as the columns might be dynamic.
Ex: Input
And the output:
The Col2 values might increase dynamically.
Is it possible in HANA. Since pivot function is not present.
In a calculation view, the columns have to be fixed.
If the number of column changes then you can go for Dynamic sql.
Hi Raj,
This is an amazing write up. It helped me a lot.
I am also trying to transpose table from rows to columns. Is there already a blog that explains rows to columns or could you write one in detail?
Appreciate your help!!!
Thanks,
Ram
Hi Raj,
Thanks for such a wonderful Blog , Now i have a requirement where i have around 105 columns that needs to transported to Row , Could you please suggest me which approach i can follow for my requirement .
Ex:
In my Source i have fields as below , those XXM01,XXM02 ---- XX100, these we have till 100 , these are accounts.
Now in Target i have a field called Account for which i need to Map these XXM01---XX100 and their values to Amount Field as below.
Could you please suggest me which approach i need to follow.
Thanks,
Pavan
Hi Pavan,
Better to go for script based on table function.
Would like to know how you had done?
Regards
Raj
Hi Raj,
Can you provide me with such a table function? I am having a very similar requirement as the questioner above, with 700 columns to be row.
Best Regards,
Zsolt
Hi Raj,
I used Matrix method.
Thanks,
Pavan
Halil Guenal has written a blog of using a generic (SQL Script) procedure for (un)pivoting tables.
To me, this looks like a great improvement on the methods described in this blog post.
Have a look here:
https://blogs.sap.com/2019/04/21/how-to-pivotunpivot-in-sap-hana/