 # Cumulative Sum / Running Total in HANA Calculation View

Requirement :

Most of the popular reporting tools can handle cumulative sum or running total , but still there have been lot of posts in SCN about whether it is possible to do it in HANA itself.  So, just wanted to see , if it can be easily done via Graphical Calculation view and it seems to be working .

Conditional summation – If in Calculated column

Note: This might still be easier to do in Scripted View or Reporting Tool .  Scripted Calculation View is deprecated now, instead use Table Function .

In Figure 1, our base data is in Column 1 and 2 and we want to add Colum 3 ( “C UM_SUM” ) to existing data set. . Figure 1 : Base data and result

Detail Steps :

Figure 2 shows overall design of the calculation view.  It is pretty simple one, isn’t it ? Figure 2 : Graphical Calculation View

a) I have two projections (  ALL_DATA and ALL_DATA2 )  of the base data ( first two columns in Figure 1 )

b) Created one Calculated Column “DUMMY” on each projection and assigned it a value 1 .  You can assign any other constant value .

c) In join node ( JOIN_1 ) ,(inner)  joined these two data sets via ‘DUMMY” column  .  After joining output would be like below Figure 3: Output of Join node.

If you notice, for every month now I shall have 12 records  ( overall 12 X 12 = 144 ) .  Where my Month matches with Month1 , that is my actual value for that month . And I need to sum all other values where  Month >= Month1 .  So, for above example, sum of  10, 20 and 30 would give me cumulative sum for Month 3.

To do this, I defined two calculated attributes  , one to check if Month = Month1  and another to check If Month >= Month1  ( refer Figure 4 ) . Figure 4: Calculated Columns ( Attributes) on Join_1 node

d) Now, I have two aggregation nodes. One with filter SAME_MONTH = “Y’ and another with GREATER_EQ_MONTH = ‘Y’ . Figure 5 : Aggregation node ( C UM_SUM )  –

I took Month and Amount_1 and renamed Amount_1 to C UM_SUM . Figure 6 : Aggregation_2

Took Month and Amount with a Filter SAME_MONTH = ‘Y’ .

e) Lastly , we need to union these two aggregation nodes.  Take Amount from one node and C UM_SUM from another node. Figure 7 : Union Node

Result is in Figure 1 itself

Note: Model can be much more complex based on number of characteristics you take, in my case it was only Month . If you really want to go for this approach ( have enmity with script or coding 🙂   ) , do not use the base tables in JOIN_1 . Use this model after you aggregate data to a certain extent.  Obviously , you would like to avoid a join m X m where m is a large number .

Equivalent SQL Code would be as simple as below . We can create a Calculation View ( Scripted) or  Table Function using this SQL .

a )

``````
select a."Month" ,a."Amount" , ( select sum(b."Amount") from "<SCHEMA>"."RUN_SUM" b  where a."Month" >= b."Month" ) as run_sum
from "<SCHEMA>"."RUN_SUM" a
order by a."Month"

``````

Or you can use Window function for better performance

b)

``````
select "Month" , "Amount" , Sum("Amount") over ( order by  "Month") from  "<SCHEMA>"."RUN_SUM"
order by "Month"

``````

SQL in Section c running against a table with 1.3 billion records and I get the results in 40 micro seconds

c )

``````
SELECT CALMONTH , ORDER_QUAN , SUM (ORDER_QUAN) OVER ( ORDER BY CALMONTH) AS SUM_VAL
FROM
( SELECT CALMONTH , SUM(ORDER_QUAN) AS "ORDER_QUAN" FROM "MY_TABLE"
WHERE CALMONTH BETWEEN '201401' AND '201603'
GROUP BY CALMONTH
)
``````

### Assigned Tags

You must be Logged on to comment or reply to a post. Nice work!

Although personally I think that you hit the nail on the head with this remark

Note: This might still be easier to do in Scripted View or Reporting Tool .

It's very likely also more efficient to simply use window functions in plain SQL or a table typed function (R.I.P. scripted calc views). Anindya Bose
Blog Post Author

Hi Lars

Thanks for the feedback. It is good to get feedback from experts like you .

Wanted to see if this can be done without a piece of code .  I shall add equivalent SQL code to achieve the same .

Regards

Anindya Anindya Bose
Blog Post Author

Hi Lars

I see SQL Code and Graphical Calculation view both are creating  Nested Loop Join . Does system understand what I am trying to do and creating Calc Scenario intelligently ?

Here is part of PlanViz . Please shed some light here . Regards

Anindya From what I can see you decided to implement the cumulative sum in SQL in a very similar fashion then what you did in the calc. view.

The scalar subquery gets rewritten into a join and that's what you get in PlanViz.

A more efficient approach to ***. sum is to use window functions instead. Hi Folks,

I liked this approach for calculating the Cumulative sum which I am using in Hana but my concern is I am dealing with the millions of records whenever the Query is run it goes back to the 1 st record & start calculating the Sum based upon the Amount .

Is there any approach by which I could  save my Sum as on today somewhere & then calculate the Sum from this point itself instead of going back to the 1st record

Hope that clears what I am trying to implement

Regards
Rahul Hi all,

someone try to add more dimension (with some cut-off dimension) to the model and can help me?

Andrea Hi Andrea,

have you or anyone tried with multiple dimensions? I could get them added using the windows function by grouping these dimensions, but it seems that if in that particular month, there are absolutely no data for a dimension value such as plant 1001, the existing cumulative value for previous month does not roll up to that future month.

example:

Jan 2017 plant 1000   Value : 500

Jan 2017 plant 1001  Value : 200

Feb 2017 plant 1001  Value : 150

The result using the windows functions are giving me cumulative as below

Jan 2017 Plant 1000 Value : 500

Jan 2017 Plant 1001 Value : 200

Feb 2017 Plant 1001 value 350

I would expect another record for Feb 2017 where Plant 1000 = Value 500 being brough forward from previous month.

The syntax i am using are example below

SELECT "CalWeek", "Plant", "StorageLocation", "Value", SUM ("Value") OVER ( partition by "Plant", "StorageLocation" ORDER BY "CalWeek" ) AS "Cum_Value"
FROM "Table"
order by "CalWeek";

Eddie Hi ,

I am new to SAP HANA Calculation view and I modified the design a bit and this is also working. 1. Table TAB1 has three columns as mentioned below, 2) Load the same table in two different Projections Projection_1 and Projection_2

3) Do inner join between these projections based on the DUMMY variable value(DUMMY value can be any constant value must be same for all the months), after the INNER JOIN the design and data looks likes below,  4) In Projection_3 create a Filter FILTER_GREATER and enter the formula like this

if("MONTH_1"<="MONTH",'Y','N') 5) In the Aggregation_2 node set the Expression as below, Also In the Aggregation node apply the aggregation function to AMOUNT1 which will store the cumulative sum of Amount 6) In the Final Projection, rename the AMOUNT1 as Cumulative_AMOUNT 7) The final value looks like this, Note : 