Advanced Formulas – How they work
You can learn how to work with advanced formulas in different ways. You can study the syntax , understand the basic rules, learn from examples or simply use trial and error. Probably you will use all these approaches together.
When I started working with advanced formulas I did it in the same way. But after a short time, I felt that the trial and error part still was too big. Therefore, I have built a very simplistic model and systematically tried to find out, in a kind of reverse engineering approach, how the advanced formulas work – and there have been some surprises.
In this blog, I want to share this experience. My focus is on the core calculations themselves, i.e. the DATA and RESULTLOOKUP statements that you write down to perform your actual calculations, the IF conditions, and the variables. These are the building blocks to set up more complex formulas.
We used this approach when building the content package “Integrated Financial Planning for SAP S/4HANA”. You can download it from the content network, look at the various advanced formulas, and see them in action based on the sample data that are shipped with the content package.
My simplistic model contains only three dimensions:
- Date dimension with two values 202101 and 202102
- Generic dimension Dim1 with values # (Unassigned), A1, A2 and A3
- Account dimension Measure with values Amount, Quantity, and Price
In the following examples, you will always see the advanced formula and the data before and after the execution. “-“ in these tables means “Null”, meaning this record does not exist. Changed cells are highlighted in yellow.
Now let’s take a look at the different examples, starting very simple and then building on each other.
1. Single Assignment of Constant
DATA([d/Measure] = "Quantity", [d/Dim1] = "A1", [d/Date] = "202101") = 1
Single assignment means that one single cell of the data cube that is spanned by the three dimensions is addressed by a single value of each of the dimensions.
2. Bulk Assignment of Constants
DATA([d/Measure] = "Quantity") = 1
Here, only one dimension is addressed by a single value. The other two dimensions are not mentioned at all in the formula. In this case, data records are generated by multiplying the master data combinations, in this case 2 * 4 = 8.
3. Single Record Calculation
DATA([d/Measure] = "Quantity", [d/Dim1] = "A1", [d/Date] = "202101") = RESULTLOOKUP([d/Measure] = "Quantity", [d/Dim1] = "A1", [d/Date] = "202101") * 2
This is now the first time that we access existing data on the right side of the formula with the RESULTLOOKUP statement. As in example 1, a single cell is addressed explicitly.
4. Bulk Calculation
DATA([d/Measure] = "Quantity") = RESULTLOOKUP([d/Measure] = "Quantity") * 2
As in example 2, we now leave out two of the dimensions and the formula automatically expands to a whole data slice. However, there is one fundamental difference which you will fully recognize in the next chapter and which has enormous impact on the performance: This formula is only calculated based on the existing fact data, master data combinations are not generated in the background.
5. Calculation Plus Constant
DATA([d/Measure] = "Quantity") = RESULTLOOKUP([d/Measure] = "Quantity") * 2 + 1
This example brings together 2 and 4 and reveals the first surprise: No records are generated for A3. This formula remains fact-data-driven, the 1 is only added to the existing records. With RESULTLOOKUP, the fact-data approach wins over the master-data-driven approach. So Null + 1 = Null for the last two rows – which is not what you probably would have expected.
Now we introduce a new behavior by creating “asymmetric” formulas:
DATA([d/Measure] = "Quantity", [d/Dim1] = "#") = RESULTLOOKUP([d/Measure] = "Quantity")
The DATA result is filtered by a single value for Dim1, but in the RESULTLOOKUP Dim1 is left out. This asymmetric assignment leads to an aggregation: For each Date value, the quantity for all the Dim1 members (#, A1, A2, A3) is aggregated and written to Dim1 = #.
What happens when we add a constant in addition?
DATA([d/Measure] = "Quantity", [d/Dim1] = "#") = RESULTLOOKUP([d/Measure] = "Quantity") + 1
The next surprise happens: The aggregation that we have introduced by the asymmetric formula takes place after the calculation. This means that the 1 is added to all the members and then aggregated. Otherwise we would have got 31 and 301 as a result. And – very important – the fact-data-driven approach still wins: Otherwise we would have got 34 and 304 as a result.
Now let’s try the other way around: The filter on Dim1 = # is in the RESULTLOOKUP on the right side and Dim1 is omitted in the DATA term. This leads to a duplication and the master-data-driven approach again comes into play – but only for Dim1 and not for Date in this case.
DATA([d/Measure] = "Quantity") = RESULTLOOKUP([d/Measure] = "Quantity", [d/Dim1] = "#") * 2
In the next step, we will work with several RESULTLOOKUPs on the right side. Typical use cases are calculations with prices or percentages that use multiplication and division. As we will see, this leads to a join of datasets.
DATA([d/Measure] = "Amount") = RESULTLOOKUP([d/Measure] = "Quantity") * RESULTLOOKUP([d/Measure] = "Price", [d/Dim1] = "#")
Very often, the different parts of the calculation are stored on different levels. In this example, the prices are independent of the dimension Dim1 and are therefore stored with Dim1 = #, but the quantity is planned on the level Dim1/Date. Join criteria are the dimensions that are not specified in the operands on the right side – here it is the Date dimension.
The rule for the left side is generalized from the rules above:
- Aggregation takes place using dimensions that are specified with a single value on the left side but do not occur in any of the operands on the right side.
- Duplication is done for dimensions that are not specified with a single value on the left side, but for all operands on the right side.
- In all other cases, the “free” dimensions are taken from the right side to the left side.
In the example above, there is no aggregation and duplication. Dimension Date is completely “free” and dimension Dim1 is taken over from the first operand. Join criterium is the Date value.
9. Virtual Variable Members
Virtual variable members are a powerful concept to store intermediate results in an advanced formula. Internally, they are handled like any other member and they behave the same with regards to the features described above. The only difference is that the data in its corresponding data slices disappears after execution of the data action. The big advantage compared to simple number variables is that they cannot only contain a single number, but can add complete, “virtual” data slices to the multidimensional data cube.
For the next example, assume that you use actuals from last year as starting point for your plan, but you want to smooth out the periodic fluctuations.
VARIABLEMEMBER #TOTAL OF [d/Date] DATA([d/Measure] = "Quantity", [d/Date] = #TOTAL) = RESULTLOOKUP([d/Measure] = "Quantity") DATA([d/Measure] = "Quantity") = RESULTLOOKUP([d/Measure] = "Quantity", [d/Date] = #TOTAL) / 2
This little example visualizes how variable members work. The first equation aggregates the quantity across the dates. The second equation is another example of the duplication in example 7. It calculates the average value and moves it to each of the two periods.
How to make this more dynamic and fact data instead of master data driven? Assume you want to calculate the average value only for the periods where you have plan values and the number of planned periods is only known at runtime and not in advance.
VARIABLEMEMBER #TOTAL OF [d/Date] VARIABLEMEMBER #NUM OF [d/Date] DATA([d/Measure] = "Quantity", [d/Date] = #TOTAL) = RESULTLOOKUP([d/Measure] = "Quantity") DATA([d/Measure] = "Quantity", [d/Date] = #NUM) = RESULTLOOKUP([d/Measure] = "Quantity") * 0 + 1 DATA([d/Measure] = "Quantity") = (RESULTLOOKUP([d/Measure] = "Quantity") * 0 + 1) * RESULTLOOKUP([d/Measure] = "Quantity", [d/Date] = #TOTAL) / RESULTLOOKUP([d/Measure] = "Quantity", [d/Date] = #NUM)
I’ve introduced another variable member #NUM which counts the number of planned periods. Note the trick to multiply the RESULTLOOKUP by 0 to make this fact data driven.
10. IF with RESULTLOOKUP
In the following two examples, we are using cell value conditions.
The first example is simple, straightforward, and intuitive:
IF RESULTLOOKUP([d/Measure] = "Quantity") > 10 THEN DATA([d/Measure] = "Amount") = RESULTLOOKUP([d/Measure] = "Amount") * 2 ENDIF
However, the second example reveals some surprises and may be counterintuitive, but in the end the IF with RESULTLOOKUP works in a consistent way.
IF RESULTLOOKUP([d/Measure] = "Quantity", [d/Date] = "202101") = 10 THEN DATA([d/Measure] = "Amount") = RESULTLOOKUP([d/Measure] = "Amount") * 2 ELSEIF RESULTLOOKUP([d/Measure] = "Quantity", [d/Date] = "202101") != 10 THEN DATA([d/Measure] = "Amount") = RESULTLOOKUP([d/Measure] = "Amount") * 3 ELSE DATA([d/Measure] = "Amount") = RESULTLOOKUP([d/Measure] = "Amount") * 4 ENDIF
At first glance, it might be strange that the rows with Date = 202102 for Dim1 = A1 and A2 also are multiplied. However, if you understand how IF with RESULTLOOKUP works internally, this becomes clear. You can interpret IF in the following way: At runtime, it is transformed into a condition on member Ids for the dimensions that are not part of the RESULTLOOKUP:
IF [d/Dim1] = "A1" THEN DATA([d/Measure] = "Amount") = RESULTLOOKUP([d/Measure] = "Amount") * 2 ELSEIF [d/Dim1] = "A2" THEN DATA([d/Measure] = "Amount") = RESULTLOOKUP([d/Measure] = "Amount") * 3 ELSEIF [d/Dim1] = ("A3", "#") THEN DATA([d/Measure] = "Amount") = RESULTLOOKUP([d/Measure] = "Amount") * 4 ENDIF
Next, let’s look at ELSE. One could think that the first two conditions “=10” and “!=10” cover all cases and ELSE is not needed. This is not correct as IF with RESULTLOOKUP is fact data driven and there is no fact data for Quantity for Dim1 = # and A3. ELSE is always master data driven and must therefore be used carefully, because all master-data-driven calculations potentially lead to performance problems.
The key take-aways of this exercise are:
- Formulas can work master-data or fact-data-driven. The system switches from a fact-data-driven approach to a master-data-driven approach in the following cases: Assignment of constants, asymmetric formulas with duplication, and the ELSE clause.
- Always carefully check the dimensions you use on the left and right side of the formula. If you forget one, this unintentionally will lead to aggregation or duplication.
- Master-data-driven formulas bear the risk of performance problems because master data combinations are multiplied out.
- Calculations are done before aggregation and not vice versa.
- IF with RESULTLOOKUP may feel counterintuitive: The more restrictive the condition, i.e. the more dimensions are included in the RESULTLOOKUP, the more data is affected if the condition is fulfilled.
- Like SQL, advanced formulas are a declarative programming language. They describe the “What” rather than the “How”. Other formula languages, e.g. FOX in SAP Business Planning and Consolidation, are mostly imperative languages.
For additional information, please see the following references:
Integrated Financial Planning for SAP S/4HANA with SAP Analytics Cloud
Visit your community topic page to learn more about SAP Analytics Cloud
Find all Q&A about SAP Analytics Cloud and feel free to ask your own question
"Counterintuitive" is a key word when talking about advanced formula language.
11 years ago I started with BPC standard script logic. It was limited and sometimes also counterintuitive. But today I compare script logic with advanced formula and looks like script logic is much better 🙂
Agreed, the new syntax and keywords seem like a step backwards.
Nice document. I was looking for a similar document since i am not from program background. Thanks for the same.
Now, i am struck up in writing the logic to consider GL account hierarchy (Parent node) in the IF condition in the Result lookup,
I want to write a code IF , say GL account group Base member, Hierarchy node is "COS", then, the code should work.
I am getting error. Can you throw some light on this.
"How to use Base Member (Hierarchy ) in the Result look up statement"
I think you don't need IF with RESULTLOOKUP. You simply can use the BASEMEMBER function within the IF, e.g.
Yes. It is working. Thanks
One more thing,
I want to add one condition, example, if the value of one measure is >0, then, one statement,
Else, another statement, that, I think , I need to use Result lookup.
Please share your views.
Yes, you can combine different conditions, e.g. a condition on a member or member property and a condition on a cell value (RESULTLOOKUP), either by nesting them or combining them with AND or OR.
Hi Hartmut Koerner
Nice explanation of data actions, I am still relatively new to data actions and I would like to understand if we can manipulate dimension columns in a model as well? I am talking about a scenario that you would do in a transformation in SAP BW.
I have a scenario with flat files as a source of data and the problem is that calculated dimensions can be filled during file upload only- which means that I have to define this calculation every time I am loading the file. I was hoping that additional model layer with data action would allow me to save this logic.
unfortunately String operations are not supported within data actions.
Hi @Hartmut Koerner ,
Big Thanks to you for a very consise explanation on Data Actions ,but still I am quite new to this script thing ( Advanced Formula ) in Data Actions and I would like to know if we can get input from a prompt ( Variable ) on story where the input would be in either percentage or decimal and based on the two types of input respectively , the calculations must be carried out for the input and result must be shown as a column in story.
Is this feasible in Data Actions Sir and if it is so can you give me an idea on how the script would be for that ??
you must define a data action parameter of type number
and can access it in the formula using "%":
DATA() = RESULTLOOKUP() * %Param1%
Thank you for providing this excellent explanation and examples. However, I cannot understand how the IF statement in the last example is internally interpreted so differently. Is there any other additional content to help in understanding?
IF [d/Dim1] = "A1" THEN
DATA([d/Measure] = "Amount") = RESULTLOOKUP([d/Measure] = "Amount") * 2
interpreted same as
IF [d/Dim1] = "A1" THEN
DATA([d/Measure] = "Amount") = RESULTLOOKUP([d/Measure] = "Amount") * 2
To me the two conditions are completely different, and should yield different results. It would be great to know the interpretive process.
I assume you have a copy/paste error here and wanted to compare
First it is important not to think in "data rows" but in "cells" of the data cube. The first IF condition is evaluated for all data cells with Measure = Quantity and Data = 202101. And now it becomes interesting: In case the condition is true for a data cell, it is true not only for this cell, but for all other cells with respect to the Measure and Date dimension. In this example, the IF condition is true for Dim1 = A1, Measure = Quantity, Date = 202101, but it is true for all cells with Dim1 = A1! And this is why the condition can be reformulated as IF Dim1 = A1.
Thank you for reading my question (correcting the incorrect copy& paste) and answering it.
Your confirmation of the unusual interpretation of the IF condition helps but from syntax point of view it is completely unexpected.
Thank you again.