Introduction

hartmut_koerner · ‎05-03-2021

Introduction

You can learn how to work with advanced formulas in different ways. You can study the syntax , understand the basic rules, learn from examples or simply use trial and error. Probably you will use all these approaches together.

When I started working with advanced formulas I did it in the same way. But after a short time, I felt that the trial and error part still was too big. Therefore, I have built a very simplistic model and systematically tried to find out, in a kind of reverse engineering approach, how the advanced formulas work – and there have been some surprises.

In this blog, I want to share this experience. My focus is on the core calculations themselves, i.e. the DATA and RESULTLOOKUP statements that you write down to perform your actual calculations, the IF conditions, and the variables. These are the building blocks to set up more complex formulas.

We used this approach when building the content package “Integrated Financial Planning for SAP S/4HANA”. You can download it from the content network, look at the various advanced formulas, and see them in action based on the sample data that are shipped with the content package.

Example model

My simplistic model contains only three dimensions:

Date dimension with two values 202101 and 202102

Generic dimension Dim1 with values # (Unassigned), A1, A2 and A3

Account dimension Measure with values Amount, Quantity, and Price

In the following examples, you will always see the advanced formula and the data before and after the execution. “-“ in these tables means “Null”, meaning this record does not exist. Changed cells are highlighted in yellow.

Now let’s take a look at the different examples, starting very simple and then building on each other.

1. Single Assignment of Constant

DATA([d/Measure] = "Quantity", [d/Dim1] = "A1", [d/Date] = "202101") = 1

Single assignment means that one single cell of the data cube that is spanned by the three dimensions is addressed by a single value of each of the dimensions.

2. Bulk Assignment of Constants

DATA([d/Measure] = "Quantity") = 1

Here, only one dimension is addressed by a single value. The other two dimensions are not mentioned at all in the formula. In this case, data records are generated by multiplying the master data combinations, in this case 2 * 4 = 8.

3. Single Record Calculation

DATA([d/Measure] = "Quantity", [d/Dim1] = "A1", [d/Date] = "202101") = 

  RESULTLOOKUP([d/Measure] = "Quantity", [d/Dim1] = "A1", [d/Date] = "202101") * 2

This is now the first time that we access existing data on the right side of the formula with the RESULTLOOKUP statement. As in example 1, a single cell is addressed explicitly.

4. Bulk Calculation

DATA([d/Measure] = "Quantity") = RESULTLOOKUP([d/Measure] = "Quantity") * 2

As in example 2, we now leave out two of the dimensions and the formula automatically expands to a whole data slice. However, there is one fundamental difference which you will fully recognize in the next chapter and which has enormous impact on the performance: This formula is only calculated based on the existing fact data, master data combinations are not generated in the background.

5. Calculation Plus Constant

DATA([d/Measure] = "Quantity") = RESULTLOOKUP([d/Measure] = "Quantity") * 2 + 1

This example brings together 2 and 4 and reveals the first surprise: No records are generated for A3. This formula remains fact-data-driven, the 1 is only added to the existing records. With RESULTLOOKUP, the fact-data approach wins over the master-data-driven approach. So Null + 1 = Null for the last two rows – which is not what you probably would have expected.

6. Aggregation

Now we introduce a new behavior by creating “asymmetric” formulas:

DATA([d/Measure] = "Quantity", [d/Dim1] = "#") = 

  RESULTLOOKUP([d/Measure] = "Quantity")

The DATA result is filtered by a single value for Dim1, but in the RESULTLOOKUP Dim1 is left out. This asymmetric assignment leads to an aggregation: For each Date value, the quantity for all the Dim1 members (#, A1, A2, A3) is aggregated and written to Dim1 = #.

What happens when we add a constant in addition?

DATA([d/Measure] = "Quantity", [d/Dim1] = "#") = 

  RESULTLOOKUP([d/Measure] = "Quantity") + 1

The next surprise happens: The aggregation that we have introduced by the asymmetric formula takes place after the calculation. This means that the 1 is added to all the members and then aggregated. Otherwise we would have got 31 and 301 as a result. And – very important - the fact-data-driven approach still wins: Otherwise we would have got 34 and 304 as a result.

7. Duplication

Now let’s try the other way around: The filter on Dim1 = # is in the RESULTLOOKUP on the right side and Dim1 is omitted in the DATA term. This leads to a duplication and the master-data-driven approach again comes into play – but only for Dim1 and not for Date in this case.

DATA([d/Measure] = "Quantity") = 

  RESULTLOOKUP([d/Measure] = "Quantity", [d/Dim1] = "#") * 2

8. Join

In the next step, we will work with several RESULTLOOKUPs on the right side. Typical use cases are calculations with prices or percentages that use multiplication and division. As we will see, this leads to a join of datasets.

DATA([d/Measure] = "Amount") = RESULTLOOKUP([d/Measure] = "Quantity") * 

  RESULTLOOKUP([d/Measure] = "Price", [d/Dim1] = "#")

Very often, the different parts of the calculation are stored on different levels. In this example, the prices are independent of the dimension Dim1 and are therefore stored with Dim1 = #, but the quantity is planned on the level Dim1/Date. Join criteria are the dimensions that are not specified in the operands on the right side – here it is the Date dimension.

The rule for the left side is generalized from the rules above:

Aggregation takes place using dimensions that are specified with a single value on the left side but do not occur in any of the operands on the right side.

Duplication is done for dimensions that are not specified with a single value on the left side, but for all operands on the right side.

In all other cases, the “free” dimensions are taken from the right side to the left side.

In the example above, there is no aggregation and duplication. Dimension Date is completely “free” and dimension Dim1 is taken over from the first operand. Join criterium is the Date value.

9. Virtual Variable Members

Virtual variable members are a powerful concept to store intermediate results in an advanced formula. Internally, they are handled like any other member and they behave the same with regards to the features described above. The only difference is that the data in its corresponding data slices disappears after execution of the data action. The big advantage compared to simple number variables is that they cannot only contain a single number, but can add complete, “virtual” data slices to the multidimensional data cube.

For the next example, assume that you use actuals from last year as starting point for your plan, but you want to smooth out the periodic fluctuations.

VARIABLEMEMBER #TOTAL OF [d/Date]

DATA([d/Measure] = "Quantity", [d/Date] = #TOTAL) = RESULTLOOKUP([d/Measure] = "Quantity")

DATA([d/Measure] = "Quantity") = RESULTLOOKUP([d/Measure] = "Quantity", [d/Date] = #TOTAL) / 2

This little example visualizes how variable members work. The first equation aggregates the quantity across the dates. The second equation is another example of the duplication in example 7. It calculates the average value and moves it to each of the two periods.

How to make this more dynamic and fact data instead of master data driven? Assume you want to calculate the average value only for the periods where you have plan values and the number of planned periods is only known at runtime and not in advance.

VARIABLEMEMBER #TOTAL OF [d/Date]

VARIABLEMEMBER #NUM OF [d/Date]

DATA([d/Measure] = "Quantity", [d/Date] = #TOTAL) = 

  RESULTLOOKUP([d/Measure] = "Quantity")

DATA([d/Measure] = "Quantity", [d/Date] = #NUM) = 

  RESULTLOOKUP([d/Measure] = "Quantity") * 0 + 1

DATA([d/Measure] = "Quantity") = (RESULTLOOKUP([d/Measure] = "Quantity") * 0 + 1) *  

  RESULTLOOKUP([d/Measure] = "Quantity", [d/Date] = #TOTAL) /

  RESULTLOOKUP([d/Measure] = "Quantity", [d/Date] = #NUM)

I’ve introduced another variable member #NUM which counts the number of planned periods. Note the trick to multiply the RESULTLOOKUP by 0 to make this fact data driven.

10. IF with RESULTLOOKUP

In the following two examples, we are using cell value conditions.

The first example is simple, straightforward, and intuitive:

IF RESULTLOOKUP([d/Measure] = "Quantity") > 10 THEN

  DATA([d/Measure] = "Amount") = RESULTLOOKUP([d/Measure] = "Amount") * 2

ENDIF

However, the second example reveals some surprises and may be counterintuitive, but in the end the IF with RESULTLOOKUP works in a consistent way.

IF RESULTLOOKUP([d/Measure] = "Quantity", [d/Date] = "202101") = 10 THEN

  DATA([d/Measure] = "Amount") = RESULTLOOKUP([d/Measure] = "Amount") * 2

ELSEIF RESULTLOOKUP([d/Measure] = "Quantity", [d/Date] = "202101") != 10 THEN

  DATA([d/Measure] = "Amount") = RESULTLOOKUP([d/Measure] = "Amount") * 3

ELSE

  DATA([d/Measure] = "Amount") = RESULTLOOKUP([d/Measure] = "Amount") * 4

ENDIF

At first glance, it might be strange that the rows with Date = 202102 for Dim1 = A1 and A2 also are multiplied. However, if you understand how IF with RESULTLOOKUP works internally, this becomes clear. You can interpret IF in the following way: At runtime, it is transformed into a condition on member Ids for the dimensions that are not part of the RESULTLOOKUP:

IF [d/Dim1] = "A1" THEN

  DATA([d/Measure] = "Amount") = RESULTLOOKUP([d/Measure] = "Amount") * 2

ELSEIF [d/Dim1] = "A2" THEN

  DATA([d/Measure] = "Amount") = RESULTLOOKUP([d/Measure] = "Amount") * 3

ELSEIF [d/Dim1] = ("A3", "#") THEN

  DATA([d/Measure] = "Amount") = RESULTLOOKUP([d/Measure] = "Amount") * 4

ENDIF

Next, let’s look at ELSE. One could think that the first two conditions “=10” and “!=10” cover all cases and ELSE is not needed. This is not correct as IF with RESULTLOOKUP is fact data driven and there is no fact data for Quantity for Dim1 = # and A3. ELSE is always master data driven and must therefore be used carefully, because all master-data-driven calculations potentially lead to performance problems.

Conclusion

The key take-aways of this exercise are:

Formulas can work master-data or fact-data-driven. The system switches from a fact-data-driven approach to a master-data-driven approach in the following cases: Assignment of constants, asymmetric formulas with duplication, and the ELSE clause.

Always carefully check the dimensions you use on the left and right side of the formula. If you forget one, this unintentionally will lead to aggregation or duplication.

Master-data-driven formulas bear the risk of performance problems because master data combinations are multiplied out.

Calculations are done before aggregation and not vice versa.

IF with RESULTLOOKUP may feel counterintuitive: The more restrictive the condition, i.e. the more dimensions are included in the RESULTLOOKUP, the more data is affected if the condition is fulfilled.

Like SQL, advanced formulas are a declarative programming language. They describe the “What” rather than the “How”. Other formula languages, e.g. FOX in SAP Business Planning and Consolidation, are mostly imperative languages.

For additional information, please see the following references:

Integrated Financial Planning for SAP S/4HANA with SAP Analytics Cloud

Visit your community topic page to learn more about SAP Analytics Cloud

Find all Q&A about SAP Analytics Cloud and feel free to ask your own question

Advanced Formulas – How they work

Introduction

Example model

1. Single Assignment of Constant

2. Bulk Assignment of Constants

3. Single Record Calculation

4. Bulk Calculation

5. Calculation Plus Constant

6. Aggregation

7. Duplication

8. Join

9. Virtual Variable Members

10. IF with RESULTLOOKUP

Conclusion

Get Your SAP HANA Idea Incubator Badge Today!

SCN Mission - SAP HANA Quiz Challenge is now retired

Share your #HANAStory and Win