Skip to Content

Understanding Query Performance in NW BI and BIA

Understanding Query Performance in NW BI and BIA

Query performance has always been a hot topic but with the advent of the BI accelerator (BIA) it has become scalding hot. Customers, consultants and colleagues report performance improvements well beyond factor 100 but also of factors like 4.7 and even 1. People are sometimes puzzled by such numbers: why is it so different? This blog attempts to make you understand what’s going on behind the scenes.

First of all, there are three rules of thumb that apply when looking at query performance with BIA:

  • query processing time basically comprises three components: data access incl. aggregation, calculations (OLAP), client rendering;
    BIA addresses the data access time (albeit in special situations also the calculations time)
  • slow queries (when running w/o BIA) can significantly benefit by using BIA;
    fast queries won’t benefit that much, i.e. if the query is already fast w/o BIA then don’t expect factor 100 – but if it’s already fast then who cares?
  • beside better performance you can expect stable performance;
    this is a consequence of the simple processing paradigms inside BIA that (a) avoid the volatility of a DB optimizer and that (b) guarantee linear scalability: more blades or faster CPUs directly translate into better performance.

Now, let’s dive into more details and find out where BIA helps by looking at a query example. In figure 1 below, a query can be seen that has been used in some demos. Sales figures are shown by countries and quarters. In figure 2, there is a little box showing some statistics behind the processing of that query. Understanding those numbers will enable you to understand query processing performance, not only for BIA but also in case of an RDBMS-based approach:

  1. total rows (TOTAL): this is the number of facts in the underlying infocube. So this is basically all the data that is available to answer the query. In this example, there are just over 1 bio records in the fact table. This is the starting point for the query.
  2. selected rows (DBSEL): this is the number of records that are actually relevant to the query. On the screenshot you see that some filters have been set on Product Group and Cal. Year/Quarter. In other words: out of the 1 bio records only 142,384,965 records are actually relevant. So the first thing, the BIA or an RDBMS has to do is to filter the 1 bio records and to identify those 142 mio records for further processing. Both, BIA and RDBMS, typically use indexes for that purpose. Both are probably equally capable in doing this. However, identifying those records is one thing, processing them is the other. Here, column-wise processing, compression and parallelization inside BIA are a clear differentiator. See Comparing the BI Accelerator (f.k.a. HPA) to Traditional RDBMS Technology.
  3. transferred rows (DBTRANS):  this is the number of result rows when you summarize (aggregate) the records identified under 2. In the example, records are aggregated by C-store (see lines in result grid) and Cal. Year/Quarter (see columns in result grid). The statistics reveal that there are 2500 combinations of C-stores and Cal. Year/Quarter. As a little calculation on the side: there are 5 quarters; so there must be 2500/5 = 500 C-stores in the hierarchy that is displayed in the lines of the result grid. Here, the BIA does tremendeously well, even if DBSEL is huge.
  4. cells in result grid (CELLS): there are 35 cells in the example – see the red box labeled with 4 in the figure – all refering to a (basic) key figure; in this case, there is only one, namely No. of Sales. This means that the under 3. aggregated rows (coming from the BIA or the RDBMS) have been further aggregated to 35 key figure values. In this example, the reason for this is that the hierarchy logic imposed through the display hierarchy on C-store (i.e. a C-store has been grouped by its respective country). The hierarchy logic is processed outside the aggregation layer (represented by the BIA or the RDBMS) and inside the calculation layer. The latter is implemented on the application server.
    If the result is visualized, i.e. digested by an end-user and not subsequently processed by a machine, then the number of cells must be small as nobody – other than Rain Man – can derive business decisions from a huge and unmanageable result set. This is an important fact!

Figure 1: Example of a query and its result grid.

Figure 2: Statistics for the query of figure 1.

No. Description Abbreviation Value
1. total rows TOTAL 1,002,752,234
2. selected rows DBSEL 142,384,965
3. transferred rows DBTRANS 2,500
4. cells CELLS 35

In the simplest case, the rows calculated under 3. are displayed one-to-one in the result grid under 4., thus 2500 rows translate into 2500 cells (per requested key figure). In practice, queries comprise some additional logic, like the display hierarchy in this instance. Other options are exception aggregations, internal business revenue elimination, formula calculation before aggregation, zero elimination, certain conditions like top-N or bottom-N, …

Overall the following rules apply:

Performance Situation Example / Rule of Thumb* What to do?
DBSEL is very high eg. DBSEL > 1 mio define aggregates (materialized views, indexed views, summary tables), especially using filters [no problem]
DBTRANS << DBSEL eg. DBSEL / DBTRANS > 100 define aggregates [no problem]
DBTRANS is very high eg. DBTRANS > 50000 In both cases the following applies:

  • If CELLS is significantly lower than DBTRANS then check the additional logic that requires to read much more details than are displayed. In some cases, there are opportunities for this logic to be avoided, bypassed or shifted – e.g. from the query processing to the loading stage. In practice, we sometimes see query definitions that comprise a number of (orthogonal, independent) pieces of logic that could be distributed if there was a query for each piece that then sit side-by-side in a dashboard.
    Known cases for which this situation currently cannot be avoided are display hierarchies, exception aggregations, internal business revenue elimination, formula calculation before aggregation, zero elimination, certain conditions like top-N or bottom-N.
  • If CELLS is equally high as DBTRANS then it is likely that either the manageability constraint has been violated (thus the query is likely to not provide you with the benefit that you expect) or the context is a different one, such as extracting mass data that is to be fed into a subsequent, machine-driven process [+]. In the latter case, it makes sense to switch to an RDBMS-based processing.
  • RDBMS support cursors and thus can transfer mass data in portions. While this does not translate into an immediate performance benefit it, at least, allows the data to be transferred in a resource-efficient manner. This is why it makes sense to switch to a RDBMS-based processing mode in an extraction scenario (see [+] above).
  • The BIA processing time in such a situation is dominated by the communication between BIA and NW BI servers which should run over a 1 Gbit connection. As an example: 1 mio rows of 1 kB each will then need 8 sec.
  • Furthermore, data is currently transferred in one portion; there is no cursor-like mechanism as in the RDBMS case.

* The values given here are very rough indicators on a row-level. Frequently, they also depend on the size of the row or the underlying technical infrastructure (hardware resources, OS and DB platforms etc.). So consider them with care.


The table above indicates where the BIA is strong but also describes situations that should be avoided. Infocube modelling and query design are still relevant albeit to a fairly reduced extent. In that sense, it’s like with any other feature: it can be sensibly used and create an outrageous benefit. But if pushed to the boundaries then things can become odd – my favourite example in that respect is an empty database table with a huge number of partitions that cause queries to run for minutes even though the table is empty. Nobody will question the benefits of table partitioning just because of that.
Anyway. I hope that, with this blog, you have gained some coloured insights into the components that determine query processing performance. Use those “colours” when confronted with those who try to make you think that it’s all around black and white.

You must be Logged on to comment or reply to a post.
  • If you have the time, I would personally very much appreciate a follow-up on exactly why the “known” problem-situations are in fact problems.  (I can guess at some of the answers but I’m not at all sure I’m correct.)

    If you do have the time, your further explanations of these problem-situations would give us a lot of practical insight into the inner workings that your post is focussed on.  (Unless of course, this would give away proprietary architectural details that you’re not at liberty to divulge.)

    Again, thanks very much for a wonderful post.  I hope that SDN sees a lot more like them from you and others in the future.


    • Hi David,

      thanks for the compliments.

      Let’s simply take the example of top-n: in order to determine your top-10 customers you have to look at each and every customer that you have (as any of them can theoretically qualify for the top-10). A retail or utilities company typically has millions of customers. So you need (a) to aggregate the facts to an aggregated row for every customer, (b) sort those rows along the key figure (that qualifies the top-n ranking) and (c) discard all but the first 10 of them. In the end, you see a list of 10 entries but, internally, the engine might have gone through a list of millions. This strikes many users (“why does it take so long to show 10 rows?”).

      In NW 2004s, (a) is processed on the BIA or an RDBMS, (b) and (c) are processed in the application server where NW BI’s calculation layer sits. The latter implies to transfer millions of rows from the BIA or RDBMS to the application server. There would be a lot to say why the situation is like this. But that would blow the scope of this comment.

      There are plans to tackle situations as with top-n, exception aggregations etc., and push them into BIA. In fact, the current dev. version of NW BI and BIA can calculate top-n queries inside the BIA engine. The performance is incredible. However, it has been a tedious effort to get there: OLAP processing imposes a certain order in the processing logic (just consider shuffling (a), (b), (c) above – that wouldn’t work). This is a major difference with SQL (or relational algebra in general). There, joins, projections, selections, unions can – to a certain extent – be processing in several orders without affecting the result.

      Actually, when I find some more time, I might pick one or the other situation and will dive into some more details in a follow-up blog.



      • Thomas, it’d be so great if you could explain the hierarchy processing case as well. Couldn’t the aggregation be performed by the aggregation layer rather than the calculation layer? Isn’t possible to have materialized aggregates for at least some nodes in the hierarchy?

        Thanks, Davide

          • Hi,

            in my understanding, the aggregation is always performed by the calculation layer (i.e. the OLAP processor) when the read mode is set to ‘A’ (query to read all data at once).

            On the other hand, when it’s set to ‘X’ the aggregation could be done by the aggregation layer: “When accessing the database, the best aggregate table is used and, if possible, data is aggregated in the database”.

            Anyway, it’s not thoughrolly clear to me when this can actually happen, as “presentation hierarchies are always imported completely on a leaf level” with this read mode.

            Thanks, Davide

      • Hi Thomas,

        Thank you for this very interesting and raising hopes blog!
        Do you know when this new functionality (top-n,except. aggr) in BIA will be available?
        We have simple queries for counting customers which are even slower executed over BIA (200 sec.) then over a InfoCube directly (45 sec).

  • Thanks for this very clear explanation of such a technical subject!

    (and i will definitely re-use your ‘Rain Man’ joke the next time a customer wants to use his BW-reports as an “access-database” :-))

  • Hello Thomas,
    I believe the reason you get different performance improvement factors like sometimes 100, sometimes 40, sometime 1, and even sometimes -1 when comparing response times of the BIA to that of the database is namely due to the fact that the query response times of the database are very variable. It depends on whether a good aggregate exists or whether the optimizer finds a suitable index for that query. With a perfect aggregate or index the DB response time can beat that of the BIA. But with no suitable aggregates and no suitable indexes, and a lousy access path, the response time of a query could take … a very long time. I work with the SAP DB2 porting team and have done some researches on this variation. I think the best way to explain it it through a graph of both response times. I have such a graph. For the BIA the response time is a smooth and almost a straight line. For the DB it is a very rugged line with a lot of ups and downs which only reflects what I said before.
  • Hello Thomas ,
    Thanks for the article . This igves basic thumb rule for the analysis and approach to be followed while deciding for Relational or Trex aggregates .
  • We are considering to upgrade BI3.5 to BI7.0 due to the fact of BIA having with great performance improvement. Now we want to know whether BIA always work for data access if using 3rd part report tools under MDX standard or others.
    • Yes, the MDX-based interfaces in SAP BI are supported by BIA. It’s basically similar to traditional aggregates.



  • Hi Thomas,

    Thanks for the explanation on the blog. We have a current issue with the BIA crashing when a specific cube starts being hit for reporting. We couldn’t find information elsewhere, so maybe you can help. Does it make any difference to BIA whether we have a cube with 1 KF and 1,000M records, or if we break these KF into 100, so the cube would have 10M records instead, and we create a CKF to summate all the 100 individual KF’s?
    From a DB point of view, the size of the cube is still the same, but what about BIA? Does it work better reading one huge column, or is it better at all to break this huge column in, let’s say, 100 chunks?

    Thanks in advance!

    • Hi Karina,

      it is difficult to say which model suits BIA best. If there are 100 key figures and typical queries only request a hand-full of them then BIA can leverage such an advantage much better than an RDBMS (which is row-oriented and would read all 100 key figures from disk to only throw away 95 of them once data reaches the processor). On the other hand, if there is only 1 key figure, BIA might be able compress better. Overall, it is wild guessing …

      What is clear is that an RDBMS would prefer the “100 key figure approach”. BIA doesn’t mind.

      Hope this helps a little bit.


  • Hi Thomas,

    I found an exact answer to my question in your blog. Happy for that. Currently, we are analysing if high ratio of DBTRANS and DBSEL has a negative impact in query performace. But I am claming that since we are using BWA we should not think much about the high ratio. I will be refering to your blog. And I will need a document (maybe published by SAP) to prove my claim. I will appreciate if you help me to find this.

    Thanks in advance,