Skip to Content

One of the quotes I like most I learned from Tom Kyte from Oracle.
It’s one from Artemus Ward:

It ain’t so much the things we don’t know that get us into trouble.
It’s the things you know that just ain’t so.”

This blog post is about one of those cases.

A customer complained about the performance of the index creation process in one of his process chains.

I had a look at the processes messages and found this:

[...]
28.03.2011 06:10:55 SQL: 28.03.2011 06:10:55 ALE_BWP_010
28.03.2011 06:10:55  CREATE BITMAP INDEX
28.03.2011 06:10:55 "SAPBWP"."/BIC/ECUBENAME~040" ON "/BIC/ECUBENAME"
28.03.2011 06:10:55 ( "KEY_SDIMENS1" ) PCTFREE 10 INITRANS
28.03.2011 06:10:55    20 MAXTRANS 255 STORAGE ( INITIAL 16 K NEXT
28.03.2011 06:10:55 81920 K MINEXTENTS 1 MAXEXTENTS UNLIMITED
28.03.2011 06:10:55 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
28.03.2011 06:10:55 BUFFER_POOL DEFAULT ) LOCAL PARALLEL TABLESPACE
28.03.2011 06:10:55 "PSAPBWP" NOLOGGING COMPUTE STATISTIC
 --- > 6 hours !!!!????
28.03.2011 12:47:30 SQL-END: 28.03.2011 12:47:30 06:36:35
[...]

There was one index that actually took 6 hours and 36 minutes to create, although the table scan was done in parallel on Oracle level.

The other indexes on the same table only took some minutes (except one other that took something around 30 minutes).
So what could be the reason for that?

The next thing I looked up was the number of distinct values for the different dimension KEY columns (since these are the only ones getting indexes on fact tables):

OBJECT                      DIST        NUM_ROWS   (LF) BLK
Table: /BIC/ECUBENAME                   783908640   14192040
 
  KEY_SDIMENS1            445965129              <<<<<<
    KEY_SDIMENS2                 1529
    KEY_SDIMENS3                 3223
    KEY_SDIMENS4                  795
    KEY_SDIMENS5                  439
    KEY_SDIMENS6                28578
    KEY_SDIMENS7              1126484
    KEY_SDIMENS8                83675
    KEY_SDIMENS9                53296
    KEY_SDIMENSA              2465897
    KEY_SDIMENSB                  170
    KEY_SDIMENSC            194594206             <<<<<<
    KEY_SDIMENSD                    1
    KEY_SDIMENSP                    1
    KEY_SDIMENST                  631
    KEY_SDIMENSU                   43

And this was already the hint required to solve this mystery.

Bitmap indexes are extremely efficient during lookup and when used to combine selection criteria.
What they don’t do very good is to handle large numbers of distinct values (always compared to the total number of rows in the table of course).
In this case the dimension for which the index creation took six hours (column KEY_SDIMENS1) had more than half as much distinct values than the total number of rows in the table.

If you’re used to data modelling in BW you know that it’s recommended not to have that many distinct values in the dimensions and if it cannot be avoided then the dimension should be marked as line item dimension.
This can be done in the Infocube modelling in transaction RSA1 by right-clicking on the dimension and opening the properties dialogue.

Flagging a dimension as line item dimension changes the data modell by that it removes the intermediate SID table that is normally put between fact tables and the actual dimension tables (for flexibility reasons) and joins the fact table directly with the dimension table.

It changes this :

[DIMENSION-TAB1] >-----<  (SID-TAB1) >---< [[FACT TABLE]] >---<  (SID-TAB2) >-----< [DIMENSION-TAB2]

to this:

 [DIMENSION-TAB1] >-----< [[FACT TABLE]]  >---<  (SID-TAB2) >-----< [DIMENSION-TAB2]

Besides this change in the data modell the index for the line item dimension should not be a BITMAP index anymore, but a B*TREE index instead. Of course the SAP BW developers know about this and provided a way to avoid the creation of BITMAP indexes on such dimension key columns.

The fallacy the customer and I became victims of was to believe that flagging a dimension as line item automatically includes the setting for the B*Tree index. After all this totally makes sense, doesn’t it?

Unfortunately, but correctly, the BW developers separated these two functions and provided two flags for infocube dimensions that can be set independently:

[ x ] Line Item
and
[ x ] High Cardinality

Only setting the High Cardinality flag leads to the creation of B*Tree indexes!
Fortunately, this can be done any time, even when the infocube is already in use and filled with data.

RSA1 Dimension Properties dialogue

This is of course documented (here) but as often, things mix up in the memory and we end up believing the wrong stuff.

Before I forget to mention it: as far as I know the High Cardinality flag really only does something on Oracle based BW systems as the Bitmap Indexes are exclusively available on Oracle.
Maybe there are similar features on other platforms, but I’m not aware of them, so I leave them out for now (and avoid believing the wrong stuff once more…).

So, make sure to check your line item dimensions to avoid long indexing runs.

Best regards,

Lars

To report this post you need to login first.

12 Comments

You must be Logged on to comment or reply to a post.

  1. Stefan Koehler
    Hello Lars,
    wooohoo … right there the next blog written by you … it seems like you are a getting a blog monster 😛

    Do you know what would be awesome?
    Such an option for real time info cubes to avoid “b-tree to bitmap conversion” and get the use of star transformations with real time info cubes.

    As far as i know – SAP has not implemented this possibilty, because of the “bitmap index issues with DMLs” with older oracle versions .. but since 10g that should be no problem anymore .. and 10g is out now for a loooong time.

    Also parallel DML on such cubes should be no issue.

    Keep it up!

    Regards
    Stefan

    http://richardfoote.wordpress.com/2010/05/25/bitmap-index-degradation-after-dml-prior-to-10g-beauty-and-the-beast/
    http://richardfoote.wordpress.com/2010/06/01/bitmap-index-degradation-since-10g-fix-you/

    (0) 
    1. Lars Breddemann Post author
      Hi Stefan,

      thanks again for the applause!
      That way I’m surely going to continue writing blog posts … 😉

      Concerning the B*tree-Bitmap-Plans the point they’re used for is to generally provide a sensible option for the Optimizer to handle OR-conditions (which we have in BW).

      So they are in fact not _that_ bad in some cases…

      The fact that index maintenance had been improoved in Oracle >=10g unfortunately didn’t resolve the more serious issue: rowid-range-based locking. With that you can generate self-deadlocks and therefore this is not really something we want to see in BW.

      But of course you’re right: transactional infocubes are little bastards performance wise, especially when the optimizer is really sensitive for ‘outlier! data (the data that is not yet covered by the statistics).

      We’ll have to see what can be done with that in the future, I guess…

      And sure: we’re obviously reading the same blogs 😉

      Cheers,
      Lars

      (0) 
      1. Stefan Koehler
        Hello Lars,
        as you know i am not very fimiliar with the SAP BW internals.

        You are right that existing bitmap indexes and high parallel DMLs can cause such locking situations, but does this really happen on real time info cubes?

        If the BW end user works with this kind of cubes – does it really execute DMLs in parallel through SAP OLAP engine?  (for example by loading/filling reqeusts, etc.)

        If not bitmap indexes should be no problem on that cubes.

        As already mentioned – the SAP BW OLAP engine is some kind of alien stuff for me ..

        Regards
        Stefan

        (0) 
        1. Lars Breddemann Post author
          Hey Stefan,

          unfortunately I’m a bit short on time today, but briefly:

          – we’re not performing what Oracle declares as Parallel DML anywhere (using parallel server). What we in fact do is to run serial DML simultanously.

          – For transactional infocubes it can happen that multiple processes enter data at the same time into the open request (depending on the application using this data structure).

          And yes – many design considerations for the OLAP <-> DB mapping cannot be understood from the angle of the database layer.
          That still is something I have to work to get my mind wrapped around every day 😉

          cheers,
          Lars

          (0) 
          1. Lars Breddemann Post author
            Hi Stefan,

            I’ve to row back on the parallel DML statement!
            In fact, there are some bits of code (in some special mass data changing reports) that actually enable parallel dml on session level.

            And again I learned something new… 🙂

            (0) 
            1. Raj majeti
              Hi Stephen,

              Can you please let me know  how you found the distinct entries in the  dimension tables? Can you please guide me for one more my early  watch report  suggestion.

              Regards,
              Rajesh

              (0) 
        2. Jens Gleichmann
          Hello Lars,

          I have got some different information about the High Cardinality flag. Some guys from SAP support told us to set it in some special cases other said it is maidness to set this option in cause of the bad performance.

          However in every EWA report you can read the following statement:
          “Recommendation: Remove all “High-Cardinality” indicators from the InfoCube definition. In rare cases, the time used to recreate the indexes can increase. In this case, set the indicator again.
          Note: Setting the “Line Item” indicator is useful for reporting and upload performance, but it can have a negative influence on the performance of input help functions.”

          As you posted if you set the ‘line item dimension’ flag you should also set the ‘High-Cardinality’ flag.
          We have still some dimensions which are over this limitation of 20% of the fact table. (little hint for you: use the stats in ST14 or report SAP_INFOCUBE_DESIGNS to identify cubes which are over this threshold of 20%)

          afaik the high cardinality flag shouldn´t be set because it could lead to bad query performance. So if you want to activate it, you should thoroughly test it with all queries.

          So my question is what is currently supported and are there any known issues with this flags?

          @Stefan: You are right, sometimes it sound like some alien stuff, but you know -> the truth is (somewhere) out there 🙂

          Regards,
          Jens

          PS: Another hint: To see which cube has one of this flag you can search for it in table RSDDIME 😉

          (0) 
          1. Lars Breddemann Post author
            Hi Jens,

            of course there is good reason to try to avoid the usage of that flag.
            One of them is the fact that without a BITMAP index in-place the STAR-Transformation-Optimization step cannot be performed with the selection criteria on the HighCard-Dimension.

            But it is still possible with all other dimensions!

            The question YOU (or the BW Admin/Designer) now has to answer is: do I use very specific restrictions on my line-item-dimension during reporting and do I need this dimension for the STAR TRANSFORMATION?

            If you answer this question with YES! Then you likely don’t have much choice but to accept a super long index-rebuild time.

            As with any recommendations, the EWA-report tends to be too universal in its statements.

            Supported are of course all combination of flag settings. But ‘supported’ and ‘recommended’ are two different things.

            Personally, I see it as a tool that need to be understood in its consequences to correctly use it. Just like any other tool.

            Rules-of-Thumb like the 20% are really that: rules-of-thumb.
            They don’t take away the need to think through the problem for you.

            BTW: good to see that there’s also some knowledge about tables like RSDDIME or report SAP_INFOCUBE_DESIGNS outside SAP 😉

            Cheers,
            Lars

            (0) 
  2. Subhasish Haldar

    Hi Lars,

    Are you really sure that selecting Line Item Dimension option removes the SID table ?

    1) SID table actually resides between DIM table and master data table.

    2) selecting Line Item Dimension removes the Dimension table and puts the SID of the characteristic into the fact table.

    Please clarify..

    Regards,

    Subbu.

    (0) 
    1. Lars Breddemann Post author

      Hello Subbu,

      yes, I’m sure about this, because:

      1. I tried it out AND
      2. it’s documented like this

      This whole feature is just about this.
      Why don’t you just go on and try it out?

      Please check yourself 😉

      regards, Lars

      (0) 
      1. Ethan Jewett

        Hi Lars,

        I’m pretty sure Subbu is right.

        I think you just mixed up the SID table and Dimension table in the diagram in the blog, at least in terms of BW terminology. The Fact table includes Dimension IDs, and the dimension table contains SIDs (which are part of an InfoObject). So the diagram should be more like:

        SID tab1 <—> Dim tab1 <—> Fact table <—> Dim tab2 <—> SID tab2

        When you define a dimension as a line item dimension it becomes:

        SID tab1 <—> Fact table <—> Dim tab2 <—> SID tab2

        The fact table includes SIDs directly in the column that would normally contain DIM IDs. Actually the Dim tab1 becomes a view in the data dictionary, I believe. Normally a dimension table is a transparent table. The dimension table view has the same two fields as a normal dimension table, but with both of those fields mapped to the SID field in the SID table. There is no longer really a dimension table, at least not at the data dictionary level that one can see in SE11.

        Cheers,

        Ethan

        (0) 
        1. Lars Breddemann Post author

          Ahhh… ok, this was about the naming of the tables…*gosh*

          Ethan, Subbu: you’re right!

          It’s the SID-table are now directly to the fact table – not the dimension table. My fault on nomenclature here. 🙁

          Thanks for pointing out.

          Concerning the “conversion” of the table to a view you’re right – that’s what we do in order to keep the structure logically the same for all dimensions.

          BTW: it’s also the case when aggregates are build and only one attribute of a dimension is placed into the aggregate – it’s turned into a line-item dimension including the view for the aggregate cube.

          best regards,

          Lars

          (0) 

Leave a Reply