Skip to Content

Applies to:  BI 7.0

Introduction

                 After the BW implementation is over in any client, the data from the Source systems will keep on extracting to BW system on a daily basis. The size of the data volume increases day by day incrementally. This fills up(accumulates) all our Infocubes what we have developed in our Data Mart Layer over a period of time. Then the question of maintenance will come into the picture. That’s exactly, I am going to demonstrate about the steps and Strategies to handle it.

Disadvantages with the current Infocubes 😥

  • Large data volumes in Cubes, which leads to higher administration efforts
  • Dropping and Creating indices takes longer times which can delay process chains
  • Aggregates Roll-up will be longer
  • Attribute Change run takes substantial time during Master data loads
  • Reporting performance will be a big question
  • Recreating Aggregates may require every now & then to meet reporting performance.
  • Initial Fill up will take substantial times to fill up new aggregates
  • Adhoc Querying during Year Ends will be tough, while fetching data from bigger Cubes/Aggregates

                 I am sharing my current Data Mart Layer which is under a Multiprovider. The Go-Live had happened in 2008.This tells you that the data is available from 2008 to till date in our Infocubes. Our implementation partner had created Year wise Cubes like below.

Yearwise MP.JPG

                This above Layer has been serving good so far. Because our data volume is able to adjust with the available cubes. Now the data volume has substantially hiked since 2012 on wards. The data volume is roughly around 13 Crores records in (2012 and 2013) Cube. The previous years data was also some big figure. We have other small cubes in the Multiprovider also. Hope you got the bottle neck situation now.

                I have proposed a solution to overcome above mentioned all disadvantages. Archiving is not possible in my case as it is against our management policy. But we are trying our best to convince the management to get the approvals . So I have made a proposal to organize all years data as per the extent of usage. You could see “Very Less usage” and “Mostly Used in Trend Analysis” in the above figure for your understanding.

                Strategy 1 : I have proposed to club the older years cubes by keeping the extent of usage in mind. Usually we do Trend Analysis by considering Last Year and Current year figures. Some companies may go with last 3 years also, depends up on the client’s requirement.

                                                                      My Proposed Dart Mart LayerYearwise MP1.JPG

                 I have clubbed 2008 to 2011 into a single Infocube as it is hardly used for business analysis. If you have observed, I have split  (2012 & 2013) Cube into separate Cube for each year. Because, As I stated in the beginning, I have 13 Crore records for both years in a single Cube. It has been observed that reports are running slower when the users do trend analysis, as we approach towards the year end.

                 Eventually, we will have to club 2012, 2013 and 2014(may be) into a single Infocube when the importance of that period is very less. The whole idea is to merge older Cubes as the time passes by based on the importance and usage of data by Business Analysts.

Advantages of this Strategy 🙂

  • Lower Administration efforts by focusing on  current Cubes
  • Faster dropping and re-building indices during data loads
  • Trend Analysis can be done quickly
  • All the other disadvantages (which are mentioned in the beginning) can be overcome by this approach

Most Important advantage


               We all aware that all images(before and after) of a particular document will come to DSO(in overwrite mode) and then updates the delta images from Change-log to Infocubes in addition mode. We clean the change-log regularly as a house keeping activity. But we never cleanse the Infocubes which has accumulated all images of a particular document. You could see lot of unwanted records which are nullified with the after images. That’s why my Cubes hold 13 to 15 crore records.

               After taking up this whole exercise, your new cubes will update the latest status of all the historical documents from the  EDW layer(Central DSO). This will reduce your cube sizes dramatically. Now the cube sizes are just 2 -3 Crores, but all the reports shows the correct KPIs as like before 🙂

Steps to carry out this entire activity without disturbing the existing setup

  1. Draw flow charts like above in a white paper. So that you will have great idea on what you are going to do in your system
  2. Take any one year Cube and make 4 copies(only Structures) as per the second image
  3. Do not forget to partition these new cubes according to the respective years by 0calmonth. Because, the copied Cubes will have old partitions based on first image. This is a key step to organize your data systematically
  4. Note: We have EDW Layer(Central DSO which is our total warehouse) under Data Mart Layer. So I will be uploading data to these new Cubes from the central DSO
  5. Copy the transformation between DSO and existing Cube(any year, because transformation is uniform for all year wise Cubes) to the DSO and New Cubes
  6. Create Full load DTPs for all new Cubes and load all the respective Cubes by chunks
  7. Copy the existing Multiprovider with a new technical name and extend the description by “Back up”.

                    

         Strategy 2 :  The idea in copying is, I will detach the existing year wise Cubes from the original Multiprovider and add the new year wise Cubes to the original Multiprovider(make sure assignments are done to the new cubes). By this, we need not to worry about our Bex Queries, WAD and Portal. Because, all the Queries which are based on the existing Multiprovider will continue to run as usual. The only difference is, the data will be retrieved from the new year wise Cubes. This strategy will completely avoid the risk of touching front-end.

  8.   Regenerate all your Queries in RSRT to refresh them from the database level. This will ensure all your queries will fetch data from the new Cubes

  9.   Make sure to change the 0INFOPROV restriction to the new Cubes in your queries, if any

10.   Modify all your Process Chains with the new Cube variants

11.   Verify the data thoroughly in your Portal Reports by comparing with the data display of you Back Up Multiprovider with same Query restrictions/Filters etc..

12.   You can also verify the data by copying Queries from your main Multiprovider to Back up Multiprovider by using RSZC

13.   Observe your Portal reports for 2-3 days by comparing with the Back up Multiprovider. If everything is ok, then you can drop all the old Cubes

Conclusion

                 This is a pure Production Support/maintenance activity, which will have to be taken up over a period of time. As Archiving is not in my scope, I have proposed above ideas and implemented the same.

Hope this will help all our BI developers. Thanks for reading my blog.



To report this post you need to login first.

27 Comments

You must be Logged on to comment or reply to a post.

  1. Raman Korrapati

    Hi Suman,

    Its very good approach when we have more than 5yrs data at targets and nice graphical representation.

    When we are below BW 7.3 version your approach was very good one and above bw 7.3 SPO was good option. its newly introduced on bw 7.3 onwards.

    Thanks

    (0) 
    1. Suman Chakravarthy K Post author

      Yeah, it is purely applicable up to BI 7.0. But this has to be done after certain period of time on a continuous basis. Otherwise, lot of performance and bottle neck situations will arise.

      Thanks for the comments and rating..

      (0) 
  2. Harish Kumar

    Hi Suman,

    This can be used if the system is in BW 7.0.

    In 7.3 and higher, SPOs are the best and reduces manual activity to a great extent. I mean, we don’t need to create copy cubes and create transformation rules and DTPs.

    These activities are totally taken care on it’s own accord when we leverage SPOs.

    Br, Harish

    (0) 
    1. Suman Chakravarthy K Post author

      Hi Harish,

      Thanks for the info. I am aware that SPO can take care in BW 7.3. I have clearly mentioned in the beginning that Applies to : BI 7.0. I think there are lot of companies which are still running on BI 7.0. I hope this will help them at least..What do you say?

      Regards,

      Suman

      (0) 
      1. Harish Kumar

        It sure will help and your contribution to SCN is indispensable.

        Thanks for keeping the forum lively.

        I’m in BW 7.4 and we are planning to  implement SPOs. That’s why mentioned a comment above on SPOs

        Br, Harish

        (0) 
        1. Suman Chakravarthy K Post author

          I am really enlightened by your comments 🙂 🙂 🙂 . This gives a super kick to motivate myself and others too. I am a huge fan of SCN 🙂 .

          No issues on your comment on SPO. It all depends on how you convey your opinion. I liked your intervention on SPO.

          (0) 
  3. Arpit Khandelwal

    Hi Suman,

    Nice post with summarization of approaches. We were planning to use the similar kind of approach initially but then we decided to opt for Archiving.

    Thanks for your contribution 🙂 .

    Br,

    Arpit

    (0) 
  4. Raki R

    Hi….All…everyone is worrying a lot because of performance while reporting.especially with multiproviders. Just now i have seen this blog post.simply nice. Suman analyzed a lot and proposed this efficient model.currently in our project we have this problem so am going to propose this model soon.Thanks a lot to suman for making and proposing such a wonderful flow.

    (0) 
    1. Suman Chakravarthy K Post author

      Hi Ravi,

      This is an amazing comment 🙂 🙂 . It gives me a super kick to continue to share my knowledge here. Performance is the bottle neck situation in any project. How best you handle it, the more popularity you will get from the users community. You can propose and get back to me with your team feedback. Thanks for all the compliments 🙂

      Regards,

      Suman

      (0) 
  5. Chanda Janardhan sudhir Kumar

    Hi Suman,

    Excellent document .

    I just proposed the same to handle huge data in our system.

    My idea was to keep the original cube as it is holding only current FY data.

    3 other cubes holding Last FY , Last to Last FY and the last last old less usage data .

    we can have a process chain to move data every new FY based on FY.

    And multi provider concept as what you mentioned. By doing this i am not changing current logic & design.i am moving processes data in to other cubes

    Well if gets approved, will seek your help to improve on the design to make it more efficient.

    This document will really help many who are struck in the box . This will really change the approach to performance turning for life..

    Thanks you for the great effort.

    Best regards,

    sudhir.

    (0) 
    1. Suman Chakravarthy K Post author

      Hi Janardhan,

      I have no words to say now. I did not expect this much response for this blog. I am extremely happy today. You can very well propose my idea and see the magic in performance improvement. 🙂 🙂 🙂

      Regards,

      Suman

      (0) 
      1. Chanda Janardhan sudhir Kumar

        Hi Suman,

        Can i check with you if we can do the same for the inventory cubes? 0IC_C03 ?

        Also if cube is compressed, how do we move data?

        please clarify or , kindly share you personal mail ID, i want to share my design on how to improve?

        Thanks & Regards,

        sudhir.

        (0) 
        1. Suman Chakravarthy K Post author

          Hi Janardhan,

          Hope you have observed in my blog about reloading data to the new cubes from the EDW layer(DSO) but not from old cubes to the new cubes. But inventory cubes are tricky. Loading and modeling will be different in inventory. Please analyze thoroughly by keeping my idea with your team.

          Regards,

          Suman

          (0) 
  6. Jino Jose

    Hi Suman,

    Nice document.  I liked the approach

    I have one question on the data,  your cube size 15 crores because of the nullified data and become 2-3 crores when loaded from the DSO. Is there any reason for not considering the option to compress the cube with zero elimination, which might have bring down the cube data size. I know with compression you won’t be able to delete the data based on request , but these are old data so I assume we won’t come across a scenario to delete the data.

    Thanks,

    Jino

    (0) 
    1. Suman Chakravarthy K Post author

      Hi Jino,

      I am aware of zero elimination. But I heard it is not safer to do that while compression. So never undertaken it. I do compression but without zero elimination. But i will definitely think of it henceforth. Anyway all my past data has been taken care and merged into few cubes with my approach. I will make a strategy for future cubes now. Thanks for pointing out.

      Regards,

      Suman

      (0) 
  7. John Lang

    Hi Suman,

    Use of the “MultiProvider Hint” feature is not well known but is extremely useful in your existing scenario where there is 1 year in each cube; specifically with regards to the 2008/9/10/11 cubes in the above design.

    By adding the MultiProvider Hint configuration you can leverage the query execution navigation state to determine if a cube should be scanned and returned as part of the result DataSet. This happens auto-magically once you have configured it.

    The performance optimisation is that the dimension table is used to determine if there is any matching data in that cube that matches the current query navigation state being executed.

    Given that the time dimension is always relatively small when compared to the other dimensions this provides a reasonably consistent and quick yes/no answer to the OLAP engine as to whether or not it should bother scanning the cubes fact tables (F, E & L).

    Given that most queries involve a time characteristic … try configuring the MultiProvider Hint to use one of the time characteristics that are used in the most executed queries.

    For Example: 0FISCYEAR, 0FISCPER, 0CALYEAR, 0CALMONTH, 0CALWEEK and 0CALDAY.

    This older document is a very good read and Google has way more information on using the MultiProvider Hint feature: http://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/library/uuid/b03b7f4c-c270-2910-a8b8-91e0f6d77096

    This will also save you the effort of having to move the data round (and the slight risk associated with even touching it in the first place).

    The above suggestion is only focusing on the cubes for 2008 to 2011 part of the above discussion.

    Hope this helps.

    Kind regards,

    John.

    (0) 
    1. Suman Chakravarthy K Post author

      HI John,

      Thanks for your time to write such a detailed reply. But my blog is only to handle modeling part. You are saying about reporting optimization. I have made another document which has already got multiprovider hint and lot more techniques.

      Have a look at this : http://scn.sap.com/community/business-explorer/blog/2013/07/25/how-to-optimize-reporting-performance

      I have already got the doc which you posted about MP.

      Regards,

      Suman

      (0) 
  8. Chandra Sekhar

    Hi Suman,

    Its really a nice blog.

    I have one question:

    Now we started Data Archiving from BW to NLS. Historical data we deleted from Current cubes and we archived the same. Current cube (which is there on BIA) has only 5 years data now.

    After archiving delete index steps are taking longer time in process chain compared to earlier and query performance also degraded.

    Can you please tell me why drop index steps are taking longer.

    Thanks & Regards,

    Chandra Sekhar.

    (0) 
    1. Suman Chakravarthy K Post author

                   Actually building indices takes longer times than dropping. Dropping has to happen in short time by based on available BGDs. Actually you have already archived the old data, this should definitely reduce the burden on the Cube. Check with you BASIS or DB team about any issues are there.

                   Just analyze your dropping indices job thoroughly, where exactly it is taking time. It also depends on your database system too. Your databse system can hold either bit map or binary tree indices.

      Thanks a lot for your compliments 🙂

      (0) 
      1. Chandra Sekhar

        Hi Suman,

        Thanks for the prompt reply.

        I have checked with Basis they are not sure about the solution. Iam still analyzing about the issue.

        Thanks & Regards,

        Chandra Sekhar.

        (0) 
  9. Chandresh Patel

    Hi Suman,

    Your option,  we clubbed all data upto year 2010. After that one cube for 1 year. But 5 year down the line this scenario won’t be applicable.

    I have one solution for that we can divide the cubes based on sales orgs or sales areas. It will be useful because mainly user are going to check report only for his region.

    Thanks,

    Chandresh Patel

    (0) 
    1. Suman Chakravarthy K Post author

      HI Patel,

      Your option,  we clubbed all data upto year 2010. After that one cube for 1 year. But 5 year down the line this scenario won’t be applicable.

      After 5 years you will have to club again from 2011 to 2016 into single cube. This is what I have explained in the blog. All our reports will be based on time chars. If we club by sales org wise or any other field wise, you will have to look at  all cubes to fetch your data with various combination of fields in the query. This will cause performance issues again. Just think of it..

      My proposal is physical partitioning and your proposal is logical partitioning.

      Thanks for commenting.

      Regards,

      Suman

      (0) 
  10. Sakthi S

    Hi Suman

    Nice blog and a good presentation. I’ve also done the same during production support but foreseeing the future in-case if we are in the year 2025 how many cubes will we have? (year wise alterations)

    Otherwise how many times client is ready for allowing alteration on cubes and the data (loading)?

    In a worst case if we’ve got any queries based with selection of infoprovider’s (infocube) then how much will be the effort of a consultant on these restructuring?

    Again this will improve the performance of reports drastically and also recommended for BW systems below 7.3 if the efforts and risks involved are least considered.

    Regards,

    Sakthi.

    (0) 
    1. Suman Chakravarthy K Post author

      Hi Sakthi S ,

      Thanks for expressing your concerns about this blog. 🙂 My idea is to keep last 2 or 3 years data in separate Cubes and we have to keep on merging older Cubes as the time goes. To avoid this only, 7.3 has come up with SPO concept which can automatically take care of this scenario. Any client will not keep past 15 to 20 years online, right? Definitely they will have to archive the older data. Then, having no. of Year wise Cubes will be lesser at any point of time.

      Regards,

      Suman 🙂

      (0) 

Leave a Reply