Applies to: BI 7.0
After the BW implementation is over in any client, the data from the Source systems will keep on extracting to BW system on a daily basis. The size of the data volume increases day by day incrementally. This fills up(accumulates) all our Infocubes what we have developed in our Data Mart Layer over a period of time. Then the question of maintenance will come into the picture. That’s exactly, I am going to demonstrate about the steps and Strategies to handle it.
Disadvantages with the current Infocubes 😥
- Large data volumes in Cubes, which leads to higher administration efforts
- Dropping and Creating indices takes longer times which can delay process chains
- Aggregates Roll-up will be longer
- Attribute Change run takes substantial time during Master data loads
- Reporting performance will be a big question
- Recreating Aggregates may require every now & then to meet reporting performance.
- Initial Fill up will take substantial times to fill up new aggregates
- Adhoc Querying during Year Ends will be tough, while fetching data from bigger Cubes/Aggregates
I am sharing my current Data Mart Layer which is under a Multiprovider. The Go-Live had happened in 2008.This tells you that the data is available from 2008 to till date in our Infocubes. Our implementation partner had created Year wise Cubes like below.
This above Layer has been serving good so far. Because our data volume is able to adjust with the available cubes. Now the data volume has substantially hiked since 2012 on wards. The data volume is roughly around 13 Crores records in (2012 and 2013) Cube. The previous years data was also some big figure. We have other small cubes in the Multiprovider also. Hope you got the bottle neck situation now.
I have proposed a solution to overcome above mentioned all disadvantages. Archiving is not possible in my case as it is against our management policy. But we are trying our best to convince the management to get the approvals . So I have made a proposal to organize all years data as per the extent of usage. You could see “Very Less usage” and “Mostly Used in Trend Analysis” in the above figure for your understanding.
Strategy 1 : I have proposed to club the older years cubes by keeping the extent of usage in mind. Usually we do Trend Analysis by considering Last Year and Current year figures. Some companies may go with last 3 years also, depends up on the client’s requirement.
I have clubbed 2008 to 2011 into a single Infocube as it is hardly used for business analysis. If you have observed, I have split (2012 & 2013) Cube into separate Cube for each year. Because, As I stated in the beginning, I have 13 Crore records for both years in a single Cube. It has been observed that reports are running slower when the users do trend analysis, as we approach towards the year end.
Eventually, we will have to club 2012, 2013 and 2014(may be) into a single Infocube when the importance of that period is very less. The whole idea is to merge older Cubes as the time passes by based on the importance and usage of data by Business Analysts.
Advantages of this Strategy 🙂
- Lower Administration efforts by focusing on current Cubes
- Faster dropping and re-building indices during data loads
- Trend Analysis can be done quickly
- All the other disadvantages (which are mentioned in the beginning) can be overcome by this approach
Most Important advantage
We all aware that all images(before and after) of a particular document will come to DSO(in overwrite mode) and then updates the delta images from Change-log to Infocubes in addition mode. We clean the change-log regularly as a house keeping activity. But we never cleanse the Infocubes which has accumulated all images of a particular document. You could see lot of unwanted records which are nullified with the after images. That’s why my Cubes hold 13 to 15 crore records.
After taking up this whole exercise, your new cubes will update the latest status of all the historical documents from the EDW layer(Central DSO). This will reduce your cube sizes dramatically. Now the cube sizes are just 2 -3 Crores, but all the reports shows the correct KPIs as like before 🙂
Steps to carry out this entire activity without disturbing the existing setup
- Draw flow charts like above in a white paper. So that you will have great idea on what you are going to do in your system
- Take any one year Cube and make 4 copies(only Structures) as per the second image
- Do not forget to partition these new cubes according to the respective years by 0calmonth. Because, the copied Cubes will have old partitions based on first image. This is a key step to organize your data systematically
- Note: We have EDW Layer(Central DSO which is our total warehouse) under Data Mart Layer. So I will be uploading data to these new Cubes from the central DSO
- Copy the transformation between DSO and existing Cube(any year, because transformation is uniform for all year wise Cubes) to the DSO and New Cubes
- Create Full load DTPs for all new Cubes and load all the respective Cubes by chunks
- Copy the existing Multiprovider with a new technical name and extend the description by “Back up”.
Strategy 2 : The idea in copying is, I will detach the existing year wise Cubes from the original Multiprovider and add the new year wise Cubes to the original Multiprovider(make sure assignments are done to the new cubes). By this, we need not to worry about our Bex Queries, WAD and Portal. Because, all the Queries which are based on the existing Multiprovider will continue to run as usual. The only difference is, the data will be retrieved from the new year wise Cubes. This strategy will completely avoid the risk of touching front-end.
8. Regenerate all your Queries in RSRT to refresh them from the database level. This will ensure all your queries will fetch data from the new Cubes
9. Make sure to change the 0INFOPROV restriction to the new Cubes in your queries, if any
10. Modify all your Process Chains with the new Cube variants
11. Verify the data thoroughly in your Portal Reports by comparing with the data display of you Back Up Multiprovider with same Query restrictions/Filters etc..
12. You can also verify the data by copying Queries from your main Multiprovider to Back up Multiprovider by using RSZC
13. Observe your Portal reports for 2-3 days by comparing with the Back up Multiprovider. If everything is ok, then you can drop all the old Cubes
This is a pure Production Support/maintenance activity, which will have to be taken up over a period of time. As Archiving is not in my scope, I have proposed above ideas and implemented the same.
Hope this will help all our BI developers. Thanks for reading my blog.