From the very beginning, BIA was positioned as an appliance that can be deployed quickly. The feedback that we have received from customers has been overwhelmingly positive and suggests that acquiring a detailed understanding of how BIA works may not be required: Just create some BIA indexes, ensure that the process chains include a roll-up – and your performance issues are gone! Whoever has spent a lot of time on improving the performance of queries by defining aggregates, partitioning database tables, creating secondary indexes, configuring the OLAP cache, etc., will probably remember the first time he or she executed a (previously) long-running query with BIA. It is easy to see why BIA leads to a reduction in the total cost of ownership and increases the overall return on investment of the BI environment.
As with everything else in life, the initial excitement is sometimes short-lived. Users get used to the new query response times very quickly and sometimes expect that all queries perform extremely well. Sooner or later a query may be “discovered” whose runtime may not be much faster than before BIA was deployed. It is therefore very important that we all have a better understanding of how BIA works, so that we can set the right expectations upfront.
Fortunately, the performance improvement is very predictable. For instance, it doesn’t take long to realize that the size of the result set and the complexity of the query have a much larger impact on the query runtime than, for instance, the number of selected records in the info cube. With this basic understanding, it should be possible to identify which cubes should be indexed, make design recommendations for queries, show users how to effectively analyze data, etc. To summarize, even though BIA is easy to deploy and doesn’t require a lot of development work, having a detailed understanding of BIA will allow you to utilize BIA much more. In fact, you will not only achieve a better query performance; by making minor design changes you will also be able to reduce data latency and disk space requirements.
In case you want to learn more about BIA, we have some good news. The SAP NetWeaver BI RIG will publish a series of blogs on BIA to address topics like the following:
Deployment Best Practices – How do you know which info cubes should be indexed? How do you have to change your process chains? (Hint: Using the “roll-up” process type may not always be a good choice.)
BIA Maintenance – How do you determine whether an index needs to be rebuilt? How often do you have to perform a reorg? When do you need to set up a delta index?
BIA Consistency Checks – What check programs are available? How often should you run them?
Aggregates vs. BIA vs. Info Cube – What are the main differences with respect to query response time, data latency and disk space requirements? When does it make sense to use aggregates? Should info cubes be compressed?
New BIA Features – We will discuss features that have been introduced recently, such as backup and recovery, FEMS compression and package-wise data reads. We will also provide an outlook on what is to come in the near future.
Finally, we will provide a lot of information on
Data Modeling and BIA
Since this topic hasn’t been discussed much on SDN, we will provide you with a brief introduction:
There are basically two views of looking at BIA: One view is the one we mentioned in the introduction to this blog: You deploy BIA, create some BIA indexes and – voilà – your response times are much shorter. There is no need to optimize the data model of the info cube, to create of secondary indexes, aggregates, etc. In other words, less work is spent on maintenance and performance optimization which in turn leads to the aforementioned lower total cost of ownership. There is, of course, nothing wrong with this view. After all, this is why you decided to deploy BIA in the first place.
A different view of looking at BIA is the following: Rather than just deploying BIA and expecting a better query response time while having to focus less on performance-related tasks, we are suggesting a (relatively minor) re-design of your (enterprise) data warehousing architecture. You will be surprised to learn that a lot of performance-related design guidelines may be irrelevant once BIA is in place. Just imagine you have to design an info cube, but don’t have to worry about performance! How would those info cubes differ from what you have in place right now? Should BIA even have an impact on your Enterprise Data Warehousing Architecture? I bet that if you spend some time to think about this, you will come to the following conclusion: BIA changes everything!
Without going into much detail, we would like to provide you with a simple example to demonstrate how BIA can have an impact on the data staging process as well as the query response time.
Consider a very simple scenario consisting of a DSO with sales order line items and an info cube that contains sales order information at a more aggregated level. Users can slice and dice the data in the info cube and – in case actual line item information is needed – they can call a DSO report using the report-to-report interface. (We have seen this particular scenario many times.) How would you model this scenario if you could utilize BIA? First of all, BIA can efficiently aggregate a huge number of records. When executing a query, you want to make sure that the result set remains relatively small. The number of records in the info cube, however, is almost irrelevant: BIA can aggregate hundreds of millions of records within a very short period of time. So the first change you would do is to store the sales order line items in the info cube. Users can still run the same reports as before, but they don’t have to run a DSO report when they need to look up a particular line item. All information is in the info cube and the report-to-report interface is now obsolete. In fact, there is no need to run any report on the DSO. If the DSO is no longer used for reporting, you don’t need the SIDs and can therefore deselect the BEx flag. Moreover, if the datasource from which the DSO is loaded provides delta information, you might want to consider turning the DSO into a write-optimized DSO. (You couldn’t do that before because write-optimized DSOs cannot be used for reporting.) In other words, you have eliminated the DSO activation and thus reduced data latency. Furthermore, the disk space requirement has been reduced considerably since the write-optimized DSO has no log table.
While this example has been rather trivial, it clearly demonstrates that BIA can have a positive impact on the whole data staging process. Overall, you may end up with a reduced data latency (in addition to the reduction that you get by eliminating aggregates and secondary indexes), fewer info providers and fewer queries. With this in mind, have a look at your architecture and see whether you can find areas that can be improved after a successful deployment of BIA. As promised, we will cover this topic in more detail in future blogs.
Currently published blogs in this series:
The specified item was not found.
The specified item was not found.