From the Archives: Factors to consider for utilizing materialized views
In this post, originally written by Glenn Paulley and posted to sybase.com in September of 2009, Glenn talks about the two different types of materialized views and the questions to think about when considering using materialized views.
Starting with Version 10, SQL Anywhere supports deferred-maintenance materialized views; version 11 introduced support for immediately-maintained materialized views. The major differences between the two are:
- With deferred-maintenance materialized views, the query optimizer may answer queries utilizing one or more materialized views that contain stale data. The “staleness” of any view, and whether or not the view can be used in query answering, is entirely under DBA control. However, deferred-maintenance views permit one to tradeoff data accuracy with the performance gains offered by the materialized view, and the update maintenance cost for that view.
- Conversely, immediately-maintained materialized view are updated within the same transaction as the base-table updates upon which the materialized view is defined. Immediately-maintained views offer a derived, up-to-the-minute copy of the view’s underlying base tables, at the expense of requiring view maintenance with each update operation.
In summary, deferred-maintenance views permit the amortization of maintenance costs for the materialized view. In contrast, immediately-maintained materialized views require each update transaction to incur the overhead of view maintenance, which may result in contention between concurrently-executing transactions.
The deferred-or-immediate materialized view decision is but one of the factors to consider by a database administrator when deciding whether or not to use a materialized view – in the literature this question is known as the view selection problem. However, there are several considerations other than the deferred-or-immediate maintenance tradeoff. Here is a checklist of questions to consider when deciding upon the utility of a materialized view:
- What is the set of queries that can benefit from creation of materialized views?
Answering this question involves an analysis of the query workload for the system, including detailed consideration of both the definition and frequency of individual queries. A good starting point is to begin with frequently executed and expensive queries, particularly those expensive queries with critical response time requirements. SQL Anywhere’s Application Profiling capability, contained in the Sybase Central SQL Anywhere plug-in, is an excellent way to capture your application’s workload and determine the “heavy hitters” contained within it.
Materialized views that can benefit multiple queries in common represent the most significant gains, because the storage and maintenance costs for the view are constant but the benefits of the materialized view increase with usage. Also, remember that a single query can make use of multiple materialized views. Splitting a complex materialized view into multiple views may permit the optimizer to utilize the materialized view to assist a larger set of queries. If considering a materialized view involving aggregation (
GROUP BY), it is often better to materialize basic functions that will permit a wider applicability to multiple queries; for example,
AVG()can be obtained from a combination of
COUNT(*). SQL Anywhere’s query optimizer is intelligent enough to utilize
COUNT()from a materialized view when the original query contains
- Does potential improvement in query performance outweigh the storage and maintenance costs of materialized views?
One must tradeoff the potential improvements in query performance with the space requirements for materialized views – and their indexes – and the maintenance costs for the view. Here, one must be aware of the update patterns from application requests; materialized views on heavily-updated base tables may have unacceptable maintenance costs, for two reasons: the cost of the updates to the materialized views themselves, and the increase in lock contention amongst update transactions from concurrent updates to the table (or index) containing the materialized view. This latter problem is difficult to assess without proper capacity planning.
DBAs often fail to realize that materialized views can be indexed, just like any other base table. Indexes are particularly useful when the application query contains additional joins to tables that are not included in the view; if indexes exist, the optimizer has more physical operator choices – particularly indexed nested-loop join – that can result in significant speed improvements.
- Can the same query be allowed to return different results, if the optimizer chooses to utilize stale data from a materialized view in one case, and chooses to process the underlying (and up-to-date) base tables in another?
- Can stored data for materialized views be allowed to become stale?
- How stale can the data become before it is unacceptable?
These latter questions pertain to the tradeoffs of immediate versus deferred-maintenance materialized views. As described above, deferred-maintenance permits one to amortize view maintenance across multiple update transactions, at the expense of data staleness. Whether or not your application can benefit from deferred-maintenance views is primarily a business question, not a systems one.
In a later post, I’ve present some examples of using materialized views. My thanks to colleague Anil Goel for providing much of the detail in this article.