In this post, originally written by Glenn Paulley and posted to sybase.com in February of 2011, Glenn talks about the effect of using snapshot isolation in combination with materialized views.


Snapshot isolation and materialized views are two important features that have been part of the SQL Anywhere server since the Version 10 release, which first shipped in September 2006. In the past, I’ve written articles that explain the tradeoffs of using snapshot isolation and presented material on the space-time tradeoffs of materialized views.

In one article on snapshot isolation, I wrote:

Of course, snapshot isolation doesn’t come for free. It is necessary for the database system to construct archive copies of changed data in anticipation of new snapshot transactions. With SQL Anywhere, copies of snapshot rows are managed automatically, written to the temp file (which grows on demand) as necessary. However, though the management impact is near zero, query performance can suffer as snapshot rows may need to be fetched individually from the snapshot row store in the temp file, based on the snapshot semantics of the transaction. The degree of performance degradation depends entirely on the application and its workload, and will be worse with update-intensive workloads.

In this article, I’ll describe the interaction between snapshot isolation and materialized views, particularly as it pertains to the SQL Anywhere server’s transaction log, and discuss the inherent tradeoffs.

Refreshing materialized views

With deferred maintenance materialized views, modifications to the materialized view’s underlying base tables proceed without any additional locking or (immediate) maintenance overhead. Put another way, update transactions modify the values or rows of base tables, and upon COMMIT these changes are made persistent. However, in this situation the materialized view contents then become stale; whether or not an SQL statement can exploit the stale contents of the materialized view is controlled by setting options for that query’s connection. To refresh the view’s contents, one issues the REFRESH MATERIALIZED VIEW statement on that view, which effectively performs a TRUNCATE TABLE on the base table containing the view, and then immediately performs an INSERT ... FROM SELECT to re-populate the materialized view.

On the other hand, with immediately-maintained materialized views a REFRESH MATERIALIZED VIEW statement is required only to initially populate the view. After that, any modifications to underlying base tables are correspondingly applied to the immediately-maintained materialized view(s) that refer to those base tables within the same transaction. When the transaction completes, either the transaction performs a COMMIT to make the changes permanent, or issues a ROLLBACK to undo them.

Example

Consider the following simple, contrived example – we create a simple, single-table immediately-maintained materialized view over the Products table in the sample DEMO database:


CREATE MATERIALIZED VIEW groupo.shirt_products( prod_id, prod_name, prod_description, prod_size, prod_color, prod_quantity, prod_unit_price)
AS
SELECT "id", "name", "description", "size", color, quantity, unitprice
FROM Products
WHERE "name" LIKE '%shirt%'

To make the view immediately-maintained, we first define a unique index on the table instantiating the view:


CREATE UNIQUE INDEX products ON groupo.shirt_products (prod_id ASC);

and then specify that the view should be maintained immediately:


ALTER MATERIALIZED VIEW groupo.shirt_products IMMEDIATE REFRESH

and, finally, initialize the materialized view:


REFRESH MATERIALIZED VIEW groupo.shirt_products

With the creation of the materialized view now complete, we start a new transaction and modify some of the underlying rows of the Products base table:

UPDATE products SET description = ‘Modified’ WHERE “name” LIKE ‘%Tee Shirt%’; COMMIT

If we look at the contents of the materialized view, we can see that the modifications to the underlying Products table are now reflected in the view:

Contents of the shirt_products materialized view

Materialized views and the transaction log

If we look at the contents of the transaction log for the DEMO database at this point – using the DBTRAN utility – we see the following:


--CONNECT-1010-0000975095-DBA-2011-02-02 11:38
--BEGIN TRANSACTION-1010-0000975106
BEGIN TRANSACTION
go
--SQL-1010-0000975109
begin
  set temporary option first_day_of_week = '7';
  set temporary option date_order = 'YMD';
  set temporary option nearest_century = '50';
  set temporary option date_format = 'YYYY-MM-DD';
  set temporary option timestamp_format = 'YYYY-MM-DD HH:NN:SS.SSS';
  set temporary option time_format = 'HH:NN:SS.SSS';
  set temporary option default_timestamp_increment = '1';
  set temporary option timestamp_with_time_zone_format = 'YYYY-MM-DD HH:NN:SS.SSS+HH:NN';
  create materialized view groupo.shirt_products( prod_id,prod_name,prod_description,prod_size,prod_color,prod_quantity,prod_unit_price )
    as select products.id,products.name,products.description,products.size,products.color,products.quantity,products.unitprice
      from GROUPO.products
      where products.name like '%shirt%';
  set temporary option first_day_of_week = ;
  set temporary option date_order = ;
  set temporary option nearest_century = ;
  set temporary option date_format = ;
  set temporary option timestamp_format = ;
  set temporary option time_format = ;
  set temporary option default_timestamp_increment = ;
  set temporary option timestamp_with_time_zone_format = ;
end
go
--COMMIT-1010-0000976304
COMMIT WORK
go
--BEGIN TRANSACTION-1010-0000976307
BEGIN TRANSACTION
go
--SQL-1010-0000976310
comment to preserve format on view groupo.shirt_products is
  'create materialized view groupo.shirt_products( prod_id, prod_name, prod_description, prod_size, prod_color, prod_quantity, prod_unit_price)
as select "id", "name", "description", "size", color, quantity, unitprice
from products
where "name" like ''%shirt%'''
go
--COMMIT-1010-0000976661
COMMIT WORK
go
--BEGIN TRANSACTION-1010-0000976664
BEGIN TRANSACTION
go
--SQL-1010-0000976667
create unique index products on groupo.shirt_products(prod_id asc)
go
--COMMIT-1010-0000976745
COMMIT WORK
go
--BEGIN TRANSACTION-1010-0000976748
BEGIN TRANSACTION
go
--SQL-1010-0000976751
alter materialized view groupo.shirt_products immediate refresh
go
--COMMIT-1010-0000976826
COMMIT WORK
go
--CHECKPOINT-0000-0000976829-2011-02-02 11:39
--BEGIN TRANSACTION-1010-0000976859
BEGIN TRANSACTION
go
--SQL-1010-0000976862
refresh materialized view groupo.shirt_products
go
--COMMIT-1010-0000976921
COMMIT WORK
go
--BEGIN TRANSACTION-1010-0000976924
BEGIN TRANSACTION
go
--UPDATE-1010-0000977130
UPDATE GROUPO.Products
  SET Description='Modified'
WHERE ID=300
go
--UPDATE-1010-0000977152
UPDATE GROUPO.Products
  SET Description='Modified'
WHERE ID=301
go
--UPDATE-1010-0000977174
UPDATE GROUPO.Products
  SET Description='Modified'
WHERE ID=302
go
--COMMIT-1010-0000977494
COMMIT WORK
go


There are a few things I’d like to point out regarding the transaction log contents above:

  • Note that the CREATE MATERIALIZED VIEW statement appears as part of a batch (lines 18-40) that includes statements to re-establish pertinent connection option settings that were in effect at the time the CREATE MATERIALIZED VIEW statement was issued.
  • The REFRESH MATERIALIZED VIEW statement is logged as a separate statement in the transaction log (line 85). By default, the REFRESH MATERIALIZED VIEW statement uses WITH SHARE MODE locking if the WITH clause is not specified. With immediately-maintained views, only WITH SHARE MODE, WITH EXCLUSIVE MODE, and WITH ISOLATION LEVEL SERIALIZABLE are permitted lock modes.
  • Row modifications to the shirt_products materialized view do not appear in the transaction log. Since the view is an immediately-updateable view, it is sufficient for the transaction log to contain only the base table modifications caused by the UPDATE statement (lines 94-105). If the log is replayed during recovery, the UPDATE statements in the transaction log will automatically invoke the corresponding modifications to the materialized view.

Impact of snapshot isolation

Enabling a SQL Anywhere database for snapshot isolation is done by setting the option allow_snapshot_isolation as follows:


SET OPTION PUBLIC.allow_snapshot_isolation = 'On';

When snapshot isolation is enabled, the default isolation setting for the REFRESH MATERIALIZED VIEW statement is WITH ISOLATION LEVEL SNAPSHOT, so that the connection executing the REFRESH statement is not blocked by other concurrent update transactions that are simultaneously modifying the underlying base tables referenced in the materialized view.

While this is advantageous, there is a tradeoff. The transaction log must contain enough detail about the modifications to the database that we can be confident that applying the log(s) to an arbitrary backup during recovery will re-establish as precisely as we can the contents of the database. With a REFRESH MATERIALIZED VIEW WITH ISOLATION LEVEL SNAPSHOT statement, the statement will “see” the rows of the underlying base tables according to snapshot semantics – that is, the REFRESH statement will fail to “see” any modified rows by committed transactions that took place after the REFRESH statement was started. To properly replay that REFRESH statement from a transaction log during recovery, the server would need to retain the state of every row of every underlying base table so that the REFRESH statement can compute the identical contents of the materialized view.

Similar phenomena can occur if the REFRESH MATERIALIZED VIEW statement is run at any isolation level other than SERIALIZABLE, SHARE MODE, or EXCLUSIVE MODE. Rather than attempt to provide all of this necessary context in the transaction log, when a weaker isolation level is used with a REFRESH MATERIALIZED VIEW statement the server logs both the REFRESH statement and each individual row inserted to the materialized view, rather than log the REFRESH statement alone.

In this case, if one looks at the SQL statements generated by the transaction log utility DBTRAN, the DBTRAN output will only include the REFRESH MATERIALIZED VIEW statement, since the REFRESH statement is all that is necessary when translating a log to SQL to apply it to a different database. On recovery, however, the server ignores the REFRESH MATERIALIZED VIEW statement, and utilizes the individual INSERT statements in the log to recover the materialized view contents.

We are considering enhancements to DBTRAN so that the INSERT statements logged in these cases appear in the output as comments, so that their existence can be verified.

To report this post you need to login first.

Be the first to leave a comment

You must be Logged on to comment or reply to a post.

Leave a Reply