From the Archives: Select Over an Update Statement Part 2
In this post, originally written by Glenn Paulley and posted to sybase.com in May of 2009, Glenn talks about using SELECT over various DML queries, a feature that was added to SQL Anywhere in version 12. This feature is worth re-iterating, as it is very useful, but lightly used feature. For example, it provides a very simple way to retrieve the primary key of a newly inserted row as part of the actual insert statement.
SELECT pkey_col FROM (INSERT INTO mytable(col2) VALUES( 'hello')) REFERENCING (FINAL as t_final) order by 1
In a previous post in May 2009 I expressed admiration for a SQL language feature in IBM’s DB2 product that permits one to use an update DML statement as a table expression in a query’s
FROM clause. Here is a simple example to illustrate DB2’s syntax:
SELECT T_updated.* FROM NEW TABLE ( UPDATE T SET x=7 WHERE y=6 ) AS T_updated WHERE T_updated.pk IN ( SELECT S.fk FROM S WHERE S.z = 8 )
With this construction, it is straightforward to join the modified rows to other tables, return the modified rows to the application via a cursor, output the modified rows to a file, and so on. Without this extension, one would have to define a
BEFORE TRIGGER to copy the modified rows to another (different) table, manage the contents of that other table (and handle concurrent updaters), and execute a separate
SELECT statement over the trigger-inserted table (only) after the
UPDATE statement had been executed. That’s a fair amount of work just to output what changes an update statement made. I am pleased to report that this language feature is now available in the SQL Anywhere Version 12.
Syntax and semantics
The grammar of this feature in SQL Anywhere 12 is as follows:
<table primary> ::= <table or query name> [ [ AS ] <correlation name> [ ( <derived column list> ) ] ] | other forms of table references ... | ( <dml derived table> ) REFERENCING <result option> <dml derived table> ::= <delete statement> | <insert statement> | <merge statement> | <update statement> <result option> ::= OLD [AS] <correlation name> | FINAL [AS] <correlation name> | OLD [AS] <correlation name> FINAL [AS] <correlation name>
This syntax differs from what is offered in DB2, largely for two reasons: the first is that we wanted to make the syntax simple for those applications that wanted to compute the contrasts between the new and old values of a row from an
MERGE statement; and the second was to make it easy to join a dml-derived-table to other objects in the same request.
The semantics of a dml-derived-table is as follows. During DESCRIBE, the dml-derived-table is ignored. At OPEN time, the DML update statement is executed first, and the rows affected by that statement are materialized into a temporary table. The column names in the schema of that temporary table are taken directly from the table being modified – the dml-derived-table can update only one table. One can refer to these values by qualifying them with the correlation name given in the
OLDcolumns contain the values as seen by the scan operator that finds the rows to include in the update operation.
FINALcolumns contain the values after referential integrity checks have been made, computed and default columns filled in, and all triggers fired (excluding
FOR STATEMENT AFTERtriggers).
With these declarations come straightforward restrictions. For
INSERT statements, you can only specify
DELETE statements, you can only specify
OLD. However, for
MERGE statements, you can specify either or both correlation names, permitting easy comparison between the old and new values of any updated row. Note that while the grammar refers to these names as “correlation names”, they don’t refer to a separate range variable over the updated table. Rather, they are merely a syntactic device to make it simple to refer to old and new column values in the same statement, mirroring the syntax in row triggers . In other words, if you specify
REFERENCING (OLD AS O FINAL AS F ), there is an implicit join predicate: O.rowid = F.rowid.
Here is a simple example showing how one can use a dml-derived-tablein a query to report on the impact of a specific database update. Using the sample Demo database, the query answers the following question:
Update all Products with a 7% price increase. List the affected Products and their Orders which were shipped between 10 April 2000 and 21 May 2000 and whose order quantity is greater than 36.
(Aside: the Demo database contains only dates between 2000 and 2001, making the query above a bit convoluted, but I think the reader should be able to understand the general idea.) Here’s what the query/update statement looks like in DBISQL:
and its graphical plan. Note the new “DML” operator in the access plan tree, signifying the generation of the temporary table containing the modified rows:
Another advantage of the use of “correlation names” –
ROW TRIGGER syntax – is that one need not come up with (another) set of unique column names for the updated table. Simply qualifying the column name by the
FINAL correlation name is sufficient.
At the moment, we are restricting dml-derived-tables to include at most one updated table, even though SQL Anywhere has long supported
UPDATE statements over join table expressions. We may relax this restriction in a subsequent release.