From The Archives: SELECT over an UPDATE statement
In this post, originally written by Glenn Paulley and posted to sybase.com in May of 2009, Glenn talks about using SELECT over various DML queries, a feature that was added to SQL Anywhere in version 12
Few, I think, would argue with the statement that SQL is a complex language, and there is considerable “bulk” in ANSI/ISO standard SQL that is of debatable value. What I’d like to describe in this post, however, is an extension to standard SQL that I think is really quite useful – so much so that I wish I’d thought of it, but I have to give credit to the people at IBM, Krishna Kulkarni in particular. The extension is another form of “table reference” in the
FROM clause of a
SELECT statement: a table function that contains an update statement (
DELETE). It’s easiest to illustrate this extension with an example:
SELECT T_updated.* FROM NEW TABLE ( UPDATE T SET x=7 WHERE y=6 ) AS T_updated WHERE T_updated.pk IN ( SELECT S.fk FROM S where S.z = 8 )
What does this SQL statement do? First, the embedded
UPDATE is executed, creating a virtual derived table consisting of the modified rows of table T, as specified by the
NEW keyword. Second, the quantifier T_updated is evaluated over these virtual derived table rows, and the
SELECT statement processes them as if they were from any other table reference; in the example, only the modified rows of T that satisfy the
WHERE clause on line 3 become part of the
SELECT‘s result set. With the above model, it is straightforward to join the modified rows to other tables, return the modified rows to the application via a cursor, output the modified rows to a file, and so on. So simple, it’s brilliant. Without this extension, one would have to define a
BEFORE trigger to copy the modified rows to another (different) table, manage the contents of that other table (handle concurrent updaters), and execute a separate
SELECT statement over the trigger-inserted table (only) after the
UPDATE statement had been executed. That’s a fair amount of work just to output what changes an update statement made. Here is the grammar for this form of table reference:
<table primary> ::= <table or query name> [ [ AS ] <correlation name> [ ( <derived column list> ) ] ] | other forms of table references, such as derived tables | <data change delta table> [ AS ] <correlation name> [ ( <derived column list> ) ] <data change delta table> ::= <result option> TABLE ( <data change statement> ) <data change statement> ::= <searched delete statement> | <insert statement> | <merge statement> | <searched update statement> <result option> ::= NEW | OLD | FINAL
NEW signifies that the virtual derived table will contain new or modified rows as a result of an
OLD signifies that the virtual derived table will contain copies of rows prior to their modification or deletion.
FINAL causes an exception to be raised if the rows that are modified are altered by any
In addition to the above semantics, there are additional restrictions on how the update statement can be utilized, largely a result of avoiding problems due to the order of evaluation. One such restriction is that only one “data change delta table” can be specified in a single request.
The ability to specify a “data change delta table” is an important extension to standard SQL, and has been available in IBM’s DB2 product since the DB2 8.1.4 release. In March 2008, this feature was adopted as an enhancement to ANSI/ISO standard SQL, and should appear as part of the forthcoming SQL:2011 standard.
One shortcoming of the syntax supported by both DB2 and the forthcoming SQL standard is that one cannot directly refer to both the original and modified versions of the same row, in the same way one can do in a row-level trigger by using the
REFERENCING [ OLD | NEW ] AS … syntax. Instead, one has to construct two common table expressions (using two
WITH clauses), one to construct a derived table containing the “after” version with the update DML statement, and the other to return the table’s existing values. The query must then join these two quantifiers together to get the old and new versions of the same row. Jan-Eike Michels of IBM’s Santa Teresa laboratory was kind enough to provide an example, which I’ve modified slightly:
WITH temp1 (T1PK, T1C1 ) AS (SELECT PK, C1 FROM T), temp2 ( T2PK, T2C2 ) AS (SELECT PK, C1 FROM NEW TABLE (UPDATE T SET C1 = C1 + 2)) SELECT T2PK AS keyvalue, T1C1 AS old, T2C1 AS new, T2C1 - T1C1 AS delta FROM temp1 JOIN temp2 ON (temp1.T1PK = temp2.T2PK )
At the moment, SQL Anywhere does not support “data change delta tables” – at least, not yet (note: As mentioned in the introduction, this feature was added to SQL Anywhere version 12). My thanks to Jan-Eike for his example and for answering some technical questions surrounding this SQL extension, and to Krishna for bringing it forward to the INCITS DM32.2 SQL standards committee.