Skip to Content

In this post, originally written by Glenn Paulley and posted to sybase.com in April of 2009, Glenn provides details on the MERGE statement in SQL Anywhere.  The latest documentation on the MERGE statement can be found here.

The MERGE statement first appeared as Feature F312 in the SQL:2003 ANSI/ISO standard. MERGE is useful in situations where one has to insert a set of rows but when cleaning or other ETL (Extract-Transform-Load) processing is required. MERGE permits one to implement quite sophisticated logic to process a set of tuples to be loaded in a single statement. In this article, I’d like to briefly outline the functionality of the MERGE statement, how it has changed between the SQL:2003 and SQL:2008 standards, and the extensions to MERGE that are supported in SQL Anywhere. To do this, I’ll extensively borrow from a presentation on the MERGE statement and materialized views that my colleague Anil Goel presented at the Sybase Techwave conference in Las Vegas in August, 2008.

Example

When processing a set of input rows to be inserted into a base table, it is sometimes necessary to perform “INSERT-else-UPDATE” processing, which is precisely the main intent of the MERGE statement. In addition, it would be beneficial to provide:

  • Provide alternate actions for INSERT;
  • Conditionally skip the INSERT and/or the UPDATE for specific rows;
  • DELETE an existing row; and
  • Permit arbitrary conditions to be specified for alternate actions.

Suppose we have an existing, non-empty table T and wish to process a set of input rows S as follows:

FOR each input row in S DO
   IF EXISTING THEN // row of S matches an existing row in table T
     IF condition-1 THEN SKIP
     ELSE IF condition-2 THEN ERROR
     ELSE IF condition-3 THEN DELETE FROM T
     ELSE UPDATE T
   ELSE
     IF condition-4 THEN SKIP
     ELSE IF condition-5 THEN ERROR
     ELSE INSERT INTO T    
END

This entire piece of logic can be executed as a single MERGE statement as follows. The example below merges existing table rows from the base table T with rows from an input file on the client machine, which are accessed using Version 11’s OPENSTRING construction:

MERGE INTO T
USING (SELECT * FROM OPENSTRING( FILE '/usr/paulley/techwave/2008/data1.txt')
           WITH( id INT, qty INT, cmt LONG VARCHAR)  S) S
ON PRIMARY KEY
WHEN NOT MATCHED AND S.cmt = 'ignore' THEN SKIP
WHEN NOT MATCHED AND S.qty < 0  THEN RAISERROR
WHEN NOT MATCHED AND S.qty = 0 THEN INSERT VALUES( S.id, 1 )
WHEN NOT MATCHED THEN INSERT VALUES( S.id, S.qty )

WHEN MATCHED AND S.cmt = 'ignore' THEN SKIP
WHEN MATCHED AND S.cmt = 'new'  THEN RAISERROR
WHEN MATCHED AND T.qty + S.qty

MERGE semantics – briefly

The MERGE statement defined in the SQL:2003 standard permitted at most one WHEN MATCHED and at most one WHEN NOT MATCHED clause.
The semantics of SQL:2003 MERGE are as follows.

  • For each row of target-object, determine if the MERGE search-condition evaluates to TRUE for each row of the source-object, to find ‘matching’ rows from the source and target.
  • Generate an error if more than one matching row is found.
  • Add the single matching row, if found, to the set of rows to be processed by the WHEN MATCHED clause, if present;
  • The WHEN NOT MATCHED clause, if present, processes rows of source-object that fail to match any target-object row.

With Feature F313, SQL:2008 permits multiple WHEN [NOT] MATCHED clauses with optional search conditions. In this case, the order of the WHEN clauses matter: it is the merge action of the first WHEN clause that satisfies its search condition that is executed. However, even with Feature F313, the standard MERGE statement isn’t that flexible: the only operation a WHEN MATCHED clause can perform is an UPDATE, and the only operation a WHEN NOT MATCHED clause can perform is an INSERT. With SQL Anywhere 11, we extended MERGE statement functionality in several ways, some of these inherited from the older INSERT ... ON EXISTING statement that was introduced to SQL Anywhere Version 8 in 2002:

  • One can use MERGE with an input dataset from a client machine using the OPENSTRING construction;
  • use an explicit column list to permit column reordering from input to target table;
  • use of WITH AUTO NAME syntax to do column matching by name;
  • the MERGE search-condition can specify PRIMARY KEY as a shorthand to generate a condition based on primary key columns;
  • several additional WHEN MATCHED actions:
    • DELETE the existing row
    • RAISERROR [ error-number ]
    • SKIP the input row
    • UPDATE [ DEFAULTS { ON | OFF } the existing row
  • additional WHEN NOT MATCHED actions:
    • INSERT the input row
    • RAISERROR [ error-number ]
    • SKIP the input row

A detailed description of the semantics of the MERGE statement, along with the semantics of INSERT, DELETE, and UPDATE row-level triggers on the target table, can be found in the SQL Anywhere documentation. Here is the BNF for the MERGE statement syntax supported by SQL Anywhere 11:

merge                ::= "MERGE" "INTO" merge_into_tabterm merge_using
                         simple_tabterm merge_on_clause merge_op_list
merge_using          ::= "USING" | "USING WITH AUTO NAME"
merge_into_tabterm   ::= ( tabref | "(" query_expr ")" as identifier
                         | "(" "WITH" with_list query_expr ")" as identifier ) derived_collist
merge_on_clause      ::= "ON" merge_search_cond
merge_search_cond    ::= searchcond
                         | "PRIMARY" "KEY"
                         | "INDEX" indexname
merge_op_list        ::= merge_op { "," merge_op }
merge_op             ::= "WHEN MATCHED" "THEN" merge_matched_op
                         | "WHEN MATCHED" "AND" searchcond "THEN"
                           merge_matched_op
                         | "WHEN NOT" "MATCHED" "THEN"
                           merge_not_matched_op
                         | "WHEN NOT" "MATCHED" "AND" searchcond
                           "THEN" merge_not_matched_op
merge_common_op      ::= "SKIP"
                         | "RAISERROR"
                         | "RAISERROR" integer_or_var
merge_matched_op     ::= merge_common_op
                         | "UPDATE SET" setlist
                         | "UPDATE"
                         | "UPDATE" "DEFAULTS" "ON"
                         | "UPDATE" "DEFAULTS" "OFF"
                         | "DELETE"
merge_not_matched_op ::= merge_common_op
                         | "INSERT" "VALUES" "(" ins_vallist ")"
                         | "INSERT" "(" inscollist ")" "VALUES" "(" ins_vallist ")"
                         | "INSERT"
ins_vallist          ::= { ins_valitem "," } ins_valitem
ins_valitem          ::= "DEFAULT" | expression
inscollist           ::= { inscolname "," } inscolname
inscolname           ::= identifier [ "." identifier ]

Another useful extension that we are considering for a future release is to support outer joins between the source- and target-objects, a construction already supported by Microsoft SQL Server 2008. My thanks to Anil Goel for putting together most of the above material for our presentation at Techwave 2008.

To report this post you need to login first.

Be the first to leave a comment

You must be Logged on to comment or reply to a post.

Leave a Reply