SAP on IBM i: Using the Query Supervisor to identify expensive SQL statements
Query Supervisor overview
In May 2021, IBM has introduced the Query Supervisor with Db2 for i PTF Group SF99704 Level 13 for IBM i 7.4 and Db2 for i PTF Group SF99703 Level 24 for IBM i 7.3. If you keep your systems current with PTF levels, you may have already installed those PTF levels and can start using the Query Supervisor now. So, what is the benefit of the Query Supervisor and how can you use it in your SAP landscape?
The Query Supervisor acts like a watch dog for unusually expensive SQL statements, and you can decide to let it bark (send a warning message or write a log file entry) or bite (terminate the execution of the SQL statement). What makes an SQL statement “expensive”? You can either look at the total execution time (elapsed time) or at the resource consumption, such as temporary storage allocation, I/O count or CPU time consumed. The Query Supervisor allows you to define thresholds for those characteristics, and when the defined threshold is exceeded, one or multiple user written exit programs are executed. In the input structure filled by the Query Supervisor, your program gets information about the current SQL statement as well as the threshold type and value that triggered the program execution, so your program can perform different actions depending on the type of threshold. In your program, you can set a return code of type integer. If it is 0 when the program ends, the SQL statement will continue to run. If it is 1, the SQL statement will be terminated with SQLCODE -666 (SQL0666) and SQLSTATE ‘57005’. If the SQL statement originated from an ABAP program, the program will be aborted with a runtime error DBSQL_SQL_ERROR.
Description of a sample exit program
At the end of this article, you can find the source of a CL program that you can use to get your first experience. It can be compiled with the CRTCLPGM command, which is available on any IBM i system. If you want to compile the exit program with the ILE compiler, make sure to specify the default activation group or ACTGRP(*CALLER). When the Query Supervisor invokes this program, it is sending one or two messages to message queue QSYSOPR. Because the program will be executed in the same job as the SQL statement, the message details in QSYSOPR will tell you in which job the expensive statement was executed. With that, you have a convenient way to identify very expensive SQL statements in one particular spot, rather than trying to analyze a variety of data sets to identify expensive SQL statements.
The first message will tell you details about the threshold and the SQL statement in the plan cache. For example, this message could be:
Query Supervisor: QRO hash: 23EFD51C, plan ID: 4344, threshold value: 30, type: ELAPSED TIME
As you can see here, the Query Supervisor had fired because the elapsed time threshold was exceeded, in this case with an execution time of 30 seconds. The plan ID can be used together with the stored procedure QSYS2.DUMP_PLAN_CACHE to get more information about the statement that exceeded the threshold. The stored procedure dumps the statement information into a physical file, which can then be imported as a plan cache snapshot into the SQL Performance Center of IBM i Access Client Solutions (ACS) and evaluated through Visual Explain. To dump the plan cache for the given plan ID, execute the stored procedure QSYS2.DUMP_PLAN_CACHE in SQL through a tool of your choice, be it the function Run SQL Scripts from IBM i Access Client Solutions, the green screen tool STRSQL or any other SQL interface:
CALL QSYS2.DUMP_PLAN_CACHE( FILESCHEMA => '<your library>', FILENAME => '<your filename>', PLAN_IDENTIFIER => 4344)
Please note that the SQL statement may be no longer in the plan cache if too much time has passed between the message in QSYSOPR and the execution of the plan cache dump. In that case, the output file will only contain two records with column QQRID=3018, but no SQL statement information.
The second message is giving you application information. It may or may not exist, depending on the application that executed the SQL statement in question. For SQL statements executed by SAP NetWeaver ABAP, you will usually see this information. An example for such a message could be:
Application information: Application: RSEUTAB_RESET Program: <context><trans></trans><main>RSEUTAB_RESET</main><program>RSEUTAB_RESET</program><cont>23</cont></context> User: JOHNDOE
The user JOHNDOE in this message is not an IBM i user profile (which would be <sid>ADM for SAP installations), but the actual end user within the SAP system. The program information allows you to directly jump into the ABAP source of the SAP program that executed the SQL statement. You can get to it by executing function module DB6_DIAG_VIEW_SOURCE through transaction SE37. The function module takes the program name and the line number as input parameters. You get them from the exit program message; the program name is enclosed between <program> and </program>, the line number is enclosed between <cont> and </cont>. Function key F8 displays the ABAP code where the statement was executed with the cursor on the SQL statement in question:
As you can see in the source below, the sample CL program is setting the return code (&RC) to 0, so here the Query Supervisor will allow the statement to continue to completion and return to the SAP application as if nothing had happened.
Activation of the sample exit program
ADDEXITPGM EXITPNT(QIBM_QQQ_QRY_SUPER) FORMAT(QRYS0100) PGMNBR(*LOW) PGM(<your library>/<your program>) THDSAFE(*YES) TEXT('Query Supervisor send QSYSOPR message')
Note: If you modify and recompile your program later, the new program will be picked up automatically by newly started processes. To ensure that already active processes use the new version, you must remove the exit program and add it again.
After that, you can define thresholds by executing the stored procedure QSYS2.ADD_QUERY_THRESHOLD in SQL through a tool of your choice, be it the function Run SQL Scripts from IBM i Access Client Solutions (ACS), the green screen tool STRSQL or any other SQL interface. When executing the stored procedure, you can specify the threshold type and a threshold value. There are four different threshold types supported, and you can call the stored procedure multiple times, once for each threshold. You can also specify a filter for job names, subsystems or user names, but if you want to watch your overall system, you can omit those filters. The supported threshold types are:
|CPU TIME||The total processing unit time used by the query, in seconds.|
|ELAPSED TIME||The total clock time, in seconds.|
|TEMPORARY STORAGE||The amount of storage, in megabytes (MB), that the query uses.|
|TOTAL IO COUNT||The total number of I/O operations.|
To define a threshold for all SQL statements that exceed a total execution time of 30 seconds, you could call the stored procedure ADD_QUERY_THRESHOLD wit the following parameters:
CALL QSYS2.ADD_QUERY_THRESHOLD( THRESHOLD_NAME => 'MAXTIME', THRESHOLD_TYPE => 'ELAPSED TIME', THRESHOLD_VALUE => 30, LONG_COMMENT => 'Elapsed time > 30 secs for all jobs')
The threshold name must be unique and is needed as input parameter in stored procedure QSYS2.REMOVE_QUERY_THRESHOLD to remove the threshold when it no longer needs to be observed. You can query the system view QSYS2.QUERY_SUPERVISOR to get a list of all defined thresholds.
A complete description of the Query Supervisor and related topics can be found in the IBM i documentation at https://www.ibm.com/docs/en/i/7.4?topic=tools-query-supervisor.
Source code of sample exit program
Below is the sample code of the Query Supervisor exit program described previously. It is intended for educational purposes only. There is no support, and there is no guarantee that the program does not cause damage or fulfills a specific purpose. You can use it on an as-is basis and modify it according to your needs.
PGM PARM(&QRYS0100 &RC) DCL VAR(&QRYS0100) TYPE(*CHAR) LEN(8192) DCL VAR(&RC) TYPE(*INT) LEN(4) DCL VAR(&MSGLEN) TYPE(*INT) LEN(2) DCL VAR(&MSGTXT) TYPE(*CHAR) LEN(1022) DCL VAR(&MSGDTA) TYPE(*CHAR) LEN(1024) DCL VAR(&LOWWORD) TYPE(*UINT) LEN(4) DCL VAR(&HIGHWORD) TYPE(*INT) LEN(4) DCL VAR(&HIGHWORDU) TYPE(*UINT) LEN(4) DCL VAR(&MAXINT) TYPE(*DEC) LEN(15 0) + VALUE(4294967296) DCL VAR(&QROHASH) TYPE(*CHAR) LEN(8) DCL VAR(&PLANID) TYPE(*DEC) LEN(15 0) DCL VAR(&THRESHTYPE) TYPE(*CHAR) LEN(30) DCL VAR(&THRESHVAL) TYPE(*DEC) LEN(15 0) DCL VAR(&OFSCLNTAPP) TYPE(*INT) LEN(4) DCL VAR(&LENCLNTAPP) TYPE(*INT) LEN(4) DCL VAR(&OFSCLNTPGM) TYPE(*INT) LEN(4) DCL VAR(&LENCLNTPGM) TYPE(*INT) LEN(4) DCL VAR(&OFSCLNTUSR) TYPE(*INT) LEN(4) DCL VAR(&LENCLNTUSR) TYPE(*INT) LEN(4) MONMSG MSGID(CPF0000) CHGVAR VAR(&RC) VALUE(0) /* General information about query and threshold: */ CHGVAR VAR(&QROHASH) VALUE(%SST(&QRYS0100 59 8)) CHGVAR VAR(&HIGHWORDU) VALUE(%BINARY(&QRYS0100 67 4)) CHGVAR VAR(&LOWWORD) VALUE(%BINARY(&QRYS0100 71 4)) CHGVAR VAR(&PLANID) VALUE(%DEC(&HIGHWORDU 15 0) * + &MAXINT + %DEC(&LOWWORD 15 0)) CHGVAR VAR(&THRESHTYPE) VALUE(%SST(&QRYS0100 135 30)) CHGVAR VAR(&HIGHWORD) VALUE(%BINARY(&QRYS0100 165 4)) CHGVAR VAR(&LOWWORD) VALUE(%BINARY(&QRYS0100 169 4)) CHGVAR VAR(&THRESHVAL) VALUE(%DEC(&HIGHWORD 15 0) * + &MAXINT + %DEC(&LOWWORD 15 0)) CHGVAR VAR(&MSGTXT) VALUE('Query Supervisor: QRO + hash:' *BCAT &QROHASH *TCAT ', plan ID:' + *BCAT %CHAR(&PLANID) *TCAT ', threshold + value:' *BCAT %CHAR(&THRESHVAL) *TCAT ', + type:' *BCAT &THRESHTYPE) CHGVAR VAR(&MSGLEN) VALUE(%LEN(&MSGTXT)) CHGVAR VAR(%BINARY(&MSGDTA 1 2)) VALUE(&MSGLEN) CHGVAR VAR(&MSGDTA) VALUE(&MSGDTA *TCAT &MSGTXT) SNDPGMMSG MSGID(SQL7064) MSGF(QSQLMSG) MSGDTA(&MSGDTA) + TOMSGQ(*SYSOPR) /* Information about SAP ABAP program and user: */ CHGVAR VAR(&OFSCLNTAPP) VALUE(%BINARY(&QRYS0100 199 4)) CHGVAR VAR(&LENCLNTAPP) VALUE(%BINARY(&QRYS0100 203 4)) CHGVAR VAR(&OFSCLNTPGM) VALUE(%BINARY(&QRYS0100 207 4)) CHGVAR VAR(&LENCLNTPGM) VALUE(%BINARY(&QRYS0100 211 4)) CHGVAR VAR(&OFSCLNTUSR) VALUE(%BINARY(&QRYS0100 215 4)) CHGVAR VAR(&LENCLNTUSR) VALUE(%BINARY(&QRYS0100 219 4)) IF COND(&LENCLNTAPP *GT 0 *OR &LENCLNTPGM *GT 0 + *OR &LENCLNTUSR *GT 0) THEN(DO) CHGVAR VAR(&MSGTXT) VALUE('Application information:') IF COND(&LENCLNTAPP *GT 0) THEN(DO) CHGVAR VAR(&OFSCLNTAPP) VALUE(&OFSCLNTAPP + 1) CHGVAR VAR(&MSGTXT) VALUE(&MSGTXT *BCAT + 'Application:' *BCAT %SST(&QRYS0100 + &OFSCLNTAPP &LENCLNTAPP)) ENDDO IF COND(&LENCLNTPGM *GT 0) THEN(DO) CHGVAR VAR(&OFSCLNTPGM) VALUE(&OFSCLNTPGM + 1) CHGVAR VAR(&MSGTXT) VALUE(&MSGTXT *BCAT 'Program:' + *BCAT %SST(&QRYS0100 &OFSCLNTPGM + &LENCLNTPGM)) ENDDO IF COND(&LENCLNTUSR *GT 0) THEN(DO) CHGVAR VAR(&OFSCLNTUSR) VALUE(&OFSCLNTUSR + 1) CHGVAR VAR(&MSGTXT) VALUE(&MSGTXT *BCAT 'User:' + *BCAT %SST(&QRYS0100 &OFSCLNTUSR + &LENCLNTUSR)) ENDDO CHGVAR VAR(&MSGDTA) VALUE(' ') CHGVAR VAR(%BINARY(&MSGDTA 1 2)) VALUE(&MSGLEN) CHGVAR VAR(&MSGDTA) VALUE(&MSGDTA *TCAT &MSGTXT) SNDPGMMSG MSGID(SQL7064) MSGF(QSQLMSG) MSGDTA(&MSGDTA) + TOMSGQ(*SYSOPR) ENDDO ENDPGM