Fast Data Access (FDA), Part 2: How to detect and tune long-running FDA statements
This is part two of my blog series on fast data access. I recommend that you read my first blog for an introduction to FDA on IBM Db2. In this 2nd blog, I will now concentrate on the tuning of FDA statements.
An FDA statement with a long average runtime may be a sign of a bad access path chosen by the Db2 optimizer, but it may also indicate that the corresponding ABAP application supplies a large FOR ALL ENTRIES table as input table. So, the first step is to distinguish between those two cases.
In the following, I will use a slightly extended version of my simple ABAP test program from my first blog on FDA . You can play with this test program and run it on any test system that uses FDA for FOR ALL ENTRIES statements.
DATA: db6 LIKE standard table of t100. DATA: db6out LIKE standard table of t100. DATA: start_time TYPE i, end_time TYPE i, run_time TYPE float. SELECT * FROM t100 UP TO 100000 ROWS INTO TABLE db6 . GET RUN TIME FIELD start_time. SELECT * INTO TABLE db6out FROM t100 BYPASSING BUFFER FOR ALL ENTRIES IN db6 WHERE sprsl = db6-sprsl AND arbgb = db6-arbgb AND msgnr = db6-msgnr. GET RUN TIME FIELD end_time. run_time = end_time - start_time . WRITE : / run_time . GET RUN TIME FIELD start_time. SELECT SPRSL INTO TABLE db6out FROM t100 BYPASSING BUFFER FOR ALL ENTRIES IN db6 WHERE sprsl = db6-sprsl AND arbgb = db6-arbgb AND msgnr = db6-msgnr. GET RUN TIME FIELD end_time. run_time = end_time - start_time . WRITE : / run_time .
The test program selects 100,000 rows from database table T100 to fill the internal ABAP table db6. After this, the program executes two different FOR ALL ENTRIES statements to join the database table T100 with the internal table db6. The first FOR ALL ENTRIES statements will return 100,000 rows since the WHERE clause uses the primary key columns of table T100 and all columns from table T100 are selected. The second FOR ALL ENTRIES statement uses the same WHERE clause, but only a small number of rows are returned since the statement selects only distinct values for column SPRSL.
We already know that the simple statements in this test program are executed well by the Db2 optimizer and the runtime is not influenced by a bad optimizer decision or a missing index.
Understanding FDA statements in the SQL trace
When using ST05 to trace FDA statements, make sure that you’re using a current DBSL version including the patch from SAP Note 2416250. In addition, you should include the ABAP code fix from SAP Note 2497337 . Otherwise, the ST05 trace entries may be misleading.
If you take an ST05 trace of the test program and filter for table T100, you will find the following aggregated rows. The new ST05 trace (compared to ST05_OLD) shows by default an aggregated view where more details can be displayed on request.
The last two statements in the trace correspond to the FDA statements. The second statement selects only two rows since my test system contains only two languages. Both statements have a relatively high execution time. If you preselect the FDA statements and press the Display Individual Records button as marked above, you can see in more detail where the statements spend their time.
In the Single Records section of the trace, you’ll find several different operation types, some of which are very specific for FDA statements.
Let’s have a closer look at the different steps of the process:
- First of all, both FDA statements are prepared. This involves sending the SQL statement text to Db2. On the Db2 side, the engine may have to generate an access path for the SQL statement and insert the statement into the package cache if it doesn’t exist yet.
- After the PREPARE operation, you’ll find an OPEN statement in the trace. For SQL statements that don’t have LOB input parameters, the statement is executed in this step and most of the statement runtime on the database server is accumulated in this step. However, this is not the case for FDA statements. Here, no network roundtrip to the database takes place and the Db2 CLI client is waiting for the FDA LOB input data.
- After the OPEN operation, you’ll find a number of artificial WRITE_MEMORY_TABLE_DATA statements. During these steps, the DBSL hands over the FDA LOB input data to the Db2 CLI client in packages of roughly 1MB. The aggregated number of rows handed over to the Db2 CLI client in our test program is 100,000 (37367 + 34271 + 28362).Again there’s no network roundtrip to the database server during the WRITE_MEMORY_TABLE_DATA steps. The Db2 CLI client collects the data in memory. After 25MB of data, it may start to write a temporary file into a temp directory. In this case, those files should be cleaned up automatically immediately after statement execution. If you observe left-over CLI* files in the temp directory of your application server, make sure that you’re using a current DBSL version including the patches from SAP Notes 2420469 and 2461903 .
- Finally, the DBSL signals with CLOSE_MEMORY_TABLE_DATA that all FDA input data has been provided. In this step, the Db2 client transfers all collected data to the database server and the database server starts executing the statement. In most cases, the CLOSE_MEMORY_TABLE_DATA step therefore is the step with the longest elapsed time for FDA statements.
- After the database server has executed the statement, the first block of rows is sent to the SAP application server and FETCH operations into the internal ABAP result table can take place. If the result set is small, the FETCH operations will be very fast since data is just copied from Db2 client memory to ABAP memory. You can see this in the second FDA example. If the result set is large, the Db2 client may have to request more data from the database server. In this case, the FETCH operations will include additional network roundtrips and some additional database execution time. It depends on the result set of the FDA statement if the FETCH operations or the CLOSE_MEMORY_TABLE_DATA step take the most elapsed time.
If you find an FDA statement in the SQL trace with a long elapsed time in CLOSE_MEMORY_TABLE_DATA, try to relate this to the number of input tuples sent with WRITE_MEMORY_TABLE_DATA to judge if the statement execution is too slow.
Interpreting the FDA statement execution time in the SQL cache of Db2
In most cases, you will detect long running FDA statements in the SQL cache of the database (open DBACOCKPIT and choose Performance -> SQL Cache or DBACOCKPIT -> Performance -> Top SQL Statements). Again, the question arises: Is the statement slow because it has a huge amount of input tuples, or is it slow because of a bad access path in Db2?
When looking at the SQL cache data, I recommend adding the column Routine Invocations using the Change Layout button in the ALV screen. Applied to our test statements, you may see the following data:
As you can see, the Routine Invocations column roughly contains the number of input tuples of the FDA statement (100,000). Db2 internally counts five additional routine invocations per execution. So you can calculate the average number of rows in the FOR ALL ENTRIES table of the ABAP application with the following formula:
<Avg. Number of Input Tuples> = <Routine Invocations> / <No. of Coord. Executions> – 5
This is very important information for your statement analysis. The formula can be applied in most cases, except for the following where it’s not applicable:
- Db2 is using a suboptimal access path where the MEMORY_TABLE function is executed more than one time.As stated in my first blog, usually a nested loop join with MEMORY_TABLE on the outer side of the join is desirable. In this case, MEMORY_TABLE is only executed once per statement execution. However, Db2 may have decided otherwise. So if you see an extremely huge number of routine executions, this may also be a sign of a bad access path and you should EXPLAIN the statement to check if MEMORY_TABLE is used in the right place.
- The SQL statement is using other external functions. However, this is almost never the case. CDS introduced more external functions in the SAPTOOLS schema, but they are very rarely used in ABAP applications.
If you compare the database execution time of the statements to the elapsed times displayed in the ST05 trace, you will find that for the second statement with the small result set, the database execution time almost matches the elapsed time of the CLOSE_MEMORY_TABLE_DATA step. In case of the first statement with the larger result set, the database execution time is longer than the elapsed time of the CLOSE_MEMORY_TABLE_DATA step because fetching the result set also includes some database execution time.
Typical bad access path patterns for FDA statements
In most cases, FOR ALL ENTRIES statements are not overly complex, the WHERE clause is selective, and a matching index exists that matches the WHERE clause. In exceptional cases where this is not the case, Db2 may decide for a bad access path. There are two main patterns:
A nested loop join is used but there’s not a good index:
The screenshot shows an FDA statement with a bad nested loop join access path. Db2 uses a nested loop join with MEMORY_TABLE on the outer side ([O]) but the used index does not match the predicate on column MSGNR in the WHERE clause. In this way, the predicate cannot be used as START/STOP predicate for the index scan, and multiple full index scans will make this statement very slow.
The statement above is artificial, but it demonstrates the problem. In this case, creating an index with column MSGNR as the left most column will help.
MEMORY_TABLE is used as the inner table of a Join:
The statement from the screenshot above has been extracted from a customer incident. The access path shows an FDA statement where MEMORY_TABLE is used as the inner table of a nested loop join. In this way, MEMORY_TABLE needs to be executed multiple times and the Routine Invocations count in the SQL cache will explode. Even worse, the used index does not help for the given WHERE clause.
In this case, creating an index with the first columns CLIENT, SRC_GUID and maybe QUAN_CF and FINAL_ENTRY may have helped to get a good access path.
There were also defects in Db2 that resulted in such bad access patterns (see, for example, SAP Note 2338139). If you start with FDA and you’re still using Db2 10.5, you should at least have Fix Pack 8. Since we’re constantly working on improvements for the Db2 engine and the Db2 optimizer, I recommend that you always use the most current versions and Fix Pack levels of Db2.
Tuning options for FDA statements
I encourage you to use FDA on your system. Some of our larger customers already do so with their production systems without problems.
However, there have always been OPEN SQL statements and, in particular, FOR ALL ENTRIES statements that required statement tuning because maybe the use case is customer-specific or the data distribution in the underlying tables is very special, or maybe Db2 does something odd and immediate help is needed. Feel free, of course, to open a customer incident if you need help with tuning your statement. Still, I’d like to briefly sketch existing options in this blogpost.
I already mentioned the most important tuning option “Creating a matching index”. Other tuning options like adding an optimization guideline (SAP Note 868888) or an SAP optimizer profile (SAP Note 1818503) also exist for FDA statements. Here, I concentrate on ABAP hints that are specific for FDA statements (SAP Note 2575792).
MEMORY_TABLE cardinality hints
At prepare time of an FDA statement, the Db2 optimizer cannot know how many rows the MEMORY_TABLE table function will provide. Therefore, it needs to make a default assumption if no other information is provided. The built-in Db2 default for such cases is 1,000 rows. But we found that the optimizer sometimes decided for a hash join too early with this default build. That’s why we currently add a CARDINALITY 1 clause to our FDA statement. This is subject to change in the DBSL with future Db2 versions.
SELECT DISTINCT … FROM … , TABLE( SAPTOOLS.MEMORY_TABLE( CAST( ? AS BLOB( 2G )) ) CARDINALITY 1 ) AS "t_00" ( … ) WHERE …
Since we assume here that the MEMORY_TABLE will provide only one row, Db2 favors simple nested loop join access plans. In most of the SAP query cases, this is exactly what we like to see. However, in cases where no good index exists or where the WHERE clause is not selective, a different join method may be better.
You can EXPLAIN your problematic FDA statement and try to change the CARDINALITY value to see if this changes the access path in a good way. After you have found a better CARDINALITY VALUE, you can add this value as an ABAP hint to your application code. To change the CARDINALITY, for example,to 1,000,000 rows, add the following ABAP hint:
SELECT … %_HINTS DB6 'MEMORY_TABLE_CARDINALITY 1000000'.
If you don’t want to change the ABAP application, the CARDINALITY clause can also be changed with an SAP optimizer profile (SAP Note 2700223).
Adding a GROUP BY clause to the MEMORY_TABLE queries
As described in my first blog, duplicate input tuples contained in the FOR ALL ENTRIES table can be sorted out early. So it’s ok to add a GROUP BY clause around the MEMORY_TABLE function. This is not done in SAP kernel version 7.49 by default. However, since some customers changed this default with good results, we changed the DBSL default with SAP kernel version 7.53.
If your system is running with SAP kernel version 7.49, I suggest you test the following SAP profile parameter setting (see also SAP Note 2675236):
Last resort: Switching off FDA for selected FOR ALL ENTRIES queries
If, for any reason, you encounter a query that cannot be tuned in its FDA form, you can still switch off FDA locally for this query using a DBI hint:
SELECT … %_HINTS DB6 '&prefer_join 1& &prefer_join_with_fda 0&'.
I hope you’ll never see a case like this, but if it happens, we’d like your feedback to enhance the Db2 optimizer in upcoming Db2 Fix Packs and versions. In such a case, please open an incident or message me. Most likely, we will need optimizer information (see SAP Note 83819) for the slow query.
Leave a comment and let me know your thoughts on FDA and if this blogpost helps you.
Thanks for useful informations about FDA.
Is there any Cons. when developers use FDA to process very few data(1 or under 10) in an internal table ?
When they don't use FDA to process data in an internal table, DB2 will create GENROW temp table instead of MEMORY_TABLE.
Is there any overhead to create and handle MEMORY_TABLE instead of GENROW in Db2 point of view ?
you should not be worried that DB2 needs to allocate 2GB each time SAPTOOLS.MEMORY_TABLE is allocated.
select typename, length, AS_LOCATOR from syscat.funcparms where funcschema = 'SAPTOOLS' and funcname ='MEMORY_TABLE'
TYPENAME LENGTH AS_LOCATOR
BLOB 2147483647 Y
The function does have a locator input parameter. So memory is only allocated as needed. If FDA has only one 1 row of input data, DBSL sends only a tiny BLOB with a few bytes length to the database server.
In 10.5 FP5 this locator attribute has been missing. Therefore DBSl activates FDA only with FP6 and higher.
there is some process invocation overhead for MEMORY_TABLE. However from our test results this can be neglected.
If you do not use FDA for such small internal tables you may have to prepare a new SQL statement with the matching number of tuples in the VALUES clause. This prepare overhead is much larger.
For FDA queries with a very large number of input tuples DB2 may have to write some temp data. However for those queries FDA saves so many network roundtrips that this can be neglected too.