Why Fast Data Access (FDA) for ABAP “FOR ALL ENTRI...

Frank-Martin · ‎11-20-2018

The FOR ALL ENTRIES clause is OPEN SQL statements is a frequently used feature in ABAP. OPEN SQL statements with FOR ALL ENTRIES clause represent one of the OPEN SQL statement types that contain both database tables and ABAP internal tables. To optimize performance, the data of the ABAP internal table needs to be transferred to the database in an effective way. In the past, JOIN transformations were used to execute this kind of SQL statement. With fast data access (FDA), a new way of executing SQL statements with FOR ALL ENTRIES clauses is available.

In this blog, I am going to explain how fast data access (FDA) in the SAP ABAP kernel works and why FDA is mostly preferable to JOIN transformations for OPEN SQL SELECT statements with FOR ALL ENTRIES clauses.

By default, the database shared library (DBSL) for IBM Db2 for Linux, UNIX, and Windows uses FDA for these statements – provided you have an SAP kernel 7.49 and higher and a database version 10.5 FP7 and higher on Unicode systems.

What Are ABAP “FOR ALL ENTRIES” Statements?

OPEN SQL statements with FOR ALL ENTRIES clause represent one of the OPEN SQL statement types that contain both database tables and ABAP internal tables. (For more information, see the documentation on SAP Help Portal.)

The basic form of these statements is:

SELECT <column list>
FROM < database tables>
FOR ALL ENTRIES IN @itab
WHERE < clause containing database table columns and itab columns>

Basically, this statement type implements a JOIN between the database tables and the ABAP-internal table itab. Only distinct rows in the result set are returned to the ABAP program. Since the ABAP-internal table is located on the SAP application server, the referenced column values from this table need to be transferred to the database server.

A Test Program

In the following, I will use the following simple ABAP test program to demonstrate how FOR ALL ENTRIES statements with FDA work. You can try this program on one of your test systems with different FOR ALL ENTRIES settings.

REPORT  Z_FAE_TEST .

DATA: db6    LIKE standard table of t100.
DATA: db6out LIKE standard table of t100.

DATA: start_time TYPE i,
end_time   TYPE i,
run_time   TYPE float.

SELECT * FROM t100 UP TO 100000 ROWS INTO TABLE db6 .

GET RUN TIME FIELD start_time.

SELECT * INTO TABLE db6out FROM t100
BYPASSING BUFFER
FOR ALL ENTRIES IN db6
WHERE sprsl = db6-sprsl
AND arbgb = db6-arbgb
AND msgnr = db6-msgnr.

GET RUN TIME FIELD end_time.
run_time = end_time - start_time .

WRITE : / run_time .

The test program selects 100,000 rows from database table T100 into an internal table called db6. After this, the program uses a FOR ALL ENTRIES statement to join database table T100 with internal table db6. Since the join uses the primary key columns of table T100, the FOR ALL ENTRIES statement will return 100,000 rows. The program displays the elapsed time used by the FOR ALL ENTRIES statement.

The “traditional” JOIN transformation of “FOR ALL ENTRIES” statements

Now let us turn to the JOIN transformation, which was used for FOR ALL ENTRIES statements before fast data access (FDA) was available. I will outline how it works and what its drawbacks are. The SAP kernel supports different methods to execute FOR ALL ENTRIES statements. These methods can be configured by a set of SAP profile parameters. Non-FDA parameters are, for example, listed in SAP Note 48230. Due to different SQL and database optimizer capabilities, different database platforms have chosen different kernel default settings for FOR ALL ENTRIES parameters.

Before FDA has been introduced, IBM Db2 for Linux, UNIX, and Windows used the so-called JOIN transformation as the default setting for FOR ALL ENTRIES statements. The JOIN transformation is mainly controlled by the following kernel parameter settings:

rsdb/prefer_join=1
rsdb/prefer_join_with_fda=0 ( if exists )
rsdb/max_blocking_factor=30

If your test system already uses FDA as a default transformation for FOR ALL ENTRIES statements, you can force the program to use the old “traditional” JOIN transformation using a local DBI hint:

SELECT * INTO TABLE db6out FROM t100
BYPASSING BUFFER
FOR ALL ENTRIES IN db6
WHERE sprsl = db6-sprsl
AND arbgb = db6-arbgb
AND msgnr = db6-msgnr
%_HINTS DB6 '&prefer_join 1& &prefer_join_with_fda 0&'.

If you run the FOR ALL ENTRIES statement with JOIN transformation and trace performance using transaction ST05, you will find that the ABAP statement is translated into a large number of SQL statements of the following form:

SELECT *
FROM
"T100",
( SELECT * FROM (
VALUES ( CAST ( ? AS VARCHAR(3) ),CAST ( ? AS VARCHAR(60) ), CAST ( ? AS VARCHAR(9) ) ),
( ?, ?, ? ), ( ?, ?, ? ), ( ?, ?, ? ), ( ?, ?, ? ), ( ?, ?, ? ),
( ?, ?, ? ), ( ?, ?, ? ), ( ?, ?, ? ), ( ?, ?, ? ), ( ?, ?, ? ),
( ?, ?, ? ), ( ?, ?, ? ), ( ?, ?, ? ), ( ?, ?, ? ), ( ?, ?, ? ),
( ?, ?, ? ), ( ?, ?, ? ), ( ?, ?, ? ), ( ?, ?, ? ), ( ?, ?, ? ),
( ?, ?, ? ), ( ?, ?, ? ), ( ?, ?, ? ), ( ?, ?, ? ), ( ?, ?, ? ),
( ?, ?, ? ), ( ?, ?, ? ), ( ?, ?, ? ), ( ?, ?, ? ) )
AS FAE_TMP ( "C_0", "C_1", "C_2" ) GROUP BY "C_0", "C_1", "C_2" ) AS "t_00”
WHERE
"T100"."SPRSL" = "t_00"."C_0" AND
"T100"."ARBGB" = "t_00"."C_1" AND
"T100"."MSGNR" = "t_00"."C_2"
WITH UR

The ABAP statement references three columns from internal table db6. Therefore, each SQL statement joins table T100 with a table constructed from tuples containing three values. Each SQL statement contains 30 value tuples corresponding to 30 rows in internal table db6. The blocking of the ABAP statement into SQL statements is determined by SAP profile parameter rsdb/max_blocking_factor=30.

Since the internal table db6 contains 100,000 rows, you will find that 3,333 SQL statements containing 30 value tuples are executed, followed by a single SQL statement containing only 10 tuples. In total, all 100,000 rows from the internal table are sent to the database.

The “old” JOIN transformation of FOR ALL ENTRIES statements has several disadvantages:

Additional network roundtrip time
For large internal FOR ALL ENTRIES tables, a large number of SQL statements needs to be send to the database server. Each of these SQL statements causes additional network roundtrips between application server and database server. So, for simple SQL statements like in our test example, much of the elapsed time for the ABAP statement is in fact network roundtrip time.In our example, you can compare the average execution time of the SQL statements in the ST05 trace and compare them with the average execution time of the statements in the SQL cache of the database. You will find that the average time in the ST05 trace is much higher. The difference is due to network roundtrip times.You can monitor the average network roundtrip times in the DBA Cockpit. Call transaction DBACOCKPIT and choose Performance -> Network Statistics in the navigation pane.

Increasing number of SQL statements
The length of SQL statement varies with the number of included value tuples. Over time, one ABAP FOR ALL ENTRIES statement might prepare different SQL statements up to the value in rsdb/max_blocking_factor. Increasing the value in rsdb/max_blocking_factor might reduce the number of SQL statements sent to the database server but increase the number of different SQL statements. This makes it more difficult to monitor one ABAP statement in the SQL cache and it increases the number of SQL statements in the SQL cache. The global kernel default setting rsdb/max_blocking_factor=30 is a compromise between those two factors.

Let me briefly explain what the access plan looks like in this case. We will later compare this access plan with the access plan of the FDA transformation.

In standard cases, where a selective index exists that matches the WHERE clause of the SQL statement, the optimizer will use a nested loop join (NLJOIN) access plan to execute the SQL statements. The outer ([O]) side of the nested loop join will be a table generated from the value tuples in the SQL statement. This object gets the generic name GENROW in the access plan listed in EXPLAIN. This way the database can make use of all three key columns in the primary key of table T100.

Background Information: What Are Db2 Generic Table Functions?

To explain how FDA is implemented on IBM Db2, let me provide some background about generic table functions in Db2. If you want to know more, have a look at the IBM Knowledge Center.

A normal table function is a database object that may have input values but a static output structure with named columns with a defined type. For a generic table function the output structure is not defined. A generic table function might return a different output structure in each SQL statement. The number, name, and type of columns returned by a generic table function is defined in the SQL statement text.

Generic table functions are also referred to as polymorphic table functions in the SQL standard.

The FDA Transformation of “FOR ALL ENTRIES” Statements

The basic idea of fast data access (FDA) is to send the complete content of an ABAP-internal table to the database in one SQL statement. Currently only the SAP HANA database, IBM Db2 for i, and IBM Db2 for Linux, UNIX, and Windows support FDA for FOR ALL ENTRIES statements. While the SAP HANA database uses an SAP-proprietary SQL feature to implement FDA, IBM Db2 for Linux, UNIX, and Windows makes use of the existing generic table function feature in Db2, in particular, the generic table function SAPTOOLS.MEMORY_TABLE

When you install a new SAP system running in IBM Db2 for LUW using the software provisioning manager, you get SAPTOOLS.MEMORY_TABLE automatically: The software provisioning manager runs the script db6_update_db, which creates objects in the SAPTOOLS schema, including the generic table function SAPTOOLS.MEMORY_TABLE. When you update an SAP system, don't forget to run the db6_update_db script manually – otherwise you won't get SAPTOOLS.MEMORY_TABLE!

IBM Db2 for Linux, UNIX, and Windows supports FDA as of SAP kernel 7.49 and database version 10.5 FP7 on Unicode systems. If your system meets these requirements, the SAP kernel default settings will activate FDA for FOR ALL ENTRIES statements. You can also activate FDA with a local DBI hint if you have overwritten the kernel default settings with other parameter settings. For our test program, this would look like this:

SELECT * INTO TABLE db6out FROM t100
BYPASSING BUFFER
FOR ALL ENTRIES IN db6
WHERE sprsl = db6-sprsl
AND arbgb = db6-arbgb
AND msgnr = db6-msgnr
%_HINTS DB6 '&prefer_join 0& &prefer_join_with_fda 1&'.

When you test the sample program with FDA, you will find that only one SQL statement is executed for the ABAP FOR ALL ENTRIES statement. This SQL statement returns all 100,000 rows that are returned to the SAP program. Although the execution time of the FDA SQL statement is much longer than the individual execution times of the JOIN transformation statement, all in all the test program runs much faster with FDA.

SELECT DISTINCT "T100".*
FROM
"T100",
TABLE( SAPTOOLS.MEMORY_TABLE( CAST( ? AS BLOB( 2G )) ) CARDINALITY 1 )
AS "t_00"
( "C_0" VARCHAR(3) , "C_1" VARCHAR(60) , "C_2" VARCHAR(9) )
WHERE
"T100"."SPRSL" = "t_00"."C_0" AND
"T100"."ARBGB" = "t_00"."C_1" AND
"T100"."MSGNR" = "t_00"."C_2"
WITH UR

The SQL statement contains the generic table function SAPTOOLS.MEMORY_TABLE. This table function has a BLOB input parameter and returns a result set that matches the three columns of internal table db6 that are referenced in the WHERE clause of the ABAP statement.

The DBSL for IBM Db2 encodes all column values from the internal table into BLOB format and sends it to the database server. On the database server, the SAPTOOLS.MEMORY_TABLE function decodes the BLOB input data and returns all value tuples from the internal table db6 in its result set. Different ABAP FOR ALL ENTRIES statements might require a different number or different data types of columns returned by the SAPTOOLS.MEMORY_TABLE. This is handled by the generic table function of Db2.

The FDA transformation of FOR ALL ENTRIES statements has the following advantages:

Only one SQL statement is executed. This reduces the network roundtrip time overhead. The difference between FDA and non-FDA execution times will be even faster if the average network roundtrip time for your application server is high (maybe due to a bad network connection).

Since only one single SQL statement is prepared, the performance of an ABAP FOR ALL ENTRIES statement can be more easily monitored with DB2 means in the SQL cache of the database.

With the JOIN transformation, individual SQL statements might return duplicate rows in the result set (although this is not the case with our test program). With FDA, duplicate rows in the result set are eliminated on the database side with the DISTINCT keyword in the SQL statement. This can further reduce network overhead and ABAP processing time.

The DBSL knows that a FOR ALL ENTRIES statement is executed that eliminates duplicates in the result set. Therefore, it can be "smart" and eliminate some duplicates in the input tuples before even sending them to the database server.

With SAP kernel 7.53, we've changed the SQL statement transformation with FDA slightly to include a GROUP BY clause (see SAP Note 2675236). This change ensures that duplicates in the input tuples are eliminated early. This change can also optionally be enabled with the 7.49 kernel. The SQL statement of our test program has not changed much because of this:

SELECT DISTINCT "T100".*
FROM
"T100",
( SELECT * FROM TABLE( SAPTOOLS.MEMORY_TABLE( CAST( ? AS BLOB( 2G )) ) CARDINALITY 1 )
AS "t_00"
( "C_0" VARCHAR(3) , "C_1" VARCHAR(60) , "C_2" VARCHAR(9) )
GROUP BY ( "C_0", "C_1”, "C_2" ) ) AS "t_00"
WHERE
"T100"."SPRSL" = "t_00"."C_0" AND
"T100"."ARBGB" = "t_00"."C_1" AND
"T100"."MSGNR" = "t_00"."C_2"
WITH UR

Note that this change with SAP kernel 7.53 makes the FDA SQL statement look even more similar to the JOIN transformation SQL statement.

In normal cases, where a selective index exists that matches the WHERE clause of the SQL statement, the access plan generated for an FDA statement also looks very similar to the JOIN transformation statement:

Compared to the JOIN transformation statement, you will now find the table function name MEMORY_TABLE instead of GENROW in the execution plan. Also, in the FDA case the database table should be on the inner ([I]) side of the nested loop join because only on the inner side all predicates from the WHERE clause can be used in the index scan. MEMORY_TABLE on the outer side ([O]) of the nested loop join is only executed once and of course you will see a TBSCAN here because table functions do not have indexes.

At prepare time of the SQL statement, Db2 does not know how many rows MEMORY_TABLE will return. To hint Db2 to use a nested loop join, we currently include a CARDINALITY 1 clause in the SQL statement text.

Performance Considerations and Statement Tuning

Overall, we received positive feedback from customers about the new FDA transformation. In large productive customer systems FDA is used by default.

However, for all FOR ALL ENTRIES (to SQL statement) transformations, borderline cases exist where Db2 chooses a bad access plan for the SQL statement or cannot choose a good access plan because no suitable index exists. This has not changed with FDA. Unfortunately, after activating FDA, some SQL statements might run slowly that have run faster before and vice versa. So, you might need to tune a small number of statements after you have activated FDA. Often the simplest tuning option “creating a matching index” helps. Watch out for another blog post where I go into more detail and concentrate on FDA statement monitoring and statement tuning.

Please also be prepared that you might see a few slowly running FDA statements. This does not necessarily indicate a bad access plan and does not necessarily require statement tuning. Remember that before FDA, you might have had a large number of SQL statements for the same ABAP statement. With the switch to FDA, all these statements have been replaced by one single FDA statement. The longer execution time of this single FDA statement can be caused by a large number of input rows in the FOR ALL ENTRIES table or by a large result set fetched by the FDA query. However, in total, the execution time of the single FDA statement is probably still shorter than the total execution time of all previous SQL statements summed up. Hence, don't overreact - a slowly running FDA statement is often no need for concern, but only natural.

Hope this helps. Your feedback on this is welcome.