[Oracle] DB Optimizer Part I – Understanding FOR ALL ENTRIES (and its processing by SAP DBSL / Oracle)
I got some nice feedback on my first blog about the oracle optimizer behavior in a special case. So i decided to start a tiny series about common issues with the oracle optimizer in a SAP environment. I don’t have a specific topic order. I just start with the topics that come to my mind and what i have seen over years very regularly.
The first part of this series will cover a pretty common topic called “IN-LIST”. SAP tried to hide the database layer (open SQL) completely from its ABAP layer, which makes sense (of course) regarding the reusability of code for different database platforms. But if you hide a layer behind an interface (SAP DBSL), it is pretty easy to run in some issues without knowing it.
I don’t focus on specific database code like adding specific database hints to ABAP code or using native SQL. The guys who are using these technics are mostly aware of what is happening behind.
Basically an IN-LIST is an expression in the WHERE clause of a SQL. You can specify multiple values and query a table (and even more, but this is not used in a SAP environment).
Let’s assume the following: You want to query an employee table and search for several personnel numbers. You can write the query in two ways (for the simplest way):
- SELECT <col> from <tab> where <col> = <p-number> or <col> = <p-number> or <col> = <p-number> ……
- SELECT <col> from <tab> where <col> in (<p-number>, <p-number>, <p-number>)
As you can see the second example is way shorter and become even more common, if you have more column values.
Now think about a SAP internal table which is used to query a database table. The SAP DBSL uses the IN-LISTs for FOR ALL ENTRIES lists in ABAP.
As an example regarding the 2 SELECTs above it looks like this in ABAP:
SELECT <col> from <tab> into <itab2> FOR ALL ENTRIES IN <itab> where <col> = <itab>-<col>
rsdb/max_in_blocking_factor (= number of values in IN-LIST by each execution)
It is used if the rsdb/prefer_in_itab_opt parameter is set to 1 (this is the default as of Kernel 6.x) and any time a conversion into IN lists is technically possible.
The default value of rsdb/max_in_blocking_factor is 5 (as far as i can remember). This means, that every database SQL contains 5 values in the IN-LIST, if you execute a SELECT in ABAP code with FOR AL ENTRIES. The database SQL is executed several times (with 5 IN-LIST values) in a row until all internal table entries are processed, if your internal ABAP table has more than 5 values.
Now you maybe think “Cool, so we can increase the parameter rsdb/max_in_blocking_factor and run less SQL statements on the database with more IN-LIST values”. Yes you can do that of course (i have used this approach a few times too), but you need to know the limitations and impact on the Oracle cost based optimizer in such scenarios. However you can also run into issues even with the default value of 5, but i get into that details later on.
The optimizer on IN-LISTs
This part of the blog tries to demonstrate the limits of the CBO regarding IN-LISTs (especially in a SAP environment with disabled CBO enhancements). The demos are run on an Oracle 22.214.171.124 database. The optimizer will transfer the IN-LIST into (multiple) “OR” concatenation internally, but this should be a side note only.
SQL> create table ZTEST as select trunc(dbms_random.value(1,6)) col from dba_objects where rownum <=100; SQL> exec dbms_stats.gather_table_stats(NULL,'ZTEST');
Now we have a table ZTEST and a column col with 5 nearly equally distributed values over 100 rows. To simplify the example i will disregard “special configurations” like histograms, etc.
- How many rows do we expect by running a query on table ZTEST with “col IN (1)”?
- … i would say 20 (= 100 * (1/5) * 1) .
- How many rows do we expect by running a query on table ZTEST with “col IN (1,2)”?
- … i would say 40 (= 100 * (1/5) * 2).
- How many rows do we expect by running a query on table ZTEST with “col IN (1,2,3)”?
- … i would say 60 (= 100 * (1/5) * 3).
- How many rows do we expect by running a query on table ZTEST with “col IN (1,1,1)”?
- … i would say 20 (= 100 * (1/5) * 1).
- How many rows do we expect by running a query on table ZTEST with “col IN (1,1,2)”?
- … i would say 40 (= 100 * (1/5) * 2).
Let’s verify our calculations with the calculations of the CBO.
SQL> select col from ZTEST where col in(1);
SQL> select col from ZTEST where col in(1,2);
SQL> select col from ZTEST where col in(1,2,3);
SQL> select col from ZTEST where col in(1,1,1);
SQL> select col from ZTEST where col in(1,1,2);
… so the oracle database optimizer is really clever, isn’t it? The CBO calculates the same results (column E-Rows) as we expected it. It is even clever enough to remove the duplicates in the IN-LIST and estimate the correct amount of rows.
However SQLs are not executed this way (with literals) by SAP DBSL (default). If you execute an open SQL through SAP DBSL, it translates this SQL with bind variables. So let’s do the same exercise with bind variables and disabled “by SAP default” CBO enhancements (like bind variable peeking).
SQL> var val1 number; SQL> var val2 number; SQL> var val3 number; SQL> var val4 number; SQL> var val5 number; SQL> exec :val1 := 1; SQL> exec :val2 := 2; SQL> exec :val3 := 3; SQL> exec :val4 := 1; SQL> exec :val5 := 1; SQL> alter session set "_optim_peek_user_binds"=FALSE; SQL> select col from ZTEST where col in(:val1);
SQL> select col from ZTEST where col in(:val1,:val2);
SQL> select col from ZTEST where col in(:val1,:val2,:val3);
SQL> select col from ZTEST where col in(:val1,:val4,:val5);
SQL> select col from ZTEST where col in(:val1,:val4,:val2);
… so the oracle database optimizer isn’t that clever anymore. The SAP default settings for oracle databases limit some CBO features (for other comprehensible reasons). In this case the optimizer can not look “behind” the bind variables and filter duplicates. This leads to a “over estimation” and can result in bad execution plans (wrong join order or join method are common issues then).
After demonstrating the CBO behavior there are some basic recommendations by using “FOR ALL ENTRIES”:
- Remove internal table duplicates (as far as possible)
- Use literals (by ABAP hint “substitute literals”), if the bad calculation is based on the values itself (and if the values do not change often)
There are several cases like out-of-range values, unequal distributed values (and so on), which can lead to a wrong calculation (by design), but this blog should focus on the basics in a SAP environment with some common recommendations.
If you have any further questions – please feel free to ask or in case of performance issues get in contact directly.