Code Inspector’s Performance Checks (II)

former_member184455 · ‎07-02-2007

This is the second article in a series dealing with the performance checks that are delivered with the Code Inspector, SAP’s tool for the analysis of static ABAP code and other repository objects. In the previous article we saw that every frequently executed database access should be supported by an appropriate database index. If the WHERE clause of a database access fails to hit an index, the Code Inspector will protest.

Articles of this series:

Code Inspector’s Performance Checks (I)
SELECT statements that bypass the table buffer (discussed in this article)
Code Inspector’s Performance Checks (III)
Code Inspector’s Performance Checks (IIII)

Data that is frequently accessed and rarely modified should be read from a table buffered on the application server, and not from the database. It is about 10-30 times faster to read one entry from a buffered table than to read it from the database cache – say it takes only 20µs instead of 400µs. Again, things can go wrong when buffered tables are accessed inadequately; that is, the buffer may be bypassed and the performance advantage lost. But do not worry – there is another Code Inspector check at hand to help you. This article explains when it is best to buffer tables. It also describes the appropriate Code Inspector check to detect statements that bypass the buffer and how to correct the underlying problems.

When and how to buffer tables

Small or medium sized database table with data that is frequently read, but only rarely modified, should be buffered within SAP’s table buffer. In fact, there are two types of table buffer: one for single record buffered tables, and one for generically and fully buffered tables, but for simplicity let's refer to 'the (SAP) table buffer'. The best candidates for buffering are tables containing business configuration data; for example, a table containing the address data of a company's subsidaries. Since the address of a subsidary will not change frequently, and since there will be at maximum some hundred to thousand data sets, the table should be buffered. Whether or not a table is buffered is defined in its technical settings (transaction SE11). You can choose between the following buffering types:

single record buffering to buffer single records, identified by the full primary key

generic buffering to buffer generic key areas with 1, 2, ..., n key fields, where n is shorter than the primary key length

full buffering to buffer the whole table; for client dependend tables, full buffering corresponds to generic buffering with respect to the client field

If data is modified on application server A it can take up to two minutes for the buffer entries to be invalidated on another application server B of the same system. During this interval, a user on application server B may read outdated information. For some critical applications this behavior can lead to data inconsistencies. Therefore, for these applications buffering must be avoided. When a buffer range is invalidated by a modification on server A, the data is not reloaded from the database into the buffer of server B with the first read access. Instead, it takes up to ten read accesses until the database interface is 'confident' enough to re-fill the buffer with the data from the database. This 'pending' behavior of the buffer, plus the communication overhead created by the invalidation mechanism, is the reason why frequently modified tables should not be buffered.

SELECT statements that bypass the table buffer

For the table buffer to be quick in responding to a request, it works in a simple and straightforward fashion. The SAP table buffer is not aware of sophisticated things such as secondary indexes (it only knows the primary one), SQL Joins, aggregate functions, or complicated selection ranges. This is why there is a list of OpenSQL options that lead to an implicit bypassing of the buffer, because the buffer cannot handle them. Therefore, avoid the following options if you want to access data from buffered tables:
List of statements that bypass the SAP table buffer (as of July 2007):

Implicit bypassing caused by an option of the SELECT statement

SELECT from a single record buffered table without explicit use of ‘SELECT SINGLE’ (starting with SAP NetWeaver 7.10, the key word 'SINGLE' is no longer required to make use of the single record buffer)
SELECT from a buffered table in an Open SQL Join or use of a buffered table as a joined table in a database view defined in the data dictionary
SELECT with a subquery
SELECT with an aggregate function: COUNT(), MIN(), MAX(), SUM(), AVG()
SELECT DISTINCT ...
SELECT ... GROUP BY ... [HAVING cond ...]
SELECT ... ORDER BY ... where the sort order differs from the primary key
SELECT with option ‘CLIENT SPECIFIED‘, but no client field in the WHERE clause
SELECT ... WHERE a IS [NOT] NULL

Implicit bypassing caused by an incomplete WHERE clause or inappropriate buffer setting

Generic key not fully specified in the WHERE clause of a SELECT from a generically buffered table
Primary key not fully specified in the WHERE clause of a SELECT from a single record buffered table

Intended Bypassing

Use the option ‘BYPASSING BUFFER’ to avoid accessing outdated table buffer information, and to read data from the database instead. This will only be relevant for some special applications.
The statement ‘SELECT FOR UPDATE’ sets a database lock and therefore always has to bypass the table buffer.

Native SQL statements also bypass the SAP table buffer. Moreover, they do not trigger the buffer invalidation mechanism after a database change, and therefore should not be used in application programming. Using any statement of the list above with a buffered table causes the SAP table buffer to be bypassed, thereby forgoing the performance advantage of the buffer access over a database access.

Details of the Code Inspector check

For every SELECT statement, the Code Inspector reads the technical properties of the table accessed (including buffer settings) from the data dictionary. For this check, only SELECT statements accessing buffered tables will be further analyzed by the Code Inspector:

The Code Inspector analyzes whether one of the options mentioned in the above list is used in the SELECT statement
For single record buffered and generically buffered tables, the WHERE clause is also analyzed. Only if the buffered key range is fully specified in the WHERE clause and solely contains 'AND' relations (conjunctions), and if the fields in the WHERE clause are compared against the current parameters with a simple ‘EQ’ or ‘=’ condition, will the buffer be used.
Range conditions (‘WHERE field IN range ’) can only make use of the buffer if the range contains one single ‘EQ’ or '=' condition. Since the Code Inspector does not know the content of a range at runtime, such range conditions are always reported.
If the SELECT comes with a 'CLIENT SPECIFIED' option, Code Inspector checks whether the WHERE clause contains the client field. Since client dependend tables are always buffered generically with respect to the client field (even if they are formally fully buffered), the buffer will be bypassed without an implicitly or explicitly specified client.
If there is an ORDER BY field1 field2 ... option, Code Inspector analyzes whether the fields determining the sort order correspond to the (leading part of the) primary key sequence. Of course, the 'ORDER BY PRIMARY KEY' option also uses the buffer.

For every statement identified to bypass the table buffer implicitly, the Code Inspector raises a warning message. The use of Native SQL is reported by the 'Critical Statements' check of the Code Inspector.

How to proceed with a Code Inspector message

Now, what should you do when the Code Inspector informs you that a SELECT statement on a buffered table bypasses the SAP table buffer?
First of all: Do not - without further analysis - change the buffer settings of the table, for example, buffering it more ‘generously’, with a more generic key range!
Consider that changes of buffer settings influence other statements and that the size of the table buffer is limited. Tables with wrong buffer settings may displace many other tables from the buffer, because they consume too much memory. Instead, investigate the following:

Sometimes developers misuse the implicit bypassing of the buffer caused by a statement of the above list as a feature, because they want to bypass the buffer for some reason. Never do this! The list of statements that bypass the table buffer can change. If bypassing is wanted, state this explicitly by using the “BYPASSING BUFFER” option.
If bypassing is not wanted, and the statement bypassing the buffer is frequently used in a production system, try to rewrite the SELECT statement so that the buffer can be used, as described below:

When implicit bypassing is caused by an option of the SELECT statement

Insert the missing key word 'SINGLE' if you access a single record buffered table. If the primary key is fully specified in the WHERE clause, the access will be able to use the buffer.
JOINs are very elegant and powerful SQL statements, and for tables that are not buffered it is clearly better to read data from the database with SELECT … JOIN … INTO TABLE than using nested SELECT ... ENDSELECT statements. But this is no longer true if one or more of the joined tables are buffered! Since the table buffer cannot handle OpenSQL Joins or subqueries, replace these constructs by explicit SELECT statements on each table (if they can make use of the table buffer). For example use subsequent SELECT statements, the FOR ALL ENTRIES option, or SELECT ... ENDSELECT loops.
For SELECT statements with a subquery the recommendations for JOINs (see above) also apply.
Replace the aggregate functions COUNT(), MIN(), MAX(), SUM(), AVG() with ABAP code. For a fully or generically buffered table it is better to read the data into an internal table and to perform the counting, summing or averaging in ABAP. A special case is the COUNT( * ) option which is often misused to check whether there is at least one entry (for a special key range) in a table. To perform an existence check in this way is a bad idea because it bypasses the table buffer and hurts the database! An existence check with SELECT COUNT(*) FROM dbtab … UP TO 1 ROWS can be harmful because some database platforms really count all entries (the UP TO 1 ROWS option just tells them to return only one value, which is the count result), while you might think that it stops after one entry has been found. You better do a SELECT first_keyfield FROM dbtab … UP TO 1 ROWS instead. This also increases the probability that the statement will use the table buffer.
Replace the DISTINCT option with functions of the ABAP LOOP processing.
Rebuild the SQL option GROUP BY ... [HAVING cond] with functions of the ABAP LOOP processing.
Replace ORDER BY (if the sort order differs from the primary key) by a SORT operation in ABAP.
The option CLIENT SPECIFIED should not be used in application code. If it is needed, though, it can be beneficial to first read all existing clients into an internal table. It may then be possible to use the buffer by specifying the clients in the WHERE clause using the FOR ALL ENTRIES option.
Since, in contrast to databases, neither the ABAP language nor the table buffer know NULL values, the condition WHERE a IS [NOT] NULL cannot be processed in ABAP - so bypassing the buffer is unevitable.

When implicit bypassing is caused by an incomplete WHERE clause or inappropriate buffer setting

Try to complete the WHERE clause with the missing key field. Sometimes the missing information can be read easily (and quickly) from another buffered table. If it is not possible to complete the WHERE clause, have a look at all SELECT statements accessing the buffered table. Check if the buffer is used in most of the (performance critical) cases, or if there are accesses that could benefit from a more ‘generous’ buffering. On the other hand, your analysis may also reveal that buffering the table is not sensible at all. Only change the buffer settings after such an analysis.
The recommendations from list item 1 also apply.

Transaction ST10 (Table Call Statistics) shows the current status of a table in the table buffer. It is always a good idea to check the buffer status in a test or production system to see whether the buffer settings make sense. There should not be too many (ideally: no) buffer invalidations for a table, and it should not allocate too much memory in the buffer. You may also be able to identify further candidates for buffering with the help of this transaction. Now you have learned how the Code Inspector check "SELECT statements that bypass the table buffer" helps to improve program performance. Please remember that some Open SQL constructs like Joins can be very elegant - but that they should not be applied to buffered tables in performance critical code. Otherwise, you will compromise the efficiency of your programs. For general aspects of program efficiency see also my blog Cooking and software efficiency.