Application Development Blog Posts
Learn and share on deeper, cross technology development topics such as integration and connectivity, automation, cloud extensibility, developing at scale, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 
Former Member


One of the most important considerations when writing a select statement against a large table is the effective use of an index. However this is sometimes more easily said than done. Have you ever found that your WHERE clause is missing just one field of an index and that field is not at the end of the index?

There are some situations where you can make more effective use of the entire index even if you are missing a field. Here is a simple trick to help you do just that. If the field you are missing cannot contain too many entries, then if you create a range table with all possible entries and add that range table to your WHERE clause, you can dramatically speed up a SELECT statement. Even when you take into account the extra time needed to retrieve the key fields, the results are worth it. This may seem a bit counter-intuitive, but the example code shows what I'm doing (but be careful – if you run this code in a QA environment, it may take a while):


I ran the above code in QA instances with a DB2 environment in both 4.6C and 4.7. There are more indexes on BKPF in 4.7, but I tried to use one that is in both versions. I also ran a similar program in Oracle with comparable results. But I really don’t know if it will work with other databases – please let me know!

I ran this many times in both active and quiet systems. Here are some typical results:


Time for first (fully qualified) select : 148 microseconds

Time for second (unindexed) select : 1,873,906 microseconds
Time for third select (indexed by selecting from the check table) : 455 microseconds

Time for fourth (partially indexed) select : 816,253 microseconds
Time for fifth select (indexed by hardcoding the domain values) : 43,259 microseconds
Time for sixth select (indexed by selecting the domain values) : 43,332 microseconds



Some things to note:

In the above times, the first select shows what happens in the ideal world. We are comparing select 2 against select 3 and select 4 against selects 5 and 6. But selects 2 and 3 should return the same results as should selects 4, 5 and 6.

But the point is that even though start out knowing nothing about (and presumably not caring about) the company code in selects 2 and 3 and the document status in selects 4, 5 and 6, if you put all possible values of these fields into the select statement, the results are dramatic.

If you try to combine both tricks, you will probably find that they don’t work very well together. Once seems to be enough.

12 Comments