Skip to Content
Author's profile photo David Halitsky

“Yes, Virginia, the check is in the mail” (or, why you can’t trust SQL)

In my previous blog post on “FOR ALL ENTRIES IN itab” vs SELECTs with LOOPs on itabs, I documented a remarkable reversal in efficiency between two tests – one against the SAP-delivered table GLPCA and one against the SAP-delivered table COBK.  Even after I took Anton’s suggestion to add BYPASSING BUFFER to the selects, and even after I changed the COLLECT to a hashed read/insert operation, the reversed results persisted:  a) against COBK, the “FOR ALL ENTRIES IN itab” was better by an order of magnitude (tenths of seconds vs seconds)  b) against GLPCA, the SELECT within LOOP on itab was better by a simialr order of magnitude (tenths of seconds vs seconds).  So I turned on SQL trace via ST05 and lo and behold – an explanation immediately appeared.  In the GLPCA case:  a) the SELECT within LOOP generated fetches on 3 matching columns of index 1;  b) “the FOR ALL ENTRIES IN itab” generated fetches on only 2 matching columns of index 1.  But in the COBK case, both methods generated fetches on 3 matching columns of index 0, i.e. the primary key.  So – what have we learned from this example?  Well I think it’s fair to say that we’ve merely re-validated the two oldest proverbs in the IT bible:  a) there’s no such thing as a free lunch; b) pay me now or pay me later.  In other words, “neat” SQL syntactic constructs are not panaceae that can always be trusted and developers must still do due diligence on every piece of code they write for back-end retrievals.  Some, particularly the younger set here, will ask: “But Grandpa – the CPU’s are so fast now and we can throw so many of them at any given problem, why should we care?”  Well, there’s two answers to this question.  First – the original answer from a very wise IT guru whose name has been forgotten: “A CPU second wasted can never be retrieved”.  Second – as time goes by and Enterprise SOA becomes the GAWODB (generally accepted way of doing business), I predict that the BW paradigm for OLAP will be supplanted by a different paradigm in which real-time OLAP is made possible by heavy use of alternative indices accompanied by really fast tricks that permit updating of these indices in real time.  And as this paradigm gains ascendancy, focus will shift off the bells and whistles of data processing, where it has been since Bill and Larry redefined the field to their own commercial advantage, and back to the basics.  (This second point, BTW, is the reason why this post is a separate blog entry, and not a comment on my previous entry.  I wanted to make sure that the prediction was seen, and not buried in the details of a technical discussion.)  Well, I gotta go code a function module now, so I won’t have time to tell y’all the story of Bill Mann and “invisible fields” in Model 204 (the US precursor of ADABAS, which SAP had the good sense to purchase.)  It’s a great story and it pertains to the prediction made above, so I hope no one will mind if I recount it in my next post …        

Assigned Tags

      You must be Logged on to comment or reply to a post.
      Author's profile photo Former Member
      Former Member
      I always like to see the - "what's going on under the covers explanations".  Now in the case of only having 2 of the 3 columns, I'm assuming you had the columns 1 and 2 and we're talking about the difference between that and having cols 1, 2, and 3, and column 3 was reasonable distinctive.  (Guess I could go back and read the related blog ...)

      Now if you are talking about having cols 2 and 3 of a 3 column index, then I suppose we'd have to know what DB and version and table statistics, because Oracle does have a feature in 9i called Index Skip Scan that might allow you to utilize cols 2+3 to probe the 3 column index.

      I think it was Benjamin Franklin that said "A CPU second wasted can never be retrieved".  Or was it Ozzie Osbourne?  I always get those two confused.

      Author's profile photo David Halitsky
      David Halitsky
      John -

      Your instincts were correct - this was a DB2 example and the performance difference came down to the fact that at the bottom of each loop on the itab, ABAP was able to pass off a fully-resolved triple of singletons to DB2 (index_field_1_value, index_field_2_value, index_field_3_value), whereas in the FOR ALL ENTRIES version, the best ABAP could do is pass off a fully-resolved pair of singletons (index_field_1_value, index_field_2_value) plus an "in" list containing the values of index_field_3.

      It is well-known that any OR'd where-clause list CAN drive any DB optimizer nuts if it doesn't know the cardinalities involved, or they're too complex to compute on.  This example is kinda a special case of that truism.

      Didn't know about that skip-scan feature of O - thanks for the tip.