The War on BINARY SEARCH

Jelena · ‎02-12-2020

Folks,

My first ABAP program was written in SAP R/3 4.7 back in 2005. It wasn’t long before I discovered BINARY SEARCH addition to the READ command, and thought it was the bee’s knees. Suddenly my programs ran so much faster and I was getting pat on the back from senior management.

Fast forward to 2020 and I hate you, BINARY SEARCH, I hate you very, very much. Why such drastic change?

Side Effects Include Death

If you’ve worked with SAP for some time, you must know about the backwards compatibility. It means that when SAP comes up with new ABAP syntax or features, the old commands do not change and continue to run as they were. This is great for the customers because they don’t end up with syntax errors in their custom code after an ABAP upgrade. But backwards compatibility has a very dangerous side effect: it does not motivate the developers to learn anything new. Indeed, in the on-premise SAP world, one can still party like it’s 1999. Or more like 1972.

The case of BINARY SEARCH is especially egregious. Table types other than STANDARD have already existed for more than a decade. HASHED table type works perfectly with a unique key. But even if that is not an option, SORTED table type uses the binary search algorithm and can have a non-unique key. It’s practically the same thing but better!

But to this day, BINARY SEARCH continues to piggyback on its old fame. I still see it mentioned in some internal guidelines (or rather “misguidelines”?) and random general performance improvement suggestions, as if it’s the best thing since sliced bread. It even still rears its head in the SCN blogs.

It’s Not About Performance

The subject of performance with different internal table types has already been covered in the SCN blogs, books, and ABAP documentation. In my own simple test (fill different tables with lots of data, then read it in a LOOP), I got a very predictable result. HASHED table was the fastest by a mile, and SORTED table, on average, performed just as fast as BINARY SEARCH. But while working on that simple program, I realized that it’s not even about the performance.

In this program, I was planning to create a routine for each table type and call them thusly:

GET RUN TIME FIELD t1.

PERFORM sorted_table.

GET RUN TIME FIELD t2.

Inside these routines, I’d call other routines to select data from database and then read it:

FORM sorted_table.

  DATA: bookings_sorted TYPE t_sorted_table.

  PERFORM get_bookings CHANGING bookings_sorted.

  PERFORM read_bookings USING bookings_sorted.

ENDFORM.

(I went deliberately with a procedural program here to make the point clear to every developer. But the same issue would’ve been obvious if I used the class methods instead.)

For standard, hashed, and sorted tables this worked fine. But when the time came to type in the code for BINARY SEARCH, my brilliant copy-paste design failed for obvious reasons. BINARY SEARCH requires a special READ command syntax, hence read_bookings routine with a plain READ could not be used for it. Either I had to put additional code in the routine and add a flag to tell it when to use what or I had to write special code outside of routine just for BINARY SEARCH.

Less flexible and reusable code is always kind of a downer.

What You See Is Not What You Get

In addition to “dirtying of the code”, BINARY SEARCH has another problem: sometimes it works, sometimes it doesn’t, and you might not even know it. (Horst Keller correctly referred to it as to “error-prone”.) The internal table must be first sorted in certain way for BINARY SEARCH to produce the expected results. What happens if it’s not sorted properly? As I learned back in 2006, in that case the result is like a lottery. Depending on specific data, it might work correctly sometimes but then it could miss a record that clearly exists in the table.

One might say “what’s the big deal, just remember to sort” but that’s exactly the problem. You’re relying on generations of future developers who will maintain the program to remember that it requires “special treatment”. What could possibly go wrong.

Let Database Do Its Job

BINARY SEARCH signals not only the program’s age or lack of table type awareness by the previous developers. Usually, it comes hand in hand with bad program design and underused database capabilities.

Out of curiosity, I ran a code scan for BINARY SEARCH clause in a random set of the legacy programs. In vast majority of the cases, BINARY SEARCH could have been replaced not just by HASHED or SORTED table but by a SELECT statement. A typical example here would be reading MARA and MAKT tables separately, then merging two internal tables. That’s just straight-up SELECT… JOIN. And the scenarios where quantity or amount, for example, is accumulated by material or customer, SELECT… SUM would do just fine.

“Code pushdown” might be a novel HANA-related concept but allowing database to do its job has always been a good practice. It’s just we didn’t always practice it. And leaning on a crutch of BINARY SEARCH did not help. It’s time to let go.

Conclusion

Does all this mean that BINARY SEARCH has no use whatsoever and must be purged from ABAP syntax? Probably not. But the valid use cases for it are so few and far in between that, as a tool, BINARY SEARCH deserves to be put out of sight on a very high shelf of our ABAP garage. Or, preferably, burned with fire. 🙂