SAP on HANA, and Pushdown for All: News about ABAP...

former_member182046 · ‎01-13-2013

You may have heard about the January 10 news conference, in which SAP announced the availability of the Business Suite on HANA. Along with some fellow SAP Mentors, I was given the chance to attend this event in the Frankfurt/Main location, and while “Business Suite on HANA” per se wasn’t new to me, other information from the event was extremely new and noteworthy to me.

Fig. 1: SAP Mentors at the "SAP on HANA" press conference: luke.marson, steinermatt, mark.chalfen, thorstenster, peter.langner2, tobias.trapp (photo by martin.gillet2)

A change in the ABAP programming model going beyond what has been previously discussed

To sum it up briefly, in previous years, all ABAP programming against the database used a layer of abstraction that reduced the possibilities to the smallest common denominator – a very small subset of SQL features defined by those functions that were supported by all databases on which SAP runs.

Then came the plan to run the Business Suite on HANA. When you run existing operational ABAP applications on HANA, the result is sometimes disappointing, and large parts of the applications don’t run any faster than they do on other databases. The reason is, simply put, that traditional ABAP applications won’t let HANA do what it does best: The things HANA is best at are traditionally either simply not done in ABAP applications (e.g. mass queries instead of SELECT SINGLE), or the ABAP application logic handles them itself instead of leaving them to the database (e.g. calculations, complex joins and transformations of datasets).

Enter code pushdown and HANA-specific code

In order to achieve the amazing performance HANA is capable of – executing queries and calculations hundreds or even thousands of times faster than other databases –, one has to change the applications so that they let the database do the things it can perform much faster than the ABAP application server. This means that a portion of the application code must bypass the abstraction layer and be written natively and specifically for HANA. (For a better understanding of code pushdown, please read eric.westenberger's excellent post, SAP TechEd 2012 – ABAP for SAP HANA: the database is not a black box anymore.)

You might ask: “Okay, so native SQL exploiting specific features of database platforms and stored procedures run faster than traditional ABAP applications, which involve several more physical, logical, and abstraction layers – big deal. We knew this before HANA. Why didn’t we use stored procedures and native SQL to optimize ABAP applications for, say, DB/2 or Oracle?”

The canonical (politically correct) answer would be: For traditional database systems, the performance gains that could be gained with such an approach exist but are outweighed by the loss of compatibility. In other words, it’s better to have an application that runs at 80% of its optimal speed but on all databases supported by SAP than to have an application that runs 1/4 faster but only on one database platform. This argument works as long as the delta between native and database-specific is not too big. We wouldn’t lightheartedly give up portability and simplicity for marginal performance gains, but if we’re talking factors of 100 or even 1,000, the business value might be so drastically increased that it’s worth giving up the abstraction against the database platform.

Gradually giving up abstraction

Logically, there are three stages of giving up abstraction against the database platform in the SAP Business Suite:

Introducing optimizations that work for all database platforms or are at least performance-neutral: SELECT SUM( … )
Using native features for HANA but not for other database systems
Using native features for HANA and the corresponding native features for other database systems

To fork or to rewrite?

When optimizing existing Business Suite code to run faster on HANA, you’ve got two options:

Fork for HANA
For every optimization, introduce a fork (IF HANA … THEN … ELSE … ENDIF). The existing code remains unchanged, and new, incompatible code using HANA-specific artifacts, is added as the HANA branch of the fork. There’s a lot of value in the approach: It is the least disruptive for existing installations that don’t migrate to HANA, because if the part of the code they run is not rewritten, there is no danger of introducing new bugs, and any existing modifications or enhancements will continue to work like before. The new code that is processed in the HANA case can be radically optimized for HANA, no holds barred. The downside: redundant business logic, increased complexity, more testing required, all resulting in higher maintenance costs and loss of agility.
Rewrite for all
For as many optimizations as possible, try to get away without a fork. Rewrite the code once, so that the new code is processed on systems with HANA and without HANA. This can be done wherever there is a way to rewrite the code so that it performs with at least the same speed on traditional database systems, and significantly faster on HANA. This approach has the advantage that the resulting codebase is simpler and with less redundancy, resulting in easier and cheaper maintenance and better agility. The downside is that is goes only so far in exploiting the advantages of HANA, still leaving much to be desired, and is more disruptive in the shortterm.

The smart solution is to combine the two approaches: Realize as much of the HANA optimization as possible with the “rewrite for all” approach, but don’t be shy to use “fork for HANA” when amazing factors or entirely new features are within reach. Each individual design consideration should be impacted by the value of the particular enhancement (how frequently does it run, and what benefit can be gained by making it run faster) and by the disruption it causes (how likely it is that the code to be replaced or forked has been modified, enhanced, or duplicated by customers, and how harmful is the respective decision to replace or fork to those enhancements).

Two measures for HANA and other databases?

With these paradigm changes, there is one thing I found slightly political, or architecturally inconsequent: If we concede that code pushdown to the database makes sense when it creates drastic improvements and new opportunities in the HANA case, then why should the same approach be per se bad for other database systems?

Of course I can see why HANA is special to SAP and doesn’t receive the same treatment by the Business Suite as other database systems. A certain extent of what might be called favoritism in a half-jesting fashion is acceptable here, but it would be difficult to follow if SAP were to pragmatically overthrow the previously existing architecture axiom of database-agnosticism for their own database only but keep it firmly in place for the databases of all other vendors.

In some cases, the same stored procedures or native SQL code accelerating the Business Suite on HANA might run perfectly on other database platforms, or could be ported with minimum effort and achieve significant performance improvements there, too. If the improved performance is worth the redundancy in the code base (with all its implications for maintainability, error-proneness, and so on), then it might be justifiable to pursue this venue even for databases other than HANA.

Please note that this is two big steps away from the previous paradigm of database-agnostic ABAP applications. Having said that, please note also that exceptions have been made in the past, for example in SAP Business Warehouse, whose code is full of forks for specific database platforms.

And pushdown for all

Considering this, I was fascinated when – I think – Sam Yen after demoing MRP on HANA explained that Stored Procedures would now be a used for HANA and other databases, too. This means that the new (revived) architecture paradigm of code pushdown – the ABAP application server giving some of the work it has traditionally considered its own to the database – won’t be limited to the HANA case.

I don’t remember the exact wording, but I found two blog posts whose authors confirm my understanding.

siva.darivemula writes in his excellent recap blog post, Real-time Business Processes powered by SAP HANA:

“Using SQL Stored Procedures (note: as I understand it), a single version of Business Suite can work with all major databases such as Oracle, IBM DB2, SQL Server, Sybase ASE, etc. in addition to SAP HANA database. This is great news for current customers that have significant investments in their database infrastructure.”

I even have a second, more prominent witness: vishal.sikka writes in his post, A Promise Delivered – SAP Business Suite Is Now Powered by HANA:

“Moreover, thanks to the power and universality of SQL, SAP Business Suite remains open to other database technologies and vendors and only one version of SAP Business Suite goes forward without disruption. Innovations in the SAP Business Suite, such as push down of data centric processing logic from the application server to database tier via stored procedures would be made available to other databases too making them perform better too.”

I might dwell more extensively in a later installment of this blog on what this means for ABAP application development: less schizophrenia at the ABAP level, but a harder struggle to deal with the peculiarities of various database platforms at the native level. For now, let it just be noted that this a great step towards pragmatic architecting, and I look forward to the fruits of SAP’s cooperation with database vendors on a tighter database integration and better compatibility. After all, the broader the shared base of compatible functionality between the databases under an ABAP server, the better for all involved. In a way, it’s as if SAP were aiming for a drastic expansion of the speed and functional richness of OpenSQL, and perhaps this is indeed where we will end up.

The road ahead

There’s room for speculation about the implications of these news, and more will be known in the future. For now, it seems safe to say that the previously bipolar HANA/non-HANA model transforms into a spectrum where, instead of being limited to the lowest common denominator, we can now exploit the different sets of native capabilities of various database platforms if we so desire:

Power of HANA: Thanks to the Calculcation Engine, SQLscript, and many other features of HANA such as its built-in Predictive Analytics capabilities, the power of HANA goes way beyond the power of SQL, and the Business Suite’s applications designers will use those features to help HANA customers get the most out of the platform. (Works on HANA only.)
Power of SQL: More or less compatible or portable SQL Stored Procedures or native SQL going beyond the power of OpenSQL are another option. (Works on some but not all supported databases.)
Power of ABAP: And there’s of course good old OpenSQL. (Works on all databases supported by SAP.)

I can’t wait to explore what’s coming, and to continue to contribute the conversation. Let's keep each other posted.