Key innovations introduced with SAP HANA

schneidertho · ‎12-11-2012

(2nd part of SAP TechEd recap blog)

In my last blog I have written about why we need ABAP for SAP HANA, about what we have already done as well as what we are planning to do in order to facilitate the integration between ABAP and the in-memory database technology and about which possibilities SAP HANA offers to ABAP (custom) development.

In this blog I want to write about the concrete impact SAP HANA has on the ABAPer.

Key innovations introduced with SAP HANA

To understand the impact SAP HANA has on the ABAPer, it can be helpful to revisit some of the key innovations introduced with the in-memory database technology. Since you will find plenty of information around these innovations, I will try to keep it short and focus on five innovations built into SAP HANA that I consider being of particular importance for this blog’s subsequent paragraphs. For details on in-memory database technology in general you can, for example, take a look at the home page of HPI (where you also find the original copies of the pictograms I have used).

	The first important innovation is (I am sure that you have heard about it) that SAP HANA is capable of storing all relevant business data in main memory. That does not mean that SAP HANA does not support storing data on disk. It does mean that SAP HANA (typically) does not need to access the disk during query execution.
	SAP HANA supports multi-core architectures by distributing queries across multiple CPU cores (and across multiple server nodes). Applications running on SAP HANA hence can benefit from massive parallelization. To give you an idea of what is technically possible: during SAPPHIRE in Orlando vishal.sikka demonstrated SAP HANA running on a system with 4.000 CPU cores.
	Within SAP HANA data can be organized either in row store (like in ‘traditional’ databases) or in column store. While both stores have specific advantages and disadvantages, the bulk of business data resides in the column store where it can be quickly searched and aggregated.
	SAP HANA can compress business data (in row store). This not only reduces the amount of main memory needed, but also the amount of data to be transferred between main memory and CPU. The latter is important to handle the bottleneck of in-memory database technology: CPU waiting for data to be loaded from main memory into cache (in contrast to disk I/O which used to be the bottleneck in the past).
	Last not least, I want to mention SAP HANA’s capability to partition datasets. Partitioning is of particular interest for large database tables. It supports the parallelization of queries. For example, if a database table is spread across multiple partitions (on one server node), the aggregation of one column can be done in two steps. In the first step all partitions are aggregated simultaneously. In a second step the results of the first step are added up.

Remark: while I have described the innovations one by one, it is the combination of them that really accounts for power of SAP HANA.

A new paradigm for ABAP emerges: “code-2-data”

To leverage the outlined innovations introduced with SAP HANA from ABAP, the database access becomes very important. And it is crucial – not so say compulsory – for ABAP applications to move data intense operations (i.e. costly calculations on large datasets) to the database layer. This is in some respects a paradigm shift, which is often referred to as “code pushdown” or “code-2-data” approach (in contrast to the “data-2-code” approach ABAP programs followed in the past).

The above diagram illustrates the two approaches: the “data-2-code” approach on the left and the “code-2-data” approach on the right.

What is the difference between the two?

In the past (based on traditional database technology) ABAP applications considered the database to be the bottleneck. Hence the complete business logic – including costly calculations – was implemented on the application layer. Very often a large number of records was transferred from the database to the application server to calculate only a few results there (for example to aggregate many line items to only a handful of key figures).

Now and in the future (based on SAP HANA) the database is not the bottleneck anymore. ABAP applications need to move – at least – parts of the business logic to the database layer. This not only reduces the amount of data to be transferred between database and application server, but it also ensures that the part of the business logic moved to the database (implicitly) benefits from the innovations built into SAP HANA.

However – especially when looking at existing ABAP programs – rewriting the business logic completely in the database might not be reasonable due to the efforts involved. Instead the “code pushdown” should mainly focus on costly calculations on large datasets (aka ‘number crunching’). Orchestration, process and display logic – that cannot be implemented in SAP HANA as of today or that will not benefit significantly from rewriting it in the database – should stay on the AS ABAP.

What does all that mean for the ABAPer?

Ok, now you might want to understand what all that means in terms of your skills and competencies as ABAPer. tobias.trapphas already touched this in his blog ‘First experience with ABAP for HANA’. I like to add a few thoughts.

During SAP TechEd I have heard questions like:

Do I have to forget everything I know about ABAP development?
Does Open SQL also work on SAP HANA?
Will all function modules be re-implemented as database procedure?

You can put your mind at rest. As already mentioned in the first blog of this series, existing ABAP programs (developed for traditional databases) continue to run on SAP HANA. That implies that also a lot of your ABAP knowledge stays relevant. You can use Open SQL to access SAP HANA just like you access any traditional database. You can work with function groups (even though I personally prefer classes) when running on SAP HANA. And you should not attempt to re-implement all your function modules as database procedures. In five words: many things stay the same.

Even performance guidelines given by us for traditional databases in the past remain valid for SAP HANA to a large extend. But certain performance guidelines become more (‘use field lists’) or less (‘avoid access using non-key fields in WHERE clause') important.

We definitely like to encourage you to apply your ABAP knowledge to business problems also when running on SAP HANA. In fact, when optimizing an existing ABAP program for SAP HANA you should first try to do the optimization by ‘standard’ ABAP means. Only if that does not do the trick, you should consider using SQL features beyond Open SQL (i.e. Native SQL / ADBC), SAP HANA views or database procedures from within ABAP (which is planned to be quite easy in SAP NetWeaver AS ABAP 7.4).

At the same time you need to look beyond your own backyard. I like to encourage you to brush up your SQL knowledge (beyond Open SQL features), gain experience with performance optimization and familiarize yourself with modeling and scripting in SAP HANA.

Now eric.westenberger will take over. In the remaining two blogs Eric will deepen some of the thoughts, give you background information on the database architecture of the AS ABAP and show you some concrete examples.

SAP TechEd 2012 – ABAP for SAP HANA: what the heck is code pushdown?

Key innovations introduced with SAP HANA

A new paradigm for ABAP emerges: “code-2-data”

What does all that mean for the ABAPer?

Get Started with the ABAP Development Tools for SAP NetWeaver

Become an ABAP in Eclipse Feature Explorer and earn the Explorer Badge

Six kinds of debugging tips to find the source code where the message is raised