(3rd part of SAP TechEd recap blog for session CD202)
In the previous part, Thorsten Schneider explained that in the context of leveraging SAP HANA from ABAP, the database access becomes a very important aspect of the application logic because of the so-called “code pushdown” (or better, “code-2-data”) paradigm. In this (and the following) blog, I want to focus a bit more on the technical aspects including how to detect and optimize an existing report.
Before going into the details, let me repeat that it is not mandatory to apply this new paradigm in each and every scenario. It is particularly suited for those scenarios where current applications and processes are just not good enough anymore for the business requirements of today; maybe just in terms of performance, but maybe also in terms of what Tobias Trapp called in his excellent blog “Intelligence follows data”, i.e. benefiting in real-time from all available business data.
You can compare this aspect to some extend with the difference between classical and quantum mechanics, where the classical methods can still be applied to most standard problems involving the motion of bodies, but for effects on another scale (in physics: microscopic scale, for business data: data volume) new paradigms have to be considered in order to overcome technical obstacles.
Now, the main motivation for applying the code-2-data paradigm on SAP HANA stems from the following two facts
- In scenarios involving big data sets, the actual result sets needed for a user interface or, more generally, business process step, are usually much smaller than the involved data.
- SAP HANA provides powerful engines for data crunching and handling business calculations on-the-fly which go far beyond standard database features (e.g. for handing unstructured data, conversion, classification and prediction algorithms, etc.).
From a technical perspective, the main interface to the SAP HANA database is SQL. In particular, also all of the advanced capabilities in SAP HANA can be accessed remotely via the corresponding database clients. Hence, also from ABAP you can access all these capabilities and I will explain below how this works.
Recap: Database architecture of the AS ABAP
Let me shortly summarize the core database architecture of the AS ABAP. All the database tables used by the ABAP server are stored in a dedicated database schema (usually called SAP<SID>, where SID is the system ID). The ABAP server connects to the database with a user of the same name. When writing a piece of ABAP coding involving database access (usually an Open SQL statement), the ABAP kernel (more precisely, the so-called DBI (database interface) and DBSL (database specific library)) translate this to a native SQL call to the database system and pass the result back into the ABAP data structures.
- Open SQL (the standard way): Open SQL is embedded into the ABAP language and provides a database independent syntax and semantics based on a subset of the SQL standard, which is common to all supported database. It is tightly integrated with ABAP data dictionary; in particular Open SQL allows only access to tables and views created via the SE11 in the default schema. Open SQL provides many qualities and services like syntax checks, automatic client handling, and table buffering.
In terms of data access, Open SQL essentially allows standard SELECT statements involving joins, aggregations, sub-selects, etc. However, just to give some examples, the following SQL constructs are currently not part of Open SQL: CASE, UNION, GROUPING SET, or expressions (e.g. string operations like concatenation or substring).
- Native SQL (for experts): native SQL allows accessing all native database features, and all tables or views independently of the data dictionary, but lacks the qualities mentioned above (in particular, obviously, database independence). Native SQL can be used either via ADBC (recommended), or via EXEC SQL.
Note that both variants support so-called secondary connections (maintained via the DBA Cockpit), which allows to access SAP HANA running side-by-side. There is a nice blog by Thomas Jung about the basics of using ADBC to access SAP HANA from ABAP via secondary connections.
The fact that native SQL plays a more important role on SAP HANA compared to the traditional databases is obvious: fully exploiting the power of SAP HANA means to make use of non-standard features. This might mean to use a specific SQL feature not supported by all databases (e.g. when using fuzzy search), or accessing a non-standard object (such as a calculation view) via SQL. These aspects and examples will be explained in more detail in the next blog.
A concrete example for code pushdown via Open SQL
Pushing code to the database is nothing new. For instance a SELECT statement involving a WHERE clause for filtering is a “pushed down” version of fetching all rows and filtering in the application server. More generally, using standard Open SQL is the simplest way of pushing down operations to the SAP HANA database.
For pure acceleration, I would always recommend to start with this approach first because
- The resulting code will still run on all database systems.
- The resulting code will nevertheless benefit from many SAP HANA capabilities (e.g. operations aggregations, sorting can be optimally supported by the column store and leverage parallelization).
Let’s consider a concrete example roughly consisting of the following steps:
- Some master data of the customer (name, address data, etc.)
- Some key figures reflecting his payment behavior (total amounts of unpaid invoices; average days until invoice settlement, etc.)
- A segmentation based on these key figures and some customizing (e.g. A, B, C customers).
An existing plain ABAP implementation may have a certain runtime (say, a few minutes based on a couple of million invoice records). In order to detect potential for optimization on SAP HANA, the proven runtime and code analysis toolsin the AS ABAP can be used (e.g. the runtime analysis (SAT) or the code inspector (SCI)). With ABAP 7.40 there will be dedicated checks for SAP HANA combined with guidelines and proposals, all integrated into the ABAP development tools in Eclipse.
The implementation of our example is based on the NetWeaver Enterprise Procurement Model, a relatively simple data model realizing a “Sell from stock” scenario. Using the tools mentioned above, it is possible to localize those parts of the coding where an optimization of the database access seems possible, e.g. by reducing the number of currency conversions and database round-trips by leveraging standard features in Open SQL (e.g. grouping and join):
(Note that the screenshots are previews of the upcoming Eclipse based versions of these tools which will be available with AS ABAP 7.4.)
An optimized version of coding using only Open SQL above may look as follows:
Indeed a performance improvement of several factors can be observed, and using the performance tools on just this coding segment (which is of course only a part of the overall ABAP report) the effect of code-pushdown also very visible:
Note that while this approach works on all database systems, it is particularly beneficial on SAP HANA due to the nature of the column store and support for parallelization. In the next blog we will see how it is possible to even improve the performance but another factor when using HANA capabilities which are not standard database features.
Code pushdown starts and ends at the user level
While accelerating certain database accesses or operations by a factor is great, it is crucial to keep the overall scenario and in particular the end-user in focus. The end user obviously does not care about “manifested aggregates” or “table paging”, but only if the system can provide him with the correct information as fast as possible and presented in a visually appealing way. All of these three qualities are crucial for the end-user.
We often use the ABAP List Viewer (ALV) as a blueprint for implications of the code-2-data paradigm. Classically when using the ALV, data is selected from the database into an internal table and this internal table is handed to the ALV which in turn displays the data provides many UI services (like sorting, paging, filtering, and export) based on this data. Clearly, there are limits for this approach when the data volume becomes too large. In addition, the initial rendering time might be quite long because all data is fetched from the database and processed by the ALV before the initial output.
On SAP HANA we can challenge this design and offer an additional variant of the ALV where the data itself is not passed in advance but only a description of the data source and the ALV handles the data access internally. This is in particular interesting when the data source itself is a view in SAP HANA involving complex data crunching (this will be discussed in the next blog).
However, the display of the (raw) result data for the consumer still involves application design and business logic. For instance
- A certain column (on the database a character of length 1) maybe needs to be presented as an icon (based on a mapping of fixed domain values),
- Currencies should be displayed properly (taking for instance the decimal places for the individual currencies into account (e.g. for Japanese Yen (JPY) there are no decimal places and the raw value from the database has to be shifted by 2 decimal places).
Such qualities are classically provided by components in the AS ABAP such as for example the ALV or the value help framework. We believe that the combination of such widely used components with the innovations of SAP HANA allows broad adoption in different user interface technologies (classical SAP GUI, Web Dynpro ABAP, SAP UI5, etc.).
In summary, it is important to remember the following three aspects
- The “code-pushdown” paradigm on SAP HANA allows addressing existing challenges and creating new opportunities when dealing with mass data in ABAP. The simplest approach for code-pushdown is to leverage standard Open SQL in an optimal way (the 5 golden rules are still valid in general), which allows optimizations without losing database independence. There is tool support for identifying and optimizing existing ABAP coding.
- All functionality in SAP HANA (including those going far beyond standard database features) can be consumed from ABAP via SQL (in releases below AS ABAP 7.4 usage of native SQL is required, with the upcoming AS ABAP 7.4 there will be dedicated support for using SAP HANA artifacts in standard ABAP).
- It is important always to consider the end-2-end process or application in order to bring the power of SAP HANA to the user.
In the next part, I will speak in detail about how to exploit concrete advanced SAP HANA features such as views, procedures, business functions, and fuzzy search from ABAP.