POV: HANA’s impact on ABAP Programming
Background
What is HANA?
HANA (High performance Analytical Appliance) is a new In-Memory data management appliance released by SAP which is more of separate plug-in (Software & Hardware) for faster analytical activities. This application allows processing of massive quantities of real time data in the main memory of the server to provide immediate results for analysis.
HANA is a pre-configured analytic appliance. SAP’s hardware partners provide the hardware and pre-install the SAP in-memory solution on that hardware.
How does it work?
Data is replicated from SAP or Non-SAP sources into HANA. For SAP Business Suite, there is a Replication Server & Agent that come along with HANA to transmit the data. For Non-SAP, Business Objects DATA Services tool can be used to replicate the data into HANA.
HANA has a Row store and Column Store for storing the data in relation and columnar format respectively. Columnar database is proven concept for high performance analytics. Introduction of calculation and aggregation engine at the database level is a huge advance in processing.
SQL and MDX processors are responsible for parsing SQL and MDX query requests and processing them.
HANA currently supports few clients like BO Explorer, Microsoft excel etc.
SAP HANA Studio is front-end which facilitates Data Modelling and administration.
Does HANA support ABAP Programming?
HANA 1.0 doesn’t support ABAP. It supports few clients like SAP BWA (BW Accelerator), Microsoft Excel, SAP Business Object Analysis, MS Edition for Office. HANA 1.5 will support ABAP Programming/reporting which would mean that ABAP developers can write code to read/modify data in HANA Appliance.
Will HANA impact ABAP Programming style?
Traditional ABAP Programming style is more inclined towards reducing the load on Database and shifting the processing to Application server. This is more to do with Data Storage models and SAP 3-tier system architecture provided another layer for data processing. This restricted ABAP developers to confine to the concept of “pulling required data to Application server and processing data in App Server” and along came the complexities of buffering.
With the introduction of HANA, data is available in volatile memory so data can be retrieved and processed in same “memory”. At first thought, this gives a feeling that ABAP OpenSQL Programming style should be changed and some of the “banned” OpenSQL statements like ‘Order By’ etc should come back in to use. Well, Hold on!! Let’s not jump into conclusions, let’s analyze some specific scenarios from HANA perspective and decide on this –
- Scenario-1: There is a database table with 50 records in it and you wish to read the table 1000 times for values based on field F1.
- Non-HANA: Since the database table is relatively small in size, make a copy of the table in your program and then read the desired record as and when required.
- With HANA: Read data from database table as and when required and process it. Remember, with HANA query and data processing is happening at the same place!!!
- Scenario-2: Let’s consider opposite scenario to Scenario-1. There is a database table with 10000 records in it and you wish to read the table 100 times for the values based on field F1.
- Non-HANA: Since the database table is very large, it’s better to use SELECT SINGLE than porting the entire table into an internal table.
- With HANA: Same as in Non-HANA read data from database table as and when required and process it. The only advantage here is there is no network traffic with every query as query and data processing is happening at the same place!!!
- Scenario-3: Joins Vs Nested Select statements
- Non-HANA: Join is the most preferred way to fetch data from two tables instead of nested select statement. No second thoughts about it from OpenSQL perspective.
- With HANA: Join has its own advantages and Nested select has its own advantages from Relational database perspective. With HANA, the decision to use Join Vs Nested select statement depends on further requirements. If any intermediate processing has to be done, nested select is better than join.
- Scenario-4: Optimize the cost of database searches
- HANA or No-HANA, the cost of database search should always be optimized and concept of index is extremely important
- Scenario-5: Table buffering
- SAP provides buffering facility to increase performance. Since buffers reside in Application server it takes considerably less time to read data locally than reading it from database.
- With HANA, no additional table buffering is required as all the database resides in volatile memory.
- Scenario-6: ABAP Sort Vs Order by clause
- Non-HANA: Optimized solution is to use ABAP Sort and reduce load on Database server
- With HANA: It really doesn’t matter. Both the statements should be equally good or bad.
Conclusion
After analyzing some of the basic programming scenarios, programming with HANA seems to be a complete different world that ABAP developers should watch out for. While the database programming best practices still hold, developers will have the flexibility to utilize data processing statements which were considered expensive without HANA. Another completely new dimension to ABAP Programming will be columnar database concept which is also a part of HANA appliance.
This is the main point. E.g. in BW there are hundreds of ABAP lines to activate data in a DSO. SAP will try to replace them by a DB stored proc.
And that's where the whole uncertainty comes into place. You will probably be able to slightly adjust your SELECT statements and internal tables to get some advantages out of HANA. But I'm doing a lot of very complex programming on BW-IP planning functions and complex ABAPs in transformations. Just the effort to fetch the data to the application layer and put it back to the DB layer afterwards means that all this coding is inferior to a stored proc.
So the question that remains to be answered: Will ABAP be only available for us oldtimers or will there be a place where ABAP will be superior to stored procs? I'm not deep enough into HANA to fully answer the question but I'm looking forward to seeing the answer.
Furthermore: in most cases, it is prohibitive to bypass a join on the RDBMS by implementing a naive nested loop join in the application server. Commercial RDBMS have a variety of join strategies and algorithms to choose from (hash, sort-merge, partitioned, index-supported etc.). So making this a HANA-specific argument is nonsense.
HANA's strengths don't come from the pure fact of "in-memory" alone.
Just to clarify that there won't be HANA 1.5, it will keep as HANA 1.0 but SP3.
2. New Data or Change Data?
3. Is ABAP able to fetch from in memory and from database at the same time?
Was wondering about these scenarios too.....
1. Changes in Select Queries --> Either changes in ABAP Kernel will be required to incorporate many Open SQL statements OR new syntax might be required to access HANA Database (In-memory Database to be precise) which support access via Row / Column. Whole new set of processing paradigm may be required for columnar operations.
2. Database updates --> It may be either restricted as local or consider synchronizing back to core system.
In my opinion, I think you're underestimating the impact that HANA will eventually have on the SAP architecture.
Currently, SAP intentionally limits the load on the database layer, and uses the CPU / memory of the application layer, as this layer can generally scale out more easily (e.g. by adding additional application server instances). In this architecture ABAP statements are used to extract and manipulate data in the memory of the application server.
With HANA the capacity and scalability of the database layer are both dramatically increased. With this architecture, continuing to extract and manipulate data at the application layer (e.g. using ABAP) no longer makes sense, as these functions can be done more quickly and efficiently in the memory of the database layer.
So, what does this mean for ABAP Programming? Most likely, the complete removal of the need for the majority of data and application logic. Like you said in your conclusion, "a completely different world".
Cheers,
Jon
We do it this way to avoid logic at the databbase level in so many different ways. One way is pulling everything into an internal table and then doing the logic on it. Who hasn't done that? That makes it quicker because the data is "in memory" within the ABAP program. We hit the database once instead of multiple times with a select single. And yes, some of this is because we are running on an AS/400.
But won't it be nice to do our logic at the "database level". We wouldn't have to pull all the records into an internal table to do the logic. Instead select single could be used. And yes, I also think it won't be a good way to program all the time.
Now I could be completely wrong. But that's one of the things I think HANA can help with. The data is now in memory instead of at the DB level. It's like using the tables that aren't changed often - they reside in the buffer.
All of our ABAP code would eventually have to change to take advantage of HANAs capabilites. A whole new way of thinking / programming.
BR,
Michelle
btw - I smiled scenarios 1 and 2. As most of my programming nowadays in in BW, if a table has less than 50'000 records, it's a candidate for full memory buffering!
Good to see your effort. Well done.
Regards
Thiru
Its really helpful for us to get 1st level knowledge on how HANA usesful for ABAPers
However, I miss the OO-approach here. In an ideal world, we would not so much be talking about "reading thousands of entries from VBAK", as "accessing the sales order object"... I've been playing around with the Shared Memory Objects technique recently, which (as far as I can see) applies the same principles as HANA, but in a more logical and (I dare say) intelligent way, by creating and storing objects - containing data and business logic - in memory.
If all we do is adapt our select statements without revising our holistic coding approach, then the gain is rather short-sighted.
With lot of buzz about HANA, I was expecting a whole revolution, but this seems very little of a change.
Is ABAP coding going to get impacted so little?
Also, is HANA available to be used by clients?
Thanks,
Kumud
What is the difference from ABAP point of view between working with HANA in ABAP and for example the traditional ABAP with fully buffered tables (for example reading customizing)?
Thanks,
Peter