A Newbie to SAP HANA
“With HANA the only limitation is our imagination!!”
These were the signing off words of Dr. Vishal Sikka, delivering the course An Introduction to SAP HANA at openSAP.
I grabbed the opportunity to introduce myself to this new amazing platform introduced by SAP through this course. The course with a total learning time of 2-3 hours was powerful enough to successfully get me carried away by the mind-blowing features offered by SAP HANA which were once deemed impossible in the technical world.
As an introductory background, Dr. Sikka explained why did they do HANA, what is some of the background, how HANA came about and how did they end to the present state. He explained that the relational database was designed in late 80’s and early 90’s. In those days SQL was the relational algebra and started gaining popularity as people wanted to get rid of file-based data management. People wanted to get more into structured relational way of managing data that is why the Structured Query Language became so popular. The hardware which implemented the relational database was significantly different from the hardware available today. Today we have multicore processing systems and the DRAM which is larger and cheaper in contrasts to what was available in late 80’s and early 90’s.
So this led SAP to think that a completely new kind of database paradigm was within their reach and it was pretty clear that if SAP has to build this new paradigm it has to be built totally around the new reality of hardware available.
- Dr. Sikka also briefed about the advent of columnar technology when people were separately doing OLTP for transactions and OLAP for analytics. Though the columnar database had good advantages to offer on analytical performance, it was largely disk-based implementation. With the availability of multicore processing systems and large and less expensive DRAMs it was possible to imagine in-memory database with faster and better performance. This led to development of a much-needed next generation database which later evolved to become a whole new platform, thrashing the need for traditional 3-tier architecture which was invented as an efficiency mechanism because the database could not become a bottleneck.
SAP HANA Technology
The second chapter which by and large covered the SAP HANA technology started with the concept of parallelism. The multicore processors, in-memory database, columnar structure and rethinking and redesigning everything from scratch gives HANA capability to offer operator and intra-operator parallelism which facilitates performing 12.5-15 million aggregations/second/core.
The column store and the row store in HANA are pretty much the same as the traditional ones, except that they are in-memory. The row store has the advantage that it can perform transactions faster and the column store as already mentioned can do analytics and read dramatically faster. In HANA they both co-exist and have led to amazing inventions like Optimistic Latch Free Index Traversal (OLFIT). OLFIT is a new wide data structure which facilitates storing a transaction in memory without locking the entire index or system. This caters to the need of wide data structure by the Financial Apps which have massive tables with around 320 fields.
One of the myths related to column store is that they are slow when it comes to transactions. But in HANA the new inventions have led to different structure for storage. There is an optimized Main column store, a semi-optimized column store called the Delta Store and a L1 Delta Store which is a row store. The L1 Delta Store, which sits in front of the Delta Store, absorbs the transaction very fast and when the system gets some breathing room it moved to Delta Store or Main Store as the things may be. In the meantime if there is a question with information in all the three stores then you can perform a join between them.
Now it was time for Projections, Dynamic Aggregation and Integrated Compression. I was in a what-are-these mood when I heard them for the first time. But Dr. Sikka’s explanation made them all clear. HANA supports principle of Minimal Projection where you can extract only those columns that are desired and skip the remaining and this is very fast in the column store. The aggregations like the sum of transactions or sales are performed dynamically in HANA. Further the data is compressed to use minimal memory i.e., storage of redundant data is avoided by using different methods of Integrated Compression like Dictionary Compression etc.
HANA has the capabilities of Insert only, Partitioning and Scale-out and Active-Passive storage which helps it to offer features that other databases cannot. Using Insert Only new entries can be added just as new columns and then in a separate operation outdated entries can be removed. The data can be partitioned across various nodes which can bring amazing performance enhancement like the operation that require days together can be done in a couple of seconds. In the scenarios where the Symantec of the Application is known the frequently required or the hot data can be stored in Hot storage/Active storage and the other data can be put into Cold/Passive Storage.
SQL Libraries and Summary
The beauty of HANA is that even though it is completely designed from scratch and ground up, the interfaces still remain the same that people have been using for decades. HANA supports mainly SQL, the structured query language that tons of people around the world use. Apart from SQL it supports MDX, text functions, all types of functional enhancements for business functions, map-reduce operations, Stored procedure language – SQL Script etc. Also HANA has native low-level language L which is a part of the LLVM.
Apart from this as HANA itself is written in C++, there are lots of C++ libraries for GIS Data, Text Data, Business Function Library, Predictive Analysis library so on and so forth. Now SAP is also coming with a way to integrate external libraries written by anyone in a safe way, calls it as AFL.
So when you look at HANA, the whole picture contains five important technical aspects viz., the Core, Engines, SQL, Libraries and the Application Services which makes it more than a database, a Platform!
After all that was designed there was a need to rethink over the notions of performance and benchmarks. Hence came the Dr. Sikka’s 5 Dimensions of Performance
- 1. Data Size
- 2. Query Complexity
- 3. Rate of Change
- 4. Data – prepared or raw
- 5. Response Time
The more of these five dimensions are in there the more the performance of HANA stands out.
Roadmap and Re-thinking Software Development using HANA
With the completion of development of HANA there was this question that where it’s gonna be used. The roadmap was quite straightforward, bringing every single product to HANA. Today every single product of SAP produces best results with HANA. The Business Suite, Cloud Applications, BW, Application Platforms ABAP 7.4, JAVA are all optimized to run on HANA. Every single that was done is being rethought and refactored on the HANA platform.
In the final practice summary video Dr. Sikka briefed on wide range of innovations and developments that have taken place which were just eyebrow raising and convincing that the only limitation with HANA is our imagination!!!
Thanks for the summary!
Sorry for this late comment, but for the insert only part, does the outdated entry get removed or does it just get invalidated? Because in the case of removal, there won't be any 'timetravelling' as Vishal calls it.