SAP Hana Introduction
SAP HANA is a in-memory data platform. With SAP HANA, you can build applications that integrate the business control logic and the database layer with unprecedented performance. As a developer, one of the key questions is how you can minimize data movements. The more you can do directly on the data in memory next to the CPUs, the better the application will perform.
SAP HANA is a flexible, data-source-agnostic appliance that enables users to analyze a large volume of SAP ERP data in real-time using in-memory computing.
Data resides in memory (RAM) and not in the hard disk. This is best suited for performing real-time analytics and developing and deploying real-time application. Data being handled in memory gives the CPU quick access to data for processing.
SAP HANA appliance software is a combination of hardware and software that integrates with various SAP technologies such as SAP HANA Database, SAP Landscape Transformation Server and Sybase replication technology.
SAP HANA Database is a hybrid in-memory database that combines row-based, column-based and object oriented technology.
The advantages gained over the usage of in memory processing are further optimized by the parallel processing capabilities of multi-core CPU architecture.
Past Vs Future
Data of traditional RDMS is disk based. Data is processed in the CPU registers. Due to limited capacity, only a subset of data is brought into the CPU by the buffer cache. The architecture of these systems were designed with the focus on optimizing the disk access, e.g.: by minimizing the number if disk blocks to be read into the main memory when processing a query.
In SAP HANA database all data is loaded into memory – there is no need to check anymore if a data is already in memory or a read from disk is necessary. The data due to column stores (vertical column wise storage, mean values of one attribute are stored sequential in memory) is CPU-aligned; no virtual expensive calculation of LRU, logical block addresses, but direct (pointer) addressing of data. With SAP HANA, all data is available in main memory there for no need disk I/O space. Either disks or solid state drives are still required for permanent consistency during the event of power failure. Some of the optimization mechanism of the database consists of Row and Column store mechanism, partitioning, compression and processing of delta records.
Understand Column Data Storage
Relational database typically use row-based data storage. SAP HANA uses both row-based and column-based storage, and is optimized for column-based storage.
Row Storage: Records of a table are stored in the sequence of rows.
Column Storage: Records are stored in a sequence of column. Therefore records of a single column consists of records of the same attribute.
The combination of both storage mechanism that provides HANA the edge over the other Database systems. Some of the advantages of these storage mechanism are:
- Flexibility and performance
- Better Compression
- Optimization through parallel processing
When to Use Row Store, when Column Store?
If you want to report on all the columns then the row store is more suitable because reconstructing the complete row is one of the most expensive column store operations.
If a table has to be filled with large amount of data that should be aggregated and analyzed then a column store is more suitable.
Compressed data can be loaded into the CPU cache faster. This is because the limiting factor is the data transport between memory and CPU cache, and so the performance gain exceeds the additional computing time needed for decompression.
Column-based storage also allows execution of operations in parallel using multiple processor cores. In a column store, data is already vertically partitioned. This means that operations on different columns can easily be processed in parallel. If multiple columns need to be searched or aggregated, each of these operations can be assigned to a different processor core. In addition, operations on one column can be parallelized by partitioning the column into multiple sections that can be processed by different processor cores.
OLAP & OLTP in a single database
For SAP Business Suite on HANA, SAP considers a new benchmark, as Suite on HANA combines OLAP and OLTP in a you could say OLEP = Online Everything Processing, including text and semantics and predictive etc. This “everything” on a single system processing is incomparable to any other current database, therefore a traditional benchmark on SAP HANA powered Business Suite system would require isolated processing independent of the parallel SAP HANA capabilities.