In Memory Analytic and HANA
In Memory Analytics
In Memory Analytics is approach of Querying data when it resides in main memory of the computer instead of physical disks. Such approach helps to improved response time which support faster business decisions. In addition to incredible fast query response, in memory analytics can reduce tasks like loading data in cubes / infoproviders, data indexing or aggregates. This help to reduce IT cost and allows faster implementation.
Need of in-memory analytics –
- Due to huge data volume
- Due to competitions in business
- Due to need of faster decision support system
- Due to technical challenges in BI
- Due to availability advanced 64 bit computing processors
High Performance Analytic Appliance (HANA)
HANA is an Analytical Appliance which is a combination of software and hardware which uses In-memory Analytics for processing business applications. It can be used to analyze huge data in real time. And its todays challenge of business to analyze massive amount of data in daily business which keeps growing day by day.
HANA has an in-memory database that support real time activities and columnar storage of data. We can say it is in-memory columnar real time database.
Traditional database and HANA database:
NO | Traditional Database | HANA |
---|---|---|
1 | 3 layered approach | 2 layered approach |
2 | Time consuming | Fast processing |
3 | Calculations are performed after fetching data from DB | Capable of performing performance intensive and complex calculations |
Row and Column storage
In traditional database, data is stored in 2 dimensional structures of rows and columns as shown below.
EMP ID |
First Name |
Last Name |
DOB |
City |
Gender |
101 |
Ram |
Jadhav |
04-01-1980 |
Pune |
Male |
102 |
Bunty |
Patil |
13-08-1983 |
Mumbai |
Male |
103 |
Sheela |
Patil |
21-11-1985 |
Pune |
Female |
104 |
Deepak |
Kore |
16-05-1986 |
Pune |
Male |
… |
… |
… |
… |
… |
… |
Such data is organized serially tuple wise. In this case data retrieving is easy but data analyzing is time consuming process. When data is huge it will be stored on huge hard drives.
Row data layout
- Data is stored tuple wise
- Easy access to whole records
- Higher cost for single attribute calculation
- Example:
Rec ID |
EMP ID |
First Name |
Last Name |
DOB |
City |
Gender |
1 |
101 |
Ram |
Jadhav |
Pune |
Male |
|
2 |
102 |
Bunty |
Patil |
13-08-1983 |
Mumbai |
Male |
3 |
103 |
Sheela |
Patil |
21-11-1985 |
Pune |
Female |
4 |
104 |
Deepak |
Kore |
16-05-1986 |
Pune |
Male |
… |
… |
… |
… |
… |
… |
… |
Data is stored tuple wise. Here it’s easy to access whole record on key. But when try to analyze data it cost higher.
Now in this case if I want to calculate how many employees are from Pune, it will work as below –
- It will load whole data into main memory
- Then it will scan City attribute to have count
- Only one column is used from loaded 6 columns, 5 columns are unnecessary loaded to main memory.
Column data layout
- Data is stored attribute wise
- Need reconstruction to access whole record
- Lower cost for attribute calculation
- Example
Rec ID |
EMP ID |
Rec ID |
First Name |
Rec ID |
Last Name |
Rec ID |
DOB |
Rec ID |
City |
Rec ID |
Gender |
1 |
101 |
1 |
Ram |
1 |
Jadhav |
1 |
4/1/1980 |
1 |
Pune |
1 |
Male |
2 |
102 |
2 |
Bunty |
2 |
Patil |
2 |
13-08-1983 |
2 |
Mumbai |
2 |
Male |
3 |
103 |
3 |
Sheela |
3 |
Patil |
3 |
21-11-1985 |
3 |
Pune |
3 |
Female |
4 |
104 |
4 |
Deepak |
4 |
Kore |
4 |
16-05-1986 |
4 |
Pune |
4 |
Male |
… |
… |
… |
… |
… |
… |
… |
… |
… |
… |
… |
… |
Data is stored attribute wise. Here it’s easy to access analyze attributes. Recollecting data of record is costly.
Now in this case if I want to calculate how many employees are from Pune, it will work as below –
- It will load column named as “City” into main memory and calculate count.
- Unnecessary data loading is prevented.
- While reconstruction of record to recollect all attributes rec ID is used.
Replication Process
The process which is used to replicate data from source system to HANA database target system is known as Replication process.
There are 3 types
- Trigger based replication
- ETL based replication
- Log based replication
Trigger based replication:
This process uses SAP Landscape Transformation (LT) to replicate data from source system to SAP HANA database target system. SAP LT ensures full data consistency and data integrity while data migration.
ETL based replication
This process uses SAP Business Objects Data Services in order to replicate relevant data from source system to target system SAP HANA.
Log based replication
This process uses Sybase to load data from source system in to target SAP HANA database.