SAP HANA : The Row store , column store and Data Compression
Here is an attempt to explain the row store data layout, column store data layout and the data compression technique.
Row Store : Here all data connect to a row is placed next to each other. See below an example.
Table 1 :
Name |
Location |
Gender |
….. |
….. |
…. |
Sachin |
Mumbai |
M |
Sania |
Hyderabad |
F |
Dravid |
Bangalore |
M |
……. |
…… |
…… |
Row store corresponding to above table is
Column store : Here contents of a column are placed next to each other. See below illustration of table 1.
Data Compression : SAP HANA provide series of data compression technique that can be used for data in the column store. To store contents of a column , the HANA database creates minimum two data structures. A dictionary vector and an attribute vector. See below table 2 and the corresponding column store.
Table 2.
Record |
Name |
Location |
Gender |
….. |
….. |
….. |
…. |
3 |
Blue |
Mumbai |
M |
4 |
Blue |
Bangalore |
M |
5 |
Green |
Chennai |
F |
6 |
Red |
Mumbai |
M |
7 |
Red |
Bangalore |
F |
…… |
….. |
…… |
…… |
Here in the above example the column ‘Name’ has repeating values ‘Blue’ and ‘Red’. Similarly for ‘Location’ and ‘Gender’. The dictionary vector stores each value of a column only once in a sorted order and also a position is maintained against each value. With reference to the above example , the dictionary vectors of Name , Location and Gender could be as follows.
Dictionary vector : Name
Name |
Position |
…. |
…… |
Blue |
10 |
Green |
11 |
Red |
12 |
….. |
…… |
Dictionary vector : Location
Location |
Position |
…. |
…… |
Bangalore |
3 |
Chennai |
4 |
Mumbai |
5 |
….. |
…… |
Dictionary vector : Gender
Gender |
Position |
F |
1 |
M |
2 |
Now the Attribute vector corresponding the above table would be as follows. Here it stores the integer values , which is the positions in dictionary vector.
Hi Praveen,
Although the document started of nicely, it seems incomplete and a misleading. You do not tell us what exactly the 'Data Compression' is as stated in the document title. If this is a series of documents, please do mention it here.
Thanks Benedict for the comments .
There will be successive documents which will cover the aspects of the ABAP on HANA
Not sure I get the point of having yet another "here's what's different between row and column store tables" piece.
By now there are many very good and detailed explanations available (e.g. SAP HANA Administration Guide - SAP Library) that this posting unfortunately doesn't add much value.
Also, and that's worse in my eyes: it leaves many questions open. Why is the column store advantageous? When is it not? What about these 'vectors' - why are those important?
I highly recommend to get up to the level of conversation first, e.g. by checking Why SAP HANA? - YouTube (column vs row store is discussed around min 44) and then focus your writing on a specific aspect that interests you specifically.
I'm sure there's more that you can tell us about your insights into this topic.
Go for it!