Summary
Introduction - First Thoughts
SAP produce a brief guide to this - http://help.sap.com/hana/hana_db_part_en.pdfInterested to find out how/why you might deploy explicit partitioning in HANA.Now I understand that HANA will dynamically partition tables to a query - i.e. to use as many cores as possible at 2Gb/sec per core. Tables need to be partitioned to spread them across mutile hosts. So why else might you partition?SAP list the 'usual' partitioning benefits -
All well and good. OK - a few HANA only considerations :-
* Delta Merge? - a key part of the design of HANA and a topic in itself. To summarise in one sentence - data changes are stored in a separate area, with its own data dictionary; this area is periodically merged with the main store (Shall we say HANA is fast - but a dictionary encoded column store is going to have to compromise somewhere ...)
** OK. Is that where 'we' get to improve parallelization - by splitting tables across nodes? I.e. HANA can partition tables 'on the fly' WITHIN A NODE ; to use more than one node for a query we need to distribute data across nodes. That makes sense.2 Billion row limit? I'm guessing thats a facet of a dictionary encoded column store which has restricted itself to integer for 'rowids'? This initially sounds a bit restrictive. However I think when you get to billion row plus tables you'd be considering partitioning on most RDBMS systems. And of course with HANA we need to consider Delta Merge processing. Also big tables are going to reside in big systems - which are likely to have more than one node -and you'd probably want to split that big table across those nodes. So this row limit is not looking like an issue. With billion row plus tables you'd probably have gone for partitioning for other reasons - not because of the 2 billion row limit.
Types of Partitioning
HANA provides two categories of table partitions ; Single Level and Multi-Level.Lets look at Single Level first.This looks pretty familiar (I'm talking from a Sybase ASE/SQL Server background) :-
Syntax is simple. You can specify the number of partitions or - and this seems to be SAPs recommendation - use the GET_NUM_SERVERS() function.A restriction in Single Level table partitions is that the columns that are used to partition the table must be part of of the primary key (except hash partitions - which cannot have a primary key defined)
Multi-Level
Slightly harder to visualise than Single Level partitioning, Multi-Level partitioning :-
Three types of Multi-Level partition :-
I can see that Hash-Range allows for maximum parallelization - i.e. highly likely to engage all nodes. Certainly Delta Merge will be completed on all nodes. So if data modifications are restricted to the current month (or quarter etc.) the Delta Merge should be consistent AND will utilize all available nodes - so it's completed as fast as possible
Partition Handling
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
37 | |
10 | |
5 | |
4 | |
4 | |
4 | |
3 | |
3 | |
3 | |
2 |