NVM – HANA game changer?
In the last time, especially in the last days during Sapphire, the topic non-volatile memory (NVM) was hot as hell. A lot of people defined it as a game changer in the context of HANA.
First of all what is NVM? It is another type of memory which is non volatile as the name already telling us. But what about the size, speed and costs? It will be available in the size from 128GB – 1TB. It will be 20-25 times faster than a normal HDD. A little bit slower than DRAM (DDR4) and round about half the price. So it avoids the access to disk which makes life easier for HANA in high I/O situations like the start time. Big HANA systems (>10TB) have a long start time => about 10-15min. This can be reduced with NVM for instance from Intel with 3D XPoint like introduced at Sapphire to under 1min!
Another example is data aging. With this feature (only with HANA) it is possible to unload big parts in shape of a partitions which results in lowering the memory payload of the HANA. It is NOT archiving! The data just will unloaded from memory to disk and the hottest / most used data will stay in memory. If your hardware is at its maximum archiving and data aging are the housekeeper for your system. But every access to the unloaded partitions are expensive disk I/O. This can be speeded up with NVM.
If your systems are at it’s max. RAM extension in spite of your housekeeping tasks, NVM can be a valid and cheap alternative with the TCO perspective compared to buying new hardware.
Another reasons for high performance or avoid buying new hardware I currently can’t imagine in context of HANA. At which time expensive I/O takes place in a HANA system? Mostly at starting, stopping, delta merges, log writes, shadow pages and savepoints.
1) Delta Merges
If delta merges taking to long, just think about your partitioning. Every partition has its own delta store.
Analyze your savepoints in the system with the SQL collection (note 1969700). You will see that mos of them in range of 10-1000ms. If they are slower you have to analyze the reason, but normally this should not influence endusers performance because the real issue is the lock time in the critical phase of the savepoint. This one should be short as possible.
3) log writes
Currently I never have seen a system which met the HWCCT KPIs which had issues while log writing besides the disk was full – no matter if flash drives or hard disks are used.
4) Starting / Stopping
When you have the SLA to keep the downtime as low as possible, you use the DBSL suspend feature with system replication. In other cases the start and stop times are not interesting.
5) Load / Unload
Yes, this could be another use case. Data which was unloaded in cause of it’s unload priority or never loaded till the last restart have to be loaded on first access from disk. But hot data should always stay in memory also during a lazy load after a restart. It can have impact but if your system is running for a while and used by enduser as it should there is no need to opimize performance because the data are already available fast in DRAM.
6) Data Aging
Yes, this part can really profit by the NVM technology, but you have to analyze how often this “old” data will be accessed. If they are accessed frequently you should adjust the threshold and load ot back to memory. I think this is also not a pain point.
7) shadow pages
They are only used in roll back scenarios or long running uncommited transactions which should not be the case all the time.
The other thing is that NVM uses the DDR bus to achive high performance. This means that you must free some slots which lowers your main memory (DDR4). This may currently not the big deal but if Skylake processors will be delivered within your new intel hardware the DDR slots will be cut in half. So with a 4 socket haswell/broadwell server you currently can address a max. of 12TB via 96 DIMMs with 128GB. HANA certified are only 1TB per socket which means with 4 sockets a max. of 4TB. All other server which address more memory have more sockets.
So a Skylake server with 4 sockets can address a max. of 6TB via 48 DIMMs with 128GB. When you use some of the DIMMs for NVM you cut down the memory once more.
Yes, NVM will change the systems and its usage in a dramatically way but in the context of HANA and with the upcoming skylake processors I don’t see a use case besides high performance and extending hardware when it is reaching its maximum. Speed up start and stopping times of a HANA system is not a valid TCO use case for NVM. Most of the operations are still performed in DRAM and this ist still faster. OK it is more expensive and limited currently by 128GB per DIMM but NVM will speed up disk I/O not extend main memory. May be SAP will also certify and limit NVM 🙂
More details : HANA: First adaption with NVM
there seems to be a very significant misconception about the way HANA will use NVM technology in your post.
1) How NVM is used by HANA
SAP HANA will treat NVRAM a lot like classical DRAM. More precisely, HANA will put the entire column stores MAIN into NVRAM instead of DRAM. That means you’re not losing any memory capacity, but you’re just shifting it from DRAM to NVRAM. With NVRAM-DIMMs potentially being much larger, you will actually gain capacity.
2) How NVM is not used by HANA
NVM is not used as an accelerator for persistency access (i.e., to speed up disk I/O). The persistency stays just the way it is today.
3) Higher availability due to shorter restart times
If data is not unloaded from memory at database or server shutdown, it doesn’t have to be reloaded at startup. That means maximum performance right from the start and no reload from persistency at all for the vast majority of data – the column store MAIN.
4) Lower TCO and higher capacity
The expectation is that NVRAM-DIMMs will be both larger and cheaper than DRAM-DIMMs. That means you can have
Hope that brings some more clarity into the picture!
thanks for sharing the SAP view as product manager. But with the current state of the HANA core it is not possible to use it as you described.
=> you are loosing DRAM capacity because NVRAM-DIMMs are using the DRAM slots which means you lose performance. With the common intel mainboards there are less DRAM slots than in the current ones. Of course you win capacity with NVRAM but you lose performance. So you have to use a tiering for tables which need high performance to store it in the normal DRAM and the warm tables inside the NVRAM.
You have two variants for NVRAM usage:
- use it like classical DRAM
- use it as filesystem
Currently there is no RAM tiering available which can handle hot tables in DRAM and warm tables in NVRAM. Storage tiering within one filesystems is currently not official supported by SAP, so why should DRAM tiering be supported? For the current state of the HANA core it is not possible to use NVRAM like you described.
=> The first tests from SAP and Intel are using it as persistency layer to accelerate the start time.
=> every table and memory part in HANA has unload priority if it is not frequently used it will be unloaded. So it is possible that this current design have to be adjusted in case of using NVM, but currently this can be performance killer for using it. Details for this feature you can find in another blog "HANA – table unload priorities"
=> higher memory capacity in total, but slower
currently 12 DRAM slots for skylake results in a maximum of 3TB DRAM in case of using 2 sockets with 128GB.
2 sockets => 3TB fully DRAM equipped => 1 socket with 512GB NVRAM DIMMs => 1,5TB DRAM & 6TB NVRAM = 7,5TB
4 sockets => 6TB fully DRAM equipped => 2 socket with 512GB NVRAM DIMMs => 3TB DRAM & 12TB NRAM = 15TB
Yes, of course you will get a higher capacity but also lower performance. Currently there is no possibility for RAM placement of tables in HANA as already explained. If this will be possible in the future that can be a great chance to lower the TCO, but not for using it as filesystem. Another thing are the core to memory sizing ratios. These ratios have to be also adjusted otherwise it is not supported and usable.