NVM - HANA game changer?

jgleichmann · ‎05-26-2017

In the last time, especially in the last days during Sapphire, the topic non-volatile memory (NVM) was hot as hell. A lot of people defined it as a game changer in the context of HANA.

First of all what is NVM? It is another type of memory which is non volatile as the name already telling us. But what about the size, speed and costs? It will be available in the size from 128GB - 1TB. It will be 20-25 times faster than a normal HDD. A little bit slower than DRAM (DDR4) and round about half the price. So it avoids the access to disk which makes life easier for HANA in high I/O situations like the start time. Big HANA systems (>10TB) have a long start time => about 10-15min. This can be reduced with NVM for instance from Intel with 3D XPoint like introduced at Sapphire to under 1min!
Another example is data aging. With this feature (only with HANA) it is possible to unload big parts in shape of a partitions which results in lowering the memory payload of the HANA. It is NOT archiving! The data just will unloaded from memory to disk and the hottest / most used data will stay in memory. If your hardware is at its maximum archiving and data aging are the housekeeper for your system. But every access to the unloaded partitions are expensive disk I/O. This can be speeded up with NVM.

If your systems are at it's max. RAM extension in spite of your housekeeping tasks, NVM can be a valid and cheap alternative with the TCO perspective compared to buying new hardware.

Another reasons for high performance or avoid buying new hardware I currently can't imagine in context of HANA. At which time expensive I/O takes place in a HANA system? Mostly at starting, stopping, delta merges, log writes, shadow pages and savepoints.

1) Delta Merges
If delta merges taking to long, just think about your partitioning. Every partition has its own delta store.

2) Savepoints
Analyze your savepoints in the system with the SQL collection (note 1969700). You will see that mos of them in range of 10-1000ms. If they are slower you have to analyze the reason, but normally this should not influence endusers performance because the real issue is the lock time in the critical phase of the savepoint. This one should be short as possible.

3) log writes
Currently I never have seen a system which met the HWCCT KPIs which had issues while log writing besides the disk was full - no matter if flash drives or hard disks are used.

4) Starting / Stopping
When you have the SLA to keep the downtime as low as possible, you use the DBSL suspend feature with system replication. In other cases the start and stop times are not interesting.

5) Load / Unload
Yes, this could be another use case. Data which was unloaded in cause of it's unload priority or never loaded till the last restart have to be loaded on first access from disk. But hot data should always stay in memory also during a lazy load after a restart. It can have impact but if your system is running for a while and used by enduser as it should there is no need to opimize performance because the data are already available fast in DRAM.

6) Data Aging
Yes, this part can really profit by the NVM technology, but you have to analyze how often this "old" data will be accessed. If they are accessed frequently you should adjust the threshold and load ot back to memory. I think this is also not a pain point.

7) shadow pages
They are only used in roll back scenarios or long running uncommited transactions which should not be the case all the time.

The other thing is that NVM uses the DDR bus to achive high performance. This means that you must free some slots which lowers your main memory (DDR4). This may currently not the big deal but if Skylake processors will be delivered within your new intel hardware the DDR slots will be cut in half. So with a 4 socket haswell/broadwell server you currently can address a max. of 12TB via 96 DIMMs with 128GB. HANA certified are only 1TB per socket which means with 4 sockets a max. of 4TB. All other server which address more memory have more sockets.
So a Skylake server with 4 sockets can address a max. of 6TB via 48 DIMMs with 128GB. When you use some of the DIMMs for NVM you cut down the memory once more.

Summary
Yes, NVM will change the systems and its usage in a dramatically way but in the context of HANA and with the upcoming skylake processors I don't see a use case besides high performance and extending hardware when it is reaching its maximum. Speed up start and stopping times of a HANA system is not a valid TCO use case for NVM. Most of the operations are still performed in DRAM and this ist still faster. OK it is more expensive and limited currently by 128GB per DIMM but NVM will speed up disk I/O not extend main memory. May be SAP will also certify and limit NVM 🙂

More details : HANA: First adaption with NVM

NVM - HANA game changer?

SAP PI for Beginners

ABAP 7.40 Quick Reference

Fiori: technical installation and configuration of one app from A - Z