SAP HANA & Persistent Memory
Moderation Note: This blog post will no longer be updated. To check the latest information, a successor has written a new version. Please see the more updated version of this blog post here.
Throughout the last few months, Intel has been releasing more and more information on their new memory technology, including its official name: Intel® Optane™ DC persistent memory. The promise comes nothing short of reimagining the data center memory storage hierarchy.
Intel is introducing a new memory tier in-between byte-addressable DRAM and block-based SSD storage devices. They are approaching this new tier from both directions, with faster block-devices and – much, much more innovative – byte-addressable main memory with a DIMM form-factor that is persistent. “Persistent” means it does not lose stored information even after a reboot of the server.
SAP is very proud that SAP HANA is the first major database platform that is specifically optimized for Intel® Optane™ DC persistent memory. SAP and Intel’s strong collaboration over many years made this possible. SAP HANA is optimized since HANA 2.0 SPS 03 to leverage the unique characteristics of this technology – even though the hardware is not yet available on the market.
SAP HANA stays in control
Most applications rely solely on the operating system for memory allocation and management. Of course, SAP HANA needs to allocate memory from the operating system just like every other application. Once allocated, SAP HANA prefers to exert a much higher degree of control over its memory management. The reason for that is simple: It allows for a much higher degree of optimization. This is especially important for an in-memory database like SAP HANA. This paradigm extends seamlessly to persistent memory.
In other words: SAP HANA knows which data structures benefit most from persistent memory. SAP HANA automatically detects persistent memory hardware and adjusts itself by automatically placing these data structures on persistent memory, while all others remain in DRAM.
This is possible via App Direct mode of persistent memory – one of two operation modes of persistent memory that Intel just publicly revealed several days ago. App Direct mode allows applications to store data persistently on persistent memory. Memory Mode – not used by HANA! – doesn’t offer this data persistency, but offers cheaper and/or larger main memory.
Using App Direct mode, persistent memory is especially suited for non-volatile data structures. It is no coincidence that “non-volatile memory” and “non-volatile RAM” are often used synonymously with persistent memory.
Given these characteristics, an excellent candidate for placement in persistent memory is the column store main. It is heavily optimized in terms of compression, leading to a very stable – non-volatile – data structure. The main store typically contains well over 90% of the data footprint in most SAP HANA databases, which means it offers a lot of potential. Furthermore, it is reconstructed rarely during the delta merge. A process that is only triggered after a certain threshold of changes to the database table was reached. For most tables, a delta merge does not happen more than once a day.
This design fits SAP HANA’s architecture perfectly. The separation of write-optimized delta and read-optimized main stores and the characteristics of both are a perfect match to the respective strengths of DRAM and persistent memory.
The persistence layer remains
Although the majority of data will be stored “persistently” in persistent memory, other key features of SAP HANA rely on the persistence layer on traditional persistent storage, e.g. SSDs. This includes the row store and the column store delta, as well as system replication and database backups. With SAP HANA’s shared-nothing architecture, this also has an impact on auto-host failover, since the persistent memory of an inactive host cannot be re-assigned to an active one. There’s also the cost aspect. Persistent memory will be cheaper than DRAM, but it will still be more expensive than traditional persistent storage devices. SAP HANA employs techniques to reduce the memory footprint of memory-hungry data types, for example, BLOB (Binary Large Object). Keeping such data in memory – even in persistent memory – would increase the cost unnecessarily.
Why should I care?
At this stage, I’d like to say a few words on how the whole persistent memory story benefits you as a customer running SAP HANA.
Increased main memory capacity
Not long ago, Intel revealed that persistent memory will be available in capacities up to 512 GB per module at launch. Compared to current DRAM – which is available in sizes up to 128 GB – this is already an increase by a factor of four. Considering that most systems don’t even use 128 GB DRAM DIMMs these days, due to limited supply and a corresponding price tag, the potential gain is even higher.
It is not hard to conclude, that the overall capacity of SAP HANA will benefit greatly from this technology. Current technical boundaries will break and even if your SAP HANA instance doesn’t need higher capacity, you will still benefit from the expected cheaper price tag of persistent memory, when compared to DRAM.
Although persistent memory is not a data tiering solution, its higher capacity will influence this anyway by raising the threshold. Possibly to a level where you can simply ignore it and keep all your data where it is – in main memory with maximum performance.
Data loading at startup
Something quite unique to SAP HANA is the consequent implementation of its “in-memory first” paradigm. All database operations are performed directly on the in-memory data structures, instead of first applying everything on persistent block-based storage (e.g., SSDs) and then simply replicating the changes to an in-memory cache like many legacy databases in the market. This means that a table must be loaded to main memory before any operation – read or write – can be performed on this table. For the vast majority of tables – those in SAP HANAs column store – this happens asynchronously after a restart of the database. The database is fully available during that time, but queries to tables that are not yet fully loaded might experience reduced performance.
With pure DRAM, this initial load happens every time the database is started, which means also after planned or unplanned outages. Systems that cannot tolerate the impact on performance often use system replication to circumvent this and, in case of a required restart, switch the workload to the replicated instance. The big disadvantage is that you need an entire second set of hardware for this – complete with CPUs, network, main memory and storage.
With persistent memory, the initial load of the column store is no longer necessary. Column store data is retained across database and even server restarts, which decreases the loading time significantly.
At Sapphire 2018 in Orlando, SAP co-founder and chairman Hasso Plattner presented the very first numbers on the improvement on startup times with persistent memory. Based on a 6 TB instance of SAP HANA, startup time including data loading improved by a factor of 12.5 – from 50 minutes with regular DRAM to just 4 minutes with persistent memory. This means a significantly lower boundary for planned business downtimes – for example due to an upgrade – of a mere few minutes, instead of almost an hour. Reducing business downtimes by this magnitude is otherwise only possible by employing measures like SAP HANA system replication.
To learn more about SAP HANA and Intel Optane DC Persistent Memory go to http://sap.com/persistent-memory.
Great work, this is impressive.
Nice Work! What is the impact of this new memory management tier on backup and restore strategy for SAP HANA?
no impact at all. Data durability of HANA is still managed by the persistence, which means backups are still required in the same way as with a pure DRAM system.
Thank you for this update.
Hi Andreas, thanks a lot for sharing!
Is the current version of the BW sizing report (note 2296290) taken this into account already? What version is needed?
It seems that version 2.6.1 does not show the persistent memory part (?).
Best regards, Olaf
Are there any customer stories, like the one Intel presented at TechEd last year, from those who have adopted persistent memory?
Great information. I do have a question about main storage in persistent memory. Will data be simultaneously placed in /hana/data paths on disk and also on persistent memory? Depending on the answer, lets assume its only in persistent memory. What happens if we loose a persistent DIMM. Is there redundancy like RAID for these DIMMS or does HANA rebuild from /hana/data and /hana/logs?
In HANA, column store tables are divided into two fragments i.e. Main and Delta. The main fragment is reader-friendly and it contains most of the data and it changes rarely. The delta fragment is writer-friendly and it contains smaller part of the data (mostly changes).
So all changes in HANA happens only on delta fragment of column store. When a delta fragment becomes too large, it is merged with main fragment. This process is called delta merge operation.
With the support of Persistent memory in HANA, architecture is slightly changed where all main fragment of column store table will reside in persistent memory (which earlier reside in RAM) and delta fragment will reside in RAM (no change here). This adoption of this technique is because write transaction will be slower in Persistent memory than reading, that's why we have delta fragment in RAM where all change happens. So on every delta merge operation, delta fragment (RAM) of column store will be merged into main fragment (NVRAM) and a new empty delta is created (in RAM). And on every save point, data from memory will get persistent to disk.
So the technique of data management on HANA doesn't change much, it's just that your main fragment of column store will reside in persistent memory instead of RAM. Also persistent memory doesn't eliminate the use of disk for storage, it still store data at every savepoint. So if your Persistent Memory DIMM gets failed and you need to replace that, in that case it will get the data from disk when you start HANA.
Hello Andrea - Thank you for sharing the details.
Question: Can we have HANA 1.0 SPS12 installed on this? I understand that HANA 1.0 SPS12 will not recognize the 'Optane Persistent Memory', but will it error out and not allow?
I want to understand if we can have HANA 1.0 SPS12 on this new hardware and then perform in place upgrade to HANA 2.0 SPS4.
thank you very much!
You cannot use Intel Optane DC persistent memory with HANA 1.0 SPS12. You need to be at least on HANA 2 SPS03 Rev 35. You can find more details on the requirements in SAP Note: 2618154.
Nice article, What is the expected IO latency difference between DRAM & NVRAM ? It would be great if you could share Performance benchmarks for DRAM vs NVRAM.