SAP HANA & Persistent Memory
Moderation Note: This blog post will no longer be updated. To check the latest information, a successor has written a new version. Please see the more updated version of this blog post here.
Throughout the last few months, Intel has been releasing more and more information on their new memory technology, including its official name: Intel® Optane™ DC persistent memory. The promise comes nothing short of reimagining the data center memory storage hierarchy.
Intel is introducing a new memory tier in-between byte-addressable DRAM and block-based SSD storage devices. They are approaching this new tier from both directions, with faster block-devices and – much, much more innovative – byte-addressable main memory with a DIMM form-factor that is persistent. “Persistent” means it does not lose stored information even after a reboot of the server.
SAP is very proud that SAP HANA is the first major database platform that is specifically optimized for Intel® Optane™ DC persistent memory. SAP and Intel’s strong collaboration over many years made this possible. SAP HANA is optimized since HANA 2.0 SPS 03 to leverage the unique characteristics of this technology – even though the hardware is not yet available on the market.
SAP HANA stays in control
Most applications rely solely on the operating system for memory allocation and management. Of course, SAP HANA needs to allocate memory from the operating system just like every other application. Once allocated, SAP HANA prefers to exert a much higher degree of control over its memory management. The reason for that is simple: It allows for a much higher degree of optimization. This is especially important for an in-memory database like SAP HANA. This paradigm extends seamlessly to persistent memory.
In other words: SAP HANA knows which data structures benefit most from persistent memory. SAP HANA automatically detects persistent memory hardware and adjusts itself by automatically placing these data structures on persistent memory, while all others remain in DRAM.
This is possible via App Direct mode of persistent memory – one of two operation modes of persistent memory that Intel just publicly revealed several days ago. App Direct mode allows applications to store data persistently on persistent memory. Memory Mode – not used by HANA! – doesn’t offer this data persistency, but offers cheaper and/or larger main memory.
Using App Direct mode, persistent memory is especially suited for non-volatile data structures. It is no coincidence that “non-volatile memory” and “non-volatile RAM” are often used synonymously with persistent memory.
Given these characteristics, an excellent candidate for placement in persistent memory is the column store main. It is heavily optimized in terms of compression, leading to a very stable – non-volatile – data structure. The main store typically contains well over 90% of the data footprint in most SAP HANA databases, which means it offers a lot of potential. Furthermore, it is reconstructed rarely during the delta merge. A process that is only triggered after a certain threshold of changes to the database table was reached. For most tables, a delta merge does not happen more than once a day.
This design fits SAP HANA’s architecture perfectly. The separation of write-optimized delta and read-optimized main stores and the characteristics of both are a perfect match to the respective strengths of DRAM and persistent memory.
The persistence layer remains
Although the majority of data will be stored “persistently” in persistent memory, other key features of SAP HANA rely on the persistence layer on traditional persistent storage, e.g. SSDs. This includes the row store and the column store delta, as well as system replication and database backups. With SAP HANA’s shared-nothing architecture, this also has an impact on auto-host failover, since the persistent memory of an inactive host cannot be re-assigned to an active one. There’s also the cost aspect. Persistent memory will be cheaper than DRAM, but it will still be more expensive than traditional persistent storage devices. SAP HANA employs techniques to reduce the memory footprint of memory-hungry data types, for example, BLOB (Binary Large Object). Keeping such data in memory – even in persistent memory – would increase the cost unnecessarily.
Why should I care?
At this stage, I’d like to say a few words on how the whole persistent memory story benefits you as a customer running SAP HANA.
Increased main memory capacity
Not long ago, Intel revealed that persistent memory will be available in capacities up to 512 GB per module at launch. Compared to current DRAM – which is available in sizes up to 128 GB – this is already an increase by a factor of four. Considering that most systems don’t even use 128 GB DRAM DIMMs these days, due to limited supply and a corresponding price tag, the potential gain is even higher.
It is not hard to conclude, that the overall capacity of SAP HANA will benefit greatly from this technology. Current technical boundaries will break and even if your SAP HANA instance doesn’t need higher capacity, you will still benefit from the expected cheaper price tag of persistent memory, when compared to DRAM.
Although persistent memory is not a data tiering solution, its higher capacity will influence this anyway by raising the threshold. Possibly to a level where you can simply ignore it and keep all your data where it is – in main memory with maximum performance.
Data loading at startup
Something quite unique to SAP HANA is the consequent implementation of its “in-memory first” paradigm. All database operations are performed directly on the in-memory data structures, instead of first applying everything on persistent block-based storage (e.g., SSDs) and then simply replicating the changes to an in-memory cache like many legacy databases in the market. This means that a table must be loaded to main memory before any operation – read or write – can be performed on this table. For the vast majority of tables – those in SAP HANAs column store – this happens asynchronously after a restart of the database. The database is fully available during that time, but queries to tables that are not yet fully loaded might experience reduced performance.
With pure DRAM, this initial load happens every time the database is started, which means also after planned or unplanned outages. Systems that cannot tolerate the impact on performance often use system replication to circumvent this and, in case of a required restart, switch the workload to the replicated instance. The big disadvantage is that you need an entire second set of hardware for this – complete with CPUs, network, main memory and storage.
With persistent memory, the initial load of the column store is no longer necessary. Column store data is retained across database and even server restarts, which decreases the loading time significantly.
At Sapphire 2018 in Orlando, SAP co-founder and chairman Hasso Plattner presented the very first numbers on the improvement on startup times with persistent memory. Based on a 6 TB instance of SAP HANA, startup time including data loading improved by a factor of 12.5 – from 50 minutes with regular DRAM to just 4 minutes with persistent memory. This means a significantly lower boundary for planned business downtimes – for example due to an upgrade – of a mere few minutes, instead of almost an hour. Reducing business downtimes by this magnitude is otherwise only possible by employing measures like SAP HANA system replication.
To learn more about SAP HANA and Intel Optane DC Persistent Memory go to http://sap.com/persistent-memory.