Understanding the Configuration of SAP HANA NSE

ruediger_karl · ‎05-03-2022

This blog is intended to provide some more understanding about SAP HANA NSE and its configuration.

Design Principles of NSE

SAP HANA NSE adds a seamlessly integrated disk-based processing extension to SAP HANA’s in-memory column store by offering a large spectrum of data sizes for an improved cost-to-performance ratio. It manages data without fully loading it into memory. This offers the ability for processing in-memory stored data for performance critical operations (hot data) and NSE-managed data for less frequent accesses (warm data).

The buffer cache is the key component of NSE to ensure the performant access to data (pages) on disk. It avoids redundant I/O operations by keeping pages which are accessed more frequently in memory rather than reading them from disk repeatedly. The buffer cache uses LRU (Least Recently Used) and HBL (Hot buffer List) strategies and reuses the memory pages from the internal pools instead of allocating or deallocating pages via HANA memory management.

One the one side, each (re-) load of data by the buffer cache requires disk I/O and adds a performance penalty to the query performance compared to in-memory data. On the other side, it is acceptable to have some reasonable disk I/O in the buffer cache, if it doesn’t harm the system performance. You should keep in mind that data in NSE is not intended for real-time operations. See also below section “Identifying warm data in the database”.

Database objects in SAP HANA have an assigned load unit definition assigned on table-, partition- or column level. Data in NSE has assigned the load unit “page-loadable” whereas in-memory data has assigned the load unit “column-loadable”.

Configuration Challenges

By understanding these design principles, the configuration for NSE requires the finding of an optimized buffer cache size that can manage the size of warm data in the database while providing an acceptable query-performance of the workload for applications or end-users. The buffer cache size has therefore a strong dependency to the size of warm data in the database and the access patterns in your workload.

There are two extremes of the configuration, which are both not desirable options:

1) Configuring a too large buffer cache size that would avoid any disk I/O, but also keep too many and rarely accessed warm data in memory (best performance, but inefficient buffer cache size).

2) Configuring a too small buffer cache size that the requested warm data cannot be loaded into memory (leads to query cancellation, i.e. Out-of-Buffer event) or the disk I/O for warm-data processing causes an unacceptable query performance which might even affect the overall system performance of HANA (worst performance).

You’d need to iterate between these two extremes to find the optimized buffer cache size while monitoring your system performance constantly.

In the next section there are some starting points that might helping you to find an initial sizing of your buffer cache.

Starting Points and Recommendations to Configure SAP HANA NSE

Please keep in mind, that the initial sizing needs to be adapted to the growing warm data size and changed workload requirements.

The given examples are simplified and assume a single-host, single-container database. It is also possible to define buffer cache sizes per index server in SAP HANA (e.g., scale-out or multiple database containers systems).

With SAP HANA SPS05 and SAP HANA Cloud the maximum buffer cache size is defined by default with 10% of the HANA database memory (global allocation limit, i.e. GAL). This default value is sufficient for many use cases initially.
Example: GAL = 6TB; Max Buffer Cache Size = 600GB (10% GAL)

The buffer cache can manage initially a warm data size by factor 8 of the max buffer cache size (1:8 ratio). That is a good starting point but depending on your workload and access patterns you might even use a lower ratio of e.g., 1:10. Smaller buffer cache Sizes with lower ratios are indicated, where data in NSE is very rarely touched.
Example: Buffer Cache Size = 600GB; Warm Data Size on disk = 4.8TB

There is no maximum or fixed ratio between hot data and warm data size in HANA, i.e., you can change the buffer cache size to a bigger or lower value than the default size. However, you need to validate the size with the actual workload in your system. For larger warm data sizes in NSE, you should consider fast disk I/O subsystems for your HANA data volume. Please keep in mind that, depending on your workload, larger warm data sizes may also generate larger intermediate result sets which may then exceeds the HANA Workspace and/or the 2 billion records limit for calculated result sets.
Example: GAL = 6TB; Workspace=3TB; Hot Data Size = 1,5TB; Buffer Cache Size = 1,5TB; Warm Data Size on disk = 12TB

For the HANA data disk storage and backup storage sizing you should simply keep in mind that hot data (column loadable) and warm data (page loadable) have almost the same disk footprint.

You can configure the buffer cache size in the Database Explorer (SQL-Console) or in the SAP HANA Cockpit accordingly.

Identifying Warm Data in the Database

Identifying the warm data size in the database first and then configuring the buffer cache size is the preferred way for configuring HANA NSE. Let’s start with some common characteristics of warm data.

Characteristics of Warm Data:

Warm data is a synonym for data being less frequently or even rarely accessed in the database

Warm data doesn’t need to be kept always in memory

The usage of warm data might have performance degradations, but reduces HANA’s main memory consumption

Static approach based on Application Data Design

Based on these characteristics and some knowledge about your applications you can apply a static approach to identify warm data.

Less-frequently accessed data in your application(s) is e.g. application log-data or statistics data. Their records are often written once and rarely read afterwards.

Less-frequently accessed data is also historical data in the application. Some application tables have e.g. a date/time column that provides an indicator to “split” a table in current and older (historical) parts. This table split can be implemented by a table range-partitioning in the HANA database, where the partition key is e.g., the data/time column in your table.

Without too much knowledge about the applications running on the database, you may also monitor the table access statistics in HANA (m_table_statistics or in the SAP HANA Cockpit) to identify rarely accessed tables.

Dynamic approach with SAP HANA NSE Advisor

Another option to identify warm data is the NSE Advisor. This feature in SAP HANA provides you recommendations for your data being column-loadable (hot data) or page-loadable (warm data) on a fine-granular level, i.e., table-, partitions- or column-level.

Please note that the NSE Advisor is not available yet in SAP HANA Cloud.

While the NSE Advisor is enabled, SAP HANA database counts all column scans and materializations. After the NSE Advisor has been disabled, it categorizes all accessed objects based on their scan density and generates load unit recommendations for them. The categories are:

Objects being considered as hot (high scan density), recommended for column-loadable

Objects with average scan density, with no recommendation

Objects being considered as cold (low scan density), recommended for page loadable

You can configure the NSE advisor to control this categorization with a corresponding parameter by defining the percentage of hot and cold objects as thresholds related to all accessed objects during a run, e.g.,

HOT_OBJECT_THRESHOLD_PERCENT = 10 (default)

COLD_OBJECT_THRESHOLD_PERCENT = 10 (default)

The sum of HOT_OBJECT_THRESHOLD_PERCENT and COLD_OBJECT_THRESHOLD_PERCENT must not exceed 100[%].

Please keep in mind that only statistics of those objects are collected, which have been accessed while the advisor was running. Other “historical” data which haven’t been accessed during the advisor run is not considered. They need to be identified with the static approach. But running the advisor constantly is also not recommended as it consumes CPU and Memory for the statistics collection. Typically, you would run the advisor for representative workloads of your system, e.g., period-end-closing, end-user activity during a workday, year-over-year comparisons or reporting.

To control the amount of statistic data and the CPU consumption of the NSE advisor you can exclude objects from the statistic collection smaller than a given row count with a corresponding parameter (>= HANA 2 SPS05), e.g.

MIN_ROW_COUNT = 10000 (default)

We recommend starting with a few large tables to see the effect and gain experience with NSE. we do not recommend using NSE for tables smaller than 1MB.

In the NSE advisor you can suppress recommendations of database objects with a smaller than a given size in Bytes with a corresponding parameter, e.g.

MIN_OBJECT_SIZE = 1048576 (default)

It is also recommended to use a production-like test system for the classification and calibration of hot and warm data. In that test system you should be able to create a production-like workload by using e.g., the Capture and Replay tooling. You can repeat the NSE advisor runs with different parameter settings and compare the list of recommended objects in the NSE advisor monitoring view. As you cannot export the monitoring view, you just create a custom table with the same structure and insert the results from the view. This custom table can then be exported.

You can configure the NSE advisor in the Database Explorer (SQL-Console) or SAP HANA Cockpit in HANA on-prem.

Monitoring the Buffer Cache

As the workload may change and the warm data size is growing in the system, you’d need to monitor your buffer cache regularly. It helps to identify if the buffer cache is not sized properly.

Out-Of-Buffer Events

The first indicator that the buffer cache is undersized are Out-Of-Buffer Events (OOB events). Whenever the buffer cache is not able to allocate memory pages that exceeds the currently defined max buffer cache size in the HANA database, it will raise an OOB-event. You can view this event by querying the SYS.M_EVENTS system catalog or in the SAP HANA Cockpit. You can solve the issue by increasing the maximum buffer cache size accordingly to the warm data size (see here the 1:8 recommendation). Please also note that in some cases the access patterns in your system required e.g. the loading of a large page-loadable column that doesn’t fit into the buffer cache. Here you’d need to increase the buffer cache size to the largest page-loadable column size of your NSE tables.

Finally, when the last out-of-buffers error is reported and if the server notices that after several subsequent buffer requests have been successfully fulfilled by the buffer cache without raising another error, the server auto-resolves the out-of-buffers event.

Monitoring the Buffer Cache KPIs

There are the following monitoring views in the HANA database (also documented in the SAP Online Help😞

M_BUFFER_CACHE_STATISTICS

M_BUFFER_CACHE_POOL_STATISTICS

The M_BUFFER_CACHE_POOL_STATISTICS is much more detailed and provides information per “buffer pool”, where each pool is used for a specific page size. The buffer cache manages page sizes from 4KB to 16MB.

The monitoring view M_BUFFER_CACHE_STATISTICS provides the cumulated values of all buffer cache pools in the HANA database.

The buffer cache KPIs can be monitored in the SAP HANA Cockpit or in the SAP HANA Database Explorer by selecting the buffer cache monitoring view, as named above.

The “Configured Cache Size” in the SAP HANA Cockpit is the configurable maximum buffer Cache size in the HANA system. The monitoring views provide you additional information about the allocated buffer cache size, which is the actual memory size of the buffer cache allocated in the HANA system. In case the used buffer cache size is constantly very below the allocated size (there might be a rare peak usage of the buffer cache in the past), you can “shrink” the allocated buffer cache Size by configuring the maximum buffer cache size below the allocated size. It enforces the buffer cache to release the allocated memory, which is then available for HANA memory. After then, you can configure the max buffer cache size back to the original value, as the max buffer cache is not pre-allocated.

Indication of an undersized Buffer Cache

Apart from Out-Of-Buffer events, the occurrence of one or more of the following monitoring information in the HANA system may indicate an undersized NSE buffer cache:

The used size of the buffer cache is close to the maximum size

The HANA query performance for NSE tables is slow (long response time or query time out)

The hit ratio of the buffer cache is low

The HANA volume I/O statistics (advanced read/write statistics) show a high number of reads/writes for the data volume

Further References

SAP HANA Administration Guide (Online Help)

SAP HANA NSE Whitepaper (HANA 2 SPS05)

SAP Note NSE FAQ

SAP HANA Capture and Replay Tooling (Online Help)

Cocquerel · ‎05-03-2022

Does NSE Advisor work with Extended tables (Dynamic Data Tiering) ? Is there a way to estimate disk size change when switching tables from Dynamic Data Tiering to NSE ?

ruediger_karl · ‎05-03-2022

Hi Michael, Dynamic Tiering (Extended Tables, Multistore Tables) is another (older) technology in SAP HANA and cannot be mixed with HANA NSE. In other words, using the NSE Advisor for extended tables won't work, as you also cannot specify load units to extended tables. There seems a slight disk size difference between tables in DT and tables in HANA NSE. Column-loadable tables and page-loadable tables in HANA have almost the same disk footprint and I found a comparison between tables in DT and HANA tables under https://blogs.sap.com/2020/06/12/hana-dynamic-tiering-setup-and-operations/ (at end of the blog). If you consider to migrate DT tables to HANA NSE, please note that moving tables from DT to HANA may take much longer time than from HANA to DT. Instead, you might think of leaving existing DT tables where they are and applying NSE for all new warm data tables in HANA.

juergen_schaefer1 · ‎07-06-2022

Is it possible to reduce the size of the NSE Buffer Cache drastically until NSE is not used ?
It came into my mind what to do in case the Hana DB is running out of memory.

We are running SPS5 Rev 59 and with a 2TB Hana we are "wasting" around 200GB of memory as long as NSE is not used.

Thanks in advance

ruediger_karl · ‎07-07-2022

Hi Jürgen,

please keep in mind that the NSE Buffer Cache does not pre-allocate any memory when it is not used. Even you have configured a max buffer cache size of 200GB, it is not allocated by NSE as long as no tables/partitions/columns are defined as page-loadable. The buffer cache will allocate memory and increase dynamically its size depending on the data used by NSE. The good news are the there is nothing to do and there is no waste of memory.

Regards, Ruediger

SriKrishna · ‎08-05-2022

Hello ,

In this blog, the ratio is mentioned as 1:4 . Please suggest the ratio we need to consider while configuring buffer cache.

https://blogs.sap.com/2019/07/22/increase-hana-data-capacity-with-sap-hana-native-storage-extensionn...

ruediger_karl · ‎08-05-2022

Hi,

as stated in my blog, a good starting point you can consider for sizing is a buffer cache size : warm data size ratio of 1:8.

Another ratio for hot data to warm data is not given. I'm aware of blogs or articles that posted a ratio of 1:4, but this has never been defined by HANA. You can start with this ratio as starting point, but you can grow beyond that. See here the example in my blog, where the ratio is 1:8 between hot and warm data size:

Example: GAL = 6TB; Workspace=3TB; Hot Data Size = 1,5TB; Buffer Cache Size = 1,5TB; Warm Data Size on disk = 12TB

hervianzh · ‎10-27-2022

Hi ruediger,

is it possible to apply NSE to row store tables ?

ruediger_karl · ‎10-27-2022

Hi Hervian,

no. NSE can only be used for column store tables.

Regards, Ruediger

hervianzh · ‎10-27-2022

Hi Ruediger,

any advise for high ram consuming row store tables? This tables is standard one (DFKKSUM). And we cannot do archiving right now as only 3 years data and already consume 1TB ++ Ram for this single tables

Thanks,

Hervian

ruediger_karl · ‎10-28-2022

Hi Hervian,

I'm not the application expert. This table is indeed a row store table that contains records for mass bookings of FI-CA jobs. These jobs produce a very large number of frequent updates on that table counting the bookings that they have made. The is also reason that in this exceptional case the table is delivered by SAP as row store table. As you shouldn't change the table type, you cannot apply NSE for that table. The recommendation is therefore that you contact the SAP support/consulting to investigate if you could optimize the configuration in FI-CA for mass activities. The table size seems to be quite large,

Hope that helps,

Ruediger

hervianzh · ‎10-28-2022

Hi Ruediger,

Thanks for your insight,

If you have any reference for row store tables ram usage optimization for hana database please kindly let me know.

Thanks,

Hervian

ruediger_karl · ‎10-31-2022

There is an additional Data Volume Service Offering by SAP consulting. You can give it a try for further optimization (contact: daniel.rasskasow@sap.com,)

533190789 · ‎11-24-2022

Hi,
What happens when the buffer is full and a new request comes in? Will the buffer flush the least recently used page and serve the new request or generate an out of buffer event?

ruediger_karl · ‎11-25-2022

Hi Vira,

the buffer cache flushes the least recently used pages to free up the buffer cache memory for the new incoming requests..

Regards, Ruediger

marcus_greger · ‎07-27-2023

Dear Rüdiger,

As a general rule of thumb I understand that tables with a low select count could be considered as NSE candidates. Now I ask myself how write operations on tables (delete, insert, update) factor into this. As an example: If you have a large table with a low select but high to very high insert count, wouldn't NSE negatively impact performance on the afore mentioned inserts?

I have seen recommendations elsewhere* that tables with low read / high change rate are considered good candidates for NSE. I don't yet understand why, because I would assume that ANY kind of table access (not just selects) would potentially be impacted by NSE.

I already tried to find an answer on help.sap.com and in SAP notes and blogs. But found nothing to help me better understand why "write access" on tables doesn't factor in the same way as read access.

Best regards,
Marcus

*See: https://blogs.sap.com/2021/10/25/hana-nse-part-i-tech.-details-qa/

ruediger_karl · ‎07-27-2023

Hi Marcus,

any inserts/updates/deletes to NSE data are written into the HANA delta store. The delta store is always hold in memory. Those changes in the delta store are merged regularly into the main store, that contains the read-only data of in-memory and NSE data. Hence, the statement "tables with low read / high change rate are considered good candidates for NSE." is correct.

Regards, Ruediger

marcus_greger · ‎07-27-2023

Hi Rüdiger,

Thanks for the quick reply! That clears it up for me. 🙂

Regards,
Marcus