Skip to Content
Technical Articles
Author's profile photo Jens Gleichmann

Details about HANA LOBs

last updated: 2022-06-22 16:30 CEST

Ok, large objects aka LOBs (no not Line of Business) are nothing special and well known from other database systems.
LOBs are unstructured data like a picture, a PDF or a message XML content. The shared properties of this kind of data is that they can be quite large.

There are different types of LOBs:

Type Description
BLOB The BLOB data type is used to store large amounts of binary data. BLOB values can be converted to VARBINARY.
CLOB The CLOB data type is used to store large amounts of 7-bit ASCII character data. CLOB values can be converted to VARCHAR.
NCLOB The NCLOB data type is used to store a large Unicode character object. NCLOB values can be converted to NVARCHAR.
TEXT The TEXT data type enables text search features. This data type can be defined for column tables, but not for row tables. This is not a standalone SQL type. Selecting a TEXT column yields a column of type NCLOB.
BINTEXT The BINTEXT data type is similar to the TEXT data type and supports text search features, but it is possible to insert binary data. This data type can be defined for column tables, but not for row tables. This is not a standalone SQL type. Selecting a BINTEXT column yields a column of type NCLOB.

There are also certain restriction for them e.g.:

LOB columns cannot appear in ORDER BY or GROUP BY clauses

or

LOB columns cannot appear in WHERE clauses as predicates other than LIKE (meaning that no comparison is allowed).

All this details and more, you can access via the official documentation.

For a in memory database those data can be a ressource killer. Because of this, not all LOB data is loaded into HANA. But there is more to report, because they are handled differently in depency to the HANA release and the running application. This must be taken into account in order to achieve max. performance.

Content:

  1. types of LOBs
  2. VARBINARY vs. LOBs
  3. garbage collection
  4. Parametrization
  5. Mini Checks
  6. Frequently occuring known issues
  7. Threads
  8. SQL checks
  9. Summary

Sources for details
2400005 – FAQ: SAP HANA Persistence

2220627 – FAQ: SAP HANA LOBs

1994962 – How-To: Activation of Hybrid LOBs in SAP HANA

2169283 – FAQ: SAP HANA Garbage Collection


1. Types of LOBs

LOBs can be stored as CS and RS table. Here we have different types of LOBs:

  1. hybrid LOBs (RS+CS): Small values (typically below 1 KB), in memory like normal column values
  2. Packed / midsize LOBs : Midsize values (typically between 1 KB and 4 KB), packed with other LOBs into one container on disk
  3. File LOBs: Large values (typically above 4 KB): one container on disk per LOB

Be aware that packed and file LOBs are also named as disk LOBs.

You can use the table data collector (attached to note 1969700) to identifiy which table stores the LOBs in which way:
HANA_Tables_DataCollector*

SCHEMA_NAME TABLE_NAME  COLUMN_NAME   LOB_TYPE DISK_GB BINARY_GB LOB_COUNT
schema table CONTENT PACKED 3.00 2.90 1215200
schema table CONTENT FILE 1.20 0.80 1215200

2. VARBINARY vs. LOBs

VARBINARY can also hold quite large binary data (up to 5000 byte), but it doesn’t belong to the LOB data types. As a consequence the hybrid LOB feature can’t be used and the column is loaded into memory when it is accessed.

When you upgrade to SAP_BASIS >= 7.53 the previous VARBINARY columns remain in place until an ABAP table conversion happens. News tables are created with a LOB data type.

As a consequence many potentially large tables with large LRAW columns (<= 5000 byte) are completely loaded into memory (e.g. BALDAT, PCL2, SOC3, SWNCMONI, SWWCNTP0). In order to reduce the memory consumption you can consider taking advantage of paged attributes / Native Storage Extension (SAP Note 2799997) so that unused column pages are evicted from memory in an LRU manner. Please also see SAP Notes 2816823 and 2898319 to safeguard NSE usage in table conversion events. Be sure that your revision is at least on Rev. 57 to avoid issues with the LOBs.

LRAW length limit SAP_BASIS SAP HANA Data Type
<= 1000 all VARBINARY
1001 – 5000 <= 7.52 VARBINARY
1001 – 5000 >= 7.53 BLOB / CLOB
> 5000 all BLOB / CLOB

 

3. Garbage Collection

There is also a LOB garbage collection (=LGC) for deleted / fragmented data.
You can also trigger a manually GC for all CS tables:
ALTER SYSTEM RECLAIM COLUMN LOB SPACE
for a defined CS table:
ALTER TABLE “<table_name>” WITH PARAMETERS (‘LOB_GARBAGE_COLLECTION’ = ‘1’)
For row store tables it is not possible to trigger LOB garbage collection on table level.

ALTER SYSTEM RECLAIM [ROW|COLUMN] LOB SPACE [ALL] The ALL key word is available as of SAP HANA 2.00 and makes sure that also LOBs of tables are cleaned that are currently not loaded into memory.

Be aware that LOB garbage collection itself doesn’t purge the LOB files on disk level. This is handled at a later time by the persistence garbage collection. Starting with SAP HANA 2.0 SPS 06 you can alternatively use the SAP HANA job scheduler to trigger column store LOB garbage collection at specific times (SAP Note 3147465).

indexserver.ini -> [lobhandling] -> garbage_collect_daily_schedule_s -> 3600
Available starting with SAP HANA 2.0 SPS 00 as alternative to garbage_collect_interval_s (see below), inactive per default with SAP HANA <= 2.0 SPS 03
Specifies the time in seconds since midnight when the volume-scan LOB garbage collection is executed on a daily basis. A negative value pauses the daily scan.

indexserver.ini -> [lobhandling] -> garbage_collect_interval_s -> 900 (<= 2.00.053) / 43200 (>= 2.00.054)
Defines the time interval of (table-wise) column store LOB garbage collections, value 0 disables LOB garbage collection

Specifies the time in seconds between table-wise LOB garbage collection runs. If the interval is zero the table-wise LOB garbage collection is paused.

Attention: When garbage_collect_daily_schedule_s is explicitly set on SAP HANA <= 2.0 SPS 03, garbage_collect_interval_s is ignored.

 

indexserver.ini -> [row_engine] -> garbage_lob_file_handler_enabled -> true
# Value ‘true’ activates row store LOB garbage collection
indexserver.ini -> [row_engine] -> garbage_lob_file_handler_interval -> 10
# Defines the time interval of row store LOB garbage collections, value 0 disables LOB garbage collection

In some patched environments both the table- and the volume-wise LGC is active.This is not neccessary! Only the volume-wise LOB garbage collection can remove LOBs belonging to previously removed columns, partitions or tables. Therefore, it is possible to completely disable the table-wise LOB garbage collection to eliminate its impact on system performance, but it is not recommended to disable the volume-wise LOB garbage collection for productive cases with LOB load.

SPS06
SchedulerJobs can only execute procedures, therefore the following command will create a procedure that triggers the LOB garbage collection:

CREATE PROCEDURE PROC_RECLAIM_COLUMN_LOB_SPACE AS BEGIN
EXEC 'ALTER SYSTEM RECLAIM COLUMN LOB SPACE';
END;

The following example defines the job to be executed once a week on Sunday beginning at 3 am at night:

CREATE SCHEDULER JOB RECLAIM_COLUMN_LOB_SPACE CRON '* * * SUN 3 0 0' ENABLE PROCEDURE PROC_RECLAIM_COLUMN_LOB_SPACE;

SQL Collection (1969700)

HANA_GarbageCollection_LOBs

4. Parametrization

Here are the default values (for SPS05 – some other releases may differ):

indexserver.ini -> [persistence] -> midsizelob_threshold = 0
global.ini -> [sql] -> lob_memory_threshold
global.ini -> [sql] -> default_lob_storage_type = hybrid
indexserver.ini -> [persistence] -> disposition_lob_read = 1
indexserver.ini -> [persistence] -> disposition_lob_read_small = 1
indexserver.ini -> [persistence] -> disposition_lob_write = 1
indexserver.ini -> [persistence] -> disposition_lob_write_small = 1
indexserver.ini -> [lobhandling] -> garbage_collect_interval_s = 43200
# When garbage_collect_daily_schedule_s is explicitly set on SAP HANA <= 2.0 SPS 03, garbage_collect_interval_s is ignored
indexserver.ini -> [row_engine] -> garbage_lob_file_handler_enabled = true
# row Value ‘true’ activates row store LOB garbage collection
indexserver.ini -> [row_engine] -> garbage_lob_file_handler_interval = 10
# Defines the time interval of row store LOB garbage collections, value 0 disables LOB garbage collectionglobal.ini -> [persistence] -> lob_page_cache_max_size => 10-20% GAL
global.ini -> [persistence] -> lob_page_cache_reclaim_target_size = 5%
global.ini -> [persistence] -> lob_page_cache_reclaim_threshold_size = 10%indexserver.ini -> [lobhandling] -> garbage_collect_daily_schedule_s -> 3600

indexserver.ini -> [cs_nse_advisor] -> enable_lob_pageloadable = false (SPS5) / true (SPS06)

# This parameter controls if LOB columns are considered by the NSE advisor (SAP Note 2799997). It is available starting with SAP HANA 2.00.059 and its default value is “false” for SAP HANA 2.0 SPS 05 and “true” for SAP HANA >= 2.0 SPS 06 (SAP Note 3125483).

 

Attention: The LOB Page Cache only includes the disk LOBs (packed + file)!

This means you can limit the size of in memory lobs which are placed as ressource killer. For sure it is a trade-off but are LOB data really business critical? You can save a lot of resources at the beginning of your sizing. You can also use NSE to page out LOBs for certain tables / columns which is an addition to the limitatiion parameter because they are global. With NSE you can also limit the memory LOBs in a different way.

 

5. Mini Checks

Check ID Description SAP note
M0371 Unused space in packed LOBs vs. data % 2220627
M0372 Unused space in file LOBs vs. data % 2220627
M0373 Tables with many small file LOBs 2220627
M0374 Tables with many disk LOBs 2220627
M0450 Tables with memory LOBs > 2GB 1994962
M0849 Orphan disk LOBs 2220627
M0850 Time since last CS LOB garb. coll. (days) 2220627

Check out note 1999993 – How-To: Interpreting SAP HANA Mini Check Results how to handle them.

 

6. Frequently occuring known issues

Memory leak

Starting with JDBC 2.3.37 JDBC uses transactional LOBs. As a consequence database requests returning LOB data are no longer closed at the end of the statement execution. As a consequence statements can remain in SUSPENDED state for a long time, introducing various trouble like blocked garbage collection or growing Pool/Statistics and Pool/RowEngine/QueryExecution allocator sizes. This affects e.g. all Solution Managers Java stacks!
This can be fixed by setting transactionalLobs=false as stated out in 2711824 – High Number of Prepared Statements Causing High Usage of Memory Allocator Pool/Statistics.

 

I/O issues

The critical phase can also take longer in context of many disk LOBs (SAP Note 2220627) and a large Pool/PersistenceManager/PersistentSpace/DefaultConverter/ConvPage allocator (SAP Note 1999997, issue number 257466). The FlushResourcesThread is in this case busy clearing converter pages in a call stack like:

memset_impl
PageAccess::ConverterIndexPageControlBlock::resetFlushPerSPCounter
DataAccess::PersistentSpaceImpl::resetFlushPerSPCounter
ResourceManager::WaitAndSwitchCounterResource::determineFlushDestination

In this scenario each critical phase has a certain duration, even if no I/O writes are done. Reducing the number of disk LOBs in the system can reduce the runtime of the critical phase. With SAP HANA >= 2.00.048.04 and >= 2.00.054 this scenario is optimized and significantly shorter critical phases can be expected.

 

7. Threads

Following known thread methods:

GarbageLobFileHandler Lob garbage collection (row store)
RSLobGarbageRemover Deletion of orphan row store LOBs
LobGarbageCollector Lob garbage collection (column store)
LobGarbageCollectorJob

Column store LOB garbage collection

Thread detail “table wise” indicates a table wise LOB garbage collection triggered by regular runs based on lob_garbage_collect_interval_s

 

8. SQL checks

Also for LOBs Martin has created some amazing scripts packed into the SQL Collection (1969700):

HANA_GarbageCollection_LOBs

HANA_LOBs_HybridLOBActivation_CommandGenerator

HANA_LOBs_LOBFiles

HANA_GarbageCollection_LOBs_ColumnStore_Reclaims_1.00.122.17+

HANA_LOBs_LOBSizeHistogram_CommandGenerator

 

9. Summary

In the end LOBs should not be placed for a long time into a in-memory database like HANA. But for some circumstances it is not possible for every scenario. Here you have to find a traid-off between performance and ressources. Therefor you have to tune the thresholds of the parameters or use NSE. This can save a lot of memory also during the initial sizing of the system and the further operation after going live.

Assigned Tags

      1 Comment
      You must be Logged on to comment or reply to a post.
      Author's profile photo Andreas Jankowiak
      Andreas Jankowiak

      Thank you Jens, very interesting!