Frequency of access, operational usefulness, performance requirements, and storage costs are key considerations for assigning data to different storage tiers. Data tiering is an integral component of an organization’s data architecture strategy that facilitates the movement of data between an assortment of high performing but costlier storage options, and slower, less expensive storage options—offering valuable performance-cost tradeoffs for large enterprises.
I recently shared an overview of the data tiering options available for SAP HANA. Now let’s take a closer look at the SAP HANA dynamic tiering solution for warm data management.
Is Your Data Hot, Warm, or Cold?
To recap, data tiering ensures that a customer’s mission-critical data—classified by SAP as “hot” data—is located on the highest performance (and highest TCO) storage, the SAP HANA in-memory database. “Warm” data—the data tier focus for this post—which is less frequently accessed than hot data and has relaxed performance constraints, is stored in lower cost disk-backed columnar stores within SAP HANA. And rarely accessed or inactive “cold” data is managed separately from the SAP HANA database and relegated to slower and less expensive disk, or Hadoop storage media.
SAP HANA Dynamic Tiering
SAP HANA dynamic tiering is an integrated component of the SAP HANA database that offloads the storage and processing of less frequently accessed warm data from SAP HANA in-memory. SAP HANA dynamic tiering is a cost-effective option that adds intelligent, disk-based extended storage to the SAP HANA database. It allows organizations to create new multistore tables and extended tables that behave in the same way as other SAP HANA tables, but their data resides in the disk-based extended store. Warm data in the extended store is online and available for both queries and updates, and can be combined with hot data in SAP HANA memory.
SAP HANA dynamic tiering’s extended tables are disk-based columnar tables created in the HANA data catalog. All data loaded into an extended table resides on disk and can be updated, deleted or queried just like a normal SAP HANA table. Multistore tables are SAP HANA partitioned tables that have partitions in both the SAP HANA default column store and the SAP HANA dynamic tiering extended store as illustrated below.
Multistore tables allow for a single, partitioned SAP HANA table—instead of creating separate hot and warm store tables. Each partition resides either in-memory or in extended storage. Multistore tables support built-in runtime partition pruning for query optimization. Multistore tables simplify data modelling, and you can easily manage the movement of data partitions between the two types of storage using SQL statements, or the Data Lifecycle Management tool.
An Integrated Storage and Processing Tier Within SAP HANA
Even though SAP HANA dynamic tiering creates a new store that sits in the SAP HANA database, developers and administrators experience it as single database. That is because SAP HANA dynamic tiering is an integrated storage and processing tier within SAP HANA, sharing common installation, monitoring and administration tools. Relevant processes such as backup and recovery, and system replication include SAP HANA dynamic tiering data. If a multistore table partition is moved from memory to disk, it is processed within a transaction boundary—meaning that if the transaction gets interrupted or doesn’t complete, data will not end up two places.
Although SAP HANA dynamic tiering uses similar columnar database technology to SAP HANA, it maintains data on disk and therefore has different sizing and configuration aspects than a pure SAP HANA in-memory system. A single SAP HANA dynamic tiering node on properly sized hardware can scale up to effectively manage up to 100TB of compressed data on disk.
In production environments, SAP HANA dynamic tiering is normally operated on a dedicated host, but it may also be co-deployed on the same host as SAP HANA for scale-up systems. For SAP HANA scale-out systems, SAP HANA dynamic tiering should be installed on its own machine. The SAP HANA dynamic tiering server may be deployed on commodity hardware, making it a lower cost option for managing warm data, although with slower performance and reduced functional parity compared to SAP HANA in-memory nodes.
When functional gaps exist between SAP HANA and SAP HANA dynamic tiering, data is pulled from SAP HANA dynamic tiering into SAP HANA memory for processing. The predictive analytics (PAL) engine provides a good example: the PAL engine runs in SAP HANA, not in the SAP HANA dynamic tiering server, so, if a PAL function is executed on SAP HANA dynamic tiering data, the data is automatically copied to SAP HANA memory prior to processing.
A Powerful Solution for Warm Data Management
SAP HANA dynamic tiering is a simple yet powerful and cost-effective solution to warm data management for SAP HANA. Managing and storing warm data—data that does not need to be maintained in-memory for real-time processing—is a crucial step for organizations seeking to decouple their data growth from expensive hardware growth.
Learn more about SAP HANA dynamic tiering and stay tuned for our next instalment focused on a second option for warm data management: SAP HANA Extension Nodes.