Employing Extension Nodes for Warm Data Management
In the increasingly complex world of information management, we know a few things to be true: security is paramount, sprawl is pervasive, cataloging and governance are essential.
Among these truths, a simple fact: data changes over time. This means that in order to control sprawl, maintain airtight security, and deploy effective cataloging and governance strategies, an organization must ensure that their data storage solutions acknowledge the shifting and variable value of data, allowing for effortless, real-time retrieval and analysis of mission-critical information and safe, cost-effective warehousing of static, inactive data.
This essential categorizing of data from most-used to least is known as data tiering, and at SAP, we use a simple classification system for our data tiers: hot, warm, and cold. You can read a more comprehensive overview of data tiering in my colleague Courtney Claussen’s recent blog post, but for now we need only a brief recap: mission-critical hot data has strict performance requirements and is stored in-memory within SAP HANA. At the other end of the data lifecycle, cold data is rarely accessed and read-only and can be stored and managed separately from the in-memory database at lower cost.
Hot and cold—relatively straightforward tiers of data with relatively straightforward storage options. But there is a third, middle tier—warm data, which presents an organization with more varied storage options. Warm data is not operational and is less frequently accessed than hot data. Previous SAP blog posts have discussed warm data storage options, including Dynamic Tiering. Customers who use SAP BW on SAP HANA, SAP BW/4HANA, and SAP HANA native applications have another option: the Extension Node.
The Extension Node option exists in a scale-out SAP HANA landscape where the system employs multiple HANA worker nodes that include several slave nodes and one master node. The minimal setup for using an extension node for warm data consists of two hosts—a worker node to store hot data (which must be the master node) and an extension node to store warm data as illustrated in the diagram below. The extension node itself is a slave node that has a been configured to process warm data exclusively. When the database is queried, it automatically retrieves data from the appropriate node. The other node is uninvolved, which creates a clear separation of block load between the two nodes and a clear separation of the data sets. The beauty of this landscape? These nodes are functionally identical; whatever you can do for hot data, you can do for warm data, so there are no inherent limitations or exceptions when using extension nodes for warm data storage.
While the extension node option allows warm data to be stored at a lower price point than in-memory hot data, they can be somewhat slower to access for processing than worker nodes due to size and quality of CPUs and the fact that some data sets may need to be reloaded from the disk.
The lower price point advantage of an extension node is also due to memory capacity. A hot data worker node can only be half-full at any given time, for example, your worker node has a one terabyte (TB) memory capacity, its effective storage capacity is 500 gigabyte (GB). An extension node, on the other hand, has double its actual storage capacity—where a one TB memory capacity amounts to two TB of warm data storage.
Added Storage Capacity
How is this possible? An extension node provides more storage capacity than hot nodes by extensively leveraging the disk to unload and load any data that does not initially fit into memory. It requires a suitable data layout (e.g. table partitioning) and query design in the application, similar to what is available in BW on HANA and BW/4HANA. Additionally, the extension node benefits from the employment of SAP HANA TDI 5 that is based on a workload-driven SAP HANA sizing and brings a relaxed core-memory ratio, which allows for a flexible relationship between memory capacity and the number of cores. This allows the extension node to use weaker and fewer CPUs for the same amount of memory. However, the memory size for hot nodes and extension node must be identical for SAP HANA native applications, known as symmetric memory sizing. BW on HANA and SAP BW/4HANA on SAP HANA 2 SPS03 allow for asymmetric extension node sizing, meaning the extension node may have a higher or lower memory size compared to the hot nodes.
SAP customers may be wondering if they can employ more than one extension node. Though it is possible, we encourage customers to begin with one extension node and leverage the factor of two storage capability because a single large extension node is more manageable than multiple extension nodes. However, should the need arise over time, multiple extension and worker nodes are supported.
Customers may also be wondering exactly how extension nodes are different from dynamic tiering. Each option offers its own value proposition: dynamic tiering does not support the complete functional scope, but it provides a large volume solution at a low price point, while the extension node supports the complete functional scope of the SAP HANA database by offering the same advanced analytics functionality for in-memory storage and retrieval of warm data that is available for hot data. The two warm storage options can coexist, though their mixed use is not supported in SAP BW use cases, which are the prime example of the value of extension node—for use with analytical rather than transactional workloads.
Extension node provides an efficient, cost-effective warm data storage solution without any sacrifice of functionality, which allows companies to remain in control of their data at every stage of its lifecycle.
For those looking to learn more about how extension nodes can transform their warm data storage, see our SAP HANA extension nodes FAQ document.