Update – Data LifeCycle Management for BW-on-HANA
Data managed by the SAP BW application can be classified into 3 basic categories:
Hot – high-frequent access (planning, reporting, advanced analytics, data manipulations)
Warm – less frequent access, relaxed performance SLAs (usually batch jobs), simple access patterns
Cold – archive, data lakes
For the “warm” data in your BW system SAP HANA offers currently 2 options: the non-active data concept (high-priority RAM displacement) and HANA dynamic tiering (using an ExtendedStorage server). We have evaluated the situation anew and have seen that with the advancements in hardware technology and in SAP HANA we do now have an even simpler and more intriguing option. Instead of introducing a new ExtendedStorage server into your HANA cluster to store the “warm” data you can use standard HANA nodes but with a different sizing formula and RAM/CPU ratio instead.
You basically run an “asymmetric” HANA scale-out landscape: a group of nodes (for your “hot” data) with standard sizing and another group with a “relaxed sizing” (you basically store more of the “warm” data on these nodes than RAM is available – the “extension” group). This allows you to run a HANA scale-out landscape with fewer nodes and less overall RAM but with the same data footprint.
Such a setup can be significantly easier to set up and administrate and it offers, right out of the box, all features of HANA with respect to operations, updates, and data management. The differentiation of data into “hot” and “warm” can be done easily via the BW application and using the standard HANA techniques to re-locate the data between nodes.
We are currently preparing the boundary conditions for these setups and are in intensive discussion with our hardware partners to enable attractive offers. The goal is to start a beta program in mid-2016. Please stay tuned for more details to follow very soon.
Please note that this setup is currently only planned for BW-on-HANA since it heavily relies on BWs partitioning, pruning and control of the access patterns. For applications looking for a more generic “warm” data concept, the HANA dynamic tiering feature is still a valid option. HANA dynamic tiering continues to be supported for BW as well.
Exciting development Klaus, thank you for sharing. I will be following the development very closely.
The original design criteria for BWoH recommended a data to memory ratio of 50% to ensure you are able to maintain your data in memory without unloads. As customers started using BWoH, they realised that many of the columns are unused, so rarely take up memory as they are never loaded, So customers squeezed more data into their HANA systems. SAP then introduced the non-active data concept acknowledging that objects like PSA's need not be in memory all of the time, which provided an opportunity for more active data per GB of memory. As customers increase beyond the 50% ratio (we have seen some with over 100%) the performance management and capacity planning become closely linked as the risk of hot data being unloaded to satisfy other data demands increases. Lots of data introduces other capacity issues to consider like increased cluster node reorganisation, backup times and test data management challenges (e.g. copy-back of data) And from a day to day operational perspective we have seen reliability degrade with unexpected out-of-memory errors increase as HANA struggles to balance the demands. Push any system to the limit, you are going to get issues. Trouble is, with no hard limits, it is difficult to define what the limit is until problems occur.
To avoid disappointment from performance of BWoH using this model, Have SAP considered what data/mem ratios are reasonable for each of the hot and warm zones? e.g. stick to the 50% data to memory ratio for your HOT data, to maximise the likelihood of your data being in memory and 150% data-to-memory for the WARM data?
I would be interested to hear your thoughts,
I am not sure I understand correctly the direction... can you please confirm following understanding is correct?
1.) extension slaves CAN use same HW as primary - they will just have more data in - so this will lead to more frequent displacements - however CPU/RAM ratio CAN be same...
2.) extension slaves CAN have maximized RAM (relaxed CPU/RAM ratio) to save the costs even more - so this would lead different models...
...where you can use first, second or both options - is this correct? I am just trying to confirm if the "warm" is achieved through displacements, or through relaxed HW or both (if desired)...
I have now published more details here: More Details - HANA Extension Nodes for BW-on-HANA
This should answer your questions why and how it works for BW and what deployment options look like.
Best regards, Klaus
really interesting, thanks. Are there any updates in relation to Disaster Recovery support - i.e. (HANA) System Replication? It has been a planned feature for a while, but so far it does not seem like it has been enabled for Dynamic Tiering.
Thanks in advance.
HANA System Replication (HSR) works in all flavours for the new Extension Node concept. Partial HSR support for Dynamic Tiering is planned for end of this year - for more details please see the roadmap slides for SAP HANA.
Best regards, Klaus
Since having read your blog about the new "Extension nodes", I'm thinking about if the concept of "Extension nodes" can be applied to a single node installation at a later stage.
Lets assume that a customer is buying a single node 4 TB System which is now available and which would be sufficient for some time.
Also let's assume that for the moment, the customer has good reasons to prefer initially a scale up to a scale-out approach.
But because of a very positive development of the business a lot more source systems might potentially become relevant in the future. This means that the Initial sizing can become too small causing the known trouble. The active/nonactive concept might not be as efficient as desired either. But,parts of the data (old and new) could be candidate for a storage in the Extension nodes.
Since the customer is on a single node installation, I'm having my doubts now, if you can move from a single node installation to a "unbalanced" scale out configuration at a later stage, continuing with the 4TB machine as the HOT data store. I could find note 2130603, which describes the Scenario of a backup/restore procedure from scale up to scale out, Step5 describes that nodes can be added to a single node installation. But I'm wondering if such a migration to a "unbalanced" scale out scenario with extension nodes might struggle due to other aspects I oversee (table placement note 1908075 apart).
Why in addition could I like that aproach: DT is an all-or-nothing approach for a table and only available for WO (A)DSO and PSA. But Extension nodes could work for "warm" partitions of a partitioned table (like NLS can operate on a subset of data of a given table). If my understanding is correct, DWF DDO can repartition a table so that the warm data part could be put to the Extension nodes while the hot part could stay on the original node. That would be a plus in my eyes as well.
What do you think? Would such a thought give some guarantee to the investment made under the condition of an "unpredictable future development" in terms of data volume?
Or is this just too much of hack?
Kind regards, Philipp
moving from a single-node instance to a scale-out system works via the backup/recovery method described in note 2130603. Once the additinal node is added you start classifiying your data as "warm" and then start moving data to this node (using DWF DDO). Currently only additional nodes with the same size are supported, i.e. in your example an additional 4TB node would be added. We are looking into further options (see option 3 in my followup blog).
We support "warm" partitions for the advanced DSOs in conjungtion with the Extension Node concept. Please again see the details in the follow up blog.
Best regards, Klaus
Many thanks, looking forward to the third part of the blog.
Kind regards, Philipp
The document does mention that Dynamic Tiering is still supported, and I have seen other documents that say "The current dynamic tiering approach is discontinued with SAP IQ as a warm store" (DOC-39944). Could you provide some guidance as to the approach for new migrations?
the information provided the documented mentioned by you is wrong and I have asked the colleague to correct this as soon as possible to avoid confusion. The message about BW and HANA Dynamic Tiering is clear: We do continue to support customers using or wanting to use DT, but we recommend the approach with the Extension Nodes for BW customers looking for an effcient way to store "warm" data.
Migrating from DT to the Extension Nodes is fairly easy as it is just moving the data from the Extended Storage server (DT) back to HANA and using the new data placement methods for the Extension Nodes. Since we have only a very low number of DT customers we have no "automation" of this process.
Best regards, Klaus
Is there any update on when this feature will become generally available?
Where can I get some additional information on tips on the expected storage capability for the Extend nodes, since the examples seem to say that memory to data ratio vary between 1:1 and 1:1.5.
GA for the Extension Node concept for option 1 and 2 is planned with the DSP (Datacenter Service Point) for HANA SP12 (~September 2016).
If and how option 2 is possible depends on the used HW setup. Please check with your HW partner as mention here:
Best regards, Klaus
IS there any documentation on this besides these two blogs and the SAP Note? Something official saying that it is real would help people that want to use it.
Is it possible to migrate "on premise" BW on hana with NLS setup to "cloud"?
What are best possible options and approaches for the same?