SAP HANA Cloud, Data Lake Relational Engine on Object Storage
In the 2022 QRC3 release of HANA Cloud, the data lake relational engine is releasing a major new feature that will change the TCO profile for the relational engine and provide improved performance in a variety of areas.
What is this change?
As you may know, the HANA Cloud, data lake relational engine was based on the on-premise SAP IQ technology. This core underwent a transformation to make it available as a cloud service, while still providing a wide breadth of features and the maturity of the on-premise product at its core.
Up to this point, the HANA data lake relational engine (also known as HDLRE) has relied on more traditional block-based disk storage in the cloud to provide its service. This enabled us to make the service available in the cloud in a shorter timeframe but with complete confidence in its quality and provide a huge amount of functionality. However, there were some drawbacks. For example, when provisioning an HDLRE instance, you had to pre-provision your storage and grow it yourself when it got full. In addition, the amount of storage you provisioned had a direct impact on the I/O performance of your instance. Finally, you could not shrink your storage once it was provisioned unless you rebuilt your instance.
Starting with the QRC 2022 release, HDLRE now leverages the HDL Files storage for its database files – in other words HDLRE stores its its database files in object storage. This resolves many of the drawbacks cited above and provides some additional benefits.
What Are The Benefits?
Moving to object storage for the HANA Cloud, data lake relational engine storage provides some significant benefits.
- No more need to pre-provision storage. Storage will dynamically grow and shrink (in 1TB increments) based on the amount of data you have stored in HDLRE. This also makes provisioning simpler since you no longer need to supply a size for the relational engine.
- The cost of storage goes down because storage is elastic, and grows/shrinks based on your usage. This will potentially provide significant savings for at-rest data storage.
- Performance becomes independent of provisioning size. Your storage performance is more dynamically scalable and related more to the amount of activity occurring than the amount of data stored.
How Are Costs Impacted?
There are a few changes to metering which will impact costs for HANA Cloud, data lake, and overall, we expect most, if not all, customers to see a decrease in costs. In some cases this will be a small benefit, and in others it will be significant. It is also important to note that these cost improvements will be recognized incrementally, since there is some cost associated with the upgrade to use cloud object storage.
Your HANA Cloud, data lake billing will also look slightly different with the new Cloud DBSpace feature enabled. Below is a summary of what you can expect to see.
As I mentioned above, object storage is significantly cheaper than block based storage. We expect to see a large decrease in costs for data at rest.
The SLA for backup is not changing, but there are a couple of changes related to backup that will affect TCO. The system database, which is small compared to the amount of user data we expect to see in a data lake, will still be backed up using traditional database backup methods. The user data, which is stored in object storage, will leverage the features of object storage (eg. snapshots and durability) to ensure recoverability in the case where a problem arises that requires recovery. Depending on your actual usage of the relational engine, you could see a significant decrease in backup costs. For example, if you only add and rarely update data in HDL RE, your backup costs will be significantly lower. If, however, you are constantly updating data in HDLRE, your backup costs will be similar to what they are today.
There are some changes to the compute charges for HDL RE that align it more closely with a pay-per-use model. The existing compute charges (the number of vCPUs allocated to HDL RE processing) are not changing, but HANA Cloud, data lake Files API calls metric will now be used to track read/write from/to object storage. These costs are directly related to your actual usage of the data lake. If you were storing data and only querying infrequently, you will not see much of a change in your compute costs. However, if you are querying the data lake heavily, you could see an increase in your compute costs.
Network Data Transfer
You may also see some changes in the Network Data Transfer metric. Prior to this change, there were network data transfer charges on some cloud providers for both read and write activities to the data lake. After this change, this metric should reflect read operations from the data lake (ie. query result sets and file reads). In almost all cases this should result in no change or a reduction of Network Data Transfer charges.
How Do I Upgrade My Existing HDLRE Instance to Use Object Storage?
With the release of HANA Cloud QRC3, all new instances of HANA Cloud, data lake relational engine will use object storage by default. However, existing instances will continue to use the traditional storage until upgraded. The upgrade to the object storage for your instance is a two step process. The first step is to upgrade the software to QRC3. This is a ‘regular’ upgrade, available to you from the HANA Cloud Central tool. You can do this upgrade at the time of your choosing.
After the software for your instance has been upgraded, another upgrade will be made available to you to perform the upgrade to object storage. This upgrade is executed in the exact same way as the software upgrade – from the HANA Cloud Central interface for your instance. The only difference is that this upgrade will take a little bit longer than a regular upgrade, since your data will be moved to object storage. The duration of the upgrade is dependent on how large your instance is. It could be anywhere from a few minutes for a 1TB instance, to a few hours for much larger instances.
Once the upgrade is complete, you are all set. There is nothing further that you must do to enable the use of object storage.
We recommend that you schedule and perform this upgrade based on your business availability. All HANA Cloud, data lake instances will be upgraded to the 2022 QRC3 release by SAP beginning in Q1 of 2023. The pre-defined maintenance window will be used for this upgrade.