Skip to Content
Author's profile photo Former Member

Why SAP Vora for Big Data?

As the former SAP CIO used to say, “HANA is growing up so fast.” Well HANA has certainly matured over the last 5 years and with age, the adolescent has new demands.

Everyone who knows two cents about HANA knows its quite expensive, in terms of licensing and hardware. In BW world, this becomes a problem because of explosive growth in data, primarily due to IoT. So SAP provides the option to store the warm data in a columnar disk based store (Dynamic Tiering), which is managed directly by HANA. This is far cheaper than HANA in-memory and thus improves the price to memory ratio for the solution.

But why stop there? As the customers appetite for storing and processing data grows, SAP has to offer a way to leverage Big Data / Hadoop as a cold store. A popular strategy is to use SDA to access Hadoop via Hive or Spark. What most people are not aware of is that this is not a good way of utilizing Hadoop. By simply sending the query to Hadoop, the data is returned to HANA for processing. However, the entire premise for HANA is to send the code where the data is; so SDA is not the right approach for Big Data. What is needed is for HANA to be able to inject its query into Hadoop nodes and leverage Hadoop’s processing power; for example, to do transformations on very large data sets, that cannot be loaded into HANA memory anyway.

This is the problem that is answered by Vora, which is a layer that sits on top of Spark in Hadoop. The role of Vora is simply to allow HANA to leverage Hadoop for processing intensive work. There are other advantages as well, such as support for hierarchies and currencies in line with HANA.

The downside of Vora is that its triples the sizing requirement for the Hadoop cluster and adds significant SAP licensing costs. This damages the case for Big Data with SAP for the time being. But Vora is still new and as time passes, we will see more refinements and perhaps a more feasible licensing strategy from SAP.

Assigned Tags

      You must be Logged on to comment or reply to a post.
      Author's profile photo Puntis Palazzolo
      Puntis Palazzolo

      Hi Adeel. It's true that SAP Hana Vora fits into the Hadoop ecosystem and leverages spark capabilities. It's not only to be used with Hana but also as a standalone platform. Vora has the capability to work with Hana in case customers want to develop their big data scenarios using both their contextual and Hadoop data but in general it is a stand alone platform and with multiple engines such as Time Series, Graph Engine, Relational Engine, Hierarchies, Disk and Document store, etc. which allows the customers to consume and analyze different data types and develop their full stack application on Vora . It surely doesn't triple the sizing for Hadoop cluster and licensing cost. for more information please refer to Vora sizing guide and please get in touch for more info on the pricing.

      Author's profile photo Daniel Rutschmann
      Daniel Rutschmann

      Hi Adeel,

      I see that Puntis already pointed out several good points. I'm not sure where you get the information from about Vora tripling the data size by factor three. This is certainly not true.

      You purely talking about the data tiering use case. In my opinion that is not really a Big Data use case. It is a use case mostly driven by Data Warehousing and Data Marting scenarios. Big Data is tacking different scenarios and challenges. While Vora provides significant value to the data tiering use case, it was certainly not just built for providing a faster and more efficient integration fro SAP HANA to Hadoop. Vora's main focus is provide rich and interactive analytical and advanced data processing capabilities for Hadoop and NoSQL use cases.

      Have a look at a blog I posted today today that explains more details and might help to understand the bigger picture.

      Best, Daniel