Skip to Content
Product Information

Big Data with SAP | SAP HANA 2.0 – An Introduction

In this blog series you will find quotes, backgrounds, suggested further readings and other information related to my latest book SAP HANA 2.0, An Introduction published by SAP Press.

As the goal of the book is to provide an introduction, we could not spend as much time and pages on each and every topic as we wished at times. Big Data is one such topic although a small paragraph is included covering SAP Data Hub, SAP Vora, and SAP HANA Hadoop Integration. In this blog, I will cover big data topics in a bit more detail and include references where to find more information.

Any good? Post a comment, share on social media, and/or give a like. Thanks!

/wp-content/uploads/2016/02/sapnwabline_885687.png

Pure Gold

When business discovered big data it was welcomed as the new (black) gold.

Alas, after five years of drilling and prospecting not everyone remained as enthusiastic.

Looking at web searches with Google Trends, we can see that the interest in big data took off in 2012 but now has waned a bit, taken over by data science and machine learning by 2018. Who’s to blame? Cloud computing.

Google Trends: Big Data versus Data Science

/wp-content/uploads/2016/02/sapnwabline_885687.png

The Vs Everyone Must Know

According to The Origins of ‘Big Data’: An Etymological Detective Story the term goes back to the 1990s but from a technical perspective, big data took shape between 2004 and 2008 when the contemporary search giants Google and Yahoo developed and later open-sourced MapReduce and the Hadoop Distributed File System (HDFS). Pig, Hive, Zookeeper, and other Apache open source projects followed (the current count is 49).

Big data was initially characterised with 3 V’s: volume, velocity, variety, to which IBM added veracity, The Four V’s of Big Data, then we had the 5 Vs Everyone Must KnowThe evolution of big data – the ‘6 Vs’, the Seven V’s of Big Datathe 10 Vs of Big Data and SAP even went up-to-eleven by adding the V of Vora (more on that V below).

Illustration from the The 42 V’s of Big Data and Data Science

Should you want to learn more, the Big data entry on Wikipedia provides as good an introduction as any (including a shady picture of the SAP Big Data bus). Alternatively, visit

/wp-content/uploads/2016/02/sapnwabline_885687.png

The Path Forward

With the Sybase acquisition of 2010, SAP got hold of several big data-related technologies like IQ and Event Stream Processor (ESP) for IoT (Internet-of-Things) ingestion.

In 2012, SAP bundled SAP HANA with several of these technologies as the Real-time Data Platform (RTDP).

 

SAP Real-Time Data Platform (2012)

Infinite Insights

A year later, in 2013, SAP acquired KXEN, the Knowledge eXtraction Engine, which just had brought InfiniteInsight to market for self-service predictive analytics, bringing data mining to the business professional, no PhD required. SAP InfiniteInsight would morph into SAP Predictive Analytics with the Automated Predictive Library (APL) providing SAP HANA integration. Although we would now file this under Analytics, at the time data mining was the way to go to unlock big data.

For more information about data mining and advanced analytics, see

Smart Data Services

The same year, with the release of SAP HANA SPS 06, smart data access added virtualisation to the SAP HANA platform, which enabled direct access to Hadoop and other data sources from SAP HANA.

Other “smart” technologies followed the next year with SPS 09 (2014) with Smart Data Streaming, (later Streaming Analytics) based on ESP; Dynamic Tiering (smart data tiering was considered as well), a native big data solution based on IQ; Smart Data Integration (SDI) and Smart Data Quality (SDQ) both BusinessObjects Data Services technologies to address the veracity of big data.

SAP HANA smart data access

/wp-content/uploads/2016/02/sapnwabline_885687.png

On the Bus

Also in 2013, SAP partnered with HortonWorks (now Cloudera) to resell big data platforms and started the Big Data Tour to get the developer community on the bus.

Hop Aboard the SAP Big Data Bus | Disrupt SF 2013

SAP HANA, the Real-Time Business Platform

Quo Vadis?

The next year, 2014, Spark integration was added plus a certified Spark distribution, causing some question marks about the future direction of SAP (HANA).

Illustration from Bridging two worlds : Integration of SAP and Hadoop Ecosystems

/wp-content/uploads/2016/02/sapnwabline_885687.png

Voracious

Big data integration took one step further with the release of SAP HANA Vora, announced at SAP TechEd 2015.

The name was later shortened to SAP Vora to underline that this concerned an independent product which not required the SAP HANA platform (see the FAQ for your questions).

/wp-content/uploads/2016/02/sapnwabline_885687.png

Big Data-as-a-Service

In 2016, SAP acquired Altiscale’s Big Data-as-a-Service (BDaaS) solution, integrated as SAP Cloud Platform Big Data Services. Vora was added to the service and this brought more good news.

/wp-content/uploads/2016/02/sapnwabline_885687.png

Intelligent Technologies

SAP Leonardo was introduced at SAPPHIRE NOW 2017 as an innovation portfolio bringing together  Internet of Things, machine learning, blockchain, analytics, artificial intelligence, and Big Data technologies.

In 2019, again at SAPPHIRE NOW, SAP refocussed on the business and announced the Business Technology Platform (BTP) as successor: the fastest way to turn data into business value (yes, that’s one of the big data V’s).

/wp-content/uploads/2016/02/sapnwabline_885687.png

Cloud-native, Multi-cloud, and Hybrid

SAP Vora 2.0

For version 2.0, SAP Vora was re-architected to run inside Docker containers with Kubernetes for cluster management, providing customers “the flexibility to choose among cloud, on-premise and hybrid deployment models, and they can migrate between these options easily and with minimal disruption”.

SAP Data Hub

SAP Vora was now also included with another new containerised application, SAP Data Hub.

Illustration from What is SAP HANA Cold Data Tiering? by Ruediger Karl

SAP Data Intelligence

In 2019, SAP Data Hub was made available as a managed service with the name SAP Data Intelligence and just recently (March 2020), the on-premise product and the cloud-based service have been merged.

/wp-content/uploads/2016/02/sapnwabline_885687.png

A Single Gateway to All Your Data

SAP HANA Cloud, Data Lake

Just released as well (March 2020) is SAP HANA Cloud. This service includes SAP HANA data lake, where we find our old friend IQ at work.

SAP HANA Cloud uses the same container and Kubernetes orchestration technologies as Data Intelligence (and Vora).

Smart data access (virtualisation) plays an important role in the design of SAP HANA Cloud and this includes, of course, access to the usual big data source suspects Hadoop and Spark but also to Google Big Query and Amazon Athena.

For more information, see

/wp-content/uploads/2016/02/sapnwabline_885687.png

Learned Something New?  

Post a comment, share on social media, and/or give a like. That’s how the community works. Thanks.

If you would like to receive updates, connect with me on

Best,

Denys van Kempen

/wp-content/uploads/2016/02/sapnwabline_885687.png

Bonus Track

Between Two Schnitzels

/wp-content/uploads/2016/02/sapnwabline_885687.png

SAP HANA 2.0 – An Introduction

Just getting started with SAP HANA? Or do have a migration to SAP HANA 2.0 coming up? Need a quick update covering business benefits and technology overview. Understand the role of the system administrator, developer, data integrator, security officer, data scientist, data modeler, project manager, and other SAP HANA stakeholders? My latest book about SAP HANA 2.0 covers everything you need to know.

Get it from SAP Press or Amazon:

/wp-content/uploads/2016/02/sapnwabline_885687.png

/wp-content/uploads/2016/02/sapnwabline_885687.png

For the others posts, see

/wp-content/uploads/2016/02/sapnwabline_885687.png

2 Comments
You must be Logged on to comment or reply to a post.