Skip to Content

Hi Everyone,

In my earlier blogs, I shared what I learnt while exploring Big Data and Hadoop – Big Data Facts and Its Importance and Hadoop,Its Importance and Use Cases

In this blog, I would like to share how Hadoop and HANA can be integrated with each other.

Lets start with advantages of using Hadoop:

It can easily handle huge amount of data volumes

It is very good for storing Unstructured data

It is reliable, scalable and fault tolerant

It is Open source so is less costly

It provides Batch Processing

Now Lets look at some of the limitations of Hadoop:

It is not efficient to use for small anmount of data

It is less mature

It is difficult to find qualified Talent

It is not suited for real time scenarios

HANA and Hadoop:

As you would already know by now that Hadoop can store very huge amount of data. It is well suited for storing unstructured data, is good for manipulating very large files and is tolerant to hardware and software failures.

But the main challenge with Hadoop is getting information out of this huge data in real time.

No we also have HANA and as you all already know that HANA is well suited for processing data in Real time.

So to get real time information from massive storage such as Hadoop, we can use HANA and HANA can be directly integrated to Hadoop.

So we can combine Hadoop and HANA to get real time information from huge data.

Read Solving Big Data with SAP HANA and Hadoop:

http://www.saphana.com/community/blogs/blog/2012/08/27/solving-big-data-with-sap-hana-and-hadoop?q4654483=1

Watch the replay of SAP Big Data Chat on HANA and Hadoop:

http://timoelliott.com/blog/2013/08/sap-big-data-chat-hana-hadoop.html

Read Demystifying Big Data with SAP HANA and Hadoop:

http://events.sap.com/sapphirenow/en/session/2457

Read Hadoop + SAP HANA: Turning Infinite Storage into Instant Insights:

http://www.saphana.com/community/blogs/blog/2013/09/20/hadoop-sap-hana-turning-infinite-storage-into-instant-insights

SAP, Hadoop and HANA:

As explained in SAP CIO Guide on Using Hadoop, Hadoop can be used in various ways as mentioned below:

I have added Smart Data Access myself as it was not available at the time this guide was written but now we can use Smart Data Access to connect HANA with Hadoop.

Lets see how Hadoop can be used in SAP world:

HANA HADOOP.JPG

Hadoop as a flexible data store:

As we know Hadoop is less costly so we can use Hadoop as a flexible data store by storing data from various sources including SAP and Non-SAP sources like Social data, streaming data, transaction data etc. By keeping all the data in Hadoop, we can get any information we want and can do any type of analysis.

Hadoop as a simple database:

We can also use Hadoop as a simple database for storing and retrieving data in very large data sets. We can retrieve data from Hadoop using Hive or HBase.

Hadoop as a processing engine:

We can use the power of MapReduce programming model for many purposes such as Pig can be used for Data Analysis and Mahout can be used for Data Mining. We can write MapReduce application code in language of our choice, which can be then arranged and executed on Hadoop.

Hadoop for data analytics:

We can use Hadoop for mining data held in Hadoop for business intelligence and analytics

We have huge amount of data in Hadoop but all of data is not useful as lot of data is a low value data – so we will load only useful data to HANA.

For loading data from Hadoop to HANA, we will use SAP Data Services.

You can check the below Youtube video on how to load data from Hadoop to HANA:


For getting more detail about the above scenarios, please refer to SAP CIO Guide on Using Hadoop.

Accessing Hadoop using Smart Data Access:

Smart Data Access is a new feature that was introduced with SAP HANA SPS06. It enables remote access to data as if they are local tables without copying the data into HANA .

One of the main benefits of Smart Data Access is that we don’t need any special syntax to access heterogeneous data sources.

Lets say we have structured data stored in HANA and unstructured data stored in Hadoop.

So now we can remotely access Hadoop data using Smart Data Access and combine both structured and unstructured data to create new models and get real insight to our business and make better decisions.

How this works:

Lets say we created a combined model using structured as well as unstructured data as told above and this model is available for reporting.

So now we will make request through our reporting tool, based on our request HANA will determine the best way to extract data(also determines where and how data will get processed based on optimum utilization of application and system resources.) and will send request to Hadoop.

Check this awesome blog on Smart Data Access with Hadoop Hive & Impala by Aron MacDonald.

To know more about Smart Data Access, check the below blogs:

http://scn.sap.com/community/hana-in-memory/blog/2013/08/22/smart-data-access-a-new-feature-by-hana

http://www.saphana.com/community/learn/startups/blog/2013/08/21/hana-curious–smart-data-access

http://www.saphana.com/community/blogs/blog/2013/07/22/smart-data-access-data-virtualization-with-sap-hana

You can also check the videos on how to use Smart Data Access at HANA Academy(HANA is connected to Sybase IQ using Smart Data Access):

http://www.saphana.com/community/hana-academy#sps6

Also check the below blog by Aaron on Streaming Real-time Data to HADOOP and HANA:

http://scn.sap.com/community/developer-center/hana/blog/2013/08/07/streaming-real-time-data-to-hadoop-and-hana

Check out this video on how Hadoop and HANA can work together by Intel:

https://youtube.com/watch?v=P41tnTT3pek

Hadoop and HANA Use Cases:

1.) Genome Analysis:

MKI is using HANA with Hadoop to improve patient care in the realm of cancer research.

Genome analysis is the technique used to determine and compare the genetic sequence (e.g. DNA in the chromosomes).

Learn why HANA was selected for Real time Big Data Analysis to deliver advanced medical treatment

Check the below video:

http://events.sap.com/sapphirenow/en/session/2388

Also Check out the below YouTube Video:

2.) Real Time Retail Point of Sales:

3.) Using Big Data In the Stadium to improve fan service:

https://youtube.com/watch?v=jxdSa0F8F8g

Check out more HANA Customer Stories:

http://www.sapbigdata.com/stories/bigpoint-solves-big-data-challenges-with-sap-hana/

Check the below blog to know more of Hadoop Use Cases:

http://www.saphana.com/community/blogs/blog/2013/10/01/the-big-data-frenzy-and-how-humanity-benefits

SAP’s Hadoop Strategy:

To get the latest news regarding SAP and Hadoop, follow SAP’s Big data site: http://www.sapbigdata.com/

Check this blog to know about SAP’s Hadoop Strategy:

http://www.saphana.com/community/learn/bigdata/blog/2013/05/09/saps-hadoop-strategy

Recently SAP has signed agreements to redistribute and support Intel Distribution Apache Hadoop and Hortonworks Data Platform to customers.

http://www.news-sap.com/sap-helps-customers-achieve-real-time-big-data-results/

Hortonworks is a company that develops, distributes and supports Hadoop.

Also read the below article by Information Week:

http://www.informationweek.com/software/information-management/sap-expands-big-data-push/240161134

If you are interested, you can also join Tomorrow’s SAP Big Data Chat with Hortonworks:

http://www.saphana.com/community/blogs/blog/2013/10/02/join-the-sap-big-data-chat-with-hortonworks

Learn more about Hadoop and HANA Integration:


Follow the channel SAP Database and Technology at https://www.brighttalk.com/channel/9727 and watch all Webinars for free.

Check the below document to get links to all Big Data Webinars:

http://scn.sap.com/docs/DOC-44661

Read about SAP Hortonworks Reference Architecture:

http://hortonworks.com/wp-content/uploads/2013/09/Reference.Architecture.SAP_Hortonworks.v1.1.pdf

Read about Combining SAP Real-Time Data Platform with Hortonworks Data Platform

http://hortonworks.com/wp-content/uploads/2013/09/SAP_HortonWorks_GB_24469_en.pdf

Thank You for reading my blog.

To report this post you need to login first.

11 Comments

You must be Logged on to comment or reply to a post.

  1. Aron MacDonald

    Thanks for the mention.

    Very nice overview and collection of links.

    In addition to what you’ve mentioned,

    if you don’t have BODS and are prepared to invest in a small amount of custom Java development then you can also use HADOOP OOZIE to schedule data loads between HANA and HADOOP if required.

    HADOOP Flume can also be used to Stream data (such as Twitter) in real time to HANA, if you don’t have Sybase ESP.

    E.g.

    http://scn.sap.com/community/developer-center/hana/blog/2013/08/07/streaming-real-time-data-to-hadoop-and-hana

    There are lots of interesting integration options to leverage the benefits of the 2 platforms. 🙂

    (0) 
  2. Pinaki Patra

    Hi Vivek ,

    I am trying to connect HADOOP to sap HANA SP9  .

    But in the web ide we are not getting   any adapter name

    Do we have to install it seperately ?

    (0) 
  3. Amit Lal

    Hi Vivek,

    Do you know if Hadoop can work on HANA MCOD scenario, or only fit in Multitenant database containers (MDC)?

    Regards,
    AK

    (0) 

Leave a Reply