Skip to Content

Part 1 is here Share the Knowledge – Big Data and the Real-Time Data Platform Including SAP HANA and Apache Hadoop

Continuing on with HANA and Hadoop in this SAP TechEd recording:

/wp-content/uploads/2014/01/1fig_353957.png

Figure 1: Source: SAP

Big data starts about 500 million records – not because you can’t store it – it is when you start to query it and face issues

With HANA you can do billions of records, TB’s of data

Hadoop comes into the picture when you have 100’s TB’s of data

At some point you know, you are not putting it in HANA

HANA is real-time, and event stream processor.  You might turn to Hadoop when you have massive amounts of data to ingest.  Each machine is parallelized.

HANA has variety of data and push to Hadoop.  Hadoop gives you flexibility to handle all types of data including image processing.

Value is the “storage area” – data lake.  HANA is for High value with low volumes of low data.

You can offload historically to Hadoop.  Hadoop is not a database.  It manages blocks of data.

Hadoop vs. NLS?  On BW there is a Near-Line-Storage Sybase IQ option to unload data from HANA to guarantee data is there, consistent.  Right you now cannot do NLS in Hadoop.  Hadoop doesn’t have transactions.

/wp-content/uploads/2014/01/2fig_353958.png

Figure 2: Source: SAP

You can go from HANA out to other databases

Smart data access is the “glue”

You can create virtual tables in HANA that refer to tables in other databases

You don’t have to do syntax from other sources and you get richer semantics

You are pushing the processing down to the remote source

Smart data access will send data out to remote site

Automatic data translation is convenient as well.

/wp-content/uploads/2014/01/3fig_353959.png

Figure 3: Source: SAP

Smart data access is one way to connect the “worlds”.

On the left of Figure 3 is the consumption model, store and process, and ingest.

You can use the data in one of two ways – applications such as machine learning & predictive analytics (product recommendations).  Analytics use cases include dashboards, explorations (Lumira) – these can use HANA or Hadoop.

You can go from BusinessObjects to Hadoop

On the bottom you have ESP, replication framework, information management, and Data Services can operate with Hadoop.

/wp-content/uploads/2014/01/4fig_353963.png

Figure 4: Source: SAP

Direct HANA – Hadoop via Smart Data Access you have virtual data access.  Integration via ETL to move data but with TB’s of data you can move on a schedule but it is not interactive.  Data Services give you PIG with scripting.

You can use BI against HIVE using multi-source universes as of BI4.1 for scheduled reports.

Question & Answer

Q: How do you deal with the fact you have different response charactistics with the 2 systems?

A: With SP7 there is the remote materialization capability to cache queries – you are trading time for space (remote caching)

Looking at improvements to make it into Hive faster

Q: Smart data access works against different sources?

A: Yes, Teradata, ASE, IQ, SQL Server

Q: What distribution is certified?

A: SAP resells Hortonworks and Intel distribution

Hive .9 or greater is supported, and Hadoop 1

Q: Smart data access connection is used?

Uses ODBC; BI uses JDBC

To report this post you need to login first.

3 Comments

You must be Logged on to comment or reply to a post.

  1. Kamaljit Vilkhoo

    Hi Tammy

    For a customer who has HANA EE and BW on HANA on the same database. Should he go for HADOOP or NLS as data repository for matured data ?

    Kind Regards

    Kamaljit Vilkhoo

    (0) 
    1. Francis Yesudas

      SAP has new feature with HANA called the Federated Enterprise Data Warehouse. Customers can invest into a low budget HADOOP system (or cloud based for low cost). Hadoop relies on cheap common hardware rather than expensive HANA dedicated hardware which is very expensive to expand as your data grows in few years time. Hadoop is good at massive parallel job processing and can give performance results similar to HANA. You can connect SAP data model with HAddop data model with a new concept called Open View ODS.

      (0) 

Leave a Reply