This is a companion to my earlier blog, where I demonstrated HADOOP HBASE records being read by HANA and presented using SAPUI5.

Tip of the iceberg: Using Hana with Hadoop Hbase


In this blog I expand on this topic further by enabling HANA to read and write to HBASE.

I’ve created a simple application where HBASE is used to capture the change log of a HANA table.


But why, you might say?


HADOOP is designed for BIG Data (e.g. at the Petabyte scale), at it’s core it uses disks for storage. It’s still very much a ‘do it yourself’ technology with very little Enterprise ready applications built on top yet.

HANA is arguably designed for LARGE Data (e.g. at  the Terabyte scale) and makes use of the latest In-Memory technology. By contrast it has a huge catalog of SAP software now running on it.


The two technologies are complementary in an Enterprise Data Architecture.


Many modern website (e.g. Facebook, Twitter, Linkedin) use a complex combination of HADOOP, traditional RDBMS, custom development and a wide collection of website scripting tools.  The aspiration for many of these websites is millions or billions of users.


For Enterprises, who don’t consider IT their core business, then venturing down this path may not be the most straightforward option. HANA is more that just a database, it also provides the platform for building and deploying custom Enterprise Applications for Desktop and Mobile. In this case a far simpler BIG Data Architecture might be to use just HANA & HADOOP. Enterprise applications may be only targeted to 100’s to 10K’s of users. 


In the following example I’ve built a very simple application on HANA, which enables a table to be maintained on a HTML/5 webpage.  For audit and control purposes I wanted to keep a log of all the changes to Column values by the users.  I could have implemented this purely in HANA, but to demonstrate the integration potential of HANA and HADOOP HBASE, I have opted to also write the changes to HBASE.  On a small scale the benefit of this is negligible, but on a larger scale there may considerable cost saving for storing low value data on an open source disk based solution such as HADOOP.


The following diagram represents the technical data flow of my example:

Now for an example.

The following shows the DOCUMENTS table in HANA, and HBASE Table Log prior to a change:


In HBASE the change log ( for Doc_Name DOC1 & Row 1)  is stored as:

Now  I make a change, e.g. Changing the Free Text from ‘APPLE’ to ‘APPLE CIDER’

Update Successful!  (Changes written to HANA and HBASE)


From SAPUI5 the HBASE Change Log appears as:

Above you can now see the history of the of the ‘Free Text’ field



In HBASE the Change log Table appears as:

NOTE:  the HADOOP User Interface (HUE) only the latest change is shown, however behind the scenes I’ve defined the HBASE table to store up to 100 changes (versions) of the an individual column.


I can also check these in Hbase Stargate directly, though they are BASE64 encoded:


OR by checking the HbaseTableLog xsjs I created, using the GET Method (which decodes):

NOTE: I used POSTMAN here to help format the returned JSON to make a bit easier to test. Above you’ll see the history for the Free_Text field.

The key feature to get this prototype working is the HANA SPS07 functionaility which enables XS Javacript libraries (xsjslib)  to be called  on ODATA CRUD  events.


E.g. DOCUMENTS.xsodata

service {

  “HADOOP”.”HanaHbase::DOCUMENTS”  as “DOCUMENTS”

     update using “HanaHbase:DOCUMENTS_table_exits.xsjslib::update_instead”

     ;

}


NOTE:  For the comparison of Before and After record, to determine the field changed, I’ve made use of the example code provided by Thomas Jung , for convert a SQL record set to a JSON object, see http://scn.sap.com/thread/3447784


During the ODATA PUT (UPDATE) I’ve modified both HANA & HBASE with the most recent change.


The HANA table is only setup to store the current value.

The HBASE equivalent table  I’ve defined to keep the most recent 100 changes per COLUMN.


The HBASE Log table was defined as:

create ‘HanaTableLog’, {NAME => ‘table’, VERSIONS => 100}


The complete code for this prototype is available on github.

https://github.com/AronMacDonald/HanaHbase

To report this post you need to login first.

3 Comments

You must be Logged on to comment or reply to a post.

  1. ro ma

    Nice post!

    For information Hue HBase Browser actually shows the latest 3 versions when clicking on the editor icon of some cell.

    (0) 
  2. Madhusudhan Kalithireddy

    Hi Aron ,

    Thanks for nice explaination , it helped me alot .

    Can we use the XS to create the tables in Hadoop and to load the data on daily basis for (approx 500+ tables from HANA DB).

    Kindly explain the process , if we any have to achieve that requirement

    Regards,

    Madhu

    (0) 

Leave a Reply