Hey, whut?!!” (or whatever similar expression in your native language is) you shouted after reading the title. “Isn’t it that Big Data is some gazillionbytes of data people run on Hadoop?” Yes, this is what you hear and read everywhere. But accordingly to this month’s pool results at KDnuggets – a popular site on Business Analytics, Big Data, Data Mining, and Data Science – not exactly.

For the fourth year in a row the majority of answers to a question “What was the largest dataset you analyzed?” remains “in the GB range”, where G stands for giga, and not for gazillion. And the most popular range is actually 1-10GB of data. And this is what SAP HANA, express edition (aka HXE) offers: you can run it for free up to 32GB of RAM, including productive use with community support.

“Still, data analysis is all about Hadoop nowadays” you may continue. Yet again, reality is different. Accordingly to another KDnugget’s survey from this year, Top 3 most popular tools for Analytics/Data Science in 2016 are:

  1. R
  2. Python
  3. SQL


As you know SAP HANA is in-memory SQL-compliant database. Even better, you can connect to it with tools from the first two spots on the list. Check these How-to’s:

And since you have mentioned Hadoop, yes, SAP HANA can integrate with it as well. But let’s take one step at a time.

So, SAP HANA Express could be used in half of all data analysis scenarios that members of KDnuggets have done. And did I mention doing it for free?

I am planing to join this movement too, and over the course of next weeks or months I am going to use SAP HANA Express, and to share with you the most interesting findings from my journey – in form of developers tutorials or in form of blog posts.

Just for doing that I have gotten Intel NUC model NUC5i5RYH. But for what I plan it should not matter what version of HXE to use: on premise or in CAL, virtual machine or installed with binary installer.

Luckily for me there is already a number of blogs on HXE, including those from my teammate Craig Cmehil. I am going to piggy-back on those to jump start, but then to run my own set of exercises. So, please stay tuned.

To report this post you need to login first.

2 Comments

You must be Logged on to comment or reply to a post.

  1. Gregory Misiorek

    Hi Witalij,

    before i jump into python (and not scala which seems more native to Spark) do you know if xse will work with SAPNewDBMDXProvider.dll version 1.5.22.26? i have a lot of client versions installed on my Windows 8 machine and i’m not entirely sure if another installation will result in this OLE DB integration or not.

    otherwise, keep those exercises coming.

    thx, greg

    (0) 
    1. Witalij Rudnicki Post author

      Hi Greg. That’s quite a question you brought for me. I think it was 2009 when I connected to HANA with MS Excel for the last time 😀 And now I have the latest HANA clients (SAPNewDBMDXProvider.dll version 1.5.26.30) on my PC, which allowed me to successfuly test a connection to HXE’s SystemDB, but then I have no cubes there defined to do any querying 🙂

      (1) 

Leave a Reply