Tableau on HANA
For my review of the “native” connector look here:
Tableau on HANA
Over the last couple of weeks I’ve had several discussions on Tableau software. Not only amongst my BI colleagues, but also amongst my clients. I knew it was a frontend tool, but that was about it. I did not have an opinion on the user friendliness of Tableau nor on the way it connects to any SAP data sources. Luckily for me I have a knack to start fiddling around once I hear a number of people talking about a product. In my book, it must be worth investigating if there is a buzz.
Being a strong believer in the capabilities of HANA, I decided to combine the two and investigate the possibilities to use Tableau on top of HANA. I consulted the book of knowledge (Google) and entered “ tableau HANA”. To my surprise the following post shows up:
Vishal Sikka (SAP’s CTO) demonstrated Tableau connecting to SAP Business Information Warehouse (SAP BW) running with HANA. That’s interesting. Why would Vishal promote a non-SAP product? Either he really likes the functionality or he wants to demonstrate the openness of the HANA solution to 3rd party applications. Whatever the motif, as Vishal demoed “ BW running on HANA”, this gives me the perfect opportunity to investigate the possibilities (and possible limitations) off using the other option: HANA standalone (for example by using the Amazon HANA images).
It always starts with a download
I installed a trial version of “ Tableau Desktop” from the Tableau website and installed it. Flawless and simple, nothing to it. Next step is connecting it to HANA. As no native HANA connector is available at the moment (the folks over at Tableau will release this is in the next coming months as I understood), I decided to make a connection via ODBC as this had proven to be working beautifully in my Siri demo.
Creating a connection
After opening Tableau I pressed “connect to Data” and made a connection to HANA via ODBC:
After connecting I can select one or multiple tables. There is even the possibility to use SQL to make a sub-selection of a table (e.g. select only year 2011).
If you want to select a HANA view (like a calculation view) be sure to select the “_SYS_BIC” schema and not your default schema which holds your normal tables!:
Tableau automatically checks the compatibility of the ODBC driver for limitations. Connecting to the HANA driver It comes up with a bunch of warnings, which seem irrelevant for the most part.
I took take a chance here and ignored the warnings. I wanted to see any limitations once I created some demo content on my “Sales” table.
After pressing ok, the next option is to create a “Tableau data extract”. That does not sound like a real time scenario, but fortunately Tableau still gives the option to connect to a live data set. Exactly what I want, how will Tableau handle the big data volume (100 million+ records) I have in my table?
Press “Connect Live”. Good to go!:
Filtering is the key to success
The fun starts. I started off by just throwing in some dimensions and measures and let it all rip. Well, forget about trying that on 100.000.000+ records. The query will simply not give back any results. I did a test by putting my customers in the rows against a sales measure, it did not come back with any results (well actually it ran for some hours when I decided to kill it).
You will need to make smart selections when you want to use Tableau on HANA via ODBC. You cannot simply request all data in your HANA table and expect lighting fast response times. So I decided to use filters to create a smaller data set. This gives another limitation however…
If you want to filter values, you cannot do this by selecting values from a populated dropdown box in case you have a lot of different values for that particular dimension (think filter on month is not a problem, filter on 100.000 customers is a problem). It seems the query reads the full table again, which makes it impossible to come back with a list of values. The workaround for this is to switch off the auto refresh and setting a filter manually:
The good part is that after switching of the auto refresh, you can still do a live search for a value to set as a manual filter. In the below example I searched for customer Adobe which comes back in three different hits (note this is due to the way I stored my data):
Good part, It only takes seconds to do this going through my largedataset!
Creating some test reports
Tableau hosts a number of small tutorials, which explain the basics of filtering, connecting and formatting data. Go through them (takes about an hour) and you will be confident enough to create some reports.
Based on my limited set of dimensions and fields I threw together some reports in a couple of hours:
Sales data per customer and salesrep, plotted per month. By using the Analysis option I added totals on rows and columns. Graph is called a highlighted table, which emphasizes the larger values based on thresholds you can set or let the system determine by itself.
A bullet graph showing actual and target per salesrep for customer Adobe. A bullet graph is a variation of a bar graph developed to replace dashboard gauges and meters. Table is sorted descending (just a simple click) and I added a “quick filter” on sales, which generated a slider. Simple and neat.
The added value, which comes with Tableau, is that you can take the individual sheets and create a dashboard. Combining my two sheets will look something like this after adding it to a dashboard sheet:
Again, performance is great when doing all of this. I did not witness any lagginess whatsoever.
I can imagine that not being able to query the full dataset at once can be a serious limitation for some. You would have to know on what to report on in advance. This is where it gets misty. Is Tableau a tool to create dashboards and consume them easily or is it a tool to do a full analysis which can deep dive into information, not knowing what to query next once you found the answer to your initial query? Trying it out for some hours I believe it is more off a Dashboard tool then a full data analysis tool. For this reason I believe it can still be off added value when using it on top off HANA. As mentioned, once you set a correct filter on your dataset, the performance is excellent. The data comes back in seconds.
As for HANA, I believe she behaved beautifully today. The ODBC connector showed to be flexible and stable enough to easily attach a 3rd party tool like Tableau to it. It did not give a single error during my tests.
I’m looking forward to the future and the new Tableau HANA plugin. Apparently it will be released somewhere the next couple of months. I’ll be sure to try it out once it is available and see if the limitations are ironed out!
Thank you for reading this and take care.