Triple H: Hadoop, Hive, HANA on Windows 10’s Bash Shell (Part-3)
This will be my last post for this series.
I will be showing here the setup of my SAP HXE 2.0 instance installed locally in my desktop.
I will also describe here the steps i have performed in order to access my Hadoop/Hive database installed on my local Linux installation, Windows 10 Bash Shell via SAP Smart Data Access:
Finally, SAP HANA Studio for the virtualization of remote tables of my Hadoop/Hive Database.
The steps here are from SAP HANA Academy channel on YouTube, SAP HANA Express playlist. The assumption here you have SAP HANA 2.0 Express Edition on Virtual Machine up and running. In addition you also have SAP HANA Studio installed and connected with SAP HXE.
Let’s get started:
Putty as SSH client
Simba Apache Hive ODBC/JDBC drives for Linux
SAP HANA Studio – Data Provisioning – SDA
Putty as SSH Client
As you may know, the interface that comes with VMWare Workstation is not that quite user-friendly. In my case, i just use Putty as SSH client to access my SAP HXE instance. Using a very simple settings I am able to connect to my SAP HXE database: Private IP address and SSH port (22):
To install UnixODBC driver you will need to register your Virtual Machine with SUSE to get the repository updates. Just follow video 22 from the SAP playlist. Then video 27 for the UnixODBC installation.
Update: I had no choice but to update to UnixODBC 2.3.4. This is because “Simba Hive ODBC” was not connecting at all.
To manually install UnixODBC the C++ Compiler is required. Just use “sudo yast” as showed in this SAP HANA academy video. I did not install the VMWare Tools itself, just the C++ Compiler.
I used FileZilla to move the UnixODBC 2.3.4 file. Then gunzip and tar. Finally the installation as follow:
I also had to install “Libsas12” from the SUSE VMWare repository. This is also required for the “Simba Hive ODBC”:
now, keep in mind the instance for SAP HXE 2.0 is 90, and not 00. In this case, the Tenant Database port will be 39013. Just in case you decide to test the UnixODBC connection.
Simba Apache Hive ODBC/JDBC drivers for Linux
Download it from Simba Technologies. Registration is needed for a 30-days trial. The license is given by email after the registration.
I downloaded it locally and then moved the ZIP file into my SAP HXE instance:
sudo mkdir /_drivers
sudo chmod -R 777 /_drivers
Then, using FileZilla I uploaded the .ZIP file into into my SAP HXE instance:
The steps here are from SAP HANA Academy “SDA: Configuring ODBC Drivers” YouTube video.
After performing “gunzip” and “tar” steps the new folder “simba” is created in the installation path. The steps are pretty much the same from the SAP HANA Academy video. Only difference it that now “simba” folder has two sub directories: 32 and 64. Do not forget to put the license file into the home directory of the hxeadm user.
I added my Hadoop/Hive IP address into /etc/hosts file as follow:
127.0.0.2 hxehost.localdomain.com hxehost
Those are my private IP addresses. So no problem showing them here.
Preparing Simba ODBC(64) ini file: no need to copy to the home directory of the hxeadm user:
# Generic ODBCInstLib
# SimbaDM / unixODBC
adding Hive DSN configuration into .odbc.ini file:
my customer.sh file:
Testing my Hadoop/Hive connection using “isql”: The troubleshooting and final solution took me 3 days. I almost gave up:
Attempts to connect my Hive system all failed at this:
|$isql -v MYHIVE hiveuser password
[unixODBC][Driver Manager]Can’t open lib ‘/_drivers/simba/hiveodbc/lib/64/libsimbahiveodbc64.so’ : file not found
[ISQL]ERROR: Could not SQLConnect
This is because something was missing from my installation:
$ ldd /_drivers/simba/hiveodbc/lib/64/libsimbahiveodbc64.so
This error is misleading. The Simba Hive ODBC Driver “libsimbahiveodbc64.so” was just fine. However the library highlighted above was missing from my SAP HXE instance. So I installed the Libsas12 from the repository using ‘sudo yast’. The installation here gives “libsasl2.so.3” only. Simba Hive ODBC needs “libsasl2.so.2”
so, the last step was to create a symbolic link as follow:
sudo ln -s libsasl2.so.3 libsasl2.so.2
checking “libsimbahiveodbc64.so” again:
then the connection was successful:
|hxeadm@hxehost:/usr/sap/HXE/home> isql -v myhive hduser xxxxxxxxx
| Connected! |
| sql-statement |
| help [tablename] |
| quit |
SAP HANA Studio – Data Provisioning – SDA
New Remote Datasource for my HADOOP/HIVE: the DSN here is the one from my .odbc.ini file:
my Hive database:
SQL console script to create the SDA virtual tables:
|CREATE VIRTUAL TABLE “LIVE2″.”VT_MYHIVE_CONNECTIONS” AT “HIVE”.”HIVE”.”live2″.”connections”;|
SDA virtual table content:
That’s all. SAP Smart Data Access using HADOOP/HIVE ODBC driver from Simba Technologies still very easy to setup.