Visualize your first Hive dataset on the SAP Cloud Platform Big Data Services with SAP Lumira
SAP Cloud Platform Big Data Services is a High Performance Big Data as a Service (BDaaS) solution based on Apache Hadoop.
In this blog, we will show you how to connect SAP Lumira to this BDaaS solution to visualize a Hive dataset.
We will use the desktop edition of SAP Lumira 1.31.10 installed on Windows 10 64-bit. You can also use SAP Lumira Discovery 2.1 to do the same.
Prior to establish the connection between SAP Lumira and the Big Data Services, you need to ensure that your PuTTY profile has configured an SSH tunnel that locally forwards port 10000 on your Windows machine to port 10000 of the Hive Server 2 service that is running within the Big Data Services. For more details, you can consult the Big Data Services documentation.
Then, follow the steps below:
1. Connect to your Big Data Services Workbench using PuTTY
2. Launch SAP Lumira
3. Create a new dataset (menu File -> New) by selecting the SQL on Hadoop source
4. Enter your credentials to connect to the Hive Server 2. Host should be set to localhost, Port should be set to 10000 and User should be set to your Big Data Services Workbench Username.
Do not enter anything in the Password field as PuTTY is already using a private SSH key to connect to the Workbench
5. Choose any Hive table in your schema and select the columns you want to be part of the dataset. In this example, we chose to include all the columns. Click the Create button to prepare and acquire the data
6. Build a visualization of your dataset
You can see how easy it is to load a dataset and create a report to display Hive data