Skip to Content
Technical Articles
Author's profile photo Shivam Shukla

Python Meeting Data Lake Part 1 – Connect and Query

Hi All ,

I am sharing my new knowledge on Data Lake – exploring all the way how we can interact with data lake directly.

Pre-requisites: https://developers.sap.com/tutorials/hana-cloud-dl-clients-overview.html Create your own data lake instance under managed or stand alone both are fine.

Here is my DL Instance up and running.

 

Install SAP IQ Client – Make sure you pick the latest version from SAP Software downloads.

Also you can follow the above tutorials for understanding data lake & SAP HANA Cloud instance. Data lake is one of the best way to store different kind of data from different source at one place and importantly at a very low-cost.

there are already few posts which can let you start easily on this topic , I am sharing my own learning So here we go .

  • Go to BTP Cockpit & create your free trial account. https://account.hana.ondemand.com/
  • Create your development space and under the SAP HANA Cloud create data lake instance , Choose IP whitelisting according to the requirement.

 

  • SAP IQ Installation – Download SAP IQ drivers from SAP Software downloads.
  • Open ODBC in administrator mode (I am explaining this in my windows system)  – if you see SAP IQ in below screen it means drivers installation is fine in your system

 

  • Now go to your HANA Data Lake instance and right click on the top right and copy SQL Endpoints & fill it in below driver details.

 

  • Test Connection.

 

  • If this is done – now you are good to go for ODBC Connection to your data lake – Programming environment is your choice now.

 

  • I am using Jupyter notebook and Python (PYODBC) for interacting with data lake & have created few tables and inserted some data as well which bring back in my python client.
  • https://pypi.org/project/pyodbc/

Install PYODBC

 

  • Open Jupyter notebook and try to connect with Data Lake Instance .

 

Import Packages and provide connection details.

import pyodbc
cnxn = pyodbc.connect('DSN=HDLSA;UID=HDLADMIN;PWD=abc1234@123A')

 

Open cursor and execute some select statements.

cur = cnxn.cursor()
cur.execute('SELECT * FROM HOTEL.HOTEL')

Hotel Table under HOTEL Schema is already created , follow data lake tutorials.

Fetch Data

rows = cur.fetchall()
rows

Print Records.

So here comes to and end of connecting data lake from Python , In next part , we will be uploading the data from CSV to Data Lake.

Keep learning Keep querying 🙂

 

Thanks.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Assigned Tags

      Be the first to leave a comment
      You must be Logged on to comment or reply to a post.