Connecting to HANA with Python using ODBC or oData
As a HANA developer it is almost mandatory for you to learn a scripting language. It will give you a ton of off new development opportunities. You probably have heard about using JavaScript as a scripting language for connecting an oData service to SAPUI5 or using R to make statistical calculations, but did you know that my TechEd Demo Jam entry from 2012 was entirely based on Python? And did you know that my HANA Inno Jam entry from 2012 was based on Ruby? I can imagine you didn’t. The point I’m trying to make is that learning a scripting language will broaden the things you can build with HANA tremendously. It is a corny statement, but the sky is truly the limit once you master the language(s).
Study young grasshopper!
Learning a scripting language can be tricky, but the web will give you lots of free training opportunities. Learn Python is an obvious one, Code Academy another one. Every language has a similar training website. Just Google and enjoy 😉
A head start
To give you a head start after your training, I would like to share my preferred ways of connecting HANA via Python. Python is my favorite language as it was developed by a Dutch guy. Just kidding of course ;-), I love it as it is a language which is easy to learn and you can do great things with only a few lines of code (compared to other languages). If you really want to know why I like it so much, a guy called Tim Peters said it best in his “Zen of Python”.
For the head start, I will give you two ways of connecting: via ODBC and ofourse via oData. For our examples we’ll use a script to create records in HANA. In the below examples I’ll use a simple table: two fields: a timestamp field and a numeric field.
ODBC
The easiest way of connecting to HANA via ODBC in Python is by using PyODBC. Install it and create a so called DSN via an ODBC manager in Windows or Linux. After that, you will be creating records in HANA in no time via your Python script:
#!/usr/bin/python
# -*- coding: utf-8 -*-
import pyodbc
cnxn = pyodbc.connect(‘DSN=YOUR_ODBC_CONNECTION;UID=USERNAME;PWD=PASSWORD’)
cursor = cnxn.cursor()
cursor.execute(“insert into MY_TABLE values (CURRENT_TIMESTAMP,’100000’)”)
cnxn.commit()
We just posted a value of 100000 with the current timestamp of the system (using the function CURRENT_TIMESTAMP).
Ofcourse this is a hardcoded example. In a real life example you will be using variables and loops to create records. Be sure to follow one of the trainings as listed above and you will be able to handle those in no time aswell.
oData service
Connecting in Python via ODBC can be tricky as you will need an ODBC driver. Now HANA comes with one of course, but sometimes you are on some exotic Linux Distro and simply do not have an ODBC driver which will compile. In that case oData comes to the rescue.
HANA SPS6 comes with some great new features, using POST in your RESTfull service is one of them. The Python library of choice in this case is Requests. It is dead simple to use. Installing it is pretty much the hardest part.
So without further a due: this is how you create a script that posts records:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import requests
import json
url = ‘YOUR_ODATA_SERVICE_HERE’
payload = {‘TIMESTAMP’: ‘/Date(1474823911255)/’, ‘VALUE’: ‘100000’}
headers = {“Content-type”: ‘application/json;charset=utf-8’}
auth = ‘USERNAME’, ‘ PASSWORD_HERE’
r = requests.post(url, data=json.dumps(payload), headers=headers, auth=auth)
print r.text
The timestamp might look a bit weird (13 digits). It is the only way the service accepts the value and represents the Unix timestamp with millisecond precision, i.e. the number of milliseconds since Unix epoch.
As you can see in the above example I also import “json”. This is required as the service can only handle the Json format.
Have fun developing, be creative and you will be creating that killer demo in no time!
Ronald.
Great blog Ronald. Just wondering... Do you have to have HANA studio installed for this to work? I've seen a couple of other blogs on this subject where you needed some Python *stuff* that's installed by the HANA studio. That would only be to get the ODBC driver I assume.
I like the odata method better, but wonder what the performance differences are...
Hi Jason, tx for your comments, much appreciated!
In order to get the ODBC driver you will need to install the HANA Client. Nothing more is required for creating some Python scripts.
If you want to defines a RESTfull service (oDate) for your data you either need to create one via HANA Studio or use any of the webbased IDEs to define one.
Performance wise I have not come across any problems yet. The XS engine pushes the calculation towards the HANA database so that means performance should be pretty good. Only thing which remains will be the normal http traffic you are generating.
Also see Thomas his blog if you want to get some more background:
http://scn.sap.com/community/developer-center/hana/blog/2012/11/29/sap-hana-extended-application-services
Have fun.
Nice one! Personally I would've never thought of using Python, but this opens up quite some possibilities!
Bedankt jonguh!
Hi Roland. Excelent post, short and effective.
Leaving aside ODBC installing issues, which do you think would be the preferred way to connect to HANA form another languages (not just Python)?
Now OData XS allows POST request we can perform any operation through that channel, and it seems to be more scalable.
I'd like to hear (read) your opinion 😉
Thanks!
Hi Luis,
Think looking at other languages (e.g JS), oData would be the way to go. If you look at the blog post from Thomas its pretty much: ODBC for legacy, Rest for new developments. Personally, I would prefer oData. Easy to use, no driver misery, lightweight,...
Tx for commenting!
Luis,
The answer entirely depends on what you want to do with the data. If you want to manipulate small amounts of data - couple of hundreds of sales orders, etc. - then oData is a way to go. If you want to move significant amounts of data back and forth then you definitely should go with TDS (tabular data stream?) type of access - ODBC, JDBC, etc.
The choice (as usual) is not language dependent but scenario dependent.
Completely agree.
Thanks both for your answers!
It's good to have in mind that TDS is a good option for large amount of data.
Good post, that you!
I would like to add some more materials about the Python and SAP HANA.
Tx Viktor, nice example scripts you posted to github with the PyODBC example!
Can any1 help me how to integrate R with HANA...
You may want to check the documentation and use a search engine first. This question has been answered many times already.
Hello Ronald,
Great blog, very helpful. I'm just wondering if its possible to connect to an SAP R/3 system that runs on HANA database via python?
Python is open-source, easy to learn language. Python is a high-level Programming language. Python has powerful libraries for web development , automation and data analytics.
http://zenithtrainings.com/