Skip to Content
Technical Articles
Author's profile photo Witalij Rudnicki

SAP HANA Spatial and GDAL in Python on Windows OS

In my previous blog posts, I wrote about GDAL with SAP HANA driver in OSGeo4W and how it enables GDAL utilities in the command line. During my SAP TechEd community session, we discussed as well the possibility of coding this in Python.

The GDAL project maintains generated Python bindings for GDAL and OGR. Generally speaking, the classes and methods mostly match those of the GDAL and OGR C++ classes.” as described in Python bindings — GDAL documentation.

Setup

If you are using OSGeo4W, then make sure python3-gdal is selected to be installed.

In my case I want to use Jupyter to code, so I install it from OSGeo4W too.

Let’s check in OSGeo4W Shell, that these components are installed.

where jupyter
python -m pip list | find /i "gdal"

Geospatial data

In my example, which I used for a demo at SAP TechEd in Las Vegas, NV, data comes from my fitness watch.

I walked my usual route in Las Vegas before the conference: https://www.strava.com/activities/8109863427

… and then I downloaded it from the website as a GPX file called My_classic_evening_walk_in_Vegas.gpx.

Coding in Python

Let me start Jupyter with jupyter notebook as Jupyter Labs is not yet available in OSGeo4W.

Import OSGEO packages

One that we need to work with vector data is ogr, but a version number can be read only from the gdal module.

try:
    from osgeo import ogr, osr, gdal
except:
    sys.exit('ERROR: cannot find GDAL/OGR modules')
else:
    print(gdal.VersionInfo('VERSION_NUM'))

Import was successful, and the version is 3.6.2.

The GDAL team acknowledges that Python programmers expect exceptions to be enabled by default, but says that exceptions are disabled by default to preserve backward compatibility.” accordingly to https://gdal.org/api/python_gotchas.html#python-bindings-do-not-raise-exceptions-unless-you-explicitly-call-useexceptions.

ogr.UseExceptions()

Check OGR drivers for HANA and GPX are included

cnt = ogr.GetDriverCount()
print(f"Number of drivers: {cnt}")

for i in range(cnt):
    driver = ogr.GetDriver(i)
    driverName = driver.GetName()
    if driverName in ('GPX', 'HANA'): print(i, driverName)

Both HANA and GPX drivers are available among 78 different OGR drivers.

Open input dataset with spatial data…

In my case a GPX file My_classic_evening_walk_in_Vegas.gpx:

ds_in = ogr.Open("My_classic_evening_walk_in_Vegas.gpx")
ds_in.GetName()

…and check its content

print(ds_in.GetLayerCount())
for layer in ds_in:
    print(f"{layer.GetName()} has {layer.GetFeatureCount()} features")

There are two layers with data: tracks and track_points.

Set up ODBC connection string to SAP HANA db and a schema

host = "95e788ad-25e2-4440-91e6-1c2e1eb3646c.hana.trial-us10.hanacloud.ondemand.com"
port = "443"
user = "DBAdmin"
schema = "TESTGEO"

I use getpass to get secret input of the password.

import getpass

pswd = getpass.getpass('Password:')

You can find all connection string options at https://gdal.org/drivers/vector/hana.html#dataset-open-options.

connString = f"HOST={host};PORT={port};USER={user};PASSWORD={pswd};SCHEMA={schema};DRIVER=HDBODBC"

Connect to SAP HANA and set a target schema

ds_out = ogr.GetDriverByName("HANA").Open(f"HANA:{connString}", update=True)

ds_out.GetName()

Check existing tables in the schema in SAP HANA db

HANA table corresponds to a “layer” in GDAL/OGR terminology.

print(ds_out.GetLayerCount())
for layer_out in ds_out:
    print(layer_out.GetName())

At this stage schema is empty as can be seen from DB Explorer too.

Copy a layer to SAP HANA db

I want to copy only one layer tracks from the GPX file into my SAP HANA’s schema.

in_layer = ds_in.GetLayerByName('tracks')
out_layer = ds_out.CopyLayer(in_layer,'tracks')
print(ds_out.GetLayerCount())
for layer_out in ds_out:
    print(layer_out.GetName())

There is one table TRACKS in SAP HANA’s schema now, as can be seen in the DB Explorer too.

Copy all layers from an input data source into SAP HANA db

ds_out = ogr.GetDriverByName("HANA").CopyDataSource(ds_in, f"HANA:{connString}")
print(ds_out.GetLayerCount())
for layer_out in ds_out:
    print(layer_out.GetName())

All five layers have been copied to five tables in SAP HANA.

Use SAP HANA Database Explorer

We can see all five tables and preview data from for example TRACKS.

This is the geometry stored in that cell.

Does it look familiar compared to the one from the beginning of this post?


Regards,
-Vitaliy, aka @Sygyzmundovych

Assigned Tags

      Be the first to leave a comment
      You must be Logged on to comment or reply to a post.