Skip to Content
Author's profile photo Serban Petrescu

HANA Spatial Demos: Geocoding, Clustering, Aggregation

Recently, I participated in Witalij Rudnicki‘s Code JAM on HANA Spatial and remembered how fun working with HANA Spatial can be. A while back, I’ve built some small demo apps as an introduction to spatial and I decided to share them with the community. After all, sharing is caring 🙂

All the code is freely available on GitHub: serban-petrescu/hana-spatial-demo. More detailed steps for setting it up can be found in the README. For the remainder of the blog, I will describe the rationale behind it and discuss on the more interesting snippets of code.

Business story

Let’s imagine a purely fictional story to back-up our applications:

  • Our customer is an insurance company specialised in house insurance.
  • Currently, they have all their data in Excel files.
  • They want to move it into a dedicated system capable of grouping houses in the same area together.

Looking back, this scenario might not be that far from reality after all.


To fulfil our customer’s needs, we will do the following steps:

  • Take the data from the Excel file (.csv) and import it in HANA. This data contains:
    • The insurance policy number (unique business key for a policy).
    • The policy holder name.
    • The insured object’s address.
    • And the total sum insured.
  • Geocode the house addresses to get their coordinates.
  • Build apps for visualising, aggregating and clustering the houses.

Technical setup

We will do all of this on a HANA Trial MDC, which we create following the steps described in the “Setup your trial SAP HANA MDC instance” SAP developer tutorial. The development itself can be done directly in the Web-based Workbench.

The Google Geocoding API is a fairly good tool for obtaining coordinates from textual addresses. To use it, we also need an API Key for it. In addition, HANA needs to “trust” the SSL certificate of Google, so we need to create a new trust store for it.

Data import

First we need to create a CDS entity for storing the input data:

    // the main table, should store all the necessary data
    entity eInsurance { 
        key policyNumber:       String(32);
            personName:         String(128) not null;
            objectAddress:      String(256) not null;
            insuredSum:         Decimal(16,2) not null;
            objectLocation:     hana.ST_POINT(4326);
            objectGeocoded:     hana.TINYINT default 0;

Then we can use a table import file to move the data from the CSV to the table itself:

import = [
        cdstable  = "";
        schema = "WORKSPACE_MASTER_SPATIAL";
        file = "";
        header = false;
        delimField = ",";


Normally, we could use HANA’s built-in functionality: the GEOCODE INDEX with a user-defined geocoding provider. Back when I wrote these apps, on an MDC I was not able to actually change the system configuration to use my own provider, so I decided to just call my function through an XS Job:

    "description": "Perform geocoding",
    "action": "workspace.master.service:job.xsjs::batch",
    "schedules": [{
        "xscron": "* * * * * * */5"

Roughly speaking, the following code is responsible for calling the Google API:

 * Calls the google geocoding API.
 * @param   {string}    sApiKey     The Google Maps API key.
 * @param   {string}    sAddress    The address which should be geocoded.
 * @returns {object|null}   The response body returned by the service.
 * Null if the request failed.
function callGeocodingApi(sApiKey, sAddress) {
    try {
        var sUrl = "/geocode/json?address=" + encodeURIComponent(sAddress) + "&key=" + sApiKey,
            oDestination = $.net.http.readDestination("workspace.master.service", "googleApi"),
            oClient = new $.net.http.Client(),
            oRequest = new $.net.http.Request($.net.http.GET, sUrl),
            oResponse = oClient.request(oRequest, oDestination).getResponse(),
            oData = JSON.parse(oResponse.body.asString());
        return oData;
    catch (e) {
        return null;

To track which addresses were already processed, I also added a objectGeocoded flag to the CDS entity.


To view the points on a simple Google Map, we need to decompose the ST_POINT objects into the latitude and longitude components and expose them through an OData service.

For this we can simply create a simple CDS view:

    // the expanded view (expands the objectLocation into coords)
    // contains only the points which were geocoded succesfully
    view vExpandedInsurance as select from eInsurance{
        objectLocation.ST_Y() as objectLatitude,
        objectLocation.ST_X() as objectLongitude
    } where eInsurance.objectGeocoded = 1;

And then add it to the xsodata service:

"" as "ExpandedInsurances"
   keys ("policyNumber")
   create forbidden
   update forbidden
   delete forbidden;

Using a simple UI5 app, we get a fairly nice view of the points:

As a bonus, if we click on a point, we get more details about the insurance policy behind it.

Data acquisition

We don’t want our customer to always use the CSV import. It should just be there for the initial “migration” of the data.

When new policies are created, we would like to help our customer in getting decent-quality data, so we build a small UI which shows exactly where an address is when creating the policy:

Technically, this UI just reuses the existing geocoding functionality from before.


One very interesting use case would be to cluster up the insured objects. Think about natural disasters; they have the potential of affecting a large number of houses located in relative close proximity. For an insurance company, having a large lump of risk in the same area can easily spell disaster.

Because we are not sure how the customer would like to have it, we build three different prototypes supporting this.

UI-based clustering

The first one is simply using Google’s Marker Clustering functionality. We don’t really like it though, because it will surely not scale and it doesn’t really use anything from HANA. Also, we don’t get to know exactly what’s the total amount of sum insured in each cluster (just the number of houses).

Static clustering

The second one assumes that the user does not need to input the number of clusters, but we can decide on it. Truth be told, this variant only exists because, back when I created the applications, using a calculation view input parameter in place of the hard-coded cluster count would result in an error.

We write a fairly simple calculation view for this and then expose it through OData:

var_out = SELECT ST_ClusterID() AS "id", SUM("insuredSum") AS "totalSum", 
	COUNT("insuredSum") AS "objectCount", ST_ClusterCentroid().ST_Y() AS "centerLatitude",
	ST_ClusterCentroid().ST_X() AS "centerLongitude"
		SELECT "insuredSum", "objectLocation".ST_Transform(1000004326) AS "objectLocation" 
			WHERE "objectLocation" IS NOT NULL
	GROUP CLUSTER BY "objectLocation"

As you can see, this uses the built-in KMEANS CLUSTER HANA spatial feature.

Dynamic clustering

We don’t really like this either, because we hard-coded the cluster count. To work around the calculation view activation error, we move the code from the view into a simple XSJS script (with basically the same content, but with the cluster count parameterized). In the end, we obtain something that looks like this:


The customer might also want to select manually how to group the points together. To support this kind of scenario, we also create two different prototypes. For both prototypes, we want to:

  • Split all the points into two groups based on some kind of user input.
  • Display which points are in each group.
  • Aggregate the total sum insured per group.

Radius-based aggregation

In this prototype, we split the points into two groups based on their distance to a user selected point. If they are inside a circle of given radius and center, then they are in the “inside” group, otherwise they are “outside”.

For determining this relationship, we use the following query that makes use of the ST_DISTANCE method:

 var_out = SELECT "policyNumber", 
	"objectLocation".ST_Y() AS "objectLatitude", 
	"objectLocation".ST_X() AS "objectLongitude", 
	    'POINT(' || :IV_LNG || ' ' ||  :IV_LAT || ')', 4326), 'meter') < :iv_radius 
	THEN 1 ELSE 0 END AS "group"
	WHERE "objectLocation" IS NOT NULL;

Similarly, we build a view for the aggregated values. Here we can take into consideration the fact that we will always have two groups (“inside” and “outside”):

var_out = SELECT 1 AS "group", IFNULL(SUM("insuredSum"), 0) AS "totalSum", COUNT("policyNumber") AS "objectCount"
	WHERE "objectLocation" IS NOT NULL AND "objectLocation".ST_DISTANCE(
		NEW ST_POINT('POINT(' || :IV_LNG || ' ' ||  :IV_LAT || ')', 4326), 'meter') < :iv_radius
	SELECT 0 AS "group", IFNULL(SUM("insuredSum"), 0) AS "totalSum", COUNT("policyNumber") AS "objectCount"
	WHERE "objectLocation" IS NOT NULL AND "objectLocation".ST_DISTANCE(
		NEW ST_POINT('POINT(' || :IV_LNG || ' ' ||  :IV_LAT || ')', 4326), 'meter') >= :iv_radius; 

Then we just display these two sets of data on a simple “picker” UI:

Region-based aggregation

The second version is to actually let the user draw the region as a polygon. Thankfully, we can use the built-in ST_CoveredBy predicate to work this out:

var_out = SELECT "policyNumber", "objectLocation".ST_Y() AS "objectLatitude", 
	"objectLocation".ST_X() AS "objectLongitude",
	"objectLocation".ST_CoveredBy(NEW ST_POLYGON(:iv_polygon, 4326)) AS "group"
	WHERE "objectLocation" IS NOT NULL;

The polygon itself is an input variable to this calculation view. The map includes a set of drawing tools which also support polygons. Using this functionality we can build a simple UI for this prototype as well:


With a fairly low amount of coding, we can easily build interactive insight tools based on the HANA Spatial functionalities. I hope you also find the examples cool!



Assigned Tags

      You must be Logged on to comment or reply to a post.
      Author's profile photo Ina Felsheim
      Ina Felsheim

      Thanks for sharing this. Love all of the screen shots and specifics of how to use HANA Spatial.

      Author's profile photo Former Member
      Former Member


      I know that this blog is quite old but I find it very interesting.

      I have some questions: you wrote: Normally, we could use HANA’s built-in functionality: the GEOCODE INDEX

      Could you be more precise. I have a requirement to reverse geocode a lat/long to an address with the standard HANA address and geocode directory.

      I Know it is possible using flowgraphs and the geocode control, but I am looking to find a way to do it without having to use flowgraphs. All I have found is using Google API which I don't want to do.

      I would like to use the built-in functionality to do so directly from SQL or any other language but without having to use flowgrpahs.

      Could you help me finding some info on that ?


      Best regards.


      Author's profile photo Serban Petrescu
      Serban Petrescu
      Blog Post Author

      Hi Guillaume,

      HANA has a built-in functionality that allows you to GEOCODE (i.e. to derive lat lng from a textual address) row automatically in an asynchronous manner. It is described in the documentation that I linked above. In a nutshell, HANA uses a so-called geocoding provider that actually does the geocoding - this provider can be the "SAP HANA smart data quality geocode" (which may not be available depending on your HANA installation/subscription) or you can define your own provider using JavaScript. I remember that at some point I implemented a custom provider based on the Google API.

      If you set up such a provider and then add an appropriate index to your table, the lat / lng columns should be automatically filled over time with appropriate values. The only additional prerequisite is that the table should have a specific structure (one or more address columns - it depends on the geocoding provider + one output columns for the coordinates as a ST_POINT).

      Hope this helps,