Technical Articles
(Reverse) Geospatial Nearest Neighbour Searches
Finding the k nearest butchers? (Photo by henry perks on Unsplash)
I have occasionally been asked for Geospatial Nearest Neighbour searches and its inverse the Reverse Nearest Neighbour. So I thought, I spend a couple of minutes to write a simple, but general purpose, SQLScript and provide it to you within this blog post. So this is gonna be a quick read and I hope, you will find some use for the scripts.
k-Nearest Neighbour
Problem Statement
Given a table of geometries (i.e. points). Find the k geometries, that are closest to a given geometry (i.e. point).
Exemplary Use Case
You want to visit The Brandenburg Gate in Berlin and are searching for an accommodation nearby. You use a k-Nearest Neighbour Search on the AirBnB listings of Berlin to find the 10 (k=10) closest listings. (…this dataset has also been used in my previous blogs)
Stored Function
Input Parameter | Description |
ref_geom | The reference geometry to measure distance |
nb_geom_tab | Name of the table containing the geometries |
nb_id_col | Name of the column containing the record ID |
nb_geom_col | Name of the column containing the ST_Geometry |
k | The number of nearest neighbours to return |
Output Parameter | Description |
ID | The record ID as per nb_id_col |
SHAPE | The geometry as per nb_geom_col |
CREATE OR REPLACE FUNCTION KNN
(
ref_geom ST_GEOMETRY,
nb_geom_tab NVARCHAR(100),
nb_id_col NVARCHAR(50),
nb_geom_col NVARCHAR(50),
k INTEGER
)
RETURNS TABLE (ID NVARCHAR(100), SHAPE ST_GEOMETRY)
AS BEGIN
DECLARE geom_list TABLE (ID NVARCHAR(100), SHAPE ST_GEOMETRY);
EXECUTE IMMEDIATE
'SELECT CAST(' || :nb_id_col || ' AS NVARCHAR(100)) AS ID, ' || :nb_geom_col || ' as SHAPE FROM ' || :nb_geom_tab
INTO geom_list;
RETURN
SELECT TOP :k ID, SHAPE
FROM :geom_list
ORDER BY :ref_geom.ST_DISTANCE(SHAPE) ASC;
END;
Usage
Find the 10 closest AirBnB listings to The Brandenburg Gate. Note that the function should return k records.
SELECT *
FROM
KNN(
ST_GEOMFROMTEXT('POINT(13.37770 52.51634)',1000004326),
'LISTINGS_AIRBNB_BERLIN',
'ID',
'SHAPE',
10
);
Reverse k-Nearest Neighbour
Problem Statement
Given a table of geometries (i.e. points). Find the geometries, where a given geometry (i.e. point) is one of the k closest geometries.
Exemplary Use Case
You are the owner of an AirBnB listing in Berlin and would like to improve your marketing. For this reason, you would like to know, for which landmarks/points of interest your listing is shown to prospects as one of the 10 (k=10) closest listings.
Stored Function
Input Parameter | Description |
ref_geom | The reference geometry to measure distance |
nb_geom_tab | Name of the table containing the (neighbour) geometries |
nb_geom_col | Name of the column containing the ST_Geometry |
poi_geom_tab | Name of the table containing the references for neighbourhoods (landmarks/pois) |
poi_id_col | Name of the column containing the record ID |
poi_geom_col | Name of the column containing the ST_Geometry |
k | The number of nearest neighbours to inspect |
Output Parameter | Description |
ID | The record ID as per poi_id_col |
SHAPE | The geometry as per poi_geom_col |
CREATE OR REPLACE FUNCTION RKNN
(
ref_geom ST_GEOMETRY,
nb_geom_tab NVARCHAR(100),
nb_geom_col NVARCHAR(50),
poi_geom_tab NVARCHAR(100),
poi_id_col NVARCHAR(50),
poi_geom_col NVARCHAR(50),
k INTEGER
)
RETURNS TABLE (ID NVARCHAR(100), SHAPE ST_GEOMETRY)
AS BEGIN
DECLARE nb_geom_list TABLE (SHAPE ST_GEOMETRY);
DECLARE poi_geom_list TABLE (ID NVARCHAR(100), SHAPE ST_GEOMETRY);
EXECUTE IMMEDIATE 'SELECT ' || :nb_geom_col || ' as SHAPE FROM ' || :nb_geom_tab INTO nb_geom_list;
EXECUTE IMMEDIATE 'SELECT CAST(' || :poi_id_col || ' AS NVARCHAR(100)) AS ID, ' || :poi_geom_col || ' as SHAPE FROM ' || :poi_geom_tab INTO poi_geom_list;
RETURN
SELECT ID, SHAPE
FROM
(
SELECT ID, SHAPE, MAX(DIST) AS MAXDIST
FROM
(
SELECT a.ID, a.SHAPE, a.SHAPE.ST_DISTANCE(b.SHAPE) AS DIST, RANK() OVER (PARTITION BY a.ID ORDER BY a.SHAPE.ST_DISTANCE(b.SHAPE) ASC) AS RNK
FROM :poi_geom_list a, :nb_geom_list b
)
WHERE RNK <= :k
GROUP BY ID, SHAPE
)
WHERE ref_geom.ST_DISTANCE(SHAPE) <= MAXDIST;
END;
Usage
Find the landmarks, where my AirBnB is listed as one of the 10 closest AirBnBs. Note that the function can return any number of records (i.e. probably not k).
SELECT *
FROM
RKNN(
ST_GEOMFROMTEXT('POINT(13.38023 52.51614)',1000004326),
'LISTINGS_AIRBNB_BERLIN',
'SHAPE',
'POIS_BERLIN',
'OSM_ID',
'WAY',
10
);
Summary
The two above outlined functions can be used to do a simple k-Nearest Neighbour or Reverse k-Nearest Neighbour search in SAP HANA. The functions are designed to work independent of the underlying dataset and should be usable for your data without modification.
Well, maybe you need to fix bugs…. Did I mention that there is no warranty?
Thanks Mathias Kemeter for the great blog!
I was wondering, if there was any plan to have nearest neighbor functionality directly in database functions? Similar to Oracle's Oracle sdo_nn. That way the performance could be better as spatial index on the database level could be used?
Hi Kimmo,
you are exactly right. By offering native support and leveraging the index, the performance can be increased.
In fact we discussed this within the team and made it part of our long-term roadmap. This in turn means we haven't planned it on the short- or mid-term. One outcome of the discussion was that from a SQL interface perspective our variant would probably be conceptually different from sdo_nn.
If you would like to increase the development priority, I happily invite you to reach out via email or file the suggestion on the Customer Influence portal (both will eventually reach my inbox).
Regards,
Mathias