Technical Articles
Calculating Isochrones using SAP HANA Graph and Spatial
In this blog post I will describe how to calculate drive time distances and isochrones on street network data. Isochrones are areas reachable within a certain time threshold. They are helpful in many transportation-related planning scenarios: retail shops, hospitals, schools, bridges, evacuation routes etc. The script and sample data is on github.
The image shows isochrones for swimming pools in London on a street network of 1.5 mio. road segments. Areas of the same color indicate the same drive time distance.
My colleague Witalij Rudnicki got a question after one of his TechEd sessions: how to calculate the number of people who can reach a given set of 10 retail shops within 60 minutes by car. This problem is related to isochrones, which depict the area accessible from a point within a certain (travel) time [1]. Since Mathias Kemeter and I ran a workshop on spatial and graph processing using London street network data [2], I thought it makes sense to elaborate on drive time distances and isochrones a little more. So here is the guiding question for the scenario we want to address using SAP HANA Graph and Spatial:
I want to move to London and I like swimming.
From which areas can I get to a swimming pool within a certain travel time?
Since our travel is “restricted” to public roads, travel time is not calculated from straight air distance, but from speeds limits and the road network… which makes it a graph problem. The graph is the set of road segments (=edges) and junctions (=vertices). The travel time is simply calculated using the segment’s length divided by the speed limit. The required data can be obtained from Open Street Map using python [3] or – if you want to run the below example on your own – just get the data and code from github [4].
We will approach isochrones in four steps.
- calculate drive times from a single start vertex to all reachable vertices using the shortest paths one-to-all function
- add point geometries for map visualization
- use spatial aggregation functions to calculate areas from point sets
- merge isochrones from multiple start vertices into a single result
The basic algorithm: Shortest Paths One-To-All
SAP HANA provides a built-in graph engine and one of the native algorithms calculates shortest paths one-to-all (SPOA): given a start point, the algorithm returns shortest path “distances” to all reachable points in our street network. Note that the “distance” measure can be derived from an arbitrary cost function. In our case we will use travel time (= length/speed). The SPOA function call in SAP HANA Cloud looks like the following:
SHORTEST_PATHS_ONE_TO_ALL(:g, :v_start, "CALCULATED_COST",
(EDGE e, DOUBLE currentPathCost) => DOUBLE{
IF(:currentPathCost < 1200.0) {
RETURN :e."length"/(DOUBLE(:e."SPEED_MPH")*0.44704);
}
ELSE {
END TRAVERSE;
}
});
Where :g is the road graph, :v_start is the start vertex. The cost function contains road segments’ “length” (in meters) and “SPEED_MPH” (maximum allowed speed in miles per hour). As 1 mile/hour equals 0.44704 meters/second, the expression returns seconds. The calculation stops if a travel time limit (1200 sec = 20 min) is reached. In our code sample, the SPOA algorithm is wrapped in a database procedure and returns a table with the vertex key “osmid” and the “CALCUALTED_COST” in seconds needed to reach this vertex.
CREATE PROCEDURE "ISOCHRONES"."GS_SPOA"(
IN i_startID BIGINT, -- the key of the start vertex, a road junction
IN i_maxSeconds DOUBLE, -- the maximum drive time distance in seconds
OUT o_vertices "ISOCHRONES"."TT_SPOA_VERTICES"
)
LANGUAGE GRAPH READS SQL DATA AS
BEGIN
-- Get the graph g
GRAPH g = Graph("ISOCHRONES", "LONDON_GRAPH");
-- Get the start vertex by its key
VERTEX v_start = Vertex(:g, :i_startID);
-- Running shortest paths one-to-all in g, starting at v_start.
-- The travel time distance to a vertex is stored
-- in the attribute CALCULATED_COST
GRAPH g_spoa = SHORTEST_PATHS_ONE_TO_ALL(:g, :v_start, "CALCULATED_COST",
(EDGE e, DOUBLE currentPathCost) => DOUBLE{
IF(:currentPathCost < :i_maxSeconds) {
RETURN :e."length"/(DOUBLE(:e."SPEED_MPH")*0.44704);
}
ELSE {
END TRAVERSE;
}
});
-- The procedure returns "osmid" and "CALCULATED_COST"
-- from all vertices in the SPOA graph
o_vertices =
SELECT :v."osmid", :v."CALCULATED_COST" FOREACH v IN Vertices(:g_spoa);
END;
Calling the procedure from one of London’s swimming pools in Forest Hill (osmid = 1275065419) returns a tabular result. The road junction 1912540961 is just a second away from or swimming pool.
Adding geometries to the result for map visualization
Next, let’s add some spatial data to see this result on a map. One of the key benefits of using SAP HANA’s graph engine is that you can leverage the full power of SQL for post-processing. So, we can create a SQL table function that calls the SPOA procedure and joins the vertices’ point geometries to the SPOA result.
CREATE FUNCTION "ISOCHRONES"."F_SPOA"(
IN i_startID BIGINT, -- the key of the start vertex
IN i_maxSeconds DOUBLE -- the maximum distance/cost
)
RETURNS TABLE ("START_ID" BIGINT, "osmid" BIGINT, "CALCULATED_COST" DOUBLE, "SHAPE" ST_GEOMETRY(32630))
LANGUAGE SQLSCRIPT READS SQL DATA AS
BEGIN
-- call the SPOA procedure and store the results in o_vertices
CALL "ISOCHRONES"."GS_SPOA"(:i_startID, :i_maxSeconds, o_vertices);
-- join o_vertices to the LONDON_VERICES table to add geometries to the result
RETURN
SELECT V."osmid" AS "START_ID", V."osmid", V."CALCULATED_COST", T."SHAPE"
FROM :o_vertices AS V
LEFT JOIN "ISOCHRONES"."LONDON_VERTICES" AS T
ON V."osmid" = T."osmid";
END;
The “F_SPOA” table function is called via SELECT.
SELECT * FROM "ISOCHRONES"."F_SPOA"(1275065419, 60);
If you are using a database tool like DBeaver you can immediately plot the result on a map. I marked the Forest Hill swimming pool in the middle with a red dot. The tooltip on one of the junctions on the left shows it takes approx. 56 seconds to get there.
From points to areas – spatial aggregation
So, given a start point and a time limit, the function returns a set of points and travel times. We can now use SAP HANA’s spatial functions to find the area which is covered by the point set. For our purpose, two alternatives seem suitable: concave hull and hexagon clustering.
- Using ST_ConcaveHullAggr on our point set essentially calculates a so-called alpha shape [5] for which the proper radius is chosen automatically.
- Spatial clustering using hexagons is another built-in mechanism to aggregate point sets. A simple way to use this clustering is by specifying the number of grid cells or the cell’s area.
Extending the function from above, we introduce a parameter i_resultType which defines the requested output: concave hull or grid clustering.
CREATE FUNCTION "ISOCHRONES"."F_ISOCHRONE_SINGLE"(
IN i_startID BIGINT, -- the key of the start vertex
IN i_maxSeconds DOUBLE, -- the maximum distance/cost
IN i_resultType NVARCHAR(20) -- indicates if the result should be POINTS, CONVEXHULL, or HEXAGON
)
RETURNS TABLE("START_ID" BIGINT, "ID" BIGINT, "SHAPE" ST_GEOMETRY(32630), "CALCULATED_COST" DOUBLE)
LANGUAGE SQLSCRIPT READS SQL DATA AS
BEGIN
-- call the SPOA procedure and store the results in o_vertices
CALL "ISOCHRONES"."GS_SPOA"(:i_startID, :i_maxSeconds, o_vertices);
-- join o_vertices to the LONDON_VERICES table to add geometries to the result
spoaPoints = SELECT V."osmid" AS "ID", V."CALCULATED_COST", T."SHAPE"
FROM :o_vertices AS V
LEFT JOIN "ISOCHRONES"."LONDON_VERTICES" AS T
ON V."osmid" = T."osmid";
-- apply CONCAVEHULL_AGGR on the point set
IF (:i_resultType = 'CONCAVEHULL') THEN
RETURN SELECT :i_startID AS "START_ID", :i_startID AS "ID", ST_CONCAVEHULLAGGR("SHAPE") AS "SHAPE", :i_maxSeconds AS "CALCULATED_COST"
FROM :spoaPoints;
-- apply HEXAGON CLUSTERING on the point set
ELSEIF (:i_resultType = 'HEXAGON') THEN
RETURN SELECT :i_startID AS "START_ID", ST_CLUSTERID() AS "ID", ST_CLUSTERCELL() AS "SHAPE", CAST(AVG("CALCULATED_COST") AS DOUBLE) AS "CALCULATED_COST"
FROM :spoaPoints
GROUP CLUSTER BY "SHAPE" USING HEXAGON CELL AREA 10000 'meter';
-- default: just return the points
ELSE
RETURN SELECT :i_startID AS "START_ID", "ID", "SHAPE", "CALCULATED_COST" FROM :spoaPoints;
END IF;
END;
We can call the new function, specifying the start point, the time limit, and the aggregation method.
SELECT * FROM "DAT260"."F_ISOCHRONE_SINGLE"(1275065419, 90, 'CONCAVEHULL');
SELECT * FROM "DAT260"."F_ISOCHRONE_SINGLE"(1275065419, 90, 'HEXAGON');
The concave hull on the left represents the area that can be reached within 90 seconds. The hexagons on the right provide even more information: the CALCULATED_COST for each cell is the average travel time for all points in the cell. As you can see, there are holes in the shapes. The reason is simple – there are no public roads (or more exact, public road junctions). The hole in the left lower part is because of a school building.
What’s better than one swimming pool?
Our guiding question from the beginning…
From which areas can I get to a swimming pool within a certain travel time?
… requires one last extension to our function. We need to run the isochrones calculation logic from the last step for all swimming pools in London. We do this in three stages.
- Get all swimming pools
We use the APPLY_FILTER function to evaluate a SQL WHERE condition (amenity = swimming pool) which is passed to the function. - Run the individual isochrone calculations in parallel
The MAP_MERGE operator calls the SPOA function for each of the swimming pools. - Find the nearest swimming pool for each of the points in the street network
Some vertices in our network are close to multiple swimming pools. For each point we will pick the minimum travel time by running a RANK() window function and select the first ranked record.
The final function below takes a filter in form of a SQL WHERE condition, a time limit, and a number-of-cells configuration for the hexagon clustering.
CREATE FUNCTION "ISOCHRONES"."F_ISOCHRONE_MULTI" (
IN i_filter NVARCHAR(5000), -- a SQL WHERE clause to filter the data
IN i_maxSeconds DOUBLE, -- the maximum distance/cost
IN i_cells INT -- a parameter to configure the number of cells for hexagon clustering
)
RETURNS TABLE ("START_ID" BIGINT, "ID" BIGINT, "SHAPE" ST_GEOMETRY(32630), "CALCULATED_COST" DOUBLE)
LANGUAGE SQLSCRIPT READS SQL DATA AS
BEGIN
-- apply the incoming filter clause on the LONDON_POI table
startPOIs = APPLY_FILTER("ISOCHRONES"."LONDON_POI", :i_filter);
-- call the function F_SPOA for all records in startPOIs and merge the result into spoaPoints
spoaPoints = MAP_MERGE(:startPOIs, "ISOCHRONES"."F_SPOA"(:startPOIs."VERTEX_OSMID", :i_maxSeconds));
-- for each vertex, calculate the minimum distance and START_ID, then apply hexagon clustering
RETURN SELECT MAX("START_ID") AS "START_ID", ST_CLUSTERID() AS "ID", ST_CLUSTERCELL() AS "SHAPE", AVG("CALCULATED_COST") AS "CALCULATED_COST" FROM (
SELECT "START_ID", "ID", "SHAPE", "CALCULATED_COST" FROM (
SELECT "START_ID", "osmid" AS "ID", "SHAPE", "CALCULATED_COST", RANK() OVER (PARTITION BY "osmid" ORDER BY "CALCULATED_COST" ASC) AS "RANK"
FROM :spoaPoints
) WHERE "RANK" = 1
) GROUP CLUSTER BY "SHAPE" USING HEXAGON X CELLS :i_cells;
END;
Calling this isochrone function on all swimming pools with a time limit of 20 minutes and requesting 110 hexagons in X direction looks like this:
SELECT * FROM "F_ISOCHRONE_MULTI"(' "amenity"=''swimming_pool'' ', 1200 , 110);
DBeaver is nice for basic visualization of geometries. However, for more advanced stuff you need to turn to a GIS application like Esri ArcGIS Pro or QGIS. With ArcGIS Pro you can put the SELECT statement into a query layer and add some color coding to the hexagons using the CALCULATED_COST measure. The result is below – same color indicates same travel time distance to the next swimming pool. Red areas are more close to a swimming pool.
As a swimming fan moving to London, I should stay away from the blue areas… unless I dare to jump into river Thames.
The calculation of isochrones for 35 swimming pools, result merging, and spatial clustering on the street network with 1.5 mio. edges takes about 6 seconds.
Summary
We have seen how to leverage some of SAP HANA’s multi-model capabilities in an integrated way. Combining built-in engines for spatial and graph processing within the relational database allows you to solve many transportation-related problems in the data layer. Play around with the code and explore London – all you need is a free SAP HANA Cloud trial instance. If you are interested in multi-model processing, check out the blog posts tagged with “SAP HANA multi-model processing”.
[1] https://en.wikipedia.org/wiki/Isochrone_map
[2] https://github.com/SAP-samples/teched2020-DAT260
[3] https://github.com/SAP-samples/hana-graph-examples/blob/main/NOTEBOOKS/LONDON/HANA-ML_Graph_demo.ipynb
[4] https://github.com/SAP-samples/hana-graph-examples/tree/main/ISOCHRONES
[5] https://en.wikipedia.org/wiki/Alpha_shape
Excellent post!