Get to know your neighbors with SAP HANA
Last month I did a SAP CodeJam in Walldorf (this is where SAP is headquartered) on the topic of Geospatial with SAP HANA. One of my questions to local participants was “What country has the longest land border with Germany?”. And to my surprise, no one had immediate answer. “France?.. Netherlands?.. Poland?.. Austria?..” participants were not sure.
Ok, calling SAP HANA Express and geospatial processing to the rescue! I have created a new set of SAP HANA Geospatial tutorials last week in case you missed them.
To answer the question we need a dataset with geographical shapes of all countries. One of the most popular open data sets is the World Borders from Thematic Mapping. But in this case let’s download Countries shape file from the same place we downloaded Continents for previous blogs, as this will help as with the exercises next week.
Once you have the file downloaded and uncompressed, load it with Spatial Reference System ID
1000004326 (planar GPS coordinates) using desktop upload in Eclipse (SAP HANA Studio) or the SQL statement below after uploading files to HANA’s file system.
IMPORT "TESTSGEO"."cntry00" AS SHAPEFILE FROM '/usr/sap/HXE/home/Downloads/cntry00/cntry00' WITH REPLACE SRID 1000004326 THREADS 4;
Quick check of the loaded table.
select count(*) from "TESTSGEO"."cntry00"; --result should be 252 select CNTRY_NAME, SHAPE.ST_SRID() from "TESTSGEO"."cntry00" where CNTRY_NAME = 'Germany'; /* result should be CNTRY_NAME;SHAPE.ST_SRID() Germany;1 000 004 326 */
Now let’s find what neighbor has the longest border line with Germany! For that:
- join a table with itself using spatial predicate
ST_Touches()to find countries sharing borders,
- calculate a line string representing a border between each two neighbors using set method
- calculate the length of each shared border (the line string), but first remember to convert that geometry to Spatial Reference System
4326to do it on the Round Earth!
select b.CNTRY_NAME as "neighbour", b.shape.ST_Intersection(a.shape).ST_SRID(4326).st_Length('kilometer') as "border" from "TESTSGEO"."cntry00" a join "TESTSGEO"."cntry00" b on a."SHAPE".ST_Touches(b.shape) = 1 where a.CNTRY_NAME like 'Germany' order by "border" desc;
And here is the result that was surprising to participants. The longest border was with Austria!
[As Ethan Jewett rightfully noticed in his tweet commenting this post:
“Yeah, these measurements are all significantly shorter than the official border lengths, but in the right order…”
Ethan was right – and I will explain why these measurements are not equal to the real ones. But the order is still correct, and it is Austria that has the longest border with Germany.]
Going back to our CodeJam’s participants. To their excuse I must say I was wrong about neighbors of Poland too! Poland has the longest border with Czech Republic, and not with Ukraine as I had said 🙂
The other interesting fact is that Germany has the biggest number of neighbors in Europe. Check it with the following statement.
select a.CNTRY_NAME as "country", count(b.CNTRY_NAME) as "neighbours" from "TESTSGEO"."cntry00" a join "TESTSGEO"."cntry00" b on a."SHAPE".ST_Touches(b.shape) = 1 group by a.CNTRY_NAME order by "neighbours" desc;
Russia has more neighbors, but then Russia is transcontinental country, which is the topic for another blog.
So, what is your country’s neighbor with the longest (or the shortest) border?
And what about #GeospatialTuesday in the title? Recently my fellows from the team started introducing thematic day blogs. Meredith Hassett with #APIFriday, Abdel DADOUCHE with #PredictiveThursday, and recently Jonathan Baker with #OpenSourceMonday. Seems like Lucia Subatin and DJ Adams haven’t chosen their days yet, so I need to rush before I am left with weekend only 😉
‘Till next week, then!
-Vitaliy, aka @Sygyzmundovych
I like my blogs to be a surprise... Unpredictable 😛
Nice job here, as always!
Another option for doing the JOIN, especially if you already imported the shapes with the "round-earth" SRID, would be to use "is empty" predicate on the intersection:
It is important to ensure that you also check that the "other" country is not the same as the target country, because you would get an error otherwise (the intersection between the target country's shape and itself will surely not result in a line-based shape; so the length call would fail for it).