More than just a great Oasis song… this is a special case to watch out for when using HANA Geospatial features. This will be obvious to some, but here is something not particularly well documented, and I felt it was worth sharing.
So far but yet so close
I’d been playing around with HANA’s Geospatial features. The challenge of the day was joining information from about 100,000 weather stations around the globe with nearby world cities. Part of this meant holding geo-coordinate information (i.e. Longitude and Latitude pairs) in columns with data type ST_POINT(4326) – the WGS84 international standard, and using the ST_DISTANCE method to return the distance between them. Easy enough.
Soon, I noticed something strange – although almost every city/station pair I had was behaving as expected, for a small handful I was getting some nonsensical results. Here are three such examples, along with a sensible result for Heidelberg (sorry SAP, Walldorf wasn’t in my table).
In the first example, my query was telling me that the nearest weather station to Auckland, New Zealand was in Moron de la Frontera… a Spanish Air Force Base… on the other side of the globe. And it thought that the distance was exactly zero.
My immediate assumption was that I had done something stupid in my select statements, or had formatted some data wrong. Wanting to sanity check, I looked up the co-ordinates for each area.
- Auckland: (174.764980799, -36.850013)
- Spanish weather station: (-5.6, 37.1833 ).
A quick check on Google maps showed that this was indeed correct… certainly both in their respective neighbourhoods.
So the co-ordinates seemed okay, and I even didn’t fall into the rookie trap of getting the longitude/latitude order wrong.
(for the uninitiated, ST_POINT has these in (long,lat) order, the opposite of the more common way of presenting, which has (lat,lon). Something I find slightly annoying).
Clearly something else was the matter…
Reproducing the issue
I thought it would be a good idea to try manually constructing the points, in case my SELECT or Calculation View logic was faulty. This also gave a distance of zero.
Now let’s move one of the points ever so slightly….
Holy moly! Now we see the correct distance of about 20,000 km.
What happened? All we did was shift one of the points about 5 km – you can see on the map it’s still in the same area.
So what’s going on? Is it because military bases are somehow restricted areas on the WGS84 standard?
Hmm… probably not. Let’s try with another one of my pairs which did not work.
Here’s the city of Shenzhen, China (114.122123099, 22.55237051):
And a weather station in La Quiaca, a sleepy town on the Argentina/Bolivia border. (-65.6, -22.1)
Again, these points manually give a distance of zero.
At this point, I saw that the Wikipedia article for La Quiaca has an interesting sentence:
“an approximate antipode to Hong Kong”.
^see, I told you…
For those unfamiliar with Asian geography, let’s zoom out of Shenzhen and see where it is…
Aha… right next to Hong Kong. So we seem to have found the culprit.
In all cases, the pairs which were showing as zero distance were about as far away as it is possible to get on the other side of the globe.
In these extreme situations, distance formulae tend to break down a little. Let’s dig further into this.
I believe the implementaiton of ST_DISTANCE in SAP HANA uses Vincenty’s Inverse Formula which you can read all about here.
The math may not mean much to you, but there is a little caveat at the end worth noting:
“Between two nearly antipodal points, the iterative formula may fail to converge”. In other words… the equation fails for points on the opposite side of the globe.
That’s exactly what happened here.
To further back this up, I found an online calculator with an explicit use of Vincenty’s Formulae here
It comes with a similar warning:
Vincenty’s inverse solution can fail on nearly antipodal points. Testing with GeographicLib test data, I’ve found this can happen with distances greater than 19,936 km, or within around 75 km of the antipodal point.
And sure enough, my co-ordinate pairs are throwing a ‘Not a Number’ error.
So there you have it. If you are implementing any similar Geospatial projects, be sure to double-check and protect against antipodal failures.
In my case, I mad a separate view to show the pairs giving exactly zero distance and manually confirmed the results.
I hope this saves someone a headache one day.
Good luck, and happy bug hunting!