Technical Articles
From QCon London to Kariera IT Gdańsk – the week-long #GeospatialTuesday
It’s been busy week. That’s why #GeospatialTuesday entry is coming on Friday. Better late than never. The week started with QCon conference in London, where I talked about and demoed geospatial data processing using SAP HANA. At the same time Craig was doing his magic explaining Express edition to the conference participants.
While travelling to London I found that interesting discussion on Reddit (the original post there was deleted for whatever reason) about the size of London comparing to European countries. What a great idea to show the use of SAP HANA and data sets we’ve been using in previous blogs already!
So, the question is what countries in Europe have population smaller than London?
Here is the answer from my SAP HANA Express and the data we already have after previous blogs.
Now let me walk you through how I approached it.
We should have all required data available already:
- GeoNames (which has population data and locations for countries and cities) in schema
GEONAME
, - Country shapes in the table
"TESTSGEO"."cntry00"
, - Continents shapes in the table
"TESTSGEO"."continent"
From GeoNames data we need data for the Great Britain (instead of Poland as in the previous blog) and countries (where featurecode
is PCLI
) only. To get all countries lets download the complete dataset, but then filter and upload only what we need.
wget http://download.geonames.org/export/dump/allCountries.zip
unzip allCountries.zip
cat allCountries.txt | grep PCLI > allPCLI.txt
wc -l allPCLI.txt
The output from the last command (counting lines in allPCLI.txt
file) was 201 for me at the time of writing.
Let’s load this file into a separate table, so following statements will be more clear.
CREATE TABLE "GEONAME"."PCLIS" LIKE "GEONAME"."GEONAMES" WITH NO DATA;
IMPORT FROM CSV FILE '/usr/sap/HXE/HDB90/work/geonames/allPCLI.txt' INTO "GEONAME"."PCLIS"
WITH FIELD DELIMITED BY '\t'
DATE FORMAT 'YYYY-MM-DD'
ERROR LOG 'PCLI.err'
NO TYPE CHECK
THREADS 8
OPTIONALLY ENCLOSED BY ''
COLUMN LIST ("GEONAMESID", "NAME", "ASCIINAME", "ALTERNATENAMES", "LATITUDE", "LONGITUDE", "FEATURECLASS",
"FEATURECODE", "COUNTRYCODE", "CC2", "ADMIN1CODE", "ADMIN2CODE", "ADMIN3CODE", "ADMIN4CODE", "POPULATION",
"ELEVATION", "DEM", "TIMEZONE", "MODIFICATIONDATE");
UPDATE "GEONAME"."PCLIS"
SET "LOC_4326" = ST_GeomFromText('POINT(' || "LONGITUDE" || ' ' || "LATITUDE" || ')', 4326),
"LOC_3857" = ST_GeomFromText('POINT(' || "LONGITUDE" || ' ' || "LATITUDE" || ')', 4326).ST_Transform(3857)
WHERE "LOC_4326" is null;
Now that all data is there let query it. First to check the population of London…
select a."NAME", a."POPULATION", a."MODIFICATIONDATE"
from "GEONAME"."GEONAMES" a
join "GEONAME"."FEATURECODES" f
on f."CODE" = a."FEATURECLASS" || '.' || a."FEATURECODE"
where f."NAME" = 'capital of a political entity'
and a.NAME = 'London';
/* The result at the time of writing this blog
NAME;POPULATION;MODIFICATIONDATE
London;7 556 900;2016-09-14
*/
…and not to find European countries with population smaller than London…
select
ST_UnionAggr(d."SHAPE").ST_asGeoJSON()
from ( select b."LOC_4326"
from "GEONAME"."GEONAMES" a
join "GEONAME"."PCLIS" b on a.POPULATION >= b.POPULATION
join "TESTSGEO"."continent" c on c."SHAPE".ST_Covers(b."LOC_4326") = 1
where a.NAME = 'London'
and a.FEATURECODE = 'PPLC'
and c."CONTINENT" = 'Europe' ) c
join "TESTSGEO"."cntry00" d on c."LOC_4326".ST_Transform(1000004326).ST_Within(d."SHAPE") = 1;
…and use another popular website http://geojsonlint.com/ to visualize the GeoJSON output from the query. That was it.
I am writing this in Gdańsk airport finishing this busy week. Today I presented here at Kariera IT conference. And I am excited as it was the biggest number of participants I had so far at this events series. Next stop will be in Warsaw on March 24th. See you there?