Technical Articles
2019 in review, or … Location history processed with SAP HANA (part 3): honeycomb clustering
Today is my last work day of 2019. So, exceptionally I am going to post Geospatial Tuesday post on Friday.
Looking back at 2019 I must say it was quite a year:
- Presented at 27 events, where 7 of them were for non-SAP audiences, like GeekWeekWro or GDG DevFest Poland,
- Data Engineering with SAP Data Hub was added as a new topic of SAP CodeJams, and thank you (Pawel Grzeskowiak, Martin Fischer, ASUG Michigan, Jakob Flaman, Leszek Bednarz) for hosting 5 of them already this year (4 more already requested for 2020),
- Crossed the mark of 100 blog posts on SAP Community (this one is #102),
- Together with our SAP Developer Advocates and Friends team — Lucia Subatin, Riley Rainey, Maximilian Streifeneder, Marius Obert, DJ Adams, Kevin Muessig, Josh Bentley, Rich Heilman, Thomas Jung, Kevin Nelson and Casey — interacted with developers attending our Developer Garages in all three SAP TechEd locations world-wide,
- Hosted 5 meetups of SAP Community here in Wrocław, plus another unforgettable SAP Inside Track #sitWRO (and I am really proud that Bartlomiej Slawinski, Ewelina Pękała and Izabela Rębisz added Warsaw, Katowice and Kraków to the map of SAP Community events in Poland in 2019; Michal Krawczyk, what about Poznań?);
- Last, but not least: enjoyed beers at SAP Stammtisch‘es in Stuttgart, Bern, Kraków, Warsaw (x2), Berlin (x2) and Istanbul (ok, it was rakı there — but on my explicit request to Abdulbasit Gulsen, Huseyin Dereli, Ridvan Polat, Mustafa Kerim Yılmaz).
Now it is time to move 2019 badges to the archive box. Actually, I need another archive box as there is no space left in the current one…
Going back to SAP HANA
I mentioned some locations visited already, but let us get back to employing SAP HANA, express edition (aka HXE), to analyze location data from 2019.
One thing that happened since my last post on the topic was the release of service pack 4 of HXE 2.0 and with it came some new functionality, like the new spatial aggregation method. My HXE system is already upgraded to SPS 04, and hopefully yours too.
Exercise 5: Hexagonal aggregation
Hexagonal grid-based aggregation is often called honeycomb, and has quite a few well discussed advantages.
So, let’s use it for my 2019 in Review. I reloaded the newest file provided by Google Takeout, and now let’s run some queries.
SELECT St_UnionAggr("Honeycomb") as "Honeycombs"
FROM (SELECT COUNT(*), ST_ClusterID(), ST_ClusterCell() as "Honeycomb"
FROM (SELECT point
FROM "GOOGLE"."LOC_HISTORY"
WHERE YEAR("TIMESTAMP") = 2019)
GROUP CLUSTER BY point
USING HEXAGON Y CELLS 20);
And the result for 2019 is:
This would be about right – except of the visit to Republique du Congo. This is just an outlier, i.e. an error recorded by GPS sensor. Because of navigation ubiquity nowadays we keep forgetting that’s really complex topic where “the Special and General theories of Relativity must be taken into account“.
Let me look more closely on travels within Poland. For that I downloaded the latest shape file with country boarders from http://thematicmapping.org/downloads/world_borders.php and uploaded it into my HXE database.
IMPORT "TM"."WORLD_BORDERS" AS SHAPEFILE FROM '/usr/sap/HXE/HDB90/work/spatial/TM_WORLD_BORDERS-0.3/TM_WORLD_BORDERS-0.3' WITH REPLACE SRID 4326 THREADS 4;
Now let me run the query that selects only locations within boarders of Poland.
SELECT St_UnionAggr("Honeycomb") as "Honeycombs"
FROM (SELECT COUNT(*), ST_ClusterID(), ST_ClusterCell() as "Honeycomb"
FROM (SELECT point
FROM "GOOGLE"."LOC_HISTORY" l, "TM"."WORLD_BORDERS" b
WHERE YEAR(l."TIMESTAMP") = 2019
AND b."ISO2" = 'PL'
AND l.point.ST_CoveredBy(b."SHAPE")=1)
GROUP CLUSTER BY point
USING HEXAGON X CELLS 60);
One additional change you may notice in the query is change from HEXAGON Y
to HEXAGON X
. This changes the alignment of hexagons, where now they are arranged row-wise, that is, the tips are pointing upwards and downwards.
The map clearly shows directions in which I travelled this year to present at Career in IT events in Poland: Poznań, Warsaw and Kraków 🙂
And with that…
…it is time to wish you all good mood, enjoyable break (if you take some), Happy New Year (if a year is changing in your calendar), and “see” you here at community.sap.com in 2020!
Till next year!
-Vitaliy (aka @Sygyzmundovych)
Impressive badge collection
You really deserve your firstname: Vitality ?;)
Thanks for your inspirational blog-series.
??
Thanks, Christian. Actually the root of my name "vital" comes from Latin "loving life", so I am thinking maybe I should change it to "Beeraliy" ??
do you have any statistics how often your name was spelled correctly?
That I can count on fingers, no need to start a HANA instance ?