Personal Insights
Your location history processed with SAP HANA – part 1
It all started with this question from Gregor Wolf:
“.@Sygyzmundovych have you ever tried to import the Google Location History from https://takeout.google.com/settings/takeout … into #HANA Geospacial?”
Well, in fact, I haven’t. But it turned out to be a good exercise to use two of SAP HANA’s multi-model processing capabilities: geospatial and JSON. As usually, I am using free SAP HANA, express edition (HXE) here.
Requesting and uploading data
I have a private Google account, so I requested Location History and My Places (places I rated on Google Maps) data:
- My Places are available as GeoJSON data, so it can be a topic for another post.
- Location History is available either as a plain JSON or KML geospatial format. I am going to request it in JSON format.
Once received requested data I downloaded it to my HXE’s /usr/sap/HXE/HDB90/work/
folder and uncompressed. Let’s look at the data structure.
hxeadm@hxehost:/usr/sap/HXE/HDB90/work/Takeout/Location History> head -20 Location\ History.json
{
"locations" : [ {
"timestampMs" : "1550519193999",
"latitudeE7" : 510754802,
"longitudeE7" : 169642113,
"accuracy" : 8,
"velocity" : 0,
"altitude" : 121,
"verticalAccuracy" : 3
}, {
"timestampMs" : "1550519070999",
"latitudeE7" : 510801796,
"longitudeE7" : 169665475,
"accuracy" : 8,
"velocity" : 9,
"heading" : 215,
"altitude" : 123,
"verticalAccuracy" : 3
}, {
...
Fields I am interested in are:
timestampMs
is a UTC timestamp in ms using UNIX Epoch time representation, i.e."1550519193999"
isFebruary 18, 2019 7:46:33.999 PM GMT
,latitudeE7
is a latitude multiplied by 10,000,000 (or1e7
) to represent it as an integer number, i.e."510754802"
is 51,0754802 value in a GPS position,longitudeE7
is a longitude multiplied by 10,000,000.
Loading data into SAP HANA
We can parse data before loading it in the database. But loading it in the original format will be a good example of processing JSON with SQL in SAP HANA.
CREATE SCHEMA "GOOGLE";
CREATE COLUMN TABLE "GOOGLE"."DUMP"(
"DATA" CLOB NOT NULL
);
IMPORT FROM CSV FILE '/usr/sap/HXE/HDB90/work/Takeout/Location History/Location History.json' INTO "GOOGLE"."DUMP"
WITH RECORD DELIMITED BY '\r'
FIELD DELIMITED BY '\t';
It is not a CSV file, but I pretended it is – with one record and with one field. That’s why I picked delimiters that are not present in the file.
Convert JSON to relational data using SQL
Next, I want to convert JSON into records of a LOC_HISTORY
table with a timestamp and a geographical position represented by ST_POINT
data type.
CREATE TABLE "GOOGLE"."LOC_HISTORY" AS
(SELECT
TO_SECONDDATE(ADD_SECONDS('1970-01-01 00:00:00', "TSMS" / 1000)) as "TIMESTAMP",
new ST_Point("LONE7"/1e7, "LATE7"/1e7) as "POINT"
FROM JSON_TABLE("GOOGLE"."DUMP"."DATA", '$.locations[*]'
COLUMNS
(
RN FOR ORDINALITY,
TSMS BIGINT PATH '$.timestampMs',
LATE7 BIGINT PATH '$.latitudeE7',
LONE7 BIGINT PATH '$.longitudeE7',
ALT INT PATH '$.altitude'
)
)
);
I used JSON_TABLE() function. Please note, that I am not using HANA’s Document store for this exercise, but only relational engine and its SQL.
I can see data in the table now. My location data is available from 2013 till yesterday.
Exercise 1: Reconstructing the walk
Let’s see. Last Sunday, February 17th, I went on my Instagram hunt between 10:27 and 11:42 local time (CET
time zone), while my children were attending UniKids classes. I just need to rework a procedure from my earlier blog Calculating a path from IoT GPS data stored as ST_Point.
CALL "GOOGLE"."PATH_FROM_POINTS"(
TS_START => LOCALTOUTC(To_TimeStamp('2019-02-17 10:27:00'), 'CET'),
TS_END => LOCALTOUTC(To_TimeStamp('2019-02-17 11:42:00'), 'CET'),
PATH => ?
);
And the result in WKB format is:
SRID=4326;LINESTRING (17.06334400177002 51.11322212219238,17.064226150512695 51.1140832901001,17.066997528076172 51.115102767944336,17.063793182373047 51.11575794219971,17.066638946533203 51.11593723297119,17.06586265563965 51.11723232269287,17.062926292419434 51.11690616607666,17.05934238433838 51.1182861328125,17.057364463806152 51.1193790435791,17.056079864501953 51.120232582092285,17.054750442504883 51.12098693847656,17.05545997619629 51.12276363372803,17.056791305541992 51.12459754943848,17.05857276916504 51.12675952911377,17.05574607849121 51.12709903717041,17.054173469543457 51.12527084350586,17.052987098693848 51.12366008758545,17.052385330200195 51.12179660797119,17.053675651550293 51.119704246520996,17.056140899658203 51.11831569671631,17.057037353515625 51.11619281768799,17.060548782348633 51.11597728729248,17.06048011779785 51.11384963989258,17.06330394744873 51.11301517486572)
This can be easily visualised using GeoJson.io website:
Location History stores the location roughly every 3 minutes, so the line is not smooth on the visualization, but reflects the path I did if comparing to the map generated on Endomondo data for the same walk:
In the second part of this blog post I do some spatial clustering on this data set.
‘Till next #GeospatialTuesday then,
-Vitaliy, aka @Sygyzmundovych
Hi Witalij,
finally I found some time and need to try out your blog. I'm trying to use the HDI Container available in the SAP Cloud Platfrom Cloud Foundry trial for that. First I converted the JSON into a csv for easier import into HANA:
As I don't have access to a file system local to the HANA DB I've installed the SAP Data Hub developer edition and created this Graph which reads the CSV in 1MB chunks and the HANA client processes 100 rows at a time:
Using that my 2.032.779 rows where imported in just about 15 minutes:
And I could start the conversion into Points
which took only took 9 seconds and now my complete location history from January 2011 to February 2019 is in the LOC_HISTORY table:
But it seems that the HANA Database used for the HDI container are not yet on SAP HANA 2.0 SP 3. Because I get the following error when I try to create the procedure:
the source I'm using is:
Maybe you have a tip there.
Best regards
Gregor
And what is the version in M_DATABASE view?
https://help.sap.com/viewer/4fe29514fd584807ac9f2a04f6754767/2.0.03/en-US/20ae63aa7519101496f6b832ec86afbd.html
Greetings from vacations on Malta ?? i’ll Need to check what Google locations tracked from here 😉
Thank you for the hint. It's only 2.00.002.00.1490188421 :-(. Enjoy Malta!
I hope you are on SPS04 in the meantime, because I just published part 3 using the new aggregation algorithm from that service pack 🙂