The data geek challenge – using SAP Lumira.
I decided to attempt the challenge using some Bureau of Meterology (Australia) data – which has very long time record of rainfall in Hobart (where I live).
I downloaded the data from www.bom.gov.au – and then used SAP Lumira to import the parts of the CSV that i wanted.
It is easy to select the data that you want to bring into the tool, either toggle select all, or select/ deselect individual rows.
As i had a number of blank rows in my .csv file, I deleted these prior to importing.
Once i imported the data into Lumira, I then wanted to reduce the amount of data that i was looking at, as the dataset contained data for every day from 1895!. For the demonstration I wanted to look at the variation in Hobart over a 40 year time gap, 1931, 1971 and 2011. Using the data view, I was able to replicate the data with the year in it, then group the years, 1931, 1971 and 2011, select these years, and them remove all the other years from my datasets.
Enrichment – I decided that i wanted to enrich the data for the output – and create some measures. I the bottom left hand box, I selected the show (for the semantic elements available). I then selected the Enrich, to create measures for each of these rows of data.
My data also has some data variation – for example, there are days, months and years available in the dataset. I wanted to created some a time hireachy for these data. As I wanted to show the varation of the days that rain fell in the different years that I wanted to map. I selected them in the attributes table, and then selected – create time hireachy.
At this point, i wanted to start visulising the data, so i switched from data view to split view, and looked at the amount of rainfall per month for my 3 years that i had previously selected. At this point, I was not too sure what output would best demonstrate the variation in rainfall. So i tried a couple of different types of variations.
Has the amount of rain changed that much between 1931 – and 2011. For this visulisation, I looked at the average rainfall (by day) (blue), and the sum of rainfall by day (in red) for the different years.
And then i went back to the heat map to see if there was that much variation in the days. From this image, you can see that for different years, the days with the highest rainfall changes. Not really that suprising. Next, I wanted to look at the monthly variation.
I then wanted to look at a few other things with the data, using different charting tools. So i started to look at some of the relationships between the 3 years chosen above, and the days with little to no rain – to look at the variation in these over the years.
Noticing the large amount of data sitting on 0 amount of rain (days without rainfall) – I then filtered the data to only look at the rainfall data that = 0mm. The aim of this was too look to see if there were more days or less days over the years with 0 rain (this is then plotted by Month). Make a pretty picture too!
But was this really significant? Looking at the chart below, and looking at the average number of days – you can see that there actually not that great a variation in the number of days that there is no rain in Hobart between 2011, 1971 and 1931.
So where to from here? As i progressed with the tool, I did notice how easy it was to change the title, and the names of datasets. I guess with more time, there are other tips and tricks that the user could pick up.This looks like a great tool to visualise data.