Data Geek Challenge – Using SAP VI to analyze States Energy Data from 1960 to 2009 in relation to population data
It’s been fun playing with SAP Visual Intelligence (VI) over the last month or so, using it to look at and analyze various public domain data sets. In the original blog article, I showed how SAP VI was used to look at US States Energy Data from 1960 to 2009.
I have updated this blog post to include additional analysis – this time including the population data for the US States from 1960 to 2009, and to present some visualizations that lead to some interesting conclusions. Update Dec 17: I have uploaded an updated YouTube video which includes the population data. I have also embedded the video at the end of this article.
I have broken down this updated post into two sections: the first section dealing with the earlier energy usage trends and which was more about using the cool features of SAP VI in some hopefully interesting ways, and the second newer section with an analysis of energy usage in relation to the population data, wherein SAP VI is used for critical analysis and coming up with some interesting conclusions.
Hope you find this as enjoyable and interesting as I did!
Section I: Energy usage trends across the decades
1. States’ Energy data was obtained from www.data.gov (can also be obtained through www.eia.gov). The energy category chosen is “Asphalt and Road Oil consumed by the industrial sector.
2. After loading the data into VI, created a measure on energy usage, and also a geographic hierarchy on State.
3. Making use of the trellis feature, used SAP VI to visualize the energy usage trends across the decades.
4. Using saved visualizations and coming up with a time-lapse video of energy usage across decades.
Shown here are some stills from the presentation (all linked to the earlier youtube video).
I.1 Chloropleth Chart of Energy Usage from 2000 to 2009 for US – contiguous states. A nice way of visualizing energy usage by state.
I.2 Radar Chart of Energy Usage from 1960 to 1969 for all states. Another neat way of looking at the data.
I.3 Utilizing a 3D Column Chart to visualize Energy Usage from 1989 to 2009 for the three states of Nevada, Pennsylvania, Texas
Section II: States’ Energy Usage in relation to Population
1. States’ (historical) population data was obtained from the US census site www.census.gov
2. The merge feature of SAP VI was used to merge the two data sets (energy and population) on State+Year.
[Notes: a) had to come up with the unique key column State+Year outside of VI; b) I also wanted to come up with a calculated column on energy usage per person – the Add Calculation feature in VI seems limited to one (measure) column, so had to do this outside of VI.]
3. With all the data loaded, created measures for sum energy usage (in Billion Btu), and average energy usage per person (in KBtu). Also created once again the geographic hierarchy on State.
4. Now for some visualizations! The chart type chosen is the Column Chart with Dual Value axes.
II.1 Population and Energy Usage (Sum, and Avg per person) from 1960 to 2009
Note, as population has been on a steady rise over the years, energy usage has trended up twice, with a marked dip in the 70s to early 80s, followed by another marked dip starting in the mid-2000s. The first dip correlates well with the economic recession in the 70s, with the more recent dip probably being caused by higher oil prices of late, which in turn has caused numerous counties/municipalities to stop paving some of their roads, or converting them to gravel roads because they can no longer afford paving them. Also, the avg energy usage per person (in KBtu), while plotted on this chart, is too small to be seen. Let’s check the same data for the three states seen earlier in chart I.3: Nevada, Pennsylvania, and Texax.
II.2 Population and Energy Usage (Sum, and Avg per person) from 1960 to 2009 for NV, PA, and TX
Somewhat like II.1 but with a lot more scatter, and with a just about noticeable downward trend in the average energy usage.
Do all the states have similar data charts? No! The charts for Alaska and DC are quite different.
II.3 Population and Energy Usage (Sum, and Avg per person) from 1960 to 2009 for Alaska
The spike seen in the chart correlates nicely with the increased spending in the oil and gas extraction sector in Alaska from 1997 to 2002.
II.4 Population and Energy Usage (Sum, and Avg per person) from 1960 to 2009 for DC
The population in DC has actually trended down till about 1999, while energy usage on the other hand is seen to be slowly trending up on the whole through to the 90s, then beginning to trend down for the last few years – perhaps no more space in DC for any more new roads to be built! 😛 I’ll leave more interesting speculations to the reader.
Finally, let’s look at the data for a couple of states having budgetary issues of late, CA and IL. For this chart, I have used State as a Legend Color on the X-axis dimension.
II.5 Population and Energy Usage (Sum, and Avg per person) from 1960 to 2009 for CA and IL.
Again, I’ll leave it to the reader to derive some conclusions from this chart, and try and correlate those with data/information from other sources. You’re most welcome to leave those conclusions as feedback to this blog post, and if there are any questions for me, I’ll get back to you.
Finally, here is the embedded YouTube video:
Thanks for reading.
Cheers, and Happy Holidays!