There is a massive shift underway in Analytics, where the line between traditional analytics – where we primarily looked at the past and used line-, bar- and pie-charts to represent data – and predictive analytics – which taking a very broad view includes statistics as well as machine- and deep learning – is blurring. We no longer look just at “what happened”, but increasingly to “why did it happen” and “what is going to happen”. Add to that Big Data, which is hard to make sense of without statistical/predictive analysis, and it is clear that we’ll see an increasing need in our visualization tools to be able to visualize the results from such analysis.
In this blog post, I will introduce three Lumira extensions that show how this can be done. The code for these with sample files is available from SAP’s lumira-viz-library repository on GitHub.
Forecast with 80% and 90% confidence intervals
The first chart shows actuals with a forecast, and 80% and 90% confidence intervals. Confidence intervals are a standard statistical technique to visualize a certain degree of certainty in the forecast. The narrower the confidence intervals, the more reliable the forecast, and the wider the confidence intervals, the more we have to deal with substantial uncertainty. Suppose we are using such forecast to decide where to make an investment decision. If we only looked at the forecast result, that could mislead dramatically, if the confidence intervals are very wide.
You see clearly here that in this example (based on per capita GDP WDI data from 2014) the confidence interval for Australia is really narrow (no surprise, as the actuals are very smooth to begin with). But the situation is very different for Greece. While the forecast itself shows an upward trend, the confidence intervals are really wide, and we should certainly be prepared (simply based on past performance) that it doesn’t recover at all. (Obviously, predictive algorithms can’t predict political changes. Who knows what will happen to Greece? The point is that there is a great deal of uncertainty…)
Forecast with single confidence interval
Once I had developed the chart above, a colleague asked me if we could do the same with a single confidence interval, and most certainly we can. In this example, we have a seasonal time series, and you can see the chart handles that pretty well.
Holt-Winters Exponential Smoothing
Another common predictive analysis is exponential smoothing, where we give it a dataset and the algorithm smoothes it out to find a more significant signal to form a trend line of some kind. There are different variations of this, but to produce the data file I used Holt-Winters. The chart should work as well with other smoothing techniques, including moving averages. In this case, we’re applying Holt-Winters exponential smoothing (in red) on a seasonally adjusted time series of tomatoes sold by weight (in blue) to see if there is a trend. (There isn’t really one, it is largely stable. There is a lot of daily variation even after removing the seasonal effect, but the exponential smoothing shows there is much just a slight growth over the course of the ~2.5 years included in this set).
SVG Path Mini Language
However, d3.svg.line() and d3.svg.area() are just helper functions around SVG paths. You can build SVG path through concatenating a string with key “letters” and coordinates. To start a line, you start with ‘M’, with each subsequent step indicated by an ‘L’. These then take X and Y coordinates, and allow you to draw whatever line or area you want. So, “M0,0L10,0L10,10L10,0L0,0” would create a little 10×10 square outline. Then, rather than push all data points through d3.svg.line() or .area(), we simply put the “pen down” the first time we see a value, and continue on until the values stop again.
Code and sample files
You can find all the code and sample files in SAP’s lumira-viz-library on GitHub.