The Starting Point
I build a cool demo of Predictive Factory for this event
Forecast the growth of the Internet usage per country.
Gathering the Source Data
I had my burning question, and went hunting for data. I turned to the World Bank Open Data website, usually a goldmine for such stats.
The indicator Internet users (per 100 people) was perfectly fitting. You can learn about this indicator here.
Keeping it simple for this demo, I used no extra variables (this is a nice next step). The dataset was only made of year, country, and the Internet users (per 100 people) as dataset fields.
I cleaned up the initial list of countries, as the data was not always fully reliable or filled. At the end of this, I was left with 153 countries in my dataset.
I selected a year range corresponding from the 20 years of TechEd, from 1996 to 2015 (last year available).
Creating My Time Series Predictive Model
Since the recent release of SAP BusinessObjects Predictive Analytics 3.1, it is possible to create time series predictive models, directly within the Predictive Factory.
I created a project, pointed to the right data source and then the creation of the predictive model is literally one simple form-filling step.
I click on Time Series Forecasting to create the predictive model.
The segment used is the Country variable, it means one separate predictive model will created for each country. 3 forecasts are required for the years 2016, 2017 and 2018.
Debriefing The Model
In the Reports tab, I see the performance of my predictive models for the different countries, the most accurate models (aka “Top Segments”) and the least accurate one (aka “Bottom Segments).
There is a warning for Poland, as there is not enough confidence to predict on the next 3 years to come.
I can look at the predictive model for Germany. We see that the Internet usage is reaching a kind of plateau, and only one point of growth is gained each year.
The situation for a country like India is very different, Internet usage is predicted to grow to 35% of the population in 2016 and 47% in 2017. This means approximately 162 million new Internet users. No wonders why companies like Facebook for example have so much interest for India’s appetite for the Internet at the moment ;-).
Scheduling the Forecasts Refresh
The Predictive Factory purpose is to cover the end-to-end of a predictive project, not only the first creation of the predictive model but also the subsequent steps.
So the next question is “when I will get the new data point for 2016, how I can automatically refresh the predictive model to forecast up to year 2019?”
Again I just need to fill a form and here I go!
I ask for a model refresh at the end of each year, 3 forecasts to be generated and the forecasts to be created in a dedicated table.
This was quite a straightforward and pleasant experience to create this end-to-end in the Predictive Factory. You can easily imagine the same process deployed in several industry examples like retail, telco, manufacturing… every industry needs to forecast and plan ahead!
If I find more time in the future (this one is difficult to predict ;-)) I would like to see how I can add valuable extra predictor variables to further refine the accuracy of the forecasts and further reduce the confidence error intervals. I am thinking for instance about the GDP Growth, Population Growth… etc…
I look forward to hearing from you on your adventures with the Predictive Factory!
The data used in this blog comes from International Telecommunication Union, World Telecommunication/ICT Development Report and database, and World Bank estimates.