Skip to Content

I would like to share something with you that I have built in close collaboration with other friends of Predictive Analytics, HANA and BI. This blog is going to be a living document so please expect that the content will keep growing and being updated.

My own first experience with what is today mostly known as predictive analytics was back in 1999. At the time I was working as a business consultant implementing and designing traditional Business Intelligence solutions on the company’s own ERP system. The BI solutions were at that time mainly based on fixed reporting, perhaps a bit of self services reporting and in some cases even dashboards with links to the Balanced Scorecard methodology.

Well a customer contacted us asking for assistance to build better sales forecast in order to achieve better utilization of their manufacturing resources and in overall even out the capacity requirements over more factories.

The customer had just been fined by the European commission for having built a cartel and shared the marked amongst its “competitors”. Based on this business model they weren’t all too experienced with forecasting. But luckily they were open to new ideas including that we would try to incorporate more dependent variables in the forecast than just last year’s sales plus/minus 5 percent. So we started using a data mining tool feeding it with variables such as macro-economic factors from the national bank, interest rates over time, outside temperature, budget in local municipalities etc. What we found during our data mining experiments was that order intake 2 months ago was a heavy influencing indicator on the actual invoiced sales – well no big surprise there as the numbers here of course are very bias. After cleaning up the biased variables and getting better domain knowledge we found something that to this day is still sort of an eye opener to me. The outside temperature had a significant impact on the order intake – first I thought that it again just was a biased or just by coincident. Correlation does no always equal causality – sometimes called spurious relationship. (*)

Back to the forecasting assignment and the findings that there was a strong correlation between the outside temperature 2 months prior and the order intake. Approaching the business we found that the actually was a good explanation for using the outside temperature 2-3 months ago as an independent indicator as this company’s product is being dug 3-4 meters into the ground. And when it is very cold it simply isn’t feasible to have large scale entrepreneurial projects digging in the frozen ground. Moreover the municipalities were the most predominant customers and their revenue was increased with falling temperature – due to their
ownership of local
energy companies. Plus the fact the municipalities are usually driven by yearly budgets and what is not spend
year 0 has to be saved in year 1. Using this newly gained information we were able to build a forecast and predict the capacity resource and material requirements with mush better accuracy.

Preparing the data for predictive analytics and feeding the algorithms.

The data for daily average temperature was accessible through the national metrological institute. Ugeoversigt: DMI

The macro-economical data were also easily accessible from the Danish nationalbank – many others provide this service today.

Finally the companies own historic sales numbers – which of course was a bit bias due to the cartel issue.

Back to the forecasting assignment and the findings that there was a strong correlation between the outside temperature 2 months prior and the order intake. Approaching the business we found that the actually was a good explanation for using the outside temperature 2-3 months ago as an independent indicator as this company’s product is being dug 3-4 meters into the ground. And when it is very cold it simply isn’t feasible to have large scale entrepreneurial projects digging in the frozen ground. Moreover the municipalities were the most predominant customers and their revenue was increased with falling temperature – due to their
ownership of local energy companies. Plus the fact the municipalities are usually driven by yearly budgets and what is not spend
year 0 has to be saved in year 1. Using this newly gained information we were able to build a forecast and predict the capacity resource and material requirements with mush better accuracy.

Forecast.jpg

          (chart for illustration purpose only – not actual company data)

Main driving indicators for needed manufacturing capacity derived by the data mining tool:

Daily average temperature 3, 4 & 5 months ago. (a variable for each was used)

Macro-economic indicators for interest rate (the rate that municipalities can borrow money).

Orders on hand 1, 2 and 3 months ago.

Orders intake 2 and 3 months ago.

After this first real project experience with Data Mining I was hooked – possibly for life. That a machine actually can learn seems so intriguing – or even magic to me. And honestly I can sometimes bedriven by my enthusiasm and passion for what happens inside the magic box than by the possible later usefulness. This is probably why I thrived and enjoyed my years working as an external lecturer on the subject Data Warehouse & Data Mining classes at the Southern University of Denmark. I also acted as counselor for bachelor and master students on Business Intelligence & Predictive Analytics projects which was a lot of fun while gaining and passing along a lot of knowledge. I am actually still in contact with many of my former students on a weekly basis – many working within the BI consultant business today I am proud to say. 🙂

(*) Spurious relationship: the correlation with number of babies born in Sweden and the number of Storks arriving in Sweden:

Correlation and causality.jpg

Borrowed from John MacGregors great book on SAP Predictive Analysis (SAP Galileo Press).

To me it was a great day when SAP announced its intentions to greatly intensify the commitment to build advanced analytics. The combination of SAP HANA, SAP Predictive Analysis & InfiniteInsight is in my view capable of providing a very decent toolbox for both the data scientist and business analysts. Should there be a need for additional algorithms than provided out-of-the-box the data scientist can embed new algorithms based on R directly in SAP Predictive Analysis. The business analysts will benefit for the many automated data mining processes built directly into SAP InfiniteInsight – providing quick results. Using SAP HANAs in-memory and in-processing capabilities to execute the predictive models and you have an – in my view – very comprehensive end-to-end solution.

Embedding the build Predictive models into SAP BI tools such as Dashboard Designer or Web Intelligence and you have just democratized predictive analytics – making it available to litteraly everyone regardless their knowledge of statistics etc.

— xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx —

Together with fellow passionate colleges I have create a few article around SAP Predictive Analytics which you might find interesting.

In this article you will find an end-to-end data mining case on how SAP Predictive Analysis and SAP InfiniteInsight can be used in a real world business to predict the potential customers who will buy additional products based on their behaviour of interests and other variables. Using the combined strenght of SAP Predictive Analysis and SAP InfiniteInsight this article demonstrates how these data mining tools produce really great solutions to real business challenges.

SAP Predictive Analysis and InfiniteInsight would come in at a tied 1st place with an American professor in statistics and his team.

SAP Predictive Analysis – implementing real life data mining use case predicting who will buy additional insurance

Model comparison.jpg

For papers describing results on this dataset: http://www.wi.leidenuniv.nl/~putten/library/cc2000/

Data files:

— xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx —

This article applies to those who are interested in building Predictive capabilities into SAP BI Clients such as Dashboards or Reports. Using this approach can help us reach a much broader audience and potentially also a lot more business users.

In essence this can be a way to start democratizing predictive analytics by empower end users with predictive analytics – making it available to literally everyone. Furthermore this could take some repetitive work off the shoulders of the usually very scarce Data Scientist resources. Hence allowing the Data Scientist to focus on building predictive models – making them more productive and efficient. 

Embed Predictive into SAP BI Clients

Embedding Predictive in SAP BI.jpg

Democratizing Predictive Analytics.jpg

— xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx —

Should you share my interst in the Balanced Scorecard methodology and how this approach can assit in communicating the company’s goals, objectives in a strategy scorecard. In these articles you will find that I have described an approach on how to use statistics or measuring the correlation of individiual KPIs and perspectives.

Balanced_Scorecard.jpg

Example – correlation between KPIs in the Employee learning- and growth perspective with a KPI in the Internal process- perspective.

Balanced_Scorecard_Cause_And_effect.jpg

Using Predictive Analysis to
optimize a performance management solution.

Build a performance management
solution with SAP PA. Which Algorithm to use?

SAP Predictive Analysis to help
select and place KPI’s in a performance management solution

— xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx —

Other articles related to SAP Predictive Analytics:

SAP’s Predictive Analysis allows users to leverage many types of algorithms and visualizations. One of the interesting algorithms is Apriori also known as association analysis or basket analysis.

In this video I will demonstrate how to use the Apriori algorithms on data from the Titanic accident.

The business question: Watch this video if you would like some arguments to use with your boss for travelling 2nd or 1 first class.

Guide: SAP Predictive Analysis – how to perform
an association analysis using data from the Titanic accident.

Using data mining best practices to
ensure optimal predictive flow

Using custom R functions in SAP PA – getting
started and step-by-step guide

Guide: SAP Predictive Analysis – Adding further
data mining algorithms & visualizations.




— xxx — xxx — xxx — xxx — xxx — xxx —

Below you see a few links to relevant data-sources in order to try out SAP Predictive Analysis and SAP InfiniteInsight.


If you would like to get something that is already prepared I would recommend the following:

http://www.bigdata-startups.com/public-data/

More specific:

http://aws.amazon.com/publicdatasets/

Common Crawl Corpus

A corpus of web crawl data composed of over 5 billion web pages. This data set is freely available on Amazon S3 and is released under the Common Crawl Terms of Use.

1000 Genomes Project

The 1000 Genomes Project, initiated in 2008, is an international public-private consortium that aims to build the most detailed map of human genetic variation available.

Google Books Ngrams


For a good end-to-end realistic business example – a classification challenge:


Data files:


+ there is also quite a few uploaded data sets to our very own community: http://scn.sap.com/community/predictive-analysis


— xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx —

Article related to SAP BusinessObjects Enterprise

How to Performance Optimize SAP BusinessObjects Reports Based Upon SAP BW using BICS Connectivity

https://scn.sap.com/docs/DOC-33706

SAP BusinessObjects Increasing Stability by Setting Limits on Max. Retrievable Cells From SAP BW into Web Intelligence Using BICS

https://scn.sap.com/docs/DOC-31900

How to Evaluate SAP BusinessObjects Out of Memory Generated Java Heap Files

https://scn.sap.com/docs/DOC-35084 -> a bit more technical.

— xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx — xxx —

I hope you found something that could be useful for your journey with Predictive Analytics, HANA or BusinessObjects.

Kindest regards,

Kurt Holst

To report this post you need to login first.

3 Comments

You must be Logged on to comment or reply to a post.

  1. Xuening Wu

    It is a great blog, includes lots of information and makes a lot sense. I want to build the similar model, where I can get the data (i.e. the data of Titanic accident). Many thanks!

    (0) 
    1. Kurt Holst Post author

      Hi Xuening,


      Thanks for your reply.

      As per your suggestion I have added a few relevant data sources which can be used with SAP Predictive Analysis & InfiniteInsight.

      Best regards, Kurt Holst

      (0) 

Leave a Reply