This blog is the second in an ongoing series of blogs – it will last the time of the Rugby World Cup 2019!
I would like to walk you through the main steps of the Rugby World Cup 2019 predictions project, from its inception to the effective delivery of the final predictions in a SAP Analytics Cloud dashboard.
Let’s kick it off!
The starting point was to predict the results of every RWC 2019 game, from the pools up to the final. This was my “business” question.
(scratching my head on the project)
Many straightforward business questions implies the use of predictive or machine learning technologies:
- Is this customer likely to be interested by my company’s product?
- What is the current mood of my employees? Are they likely to leave or to remain supportive?
- Is this shipment likely to be delayed?
- Will this manufactured product respect the specifications and the quality criteria?
- Is this transaction fraudulent?
(so many business questions out there…)
It does not matter if your predictive project is rugby-related or business-related.
Ask yourself the exact same high-level questions when you kick it off.
What are the business objective?
My primary goal was pretty straightforward, I wanted to predict the game results.
As a secondary goal, I wanted my predictive model to be understandable by rugby-savvy people.
The predictive model had to avoid being a black box, it needed to be self-explanatory.
In general, the business question you will ask yourself determines the next steps (and sometimes might not even be achievable), so take the time to think about it!
What is the best project plan to put in place to achieve these objectives?
I had an unusual project plan: 1 man, 1 computer and not enough time to do it all…
I knew from Day 1 I had to deliver pool game predictions the day before the very first game Japan vs Russia took place.
My project plan was basically relying on….sleepless nights after the usual workday.
This is probably not the best way to tackle real-world business cases ;-)!
Allocate enough time & resources to achieve your business goals. Identify the right tools & techniques to support your project.
I used SAP Analytics Cloud, more specifically Smart Predict coupled with the powerful visualization capabilities.
While I knew the data preparation would require a fair amount of time, I was sure to count on automated Smart Predict techniques to speed up the predictive modeling phase.
What is the success criteria?
it’s interesting because I did not define my success criteria from day 1 (I should have!). Do not forget defining your business success criteria and your predictive success criteria!
I actually looked for a sweet spot between an accurate & simple predictive model that would effectively deliver reliable predictions.
My “business” success criteria is at the end of day the accuracy of game predictions. It translates into a predictive success criteria, finding the right compromise between model quality, model robustness and model simplicity.
The good news is that the patented automated machine learning technology used by Smart Predict really accelerates & streamline the modeling phase.
Do not consider the creation of efficient & accurate predictive models as your end-point.
Think on how you will present the insightful information to your end-users and what they will it use it for (hint: taking confident decisions).
At the end of the day, end users do not always care about the predictive models, they just want trusted insights that they can use to take actions.
(happy end-users are the one & only success criteria)
The reality is not always bright and shiny as we would like it to be. There are multiple project constraints that should be considered (in case you ask, yes you need a good project manager):
Do I have sufficient data to support my project, in quality and quantity?
This one was not a given for me.
To build my predictive model, I realized that I needed to build a good history of rugby games, and for that that I would need to get the base information from specialized sport websites.
I’ll come back to this in upcoming blog post but the data preparation phase was time-consuming and I had to dedicate quite some energy and thoughts to this phase.
What are the constraints & the risks of my project?
I knew I might have to do some data preparation so securing a SAP HANA instance was key for the success of the project. I am not afraid of writing some basic SQL if I need – this proved to be a useful skill to create the final data model.
You need to have a good data architect in the team. As opposed to rushing in the project execution, take the time to think about of all the constraints, the possible mitigation and secure this early. Real-world examples include: data sensitivity & security, legal rules (GDPR…), data size, data availability…
I used public data that’s freely available on the Internet, but even there, some websites will ask for you to quote them and I think it’s only fair to recognize the great work they have been doing.
French poet René Char summed it up well: ” (you need to) act as a primitive and plan as a strategist”.
(take some time and a empty whiteboard – ask yourself the right questions)
At some point I realized that:
- I had a great “business” question to address.
- I had a running SAP HANA instance to store & model my datasets.
- I had some fair business knowledge about rugby (although I am certainly not an expert).
- The internet was full of rugby statistics that I can reuse in my data models.
- I might be able to please rugby fans out there and in generalthe SAP community members
It was more than time for me to:
- prepare more coffee 🙂
- define & populate cool data models
- use them as a foundation for creating the predictive models.
I’ll cover the data preparation journey in the next blog post. Stay tuned! I hope you’ll enjoy excellent rugby games in the meantime!