Much excitement this week. As my 1983 Boys Almanac told me many years ago, March 2015 would have a solar eclipse here in England on March 20th, at 9:32am. Here is my picture of the eclipse:
More 50 shades of grey than the solar corona. Same experience in 1999. At least I didn’t flog all the way up to the Faroe Islands where the view was the same, and the disappointment in the crowds was as heavy as the mist.
Eclipses can be predicted literally hundreds of years in the future, and astronomers have been doing so since the time of the Babylonians. On Eclipse day there were little charts saying : Bristol 9:28am, Manchester 9:32 am etc. All absolutely correct. But no-one would attempt to predict the cloud cover from even 4 hours distance.
This got my thinking about the nature of predictions in business. Here are some things to learn from last week’s eclipse.
Don’t focus on the easy problem, focus on the right problem
The motion of the sun, moon and earth is governed by Newton’s Laws. But the fundamental question that people were actually asking last Friday was “what will I see, where should I go to see it”. In business, it is tempting to solve problems that as data scientists we know we can solve, rather than the gnarly real-world problems that the business needs.
Give a range for your estimate
So although precise weather prediction is hard, data regarding the likelihood of clear skies on a morning in March for different locations would have been useful. This fact alone would have stopped many expensive and arduous trips to the Faroe Islands. For example, when using Pricing Algorithms to estimate the likely impact of a price rise you need to quantify the confidence of your prediction.
Combining disciplines and experts is hard
Astronomers and weathermen don’t work together often, so it was difficult to get them to focus on the main problem: “what will see, where should I go”, which incorporated data from both. They have different approaches, units, timescales, maps. You should expect this in your work as a data scientist as well. It is tempting to join sets of data that have come from different systems which were gathered for different purposes in different ways. For example, you are trying to predict the impact on customer satisfaction of a product change, based on historical data. Are you sure you know why that survey data was collected, who was asked, and is that a good proxy for customer sat for the new use case.
Decision makers require context to make the optimal choice
Not only do we need data from both astronomy and climatology, but in order to solve the question of “what will I see; where should I go”, This should have been linked to local experts who knew driving distances, traffic patterns and viewing locations. If someone had told me at 7 am that two hours drive away were good viewing conditions then I would have driven up to Leicester. Also just how rare is this eclipse, and when will the next one be (gulp, 2027 in England). We should always try to frame our recommendations in the business context so that it is easy for them to be understood.
At SAP we have been sharpening up our Predictive Analytics with a brand new release. We solve real business problems, not cute mathematical three-body problems. See you in the Western US in August 2017 where there will be a total eclipse, and my research says that chance of a sunny day is 90%.