Skip to Content

The National Football League has officially hit pre-season. Around the US, fantasy football drafts are being assembled. Personally, I think the rise in fantasy football leagues has been a big driver in the amount and availability of statistics for professional football. But I digress.

Did you see this article over on TechCrunch? http://techcrunch.com/2013/08/04/how-data-changes-preconceptions-about-nfl-football-the-weather-and-the-parallel-universe/ Author Alex Williams was trying to evaluate the impact of inclement weather on the outcome of NFL games. He analyzed data from “471,392 plays in 2,898 games played since 2002.” And, he brought in climate data as well from the Climate Data Center. You can check out the article to see the process that they went through. (I like to think the data never lies, but it’s hard to believe the Frozen Tundra of Lambeau Field doesn’t amount to more competitive advantage in late December.)

I love this example of using historical data and seemingly unrelated data sets to posit some correlations. Imagine what you could do if you extended the data captured? You could answer questions like the following:

  • Do stadiums with retractable roofs completely alleviate any weather advantage/disadvantage?
  • What kinds of concessions historically track higher or lower depending on both the weather outside and the temp in the stadium?
  • How does the weather impact the degree of no-shows for the tickets? Are scalping prices impacted?
  • Are there significant differences in injury reports during inclement weather in the last 15 years (benefiting from modern equipment) than the previous 15 years? Or is there simply a higher quantity of injuries being reported, due to changes in league rules?
  • Do specific coaches prepare better for inclement weather? What are the cumulative records as they move from team to team, stadium to stadium?

There are a million more interesting questions to ask. And not just about football. Every company can rattle off questions like this about interesting correlations where they are not sure the exact impact. The only way to answer these questions is data. Data, data, data.

Not only do you need reliable access to the data, you also need to model it. You need to store it in a structure that can support massive, ad-doc query by your data scientists. That’s just the beginning. You also need to clean and transform the data from all of these different sources into relateable chunks of data. (What does retractable mean? Is it the same in every case, etc?)

Luckily, we have some technology to help you with that. Check out SAP HANA, SAP Data Services, and SAP Lumira for starters. Then, perhaps, do some cool things with big data and enter the Data Geek challenge!

To report this post you need to login first.

1 Comment

You must be Logged on to comment or reply to a post.

  1. Jason Cao

    Hi Ina, thanks for sharing the link to the TechCrunch article – really interesting. You posted some relevant and interesting follow-on questions as well.

    We’re having SAP Inside Track Vancouver 2013 focused on the topic of Big Data in September. For anyone not in the Vancouver area and would like to join virtually, we’ll be posting an access link shortly before the event starts. Nic Smith will speak more about Lumira and the Data Geek Challenge as well. 🙂

    (0) 

Leave a Reply