If you listen to sports talk radio, you’ve probably listened to heated debates around the rise in scoring in football. After the epic LA Rams vs. KC Chiefs Monday Night Shooutout, many wondered if this was the new normal around the NFL. While these discussions are a good source of entertainment, many tend to be very subjective with very little-to-no data to support their arguments. As a data geek and in an attempt to demystify this, I thought that I would do my own analytics to see what trends I could find.
Sports Has a Data Issue
As I’ve written in the past, like most organizations, sports is not immune to data issues. Sports data sets tend to be siloed and aggregated, have inconsistent timelines, loaded with master data issues and are full of many-to-many joins – to name a few. Getting a full picture of the NFL was more difficult than I had original thought. For you data geeks, here’s what I did…
- To understand wins and losses. I wrote a quick python script that (1) extracted data from the different tables from Yahoo Sports, (2) normalized into a data format, and (3) changed the URL to loop through all of the different years. Yahoo Sports only went back to 2000.
- To get the individual game scores and spreads over the past 30 years. To get the betting odds and to see the outcomes of the individual games, I went to google sports. Similar to my data set above, a simple pyhton script was able to extract all of this data and normalize it. The challenge here is that with all of the different weeks and different seasons and different formats of the pages, this involved a lot of data massaging – after the fact.
- To make sure that I was tracking the right franchises. Over the years, many franchises have moved locations, most notably the LA Rams and the LA Chargers. To avoid the data being screwed, I normalized the data to a single franchise ID. For example, the LAR franchise can signify both the LA Rams and the St. Louis Rams. Like most slowly moving dimensions, it seemed much quicker to do it in a spreadsheet.
Is Scoring up in the NFL? By how much?
This year, teams are averaging 23.3 points per game, which is 7.5% higher than last year and 3.1% higher than average over the past 10 years. In fact, teams scored more points per game in 2013 (23.4) than this year.
If we look at this trend over the past 30 years, we can definitely see a slow increase over time. There’s a 15% increase in scoring over the past 30 years. Based on the trend, we can predict an upwards trend into the future.
Are Specific Teams Bringing Up or Bringing Down This Average?
If we drill down to the individual teams, you can easily see the gap in scoring. Through the colors, you see the increase in scoring as you move from top left to bottom right. As an example, the New England averages almost 30 points per game this decade whereas Cleveland has averaged just over 17. You can also see that many of the high scoring teams have always been high scoring and vice-versa.
If you isolate specific teams over the past decade, you can see that the top 5 highest scoring teams are all 10% higher than the league average.
While the bottom scoring teams are all 10% lower than the league average.
Does More Scoring Lead to More Wins? Does Less Scoring Lead to More Losses?
The two visuals below show every team across every year and it compares points scored vs. wins (left) and points against vs. wins (right) – both are colored on whether the team won the Super Bowl that year. In general, you can see the following:
* The teams that win more, score more.
* The teams that win more, give up fewer points.
* The teams that win the super bowl all have above average offenses and defenses.
How Does All Of This Affect Gambling Lines?
The following visual shows the total points scored per game and the average over/under by game. You can see that this O/U line closely matches the actual results of the games.
If you look at the number of times that the O/U was bet, you can see that it’s been a coin flip over the past 30 years.
If you look at this on a yearly basis, you can see that it gets more accurate each year – which no large fluctuations from year to year.
Where Are the Gems?
Like all things analytics, there are always a few good outliers. Over the past 10 years, some home teams are more notorious for covering or not covering the O/U. For example, it’s wise to bet the “under” on KC at home. And Indianapolis is an “under” at home, but an “over” on the road.
What Does All Of This Mean?
While I was able to very easily ask and answer lots of questions in the data, my answers were not very eye-opening. Here’s what I found:
- There is a steady year-over-year increase in scoring in the NFL.
- While 2018 had a lot of scoring (7.5% higher than last year), it’s not at a level not seen before. 2013 had more scoring.
- You can’t just score a lot to win a Super Bowl. You need a good offense and a good defense.
- The gambling lines are far more accurate than I would have expected.
Just like in your organization, data is everywhere. However, the true analytics is very often left to too few. This is another fun example of how analytics can be applied to help understand what drives around scoring in the NFL.
Want a demo? Use the comments below to get in touch and I’d be happy to share more details.