My very first love in sport is NOT rugby, despite the appearances.
It is basket-ball, which means to me the NBA! (I was 16 when the Dream Team made history in Barcelona)
I wanted to write about basket-ball & predictive and noticed that the next event to come in the NBA season is the NBA All-Star Game.
As Wikipedia writes: “The NBA All-Star Game is a basketball exhibition game hosted annually by the National Basketball Association (NBA), matching the league’s star players from the Eastern Conference against their counterparts from the Western Conference. Each conference team consists of 12 players, making it 24 in total.”
I wondered if I could use some data from the 2015/2016 season to predict the future 2016/2017 All-Stars?
I went to the NBA.com website and looked for the appropriate statistics. This is a nicely done website and I was able to retrieve all the player statistics from the beginning of each regular season until the All-Star begins.
Based on this source I created two data sets.
The first one is my training dataset with all the player statistics for the 2015/2016 season.
I knew which players were selected as All-Stars whether as starters or reserves, so I added this information in the training dataset.
The second one is the dataset that contains the player stats for the ongoing 2016/2017 season.
I applied the predictive model I created based on the training dataset to predict the likely All-Star players.
Both data sets contains statistics like the number of games played, won, loss, the points scored as well as the different major game statistics.
In the first step I created the predictive model using SAP BusinessObjects Predictive Analytics.
All-Star is my target variable.
All the player-related information (team, age, statistics) were used as explanatory variables.
Here is the debrief of my predictive model – out of the 28 variables only 3 of them are kept
- FGM: Field Goals Made. It has the most important contribution, 67%.
- FGA: Field Goals Attempted. Second most important contribution, 28%.
- FTM: Free Throws Made. Third most important contribution, 5%.
If I double-click on the FGM bar, I get to this diagram.
The more a category is to the left, the more it explains the output.
If I score more than 7,1 field goals per game in average, chances are I might be named an All-Star.
Once my predictive model had been created, I could apply it to predict the 2016/2017 All Stars!
I asked for the probability and output the player names.
You can see the All-Star predictions sorted by decreasing probability of the player being an All-Star.
Based on these predictions, here are my starting fives for East & West.
These are my final picks based on what the solution proposed me.
I selected Harden instead of Lillard as my second guard for West as I consider that “The Beard” is a more complete player 😉
I really enjoyed writing this blog today.
I hope it illustrates how quickly & easily you can reach meaningful results with SAP BusinessObjects Predictive Analytics.
Enjoy predictive, enjoy basketball and Happy New Year!
PS: You can vote for the 2017 All-Star players here.
The starting line-up for the NBA All-Star Game 2017 will be announced on January 19.
Credits are due to NBA.com to provide the relevant statistics used in this blog post.
January 20 update:
The results were announced by the NBA, … drum rolls…see http://global.nba.com/news/nba-all-star-starters-announced/
Let’s compare the predictions & the reality!
East is pretty much perfect 🙂 It’s a 4 on 5.
The only exception is Kyrie Irving, that was preferred to Isaiah Thomas as the second guard, due to fan’s rank and player’s rank. I suspect the fact that he is playing for Cleveland had an impact.
It’s striking to see that the Media ranked the players exactly the same as my East predictions.
West is not so bad as well. It’s a 3 on 5.
I had Stephen Curry wrong, but you can analyze he was “saved” by fans (player popularity had a huge impact here, not field statistics).
I think it’s very strange for Russell Westbrook, although I like Curry as well.
Kawhi Leonard was ranked just after Cousins by the predictions, I guess due to its dominant defensive strengths (when the predictive model favored offensive strengths).
The predictions went in favor of DeMarcus Cousins, but you can see Kawhi was preferred by all parties to Cousins (Fan/Player/Media rank).
So what’s the take away for me here?
- I need to take into account more factors. Clearly player/team popularity had an impact on some choices (Curry, Irving).
- It might be also that players that have already been All-Star one or several times in the past are preferred to newbies, even if they demonstrate great performances on the field.
- A tougher sport-related question is tied to the completeness of a player. Some players are selected because they excel in different sectors of the game (Leonard). We might need some composite index or variable that reflects this.
- Another question is tied to the charisma/leadership of a player. It’s not fully equivalent to the popularity of a player. It means that a player is recognized as one of the team leaders. Players know & recognize that, I guess including in the votes.
Let’s wait for the reserves announcement now.
I hope you enjoy this!