NBA In-Game Win Probability

Home for all your discussion of basketball statistical analysis.
boooeee
Posts: 88
Joined: Sun Jan 22, 2012 5:32 am
Contact:

NBA In-Game Win Probability

Post by boooeee »

NBA In-Game Win Probability

Still a work in progress, but I've built an in-game win probability model, using play by play data going back to the 2004 season. The probability is modeled as a function of score differential, time remaining, possession, and the Vegas point spread (this allows for an adjustment for relative team strength). There's more details in the blog post linked above, but any comments/feedback are welcome here. I'll be publishing a recap of each game of the Finals from the standpoint of win probability.

Here's a direct link to the visualization I plan to use: Heat-Pacers - Game 1. It's somewhat interactive in that you can use the sliders on the bottom to zoom in on any section of the game.

I know I'm not the first to take a stab at this, so any lessons learned anybody has would be greatly appreciated.
v-zero
Posts: 520
Joined: Sat Oct 27, 2012 12:30 pm

Re: NBA In-Game Win Probability

Post by v-zero »

It's funny you chose to do this during the playoffs, as that's exactly what got me trying to do it last year, and it's also pretty funny what you said about ANS, as that was one of my greatest inspirations for it too.

I took a more theoretical route on the calculation, rather than data mining, and reached very similar numbers (for the situation you describe of being 3 up at time t), but the data-mining route is the only obvious way to allow for the 'strangeness' that can occur in the final five minutes of a game, so I was always aware that I was somewhat blind come the final few minutes.

I tacked it to the ESPN game tracker page for any particular game using a python script, and that provided another script with the data to update a matplotlib graph as/when new data arrived.

Anyway, awesome work! :D

One thing: If you do attach it to a feed of some sort then make sure you can avoid one thing I noticed, which is that every now and then the tracker will update the time to a slightly earlier one... In this instance you either want to delete your last data point, or put the new one in its right place between other past data points: graphs that have little backwards bits in them don't look right. :lol:

Second thing: Looking at your first graph, for a team ahead by three at some arbitrary time... Wouldn't you expect a team ahead by three at t=0 to have the same implied chance of victory as a team favoured by three (by the bookies) with the game just about to start? In which case the estimate of 0.55 is about 0.05 off from where I'm sitting. Likewise if a team that is a dog by five is up by three at t=0 you might expect that to be equivalent to being the dog by just two, which would imply a win probability at t=0 of about 0.43 usually. That seems worth investigating, could be to do with the fit, could be to do with flaws in my assumptions, could be a bit of both. :?: If I was to theorise I would say that it is bound to have something to do with the weak elasticity of NBA scorelines.
Mike G
Posts: 6175
Joined: Fri Apr 15, 2011 12:02 am
Location: Asheville, NC

Re: NBA In-Game Win Probability

Post by Mike G »

It's nice. It makes lots of sense to give the better team the advantage at game start and see how the underdog can chip away at it.

You might label the y-axis "Miami win probability", for clarity.

With 17 seconds left in regulation, Indiana is down 3 and has the ball. Their chances of winning the game are only 2.5% -- something like a 6% chance of hitting a 3, and then winning in overtime. That seems pretty slim. But it's based on previous similar events, rather than on the Pacers' .345 season 3pt% --?

With 0:49 left in overtime, the score is tied, and Indiana has the ball. Yet Miami has a .696 win probablility?
Miami gets the ball with :31 to play and the score still tied. Their win% goes up only .027 (to .723) ?
Bobbofitos
Posts: 306
Joined: Sat Apr 16, 2011 7:40 am
Location: Cambridge, MA
Contact:

Re: NBA In-Game Win Probability

Post by Bobbofitos »

Mike G wrote:It's nice. It makes lots of sense to give the better team the advantage at game start and see how the underdog can chip away at it.

You might label the y-axis "Miami win probability", for clarity.

With 17 seconds left in regulation, Indiana is down 3 and has the ball. Their chances of winning the game are only 2.5% -- something like a 6% chance of hitting a 3, and then winning in overtime. That seems pretty slim. But it's based on previous similar events, rather than on the Pacers' .345 season 3pt% --?

With 0:49 left in overtime, the score is tied, and Indiana has the ball. Yet Miami has a .696 win probablility?
Miami gets the ball with :31 to play and the score still tied. Their win% goes up only .027 (to .723) ?
About the bold: It's wrong.

This is a good exercise, more PBP stats should derive their values from WP added or subtracted, rather than everything within a vacuum. Step 1 though is getting the WP model correct. This looks good but has some errors.
boooeee
Posts: 88
Joined: Sun Jan 22, 2012 5:32 am
Contact:

Re: NBA In-Game Win Probability

Post by boooeee »

Mike G wrote:With 17 seconds left in regulation, Indiana is down 3 and has the ball. Their chances of winning the game are only 2.5% -- something like a 6% chance of hitting a 3, and then winning in overtime. That seems pretty slim. But it's based on previous similar events, rather than on the Pacers' .345 season 3pt% --?

With 0:49 left in overtime, the score is tied, and Indiana has the ball. Yet Miami has a .696 win probablility?
Miami gets the ball with :31 to play and the score still tied. Their win% goes up only .027 (to .723) ?
Bobbofitos wrote:About the bold: It's wrong.
I was prepared to start pulling data from past games to show why the model was right.....until I realized I had a bad merge in my code. The model itself seems to be okay, I was just merging it up with the game's play by play data incorrectly. Mike - With the corrected code, the Pacers win probability is 8.5% with 17 seconds left and down by 3, which seems in line for an 8 point underdog. With 0:49 left, Pacers in possession, and the score tied, the Heat win probability is 54.2%, not 69.6%. Which then increases to 72.3% when Miami gets the ball at 0:31 with the score still tied.

Mike G and Bobbofitos: Thank you for the review. It's exactly why I posted it here. I'll be correcting the blog posts soon.
boooeee
Posts: 88
Joined: Sun Jan 22, 2012 5:32 am
Contact:

Re: NBA In-Game Win Probability

Post by boooeee »

v-zero wrote:I tacked it to the ESPN game tracker page for any particular game using a python script, and that provided another script with the data to update a matplotlib graph as/when new data arrived.
Thanks for the tips. For the live updating you did, do you happen to have any examples I can go off of? I would like to be able to share the probabilities realtime, but I'm completely out of my element. My coding skills are pretty much confined to R. I've learned just enough html and javascript to make a mess out of my blog.
Mike G
Posts: 6175
Joined: Fri Apr 15, 2011 12:02 am
Location: Asheville, NC

Re: NBA In-Game Win Probability

Post by Mike G »

... Pacers win probability is 8.5% with 17 seconds left and down by 3, which seems in line for an 8 point underdog.
And Pacers with the ball.
Does 8 points per 48 minutes have much bearing on 17 seconds? I mean, we're talking one possession, basically.
In 5 minutes (OT), the favorite might have a 60% chance of winning. And so your 8.5% may mean the Pacers are about 20% likely to gain 3 points in the last :17 of regulation.

This is still barely half of their normal chance of making a 3. Is this what pbp has shown? -- That when the trailing team must get a 3 with 15-20 sec. -- and presumably they put at least 4 good shooters on the floor -- the defense can still knock that much off their accuracy?

Another general question: When a team is favored by 8 to start the game, and after 24 minutes or 40 minutes the score is vastly different from expected, is it possible or likely that on this night they aren't 8 points better?

Historic pbp should detect this. Sometimes a key player is missing or hurt, or the team is especially sharp or flat; and this accounts for the discrepency. And in such cases, wouldn't the subsequent 24 or 8 minutes have a different probability?

I'm thinking that in 48 minutes, the initial 'favorite/underdog' y-intercept should steadily reduce to zero and be replaced by the actual point differential of the game. Such that teams which are even after 48 minutes (or 47:43) may be, in fact, evenly matched on this night.
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: NBA In-Game Win Probability

Post by DSMok1 »

I'm thinking that in 48 minutes, the initial 'favorite/underdog' y-intercept should steadily reduce to zero and be replaced by the actual point differential of the game. Such that teams which are even after 48 minutes (or 47:43) may be, in fact, evenly matched on this night.
I strongly disagree with that. This is definitely a small sample size issue for the one game.

That said, the regression takes that into account as a possibility if I understand its construction correctly.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
Mike G
Posts: 6175
Joined: Fri Apr 15, 2011 12:02 am
Location: Asheville, NC

Re: NBA In-Game Win Probability

Post by Mike G »

And of course, there are also individual matchup issues between 2 teams, which you might say distorts their relative strengths: How a team performs against the whole league, vs how they perform against a certain opponent.

The Heat were 1-2 vs the Pacers in the season and 2-2 vs the Bulls; also struggled to dispatch them in the postseason.

If DWade is mired in a slump -- missed 9 of his last 10 shots, shooting 42% in the playoffs -- is he closer to 42% likely to make his next shot, or 52% (as he shot in the season) ?
An even larger sample might include previous seasons; but that may be even less relevant.
boooeee
Posts: 88
Joined: Sun Jan 22, 2012 5:32 am
Contact:

Re: NBA In-Game Win Probability

Post by boooeee »

Mike G wrote:
... Pacers win probability is 8.5% with 17 seconds left and down by 3, which seems in line for an 8 point underdog.
And Pacers with the ball.
Does 8 points per 48 minutes have much bearing on 17 seconds? I mean, we're talking one possession, basically.
In 5 minutes (OT), the favorite might have a 60% chance of winning. And so your 8.5% may mean the Pacers are about 20% likely to gain 3 points in the last :17 of regulation.

This is still barely half of their normal chance of making a 3. Is this what pbp has shown? -- That when the trailing team must get a 3 with 15-20 sec. -- and presumably they put at least 4 good shooters on the floor -- the defense can still knock that much off their accuracy?
This appears to be what the actual pbp data is showing. Going back to 2004, there were 329 games where a team started their possession being down by 3 with between 15 to 20 seconds left on the clock. Those teams won 9.7% of the time (they forced overtime about 20% of the time). If the team was an underdog of 5 to 10 points, they won 5% of the time. A key thing to consider is that even if the team manages to score a three, they will probably still leave time on the clock in regulation for their opponent to answer.
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: NBA In-Game Win Probability

Post by DSMok1 »

Have you seen the work Ed Kupfer did on this way back in 2006 (before he got picked up by the Rockets)?

Here are some links to his thread:

Page 1 http://godismyjudgeok.com/DStats/APBRme ... t=586.html
Page 2 http://godismyjudgeok.com/DStats/APBRme ... rt=15.html
Page 3 http://godismyjudgeok.com/DStats/APBRme ... rt=30.html

And another thread of his with graphs: http://godismyjudgeok.com/DStats/APBRme ... t=686.html

Also, Brian Burke, of Advanced NFL Stats, set up an empirical model for NCAA basketball in 2009: http://wagesofwins.com/2009/03/05/model ... ian-burke/

And here's another thread: http://godismyjudgeok.com/DStats/APBRme ... =1701.html
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
Mike G
Posts: 6175
Joined: Fri Apr 15, 2011 12:02 am
Location: Asheville, NC

Re: NBA In-Game Win Probability

Post by Mike G »

Going back to 2004, there were 329 games where a team started their possession being down by 3 with between 15 to 20 seconds left on the clock. Those teams won 9.7% of the time (they forced overtime about 20% of the time). If the team was an underdog of 5 to 10 points, they won 5% of the time. A key thing to consider is that even if the team manages to score a three, they will probably still leave time on the clock in regulation for their opponent to answer.
Thanks, boooeee. This is all well within the range of plausible. The opponent answer is a few %, defensively crowding the arc is several %, etc.

I guess it still would seem to matter what the team's normal 3FG% is, of those 329 instances. It's a bit more specific than just the 2 teams' ORtg and DRtg.
And whether they have a Reggie Miller.
boooeee
Posts: 88
Joined: Sun Jan 22, 2012 5:32 am
Contact:

Re: NBA In-Game Win Probability

Post by boooeee »

DSMok1 wrote:
I'm thinking that in 48 minutes, the initial 'favorite/underdog' y-intercept should steadily reduce to zero and be replaced by the actual point differential of the game. Such that teams which are even after 48 minutes (or 47:43) may be, in fact, evenly matched on this night.
I strongly disagree with that. This is definitely a small sample size issue for the one game.

That said, the regression takes that into account as a possibility if I understand its construction correctly.

DSMok - You would be correct on the small sample size issue. Here are the winning percentages for underdogs that manage to force overtime:

0 - 4.5 point underdog: won 47% (n=338)
5 - 9.5 point underdog: won 36% (n=264)
10 point or greater underdog: won 35% (n=49)

5-9.5 underdogs win about 25% in general, so there is some improvement if they make it to overtime, but that could just be due to the compressed time.
Mike G
Posts: 6175
Joined: Fri Apr 15, 2011 12:02 am
Location: Asheville, NC

Re: NBA In-Game Win Probability

Post by Mike G »

0 - 4.5 point underdog: won 47% (n=338)
5 - 9.5 point underdog: won 36% (n=264)
10 point or greater underdog: won 35% (n=49)
The 49 overtimes between teams that are very unevenly matched (10+) have gone to the underdog almost as frequently as those between (5 - 9.5) moderately mismatched teams.

A 13 point underdog will win only about 15% of the time.
What's the 5-minute version of the 48-minute Pythagorean Win% -- and how do these overtime win% compare?
Bobbofitos
Posts: 306
Joined: Sat Apr 16, 2011 7:40 am
Location: Cambridge, MA
Contact:

Re: NBA In-Game Win Probability

Post by Bobbofitos »

Mike G wrote:And of course, there are also individual matchup issues between 2 teams, which you might say distorts their relative strengths: How a team performs against the whole league, vs how they perform against a certain opponent.

The Heat were 1-2 vs the Pacers in the season and 2-2 vs the Bulls; also struggled to dispatch them in the postseason.

If DWade is mired in a slump -- missed 9 of his last 10 shots, shooting 42% in the playoffs -- is he closer to 42% likely to make his next shot, or 52% (as he shot in the season) ?
An even larger sample might include previous seasons; but that may be even less relevant.
It's probably somewhere in between. Bayes and all that.
Post Reply