Page 1 of 1

Simulating NBA games (J.E., 2011)

Posted: Fri Apr 15, 2011 2:23 am
by Crow
page 1 of 2

Author Message
back2newbelf



Joined: 21 Jun 2005
Posts: 259


PostPosted: Mon Mar 07, 2011 8:44 pm Post subject: Simulating NBA games Reply with quote
Hey

My next project will be simulating entire games. This will take a long time before it's finished but I thought
I should make the thread now for discussion.

GOAL The end goal would be to have a system that's better at predicting how many points two 5-man units
score against each other. I think predicting the end-of-game point differential has its' flaws.
The system should be able to rate players according to how many points they add to an average team, either
per possession or per game. It should also be able to rate players according to how much value they add
to an already existing team. This should be different from value added to the average team because teams
have different needs.

First step would be to get some baseline numbers on RMSE for predicted vs actual points scored for
every possession. I'll start with using (1) just homecourt advantage (2) simple on-court rating
(divided by 5, because there are 5 players playing) (3) regularized adjusted +/- (3a) RAPM with Coaching or any
other add-on I might come up with down the line

Why simulate entire games? I think that player metrices that have just offensive and defensive rating
miss key interactions between players. As an example, I think good offensive rebounders should be paired
with low turnover guys, or else he won't be able to make use of that skill. Also, if you already have 3 Jordans
you probably don't want to add 2 more Jordans, even if he's the best player of all time. You're probably better
off by adding 2 Rodmans. Conventonial player metrices don't really capture that

How to simulate games. There are many facets that need to be looked at. I'll probably make threads for
these as I'm doing my research on the topics.

First simulating step The start would be, for every 5 man lineup and possession, to guess who's gonna use
that possession(taking everybodys' usage rates), then guess what he's gonna do (using assist rate, turnover rate, drawing foul rate). If he's shooting we guess if it's a 3 or a 2, then use his 3pt% and 2FG% to guess if it went in. If he's fouled, see if there other team is over the limit and continue accordingly. If there's a miss, calculate chances the ball will be rebounded by the attacking team, using OReb%s. That would be the simple model.

From there I will make this more and more detailed, this includes the following problems and questions:
-Are there different usage rates for possessions after steals and offensive rebounds? The offensive rebounder
is definitely going to use the ball more than the players around him!?
-What about defensive impact on usage rate? If teams play with 4 good defenders and one bad one, and the opponent
is aware of it, the player being defended by the bad defender should see his usage rate rise.
-What about players good that always play with other good, high usage players.
If they go to a very bad team their usage rate is probably gonna rise. We need a way to predict this well
-If we want to know if the ball is going to be turned over, is TOV% a good predictor of that?
We will probably also have to bake team turnovers into everybody's TOV% to get better results?! Maybe adjusted
turnover numbers are a better predictor of turnovers? Maybe a mix of individual numbers and adjusted numbers?
How about diminishing returns on turnovers? All questions apply to steals as well.
-Do we have different turnover rates after steals and offensive rebounds?
-When predicting FG% and 3pt%, do we use individual numbers only? Mix them with adjusted FG% numbers?
If we add lots of high usage rate players to a lineup, does that have an effect on FG% as well? How much?
What if the lineup is composed of lots of low usage players?
-Fouling: Do we just use individual fouling numbers or do we combine with adjusted fouling numbers?
-Offensive rebounding: Pretty much the same questions as with steals. Use individual numbers of adjusted?
Diminishing returns? Do teams go for offensive rebounds less when they're far ahead?

The aim is to have a website that predicts wins for the remainder of the season, let's the user trade players and then updates the win predictions accordingly

Any comments or links to studies that deal with some of the stuff mentioned would be appreciated
_________________
http://stats-for-the-nba.appspot.com/
Back to top
View user's profile Send private message
EvanZ



Joined: 22 Nov 2010
Posts: 268


PostPosted: Mon Mar 07, 2011 9:32 pm Post subject: Reply with quote
When you can beat Vegas, then we'll know you're onto something. Laughing
_________________
http://www.thecity2.com
http://www.ibb.gatech.edu/evan-zamir
Back to top
View user's profile Send private message
Crow



Joined: 20 Jan 2009
Posts: 806


PostPosted: Mon Mar 07, 2011 10:04 pm Post subject: Reply with quote
If you do

Adjusted +/- modified in some or many ways to try to reduce noise / error

and take it down to the 4 factors for players,

and Adjusted +/- for player pairs

and Adjusted +/- for the 4 factors for player pairs, first,

and then generalize that information up to player and player pair Adjusted +/- for "player types" of some kind

and do it for player-opponent type pairs too (especially counterpart),

then I think you would have more resources to research some of the smaller questions and make more sensitive predictions of play by play results in a dynamic game prediction model.



I don't know if any of the simulation / computer game designers here will share of their knowledgebase and work product but perhaps one or more might comment here or there if they feel it is safe, in public or privately. Or you could evaluate their products and think about how they work and what values and assumptions they appear to be using and how well they track with observations and your preliminary predictions.
Back to top
View user's profile Send private message
bbstats



Joined: 25 Apr 2010
Posts: 38


PostPosted: Tue Mar 08, 2011 12:23 am Post subject: Reply with quote
Yeah - I'm wondering if you will be able to make better judgements via 5-man lineups or adjusted plus-minus. Intuitively, single-player statistics would have less error, but lineups capture interactions in ways single-player numbers cannot...I think the main concern is finding out the distribution of how events occur between specific players. If you're doing a complete simulation, averages aren't going to cut it (in my opinion).

Apparently you've put together a very low-error player rating system? I read the summary of your SSAC poster...brilliant stuff man, I'd love to see the method&results!

I'm also assuming that you would do thousands of simulations to give some distribution of game results?
_________________
http://thebasketballdistribution.blogspot.com

http://twitter.com/bbstats
Back to top
View user's profile Send private message Visit poster's website
back2newbelf



Joined: 21 Jun 2005
Posts: 259


PostPosted: Thu Mar 10, 2011 2:15 pm Post subject: Reply with quote
bbstats wrote:

I'd love to see the method&results!

Thanks. I'll train and test that method on 5 man units also. I'll post the results in a few weeks
Quote:

I'm also assuming that you would do thousands of simulations to give some distribution of game results?
I'll do some tests on how many simulations I'll have to run to get stable results. Hopefully it won't be too many
_________________
http://stats-for-the-nba.appspot.com/
Back to top
View user's profile Send private message
back2newbelf



Joined: 21 Jun 2005
Posts: 259


PostPosted: Thu Mar 10, 2011 8:35 pm Post subject: Reply with quote
I'm thinking about using (rather flat) sigmoid functions to model diminishing returns. In those pictures X-Axis would be sum of individual numbers, Y-Axis would be expected team number. A good rebounder should help the average rebounding team more than he would help an already above average rebounding team. Does that make sense?

Thoughts on using a sigmoid function?
_________________
http://stats-for-the-nba.appspot.com/
Back to top
View user's profile Send private message
EvanZ



Joined: 22 Nov 2010
Posts: 268


PostPosted: Thu Mar 10, 2011 9:48 pm Post subject: Reply with quote
back2newbelf wrote:
I'm thinking about using (rather flat) sigmoid functions to model diminishing returns. In those pictures X-Axis would be sum of individual numbers, Y-Axis would be expected team number. A good rebounder should help the average rebounding team more than he would help an already above average rebounding team. Does that make sense?

Thoughts on using a sigmoid function?


I think you have it reversed. As the expected (predicted) rebounding increases (x-axis), the actual rebounding flattens (y-axis). That's the sigmoid. The way you said it would be a logit function (inverse sigmoid).
_________________
http://www.thecity2.com
http://www.ibb.gatech.edu/evan-zamir
Back to top
View user's profile Send private message
Mike G



Joined: 14 Jan 2005
Posts: 3563
Location: Hendersonville, NC

PostPosted: Fri Mar 11, 2011 9:04 am Post subject: Reply with quote
It would certainly be fun to test your model by pitting 5 Zach Randolphs against 5 Chris Pauls, for example. Such extreme lineups will require some insight to simulate one team's lack of a distributor and the other's lack of rebounding.

These are vital functions that, outside an occasional possession, no NBA team would ever neglect to have represented on the floor. But some coaches like to push the envelope, and swingmen play power forward.

How many minutes does Zach play without someone to get him the ball down low? I think if you can quantify the effect of assist men on their pass recipients, you've gotten a major obstacle behind you.
_________________
`
36% of all statistics are wrong
Back to top
View user's profile Send private message Send e-mail
bchaikin



Joined: 27 Jan 2005
Posts: 685
Location: cleveland, ohio

PostPosted: Fri Mar 11, 2011 12:36 pm Post subject: Reply with quote
Such extreme lineups will require some insight to simulate one team's lack of a distributor... These are vital functions that, outside an occasional possession, no NBA team would ever neglect to have represented on the floor.

what constitutes extreme? the 1978-79 san diego clippers finished with an above .500 record (43-39), shot a 2pt FG% right near the league average, yet assisted on just 41% of their FGMs. that same season milwaukee had an ast/fgm ratio of 66% yet finished with a below .500 record (38-44), and a better defensive team than the clippers...

in 04-05 and 05-06 combined the dallas mavericks, among 30 teams, had the league's lowest ast/fgm ratio at 51% (the highest was 66%, average was 58%), but the 8th highest eFG%, 9th highest 2pt FG%, and 4th highest 3pt FG%...

I think if you can quantify the effect of assist men on their pass recipients, you've gotten a major obstacle behind you.

just what is the effect of assist men on their pass recipients?...

fyi since 1977-78 30 teams had regular season ast/fgm ratios of from 66%-68%, and combined shot an eFG% of 48.9%, a 2pt FG% of 48.1%, and a 3pt FG% of 35.6%. 134 teams had a regular season ast/fgm ratio of from 56%-58%, and combined shot an eFG% of 48.7%, a 2pt FG% of 48.1%, and a 3pt FG% of 35.1%...

that's the same 2pt FG%, almost the same 3pt FG%, yet 10% less ast/fgm...
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Crow



Joined: 20 Jan 2009
Posts: 806


PostPosted: Sat Mar 12, 2011 1:45 pm Post subject: Reply with quote
The league average ast/fgm ratio this season is 57.7% so a range of 56-58% would be just below average this season and those with 66%-68% are rare to the tune of less than 1 per season.

Maybe 56-58% is not a problem. Teams with a 56-58% ast/fgm ratio this season again shoot better than the league average by a small amount. But they also turned it over about 2% less. Their average offensive efficiency is only 0.3 points less than average.

Maybe there is a scorekeeper affect on this band of teams. I haven't matched them up to Mike G's data.

For teams who had less than a 56% ast/fgm ratio, they shoot a little worse than league average on raw FG%, got about 3.5 % more free throws though to help offset, but also turned it over about 3% more. That is a somewhat different style of play, but 5 of 9 are still currently playoff seeded. Still their average offensive efficiency is 1.3 points less than league average so some concern is probably warranted. How much of the concern would / should be on assists and how on other factors of offense will vary by team and probably by analyst.

The free throw rates are probably affected by the frequency of drives (mostly unassisted?).

How teams work the fastbreak (dribble or pass right before the shot) could affect the ratios too.
Back to top
View user's profile Send private message
nakoned



Joined: 03 Jan 2011
Posts: 2


PostPosted: Mon Mar 14, 2011 10:20 am Post subject: Reply with quote
I was always wondering what distribution do people use when they simulate NBA games? Also do they seed player vs team events per possession? The difficulty with simulating a possession (and ultimately game), unlike stock move, for example, is there are more than 2 possible outcomes that can happen on every possession, in a broad sense. So I was curious what people use to accomplish that? I also think this is the place to start if you want to accomplish your task. Would be interesting to see though if there are any thoughts (or literature) on that...
Back to top
View user's profile Send private message
back2newbelf



Joined: 21 Jun 2005
Posts: 259


PostPosted: Mon Mar 14, 2011 12:56 pm Post subject: Reply with quote
nakoned wrote:
I was always wondering what distribution do people use when they simulate NBA games? Also do they seed player vs team events per possession?

What do you mean by "distribution" and "seed"?
Quote:
The difficulty with simulating a possession (and ultimately game), unlike stock move, for example, is there are more than 2 possible outcomes that can happen on every possession, in a broad sense. So I was curious what people use to accomplish that? I also think this is the place to start if you want to accomplish your task. Would be interesting to see though if there are any thoughts (or literature) on that...

One way would certainly be to assign probabilities to all possible events, then multiply those probabilities with the amount of points scored in those possible events to get an overall expected points per possession. Or, maybe an easier way, just simulate the possession a couple of times and take the average
_________________
http://stats-for-the-nba.appspot.com/
Back to top
View user's profile Send private message
nakoned



Joined: 03 Jan 2011
Posts: 2


PostPosted: Mon Mar 14, 2011 5:04 pm Post subject: Reply with quote
Quote:

What do you mean by "distribution" and "seed"?

When you perform simulation you basically drawing random values out of PDF multiple times. Seed is each value drawn out of distribution.

Quote:

One way would certainly be to assign probabilities to all possible events, then multiply those probabilities with the amount of points scored in those possible events to get an overall expected points per possession. Or, maybe an easier way, just simulate the possession a couple of times and take the average

it seems to me is that what you are doing here is just confirming probabilities you estimated earlier. Why do simulation at all then? Think about it... if on average a possession for a team is expected to end in FG 50% of the time (I am just making the numbers), by running "simulation" 10000 times, it will give you exactly that. I.e. 50% of possessions will end in FG. Then what are you trying to simulate? That is why I posted earlier question about distribution of events. Each event in NBA might be normally distributed, but since each possession can end in a different event, this distribution of multiple events needs to be modeled correctly. I think this is the first step on the way of designing simulation for any process. Or maybe I misunderstood what you meant?
Back to top
View user's profile Send private message
back2newbelf



Joined: 21 Jun 2005
Posts: 259


PostPosted: Mon Mar 14, 2011 5:41 pm Post subject: Reply with quote
nakoned wrote:
Think about it... if on average a possession for a team is expected to end in FG 50% of the time (I am just making the numbers), by running "simulation" 10000 times, it will give you exactly that. I.e. 50% of possessions will end in FG.

I know it gives the same result, it's just that running it a bunch of times might be simpler to code than keeping track of all possible states and their according probabilities.
But I might want to keep track of all possible states anyway because it probably matters (in terms of defensive efficiency of your teams next defensive possession) how your possession ended.
_________________
http://stats-for-the-nba.appspot.com/
Back to top
View user's profile Send private message
EvanZ



Joined: 22 Nov 2010
Posts: 268


PostPosted: Tue Mar 15, 2011 1:56 pm Post subject: Reply with quote
How does AccuScore work? Anybody know?
_________________
http://www.thecity2.com
http://www.ibb.gatech.edu/evan-zamir

page 2 of 2

Author Message
asimpkins



Joined: 30 Apr 2006
Posts: 245
Location: Pleasanton, CA

PostPosted: Tue Mar 15, 2011 6:40 pm Post subject: Reply with quote
I haven't been around here in a while, but this thread was brought to my attention and I thought I'd comment. I've taken over the xohoops.com* project after Ben F. decided to move on.

Our simulator has done a good job of getting past the first few hurdles described and it is producing credible results. It hasn't yet attempted most of the more advanced adjustments mentioned above though, so I'm very interested in seeing how this project develops. I'm particularly interested in which of these advanced adjustments have the biggest impact on the final result and which only make a minor difference.

Our approach is to parse NBA play by plays to build up tendencies for each player and then create new play by play lines based on how the tendencies of those 10 players intersect. Much like you outlined in "First simulating step", we decide who uses the possession and then what he does and if needed any reactions to that until the offense resets or the defense gets control of the ball. We have adjustments for rotations that get over/under 100% usage, we have a small home court adjustment, we factor in some shot contesting defense by looking at a combination of a player's oncourt/offcourt opponent efg% as well as their NBA team's defensive performance, and we provide some value for assists by estimating both potentially assisted and unassisted rates for many actions.

A few more of the problems we've run into that I didn't see mentioned:

1. How is rebounding affected by the position you play? If a player rebounds at a certain rate as a PF, can you reliably use that same rate for him at SF? Or does the position you play fundamentally affect the rebounding opportunities you'll get?

2. How does the ability (or lack of) to shoot from distance affect the offense by spacing the floor and stretching the defense? If you play a squad of guys that can't shoot outside the paint, does their collective efficiency drop off?

3. How do players adjust depending on foul trouble and the threat of being benched? Are they less likely to pick up a foul (or the ref to blow the whistle) if it would send them to the bench?

*For those unfamiliar, xohoops.com is an NBA fantasy league with multi-year contracts, drafts, free agency, and a full game simulator designed to produce play by plays based on the fantasy rosters and the specified substitution patterns. There's not much available for newcomers yet, but we hope to open new leagues for next season.

Re: Simulating NBA games (J.E., 2011)

Posted: Mon May 30, 2011 2:08 am
by Crow
Do you feel you are at the point where you could do this for the Finals? Just an idea, no pressure.

Re: Simulating NBA games (J.E., 2011)

Posted: Mon May 30, 2011 9:01 am
by J.E.
Crow wrote:Do you feel you are at the point where you could do this for the Finals? Just an idea, no pressure.
No. Turnovers are needed for simulation and I haven't done them yet.
Also, before I would recommend simulation over RAPM or simple team differential a lot of time will go by. It definitely won't produce better out-of-sample error right off the bat