Ideas for data to extract from PBP
Re: Ideas for data to extract from PBP
Daniel, what I make of it is the KGification of the Boston offense. For better or, far more likely, for worse. Nash has about 9% more of his assists within three feet of the basket. This is about the same % of "excess" assists Rondo has at long mid-range.
What would be interesting to see is the Nash/Rondo comparison, for lineups not including KG (and perhaps the removal of one or more of Stoudemire/Frye/Hill, who have played a similar mid-range role, on the other side).
What would be interesting to see is the Nash/Rondo comparison, for lineups not including KG (and perhaps the removal of one or more of Stoudemire/Frye/Hill, who have played a similar mid-range role, on the other side).
Re: Ideas for data to extract from PBP
DSMok1 wrote:Interesting. Pierce and (particularly) Garnett are both exceptionally good midrange shooters.
Exceptionally good midrange shooters? What data specifically are you referring to? I am not sure what link , if any, you are using.
I see Garnett as an exceptionally good midrange shooter at Hoopdata (about 47%) but overall those shots are still below league average eFG% for all shots. Pierce for all his midrange shots is quite near league average for midrange shots and those shots are dramatically below league average eFG% for all shots. Discretionary midrangers that are not wide open are probably just passable for Garnett and often a poor choice by Pierce.
Re: Ideas for data to extract from PBP
I agree, it's mostly about KG.schtevie wrote:Daniel, what I make of it is the KGification of the Boston offense. For better or, far more likely, for worse. Nash has about 9% more of his assists within three feet of the basket. This is about the same % of "excess" assists Rondo has at long mid-range.
What would be interesting to see is the Nash/Rondo comparison, for lineups not including KG (and perhaps the removal of one or more of Stoudemire/Frye/Hill, who have played a similar mid-range role, on the other side).
I guess I was primarily thinking of Garnett, who is certainly one of the best in the midrange. Certainly above the threshold where shooting them contributes to the overall team efficiency.Crow wrote:Exceptionally good midrange shooters? What data specifically are you referring to? I am not sure what link , if any, you are using.
I see Garnett as an exceptionally good midrange shooter at Hoopdata (about 47%) but overall those shots are still below league average eFG% for all shots. Pierce for all his midrange shots is quite near league average for midrange shots and those shots are dramatically below league average eFG% for all shots. Discretionary midrangers that are not wide open are probably just passable for Garnett and often a poor choice by Pierce.
Re: Ideas for data to extract from PBP
Daniel, I am curious as to how you can be certain about the second point.DSMok1 wrote:I guess I was primarily thinking of Garnett, who is certainly one of the best in the midrange. Certainly above the threshold where shooting them contributes to the overall team efficiency.
To be certain requires establishing the baseline for some kind of implicit comparison.
The simplest baseline - average PPP for mid-range vs. non-mid-range shots - clearly shows the former to be inferior. Even for KG.
Another baseline, introducing a simple notion of opportunity cost, makes certainty more likely. If a certain fraction of KGs mid-range shots are taken late in the shot clock, his below-global-average mid-range shot might, in fact, be above average given the alternative. And definitively answering a question like this (and the more important related question of shot clock optimization) is why gathering shot clock time/distance data is very important.
But then there is a third baseline, expanding the concept of opportunity cost, where certainty seems to me to be less likely. Who's to say that KG (in particular) has been optimally deployed throughout his career in taking so many long 2s relative to other shots. The recent plot Jeremias provided reminds us of this point. The Celtics could well be more efficient as a team if KGs assists from Rondo were occurring around the basket.
Re: Ideas for data to extract from PBP
Click. Rondo to Allen (blue), Pierce (red), Garnett (yellow). Total numbers, although it would probably be a good idea to normalize by shooter's # of offensive possessions
edit: CP3's graph looks almost exactly like Rondo's if you put "rim" and "1ft" in the same bin
edit: CP3's graph looks almost exactly like Rondo's if you put "rim" and "1ft" in the same bin
Re: Ideas for data to extract from PBP
Yes I did, and I am an idiot because I've been to your site many times and have seen this. Please ignore me.J.E. wrote:You mean something like this http://stats-for-the-nba.appspot.com/PB ... br_ids.rar ?kpascual wrote:How about 5 man units?
See viewtopic.php?f=2&t=8033
Re: Ideas for data to extract from PBP
For the '11-'12 season I get the following distribution of 2 pointers taken by distance
Does anyone have any suggestions on what bins I should use? I'm thinking [0-3] (around the hoop), [4-13] (midrange), [14-24] (longer midrange and long distance 2)
I don't want to create too many bins, because I still want to use a player's FG% for that bin, which I can only do if said players has more than X shots in that bin
Does anyone have any suggestions on what bins I should use? I'm thinking [0-3] (around the hoop), [4-13] (midrange), [14-24] (longer midrange and long distance 2)
I don't want to create too many bins, because I still want to use a player's FG% for that bin, which I can only do if said players has more than X shots in that bin
Re: Ideas for data to extract from PBP
Here's where And1s happen by shot distance
Not too surprising. Somewhere around '10 bbr started to list them as "1 ft" or "2 ft", instead of using "at rim" like they did before. Not sure why
Not too surprising. Somewhere around '10 bbr started to list them as "1 ft" or "2 ft", instead of using "at rim" like they did before. Not sure why
Re: Ideas for data to extract from PBP
Those bins look reasonable. It might be worthwhile to have a shorter range bin that can catch the floaters/hooks as distinct from jumpers (perhaps 4-8 or so?).J.E. wrote:For the '11-'12 season I get the following distribution of 2 pointers taken by distance
Does anyone have any suggestions on what bins I should use? I'm thinking [0-3] (around the hoop), [4-13] (midrange), [14-24] (longer midrange and long distance 2)
I don't want to create too many bins, because I still want to use a player's FG% for that bin, which I can only do if said players has more than X shots in that bin
Re: Ideas for data to extract from PBP
A bit of a data dump with some valuable PBP info like:
-rebounds split into afterFG and afterFT
-And1s (split into subsequent FT missed/made)
-defensive fouls drawn
-offensive fouls drawn
-away assists
-away blocks (split into defense/offense got the ball "good block/bad block")
-live/dead TOs
-(un)assisted makes(2s) from close/mid/far
and maybe more. The first couple of columns are standard BoxScore data, up until "POINTS"
http://stats-for-the-nba.appspot.com/data/2006.txt
http://stats-for-the-nba.appspot.com/data/2007.txt
etc.
Please tell me if you spot any errors
-rebounds split into afterFG and afterFT
-And1s (split into subsequent FT missed/made)
-defensive fouls drawn
-offensive fouls drawn
-away assists
-away blocks (split into defense/offense got the ball "good block/bad block")
-live/dead TOs
-(un)assisted makes(2s) from close/mid/far
and maybe more. The first couple of columns are standard BoxScore data, up until "POINTS"
http://stats-for-the-nba.appspot.com/data/2006.txt
http://stats-for-the-nba.appspot.com/data/2007.txt
etc.
Please tell me if you spot any errors
Re: Ideas for data to extract from PBP
What is the source for this data? Basketball Reference?
Re: Ideas for data to extract from PBP
Yeah, you can tell from the unique IDS.DSMok1 wrote:What is the source for this data? Basketball Reference?
Re: Ideas for data to extract from PBP
Age, you are adding up the age as well for players who played on multiple teams in the respective season. Right at the start of the 2006 file, Jim Jackson is listed with the age of 70, because he played for Suns and Lakers.J.E. wrote: Please tell me if you spot any errors
Re: Ideas for data to extract from PBP
I'm not sure if it is too late for this; but, it would be nice to have counterpart stats from the play by play data pubically available, similar to what 82games.com does.
Re: Ideas for data to extract from PBP
Right, thanks. %s are also messed up for players that played on multiple teams.mystic wrote:Age, you are adding up the age as well for players who played on multiple teams in the respective season. Right at the start of the 2006 file, Jim Jackson is listed with the age of 70, because he played for Suns and Lakers.
You can never tell for sure who is defending whom via PBP, so I'm not too excited about trying to extract counterpart dataI'm not sure if it is too late for this; but, it would be nice to have counterpart stats from the play by play data pubically available, similar to what 82games.com does.