Page 2 of 2

Re: Ideas for data to extract from PBP

Posted: Wed Oct 24, 2012 4:25 pm
by schtevie
Daniel, what I make of it is the KGification of the Boston offense. For better or, far more likely, for worse. Nash has about 9% more of his assists within three feet of the basket. This is about the same % of "excess" assists Rondo has at long mid-range.

What would be interesting to see is the Nash/Rondo comparison, for lineups not including KG (and perhaps the removal of one or more of Stoudemire/Frye/Hill, who have played a similar mid-range role, on the other side).

Re: Ideas for data to extract from PBP

Posted: Wed Oct 24, 2012 4:39 pm
by Crow
DSMok1 wrote:Interesting. Pierce and (particularly) Garnett are both exceptionally good midrange shooters.

Exceptionally good midrange shooters? What data specifically are you referring to? I am not sure what link , if any, you are using.

I see Garnett as an exceptionally good midrange shooter at Hoopdata (about 47%) but overall those shots are still below league average eFG% for all shots. Pierce for all his midrange shots is quite near league average for midrange shots and those shots are dramatically below league average eFG% for all shots. Discretionary midrangers that are not wide open are probably just passable for Garnett and often a poor choice by Pierce.

Re: Ideas for data to extract from PBP

Posted: Wed Oct 24, 2012 4:58 pm
by DSMok1
schtevie wrote:Daniel, what I make of it is the KGification of the Boston offense. For better or, far more likely, for worse. Nash has about 9% more of his assists within three feet of the basket. This is about the same % of "excess" assists Rondo has at long mid-range.

What would be interesting to see is the Nash/Rondo comparison, for lineups not including KG (and perhaps the removal of one or more of Stoudemire/Frye/Hill, who have played a similar mid-range role, on the other side).
I agree, it's mostly about KG.
Crow wrote:Exceptionally good midrange shooters? What data specifically are you referring to? I am not sure what link , if any, you are using.

I see Garnett as an exceptionally good midrange shooter at Hoopdata (about 47%) but overall those shots are still below league average eFG% for all shots. Pierce for all his midrange shots is quite near league average for midrange shots and those shots are dramatically below league average eFG% for all shots. Discretionary midrangers that are not wide open are probably just passable for Garnett and often a poor choice by Pierce.
I guess I was primarily thinking of Garnett, who is certainly one of the best in the midrange. Certainly above the threshold where shooting them contributes to the overall team efficiency.

Re: Ideas for data to extract from PBP

Posted: Wed Oct 24, 2012 7:17 pm
by schtevie
DSMok1 wrote:I guess I was primarily thinking of Garnett, who is certainly one of the best in the midrange. Certainly above the threshold where shooting them contributes to the overall team efficiency.
Daniel, I am curious as to how you can be certain about the second point.

To be certain requires establishing the baseline for some kind of implicit comparison.

The simplest baseline - average PPP for mid-range vs. non-mid-range shots - clearly shows the former to be inferior. Even for KG.

Another baseline, introducing a simple notion of opportunity cost, makes certainty more likely. If a certain fraction of KGs mid-range shots are taken late in the shot clock, his below-global-average mid-range shot might, in fact, be above average given the alternative. And definitively answering a question like this (and the more important related question of shot clock optimization) is why gathering shot clock time/distance data is very important.

But then there is a third baseline, expanding the concept of opportunity cost, where certainty seems to me to be less likely. Who's to say that KG (in particular) has been optimally deployed throughout his career in taking so many long 2s relative to other shots. The recent plot Jeremias provided reminds us of this point. The Celtics could well be more efficient as a team if KGs assists from Rondo were occurring around the basket.

Re: Ideas for data to extract from PBP

Posted: Wed Oct 24, 2012 9:40 pm
by J.E.
Click. Rondo to Allen (blue), Pierce (red), Garnett (yellow). Total numbers, although it would probably be a good idea to normalize by shooter's # of offensive possessions

edit: CP3's graph looks almost exactly like Rondo's if you put "rim" and "1ft" in the same bin

Re: Ideas for data to extract from PBP

Posted: Wed Nov 07, 2012 4:08 am
by kpascual
J.E. wrote:
kpascual wrote:How about 5 man units?
You mean something like this http://stats-for-the-nba.appspot.com/PB ... br_ids.rar ?
See viewtopic.php?f=2&t=8033
Yes I did, and I am an idiot because I've been to your site many times and have seen this. Please ignore me.

Re: Ideas for data to extract from PBP

Posted: Fri Nov 30, 2012 9:16 pm
by J.E.
For the '11-'12 season I get the following distribution of 2 pointers taken by distance

Does anyone have any suggestions on what bins I should use? I'm thinking [0-3] (around the hoop), [4-13] (midrange), [14-24] (longer midrange and long distance 2)

I don't want to create too many bins, because I still want to use a player's FG% for that bin, which I can only do if said players has more than X shots in that bin

Re: Ideas for data to extract from PBP

Posted: Tue Dec 04, 2012 3:45 pm
by J.E.
Here's where And1s happen by shot distance

Not too surprising. Somewhere around '10 bbr started to list them as "1 ft" or "2 ft", instead of using "at rim" like they did before. Not sure why

Re: Ideas for data to extract from PBP

Posted: Tue Dec 04, 2012 3:52 pm
by DSMok1
J.E. wrote:For the '11-'12 season I get the following distribution of 2 pointers taken by distance

Does anyone have any suggestions on what bins I should use? I'm thinking [0-3] (around the hoop), [4-13] (midrange), [14-24] (longer midrange and long distance 2)

I don't want to create too many bins, because I still want to use a player's FG% for that bin, which I can only do if said players has more than X shots in that bin
Those bins look reasonable. It might be worthwhile to have a shorter range bin that can catch the floaters/hooks as distinct from jumpers (perhaps 4-8 or so?).

Re: Ideas for data to extract from PBP

Posted: Tue Dec 04, 2012 4:23 pm
by J.E.
A bit of a data dump with some valuable PBP info like:
-rebounds split into afterFG and afterFT
-And1s (split into subsequent FT missed/made)
-defensive fouls drawn
-offensive fouls drawn
-away assists
-away blocks (split into defense/offense got the ball "good block/bad block")
-live/dead TOs
-(un)assisted makes(2s) from close/mid/far

and maybe more. The first couple of columns are standard BoxScore data, up until "POINTS"

http://stats-for-the-nba.appspot.com/data/2006.txt
http://stats-for-the-nba.appspot.com/data/2007.txt
etc.

Please tell me if you spot any errors

Re: Ideas for data to extract from PBP

Posted: Tue Dec 04, 2012 4:35 pm
by DSMok1
What is the source for this data? Basketball Reference?

Re: Ideas for data to extract from PBP

Posted: Tue Dec 04, 2012 4:55 pm
by v-zero
DSMok1 wrote:What is the source for this data? Basketball Reference?
Yeah, you can tell from the unique IDS.

Re: Ideas for data to extract from PBP

Posted: Tue Dec 04, 2012 4:59 pm
by mystic
J.E. wrote: Please tell me if you spot any errors
Age, you are adding up the age as well for players who played on multiple teams in the respective season. Right at the start of the 2006 file, Jim Jackson is listed with the age of 70, because he played for Suns and Lakers.

Re: Ideas for data to extract from PBP

Posted: Tue Dec 04, 2012 9:17 pm
by KAN
I'm not sure if it is too late for this; but, it would be nice to have counterpart stats from the play by play data pubically available, similar to what 82games.com does.

Re: Ideas for data to extract from PBP

Posted: Wed Dec 05, 2012 11:50 am
by J.E.
mystic wrote:Age, you are adding up the age as well for players who played on multiple teams in the respective season. Right at the start of the 2006 file, Jim Jackson is listed with the age of 70, because he played for Suns and Lakers.
Right, thanks. %s are also messed up for players that played on multiple teams.
I'm not sure if it is too late for this; but, it would be nice to have counterpart stats from the play by play data pubically available, similar to what 82games.com does.
You can never tell for sure who is defending whom via PBP, so I'm not too excited about trying to extract counterpart data