Advanced Statistical Plus/Minus (DSMok1, 2010)
Posted: Fri Apr 15, 2011 12:50 am
recovered page 1 of 7
Author Message
DSMok1
Joined: 05 Aug 2009
Posts: 602
Location: Where the wind comes sweeping down the plains
PostPosted: Wed Jul 14, 2010 4:39 pm Post subject: Advanced Statistical Plus/Minus Reply with quote
Advanced Statistical Plus/Minus
I've been working on deriving a new SPM regression based purely upon "advanced" stats (like TS% and OR%) for some time now. I feel comfortable enough with the results thus far to release the first iteration of this SPM.
The data used: Neil Paine's collection of 1-Yr APM's (unfortunately without std err's; I estimated the standard errors for weighting purposes), Joe Sill's 4 Year RAPM's, with the regression toward 0 backed out, and finally (and most importantly) Steve Ilardi's 6-Year APM's posted on this forum. These 6 Year APM's had quite low errors, and provided the groundwork for this regression. I weighted each player in the regression by 1/stderr^2, where stderr is their APM standard error.
I then compiled the advanced metrics from the Basketball Reference Play Index for each player, and weighted-averaged the multi-year data (including playoffs, for the APM's that included those). Thus I have 3 APM data sets and the associated advanced statistics.
I experimented with a number of constructions for the rebounding and especially the scoring parts of this regression. Finding a good way to relate turnovers, shooting, usage, and assists proved illusive for some time. I finally now have a construction I am comfortable with, though (like with any construction) there are a few holes.
To avoid over-weighting steals and blocks for defense, I also included offensive rating and defensive rating of the teams. This is not included in the final SPM, because the team adjustment (to make the teams sum to their efficiency differentials) already accounts for this.
Here are the factors in this regression:
Code:
Factor Value
TRB% 1.33823090
TRB^2 -0.08918572
TRB^3 0.00219790
STL% 1.43951052
BLK% 0.35237880
MPG 0.10099403
TO% Coeff 0.66920540
PPP Threshold 1.64758151
PPP USG Scale 0.01394727
PPP AST Scale 0.01005596
Scoring 0.55728095
USG Const 4.67604494
Intercept -6.90680060
Let me explain.
First of all, note the rebounding terms. I discovered that the value of splitting rebounding into offensive and defensive was much less than that of adding this nonlinearity (which didn't work when ORB and DRB were split). Basically, in the neighborhood of 10%, there isn't a huge amount of change. A player that gets very few rebound hurts the team a lot, and a player near 20% rebounds helps quite a bit. Here's a quick table:
Code:
TRB% Pts
0 0.00
2.5 2.82
5 4.74
7.5 5.95
10 6.66
12.5 7.09
15 7.42
17.5 7.89
20 8.67
22.5 10.00
25 12.06
Next: steals, blocks and MPG. These are all straightforward, linear terms. Be aware, though: I'm inputing these percentages throughout in their whole-number forms, like Basketball-Reference outputs them.
Charges taken would be added into the steals term--other research I've done shows them to be equivalent in SPM terms (1 ChgTkn = 1 Steal). I'm trying to make this SPM able to be applied historically; thus I've left that out.
Here's the complicated part: the scoring term. First the actual formula:
Code:
{TS%*2*(1-TO%/100) - TO%Coeff*(TO%/100) - (PPPThreshold - PPPUSGScale*USG% - PPPASTScale*AST%)}*(USG% + USGConst)*Scoring
What's going on here? First of all, this is basically an efficiency*USG term. It takes into account TS%, USG%, TO%, and AST% to create a composite scoring value.
Now, term by term. The True Shooting term is very basic. It gives the number of points scored per possession used by the player. Next, the turnover term provides the penalty for each turnover. These terms make up the efficiency side of the equation.
Next, the PPP (Points per Possession) threshold and modifiers. The threshold is just a baseline constant. Then usage is subtracted out, indicating from the regression that there is a clear benefit to having a higher usage--in fact, .1 PPP per 7 %USG increased. Finally, the assist modifier. This is the ONLY place in the regression that has assists included. It was not significant anywhere else I tried it, compared to this location in the regression. Assists also modify the PPP; when everything is multiplied through the assists basically go to the form AST%*(USG%+Constant), which is a reasonable construction.
Finally, the whole (PPP - PPPThreshold) term is multiplied by (USG% + USGConst). Again, we're using whole percentages, everywhere but with TS% (I'm following Basketball Reference on this). Because of the USGConst, even if a player has NO usage, he still gets some credit for assists. Just not very much. In other words, Steve Blake just isn't that great.
Finally, after compiling the RAW SPM, the team adjustment must be applied. This can range from negligible (Cleveland, Boston, and Utah had 0 team adjustments this year) to quite large (+1.36 for ORL, -1.43 for GSW). Mostly defense is what is accounted for by the team adjustment since it is not captured well by the regression.
Here is a sample of the results--the top 20 in SPM, minimum 1000 minutes:
Code:
Rnk Tm Player G MP SPM
1 CLE LeBron James 76 2966 12.16
2 MIA Dwyane Wade 77 2792 9.69
3 NOH Chris Paul 45 1712 6.51
4 ORL Dwight Howard 82 2843 6.31
5 SAS Manu Ginobili 75 2150 5.57
6 LAL Kobe Bryant 73 2835 5.38
7 OKC Kevin Durant 82 3239 5.32
8 SAS Tim Duncan 78 2438 5.21
9 BOS Rajon Rondo 81 2963 4.82
10 LAC Marcus Camby 51 1596 4.81
11 UTA Deron Williams 76 2802 4.42
12 ATL Josh Smith 81 2871 4.32
13 DAL Dirk Nowitzki 81 3039 4.07
14 LAL Pau Gasol 65 2403 4.01
15 UTA Carlos Boozer 78 2673 3.99
16 WAS Gilbert Arenas 32 1169 3.90
17 DEN Nene Hilario 82 2755 3.77
18 TOR Chris Bosh 70 2526 3.62
19 CHA Gerald Wallace 76 3119 3.53
20 DEN Carmelo Anthony 69 2634 3.51
The full results for 2009-2010 regular season are here: Google Spreadsheet: Advanced SPM 09-10
EDIT: See later in this thread for revisions to this method and a complete spreadsheet to play with.
Last edited by DSMok1 on Tue Oct 26, 2010 12:11 pm; edited 1 time in total
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Ilardi
Joined: 15 May 2008
Posts: 263
Location: Lawrence, KS
PostPosted: Wed Jul 14, 2010 4:55 pm Post subject: Reply with quote
Nice work, DSM: this looks like an important contribution.
A couple of quick questions:
a) Can you provide standard error (se) estimates for the SPM values?
b) Did you consider using any of the advanced metrics from 82games? I've always thought eFG% Allowed would be quite useful in an SPM model . . .
c) What is the correlation between your SPM values for each player and his corresponding APM value? (i.e., the zero-order correlation for the entire league)
d) Any plans for "out-of-sample testing" on this new SPM metric (a la Joe Sill)?
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 602
Location: Where the wind comes sweeping down the plains
PostPosted: Wed Jul 14, 2010 5:38 pm Post subject: Reply with quote
Ilardi wrote:
Nice work, DSM: this looks like an important contribution.
A couple of quick questions:
a) Can you provide standard error (se) estimates for the SPM values?
b) Did you consider using any of the advanced metrics from 82games? I've always thought eFG% Allowed would be quite useful in an SPM model . . .
c) What is the correlation between your SPM values for each player and his corresponding APM value? (i.e., the zero-order correlation for the entire league)
d) Any plans for "out-of-sample testing" on this new SPM metric (a la Joe Sill)?
Good to see you around, Ilardi!
a) How would I go about developing them for a nonlinear model? I would love to, but haven't figured out how. Another issue with the standard errors is that the APM against which we are regressing has error within it (which I think biases the error on the regression upwards).
b) I wanted to make this metric as useful historically as possible. Basketball Reference has all of the stats used in this regression back to 1977. A more intricate SPM is possible, using things like eFG% allowed, location of assists, etc.
c) I can run that... should I do it just on the low-error six season sample?
d) That would be tough for me to do. I don't have a lot of samples to work with.
Thanks for the input!
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Ilardi
Joined: 15 May 2008
Posts: 263
Location: Lawrence, KS
PostPosted: Wed Jul 14, 2010 6:01 pm Post subject: Reply with quote
Thanks: and most guys on the forum call me 'Steve'.
I'd have to get a consult to figure out how to calculate se's on a nonlinear metric like that, but I know it must be do-able. Perhaps someone on this forum can point the way to a workable approach?
As for the correlation between SPM and APM, I might suggest using the 08-09 season, for which you have my 6-season estimates (weighted heavily toward 08-09), as well as your own SPM values.
On the out-of-sample test: presumably it would be possible to calculate SPM values for each player based on games through, say, the first 4 months of last season, and then use those estimates to predict results of the final 2 months. (Same basic approach Joe used with his ridge regression APM numbers.) It would be a fair amount of work, but should be easily do-able, at least in principle.
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 602
Location: Where the wind comes sweeping down the plains
PostPosted: Wed Jul 14, 2010 11:47 pm Post subject: Reply with quote
Ilardi wrote:
Thanks: and most guys on the forum call me 'Steve'.
I'd have to get a consult to figure out how to calculate se's on a nonlinear metric like that, but I know it must be do-able. Perhaps someone on this forum can point the way to a workable approach?
As for the correlation between SPM and APM, I might suggest using the 08-09 season, for which you have my 6-season estimates (weighted heavily toward 08-09), as well as your own SPM values.
On the out-of-sample test: presumably it would be possible to calculate SPM values for each player based on games through, say, the first 4 months of last season, and then use those estimates to predict results of the final 2 months. (Same basic approach Joe used with his ridge regression APM numbers.) It would be a fair amount of work, but should be easily do-able, at least in principle.
I'd love to figure out how to do standard errors on nonlinear metrics.
I'll look into the correlation for the data you suggested, when I have time.
I still have issues with the out of sample test, because it is replacing a descriptive stat with a predictive stat--which is why the ridge regression technique provided the best out-of-sample results. It's basically regression to the mean. When I do regression, I'm going to use the samples, with their error, and regress in a Bayesian manner toward a prior based on peripheral data.
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Ilardi
Joined: 15 May 2008
Posts: 263
Location: Lawrence, KS
PostPosted: Thu Jul 15, 2010 12:18 pm Post subject: Reply with quote
DSMok1 wrote:
I still have issues with the out of sample test, because it is replacing a descriptive stat with a predictive stat--which is why the ridge regression technique provided the best out-of-sample results. It's basically regression to the mean. When I do regression, I'm going to use the samples, with their error, and regress in a Bayesian manner toward a prior based on peripheral data.
But isn't the utility of any metric linked in large part to its predictive ability? Certainly, in the natural sciences, the valid prediction of phenomena is regarded as the sine qua non of the entire enterprise, so I'm admittedly a bit biased, but suffice it to say that even NBA decision makers realize that it's much more valuable to have a stat that gives accurate prediction than one that merely provides accurate description.
Also, although ridge regression makes use of 'regression to the mean', it does so in a limited way - essentially by simply reining in outlier values via an a priori (Bayesian) determination that they are unlikely. In my view, it's an extremely clever technique for enhancing the 'signal' of player APM values via tamping down the 'noise' of extreme variations in efficiency from one low-minute lineup to another.
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 602
Location: Where the wind comes sweeping down the plains
PostPosted: Thu Jul 15, 2010 1:00 pm Post subject: Reply with quote
Ilardi wrote:
DSMok1 wrote:
I still have issues with the out of sample test, because it is replacing a descriptive stat with a predictive stat--which is why the ridge regression technique provided the best out-of-sample results. It's basically regression to the mean. When I do regression, I'm going to use the samples, with their error, and regress in a Bayesian manner toward a prior based on peripheral data.
But isn't the utility of any metric linked in large part to its predictive ability? Certainly, in the natural sciences, the valid prediction of phenomena is regarded as the sine qua non of the entire enterprise, so I'm admittedly a bit biased, but suffice it to say that even NBA decision makers realize that it's much more valuable to have a stat that gives accurate prediction than one that merely provides accurate description.
Also, although ridge regression makes use of 'regression to the mean', it does so in a limited way - essentially by simply reining in outlier values via an a priori (Bayesian) determination that they are unlikely. In my view, it's an extremely clever technique for enhancing the 'signal' of player APM values via tamping down the 'noise' of extreme variations in efficiency from one low-minute lineup to another.
I'm not disputing the value of prediction. However, I'd like to do that AFTER the SPM is calculated. In other words, construct a SPM, THEN apply the Bayesian regression to estimate "true talent", then combine with previous years to create a projection. I simply want the SPM itself to not be "biased" with information outside of actual production numbers.
I agree that RAPM works very well, but it does have a few quirks. Like Anderson Varajao getting very highly rated because it is so unlikely that Lebron is really a +11 player.
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Ilardi
Joined: 15 May 2008
Posts: 263
Location: Lawrence, KS
PostPosted: Sat Jul 17, 2010 10:40 am Post subject: Reply with quote
I've also had Varajao rated highly using a more traditional APM approach . . .
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 602
Location: Where the wind comes sweeping down the plains
PostPosted: Sat Jul 17, 2010 12:27 pm Post subject: Reply with quote
Ilardi wrote:
I've also had Varajao rated highly using a more traditional APM approach . . .
15th is pretty high. That's what the 4-year RAPM had him. Don't you think there is possibility of using the Bayesian in such a way causing some odd effects like that?
Also--would it be possible to get from you a 4 year, regular season only APM, though 09-10? Then I could use the advanced stats collected by Hoopdata in that time to run a more comprehensive SPM.
_________________
GodismyJudgeOK.com/DStats
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Neil Paine
Joined: 13 Oct 2005
Posts: 774
Location: Atlanta, GA
PostPosted: Sat Jul 17, 2010 2:50 pm Post subject: Reply with quote
Great work, DSMok!! I'm trying to replicate your work, and I had a question: how are you doing the team adjustment? What I always did was to find the minute-weighted average of each team's SPM and multiply by 5, then subtract that from the team's actual efficiency differential and divide the result by 5. But when I do that, my team adjustments don't match yours (ORL is +1.26, GSW is -1.70). Is it a rounding issue (I'm using the full, calculated versions of the BBR stats, while you used rounded versions), or is my team adjustment method incorrect?
_________________
http://www.basketball-reference.com/blog/
Back to top
View user's profile Send private message Visit poster's website
DSMok1
Joined: 05 Aug 2009
Posts: 602
Location: Where the wind comes sweeping down the plains
PostPosted: Sat Jul 17, 2010 4:02 pm Post subject: Reply with quote
Neil Paine wrote:
Great work, DSMok!! I'm trying to replicate your work, and I had a question: how are you doing the team adjustment? What I always did was to find the minute-weighted average of each team's SPM and multiply by 5, then subtract that from the team's actual efficiency differential and divide the result by 5. But when I do that, my team adjustments don't match yours (ORL is +1.26, GSW is -1.70). Is it a rounding issue (I'm using the full, calculated versions of the BBR stats, while you used rounded versions), or is my team adjustment method incorrect?
I didn't use the team efficiency precisely. I summed to 2/3 SRS 1/3 Efficiency differential. Is the SRS calculated from efficiency differentials or point margins? If it is calculated off of efficiency differentials per game, it should be the best thing to sum to. I think it's point differential, which is why I used the average I did. But whatever you choose to sum to, that's up to you.
I'm glad you're doing this! You've got all of the data for compiling a full list and actually doing the team adjustments correctly.
I'm hoping this doesn't undervalue great centers--because there weren't any in the time period I used for the regression, I don't know if the top end of the regression can capture them. Then again, I don't know how much a great center truly contributed, either.
_________________
GodismyJudgeOK.com/DStats
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Ilardi
Joined: 15 May 2008
Posts: 263
Location: Lawrence, KS
PostPosted: Sat Jul 17, 2010 4:59 pm Post subject: Reply with quote
[quote="DSMok1"]
Ilardi wrote:
Also--would it be possible to get from you a 4 year, regular season only APM, though 09-10? Then I could use the advanced stats collected by Hoopdata in that time to run a more comprehensive SPM.
I haven't run it yet, but maybe your request will be just the catalyst I need. Is it really the case that no one else out there has put out any publicly available multi-year APM stats? Are Aaron's 2-year APM stats on basketballvalue.com all there is? If that's the case, I really will try to carve out the time to work on this . . .
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 602
Location: Where the wind comes sweeping down the plains
PostPosted: Sat Jul 17, 2010 5:28 pm Post subject: Reply with quote
Ilardi wrote:
DSMok1 wrote:
Also--would it be possible to get from you a 4 year, regular season only APM, though 09-10? Then I could use the advanced stats collected by Hoopdata in that time to run a more comprehensive SPM.
I haven't run it yet, but maybe your request will be just the catalyst I need. Is it really the case that no one else out there has put out any publicly available multi-year APM stats? Are Aaron's 2-year APM stats on basketballvalue.com all there is? If that's the case, I really will try to carve out the time to work on this . . .
I don't know of any other APM's out there, now that the RAPM was taken down. The one I could use would be an "average" APM over the last 4 years.
_________________
GodismyJudgeOK.com/DStats
Back to top
View user's profile Send private message Send e-mail Visit poster's website
DSMok1
Joined: 05 Aug 2009
Posts: 602
Location: Where the wind comes sweeping down the plains
PostPosted: Mon Jul 19, 2010 9:53 am Post subject: Reply with quote
Neil Paine wrote:
Yeah, SRS is actually just SOS-adjusted point differential per game, which means it's not tempo-independent (we do it that way because we only have game possessions going back to 1986-87). If we had historical SOS-adjusted efficiency differential, that would definitely be the thing to sum to, but since we don't, I'm probably going to just sum to efficiency differential (which is what APM does anyway).
In case I didn't say so, like this new regression a lot! The most glaring problem with the old regression was that it drastically overvalued assists (and therefore PGs -- I found that the average PG was +1 or so while every other position was near zero), but it looks like you fixed this by tying AST% to the scoring term instead of having it stand alone. I would imagine this retrodicts better than the old regression as a result.
...
One troubling result at a first glance is that Dennis Rodman's 1995 is +10.91, one of the greatest seasons of all time... Maybe the rebounding term needs to be re-evaluated?
(Posted from an email)
It looks like the rebounding term will need some more work. The cubic works for just about everyone, except Rodman. He breaks the regression. The 30% TRB% is way out into the nonlinear term, and is worth like 9 points more than D-Howard's 22% TRB%.
Here are a few possibilities:
The cubic is the best fit, but only by a hair. The power + Sq is very close in terms of fit, and would probably be my pick to use. The pure power curve can't capture the desired up-turn at higher TRB% rates, but in terms of fit is still very close to power+Sq (most of the difference is out where there aren't any observations). The linear is here for reference (incidentally, TRB% by itself is a BETTER fit than ORB% and DRB% split out).
_________________
GodismyJudgeOK.com/DStats
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Mike G
Joined: 14 Jan 2005
Posts: 3573
Location: Hendersonville, NC
PostPosted: Mon Jul 19, 2010 10:19 am Post subject: Reply with quote
DSMok1 wrote:
...
I'm hoping this doesn't undervalue great centers...
This is reminiscent of a discussion starting with Nazr Mohammed (this year), that many of his rates resembled prime Moses, Gilmore, and others. Low versatility index the likely culprit.
It was asked then whether those known greats were also undervalued by the (then most current) SPM method.
If versatile less-than-great centers (Divac, Daugherty) seem to be better than Moses or Artis, then maybe it needs to be fixed?
page 2 of 7
Author Message
erivera7
Joined: 19 Jan 2009
Posts: 185
Location: Chicago, IL
PostPosted: Wed Dec 08, 2010 12:57 pm Post subject: Reply with quote
Ah, okay. Thanks for making the data readily available for the public! Much appreciated.
_________________
@erivera7
I cover the Orlando Magic - Magic Basketball
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Wed Dec 15, 2010 11:31 am Post subject: Reply with quote
I'm preparing to evaluate salaries and contracts, so I'm working on long-term projections based upon the latest updated true-talent level.
Obviously not all players will age the same; some of the young guns will probably jump to transcendant status like CP3, Lebron, and D-Wade start out. We just don't know who, yet!
Here's the pretty picture:
Those are the top 25 projected players in the 2015-2016 season.
Note some of the rookies from this year moving up! Also note OKC's roster--3 players present.
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Mon Dec 20, 2010 7:32 pm Post subject: Reply with quote
Updated Spreadsheet: https://docs.google.com/leaf?id=0Bx1NfC ... MzRl&hl=en
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
EvanZ
Joined: 22 Nov 2010
Posts: 295
PostPosted: Mon Dec 20, 2010 9:13 pm Post subject: Reply with quote
DSMok1 wrote:
Obviously not all players will age the same; some of the young guns will probably jump to transcendant status like CP3, Lebron, and D-Wade start out. We just don't know who, yet!
I hope CP3 is around in 5 years, but his knees may be mush by then.
_________________
http://www.thecity2.com
http://www.ibb.gatech.edu/evan-zamir
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Mon Dec 20, 2010 9:22 pm Post subject: Reply with quote
EvanZ wrote:
DSMok1 wrote:
Obviously not all players will age the same; some of the young guns will probably jump to transcendant status like CP3, Lebron, and D-Wade start out. We just don't know who, yet!
I hope CP3 is around in 5 years, but his knees may be mush by then.
Yeah. I'd expect outliers like him to age more rapidly.
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
page 3 of 7
Author Message
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Thu Jul 22, 2010 11:12 am Post subject: Reply with quote
DSMok1 wrote:
Here's the regression, with the TRB power term and a linear term only. Honestly, the fit is similar enough that the TRB^2 additional term is probably purely overfitting. (You were right, Rhuidean.)
Here is that regression:
Code:
TRB% -0.114650868
TRBPower 6.762898760
TRBExponent 0.284850151
STL% 1.484282673
BLK% 0.333514467
MPG 0.103521312
TO% Coeff 0.628638746
PPP Threshold 1.634054153
PPP USG Scale 0.013865918
PPP AST Scale 0.009744137
Scoring 0.579187325
USG Const 3.910166612
Intercept -12.466987594
I discovered a flaw in my calculations; I had a slight regression-to-replacement applied to the APM. Removing it doesn't change the SPM terms much, but slightly increases the overall spread. Corrected SPM:
Code:
TRB% -0.11262729
TRBPower 7.21103324
TRBExponent 0.27533883
STL% 1.52798518
BLK% 0.34027348
MPG 0.10394047
TO% Coeff 0.63662985
PPP Threshold 1.64666575
PPP USG Scale 0.01397684
PPP AST Scale 0.00970050
Scoring 0.58831902
USG Const 4.28737016
Intercept -12.78185481
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Thu Jul 22, 2010 12:48 pm Post subject: Reply with quote
I just went through and estimated OSPM and DSPM from the same data, except that the 1 year estimates did not include OAPM and DAPM split up.
Here is Offensive SPM:
Code:
TRB% -0.09696127
TRBPower 12.81133914
TRBExponent 0.11844250
STL% 0.27096567
BLK% -0.08060675
MPG 0.05489518
TO% 0.53648983
PPP Thresh 1.25957931
PPP USG Scale 0.00884937
PPP AST Scale 0.00845557
Scoring 0.59905610
USG Coeff 8.78684193
Intercept -16.03676076
And Defensive SPM:
Code:
TRB% -0.07702830
TRBPower 15.09091745
TRBExponent -0.06311682
STL% -1.23381652
BLK% -0.41721519
MPG -0.04970989
TO% -0.05031619
PPP Thresh -4.96111508
PPP USG Scale -0.05976497
PPP AST Scale 0.00216782
Scoring 0.06565780
USG Coeff 29.80487400
Intercept -23.96225822
1) On the Defensive SPM, NEGATIVE IS GOOD.
2) I'm sure some of these variables are insignificant.
3) Splitting into offensive and defensive rebounding could be beneficial for these, I suppose.
4) These do not sum perfectly to the SPM regression above. Usually within .05, however.
5) The Defensive SPM can do some weird things with low-minute players, it seems (with the negative (=good) intercept).
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Guy
Joined: 02 May 2007
Posts: 128
PostPosted: Fri Jul 23, 2010 10:37 am Post subject: Reply with quote
DSMok1:
Could you tell us the marginal value of one unit of each main boxscore stat under your current SPM?
And a question: have you ever looked to see if two separate models, one for C/F and one for Guards, improves predictive power at all? I assume it doesn't or someone would have done it by now, but intuitively it seems like coefficients might change.
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Fri Jul 23, 2010 10:46 am Post subject: Reply with quote
Guy wrote:
DSMok1:
Could you tell us the marginal value of one unit of each main boxscore stat under your current SPM?
And a question: have you ever looked to see if two separate models, one for C/F and one for Guards, improves predictive power at all? I assume it doesn't or someone would have done it by now, but intuitively it seems like coefficients might change.
The marginal value of one unit of each boxscore stat depends. I'm using advanced stats (obviously) to calculate, so even the linear terms vary depending on, for example, how many 2's were taken by the opponent (for the block term).
For rebounds, it is a power curve, so not equal. Somewhere in the area of 0.23 points per each % of TRB added, but obviously varying.
For the scoring term, it's a 4 dimensional surface, so rather hard to plot. Maybe I can put up a spreadsheet calculator to experiment with.
As for splitting up... well, I haven't worked with one with this regression. The problem is that positions are both continuous and ambiguous. I can tell a PG away from a Center, but not always a PG away from a SF. (What was Lebron, anyway?). Because of that fact, I generally feel it's not useful.
Besides, the results from the current regression seem to be doing quite well. The offensive side mimics APM well; the defensive side is limited by the terms to input.
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Guy
Joined: 02 May 2007
Posts: 128
PostPosted: Fri Jul 23, 2010 11:31 am Post subject: Reply with quote
Thanks. So if I'm following, one rebound adds something like .25-.30 points. Would be interesting to know implied value of a marginal assist as well (assuming average team/opponent).
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Fri Jul 23, 2010 11:45 am Post subject: Reply with quote
Guy wrote:
Thanks. So if I'm following, one rebound adds something like .25-.30 points. Would be interesting to know implied value of a marginal assist as well (assuming average team/opponent).
Assists are totally wrapped up in the scoring term. I just picked a random player (JJ Reddick, in this case) and added another % to his AST%. His SPM went up 0.13.
But that will vary mightily based on USG%, because the assist term is multiplied by usage.
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Fri Jul 23, 2010 1:06 pm Post subject: Reply with quote
For interests' sake: an example of the this new SPM in action:
NBA Finals, 09-10:
Code:
Player SPM %Min Contrib VORP NBA SPM NBA VORP
Kobe Bryant 5.91 85.8% 5.06 7.64 7.51 9.01
Pau Gasol 4.82 87.3% 4.21 6.83 6.42 8.22
Ron Artest -1.43 74.7% -1.07 1.17 0.17 2.37
Lamar Odom -1.30 57.2% -0.74 0.97 0.30 1.89
Derek Fisher -1.99 63.7% -1.27 0.64 -0.39 1.66
Sasha Vujacic 0.85 15.5% 0.13 0.60 2.45 0.84
Andrew Bynum -2.19 52.1% -1.14 0.42 -0.59 1.25
DJ Mbenga -13.14 0.9% -0.12 -0.09 -11.54 -0.08
Josh Powell -12.23 2.4% -0.29 -0.22 -10.63 -0.18
Shannon Brown -4.68 25.0% -1.17 -0.42 -3.08 -0.02
Jordan Farmar -4.88 26.2% -1.28 -0.49 -3.28 -0.07
Luke Walton -9.49 9.2% -0.88 -0.60 -7.89 -0.45
Player SPM %Min Contrib VORP NBA SPM NBA VORP
Kevin Garnett 4.70 66.1% 3.11 5.09 6.30 6.15
Rajon Rondo 2.87 81.0% 2.32 4.75 4.47 6.05
Paul Pierce -0.13 82.8% -0.11 2.37 1.47 3.70
Glen Davis -0.15 42.9% -0.06 1.22 1.45 1.91
Rasheed Wallace -1.02 42.9% -0.44 0.85 0.58 1.54
Nate Robinson -0.37 21.1% -0.08 0.56 1.23 0.89
Kendrick Perkins -2.02 42.0% -0.85 0.41 -0.42 1.08
Marquis Daniels 1.04 1.2% 0.01 0.05 2.64 0.07
Tony Allen -3.03 30.7% -0.93 -0.01 -1.43 0.48
Brian Scalabrine -16.39 0.3% -0.05 -0.04 -14.79 -0.04
Michael Finley -22.93 1.5% -0.34 -0.30 -21.33 -0.27
Ray Allen -3.50 82.2% -2.87 -0.41 -1.90 0.91
Shelden Williams -21.50 5.4% -1.15 -0.99 -19.90 -0.91
What I'm showing is the SPM (already team-adjusted to match up to the efficiency differential, but this adjustment was less than 0.1 for each team), the minutes played, the contribution (SPM*%Min), the VORP (SPM+3)*%Min, and SPM and VORP adjusted for the level of competition (I assumed that these teams were averaging a +8 efficiency differential level, and added that to the SPM).
This new regression does recognize Kobe as superior to Pau, but only by a little. Compare with my old beta regression: http://sonicscentral.com/apbrmetrics/vi ... 1839#31839. It's a big difference!
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Mon Jul 26, 2010 1:12 pm Post subject: Reply with quote
I noted a few posts above that the offensive and defensive SPM had some unnecessary variables included. For instance though shooting is statistically significant for defense (a good shooter = bad defender) it should not be included in the regression. We can't just ASSUME that a good shooter will be a bad defender. It's just that a bad shooter that's playing in the NBA must be a good defender, or else he wouldn't be here!
So, throwing out tangential variables, here are the offensive and defensive regressions for OSPM and DSPM:
Here is Offensive SPM:
Code:
TRB% -0.20610998
TRBPower 8.42678761
TRBExponent 0.20247387
MPG 0.08242452
TO% 0.63261410
PPP Thresh 1.35785021
PPP USG Scale 0.00926961
PPP AST Scale 0.00957339
Scoring 0.56025531
USG Coeff 7.01988471
Intercept -10.92774435
And Defensive SPM (with negative being good):
Code:
TRB% -0.14025543
TRBPower -1.64008765
TRBExponent 0.19101623
STL% -1.45142242
BLK% -0.41795713
MPG 0.00344333
Intercept 6.52648729
How well do these fit the data? Well, I'll report R-squared results here, just to give an idea. I didn't weight all data points evenly (based on the error in the APM) in the regression, so take these numbers with a lump of salt.
Overall SPM: 0.28
Offensive SPM: 0.56
Defensive SPM: 0.33
Sum of OSPM and DSPM (comparing to overall APM): 0.27
These new offensive and defensive SPM's will not sum to the overall SPM. The average difference between the Sum and SPM is around 0.3 to 0.4, so not too far off, but not great either.
I would recommend using these offensive and defensive SPM's over those above that included insignificant/inappropriate variables.
Also, remember to sum/adjust each of these to force the team totals to equal the team's efficiency above or below NBA average. (It guess that's what we should sum to--right?)
EDIT:
Probably better to use ORB and DRB instead of TRB when doing the specific regressions for OSPM and DSPM:
Code:
ORB% -0.10129651
ORBPower 5.16775677
ORBExponent 0.15790017
MPG 0.08366916
TO% 0.65655052
PPP Thresh 1.35219734
PPP USG Scale 0.00887959
PPP AST Scale 0.00971672
Scoring 0.56548199
USG Coeff 7.46146945
Intercept -5.66757625
And Defensive SPM (with negative being good):
Code:
DRB% -0.04258455
DRBPower -5.48349226
DRBExponent 0.16254300
STL% -1.42555701
BLK% -0.45616939
MPG 0.01151452
Intercept 11.43405907
The TRB forms behaved oddly for high rebounding percentages on offense. Use these lower forms!
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Fri Aug 06, 2010 12:05 pm Post subject: Reply with quote
I've added a nonlinear assist term experimentally; it seems to help considerably, but reduces the accuracy at the extremes (i.e. 0% AST).
Here is the NEW SPREADSHEET with lots of additional columns and calculations.
Including True Value II, my regressed best estimate of the player's value.
In actual dollars!
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
philrl
Joined: 28 Oct 2009
Posts: 6
PostPosted: Fri Aug 06, 2010 2:41 pm Post subject: Reply with quote
How much weight does TM DRtg get in DSPM? Would it make sense to use only the DRtg when the player is on the court?
It seems that good defenders on bad teams will get hurt by this, notably Amir Johnson. When he's in, the Raptors were close to a league average defense (107.92). It seems kind of unfair to adjust him to his team's awful rating, when most of the damage was done while he was on the bench.
Back to top
View user's profile Send private message
huevonkiller
Joined: 25 May 2010
Posts: 15
Location: Miami, Fl
PostPosted: Sat Aug 07, 2010 8:47 am Post subject: Reply with quote
DSMok1 wrote:
I've added a nonlinear assist term experimentally; it seems to help considerably, but reduces the accuracy at the extremes (i.e. 0% AST).
Here is the NEW SPREADSHEET with lots of additional columns and calculations.
Including True Value II, my regressed best estimate of the player's value.
In actual dollars!
Awesome, thanks. Intriguing predictive outlook too.
Back to top
View user's profile Send private message
battaile
Joined: 27 Jul 2009
Posts: 38
PostPosted: Wed Aug 11, 2010 11:56 am Post subject: Reply with quote
I'm sure this is blatantly obvious and I'm just having a senior moment, but what is the Rk column? (column B)
Back to top
View user's profile Send private message Visit poster's website
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Wed Aug 11, 2010 2:04 pm Post subject: Reply with quote
battaile wrote:
I'm sure this is blatantly obvious and I'm just having a senior moment, but what is the Rk column? (column B)
Just an artifact of the data from Basketball Reference. I should have deleted it... It was the "rank" of players on each team by win shares.
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
battaile
Joined: 27 Jul 2009
Posts: 38
PostPosted: Thu Aug 12, 2010 11:41 am Post subject: Reply with quote
DSMok1 wrote:
battaile wrote:
I'm sure this is blatantly obvious and I'm just having a senior moment, but what is the Rk column? (column B)
Just an artifact of the data from Basketball Reference. I should have deleted it... It was the "rank" of players on each team by win shares.
Ah cool, thanks!
Is there a post somewhere that explains the various columns? I'm pretty fascinated by this spreadsheet but struggling to understand some of the relationships. (like looking at Ariza's mostly pedestrian advanced stat numbers through column W and figuring out what it is about him that gives him a true value over 9m, etc)
Back to top
View user's profile Send private message Visit poster's website
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Thu Oct 07, 2010 12:15 pm Post subject: Reply with quote
A couple of spreadsheets:
All players since 1978, full Advanced SPM numbers
and
An example of 1 year's full calculations, including a macro to update to other years or within this coming year.
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Author Message
DSMok1
Joined: 05 Aug 2009
Posts: 602
Location: Where the wind comes sweeping down the plains
PostPosted: Wed Jul 14, 2010 4:39 pm Post subject: Advanced Statistical Plus/Minus Reply with quote
Advanced Statistical Plus/Minus
I've been working on deriving a new SPM regression based purely upon "advanced" stats (like TS% and OR%) for some time now. I feel comfortable enough with the results thus far to release the first iteration of this SPM.
The data used: Neil Paine's collection of 1-Yr APM's (unfortunately without std err's; I estimated the standard errors for weighting purposes), Joe Sill's 4 Year RAPM's, with the regression toward 0 backed out, and finally (and most importantly) Steve Ilardi's 6-Year APM's posted on this forum. These 6 Year APM's had quite low errors, and provided the groundwork for this regression. I weighted each player in the regression by 1/stderr^2, where stderr is their APM standard error.
I then compiled the advanced metrics from the Basketball Reference Play Index for each player, and weighted-averaged the multi-year data (including playoffs, for the APM's that included those). Thus I have 3 APM data sets and the associated advanced statistics.
I experimented with a number of constructions for the rebounding and especially the scoring parts of this regression. Finding a good way to relate turnovers, shooting, usage, and assists proved illusive for some time. I finally now have a construction I am comfortable with, though (like with any construction) there are a few holes.
To avoid over-weighting steals and blocks for defense, I also included offensive rating and defensive rating of the teams. This is not included in the final SPM, because the team adjustment (to make the teams sum to their efficiency differentials) already accounts for this.
Here are the factors in this regression:
Code:
Factor Value
TRB% 1.33823090
TRB^2 -0.08918572
TRB^3 0.00219790
STL% 1.43951052
BLK% 0.35237880
MPG 0.10099403
TO% Coeff 0.66920540
PPP Threshold 1.64758151
PPP USG Scale 0.01394727
PPP AST Scale 0.01005596
Scoring 0.55728095
USG Const 4.67604494
Intercept -6.90680060
Let me explain.
First of all, note the rebounding terms. I discovered that the value of splitting rebounding into offensive and defensive was much less than that of adding this nonlinearity (which didn't work when ORB and DRB were split). Basically, in the neighborhood of 10%, there isn't a huge amount of change. A player that gets very few rebound hurts the team a lot, and a player near 20% rebounds helps quite a bit. Here's a quick table:
Code:
TRB% Pts
0 0.00
2.5 2.82
5 4.74
7.5 5.95
10 6.66
12.5 7.09
15 7.42
17.5 7.89
20 8.67
22.5 10.00
25 12.06
Next: steals, blocks and MPG. These are all straightforward, linear terms. Be aware, though: I'm inputing these percentages throughout in their whole-number forms, like Basketball-Reference outputs them.
Charges taken would be added into the steals term--other research I've done shows them to be equivalent in SPM terms (1 ChgTkn = 1 Steal). I'm trying to make this SPM able to be applied historically; thus I've left that out.
Here's the complicated part: the scoring term. First the actual formula:
Code:
{TS%*2*(1-TO%/100) - TO%Coeff*(TO%/100) - (PPPThreshold - PPPUSGScale*USG% - PPPASTScale*AST%)}*(USG% + USGConst)*Scoring
What's going on here? First of all, this is basically an efficiency*USG term. It takes into account TS%, USG%, TO%, and AST% to create a composite scoring value.
Now, term by term. The True Shooting term is very basic. It gives the number of points scored per possession used by the player. Next, the turnover term provides the penalty for each turnover. These terms make up the efficiency side of the equation.
Next, the PPP (Points per Possession) threshold and modifiers. The threshold is just a baseline constant. Then usage is subtracted out, indicating from the regression that there is a clear benefit to having a higher usage--in fact, .1 PPP per 7 %USG increased. Finally, the assist modifier. This is the ONLY place in the regression that has assists included. It was not significant anywhere else I tried it, compared to this location in the regression. Assists also modify the PPP; when everything is multiplied through the assists basically go to the form AST%*(USG%+Constant), which is a reasonable construction.
Finally, the whole (PPP - PPPThreshold) term is multiplied by (USG% + USGConst). Again, we're using whole percentages, everywhere but with TS% (I'm following Basketball Reference on this). Because of the USGConst, even if a player has NO usage, he still gets some credit for assists. Just not very much. In other words, Steve Blake just isn't that great.
Finally, after compiling the RAW SPM, the team adjustment must be applied. This can range from negligible (Cleveland, Boston, and Utah had 0 team adjustments this year) to quite large (+1.36 for ORL, -1.43 for GSW). Mostly defense is what is accounted for by the team adjustment since it is not captured well by the regression.
Here is a sample of the results--the top 20 in SPM, minimum 1000 minutes:
Code:
Rnk Tm Player G MP SPM
1 CLE LeBron James 76 2966 12.16
2 MIA Dwyane Wade 77 2792 9.69
3 NOH Chris Paul 45 1712 6.51
4 ORL Dwight Howard 82 2843 6.31
5 SAS Manu Ginobili 75 2150 5.57
6 LAL Kobe Bryant 73 2835 5.38
7 OKC Kevin Durant 82 3239 5.32
8 SAS Tim Duncan 78 2438 5.21
9 BOS Rajon Rondo 81 2963 4.82
10 LAC Marcus Camby 51 1596 4.81
11 UTA Deron Williams 76 2802 4.42
12 ATL Josh Smith 81 2871 4.32
13 DAL Dirk Nowitzki 81 3039 4.07
14 LAL Pau Gasol 65 2403 4.01
15 UTA Carlos Boozer 78 2673 3.99
16 WAS Gilbert Arenas 32 1169 3.90
17 DEN Nene Hilario 82 2755 3.77
18 TOR Chris Bosh 70 2526 3.62
19 CHA Gerald Wallace 76 3119 3.53
20 DEN Carmelo Anthony 69 2634 3.51
The full results for 2009-2010 regular season are here: Google Spreadsheet: Advanced SPM 09-10
EDIT: See later in this thread for revisions to this method and a complete spreadsheet to play with.
Last edited by DSMok1 on Tue Oct 26, 2010 12:11 pm; edited 1 time in total
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Ilardi
Joined: 15 May 2008
Posts: 263
Location: Lawrence, KS
PostPosted: Wed Jul 14, 2010 4:55 pm Post subject: Reply with quote
Nice work, DSM: this looks like an important contribution.
A couple of quick questions:
a) Can you provide standard error (se) estimates for the SPM values?
b) Did you consider using any of the advanced metrics from 82games? I've always thought eFG% Allowed would be quite useful in an SPM model . . .
c) What is the correlation between your SPM values for each player and his corresponding APM value? (i.e., the zero-order correlation for the entire league)
d) Any plans for "out-of-sample testing" on this new SPM metric (a la Joe Sill)?
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 602
Location: Where the wind comes sweeping down the plains
PostPosted: Wed Jul 14, 2010 5:38 pm Post subject: Reply with quote
Ilardi wrote:
Nice work, DSM: this looks like an important contribution.
A couple of quick questions:
a) Can you provide standard error (se) estimates for the SPM values?
b) Did you consider using any of the advanced metrics from 82games? I've always thought eFG% Allowed would be quite useful in an SPM model . . .
c) What is the correlation between your SPM values for each player and his corresponding APM value? (i.e., the zero-order correlation for the entire league)
d) Any plans for "out-of-sample testing" on this new SPM metric (a la Joe Sill)?
Good to see you around, Ilardi!
a) How would I go about developing them for a nonlinear model? I would love to, but haven't figured out how. Another issue with the standard errors is that the APM against which we are regressing has error within it (which I think biases the error on the regression upwards).
b) I wanted to make this metric as useful historically as possible. Basketball Reference has all of the stats used in this regression back to 1977. A more intricate SPM is possible, using things like eFG% allowed, location of assists, etc.
c) I can run that... should I do it just on the low-error six season sample?
d) That would be tough for me to do. I don't have a lot of samples to work with.
Thanks for the input!
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Ilardi
Joined: 15 May 2008
Posts: 263
Location: Lawrence, KS
PostPosted: Wed Jul 14, 2010 6:01 pm Post subject: Reply with quote
Thanks: and most guys on the forum call me 'Steve'.
I'd have to get a consult to figure out how to calculate se's on a nonlinear metric like that, but I know it must be do-able. Perhaps someone on this forum can point the way to a workable approach?
As for the correlation between SPM and APM, I might suggest using the 08-09 season, for which you have my 6-season estimates (weighted heavily toward 08-09), as well as your own SPM values.
On the out-of-sample test: presumably it would be possible to calculate SPM values for each player based on games through, say, the first 4 months of last season, and then use those estimates to predict results of the final 2 months. (Same basic approach Joe used with his ridge regression APM numbers.) It would be a fair amount of work, but should be easily do-able, at least in principle.
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 602
Location: Where the wind comes sweeping down the plains
PostPosted: Wed Jul 14, 2010 11:47 pm Post subject: Reply with quote
Ilardi wrote:
Thanks: and most guys on the forum call me 'Steve'.
I'd have to get a consult to figure out how to calculate se's on a nonlinear metric like that, but I know it must be do-able. Perhaps someone on this forum can point the way to a workable approach?
As for the correlation between SPM and APM, I might suggest using the 08-09 season, for which you have my 6-season estimates (weighted heavily toward 08-09), as well as your own SPM values.
On the out-of-sample test: presumably it would be possible to calculate SPM values for each player based on games through, say, the first 4 months of last season, and then use those estimates to predict results of the final 2 months. (Same basic approach Joe used with his ridge regression APM numbers.) It would be a fair amount of work, but should be easily do-able, at least in principle.
I'd love to figure out how to do standard errors on nonlinear metrics.
I'll look into the correlation for the data you suggested, when I have time.
I still have issues with the out of sample test, because it is replacing a descriptive stat with a predictive stat--which is why the ridge regression technique provided the best out-of-sample results. It's basically regression to the mean. When I do regression, I'm going to use the samples, with their error, and regress in a Bayesian manner toward a prior based on peripheral data.
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Ilardi
Joined: 15 May 2008
Posts: 263
Location: Lawrence, KS
PostPosted: Thu Jul 15, 2010 12:18 pm Post subject: Reply with quote
DSMok1 wrote:
I still have issues with the out of sample test, because it is replacing a descriptive stat with a predictive stat--which is why the ridge regression technique provided the best out-of-sample results. It's basically regression to the mean. When I do regression, I'm going to use the samples, with their error, and regress in a Bayesian manner toward a prior based on peripheral data.
But isn't the utility of any metric linked in large part to its predictive ability? Certainly, in the natural sciences, the valid prediction of phenomena is regarded as the sine qua non of the entire enterprise, so I'm admittedly a bit biased, but suffice it to say that even NBA decision makers realize that it's much more valuable to have a stat that gives accurate prediction than one that merely provides accurate description.
Also, although ridge regression makes use of 'regression to the mean', it does so in a limited way - essentially by simply reining in outlier values via an a priori (Bayesian) determination that they are unlikely. In my view, it's an extremely clever technique for enhancing the 'signal' of player APM values via tamping down the 'noise' of extreme variations in efficiency from one low-minute lineup to another.
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 602
Location: Where the wind comes sweeping down the plains
PostPosted: Thu Jul 15, 2010 1:00 pm Post subject: Reply with quote
Ilardi wrote:
DSMok1 wrote:
I still have issues with the out of sample test, because it is replacing a descriptive stat with a predictive stat--which is why the ridge regression technique provided the best out-of-sample results. It's basically regression to the mean. When I do regression, I'm going to use the samples, with their error, and regress in a Bayesian manner toward a prior based on peripheral data.
But isn't the utility of any metric linked in large part to its predictive ability? Certainly, in the natural sciences, the valid prediction of phenomena is regarded as the sine qua non of the entire enterprise, so I'm admittedly a bit biased, but suffice it to say that even NBA decision makers realize that it's much more valuable to have a stat that gives accurate prediction than one that merely provides accurate description.
Also, although ridge regression makes use of 'regression to the mean', it does so in a limited way - essentially by simply reining in outlier values via an a priori (Bayesian) determination that they are unlikely. In my view, it's an extremely clever technique for enhancing the 'signal' of player APM values via tamping down the 'noise' of extreme variations in efficiency from one low-minute lineup to another.
I'm not disputing the value of prediction. However, I'd like to do that AFTER the SPM is calculated. In other words, construct a SPM, THEN apply the Bayesian regression to estimate "true talent", then combine with previous years to create a projection. I simply want the SPM itself to not be "biased" with information outside of actual production numbers.
I agree that RAPM works very well, but it does have a few quirks. Like Anderson Varajao getting very highly rated because it is so unlikely that Lebron is really a +11 player.
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Ilardi
Joined: 15 May 2008
Posts: 263
Location: Lawrence, KS
PostPosted: Sat Jul 17, 2010 10:40 am Post subject: Reply with quote
I've also had Varajao rated highly using a more traditional APM approach . . .
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 602
Location: Where the wind comes sweeping down the plains
PostPosted: Sat Jul 17, 2010 12:27 pm Post subject: Reply with quote
Ilardi wrote:
I've also had Varajao rated highly using a more traditional APM approach . . .
15th is pretty high. That's what the 4-year RAPM had him. Don't you think there is possibility of using the Bayesian in such a way causing some odd effects like that?
Also--would it be possible to get from you a 4 year, regular season only APM, though 09-10? Then I could use the advanced stats collected by Hoopdata in that time to run a more comprehensive SPM.
_________________
GodismyJudgeOK.com/DStats
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Neil Paine
Joined: 13 Oct 2005
Posts: 774
Location: Atlanta, GA
PostPosted: Sat Jul 17, 2010 2:50 pm Post subject: Reply with quote
Great work, DSMok!! I'm trying to replicate your work, and I had a question: how are you doing the team adjustment? What I always did was to find the minute-weighted average of each team's SPM and multiply by 5, then subtract that from the team's actual efficiency differential and divide the result by 5. But when I do that, my team adjustments don't match yours (ORL is +1.26, GSW is -1.70). Is it a rounding issue (I'm using the full, calculated versions of the BBR stats, while you used rounded versions), or is my team adjustment method incorrect?
_________________
http://www.basketball-reference.com/blog/
Back to top
View user's profile Send private message Visit poster's website
DSMok1
Joined: 05 Aug 2009
Posts: 602
Location: Where the wind comes sweeping down the plains
PostPosted: Sat Jul 17, 2010 4:02 pm Post subject: Reply with quote
Neil Paine wrote:
Great work, DSMok!! I'm trying to replicate your work, and I had a question: how are you doing the team adjustment? What I always did was to find the minute-weighted average of each team's SPM and multiply by 5, then subtract that from the team's actual efficiency differential and divide the result by 5. But when I do that, my team adjustments don't match yours (ORL is +1.26, GSW is -1.70). Is it a rounding issue (I'm using the full, calculated versions of the BBR stats, while you used rounded versions), or is my team adjustment method incorrect?
I didn't use the team efficiency precisely. I summed to 2/3 SRS 1/3 Efficiency differential. Is the SRS calculated from efficiency differentials or point margins? If it is calculated off of efficiency differentials per game, it should be the best thing to sum to. I think it's point differential, which is why I used the average I did. But whatever you choose to sum to, that's up to you.
I'm glad you're doing this! You've got all of the data for compiling a full list and actually doing the team adjustments correctly.
I'm hoping this doesn't undervalue great centers--because there weren't any in the time period I used for the regression, I don't know if the top end of the regression can capture them. Then again, I don't know how much a great center truly contributed, either.
_________________
GodismyJudgeOK.com/DStats
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Ilardi
Joined: 15 May 2008
Posts: 263
Location: Lawrence, KS
PostPosted: Sat Jul 17, 2010 4:59 pm Post subject: Reply with quote
[quote="DSMok1"]
Ilardi wrote:
Also--would it be possible to get from you a 4 year, regular season only APM, though 09-10? Then I could use the advanced stats collected by Hoopdata in that time to run a more comprehensive SPM.
I haven't run it yet, but maybe your request will be just the catalyst I need. Is it really the case that no one else out there has put out any publicly available multi-year APM stats? Are Aaron's 2-year APM stats on basketballvalue.com all there is? If that's the case, I really will try to carve out the time to work on this . . .
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 602
Location: Where the wind comes sweeping down the plains
PostPosted: Sat Jul 17, 2010 5:28 pm Post subject: Reply with quote
Ilardi wrote:
DSMok1 wrote:
Also--would it be possible to get from you a 4 year, regular season only APM, though 09-10? Then I could use the advanced stats collected by Hoopdata in that time to run a more comprehensive SPM.
I haven't run it yet, but maybe your request will be just the catalyst I need. Is it really the case that no one else out there has put out any publicly available multi-year APM stats? Are Aaron's 2-year APM stats on basketballvalue.com all there is? If that's the case, I really will try to carve out the time to work on this . . .
I don't know of any other APM's out there, now that the RAPM was taken down. The one I could use would be an "average" APM over the last 4 years.
_________________
GodismyJudgeOK.com/DStats
Back to top
View user's profile Send private message Send e-mail Visit poster's website
DSMok1
Joined: 05 Aug 2009
Posts: 602
Location: Where the wind comes sweeping down the plains
PostPosted: Mon Jul 19, 2010 9:53 am Post subject: Reply with quote
Neil Paine wrote:
Yeah, SRS is actually just SOS-adjusted point differential per game, which means it's not tempo-independent (we do it that way because we only have game possessions going back to 1986-87). If we had historical SOS-adjusted efficiency differential, that would definitely be the thing to sum to, but since we don't, I'm probably going to just sum to efficiency differential (which is what APM does anyway).
In case I didn't say so, like this new regression a lot! The most glaring problem with the old regression was that it drastically overvalued assists (and therefore PGs -- I found that the average PG was +1 or so while every other position was near zero), but it looks like you fixed this by tying AST% to the scoring term instead of having it stand alone. I would imagine this retrodicts better than the old regression as a result.
...
One troubling result at a first glance is that Dennis Rodman's 1995 is +10.91, one of the greatest seasons of all time... Maybe the rebounding term needs to be re-evaluated?
(Posted from an email)
It looks like the rebounding term will need some more work. The cubic works for just about everyone, except Rodman. He breaks the regression. The 30% TRB% is way out into the nonlinear term, and is worth like 9 points more than D-Howard's 22% TRB%.
Here are a few possibilities:
The cubic is the best fit, but only by a hair. The power + Sq is very close in terms of fit, and would probably be my pick to use. The pure power curve can't capture the desired up-turn at higher TRB% rates, but in terms of fit is still very close to power+Sq (most of the difference is out where there aren't any observations). The linear is here for reference (incidentally, TRB% by itself is a BETTER fit than ORB% and DRB% split out).
_________________
GodismyJudgeOK.com/DStats
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Mike G
Joined: 14 Jan 2005
Posts: 3573
Location: Hendersonville, NC
PostPosted: Mon Jul 19, 2010 10:19 am Post subject: Reply with quote
DSMok1 wrote:
...
I'm hoping this doesn't undervalue great centers...
This is reminiscent of a discussion starting with Nazr Mohammed (this year), that many of his rates resembled prime Moses, Gilmore, and others. Low versatility index the likely culprit.
It was asked then whether those known greats were also undervalued by the (then most current) SPM method.
If versatile less-than-great centers (Divac, Daugherty) seem to be better than Moses or Artis, then maybe it needs to be fixed?
page 2 of 7
Author Message
erivera7
Joined: 19 Jan 2009
Posts: 185
Location: Chicago, IL
PostPosted: Wed Dec 08, 2010 12:57 pm Post subject: Reply with quote
Ah, okay. Thanks for making the data readily available for the public! Much appreciated.
_________________
@erivera7
I cover the Orlando Magic - Magic Basketball
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Wed Dec 15, 2010 11:31 am Post subject: Reply with quote
I'm preparing to evaluate salaries and contracts, so I'm working on long-term projections based upon the latest updated true-talent level.
Obviously not all players will age the same; some of the young guns will probably jump to transcendant status like CP3, Lebron, and D-Wade start out. We just don't know who, yet!
Here's the pretty picture:
Those are the top 25 projected players in the 2015-2016 season.
Note some of the rookies from this year moving up! Also note OKC's roster--3 players present.
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Mon Dec 20, 2010 7:32 pm Post subject: Reply with quote
Updated Spreadsheet: https://docs.google.com/leaf?id=0Bx1NfC ... MzRl&hl=en
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
EvanZ
Joined: 22 Nov 2010
Posts: 295
PostPosted: Mon Dec 20, 2010 9:13 pm Post subject: Reply with quote
DSMok1 wrote:
Obviously not all players will age the same; some of the young guns will probably jump to transcendant status like CP3, Lebron, and D-Wade start out. We just don't know who, yet!
I hope CP3 is around in 5 years, but his knees may be mush by then.
_________________
http://www.thecity2.com
http://www.ibb.gatech.edu/evan-zamir
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Mon Dec 20, 2010 9:22 pm Post subject: Reply with quote
EvanZ wrote:
DSMok1 wrote:
Obviously not all players will age the same; some of the young guns will probably jump to transcendant status like CP3, Lebron, and D-Wade start out. We just don't know who, yet!
I hope CP3 is around in 5 years, but his knees may be mush by then.
Yeah. I'd expect outliers like him to age more rapidly.
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
page 3 of 7
Author Message
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Thu Jul 22, 2010 11:12 am Post subject: Reply with quote
DSMok1 wrote:
Here's the regression, with the TRB power term and a linear term only. Honestly, the fit is similar enough that the TRB^2 additional term is probably purely overfitting. (You were right, Rhuidean.)
Here is that regression:
Code:
TRB% -0.114650868
TRBPower 6.762898760
TRBExponent 0.284850151
STL% 1.484282673
BLK% 0.333514467
MPG 0.103521312
TO% Coeff 0.628638746
PPP Threshold 1.634054153
PPP USG Scale 0.013865918
PPP AST Scale 0.009744137
Scoring 0.579187325
USG Const 3.910166612
Intercept -12.466987594
I discovered a flaw in my calculations; I had a slight regression-to-replacement applied to the APM. Removing it doesn't change the SPM terms much, but slightly increases the overall spread. Corrected SPM:
Code:
TRB% -0.11262729
TRBPower 7.21103324
TRBExponent 0.27533883
STL% 1.52798518
BLK% 0.34027348
MPG 0.10394047
TO% Coeff 0.63662985
PPP Threshold 1.64666575
PPP USG Scale 0.01397684
PPP AST Scale 0.00970050
Scoring 0.58831902
USG Const 4.28737016
Intercept -12.78185481
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Thu Jul 22, 2010 12:48 pm Post subject: Reply with quote
I just went through and estimated OSPM and DSPM from the same data, except that the 1 year estimates did not include OAPM and DAPM split up.
Here is Offensive SPM:
Code:
TRB% -0.09696127
TRBPower 12.81133914
TRBExponent 0.11844250
STL% 0.27096567
BLK% -0.08060675
MPG 0.05489518
TO% 0.53648983
PPP Thresh 1.25957931
PPP USG Scale 0.00884937
PPP AST Scale 0.00845557
Scoring 0.59905610
USG Coeff 8.78684193
Intercept -16.03676076
And Defensive SPM:
Code:
TRB% -0.07702830
TRBPower 15.09091745
TRBExponent -0.06311682
STL% -1.23381652
BLK% -0.41721519
MPG -0.04970989
TO% -0.05031619
PPP Thresh -4.96111508
PPP USG Scale -0.05976497
PPP AST Scale 0.00216782
Scoring 0.06565780
USG Coeff 29.80487400
Intercept -23.96225822
1) On the Defensive SPM, NEGATIVE IS GOOD.
2) I'm sure some of these variables are insignificant.
3) Splitting into offensive and defensive rebounding could be beneficial for these, I suppose.
4) These do not sum perfectly to the SPM regression above. Usually within .05, however.
5) The Defensive SPM can do some weird things with low-minute players, it seems (with the negative (=good) intercept).
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Guy
Joined: 02 May 2007
Posts: 128
PostPosted: Fri Jul 23, 2010 10:37 am Post subject: Reply with quote
DSMok1:
Could you tell us the marginal value of one unit of each main boxscore stat under your current SPM?
And a question: have you ever looked to see if two separate models, one for C/F and one for Guards, improves predictive power at all? I assume it doesn't or someone would have done it by now, but intuitively it seems like coefficients might change.
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Fri Jul 23, 2010 10:46 am Post subject: Reply with quote
Guy wrote:
DSMok1:
Could you tell us the marginal value of one unit of each main boxscore stat under your current SPM?
And a question: have you ever looked to see if two separate models, one for C/F and one for Guards, improves predictive power at all? I assume it doesn't or someone would have done it by now, but intuitively it seems like coefficients might change.
The marginal value of one unit of each boxscore stat depends. I'm using advanced stats (obviously) to calculate, so even the linear terms vary depending on, for example, how many 2's were taken by the opponent (for the block term).
For rebounds, it is a power curve, so not equal. Somewhere in the area of 0.23 points per each % of TRB added, but obviously varying.
For the scoring term, it's a 4 dimensional surface, so rather hard to plot. Maybe I can put up a spreadsheet calculator to experiment with.
As for splitting up... well, I haven't worked with one with this regression. The problem is that positions are both continuous and ambiguous. I can tell a PG away from a Center, but not always a PG away from a SF. (What was Lebron, anyway?). Because of that fact, I generally feel it's not useful.
Besides, the results from the current regression seem to be doing quite well. The offensive side mimics APM well; the defensive side is limited by the terms to input.
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Guy
Joined: 02 May 2007
Posts: 128
PostPosted: Fri Jul 23, 2010 11:31 am Post subject: Reply with quote
Thanks. So if I'm following, one rebound adds something like .25-.30 points. Would be interesting to know implied value of a marginal assist as well (assuming average team/opponent).
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Fri Jul 23, 2010 11:45 am Post subject: Reply with quote
Guy wrote:
Thanks. So if I'm following, one rebound adds something like .25-.30 points. Would be interesting to know implied value of a marginal assist as well (assuming average team/opponent).
Assists are totally wrapped up in the scoring term. I just picked a random player (JJ Reddick, in this case) and added another % to his AST%. His SPM went up 0.13.
But that will vary mightily based on USG%, because the assist term is multiplied by usage.
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Fri Jul 23, 2010 1:06 pm Post subject: Reply with quote
For interests' sake: an example of the this new SPM in action:
NBA Finals, 09-10:
Code:
Player SPM %Min Contrib VORP NBA SPM NBA VORP
Kobe Bryant 5.91 85.8% 5.06 7.64 7.51 9.01
Pau Gasol 4.82 87.3% 4.21 6.83 6.42 8.22
Ron Artest -1.43 74.7% -1.07 1.17 0.17 2.37
Lamar Odom -1.30 57.2% -0.74 0.97 0.30 1.89
Derek Fisher -1.99 63.7% -1.27 0.64 -0.39 1.66
Sasha Vujacic 0.85 15.5% 0.13 0.60 2.45 0.84
Andrew Bynum -2.19 52.1% -1.14 0.42 -0.59 1.25
DJ Mbenga -13.14 0.9% -0.12 -0.09 -11.54 -0.08
Josh Powell -12.23 2.4% -0.29 -0.22 -10.63 -0.18
Shannon Brown -4.68 25.0% -1.17 -0.42 -3.08 -0.02
Jordan Farmar -4.88 26.2% -1.28 -0.49 -3.28 -0.07
Luke Walton -9.49 9.2% -0.88 -0.60 -7.89 -0.45
Player SPM %Min Contrib VORP NBA SPM NBA VORP
Kevin Garnett 4.70 66.1% 3.11 5.09 6.30 6.15
Rajon Rondo 2.87 81.0% 2.32 4.75 4.47 6.05
Paul Pierce -0.13 82.8% -0.11 2.37 1.47 3.70
Glen Davis -0.15 42.9% -0.06 1.22 1.45 1.91
Rasheed Wallace -1.02 42.9% -0.44 0.85 0.58 1.54
Nate Robinson -0.37 21.1% -0.08 0.56 1.23 0.89
Kendrick Perkins -2.02 42.0% -0.85 0.41 -0.42 1.08
Marquis Daniels 1.04 1.2% 0.01 0.05 2.64 0.07
Tony Allen -3.03 30.7% -0.93 -0.01 -1.43 0.48
Brian Scalabrine -16.39 0.3% -0.05 -0.04 -14.79 -0.04
Michael Finley -22.93 1.5% -0.34 -0.30 -21.33 -0.27
Ray Allen -3.50 82.2% -2.87 -0.41 -1.90 0.91
Shelden Williams -21.50 5.4% -1.15 -0.99 -19.90 -0.91
What I'm showing is the SPM (already team-adjusted to match up to the efficiency differential, but this adjustment was less than 0.1 for each team), the minutes played, the contribution (SPM*%Min), the VORP (SPM+3)*%Min, and SPM and VORP adjusted for the level of competition (I assumed that these teams were averaging a +8 efficiency differential level, and added that to the SPM).
This new regression does recognize Kobe as superior to Pau, but only by a little. Compare with my old beta regression: http://sonicscentral.com/apbrmetrics/vi ... 1839#31839. It's a big difference!
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Mon Jul 26, 2010 1:12 pm Post subject: Reply with quote
I noted a few posts above that the offensive and defensive SPM had some unnecessary variables included. For instance though shooting is statistically significant for defense (a good shooter = bad defender) it should not be included in the regression. We can't just ASSUME that a good shooter will be a bad defender. It's just that a bad shooter that's playing in the NBA must be a good defender, or else he wouldn't be here!
So, throwing out tangential variables, here are the offensive and defensive regressions for OSPM and DSPM:
Here is Offensive SPM:
Code:
TRB% -0.20610998
TRBPower 8.42678761
TRBExponent 0.20247387
MPG 0.08242452
TO% 0.63261410
PPP Thresh 1.35785021
PPP USG Scale 0.00926961
PPP AST Scale 0.00957339
Scoring 0.56025531
USG Coeff 7.01988471
Intercept -10.92774435
And Defensive SPM (with negative being good):
Code:
TRB% -0.14025543
TRBPower -1.64008765
TRBExponent 0.19101623
STL% -1.45142242
BLK% -0.41795713
MPG 0.00344333
Intercept 6.52648729
How well do these fit the data? Well, I'll report R-squared results here, just to give an idea. I didn't weight all data points evenly (based on the error in the APM) in the regression, so take these numbers with a lump of salt.
Overall SPM: 0.28
Offensive SPM: 0.56
Defensive SPM: 0.33
Sum of OSPM and DSPM (comparing to overall APM): 0.27
These new offensive and defensive SPM's will not sum to the overall SPM. The average difference between the Sum and SPM is around 0.3 to 0.4, so not too far off, but not great either.
I would recommend using these offensive and defensive SPM's over those above that included insignificant/inappropriate variables.
Also, remember to sum/adjust each of these to force the team totals to equal the team's efficiency above or below NBA average. (It guess that's what we should sum to--right?)
EDIT:
Probably better to use ORB and DRB instead of TRB when doing the specific regressions for OSPM and DSPM:
Code:
ORB% -0.10129651
ORBPower 5.16775677
ORBExponent 0.15790017
MPG 0.08366916
TO% 0.65655052
PPP Thresh 1.35219734
PPP USG Scale 0.00887959
PPP AST Scale 0.00971672
Scoring 0.56548199
USG Coeff 7.46146945
Intercept -5.66757625
And Defensive SPM (with negative being good):
Code:
DRB% -0.04258455
DRBPower -5.48349226
DRBExponent 0.16254300
STL% -1.42555701
BLK% -0.45616939
MPG 0.01151452
Intercept 11.43405907
The TRB forms behaved oddly for high rebounding percentages on offense. Use these lower forms!
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Fri Aug 06, 2010 12:05 pm Post subject: Reply with quote
I've added a nonlinear assist term experimentally; it seems to help considerably, but reduces the accuracy at the extremes (i.e. 0% AST).
Here is the NEW SPREADSHEET with lots of additional columns and calculations.
Including True Value II, my regressed best estimate of the player's value.
In actual dollars!
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
philrl
Joined: 28 Oct 2009
Posts: 6
PostPosted: Fri Aug 06, 2010 2:41 pm Post subject: Reply with quote
How much weight does TM DRtg get in DSPM? Would it make sense to use only the DRtg when the player is on the court?
It seems that good defenders on bad teams will get hurt by this, notably Amir Johnson. When he's in, the Raptors were close to a league average defense (107.92). It seems kind of unfair to adjust him to his team's awful rating, when most of the damage was done while he was on the bench.
Back to top
View user's profile Send private message
huevonkiller
Joined: 25 May 2010
Posts: 15
Location: Miami, Fl
PostPosted: Sat Aug 07, 2010 8:47 am Post subject: Reply with quote
DSMok1 wrote:
I've added a nonlinear assist term experimentally; it seems to help considerably, but reduces the accuracy at the extremes (i.e. 0% AST).
Here is the NEW SPREADSHEET with lots of additional columns and calculations.
Including True Value II, my regressed best estimate of the player's value.
In actual dollars!
Awesome, thanks. Intriguing predictive outlook too.
Back to top
View user's profile Send private message
battaile
Joined: 27 Jul 2009
Posts: 38
PostPosted: Wed Aug 11, 2010 11:56 am Post subject: Reply with quote
I'm sure this is blatantly obvious and I'm just having a senior moment, but what is the Rk column? (column B)
Back to top
View user's profile Send private message Visit poster's website
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Wed Aug 11, 2010 2:04 pm Post subject: Reply with quote
battaile wrote:
I'm sure this is blatantly obvious and I'm just having a senior moment, but what is the Rk column? (column B)
Just an artifact of the data from Basketball Reference. I should have deleted it... It was the "rank" of players on each team by win shares.
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
battaile
Joined: 27 Jul 2009
Posts: 38
PostPosted: Thu Aug 12, 2010 11:41 am Post subject: Reply with quote
DSMok1 wrote:
battaile wrote:
I'm sure this is blatantly obvious and I'm just having a senior moment, but what is the Rk column? (column B)
Just an artifact of the data from Basketball Reference. I should have deleted it... It was the "rank" of players on each team by win shares.
Ah cool, thanks!
Is there a post somewhere that explains the various columns? I'm pretty fascinated by this spreadsheet but struggling to understand some of the relationships. (like looking at Ariza's mostly pedestrian advanced stat numbers through column W and figuring out what it is about him that gives him a true value over 9m, etc)
Back to top
View user's profile Send private message Visit poster's website
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Thu Oct 07, 2010 12:15 pm Post subject: Reply with quote
A couple of spreadsheets:
All players since 1978, full Advanced SPM numbers
and
An example of 1 year's full calculations, including a macro to update to other years or within this coming year.
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1