Page 15 of 24
Re: The popularization of BPM
Posted: Thu Nov 27, 2014 5:42 pm
by Mike G
v-zero wrote:Mike G wrote:If you want to retrodict 2010 by use of BPM, could you just ignore that year's data from the RAPM that was used to build BPM? Would that suddenly make the test more valid? Or would BPM have the same coefficients with or without a given season?
Doing this still leaves the major issue of model/variable selection. BPM is a result of some model/variable selection process, and as such its formulation is a direct result of all the data which was used to create it, and that will include 2010 whether you remove 2010 from the data on which you fit the parameters or not.
Let's try this again.
Suppose you want to test BPM by retrodicting the 2010 season. Then you might create a
2010-free database version of RAPM; and from that a 2010-free version of BPM; apply these BPM numbers to player possessions of 2010, and see if they sum to team point differential.
If a 12-year interval -- 2002 to 2013 -- can be out-of-sample, then surely a one-year interval can be.
Then, if it turns out that when omitting any single season from the RAPM database, your BPM coefficients and parameters don't seem to change; then any subset of any interval can be tested, with as much validity as if it were 1990 or 1970.
Re: The popularization of BPM
Posted: Thu Nov 27, 2014 6:10 pm
by mystic
Mike G wrote:
Suppose you want to test BPM by retrodicting the 2010 season. Then you might create a 2010-free database version of RAPM; and from that a 2010-free version of BPM; apply these BPM numbers to player possessions of 2010, and see if they sum to team point differential.
http://www.basketball-reference.com/about/bpm.html
Code: Select all
BPM_Team_Adjustment = [Team_Rating*120% - S(Player_%Min*Player_RawBPM)]/5
BPM sums up to the team level by default. Even if we ignore that in-sample testing isn't particular useful to begin with, the team-level adjustment made in the BPM calculation makes your test completely useless.
Mike G wrote:
If a 12-year interval -- 2002 to 2013 -- can be out-of-sample, then surely a one-year interval can be.
That interval is only out-of-sample, if you have no data from that interval used in order to generate the metric itself or the individual player values.
Mike G wrote:
Then, if it turns out that when omitting any single season from the RAPM database, your BPM coefficients and parameters don't seem to change; then any subset of any interval can be tested, with as much validity as if it were 1990 or 1970.
That is completely irrelevant, because if no data from 1970 or 1990 was used, it can be tested "out-of-sample" no matter what the change of the coefficients caused by removing part of the underlying dataset might be.
Re: The popularization of BPM
Posted: Thu Nov 27, 2014 6:26 pm
by v-zero
The key thing to understand is that with a metric such a BPM the model has been informed by the data over the period of J.E's long term RAPM, so nothing in that period is valid to test with, end of story. Not only the values of the parameters, but the parameters themselves (as in, which parameters to include, and how) have been informed by that data.
Re: The popularization of BPM
Posted: Thu Nov 27, 2014 6:53 pm
by Mike G
No doubt this will also be largely misunderstood by today's posters; nevertheless:
If you have a metric that's just player uniform numbers with some other factors; and as long as their # doesn't change, they're the same player; and after Yr0 you notice large discrepancies between player 'rates' and team wins; you didn't notice anything, because it's all "in sample" ?
Re: The popularization of BPM
Posted: Thu Nov 27, 2014 7:59 pm
by mystic
Ok, let me try to rephrase your question a bit: Is a test, in which player values from yr0 are used to predict team wins in yr1, an out-of-sample or in-sample test? As long as you haven't used any data from yr1 in order to create the metric itself or the specific player values for yr0, it is out-of-sample.
Some "smart guy" might just try to fit player values from yr0 onto yr1 team performances using regression in order to beat the test Neil proposed. In that case we would again have an in-sample test, if those yr0 values are used to predict yr1 team performances, because the corresponding coefficients for the yr0 values are based on yr1 data as well. On the other hand you may use earlier seasons to educate your metric how to fit different player variables together. Say, we start in 1977 and then use 1977 values in order to predict 1978 team performances and thereby determine sufficient coefficients to make the error minimal, we may use those coefficients for the 1978 dataset, which then can be used to predict 1979 team performances out-of-sample. In the next step you calculate those coefficients for the 1979 dataset by regressing player values from 1977 on 1978 team data as well 1978 player values on 1979 team data. The corresponding coefficients found can be used to calculate 1979 player values, which then can be used to predict 1980 team performances out-of-sample. And so on ...
Re: The popularization of BPM
Posted: Thu Nov 27, 2014 10:07 pm
by Mike G
That's all very agreeable; thanks.
What you're describing applies to an ongoing annual tweaking of a metric. But since RAPM is calculated over a 14 year interval, and BPM has been created to fit it, do we really have to accept that BPM cannot be given the same test as other metrics in the interval? How is it different to just exclude a given year, re-calculate RAPM, re-align BPM with those values, and check the BPM for that year against the team point differential?
If BPM were calculated based on 2001-2013 RAPM, then 2014 is out of sample. It's not relevant that another version of BPM already exists that includes 2014 data. That's not the BPM we'd be testing.
And given the strong possibility that one season's omission from the database really doesn't give you a radically different BPM, why bother? Just so someone can be satisfied that it's out of sample, I guess.
Re: The popularization of BPM
Posted: Thu Nov 27, 2014 10:24 pm
by AcrossTheCourt
Neil Paine wrote:I plan to do an even more thorough examination of this when I have time (which seems like never), but here is the evidence that Statistical Plus/Minus metrics (whether ASPM/BPM or
SPM) are the best of the boxscore metrics...
Let me preface by saying the real test for any boxscore stat should be
how it predicts team performance out of sample. It's
long been known that any boxscore stat can boast a high correlation with team W% in-sample just by employing a team adjustment (like BPM does) or
otherwise setting things up so that points scored/allowed and possessions employed/acquired add up at the individual level to team totals. What matters is how well a metric predicts the performance of a future team, given who its players are and how well those players have performed in the metric in the past.
I looked at metrics from that perspective
here, and found that over the 2001-2012 period, ASPM did better than any other boxscore metric at predicting out-of-sample team performance, especially the further out of sample you go (using data from 2 and 3 years prior to predict Year Y). Over the summer, I also ran the same test using data from 1978-2014 for
my SPM metric, Daniel's old ASPM (a version behind the current BPM), PER, WS/48 and a plus/minus estimate constructed from Basketball on Paper's ORtg/%Poss/DRtg
Skill Curves.
(Just to expound on the BoP metric, I set up a fake 5-man unit w/ the player and 4 avg teammates. The teammates' ORtg changed based on the player's %Poss,
like this. So If I use 25% of poss, my avg teammate uses 18.8% on avg. If tradeoff is 1.2, then he gains 1.2*(20-18.8) of ORtg.)
Code: Select all
+-----------+--------+--------+--------+--------+
| Metric | Year-1 | Year-2 | Year-3 | Year-4 |
+-----------+--------+--------+--------+--------+
| SPM-1 | .776 | .662 | .593 | .532 |
| ASPM-1 | .763 | .647 | .577 | .511 |
| PER-1 | .663 | .598 | .538 | .485 |
| bop_1.2-1 | .743 | .614 | .528 | .465 |
| WS48-1 | .734 | .598 | .515 | .462 |
+-----------+--------+--------+--------+--------+
Btw, I came to the 1.2 tradeoff from running the same test over 1977-2014 (I didn't include ASPM in this test because it didn't extend back to 1977). Each BoP number represents the usage-efficiency tradeoff for that version of the metric:
Code: Select all
+-----------+--------+--------+--------+--------+
| metric | Year-1 | Year-2 | Year-3 | Year-4 |
+-----------+--------+--------+--------+--------+
| SPM-1 | .775 | .658 | .590 | .534 |
| PER-1 | .660 | .594 | .532 | .486 |
| bop_1.2-1 | .741 | .609 | .521 | .465 |
| bop_1.3-1 | .741 | .609 | .521 | .464 |
| bop_1.1-1 | .741 | .608 | .520 | .465 |
| bop_1.4-1 | .740 | .609 | .521 | .463 |
| bop_1.0-1 | .741 | .607 | .519 | .465 |
| bop_0.9-1 | .741 | .606 | .518 | .465 |
| bop_0.8-1 | .740 | .604 | .516 | .464 |
| WS48-1 | .733 | .593 | .507 | .463 |
+-----------+--------+--------+--------+--------+
In any event, no matter how many times I look at which metric does the best job of predicting future team performance, the Statistical Plus/Minus metrics are always in a class by themselves, particularly as you use data from further out of the sample being predicted. I still need to plug Daniel's new BPM into this framework, but I would be surprised if it didn't perform much better than PER and Win Shares/48 in a similar test.
That's showing the correlation coefficient between net team ratings and expected ratings from the metrics, right? I just want to know so I can tell if what I'm doing is worthwhile. And what do you do with rookies? Or guys who missed an entire season due to injury/whatever else?
Re: The popularization of BPM
Posted: Fri Nov 28, 2014 7:38 am
by permaximum
@AccrossTheCourt
I think he uses metric averages for those players.
Re: The popularization of BPM
Posted: Fri Nov 28, 2014 8:29 am
by mystic
Mike G wrote:
If BPM were calculated based on 2001-2013 RAPM, then 2014 is out of sample. It's not relevant that another version of BPM already exists that includes 2014 data. That's not the BPM we'd be testing.
Yes.
Mike G wrote:
And given the strong possibility that one season's omission from the database really doesn't give you a radically different BPM, why bother? Just so someone can be satisfied that it's out of sample, I guess.
First, that it wouldn't be "radically different" is an assumption, not something we know. And even a slightly different BPM can easily lead to a worse results than the current version, in fact it might be possible that in an out-of-sample test BPM would end up being worse in year 3 or 4 than PER, for example. We just don't know that, and that's why an out-of-sample test is necessary.
Re: The popularization of BPM
Posted: Fri Nov 28, 2014 11:58 am
by DSMok1
This is an interesting discussion to follow!
I do have one question. Since Box Plus/Minus was not created as a predictive regression (it was not tuned to predict the subsequent year of data) would it really be truly in sample to use those 14 years for which the RAPM basis was originally created? I agree, however, that using absolutely out of sample data would be the best for the test. I'm just not certain how significant any in sample issues would be since subsequent year team results was not really what the regression was done on. I guess we'll see.
Re: The popularization of BPM
Posted: Fri Nov 28, 2014 7:10 pm
by permaximum
permaximum wrote:However as far as public metrics go, BPM will destroy all others. Actually I didn't test RAPM or xRAPM. Will do it in a few days.. Let's see which one is worse. Noise or missing info?
Well, it looks missing info is worse. xRAPM or RPM is obviously better than all other metrics and blending it with anything will cause serious overfitting problems. Only a few things increase it's prediction and that's by a small margin.
Also, MPG is better than PER or WP. Those metrics are that bad...
1.RPM
2.BPM
3.WS
4.MPG
Re: The popularization of BPM
Posted: Sat Nov 29, 2014 12:08 am
by DSMok1
permaximum wrote:permaximum wrote:However as far as public metrics go, BPM will destroy all others. Actually I didn't test RAPM or xRAPM. Will do it in a few days.. Let's see which one is worse. Noise or missing info?
Well, it looks missing info is worse. xRAPM or RPM is obviously better than all other metrics and blending it with anything will cause serious overfitting problems. Only a few things increase it's prediction and that's by a small margin.
Also, MPG is better than PER or WP. Those metrics are that bad...
1.RPM
2.BPM
3.WS
4.MPG
That's not a surprise, since xRAPM uses all the box score data and also the lineup data. Much more information. As long as it's done right, which I'm confident Jeremias had done, it should certainly come out on top. Also, it is explicitly set up to maximize predictiveness.
Re: The popularization of BPM
Posted: Sun Nov 30, 2014 3:17 pm
by permaximum
DSMok1 wrote:permaximum wrote:permaximum wrote:However as far as public metrics go, BPM will destroy all others. Actually I didn't test RAPM or xRAPM. Will do it in a few days.. Let's see which one is worse. Noise or missing info?
Well, it looks missing info is worse. xRAPM or RPM is obviously better than all other metrics and blending it with anything will cause serious overfitting problems. Only a few things increase it's prediction and that's by a small margin.
Also, MPG is better than PER or WP. Those metrics are that bad...
1.RPM
2.BPM
3.WS
4.MPG
That's not a surprise, since xRAPM uses all the box score data and also the lineup data. Much more information. As long as it's done right, which I'm confident Jeremias had done, it should certainly come out on top. Also, it is explicitly set up to maximize predictiveness.
These results are not final. I found out that testing prediction accuracy on a non-complete season can't be more than a small validity test. Since 2000-2013 seasons are in-sample for BPM, I can't compare it with xRAPM yet.
Re: The popularization of BPM
Posted: Mon Dec 01, 2014 6:18 pm
by DSMok1
For those who are interested and those who have critiqued BPM based on single significant players being under or over-represented, I compiled the most over/under rated players by BPM within the regression sample.
I explored again whether there was any correlation of the "misses" with a specific type of player, but I couldn't find any trends of note.
Here is a list of the 5 most overrated and 5 most underrated players at each position, who played over 10,000 minutes from 2001-2014. I can also put up each player's statlines, if anyone is interested. This is basically an eyeball test--does BPM have a significant hole? Besides the things a box score doesn't measure?
Code: Select all
╔════════════════════╦══════════╦═════════╦══════╦═══════╦══════╦═══════╦═══════╦══════╦═══════╦═══════╦═══════════╦═════════╦═════════╗
║ Player ║ Position ║ Minutes ║ PER ║ WS/48 ║ RAPM ║ ORAPM ║ DRAPM ║ BPM ║ O-BPM ║ D-BPM ║ Delta Tot ║ Delta O ║ Delta D ║
╠════════════════════╬══════════╬═════════╬══════╬═══════╬══════╬═══════╬═══════╬══════╬═══════╬═══════╬═══════════╬═════════╬═════════╣
║ Steve Nash ║ 1 ║ 32920 ║ 21.0 ║ 0.176 ║ 6.0 ║ 6.6 ║ -0.6 ║ 1.6 ║ 3.8 ║ -2.3 ║ -4.4 ║ -2.8 ║ -1.7 ║
║ Derek Fisher ║ 1 ║ 27059 ║ 11.8 ║ 0.093 ║ 2.5 ║ 1.4 ║ 1.1 ║ -0.7 ║ -0.1 ║ -0.6 ║ -3.2 ║ -1.5 ║ -1.7 ║
║ Mike Conley ║ 1 ║ 16628 ║ 16.2 ║ 0.116 ║ 4.1 ║ 4.0 ║ 0.1 ║ 1.2 ║ 1.5 ║ -0.3 ║ -2.9 ║ -2.5 ║ -0.4 ║
║ Baron Davis ║ 1 ║ 27069 ║ 18.1 ║ 0.108 ║ 5.5 ║ 4.3 ║ 1.2 ║ 2.9 ║ 3.1 ║ -0.2 ║ -2.6 ║ -1.2 ║ -1.4 ║
║ Keyon Dooling ║ 1 ║ 14134 ║ 11.5 ║ 0.063 ║ -0.8 ║ -0.2 ║ -0.6 ║ -3.0 ║ -1.1 ║ -1.8 ║ -2.2 ║ -0.9 ║ -1.2 ║
║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║
║ Mario Chalmers ║ 1 ║ 12053 ║ 12.6 ║ 0.100 ║ -1.6 ║ -1.4 ║ -0.2 ║ 0.7 ║ 0.2 ║ 0.6 ║ 2.3 ║ 1.6 ║ 0.8 ║
║ Nick Van Exel ║ 1 ║ 11070 ║ 15.6 ║ 0.082 ║ -3.0 ║ -0.3 ║ -2.7 ║ -0.4 ║ 1.9 ║ -2.4 ║ 2.6 ║ 2.2 ║ 0.3 ║
║ Brandon Jennings ║ 1 ║ 12762 ║ 16.1 ║ 0.088 ║ -1.9 ║ 0.5 ║ -2.4 ║ 1.1 ║ 2.0 ║ -0.9 ║ 3.0 ║ 1.5 ║ 1.5 ║
║ Stephon Marbury ║ 1 ║ 21679 ║ 18.9 ║ 0.123 ║ -1.6 ║ 1.9 ║ -3.5 ║ 1.5 ║ 3.5 ║ -2.0 ║ 3.1 ║ 1.6 ║ 1.5 ║
║ Rajon Rondo ║ 1 ║ 16612 ║ 17.1 ║ 0.130 ║ -1.2 ║ -0.8 ║ -0.4 ║ 2.3 ║ 0.6 ║ 1.7 ║ 3.5 ║ 1.4 ║ 2.1 ║
║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║
║ Tony Allen ║ 2 ║ 12557 ║ 14.5 ║ 0.110 ║ 3.4 ║ -0.6 ║ 4.0 ║ 1.0 ║ -1.0 ║ 2.0 ║ -2.4 ║ -0.4 ║ -2.0 ║
║ Joe Johnson ║ 2 ║ 35661 ║ 16.2 ║ 0.096 ║ 2.8 ║ 3.0 ║ -0.2 ║ 0.7 ║ 2.0 ║ -1.3 ║ -2.1 ║ -1.0 ║ -1.1 ║
║ Cuttino Mobley ║ 2 ║ 23666 ║ 14.4 ║ 0.091 ║ 2.0 ║ 2.1 ║ -0.1 ║ 0.3 ║ 0.9 ║ -0.6 ║ -1.7 ║ -1.2 ║ -0.5 ║
║ Vince Carter ║ 2 ║ 34328 ║ 19.8 ║ 0.138 ║ 4.8 ║ 3.4 ║ 1.4 ║ 3.3 ║ 3.6 ║ -0.4 ║ -1.5 ║ 0.2 ║ -1.8 ║
║ Marquis Daniels ║ 2 ║ 11707 ║ 12.9 ║ 0.068 ║ 0.5 ║ -0.2 ║ 0.7 ║ -1.0 ║ -1.3 ║ 0.3 ║ -1.5 ║ -1.1 ║ -0.4 ║
║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║
║ Jarrett Jack ║ 2 ║ 19266 ║ 14.2 ║ 0.080 ║ -3.4 ║ -1.1 ║ -2.3 ║ -0.9 ║ 0.3 ║ -1.2 ║ 2.5 ║ 1.4 ║ 1.1 ║
║ O.J. Mayo ║ 2 ║ 14132 ║ 13.8 ║ 0.065 ║ -3.3 ║ -0.1 ║ -3.1 ║ -0.5 ║ 0.7 ║ -1.2 ║ 2.8 ║ 0.8 ║ 1.9 ║
║ Tyreke Evans ║ 2 ║ 10912 ║ 17.0 ║ 0.073 ║ -2.6 ║ -0.6 ║ -2.0 ║ 0.6 ║ 0.9 ║ -0.3 ║ 3.2 ║ 1.5 ║ 1.7 ║
║ Fred Jones ║ 2 ║ 10299 ║ 11.3 ║ 0.077 ║ -4.1 ║ -2.6 ║ -1.5 ║ -0.9 ║ -0.8 ║ -0.1 ║ 3.2 ║ 1.8 ║ 1.4 ║
║ Josh Childress ║ 2 ║ 10432 ║ 15.6 ║ 0.118 ║ -2.1 ║ 1.0 ║ -3.0 ║ 1.3 ║ 1.1 ║ 0.2 ║ 3.4 ║ 0.1 ║ 3.2 ║
║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║
║ Corliss Williamson ║ 3 ║ 10248 ║ 15.3 ║ 0.104 ║ 1.2 ║ -1.3 ║ 2.6 ║ -2.7 ║ -1.3 ║ -1.4 ║ -3.9 ║ 0.0 ║ -4.0 ║
║ Luol Deng ║ 3 ║ 24235 ║ 15.8 ║ 0.120 ║ 4.8 ║ 1.9 ║ 3.0 ║ 1.1 ║ 0.2 ║ 0.8 ║ -3.7 ║ -1.7 ║ -2.2 ║
║ Josh Howard ║ 3 ║ 15350 ║ 16.7 ║ 0.119 ║ 3.2 ║ 2.0 ║ 1.2 ║ 0.6 ║ 0.5 ║ 0.1 ║ -2.6 ║ -1.5 ║ -1.1 ║
║ Paul Pierce ║ 3 ║ 38223 ║ 20.5 ║ 0.164 ║ 5.9 ║ 3.6 ║ 2.2 ║ 3.6 ║ 3.0 ║ 0.6 ║ -2.3 ║ -0.6 ║ -1.6 ║
║ Metta World Peace ║ 3 ║ 28452 ║ 15.1 ║ 0.099 ║ 4.4 ║ 1.0 ║ 3.4 ║ 2.1 ║ 0.8 ║ 1.2 ║ -2.3 ║ -0.2 ║ -2.2 ║
║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║
║ Dorell Wright ║ 3 ║ 11676 ║ 14.4 ║ 0.100 ║ -2.1 ║ -1.6 ║ -0.6 ║ 0.3 ║ 0.1 ║ 0.2 ║ 2.4 ║ 1.7 ║ 0.8 ║
║ Kevin Durant ║ 3 ║ 20629 ║ 24.6 ║ 0.206 ║ 1.3 ║ 2.5 ║ -1.2 ║ 4.4 ║ 4.1 ║ 0.3 ║ 3.1 ║ 1.6 ║ 1.5 ║
║ Nicolas Batum ║ 3 ║ 12436 ║ 15.6 ║ 0.124 ║ -0.3 ║ 2.2 ║ -2.4 ║ 3.0 ║ 1.9 ║ 1.0 ║ 3.3 ║ -0.3 ║ 3.4 ║
║ Ricky Davis ║ 3 ║ 20800 ║ 14.7 ║ 0.062 ║ -4.2 ║ -1.1 ║ -3.2 ║ -0.7 ║ 0.4 ║ -1.1 ║ 3.5 ║ 1.5 ║ 2.1 ║
║ Damien Wilkins ║ 3 ║ 10839 ║ 11.6 ║ 0.054 ║ -6.1 ║ -2.6 ║ -3.5 ║ -1.6 ║ -1.2 ║ -0.5 ║ 4.5 ║ 1.4 ║ 3.0 ║
║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║
║ LaMarcus Aldridge ║ 4 ║ 20460 ║ 20.0 ║ 0.143 ║ 5.8 ║ 2.3 ║ 3.4 ║ 1.2 ║ 0.9 ║ 0.3 ║ -4.6 ║ -1.4 ║ -3.1 ║
║ Kevin Garnett ║ 4 ║ 35362 ║ 24.0 ║ 0.204 ║ 9.7 ║ 2.4 ║ 7.3 ║ 6.0 ║ 2.4 ║ 3.6 ║ -3.7 ║ 0.0 ║ -3.7 ║
║ Dirk Nowitzki ║ 4 ║ 38661 ║ 24.2 ║ 0.218 ║ 7.4 ║ 5.2 ║ 2.1 ║ 4.0 ║ 3.5 ║ 0.5 ║ -3.4 ║ -1.7 ║ -1.6 ║
║ Amir Johnson ║ 4 ║ 11193 ║ 16.3 ║ 0.146 ║ 5.8 ║ 1.7 ║ 4.1 ║ 2.6 ║ 0.3 ║ 2.3 ║ -3.2 ║ -1.4 ║ -1.8 ║
║ Rasheed Wallace ║ 4 ║ 25408 ║ 17.4 ║ 0.141 ║ 5.3 ║ 1.0 ║ 4.3 ║ 2.4 ║ 0.7 ║ 1.6 ║ -2.9 ║ -0.3 ║ -2.7 ║
║ Jason Thompson ║ 4 ║ 12336 ║ 14.0 ║ 0.077 ║ 0.9 ║ 0.7 ║ 0.2 ║ -1.7 ║ -1.4 ║ -0.3 ║ -2.6 ║ -2.1 ║ -0.5 ║
║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║
║ Karl Malone ║ 4 ║ 10244 ║ 21.8 ║ 0.180 ║ 0.8 ║ 0.5 ║ 0.2 ║ 4.3 ║ 2.5 ║ 1.9 ║ 3.5 ║ 2.0 ║ 1.7 ║
║ Chris Webber ║ 4 ║ 15570 ║ 20.4 ║ 0.120 ║ -1.2 ║ -1.3 ║ 0.1 ║ 2.5 ║ 0.7 ║ 1.8 ║ 3.7 ║ 2.0 ║ 1.7 ║
║ Troy Murphy ║ 4 ║ 19921 ║ 15.4 ║ 0.121 ║ -4.0 ║ -1.4 ║ -2.7 ║ -0.1 ║ -0.2 ║ 0.0 ║ 3.9 ║ 1.2 ║ 2.7 ║
║ Jeff Green ║ 4 ║ 15580 ║ 13.1 ║ 0.072 ║ -5.1 ║ -3.1 ║ -2.0 ║ -1.0 ║ -0.8 ║ -0.2 ║ 4.1 ║ 2.3 ║ 1.8 ║
║ Hakim Warrick ║ 4 ║ 10624 ║ 15.5 ║ 0.092 ║ -8.5 ║ -4.4 ║ -4.1 ║ -3.0 ║ -1.4 ║ -1.7 ║ 5.5 ║ 3.0 ║ 2.4 ║
║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║
║ Jason Collins ║ 5 ║ 14932 ║ 7.0 ║ 0.064 ║ 1.6 ║ -3.8 ║ 5.4 ║ -1.9 ║ -3.6 ║ 1.7 ║ -3.5 ║ 0.2 ║ -3.7 ║
║ Nick Collison ║ 5 ║ 16577 ║ 13.6 ║ 0.119 ║ 3.9 ║ 1.4 ║ 2.5 ║ 0.5 ║ -0.3 ║ 0.8 ║ -3.4 ║ -1.7 ║ -1.7 ║
║ Dikembe Mutombo ║ 5 ║ 11677 ║ 15.4 ║ 0.155 ║ 3.2 ║ -2.5 ║ 5.7 ║ 0.1 ║ -2.5 ║ 2.5 ║ -3.1 ║ 0.0 ║ -3.2 ║
║ Anderson Varejao ║ 5 ║ 13827 ║ 15.7 ║ 0.150 ║ 4.5 ║ 0.9 ║ 3.6 ║ 1.7 ║ -0.6 ║ 2.3 ║ -2.8 ║ -1.5 ║ -1.3 ║
║ Jeff Foster ║ 5 ║ 15664 ║ 14.1 ║ 0.144 ║ 4.0 ║ 0.6 ║ 3.4 ║ 1.3 ║ -0.5 ║ 1.8 ║ -2.7 ║ -1.1 ║ -1.6 ║
║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║
║ Marcus Camby ║ 5 ║ 22292 ║ 17.9 ║ 0.146 ║ 0.9 ║ -1.8 ║ 2.7 ║ 3.5 ║ -1.4 ║ 4.9 ║ 2.6 ║ 0.4 ║ 2.2 ║
║ Brian Grant ║ 5 ║ 11355 ║ 14.0 ║ 0.118 ║ -3.1 ║ -3.1 ║ 0.0 ║ -0.4 ║ -1.5 ║ 1.1 ║ 2.7 ║ 1.6 ║ 1.1 ║
║ Brook Lopez ║ 5 ║ 11339 ║ 20.6 ║ 0.134 ║ -2.1 ║ -1.3 ║ -0.9 ║ 0.8 ║ 0.6 ║ 0.2 ║ 2.9 ║ 1.9 ║ 1.1 ║
║ Joakim Noah ║ 5 ║ 14084 ║ 18.2 ║ 0.172 ║ 1.0 ║ -0.8 ║ 1.8 ║ 4.4 ║ 0.4 ║ 4.0 ║ 3.4 ║ 1.2 ║ 2.2 ║
║ J.J. Hickson ║ 5 ║ 10078 ║ 16.3 ║ 0.092 ║ -6.8 ║ -4.4 ║ -2.4 ║ -2.2 ║ -1.8 ║ -0.3 ║ 4.6 ║ 2.6 ║ 2.1 ║
╚════════════════════╩══════════╩═════════╩══════╩═══════╩══════╩═══════╩═══════╩══════╩═══════╩═══════╩═══════════╩═════════╩═════════╝
[/size]
Re: The popularization of BPM
Posted: Mon Dec 01, 2014 7:55 pm
by schtevie
DSMok1 wrote:For those who are interested and those who have critiqued BPM based on single significant players being under or over-represented, I compiled the most over/under rated players by BPM within the regression sample.
I explored again whether there was any correlation of the "misses" with a specific type of player, but I couldn't find any trends of note.
Regarding the data presented, a question: what does the "Delta Tot" distribution look like? And what are its summary statistics? I am supposing it looks normal, with a mean of about zero, and a standard error of about 1.3. And if this is about right, what does it suggest about the utility of the data if one wishes to infer, say, that Player X is better than Player Y?
And then I return to the issue of the relevance of the "gold standard" 14-year RAPM. Repeating my claim that the vast majority of folks potentially interested in +/- related statistics care primarily about sorting out the hierarchy of the elite, what recommends 14-year RAPM if its results are fundamentally at variance with year-on-year xRAPM - which quite frankly give much more intuitive results?
I don't know (and wouldn't expect) that regressing box score stats on the 14 year panel would significantly alter the coefficients (or model) but perhaps it would be worth checking?