LambdaPM: A new way of looking at adjusted +/-

Home for all your discussion of basketball statistical analysis.
Crow
Posts: 10622
Joined: Thu Apr 14, 2011 11:10 pm

Re: LambdaPM: A new way of looking at adjusted +/-

Post by Crow »

Any updates on your research?
Rhuidean
Posts: 15
Joined: Thu Jun 02, 2011 3:48 pm

Re: LambdaPM: A new way of looking at adjusted +/-

Post by Rhuidean »

Hey,

Here is the latest version of this work: http://arxiv.org/abs/1301.3523

I'll have code online in the near future so that you can run experiments yourself on any years of interest.
mtamada
Posts: 163
Joined: Thu Apr 14, 2011 11:35 pm

Re: LambdaPM: A new way of looking at adjusted +/-

Post by mtamada »

Really nice work, combining box score stats and regularization with APM in a sophisticated way.
bchaikin
Posts: 307
Joined: Thu May 12, 2011 2:09 am

Re: LambdaPM: A new way of looking at adjusted +/-

Post by bchaikin »

Here is the latest version of this work: http://arxiv.org/abs/1301.3523

your paper lists keyon dooling as the most underrated player in the 2010-11 season, correct?...

if so, here's my question - if an nba GM asked you why, what would you tell him? here is why i ask:

his man defense was very good that season, but:

- he shot worse overall than just the league average PG (50.3% vs 52.3% ScFG%). among the 39 PGs that played at least 1500 minutes, he shot the 11th lowest 2pt FG% (43.9%). that's alot of missed shots rebounded by the defense - from a team perspective a missed shot rebounded by the defense is the same as a turnover, a zero point team possession...

and among the 52 PGs that season that played at least 1000 minutes that season, dooling:

- shot the 18th lowest 2pt FG% at 43.9%...
- had the 2nd worst (51st out of 52) offensive rebounding rate...
- had the 5th worst (48th out of 52) total rebounding rate...
- had the 17th lowest per minute steal rate...
- had the 10th lowest/worst shot blocking rate...
- had the 10th lowest/worst passing rate for assists...
- had the 12th lowest FTA/min rate, so he was not drawing many fouls on defenders...

he drew 24 offensive fouls that season, but his forced turnovers (ST+CH/3000min) of (24+54)/1757min x 3000 of 133 was nowhere near the best among PGs (235/3000min for chris paul)...
bbstats
Posts: 227
Joined: Thu Apr 21, 2011 8:25 pm
Location: Boone, NC
Contact:

Re: LambdaPM: A new way of looking at adjusted +/-

Post by bbstats »

Lots of fun math here...trying to understand "Sparse Player Rating"...

Can someone explain more simply how this varies from Least Squares/APM?


Also - of course maximizing prediction performance of players with more minutes will improve final-score prediction...but this then ignores players who perform equally well with less playing time? Or was this addressed?

I guess it is important for us to remember, at any rate.
EvanZ
Posts: 912
Joined: Thu Apr 14, 2011 10:41 pm
Location: The City
Contact:

Re: LambdaPM: A new way of looking at adjusted +/-

Post by EvanZ »

bbstats wrote:Lots of fun math here...trying to understand "Sparse Player Rating"...

Can someone explain more simply how this varies from Least Squares/APM?


Also - of course maximizing prediction performance of players with more minutes will improve final-score prediction...but this then ignores players who perform equally well with less playing time? Or was this addressed?

I guess it is important for us to remember, at any rate.
I assume regularization would enforce that "players who perform equally well with less playing time" would be regressed closer to the mean anyway. Someone correct me if I am wrong about that.
v-zero
Posts: 520
Joined: Sat Oct 27, 2012 12:30 pm

Re: LambdaPM: A new way of looking at adjusted +/-

Post by v-zero »

EvanZ wrote:
bbstats wrote:Lots of fun math here...trying to understand "Sparse Player Rating"...

Can someone explain more simply how this varies from Least Squares/APM?
From my brief perusal earlier I would liken it to RAPM with additional penalization for straying too far from any given player's box-score based rating. In that I would say it is very similar to xRAPM.
mtamada
Posts: 163
Joined: Thu Apr 14, 2011 11:35 pm

Re: LambdaPM: A new way of looking at adjusted +/-

Post by mtamada »

v-zero wrote:From my brief perusal earlier I would liken it to RAPM with additional penalization for straying too far from any given player's box-score based rating. In that I would say it is very similar to xRAPM.
Right, what would be interesting to see would be a comparison of SPR's predictive ability and xRAPM's. I think xRAPM also takes into account previous seasons' stats, whereas SPR is single-season only? But one could presumably do a multi-season version of SPR. Or a single-season version of xRAPM.


A couple of editorial/formatting comments on the paper: Table 2 appears early in the paper and has a column for SPR2, but SPR2 isn't explained until section 7 in the paper, so the reader is left scratching his head wondering what SPR2 is. Worse, a text search for "SPR2" only leads back to those summary tables and figures, instead of to section 7. If nothing else, an asterisk and a footnote telling the reader "See section 7 for a description of SPR2" would be helpful.

The same with figure 2, the caption mentions SPR2 but doesn't explain it or tell the reader where to find the explanation. And the legend inside the figure refers to "Bamp-CV" and "Bamp-CV2" which I presume are earlier names for SPR and SPR2, and ought to be updated to those new names.
mtamada
Posts: 163
Joined: Thu Apr 14, 2011 11:35 pm

Re: LambdaPM: A new way of looking at adjusted +/-

Post by mtamada »

bchaikin wrote:Here is the latest version of this work: http://arxiv.org/abs/1301.3523

your paper lists keyon dooling as the most underrated player in the 2010-11 season, correct?...

if so, here's my question - if an nba GM asked you why, what would you tell him? here is why i ask:

his man defense was very good that season, but:

[...box score stats deleted]

The paper makes it clear (or at least tries to make it clear) that the lists are of players who are underrated or overrated relative to their box score stats. If a player's box score stats are the 300th best in the league, but his impact on the court is equal to say the 250th best player, then he's still a well below average player, but one whose box score stats underrate him.

Dooling's SPR rating was significantly higher than his box score rating. Thus, he was highly underrated according to SPR.

Dooling's SPR rating was positive but not very high, he was below Raymond Felton and Vince Carter for example, and not much higher than Jared Dudley. Just because a player is underrated doesn't mean that he's a guy you want to add to your team.


As for the actual reasons why the box score stats seem to underrate Dooling: assuming that SPR is correct in saying he was underrated then the usual reasons would be the things that are not captured by box score stats: defense, and the ability to make one's teammates play better be it by positioning, hustle, passing, decision-making, or whatever.
Rhuidean
Posts: 15
Joined: Thu Jun 02, 2011 3:48 pm

Re: LambdaPM: A new way of looking at adjusted +/-

Post by Rhuidean »

@mtamada: Thanks for those formatting comments, much appreciated.

1. I'd tend to agree with @mtamada's interpretation of "overrated." Essentially, SPR gives you both:

a) a player ratings vector (like APM does),
b) but also gives you a weights vector for box score statistics, which you can then use to calculating another rating for each player (kind of like Hollinger's PER).

If there is a big gap between these two numbers, then someone is overrated (or underrated). Or at least, this is how I'm defining "overrated" in this manuscript.

2. SPR is currently single season only. I'm not sure of the best way to make it multi-season at the moment. How does one weight previous years PBP data and box score data...? I guess this is a parameter one could also learn via cross-validation. Or you could try something ad hoc like just cutting the weight of each past year by 1/2...I dunno.

3. Yeah, if you set lambda_1 to zero, then SPR is basically least squares regression with a "closeness to the subspace represented by the box score matrix" penalty. On the other hand, ridge regression is least squares regression, but with a "closeness to the point 0" penalty. So in this sense, SPR and ridge regression are similar.

The upshot seems to be that using a box score prior (like in SPR) gets you better player ratings than just using the ridge regression prior (see Table 2 and Table 4 for head-to-head comparisons of SPR and ridge regression for this dataset).
v-zero
Posts: 520
Joined: Sat Oct 27, 2012 12:30 pm

Re: LambdaPM: A new way of looking at adjusted +/-

Post by v-zero »

Rhuidean wrote: 2. SPR is currently single season only. I'm not sure of the best way to make it multi-season at the moment. How does one weight previous years PBP data and box score data...? I guess this is a parameter one could also learn via cross-validation. Or you could try something ad hoc like just cutting the weight of each past year by 1/2...I dunno.
How about using an exponential decay function (with time as the decay parameter) for all of the data, box score and matchup - even in season? Finding the decay coefficient could be done on the basis of one-step-ahead prediction accuracy.
bchaikin
Posts: 307
Joined: Thu May 12, 2011 2:09 am

Re: LambdaPM: A new way of looking at adjusted +/-

Post by bchaikin »

The paper makes it clear (or at least tries to make it clear) that the lists are of players who are underrated or overrated relative to their box score stats.

here is what the paper clearly states:

In the National Basketball Association (NBA), teams must make choices about which players to acquire, how much to pay them, and other decisions that are fundamentally dependent on player effectiveness. Thus, there is great interest in quantitatively understanding the impact of each player. In this paper we develop a new penalized regression model for the NBA, use cross-validation to select its tuning parameters, and then use it to produce ratings of player ability. We then apply the model to the 2010-2011 NBA season to predict the outcome of games. We compare the performance of our procedure to other known regression techniques for this problem, and demonstrate empirically that our model produces substantially better predictions. We evaluate the performance of our procedure against the Las Vegas gambling lines, and show that with a sufficiently large number of games to train on our model outperforms those lines. Finally, we demonstrate how the technique developed in this paper can be used to quantitively identify “overrated” players who are less impactful than common wisdom might suggest.

these are some bold claims...

so what would you tell an nba GM, who wants to understand your system, about keyon dooling - specifically? that my system shows he's the league's most underrated player, but i don't know exactly why? that:

he was underrated... the usual reasons would be the things that are not captured by box score stats: defense, and the ability to make one's teammates play better be it by positioning, hustle, passing, decision-making, or whatever.

this helps a GM - how?...

this same system lists shawn marion as one of the league's most overrated players in 2010-11. the same shawn marion that was the mavericks top perimeter defender that regular season (he typically guarded the opponent's top offensive option), and their top perimeter defender in the playoffs on the team that won the title that year. just out of curiosity have you ever heard how mark cuban publically raves about how valuable marion has been to the mavericks, especially when talking about that title team?...

so why does this system list marion as one of the league's most overrated players that season? what - specifically - did he do or not do that made him overrated?...
talkingpractice
Posts: 194
Joined: Tue Oct 30, 2012 6:58 pm
Location: The Alpha Quadrant
Contact:

Re: LambdaPM: A new way of looking at adjusted +/-

Post by talkingpractice »

Dooling's IPV on the defensive side of the court was a tad over +2 over 2700 combined minutes in the 2010 and 2011 seasons, and he was a heck of a good defensive player for ORL from 2006-2008 too.
Rhuidean
Posts: 15
Joined: Thu Jun 02, 2011 3:48 pm

Re: LambdaPM: A new way of looking at adjusted +/-

Post by Rhuidean »

bchaikin wrote:The paper makes it clear (or at least tries to make it clear) that the lists are of players who are underrated or overrated relative to their box score stats.

here is what the paper clearly states:

In the National Basketball Association (NBA), teams must make choices about which players to acquire, how much to pay them, and other decisions that are fundamentally dependent on player effectiveness. Thus, there is great interest in quantitatively understanding the impact of each player. In this paper we develop a new penalized regression model for the NBA, use cross-validation to select its tuning parameters, and then use it to produce ratings of player ability. We then apply the model to the 2010-2011 NBA season to predict the outcome of games. We compare the performance of our procedure to other known regression techniques for this problem, and demonstrate empirically that our model produces substantially better predictions. We evaluate the performance of our procedure against the Las Vegas gambling lines, and show that with a sufficiently large number of games to train on our model outperforms those lines. Finally, we demonstrate how the technique developed in this paper can be used to quantitively identify “overrated” players who are less impactful than common wisdom might suggest.

these are some bold claims...

so what would you tell an nba GM, who wants to understand your system, about keyon dooling - specifically? that my system shows he's the league's most underrated player, but i don't know exactly why? that:

he was underrated... the usual reasons would be the things that are not captured by box score stats: defense, and the ability to make one's teammates play better be it by positioning, hustle, passing, decision-making, or whatever.

this helps a GM - how?...

this same system lists shawn marion as one of the league's most overrated players in 2010-11. the same shawn marion that was the mavericks top perimeter defender that regular season (he typically guarded the opponent's top offensive option), and their top perimeter defender in the playoffs on the team that won the title that year. just out of curiosity have you ever heard how mark cuban publically raves about how valuable marion has been to the mavericks, especially when talking about that title team?...

so why does this system list marion as one of the league's most overrated players that season? what - specifically - did he do or not do that made him overrated?...
I guess perhaps my goal wasn't clear. Essentially, there are several questions of interest:

(1) What is Player X's rating according to your regression procedure?
(2) Why should I believe your ratings over other ratings?
(3) Why is Player X rated at that level by your system?

My goal was to answer question (1) and (2).

It seems to me that fully answering (3) would require watching a lot of video tape on Player X, talking to coaches and players, etc.
Crow
Posts: 10622
Joined: Thu Apr 14, 2011 11:10 pm

Re: LambdaPM: A new way of looking at adjusted +/-

Post by Crow »

Rhuidean,

Thanks for providing an update. I will look into it later.
Post Reply