RAPM request thread

Home for all your discussion of basketball statistical analysis.
colts18
Posts: 313
Joined: Fri Aug 31, 2012 1:52 am

Re: RAPM request thread

Post by colts18 »

DSMok1 wrote:Jerry posted 15 year RAPM with age adjustments here; it's basically what I used to create Box Plus/Minus: http://www.apbr.org/metrics/viewtopic.p ... 673#p24673
You used age adjustments for your BPM?
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: RAPM request thread

Post by DSMok1 »

colts18 wrote:
DSMok1 wrote:Jerry posted 15 year RAPM with age adjustments here; it's basically what I used to create Box Plus/Minus: http://www.apbr.org/metrics/viewtopic.p ... 673#p24673
You used age adjustments for your BPM?
The age adjustments are included in the RAPM regression and then backed back out. It's necessary to get the most accurate RAPM results (otherwise, the same value will be used in the RAPM regression for 38 year old Shaq as 28 year old Shaq).
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
Nathan
Posts: 137
Joined: Sat Jun 22, 2013 4:30 pm

Re: RAPM request thread

Post by Nathan »

This is sort of a minor quibble, but I was planning to look at the effect of age on plus/minus as well, and if age was involved in the calculation of RAPM it might complicate this. For example, if you have a 19 year old and a 35 year old with identical box score stats, which one is (most likely) the better player? My early results suggest that there's a significant difference, with the older player being nearly 2 points per 100 possessions better than the younger player.

EDIT: to be clear, I see why the age adjustments you mention are crucial to producing RAPM from a 15-year sample. This is just me continuing to argue for why I would prefer single-year APM for the purposes of making SPM.
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: RAPM request thread

Post by DSMok1 »

Nathan wrote:This is sort of a minor quibble, but I was planning to look at the effect of age on plus/minus as well, and if age was involved in the calculation of RAPM it might complicate this. For example, if you have a 19 year old and a 35 year old with identical box score stats, which one is (most likely) the better player? My early results suggest that there's a significant difference, with the older player being nearly 2 points per 100 possessions better than the younger player.

EDIT: to be clear, I see why the age adjustments you mention are crucial to producing RAPM from a 15-year sample. This is just me continuing to argue for why I would prefer single-year APM for the purposes of making SPM.
I understand your concerns, but the age adjustment was purely based on a best fit within RAPM, not related to box scores at all. So I don't think it would be an issue for you.

Single year APM is just too noisy to really use for anything, in my opinion.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
Nathan
Posts: 137
Joined: Sat Jun 22, 2013 4:30 pm

Re: RAPM request thread

Post by Nathan »

I guess I could give it a whirl with the 15-year RAPM, although I have the lingering concern that it assumes all NBA players follow the league average aging curve. I can see how ultimately it contains more information than 15 independent years of NPI APM. Are the numbers in that file "peak" RAPM (e.g. so Jordan's 4.18 is his retrojected peak based on his few over-the-hill seasons that appear in the sample)? I guess in any case I need to know the specifics of the aging curve used.
Nathan
Posts: 137
Joined: Sat Jun 22, 2013 4:30 pm

Re: RAPM request thread

Post by Nathan »

One other thought on age adjustments in 15-year RAPM. It seems like it would perform badly for players with unusual career trajectories. For instance, am I right to say that it would tend to overestimate Derrick Rose's effectiveness this season because this "should" be his peak, and underestimate Rose's effectiveness in his MVP season because he "shouldn't" have been very good at that age, given how poorly he played post-injury?

I know of course that such cases are quite uncommon, but is this a correct understanding of how the age adjustment works?
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: RAPM request thread

Post by DSMok1 »

Nathan wrote:One other thought on age adjustments in 15-year RAPM. It seems like it would perform badly for players with unusual career trajectories. For instance, am I right to say that it would tend to overestimate Derrick Rose's effectiveness this season because this "should" be his peak, and underestimate Rose's effectiveness in his MVP season because he "shouldn't" have been very good at that age, given how poorly he played post-injury?

I know of course that such cases are quite uncommon, but is this a correct understanding of how the age adjustment works?
Yes, that's a definite issue, but not easily solved.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
J.E.
Posts: 852
Joined: Fri Apr 15, 2011 8:28 am

Re: RAPM request thread

Post by J.E. »

One step towards a solution would be to add injuries to the model (ACL, meniscus tear/repair), but that data is hard to come by
Nathan
Posts: 137
Joined: Sat Jun 22, 2013 4:30 pm

Re: RAPM request thread

Post by Nathan »

Prior-year informed RAPM (meaning RAPM with priors equal to previous year RAPM, I think that's the correct term) wouldn't suffer from that problem quite as dramatically. One way to make prior-year informed RAPM even better at dealing with these cases might be to adjust the prior when games are missed. Basically, a player's post-absence prior would be lower than his pre-absence prior by an amount that is some function of the number of games missed. For instance (ignoring age adjustments for simplicity):

Player A previous year RAPM +3, misses no games, so his prior is simply +3.

Player B previous year RAPM +3, misses the first 10 games, so his prior is +2.

Player C previous year RAPM +3, misses the first 40 games, so his prior is 0.

Player D previous year RAPM +3, plays the first 30 games, misses 10 games, then plays the last 40 games. His prior for the first 30 games is +3, for the last 40 games is +2. Pre-injury and post-injury player D would basically be treated as different players in the model.

Player E previous year RAPM +3, but missed the last 40 games of last season, so his prior is 0.

That would improve the situation to the extent possible without delving into the specifics of different injuries.



EDIT: I know this is the request thread, so to be clear I'm not requesting that you do this; it's just a thought. I'd still like to have either more seasons 1-year APM, or the underlying aging curve for the 15-year RAPM so that I can continue working on my SPM. There's no particular hurry, but I would like to know whether or not you can do this, so that I know if I should wait or if i should start working on a different project.
permaximum
Posts: 416
Joined: Tue Nov 27, 2012 7:04 pm

Re: RAPM request thread

Post by permaximum »

DSMok1 wrote:
colts18 wrote:
DSMok1 wrote:Jerry posted 15 year RAPM with age adjustments here; it's basically what I used to create Box Plus/Minus: http://www.apbr.org/metrics/viewtopic.p ... 673#p24673
You used age adjustments for your BPM?
The age adjustments are included in the RAPM regression and then backed back out. It's necessary to get the most accurate RAPM results (otherwise, the same value will be used in the RAPM regression for 38 year old Shaq as 28 year old Shaq).
Well, it's not true for that 15-year RAPM. The age adjustment was not applied in reverse after the calculation. However, it was true for the 14-year RAPM that's shared before.

@Nathan

If you go down that route, you won't like what you'll see. There are a lot of generalized adjustments in RAPM, RPM etc. To reduce noise, J.E. added tremendous amount of bias. What you're looking for is single-year APM or single-year NPI-RAPM. However you should know that for some reason people find different RAPM results for the same year which cannot be explained by regression penalty. I believe the language and packages they use to calculate it are the possible reasons. For example, there's something wrong with glmnet package for R. It uses an efficient method to do ridge regression but probably it skews the results if you use the weight parameter.

To develop an SPM, the ideal scenerio would be using one big (1996/97 - Today, playoffs included) RAPM which has age adjustment applied and reverted back. Obviously because of the ridge penalty and generalized age adjustment there will be some bias. To reduce the bias of age adjustment, you can calculate the single year NPI-RAPM of each player in that period and then adjust the big ridge regression values with your findings. Different age adjustment for different players.

I could do that if I had matchup files for all those years. But I don' have them :)
Nathan
Posts: 137
Joined: Sat Jun 22, 2013 4:30 pm

Re: RAPM request thread

Post by Nathan »

Indeed, my first choice would be to have many years of NPI APM and not have to worry about bias at all. Only if it truly turned out to be too noisy (e.g. even for simple linear model SPM the coefficients don't converge strongly) would I turn to other options. I don't think that will be the case, however. Previously I had been using GotBuckets' APM, which comes in 2-year chunks, but it hasn't been updated recently. With ratings as high as +19, it was very noisy (i.e. clearly did not utilize ridge regression), but I still got decent results.
colts18
Posts: 313
Joined: Fri Aug 31, 2012 1:52 am

Re: RAPM request thread

Post by colts18 »

permaximum wrote:
To develop an SPM, the ideal scenerio would be using one big (1996/97 - Today, playoffs included) RAPM which has age adjustment applied and reverted back. Obviously because of the ridge penalty and generalized age adjustment there will be some bias. To reduce the bias of age adjustment, you can calculate the single year NPI-RAPM of each player in that period and then adjust the big ridge regression values with your findings. Different age adjustment for different players.

I could do that if I had matchup files for all those years. But I don' have them :)
There is pbp data since 1997 so I hope someone goes back runs a 20 year RAPM of 1997-2016. It would be crazy if we could get a dataset that comprehensive. I hope that J.E. will be able to run that after the season.

It would also be interesting if BPM could be adjusted to that 20 year dataset. That would help out in being more accurate for seasons in the past because the dataset would be expanded into more years where the 3 point shot was not a big deal. I'm not sure we can use BPM to evaluate 80's players because different skillsets might have been more valuable in that era (mid range shooting instead of 3 point shooting).
Last edited by colts18 on Thu Jan 21, 2016 5:50 pm, edited 1 time in total.
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: RAPM request thread

Post by DSMok1 »

colts18 wrote:
permaximum wrote:
To develop an SPM, the ideal scenerio would be using one big (1996/97 - Today, playoffs included) RAPM which has age adjustment applied and reverted back. Obviously because of the ridge penalty and generalized age adjustment there will be some bias. To reduce the bias of age adjustment, you can calculate the single year NPI-RAPM of each player in that period and then adjust the big ridge regression values with your findings. Different age adjustment for different players.

I could do that if I had matchup files for all those years. But I don' have them :)
There is pbp data since 1997 so I hope someone goes back runs a 20 year RAPM of 1997-2016. It would be crazy if we could get a dataset that comprehensive. I hope that J.E. will be able to run that after the season.

It would also be interesting if BPM could be adjusted to that 20 year dataset. That would help out in being more accurate for seasons in the past because the dataset would be expanded into more years where the 3 point shot was not a big deal. I'm sure we can use BPM to evaluate 80's players because different skillsets might have been more valuable in that era (mid range shooting instead of 3 point shooting).
I'd love to update BPM to a new basis. I'd also like to see if a long term RAPM with a MPG prior could do any better.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
rsmth
Posts: 2
Joined: Tue Jun 18, 2013 2:54 pm

Re: RAPM request thread

Post by rsmth »

on the topic of RAPM, if anyone is familiar with R's glmnet (specifically the offset function) and can answer the below, please PM me or reply here

http://stackoverflow.com/questions/3495 ... sion-prior
xkonk
Posts: 307
Joined: Fri Apr 15, 2011 12:37 am

Re: RAPM request thread

Post by xkonk »

If the code you posted there is what you actually ran, then problem #1 is that the 1415prior in 'offset=1415prior' doesn't exist (you named it prior1415) and can't exist because variable names can't start with a numeral.
Post Reply