A few Questions about RAPM
Posted: Fri Dec 12, 2014 11:32 pm
Hello, I recently became very interested in RAPM, and I was hoping to generate some multi-year and weighted results. However, I am fairly unexperienced at coding and generating more complex models, which brings me to my questions:
1. Firstly, I was wondering where I might be able to buy/downloading the needed data sets. I have found a few websites but none provide play-by-play data going back further than 2005.
2. If I wanted to generate the average RAPM over multiple years (lets say, 2001-2010) would I need to create one data set containing each season?
3. I have found the r-code for basic rapm but I was wondering how to weight certain seasons or just the playoffs more.
4. I found that certain websites yield different rapm results. For example, the 2002 data according to http://stats-for-the-nba.appspot.com is very different from https://sites.google.com/site/rapmstats/2002-rapm. Which website should I use to cross-reference my (single-season) results? Are the differences because of what priors were chosen?
5. Looking at pre-existing rapm data, I noticed that some seasons seem to have depressed results, and I was wondering how that might impact multi-year rapm. Would it be best to normalize the results (standard deviations away from the mean) and if so how would I do that in r-studio?
Thanks for any answers.
1. Firstly, I was wondering where I might be able to buy/downloading the needed data sets. I have found a few websites but none provide play-by-play data going back further than 2005.
2. If I wanted to generate the average RAPM over multiple years (lets say, 2001-2010) would I need to create one data set containing each season?
3. I have found the r-code for basic rapm but I was wondering how to weight certain seasons or just the playoffs more.
Code: Select all
library(glmnet)
Marg <- data$MarginPer100
Poss <- data$Possessions
RebMarg <- (data$RebRateHome-(100-data$RebRateHome))
data$Possessions=NULL
data$RebRateHome=NULL
data$MarginPer100=NULL
x <- data.matrix(data)
lambda <- cv.glmnet(x,RebMarg,weights=Poss,nfolds=5)
lambda.min <- lambda$lambda.min
ridge <- glmnet(x,RebMarg,family=c("gaussian"),Poss,alpha=0,lambda=lambda.min)
coef(ridge,s=lambda.min)
5. Looking at pre-existing rapm data, I noticed that some seasons seem to have depressed results, and I was wondering how that might impact multi-year rapm. Would it be best to normalize the results (standard deviations away from the mean) and if so how would I do that in r-studio?
Thanks for any answers.