You can view the 1983 to 2012 retrodictions for models 1 and 2 here (https://docs.google.com/spreadsheet/ccc ... SM0E#gid=0)
1983 to 2013 model 3 comparisons here (https://docs.google.com/spreadsheet/ccc ... NTVE#gid=0)
And 2013 predictions for models 1 and 2 here (https://docs.google.com/spreadsheet/ccc ... sp=sharing)
Model 1 (Peak Win Shares):
The goal of the first model is to predict how many win shares a player is expected to produce in his peak season before his 26th birthday. I found each NBA player’s peak pre-26 season in terms of win shares, or in the case of more recent players I estimated how many win shares they would collect in their age-25 season. I then used mixed-effect models with all of the basic box-score stats, age (in days), position, SOS, SRS, and a couple interactions as fixed effects and "era" (college seasons in 5-year blocks from 1980-85 to 2010+) as random effects to predict “peak win shares.” I ran a unique model with all players except ‘ego’ for each player in each year to keep the retrodictions out of sample.
After computing this value for each player in each year I ran an additional regression using “most recent college season prediction”, “2nd most recent college season prediction”, “n’th most recent college season prediction”… to explain observed win share peak. This sets the weights for each college year and allows me to compute one value for the player in the year he was drafted.
Model 2 (Outcome likelihood):
This model attempts to capture the high-variance gambling nature of the draft. Rather than pegging a specific expected production to each player, it gives them percent likelihoods of being a "bust" (0 WS), "bench-warmer" (> 0, < 5 WS), "starter" (> 5, < 10 WS), or "star" (> 10 WS) at their peak performance. This model uses multinomial regression with most of the same predictors as model 1 (though it includes height and weight).
For now this model does not try to account for the information given in each college season, but instead only looks at a player’s final season. I may add a similar function in the near future however.
Model 3 (Comparison finder):
This is just a fun little model that helps find past player seasons that are similar to ego’s. All it does is look for the players who minimize the absolute difference in average standard deviation across a set of statistics. The actual math chosen was a bit arbitrary and someone may conjure a better version, but here is what I am using for now:
Code: Select all
((abs(2P.X – 2P.Y) + abs(2PA.X – 2PA.Y) + abs(3P.X – 3P.Y) + abs(3PA.X – 3PA.Y) + abs(FTA.X – FTA.Y) + abs(FT.X – FT.Y))/3 +
(abs(AST.X – AST.Y) + abs(TOV.X – TOV.Y))/2 +
(abs(STL.X – STL.Y) + abs(BLK.X – BLK.Y))/2 +
abs(TRB.X – TRB.Y) +
abs(PF.X – PF.Y)/4 +
(abs(Height.X – Height.Y) + abs(Weight.X – Weight.Y))/2 +
abs(Age.X – Age.Y) +
(abs(SOS.X – SOS.Y) + abs(SRS.X – SRS.Y))/2)
) / 7.25
