Draft Projection Model [June 10 Update]

VJL · Post by **VJL** » Mon Jun 10, 2013 2:30 pm

I posted a simple explanation and results of my draft projection models on the APBR board here (viewtopic.php?f=2&t=8242), but I have since made some significant changes and wanted to open the model up to more technical scrutiny.

The model I present here is a mixed-effect linear regression with "the most win shares earned in an NBA season before age 26" as the dependent, and a collection of college stats as fixed effects and "college" nested in "era" act as random effects. Here is the R code for the model (using the lmer4 package):

Code: Select all

model <- lmer(WS.mx ~
	log(Age) + SRS + SOS +
	X3P + X3PA + X2P + X2PA + FT*Pos + FTA*Pos +
	AST*TOV +
	TRB*Pos +
	STL + BLK + PF*Pos +
	Height +
	(1 | era:College) + (1 | College),
		data=stat)

Most of the variables should be self-explanatory. SRS and SOS are team ratings, "Simple Rating System" and "Strength of Schedule". Pos is position coded 1 through 5 with decimals when appropriate. "Era" is simply the past 30 years of NCAA basketball broken into 5 year blocks. Players playing in 80 to 85, 85 to 90, 95 to 00... are grouped together in the same "era" as the others who did so. This variable in particular is one issue where I am interested in suggestions. What are some better distinctive break-points that would change the college environment? Institution of the 3-pnt line is one obvious one. Any changes in the rules or culture regarding early-entry are probably important as well.

Here are the standardized outputs for the fixed effects:

Here is a link [http://t.co/fQnQ48rGgY] to the google document with the projection results 1981 to 2014. These also include the latest results with my multinomial "bust, bench, starter, or star" model and an RAPM-based model similar to the above. I should note that I made no effort to make these outputs "out of sample" so don't take the past results too seriously (especially for the RAPM model which uses a smaller sample).

Crow · Post by **Crow** » Thu Jun 13, 2013 1:48 pm

Thanks.

McLemore estimated here with a 3% chance to a star. The team insiders that pick him really high will probably be at least somewhat disappointed if they are expecting a "star".

Past estimates include a 3% chance of Paul George becoming a star. 12% for Wall. Westbrook only 6%. Durant pretty low at only 30%. Love was at 45% and Beasley at 47%.

Refinement still possible. I'd think that per 100 possession stats would be more helpful than the raw totals that I assume are being used. Would a separate shooting efficiency element add anything?

Misses with the projections will still occur with any system but I think there is still room to gain better fit to past cases and into the future with more work.

VJL · Post by **VJL** » Thu Jun 13, 2013 5:37 pm

Thanks for the comments!

McLemore estimated here with a 3% chance to a star. The team insiders that pick him really high will probably be at least somewhat disappointed if they are expecting a "star".
Past estimates include a 3% chance of Paul George becoming a star. 12% for Wall. Westbrook only 6%. Durant pretty low at only 30%. Love was at 45% and Beasley at 47%.

I should note, that the "bust, bench, starter, star" percentages are from a completely different model than the one above. It uses a multinomial model and only includes players' final college season. There are some things I don't like about that model's design, and I am not very happy with the results either. I include it in the google document because it may add another wrinkle to the assessment, but I don't take it very seriously for now.

The results of the model presented above are only represented in the 1st column of the linked Google document, and the RAPM model in the second column is very similar to the above but has a few more variables and uses a smaller dataset. As you can see, those models loved George and give him the second highest average score of his class (in sample disclaimer applies).

I don't expect any amount of refinement to catch Westbrook or Beasley. I have no idea what Oklahoma City noticed with Westbrook but I doubt it can be identified in the numbers (maybe I am wrong though). Beasley... is Beasley. I think there is very good reason to doubt McLemore, so until the future proves otherwise I am considering his mediocre rating a feature rather than a bug.

I'd think that per 100 possession stats would be more helpful than the raw totals that I assume are being used.

The dataset from 2002 onward is pace adjusted, so that is effectively looking at "per possession". Ultimately I would love to pace-adjust the entire dataset, but that information isn't readily available. I might try to calculate "pseudo-pace" for the rest of the dataset using team stats, but I can't even find ORBs for most college players before 2000 so even that would be a very loose approximation. Still... the closer the data gets to per possession the better.

Would a separate shooting efficiency element add anything?

I played with this. My original model of scoring included: 2pnt% + 3pnt% + FT% + 3PA + 2PA + FTA, there may be a way to make this work, but the problem I encountered was with players like Brandon Ashley who last season shot 100% from three on 1 attempts other game. The model didn't handle that appropriately. Since the current approach credits for makes and debits for misses, I think it does a good job of implicitly capturing efficiency.

Barncore · Post by **Barncore** » Fri Jun 14, 2013 9:17 am

It's impossible for metrics to catch the Beasleys and the Westbrooks of the world, simply because it can't take into account psychological intangibles. It can make pretty good guesses about a player's physical impact, his skill level, and his feel for the game. But it's the psychological intangibles that make the difference.

In Beasley's case, he has all the basketball talent in the world. He's athletic, skilled, and has a natural scorer's instinct. But he just didn't give enough of a fuck. His own ego trumped his teamwork vision, and ultimately his basketball IQ. He wasn't Miami's go-to guy from the start (poor fit) and Beasley didn't feel like earning his stripes so he fizzled out.

In Westbrook's case, he was a raw 6th man in college with tonnes of athletic ability, a decent feel for the game, but a killer work ethic. His work habits are reportedly crazy good, and that's what i think contributes to player development the most. (it's the reason why i believe Oladipo has the highest upside of anyone in the draft).

Here are some other previous prospects with reportedly great work habits (or some sort of other great psychological intangible like mental approach, will, competitive desire, character, camaraderie, self-belief): Kevin Love, Kevin Durant, Michael Kidd-Gilchrist (though still plenty of room left to grow), Damian Lillard (although he was also the beneficiary of being the perfect fit with portland), Jae Crowder, Kenneth Faried, Kawhi Leonard, Greg Monroe, Stephen Curry, Derrick Rose, Jrue Holiday, Brandan Knight, Chandler Parsons, Isaiah Thomas, Jeremy Lin, Klay Thompson, Ben Wallace, Al Horford, Joakim Noah, and obviously players like Kobe Bryant, LeBron James, Chris Paul, and obviously Michael Jordan too. Just to mention a few.

Here's some players with reported "poor psychological intangibles" (therefore may struggle to their potential, or they never did): Michael Beasley, Demarcus Cousins, Terrence Williams, pre-2012 JR Smith, Marreese Speights, Tyrus Thomas, Rashad McCants, Delonte West, Sean Williams, Julius Hodge, many more.

Definitely a correlation. Their growth curve largely depends on how they're wired internally. But there's no way to incorporate that into a metric without using subjectivity and guessing, since it's hard to get a feel for how a player is "wired" without meeting them in person. Even then it's hard to tell. All we have are draft interviews and general hearsay. If there was a way to measure this somehow THEN we would never miss a sleeper again.

VJL · Post by **VJL** » Fri Jun 14, 2013 12:58 pm

Yeah.. I am a Timberwolves fan, so I was subjected to a lot of Michael Beasley. The tools were there for the most part, and he was even a pretty likable guy, but there were some things (like an inability to go right) that should not be a problem for a 3rd/4th year player.

Michael Beasley, Demarcus Cousins, Terrence Williams, pre-2012 JR Smith, Marreese Speights, Tyrus Thomas, Rashad McCants, Delonte West, Sean Williams, Julius Hodge

Heh... not sure if you did this intentionally, but that reads like a list of the top busts in my projection model.

I agree that approach to work is the first thing you should look at after assessing production. It is a really difficult thing to get a sense of as a fan unfortunately, but I am sure the guys actually making the picks have access to that kind of information... or are at least able to make good guesses. Oladipo looks like a good one from this perspective. I think Nate Wolters is another guy who has that reputation of "never leaves the gym" which only adds to his position as my favorite player outside the obvious elite prospects.

jmethven · Post by **jmethven** » Wed Jun 19, 2013 6:57 pm

It sounds plausible enough, but I seriously doubt the notion that if we could only figure out a player's mental makeup, we would know whether or not he would succeed. It seems to me much more plausible that the determination of whether or not a player has a strong work ethic is done with hindsight - i.e., if the player succeeded beyond expectations, he must have had a good work ethic, or if he failed, he must have had a poor work ethic. While I don't doubt that it is important for a player to work hard to improve their game (obviously so), the draft-eligible population consists of players who have already worked hard enough to get their games to the point where they have starred at the NCAA level. This is an extreme population where I believe that a strong work ethic is the norm, not the exception. Most players follow a fairly predictable development path and a lot of great players were great right away because they had already put in the work long before they got to the NBA.

Most of the players you mention with 'poor psychological intangibles' do seem to have a poor basketball IQ, but what's to say that feel for the game isn't as much of a natural talent as raw athleticism? I feel that inaccurate evaluation of talent is just as strong an explanation for the successes and failures of some players than psychological intangibles, if not more so.

In the case of Beasley and Westbrook, I do actually think that Beasley underperformed his talent level, but not all of it was due to poor intangibles. In the NBA, he is an average athlete and slightly undersized at his position, which always limited his upside. Conversely, Westbrook has most definitely improved his shooting and passing ability from where he was at in college, but he also is more explosive athletically than just about any other point guard in the league. It may not have really been showcased at UCLA, but I can guarantee you that scouts could see it and that's why he went #4 overall.

Barncore wrote:It's impossible for metrics to catch the Beasleys and the Westbrooks of the world, simply because it can't take into account psychological intangibles. It can make pretty good guesses about a player's physical impact, his skill level, and his feel for the game. But it's the psychological intangibles that make the difference.

In Beasley's case, he has all the basketball talent in the world. He's athletic, skilled, and has a natural scorer's instinct. But he just didn't give enough of a fuck. His own ego trumped his teamwork vision, and ultimately his basketball IQ. He wasn't Miami's go-to guy from the start (poor fit) and Beasley didn't feel like earning his stripes so he fizzled out.

In Westbrook's case, he was a raw 6th man in college with tonnes of athletic ability, a decent feel for the game, but a killer work ethic. His work habits are reportedly crazy good, and that's what i think contributes to player development the most. (it's the reason why i believe Oladipo has the highest upside of anyone in the draft).

Here are some other previous prospects with reportedly great work habits (or some sort of other great psychological intangible like mental approach, will, competitive desire, character, camaraderie, self-belief): Kevin Love, Kevin Durant, Michael Kidd-Gilchrist (though still plenty of room left to grow), Damian Lillard (although he was also the beneficiary of being the perfect fit with portland), Jae Crowder, Kenneth Faried, Kawhi Leonard, Greg Monroe, Stephen Curry, Derrick Rose, Jrue Holiday, Brandan Knight, Chandler Parsons, Isaiah Thomas, Jeremy Lin, Klay Thompson, Ben Wallace, Al Horford, Joakim Noah, and obviously players like Kobe Bryant, LeBron James, Chris Paul, and obviously Michael Jordan too. Just to mention a few.

Here's some players with reported "poor psychological intangibles" (therefore may struggle to their potential, or they never did): Michael Beasley, Demarcus Cousins, Terrence Williams, pre-2012 JR Smith, Marreese Speights, Tyrus Thomas, Rashad McCants, Delonte West, Sean Williams, Julius Hodge, many more.

Definitely a correlation. Their growth curve largely depends on how they're wired internally. But there's no way to incorporate that into a metric without using subjectivity and guessing, since it's hard to get a feel for how a player is "wired" without meeting them in person. Even then it's hard to tell. All we have are draft interviews and general hearsay. If there was a way to measure this somehow THEN we would never miss a sleeper again.

kjb · Post by **kjb** » Wed Jun 19, 2013 8:27 pm

Very interesting approach. Yours is a different statistical approach than the one I've taken as I've tried to build a statistical model for the draft, but the results are pretty similar. I notice, for example, that you have Danny Green 9th in his draft year. I'm a crappy programmer so I have underclassmen (not in the draft) muddling stuff up in my spreadsheet, but my system also had Green in the top 10. You have Ty Lawson high -- so did I. Same for Faried. And others.

We have some disagreements this year -- you've got Wolters a lot higher than I do. Our general "shot grouping" is pretty similar, but the order is a bit different. Have you tried incorporating physical measures in your system? Comparing players to the average drafted player at their position, I use standing reach, bench press, combined vertical and combined sprint and agility drill. All data from the combine. I wish I had better physical data, but it's the only objective source for physical measures available. I also have an "intangibles" adjuster in case of injuries, psychological issues, legal issues, etc.

I want to do a complete rebuild, but I'm limited by time and my own programming shortfalls.

Statman · Post by **Statman** » Wed Jun 19, 2013 9:28 pm

Barncore wrote:It's impossible for metrics to catch the Beasleys and the Westbrooks of the world, simply because it can't take into account psychological intangibles. It can make pretty good guesses about a player's physical impact, his skill level, and his feel for the game. But it's the psychological intangibles that make the difference......

Here's some players with reported "poor psychological intangibles" (therefore may struggle to their potential, or they never did): Michael Beasley, Demarcus Cousins, Terrence Williams, pre-2012 JR Smith, Marreese Speights, Tyrus Thomas, Rashad McCants, Delonte West, Sean Williams, Julius Hodge, many more.

Definitely a correlation. Their growth curve largely depends on how they're wired internally. But there's no way to incorporate that into a metric without using subjectivity and guessing, since it's hard to get a feel for how a player is "wired" without meeting them in person. Even then it's hard to tell. All we have are draft interviews and general hearsay. If there was a way to measure this somehow THEN we would never miss a sleeper again.

It's a tricky thing. I was one that posted what seemed hundreds of times before the 2007 draft that Kevin Durant should have been the #1 pick over Oden - because of my ratings, but also my worries about Oden's health for the future (broke a bone in his hand at a young age, I figured broken foot bones in those massive feet would come - but alas it was knee. Plus, he already walked like an old man as a teenager). People argued Durant's lack of strength (not able to do one rep at 185 lbs on bench) was a sign of poor work ethic. I didn't agree, I just figured the kid was a serious gym rat (just played basketball) and never lifted - I saw that as "upside".

BUT, the very NEXT year - I posted many times that Beasley should have been #1 (and maybe Love #2) over Rose. Beasley was even more (slightly) "productive" statistically than Durant the year before, although some of Durant's outliers looked slightly better (steal & turnover rate specifically). I was pretty certain he and Love would be starters for many seasons and possibly/probably? multiple All Stars if they put the work in and avoid catastrophic injuries. Rose I doubted, in terms of future "stardom". Statistically, he looked fairly pedestrian, although my ratings rated him MUCH higher than one would expect for a PG who didn't break 15 pts, 5 ast, or 1.5 steals per game in college. I THOUGHT he had similar red flags to Beasley - mainly because it was common knowledge he cheated on the SAT to qualify (someone else took his test). BUT, statistically, he was a true freshman PG who rated well, which rarely happens except for future NBA starters/stars - which I better understand now. I just fell in love with the crazy positives the numbers showed for Beasley and Love. Oh, yeah, and Harden too. I can't wait to re-crunch the 2008 numbers, since my ratings have been tweaked some.

Anyway, I (and it appears others here) have a much better understanding of what the numbers mean than I did 5 years ago - but it hasn't changed THAT much. MANY NBA gms often are still notoriously bad about almost completely ignoring quality college production for combine scores, a few NBA camp looks, and team workouts.

The very first thing I will do when I finish the historical NCAA ratings is probably redraft the last 15 seasons based solely on just my basic impact rating (in relation to class) compared to the actual NBA drafts. I will include EVERY college player - so if a guy rated high enough to be drafted by my numbers but wasn't drafted in real life and never played a minute of NBA ball - I'll still include him in my results with a big fat zero for NBA production. I'm guessing my redrafted results on my most basic college rating will still be an improvement over real life. Of course, we'll be able to improve that even more when get into player statistical fingerprints and similarity scores - since a database or over 45,000 players who played quality minutes is no longer too small a sample size to work with.

But trolling over all these lines of data is SO tedious and slow. I can't wait until the groundwork is done - I wish I didn't have a "real" job.

BTW - great work other college to pro guys I've seen here - maybe by next season (when I'm caught up and others have best "tweaked" theirs) we'll be able to have a great thread on all our draft board comparisons based on our results.

Maybe I should keep my mouth shut until I can post real data - but you guys are inspiring me. Thanks!

VJL · Post by **VJL** » Wed Jun 19, 2013 10:52 pm

I have been impressed with the consistency across different approaches to this problem. Brocato's results posted elsewhere in this forum are largely in agreement as well. Another guy who has used my data but taken a pretty different approach arrived at very similar rankings. It makes me a lot more comfortable with the results to see the consistency.

I am curious where the models differ on Wolters. He has put up really high in pretty much every iteration of the model I have looked at.

Have you tried incorporating physical measures in your system?

I have played with combine stuff quite a bit. It isn't in the model above, because the sample goes all the way back into the 80s... and there is no information back that far. However it does use height as you can see above. The RAPM-based model that is also listed in the linked google-spreadsheet and combines with the WS model to get the "AVG" score uses combine measures. The only combine stuff that I have found really matters is replacing height with standing reach and adding no step vertical. I wager things like bench, agility, and sprint would give information as well, but the problem is that a lot of players pass on those drills so including those variables limits the sample.

VJL · Post by **VJL** » Wed Jun 19, 2013 11:06 pm

Anyway, I (and it appears others here) have a much better understanding of what the numbers mean than I did 5 years ago - but it hasn't changed THAT much. MANY NBA gms often are still notoriously bad about almost completely ignoring quality college production for combine scores, a few NBA camp looks, and team workouts.

Even really basic stuff has clearly not filtered into draft decisions for most teams. One thing that has become extremely apparent through this process is the importance of steals. Not necessarily because of "steals" themselves, but more likely because they say something about general dominance, awareness, reaction time... who knows. All I know is they absolutely help predict success, and they are just as important for bigs as they are for guards. Meanwhile... a center with mediocre overall statistics and suffering from a stress fracture as a 19 year old is shooting up draft boards in spite of the fact that he has the lowest steal rate of any prospect in the past 30 years.

Check out the top and bottom 25 college steal rates for centers:

Three of the four best centers of the time period are on the left (Ewing is right behind at 1.5 steals per 40) and the best on the right is probably Olden Polynice. Obviously no team should be drafting purely on steal rate, but I can't imagine Len would be considered a top-10 pick in a world where the relationship between steals and pro success was understood by the majority of NBA teams.

Statman · Post by **Statman** » Thu Jun 20, 2013 5:03 am

VJL wrote:
Three of the four best centers of the time period are on the left (Ewing is right behind at 1.5 steals per 40) and the best on the right is probably Olden Polynice. Obviously no team should be drafting purely on steal rate, but I can't imagine Len would be considered a top-10 pick in a world where the relationship between steals and pro success was understood by the majority of NBA teams.

On some ESPN show I watched today there were rumors that Len would go #1 overall. I don't see it at all personally. At least Noel has all the athleticism outlier stats of someone that will stick barring further injuries.

I've used steal rate many times when trying to explain why I doubt a college player's future NBA success. I used that argument in a related thread about Shabazz Muhammad. Shabazz also has a very poor assist rate AND practically no block rate - appears doomed for failure to me.

Shabazz had been projected #1 or #2 for about 3 years for this draft - it now appears he may drop out of the lottery. I do think teams are a little better at this than they used to be. Well, unless Len goes top 3.

My ratings (AFTER pace, SoS, etc.) can be broken down into any box score minutiae - I'm sure in the near future I'll start saying things like "no player has ever played more than 2000 minutes in the NBA with a college HN steal rating as low as _____" or other such arguments - with examples of past draft busts that were unable to defy the argument.

VJL · Post by **VJL** » Thu Jun 20, 2013 3:05 pm

I've used steal rate many times when trying to explain why I doubt a college player's future NBA success. I used that argument in a related thread about Shabazz Muhammad. Shabazz also has a very poor assist rate AND practically no block rate - appears doomed for failure to me.

Bazz is sporting steal and assist rates both of <= 1 per 40. I can find two non-bigs who made an NBA roster with <= 1 in each of those collegiate stats. Yakouba Diawara and Joe Crawford. The best argument I have found for him is Harrison Barnes' relative success. Barnes had a similar profile, but even then my numbers liked Barnes a lot more than they like Bazz. My theory with is that the age thing (he was a freshman who was old for a sophomore) led to some misconceptions about how dominant Bazz truly was in high school and it has been tough for scouts to go back and reevaluate him with his real age in mind..

As you said though... it looks like most teams are at least catching onto the problems with Shabazz.

Barncore · Post by **Barncore** » Thu Jun 20, 2013 7:36 pm

kjb wrote:Have you tried incorporating physical measures in your system? Comparing players to the average drafted player at their position, I use standing reach, bench press, combined vertical and combined sprint and agility drill. All data from the combine. I wish I had better physical data, but it's the only objective source for physical measures available. I also have an "intangibles" adjuster in case of injuries, psychological issues, legal issues, etc.

You'll want to be careful when taking into account combine results. Some of the tests they run have no correlation to NBA performance, some do. Have you seen Kevin Hetrik's work over at Hardwood Paroxysm? He did some amazing work on the predictive power of the draft combine. Check it out.

VJL · Post by **VJL** » Thu Jun 20, 2013 8:02 pm

I have read that. Good stuff.

I think the biggest problem with combine data for now is the crappy sample. Only started collecting it recently and many players sit out everything but the vertical, and there are always a few top-tier players who miss that as well.

That is why I only use reach and no step vert when I use combine stuff at all.

APBRmetrics

Draft Projection Model [June 10 Update]

Draft Projection Model [June 10 Update]

Re: Draft Projection Model [June 10 Update]

Re: Draft Projection Model [June 10 Update]

Re: Draft Projection Model [June 10 Update]

Re: Draft Projection Model [June 10 Update]

Re: Draft Projection Model [June 10 Update]

Re: Draft Projection Model [June 10 Update]

Re: Draft Projection Model [June 10 Update]

Re: Draft Projection Model [June 10 Update]

Re: Draft Projection Model [June 10 Update]

Re: Draft Projection Model [June 10 Update]

Re: Draft Projection Model [June 10 Update]

Re: Draft Projection Model [June 10 Update]

Re: Draft Projection Model [June 10 Update]