skill curve numbers are wrong
-
- Posts: 262
- Joined: Sun Nov 23, 2014 6:18 pm
skill curve numbers are wrong
After hearing enough people complain about the Raptor's inefficient offensive rotation, I decided to look up an optimized team usage distribution.
As this is the quintessential usage vs efficiency debate, I immediately went to skill curves to find out whats going on.
I used the slope from Eli's seminal study in 2008, and Dean Oliver's weights for high, mid and low usage players.
For every team I ran an optimized usage distribution for, they would have easily had the best nba offense of all time, not to mention really quixotic usage numbers such as nba all stars shooting 1 shot per game.
Based on this, its pretty clear that the skill curve is incorrect.
Most of the APBR discussion on the usage-efficiency debate was lost with the old forum, but I'm not really sure why there hasn't been more focus on this issue due to its potential impact.
I think Scott Sereday was onto something with his 2010 study where he included assisted and unassisted close shots, 2 point jumpers, 3 pointers and passing possessions as variables for the skill curve. Unfortunately, Scott did not post any usable data from his study so it has no real application.
I really hope I'm missing something here or wrong - so I welcome any thoughts you guys might have about skill curves, the numbers we have for it, or any other future studies we could do.
As this is the quintessential usage vs efficiency debate, I immediately went to skill curves to find out whats going on.
I used the slope from Eli's seminal study in 2008, and Dean Oliver's weights for high, mid and low usage players.
For every team I ran an optimized usage distribution for, they would have easily had the best nba offense of all time, not to mention really quixotic usage numbers such as nba all stars shooting 1 shot per game.
Based on this, its pretty clear that the skill curve is incorrect.
Most of the APBR discussion on the usage-efficiency debate was lost with the old forum, but I'm not really sure why there hasn't been more focus on this issue due to its potential impact.
I think Scott Sereday was onto something with his 2010 study where he included assisted and unassisted close shots, 2 point jumpers, 3 pointers and passing possessions as variables for the skill curve. Unfortunately, Scott did not post any usable data from his study so it has no real application.
I really hope I'm missing something here or wrong - so I welcome any thoughts you guys might have about skill curves, the numbers we have for it, or any other future studies we could do.
Re: skill curve numbers are wrong
I agree that a generalized skill curve can be surpassed by looking at sub-types of usage and their different levels at different usage. I have previously said that lineup optimization that doesn't look at limitations on who gets early and middle shot clock usage vs late can be distorted. The issue of whether there is a real drop off in lineup efficiency at very high combined usage has recently been mentioned.
There is still a far amount about the usage-efficiency issue stlll available if you look short list of threads worth reading, recovered thread thread or Daniel's recovery thread.
Sadly the original yahoo group is no longer intact. Only fragments available thru web archives. Should have been looked after, given occasional posts to keep it deemed active. But alas.
There is still a far amount about the usage-efficiency issue stlll available if you look short list of threads worth reading, recovered thread thread or Daniel's recovery thread.
Sadly the original yahoo group is no longer intact. Only fragments available thru web archives. Should have been looked after, given occasional posts to keep it deemed active. But alas.
Re: skill curve numbers are wrong
It might help to figure out if you've made a mistake or missed something if you posted an example calculation or two.
-
- Posts: 262
- Joined: Sun Nov 23, 2014 6:18 pm
Re: skill curve numbers are wrong
maybe someone can tell me if my methodology doesnt make sense.xkonk wrote:It might help to figure out if you've made a mistake or missed something if you posted an example calculation or two.
I'm using Eli/Oliver's numbers to determine what a players starting point offensive rating should be based on their usage and current offensive rating.
I then determine for each player on a roster what offensive rating they would have after their first, second, third, fourth etc. possession.
After this, I order all possible possessions by their offensive rating and select the top 100 possessions.
After calculating for those 100 possessions how many possessions each player has, I then determine what their new offensive rating could be based on these usage stats. I then add each players new offensive rating and take the average to determine what the team's offensive rating could be.
Re: skill curve numbers are wrong
I don't understand this.
-
- Posts: 262
- Joined: Sun Nov 23, 2014 6:18 pm
Re: skill curve numbers are wrong
http://www.basketball-reference.com/blog/?p=5500
Do you see any flaws with that?
I ran something essentially the same as this, but instead of finding the optimal offensive rating for a 5 man unit , I found the entire team's optimal offensive rating?
The resulting numbers should grab your attention.
Do you see any flaws with that?
I ran something essentially the same as this, but instead of finding the optimal offensive rating for a 5 man unit , I found the entire team's optimal offensive rating?
The resulting numbers should grab your attention.
Re: skill curve numbers are wrong
The approach write up don't sound similar to me. I don't know what you mean by recalculating after 1st, 2nd, 3rd, 4th possession and selecting top 100 possessions, etc. But maybe others do. The most important is your own understanding of what you are doing, how and why. I can't help on this one, unless I just did in stating that I am still puzzled.
-
- Posts: 262
- Joined: Sun Nov 23, 2014 6:18 pm
Re: skill curve numbers are wrong
the skill curve tells us what a players efficiency will be based on a change in their usage.
I'll give an example with these numbers - player x has a starting offensive rating of 130, player y at 128 and player z at 125. for simplification sake, lets assume player x is a low usage player so for every increase/decrease in usage is a change of 2 in his offensive rating. players y and z are high usage players and have a change in 1 of their offensive rating for every increase/decrease in usage.
this shows what each player's offensive rating will be per increase in usage.
player X - 130, 128, 126, 124, 122
player y - 128, 127, 126, 125, 124
player z - 125, 124, 123, 122, 121
if a team has 10 possessions - we want to figure out how these ten possessions should be divided up to ensure the highest team offensive rating. This means, we would want these possessions:
130x, 128x, 128y, 127y, 126x, 126y, 125y, 125z, 124x, 124z
which consists of player X having 4 possessions, player y having 4 possessions and player Z having 2 possessions.
based on this, the adjusted offensive ratings would be:
player x - 124
player y - 125
player z - 124
-----
Do you understand what I'm talking about now?
I'll give an example with these numbers - player x has a starting offensive rating of 130, player y at 128 and player z at 125. for simplification sake, lets assume player x is a low usage player so for every increase/decrease in usage is a change of 2 in his offensive rating. players y and z are high usage players and have a change in 1 of their offensive rating for every increase/decrease in usage.
this shows what each player's offensive rating will be per increase in usage.
player X - 130, 128, 126, 124, 122
player y - 128, 127, 126, 125, 124
player z - 125, 124, 123, 122, 121
if a team has 10 possessions - we want to figure out how these ten possessions should be divided up to ensure the highest team offensive rating. This means, we would want these possessions:
130x, 128x, 128y, 127y, 126x, 126y, 125y, 125z, 124x, 124z
which consists of player X having 4 possessions, player y having 4 possessions and player Z having 2 possessions.
based on this, the adjusted offensive ratings would be:
player x - 124
player y - 125
player z - 124
-----
Do you understand what I'm talking about now?
Re: skill curve numbers are wrong
Alright, but if you say you are finding values that look too high, can you present an example calculation of exactly that?
-
- Posts: 262
- Joined: Sun Nov 23, 2014 6:18 pm
Re: skill curve numbers are wrong
In an optimal lineup, this is the suggested usage ratings for each player on the Raptors.
(the data is from a few weeks ago, so it probably changed slightly)
Louis Williams 42.99421222
Jonas Valanciunas 33.21329046
Patrick Patterson 32.69024652
Kyle Lowry 29.81350482
James Johnson 24.84458735
Amir Johnson 14.90675241
DeMar DeRozan 10.46087889
Tyler Hansbrough 2.092175777
Greivis Vasquez 4.184351554
Terrence Ross 0
(the data is from a few weeks ago, so it probably changed slightly)
Louis Williams 42.99421222
Jonas Valanciunas 33.21329046
Patrick Patterson 32.69024652
Kyle Lowry 29.81350482
James Johnson 24.84458735
Amir Johnson 14.90675241
DeMar DeRozan 10.46087889
Tyler Hansbrough 2.092175777
Greivis Vasquez 4.184351554
Terrence Ross 0
-
- Posts: 262
- Joined: Sun Nov 23, 2014 6:18 pm
Re: skill curve numbers are wrong
Can anyone else run an optimized lineup analysis (as seen here: http://www.basketball-reference.com/blog/?p=5500 )
and see if they also get crazy results.
It shouldn't take too long and I think the ramifications are meaningful.
and see if they also get crazy results.
It shouldn't take too long and I think the ramifications are meaningful.
Re: skill curve numbers are wrong
Whenever people do statistical analysis, there is a significant concern with false attribution. I'm not familiar with the particulars of the skill curve theory or derivation (perhaps I should read a copy of Dean Oliver's book) but it's hard to work out causal relationships from correlations. Causation could go the other way....
For every team I ran an optimized usage distribution for, they would have easily had the best nba offense of all time, not to mention really quixotic usage numbers such as nba all stars shooting 1 shot per game.
Based on this, its pretty clear that the skill curve is incorrect.
...
For example, let's suppose for a moment that coaches tend to play their starters for more minutes when the team is behind. That would mean that, the starters get more usage when they miss more, and that produces a 'skill curve' pattern.
Now, there are some clever ways to test for the coach hypothesis in particular, but a nice thing to start with - if the data can readily be found - is to look at box score splits and see whether early usage drives late performance, or early performance drives late usage in a more general way.
Re: skill curve numbers are wrong
I just ran a quick & dirty least squares regression on play-by-play data from Novermber 2004 through June 2014 on San Antonio Spurs home games of mentions of 'duncan' as a function of the Spurs' lead in points and the time left in the game in 12 second chunks (negative for overtime) and got the following.
So it does seem like Duncan is less likely to be on the floor while the Spurs lead. ... What does Duncan's usage curve look like?
Code: Select all
Residuals:
Min 1Q Median 3Q Max
-0.14240 -0.09211 -0.04595 0.04641 0.93884
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 7.970e-02 2.497e-03 31.924 < 2e-16 ***
lead -5.886e-04 9.090e-05 -6.475 9.9e-11 ***
chunk 2.499e-04 2.028e-05 12.319 < 2e-16 ***
Re: skill curve numbers are wrong
What's the DV in that regression? Are those coefficients "big"?
Re: skill curve numbers are wrong
The average rate of mentions of Duncan in a line of the play by play. I'm not sure what 'big' means in this context.xkonk wrote:What's the DV in that regression? Are those coefficients "big"?
Hmm... the data set is small enough to do direct regression, so here's a logit regression of the chance that 'duncan' shows up in a line of the play-by-play on a Spurs home game.
Code: Select all
Deviance Residuals:
Min 1Q Median 3Q Max
-0.6831 -0.5096 -0.4842 -0.4201 2.4112
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.219e+00 1.741e-02 -127.48 <2e-16 ***
lead -2.873e-02 1.550e-03 -18.54 <2e-16 ***
timeleft 1.161e-04 9.894e-06 11.74 <2e-16 ***
lead:timeleft 1.585e-05 1.171e-06 13.53 <2e-16 ***
...
Hmm.. maybe I should run a regression of second half mentions in response to the first half lead...