Correlation of FT attempts with other boxscore stats.

Home for all your discussion of basketball statistical analysis.
Post Reply
Mike G
Posts: 6145
Joined: Fri Apr 15, 2011 12:02 am
Location: Asheville, NC

Correlation of FT attempts with other boxscore stats.

Post by Mike G »

Some players are said to get lots of free throws, and others don't seem to. What exactly are the expectations, based on other things a player does on the court?
Trying to keep events as separate as possible, I've got the season's boxscore stats and converted them to 2pt and 3pt FG makes and misses; OReb and DReb.
From most positive to most negative correlation to players' FTA:

Code: Select all

_TO     2FG    Stl    Min    DReb    3FG   2FGX   3fgX   OReb    Blk    Ast    PF
.830   .525   .065   .0533   .005  -.042  -.049  -.090  -.124  -.132  -.221  -.321
Now, I have no idea why TO should have the greatest association with FTA; nor why the tendency to foul would be most inversely correlated with being fouled.

Using these coefficients to create 'expected' FTA, the players who got more and fewer than expected in 2010-11:

Code: Select all

err    FT hogs        tm   FTA   exp    FT%     err    FT dogs        tm   FTA   exp    FT%
383   howard,dwight  orl   916   533   .596    -177   marion,shawn   dal   164   341   .768
335   martin,kevin   hou   669   334   .888    -156   horford,al     atl   188   344   .798
207   durant,kevin   okl   675   468   .880    -151   young,thaddeus phi   174   325   .707
199   griffin,blake  lac   695   496   .642    -147  prince,tayshaun det   168   315   .702
192   williams,lou   phi   356   164   .823    -141   diaw,boris     cha   123   264   .683

189   love,kevin     min   499   310   .850    -131   rondo,rajon    bos   132   263   .568
186   harden,james   okl   343   157   .843    -113   jefferson,al   uta   289   402   .761
178   ginobili,manu  san   410   232   .871    -111   garnett,kevin  bos   217   328   .862
178 westbrook,russel okl   631   453   .842    -109   lee,david      gsw   267   376   .787
178   wade,dwyane    mia   652   474   .758    -100   milicic,darko  min   115   215   .557

err    FT hogs        tm   FTA   exp    FT%     err    FT dogs        tm   FTA   exp    FT%
175 gallinari,danilo nyk   290   115   .893     -98   scola,luis     hou   290   388   .738
158   paul,chris     nor   384   226   .878     -94  beasley,michael min   291   385   .753
147   james,lebron   mia   663   516   .759     -93   weems,sonny    tor    94   187   .766
144   williams,deron uta   354   210   .853     -93   holiday,jrue   phi   209   302   .823
143 billups,chauncey den   287   144   .923     -93   parker,tony    san   303   396   .769

134   granger,danny  ind   466   332   .848     -88   hawes,spencer  phi    88   176   .534
129   pierce,paul    bos   449   320   .860     -88  collison,darren ind   232   320   .871
128   stuckey,rodney det   381   253   .866     -87   bledsoe,eric   lac   133   220   .744
125   maggette,corey mil   325   200   .834     -86   terry,jason    dal   214   300   .850
119   harris,devin   njn   307   188   .840     -82   fields,landry  nyk   147   229   .769
                                                -81   boozer,carlos  chi   244   325   .701
Players with multiple teams have not had their part-seasons combined.
About 64% of players rated as getting fewer than expected, thanks to those FT hoggers on the left.
Kevin Martin gets twice as many as would be predicted from his other totals.
Several on the right get to the line about half as often as they should.
NickS
Posts: 4
Joined: Fri Apr 15, 2011 2:27 am

Re: Correlation of FT attempts with other boxscore stats.

Post by NickS »

Mike G wrote:Now, I have no idea why TO should have the greatest association with FTA; nor why the tendency to foul would be most inversely correlated with being fouled.
TOs seem obvious. The closer you get to the rim (either driving, or catching a pass in the paint) the more likely you are to get fouled and the more likely you are to lose the ball.

Foul rate could be an indication of "respect from the refs" or, more likely, it could be a sign that player who have a physical advantage over their counterparts are both more likely to be fouled and have less need to foul.
xkonk
Posts: 307
Joined: Fri Apr 15, 2011 12:37 am

Re: Correlation of FT attempts with other boxscore stats.

Post by xkonk »

Mike G wrote: Trying to keep events as separate as possible, I've got the season's boxscore stats and converted them to 2pt and 3pt FG makes and misses; OReb and DReb.
From most positive to most negative correlation to players' FTA:

Code: Select all

_TO     2FG    Stl    Min    DReb    3FG   2FGX   3fgX   OReb    Blk    Ast    PF
.830   .525   .065   .0533   .005  -.042  -.049  -.090  -.124  -.132  -.221  -.321
Did you turn these stats into rates in any way? In a dataset I have for '09 and '10 there's a pretty strong correlation between FTA and either minutes played over the season or minutes per game, which you would expect; the more you play the more attempts you get. Yet you have a nearly 0 correlation.
Mike G
Posts: 6145
Joined: Fri Apr 15, 2011 12:02 am
Location: Asheville, NC

Re: Correlation of FT attempts with other boxscore stats.

Post by Mike G »

My bad, those are coefficients. Just playing 3000 minutes should get you (.0533 * 3000) 160 FTA.
Mike G
Posts: 6145
Joined: Fri Apr 15, 2011 12:02 am
Location: Asheville, NC

Re: Correlation of FT attempts with other boxscore stats.

Post by Mike G »

NickS wrote:
Mike G wrote:Now, I have no idea why TO should have the greatest association with FTA; nor why the tendency to foul would be most inversely correlated with being fouled.
TOs seem obvious. The closer you get to the rim (either driving, or catching a pass in the paint) the more likely you are to get fouled and the more likely you are to lose the ball.
.
And when you get rid of the ball, collecting assists, you're less likely to get fouled. Negative relation there.
NickS
Posts: 4
Joined: Fri Apr 15, 2011 2:27 am

Re: Correlation of FT attempts with other boxscore stats.

Post by NickS »

Mike G wrote:And when you get rid of the ball, collecting assists, you're less likely to get fouled. Negative relation there.
Actually I think the negative coefficient for assists might just be counteracting the positive coefficient for TOs.

If you think about TOs as falling into "scoring TOs" and "passing TOs" you would expect FTA to correlate strongly with the former, and not with the latter. So the negative coefficient for assists could, essentially, be estimating which portion of total TOs are "passing TOs".
mtamada
Posts: 163
Joined: Thu Apr 14, 2011 11:35 pm

Re: Correlation of FT attempts with other boxscore stats.

Post by mtamada »

Mike G wrote:Now, I have no idea why

[...]

the tendency to foul would be most inversely correlated with being fouled.
Post players tend to foul more than perimeter players do, point guards especially. Because when the perimeter defense breaks down, the post players have to be the ones who try to plug the gap, often foulling the driver or shooter in the process.

Post players of course also tend to draw more FTAs on offense, compared to perimeter players.


Also, it's not clear how you came up with the predicted FTA values; you don't want to use those correlations, which are bivariate statistics. You want to use a multivariate model, such as multivariate ordinary least squares.
xkonk
Posts: 307
Joined: Fri Apr 15, 2011 12:37 am

Re: Correlation of FT attempts with other boxscore stats.

Post by xkonk »

Mike G wrote:My bad, those are coefficients. Just playing 3000 minutes should get you (.0533 * 3000) 160 FTA.
In that case, you're going to have a hard time using the order to determine importance because of scaling effects. Players get minutes more than anything else (maxes out over 3000), then rebounds, made and missed shots, free throws, etc down to steals, which will rarely be over 200. Given an equal correlation, minutes will have a smaller regression coefficient because of its larger range. You should scale the stats somehow if you want to determine some order of importance. You'll also probably want to use all of the stats and look at a multivariate fit as mtamada suggested, although that can still be dicey to interpret.

Regardless, I'm still a little confused as to your result. In '09 and '10, there's a very strong positive correlation between virtually all the stats because of the playing time confound - the more you play, the more stats you accumulate. The only negative correlations I find are between blocks and three point makes/attempts and offensive rebounds and three point makes/attempts, and those are close to 0. If you didn't turn the stats into per 36 rates or use kind of cutoff or do something to account for playing time, how did you get a negative correlation?
Mike G
Posts: 6145
Joined: Fri Apr 15, 2011 12:02 am
Location: Asheville, NC

Re: Correlation of FT attempts with other boxscore stats.

Post by Mike G »

Every player has a unique stat profile, and so each case will be unique in its 'order of importance'.
I didn't attempt to un-confound the minutes effect. I don't know if just playing 3000 minutes and literally getting zeroes in other categories will get you 160 FTA. Probably not.

Every category is confounded with every other, and these are just averages. We may imagine an intuitive explanation, but it's never the whole story.
Shot blocking doesn't get you FTA -- the correlation is negative. I assume that's because most shot-blockers are not Dwight Howard; they tend to be non-scoring defensive specialists.

I would have expected OReb and FTA to be correlated. But again, it may be that OReb specialists dominate the field and get fewer than avg 2pt FG (per minute).

If we look at an average stat profile, scaled to 3000 minutes, he gets 302 FTA. If this hypothetical average player's FTA are ascribed to the other stats, their order is like this:

Code: Select all

..          2FG    Min    TO   Stl   DReb  3FG   Blk  3fgX   OReb   2FGX    Ast    PF
avg player  382   3000   169    91   378    80    60   143    135    482    267    257
coeff.      .52   .053   .83  .065  .005  -.04  -.13  -.09   -.12   -.05   -.22   -.32
FTA from:   200    160   140     6     2    -3    -8   -13    -17    -24    -59    -82
Committing a foul doesn't make it less likely that you will shoot FTs. It's just that big scorers try to stay on the floor, avoid fouling, etc. And there do seem to be 'fouling specialists' who do maybe one other thing well.

While it may not be instantly intuitive that TO and FTA should be connected, it's not hard to believe that players with very low TO also get few FTA.
xkonk
Posts: 307
Joined: Fri Apr 15, 2011 12:37 am

Re: Correlation of FT attempts with other boxscore stats.

Post by xkonk »

Mike G wrote:Every player has a unique stat profile, and so each case will be unique in its 'order of importance'.

Shot blocking doesn't get you FTA -- the correlation is negative.
I still don't understand how you found this result. I took the total regular season stats for 2011 (e.g., Dwight played 2935 minutes, had 916 free throw attempts, 186 blocks, 279 turnovers, etc) for all players; stats are not broken out for traded players (e.g. Jeff Green is in the data once, not twice). I ran the correlation between blocks and free throw attempts. It's positive, r = .469. If I adjust both to per-minute, I still get a positive correlation, r = .126. If I use a minute cutoff of 800 and per-minute blocks and free throw attempts, the correlation drops to about 0 but is still numerically positive. How did you get a negative correlation?
gfarkas
Posts: 19
Joined: Thu May 05, 2011 2:04 pm
Contact:

Re: Correlation of FT attempts with other boxscore stats.

Post by gfarkas »

Mike G wrote:Every player has a unique stat profile, and so each case will be unique in its 'order of importance'.
I didn't attempt to un-confound the minutes effect. I don't know if just playing 3000 minutes and literally getting zeroes in other categories will get you 160 FTA. Probably not.
What about seconds? If you convert minutes to seconds and re-run, how do the results change in your "order of importance" of coefficients?
Mike G
Posts: 6145
Joined: Fri Apr 15, 2011 12:02 am
Location: Asheville, NC

Re: Correlation of FT attempts with other boxscore stats.

Post by Mike G »

xkonk wrote:... I ran the correlation between blocks and free throw attempts. It's positive, r = .469. If I adjust both to per-minute, I still get a positive correlation, r = .126. If I use a minute cutoff of 800 and per-minute blocks and free throw attempts, the correlation drops to about 0 but is still numerically positive. How did you get a negative correlation?
This suggests that you are weighting every player equally, regardless of minutes. Then, of course, you have to use a cutoff, and you will get different results.
What I did was manually adjust about 12 coefficients until they stopped improving the total error between actual and expected. In this process, r would be an intermediate step, and what I wanted were coefficients anyway.

Once I had minimized the total prediction error (absolute value), I got a zero sum across the league by a minor adjustment in the minutes coefficient. I was kind of hoping minutes would not be a factor, and FTA would be a result of a player's activity (production stats); but apparently just being on the floor creates some FTA.


Gabe, in the 'order of importance' post, minutes is one of the big 3 (along with 2FG and TO, for creating FTA) in the average player's stat line. This can of course be different for various players.
xkonk
Posts: 307
Joined: Fri Apr 15, 2011 12:37 am

Re: Correlation of FT attempts with other boxscore stats.

Post by xkonk »

Mike G wrote: What I did was manually adjust about 12 coefficients until they stopped improving the total error between actual and expected. In this process, r would be an intermediate step, and what I wanted were coefficients anyway.

Once I had minimized the total prediction error (absolute value), I got a zero sum across the league by a minor adjustment in the minutes coefficient. I was kind of hoping minutes would not be a factor, and FTA would be a result of a player's activity (production stats); but apparently just being on the floor creates some FTA.
Soooo.... you ran a kind of partial correlation, trying to parse out minutes? I still can't really picture what you did. But if you wanted to account for minutes, why not run the correlations/regressions on per-36 stats or something similar? Either way you'll still have the scaling issue mentioned earlier if you're interested in comparing the coefficients for importance.
Post Reply