Thanks to basketball-reference.com, we now have the data to see where players take their shots from. I decided to run a multiple regression in Excel (while bored at work) to examine the player’s fta/fga ratio as the dependant variable compared to where they take their shots from and their ft%. Turns out I found some interesting things.
Note: this was done on 4/16 (one day before the end of the season)
First, I sorted the players by the total number of games they played from most to least. Unfortunately, the only way I can figure out to get this data is by hand typing in all the information. As I mentioned before, I was bored at work so at this point I was only able to enter in the top 125 players in the league so far by total games played. Second, bb-ref breaks down the shooting zones into 5 areas ( 0-3, 3-10, 10-16, 16<3, and 3pts). Third, knowing that some players get fouled more because of low ft% (aka Hack-a-Dwight) I included this as a dependant variable too. 
The top 10 players who should have a lower fta/fga ratio were:
Observation	Predicted FTr	Residuals	Standard Residuals
dwight howard	0.419912404	0.374087596	3.578385236
james harden	0.275461085	0.276538915	2.6452702
ian mahinmi	0.50381553	0.26318447	2.517526462
steven adams	0.4778401	0.2501599	2.392938183
derrick williams	0.300941215	0.241058785	2.30588024
demar derozen	0.232901054	0.212098946	2.028861003
kevin durant	0.279068269	0.196931731	1.88377697
paul pierce	0.260908333	0.174091667	1.665297264
kevin love	0.271227469	0.169772531	1.623981994
deandre jordan	0.558135875	0.167864125	1.605726873
I would feel fine throwing Dwight out of the study, but I thought it was interesting how much of an outlier he was.
The next person on the list is Dwight’s teammate. I don’t like Harden, this result just adds to how much more I dislike him and how he gets to the line too much. 
Side note: I looked at just the SGs in the league and Harden had a Standard Residual of 3.7. Dwight only was a 2.6 for centers (I took out Rudy Gobert because he had a fta/fga ratio over 1 and that was just crazy)
If anyone is interested, these are the top 10 players who should have a higher FTA/FGA ratio:
Observation	Predicted FTr	Residuals	Standard Residuals
boris diaw	0.296043177	-0.136043177	-1.301339314
kentavious caldwell-pope	0.263942526	-0.136942526	-1.309942156
miles plumlee	0.37804879	-0.13804879	-1.320524275
serge ibaka	0.346964902	-0.142964902	-1.367550009
thaddeus young	0.322257996	-0.143257996	-1.370353634
kosta koufos	0.360397372	-0.157397372	-1.505605743
norris cole	0.299395855	-0.161395855	-1.543853773
carlos boozer	0.366456826	-0.167456826	-1.601830801
samuel dalembert	0.555521721	-0.190521721	-1.822461159
shawn marion	0.308026409	-0.215026409	-2.05686405
There are not any outliers in this group.
My questions are:
1) Am I actually looking at anything relevant?
2) If so, do you think I should look more into this and include players who have at least played 42 games in the regular season?
3) Are there any variables I should add or take away?
			
			
									
						
										
						Looking at this year's fta/fga
Re: Looking at this year's fta/fga
Try to take into account Drives per game (or total drives) stat from NBA.com. You can use that to get an idea of the aggressive players. Also try to look at a height or weight variable to see if they correlate with fouls.
			
			
									
						
										
						Re: Looking at this year's fta/fga
If I'm reading your chart correctly, you have negative predicted ratios.  Those obviously can't happen, so you're modeling the data incorrectly.  It's an interesting project though.  What's the effect of free throw %?
			
			
									
						
										
						Re: Looking at this year's fta/fga
A somewhat risky variable to add -- but legitimate if you create the category in advance, without looking at the data -- would be a dummy (i.e. binary) variable to indicate superstardom.  Not so much coddled superstars, which I'm guessing is one of the things you're trying to detect with this analysis -- but a more basic, non-subjective measure of who are the superstars of the league who might in theory be getting generous calls from the refs.  Maybe all-pro status, or number of all-star games played, or a career WS or PER or RAPM measure.
Did you use FT% as an independent variable? You state that you used it as a dependent variable but I'm guessing that you meant independent. This variable could potentially be improved by either adding something like points per game and an interactive term, or creating an index of "incentives to hack this guy", i.e. a combination of being a low FT% player while also being a guy that the team depends on for scoring. Because if Meyers Leonard shoots a low FT%, who cares, the other team basically doesn't care if they foul him or not or if he shoots or not. But if it's Dwight Howard or Shaq with the ball, that's when the low FT% creates a bigger incentive to foul.
A standard presentation of regression results should include the coefficients and either the standard errors or the t-statistics for each independent variable, or at least an indication of their significance level.
			
			
									
						
										
						Did you use FT% as an independent variable? You state that you used it as a dependent variable but I'm guessing that you meant independent. This variable could potentially be improved by either adding something like points per game and an interactive term, or creating an index of "incentives to hack this guy", i.e. a combination of being a low FT% player while also being a guy that the team depends on for scoring. Because if Meyers Leonard shoots a low FT%, who cares, the other team basically doesn't care if they foul him or not or if he shoots or not. But if it's Dwight Howard or Shaq with the ball, that's when the low FT% creates a bigger incentive to foul.
A standard presentation of regression results should include the coefficients and either the standard errors or the t-statistics for each independent variable, or at least an indication of their significance level.
- 
				TheSpiceWeasel
- Posts: 18
- Joined: Mon Oct 22, 2012 2:43 pm
- Location: Eden Prairie
- Contact:
Re: Looking at this year's fta/fga
When you write "by hand", do you actually mean literally typing by hand? Or copy-and-paste? Because B-R.com has this nifty little feature that makes it easy to copy-and-paste. And so if you're re-typing by hand, then I'm wondering how much typos and errors getting introduced may be muddling your data.bsped wrote: First, I sorted the players by the total number of games they played from most to least. Unfortunately, the only way I can figure out to get this data is by hand typing in all the information. As I mentioned before, I was bored at work so at this point I was only able to enter in the top 125 players in the league so far by total games played. Second, bb-ref breaks down the shooting zones into 5 areas ( 0-3, 3-10, 10-16, 16<3, and 3pts). Third, knowing that some players get fouled more because of low ft% (aka Hack-a-Dwight) I included this as a dependant variable too.