How do I measure noise?

Home for all your discussion of basketball statistical analysis.
Post Reply
TeamEd
Posts: 21
Joined: Tue Jan 27, 2015 2:33 pm
Location: Toronto

How do I measure noise?

Post by TeamEd »

So, I want to revisit the question of measuring noise in a dataset. This is related to my project looking at productivity based on in-game splits, but I figure this is worth asking in a new thread.

I formulas that tell me how much players increase or decrease their rate of recording box score stats when trailing. My set goes back to 1996-97. This data appears to be OK for shot attempts, but quite noisy for assists, rebounds and blocks. I want to measure this noise to see if the numbers I'm finding are useful.

To do this, I think I need to measure linear regression of season X-1 numbers to season X numbers to get a correlation coefficient that tells me if the measure predicts itself. I don't really know how to do this, but I think I could figure it out. The problem as I see it is, I need to do this for every player in the dataset individually then I need to weight the result by minutes played or something to get an overall correlation coefficient for the data.... or something. I expect I'll also want to see if a three year average is predictive where season-by-season numbers aren't.

Anyway. I've googled and haven't found any tutorials. Although, I'm also not sure what exactly I'm looking for.

So, ignoring the above if it doesn't make sense: I have a set of new stats. I think they might be noisy. How do I measure this noise?

/ This exceeds my J-School education.
@EdTubb edwardtubb at gmail
NateTG
Posts: 72
Joined: Thu Dec 13, 2012 9:11 pm

Re: How do I measure noise?

Post by NateTG »

Are there any tools for dealing with data or statistics that you're familiar with already?

You can find lots of stuff for doing regressions using R (which is a tool used for doing statistics).

If you run a regression you can check the http://en.wikipedia.org/wiki/Coefficien ... ermination or something similar to see how good your predictions are.
This data appears to be OK for shot attempts, but quite noisy for assists, rebounds and blocks.
Can you explain this a little more? Is this just because there are fewer assists, blocks and rebounds than shot attempts, or something else?
TeamEd
Posts: 21
Joined: Tue Jan 27, 2015 2:33 pm
Location: Toronto

Re: How do I measure noise?

Post by TeamEd »

NateTG wrote:Are there any tools for dealing with data or statistics that you're familiar with already?

You can find lots of stuff for doing regressions using R (which is a tool used for doing statistics).

If you run a regression you can check the http://en.wikipedia.org/wiki/Coefficien ... ermination or something similar to see how good your predictions are.
This data appears to be OK for shot attempts, but quite noisy for assists, rebounds and blocks.
Can you explain this a little more? Is this just because there are fewer assists, blocks and rebounds than shot attempts, or something else?
I don't really have a lot of experience with stats tools. I'll have a look at R.

What I mean is the year over year numbers I'm getting for change in shooting rate when behind appears to be fairly consistent. Year over year the change in assist and block rate etc. appear to be closer to random. I expect it's a sample size thing.
@EdTubb edwardtubb at gmail
Chris Hoffman
Posts: 9
Joined: Mon Apr 06, 2015 1:58 pm

Re: How do I measure noise?

Post by Chris Hoffman »

TeamEd wrote:So, I want to revisit the question of measuring noise in a dataset. This is related to my project looking at productivity based on in-game splits, but I figure this is worth asking in a new thread.

I formulas that tell me how much players increase or decrease their rate of recording box score stats when trailing. My set goes back to 1996-97. This data appears to be OK for shot attempts, but quite noisy for assists, rebounds and blocks. I want to measure this noise to see if the numbers I'm finding are useful.

To do this, I think I need to measure linear regression of season X-1 numbers to season X numbers to get a correlation coefficient that tells me if the measure predicts itself. I don't really know how to do this, but I think I could figure it out. The problem as I see it is, I need to do this for every player in the dataset individually then I need to weight the result by minutes played or something to get an overall correlation coefficient for the data.... or something. I expect I'll also want to see if a three year average is predictive where season-by-season numbers aren't.

Anyway. I've googled and haven't found any tutorials. Although, I'm also not sure what exactly I'm looking for.

So, ignoring the above if it doesn't make sense: I have a set of new stats. I think they might be noisy. How do I measure this noise?

/ This exceeds my J-School education.

Call me silly, but the only stats that relate to your study are assists. Rebounds may turn up an offensive rebound or a defensive rebound. So what percent of rebounds per team are offensive and defensive? That may help you get a handle on your rebounds.

Blocks for the sake of the study of answering the question "do the rates of shots increase when trailing" is irrelevant and whether or not the shots go in is not relevant either, it just sounds like you are asking when a team is behind do they pull the trigger more. So to simplify your study, only look at shots taken. Get a handle on that one stat, then add complexity of the assist.

I would argue that even inbounding the ball could be an assist, because the person who wants the ball will not inbound it. Also, the number of assists per possession could be a number you want to look at if the data exists. The question is, do you have enough data to account for the complexity of assists, if not, for now ignore it. Just my opinions,

Is an assist only accounted for when the shot goes in? If so then whether or not the shot goes in makes assists a stat you do not need to look at given your question. Hope it makes sense and check my logic here. I don't like assists because it only accounts for the last pass a player recieves before the field goal. You could argue, if it takes a team three passes around the arc to find someone open for a three pointer in rapid succession, who gets the assist if the shot goes in? The last person to pass the ball. But you could argue that the entire dynamic and synergy of the three passes is what created the opening for the three point shot to be taken, and thus all three passes should be attributed as assists for each respective player.

-happy Easter
-chris
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: How do I measure noise?

Post by DSMok1 »

What you need, Ed, is to do year-to-year correlations and then back out how much noise there is vs. how much signal. Here's one method: http://blog.philbirnbaum.com/2011/08/ta ... -kind.html

Alternatively, you could do a simple regression to predict year 2 from year 1, and see what the slope and fit of the curves look like. That will give a good intuitive grasp.

I wouldn't worry too much about any changes in true signal between year 1 and year 2 at this point.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
TeamEd
Posts: 21
Joined: Tue Jan 27, 2015 2:33 pm
Location: Toronto

Re: How do I measure noise?

Post by TeamEd »

DSMok1 wrote:What you need, Ed, is to do year-to-year correlations and then back out how much noise there is vs. how much signal. Here's one method: http://blog.philbirnbaum.com/2011/08/ta ... -kind.html

Alternatively, you could do a simple regression to predict year 2 from year 1, and see what the slope and fit of the curves look like. That will give a good intuitive grasp.

I wouldn't worry too much about any changes in true signal between year 1 and year 2 at this point.
Ok. This seems do-able where R is proving hard to get into. I was thinking something along these lines, but getting stuck 1. in figuring out how to set up my tables, and 2. in how to weight each player's contribution to an overall measurement of noise. I'll get on this. Thanks for the advice.

On the note about looking at blocks/rebounds/steals etc. I think there might be some interesting stuff there on a per player basis, but there's obviously going to be more issues in those numbers than the larger samples for shots.

And, yeah. Assists are an imperfect measure of passing. Ideally I'd want to know how raw passing rate increases or decreases, but I don't have that data. It is what it is.
@EdTubb edwardtubb at gmail
Post Reply