gfarkas wrote:
I'm just curious, but why would you recommend Ridge Regression right off the bat? I've always considered it a technique to be used when a problem was ill-posed and/or OLS doesn't yield a unique solution.
Well, I see a ill-posed problem posted, we have 3 independent variables and a lot more equations. And we can always find a lambda for which the RMSE of the ridge regression is smaller than the RMSE for OLS.
The difference might be this: I -- and maybe GabeF too -- was envisioning 3 independent variables, a lot of observations, and just one equation. But now I see that one could fit a different equation for each individual player, or more accurately the same equation but different parameters. With several hundred players and thus several hundred parameters to estimate, yeah there are going to be some outliers and we might want to rein them in via ridge regression or something similar.
There are also fixed effects models which combine all the players together (as in my original interpretation) but which allow for individual players to have one or more parameters which vary. Ridge regression might not be used in such models but it is still true that with so many parameters Type I errors are almost guaranteed to occur and there are a variety of corrective measures one might take.