Re: Predicting RAPM with stats.nba
Posted: Thu Jun 01, 2023 8:41 pm
Agreed, super cool to see your work where "simpler" models get great performance, sometimes more complex isn't better or the gain isn't necessarily worth the complexityv-zero wrote: ↑Thu Jun 01, 2023 9:51 am Yeah managing to get that model from the box score alone has been quite pleasing. It's not reinventing the wheel, but combined with my streaming variant of a plus minus model it has pretty strong predictive power (though I have a model which uses an extended play-by-play box score which is what I usually use, this was mostly for fun and to see what adding higher-dimensionality could do for a box score model).

The streaming variant definitely seems like a natural next step and that totally makes sense.
Yeah I think Caruso is a pretty good player and my model is probably overrating him quite a bit. I'm guessing he's a leader in some of these more exotic stats like those in the hustle category and that's probably why he's so high here.v-zero wrote: ↑Thu Jun 01, 2023 9:51 am Interesting to see the dataset you're using, I can now understand how you managed to get Caruso to rate so highly. He's an excellent player, but box score metrics really struggle to hone in on that fact, but with those additional stats in there I can see what your model has managed to latch onto.
I think for now its mainly a personal curiosity and I wanted to make my hobby a little more serious. I've worked in other areas of basketball but not as much on the front office side. My current project goal is to build more complex player value models starting with just the average or per possession box score and then into time-series (this sounds like what you've done with play-by-play). I think a time-series ML model is maybe (or maybe not) a more natural way to deal with games over time than an exponential moving average, but also the difficulty goes way up. Not sure, hoping to find out!