Request a short overview of the current state of analytics

Home for all your discussion of basketball statistical analysis.
Mike G
Posts: 6144
Joined: Fri Apr 15, 2011 12:02 am
Location: Asheville, NC

Re: Request a short overview of the current state of analyti

Post by Mike G »

If you're a coach in the middle of a game, you have to be able to grasp "how good" a player is on that night, with a very small sample size.

In the course of a season, you have to allot minutes based on what some methods would call a too-small sample size: If a full season is too small, a partial season is even more useless.

If you need a bunch of seasons to rate a player, he's not the same player he was at the beginning of the analysis.

One stat which is consistent at all levels, and which seems to best correlate with the minutes a player gets, is eWins -- short for equivalent wins added to an average team. Check out the recent playoff threads and others here.
It's not "publicly available", except that if you want spreadsheets I can send them to you.
Crow
Posts: 10536
Joined: Thu Apr 14, 2011 11:10 pm

Re: Request a short overview of the current state of analyti

Post by Crow »

When was last time eWins formula changed? Any plans to tweak it to improve team results explanatory power?
Voyaging
Posts: 19
Joined: Thu Aug 14, 2014 8:47 pm

Re: Request a short overview of the current state of analyti

Post by Voyaging »

Mike G wrote:If you're a coach in the middle of a game, you have to be able to grasp "how good" a player is on that night, with a very small sample size.

In the course of a season, you have to allot minutes based on what some methods would call a too-small sample size: If a full season is too small, a partial season is even more useless.

If you need a bunch of seasons to rate a player, he's not the same player he was at the beginning of the analysis.

One stat which is consistent at all levels, and which seems to best correlate with the minutes a player gets, is eWins -- short for equivalent wins added to an average team. Check out the recent playoff threads and others here.
It's not "publicly available", except that if you want spreadsheets I can send them to you.
Does correlating with actual minutes mean it's a good metric? I don't want to underestimate NBA coaches but I'm sure there could be massive improvements to lineup and minute management. I'd love to see those spreadsheets, though, if you wouldn't mind. Pm them or whatever is easiest for you.
Voyaging
Posts: 19
Joined: Thu Aug 14, 2014 8:47 pm

Re: Request a short overview of the current state of analyti

Post by Voyaging »

permaximum wrote:The best metric at evaluating player value should be the one that out-of-sample prediction accuracy at 100% roster turnover rate is better than others. And that's PER (not even empirical) atm as far as public metrics go. However, player tracking metrics are not tested for large turnover rates since there is less data and one of them recently won the prediction test.

In my work, I found out that with NBA's current yearly roster turnover rate, RPM > BPM >= NPI-RAPM > WS >MPG > PER >= WP.

This means most of those metrics are better at capturing team value but they are not good at distributing it across players. That's why they are somehow good at predicting next year at team level but as years pass they get worse and really worse because of eventual high roster turnover rate.

So, I would check PER first to get a rough idea about how good a player is and then look at RPM for his defensive impact since PER is not really good at capturing that side of the play.
So in other words, PER is better at judging a players independent value, and would be more likely to remain accurate if the player joins a new team, etc.? Did I understand that right?

A combination of PER and DRPM would be a reasonable start then?

Thanks again to everyone.
Mike G
Posts: 6144
Joined: Fri Apr 15, 2011 12:02 am
Location: Asheville, NC

Re: Request a short overview of the current state of analyti

Post by Mike G »

Does correlating with actual minutes mean it's a good metric? I don't want to underestimate NBA coaches but I'm sure there could be massive improvements to lineup and minute management.
I figure it's not a bad sign.
When two elite coaches meet in a playoff series, and you get roughly zero correlation between minutes and some metric, you have to wonder about that metric.

Meanwhile, if player metrics do not sum to team wins or +/-, it can hardly be expected to predict future seasons for any team, whether a player stays put or moves on.

Team PER -- player PER multiplied by minutes, summed for a team, and divided by total minutes -- does not reflect a team's strength. A better defensive team will have relatively low PER vs an offense-only team.
This could be easily fixed with a team defensive component, but it's not.

There are also no adjustments for whether a player starts or subs, plays long minutes or short, creates his own shot or relies on passers, tends to assist many 3-pointers, rebounds for a strong or weak rebounding team, has anomalously higher assist or block rates at home.
These adjustments are lacking in many or perhaps all boxscore metrics, an exception being eWins.

With a change of team, teammates, or head coach, any of these factors may change dramatically, and the player's raw stats will do likewise. And yet he may be the same player, doing mostly the same things. With appropriate adjustments, his stats are seen to be more consistent, as is his total rating.

If you don't want your email on your profile, PM where you'd like to receive an eWins file or 3. Regular season or playoffs for 2015, last 15 yrs, etc.
permaximum
Posts: 416
Joined: Tue Nov 27, 2012 7:04 pm

Re: Request a short overview of the current state of analyti

Post by permaximum »

Voyaging wrote:So in other words, PER is better at judging a players independent value, and would be more likely to remain accurate if the player joins a new team, etc.? Did I understand that right?

A combination of PER and DRPM would be a reasonable start then?

Thanks again to everyone.
That's right. But bear in mind that RPM includes players' contributions in the previous years. Also, I haven't seen anybody tested BPM's prediction power at 100% roster turnover rate. Theoretically it should do worse than RPM thus PER too. Also I can't speak for player tracking metrics yet along with private metrics such as Mike G's eWins. I can only say they don't do well in yearly prediction tests. But don't forget that they're not prediction races at "player" level. Perhaps eWins is the best for what you seek. Just there's not any public evidence for it.
permaximum
Posts: 416
Joined: Tue Nov 27, 2012 7:04 pm

Re: Request a short overview of the current state of analyti

Post by permaximum »

I should add, given the lack of accuracy in predicting the next year's team wins by using the best player metric (which is not even empirical), all-in-one player metrics' current state is not good enough. I have a suspicion MPG (minutes per game) may even beat PER and thus RPM in a retrodiction test when it comes to predicting at 100% roster turnover rate.
Statman
Posts: 548
Joined: Fri Apr 15, 2011 5:29 pm
Location: Arlington, Texas
Contact:

Re: Request a short overview of the current state of analyti

Post by Statman »

Mike G wrote:If you're a coach in the middle of a game, you have to be able to grasp "how good" a player is on that night, with a very small sample size.

In the course of a season, you have to allot minutes based on what some methods would call a too-small sample size: If a full season is too small, a partial season is even more useless.

If you need a bunch of seasons to rate a player, he's not the same player he was at the beginning of the analysis.


One stat which is consistent at all levels, and which seems to best correlate with the minutes a player gets, is eWins -- short for equivalent wins added to an average team. Check out the recent playoff threads and others here.
It's not "publicly available", except that if you want spreadsheets I can send them to you.
I'm completely with Mike on the first part.

The second part, while Mike may be right (not gonna hate) - my work is my WAR (or should I say WAR/minute) & my HnI. It's at my site, currently in PDFs. I have all the regular season results from '80 to now in a spreadsheet if anyone is interested. I sent all the info to Neil Paine last season when he was going to run retrodictions on all the "known" box score metrics - he has yet to post any results. I am very confident on how well it would stand up to any box-score based metric - especially the "older" seasons (80s & 90s) when some seem to veer off a bit (my subjective opinion). Like others who do this - if I wasn't happy with the results & confident it would stand up to scrutiny, I wouldn't be doing it.

I have thought about running my own method of testing the "known" metrics (I'll make sure Mike sends me his if I do since it's not on b-r) - but I don't like having a horse in the race, if my work comes out on or very near top, I could imagine many jumping in and saying I should test differently. I just spent the last month on all the draft model retrodictions & such - I'm focusing heavily on player career projections for now (blending in D league, & summer league & exhibition as well), since that's what some others I've consulted with recently seem the most interested in.

If I ever test - I will have the results in a massive spreadsheet so others can scrutinize & possibly run their own test/correlations - I won't just come back with "here are the correlations - trust me". I don't believe anyone here is unscrupulous (make up results) - but errors can abound, so having all the data for others to check just seems prudent.

If a future possible employer asked me to do it - I'd drop everything & test tomorrow. I'd also be crossing my fingers like I did the draft retrodictions - I knew in theory my stuff in general would hold up, but I was nervous that maybe the reason I hadn't seen others do that many past retrodictions is because maybe it had been tested by others before & their draft models couldn't beat draft position over a big sample size. Since mine is solely production based (no other factors to allow results to mimic actual draft position better), I was afraid maybe 538.com & Neil Paine was right (they had their draft model come out about 2 days before my retrodictions were done) & the biggest factor in a draft model should be scouting (ie "expert" draft mocks) - & my work has absolutely zero of that. It worked out ok. More than ok.
Neil Paine
Posts: 73
Joined: Mon Apr 18, 2011 1:18 am
Location: Philadelphia
Contact:

Re: Request a short overview of the current state of analyti

Post by Neil Paine »

permaximum wrote:The best metric at evaluating player value should be the one that out-of-sample prediction accuracy at 100% roster turnover rate is better than others. And that's PER (not even empirical) atm as far as public metrics go. However, player tracking metrics are not tested for large turnover rates since there is less data and one of them recently won the prediction test.

In my work, I found out that with NBA's current yearly roster turnover rate, RPM > BPM >= NPI-RAPM > WS >MPG > PER >= WP.

This means most of those metrics are better at capturing team value but they are not good at distributing it across players. That's why they are somehow good at predicting next year at team level but as years pass they get worse and really worse because of eventual high roster turnover rate.

So, I would check PER first to get a rough idea about how good a player is and then look at RPM for his defensive impact since PER is not really good at capturing that side of the play.
I'd like to see your research on this... What I found was that even totally out of sample (i.e., not using any years that went into RAPM and therefore formed the basis for BPM), and even when giving an incredibly disproportionate weight to teams with heavy roster turnover, BPM was better at predicting team performance than PER. Perhaps a team adjustment would have helped PER close the gap, but as currently constituted at BBR, etc. I don't have any evidence that PER is better at prediction approaching 100% roster turnover.
permaximum
Posts: 416
Joined: Tue Nov 27, 2012 7:04 pm

Re: Request a short overview of the current state of analyti

Post by permaximum »

My work was a straight retrodiction test without any adjustment for different roster turnovers. It simply used NBA's yearly roster turnover rate. And the result was like I wrote in a previous post;

RPM > BPM >= NPI-RAPM > WS >MPG > PER >= WP.

I even have sent you a OBPM-DBPM-DWS-OWS-MPG blend with a small league adjustment for a retrodiction test. I didn't include PER or WP or any other public metric because they didn't help at all. So, my findings support your claim.

As for PER being the best at 100% roster turnover rate, I referred to this article. As I wrote in a previous post BPM was not tested but theoretically it shouldn't surpass xRAPM.

I guess heavily weighing for roster turnover doesn't exactly give the same results of retrodiction test at 100% roster turnover. Still it looks PER started to close the gap with more weight on roster turnover.
Neil Paine
Posts: 73
Joined: Mon Apr 18, 2011 1:18 am
Location: Philadelphia
Contact:

Re: Request a short overview of the current state of analyti

Post by Neil Paine »

No worries -- I realize it sounded like I was disputing things... Was just curious if you'd run a test I hadn't seen before.

In any event, it seems to be a testament to the sheer power of usage rate that PER does gain ground over WS when giving extreme weight to changed rosters. Other roles/skills are fluid, but if you can score, you can take that with you just about anywhere.
colts18
Posts: 313
Joined: Fri Aug 31, 2012 1:52 am

Re: Request a short overview of the current state of analyti

Post by colts18 »

Neil Paine wrote:
permaximum wrote:The best metric at evaluating player value should be the one that out-of-sample prediction accuracy at 100% roster turnover rate is better than others. And that's PER (not even empirical) atm as far as public metrics go. However, player tracking metrics are not tested for large turnover rates since there is less data and one of them recently won the prediction test.

In my work, I found out that with NBA's current yearly roster turnover rate, RPM > BPM >= NPI-RAPM > WS >MPG > PER >= WP.

This means most of those metrics are better at capturing team value but they are not good at distributing it across players. That's why they are somehow good at predicting next year at team level but as years pass they get worse and really worse because of eventual high roster turnover rate.

So, I would check PER first to get a rough idea about how good a player is and then look at RPM for his defensive impact since PER is not really good at capturing that side of the play.
I'd like to see your research on this... What I found was that even totally out of sample (i.e., not using any years that went into RAPM and therefore formed the basis for BPM), and even when giving an incredibly disproportionate weight to teams with heavy roster turnover, BPM was better at predicting team performance than PER. Perhaps a team adjustment would have helped PER close the gap, but as currently constituted at BBR, etc. I don't have any evidence that PER is better at prediction approaching 100% roster turnover.
How does WP or MPG compare in this test?
Neil Paine
Posts: 73
Joined: Mon Apr 18, 2011 1:18 am
Location: Philadelphia
Contact:

Re: Request a short overview of the current state of analyti

Post by Neil Paine »

colts18 wrote: How does WP or MPG compare in this test?
Great question. I didn't test WP because I didn't have it matched up to player IDs -- and, frankly, it has consistently finished last in tests of this ilk. I don't even see the point in going out of my way to include it anymore.

MPG is a better metric to test. The setup I used wasn't really well equipped to include it (ratings were minute-weighted, which you can't do with MPG, and it will take some thought to figure out how to squeeze it into the SPS framework), but I suspect it would fare quite well. I tinkered a little with a random forest SPM regression predicting RAPM from boxscore inputs, and the most important input was MPG.
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Request a short overview of the current state of analyti

Post by DSMok1 »

Neil Paine wrote:
colts18 wrote: How does WP or MPG compare in this test?
Great question. I didn't test WP because I didn't have it matched up to player IDs -- and, frankly, it has consistently finished last in tests of this ilk. I don't even see the point in going out of my way to include it anymore.

MPG is a better metric to test. The setup I used wasn't really well equipped to include it (ratings were minute-weighted, which you can't do with MPG, and it will take some thought to figure out how to squeeze it into the SPS framework), but I suspect it would fare quite well. I tinkered a little with a random forest SPM regression predicting RAPM from boxscore inputs, and the most important input was MPG.
I just wrote a post the other day on predicting BPM from MPG; the results should be very similar to this: viewtopic.php?p=24660#p24660

I use regressed MPG (usually 4 games of 0 Minutes) to adjust for players who play a bunch of minutes, say, in the last 3 games of the season.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
Voyaging
Posts: 19
Joined: Thu Aug 14, 2014 8:47 pm

Re: Request a short overview of the current state of analyti

Post by Voyaging »

How is the NBA.com PIE rating?

Also, Crow, could I see the results of your blend (and DRE, whatever that is)? Thanks!

One last question. What statistic is most likely to produce a ranking that would agree with the "common sense" opinion of the average basketball fan (as opposed to some advanced statistics that rank players like Draymond Green and Andre Iguodala very high).
Post Reply