APBRmetrics

Posted: **Thu Jun 09, 2011 7:48 pm**

KnickerBlogger

Joined: 30 Dec 2004
Posts: 180

PostPosted: Thu Dec 15, 2005 4:13 pm Post subject: HOF Standards Test (by Kubatko) Reply with quote
Quote:
In his excellent book The Politics of Glory, Bill James outlined what he called the Hall of Fame Standards Test. In a nutshell, the system awards points to players for various accomplishments: hitting .300, winning 300 games, etc. James designed the system such that the average Hall of Famer had a score of 50. It’s important to note that the system was designed to identify players who are likely candidates for the Hall of Fame rather that players who should be in the Hall of Fame. I wanted to develop a similar system for basketball...

Read more: http://www.courtsidetimes.net/articles/298/
_________________
KnickerBlogger.Net - now indispensable!
Back to top
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger
Doc319
Guest

PostPosted: Fri Dec 16, 2005 1:33 am Post subject: Hall of Fame Standards Reply with quote
Justin's article is very interesting. I have written for quite some time about something that Justin mentions: the accomplishments of ABA players have been overlooked by the Hall of Fame. Part of the reason for this is that ABA statistics are not "official." While the NFL includes AFL stats in its official lists, the NBA keeps separate NBA and NBA/ABA lists, with more emphasis placed on the "official" NBA-only numbers. Justin is absolutely correct about Artis Gilmore's achievements; as Justin's numbers indicate, Artis logged some impressive accomplishments comparable to the very best Hall of Famers. He should have been a first ballot inductee--and he was not even a Finalist last year.

For those of you who may have missed it, here is a link to my article about Gilmore:

http://hoopshype.com/articles/gilmore_friedman.htm

Roger Brown is another ABA player who deserves Hall of Fame induction. He had a much shorter and less statisticaly impressive career than Gilmore, starting his pro career late due to an unfair ban and retiring fairly young due to injuries. Justin, how does Roger Brown's career compute in your system?

--David Friedman
http://20secondtimeout.blogspot.com/
Back to top

mtamada

Joined: 28 Jan 2005
Posts: 377

PostPosted: Fri Dec 16, 2005 6:19 am Post subject: Reply with quote
The Reggie Miller example that someone mentioned is a good one, I agree that he's about where the bottom-rung of the Hall of Fame should be. He probably will get more votes than his HoF Standards score suggests for two reasons, actually they're variations of the same reason. In addition I have a couple of other observations about the HoF Standards formula.

1. Reggie had some notable outburst (in a good way) moments in widely televised games. Of course, this item is difficult to quantify in a formula.

2. Playoff performance. According to MikeG's stats, Reggie is one of a relative handful of players who achieved BETTER stats in the playoffs than in the regular season. The popular collective memory supports this clutch-in-the-playoffs notion (see Item 1).

Which is one area where I think the formula could be improved; except for championships won, it appears to completely ignore playoff performaces. Real-life HoF votes should, and probably do, take playoff performances into account.

Other stats such as FG% and 3pt% are also left out; it's possible that HoF voters ignore such stats but I think it is more likely that they are taken into account, as part of voters' overall evaluation of a player. Reggie's TS% is not up there with the centers and other big men, but he had an excellent one for a Shooting Guard.

I.e., Reggie was a notable deadeye marksman, which voters will certainly remember and presumably take into account, but the HoF Standards formula ignores this.

So that's Item 3: Points, Rebds, and Assists are probably the stats that HoF voters pay the most attention too, but the formula shortchanges players who contributed in other ways (and probably overvalues gunners who scored a lot but also missed a lot: Maravich, Wilkins, Chambers).
Defense is a whole other issue, probably every statistical formula undervalues defense -- but it's quite possible that HoF voters do also.

4. HoF "Standards" vs "Monitor"? From the article, the formula appears to focus on actual Hall of Famers and where they stand within the Hall of Fame. That's an interesting exercise, but probably less interesting than a HoF "monitor" which estimates the probability that a player will get elected at all. Later on the article applies the formula to non-HoFers, which is okay but not ideal. Non-HoFers have to get elected first, so ideally we'd want a formula which measures THAT, not where they stack up against Kareem or Dr. J.

5. There's a wide choice of methodologies for deriving such a HoF Monitor. Ordinary Least Squares regression would probably work decently, as would Discriminant Analysis, but Logit (also known as Logistic) Regression might be best. All three of these techniques would have two advantages: they would look at ALL players, not just the Hall of Famers, and identify the chief differences between the HoFers and the non-HoFers. And they would look at all variables; if some variables such as personal fouls are irrelevant then the equations will properly give them a tiny coefficient (in fact I'd probably not even bother putting personal fouls into the regression). But if FG% is important in voters' minds, that can be measured.

There might be different FG% standards based on position (50% is excellent for a guard, not such a big deal for a center) and on time period (50% was virtually unheard of in the 1950s, common in the 1970s and 1980s, and became rarer in the 1990s and 2000s). These can easily be incorporated into the regression.

While I think I understand why the formula caps rewards for things such as championships, it is probably better to either transform the data non-linearly (by taking logarithms e.g., to reduce the impact of winning 11 championships) or to use splines or step functions (i.e. a lower impact for championships after say the 3rd one, maybe lower still after the 6th one -- which I think is better than putting a strict ceiling on rewarding the number of championships won. Russell's 11 were almost certainly a factor in his election so why cap championships at 4?

Other items in the proposed formula such as the separate bonus for averaging 20 ppg can also be put into the regression and tested.
Back to top
View user's profile Send private message
Mike G

Joined: 14 Jan 2005
Posts: 3613
Location: Hendersonville, NC

PostPosted: Fri Dec 16, 2005 6:58 am Post subject: Reply with quote
By anticipating what qualities the HOF process seems to incorporate, I think Justin's applying those to current members and eligible players. It doesn't really mean he 'agrees' with those standards.

Miller and Rodman may have something of an inside track, in that they are all-time great Specialists. No one else did ____ as well, for so long. Then, since Dennis is such a 'bad boy', Reggie is the 'good guy', of the 2; so he'll wait fewer years.

I share MikeT's reservations about 'maximum' values placed on titles and other categories. Then again, whether for the general public (fans, that is) or for some top-secret committee, such a formula has to be Understandable. Ever try to explain logistic regression to a roomful of cigar-chomping 70-somethings?

Admission to the Hall is based on poetry at least as much as it is by the numbers. In the end, it's all about Fame. Funny how that works.
Back to top
View user's profile Send private message Send e-mail
parinella

Joined: 16 Dec 2005
Posts: 10

PostPosted: Fri Dec 16, 2005 9:58 am Post subject: Reply with quote
http://www.baseball-reference.com/about ... _standards
Here's a link to both of Bill James' metrics, the HOF Standards and the HOF Monitor. The Monitor weights seasonal performance more heavily than career performance. Even someone like Rafael Palmeiro, never thought of as one of the most dominant players at any given time (never better than 5th in MVP voting) has more than half of his points from achieving various seasonal milestones like 100 RBI or leading the league in doubles.

Did you look at this too?
Back to top
View user's profile Send private message
jkubatko

Joined: 05 Jan 2005
Posts: 702
Location: Columbus, OH

PostPosted: Fri Dec 16, 2005 6:46 pm Post subject: Reply with quote
Doc319 wrote:
Justin, how does Roger Brown's career compute in your system?

Roger Brown's score is 33. I would say that puts him in the category of "legitimate HoF candidate". However, if Artis Gilmore can't get in, I wouldn't hold out hope for Brown.
_________________
Regards,
Justin Kubatko
Basketball-Reference.com

Last edited by jkubatko on Fri Dec 16, 2005 9:53 pm; edited 1 time in total
Back to top
View user's profile Send private message Send e-mail Visit poster's website
jkubatko

Joined: 05 Jan 2005
Posts: 702
Location: Columbus, OH

PostPosted: Fri Dec 16, 2005 6:51 pm Post subject: Reply with quote
parinella wrote:
Did you look at this too?

No, I didn't try to create a Hall of Fame Monitor. As you said, the Monitor takes into account seasonal accomplishments, while the Standards Test takes into account career accomplishments. Maybe I'll get to that some day.
_________________
Regards,
Justin Kubatko
Basketball-Reference.com
Back to top
View user's profile Send private message Send e-mail Visit poster's website
jkubatko

Joined: 05 Jan 2005
Posts: 702
Location: Columbus, OH

PostPosted: Fri Dec 16, 2005 6:54 pm Post subject: Reply with quote
Mike Tamada, your comments deserve a reply, but I want to make sure I put some thought into it before I post. Hopefully I'll get to it this weekend.
_________________
Regards,
Justin Kubatko
Basketball-Reference.com
Back to top
View user's profile Send private message Send e-mail Visit poster's website
mtamada

Joined: 28 Jan 2005
Posts: 377

PostPosted: Fri Dec 16, 2005 9:42 pm Post subject: Reply with quote
Something which I belatedly thought about: rather than looking at all players as I suggested, it might be better to employ a 2-step process. First, throw out all players who clearly have no chance for the Hall of Fame. This could be done based on their statistics, or something as simple as eliminating all players who were selected for fewer than three All-star games, or some such.

Then do the analysis on the remaining players.

This is sort of an intermediate procedure between looking at all players, and looking only at the characteristics of Hall of Famers (as JustinK's article seems to describe). Most players have no chance for the HoF, and including them in the analysis probably just adds noise.

But if we want a good comparison of those inside the HoF to those outside (rather than a comparison of the players who are already in the HoF), then we don't want to focus solely of HoFers and ignore the non-HoFers.
Back to top
View user's profile Send private message
jkubatko

Joined: 05 Jan 2005
Posts: 702
Location: Columbus, OH

PostPosted: Mon Dec 19, 2005 2:20 am Post subject: Reply with quote
Much of what Mike Tamada suggests (discriminant analysis, logistic regression) I had looked into in the past. Since I don't have any notes for what I attempted, though, I decided to try some logistic regression models once again. I tried to use as many independent variables as I could think of, but I did not use statistics that have not been kept for most of the NBA's history (e.g., steals, turnovers, and blocks). Here is a summary of my findings:

* As I expected, ABA statistics, honors, and championships were not important predictors of HoF status. Thus, when building the model, I only considered NBA statistics. My player pool consisted of players who played a minimum of 400 NBA games and had been eligible for at least one Hall of Fame election. (I don't like ignoring the ABA statistics, but that's what the voters have apparently done. How else can you explain Mel Daniels being left out? Daniels was named to the All-Star team seven times, the All-ABA first team four times, and the All-ABA second team once; won the Rookie of the Year award, two MVP awards, and two playoff MVP awards; and was on three ABA championship teams.)

* The number of NBA All-Star game selections, number of All-NBA selections (first-, second-, or third-team), and number of NBA championships won were important predictors of HoF status. To reduce the impact of large counts, I added 0.5 to each count (to avoid the problem of zero counts) and then took the natural log. The signs of the coefficients for these variables were all positive.

* Points per game, rebounds per game, and assists per game were the only statistics that were particularly useful for predicting HoF status. (None of the shooting percentages were an important predictor of HoF status.) What was odd, though, was that in my early models the coefficients for the per game statistics were all negative. That is, holding all other variables in the model constant, as a given per game statistic increased, the probability of HoF election decreased. I thought that I might be missing a key variable, so I added height to the model, guessing that forwards and centers (usually the taller players) need to "do more" to enter the Hall than guards do. It seems I was correct. The sign of the coefficient for height was negative. In other words, holding all other variables constant, as a player's height increases, his probability of election decreases. After the addition of height, the signs of the coefficients for the per game statistics were all positive. Let me note that I could have -- and perhaps should have -- used a position indicator rather than height.

* My final model had seven independent variables: height, NBA points per game, NBA rebounds per game, NBA assists per game, NBA All-Star game selections, All-NBA selections, and number of NBA championships won. The latter three variables were transformed as described above. Other than height, all of the predictors had positive coefficients. All predictors were significant at the 0.10 level. I did not fit an intercept when building the model. The parameter estimates from the model can be used to obtain predicted probabilities of HoF election.

* The players with the ten highest predicted probabilities of HoF election are:

Code:

Wilt Chamberlain 1.0000
Bill Russell 1.0000
Kareem Abdul-Jabbar .9992
Magic Johnson .9991
Bob Pettit .9987
Larry Bird .9984
Bob Cousy .9980
Oscar Robertson .9979
John Havlicek .9976
Jerry West .9958

All of these players are in the HoF.

* The players with the ten highest predicted probabilities of HoF election who are not in the HoF are:

Code:

Jo Jo White .8120
Dominique Wilkins .8040
Spencer Haywood .7960
Gus Johnson .7886
Dennis Johnson .7356
Bob Dandridge .7077
Joe Dumars .6808
Willie Naulls .5354
Adrian Dantley .5132
Marques Johnson .4815

* The players with the ten lowest predicted probabilities of HoF election who are in the HoF are:

Code:

Lenny Wilkens .2673
David Thompson .2553
Connie Hawkins .2307
Frank Ramsey .1146
Dick McGuire .1116
K.C. Jones .0765
Joe Fulks .0468
Bill Bradley .0429
Dan Issel .0405
Calvin Murphy .0263

The selections of Thompson, Hawkins, and Issel seem to indicate that the voters are willing to consider some ABA accomplishments. Issel's selection makes Artis Gilmore's absence even more puzzling, since Gilmore had the better ABA career and NBA career. Speaking of Gilmore, his predicted probability of election is 0.0913. (Remember, that's based solely on his NBA accomplishments.)

* Finally, the players with the ten highest predicted probabilities of HoF election who are retired but not yet eligible for the HoF are:

Code:

Michael Jordan .9993
Hakeem Olajuwon .9907
David Robinson .9818
Scottie Pippen .9810
Charles Barkley .9753
Karl Malone .9723
John Stockton .8547
Dennis Rodman .8196
Patrick Ewing .7838
Mitch Richmond .7798

At this point I have probably gone on for far too long. I did this rather quickly, so I probably missed some things in the process. If you have any comments or suggestions I would like to hear them.
_________________
Regards,
Justin Kubatko
Basketball-Reference.com
Back to top
View user's profile Send private message Send e-mail Visit poster's website
mathayus

Joined: 15 Aug 2005
Posts: 212

PostPosted: Mon Dec 19, 2005 2:35 am Post subject: Reply with quote
Good work, and I'm particularly happy about the respect given to the ABA. With that said, my ideas for improvement:

I don't like the idea of using both direct statistics and accolades. It seems to me to lead to both double counting and inconsistency. The double counting should be obvious, but with the other: If Russell & Chamberlain each win the MVP & and finish in 2nd by a similar voting margin over a two years span, why should Chamberlain get the edge because of our counting of superior statistics to Russell in a year where Russell was deemed superior by the voters.

I don't like counting 1st team, 2nd team and MVP accolades all the same weight. I believe Justin had a system before that counted at 5 times the value of all-NBA 1st team status, that seems more like it. What I'd actually try to do is statistically analyze how rare certain feats are, and weight them accordingly. For example, if there are the same number of MVP winners as players who've been named to the all-NBA 1st team 3 times, maybe make the MVP worth 3 times that of an all-NBA 1st team.

Also, while I know there is no perfect way to do this, I don't like giving all players on a team equal credit for a championship. I think championships are important enough to try to come up with a more precise breakdown.
Back to top
View user's profile Send private message
Mike G

Joined: 14 Jan 2005
Posts: 3613
Location: Hendersonville, NC

PostPosted: Mon Dec 19, 2005 7:11 am Post subject: Reply with quote
jkubatko wrote:

... I added height to the model, guessing that forwards and centers (usually the taller players) need to "do more" to enter the Hall than guards do. It seems I was correct.
[...]
The selections of Thompson, Hawkins, and Issel seem to indicate that the voters are willing to consider some ABA accomplishments. .

Very interesting that you could (by regression) 'prove' a bias against ABA players and taller players. DT only played his rookie year in the ABA; the Hawk bolted at his first chance. Issel was shorter than Gilmore.

Other forms of discrimination you might investigate for correlations: Where did the guy play? Average miles from NY or LA would yield negative correlations. Black vs. White would, as well.

In your list of '10 likeliest who are not in', none are centers, all are black, the majority did not get NY/LA exposure. In the '10 unlikeliest who are in' list, 5 are white (including the only center, Issel).
Back to top
View user's profile Send private message Send e-mail
schtevie

Joined: 18 Apr 2005
Posts: 413

PostPosted: Mon Dec 19, 2005 9:46 am Post subject: Reply with quote
Justin, good work.

It seems to me there are ultimately two types of HOF regressions that are of interest. The first, which you seem to be converging upon, addresses the issue of how selectors have historically decided eligibility. Coming up with a robust specification of this, including stats and apparent biases (to which I would include the Celtics factor: when opening up a HOF, you need to stock it, and the Celtics were the perennial champions) is noteworthy.

The other would be at least as interesting, where the explanatory variables included are arguably the ones that really matter, and the specification is decided a priori. (Never mind the statistical significance.) Thus, absolute totals wouldn't matter (points per game, rebounds per game, etc.) and relative ones (points per possession and proportion of offensive possessions utilized, and rebound rates, etc.) would be substituted for them. To not discriminate against players in eras where average scoring efficiencies were low, these variables could be entered relative to the mean.

What would be the results of such an effort? Hopefully, a reordering of the heirarchy which would be more meritocratic.
Back to top
View user's profile Send private message
mtamada

Joined: 28 Jan 2005
Posts: 377

PostPosted: Mon Dec 19, 2005 9:36 pm Post subject: Reply with quote
Interesting results, especially the negative coefficients which turned positive when you corrected for height.

I don't think the double-counting of accolades and stats is a weakness, in fact I think it's a strength because if HoF voters do indeed take BOTH into account (and it seems intuitively likely that they do), then the equation should measure this fact -- as indeed it appears to do.

In addition to the per game stats, did you also include career totals? Possibly this should include a time trend, as some players' careers (not necessarily the superstars, but the all-stars) seem to be longer now than they were 30 or 40 years ago.

Another thing to try, if you haven't already: what happens if you do not transform the all-star, all-NBA, and MVP variables at all, i.e. use them in their raw linear form? Compared to the accuracy of the regression when using the transformed variables.

Interesting that the top players not in the HoF were all ones that I think rightfully should not be in -- the Dandridges, Jo Jo Whites, and Spencer Haywoods. Members of the Hall of the Very Good, not the Hall of Fame. So the equation and/or the voters seem to be doing a good job there. On the other hand, the equation estimates pretty high (though not overwhelmingly high) probabilities to those players, so perhaps the equation is telling us that those players ought to be in. Personally I think they are rightfully not.

On the other hand, the low-rated players who are in: most of them I think should be in. Maybe it means the equation has problems, maybe there are extenuating circumstances. There's the ABA angle you already mentioned. There's the NCAA angle: Bill Bradley'd might be a Hall of Famer even if he'd never played in the NBA. Ramsey and KC Jones might be getting the Celtic boost that Schtevie mentioned, in addition KC would get an NCAA boost. Fulks and McGuire were from a different era, the statistical equation probably doesn't work well for them. Wilkens, I'm not sure of ... maybe a regression which included career totals and which used his raw instead of his transformed count of all-star games would estimate a higher probability for him.

Quote:
I don't like counting 1st team, 2nd team and MVP accolades all the same weight.

One of the values of these multivariate techniques is they do NOT assign the same weight, nor do they rely on the investigator to guess at what the weights should be. The technique estimates what the optimum weight is. JustinK didn't report the coefficients, but it's safe to assume that the logistic regression's weight on the number of MVP awards was a good deal higher than the weight on all-NBA awards. One bit of fine-tuning that he could do would be to have separate variables for the number of 1st team, 2nd team, and 3rd team awards, but my guess is that would add only a little to the accuracy of the equation.

MikeG's comments about using these sorts of equations to detects anti-ABA or Black-White bias are also interesting. Labor economists frequently do exactly this sort of thing in measuring bias in the salaries of Black vs White athletes, using salary as the dependent variable. Baseball salaries seem to be the most commonly looked at, with NBA salaries next. I imagine that economists have looked at NFL salaries, but I haven't seen a study. There probably aren't enough Black players to make an NHL study of any use.
Back to top
View user's profile Send private message
Mike G

Joined: 14 Jan 2005
Posts: 3613
Location: Hendersonville, NC

PostPosted: Tue Dec 20, 2005 7:36 am Post subject: Reply with quote
mtamada wrote:
...
I don't think the double-counting of accolades and stats is a weakness, in fact I think it's a strength ...

Indeed, when allstars and all-NBA lists were drawn from 8-9 teams, a middling player was much more likely to get this kind of accolade. One thing Justin doesn't directly confront is the enormously more stringent entry requirement in this century, compared to the '60s and '70s.

Yet by 'double-counting' accomplishments, he does pretty much exactly account for the relative abundance of old-timers in the Hall.

I've often wondered why Andy Phillip, Tom Gola, Slater Martin, Bobby Wanzer, Arnie Risen are in the Hall. On my list, they're down there with Bradley, KC, Tricky Dick, et al. Oh, and Bob Houbregs.

Now that All great players skip college, I wonder how future elections will be different. I mean, if Bill Bradley's 100-or-so college games were really more relevant than 800+ NBA games (where he was a pretty mediocre player); and in the 21st Century, players really don't have 'college careers'; then Pro accomplishments are all there is.

And so, retrospectively, players with tremendous pro careers ought to be entering the Hall via the 'Veterans' Committee' ( -- better late than never).

If Artis Gilmore played today, he'd be just about the best center in the game, year after year. As it was, he was never all-NBA; he was an (NBA) allstar a modest 6 times. By my measures, he was a legitimate MVP candidate; and probably the most deserving of any non-allstar ever, a time or two.

Sometimes, being 'overrated' carries over from college, to the NBA, to the HOF. Think Pete Maravich, or Earl Monroe. What did they have that Gilmore didn't have?

mtamada

Joined: 28 Jan 2005
Posts: 377

PostPosted: Tue Dec 20, 2005 7:54 am Post subject: Reply with quote
Mike G wrote:
I've often wondered why Andy Phillip, Tom Gola, Slater Martin, Bobby Wanzer, Arnie Risen are in the Hall. On my list, they're down there with Bradley, KC, Tricky Dick, et al. Oh, and Bob Houbregs.

Now that All great players skip college, I wonder how future elections will be different. I mean, if Bill Bradley's 100-or-so college games were really more relevant than 800+ NBA games (where he was a pretty mediocre player); and in the 21st Century, players really don't have 'college careers'; then Pro accomplishments are all there is.

[...]

Sometimes, being 'overrated' carries over from college, to the NBA, to the HOF. Think Pete Maravich, or Earl Monroe. What did they have that Gilmore didn't have?

Fame.

I don't know about the others you mention in the first paragraph, but Houbregs got in due to his stellar college career. His pro career, even by 1950s standards, was short and undistinguished.

Bradley, and maybe Maravich and Bill Walton were probably the last (American male) college players who would've been voted in thanks to their college careers rather than their pro careers. Nowadays I think the pro career is the de facto determinant of HoF election. Walton and Maravich both had good pro careers, probably not enough to merit HoF-dom, but their college careers were (under the old way of viewing basketball, when college was relatively more important) more than enough to get them in.

In the case of Maravich and Monroe vs Gilmore, it's partly a question of college career (in Maravich's case), more however I think a case of the anti-ABA bias that you've already mentioned. Although in Gilmore's case there's an additional mysterious anti-Artis bias. Don't know where from. In the age when giants still ruled the league, he was one of them, along with Kareem and Moses. Not as good as either of them, but still good enough for the HoF.
Back to top
View user's profile Send private message
mtamada

Joined: 28 Jan 2005
Posts: 377

PostPosted: Tue Dec 20, 2005 8:01 am Post subject: Reply with quote
Oh, there's still the issue of playoff performance. One of the virtues of MikeG's career player ratings is that, almost uniquely amongs hoopstatisticians, he explicitly takes playoff stats into acocunt.

With playoff stats, it's hard to say whether career totals or per game averages would be more important (or maybe per game averages at career peak), so it's another case of it being a good idea to put both into the regression and see what the coefficients are. There's probably a time trend here that would need to be accounted for, as there are more possible playoff games these days (although MikeG points out that the average team is less likely to make the playoffs today, compared to, say, when 8 out of 12 qualified).

Career playoff point and assist accumulations, possibly with a time trend, might make Sam Jones and KC Jones look like more likely HoF candidates.
Back to top
View user's profile Send private message
Mike G

Joined: 14 Jan 2005
Posts: 3615
Location: Hendersonville, NC

PostPosted: Tue Dec 20, 2005 8:38 am Post subject: Reply with quote
mtamada wrote:

...in Gilmore's case there's an additional mysterious anti-Artis bias. Don't know where from. In the age when giants still ruled the league, he was one of them, along with Kareem and Moses. Not as good as either of them, but still good enough for the HoF.

Artis never hooked up with a supporting cast good enough to reach the Finals. Moses did, overpowering his superior, Kareem, in the process.

Similarity studies and Hall of Fame studies could have a lot in common. I tend to agree with the side that says we shouldn't give extra credit for commendations, but let the numbers speak for themselves. So with nothing but the pace-adjusted numbers, Artis' career (equivalent) totals look like these players':

career equivalents ePts eReb eBlk
.00 Artis Gilmore 26006 16335 3197
.52 Elvin Hayes 26607 15573 3016
.56 Robert Parish 25909 16861 2688
.63 Patrick Ewing 28669 13776 3223
1.06 Moses Malone 30687 19310 2068
1.06 Shaquille O'Neal 30906 13117 2669

Shaq just passed Artis last season, in the total of all these columns. These figures Do include ABA numbers, scaled down by contrived factors, year by year.

All these players are clear Hall-of-Famers; next guys in the list are Hakeem, DRob, Barkley, Erving, Pettit, Russell, Baylor, Pippen, Bird, Issel, Schayes, Havlicek, Wilt, and finally the non-HOF Buck Williams. After Buck are Drexler, Oscar, Lanier, Bellamy...

Obviously, there is a big difference in whether ABA stats are included, even though I've modestly scaled them down. Here's a look at Artis' resemblers on a per-minute basis:

dif per 36 minutes Sco Reb
.00 Artis Gilmore 19 12
.45 Ralph Sampson 17 11
.47 Harry Gallatin 18 11
.49 Elvin Hayes 19 11
.49 Elton Brand 20 11
.51 Willis Reed 18 11
.52 Jermaine O'Neal 19 10
.55 Dan Roundfield 16 11
.56 Robert Parish 18 11
.57 Derrick Coleman 18 10
.57 Patrick Ewing 23 11

Gilmore's rates look fine next to Willis Reed's. In fact, I have checked Artis' NBA sub-career alone, and it still looks better than Willis'; even counting playoff games.

Both these lists include Ast, Stl, PF, TO, and Blk in their computations.
Back to top
View user's profile Send private message Send e-mail
schtevie

Joined: 18 Apr 2005
Posts: 413

PostPosted: Tue Dec 20, 2005 9:54 am Post subject: Reply with quote
If we are throwing out a wish list for Justin, I would be intrigued to see what the effect of a very rough proxy for individual defensive ability, namely relative team defensive strength (points per possession above or below average, say). It should work to the benefit of all those who played with Bill Russell, but how else, I am curious.
Back to top
View user's profile Send private message
Jon Cohodas

Joined: 08 Jul 2005
Posts: 31
Location: Richmond, VA

PostPosted: Tue Dec 20, 2005 10:30 am Post subject: Thoughts and questions about jkubatko's study Reply with quote
Quote:
* My final model had seven independent variables: height, NBA points per game, NBA rebounds per game, NBA assists per game, NBA All-Star game selections, All-NBA selections, and number of NBA championships won. The latter three variables were transformed as described above. Other than height, all of the predictors had positive coefficients. All predictors were significant at the 0.10 level. I did not fit an intercept when building the model

First, let me say that I really like this study. I wish I could find the time to do these things rather than just read and chip in my 2 cents on occassion.

Without looking at the data, I would expect that the last three variables would be the best predictors because they measure the same things one looks for in a HOF. A player who is an All-NBA player, or an All-Star in a season is like a HOF player for that season.

I would expect that the interaction would be the exact opposite of the way you modeled it. In other words, I would think that not transforming the variables would predict better (as was noted by others), and that adding terms like the square of the variables would be significant. The interpretation would be that each additional accolade is more and more valuable on the margin.
Back to top
View user's profile Send private message
Kevin Pelton
Site Admin

Joined: 30 Dec 2004
Posts: 979
Location: Seattle

PostPosted: Tue Dec 20, 2005 10:53 am Post subject: Reply with quote
Given the resentment of the anti-ABA bias evident in this thread, I'm wondering, how much work has ever been done on looking at how those statistics translated, either post-merger or for players who switched leagues?

And is there an argument for discounting NBA performance during this period because the league was weakened by having quality players in the ABA?
Back to top
View user's profile Send private message Send e-mail Visit poster's website
jkubatko

Joined: 05 Jan 2005
Posts: 702
Location: Columbus, OH

PostPosted: Tue Dec 20, 2005 12:18 pm Post subject: Reply with quote
I never got around to posting how well the model does, so let me do that now. I fit basically the same model as I described earlier, with one significant change: I did not transform the count data. This change improved my error rates, as some suggested it would.

The study had 576 players. Of these 576 players, 504 were not in the HoF and 72 were. I used the model to make predictions of HoF status. If the player's predicted probability of election was greater than or equal to 0.5, I predicted they were in the HoF. Of the 72 players in the HoF, 53 were correctly classified (73.61%) and 19 were not (26.39%). Here are players in the HoF who were not classified as HoF players by the model:

Code:

Clyde Lovellette .485
Harry Gallatin .438
Walt Bellamy .410
Gail Goodrich .403
Bob Davies .367
Jack Twyman .352
Tom Gola .322
Bill Walton .292
Bobby Wanzer .267
Dick McGuire .232
Andy Phillip .203
Earl Monroe .200
David Thompson .192
Connie Hawkins .184
Arnie Risen .163
Dan Issel .037
Calvin Murphy .033
Bill Bradley .024
Joe Fulks .023

Note that K.C. Jones -- who had one of the lowest predicted probabilities of election for a HoF player using the transformed counts -- is not on the list above. His probability of election using the raw counts is .634. Basically, some guys got a big boost from this change (e.g., Jones, Frank Ramsey, and Lenny Wilkens) and some went the other way (e.g., Bill Walton).

Of the 504 non-HoF players, 493 were correctly classified (97.82%) and 11 were not (2.18%). Here are the non-HoF players who were classified as HoF players by the model:

Code:

Dominique Wilkins .961
Jo Jo White .816
Gus Johnson .784
Dennis Johnson .669
Joe Dumars .613
Spencer Haywood .568
Adrian Dantley .562
Willie Naulls .535
Jack Sikma .527
Tom Sanders .520
Richie Guerin .517

Overall, 546 of the 576 players (94.79%) were correctly classified by the model. I don't think Dominique Wilkins will be on this list very long, although I thought he was a shoo-in last year, so who knows. Tom Sanders makes this list mainly because he played on so many championship teams. I don't expect him to make it in the future.

I'll try to get to some of the other suggestions that people made later on.
_________________
Regards,
Justin Kubatko
Basketball-Reference.com
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Mike G

Joined: 14 Jan 2005
Posts: 3615
Location: Hendersonville, NC

PostPosted: Tue Dec 20, 2005 12:24 pm Post subject: Reply with quote
Kevin brings up 2 separate issues. The 2nd one is an extension of the 'expansion/dilution' question, in which NBA talent is not equally great on a per-team basis, from year to year. The expansion era of 1967-71 deprived NBA teams of real and potential talent; whether the 'expansion teams' were in the NBA or the ABA matters less.

For example, the Lakers lost Gail Goodrich to the expansion Suns; he may have liked the 3-point shot and gone ABA, but he didn't. Either way, talent tended to drift away from existing teams.

Of course, Goody may have beat the Lakers a time or 2, while he was a Sun. If he were a Buc or a Squire, he wouldn't have. But the players who went to the ABA strengthened that league, and the NBA/ABA conversion rates are somewhat more decipherable than 'expansion adjustments' are.

Which brings us up to Kevin's first point. What kind of 'equivalency' is there between the senior league and the upstarts? What were Mel Daniels' 1213 rebounds in 1968 really worth, in terms of 1968 NBA rebounds?

There are a couple of ways of guessing at this, at least. One way is to compare successive seasons of players migrating from one league to the other, and average their ratios of NBA/ABA totals in the various categories. Another would be to compare per-minute rates and average those.

This latter course presumes that the players would be getting the same minutes in either league. In fact they probably wouldn't, but we don't know. Since the NBA was undergoing rapid expansion during the first half of the ABA's life, lots of guys got more minutes without going to the ABA.

At some point, I've decided to go with my biases. I like the ABA, because they were innovative; ultimately, the NBA became more like the ABA, than vice versa. The ABA modernized pro basketball, and a merger occurred just in time.

So as a concession to ABA players, I grant them their minutes; but I 'dock' them for their production in what began as an inferior league. By 1976, the leagues were virtually identical in their quality of competition, as evidenced by how players from both leagues had their numbers affected by the merger.

While the NBA added the 4 ABA squads in 1976-77, it was truly a massive contraction season. 3 ABA teams dissolved, and their players were infused into the NBA. Players on every team lost minutes, their rebound% dropped, points and assists were harder to come by.

So for the 9 years of the ABA, there's a rather jerky rise to equivalence with the NBA. The ABA dominated the drafts of 1970 and 1971, grabbing Issel, Erving, Gilmore, McGinnis, Scott, etc. In 1976, talent among 7 remaining ABA teams (after 3 teams had recently folded) seems indistinguishable from the average talent among 18 NBA teams.

People are definitely squeamish about 'docking' ABA stats: Either include them or exclude them. Well, I just can't see Haywood's 30-20 averages of 1970 as making any sense. Yet Erving's numbers of 1976 would be essentially replicated in 1980-82.

So I've used yearly 'docking' rates for ABA seasons; and I generate 'equivalent totals' based on these. Scoring rates of 1968 are hit by a .60 factor; rebounds by .72. Year by year, these rise until both are .95 or more, for 1976.
Back to top
View user's profile Send private message Send e-mail
mtamada

Joined: 28 Jan 2005
Posts: 377

PostPosted: Tue Dec 20, 2005 6:20 pm Post subject: Reply with quote
Mike G wrote:
There are a couple of ways of guessing at this, at least. One way is to compare successive seasons of players migrating from one league to the other, and average their ratios of NBA/ABA totals in the various categories. Another would be to compare per-minute rates and average those.

[...]

At some point, I've decided to go with my biases. I like the ABA, because they were innovative; ultimately, the NBA became more like the ABA, than vice versa. The ABA modernized pro basketball, and a merger occurred just in time.

[...]

So I've used yearly 'docking' rates for ABA seasons; and I generate 'equivalent totals' based on these. Scoring rates of 1968 are hit by a .60 factor; rebounds by .72. Year by year, these rise until both are .95 or more, for 1976.

Another strength of MikeG's work, which I haven't seen other people do, is to explicitly include ABA stats, but with docking. I'd thought your docking rates had been directly derived from your analysis of the year-to-year stats of the migrants?

One potential use of JustinK's regression results, if ABA stats were added as additional variables, is that they would give us a hard-number estimate of the relative weight of ABA stats and NBA stats.

Of course, this weight would merely reflect the judgements of the HoF voters, it would not necessarily be the correct weight or the objective weight that should be used.

It's worthwhile at this point to address a point made by some others, against the use of subjective accolade stats such as MVP or all-stardom. There are, as someone mentioned earlier, two types of HoF formulas we could create. A "normative" one would be one which attempts to measure who "should" be in the HoF, using (hopefully) some sort of objective or theoretically based standards. A purely-stats based one which ignores subjective judgements such as MVP awards is usually attempting to achieve this.

The other kind of formula is the "positive" one, which doesn't worry about who should be in, but instead simply asks how do the HoF voters actually behave, how can we predict who's in and who's not. If MVP awards, personality, race, Celtic bias, anti-ABA bias, or whatever are part of the process, then the formula will attempt to measure it. Not to justify those biases, but simply to say that they are factors which voters have taken into account.

Both formula have their uses. The regression approach that JustinK uses is firmly in the "positive" mode; it could be used as a springboard for future normative formulas, but regressions are by nature empirical analytic techniques; they look at what actually happened, they do not attempt to tell us what should have happened. They take the stats into account, but also the accolades, because that's what HoF voters actually do do.
Back to top
View user's profile Send private message
Doc319
Guest

PostPosted: Sun Dec 25, 2005 12:57 am Post subject: Gilmore Article in Dec. 9 Issue of Sports Collectors Digest Reply with quote
Here is how I summarized my SCD Gilmore article at 20 Second Timeout:

Artis Gilmore Article Published in Dec. 9 Issue of Sports Collectors Digest

Artis Gilmore ranks first in NBA/ABA history in career regular season field goal percentage, third in blocked shots (this statistic was not recorded during the careers of Wilt Chamberlain and Bill Russell), fifth in rebounds and 18th in points. He was the 1975 ABA Playoff MVP, leading the Kentucky Colonels to the championship. Prior to that he enjoyed a distinguished collegiate career at Jacksonville, becoming one of only five Division I players to average 20-plus points and 20-plus rebounds for a career. It is inexplicable and inexcusable that he has not been inducted in the Basketball Hall of Fame.

The December 9 issue of Sports Collectors Digest contains my lengthy profile of Gilmore's career, including the A-Train's recollections of playing against UCLA in the 1970 NCAA Championship, facing off against Wilt Chamberlain in the 1972 NBA-ABA All-Star Game and competing against Kareem Abdul-Jabbar in the 1983 Western Conference Finals.

--David Friedman
http://20secondtimeout.blogspot.com/
Back to top

Mike G

Joined: 14 Jan 2005
Posts: 3615
Location: Hendersonville, NC

PostPosted: Mon Dec 26, 2005 11:33 am Post subject: Reply with quote
mtamada wrote:
... to explicitly include ABA stats, but with docking. I'd thought your docking rates had been directly derived from your analysis of the year-to-year stats of the migrants?

.

Yes, you could call it analysis. But I had to fairly liberally smooth the curve, which for some seasons was based on only 4-5 players, or only 1-2 major ones.

The 'docking' still doesn't make much of a dent in the careers of hugely-productive guys (Erving, Gilmore, etc) who dominated the 2nd-half of the ABA timeline. However, perennial allstars of the 1st 4-5 years are hugely taxed (Daniels, etc).

Since I don't remember how recently I've said it, I'll repeat: Gilmore is quite likely held out of the Hall due to the 'baggage' that many ABA fans would like him to carry with him: Precisely those original ABA'ers like Mel and Rajah, who with unadulterated stats make much more presentable cases.

Quote:
One potential use of JustinK's regression results, if ABA stats were added as additional variables, is that they would give us a hard-number estimate of the relative weight of ABA stats and NBA stats.

Of course, this weight would merely reflect the judgements of the HoF voters, it would not necessarily be the correct weight or the objective weight that should be used.

It might be a challenge to lay correlations out in a sober manner, without making the HoF voting process look ridiculous. I suspect ABA stats may actually have a negative correlation, as do height, African-ness, and distance from NY/Bos/LA.

The 'old guard' which has hijacked objectivity surely has to die off sometime. Or put out to pasture.
Back to top
View user's profile Send private message Send e-mail
gabefarkas

Joined: 31 Dec 2004
Posts: 1313
Location: Durham, NC

PostPosted: Mon Jan 02, 2006 8:49 pm Post subject: Reply with quote
admin wrote:

And is there an argument for discounting NBA performance during this period because the league was weakened by having quality players in the ABA?

I tried to make that same argument a few months back on the e-mail list, but no one seemed to be having any of it.

That said, I'm still a strong believer in it. If the ABA stats have to be discounted because so many good players were in the NBA, then shouldn't NBA stats also be discounted accordingly based on the good players that were in the ABA?
Back to top
View user's profile Send private message Send e-mail AIM Address
Mike G

Joined: 14 Jan 2005
Posts: 3615
Location: Hendersonville, NC

PostPosted: Tue Jan 03, 2006 7:19 am Post subject: Reply with quote
Gabe, earlier in the thread I suggested breaking this into 2 components: NBA/ABA equivalency, and NBA expansion.

If I/we grant a factor like:

10 ABA Reb/48 = 9 NBA Reb/48

for the year 1971, that is an ABA equivalency to the NBA in that year, for rebounds.

We might further decide on an NBA expansion factor for 1971, like:

10 '1971' Reb/48 = 8 '2006' Reb/48 .

Now, we find that 10 ABA Reb (1971) = .9 * .8 = .72 NBA Reb (2006)

The ABA rebound is still just 90% of the NBA rebound, of 1971. Discounting NBA stats for 1968-76 (or any other time), due to expansion or absent talent, is a separate issue.

We can see how playing time increases for players continuing through expansion seasons. The challenge is in creating stat equivalencies based on PT changes.
_________________
`
36% of all statistics are wrong

Posted: **Thu Jun 09, 2011 7:48 pm**

Mike G

Joined: 14 Jan 2005
Posts: 3564
Location: Hendersonville, NC

PostPosted: Fri Dec 16, 2005 11:34 am Post subject: Kubatko all over the place: Similarity Index Reply with quote
http://sportsillustrated.cnn.com/2005/w ... ity/3.html

Justin gets into comparing a sampling of past players -- mostly some greats, but not all -- to last year's players in the NBA. Using career rates for the alltimers, we get a reasonable look at currently similar players.

Since I've been doing this for years, it's easy enough to compare notes. Justin says the closest 2005 comps to Kiki Vandeweghe are Szerbiak and Redd. I can't disagree! For Bernard, he has Carmelo, followed by Maggette. Bingo! on both counts.

Some others are iffier. For Dennis Rodman, he likes Andre Iguodala and Reggie Evans. Evans, in a most untypical year for him, looks like clearly the closest fit -- even if only for 24 minutes. But Iguodala?

From the article:
"...Rodman's rebound rate (23.4 percent) dwarfs Iguodala's (9.7 percent)....the similarity score for these two players is only 810. While some may see this as a failure of the similarity scores method, I see it as success. Rodman was a rather unique player, so it's not surprising that it's hard to find a good match for him."

After Evans, I see another 108 NBA players from 2005 whose stats look more similar to Rodman's, and finally I see Iguodala. Kurt Thomas and Jeff Foster, for starters.

It's clearly a failure of the method. I'm guessing it weighs Reb% and FT% about equally, even though FT had nothing to do with Rodman's game. Rebounds, of course, were his game.

Thanks to Reggie Evans' freakishly-good year, Rodman's not the most unique player to compare to last year's bunch. Bird, Magic, Moses, Hakeem, and Stockton are even further from their nearest resemblers. As are Walton, McGinnis, Gilmore, Wilt, and Russell.
Back to top
View user's profile Send private message Send e-mail
jkubatko

Joined: 05 Jan 2005
Posts: 702
Location: Columbus, OH

PostPosted: Fri Dec 16, 2005 12:04 pm Post subject: Reply with quote
Mike, that article has been up for almost a month, and was already linked to on this board.
_________________
Regards,
Justin Kubatko
Basketball-Reference.com
Back to top
View user's profile Send private message Send e-mail Visit poster's website
KnickerBlogger

Joined: 30 Dec 2004
Posts: 180

PostPosted: Fri Dec 16, 2005 12:07 pm Post subject: Re: Kubatko all over the place: Similarity Index Reply with quote
Mike G wrote:
http://sportsillustrated.cnn.com/2005/w ... ity/3.html

Justin gets into comparing a sampling of past players -- mostly some greats, but not all -- to last year's players in the NBA. Using career rates for the alltimers, we get a reasonable look at currently similar players.

Since I've been doing this for years, it's easy enough to compare notes.

Are these published anywhere?
_________________
KnickerBlogger.Net - now indispensable!
Back to top
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger
Mike G

Joined: 14 Jan 2005
Posts: 3564
Location: Hendersonville, NC

PostPosted: Fri Dec 16, 2005 12:44 pm Post subject: Reply with quote
Unlike other similarity measures, I use a 'difference' measure. Zero (.00) is the difference between a player and himself.
Based on productivity statistics (not height, age, weight), adjusted to game pace; some NBA-top-50 players and their nearest 2005 counterparts:

dif per-36-minute Sco Reb Ast
.00 Charles Barkley 23 12 4
.57 Carlos Boozer 21 11 3
.66 Kevin Garnett 25 14 6

.00 Larry Bird 23 10 6
.64 Chris Webber 18 10 5
.67 Paul Pierce 25 7 4

.00 Clyde Drexler 20 7 6
.50 Larry Hughes 21 6 5
.57 Steve Francis 20 6 7

Anything over .50 is not really very similar.

.00 Alex English 21 6 4
.39 Carmelo Anthony 22 6 3
.41 Jason Richardson 21 6 4

.00 Patrick Ewing 23 11 2
.46 Elton Brand 22 10 3
.49 Zydrunas Ilgauskas 20 10 1

.00 Magic Johnson 19 8 10
.77 Jamaal Tinsley 18 5 8
.78 Jason Kidd 15 8 9
.79 Steve Francis 20 6 7

.00 Michael Jordan 31 6 5
.55 Vince Carter 28 6 5
.58 Tracy Mcgrady 26 6 6

dif per-36-minute Sco Reb Ast
.00 Karl Malone 27 11 4
.55 Dirk Nowitzki 29 10 3
.72 Pau Gasol 25 9 3
.72 Paul Pierce 25 7 4

.00 Moses Malone 21 13 1
.70 Drew Gooden 18 11 2
.78 Zach Randolph 20 11 2
.78 Elton Brand 22 10 3
.78 Zydrunas Ilgauskas 20 10 1

.00 Kevin McHale 21 9 2
.48 Zydrunas Ilgauskas 20 10 1
.51 Elton Brand 22 10 3

.00 Hakeem Olajuwon 23 12 3
.68 Elton Brand 22 10 3
.79 Tim Duncan 26 13 3

.00 Robert Parish 18 11 2
.42 Drew Gooden 18 11 2
.45 Chris Bosh 17 9 2

.00 Scottie Pippen 18 7 6
.49 Chris Webber 18 10 5
.68 Larry Hughes 21 6 5

dif per-36-minute Sco Reb Ast
.00 David Robinson 24 12 3
.54 Elton Brand 22 10 3
.54 Tim Duncan 26 13 3

.00 John Stockton 16 3 12
.74 Steve Nash 19 3 11
.74 Brevin Knight 11 3 11
.74 Baron Davis 20 4 9

.00 Isiah Thomas 18 4 9
.32 Baron Davis 20 4 9
.34 Jamaal Tinsley 18 5 8

.00 Dominique Wilkins 25 7 2
.55 Carmelo Anthony 22 6 3
.57 Grant Hill 22 5 3

.00 James Worthy 19 6 3
.36 Jason Richardson 21 6 4
.43 Grant Hill 22 5 3

The above players' careers are entirely after 1978, when stats are considered to be 'complete'. Players whose careers spanned or overlapped the 1974-78 'semi-complete' stat era include these guys:

dif per-36-minute Sco Reb Ast
.00 Kareem AbdulJabbar 25 11 3
.42 Pau Gasol 25 9 3
.45 Elton Brand 22 10 3

.00 Nate Archibald 18 2 7
.47 Jason Terry 18 3 6
.56 Earl Boykins 17 2 6

.00 Rick Barry 22 6 4
.38 Larry Hughes 21 6 5
.39 Manu Ginobili 26 6 5

.00 Dave Cowens 16 12 4
.35 Lamar Odom 16 11 4
.55 Mehmet Okur 18 11 3

.00 Julius Erving 23 8 4
.70 Pau Gasol 25 9 3
.70 Andrei Kirilenko 21 8 4
.70 Paul Pierce 25 7 4

.00 Walt Frazier 19 5 6
.39 Mike Bibby 20 4 6
.49 Tony Parker 20 4 7

.00 George Gervin 24 5 3
.50 Carmelo Anthony 22 6 3
.64 Grant Hill 22 5 3

dif per-36-minute Sco Reb Ast
.00 Artis Gilmore 19 12 2
.61 Zydrunas Ilgauskas 20 10 1
.68 Elton Brand 22 10 3

.00 Elvin Hayes 19 11 2
.42 Zydrunas Ilgauskas 20 10 1
.43 Chris Bosh 17 9 2

.00 Dan Issel 21 9 2
.13 Shareef Abdur-Rahim 21 8 2
.47 Zach Randolph 20 11 2

.00 Bob Lanier 21 10 3
.43 Elton Brand 22 10 3
.55 Pau Gasol 25 9 3

.00 Pete Maravich 22 4 5
.43 Richard Hamilton 21 4 5
.54 Gilbert Arenas 26 5 5

.00 Bob McAdoo 23 10 2
.44 Elton Brand 22 10 3
.44 Pau Gasol 25 9 3

.00 George McGinnis 19 10 4
.67 Chris Webber 18 10 5
.72 Carlos Boozer 21 11 3

dif per-36-minute Sco Reb Ast
.00 Wes Unseld 10 12 4
.57 P.J. Brown 12 10 3
.68 Kurt Thomas 11 11 2

.00 Bill Walton 16 13 4
.78 Lamar Odom 16 11 4
.88 Marcus Camby 12 12 3

I don't separate out the various shooting %. Rather, they are 'built in' to the Scoring rate. I've also factored in Stl, TO, Blk, and PF; but I'm not showing them, in the interest of visual clarity.

For years before 1974, I've 'invented' Stl, TO, and Blk figures. This has advantages and disadvantages; but it does offer the prospect of comparing players from all eras. If this is anathema to you, do not read further.

dif per-36-minute Sco Reb Ast
.00 Paul Arizin 23 6 2
.42 Carmelo Anthony 22 6 3
.55 Shareef Abdur-Rahim 21 8 2

.00 Elgin Baylor 23 10 4
.53 Pau Gasol 25 9 3
.64 Elton Brand 22 10 3

.00 Walt Bellamy 18 11 2
.32 Mehmet Okur 18 11 3
.37 Drew Gooden 18 11 2
.37 Zach Randolph 20 11 2

.00 Dave Bing 19 3 6
.32 Jason Terry 18 3 6
.38 Mike Bibby 20 4 6

.00 Wilt Chamberlain 23 15 4
.71 Tim Duncan 26 13 3
.89 Kevin Garnett 25 14 6

.00 Bob Cousy 19 4 8
.30 Baron Davis 20 4 9
.33 Jamaal Tinsley 18 5 8

.00 Billy Cunningham 20 9 4
.53 Chris Webber 18 10 5
.66 Kenyon Martin 18 9 3

dif per-36-minute Sco Reb Ast
.00 Dave DeBusschere 15 9 3
.36 Nene Hilario 14 9 2
.40 Kenyon Martin 18 9 3
.42 Rasheed Wallace 16 9 2

.00 Hal Greer 18 4 4
.47 Mike James 17 4 5
.48 Ricky Davis 18 4 3

.00 Cliff Hagan 20 6 4
.53 Stephen Jackson 21 6 3
.55 Mike Miller 21 5 4

.00 John Havlicek 20 5 5
.39 Jason Richardson 21 6 4
.47 Grant Hill 22 5 3

.00 Connie Hawkins 16 7 4
.44 Kenyon Martin 18 9 3
.51 Darius Miles 16 6 3

.00 Neil Johnston 24 10 3
.39 Pau Gasol 25 9 3
.45 Elton Brand 22 10 3

.00 Sam Jones 21 4 3
.31 Grant Hill 22 5 3
.36 Cuttino Mobley 18 4 3

dif per-36-minute Sco Reb Ast
.00 Jerry Lucas 16 12 3
.57 Lamar Odom 16 11 4
.65 Rasheed Wallace 16 9 2

.00 George Mikan 27 13 3
.58 Shaquille O'Neal 29 12 3
.66 Jermaine O'Neal 27 10 2

.00 Earl Monroe 20 3 4
.39 Ricky Davis 18 4 3
.46 Richard Hamilton 21 4 5

.00 Bob Pettit 25 12 3
.52 Pau Gasol 25 9 3
.54 Elton Brand 22 10 3

.00 Willis Reed 18 11 2
.40 Drew Gooden 18 11 2
.40 Zydrunas Ilgauskas 20 10 1

.00 Oscar Robertson 22 5 8
.46 Baron Davis 20 4 9
.53 Steve Francis 20 6 7

dif per-36-minute Sco Reb Ast
.00 Bill Russell 13 15 4
.64 Marcus Camby 12 12 3
1.08 Ben Wallace 10 13 2

.00 Dolph Schayes 21 10 3
.48 Elton Brand 22 10 3
.54 Kenyon Martin 18 9 3

.00 Bill Sharman 21 3 3
.36 Jalen Rose 21 4 3
.42 Ben Gordon 22 4 3

.00 Nate Thurmond 13 12 3
.51 Marcus Camby 12 12 3
.61 Chris Andersen 14 10 2

.00 Jerry West 25 4 6
.58 Manu Ginobili 26 6 5
.59 Dwyane Wade 27 5 7

.00 Lenny Wilkens 16 4 7
.25 Kirk Hinrich 16 4 7
.43 Rafer Alston 15 4 7

There are many mentions of Ilgauskas, Gasol, Brand, Kenyon. I guess these are players who are very typical (or archetypical) at their position; and which seem to be in short supply, at present.

Last edited by Mike G on Sat Dec 17, 2005 6:41 am; edited 1 time in total
Back to top
View user's profile Send private message Send e-mail
Mike G

Joined: 14 Jan 2005
Posts: 3564
Location: Hendersonville, NC

PostPosted: Fri Dec 16, 2005 1:16 pm Post subject: Reply with quote
It occurs to me that there is no universally appropriate 'publishable' form of this stuff. However, anyone who wishes may own the Excel workbook from which they are derived. Just hit the 'email' button at the bottom of the post, and request a copy. I have '571 career-averages' version, and 'player-season' version (which is much the larger one).
Back to top
View user's profile Send private message Send e-mail
bchaikin

Joined: 27 Jan 2005
Posts: 685
Location: cleveland, ohio

PostPosted: Fri Dec 16, 2005 3:45 pm Post subject: Reply with quote
in the article for the examples given for Player = Current Counterparts, were these merely examples, or the closest current comparisons? i.e. is this saying mcgrady and vinsanity are the closest comparisons for jordan, or are they just 2 examples and there are others as close or closer? and were these comparisons age based, or just stats based?...

Please note that even though two players may be statistically similar, it does not mean that they were the same type of player. It simply means they produced similar results.

by "similar results" was this meant in just certain statistical categories, all statistical categories (if not which were left out), or how they helped their teams win games?...

http://www.basketball-reference.com/about/similar.html

at the above webpage i do not see steals in any part of the formula - are they included in any similarity calculations?...
Back to top
View user's profile Send private message Send e-mail Visit poster's website
jkubatko

Joined: 05 Jan 2005
Posts: 702
Location: Columbus, OH

PostPosted: Fri Dec 16, 2005 6:10 pm Post subject: Reply with quote
bchaikin wrote:
in the article for the examples given for Player = Current Counterparts, were these merely examples, or the closest current comparisons? i.e. is this saying mcgrady and vinsanity are the closest comparisons for jordan, or are they just 2 examples and there are others as close or closer? and were these comparisons age based, or just stats based?...

The current players presented were the two most similar players. I used a career composite for the past greats and not-so-greats, and used 2004-05 statistics for the current players. So among all players in 2004-05, McGrady and Carter had results that were most similar to a "typical" Michael Jordan season.

bchaikin wrote:
by "similar results" was this meant in just certain statistical categories, all statistical categories (if not which were left out), or how they helped their teams win games?...

The categories are listed here.

bchaikin wrote:
http://www.basketball-reference.com/about/similar.html

at the above webpage i do not see steals in any part of the formula - are they included in any similarity calculations?...

The method is open source, so that page explains everything. Steals are not included. I remember having a reason for not using them, but can't recall it at the moment. Perhaps I should revisit the issue.
_________________
Regards,
Justin Kubatko
Basketball-Reference.com
Back to top
View user's profile Send private message Send e-mail Visit poster's website
jkubatko

Joined: 05 Jan 2005
Posts: 702
Location: Columbus, OH

PostPosted: Fri Dec 16, 2005 6:24 pm Post subject: Reply with quote
Mike G wrote:
Unlike other similarity measures, I use a 'difference' measure. Zero is the difference between a player and himself.

I'm not sure what you mean by that. In my system, 1000 is the top score. If you compare a player to himself, the similarity score will be 1000. That's the same as saying the difference between a player and himself is 0. Also, I do look at differences. This article explains my method in great detail. I could present the results as differences I suppose, but I think they're much easier to interpret when presented as a part of a whole. For example, which of the following is easier to interpret:

1) The difference between the two players is 95.

or

2) The similarity between the two players is 905 out of 1000.
_________________
Regards,
Justin Kubatko
Basketball-Reference.com
Back to top
View user's profile Send private message Send e-mail Visit poster's website
mtamada

Joined: 28 Jan 2005
Posts: 376

PostPosted: Fri Dec 16, 2005 8:58 pm Post subject: Reply with quote
One name which popped out at me from MikeG's list was Pau Gasol. He's on the Top 3 Most Similar for the Mailman, Kareem, Dr. J, and Bob Lanier. He's not necessarily real similar to most of them, but the was in the Top 3.

This brings up some mildly interesting possibilities: the player who was "most similar to the most players" (or alternatively, the most superstars) -- not necessarily the one with the greatest similarity score, but one with lots and lots of big similars. Gasol's a highly talented player, but he wouldn't have been my guess to be the player with similarity to those Hall of Famers.

In his Minkoff Player Ratings from several years ago, Tony Minkoff identified the "Most Average Player" one season -- I think it was Derrick McKey.

I'd have to look at the stats more closely, but I share MikeG's reaction to the Iguodala-Rodman similarity score. Although Rodman was unique, he was not far from a pretty common NBA player-type: the non-shooting big rebounder. Prior to Rodman, Mark Landsberger was probably the prototype. But there's a ton of others: Pete Cross, Larry Smith, Charles Oakley, Kurt Rambis, etc. And their slightly higher-scoring brethren: Lloyd Neal, Paul Silas, etc. Surely most or all of these players should have greater similarity to Rodman than Iguodala does.
Back to top
View user's profile Send private message
jkubatko

Joined: 05 Jan 2005
Posts: 702
Location: Columbus, OH

PostPosted: Fri Dec 16, 2005 9:01 pm Post subject: Reply with quote
mtamada wrote:
I'd have to look at the stats more closely, but I share MikeG's reaction to the Iguodala-Rodman similarity score. Although Rodman was unique, he was not far from a pretty common NBA player-type: the non-shooting big rebounder. Prior to Rodman, Mark Landsberger was probably the prototype. But there's a ton of others: Pete Cross, Larry Smith, Charles Oakley, Kurt Rambis, etc. And their slightly higher-scoring brethren: Lloyd Neal, Paul Silas, etc. Surely most or all of these players should have greater similarity to Rodman than Iguodala does.

Well sure, but I only used players from the 2004-05 season.
_________________
Regards,
Justin Kubatko
Basketball-Reference.com
Back to top
View user's profile Send private message Send e-mail Visit poster's website
jkubatko

Joined: 05 Jan 2005
Posts: 702
Location: Columbus, OH

PostPosted: Fri Dec 16, 2005 9:08 pm Post subject: Reply with quote
As a follow-up, check out Rodman's page on my web site. In the similarity scores section you'll find names like Smith, Rambis, Oakley, and Silas.
_________________
Regards,
Justin Kubatko
Basketball-Reference.com
Back to top
View user's profile Send private message Send e-mail Visit poster's website
bchaikin

Joined: 27 Jan 2005
Posts: 685
Location: cleveland, ohio

PostPosted: Fri Dec 16, 2005 10:32 pm Post subject: Reply with quote
Please note that even though two players may be statistically similar, it does not mean that they were the same type of player. It simply means they produced similar results.

The categories are listed here.

i've looked at that page, but your article makes clear the above statement on the first page. so what exactly do you mean by "...it simply means they produced similar results..."? what do you mean by results? do you mean the word statistics? just some of their statistics? or by "results" do you mean their affect on their teams?...

you state two players may be statistically similar, then go on to say they may not be the same type of player, but that they do produce similar results, as if to imply that overall their different combination of statistics are in some way - when totaled - similar...

Steals are not included. I remember having a reason for not using them, but can't recall it at the moment. Perhaps I should revisit the issue.

a steal is credited to one player and results in a zero point possession for the opponent (well the vast majority of the time anyway). there aren't many ways a single player can force the opponent into a zero point team possession by himself - a steal, forcing a turnover (which we don't have defensive stats for - yet), taking a charge...

forcing the man you're guarding into missing a shot isn't as good as a steal because a rebound ensues that could be offensive - same with a blocked shot. since def rebs occur on about 2/3 (a little more last season) of missed shots, for a single defender to get credit for a steal is - statistically speaking - a greater effect than forcing a miss because that miss could be rebounded by the offense, continuing that team's possession...
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Mike G

Joined: 14 Jan 2005
Posts: 3564
Location: Hendersonville, NC

PostPosted: Sat Dec 17, 2005 7:04 am Post subject: Reply with quote
I think of similarity as "what other player does this players' job just about as well". So if I have to rest Dennis Rodman, who is an able replacement? Iguodala or Foster?

Or, if Rodman bolts as a free agent, and I need to fill that void, with what player can I hope to do that with, mostly. No doubt Rodman's quite unique; but it may be these 'outlier' types that really create the hard tests of our systems.

Checking career rates of alltime-greats against seasonal numbers of current players is something I'd never done before. Season vs season, or career vs career makes some sense. However, fun is fun; and peak years of alltime greats are going to tend to be rather incomparable.

I've thought of going with the <1000 scale, rather than >0. Since I use 7 variables (Sco, Reb, Ast, Stl, TO, Blk, PF), it's like a 7-dimensional dartboard's bullseye. There will be many, many more players within 1.00 than within .50 .

Including players from before 1978 (and '74) invites the phenomenon of seeing certain players comparable to many oldtimers. Since I've assigned 'average' (based on other stats) Stl, Blk, and TO to that era's players, current players with 'average' stats in those categories will push to the head of the line. It's a 2-edged sword (or razor).

I appreciate the variety of methods in this endeavor. Sometimes I get a good idea; or at least a good idea of what people want to see.
Back to top
View user's profile Send private message Send e-mail
bchaikin

Joined: 27 Jan 2005
Posts: 685
Location: cleveland, ohio

PostPosted: Sat Dec 17, 2005 11:45 am Post subject: Reply with quote
I think of similarity as "what other player does this players' job just about as well".

if - and i said if - that was the author's intent, that player A is as good as player B, in terms of how much they help their team win, i have a tough time reconciling antoine walker being as good as scottie pippen....

the other comparisons look pretty good. plus so far in 05-06, after some 20+ games, elton brand sure does look similar to karl malone, maybe even tim duncan, in terms of helping his team win, even moreso than in 04-05...
Back to top
View user's profile Send private message Send e-mail Visit poster's website
jkubatko

Joined: 05 Jan 2005
Posts: 702
Location: Columbus, OH

PostPosted: Sat Dec 17, 2005 1:21 pm Post subject: Reply with quote
bchaikin wrote:
i've looked at that page, but your article makes clear the above statement on the first page. so what exactly do you mean by "...it simply means they produced similar results..."? what do you mean by results? do you mean the word statistics? just some of their statistics? or by "results" do you mean their affect on their teams?...

I guess a better way to say it would be "similar statistical results" or "similar statistical outcomes in the categories I have chosen to measure". I'm actually not sure if what appears in the article is what I originally wrote. As you should know, the SI.com editors take some liberties when editing these articles. It would make for an interesting study to see if players that are similar using my method also affect their teams in similar ways. Maybe I'll try to do that at some point.
_________________
Regards,
Justin Kubatko
Basketball-Reference.com

page 2 0f 2 missing

Posted: **Tue Jun 14, 2011 12:26 am**

page 10 of 10

Author Message
Mike G

Joined: 14 Jan 2005
Posts: 3596
Location: Hendersonville, NC

PostPosted: Tue Apr 13, 2010 12:30 pm Post subject: Reply with quote
BobboFitos wrote:
bumping this. could we get an end-of-the-year update?
What?
Is it the end of the year?
_________________
`
36% of all statistics are wrong
Back to top
View user's profile Send private message Send e-mail
BobboFitos

Joined: 21 Feb 2009
Posts: 199
Location: Cambridge, MA

PostPosted: Tue Apr 13, 2010 12:34 pm Post subject: Reply with quote
Mike G wrote:
BobboFitos wrote:
bumping this. could we get an end-of-the-year update?
What?
Is it the end of the year?

Fair enough! I can wait a few days.
_________________
-Rob
Back to top
View user's profile Send private message AIM Address
Mike G

Joined: 14 Jan 2005
Posts: 3596
Location: Hendersonville, NC

PostPosted: Tue Apr 13, 2010 12:37 pm Post subject: Reply with quote
Here it is; we're both in the 2nd division.
Code:
b2nb CSco vegas 09Py SRS SPM KP 2009 eyriq
6.67 7.15 7.35 7.62 7.63 7.74 7.85 8.05 8.06

BoFi BRSim stea JH eWins BD BS WS
8.09 8.12 8.19 8.26 8.37 8.60 8.63 8.91

Average errors.
Half of us are worse than if we'd just repeated last year's record.
Last year's Pythagorean beats almost everyone.
_________________
`
36% of all statistics are wrong
Back to top
View user's profile Send private message Send e-mail
HoopStudies

Joined: 30 Dec 2004
Posts: 705
Location: Near Philadelphia, PA

PostPosted: Tue Apr 13, 2010 2:00 pm Post subject: Reply with quote
Mike G wrote:
Here it is; we're both in the 2nd division.
Code:
b2nb CSco vegas 09Py SRS SPM KP 2009 eyriq
6.67 7.15 7.35 7.62 7.63 7.74 7.85 8.05 8.06

BoFi BRSim stea JH eWins BD BS WS
8.09 8.12 8.19 8.26 8.37 8.60 8.63 8.91

Average errors.
Half of us are worse than if we'd just repeated last year's record.
Last year's Pythagorean beats almost everyone.

Do we have a key for relating what all the names mean? b2nb is back2newbelf, but what was the basic method? Not sure who or what most of these are.
_________________
Dean Oliver
Author, Basketball on Paper
The postings are my own & don't necess represent positions, strategies or opinions of employers.
Back to top
View user's profile Send private message Visit poster's website
Mike G

Joined: 14 Jan 2005
Posts: 3596
Location: Hendersonville, NC

PostPosted: Tue Apr 13, 2010 2:15 pm Post subject: Reply with quote
b2nb's prediction appears on page 3 of this thread.
Others trickle in thereafter.
_________________
`
36% of all statistics are wrong
Back to top
View user's profile Send private message Send e-mail
Royce

Joined: 13 Feb 2005
Posts: 3

PostPosted: Wed Aug 18, 2010 6:53 pm Post subject: 2010 Predictions Reply with quote
Jumping in after five years or so of active lurking ...

I finally got around to looking back at last year's Summer Forecast on ESPN.com, found here: http://sports.espn.go.com/nba/news/stor ... tStandings

Some background: I cast a wide net, hoping for a "wisdom of crowds" effect. (There were other good reasons for getting a lot of people involved, but that was a key motive.) We ended up with 53 panelists, all associated with ESPN's NBA coverage in some way.

To help ensure broad participation, I took several steps to simplify the project for panelists:
+ Used Google Forms
+ Asked for a simple estimate of wins per team (which might have led to a little inflation, but that was prorated out)
+ Advised panelists to respond with their best guesstimates (as opposed to studying up), figuring that the differences of opinion would wash out
+ Didn't require exactly 1,230 wins total (as noted, results were prorated)
+ Promised not to reveal their responses except in the aggregate, mostly to remove any apprehensions they might have about making projections so early, well before the season was to begin

Result: An average error of 7.03, if my math is right.

Pretty good outcome, confirming that we might have seen a positive "crowd wisdom" effect.

Naturally we saw some compression in the results, toward the mean, but then again, predicting the Cavs to revert from 66 wins to 61 wins worked out pretty well.

Worth noting is that the group had very mixed results in the other categories -- seemed prescient on whether Bosh and Stoudemire would stay put, but struggled with categories like Best and Worst Newcomers. That probably says something about which areas are easier/harder to predict, but I won't read too much into one year's results.

For that reason, I can't vouch for how our panel will perform going forward, but I'm curious to see how it goes now that we've brought even more people into the process, with 93 voters this summer -- particularly in terms of W-L record.

_________________

Royce Webb
NBA Editor | ESPN.com
Back to top
View user's profile Send private message
Crow

Joined: 20 Jan 2009
Posts: 816

PostPosted: Wed Aug 18, 2010 8:22 pm Post subject: Reply with quote
An average error of 7 or a little less being good or best is being repeated here and with the ESPN average.

(I previously suggested that a blend of methods / experts might be a good way to go, factor in partially or at least consider and the performance of the ESPN average for this season tends to support that.)

Most predictors, especially the strict method based predictors, I assume are trying to best estimate individual teams even though the "prize" is the lowest average error. The season I won the prediction contest here I engaged in some efforts to minimize average error (looking at the predictions of others and regressing to the mean some, especially where I was less confident) but I could have been been more aggressive about it.

Regressing to the mean or some form of team adjustment for its recent history or regression to the mean for a group of teams with similar recent history might be be able to go somewhat below an average error of 7 consistently. Or not. I think the tracked Vegas line was at 6 to 7 the last 2 years. It probably will take more than casual effort to get to 6 or below consistently.

P.S. Congrats to back2newbelf for his apparent win. I might go back and re-read the detail about your method that linked offense with defense again; or if you've evolved or refined it further, I'd be interested in hearing a new explanation. You said essentially that it was a primary resource but not the exclusive source of your predictions? What approach do you plan this season, if you select to offer them?

If there is a repeat champ here or more broadly, and especially if that includes beating the Vegas line (or the average of several or many) that
probably would deserve more attention.
Back to top
View user's profile Send private message
back2newbelf

Joined: 21 Jun 2005
Posts: 271

PostPosted: Thu Aug 19, 2010 7:33 am Post subject: Reply with quote
Crow wrote:

P.S. Congrats to back2newbelf for his apparent win. I might go back and re-read the detail about your method that linked offense with defense again; or if you've evolved or refined it further, I'd be interested in hearing a new explanation. You said essentially that it was a primary resource but not the exclusive source of your predictions? What approach do you plan this season, if you select to offer them?

Thank you.

Unfortunately I was/am quite busy with my diploma thesis and didn't have the time to do any refinements. What I want to do is see which weights (right now almost everything is split 50/50) produce the best retrodiction results, then see how it compares to regularized adj +/- in retrodiction and create a rating that combines the two to hopefully get even better results

My rating system was the exclusive source of the predictions, rating system wise. I had to guess minute projections and also I drifted most teams to average

For this season I think I'll probably use my (not yet refined) rating system, regularized adj +/-, coaching changes, whether teams had draft picks or not and whether teams had many players playing at the World Championship (I expect those with heavy minutes at WC to miss slightly more games in the upcoming NBA season due to injuries)
Back to top
View user's profile Send private message

Posted: **Wed Jan 28, 2015 9:47 am**

Thank you so much for your post.
It's so lucky that I've learn so many info here.

APBRmetrics

Recovered old threads- miscellaneous topics

Re: Recovered old threads- miscellaneous topics

Re: Recovered old threads- miscellaneous topics

Re: 2009-10 predictions

Re: Recovered old threads- miscellaneous topics