APBRmetrics

The discussion of the analysis of basketball through objective evidence, especially basketball statistics.
It is currently Sat Nov 01, 2014 10:03 am

All times are UTC




Post new topic Reply to topic  [ 4 posts ] 
Author Message
PostPosted: Fri Apr 15, 2011 7:28 pm 
Offline

Joined: Thu Apr 14, 2011 11:10 pm
Posts: 2441
PAGE 1


Author Message dsparks



Joined: 22 Feb 2008
Posts: 61
Posted: Tue Jul 08, 2008 10:29 pm Post subject: SPI Playing Style Trichotomy

I understand that the last thing we need is yet another classification of playing styles, but I thought I would share and solicit feedback on my latest invention: a playing style spectrum, derived from identifying a player's Scoring/Perimeter/Interior tendencies: Post with details: http://arbitrarian.wordpress.com/2008/0 ... -spectrum/ Direct to Google Maps version of graphic: http://bit.ly/spi (note easy-to-remember URL!) I would appreciate any comments, but especially those addressing the utility of this conceptualization of playing style, and whether or not these objective categorizations mesh with your own objective findings and subjective impressions._________________David http://arbitrarian.wordpress.com
Back to top

Mountain



Joined: 13 Mar 2007
Posts: 1527
Posted: Tue Jul 08, 2008 11:10 pm Post subject:

I like the groupings and the display method. The comprehensive maps are impressive as always though sometimes I wish for stripped down datasets (be it just guys who played last season over a certain rating or just top 50 of alltime or whatever) to digest it easier. Not instead of comprehensive displays but perhaps in addition to. How are championships distributed among #1 / 2 guys? One twist on this might be to look at playing style of a single player thru his biggest minute lineups. Does he change as opportunity and need change or does he do his thing and let others change or let things go undone? You could also do a timeseries thru a season or thru a career. A comet changing color (in some cases).
Back to top

dsparks



Joined: 22 Feb 2008
Posts: 61
Posted: Thu Jul 10, 2008 6:03 am Post subject:

Mountain: Here's a partial response, in the form of some examples. These are per-game locations for two very different players. Carmelo Anthony Shane Battier Notice the extent to which Battier covers the whole spectrum, while Anthony concentrates heavily in the Scoring direction. Each point is a game, and I've labeled years, but it's still hard from these to get a sense of any time trend. If I find time, I'll post some more._________________David http://arbitrarian.wordpress.com
Back to top

Mountain



Joined: 13 Mar 2007
Posts: 1527
Posted: Thu Jul 10, 2008 8:32 am Post subject:

I had been thinking month to month thru a season but game by game is a very good choice too. The player maps are like galaxies.
Back to top

gabefarkas



Joined: 31 Dec 2004
Posts: 1292
Location: Durham, NC
Posted: Thu Jul 10, 2008 8:58 pm Post subject:

dsparks wrote:
Mountain: Here's a partial response, in the form of some examples. These are per-game locations for two very different players. Notice the extent to which Battier covers the whole spectrum, while Anthony concentrates heavily in the Scoring direction. Each point is a game, and I've labeled years, but it's still hard from these to get a sense of any time trend. If I find time, I'll post some more.
Maybe I missed it, but why are some points in a larger font than others?
Back to top

Ryan J. Parker



Joined: 23 Mar 2007
Posts: 706
Location: Raleigh, NC
Posted: Thu Jul 10, 2008 9:04 pm Post subject:

I think it's supposed to be 3D gabe.
Back to top

dsparks



Joined: 22 Feb 2008
Posts: 61
Posted: Thu Jul 10, 2008 9:30 pm Post subject:

gabefarkas: Points are scaled according to MEV (model-estimated value, the linear-weighted points created measure I derived from regression output), so bigger font size is associated with more productive games. Here's a link to a PDF featuring only those players who, in the 07-08 season, had BoxScores (i.e. wins produced) greater than 5: http://peoplesstatistic.googlepages.com/seasonspi.pdf Note, the size of these names are scaled, but it is not very noticeable here... Incidentally, trading Odom for Artest is a big change in playing style for the Lakers--Artest plays somewhat like Bryant, and I doubt the Lakers need two such players, even at different positions..._________________David http://arbitrarian.wordpress.com
Back to top

dsparks



Joined: 22 Feb 2008
Posts: 61
Posted: Mon Jul 14, 2008 4:00 pm Post subject:

In the interest of Science, I thought it might prove enlightening to see which playing styles most influenced team success, and which combinations of two styles on the same team yielded the most success. To this end, I have mashed together two ugly regressions, the output of which can be seen here: http://arbitrarian.wordpress.com/2008/0 ... chemistry/ The gist is that Pure Scorers aren't worth that much, while the Scorer's Opposite-types are. This was gratifying to see. Please feel free to comment on/question the validity of the methodology I've used here._________________David http://arbitrarian.wordpress.com
Back to top

Serhat Ugur (hoopseng)



Joined: 13 Oct 2006
Posts: 204
Location: Basketball Research
Posted: Mon Jul 14, 2008 4:49 pm Post subject:

You're definitely on your way to go to a team as an analyst. Any intentions for this?_________________http://www.nbastuffer.com
Back to top

NickS



Joined: 30 Dec 2004
Posts: 384
Posted: Mon Jul 14, 2008 5:32 pm Post subject:

I think the big question, for a regression like that, is what if your mental model for how player's earn minutes. If your model is that a coach identifies the eight best players on the team (using their personal sense of "best") and gives them minutes in descending order. A different model would be that the coach has a mental sense of category production needed (scoring, passing, rebounding, defending, etc . . . ), and their willingness to trade off categories, and that they try to maximize the combination of those qualities. In that model, a player with strong deficiencies in any of those categories will only get minutes if there's another player that can compensate for that deficiency -- no matter what other strengths that player has. For example, it was mentioned in the Boston thread that Eduardo Najera has a great Adj +/-. Looking at his career he's playing about 50% more minutes in Denver than he did in Dallas, why is this? One possibility is that he has improved, the other is that Denver has more playing time available for a non-scorer, because their primary scorers are such high usage. Would Najera lose minutes if Denver traded Anthony for Prince, as has been rumored (assuming he had stayed in Denver)? Probably. What this means is that when you see teams that play non-scorers more minutes it could just mean that the team has players that can carry the scoring load. Steve Smith and Michael Jordan both show up on the "perimeter scorer" list, but they're very different in terms of how many non-scorer minutes you would have playing along side them (though Smith played with Dikembe).
Back to top

NickS



Joined: 30 Dec 2004
Posts: 384
Posted: Mon Jul 14, 2008 5:58 pm Post subject:

One other question that might reveal a confounding factor in your regressions. Can you also list the minute-weighted age for each of the seven categories. I wonder whether the "pure scorer" category skews younger. It just seems, off the top of my head, that fewer old players in the league would count as pure scorers.
Back to top

Mountain



Joined: 13 Mar 2007
Posts: 1527
Posted: Mon Jul 14, 2008 10:10 pm Post subject:

Thanks for more original research / conversation starter David. I looked at the 07-08 team distributions by player type briefly and didn't see an immediate pattern for good or bad teams. I imagine though the impact of large or small amounts of minutes by certain player types would show up in a check of team performance on specific 4 factors. To try to get new insight into team design I wonder what you'd find if you tallied the minutes of each player type by team played by those in the top 25% and top 50% om MEV for their type. Maybe good teams would show more of similar design pattern when looking at minutes played at high quality by player type? Maybe not. Would be interesting to see. I also wonder how the top 50 most used lineups in the league look in terms of 5 man sets of player types and if any patterns can be found there for primary lineup choice and performance and serve as a starting point for more lineup analysis and optimization.
Back to top

dsparks



Joined: 22 Feb 2008
Posts: 61
Posted: Tue Jul 15, 2008 10:38 am Post subject:

Hoopseng: I doubt any teams are in the market for a statistical charlatan such as myself, but thanks. NickS: Thanks for your comments--you apparently noticed that I did not spend too much time thinking about the regression before I ran it. I think that my mental model is: The coach identifies some fuzzy positional archetypes, given that he will likely be facing teams which field just such a set of archetypes. Then, the coach maximizes absolute productivity, subject to the constraint that his set of players can relatively successfully defend a team composed of the typical positional archetypes. I do think though, that if a coach had a roster consisting of history's 11 best Centers ever, and one mediocre point guard, he would probably mostly play the Centers, though he might try to have better dribbling/passing centers bring the ball up more, etc. wrt your Smith v. Jordan note, teams featuring Steve Smith averaged 2405.5 PI minutes, while teams featuring Jordan averaged 2778.967. Both sets of teams are well above the PI minutes average of 1692.855. Interestingly, correlation between PI and SS minutes is (statistically significant) -0.1605, while between PI and SP, it's (still significant) 0.1344. In fact, here are correlations of minutes from each position: Code:
SSmin SPmin PPmin PImin IImin ISmin MMmin SSmin 1.00000000 -0.48196407 -0.14395292 -0.16048360 0.16661903 0.09804888 -0.33637971 SPmin -0.48196407 1.00000000 -0.37548611 0.13439206 0.06598575 -0.18009094 -0.14863258 PPmin -0.14395292 -0.37548611 1.00000000 0.02029574 -0.02945263 -0.30949438 0.17448774 PImin -0.16048360 0.13439206 0.02029574 1.00000000 -0.41219002 -0.32294959 -0.04301926 IImin 0.16661903 0.06598575 -0.02945263 -0.41219002 1.00000000 -0.31292568 -0.25340193 ISmin 0.09804888 -0.18009094 -0.30949438 -0.32294959 -0.31292568 1.00000000 -0.36775846 MMmin -0.33637971 -0.14863258 0.17448774 -0.04301926 -0.25340193 -0.36775846 1.00000000
This is interesting--it seems as though possibly having an II allows you to play a SS (or vice-versa), or that PP allows a MM (or vice-versa). Apparently, SS and SP combinations are less common--but a quick look at teams with this trait doesn't tell me whether it's because it's hard to find two such players to play together, or if it's because it's a bad combination. Any thoughts? Re: your age question, I don't have age in my dataset, but I can figure years since rookie year as a proxy for experience. Here's minutes-weighted mean experience by archetype: Code:
1 5.162527 2 5.621524 3 5.587629 4 5.885027 5 5.294387 6 4.485098 7 5.245737
So, pure scorers do trend younger (experience-wise), but not as young as Scoring Interiors, the oldest group is Scorer's Opposite. So, I included team minutes-weighted experience in a regression, and found largely similar results: Code:
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 8.4345718 6.8373802 1.234 0.217601 SSmin 0.0003435 0.0003983 0.862 0.388625 SPmin 0.0002704 0.0003715 0.728 0.466848 PPmin 0.0006624 0.0003809 1.739 0.082329 . PImin 0.0014914 0.0004090 3.647 0.000277 *** IImin 0.0014946 0.0004189 3.568 0.000374 *** ISmin 0.0011841 0.0003971 2.982 0.002924 ** MMmin 0.0009609 0.0003899 2.465 0.013862 * teamexp 2.8580071 0.2113714 13.521 < 2e-16 ***
Mountain: As usual, thanks for your encouragement and comments. It's not exactly what you suggested, but I eliminated the bottom 50% of MEV producers from my dataset, and reran the regression. The results are pretty similar, I think: Code:
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 13.0786779 2.1931608 5.963 3.27e-09 *** SSmin -0.0001565 0.0002320 -0.674 0.50015 SPmin -0.0004183 0.0002154 -1.942 0.05237 . PPmin -0.0004177 0.0002406 -1.736 0.08276 . PImin 0.0007178 0.0002342 3.065 0.00223 ** IImin 0.0009534 0.0002239 4.257 2.24e-05 *** ISmin 0.0009362 0.0002161 4.331 1.61e-05 *** MMmin 0.0003248 0.0002073 1.567 0.11748 teamexp 2.4274228 0.1881754 12.900 < 2e-16 *** teamMEV 0.0126957 0.0015354 8.269 3.66e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
I wouldn't put too much stock into that regression, though, although I tried to control for quality some with the teamMEV. I don't think I'm up for looking at the top 50 most-used lineups--I don't have the data in an easily accessible form at the moment. Thanks for all your thoughts._________________David http://arbitrarian.wordpress.com
Back to top

NickS



Joined: 30 Dec 2004
Posts: 384
Posted: Tue Jul 15, 2008 12:39 pm Post subject:

dsparks wrote:
NickS: Thanks for your comments--you apparently noticed that I did not spend too much time thinking about the regression before I ran it.
I agree with Mountain, that's a not a bad thing. It's a fun starting point to have a set of numbers that we know aren't methodologically rigorous and try to think about what could make them more robust. Thanks for the answers, I'm still looking at those charts.
Back to top

Mountain



Joined: 13 Mar 2007
Posts: 1527
Posted: Tue Jul 15, 2008 5:59 pm Post subject:

Interior Scorer and Mixed are negative by more than -.2 with the most types. Generally avoid putting them on the floor- if you get get the more preferred front-line types unless they break from this norm and work on your team / your lineup? Blazers by far the highest on minutes played by interior scorers. Mavs, Knicks and Wolves next most but only about half as much as Blazers. Celts, Cavs, Nuggets, Pistons, Lakers, Kings and Jazz field none by these definitions. Interior defense appears more important than interior scoring though both should be fine. Jazz, Pistons, Nuggets lead the way with use of Mixed. Blazers and Hornets along with Heat lowest. Pure Perimeter and Pure Interior combination have the only correlation that is positive by more than .2 and these types are the only ones with 2 positive correlations. Emphasize playing them together and in the right combos with others? Pure Perimeter highest: Blazers, Wolves, Bobcats. Lowest: Warriors, Spurs, Lakers, Rockets, 76ers. Pure Interior highest: Cavs, 76ers, Pistons. Lowest: Bulls, Rockets. Wolves. Just looking quickly at the very best on these best and worst types from a pair perspective looks like Cavs score the best with a +2 on the 4 parts. I don't think anyone got a -2. A top half Pure Scorer the least negative overall perimeter choice? Bigs have more impact- by these results. Go bigger than conventional? Suggested team and lineup construction procedure (for at least first-cut) put your best bigs on the floor as much as possible and fill in with the right perimeter guys for them? Staffing at SF may be a critical choice where playing like a big rather a perimeter is more helpful. David, have or will you post a list of the playing style of all 07-08 players (over some minutes threshold)? Player pairs paint one picture on average but player pair performance surely varies in distinct 5 man lineups and perhaps widely in some or many cases. That is why I showed interest in looking at top 50 lineups (feeling like a comprehensive tabulation of results for all lineups would be too much to ask for). But with a list of player types folks could at least speculate about efficacy of 5 man lineups off the average correlations you presented and the specific first iteration adjusted lineup performance data Eli presented.


Last edited by Crow on Thu May 12, 2011 3:09 am, edited 1 time in total.

Top
 Profile  
 
PostPosted: Fri Apr 15, 2011 7:29 pm 
Offline

Joined: Thu Apr 14, 2011 11:10 pm
Posts: 2441
Page 2

Author Message dsparks



Joined: 22 Feb 2008
Posts: 61
Posted: Tue Jul 15, 2008 11:30 pm Post subject:

Alright, I wasn't going to do it, but here's a regression of the interaction of five types at once: Code:
Coefficients: Estimate Std. Error t value Pr(>|t|) Estimate Std. Error t value Pr(>|t|) (Intercept) 2.60E+01 1.09E+00 23.797 <2e-16 *** teamMEV 2.14E-02 1.36E-03 15.749 <2e-16 *** SSmin:PPmin:PImin:ISmin:MMmin 2.13E-17 1.28E-17 1.665 0.0961 . SSmin:PPmin:PImin:IImin:MMmin 6.06E-18 7.36E-18 0.823 0.4106 SSmin:SPmin:PPmin:PImin:ISmin 5.84E-18 6.57E-18 0.889 0.3744 SSmin:PImin:IImin:ISmin:MMmin 5.41E-18 1.17E-17 0.461 0.645 SSmin:SPmin:IImin:ISmin:MMmin 2.01E-18 2.85E-18 0.705 0.4808 SSmin:SPmin:PPmin:IImin:ISmin 1.61E-18 3.61E-18 0.446 0.656 SPmin:PPmin:PImin:IImin:MMmin 1.46E-18 2.27E-18 0.644 0.5196 SSmin:SPmin:PImin:IImin:MMmin -9.05E-19 4.72E-18 -0.192 0.848 SPmin:PPmin:IImin:ISmin:MMmin -2.23E-18 2.02E-18 -1.102 0.2707 PPmin:PImin:IImin:ISmin:MMmin -2.29E-18 1.10E-17 -0.209 0.8348 SPmin:PPmin:PImin:ISmin:MMmin -2.56E-18 3.36E-18 -0.763 0.4459 SSmin:SPmin:PPmin:IImin:MMmin -3.31E-18 2.51E-18 -1.323 0.186 SSmin:SPmin:PImin:ISmin:MMmin -3.70E-18 7.83E-18 -0.473 0.6362 SPmin:PPmin:PImin:IImin:ISmin -4.12E-18 3.50E-18 -1.178 0.2392 SPmin:PImin:IImin:ISmin:MMmin -4.37E-18 4.00E-18 -1.092 0.275 SSmin:PPmin:IImin:ISmin:MMmin -4.44E-18 6.86E-18 -0.647 0.5175 SSmin:SPmin:PPmin:PImin:MMmin -6.62E-18 6.04E-18 -1.096 0.2733 SSmin:SPmin:PImin:IImin:ISmin -7.86E-18 5.09E-18 -1.544 0.1228 SSmin:SPmin:PPmin:PImin:IImin -9.12E-18 4.00E-18 -2.28 2.28E-02 * SSmin:PPmin:PImin:IImin:ISmin -9.64E-18 1.10E-17 -0.88 0.3789 SSmin:SPmin:PPmin:ISmin:MMmin -2.19E-17 8.88E-18 -2.462 0.014 * --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 10.95 on 1154 degrees of freedom (3 observations deleted due to missingness) Multiple R-Squared: 0.2115, Adjusted R-squared: 0.1965 F-statistic: 14.07 on 22 and 1154 DF, p-value: < 2.2e-16
Note that I've sorted the variables by the magnitude of their coefficients, but also note that almost none of these are significant. These results are probably essentially meaningless, but there may be some useful findings here. For example, note that the only difference between the "best" lineup and the "worst" is that the best one has a Scorer's Opposite (PI) in the place of a Perimeter Scorer (SP). The two teams with the rosters most closely resembling the "worst" in 2008 were MEM and SEA. The two teams most closely resembling the "best" in 2008 were CHI and NOH. Does this mean that replacing Juan Carlos Navarro with Julian Wright (1.84 and 1.72 BoxScores last season--a reasonable exchange) would drastically improve Memphis? That's my recommendation to their GM--I'd like to see the field experiment enacted. Similarly, the second best and second worst differ only in that the second best has a MM in place of an IS. Perhaps taking an Interior Scorer on a team with one pure scorer and lots of interior presence already, and making him play more on the perimeter and look to shoot less, would improve the team. Mountain, per your request, here's a list of each player in 07-08, along with their BoxScores, MEV, and SPI7 type: http://spreadsheets.google.com/pub?key= ... AqUOX6pdbQ They're in order of type, then sorted by BoxScores. Unfortunately, Google Docs doesn't let viewers sort, but if you sort on BoxScores, you find that the most valuable SS type, Peja Stojakovic, doesn't come until 46th on the list. Also, if you want to see where some of the better players stand in terms of pure SPI (as in, not forced into one of the seven categories) I posted a graphic above with some of the better players from 07-08: http://peoplesstatistic.googlepages.com/seasonspi.pdf To close, a couple of history's best teams: CHI1996: Code:
player tmyr min BoxScore MEV ptype 12384 Michael Jordan CHI1996 3090 17.93 2225.92 SP 12506 Scottie Pippen CHI1996 2825 12.34 1532.85 SP 12399 Toni Kukoc CHI1996 2103 8.84 1097.59 SP 12539 Dennis Rodman CHI1996 2088 6.86 852.24 II 12389 Steve Kerr CHI1996 1919 5.99 743.21 SP 12349 Ron Harper CHI1996 1886 5.97 740.86 PP 12415 Luc Longley CHI1996 1641 4.34 538.35 MM 12622 Bill Wennington CHI1996 1065 2.61 323.72 IS 12225 Jud Buechler CHI1996 740 2.20 272.92 MM 12223 Randy Brown CHI1996 671 1.78 221.26 PP 12564 Dickey Simpkins CHI1996 685 1.54 191.17 II 12230 Jason Caffey CHI1996 545 0.96 119.75 MM 12553 John Salley CHI1996 191 0.43 52.89 PI 12291 James Edwards CHI1996 274 0.23 28.08 SS 12339 Jack Haley CHI1996 7 0.00 0.01 SS
The top three are all SPs! LAL1972: Code:
player tmyr min BoxScore MEV ptype 3556 Jerry West LAL1972 2973 17.42 1880.29 SP 3400 Gail Goodrich LAL1972 3040 14.38 1552.39 SS 3358 Wilt Chamberlain LAL1972 3469 10.07 1087.56 II 3465 Jim McMillian LAL1972 3050 8.83 953.72 SS 3406 Happy Hairston LAL1972 2748 6.64 717.18 II 3505 Flynn Robinson LAL1972 1007 4.34 469.07 SP 3499 Pat Riley LAL1972 926 2.30 248.50 SS 3387 Leroy Ellis LAL1972 1081 1.72 185.96 II 3534 John Trapp LAL1972 759 1.44 156.00 IS 3388 Keith Erickson LAL1972 262 0.69 74.74 MM 3343 Elgin Baylor LAL1972 239 0.58 62.67 IS 3365 Jim Cleamons LAL1972 201 0.57 61.41 SP
LAL2000: Code:
player tmyr min BoxScore MEV ptype 13981 Shaquille O'Neal LAL2000 3163 19.16 2367.23 IS 13727 Kobe Bryant LAL2000 2524 10.93 1350.69 SP 14026 Glen Rice LAL2000 2530 7.69 950.03 SS 13841 Ron Harper LAL2000 2042 5.47 675.36 PI 13859 Robert Horry LAL2000 1685 5.38 664.52 PI 13832 A.C. Green LAL2000 1929 4.83 597.06 II 13808 Derek Fisher LAL2000 1803 3.73 461.10 SP 13812 Rick Fox LAL2000 1473 3.67 453.84 SP 14049 Brian Shaw LAL2000 1249 3.16 390.94 PI 13904 Travis Knight LAL2000 410 0.97 119.80 II 13824 Devean George LAL2000 345 0.82 101.77 IS 14045 John Salley LAL2000 303 0.65 80.41 PI 13922 Tyronn Lue LAL2000 146 0.33 40.67 SP 13747 John Celestand LAL2000 185 0.14 17.61 SP 13874 Sam Jacobson LAL2000 18 0.05 6.02 SP
BOS1986: Code:
player tmyr min BoxScore MEV ptype 8145 Larry Bird BOS1986 3113 15.63 2283.71 MM 8330 Kevin McHale BOS1986 2397 10.00 1460.87 IS 8357 Robert Parish BOS1986 2567 9.53 1392.30 II 8267 Dennis Johnson BOS1986 2732 8.13 1188.73 SP 8132 Danny Ainge BOS1986 2407 6.94 1014.73 PP 8442 Bill Walton BOS1986 1546 5.63 822.40 PI 8395 Jerry Sichting BOS1986 1596 3.96 579.00 SP 8446 Scott Wedman BOS1986 1402 3.40 497.02 SS 8165 Rick Carlisle BOS1986 760 1.40 204.03 PP 8412 David Thirdkill BOS1986 385 0.92 134.45 MM 8438 Sam Vincent BOS1986 432 0.90 130.79 SP 8294 Greg Kite BOS1986 464 0.56 81.86 II 8470 Sly Williams BOS1986 54 0.00 0.51 IS
BOS2008: Code:
player tmyr min BoxScore MEV ptype 17754 Kevin Garnett BOS2008 2328 12.26 1523.75 PI 17973 Paul Pierce BOS2008 2874 11.29 1403.39 SP 17599 Ray Allen BOS2008 2624 8.32 1033.90 SP 18001 Rajon Rondo BOS2008 2306 8.16 1014.88 PP 17968 Kendrick Perkins BOS2008 1912 5.66 703.15 II 17976 James Posey BOS2008 1821 4.79 595.26 PI 17812 Eddie House BOS2008 1480 4.00 496.94 SP 17977 Leon Powe BOS2008 809 3.44 427.08 II 17600 Tony Allen BOS2008 1373 3.24 402.62 MM 17703 Glen Davis BOS2008 940 2.37 294.18 II 17679 Sam Cassell BOS2008 299 0.70 87.23 SP 18009 Brian Scalabrine BOS2008 512 0.69 86.11 PI 17663 P.J. Brown BOS2008 209 0.52 64.45 II 17975 Scot Pollard BOS2008 173 0.38 47.21 II 17982 Gabe Pruitt BOS2008 95 0.19 23.65 SP
I reckon looking at outliers isn't all that informative, but it sure is interesting. Anyway, I'm sure all of you will have substantially better insight than I have been able to muster. Thanks for your comments. On an unrelated note: My BoxScores metric is the one I prefer to use to measure value--it attempts to capture the number of wins for which a player is responsible. However, I don't particularly care for the name BoxScores--it's measuring wins, so I'd like to have "wins," or something like that, in the name. Unfortunately, WinShares (my first choice), Wins Produced, WinVal, Win Scores, etc. etc. have all been "taken," much to my chagrin. Any ideas for something catchy, yet accurate, that has a nice abbreviation, and hasn't been used already? I'd be willing to consider acronyms, if they roll off the tongue well, I like "MEV" a lot for my Model Estimated Value metric... Best suggestion gets a pretty graph (PDF) depicting the team of their choice._________________David http://arbitrarian.wordpress.com
Back to top

Mike G



Joined: 14 Jan 2005
Posts: 3605
Location: Hendersonville, NC
Posted: Wed Jul 16, 2008 6:35 am Post subject:

How about WinChunks? (WC for short.) WinPortions? WinParts?_________________` 36% of all statistics are wrong
Back to top

Mountain



Joined: 13 Mar 2007
Posts: 1527
Posted: Wed Jul 16, 2008 8:36 pm Post subject:

Thanks very much for the past season's type reference list. 5 type regressions and championship examples. I will study them further. On the lineups shown I understand the caution given about reading too much into the regression results but interesting that Pure shooters are in 6 of the 7 league-wide positive lineups. Pure perimeter, Pure Interior, Interior Scorer, Mixed, Scorer's Opposite all in 5; Perimeter Scorer the lowest at just 4. The prior averages give one story (if you reach for one like I did) but the detail reveals more. All the types can be positive in the right combinations. Reviewing the most used lineups of teams will be educational about lineups and coaches & front offices. It should be helpful in thinking about trades. Thanks for again going the extra miles David and sharing your work here. I also got around to viewing the top 50 graphic. I find it a nice cut to help get oriented to the type spectrum and gradations.Last edited by Mountain on Sat Jul 19, 2008 11:16 am; edited 2 times in total
Back to top

QMcCall3



Joined: 17 Jul 2008
Posts: 9
Posted: Thu Jul 17, 2008 10:01 pm Post subject:

Mountain wrote:
One twist on this might be to look at playing style of a single player thru his biggest minute lineups. Does he change as opportunity and need change or does he do his thing and let others change or let things go undone? You could also do a timeseries thru a season or thru a career. A comet changing color (in some cases).
Hey folks, my first post here, but have enjoyed the work here for some time... A comment/question for David or others who may have ideas (relaying this question from someone else): Might it be interesting or useful to look at how static each type is across a career? David, you mentioned that "pure scorers do trend younger (experience-wise)", but it would be interesting to see if some types are more static than others... I could imagine pure perimeter players being relatively more static than others because many point guards are also limited by their relative size limitations... Obviously there's the confounding factors of trades, personnel changes, or changes in style of play... ...but it may be interesting as a means to project rookie development or figuring out which of a team's assets are most valuable...
Back to top

Mountain



Joined: 13 Mar 2007
Posts: 1527
Posted: Fri Jul 18, 2008 5:15 pm Post subject:

How static (or changing) each type (at a given point) is on average across rest of career would be of interest in addition to for a specific player. And how type changes or not under different coaches & teams (and for which does a player produce highest MEV or win contribution or both?). Maybe even a look at type movement (drift toward another type or actual crossing over) in games against counterpart types or team opponent types would help gave more shape to the stat movement. For example, of all Kobe's counterpart or team opponents which half saw him move more toward other neighboring types and when was he taking the right path and increasing win % and was he getting steered and not doing as well as expected? Comparison of player type from regular season vs playoffs would be another possible research extension that could be quite interesting in some cases. Welcome Q. I've read a few posts at your blog. Anyone with interest in the WNBA or general "re-thinking of basketball" use link from his screen name and you can go there. A recent post makes note among other things that WNBA-NBA similarity translations can be done using David's graphics.Last edited by Mountain on Fri Jul 18, 2008 9:49 pm; edited 1 time in total
Back to top

QMcCall3



Joined: 17 Jul 2008
Posts: 9
Posted: Fri Jul 18, 2008 7:26 pm Post subject: WNBA SPI Playing Styles spectrum....

Thanks Mountain. Any input people have is certainly welcome. For those interested in an NBA/WNBA comparison, David also created a WNBA spectrum which was posted on my blog yesterday: http://rethinkbball.blogspot.com/2008/0 ... layer.html A friend of mine commented that it seemed as though the NBA spectrum seemed to be more "clustered", while the WNBA seemed distributed across the spectrum a little bit more. I haven't taken yet taken the time to look at the percentage of players within each dimension across the two leagues, but it's an interesting point with regard to the differences in how each game is played.
Back to top

Mountain



Joined: 13 Mar 2007
Posts: 1527
Posted: Fri Jul 18, 2008 10:11 pm Post subject:

This might be a situation where the quantitative tests of quadrant density or mean distance between players or whatever would be better than the eye's impression and I am guessing that in large part because I believe the NBA graphic has many more points and that may be the main reason for the visual interpretation. But there are people around who could address this topic better.
Back to top

Mountain



Joined: 13 Mar 2007
Posts: 1527
Posted: Fri Jul 18, 2008 10:45 pm Post subject:

It is a small sample but championships are the goal so I found the type boxscore averages by players of each type for the 5 championship teams David listed who played over 1200 minutes. The order highest to lowest: Interior Scorer, Perimeter Scorer, Pure Scorer, Mixed, Pure Interior, Pure Perimeter and Scorer's Opposite. 5 of the top 10 boxscores were Perimeter Scorers and no other category had more than 1. A bigger study would be appropriate but I went with what was here just to see what it showed.
Back to top

dsparks



Joined: 22 Feb 2008
Posts: 61
Posted: Sat Jul 19, 2008 9:16 am Post subject:

Mike G and Mountain: Thanks for those suggestions--you have inspired me to think of VC, for Victory Contribution or Victories Contributed. It has the advantage of going well with MEV, PVC and VCR, which I also use, except in the other cases the V is for "Value," which sounds like a slogan for a big-box department store. So, I guess I'm still working on the name. MCW, for Marginal Contribution to Wins? That has the advantage of sounding scholarly and economics-y... Anyway, still an open topic. Mountain, you had asked for the Spurs team as well, but I failed to post it last time. Here they are, for 2006: Code:
player tmyr min BoxScore MEV ptype 16722 Tim Duncan SAS2006 2784 12.79 1537.68 II 16947 Tony Parker SAS2006 2715 11.10 1335.06 SP 16760 Manu Ginobili SAS2006 1813 7.88 947.92 SP 16652 Bruce Bowen SAS2006 2755 5.24 630.60 MM 16737 Michael Finley SAS2006 2038 4.83 581.29 SS 16905 Nazr Mohammed SAS2006 1389 4.40 528.90 II 16626 Brent Barry SAS2006 1258 3.76 452.43 MM 16923 Rasho Nesterovic SAS2006 1515 3.66 440.50 II 16801 Robert Horry SAS2006 1182 3.38 406.47 PI 17054 Nick Van Exel SAS2006 986 2.13 256.65 SP 17053 Beno Udrih SAS2006 586 1.77 213.39 SP 16931 Fabricio Oberto SAS2006 490 1.11 133.13 II 16877 Sean Marks SAS2006 181 0.66 79.64 IS 16996 Melvin Sanders SAS2006 113 0.26 31.29 MM 16998 Alex Scales SAS2006 0 0.00 0.00 MM
Mountain's post of 07/16 encouraged me to look into the typical productivity of each archetype, and so here's a table, sorted by mean BoxScores (I like the name BoxScores when I abbreviate it as BXS): Code:
N sum(BXS) mean(BXS) mean(VCR) SP 4031 12431.15 3.083887 0.8841091 MM 2540 7101.7 2.795945 0.8621128 PP 2151 5832.14 2.711362 0.8987916 IS 2534 6515.77 2.571338 0.8495555 PI 1712 4166.33 2.433604 0.8423861 II 3559 8272.32 2.324338 0.8501778 SS 1590 3550.65 2.233113 0.7835383
The first thing I notice is that Perimeter Scorers are at the top. This might mean that they are the most productive, or it might mean that my BXS formulation overweighs their statistical profiles--i.e. scoring gets too much credit. On the other hand, pure scorers are the least productive on average--this makes it seem as though scoring isn't necessarily overweighted. (VCR, incidentally, is Valuable Contributions Ratio, which is percent of valuable contributions to team success over percent of team minutes played.) I would be interested in doing a top-50 most used lineups investigation, but but I don't have a list of the lineups handy. Does anyone know of such a list with player names in a "Firsname Lastname" format, so it's easier to match up my dataset? I like the idea of making a "comet" tracing player styles over time, but that will take a little bit of coding to make happen. Sooner or later I'll get around to it, but I beg your patience for the time being. On the other hand, it's easy to see how much variance exists within a player's career. At a per-game level (where the SPI type is assessed for each game a player plays), here is a list of the most variable (in terms of direction, not radius) and least variable players (over 12 min/gp) for my more modern, per-game dataset: http://spreadsheets.google.com/pub?key= ... l1rtmb3fRA By the way, I strongly second the recommendation of the work Q is doing at his blog--his work is an excellent example of application of some of the more abstract things we're doing here, backed up with subjective observation, etc. Very well done, and much more attention needs to be paid to the WNBA anyway. Mountain, it's a very interesting idea to examine how type changes depending on the nature of the opposition. This is something I might try looking into. You guys are just asking questions faster than I can answer! Q: I noticed the same thing about the increased clustering of the NBA spectrum. I'm not sure if this a) reflects reality, or if b) it's an artifact of the plotting mechanism, or if c) it's just because the NBA spectrum is just a sample of the "top" players, while the WNBA is a much broader sample. Re: Q's and Mountain's posts on location density and distribution, I have similarly wondered, if we look at a spectrum plot for a team-year, we could see "holes" where the team is not producing. For example, New Orleans, in 2008, looked like this: Where each point is a single player-game. The apparent "lines" are due to the way the RGB to HSV conversion function works, and the fact that we're looking at single-game samples. For NOH, we can see some low-density areas in the Scorer's Opposite and Scoring Perimeter regions. What's not clear to me is whether New Orleans has any "need" to fill these holes, or if part of their success is due to filling other regions very well, while these are left mostly empty. Can anyone suggest a way of measuring/identifying gaps algorithmically? From an EDA standpoint, looking at the graphic is probably the best way to go--there's much more information than could be captured by a single number--but it might be interesting to identify gaps, identify players who could fill those gaps, and perhaps identify useful/profitable trades. For example, I'm not the biggest fan of Lamar Odom, but he occupies the Scorer's Opposite and Pure Interior positions on the 2008 Lakers graph, while Ron Artest pretty much trends toward the top-right portion of the graph, which is currently already filled aptly by this year's Arbitrary MVP: Ron Artest: If anything, the Lakers have a hole in the Interior Scorer spot, but perhaps this was their motivation for acquiring Gasol, who does that sort of thing pretty well, historically: Anyway, I'm graphed out for now._________________David http://arbitrarian.wordpress.com
Back to top

Mountain



Joined: 13 Mar 2007
Posts: 1527
Posted: Sat Jul 19, 2008 11:29 am Post subject:

David, I am glad to you are finding names you like. Once you sketch out your whole system and it stabilizes it will be a little easier for me as a reader to stay on track. I edited some stuff from earlier because I don't think I was using all the labels and type assignments completely accurately. Of course proceed at your pace, though you are quite prolific. I just suggest extensions when I see them for your consideration and action if/when you can. Thanks for the many direct responses including the data on the Spurs. Basketball value does have the top 50 most used lineups when you sort the list this way: http://basketballvalue.com/topunits.php ... order=DESC With text to columns delimited splitting and some flipping of first names in front of last and cutting extraneous columns this would seem like a good source for your eventual use. If that would work but if would make a difference to have it done for you I'd be willing if you pm'ed me a e-mail address to send it to and any further formatting preferences. Thanks again.
Back to top

Mountain



Joined: 13 Mar 2007
Posts: 1527
Posted: Sat Jul 19, 2008 12:59 pm Post subject:

Another angle that could be explored is whether player type helps with college to NBA translations more than traditional position assignment does. Many players must play a different type when they get to the pros and failure to successfully adapt to the type or role might explain a good deal. Were there any indicators in the game records that the player could successfully play another type in certain lineups in the rotation or when a teammate was out or an opponent forced it? Is the player's "talent" flexible enough to adapt to a different type? I think the type model can guide consideration even if it just one way to slice things. 7 is a pretty good simplified typology but were or are you tempted at all to add a couple more? Just asking. In the NBA it maybe that too many players are trying to be pure scorers, perhaps repeating their college type. The bottom half may be giving the true NBA pure scorers a worse name than they deserve. Any player in the bottom half of their type I'd scrutinize for any signs he can be successful at that type or signs he could be better at another. I understand teams sometimes need a type even though below average for depth or insurance or type preferences of coach / minutes demands of system but where possible you'd want to upgrade. The main thrust of the typology was simplification... but you could reverse direction and add back some detail in some circumstances or for some users. Instead of being presented as a Perimeter Scorer because that is the type a player is most often or is most like on average it would be possible to present a player for example as 55% Perimeter Scorer at BXS 10 / 40% Pure Perimeter at BXS 8. The threshold for breaking out a 2nd type might be 30-40% to avoid the mess unless a player is being used or performing as a hybrid. How much is the result of role and how is game to game variance in opportunity and performance will vary and is unclear but this more detailed typing would be intended as an alert system for further study of what is happening and why. It is also now possible using one of David's recent files giving all player types and performance to summarize a team for a season by SPI7. It is individuals and unique 5 man lineups that actually perform but this summary is another way to see the team as a whole above the often confusing and small sample detail. Boston was 37% Perimeter Scorer at an average BXS of 8.2, 24% Scorer's Opposite at 8.07, 20% Pure Interior at 3.96, 12% Pure Perimeter at 8.16 and 7% Mixed at 3.24. There was no player predominantly a Pure Scorer or Interior Scorer. That is breadth of contribution but pretty strong type choice. The Sonics were 9% Perimeter Scorer at an average BXS of 1.54, essentially no Scorer's Opposite, 25% Pure Interior at 1.79, 21% Pure Perimeter at 1.88 and 14% Mixed at 1.45, 20% Pure Scorer at 2.44 and 10% Interior Scorer at 1.84. That is more variety with no big positive contributions.
Back to top

dsparks



Joined: 22 Feb 2008
Posts: 61
Posted: Sat Jul 19, 2008 2:36 pm Post subject:

Mountain: I dropped the ball on finding the top 50 lineups, thanks for the link. It was a matter of minutes to put together: http://spreadsheets.google.com/pub?key= ... D_Mu-4i9VA Recall: 1==SS; 2==SP; 3==PP; 4==PI; 5==II; 6==IS; 7==MM I've got it sorted by team, I thought that would make for the easiest comparison._________________David http://arbitrarian.wordpress.com
Back to top

Mountain



Joined: 13 Mar 2007
Posts: 1527
Posted: Sat Jul 19, 2008 3:27 pm Post subject:

Great! Next thing to review. Looking at teams I see that the 76ers and Cavs were the least diverse teams by type looking at % of total minutes by top 2 types and it was the same 2- II and SP. Both just above and below 80% of all minutes. That is distinctive, presumably intentional.
Back to top

Mountain



Joined: 13 Mar 2007
Posts: 1527
Posted: Sat Jul 19, 2008 4:13 pm Post subject:

Only 8 of the top 50 most used lineups include a type 1 Pure Scorer and the average performance is 60% below the average for these lineups. The best and only one over average was the Hornets and Paul. (Good result but bad neighborhood?) Teams that started the position progression with 22 did slightly better than average and than 23. 222 did even better, 223 did way better (used by Celtics Suns Nets Wizards Magic). 224 slipped some. 225 and 226 were terrible. The best interior combo was by far 4, 5. The worst 5, 5 and 6,7. The most used lineup was 22577 and the performance was 40% above the average. Magic had 2 variations and Pistons and Nuggets also used it. The worst lineup in the top 50 was for the team headed for and managed by the group in OKC. The runner-up was the Bobcats. They were both substantially worse than the rest.Last edited by Mountain on Sat Jul 19, 2008 6:32 pm; edited 3 times in total
Back to top

QMcCall3



Joined: 17 Jul 2008
Posts: 9
Posted: Sat Jul 19, 2008 4:44 pm Post subject:

Mountain wrote:
The worst lineup in the top 50 was for the team headed for and managed by the group in OKC. The runner-up was the Bobcats. They were both substantially worse than the rest.
Surprising that the Knicks weren't the worst... Just to be clear, which of the OKC/CHA lineups are you referring to as the worst? How do you think this type of analysis might compare to the best & worst plus/minus lineups?


Top
 Profile  
 
PostPosted: Fri Apr 15, 2011 7:30 pm 
Offline

Joined: Thu Apr 14, 2011 11:10 pm
Posts: 2441
page 3

Author Message
Mountain



Joined: 13 Mar 2007
Posts: 1527


PostPosted: Sat Jul 19, 2008 5:03 pm Post subject: Reply with quote
Thanks, I missed 2 NY lineups that were indeed the worse on raw team +/- because of number formatting and bad eyesight with small numbers.

Worst -Crawford Curry Marbury Randolph Richardson by almost double
Next- Crawford Curry Jones Randolph Richardson
both 22667- a combination of 226 and 67 with both ranking terrible on average.

then
Watson Durant Green Collison Petro- 13557
perimeters with 1's were weak and 57 is halfway between the 2 worst interiors (5.5 and 6,7) and third weakest on average here.

Felton - Mohammed- Okafor - Richardson- Wallace
23557


Boston's best lineup compromised of 22345 had the best perimeter type and interior type and set the overall best team raw +/-. Chicken-egg and
small samples and could of course different combinations can work for different teams based on talent but I think the information from top 50 lineups is interesting.


Boston was 37% Perimeter Scorer, 24% Scorer's Opposite, 20% Pure Interior, 12% Pure Perimeter and 7% Mixed. There was no player predominantly a Pure Scorer or Interior Scorer.

The average for the 5 champions David provided since 1986 was 35% Perimeter Scorer, 13% Scorer's Opposite, 13% Pure Interior, 8% Pure Perimeter 12% Mixed 6% Pure Scorer or 7% Interior Scorer. Small sample but SP led the way. Mixed Pure Scorer and Interior were also light.


Purely on minutes distribution by type the ten teams most similar to Celtics last season were

2. SAC
LAL
PHO
HOU
League average
MIA
ATL
PHI
CLE
NJN

the least similar were

21. CHI
DAL
DET
TOR
ORL
MEM
MIN
POR
SEA
30. NOH with the least similar type distribution

Last edited by Mountain on Sat Jul 19, 2008 8:45 pm; edited 1 time in total
Back to top
View user's profile Send private message
Mountain



Joined: 13 Mar 2007
Posts: 1527


PostPosted: Sat Jul 19, 2008 8:13 pm Post subject: Reply with quote
(MEV and BXS are boxscore based so the similarity rankings have more to do with offense than defense.
Pairing BXS quick n dirty with the adjusted defensive +/- of players in each type would probably give better similarity scores if you went beyond minutes distribution and also looked at performance by type as well.

Kings had nice breadth of type contributions towards offense. I'll have to look into them a little more.)
Back to top
View user's profile Send private message
dsparks



Joined: 22 Feb 2008
Posts: 61


PostPosted: Fri Jul 25, 2008 11:11 am Post subject: Reply with quote
Alright, I broke down and did a regression at the game level, instead of the season level. For the DV, I used final scoring margin for team A (thus, it will be a positive number of points if they won, negative if they lost). The IVs are, for each of the seven style archetypes, the number of minutes played in the game for team A, less the number of minutes played in the game for team B. Thus, you get something like this actual game observation, where team A beat team B by 13:

Code:

MARGIN Perimeter Scorer Pure Scorer Scorer's Opposite Mixed Pure Perimeter Interior Scorer
13 29 39 -19 16 -34 -39


Here's the regression output:
Code:
Call:
lm(formula = WIDE[, 1] ~ WIDE[, -1] - 1)

Residuals:
Min 1Q Median 3Q Max
-55.706 -5.163 3.571 11.646 62.769

Coefficients:
Estimate Std. Error t value Pr(>|t|)
WIDE[, -1]MIN.Perimeter Scorer -0.028335 0.002797 -10.129 <2e-16 ***
WIDE[, -1]MIN.Pure Scorer -0.102925 0.003515 -29.282 <2e-16 ***
WIDE[, -1]MIN.Scorer's Opposite 0.065226 0.003277 19.904 <2e-16 ***
WIDE[, -1]MIN.Mixed -0.003964 0.002963 -1.338 0.181
WIDE[, -1]MIN.Pure Perimeter 0.075319 0.003164 23.808 <2e-16 ***
WIDE[, -1]MIN.Interior Scorer -0.056873 0.003333 -17.062 <2e-16 ***
WIDE[, -1]MIN.Pure Interior 0.028303 0.003349 8.452 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 12.99 on 24703 degrees of freedom
Multiple R-Squared: 0.08299, Adjusted R-squared: 0.08273
F-statistic: 319.4 on 7 and 24703 DF, p-value: < 2.2e-16


And, here's that output in a graphical format, with a line indicating zero, and tiny error bars:




And here's a graphical version using Type-Year minute differential as the DV:


The error bars are a little larger, as we are estimating more coefficients with the same amount of data.

The results here look pretty conclusive. Without doing interactions, each additional minute played by a Pure Perimeter or Scorer's Opposite player yields an increase of 0.075 (or 0.065) additional points on the final scoring margin, on average. Pure Scorers, on the other hand, dock your team a point of margin for every ten minutes they play. And yet, as our esteem'd colleague Dr. Berri would note, many Pure Scorers like Kevin Durant and Carmelo Anthony are lauded for their contributions...

I'm not sure what-all to make of this, but I know that some of you will have some really good ideas.

And, once you've digested all that, I've got a big puzzle for everyone to solve that will require a whole new topic!
_________________
David

http://arbitrarian.wordpress.com
Back to top
View user's profile Send private message
dsparks



Joined: 22 Feb 2008
Posts: 61


PostPosted: Fri Jul 25, 2008 11:54 am Post subject: Reply with quote
Ok, so this isn't the puzzle I promised, and I know it's not cool to double post, but this is new stuff:

I ran a couple of regressions similar to the above, except instead of using scoring margin, I used other margins, such as offensive rebounding, assists, etc. Some of which make more sense than others:

Offensive Rebounds:


Defensive Rebounds:


Total Rebounds:


My guess is that Pure Scorer increases offensive rebound margin because they take (and possibly miss) a lot of shots, and Scorer's Opposites reduce OR margin not because necessarily they fail to gather a lot of boards, but because they don't chuck. Just a theory though.

One surprise is that it appears that Pure Perimeters contribute more overall to rebounding margin than do Perimeter and Pure Scorers... No theory as to why just yet.

Assists:

No surprises here, really. It's possible that Pure Interiors help assist margin by blocking shots, which prevents assists...

Turnovers:

Again, no real surprises here. The top three don't handle the ball as much, and are more defensively-oriented.

Blocks:

No surprises.

Personal Fouls:

I'm not sure about this one... It seems as though Pure and Perimeter Scorers would get fouled more, in their scoring attempts. Also, I would have thought that Scorer's Opps and Pure Interiors, playing a more defensive game would give up more fouls... any ideas?

Missed field goals:

Largely unsurprising, except for the Pure Perimeter helping the most (in this case a positive coefficient is bad--a larger margin of missed fgs), given that Point guard-types are not typically good shooters nor well-known shot defenders. My hunch is that this is the assist effect--better passes lead to more made shots, and certainly assists and made shots are highly correlated.

Anyway, I'd love to hear what you actual knowledgeable people have to say about all this.
_________________
David

http://arbitrarian.wordpress.com
Back to top
View user's profile Send private message
Harold Almonte



Joined: 04 Aug 2006
Posts: 616


PostPosted: Fri Jul 25, 2008 2:36 pm Post subject: Reply with quote
A strong correlation between scoring margin (wins), assists (linked to teammates's FG. FG%), low FGX, and def. rebounds (shot defense?)-this is the 1rst. factor and some of 3rd., and punishable towards scoring (attemptors). Tend scorers (or are they called) to be more unidimensional than defenders? Anyway, being too unidimensional towards the defensive end is not good either, but it should be worse?. I would like to see this puzzle with playoffs games.

I won't talk about your Boxscore metric, but in your playing styles exemplars, a lot of "pures" and "scorers" could or might be qualified in a more balanced "mixed" cathegory.
Back to top
View user's profile Send private message
Mountain



Joined: 13 Mar 2007
Posts: 1527


PostPosted: Fri Jul 25, 2008 7:21 pm Post subject: Reply with quote
There are a lot of different snapshots / levels of analysis being provided to keep in mind. Scorer's Opposites and Pure Perimeters certainly looking good on these graphs.

Interesting to review an earlier dataset and see teams with the rare Scorer Opposites (PI) often have several and that increases the impression that at least in some cases this is knowledgeable selection. Posey as 9th best in this group might help explain the degree of interest more than his adjusted +/- score does. Odom#2, Josh Smith #3 seems worth mentioning too.


If I follow the "Code box" information / explanation correctly, it looks a good way to collapse all lineup match-ups onto a single spectrum. It would be interesting to see what the per minute average results look like for a set of "bands" (minute distributions that are fairly similar) for teams and the league as a whole. For a public snapshot you could "collapse" to positive and negative minute differentials and get down to 128 lineup bands. If you were on the inside I'd think you might look at around 600-3,000 bands and see what useful clues can be found from it.
Back to top
View user's profile Send private message
Mountain



Joined: 13 Mar 2007
Posts: 1527


PostPosted: Sat Jul 26, 2008 5:37 pm Post subject: Reply with quote
I can see how a Pure Perimeter facing an opponent without one of the court might have an advantage but going back to an issue of some discussion in last year or two how well do teams perform with 2 Pure Perimeters on the court together? Not just combo guards in size or mainstream description but Pure Perimeter types?


Noticed A Miller had one of worst adjusted +/-s in the league last season, a sharp contrast to previous 3 seasons. What to make of it? Only notably good with T Young. Who might trade for him? I don't see an obvious playoff team wanting him unless maybe the Suns wanted him instead of one of the large, longer term contracts. Wait n see how it goes for Philly. If things go well they probably stick with him and let his salary go away but if they don't meet higher expectations he will probably be the first to go. For their sake I'd think they'd want an experienced 3 pt shooting PG if they could get one. Would they go for Billups? Maybe, if Joe D didn't overprice him. Perhaps they could get Mo Williams. Or possibly Farmar. I'd guess something will get done this season or next summer. Unless he and Brand really click. They didn't as Clippers in 2002-3 Miller had worst season of his career to and Brand slipped from previous year and did better after Miller left. Team slipped considerably. But of course many things could have been involved other than their direct playing interaction. Will be interesting what raw player pairs show next season and the various adjusted measures.
Back to top
View user's profile Send private message
gabefarkas



Joined: 31 Dec 2004
Posts: 1313
Location: Durham, NC

PostPosted: Tue Jul 29, 2008 11:16 am Post subject: Reply with quote
dsparks wrote:
Alright, I broke down and did a regression at the game level, instead of the season level. For the DV, I used final scoring margin for team A (thus, it will be a positive number of points if they won, negative if they lost). The IVs are, for each of the seven style archetypes, the number of minutes played in the game for team A, less the number of minutes played in the game for team B. Thus, you get something like this actual game observation, where team A beat team B by 13:

Code:

MARGIN Perimeter Scorer Pure Scorer Scorer's Opposite Mixed Pure Perimeter Interior Scorer
13 29 39 -19 16 -34 -39

This is really, really, really neat.

One thought:
I seem to remember when you initially rolled out the different designations that although you're categorizing players, their assignments seemed more fluid. In other words, if someone looked like they were 70% Perimeter Scorer, 20% Pure Perimeter, and 10% Pure Scorer, they got assigned as a Perimeter Scorer.

So, I was thinking it might be interesting to factor this into the analysis, so that instead of saying "Perimeter Scorer XYZ played 30 minutes", you count those 30 minutes as 70% of a Perimeter Scorer on the floor, 20% of a Pure Perimeter on the floor, and 10% of a Pure Scorer on the floor.

I'm assuming the totals will still add up to 10 guys on the floor (give or take) at any one time.

But, if Team A has two guys on the floor who are both tagged as Perimeter Scorers, but in reality may be 55% Perimeter Scorer and 45% Pure Perimeter, it's almost like having one of each out there, no?


PS - this was my 1000th post on the board. Yikes!
Back to top
View user's profile Send private message Send e-mail AIM Address
Mountain



Joined: 13 Mar 2007
Posts: 1527


PostPosted: Tue Jul 29, 2008 8:47 pm Post subject: Reply with quote
Gabe makes a good suggestion. It probably would be worthwhile to profile lineups both ways and using player type splits might be more accurate. Then again if distinct role fulfillment by individuals matters then the dominant descriptor may be the most important.

If one was afforded the time you could look at the type split detail for each type in the league and year to year league type split movement. And Player type split change in games where the team won or lost or performed x amount better or worse on offensive or defensive efficiency or net counterpart production. And you could look at player type splits in lineups that are successful or not and how player type changes in and outside of particular player pairings. And you could compute average type split detail for starters by type, on just teams that made playoff or didn't. Or by age, height, pay, PER, BXS, playoff / regular season ratio levels, or where adjusted +/- is more than x standards errors from mean, etc.

Or look at the type split change for players changing teams or starter vs sub or pace or across coaches / career. Which coaches / GMs are type purists based on who they play or pay / trade? Which teams "change" players how and to what effect? Looking league-wide over enough time can you saying anything useful about major conversion projects?

Or tabulate team 4 factor performance across a player's game to game type spectrum and ask does type behavior correlate with important team 4 factor trends in subtle ways in addition to obvious ones. Or correlate player type in game to performance and see whether the trends suggest a player seek to be that archetype or "shift" some. When does a player's positive or negative performance occur and what type are they expressing at those times? And look at changes in players faced by various splits.

Or look at entire league's performance against say the champs and see what type or type split composite does best / worst against them by position to help guide the giant killer strategy.

Or look at the team archetype net minutes profile game by game vs the average type playing time profile of those teams to describe the coach vs coach clash and you could look at which patterns looked better / worse across the league when facing them regular season and especially in playoffs. Archetype net minutes profiles could help with team similarity statements or for that matter coaching similarity.

Plenty to keep one busy full-time. Or more.

Last edited by Mountain on Wed Jul 30, 2008 5:56 pm; edited 1 time in total
Back to top
View user's profile Send private message
dsparks



Joined: 22 Feb 2008
Posts: 61


PostPosted: Wed Jul 30, 2008 9:43 am Post subject: Reply with quote
Mountain: You could write a book using just your good ideas, and call it "Things We Should Look Into," by Mountain. Once I figure out a good framework for investigating some of those splits, I'm all over it.

This time, though, I used gabefarkas' idea. By looking at each player's proximity to each of six "ideal" archetype points, I calculated their percentage of similarity to that point. Each player's percentages add up to one, and so I've divided each player's minutes, in each game he played, according to these percentages. E.g. Player A played 40 minutes, and was classified as 20% pure scorer, 50% perimeter scorer, and 30% interior scorer: his minutes would be counted for his team as 8 SS, 20 SP, and 12 SI. One possible advantage is that we are getting rid of the "Mixed" type this way... I ran some regressions with very cool results, I'll report the output, then discuss each in turn:

Code:

(MARGIN) Predicting margin by difference in minutes played by split type
Estimate Std. Error t value Pr(>|t|)
dSS -0.302 0.004 -82.940 <2e-16 ***
dSP -0.118 0.003 -36.502 <2e-16 ***
dPP 0.129 0.004 35.196 <2e-16 ***
dPI 0.051 0.003 17.613 <2e-16 ***
dII -0.045 0.030 -1.506 0.132
dIS -0.187 0.003 -66.030 <2e-16 ***

(JOINT) Predicting own team scoring, model includes both own team and opponent minutes played by split type
Estimate Std. Error t value Pr(>|t|)
tSS 0.135 0.005 28.652 < 2e-16 ***
tSP 0.232 0.004 59.798 < 2e-16 ***
tPP 0.361 0.005 76.984 < 2e-16 ***
tPI 0.150 0.004 41.702 < 2e-16 ***
tII 0.336 0.041 8.150 3.74E-16 ***
tIS 0.080 0.003 23.207 < 2e-16 ***
oSS 0.436 0.005 92.909 < 2e-16 ***
oSP 0.350 0.004 90.076 < 2e-16 ***
oPP 0.232 0.005 49.419 < 2e-16 ***
oPI 0.100 0.004 27.669 < 2e-16 ***
oII 0.380 0.041 9.232 < 2e-16 ***
oIS 0.267 0.003 77.223 < 2e-16 ***

(OFFENSE) Predicting own team scoring, including only own team minutes played by split type
Estimate Std. Error t value Pr(>|t|)
tSS 0.410 0.004 93.823 < 2e-16 ***
tSP 0.521 0.003 170.220 < 2e-16 ***
tPP 0.687 0.004 160.293 < 2e-16 ***
tPI 0.444 0.003 149.168 < 2e-16 ***
tII 0.178 0.047 3.765 0.000167 ***
tIS 0.343 0.003 120.562 < 2e-16 ***

(DEFENSE) Predicting opponent scoring, including only own team minutes played by split type
Estimate Std. Error t value Pr(>|t|)
tSS 0.636 0.004 155.430 <2e-16 ***
tSP 0.561 0.003 195.660 <2e-16 ***
tPP 0.452 0.004 112.640 <2e-16 ***
tPI 0.278 0.003 99.700 <2e-16 ***
tII 0.486 0.044 11.010 <2e-16 ***
tIS 0.438 0.003 164.510 <2e-16 ***


The MARGIN results report the change in final margin for every minute played FOR a team by a type less every minute played AGAINST a team by a type. So, every minute you had a Pure Perimeter on the floor that your opponent did not (well, not literally, but on net), your team adds an average of 0.129 to the final margin.

The JOINT results try to estimate offensive and defensive production of each type. The DV is team points scored. The t** coefficients indicate the average increase in points per minute on the floor by that type, larger is better. The o** coefficients indicate the same, but since it's opponent type-minutes, larger is bad. Note that the difference in each type's t** and o** coefficients gives the coefficients we find in the MARGIN model. For example, tSS-oSS = 0.135 - 0.436 = -0.302 = dSS.

The OFFENSE results try to estimate offensive production, but without controlling for opponent type-minutes. Higher is better. The DEFENSE results try to estimate defensive production, without controlling for opponent type-minutes. Lower is better (indicating fewer points given up per additional minute played). I'm not sure which model is the most useful to look at, although I suspect that the JOINT model is more useful than the OFFENSE and DEFENSE models combined, but I haven't thought too much about it.

Incidentally, here's a breakdown of my dataset by minutes logged by each split type:

Code:

Number of minutes played by each split type
teamSS teamSP teamPP teamPI teamII teamIS
1511996 2728679 1646662 2157942 312689 2176822

Percentage of total minutes played by each split type
teamSS teamSP teamPP teamPI teamII teamIS
0.144 0.259 0.156 0.205 0.030 0.207


As you can see, Pure Interior play is very rare. I'm not sure exactly why this is. It could be that there are lots of minutes played by players who we would classify generally as Pure Interior players, but they actually swing back and fourth between Scorer's Opposite and Perimeter Scorer, and rarely play games in which they fit largely into the Pure Interior type. It could also be that such players exist, but don't get a lot of minutes...The regressions seem to indicate that II is not the least productive type.

Mountain, I've got a spreadsheet for you: It's each team, and their distribution of minutes at each split type.

http://spreadsheets.google.com/pub?key= ... HRx1PMkaPw

It's sorted by season, and wins within season. Someday, perhaps, Google Docs will let viewers sort by column, but until then, I guess you'll have to copy and paste to glean much more from that.

One final bit: Correlations of each team's type percentages with each other and with Pythagorean win projection:

Code:

teamSS teamSP teamPP teamPI teamII teamIS teampyth
teamSS 1.00000000 -0.04100399 -0.26441698 -0.4555426 0.24006872 -0.10362708 -0.28279211
teamSP -0.04100399 1.00000000 -0.56163292 -0.0974909 -0.03042045 -0.41776926 -0.09560936
teamPP -0.26441698 -0.56163292 1.00000000 -0.1557434 0.27530468 0.02720873 0.26684871
teamPI -0.45554255 -0.09749091 -0.15574338 1.0000000 -0.21398326 -0.40869420 0.31039440
teamII 0.24006872 -0.03042045 0.27530468 -0.2139833 1.00000000 -0.29709183 0.07201173
teamIS -0.10362708 -0.41776926 0.02720873 -0.4086942 -0.29709183 1.00000000 -0.24893058
teampyth -0.28279211 -0.09560936 0.26684871 0.3103944 0.07201173 -0.24893058 1.00000000


This (all of the above results, actually) backs up the Berri thesis (the one with which I agree), that scorers, especially pure scorers, are overrated.

Incidentally, I think Ron Artest is a much better fit for the Rockets than he would have been for the Lakers, especially given how much Houston had to give up. Morey is a genius--if he's reading this, I'd like him to know that I am available to run regressions and make graphs all day for the Rockets. He managed, essentially, to turn Bonzi Wells (for whom he got Bobby Jackson) plus a rookie who, like all rookies, is characterized by a high degree of uncertainty, into Ron Artest. Whenever I read about Morey's doings, I wonder what sort of statistics and models they're employing, because he manages to swing what appear to be really good deals almost all of the time.

I have a hard time believing that given the huge economic incentives involved, more teams haven't caught up to some of the leaders, in terms of statistical analysis. Is there truly that great of a gulf between the sophisticated and "unsophisticated" General Managers? Anyway, it's exciting to think that somewhere in Houston, someone is doing something like what we have going on here, and making things happen.


Update: Here's an SPI plot of last year's Rockets, plus Barry and Artest:


If anything, Artest is a more valuable version of Bobby Jackson, and is redundant for McGrady, not Battier. As many have pointed out, though, McGrady might not have to carry the load so much, and this would probably be a good thing for his back.
_________________
David

http://arbitrarian.wordpress.com
Back to top
View user's profile Send private message
dsparks



Joined: 22 Feb 2008
Posts: 61


PostPosted: Wed Jul 30, 2008 12:44 pm Post subject: Reply with quote
Please forgive me for monopolizing the conversation.

It has occurred to me, in thinking about the Rocket's trade, that what the Rockets really need, and what this trade may allow them to do, is put someone in a more Pure Perimeter situation. If you look at the graphic in the last post, you'll see that their starting point, Alston, is somewhere between Perimeter Scorer and Pure Perimeter. I don't know that they'll be able to make such a trade, but perhaps they could ask Alston to focus more on the facilitating aspects of his position than the shooting. I wondered whether or not teams could shift players in a meaningful way and find success.

Compare the following two SPI plots. The first features the roster of the 2008 Celtics, but uses their 2007 statistics. The second plots their actual 2008 positions.



Note that both Rondo and Pierce move much more toward the Pure Perimeter position. Ray Allen (and Garnett) allowed Pierce to carry less of the scoring load, and work more on the facilitating. Posey's presence may have allowed Rondo to focus less on the defensive aspects of perimeter play and move more toward a Pure Perimeter facilitator. Certainly, the addition of two all-star players made the difference for the Celtics in 2008, but perhaps some of the improvement came via the ability to repurpose players they already had...

Incidentally, notice how neither of these teams has anyone remotely like a Pure Scorer or an Interior Scorer, which were the two worst types according to the MARGIN regression above.
_________________
David

http://arbitrarian.wordpress.com
Back to top
View user's profile Send private message
Harold Almonte



Joined: 04 Aug 2006
Posts: 616


PostPosted: Wed Jul 30, 2008 5:29 pm Post subject: Reply with quote
A pair of less FGA (more comfortable zone of the usage-eff. curve), a better defensive frontcourt (more increasing return on defensive stats), less carrying scoring load, and less carrying defense's attention (more comfortable scoring creation), and suddenly they are away from pure and more mixed. And more wins.

What I think is that being "tended to pure" at scoring a lot of times is not a scorer's option, but a sign of team's diversity weakness. Scorers are not "overrated", but called to be pure or unidimensional without any other coaching option.

PD: Not to mention that WOW underrates scoring.

Last edited by Harold Almonte on Wed Jul 30, 2008 8:53 pm; edited 2 times in total
Back to top
View user's profile Send private message
Mountain



Joined: 13 Mar 2007
Posts: 1527


PostPosted: Wed Jul 30, 2008 6:26 pm Post subject: Reply with quote
David I threw out some leads, run with any you wish. I am not sure what if any I will get to anytime soon.

Thanks for the new spreadsheet, I'll look at it.

And eventually "the puzzle".

Could you remind me have you published a spreadsheet with the type splits for every player yet?
I thought you might have but didn't find it in a quick look backwards.It would be of interest for some of these newer research directions.

Last edited by Mountain on Wed Jul 30, 2008 7:36 pm; edited 3 times in total
Back to top
View user's profile Send private message
QMcCall3



Joined: 17 Jul 2008
Posts: 9


PostPosted: Wed Jul 30, 2008 6:28 pm Post subject: Reply with quote
Mountain wrote:

Or look at the type split change for players changing teams or starter vs sub or pace or across coaches / career. Which coaches /GMs are type purists based on who they play or pay / trade? Which teams "change" players how and to what effect? Looking leaguewide over enough time can you saying anything useful about major conversion projects?...Or look at entire league's performance against say the champs and see what type or type split composite does best / worst against them by position to help guide the giant killer strategy.


This line of thinking made me think about the Golden State Warriors -- and makes me wonder if Don Nelson runs through a mental model of these questions before making his decisions. In fact, I think he gets so caught up in trying to exploit mismatches in player styles that he puts together imbalanced rosters that can't win consistently or make deep runs in the playoffs.

Anyway, I'd be willing to bet that a lot could be learned from looking at the Warriors regarding shifting playing styles from game to game just because of the way Nelson plays with lineups. And looking at the Warriors roster and their offseason changes brought up a few additional questions too.

Here's the Warriors current roster, split into what I imagine their rotation might be (with some guesses as to the styles rookies might grow into):

Biedrins: Pure interior
Harrington: mixed
Maggette: perimeter scorer
Jackson: perimeter scorer
Ellis: perimeter scorer

Turiaf: scorer's opposite
Wright: pure interior
Azu: interior scorer
Williams: perimeter scorer
Beli: pure scorer

Watson: perimeter scorer
Morrow: (pure scorer)
Randolph: (perimeter scorer)
Hendrix: scorer's opposite
Perovic: pure interior

So a few questions:

First, Dave Berri often uses win score to analyze the impact of transactions in terms of wins. I wonder if boxscores can be used in the same way with any validity since they take a team's wins into account. For example, this new roster has a total box score of 40.19 based on last year's numbers compared to last season's 47.26 (minus a few bit players). Is it fair to say the Warriors will be 7 wins worse this year or is that not possible to say with Boxscores?

Second, the Warriors seem to be following a theory that pure distributors are not necessary (though Marcus Williams may take on that role in this system). It brings up a question for me about whether it would be worth better defining a sub-spectrum, especially for point guards. For example, Ellis (the expected 08-09 point), Marcus Williams, and Baron Davis are all "perimeter scorers", but Davis is statistically more of a distributor and Ellis more of a scorer.

Within the SPI spectrum, I tend to think of point guards in terms of facilitators (lead guards who are able to create scoring opportunities for others), creators (lead guards who are able to create scoring opportunities for themselves) and utility guards (the bigger guards who might rebound more and score less). A "combo guard" is therefore any guard who does something other than just facilitate. All of those fall somewhere in the range from perimeter scorer to pure perimeter.

I wonder if there would be a way to tease out which specific type of point guard fits best with a given team and how they need to function/shift to most effectively run a given offense.

Third, Chris Mullin has said the Warriors intend to run more this year and we see that they have put together a roster where they can come at you in waves of the same types of players, constantly keeping the pressure on.

I haven’t really looked across multiple teams, but I wonder how many teams employ this "redundancy" strategy (maintaining a consistent style of play) vs a “diversity” strategy (having different styles of play and keeping opponents off-balance). It would be interesting to see if one or the other was more/less effective and with what combinations. For whatever reason, I tend to favor the redundancy strategy, especially if there is a coherent strategy around it. But I could see where it might be beneficial to have a backup who can do some other things, in the Warriors’ case – a post scorer or pure distributor could be useful in terms of making adjustments in the face of different matchups.

Fourth: The warriors also have a number of young players and it would be interesting to try to project style development (as I’ve mentioned before) much in the way the Kevin Broom’s diamond rating projects diamonds in the rough by looking at productivity. Perhaps using the diamond rating with the S-P-I scales would help in that regard?

Sorry for the long post, but this work is getting more and more interesting as people are digging into it further.
Back to top
View user's profile Send private message Visit poster's website
Mountain



Joined: 13 Mar 2007
Posts: 1527


PostPosted: Wed Jul 30, 2008 6:34 pm Post subject: Reply with quote
Rockets PG staffing remains their biggest issue I think as I discussed some last season, though a corollary question might be how much should McGrady play with the ball. Morey (and Ed K. and Eli W.) probably have data on Rockets offensive efficiency when McGrady has ball in first x seconds of play or y # elapsed seconds of play vs when he doesn't. Are the Rockets more or less efficient when he is the "play-maker"? I don't know but they should and that information is key to next steps though under any circumstances a more Pure Perimeter would seem to be worthwhile supplemental option. I'll note that McGrady was estimated as only a +1.5 on offensive +/- this past season. Alston was barely under neutral. Not much difference. But this is for all their time on court. With great resources though you could run offensive adjusted +/- for splits of any kind- being the primary play-maker on the play or for an offensive set or a specific play or option call or whatever with useful sample sizes. Barry was almost +5 on offensive +/- but that was in a particular role. Still if you did a split for adjusted offensive +/- when he served as a play-maker last season or during past seasons you could get some read (in other team contexts) on using him for that- if they had any interest in him playing PG and felt they could live with him guarding PGs or felt comfortable with some type of cross-match in specific lineups / times in the game. Brooks was only -1 on offensive, not bad for a rookie PG, but was very weak on adjusted defensive +/-, clearly the worst on team. (Another small PG with this weakness.) How much does he improve on either next season? Adelman and McGrady are helpful resources reducing pressures on PG to be traditional and options are good but Rockets PG situation still looks not that strong- especially compare to many elite teams. Morey says he is staying with Yao and McGrady and Barry and Artest are nice changes but is he staying with Alston or open to or looking for the right opportunity for change from outside his assembled options? You can say some good things about Alston but I focus on the 17th ranked offense and think that PG decisionmaking is a big part of that.I can't see a non op ten on offensive efficiency team making it to conference finals much less to title winner. Barry is probably just a bit player and Artest was just neutral on adjusted offensive +/- so I don't expect his addition to vault the Rockets offense forward.


Pure Perimeter and Scorer's Opposite continue to look very good. Some analysis within the types to see where within the spectrum the team results are best would also be interesting or confirmation of guesses from the type scores. Do the best results within a type clearly lean toward one type line or the other or it is pretty scattered?


Top
 Profile  
 
PostPosted: Fri Apr 15, 2011 7:32 pm 
Offline

Joined: Thu Apr 14, 2011 11:10 pm
Posts: 2441
page 4 of 4 missing


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 4 posts ] 

All times are UTC


Who is online

Users browsing this forum: Yahoo [Bot] and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group