Positions in 2D

Home for all your discussion of basketball statistical analysis.
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Positions in 2D

Post by DSMok1 »

As part of my work on revising Box Plus/Minus, I am investigating using a two-dimensional position spectrum rather than the conventional positions.

I started down this road with BPM 2.0 and I am moving further that direction with the next revision of BPM.

The two dimensions I am proposing are generally Size and Creation.

These generally line up with the first two components of various principal component analyzes I have done or seen when evaluating types and roles of basketball players.

I am defining the size dimension based on on the percentage of the team's rebounds and blocks the player accumulates when they are on the floor. Defining everything in the context of the team allows the focus to be on the role of the player, hence the position of the player on that team. It also allows this approach to be flexible across any league.

I am defining the creation dimension based on the percentage of the team's points and assists the player accumulates when they are on the floor--with an added bonus for the points being efficient relative to the team's average true shooting percentage. This creation dimension does generally indicate a player is good on offense, in general.

For both of these dimensions I am adjusting for the variance of the two metrics being averaged. Assists and blocks are both having their variance divided by two because there is a much greater spread in those percentages. (This basically means I'm using z-scores of the two components, but informally.)

As always for percentages of team production, the league average is 20%.

I am then transforming the resultant percentages for each of these two dimensions into a one to five scale, capping outliers at 1.0 and 5.0 in that dimension. The exact capping bounds I have not nailed down yet.

Additionally, I am converting the resultant creation position to a letter for discussion purposes. A pure creator would be an A, with a secondary creator as a B, continuing on to E for the players with no creation ability.

Note that because of the way I am evaluating these, both of these dimensions do not have players evenly distributed on them. There are much fewer A creators then D or E creators. Similarly there are much fewer pure 5s in the size dimension than 1s and 2s.

Here is a visualization showing all of the players in the NBA over the last 43 years and where they would fall on this two-dimensional position spectrum:

https://public.tableau.com/views/ofNBAS ... share_link

Thoughts?
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
Crow
Posts: 10536
Joined: Thu Apr 14, 2011 11:10 pm

Re: Positions in 2D

Post by Crow »

What are the scales of the axes in the visualization (and why not listed)?

The data is normalized? What is the brief explanation of why and how?



What would you think of a 3rd dimension being the context of % time mainly with starters to time with mainly bench? Is that already a factor simply off % minutes and average utilization patterns?


Size & creation by position overall... versus % of time at each different position with different relative size and possibly different creation levels and then an average position and size / creation description based on the specific positions?

Whose position list are you starting with?


Still declining to use pbp and tracking data to facilitate application of formula to non-NBA contexts? How available / acceptable /usable to you are pbp and tracking data for NCAA? Is anyone using BPM beyond NBA / G league/ NCAA? Could you still have a standard baseline BPM and bolt on pbp and tracking data for a BPM+ for NBA?

D-BPM and the underlying shot defense component is really weak as currently based on all of team minutes and not exclusively when player is on court.

Tracking data and possibly pbp could get at player "position / location" on the court, especially at time of possession final action.

Your definition of size is exclusively determined by defensive markers? Why not size classification separately by behavior on both sides of the court? Cases where the 2 are not the same are important / interesting.

"I am defining the size dimension based on on the percentage of the team's rebounds and blocks the player accumulates when they are on the floor." So are you using pbp then or is it still share based on total team time?

Dividing player into big / small on size and more / less on creation, which quads gain or lose on average in New BPM vs. current? Are the hybrids "hurt" / "more fairly scored"?


Now or later, what other aspects of BPM formula are under review for possible change / possible discussion?
RowRowFan
Posts: 11
Joined: Wed Jan 03, 2024 5:06 am

Re: Positions in 2D

Post by RowRowFan »

DSMok1 wrote: Sat Jan 06, 2024 4:21 pm As part of my work on revising Box Plus/Minus, I am investigating using a two-dimensional position spectrum rather than the conventional positions.

I started down this road with BPM 2.0 and I am moving further that direction with the next revision of BPM.

The two dimensions I am proposing are generally Size and Creation.

These generally line up with the first two components of various principal component analyzes I have done or seen when evaluating types and roles of basketball players.

I am defining the size dimension based on on the percentage of the team's rebounds and blocks the player accumulates when they are on the floor. Defining everything in the context of the team allows the focus to be on the role of the player, hence the position of the player on that team. It also allows this approach to be flexible across any league.

I am defining the creation dimension based on the percentage of the team's points and assists the player accumulates when they are on the floor--with an added bonus for the points being efficient relative to the team's average true shooting percentage. This creation dimension does generally indicate a player is good on offense, in general.

For both of these dimensions I am adjusting for the variance of the two metrics being averaged. Assists and blocks are both having their variance divided by two because there is a much greater spread in those percentages. (This basically means I'm using z-scores of the two components, but informally.)

As always for percentages of team production, the league average is 20%.

I am then transforming the resultant percentages for each of these two dimensions into a one to five scale, capping outliers at 1.0 and 5.0 in that dimension. The exact capping bounds I have not nailed down yet.

Additionally, I am converting the resultant creation position to a letter for discussion purposes. A pure creator would be an A, with a secondary creator as a B, continuing on to E for the players with no creation ability.

Note that because of the way I am evaluating these, both of these dimensions do not have players evenly distributed on them. There are much fewer A creators then D or E creators. Similarly there are much fewer pure 5s in the size dimension than 1s and 2s.

Here is a visualization showing all of the players in the NBA over the last 43 years and where they would fall on this two-dimensional position spectrum:

https://public.tableau.com/views/ofNBAS ... share_link

Thoughts?

In the context of roles, it would of course not work before this data existed but maybe incorporating Offensive synergy playtype frequency to get even more in depth.

IIRC BBI uses second spectrum for some of their defensive roles too (Im not sure how else they would define a "Chaser" for exmaple) and maybe thats something to look into as well
Mike G
Posts: 6144
Joined: Fri Apr 15, 2011 12:02 am
Location: Asheville, NC

Re: Positions in 2D

Post by Mike G »

How will these meta-positions be used in a new version of BPM?
Is there already some 'positional' influence in the current BPM?

LeBron is shown at b-r.com as playing all 5 positions in his career. In 6 seasons as a Laker, he's shown at 4 positions. In order of BPM:

Code: Select all

BPM    yr   pos.  WS/48   PER
8.4   2020   PG   .204   25.5
8.1   2021   PG   .179   24.2
8.0   2019   SF   .179   25.6
7.7   2022    C   .172   26.2
7.5   2024   PF   .159   23.9
6.1   2023   PF   .138   23.9
WS falls in the same order, while PER does not quite.
I don't remember LeBron playing C at all. That season was his highest 3PAr, Blk%, and Off/Def ratio in both BPM and WS. It was also his lowest Reb% and Ast% in LA.

Can you anticipate and describe briefly how a position designation would affect his BPM over this span?


EDIT -- From totaling positions as designated at b-r.com, I get these disparities in "wins" rates, per 48 min. Showing fraction of total minutes (.200 being avg) this season:

Code: Select all

pos   %min   per   WS/48   bpm
C    .176   .140   .152   .129
PF   .203   .102   .098   .100
SF   .217   .078   .080   .078
SG   .215   .086   .081   .091
PG   .189   .112   .106   .118
A quarter-million player minutes here.
While BPM is less disparate, it still begs the question of why the positions with higher proficiency get fewer minutes on avg? Are PG and C just elite and not enough of their minutes available?
Or is it just rare to have 2 C or 2 PG in the game at one time? while we see lineups of 4 wings and a PG (or a big) pretty often.
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Positions in 2D

Post by DSMok1 »

Crow wrote: Sun Jan 07, 2024 12:09 am What are the scales of the axes in the visualization (and why not listed)?

The data is normalized? What is the brief explanation of why and how?
The scale is vertically average(% TRB, % BLK_normalized), where normalized means the spread is divided by 2 so the stdev of % TRB and % BLK are similar.

The scale is horizontally is average(% Pts_Adj, % AST_normalized), where normalized means the spread is divided by 2 so the stdev of % Pts and % AST are similar.

I reduced assists and blocks so that they do not dominate the combined metric.
Crow wrote: Sun Jan 07, 2024 12:09 amWhat would you think of a 3rd dimension being the context of % time mainly with starters to time with mainly bench? Is that already a factor simply off % minutes and average utilization patterns?

Size & creation by position overall... versus % of time at each different position with different relative size and possibly different creation levels and then an average position and size / creation description based on the specific positions?
The 3rd principal component of team role would be shooting, measured by FT% and 3 pointers. But that's at least as much quality as role on the team. I'm not super concerned about time with or against starters, although I understand that could influence things somewhat.
Crow wrote: Sun Jan 07, 2024 12:09 amWhose position list are you starting with?
Basketball Reference
Crow wrote: Sun Jan 07, 2024 12:09 amStill declining to use pbp and tracking data to facilitate application of formula to non-NBA contexts? How available / acceptable /usable to you are pbp and tracking data for NCAA? Is anyone using BPM beyond NBA / G league/ NCAA? Could you still have a standard baseline BPM and bolt on pbp and tracking data for a BPM+ for NBA?

D-BPM and the underlying shot defense component is really weak as currently based on all of team minutes and not exclusively when player is on court.

Tracking data and possibly pbp could get at player "position / location" on the court, especially at time of possession final action.
Correct. BPM is and will remain purely a box-score metric. Others can and have used additional information to build more accurate models exclusively targeted at the modern NBA. My goal is for this to be as robust and flexible a model as possible when data is limited. BPM has been used for a number of other leagues around the world, some of which I am aware of.
Crow wrote: Sun Jan 07, 2024 12:09 amYour definition of size is exclusively determined by defensive markers? Why not size classification separately by behavior on both sides of the court? Cases where the 2 are not the same are important / interesting.

"I am defining the size dimension based on on the percentage of the team's rebounds and blocks the player accumulates when they are on the floor." So are you using pbp then or is it still share based on total team time?

Dividing player into big / small on size and more / less on creation, which quads gain or lose on average in New BPM vs. current? Are the hybrids "hurt" / "more fairly scored"?

Now or later, what other aspects of BPM formula are under review for possible change / possible discussion?
I did look at other statistics as part of these regressions but found little benefit. Size as defined by TRB and BLK really does a good job of capturing a player's functional size on court.

This is simply a refinement of the 2D position spectrum that was being used for BPM 2.0. I don't know yet how it will impact the final BPM regression. I am open to tweaking anything about BPM other than its basic definition of what it should be as a statistic.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Positions in 2D

Post by DSMok1 »

Mike G wrote: Sun Jan 07, 2024 2:49 pm How will these meta-positions be used in a new version of BPM?
Is there already some 'positional' influence in the current BPM?

LeBron is shown at b-r.com as playing all 5 positions in his career. In 6 seasons as a Laker, he's shown at 4 positions. In order of BPM:

Code: Select all

BPM    yr   pos.  WS/48   PER
8.4   2020   PG   .204   25.5
8.1   2021   PG   .179   24.2
8.0   2019   SF   .179   25.6
7.7   2022    C   .172   26.2
7.5   2024   PF   .159   23.9
6.1   2023   PF   .138   23.9
WS falls in the same order, while PER does not quite.
I don't remember LeBron playing C at all. That season was his highest 3PAr, Blk%, and Off/Def ratio in both BPM and WS. It was also his lowest Reb% and Ast% in LA.

Can you anticipate and describe briefly how a position designation would affect his BPM over this span?
I have never used position designations in BPM, but rather a position spectrum estimated from the box score.

Interestingly, LeBron has never really shifted around much in the 2D position spectrum:
https://public.tableau.com/shared/QSRGF ... share_link
He's squarely a 3A player, no matter what he's called. He always plays about the same role.
Mike G wrote: Sun Jan 07, 2024 2:49 pmEDIT -- From totaling positions as designated at b-r.com, I get these disparities in "wins" rates, per 48 min. Showing fraction of total minutes (.200 being avg) this season:

Code: Select all

pos   %min   per   WS/48   bpm
C    .176   .140   .152   .129
PF   .203   .102   .098   .100
SF   .217   .078   .080   .078
SG   .215   .086   .081   .091
PG   .189   .112   .106   .118
A quarter-million player minutes here.
While BPM is less disparate, it still begs the question of why the positions with higher proficiency get fewer minutes on avg? Are PG and C just elite and not enough of their minutes available?
Or is it just rare to have 2 C or 2 PG in the game at one time? while we see lineups of 4 wings and a PG (or a big) pretty often.
When looking at the distributions, certain stats are not in a bell curve. Assists and blocks are not--there is a strong right skew. These skills seem to be more rare and valuable. It feels that players without defined roles (and perhaps not great skills) end up in the SF and SG category.

Thoughts?
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
Mike G
Posts: 6144
Joined: Fri Apr 15, 2011 12:02 am
Location: Asheville, NC

Re: Positions in 2D

Post by Mike G »

The basic question is how (and why) does 'position spectrum' influence once and future BPM?
Are rebounds/assists/ etc. more or less influential for 5A, 1C etc?
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Positions in 2D

Post by DSMok1 »

In light of the discussion here and at RealGM, I have revised my methodology significantly.
  1. For the Size dimension, the new methodology is to take % of team's (TRB + 3*BLK). In other words, blocks are worth 3x rebounds. This was found by regressing on actual player size (height and wingspan), based on a measurement dataset. Using an additive approach makes this setup far more robust. Hassan Whiteside's 2015 season is still an outlier, but that can't be helped...
  2. For the Offensive Creation dimension, I revamped the methodology and the regression basis. I compiled a large dataset from PBPstats.com to accurately measure creation--location of assists, self creation, and shooting vs. location were all included, along with team context. Using this superior basis (incidentally, Steve Nash had 4 of the top 6 seasons), I found a completely different approach to the offensive creation dimension was better.

    The new methodology is to take % of team's (AST + Pts scored above 0.85*TmTS%). In other words, the baseline is 85% of the team's true shooting percentage--points scored above this threshold are indicative of creation. Anything less is just...somebody shooting. This really highlights the importance of efficient scoring.
Using these revamped position dimensions, I plotted again the 2-Dimensional position spectrum. As before, the distribution is right-skewed, particularly for the size dimension.

Therefore, I set the position designations accordingly. I selected 10% intervals for the boundaries between positions. 10%, 20% (which is by definition average), 30%, and 40% are the separating points between the position designations (1,2,3,4,5) and (E,D,C,B,A).

For the continuous spectrum, the positions dimensions will be bounded by 5% and 45%, which correspond to the range 1.0 to 5.0.

https://public.tableau.com/views/ofNBAS ... zHome=no#3

Features of the visualization:
  • Clicking on any point will provide a tooltip about that player-season and also highlight all other seasons by that player between 1990 and 2023.
  • Filter by year, team, and by minutes played.
  • Colors indicate the player's BPM, or at least their BPM as currently formulated.
If you look at the boundary seasons (the "frontier"), you will see Hassan Whiteside, Greg Oden, Joel Embiid, Nikola Jokic, Russel Westbrook, and Chris Paul.

Interestingly, Magic Johnson (1991) is also in the top 5 Creation seasons. Recall this dataset only goes back to 1990.

Think about this "Offensive Creation" dimension as who the defense will prioritize to stop in their gameplan. Who they are most concerned about. Interestingly, Rudy Gobert's best seasons show up above league average creation (dimension 5C)--because his efficient rim scoring was a significant offensive weapon for his team.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Positions in 2D

Post by DSMok1 »

Mike G wrote: Tue Jan 09, 2024 11:47 pm The basic question is how (and why) does 'position spectrum' influence once and future BPM?
Are rebounds/assists/ etc. more or less influential for 5A, 1C etc?
In BPM 2.0, a simple version of adjusting by position was already used for the coefficients. It makes sense to me, at least, that different roles on a team will lead to some of the box score statistics being more or less indicative of actual impact on team performance. See the writeup here: https://www.basketball-reference.com/about/bpm2.html
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
Crow
Posts: 10536
Joined: Thu Apr 14, 2011 11:10 pm

Re: Positions in 2D

Post by Crow »

Pulled up Tableau chart for Thunder 2023. 2D position for some players on chart do not match the pull up screen.


Giddey, Bazley, JRE off by a letter. Joe off by a number.
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Positions in 2D

Post by DSMok1 »

Ah good point. I forgot to update the tool tip variable. I'll fix that tomorrow.

EDIT: Fixed
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
nbacouchside
Posts: 151
Joined: Sun Jul 14, 2013 4:58 am
Contact:

Re: Positions in 2D

Post by nbacouchside »

DSMok1 wrote: Tue Jan 16, 2024 12:43 am The new methodology is to take % of team's (AST + Pts scored above 0.85*TmTS%). In other words, the baseline is 85% of the team's true shooting percentage--points scored above this threshold are indicative of creation. Anything less is just...somebody shooting. This really highlights the importance of efficient scoring.[/list]
Using these revamped position dimensions, I plotted again the 2-Dimensional position spectrum. As before, the distribution is right-skewed, particularly for the size dimension.
Isn't the use of Pts scored above baseline introducing too much of a quality component to this? For instance, a player like Scoot Henderson this year is clearly charged with a lot of creation, but he is bad at it, so his scoring is below the .85*TmTS% threshold so he gets no credit for creation despite clearly having that role on the team. Why not use True Shot attempts with Assists? Or why not use Usage and AST?
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Positions in 2D

Post by DSMok1 »

nbacouchside wrote: Wed Jan 17, 2024 6:23 am
DSMok1 wrote: Tue Jan 16, 2024 12:43 am The new methodology is to take % of team's (AST + Pts scored above 0.85*TmTS%). In other words, the baseline is 85% of the team's true shooting percentage--points scored above this threshold are indicative of creation. Anything less is just...somebody shooting. This really highlights the importance of efficient scoring.[/list]
Using these revamped position dimensions, I plotted again the 2-Dimensional position spectrum. As before, the distribution is right-skewed, particularly for the size dimension.
Isn't the use of Pts scored above baseline introducing too much of a quality component to this? For instance, a player like Scoot Henderson this year is clearly charged with a lot of creation, but he is bad at it, so his scoring is below the .85*TmTS% threshold so he gets no credit for creation despite clearly having that role on the team. Why not use True Shot attempts with Assists? Or why not use Usage and AST?
That is a very fair point.

When I just ran this regression, the fit was definitely a lot better with this efficiency based approach. But my regression basis was actual creation, actual advantages created. Not attempted creation. So Scoot is currently attempting to create advantages but not necessarily succeeding.

We are obviously in the same situation with assists, where assists are measuring some sort of success in attempted creation.

What would be the regression basis for attempted creation versus actual creation?

Actually, I think the data set I already got from pbpstats.com would work pretty well. Assisted field goal attempts should give most of the credit to the assister... Remove shots off of offensive rebounds... And I still think I don't want to give much credit for mid-range shots. That's not much creation if you're getting a mid-range shot. You don't have to create an advantage at all to shoot a mid-ranger....
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Positions in 2D

Post by DSMok1 »

It looks like a formulation of % of team's (TSA + 5*AST) works well for offensive load/attempted creation.

Here is the Visualization with the revision made:

https://public.tableau.com/views/ofNBAS ... zHome=no#2

Thanks for helping steer me the right direction, Kevin!
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
Crow
Posts: 10536
Joined: Thu Apr 14, 2011 11:10 pm

Re: Positions in 2D

Post by Crow »

I dunno if there was another navigation option (should be), but to look at one team, instead of the pre-selection of all teams you have to know to click downward triangle icon at top right of team click and switch to "single value list".

A really non-obvious pathway likely to evade and annoy many.

I had to hunt to figure it out.
Post Reply