Box score stats vs. expected, by in-game situation

Home for all your discussion of basketball statistical analysis.
Post Reply
TeamEd
Posts: 21
Joined: Tue Jan 27, 2015 2:33 pm
Location: Toronto

Box score stats vs. expected, by in-game situation

Post by TeamEd »

I had a lot of fun with this:

I've been working with a database of player in-game splits going back to 1996-97 that I pulled from NBA.com. It’s a big database.

I've been using it to try and figure out ways to analyze player mentality by game situation. Basically, I want to show how players change their approach by score. Do some players look to shoot more in comebacks? Do they pass more or less aggressively if the score is close? I think with this dataset I'm able measure how players change their games, and that feels exciting.

You can see some of my earlier attempts in an earlier thread: viewtopic.php?f=2&t=8847&start=15

In that thread, I came up with a measure I called Mentality that tells, as an example, how a player changes his shooting rate in comebacks. A Shooting Mentality of +5 says a player takes 5% more true shot attempts in comebacks than you would expect. (There are nuances in this, as I adjusted for league averages, but that essentially covers the old idea).

That approach had a couple of problems:

- First, it introduced a completely new stat that’s tricky to grasp. A +5 shooting mentality is interesting, but it doesn’t relate easily back to any other stat.

- Second, because it’s a ratio that I adjust to seasonal league averages, it difficult to get any sort of accurate career-spanning number. That makes it tricky to properly categorize players.

So, I’ve come up with a new approach with the same data. Lacking a better name, I’m calling this method "Productivity vs. Expected." It works for any box score stat. So, assists, for example become "Assists vs. Expected." The final number is not a ratio, it's assists. Which is easy.

You can see a sample of my data in my Tableau charts for comeback situations. A comeback situation is any game score where a player is behind or tied. I’ll pop down the main link, and then explain my formulas.
https://public.tableau.com/views/NBAPro ... _count=yes

The charts show how many more shots, assists, rebounds etc. a player has recorded in an in-game situation than you would expect vs. an average distribution of his numbers according to his minutes played in that situations.

For comebacks, the basic formula for Shots vs. Expected is this:

TrueShotAattemptsTSAbehind - TSAexpected

Where TSAexpected is: TSA * ((MPbehind / MP) * (League Adjustment * 0.01+1))

Where League Adjustment is: ((TSALeague / MPLeague) - (TSALeagueBehind / MPLeagueBehind )) / (TSALeague / MPLeague) * - 100

I still use a league adjustment because there are severe leaguewide trends in many box stats. This season, the NBA has seen 5.4% more shot attempts recorded in comebacks, vs overall, for instance.

The final measure tells the number of shots a player has taken above what you’d expect. The beauty of this method is that instead of a ratio the final unit is simply shots.

It tells me that in this season, JJ Reddick (of all people) leads the league in taking 38 more TSAs than expected in comebacks. Players like Kyle Lowry, Jamal Crawford and Mike Conley also rank highly -- Lowry is the guy who sparked my whole obsession with in-game splits.

Meanwhile, James Harden falls at the bottom of the chart having taken -57 fewer TSAs when trailing this season than you would expect. Also ranking near the bottom are Rudy Gay, Lou Williams and Gordon Hayward.

My tableau charts also include charts for:
- Shots: https://public.tableau.com/shared/5SCFW ... _count=yes
- Assists: https://public.tableau.com/shared/5YN49 ... _count=yes
- Rebounds: https://public.tableau.com/shared/C7MJ2 ... _count=yes
- Steals: https://public.tableau.com/shared/TWKZ3 ... _count=yes
- Turnovers: https://public.tableau.com/shared/H2DWK ... _count=yes and,
- Blocks:https://public.tableau.com/shared/MCQ3D ... _count=yes

The data goes back to the 1996-97 season. So, there’s a lot there to look at. You can also filter by team. I’ve found it interesting to look at team comeback approaches. Look at how the Grizzlies look off Z-Bo for Conley, for instance: https://public.tableau.com/shared/SJZ5C ... _count=yes

The same formulas works for any other counting stat (that’s in the NBA.com in-game splits database). I've got a few more that aren't in the Tableau.

The process also works for other in-game situations. I’m putting together a similar package for close score situations (+/- 5 points) that’ll follow this one, for instance.

Lastly, the final numbers can also be turned into a per game or per minute stat or used as a counting stat in and of itself.

It’s in this last use that I’ll finish with.

As a counting stat, career Productivity vs. Expected reveals players who have consistently become more involved in certain game situations. A player who’s recorded many years of positive Shots vs. Expected has a track record of being active in comebacks, for example. You would think of these players as those who look to their own offence in comebacks or who put their teams on their backs.

In turn, the career numbers reveal who’s excelled in the opposite situation. A player who’s less productive in comebacks vs. his overall numbers, for instance, is necessarily more productive with a lead. You could fairly call these players ‘frontrunners.’

And, this is where I’m having so much fun with this data:

This next Tableau set lists players against each other by accumulated career shots, assists, rebounds etc. vs expected. https://public.tableau.com/views/NBACar ... _count=yes

(The other tabs in this Tableau chart career numbers in each stat alongside per36 numbers. Be sure to check those out too.)

I love this chart.

As you can see, there’s really fun results here. Tim Duncan is a monster in shots vs. expected in comebacks, both overall and per36. Nash likewise is a monster in both shots and assists. (Duncan meanwhile is merely excellent in assists vs. expected.) These two guys have consistently put their teams on their backs.

Other guys who rank highly in both include Ray Allen, Dirk, Fisher and for better or for worse, Josh Smith.

Meanwhile, I’ve got really some interesting guys at the bottom of the career charts. McGrady, Iverson and Prince have poor comeback shot numbers, for instance.

Comparing the names at the top of the list to those at the bottom, I can’t help but think I’m on to something here.

It’s worth noting Lebron rates pretty poorly on Shots vs. Expected, though you can see he’s tended to record more assists in comebacks to make up for it.

Some other outliers:

- Andre Miller has easily recorded the fewest assists vs. expected.

- Mozgov is recording many more rebounds per36 than anyone with his number of total rebounds.

- Hibbert, meanwhile is down at the bottom of that chart, along with Lebron.

- Kobe and Iverson’s have recorded the least Steals vs. Expected. Lebron ranks poorly here too.

- Kidd recorded the least Turnovers vs. Expected, by far. Nash ranks near the top, which isn't a surprise considering his extra assists.

- And, for whatever reason Moutombo ranks poorly here on Blocks vs. Expected. It’s hard to know if that’s a fair figure, since he predates the start of my database. Something to work on.

Overall, I’m giggling with excitement over the potential of this data.

What do you guys think?

(Oh, and I’m aware there’s a problem with Glenn Robinson’s numbers)
@EdTubb edwardtubb at gmail
boooeee
Posts: 88
Joined: Sun Jan 22, 2012 5:32 am
Contact:

Re: Box score stats vs. expected, by in-game situation

Post by boooeee »

Very interesting stuff Ed. Can you define what a "comeback" situation is? Is it whenever the team is trailing? Or only when they trail but ultimately win? Or only when trailing by a certain amount?
TeamEd
Posts: 21
Joined: Tue Jan 27, 2015 2:33 pm
Location: Toronto

Re: Box score stats vs. expected, by in-game situation

Post by TeamEd »

boooeee wrote:Very interesting stuff Ed. Can you define what a "comeback" situation is? Is it whenever the team is trailing? Or only when they trail but ultimately win? Or only when trailing by a certain amount?
A "comeback" for the purpose of these stats is any score in which a team isn't winning. So, behind or tied at any time in a game no matter the outcome, including garbage time if applicable.

It's maybe not the best way to do it, but I'm happy with it as a proof of concept. Basically, I'm using a broad definition to get the largest sample size I can, to try to avoid the kind of issues that crop up with clutch stats.
@EdTubb edwardtubb at gmail
Mike G
Posts: 6144
Joined: Fri Apr 15, 2011 12:02 am
Location: Asheville, NC

Re: Box score stats vs. expected, by in-game situation

Post by Mike G »

Sorry if it's obvious, but:
When a player isn't shooting (passing, etc) are you sure he's on the floor?
TeamEd
Posts: 21
Joined: Tue Jan 27, 2015 2:33 pm
Location: Toronto

Re: Box score stats vs. expected, by in-game situation

Post by TeamEd »

Mike G wrote:Sorry if it's obvious, but:
When a player isn't shooting (passing, etc) are you sure he's on the floor?
Yes. It's based on this data: http://stats.nba.com/player/#!/201933/s ... lit=ingame
@EdTubb edwardtubb at gmail
NateTG
Posts: 72
Joined: Thu Dec 13, 2012 9:11 pm

Re: Box score stats vs. expected, by in-game situation

Post by NateTG »

It came up in the another thread, but since you have split data can you easily test the correlation of second half true shooting with first half usage?
TeamEd
Posts: 21
Joined: Tue Jan 27, 2015 2:33 pm
Location: Toronto

Re: Box score stats vs. expected, by in-game situation

Post by TeamEd »

NateTG wrote:It came up in the another thread, but since you have split data can you easily test the correlation of second half true shooting with first half usage?
I don't have per game data, just full-season data with splits by score. So, if you mean "can I correlate how first half usage affects second half shooting on a game-by-game basis," I can't.

That said. There there's potential to scrape game-by game data from NBA.com. Their javascript interface allows you to filter by date, which means I can make that same request to pull stats from a single date. The method I use would allow me to pull a game by game database for every player, which would allow that. But, I mean... that would make my database 82 times larger than it is and I don't really have the means to run those lookups efficiently. And, that's the best I could possibly do with NBA.com's data. When other people have per-play data, there isn't really a point to only going halfway.
@EdTubb edwardtubb at gmail
NateTG
Posts: 72
Joined: Thu Dec 13, 2012 9:11 pm

Re: Box score stats vs. expected, by in-game situation

Post by NateTG »

TeamEd wrote:...
That said. There there's potential to scrape game-by game data from NBA.com. Their javascript interface allows you to filter by date, which means I can make that same request to pull stats from a single date. ...
Ah, I thought it might be easy. I've been grinding away at converting the play-by-play data from basketballvalue into a more consistent format, so I should just do that fundamental work and answer my own question.
TeamEd
Posts: 21
Joined: Tue Jan 27, 2015 2:33 pm
Location: Toronto

Re: Box score stats vs. expected, by in-game situation

Post by TeamEd »

One last demonstration before I do a full treatment of close score situations.

This is a simple Tableau of the data plotting Shots vs. Expected with Assists vs. Expected in a scatter chart.

Here's the combined data from 1996 to present: https://public.tableau.com/shared/MBG6B ... _count=yes

And here's the data for this season: https://public.tableau.com/views/NBAAss ... _count=yes

This is the kind of visual representation of mentality I've been looking to present for a while now.

With Shots on the X and Assists on the Y, I can start to separate out players by type.

The top right quarter is the "puts the team on his back" quarter. Those players record more shots and assists when trailing or tied. This season, the most extreme of this type are Mike Conley, Reggie Jackson and (for better or for worse) Josh Smith. Since 1996, the two most extreme outliers here are Tim Duncan and Steve Nash. Ray Allen and Stockton also rank highly. (I'd imagine with more data before 1996 Stockton might have totals nearer to Nash and Duncan.)

The bottom right quarter is the "I got this" quarter. They take more shots, but record fewer assists. Kyle Lowry, Jamal Crawford and JJ Barea are the leaders this season. Overall, Andre Miller is far and away the most extreme member of this quarter. He along with Nash and Duncan are the most extreme outliers in this chart. Also with big totals here are Kidd and Durant.

The top left quarter is the "one more pass" quarter. They take fewer shots but record more assists. James Harden runs away with this quarter this year. Also up there are Wall, Rudy Gay and Blake. For the total numbers Sam Cassell and Mike Miller rank highly here.

Finally, the bottom left quarter are the "let's go team!" quarter. They take fewer shots and fewer assists. This season Lebron James, Ty Lawson and Devon Harris lead in this quarter. Interestingly though, when you look at the career chart, no one has racked up big totals in this quarter. Kevin Martin leads, but he's not far from the pack.

For his career, Lebron is solidly in the "one more pass" quarter. He's +42 on Assists vs. Expected and -105 on Shots vs. Expected.

One last group I want to draw your attention to are the players who're most extreme in the negative on the shooting axis on the career chart. These guys: https://public.tableau.com/profile/edtu ... /JM2ZWTBTZ
They shoot less in comebacks but didn't change their assist rate to compensate. It's an interesting group in that little chevron shape on the left side: Gary Payton, Tracy McGrady, Latrell Sprewell, Allen Iverson, Elton Brand and Antawn Jamison. There's a lot of shots in that group... only a couple of past-their-prime championships.
@EdTubb edwardtubb at gmail
Post Reply