APBRmetrics

Posted: **Thu Jan 15, 2015 7:59 am**

Context matters. Ok, how many significant ones are there?

At a real broadbush level, I'd say there are good, average and bad teams. Offensively biased, balanced or neutral and defensively biased. Take the combination and you have a matrix of 9 contexts. How much further to go? Look at three rates of inside scoring (field plus Ft) and three rates of 3pt strength and add that detail and you have 81 contexts. Enough? If not, maybe add three levels of movement and passing. Now you have 243. Add the flip side inside scoring and 3pt rates for defensive and you have almost 2200 different possible contexts. Well now it is probably time to turn and whittle it down. There are only 30 teams. Unless you want to take context to the lineup level (and I wouldn't yet), you only have 30 team level average contexts. And if you did a cluster analysis you might only have 10-15. Long way around and back but to understand context I think it is worth trying to grapple with these 7 criteria. I didn't even include turnovers and rebounds. You could, but hey I am simplifying. 2200 possible contexts seems like enough without going into 5-6 digits for these criteria or coaching or PG style. Especially given the intended collapse back into 10, 15 or at most 30.

Anyone else have interest in this topic and / or have a different approach?

Posted: **Thu Jan 15, 2015 8:43 am**

If you did the team level analysis and wanted to proceed to lineup level, then the 2200 possible contexts come back in play. Maybe with lineup cluster analysis, you decide to find the 25-60 most common / similar. Maybe you simplify even further. With 30 teams and probably 12-16,000 total lineups, that is a pretty big clustering task. Alternatively you could focus on each teams 5-10 most used lineups or roll them up into bundles of lineups with 4-5 starters, 2-3 or 1 & none.

Posted: **Thu Jan 15, 2015 6:41 pm**

If this clustering has done, the final step ideally would be graphical analysis to collapse this 7 dimension universe into a cool visual that would make people look.

Posted: **Thu Jan 15, 2015 7:18 pm**

That's interesting.

Not sure how it would be done, but it would be cool to add in offensive sets (e.g. is this a heavy pick and roll team) and defensive strategies (e.g. Portland doesn't hedge).

Once you got all of that clustering done, I think it would reveal some interesting stuff in terms of matchups. Indy vs. Houston, what should be expected there?

Posted: **Thu Jan 15, 2015 7:37 pm**

Yeah there is more of importance that could take the matrix into 5-6 digits in size. I imagine some teams look at these factors separately but I wonder if any try to define the full context and optimize for and against. Coaches will say they are doing it but are they really subconsciously dealing with all the dimensions interactively? If they are, then they don't need supercomputers... because then they are claiming to already be one.

Posted: **Tue Jan 20, 2015 3:01 am**

Made a quick look at teams by just SRS, Off. and Def. efficiency. Seemed to sort into 13 groups of 2-3 teams with GSW a stand alone. May post the data at a later time.

Posted: **Thu Jul 09, 2015 5:40 am**

If one focused on lineup context, one could rank order lineups by position using offensive and defensive ratings of one kind of another (RPM, BPM, win share based, blend, etc.). This way you get a set of 20 contexts on each side of court. That is pretty manageable. And the contexts have some court geography or geometry that might enhance their credibility and usefulness.

If you stay at this level of context, you don't have good, average or bad at each position, just relative values.

Comparing the lineup APM estimates (and other metric scores and discrete stats) of lineup clusters with the same or close contexts might be interesting. What is working best on average leaguewide?

One could also try to compare players and their RPM based on the relative context quality of their lineup mix. RPM does this for you but it might add insight or intrigue to sorta know the quality of the context a player in a position is in alongside their RPM output. Perhaps more challenging or questionable RPM estimates could be identified or at guessed at / wrestled with better by looking at such a split instead of just the RPM output.

Posted: **Thu Jul 09, 2015 5:58 am**

Crow wrote:Context matters. Ok, how many significant ones are there?

...
Anyone else have interest in this topic and / or have a different approach?

Basically, every time you control for some variable statistically, you're calling that variable a significant context. So there's a whole library of stastical techniques for dealing with context and testing how significant a particular context is in your data set. From the math perspective, techniques like principal component analysis allow for data-drive clustering.

In the abstract, I think that an important dimension is the spectrum that ranges from garbage time and clutch time. As usual, I'll say that it's probably better to make decisions with a particular question in mind, rather than trying to come up with something that will be all things to all people.

Posted: **Thu Jul 09, 2015 6:04 am**

I came up with 2 ways to use the idea. One can tailor it further probably. Pursuing specific questions is generally good science. But good science or good application can come out of thinking in concepts / models not being completely controlled by the search for a specific answer to a specific question. A lot of scientific advancements spring from different / better models. I don't make any advance claim of the power of this "string theory" but thinking about it, trying to mold it and employ it might lead to some good, the intended or something short of it or unexpected. "Energy" (or Quality) in players, energy in lineup contexts, in the set of both. Don't elemental atom "particles" behave somewhat differently, at least sometimes, depending where they are and what they are near? You can prod gently and try to redirect, dismiss or maybe find something of value in the call for further examination, separation of context and actor / agent / proton-neutron-electron. It is a stab in the dark thought shared, fwiw to perhaps someone else. People continue to say RPM is a black box without context (while rapm factors are within reach but hardly used, probably). Instead of just saying the "charge" estimate is negative or positive and estimated at a certain size wouldn't it help to be able to also say we find this particle in this contextual "universe" x percent of the time, this context y%, z%, etc. The particle is a set of events and an average event / outcome. I'd rather try to think beyond the constructed average. At least for a bit. Until going back to a world of fighting for the validity and utility of even that constructed average. Understanding the detail better and using that understanding better is a tall task but eventually that is the call. Model then go beyond to a richer understanding. As another recent thread shows it is hard to generate interest in RAPM splits publicly. Me, I'd rather see a players RAPM splits in these 20 lineup contexts or maybe 5 context clusters than not see them. And same for player types and league averages for the contexts and context clusters. Howl at the moon.

Posted: **Thu Jul 09, 2015 7:05 am**

Ah... My 20 contexts aren't 20. I thought of the math wrong.

Is it 120 instead? That would blow the reasonable level of split argument. Latenight thinking can be more free and creative but mistakes can happen too.

Or maybe you get back to a manageable 24 contexts by saying the "context" is just the other 4 without the 5th element player. Or settle for 24 contexts per position.

Or save the exercise by folding the contexts into some manageable and distinctive enough context clusters as suggested previously.

Or maybe replace with a model of PG-wings-bigs, with the contributions of the latter two roles aggregated. That would be way simpler (6 splits? Who would entertain 6 splits? Anybody else?) Cruder. But perhaps still of some usevalue.

On a future day, after more thought. This turns out to be attempted visioning, rather than destination selected, directions set, march, find out something. Not surprised. Seems like I drift into trying something akin to this every so often. Far as I could take and twist this old topic at the moment for my own use. Before I sleep.

Then back to simpler applied science without so much theory and modeling.

Posted: **Thu Jul 09, 2015 8:12 am**

Overall RPM can and I think should be used in player evaluation related to acquisition, as argued in another thread.

Overall RPM can be used in lineup and rotation construction, of course with other stats, metrics, coaching concepts, video and memory.

But using overall RPM may not be the best or last option. Critics / skeptics of RPM would be right to say that players don't play in their average context and deliver an average impact. They play in specific contexts or context clusters and deliver varying performances in them. The construction of RPM is a useful thing but its subsequent deconstruction to adjusted performance levels in context clusters, fraught with estimated error, would allow the analyst to say: using the data samples available closest to this lineup or for this player in this context, this is our best guess of performance adjusted for other players. Isn't that most likely the thing that coaches want to hear, might listen to, might use? Would they listen more than they are listening to overall RPM or the less refined, trustworthy raw plus minus? Worth a try I'd think.

Using overall RPM averages (or any average metric or average judgment from video or or composite analytic and or subjective evaluation) you could say: Smart, Thomas, Olynyk and Zeller are all good. I'll play them in any and every combination with confidence. Meanwhile the adjusted data for certain pairings may be (likely is, I think) great for some combinations vs. others. (If the data says it is all the same, then you have the data saying that rather than just assuming.) Sample size still constrains data interpretation and reliability but better to think about that adjusted data than not. And by reviewing the adjusted data you can optimize your testing of the best lineups and player context matches to confirm or deny their initial usage priority level. It becomes operational optimization / art after the RPM build and split/clustering. You need the full cycle.

If coaches are refusing to at least consider RPM, some of the rationales are refusing to use THAT average they don't fully understand or trust to be good enough, believing their memory or judgment or intuition is capable of optimizing every possible circumstance better than that average and thinking the raw data and their judgment is good enough without actually compiling and adjusting all the data but making what appears to be a thoughtful judgment over what they can remember and adjust mentally. The hundreds of lineups and contexts over many hundreds of stints. Are coaches supercomputers? Sorta but not exactly. Looking at adjusted context clusters should be explained as doing what they are trying / doing without them having to do as much of the tedious background recall and math. They still get to decide what feels right in the moment (for now), we are just helping you get to that point were the art starts faster and probably surer. Yes, we are here to help.

Posted: **Mon Jul 13, 2015 11:31 pm**

Lacking adjusted lineup context cluster splits at this time, I tried looking at bigger minute positive raw plus minus lineup data multiseason for players. I counted the number of "creators" and "true bigs" on court with him and also looked at the studied player's usage rank. Of course there is pretty more one could look at it but I found this to be a useful context start with some strong trends found, expected and not so obvious.

Posted: **Tue Jul 14, 2015 10:33 pm**

/soapbox

Everything is "context" to some extent. This is the curse of dimensionality at work. The more we drill down into these stats, the worse it will seemingly become. Not that we know less in an absolute sense, but the more we know (and don't account for in our models) the more non-analytics-believers can use "context" as an excuse for continued ignorance.

/soapbox

Posted: **Mon Jul 20, 2015 3:37 am**

Different ways to slice it. Continuing to try to find right usage, manageable level of detail, I could define context set largely as: top 7 most used team lineups and weighted composite of rest and the same for other teams, with a focus on the best 8 teams. 232 matchup contexts of which 32-64 matter most. And again you could collapse contexts into fewer context clusters with enough similarity to each other.

APBRmetrics

How many "contexts" are there?

How many "contexts" are there?

Re: How many "contexts" are there?

Re: How many "contexts" are there?

Re: How many "contexts" are there?

Re: How many "contexts" are there?

Re: How many "contexts" are there?

Re: How many "contexts" are there?

Re: How many "contexts" are there?

Re: How many "contexts" are there?

Re: How many "contexts" are there?

Re: How many "contexts" are there?

Re: How many "contexts" are there?

Re: How many "contexts" are there?

Re: How many "contexts" are there?