2000-2006 PBP, matchupfile & RAPM

Home for all your discussion of basketball statistical analysis.
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: 2000-2006 PBP, matchupfile & RAPM

Post by DSMok1 »

Wonderful data, J.E.! Will be very useful.

Are you using BBRef ID's throughout? Do you have some sort of player database linking BBRef IDs to names and Bio data? Just asking.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
colts18
Posts: 313
Joined: Fri Aug 31, 2012 1:52 am

Re: 2000-2006 PBP, matchupfile & RAPM

Post by colts18 »

Can you explain the prior-informed RAPM and how it works? In the non-informed data, Duncan leads from 01-07 every single year, yet KG's 2003 and 2004 prior-informed is higher than Duncan's. If you are using prior data, wouldn't Duncan be higher since he is higher in every single season?
sideshowbob
Posts: 54
Joined: Fri Apr 15, 2011 4:43 am

Re: 2000-2006 PBP, matchupfile & RAPM

Post by sideshowbob »

Awesome, as usual.

Does this mean that BBV's 2012 data is complete, and that the prior-informed 2012 data should be up soon?
J.E.
Posts: 852
Joined: Fri Apr 15, 2011 8:28 am

Re: 2000-2006 PBP, matchupfile & RAPM

Post by J.E. »

DSMok1 wrote:Are you using BBRef ID's throughout?
I plan to do that, yes
Do you have some sort of player database linking BBRef IDs to names and Bio data?
I'll create files which have playerID and various BoxScore stats in csv format, and I have one file that has "bbr playerID;full name", but nothing with Bio data. What would you want to use that for?
Can you explain the prior-informed RAPM and how it works? In the non-informed data, Duncan leads from 01-07 every single year, yet KG's 2003 and 2004 prior-informed is higher than Duncan's. If you are using prior data, wouldn't Duncan be higher since he is higher in every single season?
There's a discrepancy here because the data I used for computing informed RAPM was incomplete. For most of those early years it was missing >15%
Does this mean that BBV's 2012 data is complete, and that the prior-informed 2012 data should be up soon?
?
Everything in this thread was grabbed from BBR. I don't think bbv has updated their 2012 matchupfile(s). 2012 informed RAPM is already up, it's just missing the playoffs. Hopefully I get to test all RAPM versions (vanilla, RAPM informed RAPM, BoxScore+PBP informed RAPM) before the season and then I'll upload whatever did best
sideshowbob
Posts: 54
Joined: Fri Apr 15, 2011 4:43 am

Re: 2000-2006 PBP, matchupfile & RAPM

Post by sideshowbob »

J.E. wrote:?
Everything in this thread was grabbed from BBR. I don't think bbv has updated their 2012 matchupfile(s). 2012 informed RAPM is already up, it's just missing the playoffs. Hopefully I get to test all RAPM versions (vanilla, RAPM informed RAPM, BoxScore+PBP informed RAPM) before the season and then I'll upload whatever did best
My mistake, I meant with the playoffs, and somehow it flew over my head that this was all done with BBR data. Just checked BBV, and yeah the data still hasn't been updated to include the playoffs.
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: 2000-2006 PBP, matchupfile & RAPM

Post by DSMok1 »

J.E. wrote:
DSMok1 wrote:Are you using BBRef ID's throughout?
I plan to do that, yes
Do you have some sort of player database linking BBRef IDs to names and Bio data?
I'll create files which have playerID and various BoxScore stats in csv format, and I have one file that has "bbr playerID;full name", but nothing with Bio data. What would you want to use that for?
I thought it might be nice to link in with these tables for exact age data: http://www.basketball-reference.com/players/a/
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
colts18
Posts: 313
Joined: Fri Aug 31, 2012 1:52 am

Re: 2000-2006 PBP, matchupfile & RAPM

Post by colts18 »

For the missed quarters, couldn't you use ESPN or B-R's play by play to figure out who was playing in the missing time?

Is 2001 the last season you have pbp for? Is there anyone out there that has 2000 pbp data? I would bet that Shaq in 2000 might be the only guy with a 10+ RAPM.
J.E.
Posts: 852
Joined: Fri Apr 15, 2011 8:28 am

Re: 2000-2006 PBP, matchupfile & RAPM

Post by J.E. »

J.E. wrote:Everything in this thread was grabbed from BBR
J.E.
Posts: 852
Joined: Fri Apr 15, 2011 8:28 am

Re: 2000-2006 PBP, matchupfile & RAPM

Post by J.E. »

http://stats-for-the-nba.appspot.com/PB ... br_ids.rar

Contains (almost) all matchupdata from 2000/2001 to 2012, split into regular season and playoffs.
Fixed a bug where players that were substituted in during free throws showed up one possession too early in the matchupfiles.
From 2007 onwards it's pretty much bbv's dataset, with bbr player page urls listed instead of bbv's player id's. The games which bbv did not have (but bbr does) are at the bottom of each matchupfile.
For the year 2007 I'm torn between using bbv's matchupdata or mine. Mine probably has more errors, but bbv back then only listed total possessions for both teams, instead of splitting into home/away
Crow
Posts: 10536
Joined: Thu Apr 14, 2011 11:10 pm

Re: 2000-2006 PBP, matchupfile & RAPM

Post by Crow »

Is non-prior informed RAPM coming eventually for 2007-11?

Is there something that could done systematically with the prior-informed, non-prior informed and multiyear RAPM data over the stretch of time they are available to identify the most "out of step values", which might be signs of larger than average errors for those datapoints? I think that might be interesting and helpful.
J.E.
Posts: 852
Joined: Fri Apr 15, 2011 8:28 am

Re: 2000-2006 PBP, matchupfile & RAPM

Post by J.E. »

Is non-prior informed RAPM coming eventually for 2007-11?
Probably not
Is there something that could done systematically with the prior-informed, non-prior informed and multiyear RAPM data over the stretch of time they are available to identify the most "out of step values", which might be signs of larger than average errors for those datapoints?
What exactly do you mean by "out of step"?
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: 2000-2006 PBP, matchupfile & RAPM

Post by DSMok1 »

Where do you explain the columns in the matchup files? I see no key.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
J.E.
Posts: 852
Joined: Fri Apr 15, 2011 8:28 am

Re: 2000-2006 PBP, matchupfile & RAPM

Post by J.E. »

It's the same format as bbv with the difference that bbv's files contain more information (that is not needed for RAPM or whatever).
If you split each line by TAB, gameid is at 0, home players are 5-9, away players are 10-14, home points #29, away points #30, home possessions #33, away possessions #34
Crow
Posts: 10536
Joined: Thu Apr 14, 2011 11:10 pm

Re: 2000-2006 PBP, matchupfile & RAPM

Post by Crow »

Out of step could mean if one of the prior-informed, non-prior informed and multiyear RAPM values for a player showed X standard deviation more variance from the other values for that player than the average variance among them for those datasets.
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: 2000-2006 PBP, matchupfile & RAPM

Post by DSMok1 »

J.E.: where did the matchup files go on your site?
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
Post Reply