nba.com now has play by play data back to 1997

Home for all your discussion of basketball statistical analysis.
colts18
Posts: 313
Joined: Fri Aug 31, 2012 1:52 am

Re: nba.com now has play by play data back to 1997

Post by colts18 »

J.E. wrote:Python with urllib is a good place to start.
Further, you could use Python's Beautifulsoup or go the laborious way with string.split and string.replace
Ok how long would that process take?
EvanZ
Posts: 912
Joined: Thu Apr 14, 2011 10:41 pm
Location: The City
Contact:

Re: nba.com now has play by play data back to 1997

Post by EvanZ »

colts18 wrote:
J.E. wrote:Python with urllib is a good place to start.
Further, you could use Python's Beautifulsoup or go the laborious way with string.split and string.replace
Ok how long would that process take?
It's not a weekend project.
colts18
Posts: 313
Joined: Fri Aug 31, 2012 1:52 am

Re: nba.com now has play by play data back to 1997

Post by colts18 »

EvanZ wrote:
colts18 wrote:
J.E. wrote:Python with urllib is a good place to start.
Further, you could use Python's Beautifulsoup or go the laborious way with string.split and string.replace
Ok how long would that process take?
It's not a weekend project.
give me an estimate on how many hours I would need to commit?
v-zero
Posts: 520
Joined: Sat Oct 27, 2012 12:30 pm

Re: nba.com now has play by play data back to 1997

Post by v-zero »

Do you have any experience of programming in general, or Python specifically? If not I suggest you would need many (30+) hours to acquire the skills to complete the (laborious) process of parsing through PBP data to get 5 on 5 data.
EvanZ
Posts: 912
Joined: Thu Apr 14, 2011 10:41 pm
Location: The City
Contact:

Re: nba.com now has play by play data back to 1997

Post by EvanZ »

Yeah, colts, I don't want to be Debbie Downer here, but based on your line of questioning, I'm guessing you have very little programming experience. In that case, it will take a long time. I would definitely encourage you to learn how to program (there are plenty of free online courses now), and try to tackle this project after you get some significant "flight time" under your belt.
J.E.
Posts: 852
Joined: Fri Apr 15, 2011 8:28 am

Re: nba.com now has play by play data back to 1997

Post by J.E. »

colts, were you asking how long it would take to write a crawler and then then strip the html, or were you asking how much entire process would take from html to 5on5 matchups?

The first one is definitely doable in a couple of hours if you have some programming experience
colts18
Posts: 313
Joined: Fri Aug 31, 2012 1:52 am

Re: nba.com now has play by play data back to 1997

Post by colts18 »

EvanZ wrote:Yeah, colts, I don't want to be Debbie Downer here, but based on your line of questioning, I'm guessing you have very little programming experience. In that case, it will take a long time. I would definitely encourage you to learn how to program (there are plenty of free online courses now), and try to tackle this project after you get some significant "flight time" under your belt.
Yeah my programming knowledge is limited. Oh well I guess I have to hope someone else does it.
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: nba.com now has play by play data back to 1997

Post by DSMok1 »

colts18 wrote:
EvanZ wrote:Yeah, colts, I don't want to be Debbie Downer here, but based on your line of questioning, I'm guessing you have very little programming experience. In that case, it will take a long time. I would definitely encourage you to learn how to program (there are plenty of free online courses now), and try to tackle this project after you get some significant "flight time" under your belt.
Yeah my programming knowledge is limited. Oh well I guess I have to hope someone else does it.
You're not the only one in that boat.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
sideshowbob
Posts: 54
Joined: Fri Apr 15, 2011 4:43 am

Re: nba.com now has play by play data back to 1997

Post by sideshowbob »

colts18 wrote:Yeah my programming knowledge is limited. Oh well I guess I have to hope someone else does it.
I'd assume BBR will have the PbP data soon, so I wouldn't worry.
bigbossman
Posts: 1
Joined: Fri Feb 22, 2013 6:15 pm

Re: nba.com now has play by play data back to 1997

Post by bigbossman »

What does RangeType mean? It can be 0-2.

First post!

Thanks
kpascual wrote:I doubted you, and boy was I wrong. Thanks colts!

Raw play by play example:
http://stats.nba.com/stats/playbyplay?G ... dPeriod=10

Raw box score example:
http://stats.nba.com/stats/boxscore?Gam ... dPeriod=10

The boxscore API is kind of cool. You can produce a box score not only for each quarter, but also for any given range within a game, like for a given 2 minute span in the first quarter. StartRange/EndRange represent the number of tenths of a second elapsed in the game. Lots of possibilities here.

Example:
http://stats.nba.com/stats/boxscore?Gam ... dPeriod=10
sndesai1
Posts: 141
Joined: Fri Mar 08, 2013 10:00 pm

Re: nba.com now has play by play data back to 1997

Post by sndesai1 »

bigbossman wrote:What does RangeType mean? It can be 0-2.

First post!

Thanks
kpascual wrote:I doubted you, and boy was I wrong. Thanks colts!

Raw play by play example:
http://stats.nba.com/stats/playbyplay?G ... dPeriod=10

Raw box score example:
http://stats.nba.com/stats/boxscore?Gam ... dPeriod=10

The boxscore API is kind of cool. You can produce a box score not only for each quarter, but also for any given range within a game, like for a given 2 minute span in the first quarter. StartRange/EndRange represent the number of tenths of a second elapsed in the game. Lots of possibilities here.

Example:
http://stats.nba.com/stats/boxscore?Gam ... dPeriod=10
from changing it and looking at the differences, i think it controls the type of box score you're getting
0 - seems to provide complete game box score, including players with a dnp
1 - seems to provide complete game box score, but not including players with a dnp
2 - seems to provide a limited box score, with the time probably determined by startrange and endrange in the url
Post Reply