Play by Play JSON Feeds

Home for all your discussion of basketball statistical analysis.
Post Reply
nileriver
Posts: 63
Joined: Thu Jul 18, 2013 3:24 pm
Location: Vancouver, WA

Play by Play JSON Feeds

Post by nileriver »

After several posts asking about getting play by play data, I have been learning how to do web scraping. In another thread I found a link to a Sports Illustrated JSON feed (http://data.sportsillustrated.cnn.com/j ... yplay.json). I was wondering what other JSON or XML feeds exist that others use to get data.
kohanz
Posts: 34
Joined: Fri Jan 04, 2013 6:58 pm
Contact:

Re: Play by Play JSON Feeds

Post by kohanz »

NBA.com/stats has JSON feeds for their data as well. If you learn to use the developer tools in Chrome to sniff out what data is being loaded for a given page (already described in another thread on this topic), discovering JSON feeds becomes easier.

However, the main thing I want to post is for people to try and use these feeds as responsibly as they can. Don't hammer the feed with tons of requests. Once a game is finished, the PBP data isn't going to change, so for example in my app (not yet released), once the game is over, I never hit that feed again. During games, I might query it every 5 minutes at most, which is reasonable.
nileriver
Posts: 63
Joined: Thu Jul 18, 2013 3:24 pm
Location: Vancouver, WA

Re: Play by Play JSON Feeds

Post by nileriver »

I definitely agree about being responsible with the amount of requests you give a server. I have been using Firebug in Chrome to look at the structure of various websites (NBA, ESPN, and SI) and was hoping for others to share their experiences.
EvanZ
Posts: 912
Joined: Thu Apr 14, 2011 10:41 pm
Location: The City
Contact:

Re: Play by Play JSON Feeds

Post by EvanZ »

kohanz wrote: However, the main thing I want to post is for people to try and use these feeds as responsibly as they can. Don't hammer the feed with tons of requests. Once a game is finished, the PBP data isn't going to change, so for example in my app (not yet released), once the game is over, I never hit that feed again. During games, I might query it every 5 minutes at most, which is reasonable.
Their servers should be setup to handle thousands of requests per second. I don't think anyone should worry about their scraper being mistaken for a DoS attack.
nileriver
Posts: 63
Joined: Thu Jul 18, 2013 3:24 pm
Location: Vancouver, WA

Re: Play by Play JSON Feeds

Post by nileriver »

EvanZ wrote:
kohanz wrote: However, the main thing I want to post is for people to try and use these feeds as responsibly as they can. Don't hammer the feed with tons of requests. Once a game is finished, the PBP data isn't going to change, so for example in my app (not yet released), once the game is over, I never hit that feed again. During games, I might query it every 5 minutes at most, which is reasonable.
Their servers should be setup to handle thousands of requests per second. I don't think anyone should worry about their scraper being mistaken for a DoS attack.
It is always a best practice not to put an unnecessary load on a server no matter how small. If you have previously scraped the information, there should be no reason to hit the page again. Taking the time to make sure your code is not running redundant tasks is important. We discussed the implications on the server, but it will also slow down performance when running the code. I agree that the impact would be negligible on the servers that the NBA or ESPN has. However, you should grateful that they provide this information and respectful in the way in which you grab that information.
kohanz
Posts: 34
Joined: Fri Jan 04, 2013 6:58 pm
Contact:

Re: Play by Play JSON Feeds

Post by kohanz »

EvanZ wrote:
kohanz wrote: However, the main thing I want to post is for people to try and use these feeds as responsibly as they can. Don't hammer the feed with tons of requests. Once a game is finished, the PBP data isn't going to change, so for example in my app (not yet released), once the game is over, I never hit that feed again. During games, I might query it every 5 minutes at most, which is reasonable.
Their servers should be setup to handle thousands of requests per second. I don't think anyone should worry about their scraper being mistaken for a DoS attack.
My recommendation is not centered around a concern for DoS. The website being scraped generally pays considerable amounts for that data, and I wouldn't recommend someone drawing attention to themselves re-purposing that data for their own projects. I mean, if it's just for hobby analysis at home, that's fine, but I think websites such as vorped and nbawowy (which, Evan, I'm a big fan of) start to enter a bit of a grey area. As long as they don't have an amount of traffic that makes the bigger sites notice and as long as they have no monetization, I think they'll be fine, but it's basically a case of not being noticed. I also don't think the sellers of that type of data would be thrilled to find escalating amounts of scraping.
Post Reply