APBRmetrics

The discussion of the analysis of basketball through objective evidence, especially basketball statistics.
It is currently Thu Dec 18, 2014 11:00 pm

All times are UTC




Post new topic Reply to topic  [ 44 posts ]  Go to page 1, 2, 3  Next
Author Message
PostPosted: Mon Apr 22, 2013 4:17 am 
Offline

Joined: Fri Jan 11, 2013 8:39 pm
Posts: 5
To me, the hardest part in doing Basketball Analytics right now is data manipulation. I have enough of a stats/math background that I have various tools that I'd love to run numbers through, but actually getting that data in a format I can work with (generally just a simple CSV will work) is hard to make happen.

For example, the data at basketball-reference is useful, and it's fantastic that I can format individual tables as CSVs. However, it's difficult to get in a format where you're looking at differences between players in any sort of meaningful way; you have to download each player's table one at a time, then do some excel work to make it workable. It would be great if there were some sort of downloadable database with player statistics. Indeed, one existed, but it's only updated through 2009 (databasebasketball.com).

Ditto with the +/- numbers. There's a real need right now for someone to keep public, updated RAPM numbers, and I have the software/knowhow to do that... except the play-by-play data is so scattered. With a play-by-play parser, or database along the lines of the old basketballvalue one, this would be a much easier project to tackle.

Another example, while we're at it: Synergy. The Synergy Silverlight app makes it relatively easy to see how a player is doing, but difficult to compare players, or do any sort of real research into the Synergy numbers league-wide. While this is maybe the hardest task (because the numbers are all presented via Silverlight and not in HTML like the other sources), it would be of real value to the basketball community.

So, maybe you guys know some tricks I don't: how do you acquire your data, and get it into workable formats? Any tips to share? I figure I can't be the only one who has this same roadblock on a regular basis.


Top
 Profile  
 
PostPosted: Mon Apr 22, 2013 6:00 am 
Offline

Joined: Fri Apr 15, 2011 5:29 pm
Posts: 339
Location: Arlington, Texas
andylarsen wrote:
So, maybe you guys know some tricks I don't: how do you acquire your data, and get it into workable formats? Any tips to share? I figure I can't be the only one who has this same roadblock on a regular basis.



Believe me - you aren't the only one with this problem. I actually have been very college b-ball focused for a while (as many on here know) - and it's even MUCH more difficult than compiling data from the NBA. 347 teams, inconsistencies in the data between sources, trying to get (accurate) player year (frosh, soph, etc) - not to mention trying to get player ages.

I have so much work to do before the NBA draft - and roadblocks abound that soak up my time greatly.

I feel your pain - and welcome ANY ideas - with hopes I can find something that can help.

_________________
Dan

http://hoopsnerd.com
https://twitter.com/Hoops_Nerd


Top
 Profile  
 
PostPosted: Mon Apr 22, 2013 2:39 pm 
Offline

Joined: Sat Feb 18, 2012 8:29 pm
Posts: 6
andylarsen wrote:
For example, the data at basketball-reference is useful, and it's fantastic that I can format individual tables as CSVs. However, it's difficult to get in a format where you're looking at differences between players in any sort of meaningful way; you have to download each player's table one at a time, then do some excel work to make it workable. It would be great if there were some sort of downloadable database with player statistics. Indeed, one existed, but it's only updated through 2009 (databasebasketball.com).


I would suggest one thing: go to the league summary pages at B-R, such as here:

http://www.basketball-reference.com/lea ... otals.html

If you hover your mouse over the "Stats" tab at the top, you can toggle between Totals, Per 36, Per Game, etc. That's a lot faster than going player by player, and everything on those pages can also be grabbed in csv form. If all you were looking for was player data over the last 20 years, that could be done much faster using these pages. You can also establish data connections through Excel.


Top
 Profile  
 
PostPosted: Mon Apr 22, 2013 3:52 pm 
Offline

Joined: Thu Apr 14, 2011 11:18 pm
Posts: 620
Location: Maine
Jon Nichols wrote:
andylarsen wrote:
For example, the data at basketball-reference is useful, and it's fantastic that I can format individual tables as CSVs. However, it's difficult to get in a format where you're looking at differences between players in any sort of meaningful way; you have to download each player's table one at a time, then do some excel work to make it workable. It would be great if there were some sort of downloadable database with player statistics. Indeed, one existed, but it's only updated through 2009 (databasebasketball.com).


I would suggest one thing: go to the league summary pages at B-R, such as here:

http://www.basketball-reference.com/lea ... otals.html

If you hover your mouse over the "Stats" tab at the top, you can toggle between Totals, Per 36, Per Game, etc. That's a lot faster than going player by player, and everything on those pages can also be grabbed in csv form. If all you were looking for was player data over the last 20 years, that could be done much faster using these pages. You can also establish data connections through Excel.


I have run up against the same hurdles, and still use BBref as my primary data source (with Excel macros/data connections to download large quantities of data).

_________________
APBRmetrics Forum Administrator
GodismyJudgeOK.com/DStats/
Twitter.com/DSMok1


Top
 Profile  
 
PostPosted: Mon Apr 22, 2013 5:49 pm 
Offline
Site Admin

Joined: Thu Apr 14, 2011 10:05 pm
Posts: 64
I'd note that some of what you're finding is by design. For example, the ability to sort and manage Synergy data is part of what teams/media corporations are paying for with the full version.


Top
 Profile  
 
PostPosted: Thu Apr 25, 2013 5:11 am 
Offline

Joined: Sat Feb 16, 2013 11:56 am
Posts: 167
It'd be great to have an NBA data warehouse for stuff like this that isn't someone's website where the place is abandoned once the researcher is bought by a team. Just some shared site for important data like play by play csv's and yearly adjusted/regularized plus minus numbers.

Has anyone parsed that data from the late 90's? That's a project we should focus on. It's play by play data from the Jordan Bulls era and you can get great estimates for prime Shaq.


Top
 Profile  
 
PostPosted: Thu Apr 25, 2013 6:02 pm 
Offline

Joined: Tue Dec 04, 2012 11:43 pm
Posts: 4
AcrossTheCourt wrote:
It'd be great to have an NBA data warehouse for stuff like this that isn't someone's website where the place is abandoned once the researcher is bought by a team. Just some shared site for important data like play by play csv's and yearly adjusted/regularized plus minus numbers.

Has anyone parsed that data from the late 90's? That's a project we should focus on. It's play by play data from the Jordan Bulls era and you can get great estimates for prime Shaq.


How can you expect someone to be interested enough in basketball to focus on creating public data, but not jump at the opportunity to work for a team? From what I understand, generally the team doesn't give them a choice and that is why the updates stop occurring.

I guess I'm just saying, you can't blame them. ;)


Top
 Profile  
 
PostPosted: Thu Apr 25, 2013 8:28 pm 
Offline

Joined: Sat Feb 16, 2013 11:56 am
Posts: 167
No, not blaming them for joining a team. The problem is once they do the website is toast. So you need a website with a large group of people or some system where certain roles can be filled once the person is gone.


Top
 Profile  
 
PostPosted: Fri Apr 26, 2013 12:49 am 
Offline

Joined: Thu Apr 14, 2011 10:41 pm
Posts: 817
Location: Hotlanta
AcrossTheCourt wrote:
No, not blaming them for joining a team. The problem is once they do the website is toast. So you need a website with a large group of people or some system where certain roles can be filled once the person is gone.


You need someone like me who is happy with his current job and not looking to be hired by the NBA (although I can't say the opportunity hasn't been presented).

And if I did take a job, I'd make it a condition of being hired that the site would have to stay up.

FWIW, I'm actually planning to get the 90's data (as far back as I can go) this summer and put it on nbawowy. Look for it.

_________________
The City: A Golden State Warriors-Centric NBA Blog
NBA WOWY: Find statistics of your favorite team with any arbitrary combination of players on or off the court.


Top
 Profile  
 
PostPosted: Fri Apr 26, 2013 2:08 am 
Offline

Joined: Thu Apr 14, 2011 11:18 pm
Posts: 620
Location: Maine
EvanZ wrote:
AcrossTheCourt wrote:
No, not blaming them for joining a team. The problem is once they do the website is toast. So you need a website with a large group of people or some system where certain roles can be filled once the person is gone.


You need someone like me who is happy with his current job and not looking to be hired by the NBA (although I can't say the opportunity hasn't been presented).

And if I did take a job, I'd make it a condition of being hired that the site would have to stay up.

FWIW, I'm actually planning to get the 90's data (as far back as I can go) this summer and put it on nbawowy. Look for it.


I'm with Evan here--I am not looking to get hired either. Unfortunately, my data analysis/programming skills aren't what Evan's are!

Hopefully some of the data will be pulled together for easier usage sometime soon--more and more people have the ability to compile it.

_________________
APBRmetrics Forum Administrator
GodismyJudgeOK.com/DStats/
Twitter.com/DSMok1


Top
 Profile  
 
PostPosted: Fri Apr 26, 2013 10:55 pm 
Offline

Joined: Sat Feb 16, 2013 11:56 am
Posts: 167
EvanZ wrote:
AcrossTheCourt wrote:
No, not blaming them for joining a team. The problem is once they do the website is toast. So you need a website with a large group of people or some system where certain roles can be filled once the person is gone.


You need someone like me who is happy with his current job and not looking to be hired by the NBA (although I can't say the opportunity hasn't been presented).

And if I did take a job, I'd make it a condition of being hired that the site would have to stay up.

FWIW, I'm actually planning to get the 90's data (as far back as I can go) this summer and put it on nbawowy. Look for it.

I'd love to see that 90's data in any form, though it'd be fun to play around with the raw data. Thanks for that.


Top
 Profile  
 
PostPosted: Sat Apr 27, 2013 4:16 pm 
Offline

Joined: Thu Mar 01, 2012 7:02 pm
Posts: 50
I've been thinking for a while about building an API on top of my site, so that my play by play, shot chart, and fiveman datas can be publicly accessible to you all. Actually it's already 70% built.

But I hesitate to do so mainly because I would incur all the bandwidth costs, and would have to deal with all the complaints about data quality, too. Plus I spent countless hours building all the scripts, and man is it a lot of ***** work.

I would also like to point out that my NBA data scraping scripts are open-sourced, so you can get the data on your own computer. https://github.com/kpascual/nbascrape

_________________
http://vorped.com
https://twitter.com/vorped


Top
 Profile  
 
PostPosted: Sat Apr 27, 2013 5:38 pm 
Offline

Joined: Thu Apr 14, 2011 10:41 pm
Posts: 817
Location: Hotlanta
Are you running everything off your own personal server?

_________________
The City: A Golden State Warriors-Centric NBA Blog
NBA WOWY: Find statistics of your favorite team with any arbitrary combination of players on or off the court.


Top
 Profile  
 
PostPosted: Sun Apr 28, 2013 2:07 am 
Offline

Joined: Tue Dec 04, 2012 11:43 pm
Posts: 4
kpascual wrote:
I've been thinking for a while about building an API on top of my site, so that my play by play, shot chart, and fiveman datas can be publicly accessible to you all. Actually it's already 70% built.

But I hesitate to do so mainly because I would incur all the bandwidth costs, and would have to deal with all the complaints about data quality, too. Plus I spent countless hours building all the scripts, and man is it a lot of fucking work.

I would also like to point out that my NBA data scraping scripts are open-sourced, so you can get the data on your own computer. https://github.com/kpascual/nbascrape

I'm really excited to look through this, thank you.


Top
 Profile  
 
PostPosted: Sun Apr 28, 2013 5:09 pm 
Offline

Joined: Thu Mar 01, 2012 7:02 pm
Posts: 50
EvanZ wrote:
Are you running everything off your own personal server?


Shared web hosting. So I don't own the box per se, but I'm paying for it. I was thinking of moving everything to Amazon (AWS) to scale, but again, bandwidth costs.

You know what, **** it, I'll expose some of my data. I'll post a link when it's complete.

_________________
http://vorped.com
https://twitter.com/vorped


Last edited by DSMok1 on Sun Apr 28, 2013 7:36 pm, edited 1 time in total.

Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 44 posts ]  Go to page 1, 2, 3  Next

All times are UTC


Who is online

Users browsing this forum: ampersand5, Yahoo [Bot] and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group