@Daniel, the way my site works is that the app is running off Nodejitsu and the database is running off MongoLab. The plan I'm on right now is 1 GB per month storage and that's about $10/mo. Not a big deal. But it goes up steeply from there. 4 GB per month is $40/mo. Every GB after that is another $11, so 10 GB would be over $100/mo. Since I don't make money off the site, that's $1200/yr coming out of my pocket. And for all I know the database might end up being around 15 GB to go back to '97.
@colts18, right now nbawowy is just this season. My play-by-play is coming from NBC Sports which was the cleanest, easiest source to parse that I could find (for example, providing full names on every play). Unfortunately, before I could scrape prior seasons from the site (which I think they had at one point), they took them all down (I'm assuming at the request of the NBA).
How does this parsing out process work? Is there some kind of code to make it work?
Yes, there is some kind of code. The primary challenge of parsing play-by-play is determining who is on the court. Using the NBC dataset, I've got an error rate that is very, very low. It's almost a perfect process with very little manual correction involved. Every day it takes me about 5 minutes to update the site. The one good thing about going back and dealing with old data, is that once you do it, you don't have to mess with it again. I'd love to add it to my site, but like Ken said, it's a matter of finding time.
Hey, Ken. There's a D3 meetup in SF tonight at Trulia. Any chance you're going? I'll be there.