Advanced NFL stats released the play by play data for the 2002 to 2012 seasons. The play data is human generated. Doing any Data Science on it will be difficult until you transform it. After that you can merge it with other dataset to get even more insight. Ideally, you want an easily query-able dataset that you can use Hive, Pig or Impala to gain more insight.
I’ve blogged about some of the manual MapReduce jobs I’ve created based on the dataset. So far, I’ve correlated Quarterbacks and their most thrown to receivers. http://www.jesse-anderson.com/2013/01/nfl-play-by-play-analysis/
I am a Creative Engineer with many years of experience in creating products and helping companies improve their software engineering. I strive to provide developers with the resources to learn new technologies and improve their skillsets. I am a Curriculum Developer and Instructor at Cloudera. To help the local community, I volunteer my time as the President of the Northern Nevada Software Developers Group.
Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?
Join the conversation here (requires login)
For information on exhibition and sponsorship opportunities at the conference, contact Sharon Cordesse at (707) 827-7065 or firstname.lastname@example.org.
View a complete list of OSCON contacts