For information on exhibition and sponsorship opportunities at the convention, contact Sharon Cordesse at email@example.com
Download the OSCON Data Sponsor/Exhibitor Prospectus
For information on trade opportunities with O'Reilly conferences or contact mediapartners@ oreilly.com
For media-related inquiries, contact Maureen Jennings at firstname.lastname@example.org
To stay abreast of convention news and announcements, please sign up for the OSCON email bulletin (login required)
View a complete list of OSCON contacts
This presentation will give you the big picture of how hadoop works. We will cover the key pieces of the hadoop ecosystem.
HDFS, the distributed fault tolerant filesystem.
Map/Reduce, the method of batch processing distributed data.
In this intro we will cover the key processes of the namenode the tasktracker, and jobtracker, the map the reduce, and the sort and shuffle.
A diverse ecosystem of tools are commonly used, those will be given brief mention with only brief time to mention features. Flume, Oozie, Hive, Pig, Hbase.
Tom Hanlon is currently an instructor at Cloudera where he delivers courses on the wonders of the hadoop ecosystem.
Before beginning his relationship with hadoop and large distributed data, he had a happy and lengthy relationship with MySQL with a focus on web operations.
He has been a trainer for MySQL, Sun , Percona.