The Hadoop MapReduce API
Learn how to get started writing programs against Hadoop’s API.
Introduction to MapReduce Algorithms
Writing programs for MapReduce requires analyzing problems in a new way. This lecture shows how some
common functions can be expressed as part of a MapReduce pipeline.
Debugging MapReduce programs
Debugging in the distributed environment is challenging. This lecture will expose you to best practices for program
design to mitigate debugging challenges, as well as local testing tools and techniques for debugging at scale.
Optimizing MapReduce Programs
We’ll use the Cloudera Training VM to work through an example where you write a MapReduce program and
improve its performance using techniques explored earlier.
NOTE: Attendees should download the Cloudera Training vm from http://cloudera.com/hadoop-training-virtual-machine. VMWare Player (windows, linux) or VMWare Fusion (OS X) will be required in order to use it.
Aaron Kimball is a software engineer at Cloudera, Inc., the Commercial Hadoop company. Aaron is the principle developer of Sqoop, the SQL-to-Hadoop database import/export tool. Aaron has been working with Hadoop since early 2007, and contributes actively to its development. Through Cloudera, he additionally provides training to developers and system administrators working with Hadoop. Aaron holds a B.S. in Computer Science from Cornell University, and an M.S. in Computer Science and Engineering from the University of Washington.
Comments on this page are now closed.
For information on exhibition and sponsorship opportunities at the conference, contact Sharon Cordesse at scordesse@oreilly.com
Download the OSCON Sponsor/Exhibitor Prospectus
Download the Media & Promotional Partner Brochure (PDF) for information on trade opportunities with O'Reilly conferences or contact mediapartners@ oreilly.com
For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com
To stay abreast of conference news and to receive email notification when registration opens, please sign up for the OSCON Newsletter (login required)
Have an idea for OSCON to share? oscon-idea@oreilly.com
View a complete list of OSCON contacts
Comments
Two most useful explanations: 1) MapRed algorithms 2) Debugging MapRed jobs.
Great presentation.
I really enjoyed the explanation of the different map reduce algorithms (ie using map/reduce for joins).