THIS TUTORIAL HAS REQUIREMENTS AND INSTRUCTIONS LISTED BELOW
Start on low heat with a base of Hadoop; map, then reduce. Flavor, to taste, with Scala’s concise, functional syntax and collections library. Simmer with some Pig bones: a tuple model and high-level join and aggregation operators. Mix in Cascading to hold everything together and boil until it’s very, very hot, and you get Scalding, an API for MapReduce out of Twitter.
Scalding is an open source Scala framework for concisely describing Hadoop MapReduce jobs. I started the project at Twitter as a way for ad server engineers to run simple queries on the ad logs, without needing to learn a specialized language like Pig, or dive too deeply into the guts of Hadoop. Since then, it’s been adopted by teams at Etsy, LinkedIn, EBay, SoundCloud, LivePerson, Stripe, and others, and been extended with convenient APIs for everything from large-scale sparse matrix multiplication to locality-sensitive hashing.
This tutorial will walk you through getting started with Scalding, from writing the simplest word-count job up to using probabilistic data structures for distributed machine learning. No specific background in Scala, Hadoop, distributed computing or machine learning is required, though an interest in any or all of these might help.
Bring a laptop, or share one with a friend.
TUTORIAL REQUIREMENTS AND INSTRUCTIONS FOR ATTENDEES
* No specific knowledge needed. Some familiarity with either Scala or Hadoop would be helpful but is not at all required.
* A laptop with a working JDK installation.
QUESTIONS for the speaker?: Use the “Leave a Comment or Question” section at the bottom to address them.
Avi has led product, engineering, and data science teams at Etsy, Twitter and Dabble DB (which he co-founded and Twitter acquired). He’s known for his open source work on projects such as Seaside, Scalding, and Algebird. Avi currently works at Stripe.
Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?
Join the conversation here (requires login)
For exhibition and sponsorship opportunities, contact Sharon Cordesse at firstname.lastname@example.org
For information on trade opportunities with O'Reilly conferences contact email@example.com
For media-related inquiries, contact Maureen Jennings at firstname.lastname@example.org
View a complete list of OSCON contacts