Hadoop, Pig, and Twitter

Kevin Weil (Twitter, Inc.)
Databases
Location: E145/E146
Average rating: ****.
(4.20, 15 ratings)

Massive growth in the size of business datasets leads many companies to Hadoop, an emerging architecture for parallel data processing (and top-level Apache project). However, the migration path can be challenging, in part because MapReduce analyses use programming languages like Java and Python rather than SQL. Apache Pig is a high-level framework built on top of Hadoop that offers a powerful yet vastly simplified way to analyze data in Hadoop. It allows businesses to leverage the power of Hadoop in a simple language readily learnable by anyone that understands SQL. In this presentation, I will introduce Pig and show how it’s been used at Twitter to solve numerous analytics challenges that became intractable with our former MySQL-based architecture.

Photo of Kevin Weil

Kevin Weil

Twitter, Inc.

Kevin Weil leads the analytics team at Twitter, building distributed infrastructure and leveraging data analysis at a massive scale to help grow the popular micro-blogging service. With millions of monthly site visitors and many more interacting through API-based third party applications, Twitter has one of the world’s most varied and interesting datasets. Prior to joining Twitter, Kevin led the analytics team at the Kleiner Perkins-backed web media startup Cooliris. Kevin earned his bachelor’s degree in Mathematics and Physics from Harvard University, and has a master’s degree in Physics from Stanford University.

  • Intel
  • Microsoft
  • Google
  • Facebook
  • Rackspace Hosting
  • (mt) Media Temple, Inc.
  • ActiveState
  • CommonPlaces
  • DB Relay
  • FireHost
  • GoDaddy
  • HP
  • HTSQL by Prometheus Research
  • Impetus Technologies Inc.
  • Infobright, Inc
  • JasperSoft
  • Kaltura
  • Marvell
  • Mashery
  • NorthScale, Inc.
  • Open Invention Network
  • OpSource
  • Oracle
  • Parallels
  • PayPal
  • Percona
  • Qualcomm Innovation Center, Inc.
  • Rhomobile
  • Schooner Information Technology
  • Silicon Mechanics
  • SourceGear
  • Symbian
  • VoltDB
  • WSO2
  • Linux Pro Magazine

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Sharon Cordesse at scordesse@oreilly.com

Download the OSCON Sponsor/Exhibitor Prospectus

Media Partner Opportunities

Download the Media & Promotional Partner Brochure (PDF) for information on trade opportunities with O'Reilly conferences or contact mediapartners@ oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

OSCON Newsletter

To stay abreast of conference news and to receive email notification when registration opens, please sign up for the OSCON Newsletter (login required)

OSCON 2.0 Ideas

Have an idea for OSCON to share? oscon-idea@oreilly.com

Contact Us

View a complete list of OSCON contacts