Scribe - Moving Data at Massive Scale

Robert Johnson (Facebook)
Tools & Techniques
Location: Portland 255
Average rating: ***..
(3.29, 17 ratings)

In just six years Facebook has grown from an experiment in a dorm room to a service used by over 400 million people worldwide. This explosive growth has presented many challenges in systems software and required the development of many custom solutions to handle the unique problems that arise at this scale.

One piece of software we’ve found to be particularly useful in scaling our site is Scribe, an open source system for aggregating massive amounts of logging data from thousands of machines, or more generally moving around large amounts of data in an asynchronous and mostly-reliable way. We initially wrote Scribe for just a couple of use cases, to replace special-purpose systems that were busting at the the seams as we grew. It turned out to be so useful that it now handles hundreds of use cases and delivers over 150 billion messages a day. Scribe is also now being used by Twitter.

In this talk I’ll describe the design philosophy and goals for Scribe and the decisions made in its implementation. I’ll discuss some of the lessons learned as we brought it into production and scaled it with Facebook, and what our plans are for the future.

People planning to attend this session also want to see:

Photo of Robert Johnson

Robert Johnson

Facebook

Robert Johnson is Director of Engineering at Facebook, where he leads the software development efforts to cost-effectively scale Facebook’s infrastructure and optimize performance for its many millions of users. During his time with the company, the number of users has expanded by more than twenty-fold and Facebook now handles billions of page views a day.

Robert was previously at ActiveVideo Networks where he led the distributed systems and set-top software development teams. He has worked in a wide variety of engineering roles from robotics to embedded systems to web software. He received a B.S. In Engineering and Applied Science from Caltech.

Comments on this page are now closed.

Comments

Michael Stack
07/26/2010 12:14pm PDT

I thought this was one of the best sessions I attended all week – nice job. Very informative, and presented at just the right level of technical depth.

Do you think you’ll be able to post the slides from your presentation?

Michael

  • Intel
  • Microsoft
  • Google
  • Facebook
  • Rackspace Hosting
  • (mt) Media Temple, Inc.
  • ActiveState
  • CommonPlaces
  • DB Relay
  • FireHost
  • GoDaddy
  • HP
  • HTSQL by Prometheus Research
  • Impetus Technologies Inc.
  • Infobright, Inc
  • JasperSoft
  • Kaltura
  • Marvell
  • Mashery
  • NorthScale, Inc.
  • Open Invention Network
  • OpSource
  • Oracle
  • Parallels
  • PayPal
  • Percona
  • Qualcomm Innovation Center, Inc.
  • Rhomobile
  • Schooner Information Technology
  • Silicon Mechanics
  • SourceGear
  • Symbian
  • VoltDB
  • WSO2
  • Linux Pro Magazine

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Sharon Cordesse at scordesse@oreilly.com

Download the OSCON Sponsor/Exhibitor Prospectus

Media Partner Opportunities

Download the Media & Promotional Partner Brochure (PDF) for information on trade opportunities with O'Reilly conferences or contact mediapartners@ oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

OSCON Newsletter

To stay abreast of conference news and to receive email notification when registration opens, please sign up for the OSCON Newsletter (login required)

OSCON 2.0 Ideas

Have an idea for OSCON to share? oscon-idea@oreilly.com

Contact Us

View a complete list of OSCON contacts