Sponsors

  • 10gen
  • DataStax, Inc.
  • Dell
  • Google
  • Lexis Nexis
  • Oracle
  • VMware
  • Percona

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the convention, contact Sharon Cordesse at scordesse@oreilly.com

Download the OSCON Data Sponsor/Exhibitor Prospectus

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences or contact mediapartners@ oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

OSCON Bulletin

To stay abreast of convention news and announcements, please sign up for the OSCON email bulletin (login required)

Contact Us

View a complete list of OSCON contacts

Facebook Messages and HBase

Nicolas Spiegelberg (Facebook)
Data: Hadoop
Location: C123
Average rating: ****.
(4.38, 8 ratings)

When Facebook first started exploring the next generation of Messages, handling the large data quantity and growth was a primary concern. The current Messages infrastructure handles over 350 million users sending over 15 billion person-to-person messages per month. Our chat service supports over 300 million users who send over 120 billion messages per month. The past storage implementation utilized MySQL, but many limitations arose during scaling. Read-optimized data structures, manual sharding, rigid replication options, and operationally-intensive scaling were MySQL pain points that necessitated a look at alternative storage solutions. We set up test frameworks to evaluate clusters of MySQL, Apache Cassandra, Apache HBase, and a couple of other systems. We ultimately chose HBase as the best solution for our workload.

This is a technically-dense talk that should primarily appeal to software engineers and system architects. No prior knowledge of HBase is necessary, however we expect attendees to have a comfortable knowledge of computer architecture and database fundamentals. The audience should leave with an understanding of the benefits of an HBase datastore solution, how Facebook uses HBase to reach massive scale with reliability, and common APIs / configuration parameters to consider when deploying your own HBase solution.

Nicolas Spiegelberg

Facebook

Nicolas Spiegelberg is a storage engineer in the Facebook messaging team. He helped implement the HBase storage solution for Facebook Messages from design to deployment. Additionally, Nicolas is an HBase committer and PMC, who has contributed many critical features such as HDFS data reliability, Bloom Filters, and an enhanced compaction algorithm.

UA-Huntsville : Masters in Computer Engineering

Comments on this page are now closed.

Comments

Louis Frolio
08/12/2011 6:48am PDT

Is there a webcast of this event?

Shu Zhang
07/27/2011 11:50am PDT

Don’t know if this is the right forum for it, but didn’t get a chance to ask in person and didn’t see your email anywhere… So, splitting the data into different hbase clusters was only for partitioning right? Are there any replication done at all across geographies? How do you deal with minimizing latencies for data access across the world? Maybe it’s in cache layer or maybe the partitioning takes user’s geography into account?

An unrelated question is, without any cross-cluster replication, how do you deal with a datacenter outage?

Thanks, my email is shuzhang0@gmail.com

Picture of Siddharth Anand
Siddharth Anand
07/26/2011 12:13pm PDT

Great talk!