Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the convention, contact Sharon Cordesse at scordesse@oreilly.com

Download the OSCON Data Sponsor/Exhibitor Prospectus

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences or contact mediapartners@ oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

OSCON Bulletin

To stay abreast of convention news and announcements, please sign up for the OSCON email bulletin (login required)

Contact Us

View a complete list of OSCON contacts

Taming the Big Data Fire Hose

John Hugg (VoltDB)
Data: Real-Time and Streaming
Location: C121/122
Average rating: ***..
(3.50, 4 ratings)

The term Big Data describes a new class of database applications that need to process massive data volumes in two disparate states – real time and historical. In either state, the requirements of Big Data applications vastly exceed the capabilities of traditional, one-size-fits-all database systems. Most Big Data applications require MPP scale-out architectures and have the following characteristics:
1. A “fire hose” data source such as an HTTP streams, sensor grid or other machine-generated data
2. A real-time database capable of ingesting, organizing and managing high volume inputs
3. A persistent data storage and analysis infrastructure capable of managing petabyte+ historical databases
In this talk, we will introduce a simple formula for all Big Data applications: Big Data = Fast Data + Deep Data. Through a use-case format, we will discuss the specialized requirements for real-time (“fast”) and analytic (“deep”) data management. We’ll also explore ways in which popular business intelligence solutions can be used to implement real-time and historical analytics.

Photo of John Hugg

John Hugg


John Hugg is one of the architects of the VoltDB database, where he spends his day building open source, scalable and enterprise ready transaction processing tools.

He has spent his professional career working on non-traditional solutions to data management problems, including large-dataset, non-parametric statistics, column-oreinted analytics system, cloud deployments of RDBMSs and XML databases.