Sponsors

  • 10gen
  • DataStax, Inc.
  • Dell
  • Google
  • Lexis Nexis
  • Oracle
  • VMware
  • Percona

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the convention, contact Sharon Cordesse at scordesse@oreilly.com

Download the OSCON Data Sponsor/Exhibitor Prospectus

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences or contact mediapartners@ oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

OSCON Bulletin

To stay abreast of convention news and announcements, please sign up for the OSCON email bulletin (login required)

Contact Us

View a complete list of OSCON contacts

Monday, 07/25/2011

9:00am

Add to your personal schedule
Monday, 07/25/2011
Location: Oregon Ballroom 203/204
Sarah Novotny (NGINX), Bradford Stephens (Drawn to Scale)
Opening remarks by the OSCON Data program chairs, Sarah Novotny and Bradford Stephens. Read more.

9:05am

Add to your personal schedule
Monday, 07/25/2011
Location: Oregon Ballroom 203/204
Tom Quisel (OkCupid)
Average rating: ***..
(3.22, 9 ratings)
Dive into the distributed system that powers OkCupid’s match searches. Learn how we use C++, event-based programming, and SSDs to solve problems that crop up when building a high performance, high availability distributed system. Read more.

9:20am

Add to your personal schedule
Monday, 07/25/2011
Location: Oregon Ballroom 203/204
Benjamin Black (Boundary)
Average rating: ***..
(3.67, 12 ratings)
Keynote by Benjamin Black, Co-founder, fast_ip. Read more.

9:40am

Add to your personal schedule
Monday, 07/25/2011
Location: Oregon Ballroom 203/204
Steve Yegge (Google)
Average rating: ****.
(4.71, 17 ratings)
It's 2021. You have a petabyte drive on your keychain, your startup company leases bulk cloud storage by the exabyte, and you have a million cores for data crunching. You even can have your own copy of the entire world's public semantic data. What do you do with it? If you're not sure yet, I've got plenty of ideas for you. Read more.

10:00am

Add to your personal schedule
Monday, 07/25/2011
Location: Oregon Ballroom 203/204
Average rating: **...
(2.50, 2 ratings)
An open microphone question and answer session with the morning's keynote speakers. Read more.

10:10am

Monday, 07/25/2011
Location: Exhibit Hall C
Morning Break (30m)

10:40am

Add to your personal schedule
Monday, 07/25/2011
Data: Relational
Location: C121/122
Tags: dba_dude
Lars Thalmann (Oracle)
Average rating: ****.
(4.50, 2 ratings)
We describe the new replication features in MySQL 5.5 (GA) and MySQL 5.6 (Development release). Read more.
Add to your personal schedule
Monday, 07/25/2011
Data: Hadoop
Location: C123
Tom Hanlon (Cloudera)
Average rating: ****.
(4.27, 11 ratings)
Hadoop gives you the ability to process massive amounts of data at scale. This presentation will show you how hadoop makes use of commodity hardware to allow you to build a system that scales, that deals gracefully with failure of individual nodes, and gives you the power of Map/Reduce to process Petabytes. Read more.
Add to your personal schedule
Monday, 07/25/2011
Data: Roulette
Location: C124
Andrew Turner (GeoIQ)
Average rating: ***..
(3.58, 12 ratings)
We're being surrounded by data: Open government data, streaming media, and data we're creating as we track our lives and connect with our communities. Learn how to leverage easy to use tools to combine this together for our personal and organization decision making without requiring complex processes or training. Read more.
Monday, 07/25/2011
Location: C125/126
TBC
Add to your personal schedule
Monday, 07/25/2011
Data: NoSQL Databases
Location: B118-119
Siddharth Anand (Netflix)
Average rating: ***..
(3.70, 10 ratings)
Over the past few years, Netflix has migrated to the cloud. This talk details Netflix's transition away from relational databases and towards high-availability (NoSQL) storage systems. We rely on a combination of proprietary (e.g. SimpleDB and S3) and open-source (e.g. Cassandra and HBase) NoSQL technologies. Read more.

11:30am

Add to your personal schedule
Monday, 07/25/2011
Data: Relational
Location: C121/122
Ryan Lowe (Percona), Haidong Ji (Percona)
Average rating: ****.
(4.00, 2 ratings)
With most modern web applications, there are requirements for both SQL access to complex data as well as simple Key-Value look-ups. This session will cover how to use the HandlerSocket Plug-In for MySQL to get exponentially faster look-ups for simple access patterns. Read more.
Add to your personal schedule
Monday, 07/25/2011
Data: Roulette
Location: C123
Gleicon Moraes (7co.cc)
Average rating: *....
(1.88, 8 ratings)
Ever had to dig into a system that misused the most basic features of a RDBMS ? Better yet - after the whole NoSQL storm had you wondered why it didn't shown before when you had to twist your schema to fit into something it was not designed for ? Check on this anti-patterns collection and feel better that you are not alone - and how you can benefit from it even not having big data around. Read more.
Add to your personal schedule
Monday, 07/25/2011
Data: Hadoop
Location: C124
Owen O'Malley (HortonWorks)
Average rating: **...
(2.25, 4 ratings)
Adding security to an existing product is never easy, but our team at Yahoo added strong authentication to Apache Hadoop by integrating it with Kerberos. This project was delivered on time and is currently deployed on all of Yahoo's 40,000 Hadoop computers. Come learn how we added security to and why it matters. Read more.
Add to your personal schedule
Monday, 07/25/2011
Data: Products and Services
Location: C125/126
Aurelian Dumitru (Dell, Inc)
Average rating: **...
(2.00, 1 rating)
In this session Dell will discuss the analysis of the data types suitable for transfer between Hadoop and EDW, EDW/Hadoop data lifecycle, Data governance between Hadoop and DBMS, and ETL performance tuning and best practices (i.e. Hadoop/DBMS connector, node and network designs, etc.) Read more.
Add to your personal schedule
Monday, 07/25/2011
Data: NoSQL Databases
Location: B118-119
Patrick Lightbody (New Relic)
Average rating: **...
(2.78, 9 ratings)
Between the NoSQL movement and new cloud offerings, it seems there are new storage options popping up every day. How do you select which one is the best for your project? The truth is that it's unlikely one option is best for all your needs. This session walks you through the various options considered by one startup and how it selected five separate storage engines - and has no regret doing so! Read more.

1:30pm

Add to your personal schedule
Monday, 07/25/2011
Data: Hadoop
Location: C121/122
Greg Fodor (Etsy)
Average rating: ***..
(3.75, 4 ratings)
The data & analytics teams at Etsy build up and tear down more than a thousand independent Hadoop clusters on EC2 each month. This talk discusses the benefits of this approach, where Elastic Map Reduce serves as a "meta-cluster" in which on-demand Hadoop clusters can be created, used, and shut down quickly and easily. Read more.
Add to your personal schedule
Monday, 07/25/2011
Data: Roulette
Location: C123
Ted Dziuba (eBay Local/Milo.com)
Average rating: ****.
(4.64, 11 ratings)
What happens when you write data to disk? We'll explore everything between your programming language and the spinning platters - both optimizations and dangerous pitfalls. Read more.
Add to your personal schedule
Monday, 07/25/2011
Benoit Sigoure (StumbleUpon, Inc.)
Average rating: ****.
(4.30, 10 ratings)
OpenTSDB is an open-source, distributed time series database designed to monitor large clusters of commodity machines at an unprecedented level of granularity. OpenTSDB enables operations teams to keep track in real-time of all the metrics exposed by operating systems, applications and network equipment, and makes the data easily accessible. Read more.
Add to your personal schedule
Monday, 07/25/2011
Data: Products and Services
Location: C125/126
Jonathan Ellis (DataStax)
Average rating: ***..
(3.67, 3 ratings)
Brisk is an open-source Hadoop and Hive distro that utilizes Cassandra for its core services. Brisk provides integrated Hadoop MapReduce, Hive and job and task tracking, while providing an HDFS-compatible storage layer powered by Cassandra. By accelerating the time between data creation and analysis with DataStax’ Brisk, users experience greater reliability, simpler deployment and lower TCO. Read more.
Add to your personal schedule
Monday, 07/25/2011
Data: NoSQL Databases
Location: B118-119
Tags: nosql_nerd
Roger Bodamer (10gen)
Average rating: ***..
(3.83, 6 ratings)
In this workshop, one of the core MongoDB committers will present the fundamental principles of MongoDB, how to set up and interact with the database, and what to consider when building applications using a document-based data model. Read more.

2:20pm

Add to your personal schedule
Monday, 07/25/2011
Data: Relational
Location: C121/122
Tags: dba_dude
Bruce Momjian (EnterpriseDB)
Average rating: ****.
(4.00, 1 rating)
Multiversion Concurrency Control (MVCC) allows Postgres to offer high concurrency even during significant database read/write activity. MVCC specifically offers behavior where "readers never block writers, and writers never block readers". This talk explains how MVCC is implemented in Postgres and highlights optimizations which minimize the downsides of MVCC. This talk is for advanced users. Read more.
Add to your personal schedule
Monday, 07/25/2011
Theo Schlossnagle (OmniTI/Circonus)
Average rating: ****.
(4.38, 8 ratings)
The art of dealing with real-time data is not new. In fact, much of the world's economy is propped up my making decisions on data sub milliseconds. The technology is there, we have the power. We'll take a whirlwind tour of the open-source Esper system and understand how to integrate it into your stack to enable rapid decision making on real-time data from anywhere in your architecture. Read more.
Add to your personal schedule
Monday, 07/25/2011
Data: Hadoop
Location: C124
Arun Murthy (Hortonworks Inc.)
Average rating: ***..
(3.00, 4 ratings)
YARN is the next generation of Hadoop Map-Reduce designed to scale out much further while allowing for running applications other than pure Map-Reduce in a highly fault-tolerant manner. Read more.
Add to your personal schedule
Monday, 07/25/2011
Data: NoSQL Databases
Location: B118-119
Ezra Zygmuntowicz (VMware Inc)
Average rating: ****.
(4.00, 2 ratings)
Redis is an entry in the new breed of nosql databases. But it takes a different approach that makes it much more interesting then most of the other key/value stores in the same category. Come learn what makes redis so useful that it seems everyone is adding it to their toolbox. Read more.

3:30pm

Add to your personal schedule
Monday, 07/25/2011
Data: Relational
Location: C121/122
Average rating: ***..
(3.00, 3 ratings)
We at DeNA (largest social game provider in Japan) handle over 2 billion page views per day with MySQL. We heavily use SSD and tune Linux. We run non-trivial solutions such as non-stop, automated MySQL master failover. We also use MySQL not only as traditional RDBMS but also an extremely high performance NoSQL. I'd like to introduce our MySQL solutions to make our social games scale better. Read more.
Add to your personal schedule
Monday, 07/25/2011
Jonathan Seidman (Orbitz Worldwide), Ramesh Venkataramaiah (Orbitz Worldwide)
Average rating: **...
(2.75, 8 ratings)
An overview of the state of the art for bringing together the analytical power of the R language with the big data capabilities of Hadoop. Read more.
Add to your personal schedule
Monday, 07/25/2011
Aaron Kimball (Magnify Consulting)
Average rating: ***..
(3.62, 8 ratings)
This talk introduces an open-source SQL-based system for continuous or ad-hoc analysis of streaming data built on top of Flume-based data collection for Hadoop. Attendees will understand how to use a new tool to extend their Hadoop data collection pipeline with real-time streaming analytics. Read more.
Add to your personal schedule
Monday, 07/25/2011
Location: B118-119
Tom White (Cloudera)
Average rating: ***..
(3.33, 3 ratings)
Apache Whirr is a way to run distributed systems - such as Hadoop, HBase, Cassandra, and ZooKeeper - in the cloud. Whirr provides a simple API for starting and stopping clusters for evaluation, test, or production purposes. This talk explains Whirr's architecture and shows how to use it. Read more.

4:20pm

Add to your personal schedule
Monday, 07/25/2011
Data: Relational
Location: C121/122
Inaam Rana (Oracle), Calvin Sun (Twitter)
Average rating: *....
(1.33, 6 ratings)
There are many exciting InnoDB performance and Scalability features in MySQL 5.5 and its upcoming release. But how to best use them? What are the caveats? At this session, we will describe those performance and Scalability features in depth. We will also present some benchmark results that explore the performance of those features. Read more.
Add to your personal schedule
Monday, 07/25/2011
Noah Pepper (Lucky Sort), Homer Strong (Lucky Sort)
Average rating: ***..
(3.18, 11 ratings)
We produce gorgeous LaTeX reports while harnessing the power of R on the backend. The data is pulled from our PostgreSQL database, the analysis and visualizations are fast and distributed thanks to Redis. We'll talk about weaving together open source tools to build powerful analytics reporting engines that rival the commercial alternatives. Read more.
Add to your personal schedule
Monday, 07/25/2011
Data: NoSQL Databases
Location: C124
Rusty Klophaus (Basho Technologies)
Average rating: ****.
(4.67, 3 ratings)
The Basho engineering team has been working to make Riak more queryable with the addition of built-in indexing plus a SQL-style query language. In this talk, Rusty describes the usage, benefits, limitations, and evolution of this this functionality, called Secondary Indices. He also covers the challenges and pitfalls of adding indexing to a distributed datastore. Read more.
Add to your personal schedule
Monday, 07/25/2011
Location: B118-119
Brian Aker (HP)
Average rating: ***..
(3.50, 2 ratings)
Many people view topics like Map/Reduce and queue systems as advanced concepts that require in-depth knowledge and time consuming software setup. Gearman is changing all that by making this barrier to entry as low as possible with an open source, distributed job queuing system. Read more.

5:00pm

Add to your personal schedule
Monday, 07/25/2011
Location: Gather (Double Tree Hotel bar)
Average rating: **...
(2.70, 10 ratings)
Join other Android developers for happy hour at Gather in the Double Tree Hotel on Monday evening. Meet face-to-face and share experiences with other developers working on Android. The first 100 people there get a free drink ticket. Read more.

7:00pm

Add to your personal schedule
Monday, 07/25/2011
Location: Oregon Ballroom
Average rating: ****.
(4.79, 24 ratings)
If you had five minutes on stage what would you say? What if you only got 20 slides and they rotated automatically after 15 seconds? Would you pitch a project? Launch a web site? Teach a hack? We’re going to find out when we conduct our third Ignite event at OSCON. Read more.

9:00pm

Add to your personal schedule
Monday, 07/25/2011
Location: C121/122
Moderated by: John Mark Walker
GlusterFS is an open source scale-out NAS solution. The software is a powerful and flexible solution that simplifies the task of managing unstructured file data whether you have a few terabytes of storage or multiple petabytes. In this BoF, we'll discuss the GlusterFS architecture, roadmap and share recipes for deploying at scale. Read more.
Add to your personal schedule
Monday, 07/25/2011
Location: C123
Moderated by: Jeffrey Osier-Mixon
The Yocto Project™, shepherded by the Linux Foundation, is an open source collaboration project that provides templates, tools and methods to help you create custom Linux-based systems for embedded products regardless of the hardware architecture. This BoF is a place for people to learn about the Yocto Project and discuss embedded Linux tools solutions. Read more.
Add to your personal schedule
Monday, 07/25/2011
Location: C124
Moderated by: Justin Early
Writing code for EcmaScript 5, NodeJS, JQuery, Dojo? Come see a demonstration on how VJET JavaScript IDE helps you code faster, discover problems earlier, search the code base, run and debug all within Eclipse VJET JavaScript IDE. Read more.
Add to your personal schedule
Monday, 07/25/2011
Location: C125/126
Moderated by: Linda Halligan
Average rating: *****
(5.00, 1 rating)
LinuxChix is a community for women who like Linux and Free Software, and for women and men who want to support women in computing. The membership ranges from novices to experienced users, and includes professional and amateur programmers, system administrators and technical writers. Read more.
Add to your personal schedule
Monday, 07/25/2011
Location: See BoF Schedule for Locations
Average rating: ***..
(3.00, 1 rating)
Birds of a Feather (BoF) sessions provide face to face exposure to those interested in the same projects and concepts. BoFs can be organized for individual projects or broader topics (best practices, open data, standards). BoFs are entirely up to you. We post your topic online and onsite and provide the space and time. You provide the engaging topic. Read more.

Tuesday, 07/26/2011

9:00am

Add to your personal schedule
Tuesday, 07/26/2011
Location: Oregon Ballroom 203/204
Sarah Novotny (NGINX), Bradford Stephens (Drawn to Scale)
Opening remarks by the OSCON Data program chairs, Sarah Novotny and Bradford Stephens. Read more.

9:05am

Add to your personal schedule
Tuesday, 07/26/2011
Location: Oregon Ballroom 203/204
Dwight Merriman (10gen)
Average rating: ***..
(3.71, 7 ratings)
Much has been made of scalability as a driver for choosing a database, but the choice of a database influences much more than the scaling architecture. Different database choices drive different data models which in turn influence the development process. Read more.

9:20am

Add to your personal schedule
Tuesday, 07/26/2011
Location: Oregon Ballroom 203/204
Adrian Cockcroft (Battery)
Average rating: ****.
(4.44, 9 ratings)
Keynote by Adrian Cockcroft, Cloud Architect, Netflix. Read more.

9:40am

Add to your personal schedule
Tuesday, 07/26/2011
Location: Oregon Ballroom 203/204
Brian Aker (HP)
Average rating: ***..
(3.50, 8 ratings)
We love data, and today we generate data in astronomical amounts. When we hit save on a document, snap a photo, or fill out a form online, we want to know that this data will persist, and we want to know that we can share, access, or reference it in the future. For any meaningful use, we need to how data relates to other data. Read more.

10:00am

Add to your personal schedule
Tuesday, 07/26/2011
Location: Oregon Ballroom 203/204
Average rating: ****.
(4.00, 2 ratings)
The first OSCON Data Innovation Award winner will be announced. Read more.

10:40am

Add to your personal schedule
Tuesday, 07/26/2011
Data: Real-Time and Streaming
Location: C121/122
John Hugg (VoltDB)
Average rating: ***..
(3.50, 4 ratings)
In this talk, we will introduce a simple formula for all Big Data applications: Big Data = Fast Data + Deep Data. Through a use-case format, we will discuss the specialized requirements for real-time (“fast”) and analytic (“deep”) data management. Read more.
Add to your personal schedule
Tuesday, 07/26/2011
Data: Relational
Location: C123
Selena Deckelmann (PostgreSQL)
Average rating: ****.
(4.12, 8 ratings)
PostgreSQL continues to provide a major release every year full of improvements, better performance and features that measure up to the most popular commercial databases. Our 2011 release, 9.1, is no exception! Read more.
Add to your personal schedule
Tuesday, 07/26/2011
Jean-Daniel Cryans (Cloudera)
Average rating: ****.
(4.00, 4 ratings)
Imagine for a moment doing a JOIN on two HBase tables, crazy talk right? Well now you can thanks to Hive. True, it is only meant to be used in a batch context, but we have being doing it for a few months now at StumbleUpon and our analysts and engineers love it. This presentation will cover how the Hive-HBase integration works and how we use it at our company. Read more.
Add to your personal schedule
Tuesday, 07/26/2011
Data: Big Data
Location: B118-119
Jay Kreps (LinkedIn)
Average rating: ****.
(4.11, 9 ratings)
The last few years have brought a wealth of new data technologies organized around horizontal scalability. This talk will cover the essential infrastructure areas: real-time stream processing, offline data crunching, large-scale data deployments and live serving. The focus will be on how these ingredients come together to enable innovative data-driven products at LinkedIn. Read more.

11:30am

Add to your personal schedule
Tuesday, 07/26/2011
Data: Relational
Location: C121/122
Andrew Aksyonoff (Sphinx Technologies), Richard Kelm (Sphinx Search)
Average rating: *....
(1.90, 10 ratings)
Whether you're a beginner Web guy or a veteran DBA, whether you get hands dirty with any code or just manage systems, you still must know algorithms. How come? Because that knowledge enables you to optimize your work, conduct correct benchmarks, and make educated decisions. We'll show you how knowing only a little about SQL internals can help so much with tuning things. Read more.
Add to your personal schedule
Tuesday, 07/26/2011
Data: Hadoop
Location: C123
Nicolas Spiegelberg (Facebook)
Average rating: ****.
(4.38, 8 ratings)
In November, Facebook launched a new version of Messages that combines chat, SMS, email, and Messages into a real-time conversation. Facebook relies on Apache HBase, a NoSQL-style database, for storing this real-time message data. This talk will elaborate on our decision process, system configuration, scaling issues, and advantages gained by choosing Open Source. Read more.
Add to your personal schedule
Tuesday, 07/26/2011
Jeff Hamann (Forest Informatics)
Average rating: **...
(2.67, 3 ratings)
Learn how to cobble together a PostgreSQL database, install a few handy R packages, a pinch of language extensions, and a handful of publicly available data to generate a forest monitoring platform to help landscape managers make better decisions using basic design-engineering paradigms to perform quick trade-off analyses. Read more.
Add to your personal schedule
Tuesday, 07/26/2011
Products & Services
Location: C125/126
Bill Fox J.D., M.A. (LexisNexis), Charles Kaminski (LexisNexis)
Average rating: ****.
(4.00, 1 rating)
A big data case study with the NY Medicaid Inspector General's Office and HPCC Systems from LexisNexis. Read more.
Add to your personal schedule
Tuesday, 07/26/2011
Data: Big Data
Location: B118-119
Jared Williams (New York State Senate), Noel Hidalgo (World Economic Forum), Graylin Kim (New York State Senate)
Average rating: ***..
(3.50, 2 ratings)
The story of the development team and what lessons we learned in building Open Legislation - an open government platform. It will detail our transition from a MySQL back end to an application fully powered by Lucene, the data quality and efficiency issues that we’ve had to address, and how we’re now trying to rebuild internal trust after our iterative and initially shaky development process. Read more.

1:30pm

Add to your personal schedule
Tuesday, 07/26/2011
Data: Big Data
Location: C121/122
Tom Wilkie (Acunu Ltd)
Average rating: ****.
(4.80, 5 ratings)
The standard Linux storage stack wasn't designed for write-heavy big data workloads, nor is it well-suited to modern hardware: large, slow SATA disks, SSDs or many cores. Castle, an open-source project, is a ground-up overhauling of RAID, file systems, and the POSIX interface. Read more.
Add to your personal schedule
Tuesday, 07/26/2011
Data: Relational
Location: C123
Jeremy Bingham (Dailykos.com)
Average rating: **...
(2.40, 5 ratings)
Keeping a busy site going when you don't have a lot of servers or developer resources can be a struggle. Hear what we did at Daily Kos to make the most of what we had to bring MySQL in line, make it quick, and keep the users and the boss happy. Read more.
Add to your personal schedule
Tuesday, 07/26/2011
Russell Hanson (RSI/Harvard/TCIN)
Average rating: **...
(2.67, 3 ratings)
Synthetic biology is a new field where basic biological components can be engineered to create something new. It often involves DNA synthesizers, ligation, promoters, and polymerase chain reaction -- which may or may not be safe for your in silico environment. However, as the size and complexity of the systems increase, tools become more and more important, thus CAD for biology has emerged. Read more.
Add to your personal schedule
Tuesday, 07/26/2011
Data: NoSQL Databases
Location: B118-119
Tags: nosql_nerd
Dwight Merriman (10gen)
Average rating: ****.
(4.00, 3 ratings)
One of the challenges that comes with moving to MongoDB is figuring how to best model your data. While most developers have internalized the rules of thumb for designing schemas for RDBMSs, these rules don't always apply to MongoDB. Read more.

2:20pm

Add to your personal schedule
Tuesday, 07/26/2011
Data: Relational
Location: C121/122
Brian Aker (HP)
Average rating: ****.
(4.00, 2 ratings)
Ever wondered what would happen if you could rethink a decade worth of design changes? Drizzle is a redesign of the MySQL server targeted at web development and cloud infrastructure. Update yourself on the latest features, and use cases for Drizzle7 and what is in store for the near future. Read more.
Add to your personal schedule
Tuesday, 07/26/2011
Data: Big Data
Location: C123
Kate Matsudaira (SEOmoz)
Average rating: ***..
(3.50, 10 ratings)
Building large data applications can present a unique set of technical challenges because things that often work well in the conventional development environment can become incredibly arduous or expensive when applied on a much bigger scale. This talk will cover some of those challenges and potential solutions for each. Read more.
Add to your personal schedule
Tuesday, 07/26/2011
David Pacheco (Joyent), Brendan Gregg (Joyent)
Average rating: ***..
(3.00, 3 ratings)
We'll present the architecture and implementation of a Node.js/DTrace-based distributed platform for analyzing the performance of cloud applications in real-time. We'll do a live demo on a real, internet-facing cloud and discuss some of the interesting performance pathologies we've found and explained using this tool. Read more.
Add to your personal schedule
Tuesday, 07/26/2011
Data: Big Data
Location: B118-119
Erik Onnen (Urban Airship)
Average rating: *****
(5.00, 3 ratings)
This talk will cover lessons learned in building Urban Airship's large-scale data warehouse in EC2 including PostgreSQL, Kafka, Cassandra, HBase and Hadoop. Read more.

3:30pm

Add to your personal schedule
Tuesday, 07/26/2011
Data: Scaling
Location: C121/122
Laura Thomson (Mozilla Corporation), Josh Berkus (PostgreSQL Experts), Corey Shields (Mozilla Corporation), Justin Dow (Mozilla Corporation)
Average rating: **...
(2.75, 4 ratings)
If you've ever had to move from data center to data center or to the cloud, or from old hardware to new hardware, you know that it's even more painful than moving house. In this presentation, survivors will tell you how to stay sane (and how to get it right) with a case study from Mozilla: moving 30TB of crash reports with no downtime in data collection. Read more.
Add to your personal schedule
Tuesday, 07/26/2011
Location: C123
Scott Andreas (Boundary Inc.)
Average rating: ***..
(3.87, 15 ratings)
This language-agnostic proposal focuses upon concepts and strategies critical to the design and implementation of asynchronous systems and data processing layers. Key components include a survey of implementation strategies for non-blocking edge tiers, patterns for building out a distributed worker / processing tier, along with several horror stories of cascading failures and their resolution. Read more.
Add to your personal schedule
Tuesday, 07/26/2011
Data: Roulette
Location: C124
Sharing data is critical in a world where crisis can occur at any moment. Often, valuable data is stored in disparate locations with no information on how to access. This presentation discusses spatial data discovery and open source tools for implementing a data-sharing catalog. Esri’s Geoportal Server will be used to show sharing and discovery in action. Talk is open to all attendees. Read more.
Add to your personal schedule
Tuesday, 07/26/2011
Data: NoSQL Databases
Location: B118-119
Adam Silberstein (Yahoo!)
Average rating: ***..
(3.50, 2 ratings)
I will overview PNUTS, a large-scale, geographically-replicated serving data store in widespread use at Yahoo! I will introduce key use cases, the main system components, key design decisions, and ongoing work. Read more.

4:20pm

Add to your personal schedule
Tuesday, 07/26/2011
Location: C121/122
Robert Treat (OmniTI)
Average rating: ****.
(4.17, 6 ratings)
Everyone thinks they know what sharding is and how to do it, but simple horizontal read scaling is the small potatoes. In this talk we'll focus on the sharding pattern for large scale read/write architectures, based on real world implementations. Supporting millions of users on commodity hardware doesn't need magical software, just careful application of the right scalability pattern. Read more.
Add to your personal schedule
Tuesday, 07/26/2011
Josh Patterson (Cloudera)
Average rating: ***..
(3.75, 8 ratings)
Time Series sensors are being ubiquitously integrated in places like cell phones, environmental sensors, and the smart grid. As we scale out this type of data RDBMS systems strain to scale with the high insertion rates and real time query requirements. In this talk we introduce “Lumberyard” which is a scalable indexing and low latency fuzzy pattern searching time series data. Read more.
Add to your personal schedule
Tuesday, 07/26/2011
Data: Roulette
Location: C124
Peter Neubauer (Neo Technology)
Average rating: **...
(2.00, 1 rating)
Location-based services are hot, but geographic datasets are complex. But this shouldn’t put you off writing awesome location-aware services. This talk will show how to create spatial models and query the Open Street Map dataset together with social data using the Neo4j graph database. Read more.
Add to your personal schedule
Tuesday, 07/26/2011
Data: Products and Services
Location: C125/126
Harry Heymann (foursquare)
Average rating: ****.
(4.00, 2 ratings)
A talk about how to scale foursquare using MongoDB and Scala. Read more.
Add to your personal schedule
Tuesday, 07/26/2011
Data: Scaling
Location: B118-119
Andy Blyler (Barracuda Networks), Lindsay Snider
Average rating: ****.
(4.00, 1 rating)
Solr, an open source enterprise search server, scales very well within an index (vertical scaling). It is when you have multiple indexes (horizontal scaling) that it starts to get hairy, which happens a lot when you are hosting a cloud based solution for multiple users. In this session we will discuss these issue as well as the techniques of how to overcome them in-depth. Read more.

5:00pm

Add to your personal schedule
Tuesday, 07/26/2011
Location: Expo Hall
Average rating: ***..
(3.92, 24 ratings)
Grab a drink and kick off the 13th edition of OSCON by meeting and mingling with exhibitors and fellow attendees. Read more.

6:00pm

Add to your personal schedule
Tuesday, 07/26/2011
Location: Hall B
Average rating: ****.
(4.22, 37 ratings)
Step right up and join us at the O'Reilly OSCON Carnival. There will be games, clowns, sumo wrestling, log rolling, tattoos, and lots more. There's free food, free wine, and free beer. You’ve never seen a carnival like this. Trust us. Read more.

8:00pm

Add to your personal schedule
Tuesday, 07/26/2011
Location: 411 NW Park Ave.
Average rating: ****.
(4.08, 12 ratings)
Join Puppet Labs and SwellPath Interactive at their headquarters in the Pearl District. The party is free, as in free beer, food and fun. Two floors, two open bars, and more. Take the Green or Yellow line (free transit) west to Union Station and walk 2 blocks west to 411 NW Park Ave. Read more.

Wednesday, 07/27/2011

9:00am

Add to your personal schedule
Wednesday, 07/27/2011
Location: Portland Ballroom
Average rating: ***..
(3.63, 19 ratings)
Keynotes today will be shared by OSCON, OSCON Data, and OSCON Java. Read more.

9:05am

Add to your personal schedule
Wednesday, 07/27/2011
Location: Portland Ballroom
Jono Bacon (Canonical Ltd)
Average rating: **...
(2.64, 55 ratings)
In this new keynote, Jono Bacon, author of The Art of Community (O'Reilly), founder of the Community Leadership Summit and award-winning Community Manager for the global Ubuntu community, talks about the new opportunities and challenges we face in understanding the art and science of community leadership. Read more.

9:20am

Add to your personal schedule
Wednesday, 07/27/2011
Location: Portland Ballroom
Steve Holden (Holden Web LLC)

9:25am

Add to your personal schedule
Wednesday, 07/27/2011
Location: Portland Ballroom
Gianugo Rabellino (Microsoft)
Average rating: **...
(2.51, 49 ratings)
The world is changing, and so is Microsoft. We are continuing down the path of even greater openness and interoperability in new ways . . . not just in development, but rising to meet the challenges and opportunities of the cloud and becoming flexible and nimble in the world of mobile. Read more.

9:40am

Add to your personal schedule
Wednesday, 07/27/2011
Location: Portland Ballroom
Ariel Waldman (Spacehack.org)
Average rating: ****.
(4.35, 62 ratings)
From launching robots into space to discovering distant galaxies: how people are creating open source space exploration and hacking science. Read more.

9:55am

Add to your personal schedule
Wednesday, 07/27/2011
Location: Portland Ballroom
Average rating: *....
(1.97, 37 ratings)

10:40am

Add to your personal schedule
Wednesday, 07/27/2011
Data: NoSQL Databases
Location: Oregon Ballroom 203
Jeffrey Kirkell (Project Management Institute)
Average rating: *....
(1.17, 6 ratings)
The popularity of NoSQL opens up an endless array of possible uses but also causes its own set of problems. Riak, a NoSQL offering created by Basho solves this by claiming to have no single point of failure. Proving this goes a long way to dispelling the concerns within an enterprise to begin adopting a non-relational solution. Read more.
Add to your personal schedule
Wednesday, 07/27/2011
Data: NoSQL Databases
Location: Oregon Ballroom 204
Bradley Holt (Found Line)
Average rating: ***..
(3.12, 8 ratings)
CouchDB is a document-oriented database that uses JSON documents, has a RESTful HTTP API, and employs map/reduce views for querying data. This tutorial will teach web developers the concepts they need to get started using CouchDB in their projects. Libraries are available for CouchDB’s RESTful HTTP API in many programming languages and we will take a look at some of the more popular ones. Read more.

1:40pm

Add to your personal schedule
Wednesday, 07/27/2011
Data: Roulette
Location: Oregon Ballroom 203
Krishna Sankar (Tata America International)
Average rating: ***..
(3.00, 3 ratings)
Algorithms are getting raunchier, tools more potent and competitions more intimate! Let us mix analytics tools (like R & Mahout) and a dash of algorithmics to work on BigData Analytics competitions and see if the answer is always 42. In the process we will explore and apply a few good algorithms, to the Heritage Health competition … Read more.
Add to your personal schedule
Wednesday, 07/27/2011
Data: Relational
Location: Oregon Ballroom 204
Tags: sql, postgres, dba
Robert Treat (OmniTI)
Average rating: ***..
(3.50, 6 ratings)
The open source database landscape has never been in more turmoil, and yet the popularity of Postgres continues to grow and grow. Get up to speed on what you need to know to administer the world's most advanced open source database, including installation, configuration, tuning, and how best to use PostgreSQL's community resources; with special focus on Postgres 9 and the upcoming 9.1 release. Read more.

4:10pm

Add to your personal schedule
Wednesday, 07/27/2011
Data: Analytics and Visualization
Location: Oregon Ballroom 203
Robin Anil (Google), Ted Dunning (MapR Technologies)
Average rating: **...
(2.75, 4 ratings)
This hands-on tutorial aims at learning the basics of the important machine learning algorithms in Mahout. It aims to help you get it up and running on a Hadoop cluster. Mahout is open source implementation of a collection of algorithms designed from ground up to sift through terabytes of data and help bring out important patterns which are otherwise not in the reach of standard tools. Read more.
Add to your personal schedule
Wednesday, 07/27/2011
Data: Roulette
Location: Oregon Ballroom 204
Dhruv Bansal (Infochimps), Winnie Hsia (Infochimps)
You have an idea for an app. Great! First you have to munge and maintain the data. Did you know there is one data API to pull clean, updated data from multiple sources? It slices, it dices, it serves out data on geo, social & more! And you don't need even touch MySQL. Mash up some data with the Infochimps Data Scientists Jacob Perkins, Dhruv Bansal and Ham the Incredible Coding Chimp. Read more.

5:40pm

Add to your personal schedule
Wednesday, 07/27/2011
Location: Expo Hall
Average rating: ***..
(3.27, 11 ratings)
Quench your thirst with vendor-hosted libations and snacks while you check out all the cool stuff in the expo hall. Read more.

7:00pm

Add to your personal schedule
Wednesday, 07/27/2011
Location: See BoF Schedule for Locations
Average rating: *****
(5.00, 1 rating)
Birds of a Feather (BoF) sessions provide face to face exposure to those interested in the same projects and concepts. BoFs can be organized for individual projects or broader topics (best practices, open data, standards). BoFs are entirely up to you. We post your topic and provide the space and time. You provide the engaging topic. Read more.

Thursday, 07/28/2011

9:00am

Add to your personal schedule
Thursday, 07/28/2011
Location: Portland Ballroom
Jim Zemlin (The Linux Foundation)
Average rating: ****.
(4.28, 29 ratings)
On the eve of Linux’ 20th anniversary, Jim Zemlin invites the OSCON audience into his "Bizarro World” of 2011. The world of computing has been turned upside down. Microsoft’s stock is down. They now are filing anti-trust suits, not being the subject of them. Heck, Microsoft is even contributing code to Linux. And for good reason. Read more.

9:15am

Add to your personal schedule
Thursday, 07/28/2011
Location: Portland Ballroom
Fred Trotter (FredTrotter.com)
Average rating: ***..
(3.13, 30 ratings)
Open Source software will power a new Internet layer, the Health Internet, which will finally make healthcare data liquid. The Health Internet will finally change healthcare the same way the Internet changed everything else; better, faster, cheaper. Read more.

9:20am

Add to your personal schedule
Thursday, 07/28/2011
Location: Portland Ballroom
Eri Gentry (BioCurious)
Average rating: ****.
(4.19, 31 ratings)
Join Eri Gentry, founder of BioCurious, the world’s first “hackerspace for biology” on a journey from garage biology to community lab. Read more.

9:35am

Add to your personal schedule
Thursday, 07/28/2011
Location: Portland Ballroom
John Graham-Cumming (CloudFlare)
Average rating: ****.
(4.14, 21 ratings)
This talk tells the behind-the-scenes story of the apology campaign complete with source code, tips on dealing with the old-school media, how Twitter helped and didn't, and a call for people who want to change the world to be "reasonably unreasonable" because nothing ever gets done by the reasonable. Read more.

9:45am

Add to your personal schedule
Thursday, 07/28/2011
Location: Portland Ballroom
Gabe Zichermann (Gamification.Co & Gamification Summit)
Average rating: ****.
(4.03, 33 ratings)
Creating engaging user experiences in software have become the mantra of businesses big and small - but what about open source? Do we do enough user-centric design and are we creating the kind of long-term user engagement we want? What are the challenges for open source advocates and developers to building truly engaging experiences and how can gamification make open-everywhere a reality? Read more.

10:00am

Add to your personal schedule
Thursday, 07/28/2011
Location: Portland Ballroom
The 7th Annual O’Reilly Open Source Award winners will be announced. Read more.

7:00pm

Add to your personal schedule
Thursday, 07/28/2011
Location: See BoF Schedule for Locations
Birds of a Feather (BoF) sessions provide face to face exposure to those interested in the same projects and concepts. BoFs can be organized for individual projects or broader topics (best practices, open data, standards). BoFs are entirely up to you. We post your topic and provide the space and time. You provide the engaging topic. Read more.

9:00pm

Add to your personal schedule
Thursday, 07/28/2011
Location: Jupiter Hotel @ the Dream Tent
Average rating: ***..
(3.33, 3 ratings)
Thursday, July 28th, (mt) Media Temple Party! held at the Jupiter Hotel @ the Dream Tent with an Open Bar/All you can eat Tacos/DJ! Read more.

Friday, 07/29/2011

9:00am

Add to your personal schedule
Friday, 07/29/2011
Location: Portland Ballroom
Edd Dumbill (Silicon Valley Data Science), Sarah Novotny (NGINX)
Average rating: ****.
(4.18, 11 ratings)
Opening remarks by the OSCON program chairs, Sarah Novotny and Edd Dumbill. Read more.

9:05am

Add to your personal schedule
Friday, 07/29/2011
Location: Portland Ballroom
Dan Melton (Code for America)
Average rating: ***..
(3.50, 26 ratings)
Code for America is a new type of public service for geeks to leverage their engineering skills to bring open source practices to communities across America. We'll talk about the growing geek corps and the challenges of leveraging each other's work in building our digital communities. Read more.

9:20am

Add to your personal schedule
Friday, 07/29/2011
Location: Portland Ballroom
Brian Fitzpatrick (Google, Inc.)
Average rating: ****.
(4.53, 36 ratings)
Keynote by Brian Fitzpatrick, Engineering Manager, Google, Inc. Read more.

9:35am

Add to your personal schedule
Friday, 07/29/2011
Location: Portland Ballroom
Karen Sandler (GNOME Foundation)
Average rating: ****.
(4.64, 39 ratings)
Keynote by Karen Sandler, Executive Director, GNOME Foundation. Read more.

12:40pm

Add to your personal schedule
Friday, 07/29/2011
Location: Portland Ballroom
Paul Fenwick (Perl Training Australia)
Average rating: ****.
(4.92, 36 ratings)
Our brains are not-at-all suited for modern life, and are plagued by a raft of bugs and unwanted features that we've been unable to remove. Join us in a tour of some of the most amusing bugs and exploits wetware has to offer. Read more.