Data

Data

Today’s system architectures embrace many flavors of data: relational, NoSQL, big data and streaming

Add to your personal schedule
Location: D137/138
Josh Berkus (PostgreSQL Experts, Inc.)
Average rating: ****.
(4.32, 22 ratings)
So, you've inherited a PostgreSQL server. Congratulations? Thanks to Postgres' popularity as the database for new applications, thousands of developers, system administrators and devops are finding themselves in charge of PostgreSQL servers with no idea what to do next. This tutorial will cover the essentials. Read more.
Add to your personal schedule
Location: Portland 255
Ray DiGiacomo, Jr. (Lion Data Systems, LLC)
Average rating: **...
(2.32, 53 ratings)
This workshop will provide the attendee an introduction to R, an open-source statistical computing environment that some say is even more powerful and flexible than SAS and SPSS. Additionally, the session will also provide an introduction to predictive analytics theory and R's ability to apply predictive analytics theory to real-world situations. Read more.
Add to your personal schedule
Location: Portland 252
Shaun Verch (MongoDB)
Average rating: **...
(2.25, 24 ratings)
This tutorial will be a crash course in the basics of how to use MongoDB, as well as an introduction to some of MongoDB's core design principles. We'll start by going over the fundamentals of what MongoDB is, use that as context for starting a simple application, and finish off by showing how to set up MongoDB Replica Sets and Sharded Clusters. Read more.
Add to your personal schedule
Location: Portland 255
Tom Wheeler (Cloudera, Inc.)
Average rating: ****.
(4.06, 48 ratings)
This is a solid introduction to Apache Hadoop that explains what it is, why it's relevant and how it works. No previous experience is required, and participants will gain a clear understanding of how Apache Hadoop (and many complementary tools) can be used for scalable data processing as well as approaches for integrating it with existing systems. Read more.
Add to your personal schedule
Location: D139/140
Henrik Ingo (MongoDB)
Average rating: ***..
(3.75, 8 ratings)
High Availability has become a mandatory feature for databases. MySQL replication is the most used replication solution on the Internet, but a whole family of alternative exists in the MySQL ecosystem. This tutorial walks you through your options and teaches you how to weigh the pro's and con's of each to pick a solution that best matches your use case. Read more.
Add to your personal schedule
Location: Portland 256
Erik Hatcher (LucidWorks)
Average rating: ***..
(3.33, 21 ratings)
Apache Solr is a Lucene-based blazing fast, highly scalable search engine used in thousands of applications and projects at organizations such as Zappos, Wells Fargo, Getty Images and many more. This tutorial will provide you with the fundamentals, enabling you to be up and running with Solr in minutes. Read more.
Add to your personal schedule
Location: D137/138
Michael Hunger (Neo Technology)
Average rating: ****.
(4.08, 12 ratings)
This tutorial covers the core functionality of the Neo4j graph database. With a mixture of theory and hands-on practice sessions, attendees will quickly learn how easy it is to develop a Neo4j-backed application. Read more.
Add to your personal schedule
Location: E143/144
Ted Dunning (MapR), Jacques Nadeau (Apache Foundation/MapR)
Average rating: **...
(2.91, 11 ratings)
We’ll start the session by giving users an overview of the Apache Drill and its key extension APIs. Afterwards, we’ll describe an example use case where Apache Drill’s native capabilities are lacking. We’ll then work through design and development using Java and scripting to add extensions to the Apache Drill platform. Read more.
Add to your personal schedule
Location: Portland 256
Kathleen Ting (Cloudera)
Average rating: ***..
(3.11, 9 ratings)
ZooKeeper is the unsung hero. Although a critical component, ZooKeeper is often noticed only after it’s missing. In this presentation, we'll talk about how to efficiently resolve some of the common issues that can cause ZooKeeper’s unavailability. An impenetrable ZooKeeper makes for a healthy cluster. Read more.
Add to your personal schedule
Location: Portland 256
Sanjay Radia (Hortonworks), Suresh Srinivas (Hortonworks)
Average rating: ***..
(3.75, 8 ratings)
Hadoop 2.0 offers major HDFS improvements: new append-pipeline, federation, wire compatibility, NameNode HA, performance improvements, etc. In this session, we'll describe these features, their benefits and the development underway for the next HDFS release. This includes data management features, added support for storage devices and improvements to performance, diagnosability and manageability. Read more.
Add to your personal schedule
Location: Portland 256
Dawn Nafus (Intel)
Average rating: **...
(2.25, 4 ratings)
How can open source help people get something useful out of the sensor data they generate? Based on social science research, this session will give developers some simple tools to understand how non-geeks make sense of complex data, and offers some approaches to improve user experience of both hardware and software based on that knowledge. Read more.
Add to your personal schedule
Location: Portland 256
Jesse Anderson (Cloudera)
Average rating: ***..
(3.50, 20 ratings)
Gaining insight on data is even more interesting when it comes from the NFL. See how I take play by play data, combine it with other datasets and gain insight from the data. Read more.
Add to your personal schedule
Location: Portland 256
Mark Grover (Cloudera)
Average rating: ****.
(4.64, 11 ratings)
If you have ever wanted to dabble with Apache Hadoop, Hive, HBase or other projects in the Hadoop ecosystem but have been discouraged by the painful process of installation and configuration of these projects, this talk is for you. We will learn how to install Hadoop, Hive and HBase on a cluster by making use of various packages from Apache Bigtop. Read more.
Add to your personal schedule
Location: Portland 256
Russell Branca (Cloudant)
Average rating: ***..
(3.10, 10 ratings)
Map Reduce has become a household name in data processing these days, but is typically used in a backend, batch oriented manner across large data sets. In this talk we'll explore pipelining data sets far too large to fit in the browser through map reduce implementations in CouchDB, server side javascript, and finally directly in the browser, allowing for large scale, yet interactive data analysis. Read more.
Add to your personal schedule
Location: Portland 256
Peter Zaitsev (Percona Inc)
Average rating: ***..
(3.08, 13 ratings)
In many Performance evaluation studies, you will find comparison made in terms of peak throughput or corresponding response time. This can be misleading. In this brief presentation, we will look into why such metrics can be misleading as well as provide framework and principles about performance evaluation which focuses on being able to provide good service in real world production environments. Read more.
Add to your personal schedule
Location: Portland 256
Calvin Sun (Twitter)
Average rating: ***..
(3.67, 3 ratings)
MySQL 5.6 is simply a better MySQL with improvements that enhance every functional area of the database kernel. There are many new features in the InnoDB storage engine, including: better performance and scalability, online DDL, persistent statistics, NoSQL access, and many more. Read more.
Add to your personal schedule
Location: Portland 256
Bradford Stephens (Drawn to Scale)
Average rating: *....
(1.50, 4 ratings)
Spire is one of the first open source distributed SQL databases. Architected from the ground up with no legacy code, it's meant to power large-scale applications with 10's of thousands of reads and writes at the petabyte-scale. This talk will cover parts of Spire like distributed computational fabric, distributed indexing, query planning, and more. Read more.
Add to your personal schedule
Location: D136
Byron Ruth (The Children's Hospital of Philadelphia), Michael Italia (The Children's Hospital of Philadelphia)
Average rating: ****.
(4.33, 3 ratings)
The biomedical research community is amidst a data revolution driven by the adoption of electronic health records and the arrival of next generation genomic technologies. Researchers require tools that scale with this increase without added complexity. To address this need we have developed Harvest, an open source framework for rapid development of purpose-built data discovery web applications. Read more.
Add to your personal schedule
Location: Portland 256
Robert Hodges (Continuent.com)
Average rating: ****.
(4.25, 4 ratings)
Successful database applications do not happen by accident. In this talk we will present a half-dozen design patterns for database management to help implement 24x7 applications that handle 100s of terabytes spread over multiple continents on databases like MYSQL. Start out using these patterns now and avoid a lot of pain later. Read more.
Add to your personal schedule
Location: Portland 256
Dimitri Fontaine (2ndQuadrant)
Average rating: **...
(2.40, 5 ratings)
Once a Top-10 internet audience site. 32 million users. Billions of photos and comments, more than 6TB of them. Migrating away from MySQL to PostgreSQL! Read more.
Add to your personal schedule
Location: Portland 255
Paco Nathan (Databricks)
As new data sets become available through municipal Open Data initiatives, how can these be leveraged to reveal insights and build services for communities? This talk shows Cascalog and Open Data from the City of Palo Alto to create a sample app. Some programming background is helpful, but the emphasis is on process: how to approach large-scale Open Data to build data products for a community. Read more.
Add to your personal schedule
Location: Portland 256
Ligaya Turmelle (MySQL)
Average rating: ****.
(4.00, 7 ratings)
Many companies need their employees to do more then one job - Programmer, DBA, SysAdmin. The more skills you have, the more you can contribute to the overall success of the company and improve your own job marketability. Learn the basic commands of MySQL Server Administration that every Developer should know, what each does and how to use them. Read more.
Add to your personal schedule
Location: D135
Joakim Recht (Tradeshift)
Average rating: **...
(2.12, 8 ratings)
Going from a transactional SQL/ACID-based system, to a scalable NoSQL-based system can be both scary and somewhat mysterious. Many developers don't believe it can be done. It can, however. In this talk, we'll see how and to what degree. Read more.
Add to your personal schedule
Location: D135
Christophe Pettus (PostgreSQL Experts, Inc.)
Average rating: ****.
(4.58, 12 ratings)
With the addition of JSON functionality, PostgreSQL can hold its trunk high when compared to non-SQL databases. We'll explore the ways you can use the non-structured-data features of PostgreSQL, how they perform... and when you shouldn't use them. Read more.

Sponsors

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Sharon Cordesse at (707) 827-7065 or scordesse@oreilly.com.

Contact Us

View a complete list of OSCON contacts