Sponsors

  • Microsoft
  • Nebula
  • Google
  • SugarCRM
  • Facebook
  • HP
  • Intel
  • Rackspace Hosting
  • WSO2
  • Alfresco
  • BlackBerry
  • CUBRID
  • Dell
  • eBay
  • Heroku
  • InfiniteGraph
  • JBoss
  • LeaseWeb
  • Liferay
  • Media Temple, Inc.
  • OpenShift
  • Oracle
  • Percona
  • Puppet Labs
  • Qualcomm Innovation Center, Inc.
  • Rentrak
  • Silicon Mechanics
  • SoftLayer Technologies, Inc.
  • SourceGear
  • Urban Airship
  • Vertica
  • VMware
  • (mt) Media Temple, Inc.

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the convention, contact Sharon Cordesse at scordesse@oreilly.com

Download the OSCON Sponsor/Exhibitor Prospectus

Contact Us

View a complete list of OSCON contacts

Sessions tagged with 'data_scientists'

Jonathan Seidman (Orbitz Worldwide), Ramesh Venkataramaiah (Orbitz Worldwide)
An overview of the state of the art for bringing together the analytical power of the R language with the big data capabilities of Hadoop.
Philipp Janert (Principal Value, LLC)
Data Analysis is often wrapped in a bit of mystery, with specialized tools, fancy terminology, and difficult techniques. This tutorial takes a different stance: we will review a set of basic methods and techniques, which are nevertheless essential if you want to think about and understand data. Particular emphasis is placed on ways to gain insight through graphical methods.
Jeff Hamann (Forest Informatics)
Learn how to cobble together a PostgreSQL database, install a few handy R packages, a pinch of language extensions, and a handful of publicly available data to generate a forest monitoring platform to help landscape managers make better decisions using basic design-engineering paradigms to perform quick trade-off analyses.
Robin Anil (Google), Ted Dunning (MapR Technologies)
This hands-on tutorial aims at learning the basics of the important machine learning algorithms in Mahout. It aims to help you get it up and running on a Hadoop cluster. Mahout is open source implementation of a collection of algorithms designed from ground up to sift through terabytes of data and help bring out important patterns which are otherwise not in the reach of standard tools.
Jean-Daniel Cryans (Cloudera)
Imagine for a moment doing a JOIN on two HBase tables, crazy talk right? Well now you can thanks to Hive. True, it is only meant to be used in a batch context, but we have being doing it for a few months now at StumbleUpon and our analysts and engineers love it. This presentation will cover how the Hive-HBase integration works and how we use it at our company.
Josh Patterson (Cloudera)
Time Series sensors are being ubiquitously integrated in places like cell phones, environmental sensors, and the smart grid. As we scale out this type of data RDBMS systems strain to scale with the high insertion rates and real time query requirements. In this talk we introduce “Lumberyard” which is a scalable indexing and low latency fuzzy pattern searching time series data.
Noah Pepper (Lucky Sort), Homer Strong (Lucky Sort)
We produce gorgeous LaTeX reports while harnessing the power of R on the backend. The data is pulled from our PostgreSQL database, the analysis and visualizations are fast and distributed thanks to Redis. We'll talk about weaving together open source tools to build powerful analytics reporting engines that rival the commercial alternatives.
Russell Hanson (RSI/Harvard/TCIN)
Synthetic biology is a new field where basic biological components can be engineered to create something new. It often involves DNA synthesizers, ligation, promoters, and polymerase chain reaction -- which may or may not be safe for your in silico environment. However, as the size and complexity of the systems increase, tools become more and more important, thus CAD for biology has emerged.