Sponsors

  • 10gen
  • DataStax, Inc.
  • Dell
  • Google
  • Lexis Nexis
  • Oracle
  • VMware
  • Percona

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the convention, contact Sharon Cordesse at scordesse@oreilly.com

Download the OSCON Data Sponsor/Exhibitor Prospectus

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences or contact mediapartners@ oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

OSCON Bulletin

To stay abreast of convention news and announcements, please sign up for the OSCON email bulletin (login required)

Contact Us

View a complete list of OSCON contacts

Essential Data Analysis Workshop

Philipp Janert (Principal Value, LLC)
Average rating: ****.
(4.00, 5 ratings)

Attendance to this tutorial is open to both OSCON and OSCON Data attendees. You don’t need to sign up for this tutorial in advance, but please note it lasts all day. This tutorial takes place in the regular OSCON (Portland) side of the convention center.

Data Analysis is often wrapped in a bit of mystery, with specialized tools, fancy terminology, and difficult techniques. This tutorial takes a different stance: we will review a set of basic, almost trivial, method and techniques, which are nevertheless essential if you want to think about and understand data. Particular emphasis is placed on ways to gain insight through graphical methods.

Part 1: Basic Basics

  • Histograms and Kernel-Density Estimates
  • Cumulative Distribution Functions and Rank-Order Plots
  • Summary Statistics and Box-Plots
  • The Importance of Power-Law Distributions (with an aside on Log plots)
  • Smoothing Techniques for Trend Detection

Part 2: Advanced Basics

  • Time Series
  • Bootstrapping and Other Resampling Methods
  • Multivariate Problems: SPLOMs and Coplots
  • Other Ideas: Parallel Coordinate and Mosaic Plots

This is a talk on conceptual methods and techniques, which are independent of any particular software tool or application. For this reason, we will largely stay clear of tool-specific technical details.

Photo of Philipp Janert

Philipp Janert

Principal Value, LLC

After previous careers in physics and software development, Philipp K. Janert currently provides consulting services for data analysis, algorithm development, and mathematical modeling.

He is the author of two books on data analysis: “Data Analysis with Open Source Tools” (O’Reilly) and “Gnuplot in Action – Understanding Data with Graphs” (Manning Publications).

He holds a Ph.D. in theoretical physics from the University of Washington. Visit his company website at www.principal-value.com

Comments on this page are now closed.

Comments

Becky Vorpagel
07/28/2011 11:34am PDT

very good session, could have used some discussion of tools used, content tapered off towards end,

Picture of Soren Hansen
Soren Hansen
07/27/2011 4:11pm PDT

I love sessions where the speaker is as comfortable with the subject as Philipp Janert was here. He really managed to take complex subjects and make them blindingly obvious, answered questions eloquently and succinctly.

Simply delightful!