Personal schedule for Scott Sadler
Download or
subscribe to Scott Sadler's
schedule.
Keynote
Location: Oregon Ballroom 203/204
Dive into the distributed system that powers OkCupid’s match searches. Learn how we use C++, event-based programming, and SSDs to solve problems that crop up when building a high performance, high availability distributed system.
Read more.
Keynote
Location: Oregon Ballroom 201/202
Keynote by Raffi Krikorian, developer, Twitter.
Read more.
Keynote
Location: Oregon Ballroom 203/204
It's 2021. You have a petabyte drive on your keychain, your startup company leases bulk cloud storage by the exabyte, and you have a million cores for data crunching. You even can have your own copy of the entire world's public semantic data. What do you do with it? If you're not sure yet, I've got plenty of ideas for you.
Read more.
We're being surrounded by data: Open government data, streaming media, and data we're creating as we track our lives and connect with our communities. Learn how to leverage easy to use tools to combine this together for our personal and organization decision making without requiring complex processes or training.
Read more.
In this session Dell will discuss the analysis of the data types suitable for transfer between Hadoop and EDW, EDW/Hadoop data lifecycle, Data governance between Hadoop and DBMS, and ETL performance tuning and best practices (i.e. Hadoop/DBMS connector, node and network designs, etc.)
Read more.
Ever had to dig into a system that misused the most basic features of a RDBMS ? Better yet - after the whole NoSQL storm had you wondered why it didn't shown before when you had to twist your schema to fit into something it was not designed for ? Check on this anti-patterns collection and feel better that you are not alone - and how you can benefit from it even not having big data around.
Read more.
Always wanted to create hardware devices that can interact with the real world? Heard about the Arduino electronics prototyping platform but not sure how to get started? When you attend this workshop you will: set up an Arduino board & software; learn how the Arduino fits into the field of physical computing; and make your Arduino respond to button presses and blink lights. Hardware is fun!
Read more.
Brisk is an open-source Hadoop and Hive distro that utilizes Cassandra for its core services. Brisk provides integrated Hadoop MapReduce, Hive and job and task tracking, while providing an HDFS-compatible storage layer powered by Cassandra. By accelerating the time between data creation and analysis with DataStax’ Brisk, users experience greater reliability, simpler deployment and lower TCO.
Read more.
The data & analytics teams at Etsy build up and tear down more than a thousand independent Hadoop clusters on EC2 each month. This talk discusses the benefits of this approach, where Elastic Map Reduce serves as a "meta-cluster" in which on-demand Hadoop clusters can be created, used, and shut down quickly and easily.
Read more.
What happens when you write data to disk? We'll explore everything between your programming language and the spinning platters - both optimizations and dangerous pitfalls.
Read more.
Learning the syntax of a new language is easy, but learning to think under a different paradigm is hard. This session helps you transition from a Java writing imperative programmer to a functional programmer, using Java, Clojure and Scala for examples.
Read more.
The art of dealing with real-time data is not new. In fact, much of the world's economy is propped up my making decisions on data sub milliseconds. The technology is there, we have the power. We'll take a whirlwind tour of the open-source Esper system and understand how to integrate it into your stack to enable rapid decision making on real-time data from anywhere in your architecture.
Read more.
YARN is the next generation of Hadoop Map-Reduce designed to scale out much further while allowing for running applications other than pure Map-Reduce in a highly fault-tolerant manner.
Read more.
We at DeNA (largest social game provider in Japan) handle over 2
billion page views per day with MySQL. We heavily use SSD and tune
Linux. We run non-trivial solutions such as non-stop, automated MySQL
master failover. We also use MySQL not only as traditional RDBMS but
also an extremely high performance NoSQL. I'd like to introduce our
MySQL solutions to make our social games scale better.
Read more.
This talk introduces an open-source SQL-based system for continuous or ad-hoc analysis of streaming data built on top of Flume-based data collection for Hadoop.
Attendees will understand how to use a new tool to extend their Hadoop data collection pipeline with real-time streaming analytics.
Read more.
There are many exciting InnoDB performance and Scalability features in MySQL 5.5 and its upcoming release. But how to best use them? What are the caveats? At this session, we will describe those performance and Scalability features in depth. We will also present some benchmark results that explore the performance of those features.
Read more.
We'll talk about the roles of A/B testing and similar techniques in web applications, examine an open-source A/B framework for PHP, and present general design ideas that can be applied to building similar systems using other technology stacks.
Read more.
You've heard that Functional programming (FP) is good for concurrency. Mastering FP will improve all the code you write.
FP changes practices like TDD; learn how design is more structured and tests are more precise. See why FP-style functions and data structures are actually more reusable than objects. Leave with new tools that eliminate bloat, improve code quality, and speed development.
Read more.
Keynote
Location: Oregon Ballroom 203/204
Keynote by Adrian Cockcroft, Cloud Architect, Netflix.
Read more.
Imagine for a moment doing a JOIN on two HBase tables, crazy talk right? Well now you can thanks to Hive. True, it is only meant to be used in a batch context, but we have being doing it for a few months now at StumbleUpon and our analysts and engineers love it. This presentation will cover how the Hive-HBase integration works and how we use it at our company.
Read more.
Apache Cassandra is a powerful new distributed database system that, when used correctly, provides a simple framework for managing large, rapidly changing, and/or high value datasets. But Cassandra is a bit rough around the edges. In particular, the system has a reputation for being unforgiving when misconfigured or burdened with unusual work loads.
Read more.
Quick and effective jump start for using Apache Solr, the Lucene-based search server. Solr powers the search and discovery systems of sites such as Zappos, Smithsonian's collections, The Motley Fool, Orbitz, and many many others. This three hour session will give you the basics to immediately begin using Solr on your own data.
Read more.
StatusNet (http://status.net/) best known as the Open Source microblogging platform, has a powerful plugin system for building new social networking applications. In this tutorial, the core developers of StatusNet show how to build server-side plugins, API clients, and custom themes to make your own social network tools.
Read more.
Sharing data is critical in a world where crisis can occur at any moment. Often, valuable data is stored in disparate locations with no information on how to access. This presentation discusses spatial data discovery and open source tools for implementing a data-sharing catalog. Esri’s Geoportal Server will be used to show sharing and discovery in action. Talk is open to all attendees.
Read more.
Twitter is the largest Ruby on Rails installation on the
web right now -- however, we have been moving from solely hosting
Rails applications to a mixed Rails and JVM deployment. This
migration has been ongoing for a few years at Twitter and we now run
several back-end, high-throughput, and critical components on the JVM.
Read more.
Java 7 is out in 2 days and now is the time to do some old school hacking with it! We've picked some existing open source projects that could benefit from some Java 7 spring cleaning and you're going to help us wield the feather duster.
This session has limited space for 15 attendees on a "first come, first served" basis.
Read more.
Location-based services are hot, but geographic datasets are complex. But this shouldn’t put you off writing awesome location-aware services. This talk will show how to create spatial models and query the Open Street Map dataset together with social data using the Neo4j graph database.
Read more.
Step right up and join us at the O'Reilly OSCON Carnival. There will be games, clowns, sumo wrestling, log rolling, tattoos, and lots more. There's free food, free wine, and free beer. You’ve never seen a carnival like this. Trust us.
Read more.
Event
Location: 411 NW Park Ave.
Join Puppet Labs and SwellPath Interactive at their headquarters in the Pearl District. The party is free, as in free beer, food and fun. Two floors, two open bars, and more. Take the Green or Yellow line (free transit) west to Union Station and walk 2 blocks west to 411 NW Park Ave.
Read more.
The popularity of NoSQL opens up an endless array of possible uses but also causes its own set of problems. Riak, a NoSQL offering created by Basho solves this by claiming to have no single point of failure. Proving this goes a long way to dispelling the concerns within an enterprise to begin adopting a non-relational solution.
Read more.
You've written applications for the JVM, using various frameworks and
maybe even various languages. You understand how to rig up the
CLASSPATH, get .class files to load, compile source, and set up an
IDE. But you've always wanted a better understanding of the plumbing
underneath. How does JVM bytecode work? What happens to bytecode after
you hand it off to the JVM?
Read more.
The Go programming language was designed to make programming productive and efficient. Go is a concurrent language that compiles quickly to machine code yet has the convenience of garbage collection and the power of run-time reflection. This talk is an introduction to Go that focuses on how the design of the language helps it achieves those goals.
Read more.
Algorithms are getting raunchier, tools more potent and competitions more intimate! Let us mix analytics tools (like R & Mahout) and a dash of algorithmics to work on BigData Analytics competitions and see if the answer is always 42. In the process we will explore and apply a few good algorithms, to the Heritage Health competition …
Read more.
A blatant rip-off of Josh Bloch's "Java Puzzlers: Traps, Pitfalls, and Corner Cases", Python Puzzlers reveals some of Python's productivity-threatening oddities by showing several short code examples and asking the audience to explain their behavior.
Read more.
OpenID, OAuth, and other efforts to open up the social web are a dizzying mix of successes and setbacks. Are they being widely adopted, or eclipsed by proprietary alternatives? Are they good enough for mainstream users, or still too geeky? And have their fiercest proponents “sold out” by taking jobs at Google and Facebook, or are they continuing the fight from within? Come hear the inside story.
Read more.
You have an idea for an app. Great! First you have to munge and maintain the data. Did you know there is one data API to pull clean, updated data from multiple sources?
It slices, it dices, it serves out data on geo, social & more! And you don't need even touch MySQL.
Mash up some data with the Infochimps Data Scientists Jacob Perkins, Dhruv Bansal and Ham the Incredible Coding Chimp.
Read more.
Object-functional languages have a number of desirable properties and have proven very useful in practice. Unfortunately, the merger brings with it a raft of complexities, being the root of nearly all of Scala's infamous complexity. This talk will present a new framework for resolving these issue, based around the notion of statically-typed functional object prototypes.
Read more.
This hands-on tutorial aims at learning the basics of the important machine learning algorithms in Mahout. It aims to help you get it up and running on a Hadoop cluster. Mahout is open source implementation of a collection of algorithms designed from ground up to sift through terabytes of data and help bring out important patterns which are otherwise not in the reach of standard tools.
Read more.
StreamSQL EventFlow is a Complex Event Processing language for building real-time applications. EventFlow is unique in that it is primarily a visual language. This talk will focus on the StreamBase Event Processing Platform, the design of visual representations for language features and the co-development of an Eclipse-based IDE along with a new programming language.
Read more.
Event
Location: Expo Hall
Quench your thirst with vendor-hosted libations and snacks while you check out all the cool stuff in the expo hall.
Read more.
When working with structured, semi-structured, and unstructured data, there is often a tendency to try and force one tool - either Hadoop or a traditional DBMS - to do all the work. But, there are reasons to use Hadoop for some analytics projects, and a purpose-built analytics platform for others. The magic comes in knowing when to use which and how these two tools can work together.
Read more.
Most medical devices today use proprietary/custom software platforms (operating systems, messaging framework, alarms, etc.). This talk will present the Shahid's recent work using FOSS to build safety-critical medical devices and the challenges associated with such solutions. Shahid will present architectures considered, the benefits and detriments, and findings of real-world FOSS implementations.
Read more.
Languages with first class functions are different. Callbacks and `each' are just the start - the fun really begins when you start learning from the Lisp guys and writing code that writes code that writes code. Think differently about your Javascript and do more with less code
Read more.
First done at OSCON 2010, we though this session was extremely useful in helping developers work better with Google technology and answer questions they might be baffled about. So, for 40 minutes, we'll be happy to answer nearly any question an engineer might have. Many Googlers covering everything from Android to search will be in attendance and ready to answer your questions.
Read more.
Get the most out of your logs with logstash. Logstash is free, open source, and scalable, and exists to help you debug, analyze, and correlate issues in real-time across your infrastructure and your business.
Read more.
Is your application distributed ? How have you chosen to deal with the implications of this distribution? In this session we will introduce and explore zookeeper. Originally developed at Yahoo and used by hbase, zookeeper is a wonderful tool. Zookeeper is straightforward and provides an interface allowing for easy configuration and use.
Read more.
Nowadays many modern web applications are solely relying on JavaScript to render their frontend. But if you want to create mashups, load data from many different places or include external widgets into your site, you are quickly running into boundaries because of browser and security restrictions. In this presentation I will talk about techniques old and new helping you with such problems.
Read more.
Discover a variety of creative techniques for dramatically improving page load speed which focus on low-hanging fruit rather than micro-optimization, and what impact they had when applied to the world's fifth largest website, Wikipedia. Trevor and Roan will explore optimization beyond server load, minification and gzip, and offer up new open source libraries to help others do the same.
Read more.
Keynote
Location: Portland Ballroom
Our brains are not-at-all suited for modern life, and are plagued by a raft of bugs and unwanted features that we've been unable to remove. Join us in a tour of some of the most amusing bugs and exploits wetware has to offer.
Read more.
Event
Location: Portland Ballroom Foyer
Take the opportunity to network one last time and exchange contact information with one another.
Read more.
Event
Location: Meet in MLK Lobby of the Oregon Convention Center
One of the best ways to experience Portland, this walking tour will expose you to the culturally underground, the socially underground, and the subterranean underground of Portland. Please register in advance. Tickets are $19 per person.
Read more.