THIS TUTORIAL HAS REQUIREMENTS AND INSTRUCTIONS LISTED BELOW
Apache Drill is a new type of massively parallel processing (MPP) framework that allows companies to federate NoSQL and old-line data storage technologies in a single query interface. This query interface operates using DrQL, a superset of SQL2003 enhanced to support manipulation of complex hierarchical and schema-less data. In addition to being the first open source framework to tackle this problem, Apache Drill also provides a very powerful abstraction layer that allows users to extend the processing framework to solve their business problems.
We’ll start the session by giving users an overview of the Apache Drill and its key extension APIs. Afterwards, we’ll describe an example use case where Apache Drill’s native capabilities are lacking. We’ll then work through design and development using Java and scripting to add extensions to the Apache Drill platform.
The coding exercises will generate: new data processing logical and physical operators, a new type of storage engine, additional query optimizer rules and implementation of a new domain specific language focused on our particular use case. Upon completion, attendees will have a strong understanding of Apache Drill fundamentals, a set of real-world useful extensions to the platform and a new tool in their data analysis tool chest.
TUTORIAL REQUIREMENTS AND INSTRUCTIONS FOR ATTENDEES
Coders might like to take a look at Apache Drill (http://incubator.apache.org/drill) and they should come with a laptop that has Java 1.7, maven and git installed.
QUESTIONS for the speaker?: Use the “Leave a Comment or Question” section at the bottom to address them.
Serial startup and artist and open-source innovator, particularly interested in large data systems and statistical modeling.
Nadeau is MapR’s lead developer on the Apache Drill open source project. Prior to joining MapR, he was CTO with in.vu and YapMap, where he built and launched massively parallel distributed search engine on top of Hadoop, supporting more than 650 million documents with sub-second response times. Evolved platform through three major architectures, ultimately building our own custom indexing kernel.
Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?
Join the conversation here (requires login)
For information on exhibition and sponsorship opportunities at the conference, contact Sharon Cordesse at (707) 827-7065 or email@example.com.
View a complete list of OSCON contacts