Adding security to an existing product is never easy, but our team at Yahoo added strong authentication to Apache Hadoop by integrating it with Kerberos. This project was delivered on time and is currently deployed on all of Yahoo's 40,000 Hadoop computers. Come learn how we added security to and why it matters.
This talk introduces an open-source SQL-based system for continuous or ad-hoc analysis of streaming data built on top of Flume-based data collection for Hadoop.
Attendees will understand how to use a new tool to extend their Hadoop data collection pipeline with real-time streaming analytics.
Algorithms are getting raunchier, tools more potent and competitions more intimate! Let us mix analytics tools (like R & Mahout) and a dash of algorithmics to work on BigData Analytics competitions and see if the answer is always 42. In the process we will explore and apply a few good algorithms, to the Heritage Health competition …
YARN is the next generation of Hadoop Map-Reduce designed to scale out much further while allowing for running applications other than pure Map-Reduce in a highly fault-tolerant manner.