We'll present the architecture and implementation of a Node.js/DTrace-based distributed platform for analyzing the performance of cloud applications in real-time. We'll do a live demo on a real, internet-facing cloud and discuss some of the interesting performance pathologies we've found and explained using this tool.
This language-agnostic proposal focuses upon concepts and strategies critical to the design and implementation of asynchronous systems and data processing layers. Key components include a survey of implementation strategies for non-blocking edge tiers, patterns for building out a distributed worker / processing tier, along with several horror stories of cascading failures and their resolution.
An overview of the state of the art for bringing together the analytical power of the R language with the big data capabilities of Hadoop.
We produce gorgeous LaTeX reports while harnessing the power of R on the backend. The data is pulled from our PostgreSQL database, the analysis and visualizations are fast and distributed thanks to Redis. We'll talk about weaving together open source tools to build powerful analytics reporting engines that rival the commercial alternatives.
The last few years have brought a wealth of new data technologies organized around horizontal scalability. This talk will cover the essential infrastructure areas: real-time stream processing, offline data crunching, large-scale data deployments and live serving. The focus will be on how these ingredients come together to enable innovative data-driven products at LinkedIn.
Algorithms are getting raunchier, tools more potent and competitions more intimate! Let us mix analytics tools (like R & Mahout) and a dash of algorithmics to work on BigData Analytics competitions and see if the answer is always 42. In the process we will explore and apply a few good algorithms, to the Heritage Health competition …