The art of dealing with real-time data is not new. In fact, much of the world's economy is propped up my making decisions on data sub milliseconds. The technology is there, we have the power. We'll take a whirlwind tour of the open-source Esper system and understand how to integrate it into your stack to enable rapid decision making on real-time data from anywhere in your architecture.
In November, Facebook launched a new version of Messages that combines chat, SMS, email, and Messages into a real-time conversation. Facebook relies on Apache HBase, a NoSQL-style database, for storing this real-time message data. This talk will elaborate on our decision process, system configuration, scaling issues, and advantages gained by choosing Open Source.
OpenTSDB is an open-source, distributed time series database designed to monitor large clusters of commodity machines at an unprecedented level of granularity. OpenTSDB enables operations teams to keep track in real-time of all the metrics exposed by operating systems, applications and network equipment, and makes the data easily accessible.
This talk introduces an open-source SQL-based system for continuous or ad-hoc analysis of streaming data built on top of Flume-based data collection for Hadoop.
Attendees will understand how to use a new tool to extend their Hadoop data collection pipeline with real-time streaming analytics.
In this talk, we will introduce a simple formula for all Big Data applications: Big Data = Fast Data + Deep Data. Through a use-case format, we will discuss the specialized requirements for real-time (“fast”) and analytic (“deep”) data management.
The last few years have brought a wealth of new data technologies organized around horizontal scalability. This talk will cover the essential infrastructure areas: real-time stream processing, offline data crunching, large-scale data deployments and live serving. The focus will be on how these ingredients come together to enable innovative data-driven products at LinkedIn.