Netflix created a suite of tools, collectively called the Simian Army, to improve resiliency and maintain the cloud environment.
In the typical case, failure modes are corner cases which are poorly, if at all, tested. It’s only by failing often that we can ensure that we are resilient to failure.
We look for ways to induce failure in our production environment to better prepare us for the inevitable failures that will occur.
We’ve open sourced some of the monkeys (Chaos, Janitor, Conformity), and are working on releasing the rest. The presentation will cover the reason for creating each monkey, what we’ve been able to learn from running them, and tips for those interested in adopting the approach.
Chaos Monkey randomly terminates virtual machines to ensure that services are resilient to node failure.
Chaos Gorilla is a more powerful version of Chaos Monkey, terminating an entire AWS Availability Zone (data center) to ensure resiliency to a single zone failure.
Latency Monkey induces random network delays and errors to ensure that services are resilient to degradation in their dependencies.
Janitor Monkey is the cloud cleaning crew. It prevents clutter by cleaning up old and unused resources.
The presentation will cover the reason for creating each monkey, what we’ve been able to learn from running them, and tips for those interested in adopting the approach.
Ariel Tseitlin is Director of Cloud Solutions at Netflix and is responsible for making Netflix successful in the Cloud, including managing the reliability and availability of the Netflix streaming service. Ariel’s team builds the Simian Army, including the Chaos Monkey, making the Netflix streaming service more resilient and reliable. Prior to Netflix, Ariel was VP of Technology and Products at Sungevity and before that was the Founder & CEO of CTOWorks, a software consultancy helping early-stage entrepreneurs deliver their first product to market. Earlier in his career, Ariel held senior management positions at Siebel Systems and Oracle. Ariel holds a bachelor’s degree in Computer Science from UC Berkeley and an MBA with honors from the Wharton School of Business at the University of Pennsylvania.
Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?
Join the conversation here (requires login)
For information on exhibition and sponsorship opportunities at the conference, contact Sharon Cordesse at (707) 827-7065 or firstname.lastname@example.org.
View a complete list of OSCON contacts