Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Ryan Kelly: Testing for Graceful Failure with V...

Ryan Kelly: Testing for Graceful Failure with Vaurien and Marteau

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Ryan Kelly:
Testing for Graceful Failure with Vaurien and Marteau
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
@ Kiwi PyCon 2013 - Sunday, 08 Sep 2013 - Track 2
http://nz.pycon.org/

**Audience level**

Intermediate

**Description**

This talk shows how the Mozilla Services team test failure scenarios in their web services with two python-based tools: Marteau, a web-based UI for easily running load tests, and Vaurien, a misbehaving TCP proxy that can simulate various backend failures. Used together, these tools help ensure a service will not only scale up to meet demand, but will fail gracefully if it reaches breaking point.

**Abstract**

So you've built an awesome webapp, put it through its paces, and assured yourself that it does what it's supposed to do. Great! Now how does it behave when things start to go wrong?

This talk will demonstrate how the Mozilla Services team tests for failure scenarios in our web services, focusing on two key python-based tools: Marteau, a web-based frontend for easily running load-tests and analyzing the results, and Vaurien, a misbehaving TCP proxy that can simulate a variety of backend failure modes.

Used together, these tools can help ensure that a web service will not only scale up to meet its expected demand, but will fail gracefully when it finally reaches breaking point.

The talk will cover:

* Real-life examples of bugs that only show up when your app is under load; bugs that can turn a brief partial outage into a cascading whole-system failure.
* The basics of writing a load-testing suite for your app.
* How to set up Marteau for easy on-demand load testing.
* How to use Vaurien to simulate various kinds of backend failure, such as an overloaded database, misconfigured DNS, or a suddenly-disappearing job queue.
* Some tips for systematically applying these tools to your own setup.

**YouTube**

http://www.youtube.com/watch?v=WSdyU5s-SMI

New Zealand Python User Group

September 08, 2013
Tweet

More Decks by New Zealand Python User Group

Other Decks in Programming

Transcript

  1. You have a web application You are confident that it

    works You have a solid deployment and monitoring setup
  2. You have a web application You are confident that it

    works You have a solid deployment and monitoring setup Now...what happens when things start to break?
  3. Mozilla Services Firefox Sync Server FirefoxOS Marketplace Push Notification Service

    More coming... Organic process and tooling This will be a demo, not a tutorial
  4. Our shiny new loadtesting tool Built on existing best-of-breed libraries

    Designed for distributed real-time operation Highly pluggable
  5. A misbehaving TCP proxy: Behaviours: delay, hang, error, ... Protocols:

    raw tcp, http, memcached, mysql, ... Easily extensible for your needs French for "rapscallion"
  6. Combine with loadtests to check for cascading failures Integrate into

    functional tests, for specific error sequences (example opens in new tab...)
  7. How does your app interact with the outside world? What

    happens when the outside world misbehaves? How can you simulate that under controlled conditions? Come hack with us! "Continuous Stress Testing"