Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[Chaos Meetup] Running a Successful GameDay

Ho Ming Li
January 24, 2019

[Chaos Meetup] Running a Successful GameDay

Delivered during the meetup "Silicon Valley Chaos Engineering Community > Chaos Engineering at Twilio" at Microsoft Reactor.

Ho Ming Li

January 24, 2019
Tweet

More Decks by Ho Ming Li

Other Decks in Technology

Transcript

  1. Prime Down Amazon’s sale day turns into fail day TechCrunch

    Delta Outage Computer malfunction results in nationwide ground stop NBC Slack Outage Connectivity issues hit workplaces WSJ
  2. GameDay Experiment #1 Experiment #2 Attack (Inject Failure) Attack Attack

    Attack ... ... Experiment #3 Attack Attack ...
  3. Get rid of the Fog of War so you can

    clearly see the map and strategize accordingly. Gain Deep Insight with: - Metrics - Logging - Request Tracing
  4. GameDay is not just a one time event Think about

    the next GameDay Track and Measure Success over time
  5. Attack - Inject small amount of latency between app and

    database Expectation - Users experience delay roughly same as injected latency
  6. “Edge” DNS, CDN “Front End” LB, API “Back End” App/Web

    Server Queue, RDB, KV DB Search Index “Infrastructure”: Container Kubernetes Virtual Machine Physical Server Storage Network Data Center Geography