Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The overnight failure

The overnight failure

This talk is based on a true story.

Here I share the story of how I created a big bug and the lessons I learned from this experience. In hope that you can learn from them, so that when it happens to you (it eventually will), you are better prepared.

Presented at: RubyConf 2017

Sebastian Sogamoso

November 10, 2017
Tweet

More Decks by Sebastian Sogamoso

Other Decks in Technology

Transcript

  1. A B

  2. A B

  3. A B

  4. A B

  5. Recap • Users carpooled everyday • The payment process ran

    once a week • Passengers were charged • Drivers were paid
  6. Black Saturday Weekly process was ran 06:00 06:25 User couldn’t

    pay for breakfast 06:34 Users reported bug 22:50
  7. Boss: hey, sorry to call you this early but we

    have a problem with payments in production and a lot of customers are complaining about it
  8. Black Saturday Weekly process was ran 06:00 06:25 06:43 User

    couldn’t pay for breakfast 06:34 Users reported bug Manager woke me up
  9. Black Saturday Weekly process was ran 06:00 06:25 06:43 User

    couldn’t pay for breakfast 06:34 Users reported bug Manager woke me up Problem contained 07:28
  10. Passenger: UserID: 9 Driver: User ID: 100 $10.00 Passenger: UserID:

    9 Driver: User ID: 100 $10.00 Passenger: UserID: 9 Driver: User ID: 100 $10.00 Passenger: UserID: 9 Driver: User ID: 100 $10.00 Passenger: UserID: 9 Driver: User ID: 100 $10.00 Passenger: UserID: 9 Driver: User ID: 100 $10.00 Passenger: UserID: 9 Driver: User ID: 100 $10.00 Passenger: UserID: 9 Driver: User ID: 100 $10.00 Passenger: UserID: 9 Driver: User ID: 100 $10.00 Passenger: UserID: 9 Driver: User ID: 100 $10.00 Passenger: UserID: 9 Driver: User ID: 100 $10.00 Passenger: UserID: 9 Driver: User ID: 100 $10.00 0 0 0
  11. Black Saturday Weekly process was ran 06:00 06:25 06:43 User

    couldn’t pay for breakfast 06:34 Users reported bug Manager woke me up 22:50 Deployed a fix to production Problem contained 07:28
  12. Black Saturday 06:25 06:34 Users reported bug 22:55 Started looking

    for a new job Problem contained 07:28 Weekly process was ran 06:00 06:25 06:43 User couldn’t pay for breakfast 06:34 Users reported bug Manager woke me up 22:50 Deployed a fix to production Problem contained 07:28 Started looking for a new job
  13. Black Saturday Weekly process was ran 06:00 06:25 06:43 User

    couldn’t pay for breakfast 06:34 Users reported bug Manager woke me up 22:50 Deployed a fix to production Problem contained 07:28
  14. Thousands of users affected by the bug Users were charged

    up-to 200 times A single user was charged over $5k Maxed out credit cards. Emptied bank accounts