Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Reliability

Piyush Verma
September 21, 2019

 Reliability

Every product either dies a hero or lives long enough to hit Reliability issues.
Whether it’s your code or a service that you connect to, there will be a disk that will fail, a network that will experience partition, a CPU that will throttle, or a Memory that will fill up.
While you go about fixing this, What is the cost, both in terms of effort and business lost, of failure and how much does each nine of reliability cost?
The talk considers a sample and straightforward product and evaluates the depths of each failure point. We take one fault at a time and introduce incremental changes to the architecture, the product, and the support structure like monitoring and logging to detect and overcome those failures.

Piyush Verma

September 21, 2019
Tweet

More Decks by Piyush Verma

Other Decks in Technology

Transcript

  1. At-least one server is online All servers are below 100%

    All servers are responding within x ms. All of the above. 01 02 03 04 5
  2. 6

  3. Service receives SMS User sends SMS Remind me to buy

    milk at 6:30 PM to 53308 Cron gets Activated when time is right. Call the User 7 Sample Product
  4. Service receives SMS User sends SMS Remind me to buy

    milk at 6:30 PM to 53308 Cron gets Activated when time is right. Call the User 8 Sample Product: Inbound
  5. Cron gets Activated when time is right. Call the User

    9 Sample Product: Outbound Service receives SMS User sends SMS Remind me to buy milk at 6:30 PM to 53308
  6. — Leslie Lamport https://www.microsoft.com/en-us/research/uploads/prod/2016 /12/Distribution.pdf “A distributed system is one

    in which the failure of a computer you didn’t even know existed can render your own computer unusable” 11
  7. Cron gets Activated when time is right. Call the User

    35 Sample Product: Outbound Service receives SMS User sends SMS Remind me to buy milk at 6:30 PM to 53308 Part 1
  8. Service receives SMS User sends SMS Remind me to buy

    milk at 6:30 PM to 53308 Cron gets Activated when time is right. Call the User 53 Sample Product
  9. 64