Distributed Systems: Your Only Guarantee Is Inconsistency

2cabbed1afd95c74a7a4d175e8225a88?s=47 Anthony
September 29, 2017

Distributed Systems: Your Only Guarantee Is Inconsistency

2cabbed1afd95c74a7a4d175e8225a88?s=128

Anthony

September 29, 2017
Tweet

Transcript

  1. Distributed Systems Your Only Guarantee Is Inconsistency

  2. None
  3. • Generate the user's invoice • Charge them • Email

    them • Place account holds on delinquent users • Generate reports for internal finance teams • Perform other relevant actions Our end-of-month pipeline
  4. Where we are... Architecture goals Where we want to be...

  5. class MonthClose def perform generate_invoice_items # expensive amount = generate_invoice

    # expensive success = charge_balance(amount) # external dependencies email_user(amount) # external dependencies handle_failed_charge unless success # complicated and messy end end There's a lot to do
  6. What if it fails halfway through?

  7. Background workers

  8. • Persistent jobs (SQL or Redis) • Prioritized job queues

    • Immediate, recurring, or delayed scheduling • Expect failure: automatic retries • Batch jobs with success/failure callbacks Background workers
  9. None
  10. • Sidekiq • Delayed Job • Resque Background workers

  11. # app/workers/expensive_job_worker.rb class ExpensiveJobWorker include Sidekiq::Worker sidekiq_options(queue: :high) def perform(args)

    ExpensiveJob.new(args).expensive_method end end # app/lib/expensive_job.rb class ExpensiveJob def initialize(args) @args = args end def expensive_method end end
  12. # Run it in the background ExpensiveJobWorker.perform_async(args)

  13. # Run it in the background… in 10 minutes ExpensiveJobWorker.perform_in(10.minutes,

    args)
  14. # Run it in the background every day # whenever

    gem => https://github.com/javan/whenever every :day do runner "ExpensiveJobWorker.perform_async(args)" end
  15. class MonthCloseWorker def perform generate_invoice_items amount = generate_invoice charge_balance(amount) email_user(amount)

    handle_failed_charge end end We can do better
  16. class MonthCloseWorker def perform generate_invoice_items amount = generate_invoice PaymentWorker.perform_async(amount) EmailWorker.perform_async(amount)

    end end Applying it to our use case class PaymentWorker def perform(amount) success = charge_user(amount) HandleFailedChargeWorker.perform_async unless success end end class HandleFailedChargeWorker def perform handle_failed_charge end end class EmailWorker def perform(amount) email_user(amount) end end
  17. Before • ~30 minutes per user (on average) • 1-2

    days for entire month close process After • <10 minutes per user • <8 hours for entire month close process So much better
  18. Whoops! We just introduced all kinds of bugs

  19. None
  20. Average time between steps Before: 10 µs After: 5 min?

    10 min?
  21. Our mental model is an ideal world They're created from

    user stories or an ideal workflow They don't necessarily represent reality
  22. Ideal workflows Invoice is generated and then payment is attempted

  23. Ideal workflows Payment fails and then the user is suspended

  24. Ideal workflows Payment succeeds and then the user is emailed

    a receipt
  25. Notice the and thens?

  26. Reality likes buts

  27. Real world workflows Invoice is generated, but the user applied

    a credit before the payment could be made
  28. Real world workflows Payment is attempted but the user removed

    their credit card before we realized we couldn’t charge them
  29. Real world workflows Payment is attempted but the user already

    paid manually
  30. Learning #1 When you pass information, you are working under

    the assumption that represents the state of the world at that time
  31. Learning #2 Changing methods from synchronous to asynchronous is an

    implicit change in behavior
  32. What can we do?

  33. “Well we need payments to run immediately after an invoice

    is generated, so we'll mark it highest priority”
  34. “Well we need payments to run immediately after an invoice

    is generated, so we'll mark it highest priority” NO!
  35. Don’t engage in a priority arms race

  36. queue_priority: - this_one_first_do_not_move_down - super_critical - critical - highest -

    higher - high - default
  37. So… what can we do?

  38. Assume the world changes

  39. class PaymentWorker def perform(amount) current_balance = user.balance if current_balance !=

    amount # charge user? throw error? do nothing? else charge_user(amount) end end end
  40. Bonus

  41. None
  42. Freeze your world in time

  43. From: billing@digitalocean.com Subject: Your August 2017 Invoice Hi Anthony, Thanks

    for being a loyal customer! As of 2017-08-07 19:31:09 PST, your balance is $10.00. Thanks, DigitalOcean
  44. From: billing@digitalocean.com Subject: Your August 2017 Invoice Hi Anthony, Thanks

    for being a loyal customer! As of 2017-08-07 19:31:09 PST, your balance is $10.00. As of 2017-08-08 03:31:09 CET, your balance is Ft2565. Thanks, DigitalOcean
  45. Embrace the inconsistency

  46. Thanks azacharakis@do.co @azacharax