Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Distributed Systems: Your Only Guarantee Is Inconsistency

Anthony
September 29, 2017

Distributed Systems: Your Only Guarantee Is Inconsistency

Anthony

September 29, 2017
Tweet

Other Decks in Programming

Transcript

  1. Distributed Systems
    Your Only Guarantee Is Inconsistency

    View Slide

  2. View Slide

  3. ● Generate the user's invoice
    ● Charge them
    ● Email them
    ● Place account holds on delinquent users
    ● Generate reports for internal finance teams
    ● Perform other relevant actions
    Our end-of-month pipeline

    View Slide

  4. Where we are...
    Architecture goals
    Where we want to be...

    View Slide

  5. class MonthClose
    def perform
    generate_invoice_items # expensive
    amount = generate_invoice # expensive
    success = charge_balance(amount) # external dependencies
    email_user(amount) # external dependencies
    handle_failed_charge unless success # complicated and messy
    end
    end
    There's a lot to do

    View Slide

  6. What if it fails halfway through?

    View Slide

  7. Background workers

    View Slide

  8. ● Persistent jobs (SQL or Redis)
    ● Prioritized job queues
    ● Immediate, recurring, or delayed scheduling
    ● Expect failure: automatic retries
    ● Batch jobs with success/failure callbacks
    Background workers

    View Slide

  9. View Slide

  10. ● Sidekiq
    ● Delayed Job
    ● Resque
    Background workers

    View Slide

  11. # app/workers/expensive_job_worker.rb
    class ExpensiveJobWorker
    include Sidekiq::Worker
    sidekiq_options(queue: :high)
    def perform(args)
    ExpensiveJob.new(args).expensive_method
    end
    end
    # app/lib/expensive_job.rb
    class ExpensiveJob
    def initialize(args)
    @args = args
    end
    def expensive_method
    end
    end

    View Slide

  12. # Run it in the background
    ExpensiveJobWorker.perform_async(args)

    View Slide

  13. # Run it in the background… in 10 minutes
    ExpensiveJobWorker.perform_in(10.minutes, args)

    View Slide

  14. # Run it in the background every day
    # whenever gem => https://github.com/javan/whenever
    every :day do
    runner "ExpensiveJobWorker.perform_async(args)"
    end

    View Slide

  15. class MonthCloseWorker
    def perform
    generate_invoice_items
    amount = generate_invoice
    charge_balance(amount)
    email_user(amount)
    handle_failed_charge
    end
    end
    We can do better

    View Slide

  16. class MonthCloseWorker
    def perform
    generate_invoice_items
    amount = generate_invoice
    PaymentWorker.perform_async(amount)
    EmailWorker.perform_async(amount)
    end
    end
    Applying it to our use case
    class PaymentWorker
    def perform(amount)
    success = charge_user(amount)
    HandleFailedChargeWorker.perform_async unless success
    end
    end
    class HandleFailedChargeWorker
    def perform
    handle_failed_charge
    end
    end
    class EmailWorker
    def perform(amount)
    email_user(amount)
    end
    end

    View Slide

  17. Before
    ● ~30 minutes per user (on average)
    ● 1-2 days for entire month close process
    After
    ● <10 minutes per user
    ● <8 hours for entire month close process
    So much better

    View Slide

  18. Whoops!
    We just introduced all kinds of bugs

    View Slide

  19. View Slide

  20. Average time between steps
    Before: 10 µs After: 5 min? 10 min?

    View Slide

  21. Our mental model is an ideal world
    They're created from user stories or an ideal workflow
    They don't necessarily represent reality

    View Slide

  22. Ideal workflows
    Invoice is generated and then payment is attempted

    View Slide

  23. Ideal workflows
    Payment fails and then the user is suspended

    View Slide

  24. Ideal workflows
    Payment succeeds and then the user is emailed a receipt

    View Slide

  25. Notice the and thens?

    View Slide

  26. Reality likes buts

    View Slide

  27. Real world workflows
    Invoice is generated, but the user applied a credit before the
    payment could be made

    View Slide

  28. Real world workflows
    Payment is attempted but the user removed their credit card
    before we realized we couldn’t charge them

    View Slide

  29. Real world workflows
    Payment is attempted but the user already paid manually

    View Slide

  30. Learning #1
    When you pass information, you are working under the
    assumption that represents the state of the world at that
    time

    View Slide

  31. Learning #2
    Changing methods from synchronous to asynchronous is an
    implicit change in behavior

    View Slide

  32. What can we do?

    View Slide

  33. “Well we need payments to run immediately after an invoice is generated, so
    we'll mark it highest priority”

    View Slide

  34. “Well we need payments to run immediately after an invoice is generated, so
    we'll mark it highest priority”
    NO!

    View Slide

  35. Don’t engage in a priority arms race

    View Slide

  36. queue_priority:
    - this_one_first_do_not_move_down
    - super_critical
    - critical
    - highest
    - higher
    - high
    - default

    View Slide

  37. So… what can we do?

    View Slide

  38. Assume the world changes

    View Slide

  39. class PaymentWorker
    def perform(amount)
    current_balance = user.balance
    if current_balance != amount
    # charge user? throw error? do nothing?
    else
    charge_user(amount)
    end
    end
    end

    View Slide

  40. Bonus

    View Slide

  41. View Slide

  42. Freeze your world in time

    View Slide

  43. From: [email protected]
    Subject: Your August 2017 Invoice
    Hi Anthony,
    Thanks for being a loyal customer!
    As of 2017-08-07 19:31:09 PST, your balance is $10.00.
    Thanks,
    DigitalOcean

    View Slide

  44. From: [email protected]
    Subject: Your August 2017 Invoice
    Hi Anthony,
    Thanks for being a loyal customer!
    As of 2017-08-07 19:31:09 PST, your balance is $10.00.
    As of 2017-08-08 03:31:09 CET, your balance is Ft2565.
    Thanks,
    DigitalOcean

    View Slide

  45. Embrace the inconsistency

    View Slide

  46. Thanks
    [email protected]
    @azacharax

    View Slide