Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Background jobs at scale (Montreal.rb)

Background jobs at scale (Montreal.rb)

Talks at Montreal.rb

Kerstin Puschke

February 26, 2019
Tweet

More Decks by Kerstin Puschke

Other Decks in Programming

Transcript

  1. Kerstin Puschke
    @titanoboa42
    Background jobs at scale

    View Slide

  2. View Slide

  3. @titanoboa42
    Scaling applications using
    background jobs
    keeping code simple

    View Slide

  4. @titanoboa42
    Outline

    View Slide

  5. @titanoboa42
    • Introduction to background jobs
    Outline

    View Slide

  6. @titanoboa42
    • Introduction to background jobs
    • Features
    Outline

    View Slide

  7. @titanoboa42
    • Introduction to background jobs
    • Features
    • Mastering challenges
    Outline

    View Slide

  8. @titanoboa42
    Outline

    View Slide

  9. @titanoboa42
    • Being RESTful
    Outline

    View Slide

  10. @titanoboa42
    • Being RESTful
    • Background jobs at scale
    Outline

    View Slide

  11. @titanoboa42
    • Being RESTful
    • Background jobs at scale
    • Summary
    Outline

    View Slide

  12. @titanoboa42
    Introduction to background jobs

    View Slide

  13. @titanoboa42
    Background job:

    Work to be done later
    App
    Server
    Worker

    View Slide

  14. @titanoboa42
    Asynchronous
    communication
    App
    Server
    Message
    Queue
    Worker

    View Slide

  15. @titanoboa42
    Asynchronous
    communication
    App
    Server
    Message
    Queue
    Worker
    Task
    Queue

    View Slide

  16. @titanoboa42
    Asynchronous
    communication
    App
    Server
    Message
    Queue
    Worker Worker
    Worker
    Task
    Queue

    View Slide

  17. @titanoboa42
    Background job backend:

    task queue & broker

    View Slide

  18. @titanoboa42
    Encapsulating

    async communication

    View Slide

  19. @titanoboa42
    Features

    View Slide

  20. @titanoboa42
    Task
    Queue
    Response times
    App
    Server
    Worker

    View Slide

  21. @titanoboa42
    Task
    Queue
    Spikeability
    App
    Server
    Worker

    View Slide

  22. @titanoboa42
    Task
    Queue
    Parallelization
    App
    Server
    Worker Worker
    Worker

    View Slide

  23. @titanoboa42
    Task
    Queue
    Retries
    App
    Server
    Worker Worker
    Worker

    View Slide

  24. @titanoboa42
    Prioritization
    App
    Server
    Worker Worker
    High Prio
    Queue
    Low Prio
    Queue

    View Slide

  25. @titanoboa42
    Prioritization
    App
    Server
    Worker Worker
    High Prio
    Queue
    Low Prio
    Queue

    View Slide

  26. @titanoboa42
    Prioritization
    App
    Server
    Worker Worker
    High Prio
    Queue
    Low Prio
    Queue

    View Slide

  27. @titanoboa42
    Mastering challenges

    View Slide

  28. @titanoboa42
    No exactly once delivery

    View Slide

  29. @titanoboa42
    • “At least” vs. “at most” once delivery
    No exactly once delivery

    View Slide

  30. @titanoboa42
    • “At least” vs. “at most” once delivery
    • Idempotent jobs & at least once delivery
    No exactly once delivery

    View Slide

  31. @titanoboa42
    Out of order delivery

    View Slide

  32. @titanoboa42
    • If order matters, queue sequentially
    Out of order delivery

    View Slide

  33. @titanoboa42
    • If order matters, queue sequentially
    • First job queues follow up jobs
    Out of order delivery

    View Slide

  34. @titanoboa42
    Job queued and processed by different versions

    View Slide

  35. @titanoboa42
    • No breaking changes to job parameters
    Job queued and processed by different versions

    View Slide

  36. @titanoboa42
    • No breaking changes to job parameters
    • Changes need to be backwards compatible
    until legacy jobs have been processed
    Job queued and processed by different versions

    View Slide

  37. @titanoboa42
    Eventual consistency (at best)

    View Slide

  38. @titanoboa42
    • Prepare for inconsistency
    Eventual consistency (at best)

    View Slide

  39. @titanoboa42
    • Prepare for inconsistency
    • Trade-off lack of consistency guarantees vs.
    benefits of background jobs
    Eventual consistency (at best)

    View Slide

  40. @titanoboa42
    Non-transactional queuing

    View Slide

  41. @titanoboa42
    • Don’t queue from within a db transaction
    Non-transactional queuing

    View Slide

  42. @titanoboa42
    • Don’t queue from within a db transaction
    • Job runs before commit, or if rollback
    Non-transactional queuing

    View Slide

  43. @titanoboa42
    • Don’t queue from within a db transaction
    • Job runs before commit, or if rollback
    • Commit before queuing or 

    stage transactionally
    Non-transactional queuing

    View Slide

  44. @titanoboa42
    Being RESTful

    View Slide

  45. @titanoboa42
    Don’t lie about resource creation

    View Slide

  46. @titanoboa42
    • 202 Accepted
    Don’t lie about resource creation

    View Slide

  47. @titanoboa42
    • 202 Accepted
    • Location: temporary resource
    Don’t lie about resource creation

    View Slide

  48. @titanoboa42
    • 202 Accepted
    • Location: temporary resource
    • 303 See other
    Don’t lie about resource creation

    View Slide

  49. @titanoboa42
    • 202 Accepted
    • Location: temporary resource
    • 303 See other
    • Location: does not represent target resource
    Don’t lie about resource creation

    View Slide

  50. @titanoboa42
    Callers can enforce (a)sync behaviour

    View Slide

  51. @titanoboa42
    • Expect header
    Callers can enforce (a)sync behaviour

    View Slide

  52. @titanoboa42
    • Expect header
    • 202-accepted
    Callers can enforce (a)sync behaviour

    View Slide

  53. @titanoboa42
    • Expect header
    • 202-accepted
    • 200-ok/201-created/204-no-content
    Callers can enforce (a)sync behaviour

    View Slide

  54. @titanoboa42
    • Expect header
    • 202-accepted
    • 200-ok/201-created/204-no-content
    • 417 Expectation failed
    Callers can enforce (a)sync behaviour

    View Slide

  55. @titanoboa42
    Background jobs at scale

    View Slide

  56. @titanoboa42
    DelayedJob is easy to get started

    View Slide

  57. @titanoboa42
    • No additional infrastructure
    DelayedJob is easy to get started

    View Slide

  58. @titanoboa42
    • No additional infrastructure
    • ActiveRecord
    DelayedJob is easy to get started

    View Slide

  59. @titanoboa42
    ActiveJob makes
    swapping backends easy

    View Slide

  60. @titanoboa42
    DelayedJob issues

    View Slide

  61. @titanoboa42
    • Overhead of relational database
    DelayedJob issues

    View Slide

  62. @titanoboa42
    • Overhead of relational database
    • Workers monitored from outside
    DelayedJob issues

    View Slide

  63. @titanoboa42
    • Overhead of relational database
    • Workers monitored from outside
    • Frequently needs workers to restart
    DelayedJob issues

    View Slide

  64. @titanoboa42
    Resque scales

    View Slide

  65. @titanoboa42
    • Redis - no relational db
    Resque scales

    View Slide

  66. @titanoboa42
    • Redis - no relational db
    • Parent-child forking for workers
    Resque scales

    View Slide

  67. @titanoboa42
    • Redis - no relational db
    • Parent-child forking for workers
    • Rarely needs workers to restart
    Resque scales

    View Slide

  68. @titanoboa42
    • Redis - no relational db
    • Parent-child forking for workers
    • Rarely needs workers to restart
    • Workers manage their own state
    Resque scales

    View Slide

  69. @titanoboa42
    Resque issues

    View Slide

  70. @titanoboa42
    • Child processes
    Resque issues

    View Slide

  71. @titanoboa42
    • Child processes
    • Memory hungry and slow
    Resque issues

    View Slide

  72. @titanoboa42
    Sidekiq scales

    View Slide

  73. @titanoboa42
    • Redis - no relational db
    Sidekiq scales

    View Slide

  74. @titanoboa42
    • Redis - no relational db
    • Threads instead of child processes
    Sidekiq scales

    View Slide

  75. @titanoboa42
    • Redis - no relational db
    • Threads instead of child processes
    • Fast and less memory hungry
    Sidekiq scales

    View Slide

  76. @titanoboa42
    Sidekiq issues

    View Slide

  77. @titanoboa42
    • Requires thread safe code
    Sidekiq issues

    View Slide

  78. @titanoboa42
    Long running jobs - Resque

    View Slide

  79. @titanoboa42
    • Prevent worker shutdown
    Long running jobs - Resque

    View Slide

  80. @titanoboa42
    • Prevent worker shutdown
    • No deployments
    Long running jobs - Resque

    View Slide

  81. @titanoboa42
    • Prevent worker shutdown
    • No deployments
    • Not cloud-friendly
    Long running jobs - Resque

    View Slide

  82. @titanoboa42
    • Aborted and requeued on shutdown
    Long running jobs - Sidekiq

    View Slide

  83. @titanoboa42
    • Aborted and requeued on shutdown
    • Job may not finish before being aborted again
    Long running jobs - Sidekiq

    View Slide

  84. @titanoboa42
    github.com

    /Shopify/job-iteration

    View Slide

  85. @titanoboa42
    Large collections

    View Slide

  86. @titanoboa42
    • Split job into collection and task to be done
    Large collections

    View Slide

  87. @titanoboa42
    • Split job into collection and task to be done
    • Checkpoint after iteration & requeue
    Large collections

    View Slide

  88. @titanoboa42
    Interruptible job with automatic resuming

    View Slide

  89. @titanoboa42
    • Shutdown workers anytime
    Interruptible job with automatic resuming

    View Slide

  90. @titanoboa42
    • Shutdown workers anytime
    • Disaster prevention
    Interruptible job with automatic resuming

    View Slide

  91. @titanoboa42
    • Shutdown workers anytime
    • Disaster prevention
    • Data integrity
    Interruptible job with automatic resuming

    View Slide

  92. @titanoboa42
    Abstracting scaling issues

    simplifies 

    concrete background jobs

    View Slide

  93. @titanoboa42
    github.com

    /Shopify/job-iteration

    View Slide

  94. @titanoboa42
    Background jobs

    View Slide

  95. @titanoboa42
    • Benefit apps of all sizes
    Background jobs

    View Slide

  96. @titanoboa42
    • Benefit apps of all sizes
    • Require trade-offs
    Background jobs

    View Slide

  97. @titanoboa42
    • Benefit apps of all sizes
    • Require trade-offs
    • Keep code simple at scale
    Background jobs

    View Slide

  98. Thanks!

    Questions?

    @titanoboa42


    https://www.shopify.com/careers

    View Slide