Background jobs at scale (Montreal.rb)

Kerstin Puschke @titanoboa42 Background jobs at scale

@titanoboa42 Scaling applications using background jobs keeping code simple

@titanoboa42 Outline

@titanoboa42 • Introduction to background jobs Outline

@titanoboa42 • Introduction to background jobs • Features Outline

@titanoboa42 • Introduction to background jobs • Features • Mastering
challenges Outline

@titanoboa42 Outline

@titanoboa42 • Being RESTful Outline

@titanoboa42 • Being RESTful • Background jobs at scale Outline

@titanoboa42 • Being RESTful • Background jobs at scale •
Summary Outline

@titanoboa42 Introduction to background jobs

@titanoboa42 Background job:  Work to be done later App Server
Worker

@titanoboa42 Asynchronous communication App Server Message Queue Worker

@titanoboa42 Asynchronous communication App Server Message Queue Worker Task Queue

@titanoboa42 Asynchronous communication App Server Message Queue Worker Worker Worker
Task Queue

@titanoboa42 Background job backend:  task queue & broker

@titanoboa42 Encapsulating  async communication

@titanoboa42 Features

@titanoboa42 Task Queue Response times App Server Worker

@titanoboa42 Task Queue Spikeability App Server Worker

@titanoboa42 Task Queue Parallelization App Server Worker Worker Worker

@titanoboa42 Task Queue Retries App Server Worker Worker Worker

@titanoboa42 Prioritization App Server Worker Worker High Prio Queue Low
Prio Queue

@titanoboa42 Mastering challenges

@titanoboa42 No exactly once delivery

@titanoboa42 • “At least” vs. “at most” once delivery No
exactly once delivery

@titanoboa42 • “At least” vs. “at most” once delivery •
Idempotent jobs & at least once delivery No exactly once delivery

@titanoboa42 Out of order delivery

@titanoboa42 • If order matters, queue sequentially Out of order
delivery

@titanoboa42 • If order matters, queue sequentially • First job
queues follow up jobs Out of order delivery

@titanoboa42 Job queued and processed by diﬀerent versions

@titanoboa42 • No breaking changes to job parameters Job queued
and processed by diﬀerent versions

@titanoboa42 • No breaking changes to job parameters • Changes
need to be backwards compatible until legacy jobs have been processed Job queued and processed by diﬀerent versions

@titanoboa42 Eventual consistency (at best)

@titanoboa42 • Prepare for inconsistency Eventual consistency (at best)

@titanoboa42 • Prepare for inconsistency • Trade-oﬀ lack of consistency
guarantees vs. benefits of background jobs Eventual consistency (at best)

@titanoboa42 Non-transactional queuing

@titanoboa42 • Don’t queue from within a db transaction Non-transactional
queuing

@titanoboa42 • Don’t queue from within a db transaction •
Job runs before commit, or if rollback Non-transactional queuing

@titanoboa42 • Don’t queue from within a db transaction •
Job runs before commit, or if rollback • Commit before queuing or   stage transactionally Non-transactional queuing

@titanoboa42 Being RESTful

@titanoboa42 Don’t lie about resource creation

@titanoboa42 • 202 Accepted Don’t lie about resource creation

@titanoboa42 • 202 Accepted • Location: temporary resource Don’t lie
about resource creation

@titanoboa42 • 202 Accepted • Location: temporary resource • 303
See other Don’t lie about resource creation

@titanoboa42 • 202 Accepted • Location: temporary resource • 303
See other • Location: does not represent target resource Don’t lie about resource creation

@titanoboa42 Callers can enforce (a)sync behaviour

@titanoboa42 • Expect header Callers can enforce (a)sync behaviour

@titanoboa42 • Expect header • 202-accepted Callers can enforce (a)sync
behaviour

@titanoboa42 • Expect header • 202-accepted • 200-ok/201-created/204-no-content Callers can
enforce (a)sync behaviour

@titanoboa42 • Expect header • 202-accepted • 200-ok/201-created/204-no-content • 417
Expectation failed Callers can enforce (a)sync behaviour

@titanoboa42 Background jobs at scale

@titanoboa42 DelayedJob is easy to get started

@titanoboa42 • No additional infrastructure DelayedJob is easy to get
started

@titanoboa42 • No additional infrastructure • ActiveRecord DelayedJob is easy
to get started

@titanoboa42 ActiveJob makes swapping backends easy

@titanoboa42 DelayedJob issues

@titanoboa42 • Overhead of relational database DelayedJob issues

@titanoboa42 • Overhead of relational database • Workers monitored from
outside DelayedJob issues

@titanoboa42 • Overhead of relational database • Workers monitored from
outside • Frequently needs workers to restart DelayedJob issues

@titanoboa42 Resque scales

@titanoboa42 • Redis - no relational db Resque scales

@titanoboa42 • Redis - no relational db • Parent-child forking
for workers Resque scales

for workers • Rarely needs workers to restart Resque scales

for workers • Rarely needs workers to restart • Workers manage their own state Resque scales

@titanoboa42 Resque issues

@titanoboa42 • Child processes Resque issues

@titanoboa42 • Child processes • Memory hungry and slow Resque
issues

@titanoboa42 Sidekiq scales

@titanoboa42 • Redis - no relational db Sidekiq scales

@titanoboa42 • Redis - no relational db • Threads instead
of child processes Sidekiq scales

@titanoboa42 • Redis - no relational db • Threads instead
of child processes • Fast and less memory hungry Sidekiq scales

@titanoboa42 Sidekiq issues

@titanoboa42 • Requires thread safe code Sidekiq issues

@titanoboa42 Long running jobs - Resque

@titanoboa42 • Prevent worker shutdown Long running jobs - Resque

@titanoboa42 • Prevent worker shutdown • No deployments Long running
jobs - Resque

@titanoboa42 • Prevent worker shutdown • No deployments • Not
cloud-friendly Long running jobs - Resque

@titanoboa42 • Aborted and requeued on shutdown Long running jobs
- Sidekiq

@titanoboa42 • Aborted and requeued on shutdown • Job may
not finish before being aborted again Long running jobs - Sidekiq

@titanoboa42 github.com  /Shopify/job-iteration

@titanoboa42 Large collections

@titanoboa42 • Split job into collection and task to be
done Large collections

@titanoboa42 • Split job into collection and task to be
done • Checkpoint after iteration & requeue Large collections

@titanoboa42 Interruptible job with automatic resuming

@titanoboa42 • Shutdown workers anytime Interruptible job with automatic resuming

@titanoboa42 • Shutdown workers anytime • Disaster prevention Interruptible job
with automatic resuming

@titanoboa42 • Shutdown workers anytime • Disaster prevention • Data
integrity Interruptible job with automatic resuming

@titanoboa42 Abstracting scaling issues  simplifies   concrete background jobs

@titanoboa42 github.com  /Shopify/job-iteration

@titanoboa42 Background jobs

@titanoboa42 • Benefit apps of all sizes Background jobs

@titanoboa42 • Benefit apps of all sizes • Require trade-oﬀs
Background jobs

@titanoboa42 • Benefit apps of all sizes • Require trade-oﬀs
• Keep code simple at scale Background jobs

Thanks!  Questions?  @titanoboa42    https://www.shopify.com/careers

Background jobs at scale (Montreal.rb)

Background jobs at scale (Montreal.rb)

More Decks by Kerstin Puschke

Other Decks in Programming

Featured

Transcript