Async Processing
for Fun and Profit
Mike Perham
@mperham
Slide 2
Slide 2 text
Me
Director of Engineering, TheClymb.com
Slide 3
Slide 3 text
Agenda
• Basics
• Protips
• Sidekiq
Slide 4
Slide 4 text
Why?
• User-Perceived Performance
• I/O is SLOOOOOOOOOOOW
• I/O is unreliable
Slide 5
Slide 5 text
What?
• Optional work
• Anything not required to build the HTTP
response
Slide 6
Slide 6 text
How?
• Client puts message on a queue
• Server pulls message off queue
• Worker executes code based on message
Slide 7
Slide 7 text
How?
async do
# perform some work
end
Slide 8
Slide 8 text
How?
• Marshalling a Proc
• Need to serialize closure
Slide 9
Slide 9 text
How?
async(instance, :method, args)
Slide 10
Slide 10 text
How?
• Marshal instance
• Marshal args
Slide 11
Slide 11 text
How?
async(Class, :method, args)
Slide 12
Slide 12 text
How?
• Serialize just the class name
• Marshal the arguments
Slide 13
Slide 13 text
Congratulations!
This is exactly how Resque and Sidekiq work.
Slide 14
Slide 14 text
How?
Slide 15
Slide 15 text
How?
Slide 16
Slide 16 text
How?
Slide 17
Slide 17 text
Tip #1
Small, Stateless Messages
Slide 18
Slide 18 text
Stateless
• Database holds objects (nouns)
• Queue holds actions (verbs)
• “Perform X on Object 123”
Slide 19
Slide 19 text
Avoid State
• Bad
• @user.delay.sync_images
• Good
• User.delay.sync_images(@user.id)
Slide 20
Slide 20 text
Simple Types
• Small & Easy to read
• Cross-platform
• Sidekiq / Resque use JSON
Slide 21
Slide 21 text
Debugging
Slide 22
Slide 22 text
Tip #2
Idempotent, Transactional
Units of Work
Slide 23
Slide 23 text
Idempotent
• Fancy computer science term
• “Work can be applied multiple times without
changing the result beyond the initial
application”
Slide 24
Slide 24 text
Idempotent
• Canceling an order
• Updating user’s email address
Slide 25
Slide 25 text
Not Idempotent!
• Charging credit card
• Sending email
Slide 26
Slide 26 text
Idempotent
• Your code has bugs!
• Sidekiq will retry jobs that raise
• Design jobs to be retried
• e.g. verify you need to perform action
before performing it
Slide 27
Slide 27 text
Transactional
• Infinite credit?
Slide 28
Slide 28 text
Tip #3
Embrace Concurrency
Slide 29
Slide 29 text
Concurrency
• Resque/DJ = 4 workers
• Sidekiq = 100 workers
• Mike, what is best in life?
Slide 30
Slide 30 text
“To crush their
servers, see them
smoking before you
and hear the
lamentations of
their admins.”
Slide 31
Slide 31 text
Concurrency
• Use connection_pool gem to limit client
connections
• Split work into small batches
• 100 items => 10 jobs of 10 items
Slide 32
Slide 32 text
Concurrency
• Thread safety rarely an issue
• Most gem maintainers very responsive
• Recently fixed:
• cocaine, typheuos
Slide 33
Slide 33 text
Theory, meet Practice
Slide 34
Slide 34 text
Sidekiq
• Simple, efficient message processing
• Like Resque, but 10x faster
Slide 35
Slide 35 text
MODERN RUBY IS NOT SLOW
SINGLE THREADING IS SLOW
Slide 36
Slide 36 text
Concurrency
• To scale single-threaded, create lots of
processes.
• HORRIBLY RAM INEFFICIENT
Quick Rant
• GIL + poor GC
• C extension API must DIE
• What’s larger: 50% or 800%?
Slide 39
Slide 39 text
Client
Your
App
Client
Middleware
Redis
Sidekiq
Client
API
Rails Process
Slide 40
Slide 40 text
Server
Processor
Server
Middleware
Redis
Worker
Sidekiq Process
Processor
Server
Middleware
Worker
Processor
Server
Middleware
Worker
Processor
Server
Middleware
Worker
Fetcher Manager
Slide 41
Slide 41 text
Versions
• Sidekiq - Free, LGPLv3
• Sidekiq Pro - more features, support, $
• motivation!
Slide 42
Slide 42 text
Features
Sidekiq Resque DJ
Concurrency
Store
Hooks
Web UI
Scheduler
Retry
Delay
Batches
Threads Processes Processes
Redis Redis DB
middleware callbacks callbacks
✓ ✓ ?
✓ ? ✓
✓ ? ✓
✓ ? ✓
Pro ? -
? - optional
Slide 43
Slide 43 text
Future
• Nicer, more functional Web UI
• APIs for managing queues / retries
• Rails 4 Queue API
Slide 44
Slide 44 text
Pro Future
• Enterprise-y features
• Workflow
• Notifications