Hacking Sidekiq for Fun (and Profit) - RubyConf Australia 2014

Hacking Sidekiq for Fun (and Profit) - RubyConf Australia 2014

Get an introduction to the history of background processing in Ruby, What Sidekiq is and how you can use sidekiq to do some excellent things. Presented at RubyConf Australia 2014 in Sydney on the 21st of February, 2014.

72d5f419e4a520ff4fe452400eac83d4?s=128

Darcy Laycock

February 21, 2014
Tweet

Transcript

  1. Hello!

  2. I'm Darcy. ! ! @sutto

  3. If I talk too fast, yell out

  4. Hacking Sidekiq For Fun!

  5. (and Profit)

  6. Sidekiq.explain!

  7. Background Processing

  8. Complex logic with many side effects

  9. Slow logic

  10. Decouple from Request / Response Lifecycle

  11. e.g. hit an external API to update expired data on

    every request
  12. Background Jobs in Ruby

  13. backgroundrb

  14. delayed job

  15. resque

  16. sidekiq

  17. Mike Perham

  18. None
  19. Enter Sidekiq

  20. Threaded

  21. Actor based concurrency

  22. None
  23. Even in MRI, great for network-heavy logic

  24. None
  25. Redis for job & metadata storage

  26. … using same redis layout as Resque.

  27. Batteries-included

  28. Built in exception reporting support

  29. None
  30. Job retries on failures

  31. Scheduled Jobs!

  32. Built in support for many queues

  33. Built to be extensible

  34. … we'll talk about this shortly

  35. Sidekiq Pro

  36. Not Free but adds powerful features such as: batching, notifications,

    metrics and reliable workers.
  37. Scales from a small number of jobs to many

  38. None
  39. None
  40. But how?

  41. class HelloPersonWorker! ! include Sidekiq::Worker! sidekiq_options queue: "onboarding"! ! def

    perform(name)! Rails.logger.info "Saying hello to #{name}"! end! ! end!
  42. Bonus: Testing is simple - instantiate and test worker class.

  43. HelloPersonWorker.perform_async "Darcy"! # => "2dbfbc3f28d26be12107db84"

  44. Sidekiq::Client.push({! "class" => HelloPersonWorker,! "args" => ["Darcy"]! })!

  45. {! "class": "HelloPersonWorker",! "args": ["Darcy"],! "retry": true,! "queue": "onboarding",! "jid":

    "2dbfbc3f28d26be12107db84",! "enqueued_at": 1392701981.091299! }
  46. sadd queues onboarding lpush queue:onboarding [json encoded job]

  47. Scheduled job?

  48. Sidekiq::Client.push({! "class" => HelloPersonWorker,! "args" => ["Darcy"],! "at" => Time.now.to_f!

    })!
  49. zadd schedule timestamp_1 encoded_json_blob

  50. Remembering JSON includes Queue Name

  51. How do we run jobs?

  52. Sidekiq::Manager

  53. Sidekiq::Fetch

  54. … or, more accurately, it's strategy.

  55. class BasicFetch! def initialize(options)! @strictly_ordered_queues = !!options[:strict]! @queues = options[:queues].map

    { |q| "queue:#{q}" }! @unique_queues = @queues.uniq! end! ! def retrieve_work! work = Sidekiq.redis { |conn| conn.brpop(*queues_cmd) }! UnitOfWork.new(*work) if work! end! ! # ... more goes here ...! end!
  56. brpop semantics control how / what we fetch from

  57. Sidekiq::Fetch::UnitOfWork

  58. UnitOfWork = Struct.new(:queue, :message) do! def acknowledge! # nothing to

    do! end! ! def queue_name! queue.gsub(/.*queue:/, '')! end! ! def requeue! Sidekiq.redis do |conn|! conn.rpush("queue:#{queue_name}", message)! end! end! end!
  59. Job is fetched, run through middleware and “invoked”.

  60. More around that, but that's the core.

  61. Making the most of Sidekiq

  62. Trick #1: Unique Jobs

  63. Middleware!

  64. module Sidekiq! module Middleware! module Server! class Logging! ! def

    call(worker, item, queue)! Sidekiq::Logging.with_context("#{worker.class.to_s} JID-#{item['jid']}") do! begin! start = Time.now! logger.info { "start" }! yield! logger.info { "done: #{elapsed(start)} sec" }! rescue Exception! logger.info { "fail: #{elapsed(start)} sec" }! raise! end! end! end! ! def elapsed(start)! (Time.now - start).to_f.round(3)! end! ! def logger! Sidekiq.logger! end! end! end! end! end!
  65. def call(worker, item, queue)! Sidekiq::Logging.with_context("#{worker.class.to_s} JID-#{item['jid']}") do! begin! start =

    Time.now! logger.info { "start" }! yield! logger.info { "done: #{elapsed(start)} sec" }! rescue Exception! logger.info { "fail: #{elapsed(start)} sec" }! raise! end! end! end!
  66. We want to queue a job, but only if it's

    not currently queued.
  67. E.g. Loading external data from an API.

  68. $ gem install sidekiq-middleware

  69. Store job uniqueness somehow?

  70. Client AND Server middleware

  71. Pushing AND Pulling Jobs

  72. module Sidekiq! module Middleware! module Client! class UniqueJobs! def call(worker_class,

    item, queue)! worker_class = worker_class.constantize if worker_class.is_a?(String)! enabled = Sidekiq::Middleware::Helpers.unique_enabled?(worker_class, item)! ! if enabled! expiration = Sidekiq::Middleware::Helpers.unique_exiration(worker_class)! job_id = item['jid']! unique = false! ! # Scheduled! if item.has_key?('at')! # Use expiration period as specified in configuration,! # but relative to job schedule time! expiration += (item['at'].to_i - Time.now.to_i)! end! ! unique_key = Sidekiq::Middleware::Helpers.unique_digest(worker_class, item)! ! Sidekiq.redis do |conn|! conn.watch(unique_key)! ! locked_job_id = conn.get(unique_key)! if locked_job_id && locked_job_id != job_id! conn.unwatch! else! unique = conn.multi do! conn.setex(unique_key, expiration, job_id)! end! end! end! ! yield if unique! else! yield! end! end! end! end! end! end!
  73. unique_key = Sidekiq::Middleware::Helpers.unique_digest(worker_class, item)! ! Sidekiq.redis do |conn|! conn.watch(unique_key)! !

    locked_job_id = conn.get(unique_key)! if locked_job_id && locked_job_id != job_id! conn.unwatch! else! unique = conn.multi do! conn.setex(unique_key, expiration, job_id)! end! end! end! ! yield if unique!
  74. 1. Generate a key from the job structure 2. Check

    if locked 3. Set lock (with expiry) if not locked
  75. … with some redis magic

  76. Hash the JSON of the job, check existence of hash

    in Redis
  77. None
  78. module Sidekiq! module Middleware! module Server! class UniqueJobs! def call(worker_instance,

    item, queue)! worker_class = worker_instance.class! enabled = Sidekiq::Middleware::Helpers.unique_enabled?(worker_class, item)! ! if enabled! begin! yield! ensure! unless Sidekiq::Middleware::Helpers.unique_manual?(worker_class)! clear(worker_class, item)! end! end! else! yield! end! end! ! def clear(worker_class, item)! Sidekiq.redis do |conn|! conn.del Sidekiq::Middleware::Helpers.unique_digest(worker_class, item)! end! end! end! end! end! end!
  79. begin! yield! ensure! unless Sidekiq::Middleware::Helpers.unique_manual?(worker_class)! clear(worker_class, item)! end! end!

  80. 1. Run the work 2. Unless user has specified they'll

    clear the lock, clear the lock
  81. Trick #2: Adjusting Running Instances

  82. Transitions into more powerful territory

  83. How can we prioritise and schedule work?

  84. Control number of workers dedicated to a task

  85. Temporarily pause a queue

  86. Make queues block others

  87. Dynamically reorganise workers without process changes

  88. $ gem install sidekiq-limit_fetch

  89. Store queue / system metadata in Redis (seperate to jobs)

  90. Act on settings stored in Redis to control the system

    state.
  91. None
  92. None
  93. A smart locking algorithm for jobs in Lua

  94. … running on your Applications Redis server

  95. None
  96. Use the lua script to acquire a job instead of

    brpop
  97. Custom fetch strategy, a unit of work (to update state)

    & way to manage metadata.
  98. Trick #3: Replacing the queue

  99. … if we can replace the fetcher, what about how

    we store data?
  100. None
  101. None
  102. None
  103. $ gem install sidekiq-sqs

  104. Monkey patches Sidekiq::Client.push to override adding a job

  105. … but reasonably cleanly switches the fetch strategy

  106. … but avoid this.

  107. Final Trick: Spot Instances and AutoScaling

  108. Spot instances average much, much cheaper

  109. … but are transient and not guaranteed

  110. Normal Price for a c1.medium: $0.145 per Hour

  111. Spot Price for a c1.medium: Average $0.018 per Hour, Peak

    at $5.00 an hour.
  112. None
  113. Perfect for processing big amounts of data / backed up

    queues
  114. “I always want at least 5 worker servers, but I

    can happily jump to 100 if need be”
  115. … but do it without human intervention

  116. Sidekiq

  117. Sidekiq CloudWatch! (Metrics) Publish Sidekiq Queue Sizes

  118. Sidekiq CloudWatch! (Metrics) Publish Sidekiq Queue Sizes AutoScaling Scale Up

    / Down based on # of waiting jobs
  119. Sidekiq CloudWatch! (Metrics) Publish Sidekiq Queue Sizes AutoScaling Scale Up

    / Down based on # of waiting jobs EC2 Instances Launch Instances w/ Queue or Groups specified via User Data
  120. Sidekiq CloudWatch! (Metrics) Publish Sidekiq Queue Sizes AutoScaling Scale Up

    / Down based on # of waiting jobs EC2 Instances Launch Instances w/ Queue or Groups specified via User Data Read Queues / Group from the config, start processing jobs
  121. System responds to load and instigates measures to help solve

    it
  122. None
  123. None
  124. I mentioned queue groups, but they're not a sidekiq feature.

  125. sidekiq.yml config is run through ERB first.

  126. Write logic based on ENV

  127. ---! queues:! <% if ENV['SIDEKIQ_GROUP'] == 'onboarding' %>! - [hello,

    10]! - [world, 5]! <% else %>! - [hello, 2]! - [world, 2]! - [default, 1]! <% end %>!
  128. Want to move to controllable queues per instance / group

  129. Opportunity for custom extensions?

  130. On spot, expect machines to just be turned off at

    any time
  131. Jobs must be idempotent

  132. … also, you must design them to fail gracefully

  133. It shouldn't matter if it runs 1 time or 1000

  134. Harder than it sounds

  135. Other tips

  136. Tooling is sparse(r)

  137. Metrics around performance specifically

  138. … but you can tail log files and collate information

  139. None
  140. … just from tailing files

  141. Data imports start approaching Hadoop-level complexity.

  142. Summary

  143. Understand how Sidekiq is designed / structured

  144. Know redis & store job metadata there

  145. You can do things in your redis instance using Lua

  146. But don't limit yourself to whats there

  147. Bend Queues to your will.

  148. Huge amounts of existing sidekiq middleware

  149. And libraries for other languages

  150. Also, consider when you need to move to a proper

    MQ.
  151. What's coming in Sidekiq?

  152. Sidekiq 3.0!

  153. Main Feature: Dead Job Queue For when job fails all

    retries.
  154. Sidekiq Pro 2.0!

  155. Main Feature: Nested Job Batches For complex workflows

  156. e.g. multi stage data imports.

  157. Questions?