Upgrade to Pro — share decks privately, control downloads, hide ads and more …

From Resque to SolidQueue - Rethinking our back...

Avatar for Andrew Markle Andrew Markle
July 14, 2025
2

From Resque to SolidQueue - Rethinking our background jobs for modern times

If your Rails app has been around for a while, you might still be using Resque for background jobs. We were too—until scaling issues, missing features, and increasing maintenance costs made it clear that Resque was no longer working for us.

This year we migrated to SolidQueue, Rails’ new default job runner and haven't looked back. This talk will walk you through how we did it—what worked, what didn’t, and what we learned along the way.

Key takeaways:
• Why we left Resque

• How we migrated with minimal disruption using a parallel rollout

• Why we went through the effort to re-name all our queues so that they were SLO-based (within_1_minute) and why this matters

• Lessons learned, pitfalls to avoid, and how SolidQueue made our jobs (and jobs!) easier

If your background jobs are from a previous era, this talk will give you a practical, real-world migration playbook to modernize with SolidQueue—without breaking anything.

Avatar for Andrew Markle

Andrew Markle

July 14, 2025
Tweet

Transcript

  1. • Staff engineer at Fullscript • Work on our platform

    team • Work to shepherd and maintain the long-term health of our monolith • First Rails version: 4.0 self.introduce
  2. 2009 Resque 2023 SolidQueue 2020 GoodJob 2024 2014 ActiveJob Rails

    8 2011 Fullscript monolith started Sidekiq 2012
  3. Problems Processes that took a nap Resque::PruneDeadWorkerDirtyExit Resque::PruneDeadWorkerDirtyExit Resque::PruneDeadWorkerDirtyExit Resque::PruneDeadWorkerDirtyExit

    Resque::PruneDeadWorkerDirtyExit Resque::PruneDeadWorkerDirtyExit Resque::PruneDeadWorkerDirtyExit Resque::PruneDeadWorkerDirtyExit Resque::PruneDeadWorkerDirtyExit Resque::PruneDeadWorkerDirtyExit
  4. • ✅ Sidekiq gets better ever year • ❌ GoodJob

    is a great PostgreSQL ActiveJob backend but does not support MySQL • ✅ SolidQueue new default queue adapter for Rails Alternatives Lots of great modern options
  5. Why SolidQueue? ✅ Backed by a SQL database instead of

    Redis ✅ It supports MySQL ✅ It's open source ✅ It's the default in Rails and it's maintained by Rails
  6. Plan of action •Use ActiveJob •Setup SolidQueue and database •Setup

    queues •Setup workers •Setup metrics and monitoring •Migrate with minimal / no downtown / no jobs lost •Optimize for performance / cleanup •Remove resque
  7. Use ActiveJob Everything needs to be enqueued with ActiveJob ❌

    Resque.enqueue( MyJob, params ) ✅ class MyJob < ApplicationJob ✅ MyJob.perform_later(params) ✅ MyJob.perform_now(params)
  8. Database setup Separate DB or same DB? ✅ Use the

    same database engine as your primary db • For us this was MySQL • This allows you to setup solid_queue on that same mysql instance using a separate schema for envs where scale is not a concern. • This saves cost and can be a simpler setup for local / staging envs. ✅ Separate DB is a question of scale. (But recommended) • We process almost 3 million jobs per day. • Separate DB was a must for production.
  9. What about transaction integrity? ❌ No transaction integrity with Resque

    ❌ No Transaction integrity using a separate queue db ❌ Transactional integrity on same db as primary? ✅ Transaction integrity only with enqueue_after_transaction_commit (Rails 8) MyJob < ApplicationJob do self.enqueue_after_transaction_commit = true end
  10. Transactional integrity MyJob < ApplicationJob do self.enqueue_after_transaction_commit = true end

    ActiveRecord::Base.transaction do user = User.create!(email: email) MyJob.perform_later(user) end
  11. Database setup ✅ Same DB engine as our primary (MySQL)

    ✅ Separate DB for production ❌ No transactional integrity. Same as resque.
  12. api_webhooks mailers avatax manual_only batch_payouts medium critical notifications default quarantine

    elastic_indexing shipments heavy stripe high stripe_fee_refunds iterable_sync stripe_webhook_events low tax_form_generation mailer throttled Queue problems
  13. api_webhooks mailers avatax manual_only batch_payouts medium critical notifications default quarantine

    elastic_indexing shipments heavy stripe high stripe_fee_refunds iterable_sync stripe_webhook_events low tax_form_generation mailer throttled Queue problems Mix of priority-based and domain-based queues
  14. api_webhooks mailers avatax manual_only batch_payouts medium critical notifications default quarantine

    elastic_indexing shipments heavy stripe high stripe_fee_refunds iterable_sync stripe_webhook_events low tax_form_generation mailer throttled Queue problems Mix of priority-based and domain-based queues
  15. api_webhooks mailers avatax manual_only batch_payouts medium critical notifications default quarantine

    elastic_indexing shipments heavy stripe high stripe_fee_refunds iterable_sync stripe_webhook_events low tax_form_generation mailer throttled Queue problems Mix of priority-based and domain-based queues
  16. api_webhooks mailers avatax manual_only batch_payouts medium critical notifications default quarantine

    elastic_indexing shipments heavy stripe high stripe_fee_refunds iterable_sync stripe_webhook_events low tax_form_generation mailer throttled Old Queues
  17. Queue latency The most important metric Time elapsed between the

    oldest job in the queue and now (in seconds).
  18. def self.queue_as(queue_name = nil, &block) if queue_name.to_s.match?(/^within_/) raise "Unsupported queue

    name: #{queue_name}" \ unless SUPPORTED_QUEUES[queue_name] self.queue_adapter = :solid_queue end super end
  19. # in application_job.rb around_enqueue do |job, block| block.call rescue SolidQueue::Job::EnqueueError

    => error Rails.error.report(error) job.queue_adapter = :resque job.set(queue: "a_resque_queue").perform_later(job.arguments) end Tip: Have a backup so you don't lose jobs If the job fails to enqueue in SolidQueue send it back to Resque
  20. Problem: Too many args When TEXT is (way) too small

    create_table "solid_queue_jobs", charset: "utf8mb4", collation: "utf8mb4_0900_ai_ci", force: :cascade do |t| t.string "queue_name", null: false t.string "class_name", null: false t.text "arguments" t.integer "priority", default: 0, null: false t.string "active_job_id" t.datetime "scheduled_at" t.datetime "finished_at" t.string "concurrency_key" t.datetime "created_at", null: false t.datetime "updated_at", null: false t.index ["active_job_id"], name: "index_solid_queue_jobs_on_active_job_id" t.index ["class_name"], name: "index_solid_queue_jobs_on_class_name"
  21. Solution: Make args larger Run a migration to increase the

    storage size of arguments execute <<-SQL ALTER TABLE solid_queue_jobs MODIFY arguments MEDIUMTEXT SQL
  22. Problem: Workers starting to get killed Random workers dying... SolidQueue::Processes::ProcessExitError:

    Process pid=<num> exited unexpectedly. Received unhandled signal 9.
  23. Problem: Workers starting to get killed Random workers dying... SolidQueue::Processes::ProcessExitError:

    Process pid=<num> exited unexpectedly. Received unhandled signal 9.
  24. Why: Worker running out of memory There was a spike

    of over 3.5 GB on a limit of 2 GB memory
  25. Problem: Too many connections We ran out of database connections

    😭 ActiveRecord::ConnectionNotEsta blished: Too many connections (SolidQueue::Job::EnqueueError) ActiveRecord::ConnectionNotEsta blished: Too many connections (SolidQueue::Job::EnqueueError) ActiveRecord::ConnectionNotEsta blished: Too many connections (SolidQueue::Job::EnqueueError) ActiveRecord::ConnectionNotEsta blished: Too many connections (SolidQueue::Job::EnqueueError)
  26. Why: We did it to ourselves We autoscaled our way

    out of database connections 😭 ActiveRecord::ConnectionNotEstablished: Too many connections (SolidQueue::Job::EnqueueError)
  27. Problem: Slow jobs How do you keep the fast queues

    fast? within_1_minute 🐌 🐌 🐌
  28. Queue Name Maximum Time within_1_minute 6 seconds within_5_minutes 30 seconds

    within_10_minutes 1 minute within_1_hour 6 minutes within_4_hours 24 minutes within_1_day 2.4 hours Solution: Set an SLO for job runtime Keep jobs accountable to run fast (especially for the fast queues) 1/10 of the queue was our max time
  29. around_perform :alert_slow_job_for_queue def alert_slow_job_for_queue start_time = Time.current yield job_duration =

    Time.current - start_time too_slow = (job_duration > slowest_acceptable_time_for(queue_name)) if SUPPORTED_QUEUES[queue_name.to_sym] && too_slow Rails.error.report(SlowJobError.new(job_duration)) end end def slowest_acceptable_time_for(queue_name) SUPPORTED_QUEUES[queue_name.to_sym] / 10 end
  30. Insight: 💡 Redis != MySQL What works fi ne in

    Redis can kill your DB Model.find_each do |record| MyJob.perform_later(record) end
  31. Insight: 💡Use perform_all_later Batch multiple jobs into a single SQL

    insert Model.find_in_batches(batch_size: 500) do |batch| jobs = batch.map { |record| MyJob.new(record) } ActiveJob.perform_all_later(jobs) end
  32. Insight: 💡Use perform_all_later Batch multiple jobs into a single SQL

    insert Model.find_in_batches(batch_size: 500) do |batch| jobs = batch.map { |record| MyJob.set(wait: rand(30.minutes)).new(record) } ActiveJob.perform_all_later(jobs) end
  33. Final task Swap the default queue adapter # in config/application.rb

    config.active_job.queue_adapter = :solid_queue config.action_mailer.deliver_later_queue_name = :within_10_minutes config.active_storage.queues.analysis = :within_1_day config.active_storage.queues.purge = :within_1_day
  34. Recap •Use ActiveJob •Setup SolidQueue and database •Setup queues •Setup

    workers •Setup metrics and monitoring •Migrate with minimal / no downtown / no jobs lost •Optimize for performance •Remove resque
  35. How did it go? Migration took about 3 months •

    2 Devs for about 2 months to setup SolidQueue and migrate all our jobs • Most of the work was setting up SolidQueue and then adjusting our infrastructure when we got it wrong • Once con fi dent we were able to migrate all of our jobs in a couple weeks • 1 Dev from ops to setup the infrastructure and fi ne tune it as we went • 1 DBA that made sure our new DB was setup properly and running smooth
  36. All done! • We're processing almost 3 million jobs per

    day • Performance of SolidQueue is amazing compared to Resque • Easy to horizontally scale workers based on demand • The observability is great—being able to write a SQL query and see what your jobs have done is really powerful • Renaming our queues was one of the biggest bene fi ts • Very reliable!