Stephen Margheim — RubyConf 2021 ACIDic Jobs A Layman's Guide to Job Bliss

Stephen Margheim @fractaledmind

Jobs are essential

But what is a Job?

Job == Verb Job != ActiveJob object

Job == State Mutation Job != inspection or retrieval of information

Job => Side Effects Job !=> return value

• a Ruby class object, • representing a state mutation action, • that takes as input a representation of initial state • and produces side-effects representing a next state. A job is

Jobs must be idempotent & transactional

Operation 1 Operation 2 Operation 3 Transaction

• Atomicity: everything succeeds or everything fails • Consistency: the data always ends up in a valid state, as defined by your schema • Isolation: concurrent transactions won't conflict with each other • Durability: once committed always committed, even with system failures The ACIDic Guarantees

Jobs · Databases

Idempotency f(f(x)) == f(x) Job.perform & Job.perform == Job.perform

• Functional Idempotency: the function always returns the same result, even if called multiple times f(f(x)) == f(x) • Practical Idempotency: the side-effect(s) will happen once and only once, no matter how many times the job is performed Job.perform == Job.perform & Job.perform The Idempotent Guarantee

Jobs · Retries

class JobRun < ActiveRecord::Base # ... end

• Use a transaction to guarantee atomic execution • Use locks to prevent concurrent data access • Use idempotency and retries to ensure eventual completion • Ensure enqueuing other jobs is co-transactional • Split complex operations into steps ACIDic Job Principles — Nathan Griffith

ACIDic Jobs Level 0 — Transactional Jobs

def perform(from_account, to_account, amount) run = JobRun.find_or_create_by!( job_class: self.class, job_id: job_id, job_args: [from_account, to_account, amount]) run.with_lock do from_account.lock! to_account.lock! from_account.update!(balance: from_account.balance - amount) to_account.update!(balance: to_account.balance - amount) end end

— Mike Perham “Just remember that Sidekiq will execute your job at least once, not exactly once.”

• Use a database record to make job runs transactional • Use a database lock to mitigate concurrency issues ACIDic Jobs Level 0 Recap

ACIDic Jobs Level 1 — Idempotent Jobs

Idempotency & Uniqueness To guarantee idempotency, we must be able to define and identify the job uniquely

john_account.balance # initial state # => 100_00 TransferBalanceJob.perform_later(john_account, jane_account, 10_00) TransferBalanceJob.perform_later(john_account, jane_account, 10_00) john_account.balance # resulting state # => 80_00 or 90_00 ?

• each job run uses a generic unique entity representing this job run • each job run uses a generic unique entity representing this job execution (based on args) Forms of Job Uniqueness

def perform(from_account, to_account, amount) run = JobRun.find_or_create_by!(job_class: self.class, job_id: job_id) run.with_lock do return if run.completed? from_account.update!(balance: from_account.balance - amount) to_account.update!(balance: to_account.balance - amount) run.update!(completed_at: Time.current) end end Unique Job by Job Run

john_account.balance # initial state # => 100_00 TransferBalanceJob.perform_later(john_account, jane_account, 10_00) TransferBalanceJob.perform_later(john_account, jane_account, 10_00) john_account.balance # resulting state # => 80_00 ?

def perform(from_account, to_account, amount) run = JobRun.find_or_create_by!(job_class: self.class, job_args: [from_account, to_account, amount]) run.with_lock do return if run.completed? from_account.update!(balance: from_account.balance - amount) to_account.update!(balance: to_account.balance - amount) run.update!(completed_at: Time.current) end end Unique Job by Execution Args

john_account.balance # initial state # => 100_00 TransferBalanceJob.perform_later(john_account, jane_account, 10_00) TransferBalanceJob.perform_later(john_account, jane_account, 10_00) john_account.balance # resulting state # => 90_00 ?

Unique Job flexibly and generically class TransferBalanceJob < ApplicationJob prepend UniqueByJobRun uniquely_identified_by_job_id # uniquely_identified_by_job_args def perform(from_account, to_account, amount) from_account.lock! to_account.lock! from_account.update!(balance: from_account.balance - amount) to_account.update!(balance: to_account.balance - amount) end end

john_account.balance # initial state # => 100_00 TransferBalanceJob.perform_later(john_account, jane_account, 10_00) TransferBalanceJob.perform_later(john_account, jane_account, 10_00) john_account.balance # resulting state # => 80_00

Unique Job flexibly and generically class TransferBalanceJob < ApplicationJob prepend UniqueByJobRun # uniquely_identified_by_job_id uniquely_identified_by_job_args def perform(from_account, to_account, amount) from_account.lock! to_account.lock! from_account.update!(balance: from_account.balance - amount) to_account.update!(balance: to_account.balance - amount) end end

john_account.balance # initial state # => 100_00 TransferBalanceJob.perform_later(john_account, jane_account, 10_00) TransferBalanceJob.perform_later(john_account, jane_account, 10_00) john_account.balance # resulting state # => 90_00

• Use a database record to make job runs idempotent • custom or generic • by job ID or by job args ACIDic Jobs Level 1 Recap

ACIDic Jobs Level 2 — Enqueuing other Jobs

uniquely_identified_by_job_id def perform(from_account, to_account, amount) from_account.lock! to_account.lock! from_account.update!(balance: from_account.balance - amount) to_account.update!(balance: to_account.balance - amount) TransferMailer.with(account: from_account).outbound.deliver_later TransferMailer.with(account: to_account).inbound.deliver_later end

Failure Condition 1 job queue! job process! TransferMailer.deliver_later transaction commits job starts job fails job queued by web process and dequeued by background worker

Failure Condition 2 job queue! job process! TransferMailer.deliver_later job starts job fails rollback job queued by web process and dequeued by background worker

Solution Option 1 • A database-backed job queue • delayed_job • que • good_job But... no Sidekiq and increased db load

Solution Option 2 • A transactionally-staged job queue So... more Sidekiq and minimal increased db load

uniquely_identified_by_job_id def perform(from_account, to_account, amount) from_account.lock! to_account.lock! from_account.update!(balance: from_account.balance - amount) to_account.update!(balance: to_account.balance - amount) TransferMailer.with(account: from_account).outbound.deliver_acidic TransferMailer.with(account: to_account).inbound.deliver_acidic end

def deliver_acidic(options = {}) job = delivery_job_class attributes = { adapter: "activejob", job_name: } job_args = if job <= ActionMailer::Parameterized::MailDeliveryJob [, @action, "deliver_now", {params: @params, args: @args}] else [, @action, "deliver_now", @params, *@args] end attributes[:job_args] = StagedJob.create!(attributes) end

class StagedJob < ActiveRecord::Base after_create_commit :enqueue_job def enqueue_job case adapter when "activejob" ActiveJob::Base.deserialize(job_args).enqueue when "sidekiq" Sidekiq::Client.push("class" => job_name, "args" => job_args) end end end

• Use transactionally-staged jobs to keep job enqueuing co-transactional with standard database operations • while keeping Sidekiq, and • not requiring an independent de-staging process ACIDic Jobs Level 2 Recap

ACIDic Jobs Level 3 — Operational Steps

def perform(order) order.lock! order.process_and_fulfill! ShopifyAPI::Fulfillment.create!({ amount: order.amount, customer: order.purchaser, }) OrderMailer.with(order: order).fulfilled.deliver_acidic end

def perform(order) order.lock! order.process_and_fulfill! ShopifyAPI::Fulfillment.create!({ amount: order.amount, customer: order.purchaser, }) OrderMailer.with(order: order).fulfilled.deliver_acidic end

def perform(order) order.lock! order.process_and_fulfill! ShopifyAPI::Fulfillment.create!({ amount: order.amount, customer: order.purchaser, }) OrderMailer.with(order: order).fulfilled.deliver_acidic end

def perform(order) order.lock! order.process_and_fulfill! ShopifyAPI::Fulfillment.create!({ amount: order.amount, customer: order.purchaser, }) OrderMailer.with(order: order).fulfilled.deliver_acidic end

def perform(order) order.lock! order.process_and_fulfill! ShopifyAPI::Fulfillment.create!({ amount: order.amount, customer: order.purchaser, }) OrderMailer.with(order: order).fulfilled.deliver_acidic end

def perform(order) order.lock! order.process_and_fulfill! ShopifyAPI::Fulfillment.create!({ amount: order.amount, customer: order.purchaser, }) OrderMailer.with(order: order).fulfilled.deliver_acidic end ?

Workflow step 1 step 2 step 3

Workflow — Run 1 step 1 step 2 step 3

Workflow — Run 2 step 1 step 2 step 3

Job-wise vs Step-wise Idempotency

Job-wise vs Step-wise vs Idempotency

uniquely_identified_by_job_args def perform(order) @job.with_lock do order.process_and_fulfill! @job.update!(recovery_point: :fulfill_order) end if @job.recovery_point == :start # ... end

uniquely_identified_by_job_args def perform(order) # ... @job.with_lock do ShopifyAPI::Fulfillment.create!({ ... }) @job.update!(recovery_point: :send_email) end if @job.recovery_point == :fulfill_order # ... end

uniquely_identified_by_job_args def perform(order) # ... @job.with_lock do OrderMailer.with(order: order).fulfilled.deliver_acidic @job.update!(recovery_point: :finished) end if @job.recovery_point == :send_email end

include WithAcidity def perform(order) with_acidity do step :process_order step :fulfill_order step :send_emails end end def process_order; # ... end def fulfill_order; # ... end def send_emails; # ... end

module WithAcidity def perform_step(current_step_method, next_step_method) return unless @job.recovery_point == current_step_method @job.with_lock do method(current_step_method).call @job.update!(recovery_point: next_step_method) end end end

• Use a recovery key to keep track of which steps in a workflow job have been successfully completed • make each step ACIDic, and • keep the entire workflow job ACIDic ACIDic Jobs Level 3 Recap

ACIDic Jobs Level 4 — Step Batches

def perform(order) with_acidity do step :process_order step :fulfill_order, awaits: [ShopifyFulfillJob] step :send_emails end end

— Mike Perham “Batches are Sidekiq Pro's [tool to] create a set of jobs to execute in parallel and then execute a callback when all the jobs are finished.”

Parallel Executing + Workflow Blocking

Workflow Job A Shopify Fulfill Job A Workflow Job B Workflow Job A

• Use Sidekiq Batches to allow parallel, separately queued jobs to be used within a multi-step workflow • keep steps serially dependent • while allowing for parallelizatino ACIDic Jobs Level 4 Recap

