• It’s really hard to isolate the problem in a distributed environment. • “Programmable” means programming is always needed. • Nothing is reliable except MySQL. Lessons from Kuroko1
def check_assignment_delay(exec_no_pid) if exec_no_pid.started_at < 1.minutes.ago Kuroko2::Notifications .not_assigned(exec_no_pid, @hostname).deliver_now exec_no_pid.touch(:mailed_at) end end
def check_process_absence(execution) begin Process.kill(0, execution.pid) rescue Errno::EPERM true rescue Errno::ESRCH if Execution.exists?(execution.id) notify_process_absence(execution) end end end