Slide 1

Slide 1 text

<3 Unix Applying beautiful Unix idioms to build a Ruby prefork server

Slide 2

Slide 2 text

Sahil Muthoo @sahilmuthoo ThoughtWorks Studios

Slide 3

Slide 3 text

snap-ci.com The missing link between GitHub and Heroku

Slide 4

Slide 4 text

Four decades of Unix

Slide 5

Slide 5 text

Transcends languages and frameworks

Slide 6

Slide 6 text

Solving problems for four decades

Slide 7

Slide 7 text

Unicorn Resque

Slide 8

Slide 8 text

Unicorn Resque UNIX Live free or die

Slide 9

Slide 9 text

Man pages Tell your kill(2) from your kill(1)

Slide 10

Slide 10 text

System calls

Slide 11

Slide 11 text

Ruby was built for Unix hacking

Slide 12

Slide 12 text

System calls Unix Ruby fork(2) Process.fork execve(2) Kernel.exec pipe(2) IO.pipe select(2) IO.select kill(2) Process.kill

Slide 13

Slide 13 text

Processes The atoms of Unix

Slide 14

Slide 14 text

Processes have ids % irb >> Process.pid => 705 Process #705

Slide 15

Slide 15 text

Processes have parents % irb >> Process.pid => 705 >> Process.ppid => 99930 Process #705 Parent process #99930

Slide 16

Slide 16 text

Process lineage % pstree -s irb -+= 00001 root /sbin/launchd \-+= 00136 sahilm /sbin/launchd \-+= 08735 sahilm /Applications/iTerm.app \-+= 08739 sahilm -/bin/bash \--= 08830 sahilm irb

Slide 17

Slide 17 text

Processes have friendly names % irb >> $PROGRAM_NAME => "irb" >> $PROGRAM_NAME="Ponies!" => "Ponies!" Process #705 Parent process #99930 Name irb Ponies!

Slide 18

Slide 18 text

Process names are useful % bin/kaanta [kaanta master (PID: 4402)] INFO -- Listening on 0.0.0.0: 8080 [kaanta master (PID: 4402)] INFO -- Spawning 3 workers [kaanta worker 0 (PID: 4429)] INFO -- up [kaanta worker 1 (PID: 4430)] INFO -- up [kaanta worker 2 (PID: 4431)] INFO -- up

Slide 19

Slide 19 text

Process names are useful % bin/kaanta [kaanta master (PID: 4402)] INFO -- Listening on 0.0.0.0: 8080 [kaanta master (PID: 4402)] INFO -- Spawning 3 workers [kaanta worker 0 (PID: 4429)] INFO -- up [kaanta worker 1 (PID: 4430)] INFO -- up [kaanta worker 2 (PID: 4431)] INFO -- up

Slide 20

Slide 20 text

Process names are useful % bin/kaanta [kaanta master (PID: 4402)] INFO -- Listening on 0.0.0.0: 8080 [kaanta master (PID: 4402)] INFO -- Spawning 3 workers [kaanta worker 0 (PID: 4429)] INFO -- up [kaanta worker 1 (PID: 4430)] INFO -- up [kaanta worker 2 (PID: 4431)] INFO -- up

Slide 21

Slide 21 text

Processes have resources % irb >> STDIN.fileno => 0 >> STDOUT.fileno => 1 >> STDERR.fileno => 2 Process #705 Parent process #99930 Name irb Ponies! Resources fd0-2

Slide 22

Slide 22 text

file == resource == file % irb >> socket = TCPServer.open => # >> socket.fileno => 3 Process #705 Parent process #99930 Name irb Ponies! Resources fd0-2 fd3

Slide 23

Slide 23 text

Processes can fork % irb >> fork { puts "Oh hai!" } => 5306 Oh hai! Process #705 Parent process #99930 Name irb Ponies! Resources fd0-2 fd3

Slide 24

Slide 24 text

% irb >> fork { puts "Oh hai!" } => 5306 Oh hai! Process #705 Parent process #99930 Name irb Ponies! Resources fd0-2 fd3 => Oh hai! Process #5306 Parent process #705 Name Ponies! Resources fd0-2 fd3

Slide 25

Slide 25 text

% irb >> fork { puts "Oh hai!" } => 5306 Oh hai! Process #705 Parent process #99930 Name irb Ponies! Resources fd0-2 fd3 => Oh hai! Process #5306 Parent process #705 Name Ponies! Resources fd0-2 fd3

Slide 26

Slide 26 text

fork returns twice if fork puts "forked" else puts "didn't fork" end => forked => didn't fork Process #705 Parent process #99930 Name irb Ponies! Resources fd0-2 fd3

Slide 27

Slide 27 text

fork returns twice puts "Parent: #{Process.pid}" if fork puts "#{Process.pid} forked" else puts "#{Process.pid} didn't fork" end => Parent 705 => 705 forked => 5306 didn't fork Process #705 Parent process #99930 Name irb Ponies! Resources fd0-2 fd3

Slide 28

Slide 28 text

fork(2) is copy on write friendly

Slide 29

Slide 29 text

homework = Tempfile.new('') pid = fork do homework.write "5 * 5 = 25" end if File.zero?(homework) puts "Y U NO HOMEWORK!" else puts "Have a cookie" end

Slide 30

Slide 30 text

Y U NO HOMEWORK!

Slide 31

Slide 31 text

homework = Tempfile.new('') pid = fork do homework.write "5 * 5 = 25" end if File.zero?(homework) puts "Y U NO HOMEWORK!" else puts "Have a cookie" end Process.waitpid(pid)

Slide 32

Slide 32 text

Wait for your children waitpid(2) and friends

Slide 33

Slide 33 text

Processes have exit statuses 3.times do fork {exit([0,1].sample)} end 3.times do pid, status = Process.wait2 if status.exitstatus == 1 puts "#{pid} failed!" end end

Slide 34

Slide 34 text

wait2 returns status 3.times do fork {exit([0,1].sample)} end 3.times do pid, status = Process.wait2 if status.exitstatus == 1 puts "#{pid} failed!" end end

Slide 35

Slide 35 text

Nonzero exit statuses indicate errors 3.times do fork {exit([0,1].sample)} end 3.times do pid, status = Process.wait2 if status.exitstatus == 1 puts "#{pid} failed!" end end

Slide 36

Slide 36 text

% bin/kaanta [kaanta master (PID: 13685)] INFO -- Listening on 0.0.0.0: 8080 [kaanta master (PID: 13685)] INFO -- Spawning 3 workers [kaanta worker 0 (PID: 13712)] INFO -- up [kaanta worker 1 (PID: 13713)] INFO -- up [kaanta worker 2 (PID: 13714)] INFO -- up [kaanta master (PID: 13685)] INFO -- reaped worker 0 (PID:13712) status: 0 [kaanta worker 0 (PID: 13725)] INFO -- up % kill 13712

Slide 37

Slide 37 text

% bin/kaanta [kaanta master (PID: 13685)] INFO -- Listening on 0.0.0.0: 8080 [kaanta master (PID: 13685)] INFO -- Spawning 3 workers [kaanta worker 0 (PID: 13712)] INFO -- up [kaanta worker 1 (PID: 13713)] INFO -- up [kaanta worker 2 (PID: 13714)] INFO -- up [kaanta master (PID: 13685)] INFO -- reaped worker 0 (PID:13712) status: 0 [kaanta worker 0 (PID: 13725)] INFO -- up % kill 13712

Slide 38

Slide 38 text

Processes can receive signals % irb >> Process.pid => 14576 >> Terminated: 15 % % kill -SIGTERM 14576

Slide 39

Slide 39 text

Processes can receive signals % irb >> Process.pid => 14576 >> Terminated: 15 % % kill -15 14576

Slide 40

Slide 40 text

kill(2) isn't only for killing puts Process.pid i=0 loop do print i+=1 sleep 1 end % ruby kill.rb 15068 1234... % kill -SIGSTOP 15068 [2]+ Stopped ruby kill.rb

Slide 41

Slide 41 text

kill(2) isn't only for killing puts Process.pid i=0 loop do print i+=1 sleep 1 end % kill -SIGCONT 15068 % 5678...

Slide 42

Slide 42 text

man 7 signal

Slide 43

Slide 43 text

Processes can trap signals % irb >> Process.pid => 15369 >> trap(:TERM){puts "nope"} => "DEFAULT" >> nope % kill -SIGTERM 15369

Slide 44

Slide 44 text

Refuse to die % irb >> Process.pid => 15369 >> trap(:TERM){puts "nope"} => "DEFAULT" >> nope % kill -SIGTERM 15369

Slide 45

Slide 45 text

A more useful example pid = fork do loop do print "." sleep 1 end end trap(:TERM) do Process.kill(:TERM, pid) end Process.waitpid(pid) % kill -SIGTERM 15369

Slide 46

Slide 46 text

Children exit with a SIGCHLD trap(:CHLD) do exit end fork do sleep 1 end

Slide 47

Slide 47 text

Signals are asynchronous trap(:CHLD) do exit end fork do sleep 1 end # tight loop loop {}

Slide 48

Slide 48 text

exited = 0 trap(:CHLD) do exited +=1 exit if exited == 3 end 3.times do fork do sleep 1 end end # tight loop loop {}

Slide 49

Slide 49 text

May never exit

Slide 50

Slide 50 text

The kernel maintains a signal queue

Slide 51

Slide 51 text

Signal delivery is unreliable

Slide 52

Slide 52 text

def handler(exited) loop do Process.waitpid(-1, Process::WNOHANG) || break exited +=1 exit if exited == 3 end exited rescue Errno::ECHILD end exited = 0 trap(:CHLD) { exited = handler(exited) } 3.times do fork do sleep 3 end end # tight loop loop {}

Slide 53

Slide 53 text

def handler(exited) exited rescue Errno::ECHILD end exited = 0 trap(:CHLD) { exited = handler(exited) } 3.times do fork do sleep 3 end end # tight loop loop {} loop do Process.waitpid(-1, Process::WNOHANG) || break exited +=1 exit if exited == 3 end

Slide 54

Slide 54 text

def handler(exited) exited end exited = 0 trap(:CHLD) { exited = handler(exited) } 3.times do fork do sleep 3 end end # tight loop loop {} loop do Process.waitpid(-1, Process::WNOHANG) || break exited +=1 exit if exited == 3 end rescue Errno::ECHILD

Slide 55

Slide 55 text

Processes can communicate reader, writer = IO.pipe fork do writer.close puts reader.gets reader.close end reader.close writer.write("Wake up!") writer.close Process.wait

Slide 56

Slide 56 text

Pipes are one-way reader, writer = IO.pipe fork do writer.close puts reader.gets reader.close end reader.close writer.write("Wake up!") writer.close Process.wait

Slide 57

Slide 57 text

select(2) tells you when stuff is ready reader, writer = IO.pipe fork do writer.puts("Do this") sleep 2 writer.puts("Do that") end loop do ret = IO.select([reader], nil, nil, 1) if ret puts reader.gets else puts "no work :(" end end Do this no work :( Do that no work :(

Slide 58

Slide 58 text

IO.select(read_array [, write_array [, error_array [, timeout]]]) -> array or nil

Slide 59

Slide 59 text

Preforking servers

Slide 60

Slide 60 text

Master Worker Worker fork(2) fork(2) select(2) on shared socket

Slide 61

Slide 61 text

Unicorn, Passenger

Slide 62

Slide 62 text

And Kaanta (Hindi for fork) github.com/sahilm/kaanta

Slide 63

Slide 63 text

One master forks many children

Slide 64

Slide 64 text

Concurrency via fork(2)

Slide 65

Slide 65 text

Faster to boot and more memory efficient

Slide 66

Slide 66 text

def start $PROGRAM_NAME="kaanta master" @master_pid = Process.pid @socket = TCPServer.open(Config.host, Config.port) spawn_workers end def spawn_workers worker_number = -1 until (worker_number += 1) == Config.workers @workers.value?(worker_number) && next worker = Kaanta::Worker.new(@master_pid, @socket, tempfile, worker_number,logger) pid = fork { worker.start } @workers[pid] = worker end end

Slide 67

Slide 67 text

def start $PROGRAM_NAME="kaanta master" @master_pid = Process.pid @socket = TCPServer.open(Config.host, Config.port) spawn_workers end def spawn_workers worker_number = -1 until (worker_number += 1) == Config.workers @workers.value?(worker_number) && next worker = Kaanta::Worker.new(@master_pid, @socket, tempfile, worker_number,logger) pid = fork { worker.start } @workers[pid] = worker end end

Slide 68

Slide 68 text

One socket shared amongst all workers

Slide 69

Slide 69 text

Kernel load balances connections

Slide 70

Slide 70 text

while alive && @master_pid == Process.ppid do if ret begin client = @socket.accept_nonblock rescue Errno::EAGAIN end end ret = begin IO.select([@socket], nil, nil, Config.timeout / 2) || next rescue Errno::EBADF end end

Slide 71

Slide 71 text

Easier sysadmining

Slide 72

Slide 72 text

Managed via signals to master

Slide 73

Slide 73 text

No zoo of processes to manage

Slide 74

Slide 74 text

Works well with simple tools like kill

Slide 75

Slide 75 text

loop do reap_workers case (mode = @sig_queue.shift) when nil kill_runaway_workers spawn_workers when 'QUIT', 'TERM', 'INT' # we don't handle gracefully stopping workers # to demonstrate that workers go down quickly # after the master quits by tracking their # Process.ppid. break when 'TTIN' Config.workers += 1 when 'TTOU' unless Config.workers <= 0 Config.workers -= 1 kill_worker('QUIT', @workers.keys.max) end end

Slide 76

Slide 76 text

% bin/kaanta [kaanta master (PID: 729)] INFO -- Listening on 0.0.0.0: 8080 [kaanta master (PID: 729)] INFO -- Spawning 3 workers [kaanta worker 0 (PID: 756)] INFO -- up [kaanta worker 1 (PID: 757)] INFO -- up [kaanta worker 2 (PID: 758)] INFO -- up [kaanta worker 3 (PID: 862)] INFO -- up % kill -TTIN 729 13712

Slide 77

Slide 77 text

Moar idioms!

Slide 78

Slide 78 text

Let's turn the unix up to 11

Slide 79

Slide 79 text

How do you kill hung workers?

Slide 80

Slide 80 text

Hand each worker an unlinked temporary file def spawn_workers worker_number = -1 until (worker_number += 1) == Config.workers @workers.value?(worker_number) && next tempfile = Tempfile.new('') tempfile.unlink tempfile.sync = true worker = Kaanta::Worker.new(tempfile) pid = fork { init_worker(worker) } @workers[pid] = worker end end

Slide 81

Slide 81 text

Workers must chmod that tempfile periodically while alive && @master_pid == Process.ppid do tempfile.chmod(i += 1) if ret begin client = @socket.accept_nonblock command = client.gets logger.info("Executing: #{command}") client.write `#{command}` client.flush client.close rescue Errno::EAGAIN end end tempfile.chmod(i += 1) ret = begin IO.select([@socket], nil, nil, Config.timeout / 2)||next rescue Errno::EBADF end end

Slide 82

Slide 82 text

Kill workers that failed to chmod in time def kill_runaway_workers now = Time.now @workers.each_pair do |pid, worker| (now - worker.tempfile.ctime) <= Config.timeout && next kill_worker('KILL', pid) end end

Slide 83

Slide 83 text

Spawn replacements in your master loop loop do reap_workers case (mode = @sig_queue.shift) when nil kill_runaway_workers spawn_workers when 'QUIT', 'TERM', 'INT' break when 'TTIN' Config.workers += 1 when 'TTOU' unless Config.workers <= 0 Config.workers -= 1 kill_worker('QUIT', @workers.keys.max) end end end

Slide 84

Slide 84 text

fchmod(2) heartbeat

Slide 85

Slide 85 text

How do you reap dead workers?

Slide 86

Slide 86 text

def reap_workers loop do pid, status = Process.waitpid2(-1, Process::WNOHANG)||break reap_worker(pid, status) end rescue Errno::ECHILD end def reap_worker(pid, status) worker = @workers.delete(pid) worker.tempfile.close rescue nil logger.info "reaped worker #{worker.number} " \ "(PID:#{pid}) " \ "status: #{status.exitstatus}" end

Slide 87

Slide 87 text

waitpid(2) with WNOHANG

Slide 88

Slide 88 text

How does a worker detect its parent's death?

Slide 89

Slide 89 text

while alive && @master_pid == Process.ppid do tempfile.chmod(i += 1) if ret begin client = @socket.accept_nonblock command = client.gets logger.info("Executing: #{command}") client.write `#{command}` client.flush client.close rescue Errno::EAGAIN end end tempfile.chmod(i += 1) ret = begin IO.select([@socket], nil, nil, Config.timeout / 2) || next rescue Errno::EBADF end end

Slide 90

Slide 90 text

Time out IO.select in timeout / 2 seconds

Slide 91

Slide 91 text

Make sure the master pid is correct

Slide 92

Slide 92 text

More tricks in kaanta's source

Slide 93

Slide 93 text

Daemonizing a process

Slide 94

Slide 94 text

Deferred signal handling

Slide 95

Slide 95 text

Fin

Slide 96

Slide 96 text

Thank you