A tale of three web servers

A Tale Of Three Web Servers

Why do these different web servers even exist?

All three use a similar way of receiving incoming requests
and sending responses

All three use Zed Shaw’s HTTP parser

A web server has to do more than one thing
at a time

Wikipedia Concurrency is a property of systems in which several
computations are executing simultaneously, and potentially interacting with each other.

Three main ways of doing concurrency in Ruby • Multi-process
(Unicorn) • Threading (Puma) • Event-driven (Thin) Disclaimer: For the sake of simplicity we will focus on the original strong point of each of these three servers, the story is a bit more complex in reality. There are other web servers out there too.

Multi-process

Multi-process • When you start Unicorn you start a master
process • The master process does not handle requests, but controls one or more child processes that do • It starts these processes by forking itself

Fork is a Unix command that makes a copy of
a process, with the exact state it has at the time of forking.

Code Console output @best_year_ever = 2014    puts "#{Process.pid}: I'm
the original process"    if fork  puts "#{Process.pid}: I’m the master"  else  puts "#{Process.pid}: I'm the child” @best_year_ever = 2015  end    puts "#{Process.pid}: The best year ever is #{@best_year_ever}" 351: I'm the original process 351: I’m the master 351: The best year ever is 2014 372: I'm the child 372: The best year ever is 2015

How does Unicorn do this?

def spawn_missing_workers  worker_nr = -1  until (worker_nr += 1) ==
@worker_processes  worker = Worker.new(worker_nr)  if pid = fork  # Run in the master  WORKERS[pid] = worker  worker.atfork_parent  else  # Run in the child  after_fork_internal  # Start the loop that handles incoming requests  worker_loop(worker)  end  end  end

WORKERS.each_key do |pid|  Process.kill(pid)  end

socket = Socket.new  loop do  # Wait for incoming connection 
socket, addr = socket.accept  # Process incoming request with some backend  process_request(socket.gets)  end

Memory

Copy on write • Memory is not copied on forking
• It does get copied when it’s written too • Therefore code used by frameworks and such occupies memory only once • Introduced in Ruby 2.0 (used to be available in REE too)

Recap: Multi-process • Multi-process does concurrency by running separate worker
processes that handle requests • If you expect your workers to break it’s easy to kill them without affecting other workers • Concurrency is limited by the number of processes • Every process uses the full amount of memory. Copy on write helps, a bit.

Threading

Threading • One process handles multiple requests by running multiple
threads • Threads live in the same process and therefore share the same global state

You have to take the shared global state into account
when using threaded code, let’s look at some examples.

Code Console output 5.times do |i|  Thread.new do  sleep rand(5) 
puts "I'm thread #{i}"  end  end    sleep 10 I'm thread 1 I'm thread 3 I'm thread 4 I'm thread 0 I'm thread 2

Code Console output @best_year_ever = 2014    5.times do |i| 
Thread.new do  sleep rand(5)  puts "I'm thread #{i} and the best year ever is #{@best_year_ever}"  end  end    sleep 2  @best_year_ever = 2015    sleep 30 I'm thread 2 and the best year ever is 2014 I'm thread 4 and the best year ever is 2014 I'm thread 0 and the best year ever is 2014 I'm thread 1 and the best year ever is 2015 I'm thread 3 and the best year ever is 2015

Code Console output @total = 0    100.times do |i| 
Thread.new do  sleep rand(0.5)  snapshot_of_total = @total  sleep rand(0.5)  @total = snapshot_of_total + 1  end  end    sleep 5  puts @total 6

Code Console output @total = 0  @lock = Mutex.new   
100.times do |i|  Thread.new do  sleep rand(0.5)  @lock.synchronize do  snapshot_of_total = @total  sleep rand(0.5)  @total = snapshot_of_total + 1  end  end  end    sleep 60  puts @total 100

Code Console output @total = 0    100.times do |i| 
snapshot_of_total = @total  sleep rand(0.5)  @total = snapshot_of_total + 1  end    puts @total 100

Threads can work well to achieve concurrency, but it can
be really hard to make them independent of each other.

Global Interpreter Lock (GIL) • Every time the interpreter runs
a line of Ruby code it locks • IO operations are run outside of the GIL • If you run operations on hashes, for example, in multiple threads your program will still only utilize one CPU core • Rubinius and jRuby don’t have a GIL

How does Puma do this?

module Puma  class ThreadPool  def initialize(min, max, *extra, &block)  @cond
= ConditionVariable.new  @mutex = Mutex.new  @workers = []  @mutex.synchronize do  @min.times { spawn_thread }  end   end  end  end

while true  mutex.synchronize do  while todo.empty?  @waiting += 1  @cond.wait
mutex  @waiting -= 1  end    work = @todo.pop  end    block.call(work, *extra)  end 

def <<(work)  @mutex.synchronize do  @todo << work    if @waiting
== 0 and @spawned < @max  spawn_thread  end    @cond.signal  end  end

while @status == :run  begin  ios = IO.select sockets  ios.first.each
do |sock|  if io = sock.accept_nonblock  c = Client.new io, nil  pool << c  end  end  end

Recap: Threading • Server keeps a pool of worker threads
• They wait for work to come in and process it outside of the server’s main lock • Concurrency is limited by size of thread pool • Worker threads use little memory compared to processes

Event driven

Event driven • One process handles multiple requests by running
an event loop that schedules work • All code that gets run in this loop has to split itself up in the smallest feasible units of work

loop do  events = select(operations_being_watched)    events.each do |event|  find_callback_for_event(event).call 
end  end

EventMachine

Code Console output require 'eventmachine'    EM.run do  5.times do
|i|  EM.add_timer(rand(5)) do  puts "I'm callback #{i}"  end  end  end I'm callback 1 I'm callback 2 I'm callback 0 I'm callback 3 I'm callback 4

Code Console output require 'eventmachine'    @best_year_ever = 2014   
EM.run do  5.times do |i|  EM.add_timer(rand(5)) do  puts "I'm callback #{i} and” the best year ever is #{@best_year_ever}"  end  end    EM.add_timer(2) do  @best_year_ever = 2015  end  end I'm callback 1 and the best year ever is 2014 I'm callback 3 and the best year ever is 2014 I'm callback 0 and the best year ever is 2015 I'm callback 2 and the best year ever is 2015 I'm callback 4 and the best year ever is 2015

Code Console output require 'eventmachine'    @total = 0   
EM.run do  100.times do |i|  EM.add_timer(rand(0.5)) do  snapshot_of_total = @total  EM.add_timer(rand(0.5)) do  @total = snapshot_of_total + 1  end  end  end    EM.add_timer(5) do  puts @total  end  end 4

We can’t ﬁx this

You can only use code that works in an event-
driven way, or you’ll end up with a program that’s not concurrent

How does Thin do this?

EventMachine.start_server(  @host,  @port,  Connection  ) @port,  Connection  )

EventMachine.defer(  operation,  callback  )

module Thin  module Connection  def receive_data(data)  process if @request.parse(data)  end 
  def process  EventMachine.defer(  method(:pre_process),  method(:post_process)  )  end    def pre_process @app.call(@request.env)  end    def post_process(result)  @response.status, @response.headers, @response.body = *result  @response.each do |chunk|  send_data chunk  end  end  end  end @port,  Connection  )

Recap: Event-driven • Server runs a loop that schedules execution
of operations and callbacks • Whenever an operation has to wait for something it stops, a callback gets called when the wait is over • Hardly any memory is used by the callbacks, they’re just Ruby blocks • Concurrency is an order of magnitude bigger than the other two models • All code running in the loop has to be event-driven

So which concurrency model should I use?

It depends

• For most apps threading makes sense, Ruby/Rails ecosystem seems
to (slowly) be moving this way. • If you run highly concurrent apps with long-running streams event-driven allows you to scale • If you don’t have a high-trafﬁc site or you expect your workers to break go for good old multi-process

Thank you

More information • http://tomayko.com/writings/unicorn-is-unix • http://www.jstorimer.com/products/working-with-unix-processes • http://ablogaboutcode.com/2012/02/06/the-ruby-global-interpreter-lock/ • http://jordanhollinger.com/2011/05/15/writing-an-ajax-long-polling-server-in-ruby-part-1
• http://dailyjs.com/2012/09/27/truth-about-event-loops/

A tale of three web servers

A tale of three web servers

More Decks by Thijs Cadier

Other Decks in Technology

Featured

Transcript