A Brief Aside on Request Queueing

QUEUEING @jpignata br f REQUEST

nginx thin load balancer

nginx thin load balancer X-Request-Start: 1362955863565

now = Time.now x_request_start = request.headers["HTTP_X_REQUEST_START"] queue_start = Time.at(x_request_start.to_i /
1000) queue_time = now - queue_start

nginx thin load balancer New Relic

New Relic Agent ActionController::Base#process_action Rails New Relic

Response Time t h e a m o u n
t o f t i m e required to process a single request

THROUGHPUT the amount of requests that can be processed within
a unit of time

ARRIVAL RATE the amount of requests that are enqueued for
processing in a unit of time

Latency the time delay prior to processing a request

CONCURRENCY how many requests can be processed simultaneously

require "socket" server = TCPServer.new(3000) loop do client = server.accept
sleep 1 client.puts Time.now client.close end

jp@oeuf:~$ ./tcpping.rb 1 Throughput: 0.9974 req/sec Average: 1.0023 secs Max:
1.0023 secs Min: 1.0023 secs Total: 1.0026 secs

require "java" require "socket" import "java.util.concurrent.Executors" pool = Executors.new_fixed_thread_pool(2) server
= TCPServer.new(3000) loop do if pool.active_count < pool.maximum_pool_size client = server.accept pool.execute do sleep 1 client.puts Time.now client.close end end end

require "java" require "socket" import "java.util.concurrent.Executors" size = ARGV[0].to_i pool
= Executors.new_fixed_thread_pool(size) server = TCPServer.new(3000) loop do if pool.active_count < pool.maximum_pool_size client = server.accept pool.execute do sleep 1 client.puts Time.now client.close end end end

jp@oeuf:~$ jruby ./thread_pool_server.rb 25

listen() Backlog

socket() bind() listen() accept()

int listen(int sockfd, int backlog)

require "socket" server = TCPServer.new(3000) server.listen(1024) loop do client =
server.accept sleep 1 client.puts Time.now client.close end

Client Server SYN J connect called create entry on listen
queue SYN K, ACK J+1 connect returns; client considers socket open ACK K+1 connection acceptable by server process

incoming connections are enqueued and passed to accept() one at
a time, ﬁrst in, ﬁrst out

diﬀerent application servers call accept() with diﬀerent frequencies

Unicorn workers each call accept() on the q u e
u e a n d w o r k connections directly from it

P u m a w i l l c a
l l accept() and pass connections to its worker thread pool

Thin, via EventMachine, will call accept() 10 times per reactor
tick and place connections into a separate queue

these backlogs could represent the majority of your request queue
time

this is generally invisible to your application as it is
happening underneath it i n t h e k e r n e l o r application server

CONCURRENCY

nginx thin load balancer

assuming Rack::Lock middleware; i.e., Rails’ config.threadsafe! commented out.

elapsed wall clock time: 400ms thin 50ms 50ms 50ms 50ms
50ms 50ms 50ms 50ms Request Queue Response Time + Latency 50ms 100ms 150ms 200ms 250ms 300ms 350ms 400ms

elapsed wall clock time: 100MS unicorn worker 50ms 50ms 50ms
50ms 50ms 50ms 50ms 50ms Request Queue Response Time + Latency 50ms 50ms 50ms 50ms 100ms 100ms 100ms 100ms unicorn worker unicorn worker unicorn worker

RESPONSE TIME VARIANCE

Rails 50ms 130ms 20ms 9.5secs 340ms 9ms 2.3secs 693ms Request
Queue

Rails 50ms 130ms 20ms 9.5secs 340ms 9ms 2.3secs 693ms Request
Queue Response Time + Latency 50ms 180ms 200ms 9.7secs 10secs 10secs 12.3secs 13secs

load balancer algorithms

leastconns or bybusyness

def balance(connection) instance = instances.sort_by(&:connection_count).first begin instance.connection_count += 1 instance.process(connection)
ensure instance.connection_count -= 1 end end

round-robin or least recently used

def balance(connection) @processed ||= 0 instance = instances[@processed % instances.count]
instance.process(connection) ensure @processed += 1 end

Random

def balance(connection) instances.sample.process(connection) end

your queue distribution and worker utilization will be in large
part dictated by the dealing strategy

n o a l g o r i t h
m w i l l compensate for an unstable distribution in response time

leastconns doesn’t know that all connections aren’t equal

the least recently used worker might be still processing a
long-running request

Client latency

unicorn load balancer

unicorn load balancer lol, edge network

slow clients could cause a worker to spend time shoveling
bytes over the wire

work through a buﬀering intermediary to allow the worker to
immediately ﬂush the response and move on

1. Measure from the outside in

jp@oeuf:~$ ping -c 5 www.stanford.edu PING www-v6.stanford.edu (171.67.215.200): 56 data
bytes 64 bytes from 171.67.215.200: icmp_seq=0 ttl=252 time=91.151 ms 64 bytes from 171.67.215.200: icmp_seq=1 ttl=252 time=96.684 ms 64 bytes from 171.67.215.200: icmp_seq=2 ttl=252 time=93.497 ms 64 bytes from 171.67.215.200: icmp_seq=3 ttl=252 time=96.360 ms 64 bytes from 171.67.215.200: icmp_seq=4 ttl=252 time=92.683 ms --- www-v6.stanford.edu ping statistics --- 5 packets transmitted, 5 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 91.151/94.075/96.684/2.138 ms

jp@oeuf:~$ time curl -s http://www.stanford.edu > /dev/null real 0m0.394s user
0m0.007s sys 0m0.011s

Synthetic transactions capture all of the latency

2. Instrument your actual queueing

# Forgive me father for I [@daveyeu] have sinned. if
defined?(Thin) class Thin::Connection def process_with_log_connections size = backend.instance_variable_get("@connections").size Stats.measure("thin.connections", size) process_without_log_connections end alias_method_chain :process, :log_connections end end

# https://gist.github.com/jpignata/5084567 class UnicornConnectionMonitor def initialize(app, options = {}) @app
= app @statsd = options.fetch(:statsd) end def call(env) Raindrops::Linux.tcp_listener_stats(“0.0.0.0:3000”).each do |_, stats| @statsd.measure("unicorn.connections.active", stats.active) @statsd.measure("unicorn.connections.queued", stats.queued) end @app.call(env) end end

3. Provision more backends

4. USE a buffering proxy

5. SQUASH THE RESPONSE TIME OUTLIERS

Rack::Timeout

6. Move toward CONCURRENT APPLICATION servers

(hint: you can still use Rails)

THANKS! @jpignata http://tx.pignata.com

A Brief Aside on Request Queueing

A Brief Aside on Request Queueing

More Decks by John Pignata

Other Decks in Programming

Featured

Transcript