Slide 1

Slide 1 text

QUEUEING @jpignata br f REQUEST

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

nginx thin load balancer

Slide 5

Slide 5 text

nginx thin load balancer X-Request-Start: 1362955863565

Slide 6

Slide 6 text

now = Time.now x_request_start = request.headers["HTTP_X_REQUEST_START"] queue_start = Time.at(x_request_start.to_i / 1000) queue_time = now - queue_start

Slide 7

Slide 7 text

nginx thin load balancer New Relic

Slide 8

Slide 8 text

New Relic Agent ActionController::Base#process_action Rails New Relic

Slide 9

Slide 9 text

Response Time t h e a m o u n t o f t i m e required to process a single request

Slide 10

Slide 10 text

THROUGHPUT the amount of requests that can be processed within a unit of time

Slide 11

Slide 11 text

ARRIVAL RATE the amount of requests that are enqueued for processing in a unit of time

Slide 12

Slide 12 text

Latency the time delay prior to processing a request

Slide 13

Slide 13 text

CONCURRENCY how many requests can be processed simultaneously

Slide 14

Slide 14 text

require "socket" server = TCPServer.new(3000) loop do client = server.accept sleep 1 client.puts Time.now client.close end

Slide 15

Slide 15 text

jp@oeuf:~$ ./tcpping.rb 1 Throughput: 0.9974 req/sec Average: 1.0023 secs Max: 1.0023 secs Min: 1.0023 secs Total: 1.0026 secs

Slide 16

Slide 16 text

jp@oeuf:~$ ./tcpping.rb 2 Throughput: 0.9983 req/sec Average: 1.5024 secs Max: 2.003 secs Min: 1.0019 secs Total: 2.0033 secs

Slide 17

Slide 17 text

jp@oeuf:~$ ./tcpping.rb 4 Throughput: 0.9982 req/sec Average: 2.5037 secs Max: 4.0051 secs Min: 1.0025 secs Total: 4.0071 secs

Slide 18

Slide 18 text

jp@oeuf:~$ ./tcpping.rb 25 Throughput: 0.9985 req/sec Average: 13.0168 secs Max: 25.0288 secs Min: 1.0014 secs Total: 25.0364 secs

Slide 19

Slide 19 text

require "java" require "socket" import "java.util.concurrent.Executors" pool = Executors.new_fixed_thread_pool(2) server = TCPServer.new(3000) loop do if pool.active_count < pool.maximum_pool_size client = server.accept pool.execute do sleep 1 client.puts Time.now client.close end end end

Slide 20

Slide 20 text

jp@oeuf:~$ ./tcpping.rb 1 Throughput: 0.9881 req/sec Average: 1.0116 secs Max: 1.0116 secs Min: 1.0116 secs Total: 1.0119 secs

Slide 21

Slide 21 text

jp@oeuf:~$ ./tcpping.rb 2 Throughput: 1.972 req/sec Average: 1.0133 secs Max: 1.0138 secs Min: 1.0129 secs Total: 1.014 secs

Slide 22

Slide 22 text

jp@oeuf:~$ ./tcpping.rb 4 Throughput: 1.9858 req/sec Average: 1.5077 secs Max: 2.0119 secs Min: 1.0042 secs Total: 2.0142 secs

Slide 23

Slide 23 text

jp@oeuf:~$ ./tcpping.rb 25 Throughput: 1.9157 req/sec Average: 6.7896 secs Max: 13.0433 secs Min: 1.0059 secs Total: 13.0499 secs

Slide 24

Slide 24 text

require "java" require "socket" import "java.util.concurrent.Executors" size = ARGV[0].to_i pool = Executors.new_fixed_thread_pool(size) server = TCPServer.new(3000) loop do if pool.active_count < pool.maximum_pool_size client = server.accept pool.execute do sleep 1 client.puts Time.now client.close end end end

Slide 25

Slide 25 text

jp@oeuf:~$ jruby ./thread_pool_server.rb 25

Slide 26

Slide 26 text

jp@oeuf:~$ ./tcpping.rb 25 Throughput: 24.1932 req/sec Average: 1.023 secs Max: 1.028 secs Min: 1.0191 secs Total: 1.0332 secs

Slide 27

Slide 27 text

jp@oeuf:~$ ./tcpping.rb 50 Throughput: 24.6 req/sec Average: 1.5127 secs Max: 2.0238 secs Min: 1.0056 secs Total: 2.0324 secs

Slide 28

Slide 28 text

jp@oeuf:~$ ./tcpping.rb 100 Throughput: 24.7026 req/sec Average: 2.5089 secs Max: 4.0256 secs Min: 1.0031 secs Total: 4.048 secs

Slide 29

Slide 29 text

listen() Backlog

Slide 30

Slide 30 text

socket() bind() listen() accept()

Slide 31

Slide 31 text

int listen(int sockfd, int backlog)

Slide 32

Slide 32 text

require "socket" server = TCPServer.new(3000) server.listen(1024) loop do client = server.accept sleep 1 client.puts Time.now client.close end

Slide 33

Slide 33 text

Client Server SYN J connect called create entry on listen queue SYN K, ACK J+1 connect returns; client considers socket open ACK K+1 connection acceptable by server process

Slide 34

Slide 34 text

incoming connections are enqueued and passed to accept() one at a time, first in, first out

Slide 35

Slide 35 text

different application servers call accept() with different frequencies

Slide 36

Slide 36 text

Unicorn workers each call accept() on the q u e u e a n d w o r k connections directly from it

Slide 37

Slide 37 text

P u m a w i l l c a l l accept() and pass connections to its worker thread pool

Slide 38

Slide 38 text

Thin, via EventMachine, will call accept() 10 times per reactor tick and place connections into a separate queue

Slide 39

Slide 39 text

these backlogs could represent the majority of your request queue time

Slide 40

Slide 40 text

this is generally invisible to your application as it is happening underneath it i n t h e k e r n e l o r application server

Slide 41

Slide 41 text

CONCURRENCY

Slide 42

Slide 42 text

nginx thin load balancer

Slide 43

Slide 43 text

assuming Rack::Lock middleware; i.e., Rails’ config.threadsafe! commented out.

Slide 44

Slide 44 text

elapsed wall clock time: 400ms thin 50ms 50ms 50ms 50ms 50ms 50ms 50ms 50ms Request Queue Response Time + Latency 50ms 100ms 150ms 200ms 250ms 300ms 350ms 400ms

Slide 45

Slide 45 text

elapsed wall clock time: 100MS unicorn worker 50ms 50ms 50ms 50ms 50ms 50ms 50ms 50ms Request Queue Response Time + Latency 50ms 50ms 50ms 50ms 100ms 100ms 100ms 100ms unicorn worker unicorn worker unicorn worker

Slide 46

Slide 46 text

RESPONSE TIME VARIANCE

Slide 47

Slide 47 text

Rails 50ms 130ms 20ms 9.5secs 340ms 9ms 2.3secs 693ms Request Queue

Slide 48

Slide 48 text

Rails 50ms 130ms 20ms 9.5secs 340ms 9ms 2.3secs 693ms Request Queue Response Time + Latency 50ms 180ms 200ms 9.7secs 10secs 10secs 12.3secs 13secs

Slide 49

Slide 49 text

load balancer algorithms

Slide 50

Slide 50 text

leastconns or bybusyness

Slide 51

Slide 51 text

def balance(connection) instance = instances.sort_by(&:connection_count).first begin instance.connection_count += 1 instance.process(connection) ensure instance.connection_count -= 1 end end

Slide 52

Slide 52 text

round-robin or least recently used

Slide 53

Slide 53 text

def balance(connection) @processed ||= 0 instance = instances[@processed % instances.count] instance.process(connection) ensure @processed += 1 end

Slide 54

Slide 54 text

Random

Slide 55

Slide 55 text

def balance(connection) instances.sample.process(connection) end

Slide 56

Slide 56 text

your queue distribution and worker utilization will be in large part dictated by the dealing strategy

Slide 57

Slide 57 text

n o a l g o r i t h m w i l l compensate for an unstable distribution in response time

Slide 58

Slide 58 text

leastconns doesn’t know that all connections aren’t equal

Slide 59

Slide 59 text

the least recently used worker might be still processing a long-running request

Slide 60

Slide 60 text

Client latency

Slide 61

Slide 61 text

unicorn load balancer

Slide 62

Slide 62 text

unicorn load balancer lol, edge network

Slide 63

Slide 63 text

slow clients could cause a worker to spend time shoveling bytes over the wire

Slide 64

Slide 64 text

work through a buffering intermediary to allow the worker to immediately flush the response and move on

Slide 65

Slide 65 text

TODO

Slide 66

Slide 66 text

1. Measure from the outside in

Slide 67

Slide 67 text

jp@oeuf:~$ ping -c 5 www.stanford.edu PING www-v6.stanford.edu (171.67.215.200): 56 data bytes 64 bytes from 171.67.215.200: icmp_seq=0 ttl=252 time=91.151 ms 64 bytes from 171.67.215.200: icmp_seq=1 ttl=252 time=96.684 ms 64 bytes from 171.67.215.200: icmp_seq=2 ttl=252 time=93.497 ms 64 bytes from 171.67.215.200: icmp_seq=3 ttl=252 time=96.360 ms 64 bytes from 171.67.215.200: icmp_seq=4 ttl=252 time=92.683 ms --- www-v6.stanford.edu ping statistics --- 5 packets transmitted, 5 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 91.151/94.075/96.684/2.138 ms

Slide 68

Slide 68 text

jp@oeuf:~$ time curl -s http://www.stanford.edu > /dev/null real 0m0.394s user 0m0.007s sys 0m0.011s

Slide 69

Slide 69 text

Synthetic transactions capture all of the latency

Slide 70

Slide 70 text

2. Instrument your actual queueing

Slide 71

Slide 71 text

No content

Slide 72

Slide 72 text

# Forgive me father for I [@daveyeu] have sinned. if defined?(Thin) class Thin::Connection def process_with_log_connections size = backend.instance_variable_get("@connections").size Stats.measure("thin.connections", size) process_without_log_connections end alias_method_chain :process, :log_connections end end

Slide 73

Slide 73 text

# https://gist.github.com/jpignata/5084567 class UnicornConnectionMonitor def initialize(app, options = {}) @app = app @statsd = options.fetch(:statsd) end def call(env) Raindrops::Linux.tcp_listener_stats(“0.0.0.0:3000”).each do |_, stats| @statsd.measure("unicorn.connections.active", stats.active) @statsd.measure("unicorn.connections.queued", stats.queued) end @app.call(env) end end

Slide 74

Slide 74 text

3. Provision more backends

Slide 75

Slide 75 text

4. USE a buffering proxy

Slide 76

Slide 76 text

5. SQUASH THE RESPONSE TIME OUTLIERS

Slide 77

Slide 77 text

Rack::Timeout

Slide 78

Slide 78 text

6. Move toward CONCURRENT APPLICATION servers

Slide 79

Slide 79 text

(hint: you can still use Rails)

Slide 80

Slide 80 text

THANKS! @jpignata http://tx.pignata.com