Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Twemproxy

 Twemproxy

twemproxy (pronounced "two-em-proxy"), aka nutcracker, is a fast and lightweight proxy for memcached and redis protocol. It was primarily built to reduce the connection count on the backend caching servers. Twemproxy was created within Twitter to initially support Memchace with Redis support being added 4 months ago

Justin Mares

April 19, 2013
Tweet

Other Decks in Programming

Transcript

  1. Deployed as Remote Proxy U C C U m n

    m >> n N N U N => n N
  2. Fault Tolerance with Remote Proxy U C C U m

    n m >> n => n N N U N N N N
  3. Features Fast & lightweight Persistent server connections Protocol pipelining Shards

    data automatically across multiple server Supports multiple cache pools simultaneously Supports ketama aka consistent hashing Configuration through YAML file Disable nodes on failures
  4. Pipelining N U T C R A C K E

    R get k1 delete k3, get k2, get k1 get k2 delete k3 Tradeoff latency for throughput
  5. Fault Tolerance P S P S time C P C

    C -ERR Connection refused\r\n S get key get key or -ERR Connection timed out\r\n t0 t1 t2 => Client retries on -ERR response!
  6. Client retries on -ERR responses -ERR Connection refused\r\n (errno =

    ECONNREFUSED) -ERR Invalid argument\r\n (errno = EINVAL) -ERR Connection timed out\r\n (errno = ETIMEDOUT) -ERR Host is down\r\n (errno = EHOSTDOWN) -ERR Connection reset by peer\r\n (errno = ECONNRESET) ...
  7. Retries with slow server P S time C S get

    key t0 t1 t2 P C get key P C get key P C get key S S stuck-up server :( 99p > 30 ms client-timeout = 30 ms get key get key get key get key Outstanding req Q growing tn
  8. Solution: Use “timeout:” config P S time C t0 t1

    t2 P C get key S stuck-up server :( get key 1. Proxy times out request 2. Set timeout: to expected 99p or 999p latency 3. Set client-side timeout > timeout: => Outstanding req Q bounded! P C S -ERR connection timed out
  9. Re-routing on failures P P P P S1 S2 S3

    S4 S5 S6 ring-size = 5 Compute global view from local information ring-size = 6
  10. Rerouting on failures P S P S back off by

    server_retry_timeout: retry any new request on ejected server after server_retry_timeout: time redis-cache-pool: auto_eject_hosts: true server_retry_timeout: 30000 server_failure_limit: 3 timeout: 400
  11. Succeeding on failures P S server_failure_limit: 3 C S’ retires:

    3 t0: Server - S dies t1: Query is rerouted to S’ after server_failure_limit tries => client-side retries >= server_failure_limit for success => set TTL on items
  12. Simultaneous failures S1 S2 S3 S4 S5 S6 Global view

    is fragmented => Solun: large server_retry_timeout: P P P P cluster-size = 5? cluster-size = 4? cluster-size = 3?
  13. hash_tag: “{}” Command with multiple keys? Eg: MGET foo bar

    Solun: MGET foo, MGET bar Eg: SINTER foo bar Solun: SINTER foo{tag} bar{tag} Read “notes/redis.md” for details on supported commands
  14. Deployment Checklist ... Logging enabled to LOG_INFO / LOG_NOTICE for

    debugging? Are exposed stats being collected? Is timeout: set? Is redis pool used has a cache? -> set auto_eject_hosts: true Is redis pool used as a data-store? - set auto_eject_hosts: false Is server_retry_timeout: value reasonable to your app domain? Is swap enabled on redis server? Are you ok with high latency variance?
  15. Deployment Checklist Have you tested your setup for resiliency to:

    Permanent failures: kill / reboot machines Transient failures: SIGSTOP redis / drop packets using iptables What’s your strategy for updating configuration? What is the value of -m (--mbuf-size)? What is your max key length? Is it less than --mbuf-size? How many connections is your proxy meant to handle? - file descriptor limit: “ulimit -n” Are you using commands with multiple keys? If so, is hash_tag configured?
  16. Why Twemproxy? Persistent server connections faster client restarts filter close

    from client Protocol pipelining Enables simple and dumb clients Hides semantics of underlying cache pool Easy configuration Automatic sharding and fault-tolerance capability
  17. Core Event Loop for (;;) { wait(&event); process(&event); } event

    : connection = 1:1 I/O - non-blocking ET vs LT main
  18. mbuf /* * mbuf header is at the tail end

    of the mbuf. This enables us to catch * buffer overrun early by asserting on the magic value during get or * put operations * * <------------- mbuf_chunk_size -------------> * +-------------------------------------------+ * | mbuf data | mbuf header | * | (mbuf_offset) | (struct mbuf) | * +-------------------------------------------+ * ^ ^ ^ ^^ * | | | || * \ | | |\ * mbuf->start \ | | mbuf->end (one byte past valid bound) * mbuf->pos \ * \ mbuf * mbuf->last (one byte past valid byte) * */
  19. Chain of handlers Chain of processing handlers Takes an input

    message and produces output for the next handler in the chain Unix pipes: cat foo.txt | tr -s “ “ | cut -d “ “ -f 2 Filter Manipulates output produced by a handler, possibly short circuiting the chain Forwarder Chooses one of the backend server to send the request to Req | Filter-1 | Filter-2 | Forwarder
  20. Chain of handlers read_event req_recv req_filter* req_forward read_event rsp_recv rsp_filter*

    rsp_forward write_event req_send write_event rsp_send Proxy req req rsp rsp Client Server Q: req_1 <- req_2 <- req_3 ...
  21. Stats Collection main stats connect localhost:22222 <stats> a 0 b

    0 ... ... += flag = 1 a 0 b 0 ... ... a 0 b 0 ... ...
  22. Stats Collection main stats connect localhost:22222 <stats> a 1 b

    5 ... ... += flag = 1 a 0 b 0 ... ... a 0 b 0 ... ...
  23. Stats Collection main stats connect localhost:22222 <stats> a 0 b

    0 ... ... += flag = 0 a 1 b 5 ... ... a 0 b 0 ... ...
  24. Stats Collection main stats connect localhost:22222 <stats> a 4 b

    1 ... ... += flag = 1 a 0 b 0 ... ... a 1 b 5 ... ...
  25. Stats Collection main stats connect localhost:22222 <stats> a 0 b

    1 ... ... += flag = 0 a 4 b 1 ... ... a 1 b 5 ... ...