Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Containers and Microservices make performance worse

Containers and Microservices make performance worse

Snarky but mainly light-hearted look at how introducing microservices and containers into your infrastructure might just make it less-performant if you don't think about the new challenges.

Presented at the London Web Performance meetup

98234c645fe8c935edc0fec0186d28b8?s=128

Gareth Rushgrove

August 04, 2015
Tweet

Transcript

  1. Containers and Microservices Puppet Labs Gareth Rushgrove Make everything performance

    worse
  2. Gareth Rushgrove @garethr

  3. Gareth Rushgrove

  4. Gareth Rushgrove

  5. Quiz Time

  6. Gareth Rushgrove Apologies to the following. I love you really.

  7. This Talk

  8. Gareth Rushgrove

  9. make snide and sharply critical comments Gareth Rushgrove snark /sna:k/

    noun
  10. Fast in-process communication vs slow and unreliable HTTP Gareth Rushgrove

  11. The performance cost of an overlay network Gareth Rushgrove

  12. What happened to ps and friends? Gareth Rushgrove

  13. Microservices and Performance It will get bad before it gets

    good
  14. Monoliths are pretty easy to understand at a high level

    Gareth Rushgrove
  15. Gareth Rushgrove Monolith

  16. Even adding in the supporting infrastructure it’s not so bad

    Gareth Rushgrove
  17. Gareth Rushgrove Monolith Database Load Balancer Network Network

  18. I can optimise the network, the load balancer, the database,

    the application or the client. Easy Gareth Rushgrove
  19. What about Microservices? Well, first start with a diagram like

    this… Gareth Rushgrove
  20. Microservice Microservice Microservice Microservice Microservice Microservice Microservice Microservice Gareth Rushgrove

  21. Whatever protocol you’re using lets assume you’re communicating over the

    network… Gareth Rushgrove
  22. Microservice Microservice Microservice Microservice Microservice Microservice Microservice Microservice Gareth Rushgrove

    Network Network Network Network Network Network Network Network
  23. And you probably want more than one instance of each

    service… Gareth Rushgrove
  24. Gareth Rushgrove Microservice Load Balancer Network

  25. Gareth Rushgrove Microservice Microservice Microservice Microservice Microservice Microservice Microservice Microservice

    Network Network Network Network Network Network Network Network Load Balancer Network Load Balancer Network Load Balancer Network Load Balancer Network Load Balancer Network Load Balancer Network
  26. And you probably have a few different databases now too…

    Gareth Rushgrove
  27. Gareth Rushgrove Microservice Microservice Microservice Microservice Microservice Microservice Microservice Microservice

    Network Network Network Network Network Network Network Network Load Balancer Network Load Balancer Network Load Balancer Network Load Balancer Network Load Balancer Network Load Balancer Network Database Network Network Database
  28. In my made-up 8 service architecture we went from 5

    things to optimise up to 32 Gareth Rushgrove The Bad
  29. We went from 3 network hops to, er, more depending

    on the request Gareth Rushgrove The Bad
  30. We ignored the cost of serialisation/deserialisation (JSON can be expensive)

    Gareth Rushgrove The Bad
  31. The operational overhead just jumped considerably Gareth Rushgrove The Bad

  32. Lots more network traffic. Watch out for latency in particular

    Gareth Rushgrove The Bad
  33. Without request tracing you’re doomed Gareth Rushgrove The Bad

  34. Granular services are easier to optimise individually Gareth Rushgrove The

    Good
  35. Individual services can be scaled independently Gareth Rushgrove The Good

  36. Debugging with Containers Is that process inside or outside a

    container?
  37. Problems with free and top Gareth Rushgrove

  38. Gareth Rushgrove $ free total used free shared buffers cached

    Mem: 1024444 864140 160304 5024 50008 637736 -/+ buffers/cache: 176396 848048 Swap: 473084 16 473068 $ docker exec test-container free total used free shared buffers cached Mem: 1024444 866440 158004 5024 50000 637732 -/+ buffers/cache: 178708 845736 Swap: 473084 16 473068 Can a container use that memory?
  39. memory stats come from the proc filesystem: /proc/meminfo, / proc/vmstat,

    etc. Gareth Rushgrove
  40. /proc/meminfo and /proc/vmstat are not aware of cgroups Gareth Rushgrove

  41. Problems with ps Gareth Rushgrove

  42. Gareth Rushgrove $ ps aux USER PID %CPU %MEM VSZ

    RSS TTY STAT START TIME COMMAND ... 999 1807 0.2 11.4 867624 464572 ? Ssl 09:38 0:21 mysqld Is this process in a container?
  43. Gareth Rushgrove $ ps -eo ucmd,cgroup COMMAND CGROUP ... mysqld

    9:perf_event:/docker/61e76d2c39121282474ff895b9b3ba2addd775cdea6d2ba89ce76c28 Which container is that?
  44. Gareth Rushgrove Sysdig

  45. Provides a Kernel module, which hooks into cgroups and namespaces

    Gareth Rushgrove
  46. Gareth Rushgrove $ sudo sysdig -c topcontainers_cpu CPU% container.name -----------------------------------------------------------------------

    90.13% mysql 15.93% wordpress1 7.27% haproxy 3.46% wordpress2 CPU usage across containers
  47. Gareth Rushgrove $ sudo sysdig -pc -c topprocs_cpu container.name=client CPU%

    Process container.name ---------------------------------------------- 02.69% bash client 31.04% curl client 0.74% sleep client CPU usage in a single container
  48. Gareth Rushgrove $ sudo sysdig -pc -c topprocs_net Bytes Process

    Host_pid Container_pid container.name --------------------------------------------------------------- 72.06KB haproxy 7385 13 haproxy 56.96KB docker.io 1775 7039 host 44.45KB mysqld 6995 91 mysql 44.45KB mysqld 6995 99 mysql 29.36KB apache2 7893 124 wordpress1 29.36KB apache2 26895 126 wordpress4 29.36KB apache2 26622 131 wordpress2 29.36KB apache2 27935 132 wordpress3 29.36KB apache2 27306 125 wordpress4 22.23KB mysqld 6995 90 mysqlclient Network bandwidth
  49. Gareth Rushgrove $ sudo sysdig -pc -A -c echo_fds "fd.ip=172.17.0.3

    and fd.ip=172.17.0.7" ------ Write 103B to [haproxy] [d468ee81543a] 172.17.0.7:37557->172.17.0.3:80 (hapr GET / HTTP/1.1 User-Agent: curl/7.35.0 Host: 172.17.0.7 Accept: */* X-Forwarded-For: 172.17.0.8 ------ Read 103B from [wordpress1] [12b8c6a04031] 172.17.0.7:37557->172.17.0.3:80 ( GET / HTTP/1.1 User-Agent: curl/7.35.0 Host: 172.17.0.7 Accept: */* X-Forwarded-For: 172.17.0.8 ------ Write 346B to [wordpress1] [12b8c6a04031] 172.17.0.7:37557->172.17.0.3:80 (a HTTP/1.1 302 Found Date: Sat, 21 Feb 2015 22:19:18 GMT Traffic between containers
  50. Don’t expect existing debugging tools to work Gareth Rushgrove The

    Bad
  51. New tools are emerging. Often with better interfaces Gareth Rushgrove

    The Good
  52. Container Overhead Count the performance penalties

  53. Gareth Rushgrove

  54. Containers add very little overhead Gareth Rushgrove The Good

  55. Gareth Rushgrove

  56. Gareth Rushgrove

  57. Memory cgroups can be expensive Gareth Rushgrove The Bad

  58. By default, the memory subsystem uses 40 bytes of memory

    per physical page on x86_64 systems. These resources are consumed even if memory is not used in any hierarchy Gareth Rushgrove https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/sec-memory.html
  59. Container networking is hard. Overlay networks make it easy. But

    slow. Gareth Rushgrove The Bad
  60. Gareth Rushgrove http://www.generictestdomain.net/docker/weave/networking/stupidity/2015/04/05/weave-is-kinda-slow/

  61. Gareth Rushgrove https://arjanschaaf.github.io/is-the-network-the-limit/

  62. Gareth Rushgrove The Good

  63. Gareth Rushgrove $ qperf between two EC2 c3.8xlarge with 10Gb/s

    tcp_bw: bw = 1.2 GB/sec udp_lat: latency = 48.1 us $ qperf over weave network using ODP/VXLAN tcp_bw: bw = 1.09 GB/sec udp_lat: latency = 61.9 us
  64. Linux containers don’t really contain Gareth Rushgrove The Bad

  65. Gareth Rushgrove

  66. To get strict isolation guarantees you’re going to wrap them

    in virtual machines anyway Gareth Rushgrove The Bad
  67. Projects like Clear Linux from Intel are innovating in this

    space Gareth Rushgrove The Good
  68. User namespaces (host/container separation) and seccomp (limit syscalls) are coming

    to Docker Gareth Rushgrove The Good
  69. Conclusions Snark free zone

  70. Containers and microservices pose new performance challenges Gareth Rushgrove

  71. Most problems aren’t performance problems Gareth Rushgrove

  72. If you’re not already a networking expert start learning now

    Gareth Rushgrove
  73. Lots of opportunities for new tooling to improve things Gareth

    Rushgrove
  74. Questions? And thanks for listening Gareth Rushgrove