Kubecon EU 2018

Kubecon EU 2018

Envoy internals deep dive


Matt Klein

May 03, 2018


  1. Envoy internals deep dive Matt Klein / @mattklein123, Software Engineer

  2. Agenda • Envoy design goals • Architecture overview • Threading

    model • Hot restart • Stats • Q&A
  3. What is Envoy? The network should be transparent to applications.

    When network and application problems do occur it should be easy to determine the source of the problem.
  4. Envoy design goals • Out of process architecture • Low

    latency, high performance, dev productivity • L3/L4 filter architecture • HTTP L7 filter architecture • HTTP/2 first • Service/config discovery • Active/passive health checking • Advanced load balancing • Best in class observability • Service/middle/edge proxy • Hot restart
  5. Envoy architecture diagram Connection Listener filters TCP filter manager TCP

    Read Filters TCP write filters HTTP conn manager HTTP codec HTTP read filters HTTP write filters Service router Upstream conn pool Backend services Stats Admin Cluster/Listener/Route Manager xDS API Worker
  6. Envoy threading model (c10k) Thread Connection • Connection per thread

    does not scale • Scaling requires many connections per thread: “c10k” • Requires asynchronous programming paradigms: harder Thread Event loop Connection Connection Connection
  7. Envoy threading model (overview) Main thread Worker thread(s) Worker thread(s)

    Worker thread(s) Worker thread(s) Worker thread(s) File flush thread(s) Listeners xDS Runtime Stat flush Admin Connections Process management • Main thread handles non-data plane misc tasks • Worker thread(s) embarrassingly parallel and handle listeners, connections, and proxying • File flush threads avoid blocking • Designed to be 100% non-blocking • Designed to scale to massive parallelism (# of HW threads)
  8. Envoy threading model (RCU) • RCU = Read-Copy-Update • Synchronization

    primitive heavily used in the Linux kernel • Scales extremely well for R/W locking that is read heavy New Ref-counted Data Reader Updater Event loop Ref-counted Data “Quiescent period” Copy
  9. Envoy threading model (TLS and RCU) 0 1 2 3

    4 5 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 Worker 1 Worker 2 Worker 3 Worker 4 Main Event loop post() “Slots” TLS get() • TLS = Thread Local Storage • TLS slots can be allocated dynamically by objects • RCU is used to post shared read-only data from the main thread to workers
  10. Envoy threading model (cluster updates example) Cluster manager (1) Worker

    event loop (4) IO event / load balancer (7) Post handler / TLS update (5) Health checker (2) xDS/DNS (3) TLS (6) Main Worker • Complete example of TLS and RCU for cluster updates
  11. Envoy hot restart (overview) Load balancer Service A Service A

    Service A Service A Service A Service A’ 33% 67% Load balancer Service A -> A’ Rolling deploy Hot restart deploy Service A -> A’ Service A -> A’ • Full binary reload without dropping any connections • Very useful in legacy/non-container scheduler worlds
  12. Envoy hot restart (mechanics) Stats Locks Shared memory Stats Logs

    Hot restarter Primary process Stats Logs Hot restarter Secondary process UDS • Stats/locks kept in shared memory • Simple RPC protocol over unix domain sockets (UDS) • Sockets, some stats, etc. passed between processes • Built for containers
  13. Envoy stats (overview) Store Sink Sink Counters Counters Counters Counters

    Counters Gauges Counters Counters Histograms Flusher Admin Sink • Store: holds stats • Sink: protocol adapter (statsd, gRPC, etc.) • Admin: allows pull access • Flusher: allows push access • Scope: discrete grouping of stats that can be deleted. Critical for dynamic xDS on top of hot restart shared memory
  14. Envoy stats (TLS scopes) 1. Store is global 2. Stats

    first looked up in TLS cache 3. Not found, allocated in central cache, added to TLS cache 4. Counters/gauges in shared memory 5. Histograms in process memory 6. Scope deletion causes a TLS cache flush on all threads Store (main thread) (1) Shared memory (4) Stat entries Scope central cache (3) Gauges Counters TLS scope cache flush (6) Histograms Scope central cache Scope central cache Scope TLS cache (2) Gauges Counters Histograms Process memory (5) Histograms
  15. Envoy stats (TLS histograms) Parent histogram TLS histogram TLS histogram

    TLS histogram TLS histogram Histogram A Histogram B &Current histogram recordValue(...) (1) Merge histogram (2) TLS post() to swap current (3) TLS post() back to continue merge (4) Merge all background histograms (5) (1) TLS histogram values recorded into “current” without locks (2) Period merge/flush (3) Post to each worker to swap current histogram (record now happens on alternate) (4) Post back to main thread to continue merge (5) Merge all TLS histograms without locks
  16. Summary • Bias for developer productivity without sacrificing high throughput

    and low latency • Architecture embarrassingly parallel and designed for mostly lock free scaling across high HW thread count • Heavy use of RCU locking paradigm and TLS • Design for containerized world • Extensibility is key Cloud native summary
  17. More information • I’ve written detailed blog posts about these

    topics • Please find them on Medium: https://medium.com/@mattklein123
  18. Q&A • Thanks for coming! Questions welcome on Twitter: @mattklein123

    • We are super excited about building a community around Envoy. Talk to us if you need help getting started. • https://www.envoyproxy.io/ • Lyft is hiring!