Slide 1

Slide 1 text

Envoy internals deep dive Matt Klein / @mattklein123, Software Engineer @Lyft

Slide 2

Slide 2 text

Agenda ● Envoy design goals ● Architecture overview ● Threading model ● Hot restart ● Stats ● Q&A

Slide 3

Slide 3 text

What is Envoy? The network should be transparent to applications. When network and application problems do occur it should be easy to determine the source of the problem.

Slide 4

Slide 4 text

Envoy design goals ● Out of process architecture ● Low latency, high performance, dev productivity ● L3/L4 filter architecture ● HTTP L7 filter architecture ● HTTP/2 first ● Service/config discovery ● Active/passive health checking ● Advanced load balancing ● Best in class observability ● Service/middle/edge proxy ● Hot restart

Slide 5

Slide 5 text

Envoy architecture diagram Connection Listener filters TCP filter manager TCP Read Filters TCP write filters HTTP conn manager HTTP codec HTTP read filters HTTP write filters Service router Upstream conn pool Backend services Stats Admin Cluster/Listener/Route Manager xDS API Worker

Slide 6

Slide 6 text

Envoy threading model (c10k) Thread Connection ● Connection per thread does not scale ● Scaling requires many connections per thread: “c10k” ● Requires asynchronous programming paradigms: harder Thread Event loop Connection Connection Connection

Slide 7

Slide 7 text

Envoy threading model (overview) Main thread Worker thread(s) Worker thread(s) Worker thread(s) Worker thread(s) Worker thread(s) File flush thread(s) Listeners xDS Runtime Stat flush Admin Connections Process management ● Main thread handles non-data plane misc tasks ● Worker thread(s) embarrassingly parallel and handle listeners, connections, and proxying ● File flush threads avoid blocking ● Designed to be 100% non-blocking ● Designed to scale to massive parallelism (# of HW threads)

Slide 8

Slide 8 text

Envoy threading model (RCU) ● RCU = Read-Copy-Update ● Synchronization primitive heavily used in the Linux kernel ● Scales extremely well for R/W locking that is read heavy New Ref-counted Data Reader Updater Event loop Ref-counted Data “Quiescent period” Copy

Slide 9

Slide 9 text

Envoy threading model (TLS and RCU) 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 Worker 1 Worker 2 Worker 3 Worker 4 Main Event loop post() “Slots” TLS get() ● TLS = Thread Local Storage ● TLS slots can be allocated dynamically by objects ● RCU is used to post shared read-only data from the main thread to workers

Slide 10

Slide 10 text

Envoy threading model (cluster updates example) Cluster manager (1) Worker event loop (4) IO event / load balancer (7) Post handler / TLS update (5) Health checker (2) xDS/DNS (3) TLS (6) Main Worker ● Complete example of TLS and RCU for cluster updates

Slide 11

Slide 11 text

Envoy hot restart (overview) Load balancer Service A Service A Service A Service A Service A Service A’ 33% 67% Load balancer Service A -> A’ Rolling deploy Hot restart deploy Service A -> A’ Service A -> A’ ● Full binary reload without dropping any connections ● Very useful in legacy/non-container scheduler worlds

Slide 12

Slide 12 text

Envoy hot restart (mechanics) Stats Locks Shared memory Stats Logs Hot restarter Primary process Stats Logs Hot restarter Secondary process UDS ● Stats/locks kept in shared memory ● Simple RPC protocol over unix domain sockets (UDS) ● Sockets, some stats, etc. passed between processes ● Built for containers

Slide 13

Slide 13 text

Envoy stats (overview) Store Sink Sink Counters Counters Counters Counters Counters Gauges Counters Counters Histograms Flusher Admin Sink ● Store: holds stats ● Sink: protocol adapter (statsd, gRPC, etc.) ● Admin: allows pull access ● Flusher: allows push access ● Scope: discrete grouping of stats that can be deleted. Critical for dynamic xDS on top of hot restart shared memory

Slide 14

Slide 14 text

Envoy stats (TLS scopes) 1. Store is global 2. Stats first looked up in TLS cache 3. Not found, allocated in central cache, added to TLS cache 4. Counters/gauges in shared memory 5. Histograms in process memory 6. Scope deletion causes a TLS cache flush on all threads Store (main thread) (1) Shared memory (4) Stat entries Scope central cache (3) Gauges Counters TLS scope cache flush (6) Histograms Scope central cache Scope central cache Scope TLS cache (2) Gauges Counters Histograms Process memory (5) Histograms

Slide 15

Slide 15 text

Envoy stats (TLS histograms) Parent histogram TLS histogram TLS histogram TLS histogram TLS histogram Histogram A Histogram B &Current histogram recordValue(...) (1) Merge histogram (2) TLS post() to swap current (3) TLS post() back to continue merge (4) Merge all background histograms (5) (1) TLS histogram values recorded into “current” without locks (2) Period merge/flush (3) Post to each worker to swap current histogram (record now happens on alternate) (4) Post back to main thread to continue merge (5) Merge all TLS histograms without locks

Slide 16

Slide 16 text

Summary ● Bias for developer productivity without sacrificing high throughput and low latency ● Architecture embarrassingly parallel and designed for mostly lock free scaling across high HW thread count ● Heavy use of RCU locking paradigm and TLS ● Design for containerized world ● Extensibility is key Cloud native summary

Slide 17

Slide 17 text

More information ● I’ve written detailed blog posts about these topics ● Please find them on Medium: https://medium.com/@mattklein123

Slide 18

Slide 18 text

Q&A ● Thanks for coming! Questions welcome on Twitter: @mattklein123 ● We are super excited about building a community around Envoy. Talk to us if you need help getting started. ● https://www.envoyproxy.io/ ● Lyft is hiring!