Slide 1

Slide 1 text

Design and Pa+erns of Distributed Systems Using Kubernetes as a real-world Example Michael Gasch (June 2023)

Slide 2

Slide 2 text

2 (c) Michael Gasch 2023 @embano1 about://me • First Computer 1992 (C-64) • IHK Applied Computer Science/Max-Planck Society (-2007) • Systems Engineer Dell (-2015) • Engineer and Research VMware Office of the CTO (-2022) • Product Manager AWS EventBridge (2022-) • Self-taught Developer (Golang) • Public Speaking and Blogging (@embano1 www.mgasch.com) • ! and "

Slide 3

Slide 3 text

Act 1 – Kubernetes Introduction

Slide 4

Slide 4 text

4 (c) Michael Gasch 2023 @embano1 In 2013 Docker changed the IT World Credit: docker.com

Slide 5

Slide 5 text

5 (c) Michael Gasch 2023 @embano1 Kubernetes won the Container OrchestraAon War

Slide 6

Slide 6 text

6 (c) Michael Gasch 2023 @embano1 Overview Kubernetes Architecture Control Plane API Server etcd Controller Manager Scheduler … Access REST API SDKs Web UI kubectl Workers Kubelet Kubelet Kubelet Pod Pod Pod

Slide 7

Slide 7 text

7 (c) Michael Gasch 2023 @embano1 Distributed Systems ain’t easy

Slide 8

Slide 8 text

8 (c) Michael Gasch 2023 @embano1 Design Considerations Kubernetes Architecture Control Plane API Server etcd Controller Manager Scheduler … Access REST API SDKs Web UI kubectl Workers Kubelet Kubelet Kubelet Pod Pod Pod Availability Scalability Resiliency Security Flow Control Extensibility Observability Durability Consistency Deployment Versioning Responsiveness Programming Model Open Source Recovery Communication

Slide 9

Slide 9 text

9 (c) Michael Gasch 2023 @embano1 (inherent) Complexity

Slide 10

Slide 10 text

Act 2 – Taming Complexity with Patterns

Slide 11

Slide 11 text

11 (c) Michael Gasch 2023 @embano1 • Common Definition: “A Solution to a Problem in a Context” • Applicable in lots of different Situations • Patterns are Solutions • Why as well as how • Code Examples Defini&on Patterns Source: https://martinfowler.com/articles/writingPatterns.html

Slide 12

Slide 12 text

12 (c) Michael Gasch 2023 @embano1 • Consistent Core • Control (Feedback) Loops* • Idempotent Receiver • Leader-Follower • Quorum • Replicated Log • State Watch • Versioned Value Examples Patterns Source: h;ps://mar

Slide 13

Slide 13 text

Act 3 – Control Loops

Slide 14

Slide 14 text

14 (c) Michael Gasch 2023 @embano1 Kubernetes Architecture Control Plane API Server etcd Controller Manager Scheduler … Access REST API SDKs Web UI kubectl Workers Kubelet Kubelet Kubelet Pod Pod Pod

Slide 15

Slide 15 text

15 (c) Michael Gasch 2023 @embano1 Control Loops everywhere Kubernetes Architecture Control Plane API Server Workers = Control Loop

Slide 16

Slide 16 text

16 (c) Michael Gasch 2023 @embano1 Commands vs Events Kubernetes Architecture Commands Events • Requests (intent) to do something • Named in the impera)ve, e.g. “CREATE” • Can be rejected • Higher coupling between sender and owner • Typically used in synchronous 1-to-1 request/response communicaEon • Something that has happened (a fact) • Named in past tense, e.g. “CREATED” • Cannot (semantically) be rejected by receiver • Lowest coupling between sender and owner • Asynchronous 1-to-many communication, e.g. publish/subscribe

Slide 17

Slide 17 text

17 (c) Michael Gasch 2023 @embano1 Request Flow Kubernetes Architecture API Server REST REST Decoding Conversion & Defaulting Admission Persistency (etcd) … WATCH POST $ kubectl create –f my_replicaset.yaml apiVersion: extensions/v1beta1 kind: ReplicaSet spec: replicas: 2 Commands Events EVENT

Slide 18

Slide 18 text

18 (c) Michael Gasch 2023 @embano1 Inside the Control Loop Kubernetes Architecture Observe Analyze Act apiVersion: extensions/v1beta1 kind: ReplicaSet spec: replicas: 2 desired := getDesiredState() current := getCurrentState() diff := desired – current if diff < 0 { deletePods() } if diff > 0 { createPods() } Command Event (Edge-Triggered) Event (Level-Triggered)

Slide 19

Slide 19 text

19 (c) Michael Gasch 2023 @embano1 Controllers, oh my… Kubernetes Architecture

Slide 20

Slide 20 text

20 (c) Michael Gasch 2023 @embano1 Choreography (over Coordina,on)

Slide 21

Slide 21 text

21 (c) Michael Gasch 2023 @embano1 Asynchronous IntegraGon Kubernetes Architecture API Server CREATE apiVersion: extensions/v1beta1 kind: ReplicaSet spec: replicas: 2 CREATE Pod ReplicaSet CREATED ReplicaSet Controller BIND Pod Pod CREATED Scheduler Kubelet Pod BOUND UPDATE Pod (“running”) Time Command Event

Slide 22

Slide 22 text

Act 4- Writing Controllers

Slide 23

Slide 23 text

23 • Single Responsible Principle • Decoupling via event-driven Messaging • No central Coordinator A different Mindset WriAng Controllers Autonomous Processes

Slide 24

Slide 24 text

24 • Eventual consistent by Design • Don’t rely on (assume) Order • Single Responsible Principle • Decoupling via event-driven Messaging • No central Coordinator A different Mindset Writing Controllers Autonomous Processes Concurrency & Asynchrony

Slide 25

Slide 25 text

25 • API server (etcd) is the Source of Truth* • In-memory Cache via Reconciliation • Eventual consistent by Design • Don’t rely on (assume) Order • Single Responsible Principle • Decoupling via event-driven Messaging • No central Coordinator A different Mindset Writing Controllers Autonomous Processes Concurrency & Asynchrony Stateless over Stateful

Slide 26

Slide 26 text

26 • Things will go wrong (crash) • No shared (wall) Clock • Anticipate Effects on the Rest of the System • API server (etcd) is the Source of Truth* • In-memory Cache via Reconciliation • Eventual consistent by Design • Don’t rely on (assume) Order • Single Responsible Principle • Decoupling via event-driven Messaging • No central Coordinator A different Mindset Writing Controllers Autonomous Processes Concurrency & Asynchrony Stateless over Stateful Defensive Programming

Slide 27

Slide 27 text

27 • Delivery and Processing Guarantees only within Kubernetes • Things will go wrong (crash) • No shared (wall) Clock • Anticipate Effects on the Rest of the System • API server (etcd) is the Source of Truth* • In-memory Cache via Reconciliation • Eventual consistent by Design • Don’t rely on (assume) Order • Single Responsible Principle • Decoupling via event-driven Messaging • No central Coordinator A different Mindset Writing Controllers Autonomous Processes Concurrency & Asynchrony Stateless over Stateful Side Effects Defensive Programming

Slide 28

Slide 28 text

Act 5 – State Watch Pattern Kubernetes ListerWatcher Implementation

Slide 29

Slide 29 text

29 (c) Michael Gasch 2023 @embano1 Clients are interested in changes to the specific values on the server. It's difficult for clients to structure their logic if they need to poll the server conbnuously to look for changes. If clients open too many connecbons to the server for watching changes, it can overwhelm the server. Problem State Watch Pattern

Slide 30

Slide 30 text

30 (c) Michael Gasch 2023 @embano1 Allow clients to register their interest with the server for specific state changes. The server nobfies the interested clients when state changes happen. The client maintains a Single Socket Channel with the server. The server sends state change nobficabons on this channel. Solution State Watch Pattern

Slide 31

Slide 31 text

31 (c) Michael Gasch 2023 @embano1 Kubernetes ListerWatcher: LIST Phase State Watch Pattern Controller API SRV etcd GET https://127.0.0.1:65048/api/v1/namespaces/default/pods?limit=500 C 1 Get(“/registry/pods/default”).WithPrefix() A 2 RangeResponse { "header": { "cluster_id": 14358680983224840000, "member_id": 1033796535975940100, "revision": 3788, "raU_term": 2 }, "kvs": [...] } E 3 Response Body: {"kind":"PodList","apiVersion":"v1","metadata":{"resourceVersion":"3788"},"items":[...] A 4

Slide 32

Slide 32 text

32 (c) Michael Gasch 2023 @embano1 Kubernetes ListerWatcher: WATCH Phase State Watch Pattern Controller API SRV etcd GET h]ps://127.0.0.1:65048/api/v1/namespaces/default/pods?resourceVersion=3788&watch=true C 5 Watch(“/registry/pods/default”).WithPrefix().WithRev(3788+1) A 6 WatchResponse { "Header": { "cluster_id": 14358680983224840000, "member_id": 1033796535975940100, "revision": 4067, "raft_term": 2 }, "Events": [{type: 0, kv: {"key":"/registry/pods/default/vcsim-7c578468cc-j2d6p"},"value":"...", ”create_revision”:2005,"mod_revision":4067,...] } E 7 Response Body: {{"type":"MODIFIED","object":{"apiVersion":"v1","kind":"Pod","metadata": {“name”:”vcsim-7c578468cc-j2d6p”,"resourceVersion":”4067",...},"spec": {...}} A 8

Slide 33

Slide 33 text

33 (c) Michael Gasch 2023 @embano1 • Can a Controller miss Events e.g., during Downtime? • How to handle duplicate Events, such as when re-LISTING due to transient Network Errors? • How to reconcile Changes on external Resources that don’t support a WATCH mechanism? Considera&ons State Watch Pa-ern !

Slide 34

Slide 34 text

Recap

Slide 35

Slide 35 text

35 (c) Michael Gasch 2023 @embano1 • Kubernetes is a Distributed System • Building and operabng Distributed Systems is hard • Pagerns decompose complex Problems into understandable and reusable Solubons Recap 1/2

Slide 36

Slide 36 text

36 (c) Michael Gasch 2023 @embano1 • Closed Feedback (Control) Loops provide Boundaries • Choreography and event-driven Integrabon unlock Extensibility and Autonomy • At the Cost of added Developer Complexity (Asynchrony and Eventual Consistency) • Consistent (replicated) Core, State Watch and Versioned Values for Durability, Availability, Scalability, and Consistency Recap 2/2

Slide 37

Slide 37 text

37 (c) Michael Gasch 2023 @embano1 THANK YOU !" ! @embano1 " www.mgasch.com