Reconciling Everything - Speaker Deck

Slide 1

Slide 1 text

EVERYTHING ANDREW GODWIN // @[email protected] RECONCILING

Slide 2

Slide 2 text

Andrew Godwin / @[email protected] Hi, I’m Andrew Godwin • Principal Engineer at Astronomer.io (Airﬂow) • Django Migrations, Channels, Async, others • Building distributed systems since 2008

Slide 3

Slide 3 text

Andrew Godwin / @[email protected] Let's start with a bold statement I hear it drives engagement. Maybe also heckling.

Slide 4

Slide 4 text

Andrew Godwin / @[email protected] Event-Driven Via Queues Totally separate stores of truth that only talk via queues Reconciliation Loops Stateless components that talk to one store of truth There are only two good* ways to build distributed systems**:

Slide 5

Slide 5 text

Andrew Godwin / @[email protected] What is "good"? Low maintenance and high reliability, for me at least.

Slide 6

Slide 6 text

Andrew Godwin / @[email protected] What kinds of distributed systems? Remember, friends don't let friends write microservices

Slide 7

Slide 7 text

Andrew Godwin / @[email protected] Due to size What companies often claim is the reason Due to team structure What the actual reason often is Due to federation It's what the internet was built on!

Slide 8

Slide 8 text

Andrew Godwin / @[email protected] RPC Microservices Email Service Accounting Service Billing Service Reservation Service

Slide 9

Slide 9 text

Andrew Godwin / @[email protected] A system is defined by its failure modes Nobody gets paged when it's up and running happily

Slide 10

Slide 10 text

Andrew Godwin / @[email protected] Event-Driven (Message Passing) Task Runners Billing System QUEUE QUEUE Log Storage Analytics Aggregator QUEUE

Slide 11

Slide 11 text

Andrew Godwin / @[email protected] What kind of queue is it? Hint: It will not deliver messages exactly once

Slide 12

Slide 12 text

Andrew Godwin / @[email protected] At-most-once In case of failure, a message will not get delivered At-least-once In case of failure, a messages will get delivered twice or more

Slide 13

Slide 13 text

Andrew Godwin / @[email protected] Backlogs Need more queue consumers, can be asymmetric Replays I hope you can cope with lots of duplicates! Overflow Do you grind to a halt? Or drop?

Slide 14

Slide 14 text

Andrew Godwin / @[email protected] Neither is "perfect" (But I recommend at-least-once unless you truly do not care about the data)

Slide 15

Slide 15 text

Andrew Godwin / @[email protected] Reconciliation (Control) Loop Kubelet Deployment Controller DATABASE Pod Controller Kubectl (User)

Slide 16

Slide 16 text

Andrew Godwin / @[email protected] Reconciliation (Control) Loop Kubelet Deployment Controller apiserver Pod Controller Kubectl (User) etcd

Slide 17

Slide 17 text

Andrew Godwin / @[email protected] Kubernetes Example ● User creates a new Deployment ● Deployment controller notices that the deployment has no pods ○ To reconcile this, it makes one ● Pod controller notices that the pod is not assigned ○ To reconcile this, it assigns it to a node ● Kubelet notices it has an assigned pod that is not running ○ To reconcile this, it creates it locally and starts it

Slide 18

Slide 18 text

Andrew Godwin / @[email protected] One "single point of failure" is great Keeping one service up is much simpler

Slide 19

Slide 19 text

Andrew Godwin / @[email protected] So how do reconciliation loops fail? They slow down and halt, but with easy restarts and no data loss

Slide 20

Slide 20 text

Andrew Godwin / @[email protected] Each controller must be stateless This lets you scale them up, down, restart, upgrade, and recover easily

Slide 21

Slide 21 text

Andrew Godwin / @[email protected] Practical example: Takahē Of course I wrote an ActivityPub/Fediverse server!

Slide 22

Slide 22 text

Andrew Godwin / @[email protected]

Slide 23

Slide 23 text

Andrew Godwin / @[email protected] The Fediverse is an event-driven system Well, mostly. It's also imperfect and still evolving.

Slide 24

Slide 24 text

Andrew Godwin / @[email protected] Making a single Fediverse post QUEUE Sending Client Origin API Fanout System Destination Inbox Timeline Builder Receiving Client QUEUE

Slide 25

Slide 25 text

Andrew Godwin / @[email protected] A server needs background workers Both to fan out sending of posts, and to process incoming posts

Slide 26

Slide 26 text

Andrew Godwin / @[email protected] Stator Takahē's state-machine reconciliation system

Slide 27

Slide 27 text

Andrew Godwin / @[email protected] See more inside github.com/jointakahe/takahe

Slide 28

Slide 28 text

Andrew Godwin / @[email protected] You can also do this with Airflow We have an Astronomer executor which uses this for running tasks

Slide 29

Slide 29 text

Andrew Godwin / @[email protected] Airflow Example ● A Task Instance is scheduled to run, and the scheduler passes it over ● The Executor makes a new Workload for the Task Instance ● Allocator notices that the Workload has no Runner ○ To reconcile this, it assigns it to a runner ● Runner notices it has an assigned Workload that is not running ○ To reconcile this, it creates it locally and starts it

Slide 30

Slide 30 text

Andrew Godwin / @[email protected] So, how should you build them? Carefully! Ha, ha.

Slide 31

Slide 31 text

Andrew Godwin / @[email protected] All state in the central store You can maybe get away with caching elsewhere Zero service-to-service comms Communication paths are failure modes Controllers should be simple loops Query database, reconcile, repeat

Slide 32

Slide 32 text

Andrew Godwin / @[email protected] They are easy to test and refactor Mocking your datastore is all that's needed; everything else should ﬂow

Slide 33

Slide 33 text

Andrew Godwin / @[email protected] They self-heal! Provided your datastore stays up, everything else is ﬂexible

Slide 34

Slide 34 text

Andrew Godwin / @[email protected] There are downsides - mostly max scale But I generally design systems with upper bounds on their throughput

Slide 35

Slide 35 text

Andrew Godwin / @[email protected] The datastore will be your bottleneck But modern DBs are very capable - and you can do replicas!

Slide 36

Slide 36 text

Andrew Godwin / @[email protected] Queues are still fine! If you need the complexity and scale, then great. But they are harder!

Slide 37

Slide 37 text

Andrew Godwin / @[email protected] Remember, there's no universal solution Just please, please, avoid building a mess of microservices that do RPC calls

Slide 38

Slide 38 text

Thanks! Andrew Godwin [email protected] Want to talk more about Takahē and ActivityPub protocols? Come to the Open Space at 3pm in Room 251AB!