Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Reconciling Everything

Reconciling Everything

A talk I gave at PyCon US 2023.

Andrew Godwin

April 22, 2023
Tweet

More Decks by Andrew Godwin

Other Decks in Programming

Transcript

  1. EVERYTHING
    ANDREW GODWIN // @[email protected]
    RECONCILING

    View Slide

  2. Andrew Godwin / @[email protected]
    Hi, I’m
    Andrew Godwin
    • Principal Engineer at Astronomer.io (Airflow)
    • Django Migrations, Channels, Async, others
    • Building distributed systems since 2008

    View Slide

  3. Andrew Godwin / @[email protected]
    Let's start with a bold statement
    I hear it drives engagement. Maybe also heckling.

    View Slide

  4. Andrew Godwin / @[email protected]
    Event-Driven Via Queues
    Totally separate stores of truth that only talk via queues
    Reconciliation Loops
    Stateless components that talk to one store of truth
    There are only two good* ways to build
    distributed systems**:

    View Slide

  5. Andrew Godwin / @[email protected]
    What is "good"?
    Low maintenance and high reliability, for me at least.

    View Slide

  6. Andrew Godwin / @[email protected]
    What kinds of distributed systems?
    Remember, friends don't let friends write microservices

    View Slide

  7. Andrew Godwin / @[email protected]
    Due to size
    What companies often claim is the reason
    Due to team structure
    What the actual reason often is
    Due to federation
    It's what the internet was built on!

    View Slide

  8. Andrew Godwin / @[email protected]
    RPC Microservices
    Email Service
    Accounting
    Service
    Billing Service
    Reservation
    Service

    View Slide

  9. Andrew Godwin / @[email protected]
    A system is defined by its failure modes
    Nobody gets paged when it's up and running happily

    View Slide

  10. Andrew Godwin / @[email protected]
    Event-Driven (Message Passing)
    Task Runners
    Billing System
    QUEUE
    QUEUE Log Storage
    Analytics
    Aggregator
    QUEUE

    View Slide

  11. Andrew Godwin / @[email protected]
    What kind of queue is it?
    Hint: It will not deliver messages exactly once

    View Slide

  12. Andrew Godwin / @[email protected]
    At-most-once
    In case of failure, a message will not get delivered
    At-least-once
    In case of failure, a messages will get delivered twice or more

    View Slide

  13. Andrew Godwin / @[email protected]
    Backlogs
    Need more queue consumers, can be asymmetric
    Replays
    I hope you can cope with lots of duplicates!
    Overflow
    Do you grind to a halt? Or drop?

    View Slide

  14. Andrew Godwin / @[email protected]
    Neither is "perfect"
    (But I recommend at-least-once unless you truly do not care about the data)

    View Slide

  15. Andrew Godwin / @[email protected]
    Reconciliation (Control) Loop
    Kubelet Deployment
    Controller
    DATABASE
    Pod Controller
    Kubectl (User)

    View Slide

  16. Andrew Godwin / @[email protected]
    Reconciliation (Control) Loop
    Kubelet Deployment
    Controller
    apiserver
    Pod Controller
    Kubectl (User)
    etcd

    View Slide

  17. Andrew Godwin / @[email protected]
    Kubernetes Example
    ● User creates a new Deployment
    ● Deployment controller notices that the deployment has no pods
    ○ To reconcile this, it makes one
    ● Pod controller notices that the pod is not assigned
    ○ To reconcile this, it assigns it to a node
    ● Kubelet notices it has an assigned pod that is not running
    ○ To reconcile this, it creates it locally and starts it

    View Slide

  18. Andrew Godwin / @[email protected]
    One "single point of failure" is great
    Keeping one service up is much simpler

    View Slide

  19. Andrew Godwin / @[email protected]
    So how do reconciliation loops fail?
    They slow down and halt, but with easy restarts and no data loss

    View Slide

  20. Andrew Godwin / @[email protected]
    Each controller must be stateless
    This lets you scale them up, down, restart, upgrade, and recover easily

    View Slide

  21. Andrew Godwin / @[email protected]
    Practical example: Takahē
    Of course I wrote an ActivityPub/Fediverse server!

    View Slide

  22. Andrew Godwin / @[email protected]

    View Slide

  23. Andrew Godwin / @[email protected]
    The Fediverse is an event-driven system
    Well, mostly. It's also imperfect and still evolving.

    View Slide

  24. Andrew Godwin / @[email protected]
    Making a single Fediverse post
    QUEUE
    Sending Client Origin API
    Fanout System
    Destination
    Inbox
    Timeline Builder Receiving Client
    QUEUE

    View Slide

  25. Andrew Godwin / @[email protected]
    A server needs background workers
    Both to fan out sending of posts, and to process incoming posts

    View Slide

  26. Andrew Godwin / @[email protected]
    Stator
    Takahē's state-machine reconciliation system

    View Slide

  27. Andrew Godwin / @[email protected]
    See more inside github.com/jointakahe/takahe

    View Slide

  28. Andrew Godwin / @[email protected]
    You can also do this with Airflow
    We have an Astronomer executor which uses this for running tasks

    View Slide

  29. Andrew Godwin / @[email protected]
    Airflow Example
    ● A Task Instance is scheduled to run, and the scheduler passes it over
    ● The Executor makes a new Workload for the Task Instance
    ● Allocator notices that the Workload has no Runner
    ○ To reconcile this, it assigns it to a runner
    ● Runner notices it has an assigned Workload that is not running
    ○ To reconcile this, it creates it locally and starts it

    View Slide

  30. Andrew Godwin / @[email protected]
    So, how should you build them?
    Carefully! Ha, ha.

    View Slide

  31. Andrew Godwin / @[email protected]
    All state in the central store
    You can maybe get away with caching elsewhere
    Zero service-to-service comms
    Communication paths are failure modes
    Controllers should be simple loops
    Query database, reconcile, repeat

    View Slide

  32. Andrew Godwin / @[email protected]
    They are easy to test and refactor
    Mocking your datastore is all that's needed; everything else should flow

    View Slide

  33. Andrew Godwin / @[email protected]
    They self-heal!
    Provided your datastore stays up, everything else is flexible

    View Slide

  34. Andrew Godwin / @[email protected]
    There are downsides - mostly max scale
    But I generally design systems with upper bounds on their throughput

    View Slide

  35. Andrew Godwin / @[email protected]
    The datastore will be your bottleneck
    But modern DBs are very capable - and you can do replicas!

    View Slide

  36. Andrew Godwin / @[email protected]
    Queues are still fine!
    If you need the complexity and scale, then great. But they are harder!

    View Slide

  37. Andrew Godwin / @[email protected]
    Remember, there's no universal solution
    Just please, please, avoid building a mess of microservices that do RPC calls

    View Slide

  38. Thanks!
    Andrew Godwin
    [email protected]
    Want to talk more about Takahē and ActivityPub protocols?
    Come to the Open Space at 3pm in Room 251AB!

    View Slide