Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Mistakes we made adopting event sourcing (and how we recovered)

Nat Pryce
February 05, 2020

Mistakes we made adopting event sourcing (and how we recovered)

Over the last few years we have been building new systems that have event-sourced architectures. Event-sourcing is a good fit for our needs because the organisation wants to preserve an accurate history of information managed by the system and analyse it for (among other things) fraud detection. When we started, however, none of us had built a system with an event-sourced architecture before. Despite reading plenty of advice on what to do and what to avoid, and experience reports from other projects, we made some significant mistakes in our design. This talk describes where we went wrong, in the hope that others can learn from our failures.

But it’s not all bad news. We were able to recover from our mistakes with an ease that surprised us. I’ll also describe the factors that allowed us to easily change our architecture, in the hope that others can learn from our successes too.

Presented at DDDEU 2020.

Nat Pryce

February 05, 2020
Tweet

More Decks by Nat Pryce

Other Decks in Programming

Transcript

  1. Mistakes we made
    adopting event sourcing
    (and how we recovered)
    Nat Pryce
    natpryce.com
    @natpryce

    View full-size slide

  2. We were building an editorial system for scientific publishing
    Submit Review Accept Typeset Publish Read
    Research
    Transfer
    Reject
    Revise
    Research
    Integrity
    Analysis

    View full-size slide

  3. Containment & facing metaphors: core & peripheral applications
    Research
    Integrity
    Analysis

    View full-size slide

  4. Technical underpinnings: boring technologies
    http://boringtechnology.club/

    View full-size slide

  5. Mistakes we made

    View full-size slide

  6. Mistakes we made
    ● Seduced by eventual consistency
    Application
    Event History
    Service
    Event History
    Store
    Application
    HTTP

    View full-size slide

  7. Application Current State
    Store
    Mistakes we made
    ● Seduced by eventual consistency
    ● Command processors both stored
    events and current state
    Event History
    Service
    Event History
    Store

    View full-size slide

  8. Event History
    Service
    Mistakes we made
    ● Seduced by eventual consistency
    ● Command processors both storied
    events and current state
    ● Confusion between event-driven &
    event-sourced architecture
    Application
    Event History
    Store
    Application

    View full-size slide

  9. Mistakes we made
    ● Seduced by eventual consistency
    ● Command processors both stored
    events and current state
    ● Confusion between event-driven
    & event-sourced architecture
    ● Used the event store as a
    message bus
    Event History
    Service
    Application
    Event History
    Store
    Application

    View full-size slide

  10. How we recognised our mistakes

    View full-size slide

  11. Perhaps I was a little machiavellian
    I wanted us to build an intuition for the
    advantages, disadvantages and
    trade-offs inherent in an event-sourced
    architecture, rather than apply patterns
    cookie-cutter style.

    View full-size slide

  12. What we decided to do

    View full-size slide

  13. Change our terminology
    The word “event” is such an overused term!
    We had many discussions about how to name different kinds of event:
    ● those that are part of the historical record
    ● those that are emitted by our active application monitoring
    ● those that are notifications that should trigger activity
    ● etc.
    Our latest attempt:
    ● "historical fact" not "event"
    ● "historical record" not "event store"

    View full-size slide

  14. Use REST & HTTP for integration, not events
    HATEOAS means that peripheral applications are context independent
    ● We can link our legacy systems to them
    ● We can invoke them from test tools, even in the live environment
    Links and HTTP content type negotiation to manage service evolution
    ● Decouple release cadences across the organisation

    View full-size slide

  15. Make command handlers transactional
    The application components connect directly to the event store database
    We kept the HTTP service, but only for reading events
    It turned out to be very useful for hack-days!

    View full-size slide

  16. Treat snapshot of entity state as a read-through cache
    1. In a read transaction:
    1. load the recent state of the entity into an in-memory model
    2. In a write transaction:
    1. load events that occurred to the entity since the recent projection into the
    in-memory model
    2. perform business logic
    3. record events resulting from executing the command
    3. In a write transaction:
    1. Save the in memory state if it was created from more recent events than
    the latest state in the database

    View full-size slide

  17. Store current entity state as JSON, not relational tables
    Use existing code for:
    ● Serialisation/deserialisation between JSON and Kotlin domain model
    ● Migrating JSON structures in the database concurrently with live transactions
    Alternatively: discard JSON and rebuild from events if the format is out of date
    Postgres can index on properties of JSON objects
    ● No need to denormalise JSON into columns to select entities

    View full-size slide

  18. How we recovered

    View full-size slide

  19. Why was that so easy?

    View full-size slide

  20. We used the "Hexagonal" architecture
    HTTP
    service
    https://alistair.cockburn.us/hexagonal-architecture/

    View full-size slide

  21. We had extensive automated tests
    val scenario = newScenario()
    @Test fun `an author cannot edit another author's submission`() {
    val alice = scenario.newAuthor(aliceDetails, aliceUserId, theJournal)
    val bob = scenario.newAuthor(bobDetails, bobUserId, theJournal)
    val alicesSubmission = alice.canCreateSubmission(
    journal = theJournal,
    manuscriptType = article)
    bob.cannotUpdate(alicesSubmission, SetAuthors(bob),
    because = NotAuthorised(bob.userId, theJournal, alicesSubmission, alice.userId))
    }

    View full-size slide

  22. We integrated and deployed continuously
    Everyone pushed straight to master: no
    branches or pull requests
    Every commit was automatically built, tested
    and promoted to live.
    Release decoupled from deployment, controlled
    by feature flags. Features could be turned on for
    a single session with cookies, or for all users.
    Client-driven compatibility tests automatically
    inserted into build pipelines of dependencies
    The tests have to have good coverage!

    View full-size slide

  23. Our automated tests enforced the Hexagonal architecture
    Direct
    Actor
    In-memory
    storage
    Test
    Screenplay Pattern
    https://ideas.riverglide.com/page-objects-refactored-12ec3541990

    View full-size slide

  24. The exact same test code ran against the HTTP services
    HTTP
    service
    HTTP
    Actor
    Test

    View full-size slide

  25. … when deployed into a cloud environment
    Virtual Machine
    Virtual Machine
    Virtual Machine
    HTTP
    Actor
    Test
    Load balancer
    Firewall

    View full-size slide

  26. … through the browser
    Service
    cluster
    Service
    cluster
    Front-end
    cluster
    Front-end
    cluster
    Service
    cluster
    Browsing
    Actor
    Test
    Front-end
    cluster
    Web
    browser

    View full-size slide

  27. … against production
    Data centre
    Data centre
    Data centre
    Browsing
    Actor
    Test Web
    browser
    CDN

    View full-size slide

  28. Lessons Learned

    View full-size slide

  29. The distinction between commands and events is vital
    The confusion between "event sourced" and "event driven" seems very common.
    ● Developers joining the project often go down the same rabbit hole we did.
    Drawing a clear distinction between "things that cause side effects" and "things
    that happened in the past" is more important than Command/Query separation.
    (Even in an event-driven architecture, event handlers would perform commands in
    response to events and – to use the terminology we adopted – record facts in the
    historical record about what happened.)

    View full-size slide

  30. Event sourcing combines well with other architectural styles
    In our case:
    Event Sourcing: Domain & data model
    Hexagonal: Intra-process software architecture
    CQRS: Intra-application data flow
    REST: Integrate applications & organisations

    View full-size slide

  31. Eschew technical details in functional tests
    By defining our tests in terms of the domain model only, they were able to support
    significant changes to the technical architecture.
    The Hexagonal architecture and Screenplay pattern combine well to achieve this.
    Hexagonal architecture
    https://alistair.cockburn.us/hexagonal-architecture/
    Screenplay Pattern
    https://ideas.riverglide.com/page-objects-refactored-12ec3541990

    View full-size slide

  32. Thank you
    Nat Pryce
    natpryce.com
    @natpryce

    View full-size slide