Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Mistakes we made adopting event sourcing (and how we recovered)

5358df52bd2ef4f57da1b1cc8634cfd9?s=47 Nat Pryce
February 05, 2020

Mistakes we made adopting event sourcing (and how we recovered)

Over the last few years we have been building new systems that have event-sourced architectures. Event-sourcing is a good fit for our needs because the organisation wants to preserve an accurate history of information managed by the system and analyse it for (among other things) fraud detection. When we started, however, none of us had built a system with an event-sourced architecture before. Despite reading plenty of advice on what to do and what to avoid, and experience reports from other projects, we made some significant mistakes in our design. This talk describes where we went wrong, in the hope that others can learn from our failures.

But it’s not all bad news. We were able to recover from our mistakes with an ease that surprised us. I’ll also describe the factors that allowed us to easily change our architecture, in the hope that others can learn from our successes too.

Presented at DDDEU 2020.

5358df52bd2ef4f57da1b1cc8634cfd9?s=128

Nat Pryce

February 05, 2020
Tweet

Transcript

  1. Mistakes we made adopting event sourcing (and how we recovered)

    Nat Pryce natpryce.com @natpryce
  2. We were building an editorial system for scientific publishing Submit

    Review Accept Typeset Publish Read Research Transfer Reject Revise Research Integrity Analysis
  3. Containment & facing metaphors: core & peripheral applications Research Integrity

    Analysis
  4. Technical underpinnings: boring technologies http://boringtechnology.club/

  5. Mistakes we made

  6. Mistakes we made • Seduced by eventual consistency Application Event

    History Service Event History Store Application HTTP
  7. Application Current State Store Mistakes we made • Seduced by

    eventual consistency • Command processors both stored events and current state Event History Service Event History Store
  8. Event History Service Mistakes we made • Seduced by eventual

    consistency • Command processors both storied events and current state • Confusion between event-driven & event-sourced architecture Application Event History Store Application
  9. Mistakes we made • Seduced by eventual consistency • Command

    processors both stored events and current state • Confusion between event-driven & event-sourced architecture • Used the event store as a message bus Event History Service Application Event History Store Application
  10. How we recognised our mistakes

  11. Perhaps I was a little machiavellian I wanted us to

    build an intuition for the advantages, disadvantages and trade-offs inherent in an event-sourced architecture, rather than apply patterns cookie-cutter style.
  12. What we decided to do

  13. Change our terminology The word “event” is such an overused

    term! We had many discussions about how to name different kinds of event: • those that are part of the historical record • those that are emitted by our active application monitoring • those that are notifications that should trigger activity • etc. Our latest attempt: • "historical fact" not "event" • "historical record" not "event store"
  14. Use REST & HTTP for integration, not events HATEOAS means

    that peripheral applications are context independent • We can link our legacy systems to them • We can invoke them from test tools, even in the live environment Links and HTTP content type negotiation to manage service evolution • Decouple release cadences across the organisation
  15. Make command handlers transactional The application components connect directly to

    the event store database We kept the HTTP service, but only for reading events It turned out to be very useful for hack-days!
  16. Treat snapshot of entity state as a read-through cache 1.

    In a read transaction: 1. load the recent state of the entity into an in-memory model 2. In a write transaction: 1. load events that occurred to the entity since the recent projection into the in-memory model 2. perform business logic 3. record events resulting from executing the command 3. In a write transaction: 1. Save the in memory state if it was created from more recent events than the latest state in the database
  17. Store current entity state as JSON, not relational tables Use

    existing code for: • Serialisation/deserialisation between JSON and Kotlin domain model • Migrating JSON structures in the database concurrently with live transactions Alternatively: discard JSON and rebuild from events if the format is out of date Postgres can index on properties of JSON objects • No need to denormalise JSON into columns to select entities
  18. How we recovered

  19. Why was that so easy?

  20. We used the "Hexagonal" architecture HTTP service https://alistair.cockburn.us/hexagonal-architecture/

  21. We had extensive automated tests val scenario = newScenario() @Test

    fun `an author cannot edit another author's submission`() { val alice = scenario.newAuthor(aliceDetails, aliceUserId, theJournal) val bob = scenario.newAuthor(bobDetails, bobUserId, theJournal) val alicesSubmission = alice.canCreateSubmission( journal = theJournal, manuscriptType = article) bob.cannotUpdate(alicesSubmission, SetAuthors(bob), because = NotAuthorised(bob.userId, theJournal, alicesSubmission, alice.userId)) }
  22. We integrated and deployed continuously Everyone pushed straight to master:

    no branches or pull requests Every commit was automatically built, tested and promoted to live. Release decoupled from deployment, controlled by feature flags. Features could be turned on for a single session with cookies, or for all users. Client-driven compatibility tests automatically inserted into build pipelines of dependencies The tests have to have good coverage!
  23. Our automated tests enforced the Hexagonal architecture Direct Actor In-memory

    storage Test Screenplay Pattern https://ideas.riverglide.com/page-objects-refactored-12ec3541990
  24. The exact same test code ran against the HTTP services

    HTTP service HTTP Actor Test
  25. … when deployed into a cloud environment Virtual Machine Virtual

    Machine Virtual Machine HTTP Actor Test Load balancer Firewall
  26. … through the browser Service cluster Service cluster Front-end cluster

    Front-end cluster Service cluster Browsing Actor Test Front-end cluster Web browser
  27. … against production Data centre Data centre Data centre Browsing

    Actor Test Web browser CDN
  28. Lessons Learned

  29. The distinction between commands and events is vital The confusion

    between "event sourced" and "event driven" seems very common. • Developers joining the project often go down the same rabbit hole we did. Drawing a clear distinction between "things that cause side effects" and "things that happened in the past" is more important than Command/Query separation. (Even in an event-driven architecture, event handlers would perform commands in response to events and – to use the terminology we adopted – record facts in the historical record about what happened.)
  30. Event sourcing combines well with other architectural styles In our

    case: Event Sourcing: Domain & data model Hexagonal: Intra-process software architecture CQRS: Intra-application data flow REST: Integrate applications & organisations
  31. Eschew technical details in functional tests By defining our tests

    in terms of the domain model only, they were able to support significant changes to the technical architecture. The Hexagonal architecture and Screenplay pattern combine well to achieve this. Hexagonal architecture https://alistair.cockburn.us/hexagonal-architecture/ Screenplay Pattern https://ideas.riverglide.com/page-objects-refactored-12ec3541990
  32. Thank you Nat Pryce natpryce.com @natpryce