Slide 1

Slide 1 text

Mistakes we made adopting event sourcing (and how we recovered) Nat Pryce natpryce.com @natpryce

Slide 2

Slide 2 text

We were building an editorial system for scientific publishing Submit Review Accept Typeset Publish Read Research Transfer Reject Revise Research Integrity Analysis

Slide 3

Slide 3 text

Containment & facing metaphors: core & peripheral applications Research Integrity Analysis

Slide 4

Slide 4 text

Technical underpinnings: boring technologies http://boringtechnology.club/

Slide 5

Slide 5 text

Mistakes we made

Slide 6

Slide 6 text

Mistakes we made ● Seduced by eventual consistency Application Event History Service Event History Store Application HTTP

Slide 7

Slide 7 text

Application Current State Store Mistakes we made ● Seduced by eventual consistency ● Command processors both stored events and current state Event History Service Event History Store

Slide 8

Slide 8 text

Event History Service Mistakes we made ● Seduced by eventual consistency ● Command processors both storied events and current state ● Confusion between event-driven & event-sourced architecture Application Event History Store Application

Slide 9

Slide 9 text

Mistakes we made ● Seduced by eventual consistency ● Command processors both stored events and current state ● Confusion between event-driven & event-sourced architecture ● Used the event store as a message bus Event History Service Application Event History Store Application

Slide 10

Slide 10 text

How we recognised our mistakes

Slide 11

Slide 11 text

Perhaps I was a little machiavellian I wanted us to build an intuition for the advantages, disadvantages and trade-offs inherent in an event-sourced architecture, rather than apply patterns cookie-cutter style.

Slide 12

Slide 12 text

What we decided to do

Slide 13

Slide 13 text

Change our terminology The word “event” is such an overused term! We had many discussions about how to name different kinds of event: ● those that are part of the historical record ● those that are emitted by our active application monitoring ● those that are notifications that should trigger activity ● etc. Our latest attempt: ● "historical fact" not "event" ● "historical record" not "event store"

Slide 14

Slide 14 text

Use REST & HTTP for integration, not events HATEOAS means that peripheral applications are context independent ● We can link our legacy systems to them ● We can invoke them from test tools, even in the live environment Links and HTTP content type negotiation to manage service evolution ● Decouple release cadences across the organisation

Slide 15

Slide 15 text

Make command handlers transactional The application components connect directly to the event store database We kept the HTTP service, but only for reading events It turned out to be very useful for hack-days!

Slide 16

Slide 16 text

Treat snapshot of entity state as a read-through cache 1. In a read transaction: 1. load the recent state of the entity into an in-memory model 2. In a write transaction: 1. load events that occurred to the entity since the recent projection into the in-memory model 2. perform business logic 3. record events resulting from executing the command 3. In a write transaction: 1. Save the in memory state if it was created from more recent events than the latest state in the database

Slide 17

Slide 17 text

Store current entity state as JSON, not relational tables Use existing code for: ● Serialisation/deserialisation between JSON and Kotlin domain model ● Migrating JSON structures in the database concurrently with live transactions Alternatively: discard JSON and rebuild from events if the format is out of date Postgres can index on properties of JSON objects ● No need to denormalise JSON into columns to select entities

Slide 18

Slide 18 text

How we recovered

Slide 19

Slide 19 text

Why was that so easy?

Slide 20

Slide 20 text

We used the "Hexagonal" architecture HTTP service https://alistair.cockburn.us/hexagonal-architecture/

Slide 21

Slide 21 text

We had extensive automated tests val scenario = newScenario() @Test fun `an author cannot edit another author's submission`() { val alice = scenario.newAuthor(aliceDetails, aliceUserId, theJournal) val bob = scenario.newAuthor(bobDetails, bobUserId, theJournal) val alicesSubmission = alice.canCreateSubmission( journal = theJournal, manuscriptType = article) bob.cannotUpdate(alicesSubmission, SetAuthors(bob), because = NotAuthorised(bob.userId, theJournal, alicesSubmission, alice.userId)) }

Slide 22

Slide 22 text

We integrated and deployed continuously Everyone pushed straight to master: no branches or pull requests Every commit was automatically built, tested and promoted to live. Release decoupled from deployment, controlled by feature flags. Features could be turned on for a single session with cookies, or for all users. Client-driven compatibility tests automatically inserted into build pipelines of dependencies The tests have to have good coverage!

Slide 23

Slide 23 text

Our automated tests enforced the Hexagonal architecture Direct Actor In-memory storage Test Screenplay Pattern https://ideas.riverglide.com/page-objects-refactored-12ec3541990

Slide 24

Slide 24 text

The exact same test code ran against the HTTP services HTTP service HTTP Actor Test

Slide 25

Slide 25 text

… when deployed into a cloud environment Virtual Machine Virtual Machine Virtual Machine HTTP Actor Test Load balancer Firewall

Slide 26

Slide 26 text

… through the browser Service cluster Service cluster Front-end cluster Front-end cluster Service cluster Browsing Actor Test Front-end cluster Web browser

Slide 27

Slide 27 text

… against production Data centre Data centre Data centre Browsing Actor Test Web browser CDN

Slide 28

Slide 28 text

Lessons Learned

Slide 29

Slide 29 text

The distinction between commands and events is vital The confusion between "event sourced" and "event driven" seems very common. ● Developers joining the project often go down the same rabbit hole we did. Drawing a clear distinction between "things that cause side effects" and "things that happened in the past" is more important than Command/Query separation. (Even in an event-driven architecture, event handlers would perform commands in response to events and – to use the terminology we adopted – record facts in the historical record about what happened.)

Slide 30

Slide 30 text

Event sourcing combines well with other architectural styles In our case: Event Sourcing: Domain & data model Hexagonal: Intra-process software architecture CQRS: Intra-application data flow REST: Integrate applications & organisations

Slide 31

Slide 31 text

Eschew technical details in functional tests By defining our tests in terms of the domain model only, they were able to support significant changes to the technical architecture. The Hexagonal architecture and Screenplay pattern combine well to achieve this. Hexagonal architecture https://alistair.cockburn.us/hexagonal-architecture/ Screenplay Pattern https://ideas.riverglide.com/page-objects-refactored-12ec3541990

Slide 32

Slide 32 text

Thank you Nat Pryce natpryce.com @natpryce