Getting (service) design right

Slide 1

Slide 1 text

Getting (service) design right or how timeless wisdom can lead the way Uwe Friedrichsen – codecentric AG – 2016-2019

Slide 2

Slide 2 text

Uwe Friedrichsen IT traveller. Dot Connector. Cartographer of uncharted territory. Keeper of timeless wisdom. CTO and Fellow at codecentric. https://www.speakerdeck.com/ufried https://medium.com/@ufried @ufried

Slide 3

Slide 3 text

The marvelous world of microservices

Slide 4

Slide 4 text

So, you are doing microservices ...

Slide 5

Slide 5 text

Yeah!

Slide 6

Slide 6 text

... of course, putting them in Docker ...

Slide 7

Slide 7 text

Yeah!

Slide 8

Slide 8 text

... and running them on Kubernetes

Slide 9

Slide 9 text

Yeah!

Slide 10

Slide 10 text

So, your are doing it the cloud-native way

Slide 11

Slide 11 text

“Cloud native computing uses an open source software stack to deploy applications as microservices, packaging each part into its own container, and dynamically orchestrating those containers to optimize resource utilization.” Source: Cloud Native Computing Foundation Homepage (https://www.cncf.io/)

Slide 12

Slide 12 text

Yeah!

Slide 13

Slide 13 text

Well, ever pondered the consequences?

Slide 14

Slide 14 text

Eh, what?

Slide 15

Slide 15 text

Why do you ask?

Slide 16

Slide 16 text

Consequences of cloud-native computing

Slide 17

Slide 17 text

Different processes at runtime Remote communication between services Distributed system Cloud-native computing Services packaged in containers

Slide 18

Slide 18 text

Distributed systems are not limited to cloud-native computing

Slide 19

Slide 19 text

(Almost) every system is a distributed system -- Chas Emerick http://www.infoq.com/presentations/problems-distributed-systems

Slide 20

Slide 20 text

The software you develop and maintain is most likely part of a (big) distributed system landscape

Slide 21

Slide 21 text

Consequences of distributed systems

Slide 22

Slide 22 text

Everything fails, all the time. -- Werner Vogels

Slide 23

Slide 23 text

Failures in distributed systems ... • Crash failure • Omission failure • Timing failure • Response failure • Byzantine failure

Slide 24

Slide 24 text

... lead to a variety of effects • Lost messages • Incomplete messages • Duplicate messages • Distorted messages • Out-of-order message arrival • Partial, out-of-sync local memory • ...

Slide 25

Slide 25 text

These effects are based on the non-determinism introduced by remote communication due to distributed failure modes

Slide 26

Slide 26 text

Understand that remote communication points are predetermined breaking points of your application Accept that the effects will hit you at the application level

Slide 27

Slide 27 text

What can we do against it?

Slide 28

Slide 28 text

Typical measures • High-availability (HA) hardware/software • Only applicable for very small installations • Usually not available in cloud environments • Delegate failure handling to infrastructure level • Partial relief, will not solve all problems • Implement resilient software design patterns • Very important, still will not fix a bad design • Minimize number of remote communication points • Minimize problem surface by design

Slide 29

Slide 29 text

Reducing remote communication points

Slide 30

Slide 30 text

Reducing remote communication points Reduce the number of distributed runtime artifacts i.e., create coarser grained services Reduce the need to communicate across runtime artifact boundaries i.e., get the functional design right

Slide 31

Slide 31 text

Slide 32

Slide 32 text

Remember? https://www.martinfowler.com/articles/ distributed-objects-microservices.html

Slide 33

Slide 33 text

If you can get away with a monolith, just do it! You are still allowed to structure it well ... ;)

Slide 34

Slide 34 text

Still, your functional domain usually will be too big to put it all in a single monolith Thus, you will need to distribute your system landscape to a certain degree

Slide 35

Slide 35 text

Rule of thumb Always think (at least) twice before distributing functionality Only do it if you really need independent deployments Addendum You may also want to distribute your functionality if you have very disparate NFRs for different parts of your functionality (e.g., security *, availability, reliability, scalability **, portability, resource utilization) * forgotten most of the time ** less often needed than most people assume

Slide 36

Slide 36 text

Slide 37

Slide 37 text

Case study (Very simple) eCommerce shop • Implements the core functionalities • Search & show • Add to shopping cart • Checkout • Shipment • Payments only touched as black box • No recommendations, etc.

Slide 38

Slide 38 text

The typical design approach ... a.k.a. “the counter-example”

Slide 39

Slide 39 text

Typical design approach Focus on avoiding redundancy and maximizing reuse 1. Start with a comprehensive domain (actually: E/R) model

Slide 40

Slide 40 text

Slide 41

Slide 41 text

Typical design approach Focus on avoiding redundancy and maximizing re-use 1. Start with a comprehensive domain (actually: E/R) model 2. Wrap entities with services

Slide 42

Slide 42 text

CustomerService S Customer OrderService S Order ProductService S Product

Slide 43

Slide 43 text

Slide 44

Slide 44 text

ProductService OrderService CustomerService S Customer OrderService • Add to shopping cart S ProductService • Search/show S Order Product

Slide 45

Slide 45 text

Typical design approach Focus on avoiding redundancy and maximizing re-use 1. Start with a comprehensive domain (actually: E/R) model 2. Wrap entities with services 3. Spread functionality over services 4. Add “process services” for “complex use cases” • i.e., use cases that touch more than one data service

Slide 46

Slide 46 text

CustomerService S Customer OrderService • Add to shopping cart S ProductService • Search/show S Order Product CheckoutService • Checkout S ShipmentService • Shipment S

Slide 47

Slide 47 text

Slide 48

Slide 48 text

CustomerService ProductService • Search/show CustomerService • Customer self care S Customer OrderService • Add to shopping cart S ProductService • Search/show • Product catalog maintenance S CheckoutService • Checkout S ShipmentService • Shipment S Order Product

Slide 49

Slide 49 text

This (familiar) design looks innocuous at first sight But how good is it in terms of remote communication?

Slide 50

Slide 50 text

CheckoutService OrderService ProductService CustomerService Payment Provider read order read price loop [order items] calculate price read payment methods pay mark paid

Slide 51

Slide 51 text

ShipmentService OrderService ProductService CustomerService Delivery Provider read order read product and packaging information loop [order items] read delivery address inform delivery provider mark dispatched update items in stock loop [order items]

Slide 52

Slide 52 text

Findings • Core business use cases are failure-prone and slow • Data maintenance use cases are robust and fast

Slide 53

Slide 53 text

Congratulations! You designed a system for a company that's core business purpose is to maintain data, not to make money!

Slide 54

Slide 54 text

Properties of the design • Focus on avoiding redundancy and maximizing reuse • Based on traditional OO design practices • Results in high coupling between services • Results in moderate cohesion inside services • Okay for CRUD applications • But then better use a generator, scaffolding framework, ... • Okay-ish for single-process applications • Tends to affect maintainability negatively • Not okay for distributed services • Big failure surface, bad response times

Slide 55

Slide 55 text

How can we do better?

Slide 56

Slide 56 text

Let’s do a bit of research ...

Slide 57

Slide 57 text

Structured design by W. P. Stevens, G. J. Myers and L. L. Constantine [Stev 1974]

Slide 58

Slide 58 text

“The fewer and simpler the connections between modules, the easier it is to understand each module without reference to other modules. Minimizing connections between modules also minimizes the paths along which changes and errors can propagate into other parts of the system, thus eliminating disastrous ‘ripple’ effects, where changes in one part cause errors in another, necessitating additional changes elsewhere, giving rise to new errors, etc.” [Ste 1974]

Slide 59

Slide 59 text

“Coupling is the measure of the strength of association established by a connection from one module to another.” [Ste 1974]

Slide 60

Slide 60 text

Coupling High Low Contributing factors Interface Complexity Type of Connection Type of Communication Simple, obvious Complicated, obscure To module by name (depending on interface) To internal elements (depending on implementation) Data (control flow handled by environment) Control (Explicit passing of control) Hybrid (Manipulation of internal control flow by parameters) [Ste 1974]

Slide 61

Slide 61 text

Realize that this paper was written at a very different time and in a very different context than we face today While the core concepts are timeless and still valid, we usually need to rethink the concrete instructions

Slide 62

Slide 62 text

Contributing factors Interface Complexity Type of Connection Type of Communication Simple, obvious Complicated, obscure To module by name (depending on interface) To internal elements (depending on implementation) Data (control flow handled by environment) Control (Explicit passing of control) Hybrid (Manipulation of internal control flow by parameters) * Ability of a service to complete its task without the other service being present Functional Independence * Independent (does not need other service to work) Fully dependent (does not work without other service) Partly dependent (graceful degradation of service) Coupling High Low

Slide 63

Slide 63 text

“Coupling is reduced when the relationships among elements not in the same module are minimized. There are two ways of achieving this – minimizing the relationships among modules and maximizing relationships among elements in the same module. In practice, both ways are used. [...] Binding is the measure of the cohesiveness of a module. The objective here is to reduce coupling by striving for high binding.” [Ste 1974]

Slide 64

Slide 64 text

On the criteria to be used in decomposing systems into modules by David L. Parnas [Par 1972]

Slide 65

Slide 65 text

Separation of concerns One concept/decision per module Information hiding Reveal as little as possible about internal implementation + Better changeability Changes are kept local Independent teams Teams can easier work independently on different modules Easier to comprehend Modules can be understood on their own easier

Slide 66

Slide 66 text

Research findings • High cohesion, low coupling leads the right way • Separation of Concerns and Information hiding support implementing them • Concrete paper instructions should not be followed blindly • Different context (single process, very limited hardware) • Would lead to nano or pico services à lots of remote calls • You need to rethink instructions in the current context • Required for all CS papers from a different time & context • Leads to concept of Functional Independence in this context • Reduces risk of “vertical decomposition” (i.e., layered design)

Slide 67

Slide 67 text

Functionality vertical decomposition (layer design, composition) In practice, you typically use a combination of both approaches Core functional decomposition approaches horizontal decomposition (pillar design, segregation)

Slide 68

Slide 68 text

Vertical decomposition • Based on “uses” relation • Typical drivers are reuse and avoidance of redundancy • Creates strong coupling (high functional dependence) • Often useful pattern inside a process boundary • Due to deterministic communication behavior • Problematic across process boundaries à Should be avoided in service design

Slide 69

Slide 69 text

Horizontal decomposition • Based on functional segregation • Typical drivers are autonomy and independence • Creates low coupling (high functional independence) • Useful pattern across process boundaries • Can also be useful inside a process boundary • Less accidental "ripple" effects due to changes à Should be preferred in service design

Slide 70

Slide 70 text

Watch out! • Vertical decomposition is our default design approach • We’ve learned it in our CS education (divide and conquer, ...) • It’s emphasized in our daily work (DRY, reusability, ...) • Even our IDEs support it (“Extract method”) • It's everywhere! It's predominant! • It takes energy not to apply vertical decomposition • Most people never learned horizontal decomposition

Slide 71

Slide 71 text

How to learn horizontal decomposition?

Slide 72

Slide 72 text

Domain-Driven Design by Eric Evans [Eva 2004]

Slide 73

Slide 73 text

DDD to the rescue? • Naive application of building block patterns leads to the counter-example design we have seen before • Not useful in our context due to high coupling • “Service” pattern leads to process service working on entities • Anti-pattern in our context due to high coupling • “Conceptual contours” supports high cohesion • Yet, tends to be too fine grained for our context • “Bounded contexts” supports low coupling • Yet, tends to be too coarse grained for our context à Mixed emotions: Good, but not the expected panacea

Slide 74

Slide 74 text

And now?

Slide 75

Slide 75 text

For good service design, look at the behavior first, not the data

Slide 76

Slide 76 text

Case study (Very simple) eCommerce shop • Implements the core functionalities • Search & show • Add to shopping cart • Checkout • Shipment • Customer self-care • Product catalog maintenance • Payments only touched as black box • No recommendations, etc.

Slide 77

Slide 77 text

Core reasoning To reduce the number of remote calls needed for a given functionality, we need to spread the functionality between the services in a way that a single use case/user interaction less often needs to cross service boundaries. Therefore, we try to organize services around use cases/user interactions.

Slide 78

Slide 78 text

Search & Show Add to shopping cart Checkout Shipment Customer self-care Product catalog maintenance eCommerce shop Customer Back office employee Warehouse employee

Slide 79

Slide 79 text

Search & Show Add to shopping cart Checkout Shipment Customer self-care Product catalog maintenance eCommerce shop Customer Back office employee Warehouse employee Three different actors • Indicator for cohesion boundaries • (At least) three different UIs • Could be completely different architectures • Depending on user needs, usage patterns and other NFRs • As an architect this gives you additional options

Slide 80

Slide 80 text

Warehouse employee Search & Show Add to shopping cart Checkout Shipment Customer self-care Product catalog maintenance eCommerce shop Customer Back office employee Could be a mobile-first FE with service-oriented backend Could be a special warehouse device FE with a monolithic backend Could be a rich desktop app Could be a desktop browser first FE with a service-oriented backend

Slide 81

Slide 81 text

Search & Show Add to shopping cart Checkout Shipment Customer self-care Product catalog maintenance

Slide 82

Slide 82 text

Behavior-based design approach Focus on minimum cross-service communication inside a use case/user interaction 1. Each use case/user interaction is a service candidate

Slide 83

Slide 83 text

ProductCatalogService • Product catalog maintenance S ShipmentService • Shipment S CustomerMDService * • Customer self care S Candidate Candidate Candidate ShoppingCartService • Add to shopping cart S CheckoutService • Checkout S SearchService • Search/show S Candidate Candidate Candidate * MD = Master Data

Slide 84

Slide 84 text

Slide 85

Slide 85 text

ShoppingCartService • Add to shopping cart S CheckoutService • Checkout S SearchService • Search/show S Candidate Candidate Candidate ProductCatalogService • Product catalog maintenance S ShipmentService • Shipment S CustomerMDService • Customer self care S Candidate Candidate Candidate Splitting up use cases in multiple services not needed in this example

Slide 86

Slide 86 text

Behavior-based design approach Focus on minimum cross-service communication inside a use case/user interaction 1. Each use case/user interaction is a service candidate 2. Possibly split big use cases in multiple services • Only if really needed (e.g., multiple teams, disparate NFRs) • Look for functional clusters with low coupling between them 3. Try to group several use cases in a single service • Strive for a sweet spot in terms of an overall trade-off • Look for service candidates that operate on the same data

Slide 87

Slide 87 text

ShoppingCartService • Add to shopping cart S ProductCatalogService • Product catalog maintenance S CheckoutService • Checkout S ShipmentService • Shipment S SearchService • Search/show S CustomerMDService • Customer self care S Candidate Candidate Candidate Candidate Candidate Candidate Product catalog Customer master data Shopping cart Buying order Inventory data / shipping order Product catalog

Slide 88

Slide 88 text

Slide 89

Slide 89 text

Architectural reasoning • Same data ... • ... but different actors • Option to work on a single product catalog database here outweighs different actors using a single service à Unite in one service (unless you decide to use a different architectural style for the back office employee application)

Slide 90

Slide 90 text

ShoppingCartService • Add to shopping cart S ProductCatalogService • Product catalog maintenance • Search/show S CheckoutService • Checkout S ShipmentService • Shipment S CustomerMDService • Customer self care S Candidate Candidate Candidate Candidate Customer master data Shopping cart Buying order Inventory data / shipping order Product catalog

Slide 91

Slide 91 text

Slide 92

Slide 92 text

Architectural reasoning • Some (sequential) cohesion and could work on same data ... • ... but unification is still not imperative • Need to ponder other aspects and balance trade-offs • Different representations for shopping cart and order needed? • UI part of the service? • How does payment interfere (not considered in the example)? à Here we assume that it is best to unite the services

Slide 93

Slide 93 text

ProductCatalogService • Product catalog maintenance • Search/show S OrderCreationService • Add to shopping cart • Checkout S ShipmentService • Shipment S CustomerMDService • Customer self care S Candidate Candidate Customer master data Shopping cart / buying order Inventory data / shipping order Product catalog

Slide 94

Slide 94 text

Additional reasoning • Buying order vs. shipping order • Less commonalities than shopping cart and buying order • Shipping order is only “ephemeral” entity • Different actors using them à Keep them separated (we need a signaling mechanism then) • Who updates items in stock? • No longer part of product catalog maintenance • Warehouse employee responsible (more reasonable anyway) à Add additional use case “Fill up inventory”

Slide 95

Slide 95 text

Slide 96

Slide 96 text

Nice, but is this design any better? Again: How good is it in terms of remote communication?

Slide 97

Slide 97 text

CheckoutService Payment Provider pay calculate price

Slide 98

Slide 98 text

WarehouseService Delivery Provider inform delivery provider update items in stock

Slide 99

Slide 99 text

Findings • All use cases are robust and fast Side note: It is not always as nice and simple as in this example

Slide 100

Slide 100 text

Hmm, any trade-offs?

Slide 101

Slide 101 text

1st law of architectural work: Every decision has its price. No decision is for free. (Translation: No decision only has upsides. Every decision also has downsides.)

Slide 102

Slide 102 text

2nd law of architectural work: A decision can only be evaluated with respect to its context. (Translation: Decisions are not invariably “good” or “bad”, but only in a given context.)

Slide 103

Slide 103 text

Trade-offs of the approach • Biggest concern: What about the data? • Data replication and reconciliation • Entity distribution (no single source of truth for an entity) • Question cannot be answered in general • Here we will evaluate it with respect to the given example • Plus some general considerations (but no general evaluation)

Slide 104

Slide 104 text

Slide 105

Slide 105 text

* * * * Customer • name • address • payment methods E Order • customer • payment information • shipping information • order items • product • quantity E Product • name • description • image(s) • price • items in stock • packaging information (size, weight, special care) E Only used as copy template Only relevant for search/show Only relevant for checkout Just an ID for business related referencing purposes Only relevant for checkout (including invoice address) Only relevant for shipment Only relevant for shipment (including delivery address) Different for checkout and shipment (only IDs and quantities needed) Immutable after completion (all data copied into order)

Slide 106

Slide 106 text

CustomerMDService • Customer self care S Customer master data OrderCreationService • Add to shopping cart • Checkout S Shopping cart / buying order WarehouseService • Shipment • Fill up inventory S Inventory data / shipping order ProductCatalogService • Product catalog maintenance • Search/show S Product catalog Putting these use cases in a single service avoids the need for data replication 3 3 3 Putting these use cases in a single service avoids the need for data signaling 4 4 Needs to signal data for shipment order (signaling mechanism required) 2 2 2 Needs to copy some product (and customer) data into the order (could be handled by the frontend) 1 1 1 1

Slide 107

Slide 107 text

Findings • All use cases are robust and fast • Minimal need to transfer data between services • Solvable via frontend and standard data transfer solution (batch file, transfer table, message queue, ...) • No data replication and reconciliation solution needed Side note: It is not always as nice and simple as in this example

Slide 108

Slide 108 text

Still, it is not always that nice and easy Translation: There are situations where two or more copies of the same data need to be kept in sync

Slide 109

Slide 109 text

CustomerMDService • Customer self care S Customer master data OrderCreationService • Add to shopping cart • Checkout S Shopping cart / buying order WarehouseService • Shipment • Fill up inventory S Inventory data / shipping order ProductCatalogService • Product catalog maintenance • Search/show S Product catalog Might want to allow update of payment methods in the context of customer self care and checkout (requires one-way synchronization of master data after change) 1 1 1 Might want to allow adding items to shopping cart only if items are in stock (requires one-way synchronization of transactional data after change) 2 2 2 It might even hit us in our example

Slide 110

Slide 110 text

How can we keep the data in sync?

Slide 111

Slide 111 text

Options to keep data in sync • Shared database • Compromises original reasoning to use services • Distributed transactions (2-phase commit) • Tight coupling compromises service independence • Compromises availability and scalability • Eventual consistency • Sufficient for basically all functional requirements • Supports low coupling and high availability • Downside: Much harder programming model than ACID TX

Slide 112

Slide 112 text

Options for eventual consistency • Batch pull • Consumer pulls data batch when ready to process data • Very robust approach (also suitable for legacy integration) • Data sync delay may be longer than acceptable • Batch bootstrapping & delta push • Initial state sync via batch pull, then push of delta updates • Often combined with event sourcing, CQRS, reactive, ... • Fast, robust (if done right) and still quite lightweight • Distributed log • Offers advantages of previous approach in one technology • Kafka currently is the best-known implementation • Still have a plan how to recover if the tool breaks

Slide 113

Slide 113 text

And the single source of truth issue?

Slide 114

Slide 114 text

Pondering single source of truth • Usually task for analytical data processing • Orthogonal, well-understood issue • Many solutions available • Sometimes needed in transactional systems (e.g., CRM) • Question if it is really a need or just a habit • Strive for eventual consistency • Go for event streams or distributed logs for fast updates

Slide 115

Slide 115 text

Wrap-up

Slide 116

Slide 116 text

Wrap-up • Think (at least) twice before distributing functionality • Strive for low coupling, support with high cohesion • Prefer horizontal decomposition in service design • Favor functional independence over reuse • The magic is in the behavior, not the data • Employ, e.g., use cases to find service boundaries • Prefer eventual consistency for data synchronization • Value the timeless wisdom • But update the instructions to the given current context

Slide 117

Slide 117 text

No content

Slide 118

Slide 118 text

Uwe Friedrichsen https://www.speakerdeck.com/ufried https://medium.com/@ufried @ufried