Getting (service) design right

Getting (service) design right

This slide deck basically tries to address functional service design not only from a maintainability point of view, but also form an operational point of view. How to spread your business functionality across your service landscape to maximize availability and responsiveness?

It starts with the observation that virtually all system landscapes these days, including cloud-native, are distributed systems. Due to the failure modes of distributed systems every remote communication basically is a predetermined breaking point of your system.

After briefly discussing other options to increase availability and responsiveness the biggest part of the deck discusses the question how a good functional design can positively influence the aforementioned system properties.

Using a very simple eCommerce application as an example, first a typical design is shown and why it is suboptimal in terms of availability and responsiveness. After that, we briefly dive into classical CS papers and see what we can learn from them. We also learn that while the core insights of those papers are mostly timeless, their implementation instructions are not and that we need to update them to our current context.

Then we use the learnt ideas to derive a different functional decomposition approach starting with the system behavior and apply it to our example - which leads to a lot better system behavior in terms of availability and responsiveness. Finally we discuss the most important trade-offs of the different design approach - as every design decision of course also has its downsides ... ;)

As always, the voice track is missing. Still, I hope the slides on their own also offer you a few valuable ideas.

E698a765c9d04ae52d5e1815b2007cfe?s=128

Uwe Friedrichsen

March 07, 2019
Tweet

Transcript

  1. Getting (service) design right or how timeless wisdom can lead

    the way Uwe Friedrichsen – codecentric AG – 2016-2019
  2. Uwe Friedrichsen IT traveller. Dot Connector. Cartographer of uncharted territory.

    Keeper of timeless wisdom. CTO and Fellow at codecentric. https://www.speakerdeck.com/ufried https://medium.com/@ufried @ufried
  3. The marvelous world of microservices

  4. So, you are doing microservices ...

  5. Yeah!

  6. ... of course, putting them in Docker ...

  7. Yeah!

  8. ... and running them on Kubernetes

  9. Yeah!

  10. So, your are doing it the cloud-native way

  11. “Cloud native computing uses an open source software stack to

    deploy applications as microservices, packaging each part into its own container, and dynamically orchestrating those containers to optimize resource utilization.” Source: Cloud Native Computing Foundation Homepage (https://www.cncf.io/)
  12. Yeah!

  13. Well, ever pondered the consequences?

  14. Eh, what?

  15. Why do you ask?

  16. Consequences of cloud-native computing

  17. Different processes at runtime Remote communication between services Distributed system

    Cloud-native computing Services packaged in containers
  18. Distributed systems are not limited to cloud-native computing

  19. (Almost) every system is a distributed system -- Chas Emerick

    http://www.infoq.com/presentations/problems-distributed-systems
  20. The software you develop and maintain is most likely part

    of a (big) distributed system landscape
  21. Consequences of distributed systems

  22. Everything fails, all the time. -- Werner Vogels

  23. Failures in distributed systems ... • Crash failure • Omission

    failure • Timing failure • Response failure • Byzantine failure
  24. ... lead to a variety of effects • Lost messages

    • Incomplete messages • Duplicate messages • Distorted messages • Out-of-order message arrival • Partial, out-of-sync local memory • ...
  25. These effects are based on the non-determinism introduced by remote

    communication due to distributed failure modes
  26. Understand that remote communication points are predetermined breaking points of

    your application Accept that the effects will hit you at the application level
  27. What can we do against it?

  28. Typical measures • High-availability (HA) hardware/software • Only applicable for

    very small installations • Usually not available in cloud environments • Delegate failure handling to infrastructure level • Partial relief, will not solve all problems • Implement resilient software design patterns • Very important, still will not fix a bad design • Minimize number of remote communication points • Minimize problem surface by design
  29. Reducing remote communication points

  30. Reducing remote communication points Reduce the number of distributed runtime

    artifacts i.e., create coarser grained services Reduce the need to communicate across runtime artifact boundaries i.e., get the functional design right
  31. Reducing remote communication points Reduce the number of distributed runtime

    artifacts i.e., create coarser grained services Reduce the need to communicate across runtime artifact boundaries i.e., get the functional design right
  32. Remember? https://www.martinfowler.com/articles/ distributed-objects-microservices.html

  33. If you can get away with a monolith, just do

    it! You are still allowed to structure it well ... ;)
  34. Still, your functional domain usually will be too big to

    put it all in a single monolith Thus, you will need to distribute your system landscape to a certain degree
  35. Rule of thumb Always think (at least) twice before distributing

    functionality Only do it if you really need independent deployments Addendum You may also want to distribute your functionality if you have very disparate NFRs for different parts of your functionality (e.g., security *, availability, reliability, scalability **, portability, resource utilization) * forgotten most of the time ** less often needed than most people assume
  36. Reducing remote communication points Reduce the number of distributed runtime

    artifacts i.e., create coarser grained services Reduce the need to communicate across runtime artifact boundaries i.e., get the functional design right
  37. Case study (Very simple) eCommerce shop • Implements the core

    functionalities • Search & show • Add to shopping cart • Checkout • Shipment • Payments only touched as black box • No recommendations, etc.
  38. The typical design approach ... a.k.a. “the counter-example”

  39. Typical design approach Focus on avoiding redundancy and maximizing reuse

    1. Start with a comprehensive domain (actually: E/R) model
  40. * * * * Customer • name • address •

    payment methods E Order • customer • payment information • shipping information • order items • product • quantity E Product • name • description • image(s) • price • items in stock • packaging information (size, weight, special care) E
  41. Typical design approach Focus on avoiding redundancy and maximizing re-use

    1. Start with a comprehensive domain (actually: E/R) model 2. Wrap entities with services
  42. CustomerService S Customer OrderService S Order ProductService S Product

  43. Typical design approach Focus on avoiding redundancy and maximizing re-use

    1. Start with a comprehensive domain (actually: E/R) model 2. Wrap entities with services 3. Spread functionality over services
  44. ProductService OrderService CustomerService S Customer OrderService • Add to shopping

    cart S ProductService • Search/show S Order Product
  45. Typical design approach Focus on avoiding redundancy and maximizing re-use

    1. Start with a comprehensive domain (actually: E/R) model 2. Wrap entities with services 3. Spread functionality over services 4. Add “process services” for “complex use cases” • i.e., use cases that touch more than one data service
  46. CustomerService S Customer OrderService • Add to shopping cart S

    ProductService • Search/show S Order Product CheckoutService • Checkout S ShipmentService • Shipment S
  47. Typical design approach Focus on avoiding redundancy and maximizing re-use

    1. Start with a comprehensive domain (actually: E/R) model 2. Wrap entities with services 3. Spread functionality over services 4. Add “process services” for “complex use cases” • i.e., use cases that touch more than one data service 5. Add missing data maintenance use cases
  48. CustomerService ProductService • Search/show CustomerService • Customer self care S

    Customer OrderService • Add to shopping cart S ProductService • Search/show • Product catalog maintenance S CheckoutService • Checkout S ShipmentService • Shipment S Order Product
  49. This (familiar) design looks innocuous at first sight But how

    good is it in terms of remote communication?
  50. CheckoutService OrderService ProductService CustomerService Payment Provider <proceed to checkout> read

    order read price loop [order items] calculate price read payment methods <show price and ask for payment method> <proceed to payment> pay mark paid <report back completion>
  51. ShipmentService OrderService ProductService CustomerService Delivery Provider <initiate shipment> read order

    read product and packaging information loop [order items] read delivery address <show shipment information> <parcel(s) packed – initiate delivery> inform delivery provider mark dispatched update items in stock loop [order items]
  52. Findings • Core business use cases are failure-prone and slow

    • Data maintenance use cases are robust and fast
  53. Congratulations! You designed a system for a company that's core

    business purpose is to maintain data, not to make money!
  54. Properties of the design • Focus on avoiding redundancy and

    maximizing reuse • Based on traditional OO design practices • Results in high coupling between services • Results in moderate cohesion inside services • Okay for CRUD applications • But then better use a generator, scaffolding framework, ... • Okay-ish for single-process applications • Tends to affect maintainability negatively • Not okay for distributed services • Big failure surface, bad response times
  55. How can we do better?

  56. Let’s do a bit of research ...

  57. Structured design by W. P. Stevens, G. J. Myers and

    L. L. Constantine [Stev 1974]
  58. “The fewer and simpler the connections between modules, the easier

    it is to understand each module without reference to other modules. Minimizing connections between modules also minimizes the paths along which changes and errors can propagate into other parts of the system, thus eliminating disastrous ‘ripple’ effects, where changes in one part cause errors in another, necessitating additional changes elsewhere, giving rise to new errors, etc.” [Ste 1974]
  59. “Coupling is the measure of the strength of association established

    by a connection from one module to another.” [Ste 1974]
  60. Coupling High Low Contributing factors Interface Complexity Type of Connection

    Type of Communication Simple, obvious Complicated, obscure To module by name (depending on interface) To internal elements (depending on implementation) Data (control flow handled by environment) Control (Explicit passing of control) Hybrid (Manipulation of internal control flow by parameters) [Ste 1974]
  61. Realize that this paper was written at a very different

    time and in a very different context than we face today While the core concepts are timeless and still valid, we usually need to rethink the concrete instructions
  62. Contributing factors Interface Complexity Type of Connection Type of Communication

    Simple, obvious Complicated, obscure To module by name (depending on interface) To internal elements (depending on implementation) Data (control flow handled by environment) Control (Explicit passing of control) Hybrid (Manipulation of internal control flow by parameters) * Ability of a service to complete its task without the other service being present Functional Independence * Independent (does not need other service to work) Fully dependent (does not work without other service) Partly dependent (graceful degradation of service) Coupling High Low
  63. “Coupling is reduced when the relationships among elements not in

    the same module are minimized. There are two ways of achieving this – minimizing the relationships among modules and maximizing relationships among elements in the same module. In practice, both ways are used. [...] Binding is the measure of the cohesiveness of a module. The objective here is to reduce coupling by striving for high binding.” [Ste 1974]
  64. On the criteria to be used in decomposing systems into

    modules by David L. Parnas [Par 1972]
  65. Separation of concerns One concept/decision per module Information hiding Reveal

    as little as possible about internal implementation + Better changeability Changes are kept local Independent teams Teams can easier work independently on different modules Easier to comprehend Modules can be understood on their own easier
  66. Research findings • High cohesion, low coupling leads the right

    way • Separation of Concerns and Information hiding support implementing them • Concrete paper instructions should not be followed blindly • Different context (single process, very limited hardware) • Would lead to nano or pico services à lots of remote calls • You need to rethink instructions in the current context • Required for all CS papers from a different time & context • Leads to concept of Functional Independence in this context • Reduces risk of “vertical decomposition” (i.e., layered design)
  67. <uses> Functionality vertical decomposition (layer design, composition) In practice, you

    typically use a combination of both approaches Core functional decomposition approaches horizontal decomposition (pillar design, segregation)
  68. Vertical decomposition • Based on “uses” relation • Typical drivers

    are reuse and avoidance of redundancy • Creates strong coupling (high functional dependence) • Often useful pattern inside a process boundary • Due to deterministic communication behavior • Problematic across process boundaries à Should be avoided in service design
  69. Horizontal decomposition • Based on functional segregation • Typical drivers

    are autonomy and independence • Creates low coupling (high functional independence) • Useful pattern across process boundaries • Can also be useful inside a process boundary • Less accidental "ripple" effects due to changes à Should be preferred in service design
  70. Watch out! • Vertical decomposition is our default design approach

    • We’ve learned it in our CS education (divide and conquer, ...) • It’s emphasized in our daily work (DRY, reusability, ...) • Even our IDEs support it (“Extract method”) • It's everywhere! It's predominant! • It takes energy not to apply vertical decomposition • Most people never learned horizontal decomposition
  71. How to learn horizontal decomposition?

  72. Domain-Driven Design by Eric Evans [Eva 2004]

  73. DDD to the rescue? • Naive application of building block

    patterns leads to the counter-example design we have seen before • Not useful in our context due to high coupling • “Service” pattern leads to process service working on entities • Anti-pattern in our context due to high coupling • “Conceptual contours” supports high cohesion • Yet, tends to be too fine grained for our context • “Bounded contexts” supports low coupling • Yet, tends to be too coarse grained for our context à Mixed emotions: Good, but not the expected panacea
  74. And now?

  75. For good service design, look at the behavior first, not

    the data
  76. Case study (Very simple) eCommerce shop • Implements the core

    functionalities • Search & show • Add to shopping cart • Checkout • Shipment • Customer self-care • Product catalog maintenance • Payments only touched as black box • No recommendations, etc.
  77. Core reasoning To reduce the number of remote calls needed

    for a given functionality, we need to spread the functionality between the services in a way that a single use case/user interaction less often needs to cross service boundaries. Therefore, we try to organize services around use cases/user interactions.
  78. Search & Show Add to shopping cart Checkout Shipment Customer

    self-care Product catalog maintenance eCommerce shop Customer Back office employee Warehouse employee
  79. Search & Show Add to shopping cart Checkout Shipment Customer

    self-care Product catalog maintenance eCommerce shop Customer Back office employee Warehouse employee Three different actors • Indicator for cohesion boundaries • (At least) three different UIs • Could be completely different architectures • Depending on user needs, usage patterns and other NFRs • As an architect this gives you additional options
  80. Warehouse employee Search & Show Add to shopping cart Checkout

    Shipment Customer self-care Product catalog maintenance eCommerce shop Customer Back office employee Could be a mobile-first FE with service-oriented backend Could be a special warehouse device FE with a monolithic backend Could be a rich desktop app Could be a desktop browser first FE with a service-oriented backend
  81. Search & Show Add to shopping cart Checkout Shipment Customer

    self-care Product catalog maintenance
  82. Behavior-based design approach Focus on minimum cross-service communication inside a

    use case/user interaction 1. Each use case/user interaction is a service candidate
  83. ProductCatalogService • Product catalog maintenance S ShipmentService • Shipment S

    CustomerMDService * • Customer self care S Candidate Candidate Candidate ShoppingCartService • Add to shopping cart S CheckoutService • Checkout S SearchService • Search/show S Candidate Candidate Candidate * MD = Master Data
  84. Behavior-based design approach Focus on minimum cross-service communication inside a

    use case/user interaction 1. Each use case/user interaction is a service candidate 2. Possibly split big use cases in multiple services • Only if really needed (e.g., multiple teams, disparate NFRs) • Look for functional clusters with low coupling between them
  85. ShoppingCartService • Add to shopping cart S CheckoutService • Checkout

    S SearchService • Search/show S Candidate Candidate Candidate ProductCatalogService • Product catalog maintenance S ShipmentService • Shipment S CustomerMDService • Customer self care S Candidate Candidate Candidate Splitting up use cases in multiple services not needed in this example
  86. Behavior-based design approach Focus on minimum cross-service communication inside a

    use case/user interaction 1. Each use case/user interaction is a service candidate 2. Possibly split big use cases in multiple services • Only if really needed (e.g., multiple teams, disparate NFRs) • Look for functional clusters with low coupling between them 3. Try to group several use cases in a single service • Strive for a sweet spot in terms of an overall trade-off • Look for service candidates that operate on the same data
  87. ShoppingCartService • Add to shopping cart S ProductCatalogService • Product

    catalog maintenance S CheckoutService • Checkout S ShipmentService • Shipment S SearchService • Search/show S CustomerMDService • Customer self care S Candidate Candidate Candidate Candidate Candidate Candidate Product catalog Customer master data Shopping cart Buying order Inventory data / shipping order Product catalog
  88. ShoppingCartService • Add to shopping cart S ProductCatalogService • Product

    catalog maintenance S CheckoutService • Checkout S ShipmentService • Shipment S SearchService • Search/show S CustomerMDService • Customer self care S Candidate Candidate Candidate Candidate Candidate Candidate Product catalog Customer master data Shopping cart Buying order Inventory data / shipping order Product catalog Service candidates working on the same data
  89. Architectural reasoning • Same data ... • ... but different

    actors • Option to work on a single product catalog database here outweighs different actors using a single service à Unite in one service (unless you decide to use a different architectural style for the back office employee application)
  90. ShoppingCartService • Add to shopping cart S ProductCatalogService • Product

    catalog maintenance • Search/show S CheckoutService • Checkout S ShipmentService • Shipment S CustomerMDService • Customer self care S Candidate Candidate Candidate Candidate Customer master data Shopping cart Buying order Inventory data / shipping order Product catalog
  91. ShoppingCartService • Add to shopping cart S ProductCatalogService • Product

    catalog maintenance • Search/show S CheckoutService • Checkout S ShipmentService • Shipment S CustomerMDService • Customer self care S Candidate Candidate Candidate Candidate Customer master data Shopping cart Buying order Inventory data / shipping order Product catalog Service candidates working on the same type of data (shopping cart is a preliminary order)
  92. Architectural reasoning • Some (sequential) cohesion and could work on

    same data ... • ... but unification is still not imperative • Need to ponder other aspects and balance trade-offs • Different representations for shopping cart and order needed? • UI part of the service? • How does payment interfere (not considered in the example)? à Here we assume that it is best to unite the services
  93. ProductCatalogService • Product catalog maintenance • Search/show S OrderCreationService •

    Add to shopping cart • Checkout S ShipmentService • Shipment S CustomerMDService • Customer self care S Candidate Candidate Customer master data Shopping cart / buying order Inventory data / shipping order Product catalog
  94. Additional reasoning • Buying order vs. shipping order • Less

    commonalities than shopping cart and buying order • Shipping order is only “ephemeral” entity • Different actors using them à Keep them separated (we need a signaling mechanism then) • Who updates items in stock? • No longer part of product catalog maintenance • Warehouse employee responsible (more reasonable anyway) à Add additional use case “Fill up inventory”
  95. CustomerMDService • Customer self care S Customer master data OrderCreationService

    • Add to shopping cart • Checkout S Shopping cart / buying order WarehouseService • Shipment • Fill up inventory S Inventory data / shipping order ProductCatalogService • Product catalog maintenance • Search/show S Product catalog
  96. Nice, but is this design any better? Again: How good

    is it in terms of remote communication?
  97. CheckoutService Payment Provider <proceed to checkout> <show price and ask

    for payment method> <proceed to payment> pay <report back completion> calculate price
  98. WarehouseService Delivery Provider inform delivery provider <initiate shipment> <show shipment

    information> <parcel(s) packed – initiate delivery> update items in stock
  99. Findings • All use cases are robust and fast Side

    note: It is not always as nice and simple as in this example
  100. Hmm, any trade-offs?

  101. 1st law of architectural work: Every decision has its price.

    No decision is for free. (Translation: No decision only has upsides. Every decision also has downsides.)
  102. 2nd law of architectural work: A decision can only be

    evaluated with respect to its context. (Translation: Decisions are not invariably “good” or “bad”, but only in a given context.)
  103. Trade-offs of the approach • Biggest concern: What about the

    data? • Data replication and reconciliation • Entity distribution (no single source of truth for an entity) • Question cannot be answered in general • Here we will evaluate it with respect to the given example • Plus some general considerations (but no general evaluation)
  104. * * * * Customer • name • address •

    payment methods E Order • customer • payment information • shipping information • order items • product • quantity E Product • name • description • image(s) • price • items in stock • packaging information (size, weight, special care) E This diagram is misleading!
  105. * * * * Customer • name • address •

    payment methods E Order • customer • payment information • shipping information • order items • product • quantity E Product • name • description • image(s) • price • items in stock • packaging information (size, weight, special care) E Only used as copy template Only relevant for search/show Only relevant for checkout Just an ID for business related referencing purposes Only relevant for checkout (including invoice address) Only relevant for shipment Only relevant for shipment (including delivery address) Different for checkout and shipment (only IDs and quantities needed) Immutable after completion (all data copied into order)
  106. CustomerMDService • Customer self care S Customer master data OrderCreationService

    • Add to shopping cart • Checkout S Shopping cart / buying order WarehouseService • Shipment • Fill up inventory S Inventory data / shipping order ProductCatalogService • Product catalog maintenance • Search/show S Product catalog Putting these use cases in a single service avoids the need for data replication 3 3 3 Putting these use cases in a single service avoids the need for data signaling 4 4 Needs to signal data for shipment order (signaling mechanism required) 2 2 2 Needs to copy some product (and customer) data into the order (could be handled by the frontend) 1 1 1 1
  107. Findings • All use cases are robust and fast •

    Minimal need to transfer data between services • Solvable via frontend and standard data transfer solution (batch file, transfer table, message queue, ...) • No data replication and reconciliation solution needed Side note: It is not always as nice and simple as in this example
  108. Still, it is not always that nice and easy Translation:

    There are situations where two or more copies of the same data need to be kept in sync
  109. CustomerMDService • Customer self care S Customer master data OrderCreationService

    • Add to shopping cart • Checkout S Shopping cart / buying order WarehouseService • Shipment • Fill up inventory S Inventory data / shipping order ProductCatalogService • Product catalog maintenance • Search/show S Product catalog Might want to allow update of payment methods in the context of customer self care and checkout (requires one-way synchronization of master data after change) 1 1 1 Might want to allow adding items to shopping cart only if items are in stock (requires one-way synchronization of transactional data after change) 2 2 2 It might even hit us in our example
  110. How can we keep the data in sync?

  111. Options to keep data in sync • Shared database •

    Compromises original reasoning to use services • Distributed transactions (2-phase commit) • Tight coupling compromises service independence • Compromises availability and scalability • Eventual consistency • Sufficient for basically all functional requirements • Supports low coupling and high availability • Downside: Much harder programming model than ACID TX
  112. Options for eventual consistency • Batch pull • Consumer pulls

    data batch when ready to process data • Very robust approach (also suitable for legacy integration) • Data sync delay may be longer than acceptable • Batch bootstrapping & delta push • Initial state sync via batch pull, then push of delta updates • Often combined with event sourcing, CQRS, reactive, ... • Fast, robust (if done right) and still quite lightweight • Distributed log • Offers advantages of previous approach in one technology • Kafka currently is the best-known implementation • Still have a plan how to recover if the tool breaks
  113. And the single source of truth issue?

  114. Pondering single source of truth • Usually task for analytical

    data processing • Orthogonal, well-understood issue • Many solutions available • Sometimes needed in transactional systems (e.g., CRM) • Question if it is really a need or just a habit • Strive for eventual consistency • Go for event streams or distributed logs for fast updates
  115. Wrap-up

  116. Wrap-up • Think (at least) twice before distributing functionality •

    Strive for low coupling, support with high cohesion • Prefer horizontal decomposition in service design • Favor functional independence over reuse • The magic is in the behavior, not the data • Employ, e.g., use cases to find service boundaries • Prefer eventual consistency for data synchronization • Value the timeless wisdom • But update the instructions to the given current context
  117. None
  118. Uwe Friedrichsen https://www.speakerdeck.com/ufried https://medium.com/@ufried @ufried