Underestimated costs of microservice architectures

Underestimated costs of microservice architectures

With many business success stories, our beautiful software systems can degrade into monolithic Big Balls of Mud. And to fix these kinds of monstrosities, we as developers and architects have begun to reach for microservices as our solution. Beautifully-designed architecture diagrams and org charts clearly show the benefits in terms of coordination, batch size, and codebase understandability, but is it really all unicorns and rainbows?

As folks who have been down this road can tell you, microservice architectures don’t solve all our problems. Tradeoffs abound, and in this talk we’ll see the costs we need to be prepared to pay when we introduce microservices, including team dynamics as well as technical tradeoffs around consistency, failure handling, and observability.

E16bc9c356b65d61ee1d74c8f06ae35b?s=128

Colin Jones

May 02, 2018
Tweet

Transcript

  1. 8th Light, Inc. Colin Jones @trptcolin https://8thlight.com Underestimated costs of

    microservice architectures
  2. Microservices

  3. Happy!

  4. ☠ Sad ☠

  5. None
  6. None
  7. None
  8. None
  9. Avoid microservices?

  10. Avoid microservices?

  11. On other hand But to gain any benefit from microservice

    thinking, you have to understand what it is, how to do it, and why you should usually do something else. - Martin Fowler
  12. Hype Cycle

  13. Accusations

  14. We underestimate costs

  15. Benefits

  16. independent deployability

  17. independent scalability independent deployability

  18. fault tolerance independent deployability independent scalability

  19. avoid dependency hell independent deployability independent scalability fault tolerance

  20. architectural boundaries independent deployability independent scalability fault tolerance avoid dependency

    hell
  21. small team ownership independent deployability independent scalability fault tolerance avoid

    dependency hell architectural boundaries
  22. eliminate legacy code independent deployability independent scalability fault tolerance avoid

    dependency hell architectural boundaries small team ownership
  23. eliminate legacy code independent deployability independent scalability fault tolerance avoid

    dependency hell architectural boundaries small team ownership microservices!
  24. Costs & Mitigations

  25. Well-understood costs

  26. Latency

  27. Latency Mitigations • Cache responses • Batch calls together •

    Coarse-grained service API
  28. Additional infrastructure Latency

  29. Additional infrastructure Mitigations • Containers (e.g. Docker) • Infrastructure automation

    & configuration management • Virtual machines / cloud • Auto-scaling (metered cost) • Serverless
  30. Understanding != Paying

  31. Underestimated Costs

  32. Data consistency Additional infrastructure Latency

  33. Data consistency Orders Service Main App Main DB

  34. Data consistency Orders Service Main App Orders DB Main DB

  35. Data consistency Mitigations • Design for eventual consistency • Canonical

    source for data (aka “system of record” / “source of truth”) and derived data • Backend sync processes • Service teams co-own ETL for analytics/ business intelligence / data warehouse
  36. Data consistency

  37. Failure modes Data consistency Additional infrastructure Latency

  38. Failure modes A B

  39. Failure modes A B ???

  40. Failure modes A B

  41. Failure modes A B

  42. Failure modes A B ???

  43. Failure modes Mitigations • Use retries (with backoff; cap the

    max time) • Read the remote end to see if it succeeded • Use fallbacks for read timeouts • Use circuit breaker to limit cascading failures • Use bulkheads to protect independent modules
  44. Failure modes

  45. Development & testing Failure modes Data consistency Additional infrastructure Latency

  46. Development & testing

  47. from "Testing Microservice the Sane Way" by Cindy Sridharan Development

    & testing
  48. Mitigations • Expand testing mindset to staging/production observability efforts •

    Integrate only a few services / API checking and rely more on unit tests • Unify on lightweight runtimes and single database technology [only delays the problem; conflicts with team ownership] • Test against external environments with services set up [risks test pollution] • Orchestrate new isolated infrastructure for each test run Development & testing
  49. Observability Failure modes Development & testing Data consistency Additional infrastructure

    Latency
  50. Observability

  51. Observability Mitigations • Log aggregation with correlation IDs • Error

    reporting / alerting [generally on symptoms, not causes] • Distributed tracing tools • Monitoring tools / dashboards • Fancier 3rd-party observability tools
  52. Tunnel vision Failure modes Development & testing Observability Data consistency

    Additional infrastructure Latency
  53. Tunnel vision

  54. Mitigations • Measure business metrics, not team velocity • Make

    sure team / service incentives are aligned with the company’s • Rotations / team exchanges / dynamic re- teaming • Cross-org communication Tunnel vision
  55. Implicit connection data Failure modes Development & testing Observability Tunnel

    vision Data consistency Additional infrastructure Latency
  56. Implicit connection data

  57. Mitigations • Well known API contracts / specifications (e.g. JSON

    Schema, Swagger, Protocol Buffers, Thrift) • Centralized/standardized repository to track service metadata for service discovery • Put all services in one codebase (monorepo) for easier searchability • Custom tooling based on log aggregation / monitoring Implicit connection data
  58. Inter-team priority conflicts Failure modes Development & testing Observability Tunnel

    vision Implicit connection data Data consistency Additional infrastructure Latency
  59. Inter-team priority conflicts Them Us

  60. Inter-team priority conflicts Them Consumer B Us (Consumer A) Consumer

    C Consumer C
  61. Mitigations • Make our case really well to the service

    team, or management • Add staff on heavily-used microservices • Split heavily-used services further • Contribute to their project (aka “internal open- source”) • Rebuild a similar service with the changes we need Inter-team priority conflicts
  62. Hard to change across boundaries Failure modes Development & testing

    Observability Tunnel vision Inter-team priority conflicts Implicit connection data Data consistency Additional infrastructure Latency
  63. Hard to change across boundaries

  64. Hard to change across boundaries

  65. Mitigations • Be deliberate about the choice to Extract Microservice

    • Version your API? [controversial] • If versioning / breaking: Have a well-defined way to communicate breaking changes / deadlines • Sticking with the same runtime (e.g. JVM) makes Inline Microservice possible • Cross-org communication Hard to change across boundaries
  66. Mitigations • Skill and culture of backwards compatibility (SemVer, Postel’s

    Law) • Don’t make breaking changes • Well known API contracts / specifications (e.g. JSON Schema, Swagger, Protocol Buffers, Thrift) • Consumer-driven contract tests in CI • See also “Connection data is implicit” mitigations Hard to change across boundaries
  67. Failure modes Development & testing Observability Tunnel vision Inter-team priority

    conflicts Implicit connection data Data consistency Additional infrastructure Latency Hard to change across boundaries microservices!
  68. Alternatives

  69. Milliservices? Centiservices?

  70. Modules / encapsulation

  71. Modules / encapsulation

  72. Modules / encapsulation

  73. Modules / encapsulation

  74. Recommendations

  75. List problems. Then solutions. Then pros and cons.

  76. Don’t believe the hype.

  77. Be ready to pay the costs.

  78. Make sure you’re getting the benefits.

  79. Make the change easy. Then make the change.

  80. Make the change easy. Then make the change. (maybe)

  81. Learn more

  82. Learn more

  83. Learn more •Ben Christensen. “Don’t Build a Distributed Monolith”: https://

    www.microservices.com/talks/dont-build-a-distributed-monolith/ •Michael Feathers. “Microservices and the Failure of Encapsulation”: https:// michaelfeathers.silvrback.com/microservices-and-the-failure-of-encapsulaton •Michael Feathers. Working Effectively with Legacy Code: https:// www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/ 0131177052 •Martin Fowler. Enterprise Application Architecture: https:// www.martinfowler.com/books/eaa.html •Martin Fowler. “MicroservicePremium”: https://www.martinfowler.com/ bliki/MicroservicePremium.html •Martin Fowler. “Microservice Prerequisites”: https://www.martinfowler.com/ bliki/MicroservicePrerequisites.html •Martin Fowler. “Microservices Resource Guide”: https:// www.martinfowler.com/microservices/
  84. Learn more •Susan Fowler. Production-Ready Microservices: http://shop.oreilly.com/product/ 0636920053675.do •John Gall.

    The Systems Bible: https://www.amazon.com/Systems-Bible-Beginners- Guide-Large/dp/0961825170 •David Heinemeier Hansson. “The Majestic Monolith”: https://m.signalvnoise.com/ the-majestic-monolith-29166d022228 •Rich Hickey. “Hammock Driven Development”: https://www.youtube.com/watch? v=f84n5oFoZBc •Gregor Hohpe and Bobby Woolf. Enterprise Integration Patterns: http:// www.enterpriseintegrationpatterns.com/ •Mike Knepper. “The Hidden Costs of Leaving a Monolith”: https://8thlight.com/ blog/mike-knepper/2016/01/20/hidden-costs-of-leaving-a-monolith.html •Dan Manges. “The Modular Monolith: Rails Architecture”: https://medium.com/ @dan_manges/the-modular-monolith-rails-architecture-fb1023826fc4 •Sam Newman. Building Microservices: https://samnewman.io/books/ building_microservices/
  85. Learn more •Michael Nygard. “The Entity Microservice Antipattern”: http:// www.michaelnygard.com/blog/2017/12/the-entity-service-antipattern/

    •Michael Nygard. Release It!, 2nd edition: https://pragprog.com/book/mnee2/ release-it-second-edition •Ozan Onay. “You are not Google”: https://blog.bradfieldcs.com/you-are-not- google-84912cf44afb •Arnon Rotem-Gal-Oz. “Fallacies of Distributed Computing Explained”: http:// www.rgoarchitects.com/Files/fallacies.pdf •Cindy Sridharan. “Testing Microservices, the Sane Way”: https://medium.com/ @copyconstruct/testing-microservices-the-sane-way-9bb31d158c16 •Cindy Sridharan. “Testing in Production, the safe way”: https://medium.com/ @copyconstruct/testing-in-production-the-safe-way-18ca102d0ef1 •Jim Waldo, Geoff Wyant, Ann Wolrath, and Sam Kendall. “A Note on Distributed Computing”: http://web.cs.wpi.edu/~cs3013/a11/Papers/ Waldo_NoteOnDistributedComputing.pdf
  86. 8th Light, Inc. Colin Jones @trptcolin https://8thlight.com Thank you!