Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Art of Architecting for Scale

The Art of Architecting for Scale

Microservice Meetup Munich

SQUER Solutions

June 20, 2023
Tweet

More Decks by SQUER Solutions

Other Decks in Technology

Transcript

  1. @duffleit
    The Art of
    Architecting for Scale
    @duffleit

    View Slide

  2. @duffleit
    DAVID LEITNER
    Principal Engineer
    👋 [email protected]
    🌎 @duffleit

    View Slide

  3. @duffleit
    📦
    👧 🧑 👧 🧑 👧
    🔥
    Safely and sustainably reduce
    lead time to thank you.
    Daniel Terhorst-North
    Complex System

    View Slide

  4. @duffleit
    🔥
    🔴
    Simple System

    View Slide

  5. @duffleit
    🥳
    🟢
    Simple System

    View Slide

  6. @duffleit
    Simple System

    View Slide

  7. @duffleit
    Complicated System
    🔥
    🔴

    View Slide

  8. @duffleit
    Complicated System

    View Slide

  9. @duffleit
    Complicated System
    Module A Module B
    Module C

    View Slide

  10. @duffleit
    Complicated System
    Distributed
    Systems
    Service A Service B
    Service C

    View Slide

  11. @duffleit
    Deployment Units
    Monolithic Distributed
    Modularisation
    Bad
    Well
    Big Ball
    Of Mud

    View Slide

  12. @duffleit
    “Sadly, architecture has
    been undervalued for so
    long that many
    engineers regard life
    with a BIG BALL OF MUD
    as normal.“
    Foote & Yoder

    View Slide

  13. @duffleit
    Deployment Units
    Monolithic Distributed
    Modularisation
    Bad
    Well
    Big Ball
    Of Mud
    Distributed
    MOnolith
    WelL-Structured
    MOdulith

    View Slide

  14. @duffleit
    There is a multitude of reasons
    to go for Distributed Systems.
    Modularisation is none of them.

    View Slide

  15. @duffleit
    If you can’t build a well-structured
    monolith, what makes you think
    microservices are the answer?
    Simon Brown

    View Slide

  16. @duffleit
    Module A Module B
    Module C
    Single Deployment Unit

    View Slide

  17. @duffleit
    Service A Service B
    Serivice C

    View Slide

  18. @duffleit
    Service A Service B
    Serivice C
    Individual
    Scaling Demand
    10x Users
    Usual Traffic

    View Slide

  19. @duffleit
    Service A Service B
    Serivice C
    Individual
    Scaling Demand
    Technology
    Segmentation

    View Slide

  20. @duffleit
    Service A Service B
    Serivice C
    Individual
    Scaling Demand
    Technology
    Segmentation
    🇩🇪
    🇩🇪
    🇺🇸
    CO-Locating

    View Slide

  21. @duffleit
    Service A Service B
    Serivice C

    View Slide

  22. @duffleit
    Service A Service B
    Serivice C
    👧 🧑
    🧑
    👧 🧑
    🧑
    👧 🧑
    🧑 Ownership
    You build it,
    You Own it
    You build it,
    You Run it

    View Slide

  23. @duffleit
    Single Deployment Unit
    MOdule A MOdule B
    MOdule C
    👧 🧑
    🧑
    👧 🧑
    🧑
    👧 🧑
    🧑 Ownership
    You build it,
    You Own it
    You build it,
    You Run it
    🔥

    View Slide

  24. @duffleit
    Single Deployment Unit
    MOdule A MOdule B
    MOdule C
    👧 🧑
    🧑
    👧 🧑
    🧑
    👧 🧑
    🧑 Ownership
    You build it,
    You Own it
    You build it,
    You Run it
    🔥
    🔥

    View Slide

  25. @duffleit
    Service A Service B
    Serivice C
    👧 🧑
    🧑
    👧 🧑
    🧑
    👧 🧑
    🧑 Take Ownership
    and Responsibility

    View Slide

  26. @duffleit
    Technical
    Benefits
    Organizational
    Autonomoy
    Scale Engineering
    CO-Locating
    Technology
    Segmentation
    Individual
    Scaling Demand

    View Slide

  27. @duffleit
    Technical
    Benefits Organizational
    Autonomoy
    Scale Engineering
    CO-Locating
    Technology
    Segmentation
    Individual
    Scaling Demand
    20% 80%

    View Slide

  28. @duffleit
    👧 🧑
    🧑👧 🧑
    🧑
    👧 🧑
    🧑
    📦

    View Slide

  29. @duffleit
    📦
    📦
    👧 🧑
    🧑
    📦
    👧 🧑
    🧑
    👧 🧑
    🧑

    View Slide

  30. @duffleit
    Deployment Units
    Monolithic Distributed
    Modularisa3on
    Bad
    Well
    Big Ball
    Of Mud
    WelL-Structured
    Modulith
    Distributed
    Monolith
    Autonomous
    Service-Based
    Architecture

    View Slide

  31. @duffleit
    Deployment Units
    Distributed
    Modularisa3on
    Bad
    Well
    Big Ball
    Of Mud
    WelL-Structured
    Modulith
    Distributed
    Monolith
    Autonomous
    Service-Based
    Architecture
    Monolithic

    View Slide

  32. @duffleit
    Deployment Units
    Monolithic Distributed
    Modularisation
    Bad
    Well
    Complex
    Distributed
    System
    Complicated
    Monolithic
    System

    View Slide

  33. @duffleit
    👧 🧑
    🧑👧 🧑
    🧑
    👧 🧑
    🧑
    🔴
    🔥
    🟢
    Complicated Monolithic System
    Age
    Calculation
    Lending System

    View Slide

  34. @duffleit
    👧 🧑
    🧑
    🔥
    👧 🧑
    🧑
    👧 🧑
    🧑
    🔴
    🔴
    Complex Distributed System
    Loan
    Team
    Lending System

    View Slide

  35. @duffleit
    👧 🧑
    🧑
    🔥
    👧 🧑
    🧑
    👧 🧑
    🧑
    🔴
    🔴
    🔴
    Complex Distributed System
    Loan
    Team
    Lending System

    View Slide

  36. @duffleit
    👧 🧑
    🧑
    🔥
    👧 🧑
    🧑
    👧 🧑
    🧑
    🔴
    Complex Distributed System
    Loan
    Team
    Lending System

    View Slide

  37. @duffleit
    Relation Between
    Root Cause and Effect
    Complex
    Distributed
    System
    Complicated
    Monolithical
    System
    is usually
    given
    is usually
    not given

    View Slide

  38. @duffleit
    What’s the difference between a
    method call within a single deployment unit

    View Slide

  39. @duffleit
    What’s the difference between a
    method call within a single deployment unit
    Deployment Unit
    Module A Module B
    moduleB.createUser() fun createUser() { /*...*/ }

    View Slide

  40. @duffleit
    What’s the difference between a
    method call within a single deployment unit,
    and a method call across the network.
    Deployment Unit
    Deployment Unit
    Service A Service B
    restClient.user() fun createUser() { /*...*/ }

    View Slide

  41. @duffleit
    What’s the difference between a
    method call within a single deployment unit,
    and a method call across the network.
    Everything.

    View Slide

  42. @duffleit
    👧 🧑
    🧑
    🔥
    👧 🧑
    🧑
    👧 🧑
    🧑
    🔴
    Complex Distributed System

    View Slide

  43. @duffleit
    👧 🧑
    🧑
    🔥
    👧 🧑
    🧑
    👧 🧑
    🧑
    🔴
    Complex Distributed System
    https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing

    View Slide

  44. @duffleit
    Relation Between
    Root Cause and Effect
    Complex
    Distributed
    System
    Complicated
    Monolithical
    System
    is usually
    given
    is usually
    not given
    Cause of Failures reliability availability

    View Slide

  45. @duffleit
    📦
    📦
    👧 🧑
    🧑
    📦
    👧 🧑
    🧑
    👧 🧑
    🧑

    View Slide

  46. @duffleit
    📦
    📦
    👧 🧑
    🧑
    📦
    👧 🧑
    🧑
    👧 🧑
    🧑
    🔥

    View Slide

  47. @duffleit
    📦
    📦
    👧 🧑
    🧑
    📦
    👧 🧑
    🧑
    👧 🧑
    🧑
    🤯

    View Slide

  48. @duffleit
    Relation Between
    Root Cause and Effect
    Complex
    Distributed
    System
    Complicated
    Monolithical
    System
    is usually
    given
    is usually
    not given
    Cause of Failures reliability availability

    View Slide

  49. @duffleit
    Relation Between
    Root Cause and Effect
    Complex
    Distributed
    System
    Complicated
    Monolithic
    System
    is usually
    given
    is usually
    not given
    Cause of Failures reliability availability

    View Slide

  50. @duffleit
    👧 🧑
    🧑
    🔥
    👧 🧑
    🧑
    🔴
    Complex Distributed System
    👧 🧑
    🧑
    Loan
    Team
    Lending System

    View Slide

  51. @duffleit
    👧 🧑
    🧑
    🔥
    👧 🧑
    🧑
    🔴
    🔴
    Complex Distributed System
    👧 🧑
    🧑
    🟢
    Account Team
    Loan
    Team
    Lending System

    View Slide

  52. @duffleit
    👧 🧑
    🧑
    🔥
    👧 🧑
    🧑
    Complex Distributed System
    👧 🧑
    🧑
    🔴
    Loan
    Team
    Lending System
    Account Team

    View Slide

  53. @duffleit
    We strive for a stable production.
    But this is totally elusive
    in a complex environment.
    and sometimes chaotic
    The sooner we accept this, the better.

    View Slide

  54. @duffleit
    Failure in Mind.
    We need to architect with
    🔥

    View Slide

  55. @duffleit
    👧 🧑
    🧑
    👧 🧑
    👧 🧑
    🧑
    Payment Service
    🧑
    Account Service
    🛍 Online Shopping System

    View Slide

  56. @duffleit
    👧 🧑
    🧑
    👧 🧑
    👧 🧑
    🧑
    🔴
    Payment Service
    🧑
    Account Service
    🔥
    Users[]
    🔥
    🛍 Online Shopping System

    View Slide

  57. @duffleit
    👧 🧑
    🧑
    👧 🧑
    👧 🧑
    🧑
    🔴
    Payment Service
    🧑
    Account Service
    Users[]
    🔥
    🛍 Online Shopping System

    View Slide

  58. @duffleit
    👧 🧑
    🧑
    👧 🧑
    👧 🧑
    🧑
    🔴
    Payment Service
    🧑
    Account Service
    Users[]
    cache
    🛍 Online Shopping System

    View Slide

  59. @duffleit
    👧 🧑
    🧑
    👧 🧑
    👧 🧑
    🧑
    🔴
    Payment Service
    🧑
    Account Service
    Users[]
    stream
    UserChanged
    🛍 Online Shopping System

    View Slide

  60. @duffleit
    👧 🧑
    🧑
    👧 🧑
    👧 🧑
    🧑
    🔴
    Payment Service
    🧑
    Account Service
    projection
    stream
    Users[]
    UserChanged
    🛍 Online Shopping System

    View Slide

  61. @duffleit
    👧 🧑
    🧑
    👧 🧑
    👧 🧑
    🧑
    🔴
    Payment Service
    🧑
    Account Service
    projection
    stream
    Users[]
    UserChanged
    🛍 Online Shopping System

    View Slide

  62. @duffleit
    👧 🧑
    🧑
    👧 🧑
    👧 🧑
    🧑
    Payment Service
    🧑
    Account Service
    projection
    stream
    🔥🔥
    🔥
    🔥
    Users[]
    🛍 Online Shopping System

    View Slide

  63. @duffleit
    👧 🧑
    🧑
    👧 🧑
    👧 🧑
    🧑
    Payment Service
    🧑
    Account Service
    🔥
    🚀
    LBs
    CBs
    Caching

    View Slide

  64. @duffleit
    Failure and Scale in Mind.
    We need to architect with
    🔥

    View Slide

  65. @duffleit
    Services that can run in a
    chaotic environment can run everywhere.
    Services that can only run in a
    stable environment can only run there.

    View Slide

  66. @duffleit
    👧 🧑
    🧑
    🔥
    👧 🧑
    👧 🧑
    🧑
    🔴
    Payment Team
    🧑
    Account Team
    Execute
    Payment
    Payment
    Gateway
    Team

    View Slide

  67. @duffleit
    👧 🧑
    🧑
    👧 🧑
    🧑
    👧 🧑
    Payment Team
    🧑
    Account Team
    🤯

    View Slide

  68. @duffleit
    <
    👧 🧑
    🧑
    🔥
    SEPA Payment Team
    Account Team
    Execute
    Payment
    CREDITCARD
    Payment
    Team
    <
    👧 🧑
    🧑
    🧑
    👧 🧑

    View Slide

  69. @duffleit
    <
    👧 🧑
    🧑
    🔥
    SEPA Payment Team
    Account Team
    Execute
    Payment
    CREDITCARD
    Payment
    Team
    <
    COMPLICATED End2End flow
    👧 🧑
    🧑
    COMPLICATED End2End flow
    🧑
    👧 🧑
    😌

    View Slide

  70. @duffleit
    👧 🧑
    🧑
    🔥
    🔴
    Payment Team
    Account Team
    Execute
    Payment
    Payment
    Gateway
    Team
    COMPLEX End2End flow
    👧 🧑
    🧑
    👧 🧑
    🧑

    View Slide

  71. @duffleit
    End-2-End Customer Journeys.
    We need to architect for
    🪢

    View Slide

  72. @duffleit
    <
    SEPA Payment Team
    Account Team
    FRAUD
    CHECK
    <
    🧑
    👧 🧑
    👧 🧑
    🧑
    Core
    Team
    👧 🧑
    🧑
    💁 Check
    Payment
    👩💻

    View Slide

  73. @duffleit
    <
    SEPA Payment Team
    Account Team
    Check
    Payment
    FRAUD
    CHECK
    <
    Core
    Team
    🧑
    👧 🧑
    👧 🧑
    🧑
    👧 🧑
    🧑
    How many
    Nines?
    Availability?
    Nines Percentage Yearly Outage
    1 Nine 90% 36,5 days
    2 Nines 99% 3,65 days
    3 Nines 99,9% 8,76 hours
    4 Nines 99,99% 52,56 minutes
    5 Nines 99,999% 5,26 minutes
    6 Nines 99,9999% 31,5 seconds
    A Planed Outage
    is an Outage

    View Slide

  74. @duffleit
    Can our service have
    99,99% availability
    if our dependencies have a
    99,9% availability?

    View Slide

  75. @duffleit
    <
    SEPA Payment Team
    Account Team
    Check
    Payment
    FRAUD
    CHECK
    <
    🧑
    👧 🧑
    👧 🧑
    🧑
    Core
    Team
    👧 🧑
    🧑
    How many
    Nines? Nines Percentage Yearly Outage
    1 Nine 90% 36,5 days
    2 Nines 99% 3,65 days
    3 Nines 99,9% 8,76 hours
    4 Nines 99,99% 52,56 minutes
    5 Nines 99,999% 5,26 minutes
    6 Nines 99,9999% 31,5 seconds
    A Planed Outage
    is an Outage
    rescheduler
    Availability?

    View Slide

  76. @duffleit
    <
    SEPA Payment Team
    Account Team
    Check
    Payment
    FRAUD
    CHECK
    <
    🧑
    👧 🧑
    👧 🧑
    🧑
    Core
    Team
    👧 🧑
    🧑
    How many
    Nines? Nines Percentage Yearly Outage
    1 Nine 90% 36,5 days
    2 Nines 99% 3,65 days
    3 Nines 99,9% 8,76 hours
    4 Nines 99,99% 52,56 minutes
    5 Nines 99,999% 5,26 minutes
    6 Nines 99,9999% 31,5 seconds
    🔴
    Availability?

    View Slide

  77. @duffleit
    <
    SEPA Payment Team
    Account Team
    Check
    Payment
    FRAUD
    CHECK
    <
    🧑
    👧 🧑
    👧 🧑
    🧑
    Core
    Team
    👧 🧑
    🧑
    MTTF?
    Platform teams
    SLAs

    View Slide

  78. @duffleit
    by Availability.
    We need to architect driven
    9⃣

    View Slide

  79. @duffleit
    <
    SEPA Payment Team
    Account Team
    FRAUD
    CHECK
    <
    👧 🧑
    🧑
    Legacy
    System
    👧 🧑
    🧑
    👧 🧑
    🧑

    View Slide

  80. @duffleit
    <
    SEPA Payment Team
    Account Team
    FRAUD
    CHECK
    <
    👧 🧑
    🧑
    Legacy
    System
    👧 🧑
    🧑

    View Slide

  81. @duffleit
    Readers
    Writers
    Writers
    Readers
    API
    User Fraud
    Data
    User Fraud
    Data
    ⏳ Downtime Given

    View Slide

  82. @duffleit
    Readers
    Writers
    Writers
    Readers
    API
    User Fraud
    Data
    User Fraud
    Data
    🔥

    View Slide

  83. @duffleit
    Readers
    Writers
    Writers
    Readers
    API
    User Fraud
    Data
    User Fraud
    Data
    🔥

    View Slide

  84. @duffleit
    CDC
    Readers
    Writers
    Writers
    Readers
    API
    User Fraud
    Data
    User Fraud
    Data

    View Slide

  85. @duffleit
    Readers Writers
    Writers
    Readers
    API
    User Fraud
    Data
    Events
    User Fraud
    Data
    CDC

    View Slide

  86. @duffleit
    Readers
    Writers
    Writers
    Readers
    API
    User Fraud
    Data
    User Fraud
    Data
    🔥
    Events
    CDC

    View Slide

  87. @duffleit
    Readers
    Writers
    Writers
    Readers
    API
    User Fraud
    Data
    User Fraud
    Data
    👷
    Events
    CDC

    View Slide

  88. @duffleit
    Readers
    Writers
    Writers
    Readers
    API
    User Fraud
    Data
    User Fraud
    Data
    Zero-Downtime
    Migration
    Events
    CDC

    View Slide

  89. @duffleit
    a Zero-Downtime Mindset.
    We need to architect driven by
    🦜

    View Slide

  90. @duffleit
    We need to
    Start Dacining
    with the Beast

    View Slide

  91. @duffleit
    👧 🧑
    🧑
    👧 🧑
    🧑
    Complex Distributed System
    👧 🧑
    🧑
    Gamedays
    🤡
    🤡
    🤡
    Payment Team

    View Slide

  92. @duffleit
    Chaos.
    We need to embrace
    💣

    View Slide

  93. @duffleit
    So many thing to consider,
    Let’s sum up.

    View Slide

  94. @duffleit
    Complex
    Distributed
    System
    Deployment Units
    Monolithic Distributed
    Complicated
    Monolithic
    System
    Modularisation is
    no reason to go
    for Distributed
    Systems.

    View Slide

  95. @duffleit
    Failure and Scale in Mind.
    We need to architect with
    🔥
    End-2-End Customer Journeys.
    We need to architect for
    🪢
    By Availability.
    We need to architect driven
    9⃣
    A Zero-Downtime Mindset.
    We need to architect with
    🦜
    Chaos.
    We need to embrace
    💣

    View Slide

  96. @duffleit
    👧 🧑
    🧑
    👧 🧑
    🧑
    👧 🧑
    🧑
    🔥

    View Slide

  97. @duffleit
    📦
    👧 🧑
    🧑
    📦
    📦
    Keep the Flow of Value high
    & the Amount of Outages low.
    👧 🧑
    🧑
    👧 🧑
    🧑

    View Slide

  98. @duffleit
    Keep the lead time
    to Thank You low,
    even at scale.

    View Slide

  99. @duffleit
    DAVID LEITNER
    Principal Engineer
    👋 [email protected]
    🌎 @duffleit

    View Slide

  100. @duffleit
    squer.link/arch-scale
    WORKSHOP
    Architecting for Scale
    Free Friends & Family DryRun

    View Slide