Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[GEECON24] Let's Learn to Identify Technical Re...

Alexandre Touret
June 21, 2024
16

[GEECON24] Let's Learn to Identify Technical Requirements for Better Design

Have you ever heard phrases like "it must work 24/7," "I want 100% availability," only to end up with "in reality, a VM will be more than sufficient"? Or conversely, "No SLA, my platform is not critical, it just needs to run precisely at 6:54 AM on the first day of the month"? If these situations sound familiar, don't miss out! Whether these Non-Functional Requirements are explicit or not, they are the keystone of any architecture aligned with client needs.

Drawing on two fictional examples (any resemblance to reality is purely coincidental, or maybe not), we will explore how to navigate the pitfalls of overengineering and establish a pragmatic approach to identifying the right architecture for the right business need.

By the end of this presentation, we'll know how to identify those elusive NFRs that will help us design better architectures while avoiding unnecessary complexity!

Alexandre Touret

June 21, 2024
Tweet

More Decks by Alexandre Touret

Transcript

  1. Let's Learn to Identify Technical Requirements for Better Design Philippe

    Duval & Alexandre Touret Questions: sli.do / #geecon
  2. ❑1 million of registered users ❑500GB of data ❑50 TPS

    (peak) ❑40 million active users ❑40TB secured storage ❑1200TPS Sensitive data repository Customer’s goals After 2 years…
  3. Backstage… It doesn’t handle the load Unavailable It crashed again

    The database is dying… yet again… Latency is too high It lags The app doesn’t scale
  4. ❑ Understand the technical requirements ❑ Draft the simplest design

    possible ❑ Adapt design costs to the business value How to be a part of 31%?... →Go beyond of the functional requirements!
  5. Available 24/7 and beyond Available « We don’t know »

    Two projects for the following platforms
  6. We design payments technology that powers the growth of millions

    of businesses around the world. 7000+ engineers in over 40 countries Managing 43+ billion transactions per year €250M spent in R&D every year Handling 150+ payment methods #1 European payment processor
  7. ❑INTERNAL service level objectives 99% of the web pages must

    be loaded in less than 2 sec Service Level Objective ❑ Indicator for checking our SLO Effective measurement of the rendering time captured from the HTTP server access logs Service Level Indicator ❑ Contractual agreement (SLA < SLO) Service Level Agreement SLO/SLI/SLA ? n ro ides eb pages nd ro ides Is The customer ho uses our application
  8. Availability Authorized Interruption time (per year) 90% 36 days, 14

    hours and 24 minutes 95% 18 days, 6 hours and 2 minutes 99% 3 days, 15 hours and 36 minutes 99,9% 8 hours, 45 minutes and 36 secondes 99,95% 4 hours, 22 minutes and 56 secondes ❑ Ability of a platform to be available and provide a service to users when they need it (ex. 99%) ❑It can be restricted to a specific time slot (ex. from 8AM to 8PM) Availability
  9. What do we do when us-east1 is down? Disaster Recovery

    Site ❑ How long does it take to restore the service after a crash Recover in 1h Recovery Time Objective ❑ How much data you allow yourself to loose ? Orders from the last five minutes Recovery Point Objective RTO/RPO, DRS ?
  10. ✓ Ordering ✓ Payment « THIS APPLICATION MUST NEVER CRASH,

    24/7 100% OF THE TIME". 24/7 available application, eleven nines Inc.
  11. ❑Cloud SQL • 24/7, db-standard-2 instance • 10.0 GiB storage

    • EUR 38.65 ❑Cloud Run • CPU per request, 2 CPU 1GiB • Memory: 1 GiB • 5/7, 11h-22h, 5,000 orders/month • EUR 0.00 ❑Cloud Spanner • 100 processing units: 100 • 10 GiB storage • 50 GiB backup • EUR 72.90 ❑App Engine standard instances • 24/7, F4_1G x 4 instances • EUR 752.91 Financial impacts EUR 825,81 EUR 38,65
  12. An online French pastry Short description of the purpose: «

    An e-commerce platform for French local products»
  13. Architecture (V1) ma on I ate ay ambda ma on

    DynamoDB ma on ambda a ka Client 2 5
  14. ❑The NFRs eren’t adapted to the business model ❑The cost

    management asn’t considered (vs scalability) ❑The NFRs weren’ t ully aligned with the business domain Why may these platforms belong to the 31%? O , let’s ix them!
  15. Understand the functional requirements Technical goals Know and get feedback

    from Ops Evaluate the risks Answer to the 1-billion-dollar question: « is it worth it? » Our approach
  16. Understand the functional requirements Technical goals Know and get feedback

    from Ops Evaluate the risks Answer to the 1-billion-dollar question: « is it worth it? »
  17. Understand the functional requirements Technical goals Know and get feedback

    from Ops Evaluate the risks Answer to the 1-billion-dollar question: « is it worth it? »
  18. High Availability Auto Recovery New paradigm From « Hope it

    will never happen » to « recover easily »
  19. Timeouts “avoid resources overload / avoid w itin fo v

    ” Circuit breakers “do not ” Graceful degradation “wh t is my MVP” How to mitigate latency problems ? Retries “Miti t ponctual n v i bi ity” Exponentional Backoff “…whi voidin th domino ff t”
  20. Understand the functional requirements Technical goals Know and get feedback

    from Ops Evaluate the risks Answer to the 1-billion-dollar question: « is it worth it? »
  21. It’s more a feature of the platform than just a

    bunch of tools... Think about Observability by Design! Visualize Alert Metrics Traces Logs Global Dashboard Technical console for advanced logs Prometheus ecosystem Elastic ecosystem CaaS specific OpenTelemetry ecosystem Legacy
  22. Understand the functional requirements Technical goals Know and get feedback

    from Ops Evaluate the risks Answer to the 1-billion-dollar question: « is it worth it? »
  23. Low 1 Medium 2 High 3 Low 1 Medium 2

    High 3 - Database access issue Probability Impact
  24. Low 1 Medium 2 High 3 Low 1 Medium 2

    High 3 - Database access issue Probability Impact
  25. Understand the functional requirements Technical goals Know and get feedback

    from Ops Evaluate the risks Answer to the 1-billion-dollar question: « is it worth it? »
  26. « Does the creation of pastries require a 99,95% availability?

    » « Is the mobile app mandatory for ordering a Kebab? Is it worth it?
  27. ❑ Get clear and fitted technical goals ❑A pragmatic risk

    management assessment ❑Simplicity & evolutivity What did our approach bring?
  28. I you already operate plat orms… Onboard the OPS Gather,

    pinpoint & the risks Build a knowledge base
  29. To sum up Dig into the user needs Pinpoint the

    requirements Get down to basics! Iterate!
  30. Don’t be a stranger! Follow & get in touch @malkav30

    linkedin.com/in /phduval/ blog.worldline.tech @Worldlinetech Feedback Follow us: @touret_alex linkedin.com/in /atouret 72 | Follow our tech team: