Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[SCaLE16x] Silo-Based Architectures for High Availability Applications

[SCaLE16x] Silo-Based Architectures for High Availability Applications

High availability is becoming a de-facto requirement of today's applications. Customer-facing IT failures mean directly losing customer revenue and trust, as users have grown accustomed to easily switching service providers for more reliable ones. Lack of internal systems availability block employee productivity and add to the financial burden. Thus, it is critical to have a healthy, performant, resilient IT structure serving as a backbone of conducting your business. But there are no textbook solutions to achieving five 9s availability. Data redundancy, computing clusters, load balancing, fail-over mechanisms, each of these individually addresses one potential issue, but none treats systems in your organisation holistically for maximising business revenue.

Not everyone has the financial and technical ability to use the latest and greatest CDN and offload their high-availability requirements to such 3rd parties. This is where smartness comes into play, and my goal is to show you a different way of architecting an application, one that is centered around solving your own business needs without a huge additional cost. We have devised this solution while working on a very large US airline, using open-source technologies, to meed the Black Friday & Cyber Monday traffic requirements.

Silos are a clever method of grouping servers in such a way that they can be scaled both horizontally and vertically, depending on the actual application needs. Most importantly, it frees you from over-optimizing the architecture upfront, by allowing fine adjustments easy to integrate in your Agile workflow.

Georgiana Gligor

March 10, 2018
Tweet

More Decks by Georgiana Gligor

Other Decks in Technology

Transcript

  1. FOR HIGH AVAILABILITY APPLICATIONS
    SILO-BASED ARCHITECTURES
    Georgiana Gligor / Tekkie Consulting / @gbtekkie

    View Slide

  2. @gbtekkie SCaLE 16X
    2
    ✤ Geek. Mother. Do-er.
    ✤ on LAMP/LEMP stack since 2003
    ✤ Architecture / DevOps consultant
    ✤ RomaniaPHP Organizer
    ✤ PhD Student
    @gbtekkie
    [email protected]
    GEORGIANA GLIGOR

    View Slide

  3. @gbtekkie SCaLE 16X
    3
    advantages and disadvantages
    silos: a possible approach
    the need for high availability
    what is high availability (HA)?
    AGENDA

    View Slide

  4. View Slide

  5. @gbtekkie SCaLE 16X
    5
    https://youtu.be/MQm5BnhTBEQ

    View Slide

  6. 6
    Software industry
    is built around
    anticipating
    change.

    View Slide

  7. 7
    anticipate
    accommodate
    vs

    View Slide

  8. TYPICAL
    APPLICATION

    View Slide

  9. @gbtekkie SCaLE 16X
    9

    View Slide

  10. View Slide

  11. @gbtekkie SCaLE 16X
    master
    Frontend
    Business
    Logic
    Frontend
    Frontend
    Browser
    internet
    Load
    balancer
    slave
    reads writes
    11
    ADJUSTING

    View Slide

  12. @gbtekkie SCaLE 16X
    master
    Frontend
    Business
    Logic
    Frontend
    Frontend
    Browser
    internet
    Load
    balancer
    slave
    reads writes
    12
    ADJUSTING
    redundancy

    View Slide

  13. @gbtekkie SCaLE 16X
    master
    Frontend
    Business
    Logic
    Frontend
    Frontend
    Browser
    internet
    Load
    balancer
    slave
    reads writes
    13
    ADJUSTING
    resilience

    View Slide

  14. @gbtekkie SCaLE 16X
    14
    TYPICAL LAYERING

    View Slide

  15. @gbtekkie SCaLE 16X
    15
    APPLICATION ARCHITECTURE

    View Slide

  16. HIGH
    AVAILABILITY

    View Slide

  17. @gbtekkie SCaLE 16X
    17
    Ability to access the system:
    ✤ retrieve information
    ✤ alter information
    ✤ send new data
    AVAILABILITY

    View Slide

  18. https:/
    /flic.kr/p/dkasBz

    View Slide

  19. @gbtekkie SCaLE 16X
    19
    THE 9s DANCE
    Uptime Downtime
    (per year)
    90.000 % 36.50 days one nine
    99.000 % 3.65 days two nines
    99.900 % 8.76 hrs three nines
    99.950 % 4 hrs 23 mins
    99.990 % 52.56 mins four nines
    99.999 % 5.26 mins five nines

    View Slide

  20. @gbtekkie SCaLE 16X
    20
    THE 9s DANCE
    Uptime Downtime
    (per year)
    90.000 % 36.50 days
    99.000 % 3.65 days
    99.900 % 8.76 hrs
    99.950 % 4 hrs 23 mins Amazon SLA
    99.990 % 52.56 mins four nines
    99.999 % 5.26 mins five nines

    View Slide

  21. @gbtekkie SCaLE 16X
    21
    IMPACT
    $ 144,000
    / hour
    3600
    $ 40
    / sec
    * =

    View Slide

  22. @gbtekkie SCaLE 16X
    22
    USER BEHAVIOUR
    amazon facebook youtube
    Alexa Rank 6 3 2
    daily time on site
    12:07
    mins
    19:27
    mins
    23:44
    mins
    daily pageviews /
    visitor
    11.83 9.38 12.84
    bounce rate 21 % 29 % 33 %

    View Slide

  23. @gbtekkie SCaLE 16X
    23
    HIGH AVAILABILITY TRIANGLE
    cost
    complexity
    risk

    View Slide

  24. @gbtekkie SCaLE 16X
    24
    DOWNTIME
    scheduled
    ‣ you
    unscheduled
    ‣ you
    ‣ others

    View Slide

  25. @gbtekkie SCaLE 16X
    25
    HAPPENS TO THE BEST

    View Slide

  26. @gbtekkie SCaLE 16X
    26
    MICHAEL JACKSON

    View Slide

  27. H.A. SYSTEM
    CHARACTERISTICS

    View Slide

  28. https://flic.kr/p/quMmFw
    NO SINGLE POINT OF FAILURE

    View Slide

  29. https://flic.kr/p/RLKw8z
    RELIABLE CROSSOVER

    View Slide

  30. DETECT FAILURES AS THEY OCCUR

    View Slide

  31. @gbtekkie SCaLE 16X
    31
    HA BEST PRACTICES
    1. no single points of failure
    2. stateless application design
    3. automate infrastructure for consistency & reliability
    4. clever monitoring and alerting
    5. geographically distribute your machines
    6. keep spare capacity to meet increasing demand

    View Slide

  32. 32
    A man’s got to
    know his
    limitations.
    - Dirty Harry

    View Slide

  33. SILOS

    View Slide

  34. @gbtekkie SCaLE 16X
    34
    TRY UPGRADE TO PHP7

    View Slide

  35. @gbtekkie SCaLE 16X
    35
    WHAT IS A SILO?
    ✤ frontend (SPAs, PWAs, etc)
    ✤ backend (e.g. PHP services)
    ✤ data (including cache)
    1 silo = full setup of servers that
    deliver the end-to-end functionality

    View Slide

  36. @gbtekkie SCaLE 16X
    36
    WHAT IS A SILO?

    View Slide

  37. @gbtekkie SCaLE 16X
    37
    SILO-BASED ARCHITECTURE

    View Slide

  38. @gbtekkie SCaLE 16X
    38
    MULTIPLE CACHES

    View Slide

  39. @gbtekkie SCaLE 16X
    39
    A/B TESTING

    View Slide

  40. @gbtekkie SCaLE 16X
    40
    GEOGRAPHICAL DISTRIBUTION

    View Slide

  41. @gbtekkie SCaLE 16X
    41
    LIVE UPGRADES

    View Slide

  42. @gbtekkie SCaLE 16X
    42
    ADVANTAGES
    ✤ reuse familiar technology
    ✤ real A/B testing
    ✤ no BHUF requirements
    ✤ no disruption => brand loyalty
    ✤ lower Total Cost of Ownership
    ✤ simplify scalability

    View Slide

  43. @gbtekkie SCaLE 16X
    43
    DISADVANTAGES
    ✤ needs razor-sharp DevOps team
    ✤ small increase in hardware costs on kick-off
    ✤ adds complexity to the monitoring layer
    ✤ reconsider traceability
    ✤ different bug reproducing and hunting

    View Slide

  44. @gbtekkie SCaLE 16X
    44
    TAKEAWAYS

    View Slide

  45. @gbtekkie SCaLE 16X
    45
    ✤ build situational awareness with
    clever monitoring
    ✤ automate outage detection
    ✤ powerful A/B testing
    TAKEAWAYS

    View Slide

  46. @gbtekkie SCaLE 16X
    46
    FURTHER READING
    ✤ Wikipedia HA page
    ✤ OpenStack’s HA concepts
    ✤ Merge Hemo report from FDA
    ✤ USA Presidential Policy Directive 21
    ✤ “Beyond Legacy Code” book
    ✤ TechCrunch’s summary of sites affected by Michael Jackson’s death
    ✤ Netflix lessons learned after AWS outage
    ✤ Netflix Chaos Monkey source code
    ✤ Brian Adler’s talk on “Architecting for High Availability and Multi-Cloud”

    View Slide

  47. ‹#›
    Questions?
    }
    Efficient architecture.
    Performance oriented.
    AI enhanced. [email protected]

    View Slide