[ZendCon 2017] Silo-Based Architectures for High Availability Applications

[ZendCon 2017] Silo-Based Architectures for High Availability Applications

84cfe0e14cd3fdf8d1b2ef8223d99619?s=128

Georgiana Gligor

October 25, 2017
Tweet

Transcript

  1. FOR HIGH AVAILABILITY APPLICATIONS SILO-BASED ARCHITECTURES Georgiana Gligor / Tekkie

    Consulting / @gbtekkie
  2. 2 ✤ Geek. Mother. Do-er. ✤ on LAMP stack since

    2003 ✤ Architecture / DevOps consultant ✤ RomaniaPHP Organizer ✤ PhD Student @gbtekkie gb@tekkie.ro GEORGIANA GLIGOR
  3. 3 advantages and disadvantages silos: a possible approach the need

    for high availability what is high availability (HA)? AGENDA
  4. None
  5. 5 https://youtu.be/MQm5BnhTBEQ

  6. 6 Tpguxbsf jnevtusz jt cvjmu bspvne bnujdjqbujnh dibnhf/

  7. 7 anticipate accommodate vs

  8. TYPICAL APPLICATION

  9. 9

  10. None
  11. master Frontend Business Logic Frontend Frontend Browser internet Load balancer

    slave reads writes 11 ADJUSTING
  12. master Frontend Business Logic Frontend Frontend Browser internet Load balancer

    slave reads writes 12 ADJUSTING redundancy
  13. master Frontend Business Logic Frontend Frontend Browser internet Load balancer

    slave reads writes 13 ADJUSTING resilience
  14. 14 TYPICAL LAYERING

  15. 15 APPLICATION ARCHITECTURE

  16. HIGH AVAILABILITY

  17. 17 Ability to access the system: ✤ retrieve information ✤

    alter information ✤ send new data AVAILABILITY
  18. https:/ /flic.kr/p/dkasBz

  19. 19 THE 9s DANCE Uptime Downtime (per year) 90.000 %

    36.50 days one nine 99.000 % 3.65 days two nines 99.900 % 8.76 hrs three nines 99.950 % 4 hrs 23 mins 99.990 % 52.56 mins four nines 99.999 % 5.26 mins five nines
  20. 20 THE 9s DANCE Uptime Downtime (per year) 90.000 %

    36.50 days 99.000 % 3.65 days 99.900 % 8.76 hrs 99.950 % 4 hrs 23 mins Amazon SLA 99.990 % 52.56 mins four nines 99.999 % 5.26 mins five nines
  21. 21 IMPACT $ 144,000 / hour 3600 $ 40 /

    sec * =
  22. 22 USER BEHAVIOUR amazon facebook youtube Alexa Rank 6 3

    2 daily time on site 12:07 mins 19:27 mins 23:44 mins daily pageviews / visitor 11.83 9.38 12.84 bounce rate 21 % 29 % 33 %
  23. 23 HIGH AVAILABILITY TRIANGLE cost complexity risk

  24. 24 DOWNTIME ✤ scheduled ‣ you ✤ unscheduled ‣ you

    ‣ others
  25. 25 HAPPENS TO THE BEST

  26. 26 MICHAEL JACKSON

  27. ‹#› H.A. SYSTEM CHARACTERISTICS

  28. https://flic.kr/p/quMmFw NO SINGLE POINT OF FAILURE

  29. https://en.wikipedia.org/wiki/Point_of_sale#/media/File:Cash_Registers.JPG RELIABLE CROSSOVER

  30. https://flic.kr/p/4S4uDz DETECT FAILURES AS THEY OCCUR

  31. 31 HA BEST PRACTICES 1. no single points of failure

    2. stateless application design 3. automate infrastructure for consistency & reliability 4. clever monitoring and alerting 5. geographically distribute your machines 6. keep spare capacity to meet increasing demand
  32. 32 B nbn“t hpu up lnpx ijt mjnjubujpnt/ - Dirty

    Harry
  33. SILOS

  34. 34 TRY UPGRADE TO PHP7

  35. 35 WHAT IS A SILO? ✤ frontend (SPAs, PWAs, etc)

    ✤ backend (e.g. PHP services) ✤ data (including cache) 1 silo = full setup of servers that deliver the end-to-end functionality
  36. 36 WHAT IS A SILO?

  37. 37 SILO-BASED ARCHITECTURE

  38. 38 MULTIPLE CACHES

  39. 39 A/B TESTING

  40. 40 GEOGRAPHICAL DISTRIBUTION

  41. 41 LIVE UPGRADES

  42. 42 ADVANTAGES ✤ reuse familiar technology ✤ real A/B testing

    ✤ no BHUF requirements ✤ no disruption => brand loyalty ✤ lower Total Cost of Ownership ✤ simplify scalability
  43. 43 DISADVANTAGES ✤ needs razor-sharp DevOps team ✤ small increase

    in hardware costs on kick-off ✤ adds complexity to the monitoring layer ✤ reconsider traceability ✤ different bug reproducing and hunting
  44. 44 TAKEAWAYS

  45. 45 ✤ build situational awareness with clever monitoring ✤ automate

    outage detection ✤ powerful A/B testing TAKEAWAYS
  46. 46 FURTHER READING ✤ Wikipedia HA page ✤ OpenStack’s HA

    concepts ✤ Merge Hemo report from FDA ✤ USA Presidential Policy Directive 21 ✤ “Beyond Legacy Code” book ✤ TechCrunch’s summary of sites affected by Michael Jackson’s death ✤ Netflix lessons learned after AWS outage ✤ Netflix Chaos Monkey source code ✤ Brian Adler’s talk on “Architecting for High Availability and Multi-Cloud”
  47. ‹#› QUESTIONS? } Efficient architecture. Performance oriented. AI enhanced. dev@tekkie.ro