Upgrade to Pro — share decks privately, control downloads, hide ads and more …

about:infrastructure

 about:infrastructure

Yamashita, Yuu

August 11, 2014
Tweet

More Decks by Yamashita, Yuu

Other Decks in Technology

Transcript

  1. Outline 1. What’s the infra engineer? 2. What’s the fault

    torelant system? 3. Appendix: the current & future of the fault torelant systems
  2. 1. What’s the infra engineer? A person who build and

    maintain the “infrastructure”.
  3. 1-1. What’s the infrastructure? Keep everything up and running: *

    Design * Implementation * Maintainance ⇒ It’s all based on the design. High availability =~ Fault torelance
  4. 2-1. Redundancy Allocate enough capacity including spairs. * N+M Redundancy

    * Have at least N with M spairs for faults. * Active/Passive (Active/Backup) * Single master with hot standby
  5. 2-1. Redundancy (examples) * N+M Redundancy: * LB + app

    servers * RAID5/RAID6/RAID-Z * Active/Passive: * MySQL replication * HSRP/VRRP/UCARP ⇒ There’re the best practices
  6. 2-2. Homeostasis (examples) * Health-check & Recovery: * mon, monit

    * cron * chef * puppet ⇒ continuous process to detect faults ⇒ need automation
  7. Conclusion 1. Learn the best practices * Try AWS/GCP &

    Cloud Design Patterns * Redundancy is by design; kill all SPOFs 2. Automate everything =~ programming * Cloud services must have its API * Design your API and automate the process
  8. The fault torerant system current: * Redundancy * Homeostasis future:

    * Redundancy * Homeostasis * Apoptosis (NEW!)
  9. Apoptosis: programmed suicide * The death of long living processes

    have great impact to the overall system. ⇒ If so, kill them all before it’s getting old... * Netflix/SimianArmy (Chaos Monkey) * Terminate EC2 instances in random manner
  10. To use “apoptosis” of the system... * Make the lifecycle

    of processes shorter * Aggressive system-wide “homeostasis” * GoogleCloudPlatform/kubernetes * Decouple the app and the system * Use containers =~ Docker
  11. EOF