Automation @ Hitta.se and why it happened

Automation @ Hitta.se and why it happened

10 minute lightning talk given at the first Stockholm DevOps meetup on why and how we wound up with a certain level of infrastructure automation at Hitta.se.

Approx 30 minutes.

A204e1fe2002bc6d087391759c3dfab0?s=128

Mårten Gustafson

June 01, 2011
Tweet

Transcript

  1. AUTOMATION @ HITTA.SE and why it happened Thursday, June 2,

    2011
  2. HISTORY Thursday, June 2, 2011

  3. INFRA 1.0 web (windows) unique visitors / week Thursday, June

    2, 2011 We started small.
  4. INFRA 1.5 web (windows) cache + api (windows) unique visitors

    / week Thursday, June 2, 2011 And grew ad-hoc.
  5. INFRA 2.0 (BETA) static + api (linux) web (windows) cache

    + api (windows) unique visitors / week Thursday, June 2, 2011 For various reasons Linux got introduced in parts of the solution
  6. INFRA 2.0 (REFINED) static + api (linux) static (linux) api

    (linux) Thursday, June 2, 2011 Those Linux nodes where then split out into two main node groups
  7. INFRA 2.0 api (linux) static (linux) web (windows) web (linux)

    unique visitors / week Thursday, June 2, 2011 The much simplified high level overview of the current solution
  8. STATUS System load? System latency? Redundancy? Thursday, June 2, 2011

  9. STATUS System load? System latency? Redundancy? Thursday, June 2, 2011

  10. STATUS System load? System latency? Redundancy? Thursday, June 2, 2011

  11. STATUS System load? System latency? Redundancy? Thursday, June 2, 2011

  12. STATUS System load? System latency? Redundancy? Thursday, June 2, 2011

  13. web (windows) PROBLEM? api (linux) static (linux) web (linux) 3

    x static 4 x web 4 x web 4 x API Thursday, June 2, 2011
  14. web (windows) PROBLEM? 4 + 4 + 3 = 11

    •11 nginx •11 tomcat •11 sshd •11 /etc/sudoers •11 foo... •11 bar... All manually configured (by us or by per-ticket by provider) api (linux) static (linux) web (linux) Thursday, June 2, 2011
  15. THERE MUST BE A BETTER WAY! Thursday, June 2, 2011

  16. AD-HOC VS DEFINED Thursday, June 2, 2011

  17. MANUAL VS AUTOMATIC Thursday, June 2, 2011

  18. PREDICTABILITY PLEASE! Thursday, June 2, 2011

  19. WE WANT Thursday, June 2, 2011

  20. WE WANT Reproducible infrastructure Thursday, June 2, 2011

  21. WE WANT Reproducible infrastructure Disaster recovery Thursday, June 2, 2011

  22. WE WANT Reproducible infrastructure Disaster recovery Free expansion (3 or

    100 won’t differ) Thursday, June 2, 2011
  23. WE WANT Reproducible infrastructure Disaster recovery Free expansion (3 or

    100 won’t differ) Define once Thursday, June 2, 2011
  24. WE WANT Reproducible infrastructure Disaster recovery Free expansion (3 or

    100 won’t differ) We use X in way Y Define once Thursday, June 2, 2011
  25. WE WANT Reproducible infrastructure Disaster recovery Free expansion (3 or

    100 won’t differ) Apply everywhere (it should be applied) We use X in way Y Define once Thursday, June 2, 2011
  26. WE WANT Reproducible infrastructure Disaster recovery Free expansion (3 or

    100 won’t differ) Apply everywhere (it should be applied) We use X in way Y Metadata Define once Thursday, June 2, 2011
  27. WE WANT Reproducible infrastructure Disaster recovery Free expansion (3 or

    100 won’t differ) Apply everywhere (it should be applied) We use X in way Y Metadata How many X do we have? Define once Thursday, June 2, 2011
  28. WE WANT Reproducible infrastructure Disaster recovery Free expansion (3 or

    100 won’t differ) Apply everywhere (it should be applied) We use X in way Y Metadata How many X do we have? How many Z of version Y are we running? Define once Thursday, June 2, 2011
  29. WE WANT Reproducible infrastructure Disaster recovery Free expansion (3 or

    100 won’t differ) Apply everywhere (it should be applied) We use X in way Y Metadata Query your infrastructure! How many X do we have? How many Z of version Y are we running? Define once Thursday, June 2, 2011
  30. WE WANT Reproducible infrastructure Disaster recovery Free expansion (3 or

    100 won’t differ) Apply everywhere (it should be applied) We use X in way Y Metadata Query your infrastructure! How many X do we have? How many Z of version Y are we running? Define once Consistency Thursday, June 2, 2011
  31. WE WANT Reproducible infrastructure Disaster recovery Free expansion (3 or

    100 won’t differ) Apply everywhere (it should be applied) We use X in way Y Metadata Query your infrastructure! How many X do we have? How many Z of version Y are we running? Define once Consistency Predictability Thursday, June 2, 2011
  32. WE WANT Reproducible infrastructure Disaster recovery Free expansion (3 or

    100 won’t differ) Apply everywhere (it should be applied) We use X in way Y Metadata Query your infrastructure! How many X do we have? How many Z of version Y are we running? Define once Consistency Predictability Confidence Thursday, June 2, 2011
  33. EXPERIENCE AUTOMATION Push button deployment of infra + solution =

    complete stack Thursday, June 2, 2011
  34. TOOLING http://en.wikipedia.org/wiki/Comparison_of_open_source_configuration_management_software Thursday, June 2, 2011 cfengine, then puppet, then

    chef
  35. Thursday, June 2, 2011 Right from the start it’s pretty

    obvious what’s being used as far as my network and “research” counts. The long timers started with cfengine and most tend to be on puppet with some on
  36. Thursday, June 2, 2011 At the time of selecting tech

    we didn’t have the manpower nor the time to learn and introduce any tool ourselves. And we were unable to find anyone who could assist us with
  37. MÅRTEN STATUS: Thursday, June 2, 2011 Stockholm, I was disappoint.

  38. Thursday, June 2, 2011 On the other hand we did

    find quite a few people who had experience with Puppet.
  39. SETUP Puppet Agent Search Puppet Agent Web Puppet Agent Development

    subversion jenkins ci puppet master Thursday, June 2, 2011 Conceptual setup.
  40. SETUP Puppet Agent Search Puppet Agent Web Puppet Agent Development

    subversion jenkins ci puppet master Thursday, June 2, 2011 Conceptual setup.
  41. SETUP Puppet Agent Search Puppet Agent Web Puppet Agent Development

    subversion jenkins ci puppet master Thursday, June 2, 2011 Conceptual setup.
  42. SETUP Puppet Agent Search users sudo sshd search Puppet Agent

    Web jvm nginx play! users sudo sshd Puppet Agent Development subversion jenkins ci puppet master Thursday, June 2, 2011 Conceptual setup.
  43. SETUP Puppet Agent Search users sudo sshd search Puppet Agent

    Web jvm nginx play! users sudo sshd Puppet Agent Development subversion jenkins ci puppet master Thursday, June 2, 2011 Conceptual setup.
  44. 4 1/2 day could have been done in 2 -

    3 days we insisted on 5 days Thursday, June 2, 2011 It need not be expensive.