SREチームができるまでと、これから

 SREチームができるまでと、これから

Presentaions for
#shinjukugl 新宿 Geek Lounge#7 SRE Meetup (2019/3/12)
https://shinjuku-geek-lounge.connpass.com/event/120693/

07fe43bbec550b3230b3a7f9a81de7cc?s=128

Takuya Nishigori

March 12, 2019
Tweet

Transcript

  1. SREνʔϜ͕Ͱ͖Δ·Ͱͱɺ͜Ε͔Β ɹ VOYAGE GROUP, Inc. @_nishigori SRE LifeCycle in my

    team - @_nishigori #shinjukugl 1
  2. Who are you? • Twi%er @_nishigori • fluct SREs Manager

    • I love Makefile SRE LifeCycle in my team - @_nishigori #shinjukugl 2
  3. ڈ೥΍ͬͯͨ͜ͱ "ࣄۀΛࢧ͑ΔPHP" VOYAGE GROUP ͑Μ͡ʹΌʔͣͰ࿈ࡌ WEB+DB PRESS - ٕज़ධ࿦༷͔ࣾΒग़൛ •

    Vol.104 PHPͷܧଓతόʔδϣϯΞοϓ • Vol.107 ຊ౰ʹ஌ͬͯΔʁ php.ini ʢPHPҎ֎΋ͨ͘͞Μݴޠ৮͍ͬͯΔ SRE LifeCycle in my team - @_nishigori #shinjukugl 3
  4. VOYAGE GROUPͷࣄۀ (2018ݱࡏ) h"ps:/ /voyagegroup.com/business/ SRE LifeCycle in my team

    - @_nishigori #shinjukugl 4
  5. SRE LifeCycle in my team - @_nishigori #shinjukugl 5

  6. Agenda • What is SRE? • SRE Book • case

    fluct • class SRE implements DevOps • Division is not a firewall SRE LifeCycle in my team - @_nishigori #shinjukugl 6
  7. What is SRE? SRE LifeCycle in my team - @_nishigori

    #shinjukugl 7
  8. SRE LifeCycle in my team - @_nishigori #shinjukugl 8

  9. ಡ΋͏ʂʂʢ਎΋֖΋ͳ͍ SRE LifeCycle in my team - @_nishigori #shinjukugl 9

  10. Case fluct SSP SRE LifeCycle in my team - @_nishigori

    #shinjukugl 10
  11. SRE LifeCycle in my team - @_nishigori #shinjukugl 11

  12. ࣮ʹෳࡶ >< SRE LifeCycle in my team - @_nishigori #shinjukugl

    12
  13. ͜ͷதͰɺकΓ͍ͨ΋ͷ͸Ͳ͔͜… SRE LifeCycle in my team - @_nishigori #shinjukugl 13

  14. SLO (αʔϏεϨϕϧ໨ඪ) ޿ࠂ഑৴෦෼ͷγεςϜతͳ౎߹Ͱʮ޿ࠂ͕ݟΕͳ͍ʯΛͳ͘͢ SRE LifeCycle in my team - @_nishigori

    #shinjukugl 14
  15. SLO (αʔϏεϨϕϧ໨ඪ) SLOʹؚΊ͍ͯͳ͍΋ͷ: • αΠτͷjsࢯࢮ๢ • NoAd (has many many

    reason) ޿ࠂ͕ݟΕͳ͍έʔε͸࣮ʹ༷ʑɻ ࢪࡦ͸औ͍ͬͯͨΓ͢Δ͕SLOʹ͸͋͘·ͰؚΊ͍ͯͳ͍ɻ ʢԿඦɾԿઍഔମ΋͋ΔͱҰ݄Ͱ··͋Δʣ SRE LifeCycle in my team - @_nishigori #shinjukugl 15
  16. SLO (αʔϏεϨϕϧ໨ඪ) ӈਤͷதͰ: • ഑৴αʔό͕௚઀བྷΉϦΫΤετ -> Ϩ εϙϯεΛࢭΊͳ͍ • ٯʹͦ͜Ҏ֎͸ͨ·ʹࢭ·ͬͯ΋͝Ί

    ΜͪΌ͍ • ؅ཧը໘, etc ... • ʢળॲ͸ͯ͠ΔYoʣ SRE LifeCycle in my team - @_nishigori #shinjukugl 16
  17. SLI (αʔϏεϨϕϧࢦඪ) • Τϥʔ཰ • ϨΠςϯγ (˚˚ms Ҏ಺ʹฦͧ͢ʙ) SRE LifeCycle

    in my team - @_nishigori #shinjukugl 17
  18. ʮਪଌ͢ΔͳɹܭଌͤΑʯ — Rob Pike - Notes on Programming in C

    ϞχλϦϯάେࣄ SLI͕ةͿ·Εͨ࣌ʹɺͦΕҎ֎΋ՄࢹԽ ͯ͠ͳ͍ͱ݁ہԿ͕ݪҼ͔෼͔Βͳ͍ɻ e.g. process͝ͱͷϚγϯϦιʔε࢖༻཰ ӈਤ sre-book Simplicity Part III - Prac4ces ΑΓ SRE LifeCycle in my team - @_nishigori #shinjukugl 18
  19. fluct SREνʔϜ͸ SLO / SLA Λҡ࣋͠ଓ͚ΔͨΊͷूஂ SRE LifeCycle in my

    team - @_nishigori #shinjukugl 19
  20. SRE book 5.3 ΤϯδχΞϦϯάͰ͋ΔͨΊͷ৚݅ ͪΐͬͱfluct SREνʔϜͰԿͯ͠Δ͔౰ͯ͸Ίͯൈਮͯ͠Έͨ SRE LifeCycle in my

    team - @_nishigori #shinjukugl 20
  21. ιϑτ΢ΣΞΤϯδχΞϦϯά ͍ͩͿίʔυॻ͘ूஂͰ͸͋Δ • terraform • consul and related anything •

    puppet manifests • bot࡞੒ • etc... SRE LifeCycle in my team - @_nishigori #shinjukugl 21
  22. γεςϜΤϯδχΞϦϯά • ૉૣ͍ϏϧυγεςϜͷߏஙɾӡ༻ (Makefile) • σϓϩΠ଎౓ͷ޲্ • Grafana Monitoring͝ʹΐ͝ʹΐ •

    OSΞοϓάϨʔυɾKernelύϥϝʔλ͍͍ͫͫ • Container Orchestra8on SRE LifeCycle in my team - @_nishigori #shinjukugl 22
  23. τΠϧ ๾໓͢Δͧʙ • redashɺର৅ςʔϒϧͷ௥Ճ • ECSͷEC2 Instance drainingϙνοͱͳ Φʔόʔϔου ʮ҆◦ઌੜɺਓ͕ཉ͍͠Ͱ͢…ʯ

    SRE LifeCycle in my team - @_nishigori #shinjukugl 23
  24. Χϯόϯ͸͍ͩͿΏΔ;Θ SRE LifeCycle in my team - @_nishigori #shinjukugl 24

  25. ͳΜͯ͜ͱΛͯͨ͠Β͜ͷ਺೥Ͱɺ • ΊͬͪΌϞχλϦϯά૿͑ͨ • ϓϩμΫγϣϯϛʔςΟϯάͷස౓͕૿͑ͨʢSREຊ 31ষ SRE ʹ͓͚ΔίϛϡχέʔγϣϯͱίϥϘϨʔγϣϯʣ • blue-green

    deployment / canary release SRE LifeCycle in my team - @_nishigori #shinjukugl 25
  26. fluct SREνʔϜ is Not ઐ೚ But ઐ໳ SRE LifeCycle in

    my team - @_nishigori #shinjukugl 26
  27. SRE LifeCycle in my team - @_nishigori #shinjukugl 27

  28. Devision is not a firewall SRE LifeCycle in my team

    - @_nishigori #shinjukugl 28
  29. class SRE implements DevOps DevOps as ఩ֶ / จԽ SRE

    is prescrip+ve (نൣ) ~ SRE vs. DevOps: compe1ng standards or close friends? ~ SRE LifeCycle in my team - @_nishigori #shinjukugl 29
  30. class SRE implements DevOps fluct SREνʔϜʹͱͬͯ - • developers͸͓٬༷Ͱ͸ͳ͍ʢͳ͔ʔ·ʣ •

    ݏ͍ͳݴ༿: ʮґཔʯ • ޷͖ͳݴ༿: ʮ૬ஊʯ ʢ͔Βͷʣ ʮϖΞϓϩʯ SRE LifeCycle in my team - @_nishigori #shinjukugl 30
  31. ϖΞϓϩάϥϛϯάศར!!! SRE LifeCycle in my team - @_nishigori #shinjukugl 31

  32. ͋͘·Ͱઐ໳ͳͷͰ ͿͬͪΌ͚୭͕΍ͬͯ΋͍͍ ͦΕͰ΋ɺSREνʔϜ͕ಘҙͳྖҬ͸ͨ͘͞Μ͋Δ • AWSपΓͷ஌ࣝ • OSಛ༗ͷ໰୊ • CI/CD ࢥߟ

    ͜͜Β΁Μɺ ϖΞϓϩΛܦͯ΍ΔͱΊͬͪΌֶͼ͕͋Δͳͱ SRE LifeCycle in my team - @_nishigori #shinjukugl 32
  33. ϖΞϓϩͯ͠ΔͱΑ͘Ϳͪ౰ͨΔ՝୊ ʂݖݶ͕ͳ͍ʂ ʂʂݖݶ͕ͳ͍ʂʂ ʂʂʂݖݶ͕ͳ͍ʂʂʂ SRE LifeCycle in my team -

    @_nishigori #shinjukugl 33
  34. ద੾ͳݖݶҕৡ "You are administrator" AWSͰͳΜͰ΋Ͱ͖ΔIAM User͸ੵۃతʹ౉͢ελΠϧ SRE LifeCycle in my

    team - @_nishigori #shinjukugl 34
  35. ʮadmin౉ͯ͠ΦγϚΠʯ ͡Όͳ͍ ނʹϖΞϓϩάϥϛϯάͯͨ͠Γ SRE LifeCycle in my team - @_nishigori

    #shinjukugl 35
  36. ٯʹDevelopersʹूதͯ͠΄͍͠ྖҬ΋͋Δɻ ͷͰωοτϫʔΫपΓͱ͔͸ͪ͜Β͕΄΅ઌճΓ͍ͨ͠ • AWS Transit Gateway • VPN Connec6ons •

    EL8 ରԠ • etc ...** SRE LifeCycle in my team - @_nishigori #shinjukugl 36
  37. นͳͲͳ͍ʢ͋ͬͨΒյͦ͏ SRE LifeCycle in my team - @_nishigori #shinjukugl 37

  38. "ͳΜͱͳ͘఻͔͑ͨͬͨ͜ͱ" SREΛಋೖ͍ͨ͠ ·ͣ SLO / SLI ΛܾΊΑ͏ɺ࿩͸ͦΕ͔Βͩ ʢࠓճ࿩ͯ͠ͳ͍͚ͲʣΠϯγσϯτपΓͱ͔ Φψψϝ: SRE

    νʔϜͷධՁʹ໾ཱͭϨϕϧผνΣοΫ Ϧετ SRE LifeCycle in my team - @_nishigori #shinjukugl 38
  39. "ͳΜͱͳ͘఻͔͑ͨͬͨ͜ͱ" ΋͠SREνʔϜΛͭ͘Γͨ͘ͳͬͨΒ ʮνʔϜͰ͋Δඞཁ͋Δʁʯ΋ߟྀͯ͠ɺɺɺ ෦ॺɾνʔϜ͸น͡Όͳ͍Α ͦͷͨΊͷ૬ޓ౒ྗ͸੯͠·ͣ SRE LifeCycle in my team

    - @_nishigori #shinjukugl 39
  40. [nits] Q. ৽نαʔϏε΍Δͱͨ͠Β SREνʔϜͭ͘Δʁ SRE LifeCycle in my team -

    @_nishigori #shinjukugl 40
  41. [nits] Q. ৽نαʔϏε΍Δͱͨ͠Β SREνʔϜͭ͘Δʁ A. No SRE LifeCycle in my

    team - @_nishigori #shinjukugl 41
  42. [nits] Q. ৽نαʔϏε΍Δͱͨ͠Β SREνʔϜͭ͘Δʁ A. No • ·ͩͦͷن໛͡Όͳ͍ͩΖ͏ͱ͔ࢥ͏ • νʔϜɺ͸ͭ͘Βͳ͍͔ͳ

    SRE LifeCycle in my team - @_nishigori #shinjukugl 42
  43. [nits] Q. ৽نαʔϏε΍Δͱͨ͠Β SREνʔϜͭ͘Δʁ ΍Δͱͨ͠ΒɺSLO / SLA ΛܾΊͯ Full Cycle

    DevelopersͷूஂΛ໨ࢦ͢ɺ ͔ͳ • Full Cycle Developers at Ne3lix —  Operate What You Build • Ne3lixʹ͓͚ΔϑϧαΠΫϧ։ൃऀ―։ ൃͨ͠΋ͷ͕ӡ༻͢Δ SRE LifeCycle in my team - @_nishigori #shinjukugl 43
  44. ଞ࿩͔ͨͬͨ͜͠ͱ • Releasement Cycle • Reliable V.S. Speed • Infrastructure

    as Code ͷढ͍ • αΠτ৴པੑͷϋϯυϦϯά Ͳ͔͜Ͱ࿩ͤΔͱخ͍͠ͳ SRE LifeCycle in my team - @_nishigori #shinjukugl 44
  45. thx. ! SRE LifeCycle in my team - @_nishigori #shinjukugl

    45