Slide 1

Slide 1 text

Implementing Site Reliability Engineering in your organization - Making Culture, Enabling DevOps, Building Platform - Takeshi Kondo / @chaspy 2021/11/16 Infra Study 2nd #7ʮSREͱ૊৫ʯ

Slide 2

Slide 2 text

Who am I chaspy chaspy_ Engineering Manager Site Reliability at Recruit Co., Ltd. Takeshi Kondo

Slide 3

Slide 3 text

SRE NEXT 2020 https://sre-next.dev/schedule#c4

Slide 4

Slide 4 text

ࠓ೔࿩͢͜ͱ / ࿩͞ͳ͍͜ͱ / ର৅ • ࿩͢͜ͱ • SRE Λ૊৫ʹ࣮૷͢ΔͨΊͷഎܠͱͳΔߟ͑ํ • ࿩͞ͳ͍͜ͱ • ಛఆͷٕज़ͷ࿩ • SRE Practice ͷ࣮ફྫ • ର৅ • SRE Λ૊৫ʹ࣮૷͍͚ͨ͠Ͳ೰ΜͰΔͻͱ

Slide 5

Slide 5 text

Tl;dr • SRE Λ૊৫ʹ࣮૷͢ΔͨΊʹ৺͕͚Δͱ͍͍͜ͱ • φϥςΟϒΛཧղ͢Δ • ೝ஌ෛՙΛԼ͛Δ • పఈతʹݴޠԽ͢Δ

Slide 6

Slide 6 text

Infra Study Meetup #3ʮSREͷ͜Ε·Ͱͱ͜Ε͔Βʯ https://speakerdeck.com/masayoshi/sre-culture-organization?slide=29

Slide 7

Slide 7 text

Agenda 1. SRE Λ૊৫ʹ࣮૷͢Δͱ͸Ͳ͏͍͏͜ͱ͔ 2. ελσΟαϓϦ / Quipper ͷ SRE ͱͯ͠΍͖ͬͯͨ͜ͱ 3. SRE ͱจԽ 4. SRE ͱ DevOps 5. SRE ͱ Platform 6. ·ͱΊͱࠓޙ

Slide 8

Slide 8 text

Agenda 1. SRE Λ૊৫ʹ࣮૷͢Δͱ͸Ͳ͏͍͏͜ͱ͔ 2. ελσΟαϓϦ / Quipper ͷ SRE ͱͯ͠΍͖ͬͯͨ͜ͱ 3. SRE ͱจԽ ⾢ main 4. SRE ͱ DevOps 5. SRE ͱ Platform 6. ·ͱΊͱࠓޙ

Slide 9

Slide 9 text

Agenda 1. SRE Λ૊৫ʹ࣮૷͢Δͱ͸Ͳ͏͍͏͜ͱ͔ 2. ελσΟαϓϦ / Quipper ͷ SRE ͱͯ͠΍͖ͬͯͨ͜ͱ 3. SRE ͱจԽ 4. SRE ͱ DevOps 5. SRE ͱ Platform 6. ·ͱΊͱࠓޙ

Slide 10

Slide 10 text

SRE Λ૊৫ʹ࣮૷͢Δͱ͸Ͳ͏͍͏͜ͱ͔ • SRE ͷ໨ඪ • αΠτͷ৴པੑΛίϯτϩʔϧ͢Δ͜ͱ • Agility ͱ Reliability ͲͪΒʹ౤ࢿ͢Δͷ͔Λ SLO ͱ͍͏ࢦඪΛݩʹ ҙࢥܾఆ͢Δ • ࣗ෼ͨͪͷϓϩμΫτɾαʔϏεΛ࡞ΔνʔϜ͕͜ΕΒΛ౰ͨΓલʹ Ͱ͖Δঢ়ଶΛ໨ࢦ͢

Slide 11

Slide 11 text

૊৫΋γεςϜ https://speakerdeck.com/masayoshi/sre-culture-organization?slide=29

Slide 12

Slide 12 text

૊৫΋γεςϜ💡

Slide 13

Slide 13 text

૊৫΋γεςϜ🤔

Slide 14

Slide 14 text

૊৫͸ਓؒ🙆

Slide 15

Slide 15 text

૊৫͸࿩͞ͳ͍Ͱ͢Α https://note.com/qsona/n/ncb9e1f242fb4

Slide 16

Slide 16 text

SRE Λ૊৫ʹ࣮૷͢Δͱ͸Ͳ͏͍͏͜ͱ͔ • SRE ͷ໨ඪ • αΠτͷ৴པੑΛίϯτϩʔϧ͢Δ͜ͱ • Agility ͱ Reliability ͲͪΒʹ౤ࢿ͢Δͷ͔Λ SLO ͱ͍͏ࢦඪΛݩʹ ҙࢥܾఆ͢Δ • ࣗ෼ͨͪͷϓϩμΫτɾαʔϏεΛ࡞ΔνʔϜ͕͜ΕΒΛ౰ͨΓલʹ Ͱ͖Δঢ়ଶΛ໨ࢦ͢ ૊৫ʹνʔϜʹਓؒʹ Կ͔Λ࣮ߦͯ͠΋Β͏

Slide 17

Slide 17 text

Agenda 1. SRE Λ૊৫ʹ࣮૷͢Δͱ͸Ͳ͏͍͏͜ͱ͔ 2. ελσΟαϓϦ / Quipper ͷ SRE ͱͯ͠΍͖ͬͯͨ͜ͱ 3. SRE ͱจԽ 4. SRE ͱ DevOps 5. SRE ͱ Platform 6. ·ͱΊͱࠓޙ

Slide 18

Slide 18 text

ελσΟαϓϦ K12 SRE Team ͷ Vision / Mission / Values • Vision • ࠷ߴͷֶशϓϩμΫτΛ࡞Γଓ͚ΒΕΔ։ൃ૊৫ͷ࣮ݱ • Mission • ࣗݾ׬݁νʔϜ͕ϓϩμΫτΛૉૣ҆͘શʹಧ͚ଓ͚ΔͨΊͷϓϥο τϑΥʔϜͱจԽΛ࡞Δ • Values • Fail smart / Learning / Borderless / Metrics-driven

Slide 19

Slide 19 text

ελσΟαϓϦ K12 SRE Team ͷ Vision / Mission / Values https://blog.studysapuri.jp/entry/sre-vision-mission-values

Slide 20

Slide 20 text

૊৫ن໛ͷਪҠ ։ൃऀ 43& ։ൃऀ͸ελσΟαϓϦɾQuipper ྆ํͷɺWeb Engineer (frontend&backend) ͷ਺ɻNative ͸আ֎͍ͯ͠Δɻ

Slide 21

Slide 21 text

Timeline at 2020-01-11 (SRE NEXT 2020)

Slide 22

Slide 22 text

2020೥1݄౰࣌ͷঢ়گ • Platform Λ Kubernetes ʹࡌͤସ͑ɺMicroservices Ready ͳঢ়گΛ໨ࢦ͍ͯͨ͠ • ૊৫ͱγεςϜ͕εέʔϧ͢ΔΑ͏ʹϓϩηεΛ੔͑ͨ • αʔϏεΦʔφʔγοϓͷࡦఆ • Design Doc • Production Readiness Checklist • Self-services Infrastructure (Terraform) KubernetesಋೖͰ࣮ݱ͍ͨ͠ੈքͱͦͷઌʹ͋ΔMicroservices https://blog.studysapuri.jp/entry/future-with-kubernetes

Slide 23

Slide 23 text

ͦΕ͔Β1೥൒ɺ2021೥10݄ݱࡏ • "ࣗݾ׬݁Խ / self-contained"Λ໨ࢦ͢ • ֤αʔϏενʔϜ͕౰ͨΓલʹDesign Doc Λॻ͖ɺSLI/ SLO Λߟ͑ɺఆٛ͠ɺఆظతʹͦΕΛ؍࡯͍ͯ͠Δ • จԽͱͯ͠ఆணͨ͠ͱݴͬͯ΋͍͍͸ͣ

Slide 24

Slide 24 text

จԽ🤔

Slide 25

Slide 25 text

Agenda 1. SRE Λ૊৫ʹ࣮૷͢Δͱ͸Ͳ͏͍͏͜ͱ͔ 2. ελσΟαϓϦ / Quipper ͷ SRE ͱͯ͠΍͖ͬͯͨ͜ͱ 3. SRE ͱจԽ ⾢ main 4. SRE ͱ DevOps 5. SRE ͱ Platform 6. ·ͱΊͱࠓޙ

Slide 26

Slide 26 text

จԽͱ͸Կ͔ ਓ͕ؒࣾձͷߏ੒һͱͯ֫͠ಘ͢Δ ଟ਺ͷৼΔ෣͍ͷશମͷ͜ͱ (Wikipedia ΑΓ)

Slide 27

Slide 27 text

SRE ͷ૊৫΁ͷ࣮૷ͱ͸Կ͔ • SRE ͷ૊৫΁ͷ࣮૷ ▶ SRE ͷจԽΛ։ൃ૊৫ʹ࡞Δ͜ͱ • ։ൃϝϯόʔ͕ɺ։ൃ૊৫ͷߏ੒һͱͯ͠ɺSLI/SLO Λఆٛ ͢ΔͳͲͷ Practice Λ࣮ફ͠ɺ৴པੑΛίϯτϩʔϧ͢Δৼ Δ෣͍Λࣗવʹ࣮ߦ͍ͯ͠Δঢ়ଶͷ͜ͱ

Slide 28

Slide 28 text

ͲͷΑ͏ʹจԽΛ࡞Δͷ͔ • ҟจԽΛ։ൃνʔϜʹड͚ೖΕͯ΋Β͏͜ͱ • ૊৫͸ਓؒͰߏ੒͞Ε͍ͯΔɺਓؒͷಛੑΛཧղ͢Δ • ਓؒɺ஌Βͳ͍΋ͷʹ͸ෆ҆Λ֮͑Δ • ਓؒɺ೉͍͠΋ͷ͸΍Γͨ͘ͳ͍ • ਓؒɺมԽ͸ۤख γεςϜͱͷҧ͍Ͱ͢Ͷ

Slide 29

Slide 29 text

ҟจԽΛड͚ೖΕͯ΋Β͏ͨΊʹͲ͏͢Ε͹͍͍͔ • ର৅ʢSREจԽʣ͕Կͳͷ͔Λ஌Δ • ߹ཧੑɾϝϦοτΛཧղ͠ɺΠϯηϯςΟϒΛײ͡Δ • ࣮ફ͕ՄೳͳݶΓ؆୯ʹͳ͍ͬͯΔ ʮઆ໌Λ͢Δʯ͚ͩͰ͜ΕΛ੒͠਱͛Δͷ͸೉͍͠

Slide 30

Slide 30 text

จԽߏஙͷͨΊͷ3ͭͷϙΠϯτ • φϥςΟϒΛཧղ͢Δ • ೝ஌ෛՙΛԼ͛Δ • పఈతʹݴޠԽ͢Δ

Slide 31

Slide 31 text

จԽߏஙͷͨΊͷ3ͭͷϙΠϯτ • φϥςΟϒΛཧղ͢Δ • ೝ஌ෛՙΛԼ͛Δ • పఈతʹݴޠԽ͢Δ

Slide 32

Slide 32 text

φϥςΟϒΛཧղ͢Δ https://publishing.newspicks.com/books/9784910063010 https://twitter.com/chaspy_/status/1223088950387982337?s=20 https://twitter.com/chaspy_/status/1403587911421894657?s=20

Slide 33

Slide 33 text

໺ྑ 1on1 ͷ͢ʍΊ ηΫγϣϯͷนΛ௒͑ͯڧྗ͋͠͏ https://speakerdeck.com/chaspy/how-we-overcame-the-covid-19-crisis?slide=56 8FC%FWFMPQFS νʔϜن໛͕มԽ͍ͯ͠Δঢ়گͰͷӡ༻ ෛ୲΍ɺݱঢ়ೝࣝΛ஌Δ͜ͱ͕Ͱ͖ͨ #VTJOFTT%FWFMPQFS1MBOOFS 43&ͬͯͦ΋ͦ΋஌ͬͯ·͔͢ʁͬͯ࿩΍ɺ ֶशऀ 6TFS ,1*ɺ৴པੑࢦඪͷ࿩Λڞ༗ ͪͳΈʹ௅Ήͱ͖͸ͪΌΜͱ google doc ʹ agenda Λॻ͍ͯࣄલʹڞ༗͍ͯ͠·͢ɻͼͬ͘Γͪ͠Ό͏͔ΒͶɻ

Slide 34

Slide 34 text

φϥςΟϒΛཧղ͢Δ • ର࿩Λ௨ͯ͡ଞऀΛ஌ΔɺΘ͔Γ͋͑ͳ͞Λ஌Δ • ཱ৔͕ҧ͏ਓؒͱڠۀ͢Δதɺࣗ෼͕ͨͪ໨ࢦ͢ੈքΛͲ͏ ࣮ݱ͢Δ͔Λߟ͑ൈ͘ • ૬खʹͦ΋ͦ΋ࣗ෼͕ͨͪ΍Ζ͏ͱ͍ͯ͠Δ͜ͱΛཧղͯ͠΋Β͑Δ • ΠϯηϯςΟϒͷઆ໌ͷ࢓ํΛߟ͑Δ͖͔͚ͬʹͳΔ

Slide 35

Slide 35 text

จԽߏஙͷͨΊͷ3ͭͷϙΠϯτ • φϥςΟϒΛཧղ͢Δ • ೝ஌ෛՙΛԼ͛Δ • పఈతʹݴޠԽ͢Δ

Slide 36

Slide 36 text

ೝ஌ෛՙ / Cognitive load In cognitive psychology, cognitive load refers to the used amount of working memory resources (Wikipedia ΑΓ)

Slide 37

Slide 37 text

ೝ஌ෛՙΛԼ͛Δ SRE NEXT 2020 ͰʮSLO Reviewʯͱ͍͏λΠτϧͰొஃ͠·ͨ͠ #srenext https://blog.studysapuri.jp/entry/2020/01/30/slo-review

Slide 38

Slide 38 text

ೝ஌ෛՙΛԼ͛Δ • υΩϡϝϯςʔγϣϯͱͦͷಋઢ • ΨΠυͱࣗಈԽ • γϯϓϧ͞

Slide 39

Slide 39 text

ೝ஌ෛՙΛԼ͛Δ σϓϩΠ׬ྃ௨஌ͱҰॹʹɺ1SFWJFX؀ ڥ 13͝ͱʹੜ੒͞ΕΔ؀ڥ ΁ͷ-JOL ͱ"SHP$%6*΁ͷ-JOLΛ௨஌ ,VCFSOFUFTNBOJGFTUΛมߋͨ͠৔߹ɺ DVTUPNJ[F࣮ߦޙͷࠩ෼EJ ff Λ௨஌

Slide 40

Slide 40 text

ೝ஌ෛՙΛԼ͛Δ $POGUFTUʹΑΔNBOJGFTUMJOU /PEFB ffi OJUZࢦఆෆ଍ Ͳ͏मਖ਼͢Ε͹͍͍͔͕ॻ͔Εͨ υΩϡϝϯτʹ༠ಋ

Slide 41

Slide 41 text

Terraform Platform in Quipper HashiTalks Japan 2021 ͰฐϓϩμΫτͷ Terraform Platform ʹ͍ͭͯొஃ͠·ͨ͠ https://blog.studysapuri.jp/entry/2021/10/13/080000

Slide 42

Slide 42 text

จԽߏஙͷͨΊͷ3ͭͷϙΠϯτ • φϥςΟϒΛཧղ͢Δ • ೝ஌ෛՙΛԼ͛Δ • పఈతʹݴޠԽ͢Δ

Slide 43

Slide 43 text

పఈతͳݴޠԽΛ͢Δ • Backgrond, Problem, Why, What, How Λݴ༿ʹ͢Δ • ϑΟʔυόοΫΛ΋Β͍΍͘͢͢Δ • উखʹڞ༗͞Ε͍ͯ͘ ݴ༿͸ڧ͍ɻݴ༿ʹͯ͠ɺҙࢥܾఆͯ͠ɺ ࢼͯ͠ɺৼΓฦΔ͜ͱΛ܁Γฦ͔͢͠ͳ͍

Slide 44

Slide 44 text

quipper/snippets-ja

Slide 45

Slide 45 text

quipper/snippets-ja

Slide 46

Slide 46 text

จԽߏஙͷͨΊͷ3ͭͷϙΠϯτ • φϥςΟϒΛཧղ͢Δ • ೝ஌ෛՙΛԼ͛Δ • పఈతʹݴޠԽ͢Δ

Slide 47

Slide 47 text

Agenda 1. SRE Λ૊৫ʹ࣮૷͢Δͱ͸Ͳ͏͍͏͜ͱ͔ 2. ͜Ε·ͰελσΟαϓϦ / Quipper ͷ SRE ͱͯ͠΍͖ͬͯ ͨ͜ͱ 3. SRE ͱจԽ 4. SRE ͱ DevOps 5. SRE ͱ Platform 6. ·ͱΊͱࠓޙ

Slide 48

Slide 48 text

Class SRE implements DevOps • DevOps ͱ͍͏ࢥ૝Λ࣮ફ͢Δͷ͕ SRE • What's the Difference Between DevOps and SRE? (class SRE implements DevOps) https://www.youtube.com/watch?v=uTEL8Ff1Zvk

Slide 49

Slide 49 text

What is DevOps?

Slide 50

Slide 50 text

ελσΟαϓϦখதߴେ։ൃ෦ "ٕज़ઓུάϧʔϓ" • ϓϩμΫτ։ൃ૊৫ͱͦͷγεςϜΛΑΓมԽʹڧ͘͢Δ • ٕज़తͳϏδϣϯͱํ਑ͷࡦఆ • ٕज़త՝୊ɾෛ࠴ΛίϯτϩʔϧԼʹஔ͘ • վળαΠΫϧͷཱ֬ͱࣗݾ਍அೳྗͷ֫ಘ • DevOps WG (Working Group) Ͱ׆ಈத

Slide 51

Slide 51 text

DevOps WG • ֤։ൃνʔϜͱ SRE, QA ϝϯόʔͰʮࣗݾ਍அೳྗͷ֫ಘʯ Λߟ͍͑ͯΔ • DX Criteria ͷ࣮ࢪ • ։ൃͷόϦϡʔετϦʔϜϚοϐϯάͷ࡞੒ • Metrics ͷऩूͱ؍࡯ • Platform ͷൈຊతͳվળ΁ͷΠϯϓοτ

Slide 52

Slide 52 text

Agenda 1. SRE Λ૊৫ʹ࣮૷͢Δͱ͸Ͳ͏͍͏͜ͱ͔ 2. ελσΟαϓϦ / Quipper ͷ SRE ͱͯ͠΍͖ͬͯͨ͜ͱ 3. SRE ͱจԽ 4. SRE ͱ DevOps 5. SRE ͱ Platform 6. ·ͱΊͱࠓޙ

Slide 53

Slide 53 text

Why Platform? • Platform = ڞ௨ʹ࢖ΘΕΔ࣮ߦج൫ • ͳͥ Platform ͕ඞཁ͔ʁ • ӡ༻ෛՙΛԼ͛Δ • ೝ஌ෛՙΛԼ͛Δ • Metrics ΛΑΓޮՌతʹऩू͢Δ • Agility ͱ Reliability ͷ૒ํΛߴΊΔ

Slide 54

Slide 54 text

Cloud Native Platform Team ϝϯόʔͷϛογϣϯΛπϦʔߏ଄ʹͨ͠ϛογϣϯπϦʔΛ࡞Γ·ͨ͠ɻ͜Εʹ͍ͭͯϒϩάΛॻ͘༧ఆɻ͜Ε͸ Platform ͱ͍͏ Tree ͷ child nodes

Slide 55

Slide 55 text

Agenda 1. SRE Λ૊৫ʹ࣮૷͢Δͱ͸Ͳ͏͍͏͜ͱ͔ 2. ελσΟαϓϦ / Quipper ͷ SRE ͱͯ͠΍͖ͬͯͨ͜ͱ 3. SRE ͱจԽ 4. SRE ͱ DevOps 5. SRE ͱ Platform 6. ·ͱΊͱࠓޙ

Slide 56

Slide 56 text

·ͱΊ • SRE ͷ૊৫΁ͷ࣮૷ͷͨΊʹ৺͕͚Δͱ͍͍͜ͱ • φϥςΟϒΛཧղ͢Δ • ೝ஌ෛՙΛԼ͛Δ • పఈతʹݴޠԽ͢Δ • ։ൃ૊৫શମͰมԽʹڧ͍૊৫ͱγεςϜΛ࡞͍ͬͯ͘ • DevOps ͷ࣮ݱ • Platform ͷ։ൃ • ։ൃνʔϜ಺Ͱͷ SRE ࣮ફͷαϙʔτ

Slide 57

Slide 57 text

ࠓޙ • Site Reliability Engineering ͷ૊৫΁ͷ࣮૷͸Ҿ͖ଓ͖΍Δ • ։ൃ Team ಺Ͱͷ capability शಘΛࢧԉ͍ͯ͘͠ • SRE Team ͸Πϯϑϥͷٕज़తෛ࠴Λղফ͠ͳ͕Βɺ Platform ։ൃʹ஫ྗ͍ͯ͘͠ • Security ΋ Reliability ͱಉ͡ߏਤʹͳΔɻ։ൃνʔϜ͕ࣗ཯ తʹ࣮ફͰ͖ΔΑ͏ͳจԽͱϓϥοτϑΥʔϜΛ࡞͍ͬͯ͘

Slide 58

Slide 58 text

We are hiring! https://brand.studysapuri.jp/career/position/sre

Slide 59

Slide 59 text

Thank you! chaspy chaspy_ Engineering Manager Site Reliability at Recruit Co., Ltd. Takeshi Kondo

Slide 60

Slide 60 text

FAQ: ࠷΋఍߅͕ڧ͔ͬͨͷ͸Ͳ͜ͰɺͲΜͳࣄ͕͖͔͚ͬͰᬍ᫯͠·͔ͨ͠ʁ • "ڧ͍఍߅ʹ͋ͬͨ"Έ͍ͨͳγʔϯ͸ͳ͔ͬͨ • ͔ͳΓͬ͘͡ΓೖΕ͍ͯͬͨͷͰ

Slide 61

Slide 61 text

FAQ: SREͷ࣮૷͕ఆணͨ͠ޙɺଞνʔϜͷҙࣝ΍ߦಈ͸Կ͔มԽͨ͠Ͱ͠ΐ͏͔ʁ • SLO ҧ൓ͷΞϥʔτ΍ Loadtest ͷ݁ՌΛݩʹੑೳվળ͕ࣗ ཯తʹͰ͖͍ͯΔ

Slide 62

Slide 62 text

FAQ: SREνʔϜࣗ਎͕Πϯϑϥج൫Λ։ൃͨ͠Γ͠·͔͢ʁ • Yes. جຊతʹ Cloud Λ׆༻ͨ͠ج൫Λ։ൃ͍ͯ͠Δ • Cloud Native Platform • Application CI/CD • Infrastructure CI/CD • Kubernetes Platform • Loadtest Platform