Upgrade to Pro — share decks privately, control downloads, hide ads and more …

noteのサービスインフラの “式年遷宮” の取り組み #notetechmeetup

varu3
April 22, 2022

noteのサービスインフラの “式年遷宮” の取り組み #notetechmeetup

note tech meetup #2 エンジニアLT大会でLTしたスライドです

Youtube:
https://www.youtube.com/watch?v=U5Z23eFyCL0

varu3

April 22, 2022
Tweet

More Decks by varu3

Other Decks in Programming

Transcript

  1. note inc. noteͷΤϯδχΞνʔϜ͕2022೥ʹ޲͚ͯ௅ઓ͢Δɺॏཁ՝୊9બ 4 https://note.jp/n/n39681098a0d2 • ࡢ೥຤ʹൃද͞Εͨ note هࣄ •

    ʮίϯςφج൫΁ͷҠߦΛ࣮ࢪ͠ӡ༻ ޮ཰ԽɾলྗԽΛ࣮ݱʯ͕ॏཁ՝୊ͷ Ұͭͱͯ͠ڍ͛ΒΕͨ • ࡢ೥຤͔ΒϓϩδΣΫτ͕ελʔτ ͠ɺSREνʔϜΛத৺ʹঃʑʹҠߦ͠ ͍ͯΔ
  2. note inc. ҰൠతͳίϯςφԽͷϝϦοτ 5 • σϓϩΠαΠΫϧͷվળ ◦ CI/CDΛར༻ͨ͠σϓϩΠͷࣗಈԽ ◦ Blue-GreenσϓϩΠɺCanaryσϓϩΠ


    • ։ൃ؀ڥͷվળɺӡ༻ͷ༰қԽɺetc… ◦ ։ൃ؀ڥΛ༰қʹߏஙͰ͖Δ ◦ Kubernetesͷ࢓૊ΈΛར༻ͯ͠ӡ༻ͷࣗಈԽ
  3. note inc. ιϑτ΢ΣΞࣜ೥ભٶͱ͍͏ߟ͑ํ 7 • ࣜ೥ભٶ ◦ ҏ੎ਆٶͰߦΘΕΔɺ20೥͓͖ʹશͯͷ఼ࣾΛ଄Γସ ͑Δߦࣄͷ͜ͱ ◦

    ଱༻೥਺ͷ૿Ճ ΍ ٕज़ͷܧঝ ͷͨΊ
 • γεςϜͰ΋ಉ͜͡ͱ͕͍͑Δ ◦ AWS ᴈ໌ظ͔Βӡ༻͞Ε͍ͯΔྺ࢙͋Δ note ͷγες Ϝ ◦ ౰࣌ͱࠓͰ͸ϕετϓϥΫςΟε͕ҟͳΔ෦෼΋͋Δ
  4. note inc. γεςϜશମͷঠѲͱϦϑΝΫλϦϯάͷͨΊʹ΍͍ͬͯΔ͜ͱ • ωοτϫʔΫߏ੒ͷ࡮৽ • Τϥʔ௨஌ / ϞχλϦϯά /

    ϩΪϯά؀ڥͷ੔ཧ • ύϑΥʔϚϯεΛܭଌ͠ɺεέʔϧΞ΢τ΍ΩϟύγςΟͷݟ௚͠ • σϓϩΠ଎౓ͱ҆ఆੑͷվળ • rakeλεΫͰ࣮ߦ͞ΕΔόονॲཧͷ੔ཧ • ࢖ΘΕ͍ͯͳ͍(͔΋͠Εͳ͍) gem Λ୨Է͠ • ։ൃʹඞཁͳ؀ڥΛύοͱ࡞ΕΔΑ͏ʹ • طଘͷCI؀ڥ(Jenkins) ʹґଘ͍ͯ͠Δ෦෼Λղফ • etc…
  5. note inc. noteͷEKSҠઃϓϩδΣΫτͷ֓ཁ 11 • noteͷ֤ΞϓϦέʔγϣϯͷ࣮ߦج൫ΛEKS(Kubernetes)ʹҠઃ͢Δ • note͸ෳ਺αʔϏεʹ෼͔ΕͯEC2্ͰΞϓϦέʔγϣϯ͕࣮ߦ͞Ε͍ͯΔ ◦ ϑϩϯτΤϯυ:

    Nuxt.js౳ ◦ όοΫΤϯυ: Ruby on Rails ◦ όον: Ruby on Rails (rakeλεΫ) ◦ ඇಉظϫʔΧʔ: Sidekiq • ͜ΕΒΛॱ൪ʹEKS΁Ҡઃ͍͖ͯ͠ɺ
 noteͷαʔϏεશମͷӡ༻ޮ཰ԽɾলྗԽΛ࣮ݱ͢Δ
  6. note inc. EKSΫϥελͷߏ੒ • ୯ҰΫϥελͷϚϧνςφϯτߏ੒Λ࠾ ༻ ◦ ΞϓϦέʔγϣϯͷ໾ׂ͝ͱʹ NodeGroupΛ෼͚Δ ◦

    namespaceͱNodeGroup͸1:1ͱ͠ɺ nodeselectorͰىಈ͢ΔNodeΛ੍ޚ͢Δ ◦ Pod͸IPΞυϨεͷރׇΛ๷͙ͨΊɺη ΧϯμϦCIDR(100.XX.XX.XX) Ͱىಈ
  7. note inc. ֤αʔϏε͝ͱͷҠઃॱ όον: Ruby on Rails(rakeλεΫ + cron) όοΫΤϯυ:

    Ruby on Rails ϑϩϯτΤϯυ: Nuxt.js౳ Worker: Ruby on Rails(Sidekiq) Ҡߦ ೉қ౓ Ҡߦॱ ※ఆࠁʹॲཧ͕࣮ߦ͢ΔͨΊͷγεςϜ ΠϚίί ߴ ௿
  8. note inc. 2021೥10݄~ ~2022೥6݄ 1. ཁ݅Λຬͨͨ͢Ίͷػೳ ։ൃ 2. ؂ࢹɾΞϥʔτݕ஌ͷ࢓ ૊ΈΛߏங

    3. όονͷςετɾ֬ೝ࡞ۀ • OneshotδϣϒͰͷόονͷ࣮ߦ • beta؀ڥ΁ͷσϓϩΠ όονͷҠઃਐḿ • PodͷΞϥʔτݕ ஌(Datadog) • ΞϓϦέʔγϣϯ ͷΞϥʔτݕ஌ (Sentry) • OneshotδϣϒΛ ࣮ߦͰ͖Δ؀ڥͷ ։ൃ • cronjob-managerͷ ։ൃ 4. Ҡઃ • ຊ൪؀ڥ΁ͷσϓϩΠ • σϓϩΠޙͷ؂ࢹ
  9. note inc. OneshotδϣϒΛ࣮ߦͰ͖Δ؀ڥ: k8s-job-executor • Slackbot্͔ΒίϚϯυΛ࣮ߦͰ͖Δػ ೳΛ։ൃͨ͠ • ίϚϯυΛ࣮ߦ͢ΔͱɺPod্ཱ͕͕ͪ ΓίϚϯυΛ࣮ߦ͢Δ

    • ग़ྗ͸CloudWatch Logsʹू໿ͯ͠ɺ Managed Service for GrafanaͰࢀর͢Δ • GrafanaͰ͸Pod͝ͱɺΞϓϦέʔγϣ ϯ͝ͱʹϩάͷࢀর͕Մೳ
  10. note inc. cronjob-manager(Kubernetes Custom Controller) • KubernetesͰ͸CronJob͝ͱ(1 batch͝ͱʣʹ 1 manifest

    Λ࡞ ੒͢Δඞཁ͕͋Δ • ҰͭͷίϯςφΠϝʔδΛ࡞੒͢ΔͨͼʹશͯͷCronJobͷΠ ϝʔδΛॻ͖׵͑Δͷ͸େม…(Ҏલ͸γΣϧܳΛۦ࢖͍ͯ͠ ͨʣ • cronjob-managerͱ͍͏Kubernetes Custom ControllerΛ։ൃ • ҰͭͷmanifestͰෳ਺ͷCronJobΛ੍ޚ͢Δ͜ͱ͕Ͱ͖Δ • CronJobManagerΛ࡞੒͢ΔͱCronJob͕ࣗಈͰ࡞ΒΕΔ
  11. note inc. Ξϥʔτݕ஌ • Node, PodϨϕϧͷΞϥʔτͱΞϓϦέʔγϣϯϨϕϧͷΞϥʔτͰݕ஌ ख๏Λ෼͚Δ • Node, PodϨϕϧͷΞϥʔτ

    ◦ DatadogΛར༻͢Δ ◦ DatadogAgentΛDaemonsetͱͯ͠σϓϩΠͯ͠ɺϝτϦΫεΛ؂ࢹ • ΞϓϦέʔγϣϯϨϕϧͷΞϥʔτ ◦ SentryΛར༻͢Δ ◦ ΞϓϦέʔγϣϯͰग़ͨbacktraceΛৄࡉʹ෼ੳͰ͖Δ • Կ͔໰୊͕ൃੜͨ͠৔߹ʹ͸͍ͣΕ͔ͷΞϥʔτͰݕ஌͞ΕΔ
  12. note inc. Ξϥʔτݕ஌: ΞΫγϣϯϨϕϧ • ࠓ·Ͱ1ͭͷchannelʹਨΕྲྀ͠ʹͳ͍ͬͯͨΞϥʔτΛΞΫγϣϯϨϕϧ͝ͱʹ੔ཧ • ֤Ϩϕϧຖʹ̏ͭͷSlack channelʹͦΕͧΕ௨஌͢ΔΑ͏ʹͨ͠ info

    • ରԠෆཁͳΞϥʔτ • جຊతʹ৘ใΛऩू ͢Δ͚ͩ • ো֐ൃੜ࣌΍ಈ࡞֬ ೝͷͨΊʹར༻͢Δ ৘ใ warn • ཌ೔·ͰʹରԠ͕ඞ ཁͳΞϥʔτ • ۓٸͷରԠ͸ෆཁͩ ͕์ஔ͢ΔͱϢʔ βʔʹӨڹ͕ग़͔Ͷ ͳ͍΋ͷ critical • ଈ࣌ରԠ͕ඞཁͳΞϥʔτ • ։ൃΤϯδχΞશһ͕ΈΔ • Կ͔͠ΒϢʔβʔʹӨڹ͕ ग़࢝Ί͍ͯΔՄೳੑ͕͋Δ ΋ͷ
  13. note inc. Ξϥʔτݕ஌: criticalอશ࡞ۀ • critical = ੟Ҭͱͯ͠ɺৗʹ༨ܭͳΞϥʔτ͕ඈ͹ͳ͍Α͏ʹ஫ҙΛ෷͏எ·͵౒ྗ͕ඞཁ • ༨ܭͳΞϥʔτ͕૿͍͑ͯ͘ͱɺΞϥʔτʹରͯ͠ԿΛ͍͍͔ͯ͠Θ͔Βͳ͘ͳΔ

    • ͍ͦͯͣ͠Ε͸ɺ୭΋ͦͷνϟϯωϧΛݟͳ͘ͳ͍ͬͯ͘…
 • ͦ͏ͳΒͳ͍Α͏ʹɺᮢ஋΍ΞϥʔτఆٛΛఆظతʹݟ௚͠ɺৗʹద੾ͳΞϥʔτͱͳΔΑ͏ ৺͕͚Δ • ͞ΒʹΑ͘͢Δʹ͸ʢ՝୊ʣ ◦ ि࣍ͷϨϙʔςΟϯάͰ͜ΜͳΞϥʔτ͕Կ݅ى͖͍ͯͨͱ͍͏ͷΛαϚϦԽ ◦ Ξϥʔτ਺Λఆ఺؍ଌ͠ɺΞϥʔτͷ૿ݮΛৗʹҙ͍ࣝͯ͘͠
  14. note inc. όονͷςετɾ֬ೝ࡞ۀ batchͷத਎Λ֬ೝ k8s-job-executorͰ࣮ ߦ͢Δ on beta Τϥʔ͕ग़͍ͯͳ͍͔ ϩάΛ֬ೝ͢Δ

    ࣮ߦ࣌ؒ౳ͷڍಈΛ֬ ೝ cronjob-managerͰσ ϓϩΠ͢Δ on beta cronjob-managerͰσ ϓϩΠ͢Δ on ຊ൪ ͋Δఔ౓ͨ·ͬͨΒ 1 batch͝ͱʹ͜ͷα ΠΫϧΛ܁Γฦ͢ • batchͷ਺͸શ෦Ͱ໿300 • SRE-TͱϓϩμΫτ։ൃνʔϜͰ ख෼͚ͯ͠ɺҰͭͣͭ֬ೝ࡞ۀΛ ࣮ࢪத • priorityͷ௿͍΋ͷ͔Βॱ࣍ɺEKS ্Ͱಈ͔͍ͯ͠Δ
  15. note inc. EKS(Kubernetes)ʹͯ͠Α͔ͬͨͱ͜Ζ Jobͷ࣮ߦཤྺ͕͙֬͢ೝͰ͖Δ • kubectl get jobs ίϚϯυͰ௚ۙʹ࣮ߦ ͞ΕͨJobͷཤྺ͕͙͢ʹ֬ೝͰ͖Δ

    • աڈͷ࣮ߦϩάͳͲͷৄࡉΛJob͝ͱ ʹݸผʹ֬ೝ͢Δ͜ͱ͕Ͱ͖Δ • ࣦഊͨ͠৔߹ͷ࠶࣮ߦ΋༰қ Job͝ͱʹࡉ੍͔͍ޚ͕Մೳ • Job͝ͱʹCPU / Memoryͷ্ݶ΍Ϧ ΫΤετྔΛࢦఆͰ͖Δ • ࠶࣮ߦճ਺΍ฒྻ࣮ߦ਺ͳͲͷࡉ͔͍ ੍ޚ͕Մೳ • ಛఆͷJob͚ͩFargateͰ࣮ߦ͢Δɺͳ Ͳͷ͜ͱ΋Մೳʢࠓޙͷ՝୊ʣ
  16. note inc. noteͷ”ࣜ೥ભٶ”ͰͷϝϦοτ • ӡ༻ͷརศੑ ◦ ΄΅શͯͷಈ࡞ΛSlack͔Βૢ࡞͢Δ͜ͱ͕Ͱ͖ΔΑ͏ʹͨ͜͠ͱͰརศੑ͕޲্ͨ͠ (ίϯςφͷϏ ϧυɺίϚϯυͷ࣮ߦɺϩάͷࢀর౳ʣ ◦

    ϩάͷࢀর΍ݕࡧͳͲ͕Grafana౳ͷ֎෦πʔϧͰ֬ೝͰ͖ΔΑ͏ʹͳͬͨ • ো֐ͷൃݟੑ͕޲্ ◦ Datadog΍SentryͷϞχλϦϯά؀ڥΛ੔͑ͨ͜ͱͰɺCronJobͷಈ࡞ʹ໰୊͕͋ͬͨ৔߹ʹ͙͢ʹݕ ஌͢Δ͜ͱ͕Ͱ͖ΔΑ͏ʹͳͬͨ ◦ ֤ΞϥʔτΛΞΫγϣϯϨϕϧ͝ͱʹ੔ཧ͠ɺΦϯίʔϧ؀ڥ͕վળͨ͠ • όονॲཧࣗମͷվળɾݟ௚͠ ◦ ςετσʔλͷͳ͍΋ͷɺద੾ͳϩά͕ग़ྗ͞Ε͍ͯͳ͍΋ͷͳͲΛվળ