LINE ShopチームでのSREの取り組み / SRE in LINE Shop team

LINE ShopチームでのSREの取り組み / SRE in LINE Shop team

2020/11/7に行われたJJUG CCC 2020 Fallでのスポンサーセッションの登壇資料です。
https://ccc2020fall.java-users.jp/

53850955f15249a1a9dc49df6113e400?s=128

LINE Developers

November 07, 2020
Tweet

Transcript

  1. LINE ShopνʔϜͰͷSREͷऔΓ૊Έ 2020/11/07 JJUG CCC 2020 Fall ( https://ccc2020fall.java-users.jp )

    LINE Fukuokaגࣜձࣾ ։ൃ1ࣨ দ࡚ ֶ
  2. ࣗݾ঺հ @matsumana LINE Fukuokaגࣜձࣾ ։ൃ1ࣨ SRE/Server Side Engineer https://github.com/matsumana Manabu

    Matsuzaki
  3. • LINE ShopαʔϏε঺հ • LINE ShopαʔϏεΞʔΩςΫνϟ঺հ • LINE ShopνʔϜͰͷSREͷऔΓ૊Έ Agenda

  4. LINE ShopαʔϏε঺հ

  5. LINE Shopͱ͸ʁ •LINEͷίϯςϯπൢചϓϥοτϑΥʔϜʢLINEͷελϯϓɾ ֆจࣈɾண͔ͤ͑ػೳʣͷࣾ಺Ͱͷ௨শ • LINEΞϓϦ಺ͷελϯϓγϣοϓɺண͔ͤ͑γϣοϓ • WebͷLINE STORE (https://store.line.me/)

  6. ΧελϜελϯϓ • ελϯϓͷςΩετͷҰ෦ΛมߋՄೳͳελϯϓ • https://linecorp.com/ja/pr/news/ja/2019/2664

  7. ϝοηʔδελϯϓ • ΧελϜελϯϓΑΓ΋௕จΛೖྗՄೳͳελϯϓ • https://linecorp.com/ja/pr/news/ja/2020/3127

  8. LINEελϯϓ ϓϨϛΞϜ • ΫϦΤΠλʔζελϯϓͷαϒεΫϦϓγϣϯ • https://store.line.me/stickers-premium/landing/ja • http://creator-mag.line.me/ja/archives/1075007192.html

  9. αʔϏεن໛ • ελϯϓʹؔ͢Δ਺ࣈ *1 • ൢചதͷελϯϓ਺: 855ສηοτ (2020೥3݄࣌఺) • 1೔͋ͨΓͷελϯϓૹ৴਺:

    ฏۉ4ԯ3,300ສճ (2019೥݄̐࣌఺) •RPS(requests/sec) *2 • ීஈͷϐʔΫ: ~ 80K RPS (2020/10࣌఺) • ೥࢝ͷϐʔΫ: ~ 120K RPS (2020/01࣌఺) *1 https://linecorp.com/ja/pr/news/ja/2020/3127 *2 https://logmi.jp/tech/articles/322924
  10. LINEελϯϓͷLINEެࣜΞΧ΢ϯτͷ ༑ͩͪ਺ͷਪҠ • 2018/12: 39,000,000 • 2019/12: 55,000,000 • 2020/10:

    63,000,000
  11. LINE ShopαʔϏεΞʔΩςΫνϟ঺հ

  12. LINEʹ͓͚ΔϚΠΫϩαʔϏε •LINE Shop͸ϚΠΫϩαʔϏεͰߏங͞Ε͍ͯΔ •LINEϝοηʔδϯάϓϥοτϑΥʔϜશମ͔ΒݟΔͱɺ LINE Shopࣗମ΋̍ͭͷϚΠΫϩαʔϏε *1 *1: LINEͷϝοηʔδϯάϓϥοτϑΥʔϜʹ͓͚ΔϚΠΫϩαʔϏεԽ΁ͷ௕͍ಓͷΓ https://linedevday.linecorp.com/jp/2019/sessions/D1-6

  13. LINE Shopʹ͓͚ΔϚΠΫϩαʔϏεʢҰ෦ʣ

  14. ϑϨʔϜϫʔΫɾϞχλϦϯά

  15. Armeria ػೳ֓ཁ •Asynchronous and reactive (like Spring WebFlux) •HTTP/2 •REST

    API͚ͩͰ͸ͳ͘ɺgRPCͱThrift΋αϙʔτ •Client side load balancing • https://armeria.dev/docs/client-service-discovery •ϚΠΫϩαʔϏεͰඞཁͳػೳΛఏڙ • Circuit breaker, Service discovery(DNS etc),Distributed tracing(Zipkin integration), etc
  16. Armeria ػೳ֓ཁ •SwaggerͷΑ͏ͳdebug console •Integration • Spring Boot integration •

    طଘͷJava webΞϓϦʹ૊ΈࠐΜͰ࢖͏ࣄ͕ग़དྷΔ •etc
  17. Armeria ࢀߟࢿྉ •Official site: https://armeria.dev •GitHub repo: https://github.com/line/armeria •LINE DEVELOPER

    DAY 2019 ʮArmeriaɿͲ͜Ͱ΋໾ཱͭϚΠΫϩαʔϏεϑϨʔϜϫʔΫʯ • https://linedevday.linecorp.com/jp/2019/sessions/D2-2 • https://youtu.be/lii7oNzAOx0 • https://speakerdeck.com/line_devday2019/armeria-a- microservice-framework-well-suited-everywhere
  18. Armeria ࢀߟࢿྉ •JSUGษڧձͰʮSpring BootϢʔβͷͨΊͷArmeriaೖ໳ʯͱ ͍͏λΠτϧͰLT͠·ͨ͠ • https://matsumana.info/blog/2020/07/30/introduce-to-armeria- for-spring-users/

  19. LINE ShopͷϚΠΫϩαʔϏεͱ։ൃνʔϜ •౦ژͱ෱Ԭ߹ΘͤͯνʔϜϝϯόʔ͸໿25ਓʢαʔόαΠυ+SREʣ •ϓϩδΣΫτ୯ҐͰνʔϜ͕࡞ΒΕɺෳ਺ͷϚΠΫϩαʔϏεΛඞཁʹ Ԡͯ͡ػೳ௥Ճɾमਖ਼͍ͯ͠Δ

  20. ϚΠΫϩαʔϏεʹ͓͍ͯݕ౼͕ඞཁͳࣄ •Distributed Tracing •Cascading FailureΛ๷͙ͨΊͷCircuit Breaker •Graceful DegradationΛߟྀͨ͠αʔϏε෼ׂ •Service Discovery

    https://employment.en-japan.com/engineerhub/entry/2018/10/09/110000
  21. Distributed Tracing • APIͷݺͼग़͠ͱɺͦΕʹ͔͔ͬͨ࣌ؒΛՄࢹԽ͢Δ • ϨΠςϯγʹ໰୊͕͋Δ৔߹ͷϘτϧωοΫௐࠪ • LINE ShopͰ͸ɺZipkinΛ࢖༻

  22. Circuit Breaker https://armeria.dev/docs/client-circuit-breaker • ݺͼग़͠ઌʹো֐͕ൃੜͨ͠৔߹ɺͦΕ͕ղফ͞ΕΔ·Ͱ௨৴ ΛߦΘͳ͍Α͏ʹ͢Δ

  23. Circuit Breaker

  24. Circuit Breaker ※Cascading Failure͕ൃੜ

  25. Circuit Breaker

  26. Circuit Breaker (ArmeriaͷFailFastException)

  27. Graceful Degradation •ো֐͕ൃੜͨ͠৔߹ʹɺϨεϙϯεͷҰ෦ͷ඼࣭Λ௿Լʢe.g. Ωϟο γϡ͞Εͨݹ͍σʔλΛ࢖͏ʣͤ͞Δ͜ͱͰɺຊ౰ʹେࣄͳ෦෼ΛकΔ •ϚΠΫϩαʔϏεͷ৔߹͸ҎԼΛҙࣝͯ͠αʔϏεΛ෼ׂ • αʔϏεʹো֐͕ൃੜͨ͠৔߹Ͱ΋ܧଓ͍ͨ͠ػೳ͸Կ͔ʁ • Ұ෦ͷػೳ͸Ϩεϙϯεͷ࣭Λ௿Լͤͯ͞ܧଓ͕Մೳ͔ʁ

    ࢀߟࢿྉ: SRE αΠτϦϥΠΞϏϦςΟ ΤϯδχΞϦϯά P281. 22.2.2 ϩʔυγΣσΟϯάͱάϨʔεϑϧσάϥϨʔγϣϯ
  28. Service DiscoveryʢLINE Shopͷ৔߹ʣ

  29. Central Dogma ػೳ֓ཁ •ઃఆϦϙδτϦαʔϏε •watch͓͚ͯ͠͹มߋ௨஌Λड͚औΕΔ •όοΫΤϯυʹGitΛ࢖༻ •ΫϥΠΞϯτϥΠϒϥϦ (Java, Go) •etc

  30. Central Dogma Ϣʔεέʔε •ಈతʹઃఆ͍ͨ͠΋ͷΛCentral DogmaͰ؅ཧ • e.g. Service discovery, Rate

    limitઃఆ, A/Bςετ, etc
  31. Central Dogma ࢀߟࢿྉ •Official site: https://line.github.io/centraldogma/ •GitHub repo: https://github.com/line/centraldogma/ •LINE

    DEVELOPER DAY 2017 Central DogmaɿLINE ͷ GitΛϕʔεʹͨ͠ߴՄ༻ੑαʔϏεߏ੒Ϩϙ δτϦ • https://www.slideshare.net/linecorp/central-dogma-lines-gitbacked- highlyavailable-service-configuration-repository • https://www.youtube.com/watch?v=BmgizIFwMq4
  32. LINE ShopνʔϜͰͷSREͷऔΓ૊Έ

  33. SLI

  34. SLI • SLI (Service Level Indicator) • API Availability (ϦΫΤετ੒ޭ཰:

    ੒ޭ਺/τʔλϧϦΫΤετ਺) • ϨΠςϯγ • etc • SLO (Service Level Objective) • SLIΛϕʔεʹͨ͠αʔϏεͷ৴པੑͷ໨ඪ • SLO 100%͸ؒҧͬͨ໨ඪ • ػೳվળɺ৽ػೳ௥Ճɺϝϯςφϯε͕ߦ͑ͳ͘ͳΔ • ࢖͍ͬͯΔϓϥοτϑΥʔϜͷSLA͕100%Ͱ͸ͳ͍৔߹΋͋Δ
  35. SLI • LINE ShopͰ͸API availability(੒ޭ཰), API latencyΛSLIͱͯ͠࢖༻ • ʮThe Site

    Reliability Workbookʢ೔ຊޠ൛ɿαΠτϦϥΠΞϏϦ ςΟϫʔΫϒοΫʣʯʹܝࡌ͞Ε͍ͯΔࣄྫΛࢀߟʹͨ͠ • Prometheus+GrafanaͰՄࢹԽ • αʔϏεো֐͕ൃੜͨ࣌͠ʹϢʔβ΁ͷӨڹΛ֬ೝ͍ͯ͠Δ • SREͷϓϥΫςΟεͰ͸ɺSLO͸εςʔΫϗϧμʔͱ ߹ҙ͢Δඞཁ͕͋Δ͕·ͩग़དྷ͍ͯͳ͍ʢࠓޙͷ՝୊ʣ
  36. SLI Dashboard • ݱࡏͷঢ়ଶΛදࣔ͢ΔμογϡϘʔυ • ͖͍͠஋ͱഎܠ৭Λઃఆ͍ͯͯ͠ɺ ϝτϦΫεʹΑͬͯഎܠ৭͕มΘΔ

  37. SLI Dashboard

  38. ʮαʔϏεͷ৴པੑͷ֊૚ʯʹ͍ͭͯ

  39. ʮSRE αΠτϦϥΠΞϏϦςΟ ΤϯδχΞϦϯά - ୈᶙ෦ ࣮ફʯΑΓ αʔϏεͷ৴པੑͷ֊૚ • 7. ϓϩμΫτ

    • 6. ։ൃ • 5. ΩϟύγςΟϓϥϯχϯά • 4. ςετٴͼϦϦʔεखॱ • 3. ϙετϞʔςϜ/ࠜຊݪҼ෼ੳ • 2. ΠϯγσϯτରԠ • 1. ϞχλϦϯά
  40. ʮSRE αΠτϦϥΠΞϏϦςΟ ΤϯδχΞϦϯά - ୈᶙ෦ ࣮ફʯΑΓ αʔϏεͷ৴པੑͷ֊૚ • 7. ϓϩμΫτ

    • 6. ։ൃ • 5. ΩϟύγςΟϓϥϯχϯά • 4. ςετٴͼϦϦʔεखॱ • 3. ϙετϞʔςϜ/ࠜຊݪҼ෼ੳ • 2. ΠϯγσϯτରԠ • 1. ϞχλϦϯά
  41. ʮSRE αΠτϦϥΠΞϏϦςΟ ΤϯδχΞϦϯά - ୈᶙ෦ ࣮ફʯΑΓ αʔϏεͷ৴པੑͷ֊૚ • 7. ϓϩμΫτ

    • 6. ։ൃ • 5. ΩϟύγςΟϓϥϯχϯά • 4. ςετٴͼϦϦʔεखॱ • 3. ϙετϞʔςϜ/ࠜຊݪҼ෼ੳ • 2. ΠϯγσϯτରԠ • 1. ϞχλϦϯά
  42. ϞχλϦϯά - Alerting • Ϣʔβʹ௚઀తͳӨڹ͕͋ΔAlertͳͷ͔Ͳ͏͔ʹج͍ͮͯ AlertϨϕϧͱ௨஌͢ΔChannelΛ෼͚͍ͯΔ

  43. ϞχλϦϯά - Alerting • ErrorʢϢʔβʹ௚઀తͳӨڹ͋Γʣ • LatencyͷѱԽ • Error response/secͷ૿Ճ

    • etc • Warnʢ໰୊ͷݪҼͱͳΔ΋ͷ or αʔϏεӨڹ͕௿͍ʣ • CPU usage • JVM GC • ΞϓϦέʔγϣϯαʔό͕མͪͨʢ਺୆ͳΒαʔϏεӨڹ͸ແ͍ʣ • etc
  44. ϞχλϦϯά - ऩू͍ͯ͠ΔϝτϦΫε • API͝ͱͷϝτϦΫε • Server/Client latency (50th, 90th,

    99th percentile, etc) • Requests/sec • Error responses/sec • ϩάͷྔ (Warn, Error) • JVM (GC, Heap, etc) • DB client metrics (HikariCP, etc) • Server load (CPU, Memory, Network Traffic, etc) • etc…
  45. Armeria͕export͢ΔϝτϦΫεͷྫʢҰ෦ʣ • Server/Client latency (50th, 90th, 99th percentile, etc) •

    Requests/sec • Error response/sec • Circuit breaker(CLOSED, OPEN, HALF_OPEN, etc)
  46. Armeria͕export͢ΔϝτϦΫεͷྫʢҰ෦ʣ • Request/Response size • ݺͼग़͠ଆͷ໰୊ͰRequest size͕૿͑ͯαʔόͷෛՙ্͕͕ ΔՄೳੑΛϞχλϦϯά • αʔόଆͷ໰୊Ͱෆਖ਼ͳʢۃ୺ʹখ͞ͳʣϨεϙϯεΛฦͯ͠

    ͠·͏ՄೳੑΛϞχλϦϯά • Armeria client͕DNS໊લղܾʹ͔͔ͬͨ࣌ؒ • DNS͕ݪҼͰ໊લղܾʹ͕͔͔࣌ؒΓϨΠςϯγ͕ѱԽͯ͠͠ ·͏ՄೳੑΛϞχλϦϯά
  47. ΞϓϦέʔγϣϯݻ༗ͷϝτϦΫεͷྫ • όʔδϣϯʢ͍ͭɺͲͷόʔδϣϯ͕σϓϩΠ͞Εͨͷ͔ʁʣ • ΞϓϦέʔγϣϯ • ϑϨʔϜϫʔΫ • Armeria •

    Spring Boot • JVM • etc…
  48. ΞϓϦέʔγϣϯݻ༗ͷϝτϦΫεͷྫ https://github.com/line/armeria/blob/armeria-1.1.0/core/src/main/java/com/linecorp/armeria/server/Server.java#L375

  49. ΞϓϦέʔγϣϯݻ༗ͷϝτϦΫεͷྫ Metrics: armeria_build_info{version=“1.1.0", …} 1.0 PromQL: sum(armeria_build_info{project=~”$project”, …}) by (project,

    version)
  50. ϞχλϦϯά - Batch job Metrics: shop_batch_successful_time_seconds{job=“foo”, period="10min"} 1601313019 ※”1601313019”ͷ෦෼͸job͕ਖ਼ৗʹ׬ྃͨ࣌͠఺ͷUNIX time

    Alert rule: time() - shop_batch_successful_time_seconds{period="10min"} > 60 * 10 * 3 ※”3”͸Ұ࣌తͳΤϥʔͰΞϥʔτΛ্͛ͳ͍ͨΊͷόοϑΝɻ͓޷ΈͰɻ ※͜ͷྫͰ͸ɺperiod=“10min”ϥϕϧΛ෇͚ͨbatch job͕લճऴྃҎ߱ɺ30෼Ҏ಺ʹਖ਼ৗऴ ྃ͠ͳ͚Ε͹Ξϥʔτ্͕͕Δ https://www.robustperception.io/monitoring-batch-jobs-in-python
  51. ʮSRE αΠτϦϥΠΞϏϦςΟ ΤϯδχΞϦϯά - ୈᶙ෦ ࣮ફʯΑΓ αʔϏεͷ৴པੑͷ֊૚ • 7. ϓϩμΫτ

    • 6. ։ൃ • 5. ΩϟύγςΟϓϥϯχϯά • 4. ςετٴͼϦϦʔεखॱ • 3. ϙετϞʔςϜ/ࠜຊݪҼ෼ੳ • 2. ΠϯγσϯτରԠ • 1. ϞχλϦϯά
  52. on-call • on-call୲౰Λຖि2ਓͰ࣋ͪճΓ • νʔϜͰ࡞੒ͨ͠ΨΠυΛݩʹΞϥʔτͷରԠΛߦ͏ • ൃੜͨ͠ΞϥʔτΛνΣοΫ͠ɺϝϯόʔʹΤεΧϨʔγϣϯ • on-call୲౰Ҏ֎ͷϝϯόʔ΋ΞϥʔτରԠʹࢀՃ •

    ֤छμϯϓɾϩάͳͲΛऔಘ • JVM Thread dump • JVM Heap dump • JFRϩά • etc
  53. on-call • αʔϏεӨڹ͕͋Δ৔߹͸ؔ܎͢ΔChannelʹΤεΧϨʔγϣϯ • ൃੜͨ͠issueͷνέοτొ࿥ • αʔϏεো֐͕ൃੜͨ͠৔߹͸ϨϙʔτΛ࡞੒ • ͍ΘΏΔSREϓϥΫςΟεʹ͓͚ΔϙετϞʔςϜ •

    etc
  54. ʮSRE αΠτϦϥΠΞϏϦςΟ ΤϯδχΞϦϯά - ୈᶙ෦ ࣮ફʯΑΓ αʔϏεͷ৴པੑͷ֊૚ • 7. ϓϩμΫτ

    • 6. ։ൃ • 5. ΩϟύγςΟϓϥϯχϯά • 4. ςετٴͼϦϦʔεखॱ • 3. ϙετϞʔςϜ/ࠜຊݪҼ෼ੳ • 2. ΠϯγσϯτରԠ • 1. ϞχλϦϯά
  55. ϙετϞʔςϜ • ϙετϞʔςϜͱ͸ɺαʔϏεো֐͕ൃੜͨ͠৔߹ʹॻ͘ Ϩϙʔτ΍ɺͦͷऔΓ૊Έͷࣄ • LINEͰ͸Ҏલ͔ΒϙετϞʔςϜͷจԽ͕͋Δ • LINE Shopͷ৔߹ɺϙετϞʔςϜΛॻ͍ͯؔ܎νʔϜͱ ϛʔςΟϯάΛ։࠵͍ͯ͠Δ

  56. ϙετϞʔςϜ • ϙετϞʔςϜʹ·ͱΊΔ߲໨ • Өڹൣғ • ো֐ͷݪҼ • ঢ়گͷ࣌ܥྻ·ͱΊ •

    ࠶ൃ๷ࢭࡦͷݕ౼ • ো֐ݕ஌ʹ໰୊͕ͳ͔͔ͬͨʁͲ͏վળ͢Δ͔ʁ • ো֐ͷϋϯυϦϯάʹ໰୊͕ͳ͔͔ͬͨʁͲ͏վળ͢Δ͔ʁ
  57. ʮSRE αΠτϦϥΠΞϏϦςΟ ΤϯδχΞϦϯά - ୈᶙ෦ ࣮ફʯΑΓ αʔϏεͷ৴པੑͷ֊૚ • 7. ϓϩμΫτ

    • 6. ։ൃ • 5. ΩϟύγςΟϓϥϯχϯά • 4. ςετٴͼϦϦʔεखॱ • 3. ϙετϞʔςϜ/ࠜຊݪҼ෼ੳ • 2. ΠϯγσϯτରԠ • 1. ϞχλϦϯά
  58. ΩϟύγςΟϓϥϯχϯά • ݩ୴ͷ0:00௚ޙ͸࠷΋τϥϑΟοΫ͕૿͑ΔΠϕϯτͷ̍ͭ • ຖ೥ࣄલ४උΛߦ͍ͬͯΔ • ڈ೥ͷݩ୴΍ɺͦͷଞͷΠϕϯτͷϝτϦΫεΛݩʹαʔόͷ εέʔϧΞοϓ΍εέʔϧΞ΢τΛߦ͏ • LINE

    Developer MeetupͰൃදͨ࣌͠ͷॻ͖ى͕͋͜͠Γ·͢ • https://logmi.jp/tech/articles/322924
  59. ΩϟύγςΟϓϥϯχϯά • ػೳ௥Ճ΍ɺLINEελϯϓͷLINEެࣜΞΧ΢ϯτͷ༑ͩͪ਺ͷ ૿ՃʹΑͬͯɺීஈͷϦΫΤετ΋૿͑ଓ͚͍ͯΔ • ݩ୴Ҏ֎ͷλΠϛϯάͰ΋ඞཁʹԠͯ͡ਵ࣌εέʔϧΞ΢τ

  60. ͦͷଞͷτϐοΫ

  61. ͦͷଞͷτϐοΫ • ಥൃతͳաෛՙ΁ͷରॲ • k8sΛ࢖͏࣌ʹݕ౼ɾ४උͨ͠ࣄ

  62. ಥൃతͳաෛՙ΁ͷରॲ • ಥൃతʹൃੜͨ͠աෛՙ͕ݪҼͰαʔϏεো֐͕ൃੜ͢ΔՄೳੑ • ߟ͑ΒΕΔ͍͔ͭ͘ͷཁҼ • TVͰऔΓ্͛ΒΕͯಥൃతͳεύΠΫ͕ൃੜ • ෆ۩߹ΛؚΉόʔδϣϯͷσϓϩΠʹΑΔϦΫΤετ૿Ճ •

    etc • Ͳ͏ରॲ͢Δ͔ʁ • Rate limit ʢҰఆͷϦΫΤετΛrejectͯ͠աෛՙঢ়ଶʹͳΔࣄΛ๷͙ʣ • αʔόͷεέʔϧΞ΢τ
  63. LINE ShopͰͷRate limit • Rate limitͷઃఆ஋͸Central DogmaͰ؅ཧ • Rate limitॲཧ͸ArmeriaͷThrottlingServiceΛ࢖ͬͯΞϓϦʹ࣮૷

    https://armeria.dev/docs/advanced-production-checklist/
  64. k8sΛ࢖͏࣌ʹݕ౼ɾ४උͨ͠ࣄ

  65. ͳͥLINE ShopͰk8sΛ࢖͍͍ͨͷ͔ʁ • ΄ͱΜͲͷϚΠΫϩαʔϏε͸VerdaʢPrivate CloudʣͷVMͱ ෺ཧαʔόΛ࢖༻த • AnsibleͰϓϩϏδϣχϯά • εέʔϧΞ΢τʹ͕͔͔࣌ؒΔ

    • ಥൃతͳϦΫΤετ૿Ճ࣌ʹૉૣ͘εέʔϧΞ΢τ͍ͨ͠ • VerdaͰ͸k8sͷαʔϏε΋ఏڙ͞Ε͍ͯΔ
  66. ݕ౼ɾ४උͨ͠ࣄʢϞχλϦϯάʣ • k8s؀ڥʹ͓͚ΔPrometheusϞχλϦϯάͷҰൠతͳύλʔϯ → Ϋϥελ಺ʹPrometheusΛσϓϩΠ͢Δ

  67. ݕ౼ɾ४උͨ͠ࣄʢϞχλϦϯάʣ • PrometheusαʔόΛࣗ෼ୡͰӡ༻͢ΔࣄΛආ͚͍ͨ • LINE ShopαʔϏεશମͰऩू͞Ε͍ͯΔϝτϦΫεͷ૯਺ (scrape_samples_scraped)͸6,000,000Ҏ্ • PrometheusαʔόΛ҆ఆՔಇͤ͞ΔͨΊʹ͸Prometheusͷ ஌ݟɾϊ΢ϋ΢͕ඞཁ

    • ࠓ࢖ͬͯΔPrometheus/Alertmanager/alert rule/Grafanaμο γϡϘʔυΛk8s؀ڥͰ΋ͦͷ··࢖͍͍ͨ
  68. ݕ౼ɾ४උͨ͠ࣄʢϞχλϦϯάʣ • k8s API serverܦ༝ͰΫϥελ֎ͷPrometheusαʔό͔ΒϝτϦ ΫεΛऩू͢Δࣄ΋Մೳ

  69. ݕ౼ɾ४උͨ͠ࣄʢϞχλϦϯάʣ • σϓϩΠ͢ΔPodͷϝτϦΫεΛk8s APIαʔόܦ༝Ͱऩू͢Δͷ ͸ආ͚͍ͨ • Pod਺͕૿͑ͨ৔߹Ͱ΋ɺk8s APIαʔόʹͰ͖Δ͚ͩෛՙΛ͔ ͚ͨ͘ͳ͍

  70. ݕ౼ɾ४උͨ͠ࣄʢϞχλϦϯάʣ • Reverse ProxyΞϓϦΛ։ൃͯ͠ɺk8sΫϥελʹσϓϩΠ • PodͷϝτϦΫεऩू͸Reverse ProxyΞϓϦܦ༝Ͱߦ͏

  71. ·ͱΊ • LINE ShopαʔϏεΛ͝঺հ͠·ͨ͠ • LINE ShopαʔϏεΞʔΩςΫνϟΛ͝঺հ͠·ͨ͠ • ArmeriaͱCentral Dogma

    • ϚΠΫϩαʔϏεʹ͓͍ͯݕ౼͕ඞཁͳࣄ • Distributed Tracing • Cascading FailureΛ๷͙ͨΊͷCircuit Breaker • Graceful DegradationΛߟྀͨ͠αʔϏε෼ׂ • Service Discovery
  72. ·ͱΊ • LINE ShopνʔϜͰͷSREͷऔΓ૊ΈΛ͝঺հ͠·ͨ͠ • αʔϏεͷ৴པੑͷ֊૚ • ϞχλϦϯά • ΠϯγσϯτରԠ

    • ϙετϞʔςϜ • ΩϟύγςΟϓϥϯχϯά
  73. LINE DEVELOPER DAY 2020ͷ͝Ҋ಺ • ެࣜαΠτ: https://linedevday.linecorp.com/2020/ja • ೔ఔ: 2020/11/25~27ʢΦϯϥΠϯ։࠵ʣ

    • ߹ܭ150Ҏ্ͷηογϣϯ • શηογϣϯ೔ӳ௨༁ରԠ
  74. We are hiring • LINE Fukuokaגࣜձࣾ • αʔόʔαΠυΤϯδχΞ https://linefukuoka.co.jp/ja/career/list/engineer/ development_engineer_server-side

    • LINEגࣜձࣾ • γχΞαʔόʔαΠυΤϯδχΞ/ίϯςϯπൢചϓϥοτϑΥʔϜ https://linecorp.com/ja/career/position/665 • Site Reliability Engineer/ίϯςϯπൢചϓϥοτϑΥʔϜ https://linecorp.com/ja/career/position/1535
  75. ͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠ʂ