Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
LINE ShopチームでのSREの取り組み / SRE in LINE Shop team
Search
LINE Developers
PRO
November 07, 2020
Programming
6
3.5k
LINE ShopチームでのSREの取り組み / SRE in LINE Shop team
2020/11/7に行われたJJUG CCC 2020 Fallでのスポンサーセッションの登壇資料です。
https://ccc2020fall.java-users.jp/
LINE Developers
PRO
November 07, 2020
Tweet
Share
More Decks by LINE Developers
See All by LINE Developers
LINEスタンプのSREing事例集:大きなスパイクアクセスを捌くためのSREing
line_developers
PRO
1
1.9k
Java 21 Overview
line_developers
PRO
6
1k
Code Review Challenge: An example of a solution
line_developers
PRO
1
1.1k
KARTEのAPIサーバ化
line_developers
PRO
1
430
著作権とは何か?〜初歩的概念から権利利用法、侵害要件まで
line_developers
PRO
5
2k
生成AIと著作権 〜生成AIによって生じる著作権関連の課題と対処
line_developers
PRO
3
2k
マイクロサービスにおけるBFFアーキテクチャでのモジュラモノリスの導入
line_developers
PRO
9
3k
A/B Testing at LINE NEWS
line_developers
PRO
3
830
LINEのサポートバージョンの考え方
line_developers
PRO
2
1.1k
Other Decks in Programming
See All in Programming
AWS Lambdaから始まった Serverlessの「熱」とキャリアパス / It started with AWS Lambda Serverless “fever” and career path
seike460
PRO
1
240
CSC509 Lecture 09
javiergs
PRO
0
140
JavaでLチカしたい! / JJUG CCC 2024 Fall LT
nhayato
0
140
TypeScript Graph でコードレビューの心理的障壁を乗り越える
ysk8hori
2
940
エンジニアとして関わる要件と仕様(公開用)
murabayashi
0
140
PHP でアセンブリ言語のように書く技術
memory1994
PRO
1
170
Ethereum_.pdf
nekomatu
0
450
Jakarta Concurrencyによる並行処理プログラミングの始め方 (JJUG CCC 2024 Fall)
tnagao7
1
280
Amazon Bedrock Agentsを用いてアプリ開発してみた!
har1101
0
330
Enabling DevOps and Team Topologies Through Architecture: Architecting for Fast Flow
cer
PRO
0
290
受け取る人から提供する人になるということ
little_rubyist
0
220
弊社の「意識チョット低いアーキテクチャ」10選
texmeijin
5
24k
Featured
See All Featured
Understanding Cognitive Biases in Performance Measurement
bluesmoon
26
1.4k
Product Roadmaps are Hard
iamctodd
PRO
49
11k
Rebuilding a faster, lazier Slack
samanthasiow
79
8.7k
Speed Design
sergeychernyshev
24
610
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
159
15k
Building Flexible Design Systems
yeseniaperezcruz
327
38k
How STYLIGHT went responsive
nonsquared
95
5.2k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
27
4.2k
The World Runs on Bad Software
bkeepers
PRO
65
11k
Fireside Chat
paigeccino
33
3k
A designer walks into a library…
pauljervisheath
203
24k
The Language of Interfaces
destraynor
154
24k
Transcript
LINE ShopνʔϜͰͷSREͷऔΓΈ 2020/11/07 JJUG CCC 2020 Fall ( https://ccc2020fall.java-users.jp )
LINE Fukuokaגࣜձࣾ ։ൃ1ࣨ দ࡚ ֶ
ࣗݾհ @matsumana LINE Fukuokaגࣜձࣾ ։ൃ1ࣨ SRE/Server Side Engineer https://github.com/matsumana Manabu
Matsuzaki
• LINE ShopαʔϏεհ • LINE ShopαʔϏεΞʔΩςΫνϟհ • LINE ShopνʔϜͰͷSREͷऔΓΈ Agenda
LINE ShopαʔϏεհ
LINE Shopͱʁ •LINEͷίϯςϯπൢചϓϥοτϑΥʔϜʢLINEͷελϯϓɾ ֆจࣈɾண͔ͤ͑ػೳʣͷࣾͰͷ௨শ • LINEΞϓϦͷελϯϓγϣοϓɺண͔ͤ͑γϣοϓ • WebͷLINE STORE (https://store.line.me/)
ΧελϜελϯϓ • ελϯϓͷςΩετͷҰ෦ΛมߋՄೳͳελϯϓ • https://linecorp.com/ja/pr/news/ja/2019/2664
ϝοηʔδελϯϓ • ΧελϜελϯϓΑΓจΛೖྗՄೳͳελϯϓ • https://linecorp.com/ja/pr/news/ja/2020/3127
LINEελϯϓ ϓϨϛΞϜ • ΫϦΤΠλʔζελϯϓͷαϒεΫϦϓγϣϯ • https://store.line.me/stickers-premium/landing/ja • http://creator-mag.line.me/ja/archives/1075007192.html
αʔϏεن • ελϯϓʹؔ͢Δࣈ *1 • ൢചதͷελϯϓ: 855ສηοτ (20203݄࣌) • 1͋ͨΓͷελϯϓૹ৴:
ฏۉ4ԯ3,300ສճ (2019݄̐࣌) •RPS(requests/sec) *2 • ීஈͷϐʔΫ: ~ 80K RPS (2020/10࣌) • ࢝ͷϐʔΫ: ~ 120K RPS (2020/01࣌) *1 https://linecorp.com/ja/pr/news/ja/2020/3127 *2 https://logmi.jp/tech/articles/322924
LINEελϯϓͷLINEެࣜΞΧϯτͷ ༑ͩͪͷਪҠ • 2018/12: 39,000,000 • 2019/12: 55,000,000 • 2020/10:
63,000,000
LINE ShopαʔϏεΞʔΩςΫνϟհ
LINEʹ͓͚ΔϚΠΫϩαʔϏε •LINE ShopϚΠΫϩαʔϏεͰߏங͞Ε͍ͯΔ •LINEϝοηʔδϯάϓϥοτϑΥʔϜશମ͔ΒݟΔͱɺ LINE Shopࣗମ̍ͭͷϚΠΫϩαʔϏε *1 *1: LINEͷϝοηʔδϯάϓϥοτϑΥʔϜʹ͓͚ΔϚΠΫϩαʔϏεԽͷ͍ಓͷΓ https://linedevday.linecorp.com/jp/2019/sessions/D1-6
LINE Shopʹ͓͚ΔϚΠΫϩαʔϏεʢҰ෦ʣ
ϑϨʔϜϫʔΫɾϞχλϦϯά
Armeria ػೳ֓ཁ •Asynchronous and reactive (like Spring WebFlux) •HTTP/2 •REST
API͚ͩͰͳ͘ɺgRPCͱThriftαϙʔτ •Client side load balancing • https://armeria.dev/docs/client-service-discovery •ϚΠΫϩαʔϏεͰඞཁͳػೳΛఏڙ • Circuit breaker, Service discovery(DNS etc),Distributed tracing(Zipkin integration), etc
Armeria ػೳ֓ཁ •SwaggerͷΑ͏ͳdebug console •Integration • Spring Boot integration •
طଘͷJava webΞϓϦʹΈࠐΜͰ͏ࣄ͕ग़དྷΔ •etc
Armeria ࢀߟࢿྉ •Official site: https://armeria.dev •GitHub repo: https://github.com/line/armeria •LINE DEVELOPER
DAY 2019 ʮArmeriaɿͲ͜ͰཱͭϚΠΫϩαʔϏεϑϨʔϜϫʔΫʯ • https://linedevday.linecorp.com/jp/2019/sessions/D2-2 • https://youtu.be/lii7oNzAOx0 • https://speakerdeck.com/line_devday2019/armeria-a- microservice-framework-well-suited-everywhere
Armeria ࢀߟࢿྉ •JSUGษڧձͰʮSpring BootϢʔβͷͨΊͷArmeriaೖʯͱ ͍͏λΠτϧͰLT͠·ͨ͠ • https://matsumana.info/blog/2020/07/30/introduce-to-armeria- for-spring-users/
LINE ShopͷϚΠΫϩαʔϏεͱ։ൃνʔϜ •౦ژͱԬ߹ΘͤͯνʔϜϝϯόʔ25ਓʢαʔόαΠυ+SREʣ •ϓϩδΣΫτ୯ҐͰνʔϜ͕࡞ΒΕɺෳͷϚΠΫϩαʔϏεΛඞཁʹ Ԡͯ͡ػೳՃɾमਖ਼͍ͯ͠Δ
ϚΠΫϩαʔϏεʹ͓͍ͯݕ౼͕ඞཁͳࣄ •Distributed Tracing •Cascading FailureΛ͙ͨΊͷCircuit Breaker •Graceful DegradationΛߟྀͨ͠αʔϏεׂ •Service Discovery
https://employment.en-japan.com/engineerhub/entry/2018/10/09/110000
Distributed Tracing • APIͷݺͼग़͠ͱɺͦΕʹ͔͔ͬͨ࣌ؒΛՄࢹԽ͢Δ • ϨΠςϯγʹ͕͋Δ߹ͷϘτϧωοΫௐࠪ • LINE ShopͰɺZipkinΛ༻
Circuit Breaker https://armeria.dev/docs/client-circuit-breaker • ݺͼग़͠ઌʹো͕ൃੜͨ͠߹ɺͦΕ͕ղফ͞ΕΔ·Ͱ௨৴ ΛߦΘͳ͍Α͏ʹ͢Δ
Circuit Breaker
Circuit Breaker ※Cascading Failure͕ൃੜ
Circuit Breaker
Circuit Breaker (ArmeriaͷFailFastException)
Graceful Degradation •ো͕ൃੜͨ͠߹ʹɺϨεϙϯεͷҰ෦ͷ࣭ΛԼʢe.g. Ωϟο γϡ͞Εͨݹ͍σʔλΛ͏ʣͤ͞Δ͜ͱͰɺຊʹେࣄͳ෦ΛकΔ •ϚΠΫϩαʔϏεͷ߹ҎԼΛҙࣝͯ͠αʔϏεΛׂ • αʔϏεʹো͕ൃੜͨ͠߹Ͱܧଓ͍ͨ͠ػೳԿ͔ʁ • Ұ෦ͷػೳϨεϙϯεͷ࣭ΛԼͤͯ͞ܧଓ͕Մೳ͔ʁ
ࢀߟࢿྉ: SRE αΠτϦϥΠΞϏϦςΟ ΤϯδχΞϦϯά P281. 22.2.2 ϩʔυγΣσΟϯάͱάϨʔεϑϧσάϥϨʔγϣϯ
Service DiscoveryʢLINE Shopͷ߹ʣ
Central Dogma ػೳ֓ཁ •ઃఆϦϙδτϦαʔϏε •watch͓͚ͯ͠มߋ௨Λड͚औΕΔ •όοΫΤϯυʹGitΛ༻ •ΫϥΠΞϯτϥΠϒϥϦ (Java, Go) •etc
Central Dogma Ϣʔεέʔε •ಈతʹઃఆ͍ͨ͠ͷΛCentral DogmaͰཧ • e.g. Service discovery, Rate
limitઃఆ, A/Bςετ, etc
Central Dogma ࢀߟࢿྉ •Official site: https://line.github.io/centraldogma/ •GitHub repo: https://github.com/line/centraldogma/ •LINE
DEVELOPER DAY 2017 Central DogmaɿLINE ͷ GitΛϕʔεʹͨ͠ߴՄ༻ੑαʔϏεߏϨϙ δτϦ • https://www.slideshare.net/linecorp/central-dogma-lines-gitbacked- highlyavailable-service-configuration-repository • https://www.youtube.com/watch?v=BmgizIFwMq4
LINE ShopνʔϜͰͷSREͷऔΓΈ
SLI
SLI • SLI (Service Level Indicator) • API Availability (ϦΫΤετޭ:
ޭ/τʔλϧϦΫΤετ) • ϨΠςϯγ • etc • SLO (Service Level Objective) • SLIΛϕʔεʹͨ͠αʔϏεͷ৴པੑͷඪ • SLO 100%ؒҧͬͨඪ • ػೳվળɺ৽ػೳՃɺϝϯςφϯε͕ߦ͑ͳ͘ͳΔ • ͍ͬͯΔϓϥοτϑΥʔϜͷSLA͕100%Ͱͳ͍߹͋Δ
SLI • LINE ShopͰAPI availability(ޭ), API latencyΛSLIͱͯ͠༻ • ʮThe Site
Reliability Workbookʢຊޠ൛ɿαΠτϦϥΠΞϏϦ ςΟϫʔΫϒοΫʣʯʹܝࡌ͞Ε͍ͯΔࣄྫΛࢀߟʹͨ͠ • Prometheus+GrafanaͰՄࢹԽ • αʔϏεো͕ൃੜͨ࣌͠ʹϢʔβͷӨڹΛ֬ೝ͍ͯ͠Δ • SREͷϓϥΫςΟεͰɺSLOεςʔΫϗϧμʔͱ ߹ҙ͢Δඞཁ͕͋Δ͕·ͩग़དྷ͍ͯͳ͍ʢࠓޙͷ՝ʣ
SLI Dashboard • ݱࡏͷঢ়ଶΛදࣔ͢ΔμογϡϘʔυ • ͖͍͠ͱഎܠ৭Λઃఆ͍ͯͯ͠ɺ ϝτϦΫεʹΑͬͯഎܠ৭͕มΘΔ
SLI Dashboard
ʮαʔϏεͷ৴པੑͷ֊ʯʹ͍ͭͯ
ʮSRE αΠτϦϥΠΞϏϦςΟ ΤϯδχΞϦϯά - ୈᶙ෦ ࣮ફʯΑΓ αʔϏεͷ৴པੑͷ֊ • 7. ϓϩμΫτ
• 6. ։ൃ • 5. ΩϟύγςΟϓϥϯχϯά • 4. ςετٴͼϦϦʔεखॱ • 3. ϙετϞʔςϜ/ࠜຊݪҼੳ • 2. ΠϯγσϯτରԠ • 1. ϞχλϦϯά
ʮSRE αΠτϦϥΠΞϏϦςΟ ΤϯδχΞϦϯά - ୈᶙ෦ ࣮ફʯΑΓ αʔϏεͷ৴པੑͷ֊ • 7. ϓϩμΫτ
• 6. ։ൃ • 5. ΩϟύγςΟϓϥϯχϯά • 4. ςετٴͼϦϦʔεखॱ • 3. ϙετϞʔςϜ/ࠜຊݪҼੳ • 2. ΠϯγσϯτରԠ • 1. ϞχλϦϯά
ʮSRE αΠτϦϥΠΞϏϦςΟ ΤϯδχΞϦϯά - ୈᶙ෦ ࣮ફʯΑΓ αʔϏεͷ৴པੑͷ֊ • 7. ϓϩμΫτ
• 6. ։ൃ • 5. ΩϟύγςΟϓϥϯχϯά • 4. ςετٴͼϦϦʔεखॱ • 3. ϙετϞʔςϜ/ࠜຊݪҼੳ • 2. ΠϯγσϯτରԠ • 1. ϞχλϦϯά
ϞχλϦϯά - Alerting • ϢʔβʹతͳӨڹ͕͋ΔAlertͳͷ͔Ͳ͏͔ʹج͍ͮͯ AlertϨϕϧͱ௨͢ΔChannelΛ͚͍ͯΔ
ϞχλϦϯά - Alerting • ErrorʢϢʔβʹతͳӨڹ͋Γʣ • LatencyͷѱԽ • Error response/secͷ૿Ճ
• etc • WarnʢͷݪҼͱͳΔͷ or αʔϏεӨڹ͕͍ʣ • CPU usage • JVM GC • ΞϓϦέʔγϣϯαʔό͕མͪͨʢͳΒαʔϏεӨڹແ͍ʣ • etc
ϞχλϦϯά - ऩू͍ͯ͠ΔϝτϦΫε • API͝ͱͷϝτϦΫε • Server/Client latency (50th, 90th,
99th percentile, etc) • Requests/sec • Error responses/sec • ϩάͷྔ (Warn, Error) • JVM (GC, Heap, etc) • DB client metrics (HikariCP, etc) • Server load (CPU, Memory, Network Traffic, etc) • etc…
Armeria͕export͢ΔϝτϦΫεͷྫʢҰ෦ʣ • Server/Client latency (50th, 90th, 99th percentile, etc) •
Requests/sec • Error response/sec • Circuit breaker(CLOSED, OPEN, HALF_OPEN, etc)
Armeria͕export͢ΔϝτϦΫεͷྫʢҰ෦ʣ • Request/Response size • ݺͼग़͠ଆͷͰRequest size͕૿͑ͯαʔόͷෛՙ্͕͕ ΔՄೳੑΛϞχλϦϯά • αʔόଆͷͰෆਖ਼ͳʢۃʹখ͞ͳʣϨεϙϯεΛฦͯ͠
͠·͏ՄೳੑΛϞχλϦϯά • Armeria client͕DNS໊લղܾʹ͔͔ͬͨ࣌ؒ • DNS͕ݪҼͰ໊લղܾʹ͕͔͔࣌ؒΓϨΠςϯγ͕ѱԽͯ͠͠ ·͏ՄೳੑΛϞχλϦϯά
ΞϓϦέʔγϣϯݻ༗ͷϝτϦΫεͷྫ • όʔδϣϯʢ͍ͭɺͲͷόʔδϣϯ͕σϓϩΠ͞Εͨͷ͔ʁʣ • ΞϓϦέʔγϣϯ • ϑϨʔϜϫʔΫ • Armeria •
Spring Boot • JVM • etc…
ΞϓϦέʔγϣϯݻ༗ͷϝτϦΫεͷྫ https://github.com/line/armeria/blob/armeria-1.1.0/core/src/main/java/com/linecorp/armeria/server/Server.java#L375
ΞϓϦέʔγϣϯݻ༗ͷϝτϦΫεͷྫ Metrics: armeria_build_info{version=“1.1.0", …} 1.0 PromQL: sum(armeria_build_info{project=~”$project”, …}) by (project,
version)
ϞχλϦϯά - Batch job Metrics: shop_batch_successful_time_seconds{job=“foo”, period="10min"} 1601313019 ※”1601313019”ͷ෦job͕ਖ਼ৗʹྃͨ࣌͠ͷUNIX time
Alert rule: time() - shop_batch_successful_time_seconds{period="10min"} > 60 * 10 * 3 ※”3”Ұ࣌తͳΤϥʔͰΞϥʔτΛ্͛ͳ͍ͨΊͷόοϑΝɻ͓ΈͰɻ ※͜ͷྫͰɺperiod=“10min”ϥϕϧΛ͚ͨbatch job͕લճऴྃҎ߱ɺ30Ҏʹਖ਼ৗऴ ྃ͠ͳ͚ΕΞϥʔτ্͕͕Δ https://www.robustperception.io/monitoring-batch-jobs-in-python
ʮSRE αΠτϦϥΠΞϏϦςΟ ΤϯδχΞϦϯά - ୈᶙ෦ ࣮ફʯΑΓ αʔϏεͷ৴པੑͷ֊ • 7. ϓϩμΫτ
• 6. ։ൃ • 5. ΩϟύγςΟϓϥϯχϯά • 4. ςετٴͼϦϦʔεखॱ • 3. ϙετϞʔςϜ/ࠜຊݪҼੳ • 2. ΠϯγσϯτରԠ • 1. ϞχλϦϯά
on-call • on-call୲Λຖि2ਓͰ࣋ͪճΓ • νʔϜͰ࡞ͨ͠ΨΠυΛݩʹΞϥʔτͷରԠΛߦ͏ • ൃੜͨ͠ΞϥʔτΛνΣοΫ͠ɺϝϯόʔʹΤεΧϨʔγϣϯ • on-call୲Ҏ֎ͷϝϯόʔΞϥʔτରԠʹࢀՃ •
֤छμϯϓɾϩάͳͲΛऔಘ • JVM Thread dump • JVM Heap dump • JFRϩά • etc
on-call • αʔϏεӨڹ͕͋Δ߹ؔ͢ΔChannelʹΤεΧϨʔγϣϯ • ൃੜͨ͠issueͷνέοτొ • αʔϏεো͕ൃੜͨ͠߹ϨϙʔτΛ࡞ • ͍ΘΏΔSREϓϥΫςΟεʹ͓͚ΔϙετϞʔςϜ •
etc
ʮSRE αΠτϦϥΠΞϏϦςΟ ΤϯδχΞϦϯά - ୈᶙ෦ ࣮ફʯΑΓ αʔϏεͷ৴པੑͷ֊ • 7. ϓϩμΫτ
• 6. ։ൃ • 5. ΩϟύγςΟϓϥϯχϯά • 4. ςετٴͼϦϦʔεखॱ • 3. ϙετϞʔςϜ/ࠜຊݪҼੳ • 2. ΠϯγσϯτରԠ • 1. ϞχλϦϯά
ϙετϞʔςϜ • ϙετϞʔςϜͱɺαʔϏεো͕ൃੜͨ͠߹ʹॻ͘ ϨϙʔτɺͦͷऔΓΈͷࣄ • LINEͰҎલ͔ΒϙετϞʔςϜͷจԽ͕͋Δ • LINE Shopͷ߹ɺϙετϞʔςϜΛॻ͍ͯؔνʔϜͱ ϛʔςΟϯάΛ։࠵͍ͯ͠Δ
ϙετϞʔςϜ • ϙετϞʔςϜʹ·ͱΊΔ߲ • Өڹൣғ • োͷݪҼ • ঢ়گͷ࣌ܥྻ·ͱΊ •
࠶ൃࢭࡦͷݕ౼ • োݕʹ͕ͳ͔͔ͬͨʁͲ͏վળ͢Δ͔ʁ • োͷϋϯυϦϯάʹ͕ͳ͔͔ͬͨʁͲ͏վળ͢Δ͔ʁ
ʮSRE αΠτϦϥΠΞϏϦςΟ ΤϯδχΞϦϯά - ୈᶙ෦ ࣮ફʯΑΓ αʔϏεͷ৴པੑͷ֊ • 7. ϓϩμΫτ
• 6. ։ൃ • 5. ΩϟύγςΟϓϥϯχϯά • 4. ςετٴͼϦϦʔεखॱ • 3. ϙετϞʔςϜ/ࠜຊݪҼੳ • 2. ΠϯγσϯτରԠ • 1. ϞχλϦϯά
ΩϟύγςΟϓϥϯχϯά • ݩ୴ͷ0:00ޙ࠷τϥϑΟοΫ͕૿͑ΔΠϕϯτͷ̍ͭ • ຖࣄલ४උΛߦ͍ͬͯΔ • ڈͷݩ୴ɺͦͷଞͷΠϕϯτͷϝτϦΫεΛݩʹαʔόͷ εέʔϧΞοϓεέʔϧΞτΛߦ͏ • LINE
Developer MeetupͰൃදͨ࣌͠ͷॻ͖ى͕͋͜͠Γ·͢ • https://logmi.jp/tech/articles/322924
ΩϟύγςΟϓϥϯχϯά • ػೳՃɺLINEελϯϓͷLINEެࣜΞΧϯτͷ༑ͩͪͷ ૿ՃʹΑͬͯɺීஈͷϦΫΤετ૿͑ଓ͚͍ͯΔ • ݩ୴Ҏ֎ͷλΠϛϯάͰඞཁʹԠͯ͡ਵ࣌εέʔϧΞτ
ͦͷଞͷτϐοΫ
ͦͷଞͷτϐοΫ • ಥൃతͳաෛՙͷରॲ • k8sΛ͏࣌ʹݕ౼ɾ४උͨ͠ࣄ
ಥൃతͳաෛՙͷରॲ • ಥൃతʹൃੜͨ͠աෛՙ͕ݪҼͰαʔϏεো͕ൃੜ͢ΔՄೳੑ • ߟ͑ΒΕΔ͍͔ͭ͘ͷཁҼ • TVͰऔΓ্͛ΒΕͯಥൃతͳεύΠΫ͕ൃੜ • ෆ۩߹ΛؚΉόʔδϣϯͷσϓϩΠʹΑΔϦΫΤετ૿Ճ •
etc • Ͳ͏ରॲ͢Δ͔ʁ • Rate limit ʢҰఆͷϦΫΤετΛrejectͯ͠աෛՙঢ়ଶʹͳΔࣄΛ͙ʣ • αʔόͷεέʔϧΞτ
LINE ShopͰͷRate limit • Rate limitͷઃఆCentral DogmaͰཧ • Rate limitॲཧArmeriaͷThrottlingServiceΛͬͯΞϓϦʹ࣮
https://armeria.dev/docs/advanced-production-checklist/
k8sΛ͏࣌ʹݕ౼ɾ४උͨ͠ࣄ
ͳͥLINE ShopͰk8sΛ͍͍ͨͷ͔ʁ • ΄ͱΜͲͷϚΠΫϩαʔϏεVerdaʢPrivate CloudʣͷVMͱ ཧαʔόΛ༻த • AnsibleͰϓϩϏδϣχϯά • εέʔϧΞτʹ͕͔͔࣌ؒΔ
• ಥൃతͳϦΫΤετ૿Ճ࣌ʹૉૣ͘εέʔϧΞτ͍ͨ͠ • VerdaͰk8sͷαʔϏεఏڙ͞Ε͍ͯΔ
ݕ౼ɾ४උͨ͠ࣄʢϞχλϦϯάʣ • k8sڥʹ͓͚ΔPrometheusϞχλϦϯάͷҰൠతͳύλʔϯ → ΫϥελʹPrometheusΛσϓϩΠ͢Δ
ݕ౼ɾ४උͨ͠ࣄʢϞχλϦϯάʣ • PrometheusαʔόΛࣗୡͰӡ༻͢ΔࣄΛආ͚͍ͨ • LINE ShopαʔϏεશମͰऩू͞Ε͍ͯΔϝτϦΫεͷ૯ (scrape_samples_scraped)6,000,000Ҏ্ • PrometheusαʔόΛ҆ఆՔಇͤ͞ΔͨΊʹPrometheusͷ ݟɾϊϋ͕ඞཁ
• ࠓͬͯΔPrometheus/Alertmanager/alert rule/Grafanaμο γϡϘʔυΛk8sڥͰͦͷ··͍͍ͨ
ݕ౼ɾ४උͨ͠ࣄʢϞχλϦϯάʣ • k8s API serverܦ༝ͰΫϥελ֎ͷPrometheusαʔό͔ΒϝτϦ ΫεΛऩू͢ΔࣄՄೳ
ݕ౼ɾ४උͨ͠ࣄʢϞχλϦϯάʣ • σϓϩΠ͢ΔPodͷϝτϦΫεΛk8s APIαʔόܦ༝Ͱऩू͢Δͷ ආ͚͍ͨ • Pod͕૿͑ͨ߹Ͱɺk8s APIαʔόʹͰ͖Δ͚ͩෛՙΛ͔ ͚ͨ͘ͳ͍
ݕ౼ɾ४උͨ͠ࣄʢϞχλϦϯάʣ • Reverse ProxyΞϓϦΛ։ൃͯ͠ɺk8sΫϥελʹσϓϩΠ • PodͷϝτϦΫεऩूReverse ProxyΞϓϦܦ༝Ͱߦ͏
·ͱΊ • LINE ShopαʔϏεΛ͝հ͠·ͨ͠ • LINE ShopαʔϏεΞʔΩςΫνϟΛ͝հ͠·ͨ͠ • ArmeriaͱCentral Dogma
• ϚΠΫϩαʔϏεʹ͓͍ͯݕ౼͕ඞཁͳࣄ • Distributed Tracing • Cascading FailureΛ͙ͨΊͷCircuit Breaker • Graceful DegradationΛߟྀͨ͠αʔϏεׂ • Service Discovery
·ͱΊ • LINE ShopνʔϜͰͷSREͷऔΓΈΛ͝հ͠·ͨ͠ • αʔϏεͷ৴པੑͷ֊ • ϞχλϦϯά • ΠϯγσϯτରԠ
• ϙετϞʔςϜ • ΩϟύγςΟϓϥϯχϯά
LINE DEVELOPER DAY 2020ͷ͝Ҋ • ެࣜαΠτ: https://linedevday.linecorp.com/2020/ja • ఔ: 2020/11/25~27ʢΦϯϥΠϯ։࠵ʣ
• ߹ܭ150Ҏ্ͷηογϣϯ • શηογϣϯӳ௨༁ରԠ
We are hiring • LINE Fukuokaגࣜձࣾ • αʔόʔαΠυΤϯδχΞ https://linefukuoka.co.jp/ja/career/list/engineer/ development_engineer_server-side
• LINEגࣜձࣾ • γχΞαʔόʔαΠυΤϯδχΞ/ίϯςϯπൢചϓϥοτϑΥʔϜ https://linecorp.com/ja/career/position/665 • Site Reliability Engineer/ίϯςϯπൢചϓϥοτϑΥʔϜ https://linecorp.com/ja/career/position/1535
͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠ʂ