Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
サービスのパフォーマンス数値と 依存関係を用いたサービス同士の 協調スケール構想 / Web ...
Search
itkq
December 23, 2017
0
890
サービスのパフォーマンス数値と 依存関係を用いたサービス同士の 協調スケール構想 / Web System Architecture #1
Web System Architecture 研究会 #1
itkq
December 23, 2017
Tweet
Share
More Decks by itkq
See All by itkq
SREからゼロイチプロダクト開発へ ー越境する打席の立ち方と期待への応え方ー / Product Engineering Night #8
itkq
2
4.2k
アクセス制御にまつわる改善 / Improving access control
itkq
2
2k
SRE Lounge #13 Service Level at Ubie
itkq
4
2.4k
Cookpad summer internship 2019 - infra intro
itkq
2
10k
Where Chaos Engineering comes from, and what's next
itkq
6
7.8k
Re:silience から始めるカオスエンジニアリング生活 / A Life of Chaos Engineering Starting with Resilience
itkq
12
4k
Featured
See All Featured
GraphQLとの向き合い方2022年版
quramy
49
14k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
46
7.6k
Designing for humans not robots
tammielis
253
25k
A better future with KSS
kneath
239
17k
The World Runs on Bad Software
bkeepers
PRO
70
11k
Scaling GitHub
holman
463
140k
Bash Introduction
62gerente
615
210k
Stop Working from a Prison Cell
hatefulcrawdad
271
21k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
252
21k
Why Our Code Smells
bkeepers
PRO
339
57k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
1.5k
How STYLIGHT went responsive
nonsquared
100
5.8k
Transcript
αʔϏεͷύϑΥʔϚϯεͱ ґଘؔΛ༻͍ͨαʔϏεಉ࢜ͷ ڠௐεέʔϧߏ Web System Architecture ݚڀձ #1 @itkq
Me 2 @itkq Takuya Kosugiyama ౦େ ใ௨৴ܥ M2 όΠτ (SRE)
͍ͨ͜ (͖Ύʔ)
ࢀՃత • Web System ͷՄೳੑɺݶքɺֶज़తՁʹ ڵຯ͕͋Δ • ༷ʑͳਓʑͱ͍ٞͨ͠ • ए͍͏ͪ
[ཁग़య] ʹੈքΛ͍͛ͨ 3
ࢀՃత • Web System ͷՄೳੑɺݶքɺֶज़తՁʹ ڵຯ͕͋Δ • ༷ʑͳਓʑͱ͍ٞͨ͠ • ए͍͏ͪ
[ཁग़య] ʹੈքΛ͍͛ͨ 3 • ϦΞϧ id:y_uuki ͞ΜΛݟʹ͖ͨ
ൃද༰ • ࠷ۙͷ Web ΞʔΩςΫνϟͷಈͱ SRE ຊΛ ಡΜͰߟ͑ͨ͜ͱ • ίϯςφϕʔεͷαʔϏεϝογϡΛ׆༻ͨ͠
ӡ༻্ͷεέʔϧղܾͷͨΊͷ͕͔Γ 4
എܠɿΞϓϦέʔγϣϯίϯςφ • ίϯςφܕԾԽ • খΦʔόʔϔου • Container as a Service
• Production ready • Docker, Kubernetes • Managed services (GKE, ECS, EKS) 5
എܠɿαʔϏεࢦΞʔΩςΫνϟ • ϞϊϦεͷେنԽͱݶք • อकੑɺ։ൃޮɺ… • αʔϏεࢦΞʔΩςΫνϟ • ػೳΛαʔϏεͱͯ͠Γग़͢ •
αʔϏεಉ͕࢜࿈ܞ 6 Q. αʔϏεςΟεΧόϦ? ϦτϥΠ? λΠϜΞτ?
എܠɿαʔϏεϝογϡ • service-to-service ௨৴ͷࡍͰϓϩΩγΛհ͢Δ • αʔϏεϝογϡɿ L7 ϓϩΩγʹΑΔωοτϫʔΫͷநϨΠϠʔ 7 Service
A Proxy Service B Proxy Controller Data Plane Control Plane
എܠɿαʔϏεϝογϡͷ༻్ • Envoy, Linkerd • Advanced load balancing • Circuit
breaking • Rate limiting • Service discovery • Observation • Statistics • Logging • Tracing 8 ։ൃج൫ͷྖҬʹۙ͘ɺӡ༻ͷԠ༻ߟ͑Δ༨͋Γ
ίϯςφɾϚΠΫϩαʔϏεͷ࣮ӡ༻՝ 1. ϗετΩϟύγςΟϓϥϯχϯά • ίϯςφͷਫฏεέʔϧ༰қ • ͔͠͠ϗετΩϟύγςΟඞཁ 2. ίϯςφϦιʔεͷܾఆ๏ •
“desired count” ୭͕Ͳ͏ܾΊΔͷ͔ 3. αʔϏεಉ͕࢜ڠௐͨ͠εέʔϧ • ґଘઌαʔϏεͷΩϟύγςΟߟྀ͢Δඞཁ 9 ͍ͣΕΩϟύγςΟͷͱͯ͠ूͰ͖Δ
ΩϟύγςΟϓϥϯχϯάͷ͔͏͖࢟ • Site Reliability Engineering 18.2 10 ΠϯςϯτϕʔεͷΩϟύγςΟඪ Ϧιʔε੍ɾྉۚ࠷খԽͷ࠷దԽ ґଘؔɾύϑΥʔϚϯεϝτϦΫεͷѲ
ΩϟύγςΟϓϥϯχϯάͷ͔͏͖࢟ • Site Reliability Engineering 18.2 10 ΠϯςϯτϕʔεͷΩϟύγςΟඪ Ϧιʔε੍ɾྉۚ࠷খԽͷ࠷దԽ ґଘؔɾύϑΥʔϚϯεϝτϦΫεͷѲ
࠶ܝɿ࣮ӡ༻՝ 11 1. ϗετΩϟύγςΟϓϥϯχϯά 2. ίϯςφϦιʔεͷܾఆ๏ 3. αʔϏεಉ͕࢜ڠௐͨ͠εέʔϧ
࠶ܝɿ࣮ӡ༻՝ 11 1. ϗετΩϟύγςΟϓϥϯχϯά 2. ίϯςφϦιʔεͷܾఆ๏ 3. αʔϏεಉ͕࢜ڠௐͨ͠εέʔϧ جૅݕ౼ɿ •
ύϑΥʔϚϯεఆྔԽ
՝Πϝʔδɿఆঢ়گ 12 A B C user facing internal × 2
× 1 × 1 D System × 1
՝Πϝʔδ 1 13 A B C user facing internal ×
2 × 1 × 1 D System × 1 desired count: 6 ex. ϐʔΫ࣌ A ͷෛՙ 3 ഒ desired count: 3
՝Πϝʔδ 1 13 A B C user facing internal ×
2 × 1 × 1 D System × 1 desired count: 6 × 6 ex. ϐʔΫ࣌ A ͷෛՙ 3 ഒ × 3 desired count: 3 C ͷߟྀ࿙Ε
՝Πϝʔδ 1 13 A B C user facing internal ×
2 × 1 × 1 D System × 1 desired count: 6 × 6 ex. ϐʔΫ࣌ A ͷෛՙ 3 ഒ C × 3 desired count: 3 C ͷߟྀ࿙Ε
՝Πϝʔδ 1 13 A B C user facing internal ×
2 × 1 × 1 D System × 1 desired count: 6 × 6 ex. ϐʔΫ࣌ A ͷෛՙ 3 ഒ C A × 3 desired count: 3 C ͷߟྀ࿙Ε
՝Πϝʔδ 2 14 A B C user facing internal ×
2 × 1 × 1 D System × 1 desired count: 6 desired count: 3 desired count: 3 ex. ϐʔΫ࣌ A ͷෛՙ 3 ഒ
՝Πϝʔδ 2 14 A B C user facing internal ×
2 × 1 × 1 D System × 1 desired count: 6 × 6 × 3 × 3 desired count: 3 desired count: 3 ex. ϐʔΫ࣌ A ͷෛՙ 3 ഒ D ͷߟྀ࿙Ε
՝Πϝʔδ 2 14 A B C user facing internal ×
2 × 1 × 1 D System × 1 desired count: 6 × 6 B × 3 × 3 desired count: 3 desired count: 3 ex. ϐʔΫ࣌ A ͷෛՙ 3 ഒ D ͷߟྀ࿙Ε
՝Πϝʔδ 2 14 A B C user facing internal ×
2 × 1 × 1 D System × 1 desired count: 6 × 6 B A × 3 × 3 desired count: 3 desired count: 3 ex. ϐʔΫ࣌ A ͷෛՙ 3 ഒ D ͷߟྀ࿙Ε
ղܾΠϝʔδɿఆঢ়گ 15 A 100 rps/container A B C user facing
internal × 2 × 1 × 1 D System × 1 B 200 rps/container C 100 rps/container D 100 rps/container 0.4 0.2 0.7
ղܾΠϝʔδ 16 A 100 rps/container A B C user facing
internal × 2 × 1 × 1 D System × 1 B 200 rps/container C 100 rps/container D 100 rps/container 0.4 0.2 0.7 ex. ϐʔΫ࣌ A ͷෛՙ 3 ഒ
ղܾΠϝʔδ 16 A 100 rps/container A B C user facing
internal × 2 × 1 × 1 D System × 1 B 200 rps/container C 100 rps/container D 100 rps/container 0.4 0.2 0.7 ex. ϐʔΫ࣌ A ͷෛՙ 3 ഒ × 6
ղܾΠϝʔδ 16 A 100 rps/container A B C user facing
internal × 2 × 1 × 1 D System × 1 B 200 rps/container C 100 rps/container D 100 rps/container 0.4 0.2 0.7 ex. ϐʔΫ࣌ A ͷෛՙ 3 ഒ (100 * 6 * 0.4 + 100 * 0.7) / 200 ~ 2 (100 * 6 * 0.2) / 100 ~ 2 × 6 × 2 × 2 ύϑΥʔϚϯε + ґଘ → ඞཁ࠷ݶͷίϯςφ
ɿύϑΥʔϚϯεఆྔԽͱґଘ • ఆྔԽʹؔͯ͠ • ͲͷΑ͏ʹଌఆ͢Δ͔ • ϝτϦΫε rps ͰΑ͍͔ •
ͲͷΤϯυϙΠϯτʹର͢Δ rps ͔ • ϦΫΤετύλʔϯ prod ͱಉҰ͔ • αʔϏεͷґଘͲ͏ܾ·Δ͔ 17 ύϑΥʔϚϯεࣗಈଌఆɾఆྔԽͷ࣮ྫ͕গͳ͍
ɿύϑΥʔϚϯεఆྔԽͱґଘ • ఆྔԽʹؔͯ͠ • ͲͷΑ͏ʹଌఆ͢Δ͔ • ϝτϦΫε rps ͰΑ͍͔ •
ͲͷΤϯυϙΠϯτʹର͢Δ rps ͔ • ϦΫΤετύλʔϯ prod ͱಉҰ͔ • αʔϏεͷґଘͲ͏ܾ·Δ͔ 17 ύϑΥʔϚϯεࣗಈଌఆɾఆྔԽͷ࣮ྫ͕গͳ͍ αʔϏεϝογϡ + shadowing
αʔϏεϝογϡߏྫ 18 front-proxy Request B proxy A proxy stats discovery
• connection / request count • 1xx, 2xx, … 5xx count • etc. per proxy metrics:
shadowing 19 front-proxy Request B proxy A proxy Prod Shadow
# routes { "virtual_hosts": [ { "name": "service_a", "domains": [ "*" ], "routes": [ { "prefix": "/", "cluster": "service_a", "shadow": { "cluster": "service_a_prime" } ... A’ B’ strage front->A: GET / front->A: GET /users A->B: GET /awesome_process record log shadow shadow
ύϑΥʔϚϯεଌఆҊ • ೖྗ • ϦΫΤετύεॏΈ d • ಉ࣌ଓ c •
ϨεϙϯελΠϜඪ r • ग़ྗɿϝτϦΫε • d ʹैͬͨ WRR, c ฒྻͰϦΫΤετ 20 RTT ≦ r Ҏ͔ͭ 5xx Ҏ֎ͷϦΫΤετ ୯Ґ࣌ؒ rps = GET / 0.3 GET /users 0.2 POST /users 0.2 GET /depends_b 0.1 …
·ͱΊ • ϚΠΫϩαʔϏεʹ͏αʔϏεϝογϡͷଘࡏ • ࣮ӡ༻ͷԠ༻͕ݕ౼༨͋Γ • ࠷ߴͷΩϟύγςΟϓϥϯχϯάͷ͕͔Γ ͱͯ͠ɺύϑΥʔϚϯεఆྔԽʹண • αʔϏεϝογϡͰϦΫΤετύεॏΈ
Λࢉग़͠ϝτϦΫεΛܭࢉ • ύϑΥʔϚϯε࣮ଌɺґଘࠓޙͷ՝ 21