Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Monitoring JUST EAT on AWS
Search
Peter Mounce
April 24, 2015
Technology
0
120
Monitoring JUST EAT on AWS
Or, why we didn't just use CloudWatch.
Peter Mounce
April 24, 2015
Tweet
Share
More Decks by Peter Mounce
See All by Peter Mounce
Modern Monitoring for .NET
petemounce
0
160
Embracing DevOps at JUST EAT, within a Microsoft platform
petemounce
1
280
Other Decks in Technology
See All in Technology
Git 研修 Advanced【MIXI 24新卒技術研修】
mixi_engineers
PRO
0
200
公共領域から学ぶ クラウド移行についてエンジニアが意識していること
kawakawa2222
0
140
データ分析を支える技術 生成AI再入門
ishikawa_satoru
0
380
データベース研修 分析向けSQL入門【MIXI 24新卒技術研修】
mixi_engineers
PRO
0
110
AIエージェントを現場に導入する目線とは
masahiro_nishimi
1
1.5k
大規模ドラレコデータ収集・機械学習基盤を支える AWS CDK 〜導入・運用事例紹介〜
pemugi
0
110
コンテナ・K8s研修 - 後半 Kubernetes 基礎&ハンズオン【MIXI 24新卒技術研修】
mixi_engineers
PRO
1
120
クラウド利用者の「責任」をどう果たす?AWSセキュリティ対策のススメ #AWSSummit
hiashisan
0
280
What if...? 처음부터 다시 LLM 어플리케이션을 개발한다면
huffon
0
1k
[2024最新版]AWS Control Towerを使ったセキュアなマルチアカウント環境の作り方
hiashisan
0
270
AWSでRAGを作る法方
sonoda_mj
1
140
サーバーレスAPI(API Gateway+Lambda)とNext.jsで 個人ブログを作ろう!
shuntaka
PRO
0
560
Featured
See All Featured
Designing Experiences People Love
moore
136
23k
4 Signs Your Business is Dying
shpigford
178
21k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
277
13k
Principles of Awesome APIs and How to Build Them.
keavy
124
16k
Visualization
eitanlees
139
14k
Fashionably flexible responsive web design (full day workshop)
malarkey
399
65k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
36
9.1k
The Invisible Customer
myddelton
117
13k
How to Think Like a Performance Engineer
csswizardry
4
590
Reflections from 52 weeks, 52 projects
jeffersonlam
346
19k
Fantastic passwords and where to find them - at NoRuKo
philnash
42
2.7k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
353
29k
Transcript
Monitoring JUST EAT on AWS (Or: why we didn’t just
use AWS CloudWatch) Peter Mounce @petemounce / @justeat_tech
What did we want? Peter Mounce @petemounce / @justeat_tech One
source of truth Alerts that fire in (hopefully) a few seconds Data we can keep for a long time Data we can get rid of when we want
What did we end up with? Harvests OS-level perf-counters into
statsd Apps publish their own metrics where they choose Publishers: PerfTap + app-specific Peter Mounce @petemounce / @justeat_tech
What did we end up with? Send metrics over UDP:
timers.uk.paymentsapi.checkout.200.005.eu-west-1.a:343|ms Receiver: StatsD (by Etsy) Peter Mounce @petemounce / @justeat_tech
What did we end up with? Aggregator: Graphite Peter Mounce
@petemounce / @justeat_tech
What did we end up with? Check-runner / alerter: Seyren
Peter Mounce @petemounce / @justeat_tech
What did we end up with? absolute(diffSeries(movingAverage(sumSeries(stats_counts.consumercommunicationservice. uk.*.event-*.reaction-savetoken.*.eu-west-1.*),50),movingAverage(sumSeries(stats. timers.api-consumer.asp-net-responses.*authorizetoken.put.200.*.*.*.count,stats. timers.api-consumer.asp-net-responses.loginuser.post.200.*.*.*.count,stats.timers.api-
consumer.asp-net-responses.create.post.201.*.*.*.count),50))) Just kidding. Example alert Peter Mounce @petemounce / @justeat_tech
What did we end up with? absolute( diffSeries( movingAverage( sumSeries(
stats_counts.consumercommunicationservice.uk.*.event-*.reaction-savetoken.*.eu-west-1.*) ,50), movingAverage( sumSeries( stats.timers.api-consumer.asp-net-responses.*authorizetoken.put.200.*.*.*.count, stats.timers.api-consumer.asp-net-responses.loginuser.post.200.*.*.*.count, stats.timers.api-consumer.asp-net-responses.create.post.201.*.*.*.count ) ,50) ) ) Example alert (comprehensible) Peter Mounce @petemounce / @justeat_tech
What did we end up with? • PagerDuty • Grafana
• HipChat Some other stuff too Peter Mounce @petemounce / @justeat_tech
What does it look like? Peter Mounce @petemounce / @justeat_tech
Diagram credit
What does it cost? Peter Mounce @petemounce / @justeat_tech Graphite
+ whisper 1x m3.2xlarge, 12x 1TB @ 500 PIOPs StatsD 1x m3.xlarge Carbon-relay 1x m3.xlarge Seyren 1x c3.xlarge Grafana S3 website PagerDuty somebody else’s problem ;-) Buys: 200k metrics / sec & alarm latency around 2min
What did we gain? Graphite has more analysis functions than
CloudWatch does. Graphite: ~100 CloudWatch: 5…? Rich set of data analysis functions Peter Mounce @petemounce / @justeat_tech
What did we gain? CloudWatch - retains data for 2
weeks … or until shortly after resources are terminated … so we would need to archive data ourselves Capability for historical analysis Peter Mounce @petemounce / @justeat_tech
What did we gain? CloudWatch • 1 min granularity •
~2 min latency (CloudWatch::DynamoDB - 5 min granularity on CCU) Our MTR-React is shorter Peter Mounce @petemounce / @justeat_tech
Happiness! (Mostly) Peter Mounce @petemounce / @justeat_tech
We’re recruiting! http://tech.just-eat.com/jobs Peter Mounce @petemounce / @justeat_tech