Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Monitoring JUST EAT on AWS
Search
Sponsored
·
SiteGround - Reliable hosting with speed, security, and support you can count on.
→
Peter Mounce
April 24, 2015
Technology
0
140
Monitoring JUST EAT on AWS
Or, why we didn't just use CloudWatch.
Peter Mounce
April 24, 2015
Tweet
Share
More Decks by Peter Mounce
See All by Peter Mounce
Modern Monitoring for .NET
petemounce
0
170
Embracing DevOps at JUST EAT, within a Microsoft platform
petemounce
1
340
Other Decks in Technology
See All in Technology
君はジョシュアツリーを知っているか?名前をつけて事象を正しく認識しよう / Do you know Joshua Tree?
ykanoh
3
110
スピンアウト講座01_GitHub管理
overflowinc
0
1.2k
DMBOKを使ってレバレジーズのデータマネジメントを評価した
leveragestech
0
160
ADK + Gemini Enterprise で 外部 API 連携エージェント作るなら OAuth の仕組みを理解しておこう
kaz1437
0
150
建設DXを支えるANDPAD: 2025年のセキュリティの取り組みと卒業したいセキュリティ
andpad
0
160
Kiro Meetup #7 Kiro アップデート (2025/12/15〜2026/3/20)
katzueno
2
240
_Architecture_Modernization_から学ぶ現状理解から設計への道のり.pdf
satohjohn
2
710
Phase02_AI座学_応用
overflowinc
0
2.5k
Phase04_ターミナル基礎
overflowinc
0
2k
SLI/SLO 導入で 避けるべきこと3選
yagikota
0
140
20年以上続く PHP 大規模プロダクトを Kubernetes へ ── クラウド基盤刷新プロジェクトの4年間
oogfranz
PRO
0
160
Goのerror型がシンプルであることの恩恵について理解する
yamatai1212
1
300
Featured
See All Featured
B2B Lead Gen: Tactics, Traps & Triumph
marketingsoph
0
84
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
55
3.3k
sira's awesome portfolio website redesign presentation
elsirapls
0
200
Facilitating Awesome Meetings
lara
57
6.8k
How to Talk to Developers About Accessibility
jct
2
160
Have SEOs Ruined the Internet? - User Awareness of SEO in 2025
akashhashmi
0
300
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
9
1.2k
How People are Using Generative and Agentic AI to Supercharge Their Products, Projects, Services and Value Streams Today
helenjbeal
1
140
The Illustrated Guide to Node.js - THAT Conference 2024
reverentgeek
1
320
The B2B funnel & how to create a winning content strategy
katarinadahlin
PRO
1
310
Building a Modern Day E-commerce SEO Strategy
aleyda
45
9k
Documentation Writing (for coders)
carmenintech
77
5.3k
Transcript
Monitoring JUST EAT on AWS (Or: why we didn’t just
use AWS CloudWatch) Peter Mounce @petemounce / @justeat_tech
What did we want? Peter Mounce @petemounce / @justeat_tech One
source of truth Alerts that fire in (hopefully) a few seconds Data we can keep for a long time Data we can get rid of when we want
What did we end up with? Harvests OS-level perf-counters into
statsd Apps publish their own metrics where they choose Publishers: PerfTap + app-specific Peter Mounce @petemounce / @justeat_tech
What did we end up with? Send metrics over UDP:
timers.uk.paymentsapi.checkout.200.005.eu-west-1.a:343|ms Receiver: StatsD (by Etsy) Peter Mounce @petemounce / @justeat_tech
What did we end up with? Aggregator: Graphite Peter Mounce
@petemounce / @justeat_tech
What did we end up with? Check-runner / alerter: Seyren
Peter Mounce @petemounce / @justeat_tech
What did we end up with? absolute(diffSeries(movingAverage(sumSeries(stats_counts.consumercommunicationservice. uk.*.event-*.reaction-savetoken.*.eu-west-1.*),50),movingAverage(sumSeries(stats. timers.api-consumer.asp-net-responses.*authorizetoken.put.200.*.*.*.count,stats. timers.api-consumer.asp-net-responses.loginuser.post.200.*.*.*.count,stats.timers.api-
consumer.asp-net-responses.create.post.201.*.*.*.count),50))) Just kidding. Example alert Peter Mounce @petemounce / @justeat_tech
What did we end up with? absolute( diffSeries( movingAverage( sumSeries(
stats_counts.consumercommunicationservice.uk.*.event-*.reaction-savetoken.*.eu-west-1.*) ,50), movingAverage( sumSeries( stats.timers.api-consumer.asp-net-responses.*authorizetoken.put.200.*.*.*.count, stats.timers.api-consumer.asp-net-responses.loginuser.post.200.*.*.*.count, stats.timers.api-consumer.asp-net-responses.create.post.201.*.*.*.count ) ,50) ) ) Example alert (comprehensible) Peter Mounce @petemounce / @justeat_tech
What did we end up with? • PagerDuty • Grafana
• HipChat Some other stuff too Peter Mounce @petemounce / @justeat_tech
What does it look like? Peter Mounce @petemounce / @justeat_tech
Diagram credit
What does it cost? Peter Mounce @petemounce / @justeat_tech Graphite
+ whisper 1x m3.2xlarge, 12x 1TB @ 500 PIOPs StatsD 1x m3.xlarge Carbon-relay 1x m3.xlarge Seyren 1x c3.xlarge Grafana S3 website PagerDuty somebody else’s problem ;-) Buys: 200k metrics / sec & alarm latency around 2min
What did we gain? Graphite has more analysis functions than
CloudWatch does. Graphite: ~100 CloudWatch: 5…? Rich set of data analysis functions Peter Mounce @petemounce / @justeat_tech
What did we gain? CloudWatch - retains data for 2
weeks … or until shortly after resources are terminated … so we would need to archive data ourselves Capability for historical analysis Peter Mounce @petemounce / @justeat_tech
What did we gain? CloudWatch • 1 min granularity •
~2 min latency (CloudWatch::DynamoDB - 5 min granularity on CCU) Our MTR-React is shorter Peter Mounce @petemounce / @justeat_tech
Happiness! (Mostly) Peter Mounce @petemounce / @justeat_tech
We’re recruiting! http://tech.just-eat.com/jobs Peter Mounce @petemounce / @justeat_tech