Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Monitoring JUST EAT on AWS
Search
Peter Mounce
April 24, 2015
Technology
140
0
Share
Monitoring JUST EAT on AWS
Or, why we didn't just use CloudWatch.
Peter Mounce
April 24, 2015
More Decks by Peter Mounce
See All by Peter Mounce
Modern Monitoring for .NET
petemounce
0
170
Embracing DevOps at JUST EAT, within a Microsoft platform
petemounce
1
340
Other Decks in Technology
See All in Technology
MySQL 9.7がやってきた ~これまでのあらすじと基本情報~ @ 日本MySQLユーザ会会2026年04月 / mysql97-yattekita
sakaik
0
150
[Oracle TechNight#99] 生成AI時代のAI/ML入門 ~ AIとオラクルデータベースの関係 (後半)
oracle4engineer
PRO
1
150
Pure Intonation on Browser: Building a Sequencer with Ruby
nagachika
0
390
「SaaSの次の時代」に重要性を増すステークホルダーマネジメントの要諦 ~解像度を圧倒的に高めPdMの価値を最大化させる方法~
kakehashi
PRO
3
3.3k
ボトムアップの改善の火を灯し続けろ!〜支援現場で学んだ、消えないための3つの打ち手〜 / 20260509 Kazuki Mori
shift_evolve
PRO
0
170
Cortex Codeのコスト見積ヒントご紹介
yokatsuki
0
130
Google Cloud Next '26 の裏でこっそりリリースされたCloud Number Registry & Cloud Hub コスト分析 を試してみた
hikaru1001
0
140
Shipping AI Agents — Lessons from Production
vvatanabe
0
300
世界の中心でApp Runnerを叫ぶ FINAL
tsukuboshi
0
140
要件定義の精度を高めるための型と生成AIの活用 / Using Types and Generative AI to Improve the Accuracy of Requirements Definition
haru860
0
240
Fabric MCPの紹介と使い分け
ryomaru0825
1
100
需要創出(Chatwork)×供給(BPaaS) フライホイールとMoat 実行能力の最適配置とAI戦略
kubell_hr
0
1.6k
Featured
See All Featured
Testing 201, or: Great Expectations
jmmastey
46
8.1k
Thoughts on Productivity
jonyablonski
76
5.1k
Color Theory Basics | Prateek | Gurzu
gurzu
0
300
The untapped power of vector embeddings
frankvandijk
2
1.7k
Scaling GitHub
holman
464
140k
Hiding What from Whom? A Critical Review of the History of Programming languages for Music
tomoyanonymous
2
780
Documentation Writing (for coders)
carmenintech
77
5.3k
AI Search: Implications for SEO and How to Move Forward - #ShenzhenSEOConference
aleyda
1
1.2k
Organizational Design Perspectives: An Ontology of Organizational Design Elements
kimpetersen
PRO
1
680
Large-scale JavaScript Application Architecture
addyosmani
515
110k
Impact Scores and Hybrid Strategies: The future of link building
tamaranovitovic
0
270
WCS-LA-2024
lcolladotor
0
560
Transcript
Monitoring JUST EAT on AWS (Or: why we didn’t just
use AWS CloudWatch) Peter Mounce @petemounce / @justeat_tech
What did we want? Peter Mounce @petemounce / @justeat_tech One
source of truth Alerts that fire in (hopefully) a few seconds Data we can keep for a long time Data we can get rid of when we want
What did we end up with? Harvests OS-level perf-counters into
statsd Apps publish their own metrics where they choose Publishers: PerfTap + app-specific Peter Mounce @petemounce / @justeat_tech
What did we end up with? Send metrics over UDP:
timers.uk.paymentsapi.checkout.200.005.eu-west-1.a:343|ms Receiver: StatsD (by Etsy) Peter Mounce @petemounce / @justeat_tech
What did we end up with? Aggregator: Graphite Peter Mounce
@petemounce / @justeat_tech
What did we end up with? Check-runner / alerter: Seyren
Peter Mounce @petemounce / @justeat_tech
What did we end up with? absolute(diffSeries(movingAverage(sumSeries(stats_counts.consumercommunicationservice. uk.*.event-*.reaction-savetoken.*.eu-west-1.*),50),movingAverage(sumSeries(stats. timers.api-consumer.asp-net-responses.*authorizetoken.put.200.*.*.*.count,stats. timers.api-consumer.asp-net-responses.loginuser.post.200.*.*.*.count,stats.timers.api-
consumer.asp-net-responses.create.post.201.*.*.*.count),50))) Just kidding. Example alert Peter Mounce @petemounce / @justeat_tech
What did we end up with? absolute( diffSeries( movingAverage( sumSeries(
stats_counts.consumercommunicationservice.uk.*.event-*.reaction-savetoken.*.eu-west-1.*) ,50), movingAverage( sumSeries( stats.timers.api-consumer.asp-net-responses.*authorizetoken.put.200.*.*.*.count, stats.timers.api-consumer.asp-net-responses.loginuser.post.200.*.*.*.count, stats.timers.api-consumer.asp-net-responses.create.post.201.*.*.*.count ) ,50) ) ) Example alert (comprehensible) Peter Mounce @petemounce / @justeat_tech
What did we end up with? • PagerDuty • Grafana
• HipChat Some other stuff too Peter Mounce @petemounce / @justeat_tech
What does it look like? Peter Mounce @petemounce / @justeat_tech
Diagram credit
What does it cost? Peter Mounce @petemounce / @justeat_tech Graphite
+ whisper 1x m3.2xlarge, 12x 1TB @ 500 PIOPs StatsD 1x m3.xlarge Carbon-relay 1x m3.xlarge Seyren 1x c3.xlarge Grafana S3 website PagerDuty somebody else’s problem ;-) Buys: 200k metrics / sec & alarm latency around 2min
What did we gain? Graphite has more analysis functions than
CloudWatch does. Graphite: ~100 CloudWatch: 5…? Rich set of data analysis functions Peter Mounce @petemounce / @justeat_tech
What did we gain? CloudWatch - retains data for 2
weeks … or until shortly after resources are terminated … so we would need to archive data ourselves Capability for historical analysis Peter Mounce @petemounce / @justeat_tech
What did we gain? CloudWatch • 1 min granularity •
~2 min latency (CloudWatch::DynamoDB - 5 min granularity on CCU) Our MTR-React is shorter Peter Mounce @petemounce / @justeat_tech
Happiness! (Mostly) Peter Mounce @petemounce / @justeat_tech
We’re recruiting! http://tech.just-eat.com/jobs Peter Mounce @petemounce / @justeat_tech