Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Monitoring JUST EAT on AWS
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Peter Mounce
April 24, 2015
Technology
0
140
Monitoring JUST EAT on AWS
Or, why we didn't just use CloudWatch.
Peter Mounce
April 24, 2015
Tweet
Share
More Decks by Peter Mounce
See All by Peter Mounce
Modern Monitoring for .NET
petemounce
0
170
Embracing DevOps at JUST EAT, within a Microsoft platform
petemounce
1
340
Other Decks in Technology
See All in Technology
AIエンジニア Devin と歩む、自律型運用プロセスの構築
a2ito
0
650
クラウド時代における一時権限取得
krrrr38
1
150
白金鉱業Meetup_Vol.22_Orbital Senseを支える衛星画像のマルチモーダルエンベディングと地理空間のあいまい検索技術
brainpadpr
1
120
マイグレーションガイドに書いてないRiverpod 3移行話
taiju59
0
350
マルチロールEMが実践する「組織のレジリエンス」を高めるための組織構造と人材配置戦略
coconala_engineer
2
340
LLM活用の壁を超える:リクルートR&Dの戦略と打ち手
recruitengineers
PRO
1
220
Snowflake Night #2 LT
taromatsui_cccmkhd
0
320
Lookerの最新バージョンv26.2がやばい話
waiwai2111
1
150
作るべきものと向き合う - ecspresso 8年間の開発史から学ぶ技術選定 / 技術選定con findy 2026
fujiwara3
7
2k
Introduction to Sansan Meishi Maker Development Engineer
sansan33
PRO
0
360
All About Sansan – for New Global Engineers
sansan33
PRO
1
1.4k
Oracle Base Database Service 技術詳細
oracle4engineer
PRO
15
95k
Featured
See All Featured
The innovator’s Mindset - Leading Through an Era of Exponential Change - McGill University 2025
jdejongh
PRO
1
120
Introduction to Domain-Driven Design and Collaborative software design
baasie
1
620
Bash Introduction
62gerente
615
210k
Avoiding the “Bad Training, Faster” Trap in the Age of AI
tmiket
0
96
Intergalactic Javascript Robots from Outer Space
tanoku
273
27k
What’s in a name? Adding method to the madness
productmarketing
PRO
24
4k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
55
3.3k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
3.4k
The Mindset for Success: Future Career Progression
greggifford
PRO
0
270
For a Future-Friendly Web
brad_frost
183
10k
Exploring anti-patterns in Rails
aemeredith
2
280
The Curse of the Amulet
leimatthew05
1
9.5k
Transcript
Monitoring JUST EAT on AWS (Or: why we didn’t just
use AWS CloudWatch) Peter Mounce @petemounce / @justeat_tech
What did we want? Peter Mounce @petemounce / @justeat_tech One
source of truth Alerts that fire in (hopefully) a few seconds Data we can keep for a long time Data we can get rid of when we want
What did we end up with? Harvests OS-level perf-counters into
statsd Apps publish their own metrics where they choose Publishers: PerfTap + app-specific Peter Mounce @petemounce / @justeat_tech
What did we end up with? Send metrics over UDP:
timers.uk.paymentsapi.checkout.200.005.eu-west-1.a:343|ms Receiver: StatsD (by Etsy) Peter Mounce @petemounce / @justeat_tech
What did we end up with? Aggregator: Graphite Peter Mounce
@petemounce / @justeat_tech
What did we end up with? Check-runner / alerter: Seyren
Peter Mounce @petemounce / @justeat_tech
What did we end up with? absolute(diffSeries(movingAverage(sumSeries(stats_counts.consumercommunicationservice. uk.*.event-*.reaction-savetoken.*.eu-west-1.*),50),movingAverage(sumSeries(stats. timers.api-consumer.asp-net-responses.*authorizetoken.put.200.*.*.*.count,stats. timers.api-consumer.asp-net-responses.loginuser.post.200.*.*.*.count,stats.timers.api-
consumer.asp-net-responses.create.post.201.*.*.*.count),50))) Just kidding. Example alert Peter Mounce @petemounce / @justeat_tech
What did we end up with? absolute( diffSeries( movingAverage( sumSeries(
stats_counts.consumercommunicationservice.uk.*.event-*.reaction-savetoken.*.eu-west-1.*) ,50), movingAverage( sumSeries( stats.timers.api-consumer.asp-net-responses.*authorizetoken.put.200.*.*.*.count, stats.timers.api-consumer.asp-net-responses.loginuser.post.200.*.*.*.count, stats.timers.api-consumer.asp-net-responses.create.post.201.*.*.*.count ) ,50) ) ) Example alert (comprehensible) Peter Mounce @petemounce / @justeat_tech
What did we end up with? • PagerDuty • Grafana
• HipChat Some other stuff too Peter Mounce @petemounce / @justeat_tech
What does it look like? Peter Mounce @petemounce / @justeat_tech
Diagram credit
What does it cost? Peter Mounce @petemounce / @justeat_tech Graphite
+ whisper 1x m3.2xlarge, 12x 1TB @ 500 PIOPs StatsD 1x m3.xlarge Carbon-relay 1x m3.xlarge Seyren 1x c3.xlarge Grafana S3 website PagerDuty somebody else’s problem ;-) Buys: 200k metrics / sec & alarm latency around 2min
What did we gain? Graphite has more analysis functions than
CloudWatch does. Graphite: ~100 CloudWatch: 5…? Rich set of data analysis functions Peter Mounce @petemounce / @justeat_tech
What did we gain? CloudWatch - retains data for 2
weeks … or until shortly after resources are terminated … so we would need to archive data ourselves Capability for historical analysis Peter Mounce @petemounce / @justeat_tech
What did we gain? CloudWatch • 1 min granularity •
~2 min latency (CloudWatch::DynamoDB - 5 min granularity on CCU) Our MTR-React is shorter Peter Mounce @petemounce / @justeat_tech
Happiness! (Mostly) Peter Mounce @petemounce / @justeat_tech
We’re recruiting! http://tech.just-eat.com/jobs Peter Mounce @petemounce / @justeat_tech