Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Monitoring JUST EAT on AWS
Search
Peter Mounce
April 24, 2015
Technology
0
130
Monitoring JUST EAT on AWS
Or, why we didn't just use CloudWatch.
Peter Mounce
April 24, 2015
Tweet
Share
More Decks by Peter Mounce
See All by Peter Mounce
Modern Monitoring for .NET
petemounce
0
160
Embracing DevOps at JUST EAT, within a Microsoft platform
petemounce
1
290
Other Decks in Technology
See All in Technology
Microsoft Azure全冠になってみた ~アレを使い倒した者が試験を制す!?~/Obtained all Microsoft Azure certifications Those who use "that" to the full will win the exam! ?
yuj1osm
1
110
How to be an AWS Community Builder | 君もAWS Community Builderになろう!〜2024 冬 CB募集直前対策編?!〜
coosuke
PRO
2
2.8k
Postman と API セキュリティ / Postman and API Security
yokawasa
0
200
2024年にチャレンジしたことを振り返るぞ
mitchan
0
130
Storage Browser for Amazon S3
miu_crescent
1
130
社内イベント管理システムを1週間でAKSからACAに移行した話し
shingo_kawahara
0
180
Password-less Journey - パスキーへの移行を見据えたユーザーの準備 @ AXIES 2024
ritou
3
1.4k
Oracle Cloud Infrastructure:2024年12月度サービス・アップデート
oracle4engineer
PRO
0
160
5分でわかるDuckDB
chanyou0311
10
3.2k
Turing × atmaCup #18 - 1st Place Solution
hakubishin3
0
470
生成AIのガバナンスの全体像と現実解
fnifni
1
180
日本版とグローバル版のモバイルアプリ統合の開発の裏側と今後の展望
miichan
1
120
Featured
See All Featured
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
33
1.9k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
2
160
XXLCSS - How to scale CSS and keep your sanity
sugarenia
247
1.3M
The MySQL Ecosystem @ GitHub 2015
samlambert
250
12k
It's Worth the Effort
3n
183
28k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
365
25k
Put a Button on it: Removing Barriers to Going Fast.
kastner
59
3.6k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
17
2.2k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
95
17k
Building Flexible Design Systems
yeseniaperezcruz
327
38k
Build The Right Thing And Hit Your Dates
maggiecrowley
33
2.4k
The Art of Programming - Codeland 2020
erikaheidi
53
13k
Transcript
Monitoring JUST EAT on AWS (Or: why we didn’t just
use AWS CloudWatch) Peter Mounce @petemounce / @justeat_tech
What did we want? Peter Mounce @petemounce / @justeat_tech One
source of truth Alerts that fire in (hopefully) a few seconds Data we can keep for a long time Data we can get rid of when we want
What did we end up with? Harvests OS-level perf-counters into
statsd Apps publish their own metrics where they choose Publishers: PerfTap + app-specific Peter Mounce @petemounce / @justeat_tech
What did we end up with? Send metrics over UDP:
timers.uk.paymentsapi.checkout.200.005.eu-west-1.a:343|ms Receiver: StatsD (by Etsy) Peter Mounce @petemounce / @justeat_tech
What did we end up with? Aggregator: Graphite Peter Mounce
@petemounce / @justeat_tech
What did we end up with? Check-runner / alerter: Seyren
Peter Mounce @petemounce / @justeat_tech
What did we end up with? absolute(diffSeries(movingAverage(sumSeries(stats_counts.consumercommunicationservice. uk.*.event-*.reaction-savetoken.*.eu-west-1.*),50),movingAverage(sumSeries(stats. timers.api-consumer.asp-net-responses.*authorizetoken.put.200.*.*.*.count,stats. timers.api-consumer.asp-net-responses.loginuser.post.200.*.*.*.count,stats.timers.api-
consumer.asp-net-responses.create.post.201.*.*.*.count),50))) Just kidding. Example alert Peter Mounce @petemounce / @justeat_tech
What did we end up with? absolute( diffSeries( movingAverage( sumSeries(
stats_counts.consumercommunicationservice.uk.*.event-*.reaction-savetoken.*.eu-west-1.*) ,50), movingAverage( sumSeries( stats.timers.api-consumer.asp-net-responses.*authorizetoken.put.200.*.*.*.count, stats.timers.api-consumer.asp-net-responses.loginuser.post.200.*.*.*.count, stats.timers.api-consumer.asp-net-responses.create.post.201.*.*.*.count ) ,50) ) ) Example alert (comprehensible) Peter Mounce @petemounce / @justeat_tech
What did we end up with? • PagerDuty • Grafana
• HipChat Some other stuff too Peter Mounce @petemounce / @justeat_tech
What does it look like? Peter Mounce @petemounce / @justeat_tech
Diagram credit
What does it cost? Peter Mounce @petemounce / @justeat_tech Graphite
+ whisper 1x m3.2xlarge, 12x 1TB @ 500 PIOPs StatsD 1x m3.xlarge Carbon-relay 1x m3.xlarge Seyren 1x c3.xlarge Grafana S3 website PagerDuty somebody else’s problem ;-) Buys: 200k metrics / sec & alarm latency around 2min
What did we gain? Graphite has more analysis functions than
CloudWatch does. Graphite: ~100 CloudWatch: 5…? Rich set of data analysis functions Peter Mounce @petemounce / @justeat_tech
What did we gain? CloudWatch - retains data for 2
weeks … or until shortly after resources are terminated … so we would need to archive data ourselves Capability for historical analysis Peter Mounce @petemounce / @justeat_tech
What did we gain? CloudWatch • 1 min granularity •
~2 min latency (CloudWatch::DynamoDB - 5 min granularity on CCU) Our MTR-React is shorter Peter Mounce @petemounce / @justeat_tech
Happiness! (Mostly) Peter Mounce @petemounce / @justeat_tech
We’re recruiting! http://tech.just-eat.com/jobs Peter Mounce @petemounce / @justeat_tech