Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
RPC Metrics at Google
Search
JBD
August 09, 2018
Programming
2
590
RPC Metrics at Google
JBD
August 09, 2018
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.1k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.9k
Debugging Code Generation in Go
rakyll
5
1.6k
Are you ready for production?
rakyll
8
2.8k
Servers are doomed to fail
rakyll
3
1.5k
Serverless Containers
rakyll
1
260
Critical Path Analysis
rakyll
0
630
Monitoring and Debugging Containers
rakyll
2
1.1k
Other Decks in Programming
See All in Programming
どの様にAIエージェントと 協業すべきだったのか?
takefumiyoshii
2
610
ソフトウェア設計の実践的な考え方
masuda220
PRO
3
490
止められない医療アプリ、そっと Swift 6 へ
medley
1
120
開発生産性を上げるための生成AI活用術
starfish719
1
170
After go func(): Goroutines Through a Beginner’s Eye
97vaibhav
0
230
CSC509 Lecture 04
javiergs
PRO
0
300
iOSアプリの信頼性を向上させる取り組み/ios-app-improve-reliability
shino8rayu9
0
150
Local Peer-to-Peer APIはどのように使われていくのか?
hal_spidernight
2
450
フロントエンド開発に役立つクライアントプログラム共通のノウハウ / Universal client-side programming best practices for frontend development
nrslib
7
3.9k
CSC509 Lecture 03
javiergs
PRO
0
330
LLMとPlaywright/reg-suitを活用した jQueryリファクタリングの実際
kinocoboy2
4
670
ABEMAモバイルアプリが Kotlin Multiplatformと歩んだ5年 ─ 導入と運用、成功と課題 / iOSDC 2025
akkyie
0
330
Featured
See All Featured
Testing 201, or: Great Expectations
jmmastey
45
7.7k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
188
55k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3.1k
Writing Fast Ruby
sferik
629
62k
Learning to Love Humans: Emotional Interface Design
aarron
274
40k
Raft: Consensus for Rubyists
vanstee
139
7.1k
Why You Should Never Use an ORM
jnunemaker
PRO
59
9.6k
Stop Working from a Prison Cell
hatefulcrawdad
271
21k
For a Future-Friendly Web
brad_frost
180
9.9k
Site-Speed That Sticks
csswizardry
11
880
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
31
9.7k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
16k
Transcript
RPC Metrics at Google JBD, Google (@rakyll)
gRPC Metrics at Google JBD, Google (@rakyll)
Request Metrics at Google JBD, Google (@rakyll)
@rakyll "100% is the wrong reliability target for basically everything."
-- Benjamin Treynor Sloss, VP of Engineering, Google
@rakyll "A service is available if users cannot tell that
there was an outage."
@rakyll Principled way of saying what level of downtime is
acceptable. • Error rate • Latency expectations SLOs
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store
@rakyll Questions infra teams want to ask: • Are we
meeting the SLO for the other team? • What’s the impact of a product on infra? • How much do we need to scale up if product grows 10%?
@rakyll High-Cardinality Breaking down the metrics data...
@rakyll Query the collected data in various ways: • Latency
distribution for RPCs originated at Google Analytics. • Requests take took more than 100ms for the customer #123. • Compare the request latency initiated at web vs mobile frontend.
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store originator=analytics; ...
@rakyll Blob store read errors by originator
@rakyll Dynamically choose aggregation (split between recording and aggregation)
@rakyll Exemplars
@rakyll /rpz and /statz
@rakyll http://server:7777/debug/rpcz
@rakyll Export? Monarch, Prometheus, and more.
@rakyll import “cloud.google.com/go/pubsub”
@rakyll +
Thank you! JBD, Google
[email protected]
@rakyll