Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
RPC Metrics at Google
Search
JBD
August 09, 2018
Programming
2
580
RPC Metrics at Google
JBD
August 09, 2018
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.1k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.9k
Debugging Code Generation in Go
rakyll
5
1.6k
Are you ready for production?
rakyll
8
2.8k
Servers are doomed to fail
rakyll
3
1.5k
Serverless Containers
rakyll
1
260
Critical Path Analysis
rakyll
0
620
Monitoring and Debugging Containers
rakyll
2
1.1k
Other Decks in Programming
See All in Programming
[FEConf 2025] 모노레포 절망편, 14개 레포로 부활하기까지 걸린 1년
mmmaxkim
0
990
CEDEC2025 長期運営ゲームをあと10年続けるための0から始める自動テスト ~4000項目を50%自動化し、月1→毎日実行にした3年間~
akatsukigames_tech
0
150
開発チーム・開発組織の設計改善スキルの向上
masuda220
PRO
13
7.8k
自作OSでDOOMを動かしてみた
zakki0925224
1
1.4k
GUI操作LLMの最新動向: UI-TARSと関連論文紹介
kfujikawa
0
1k
物語を動かす行動"量" #エンジニアニメ
konifar
14
5.5k
Honoアップデート 2025年夏
yusukebe
1
850
未来を拓くAI技術〜エージェント開発とAI駆動開発〜
leveragestech
2
180
「リーダーは意思決定する人」って本当?~ 学びを現場で活かす、リーダー4ヶ月目の試行錯誤 ~
marina1017
0
240
UbieのAIパートナーを支えるコンテキストエンジニアリング実践
syucream
2
710
kiroでゲームを作ってみた
iriikeita
0
180
Flutter로 Gemini와 MCP를 활용한 Agentic App 만들기 - 박제창 2025 I/O Extended Seoul
itsmedreamwalker
0
150
Featured
See All Featured
Building Applications with DynamoDB
mza
96
6.6k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
26k
The Power of CSS Pseudo Elements
geoffreycrofte
77
5.9k
Practical Orchestrator
shlominoach
190
11k
The World Runs on Bad Software
bkeepers
PRO
70
11k
Unsuck your backbone
ammeep
671
58k
How To Stay Up To Date on Web Technology
chriscoyier
790
250k
Learning to Love Humans: Emotional Interface Design
aarron
273
40k
Connecting the Dots Between Site Speed, User Experience & Your Business [WebExpo 2025]
tammyeverts
8
480
The Cost Of JavaScript in 2023
addyosmani
53
8.8k
Site-Speed That Sticks
csswizardry
10
790
Transcript
RPC Metrics at Google JBD, Google (@rakyll)
gRPC Metrics at Google JBD, Google (@rakyll)
Request Metrics at Google JBD, Google (@rakyll)
@rakyll "100% is the wrong reliability target for basically everything."
-- Benjamin Treynor Sloss, VP of Engineering, Google
@rakyll "A service is available if users cannot tell that
there was an outage."
@rakyll Principled way of saying what level of downtime is
acceptable. • Error rate • Latency expectations SLOs
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store
@rakyll Questions infra teams want to ask: • Are we
meeting the SLO for the other team? • What’s the impact of a product on infra? • How much do we need to scale up if product grows 10%?
@rakyll High-Cardinality Breaking down the metrics data...
@rakyll Query the collected data in various ways: • Latency
distribution for RPCs originated at Google Analytics. • Requests take took more than 100ms for the customer #123. • Compare the request latency initiated at web vs mobile frontend.
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store originator=analytics; ...
@rakyll Blob store read errors by originator
@rakyll Dynamically choose aggregation (split between recording and aggregation)
@rakyll Exemplars
@rakyll /rpz and /statz
@rakyll http://server:7777/debug/rpcz
@rakyll Export? Monarch, Prometheus, and more.
@rakyll import “cloud.google.com/go/pubsub”
@rakyll +
Thank you! JBD, Google
[email protected]
@rakyll