Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
RPC Metrics at Google
Search
JBD
August 09, 2018
Programming
2
570
RPC Metrics at Google
JBD
August 09, 2018
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.1k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.9k
Debugging Code Generation in Go
rakyll
5
1.6k
Are you ready for production?
rakyll
8
2.8k
Servers are doomed to fail
rakyll
3
1.5k
Serverless Containers
rakyll
1
250
Critical Path Analysis
rakyll
0
600
Monitoring and Debugging Containers
rakyll
2
1.1k
Other Decks in Programming
See All in Programming
KotlinConf 2025 現地参加の土産話
n_takehata
0
100
LINEヤフー データグループ紹介
lycorp_recruit_jp
0
760
DroidKnights 2025 - 다양한 스크롤 뷰에서의 영상 재생
gaeun5744
3
300
AIネイティブなプロダクトをGolangで挑む取り組み
nmatsumoto4
0
120
無関心の谷
kanayannet
0
180
Go Modules: From Basics to Beyond / Go Modulesの基本とその先へ
kuro_kurorrr
0
120
今ならAmazon ECSのサービス間通信をどう選ぶか / Selection of ECS Interservice Communication 2025
tkikuc
11
2.7k
第9回 情シス転職ミートアップ 株式会社IVRy(アイブリー)の紹介
ivry_presentationmaterials
1
190
地方に住むエンジニアの残酷な現実とキャリア論
ichimichi
2
560
The Evolution of Enterprise Java with Jakarta EE 11 and Beyond
ivargrimstad
1
820
セキュリティマネジャー廃止とクラウドネイティブ型サンドボックス活用
kazumura
1
190
関数型まつりレポート for JuliaTokai #22
antimon2
0
130
Featured
See All Featured
Building Better People: How to give real-time feedback that sticks.
wjessup
367
19k
Embracing the Ebb and Flow
colly
86
4.7k
RailsConf 2023
tenderlove
30
1.1k
Bash Introduction
62gerente
614
210k
Building Applications with DynamoDB
mza
95
6.5k
Raft: Consensus for Rubyists
vanstee
140
7k
Testing 201, or: Great Expectations
jmmastey
42
7.5k
Optimising Largest Contentful Paint
csswizardry
37
3.3k
Build The Right Thing And Hit Your Dates
maggiecrowley
36
2.8k
Principles of Awesome APIs and How to Build Them.
keavy
126
17k
Designing for Performance
lara
609
69k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
181
53k
Transcript
RPC Metrics at Google JBD, Google (@rakyll)
gRPC Metrics at Google JBD, Google (@rakyll)
Request Metrics at Google JBD, Google (@rakyll)
@rakyll "100% is the wrong reliability target for basically everything."
-- Benjamin Treynor Sloss, VP of Engineering, Google
@rakyll "A service is available if users cannot tell that
there was an outage."
@rakyll Principled way of saying what level of downtime is
acceptable. • Error rate • Latency expectations SLOs
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store
@rakyll Questions infra teams want to ask: • Are we
meeting the SLO for the other team? • What’s the impact of a product on infra? • How much do we need to scale up if product grows 10%?
@rakyll High-Cardinality Breaking down the metrics data...
@rakyll Query the collected data in various ways: • Latency
distribution for RPCs originated at Google Analytics. • Requests take took more than 100ms for the customer #123. • Compare the request latency initiated at web vs mobile frontend.
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store originator=analytics; ...
@rakyll Blob store read errors by originator
@rakyll Dynamically choose aggregation (split between recording and aggregation)
@rakyll Exemplars
@rakyll /rpz and /statz
@rakyll http://server:7777/debug/rpcz
@rakyll Export? Monarch, Prometheus, and more.
@rakyll import “cloud.google.com/go/pubsub”
@rakyll +
Thank you! JBD, Google
[email protected]
@rakyll