Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
RPC Metrics at Google
Search
JBD
August 09, 2018
Programming
2
520
RPC Metrics at Google
JBD
August 09, 2018
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.1k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.8k
Debugging Code Generation in Go
rakyll
5
1.5k
Are you ready for production?
rakyll
8
2.7k
Servers are doomed to fail
rakyll
3
1.5k
Serverless Containers
rakyll
1
240
Critical Path Analysis
rakyll
0
520
Monitoring and Debugging Containers
rakyll
2
1.1k
Other Decks in Programming
See All in Programming
「今のプロジェクトいろいろ大変なんですよ、app/services とかもあって……」/After Kaigi on Rails 2024 LT Night
junk0612
5
2.1k
3 Effective Rules for Using Signals in Angular
manfredsteyer
PRO
0
100
3 Effective Rules for Using Signals in Angular
manfredsteyer
PRO
1
100
色々なIaCツールを実際に触って比較してみる
iriikeita
0
330
みんなでプロポーザルを書いてみた
yuriko1211
0
260
弊社の「意識チョット低いアーキテクチャ」10選
texmeijin
5
24k
ペアーズにおけるAmazon Bedrockを⽤いた障害対応⽀援 ⽣成AIツールの導⼊事例 @ 20241115配信AWSウェビナー登壇
fukubaka0825
6
1.9k
Streams APIとTCPフロー制御 / Web Streams API and TCP flow control
tasshi
2
350
cmp.Or に感動した
otakakot
2
140
광고 소재 심사 과정에 AI를 도입하여 광고 서비스 생산성 향상시키기
kakao
PRO
0
170
シェーダーで魅せるMapLibreの動的ラスタータイル
satoshi7190
1
480
TypeScriptでライブラリとの依存を限定的にする方法
tutinoko
2
660
Featured
See All Featured
Optimizing for Happiness
mojombo
376
70k
The Straight Up "How To Draw Better" Workshop
denniskardys
232
140k
VelocityConf: Rendering Performance Case Studies
addyosmani
325
24k
GraphQLとの向き合い方2022年版
quramy
43
13k
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
506
140k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
329
21k
Fontdeck: Realign not Redesign
paulrobertlloyd
82
5.2k
Music & Morning Musume
bryan
46
6.2k
4 Signs Your Business is Dying
shpigford
180
21k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
169
50k
Code Reviewing Like a Champion
maltzj
520
39k
Become a Pro
speakerdeck
PRO
25
5k
Transcript
RPC Metrics at Google JBD, Google (@rakyll)
gRPC Metrics at Google JBD, Google (@rakyll)
Request Metrics at Google JBD, Google (@rakyll)
@rakyll "100% is the wrong reliability target for basically everything."
-- Benjamin Treynor Sloss, VP of Engineering, Google
@rakyll "A service is available if users cannot tell that
there was an outage."
@rakyll Principled way of saying what level of downtime is
acceptable. • Error rate • Latency expectations SLOs
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store
@rakyll Questions infra teams want to ask: • Are we
meeting the SLO for the other team? • What’s the impact of a product on infra? • How much do we need to scale up if product grows 10%?
@rakyll High-Cardinality Breaking down the metrics data...
@rakyll Query the collected data in various ways: • Latency
distribution for RPCs originated at Google Analytics. • Requests take took more than 100ms for the customer #123. • Compare the request latency initiated at web vs mobile frontend.
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store originator=analytics; ...
@rakyll Blob store read errors by originator
@rakyll Dynamically choose aggregation (split between recording and aggregation)
@rakyll Exemplars
@rakyll /rpz and /statz
@rakyll http://server:7777/debug/rpcz
@rakyll Export? Monarch, Prometheus, and more.
@rakyll import “cloud.google.com/go/pubsub”
@rakyll +
Thank you! JBD, Google
[email protected]
@rakyll