Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
RPC Metrics at Google
Search
JBD
August 09, 2018
Programming
2
530
RPC Metrics at Google
JBD
August 09, 2018
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.1k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.8k
Debugging Code Generation in Go
rakyll
5
1.5k
Are you ready for production?
rakyll
8
2.7k
Servers are doomed to fail
rakyll
3
1.5k
Serverless Containers
rakyll
1
240
Critical Path Analysis
rakyll
0
530
Monitoring and Debugging Containers
rakyll
2
1.1k
Other Decks in Programming
See All in Programming
HTTP compression in PHP and Symfony apps
dunglas
2
1.7k
わたしの星のままで一番星になる ~ 出産を機にSIerからEC事業会社に転職した話 ~
kimura_m_29
0
180
Keeping it Ruby: Why Your Product Needs a Ruby SDK - RubyWorld 2024
envek
0
180
これが俺の”自分戦略” プロセスを楽しんでいこう! - Developers CAREER Boost 2024
niftycorp
PRO
0
190
Mermaid x AST x 生成AI = コードとドキュメントの完全同期への道
shibuyamizuho
0
160
Symfony Mapper Component
soyuka
2
730
DevFest Tokyo 2025 - Flutter のアプリアーキテクチャ現在地点
wasabeef
5
900
CQRS+ES の力を使って効果を感じる / Feel the effects of using the power of CQRS+ES
seike460
PRO
0
120
The Efficiency Paradox and How to Save Yourself and the World
hollycummins
1
440
責務を分離するための例外設計 - PHPカンファレンス 2024
kajitack
1
390
あれやってみてー駆動から成長を加速させる / areyattemite-driven
nashiusagi
1
200
ゆるやかにgolangci-lintのルールを強くする / Kyoto.go #56
utgwkk
1
370
Featured
See All Featured
BBQ
matthewcrist
85
9.4k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
26
1.9k
Product Roadmaps are Hard
iamctodd
PRO
49
11k
Large-scale JavaScript Application Architecture
addyosmani
510
110k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
226
22k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
48
2.2k
Optimizing for Happiness
mojombo
376
70k
4 Signs Your Business is Dying
shpigford
181
21k
Building Flexible Design Systems
yeseniaperezcruz
327
38k
Designing for Performance
lara
604
68k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
95
17k
Java REST API Framework Comparison - PWX 2021
mraible
PRO
28
8.3k
Transcript
RPC Metrics at Google JBD, Google (@rakyll)
gRPC Metrics at Google JBD, Google (@rakyll)
Request Metrics at Google JBD, Google (@rakyll)
@rakyll "100% is the wrong reliability target for basically everything."
-- Benjamin Treynor Sloss, VP of Engineering, Google
@rakyll "A service is available if users cannot tell that
there was an outage."
@rakyll Principled way of saying what level of downtime is
acceptable. • Error rate • Latency expectations SLOs
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store
@rakyll Questions infra teams want to ask: • Are we
meeting the SLO for the other team? • What’s the impact of a product on infra? • How much do we need to scale up if product grows 10%?
@rakyll High-Cardinality Breaking down the metrics data...
@rakyll Query the collected data in various ways: • Latency
distribution for RPCs originated at Google Analytics. • Requests take took more than 100ms for the customer #123. • Compare the request latency initiated at web vs mobile frontend.
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store originator=analytics; ...
@rakyll Blob store read errors by originator
@rakyll Dynamically choose aggregation (split between recording and aggregation)
@rakyll Exemplars
@rakyll /rpz and /statz
@rakyll http://server:7777/debug/rpcz
@rakyll Export? Monarch, Prometheus, and more.
@rakyll import “cloud.google.com/go/pubsub”
@rakyll +
Thank you! JBD, Google
[email protected]
@rakyll