Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
RPC Metrics at Google
Search
JBD
August 09, 2018
Programming
2
540
RPC Metrics at Google
JBD
August 09, 2018
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.1k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.8k
Debugging Code Generation in Go
rakyll
5
1.5k
Are you ready for production?
rakyll
8
2.7k
Servers are doomed to fail
rakyll
3
1.5k
Serverless Containers
rakyll
1
240
Critical Path Analysis
rakyll
0
540
Monitoring and Debugging Containers
rakyll
2
1.1k
Other Decks in Programming
See All in Programming
Итераторы в Go 1.23: зачем они нужны, как использовать, и насколько они быстрые?
lamodatech
0
1.4k
Jaspr Dart Web Framework 박제창 @Devfest 2024
itsmedreamwalker
0
150
PHPカンファレンス 2024|共創を加速するための若手の技術挑戦
weddingpark
0
140
ASP.NET Core の OpenAPIサポート
h455h1
0
110
技術的負債と向き合うカイゼン活動を1年続けて分かった "持続可能" なプロダクト開発
yuichiro_serita
0
300
「とりあえず動く」コードはよい、「読みやすい」コードはもっとよい / Code that 'just works' is good, but code that is 'readable' is even better.
mkmk884
6
1.4k
chibiccをCILに移植した結果 (NGK2025S版)
kekyo
PRO
0
130
混沌とした例外処理とエラー監視に秩序をもたらす
morihirok
13
2.2k
PSR-15 はあなたのための ものではない? - phpcon2024
myamagishi
0
400
DevFest - Serverless 101 with Google Cloud Functions
tunmise
0
140
Findy Team+ Awardを受賞したかった!ベストプラクティス応募内容をふりかえり、開発生産性向上もふりかえる / Findy Team Plus Award BestPractice and DPE Retrospective 2024
honyanya
0
140
週次リリースを実現するための グローバルアプリ開発
tera_ny
1
1.2k
Featured
See All Featured
Docker and Python
trallard
43
3.2k
Bootstrapping a Software Product
garrettdimon
PRO
305
110k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
365
25k
The Invisible Side of Design
smashingmag
299
50k
The Illustrated Children's Guide to Kubernetes
chrisshort
48
49k
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
3.6k
Building a Scalable Design System with Sketch
lauravandoore
460
33k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
29
960
The Cost Of JavaScript in 2023
addyosmani
46
7.2k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
33
2.7k
Building Better People: How to give real-time feedback that sticks.
wjessup
366
19k
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
232
17k
Transcript
RPC Metrics at Google JBD, Google (@rakyll)
gRPC Metrics at Google JBD, Google (@rakyll)
Request Metrics at Google JBD, Google (@rakyll)
@rakyll "100% is the wrong reliability target for basically everything."
-- Benjamin Treynor Sloss, VP of Engineering, Google
@rakyll "A service is available if users cannot tell that
there was an outage."
@rakyll Principled way of saying what level of downtime is
acceptable. • Error rate • Latency expectations SLOs
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store
@rakyll Questions infra teams want to ask: • Are we
meeting the SLO for the other team? • What’s the impact of a product on infra? • How much do we need to scale up if product grows 10%?
@rakyll High-Cardinality Breaking down the metrics data...
@rakyll Query the collected data in various ways: • Latency
distribution for RPCs originated at Google Analytics. • Requests take took more than 100ms for the customer #123. • Compare the request latency initiated at web vs mobile frontend.
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store originator=analytics; ...
@rakyll Blob store read errors by originator
@rakyll Dynamically choose aggregation (split between recording and aggregation)
@rakyll Exemplars
@rakyll /rpz and /statz
@rakyll http://server:7777/debug/rpcz
@rakyll Export? Monarch, Prometheus, and more.
@rakyll import “cloud.google.com/go/pubsub”
@rakyll +
Thank you! JBD, Google
[email protected]
@rakyll