Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Observability_at_Google_--_OSCON.pdf
Search
JBD
July 23, 2018
Programming
1
260
Observability_at_Google_--_OSCON.pdf
JBD
July 23, 2018
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.2k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.9k
Debugging Code Generation in Go
rakyll
5
1.6k
Are you ready for production?
rakyll
8
2.9k
Servers are doomed to fail
rakyll
3
1.5k
Serverless Containers
rakyll
1
270
Critical Path Analysis
rakyll
0
660
Monitoring and Debugging Containers
rakyll
2
1.1k
Other Decks in Programming
See All in Programming
안드로이드 9년차 개발자, 프론트엔드 주니어로 커리어 리셋하기
maryang
1
110
複数人でのCLI/Infrastructure as Codeの暮らしを良くする
shmokmt
5
2.2k
AIコーディングエージェント(NotebookLM)
kondai24
0
170
新卒エンジニアのプルリクエスト with AI駆動
fukunaga2025
0
200
Full-Cycle Reactivity in Angular: SignalStore mit Signal Forms und Resources
manfredsteyer
PRO
0
200
AIコーディングエージェント(Manus)
kondai24
0
160
堅牢なフロントエンドテスト基盤を構築するために行った取り組み
shogo4131
8
2.3k
これだけで丸わかり!LangChain v1.0 アップデートまとめ
os1ma
6
1.8k
DSPy Meetup Tokyo #1 - はじめてのDSPy
masahiro_nishimi
1
160
組み合わせ爆発にのまれない - 責務分割 x テスト
halhorn
1
140
Tinkerbellから学ぶ、Podで DHCPをリッスンする手法
tomokon
0
120
JETLS.jl ─ A New Language Server for Julia
abap34
1
300
Featured
See All Featured
Bootstrapping a Software Product
garrettdimon
PRO
307
120k
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.5k
Being A Developer After 40
akosma
91
590k
For a Future-Friendly Web
brad_frost
180
10k
VelocityConf: Rendering Performance Case Studies
addyosmani
333
24k
Mobile First: as difficult as doing things right
swwweet
225
10k
Documentation Writing (for coders)
carmenintech
76
5.2k
4 Signs Your Business is Dying
shpigford
186
22k
The Cost Of JavaScript in 2023
addyosmani
55
9.3k
Designing Experiences People Love
moore
143
24k
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
234
17k
How to Ace a Technical Interview
jacobian
280
24k
Transcript
Observability at Google JBD, Google (@rakyll)
@rakyll History Long history of distributed systems 10ks of different
services built by 100s of teams Many backends/analysis tools invented here ™
@rakyll
@rakyll 100% availability (is a lie)
“ @rakyll A service is available if users cannot tell
there is an outage.
“ @rakyll Google Load Balancers are available if users cannot
tell there is an outage.
@rakyll Principled way of saying what level of downtime is
acceptable. • Error rate • Latency expectations SLOs
@rakyll An observable system tells more than its availability.
@rakyll Context, status, expectations, debuggability
@rakyll How? Observe by collecting signals Export them to analysis
tools Correlate and analyze to find root cause
@rakyll
@rakyll
@rakyll
@rakyll
@rakyll This is hard Must have integrations for web, RPC,
and storage clients Must support all languages Must be context aware (e.g. canary vs prod) Must support many analysis tools Developers need to add custom instrumentation
@rakyll This is too hard!
@rakyll Borg Stubby Census
opencensus.io
@rakyll
@rakyll
@rakyll
@rakyll
@rakyll Z-Pages • Allows processes report their own dashboards. •
Z-Pages have no sampling.
@rakyll Try! import “go.opencensus.io/plugin/ocgrpc” s := grpc.NewServer(grpc.StatsHandler(&ocgrpc.ServerHandler{})) if err :=
s.Serve(lis); err != nil { log.Fatalf("Failed to serve: %v", err) }
@rakyll import ( “go.opencensus.io/stats/view” “go.opencensus.io/trace” “contrib.go.opencensus.io/exporter/stackdriver” ) exporter, err :=
stackdriver.NewExporter(stackdriver.Options{ … }) if err != nil { log.Fatal(err) } view.RegisterExporter(exporter) trace.RegisterExporter(exporter)
@rakyll
@rakyll
@rakyll Roadmap Stable libraries in 8+ languages Exporter daemon Cluster-wide
Z-Pages Smart sampling Exemplars Framework, database, MQ integrations
opencensus.io
Thank you! opencensus.io JBD, Google
[email protected]
@rakyll