Lock in $30 Savings on PRO—Offer Ends Soon! ⏳
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Observability_at_Google_--_OSCON.pdf
Search
JBD
July 23, 2018
Programming
1
260
Observability_at_Google_--_OSCON.pdf
JBD
July 23, 2018
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.2k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.9k
Debugging Code Generation in Go
rakyll
5
1.6k
Are you ready for production?
rakyll
8
2.9k
Servers are doomed to fail
rakyll
3
1.5k
Serverless Containers
rakyll
1
270
Critical Path Analysis
rakyll
0
660
Monitoring and Debugging Containers
rakyll
2
1.1k
Other Decks in Programming
See All in Programming
Claude Codeの「Compacting Conversation」を体感50%減! CLAUDE.md + 8 Skills で挑むコンテキスト管理術
kmurahama
0
270
Cell-Based Architecture
larchanjo
0
130
20 years of Symfony, what's next?
fabpot
2
360
Rediscover the Console - SymfonyCon Amsterdam 2025
chalasr
2
170
AIコードレビューがチームの"文脈"を 読めるようになるまで
marutaku
0
360
React Native New Architecture 移行実践報告
taminif
1
160
안드로이드 9년차 개발자, 프론트엔드 주니어로 커리어 리셋하기
maryang
1
120
sbt 2
xuwei_k
0
300
「コードは上から下へ読むのが一番」と思った時に、思い出してほしい話
panda728
PRO
38
26k
まだ間に合う!Claude Code元年をふりかえる
nogu66
5
840
20251127_ぼっちのための懇親会対策会議
kokamoto01_metaps
2
440
Rubyで鍛える仕組み化プロヂュース力
muryoimpl
0
140
Featured
See All Featured
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
254
22k
The Power of CSS Pseudo Elements
geoffreycrofte
80
6.1k
Producing Creativity
orderedlist
PRO
348
40k
Learning to Love Humans: Emotional Interface Design
aarron
274
41k
Fireside Chat
paigeccino
41
3.7k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
Speed Design
sergeychernyshev
33
1.4k
Become a Pro
speakerdeck
PRO
31
5.7k
4 Signs Your Business is Dying
shpigford
186
22k
Art, The Web, and Tiny UX
lynnandtonic
304
21k
GitHub's CSS Performance
jonrohan
1032
470k
Stop Working from a Prison Cell
hatefulcrawdad
273
21k
Transcript
Observability at Google JBD, Google (@rakyll)
@rakyll History Long history of distributed systems 10ks of different
services built by 100s of teams Many backends/analysis tools invented here ™
@rakyll
@rakyll 100% availability (is a lie)
“ @rakyll A service is available if users cannot tell
there is an outage.
“ @rakyll Google Load Balancers are available if users cannot
tell there is an outage.
@rakyll Principled way of saying what level of downtime is
acceptable. • Error rate • Latency expectations SLOs
@rakyll An observable system tells more than its availability.
@rakyll Context, status, expectations, debuggability
@rakyll How? Observe by collecting signals Export them to analysis
tools Correlate and analyze to find root cause
@rakyll
@rakyll
@rakyll
@rakyll
@rakyll This is hard Must have integrations for web, RPC,
and storage clients Must support all languages Must be context aware (e.g. canary vs prod) Must support many analysis tools Developers need to add custom instrumentation
@rakyll This is too hard!
@rakyll Borg Stubby Census
opencensus.io
@rakyll
@rakyll
@rakyll
@rakyll
@rakyll Z-Pages • Allows processes report their own dashboards. •
Z-Pages have no sampling.
@rakyll Try! import “go.opencensus.io/plugin/ocgrpc” s := grpc.NewServer(grpc.StatsHandler(&ocgrpc.ServerHandler{})) if err :=
s.Serve(lis); err != nil { log.Fatalf("Failed to serve: %v", err) }
@rakyll import ( “go.opencensus.io/stats/view” “go.opencensus.io/trace” “contrib.go.opencensus.io/exporter/stackdriver” ) exporter, err :=
stackdriver.NewExporter(stackdriver.Options{ … }) if err != nil { log.Fatal(err) } view.RegisterExporter(exporter) trace.RegisterExporter(exporter)
@rakyll
@rakyll
@rakyll Roadmap Stable libraries in 8+ languages Exporter daemon Cluster-wide
Z-Pages Smart sampling Exemplars Framework, database, MQ integrations
opencensus.io
Thank you! opencensus.io JBD, Google
[email protected]
@rakyll