Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Observability_at_Google_--_OSCON.pdf
Search
JBD
July 23, 2018
Programming
1
220
Observability_at_Google_--_OSCON.pdf
JBD
July 23, 2018
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.1k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.8k
Debugging Code Generation in Go
rakyll
5
1.5k
Are you ready for production?
rakyll
8
2.7k
Servers are doomed to fail
rakyll
3
1.5k
Serverless Containers
rakyll
1
240
Critical Path Analysis
rakyll
0
530
Monitoring and Debugging Containers
rakyll
2
1.1k
Other Decks in Programming
See All in Programming
採用事例の少ないSvelteを選んだ理由と それを正解にするためにやっていること
oekazuma
2
1k
[JAWS-UG横浜 #76] イケてるアップデートを宇宙いち早く紹介するよ!
maroon1st
0
460
Webエンジニア主体のモバイルチームの 生産性を高く保つためにやったこと
igreenwood
0
330
ゆるやかにgolangci-lintのルールを強くする / Kyoto.go #56
utgwkk
1
370
rails stats で紐解く ANDPAD のイマを支える技術たち
andpad
1
290
Zoneless Testing
rainerhahnekamp
0
120
あれやってみてー駆動から成長を加速させる / areyattemite-driven
nashiusagi
1
200
生成AIでGitHubソースコード取得して仕様書を作成
shukob
0
310
return文におけるstd::moveについて
onihusube
1
980
急成長期の品質とスピードを両立するフロントエンド技術基盤
soarteclab
0
930
KubeCon + CloudNativeCon NA 2024 Overviewat Kubernetes Meetup Tokyo #68 / amsy810_k8sjp68
masayaaoyama
0
250
モバイルアプリにおける自動テストの導入戦略
ostk0069
0
110
Featured
See All Featured
Making Projects Easy
brettharned
116
5.9k
Stop Working from a Prison Cell
hatefulcrawdad
267
20k
Scaling GitHub
holman
458
140k
Building Applications with DynamoDB
mza
91
6.1k
Learning to Love Humans: Emotional Interface Design
aarron
273
40k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
330
21k
Embracing the Ebb and Flow
colly
84
4.5k
Testing 201, or: Great Expectations
jmmastey
40
7.1k
How to train your dragon (web standard)
notwaldorf
88
5.7k
Practical Orchestrator
shlominoach
186
10k
BBQ
matthewcrist
85
9.4k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
6
510
Transcript
Observability at Google JBD, Google (@rakyll)
@rakyll History Long history of distributed systems 10ks of different
services built by 100s of teams Many backends/analysis tools invented here ™
@rakyll
@rakyll 100% availability (is a lie)
“ @rakyll A service is available if users cannot tell
there is an outage.
“ @rakyll Google Load Balancers are available if users cannot
tell there is an outage.
@rakyll Principled way of saying what level of downtime is
acceptable. • Error rate • Latency expectations SLOs
@rakyll An observable system tells more than its availability.
@rakyll Context, status, expectations, debuggability
@rakyll How? Observe by collecting signals Export them to analysis
tools Correlate and analyze to find root cause
@rakyll
@rakyll
@rakyll
@rakyll
@rakyll This is hard Must have integrations for web, RPC,
and storage clients Must support all languages Must be context aware (e.g. canary vs prod) Must support many analysis tools Developers need to add custom instrumentation
@rakyll This is too hard!
@rakyll Borg Stubby Census
opencensus.io
@rakyll
@rakyll
@rakyll
@rakyll
@rakyll Z-Pages • Allows processes report their own dashboards. •
Z-Pages have no sampling.
@rakyll Try! import “go.opencensus.io/plugin/ocgrpc” s := grpc.NewServer(grpc.StatsHandler(&ocgrpc.ServerHandler{})) if err :=
s.Serve(lis); err != nil { log.Fatalf("Failed to serve: %v", err) }
@rakyll import ( “go.opencensus.io/stats/view” “go.opencensus.io/trace” “contrib.go.opencensus.io/exporter/stackdriver” ) exporter, err :=
stackdriver.NewExporter(stackdriver.Options{ … }) if err != nil { log.Fatal(err) } view.RegisterExporter(exporter) trace.RegisterExporter(exporter)
@rakyll
@rakyll
@rakyll Roadmap Stable libraries in 8+ languages Exporter daemon Cluster-wide
Z-Pages Smart sampling Exemplars Framework, database, MQ integrations
opencensus.io
Thank you! opencensus.io JBD, Google
[email protected]
@rakyll