Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Speaker Deck
PRO
Sign in
Sign up
for free
CPDD
JBD
October 02, 2018
Technology
0
3.6k
CPDD
JBD
October 02, 2018
Tweet
Share
More Decks by JBD
See All by JBD
rakyll
5
1.2k
rakyll
7
1.5k
rakyll
2
1.1k
rakyll
0
120
rakyll
0
110
rakyll
2
870
rakyll
1
190
rakyll
0
100
rakyll
2
1.5k
Other Decks in Technology
See All in Technology
am7cinnamon
2
2.8k
asaju7142501
0
310
bufferings
2
3.3k
hamadakoji
1
1.1k
oracle4engineer
0
2.7k
ocise
1
1.3k
humank
0
220
sasakendayo
2
430
natsusan
0
190
kenya888
1
130
yutamakotaro
1
110
takuros
3
440
Featured
See All Featured
jcasabona
7
520
tenderlove
52
3.4k
ddemaree
274
31k
morganepeng
92
14k
roundedbygravity
84
7.8k
zakiwarfel
88
3.3k
destraynor
223
47k
chrislema
231
16k
davidbonilla
69
3.5k
garrettdimon
287
110k
geoffreycrofte
18
780
aarron
258
36k
Transcript
None
None
None
systems? who does that?
jaana b. dogan 6+ years at Google, touched many projects
early days (of a company)
None
growing...
what growth looks like service A service B service C
service D service E service auth email
one becomes many failure in isolation who to ping in
failure?
and it goes larger...
good guy jeff
None
None
code search
go_library( name = "logs", srcs = ["logs.go"], visibility = ["//visibility:public"],
deps = [ …. ], ) References (641 occurrences) - //source/ads/monitoring/BUILD - //source/ads/analysis/BUILD - //source/ads/mobile/BUILD ...
frontend server authentication users images memcache blobservice memcache memcache (metadata)
(disks) load balancer
frontend server authentication users images memcache blobservice memcache memcache (metadata)
(disks) load balancer critical path
cpdd (critical path driven development)
discover the critical paths make them reliable and fast make
them debuggable
how do we get there? events or tracing
why? why? why? why? why?
GET /timeline edge-lb sched api-server auth.Auth cache.Get mysql.Query user.Profile cache.Get
mysql.Query images.Filter blobstore.Get
bare metal kernel process scheduler network stack cloud stack user
process frameworks your code
GET /timeline edge-lb sched api-server auth.Auth cache.Get mysql.Query user.Profile cache.Get
mysql.Query images.Filter blobstore.Get not my fault
GET /timeline auth.Auth cache.Get mysql.Query user.Profile cache.Get mysql.Query images.Filter blobstore.Get
cache.Get mysql.Query blob.Get where is the source code?
GET /timeline auth.Auth cache.Get mysql.Query user.Profile cache.Get mysql.Query images.Filter blobstore.Get
cache.Get mysql.Query blob.Get who to call?
GET /timeline auth.Auth cache.Get mysql.Query user.Profile cache.Get mysql.Query images.Filter blobstore.Get
cache.Get mysql.Query blob.Get give me the logs, runtime events, profiles...
challenges...
this is an organizational problem CPDD CHALLENGE #1:
github.com/w3c/distributed-tracing
engineers don’t know where to start CPDD CHALLENGE #2:
infra is still a blackbox CPDD CHALLENGE #3:
instrumentation is expensive CPDD CHALLENGE #4:
dynamic capabilities are underestimated CPDD CHALLENGE #5:
None
cpdd: a tool to close knowledge gaps (which we don’t
talk about)
fin jbd@google.com