@rakyllmonitoring and debuggingcontainerized systemsJaana B. Dogan, Google[email protected]
View Slide
@rakyllmeoverly frustrated engineer15+ years in networking systemsmaking systems more reliable
@rakyllthe new old monitoring?(maybe)
@rakyllsystems are growing...and you are not in control
@rakyllbare metalkernelnetwork stackcloud stacklibrariesframeworksyour code
@rakyll
@rakyllcomplexity is inevitable
@rakyllcontainer
@rakyllcontainer container
@rakyllcontainer containermessage queue
@rakyllcontainer containerstorage/database
@rakyllcontainer containerload balancerlocation=us-west location=europe-central
@rakyllhosthostcontainer containerload balancer
@rakyllcontainer containercontainercontainercontainerorchestrated hot mess
@rakyllareas of issues:- lack of locality- networking- scheduling- dependencies
@rakyll“my job is done here”
@rakyllafter going to production...1. monitor2. alert3. troubleshoot4. fix
@rakyllload balancer
@rakyllload balancercritical path
@rakylldiscovering critical pathsmaking them reliable then fastmaking them debuggable
@rakyllLatency Numbers Every Programmer Should Know by Jeff Dean
@rakyllping pongpongservice:6996project: ping the pong server.
@rakyllopencensus.io
@rakyllnot my team!
@rakyllwhere is the source code?
@rakyllwho to page?
@rakyllgive me the logs, runtimeevents, profiles...
@rakyllhttp://server:9999/tracez
@rakyllchallenges...
@rakyllno wire standards
@rakylltraceparent: ---Example:traceparent: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01
@rakyllno export standards
@rakyllareas of issues:- locality- networking- scheduling- dependencies
@rakyllfin[email protected]