on prototyping, developing, and testing applications with Kubernetes from an appops perspective (tools & techniques) • But not really (much) about … • troubleshooting installation or upgrading issues • performance testing or optimising containerized microservices • SRE-style troubleshooting (check out what Googlers say on this topic)
list 6. missing/bad config or secret 7. lifecycle issues (probes fail) 8. can’t reach service 9. looking at the wrong place—where is localhost? 10. failed mounts
logs? Establish baseline. Orient Formulate hypotheses. Don’t jump to conclusions. Decide Sort hypotheses by likelihood. Pick one of the hypotheses. Act Test the hypothesis you picked. If confirmed: fix it, else: continue. OODA loop
low-level metrics (CPU, memory) • Application-specific metrics (full-blown instrumentation vs service mesh- based approaches) • Options • Roll your own, use the industry standards Prometheus + Grafana • Cloud provider native
In app, log to stdout or if you can’t use an adapter • Options • Roll your own, use the industry standards: ELK/EFK stack • Cloud provider native such as CloudWatch or StackDriver
debugging • Roots: need to overcome limitations of “time-synced logs” • Specifications: OpenCensus and OpenTracing • Tooling: Zipkin, Jaeger, Stackdriver • A must-have in a microservices setup • Debugging: use KubeSquash
kubectl apply kubelet asks container runtime via CRI to launch container(s) etcd happy? API Server stores desired state Scheduler sees new pod, selects node Scheduler assigns pod to a fitting node container runtime pulls image container runtime runs images kubelet takes over pod lifecycle (probes) pod runs until deleted or evicted garbage collection ask cluster admin NO YES does the pod get scheduled? fork out more $$$ container runtime happy? ask cluster admin can access container registry? fix access to registry is container starting up? (init containers) debug app probes fine? no leaking resources? soak testing, monitoring YES NO YES YES YES YES NO NO NO NO kubelet watches API server and notices new pod 1 2 3 4 5 6 7 8 9 container crashing after startup? NO YES debug app
and policies • Use kubectl auth can-i to check RBAC permissions • Make yourself familiar with: • Pod Security Policies, might constrain your app too much • Network Policies, might be too strict for your app’s communication needs • See kubernetes-security.info
kubectl apply kubelet asks container runtime via CRI to launch container(s) etcd happy? API Server stores desired state Scheduler sees new pod, selects node Scheduler assigns pod to a fitting node container runtime pulls image container runtime runs images kubelet takes over pod lifecycle (probes) pod runs until deleted or evicted garbage collection ask cluster admin NO YES does the pod get scheduled? fork out more $$$ container runtime happy? ask cluster admin can access container registry? fix access to registry is container starting up? (init containers) debug app probes fine? no leaking resources? soak testing, monitoring YES NO YES YES YES YES NO NO NO NO kubelet watches API server and notices new pod 1 2 3 4 5 6 7 8 9 container crashing after startup? NO YES debug app
wrote an application server. For load-balancing purposes, where would you put a reverse proxy such as NGINX? A. Into the container (same Dockerfile) B. Into a side car container (same pod) C. Into a separate pod
your apps the cloud native way by … • knowing and using the Kubernetes primitives (services, deployments) • implementing retries & timeouts (in-tree or via service mesh) • avoiding hardcoded (start-up) dependencies • listening on 0.0.0.0 (not 127.0.0.1)
Apply chaos engineering as long as all is well and learn from it where and how your system fails • Provide debug tools in image, but also: footprint, security! • Automate all the things: Autoscaler, Brigade, Draft, Forge, Helm, knative, ksync, odo, Operators, Skaffold, watchpod, etc.
site • Debugging microservices - Squash vs. Telepresence • Debugging and Troubleshooting Microservices in Kubernetes with Ray Tsang (Google) • Troubleshooting Kubernetes Using Logs • Debug a Go Application in Kubernetes from IDE • Troubleshooting Kubernetes Networking Issues • Video: CrashLoopBackoff, Pending, FailedMount and Friends: Debugging Common Kubernetes Cluster • Video: Troubleshooting & Debugging Microservices in Kubernetes • Slide deck: Evolution of Monitoring and Prometheus Articles, slide decks, videos