Kubernetes Best Practices

Kubernetes Best Practices

Kubernetes and friends are powerful tools that can really simplify your operations. However, there are many gotchas and common pitfalls that can ruin your experience. I’ll share some best practices around building and deploying your containers that will let you run more stably, efficiently, and securely.

365bbc53688d2cceb72e296acf7f65ee?s=128

Sandeep Dinesh

July 14, 2017
Tweet

Transcript

  1. Sandeep Dinesh Developer Advocate @sandeepdinesh github.com/thesandlord Kubernetes Best Practices

  2. Google Cloud Platform 2 Kubernetes is really flexible

  3. Google Cloud Platform 3 But you might yourself in the

  4. Building Containers

  5. Google Cloud Platform 5 Don’t trust arbitrary base images!

  6. 6 Google Cloud Platform Static Analysis of Containers https://github.com/banyanops/collector https://github.com/coreos/clair

  7. Google Cloud Platform 7 Use small base images

  8. 8 Google Cloud Platform Overhead Node.js App Your App →

    5MB Your App’s Dependencies → 95MB Total App Size → 100MB Docker Base Images: node:8 → 667MB node:8-wheezy → 521MB node:8-slim → 225MB node:8-alpine → 63.7MB scratch → ~50MB
  9. 9 Google Cloud Platform Overhead Node.js App Your App →

    5MB Your App’s Dependencies → 95MB Total App Size → 100MB Docker Base Images: node:8 → 667MB node:8-wheezy → 521MB node:8-slim → 225MB node:8-alpine → 63.7MB scratch → ~50MB ← 6.6x App Size!!
  10. 10 Google Cloud Platform Overhead Node.js App Your App →

    5MB Your App’s Dependencies → 95MB Total App Size → 100MB Docker Base Images: node:8 → 667MB node:8-wheezy → 521MB node:8-slim → 225MB node:8-alpine → 63.7MB scratch → ~50MB ← 6.6x App Size!! ← 13.3x “min” overhead!!
  11. 11 Google Cloud Platform Overhead Node.js App Your App →

    5MB Your App’s Dependencies → 95MB Total App Size → 100MB Docker Base Images: node:8 → 667MB node:8-wheezy → 521MB node:8-slim → 225MB node:8-alpine → 63.7MB scratch → ~50MB ← 6.6x App Size!! ← 13.3x “min” overhead!! Pros: Builds are faster Need less storage Cold starts (image pull) are faster Potentially less attack surface Cons: Less tooling inside container “Non-standard” environment
  12. Google Cloud Platform 12 Use the “builder pattern”

  13. 13 Google Cloud Platform Code Build Container Compiler Dev Deps

    Unit Tests etc... Build Artifact(s) Runtime Container Runtime Env Debug/Monitor Tooling Binaries Static Files Bundles Transpiled Code
  14. Google Cloud Platform 14 Docker bringing native support for multi-stage

    builds in Docker CE 17.05
  15. Container Internals

  16. Google Cloud Platform 16 Use a non-root user inside the

    container
  17. 17 Google Cloud Platform FROM node:alpine RUN apk update &&

    apk add imagemagick RUN groupadd -r nodejs RUN useradd -m -r -g nodejs nodejs USER nodejs ADD package.json package.json RUN npm install ADD index.js index.js CMD npm start Example Dockerfile
  18. 18 Google Cloud Platform Enforce it! apiVersion: v1 kind: Pod

    metadata: name: hello-world spec: containers: # specification of the pod’s containers # ... securityContext: runAsNonRoot: true
  19. Google Cloud Platform 19 Make the filesystem read-only

  20. 20 Google Cloud Platform apiVersion: v1 kind: Pod metadata: name:

    hello-world spec: containers: # specification of the pod’s containers # ... securityContext: runAsNonRoot: true readOnlyRootFilesystem: true Enforce it!
  21. Google Cloud Platform 21 One process per container

  22. Google Cloud Platform 22 Don’t restart on failure. Crash cleanly

    instead.
  23. Google Cloud Platform 23 Log to stdout and stderr

  24. Google Cloud Platform 24 Add “dumb-init” to prevent zombie processes

  25. 25 Google Cloud Platform FROM node:alpine RUN apk update &&

    apk add imagemagick RUN groupadd -r nodejs RUN useradd -m -r -g nodejs nodejs USER nodejs ADD https://github.com/Yelp/dumb-init/releases/download/v1.2.0/dumb-init_1.2.0_amd64 \ /usr/local/bin/dumb-init RUN chmod +x /usr/local/bin/dumb-init ENTRYPOINT ["/usr/bin/dumb-init", "--"] ADD package.json package.json RUN npm install ADD index.js index.js CMD npm start Example Dockerfile
  26. Google Cloud Platform 26 Good News: No need to do

    this in K8s 1.7
  27. Deployments

  28. Google Cloud Platform 28 Use the “record” option for easier

    rollbacks
  29. 29 Google Cloud Platform $ kubectl apply -f deployment.yaml --record

    … $ kubectl rollout history deployments my-deployment deployments "ghost-recorded" REVISION CHANGE-CAUSE 1 kubectl apply -f deployment.yaml --record 2 kubectl edit deployments my-deployment 3 kubectl set image deployment/my-deplyoment my-container=app:2.0
  30. Google Cloud Platform 30 Use plenty of descriptive labels

  31. 31 Google Cloud Platform apiVersion: extensions/v1beta1 kind: Deployment metadata: name:

    web spec: replicas: 12 template: metadata: labels: name: web color: blue experimental: 'true' Labels
  32. 32 Google Cloud Platform App: Nifty Phase: Dev Role: FE

    App: Nifty Phase: Test Role: FE App: Nifty Phase: Dev Role: BE App: Nifty Phase: Test Role: BE Labels
  33. 33 Google Cloud Platform App: Nifty Phase: Dev Role: FE

    App: Nifty Phase: Test Role: FE App: Nifty Phase: Dev Role: BE App: Nifty Phase: Test Role: BE Labels App = Nifty
  34. 34 Google Cloud Platform App: Nifty Phase: Dev Role: FE

    App: Nifty Phase: Test Role: FE App: Nifty Phase: Dev Role: BE App: Nifty Phase: Test Role: BE Labels Role = BE
  35. 35 Google Cloud Platform App: Nifty Phase: Dev Role: FE

    App: Nifty Phase: Test Role: FE App: Nifty Phase: Dev Role: BE App: Nifty Phase: Test Role: BE Labels Phase = Dev
  36. 36 Google Cloud Platform App: Nifty Phase: Dev Role: FE

    App: Nifty Phase: Test Role: FE App: Nifty Phase: Dev Role: BE App: Nifty Phase: Test Role: BE Labels Phase = Dev Role = FE
  37. Google Cloud Platform 37 Use sidecar containers for proxies, watchers,

    etc
  38. 38 Google Cloud Platform Examples App Database localhost App App

    Proxy Proxy Proxy secure connection
  39. 39 Google Cloud Platform Auth, Rate Limiting, etc. Examples App

    Incoming Requests localhost App App Proxy Proxy Proxy
  40. Google Cloud Platform 40 Don’t use sidecars for bootstrapping!

  41. Google Cloud Platform 41 Use init containers instead!

  42. 42 Google Cloud Platform apiVersion: v1 kind: Pod metadata: name:

    awesomeapp-pod labels: app: awesomeapp annotations: pod.beta.kubernetes.io/init-containers : '[ { "name": "init-myapp", "image": "busybox", "command": ["sh", "-c", "until nslookup myapp; do echo waiting for myapp; sleep 2; done;"] }, { "name": "init-mydb", "image": "busybox", "command": ["sh", "-c", "until nslookup mydb; do echo waiting for mydb; sleep 2; done;"] } ]' spec: containers: - name: awesomeapp-container image: busybox command: ['sh', '-c', 'echo The app is running! && sleep 3600' ]
  43. Google Cloud Platform 43 Don’t use :latest or no tag

  44. Google Cloud Platform 44 Readiness and Liveness probes are your

    friend
  45. 45 Google Cloud Platform Readiness → Is the app ready

    to start serving traffic? • Won’t be added to a service endpoint until it passes • Required for a “production app” in my opinion Liveness → Is the app still running? • Default is “process is running” • Possible that the process can be running but not working correctly • Good to define, might not be 100% necessary These can sometimes be the same endpoint, but not always Health Checks
  46. Services

  47. Google Cloud Platform 47 Don’t always use type: LoadBalancer

  48. Google Cloud Platform 48 Ingress is great

  49. 49 Google Cloud Platform Ingress Service 1 Service 2 Service

    3 websocket.mydomain.com mydomain.com /foo /bar
  50. Google Cloud Platform 50 type: NodePort can be “good enough”

  51. Google Cloud Platform 51 Use Static IPs. They are free*!

  52. 52 Google Cloud Platform apiVersion: v1 kind: Service metadata: name:

    myservice spec: type: LoadBalancer loadBalancerIP: QQQ.ZZZ.YYY.XXX ports: - port: 80 targetPort: 3000 protocol: TCP selector: name: myapp $ gcloud compute addresses create ingress --global … $ gcloud compute addresses create myservice --region=us-west1 Created … address: QQQ.ZZZ.YYY.XXX … $ apiVersion: extensions/v1beta1 kind: Ingress metadata: name: myingress annotations: kubernetes.io/ingress.global-static-ip-name: "ingress" spec: backend: serviceName: myservice servicePort: 80
  53. Google Cloud Platform 53 Map external services to internal ones

  54. 54 Google Cloud Platform External Services kind: Service apiVersion: v1

    metadata: name: mydatabase namespace: prod spec: type: ExternalName externalName : my.database.example.com ports: - port: 12345 kind: Service apiVersion: v1 metadata: name: mydatabase spec: ports: - protocol: TCP port: 80 targetPort: 12345 kind: Endpoints apiVersion: v1 metadata: name: mydatabase subsets: - addresses: - ip: 10.128.0.2 ports: - port: 12345 Hosted Database Database outside cluster but inside network
  55. Application Architecture

  56. Google Cloud Platform 56 Use Helm Charts

  57. Google Cloud Platform 57 ALL downstream dependencies are unreliable

  58. Google Cloud Platform 58 Make sure your microservices aren’t too

    micro
  59. Google Cloud Platform 59 Use a “Service Mesh”

  60. 60 Google Cloud Platform https://github.com/istio/istio https://github.com/linkerd/linkerd

  61. Google Cloud Platform 61 Use a PaaS?

  62. 62 Google Cloud Platform

  63. Cluster Management

  64. Google Cloud Platform 64 Use Google Container Engine

  65. Google Cloud Platform 65 Resources, Anti-Affinity, and Scheduling

  66. 66 Google Cloud Platform Node Affinity hostname zone region instance-type

    os arch custom!
  67. 67 Google Cloud Platform Node Taints / Tolerations special hardware

    dedicated hosts etc
  68. 68 Google Cloud Platform Pod Affinity / Anti-Affinity hostname zone

    region
  69. Google Cloud Platform 69 Use Namespaces to split up your

    cluster
  70. Google Cloud Platform 70 Role Based Access Control

  71. Google Cloud Platform 71 Unleash the Chaos Monkey

  72. 72 Google Cloud Platform More Resources • http://blog.kubernetes.io/2016/08/security-best-practices-kubernetes-deployment.html • https://github.com/gravitational/workshop/blob/master/k8sprod.md

    • https://nodesource.com/blog/8-protips-to-start-killing-it-when-dockerizing-node-js/ • https://www.ianlewis.org/en/using-kubernetes-health-checks • https://www.linux.com/learn/rolling-updates-and-rollbacks-using-kubernetes-deployments • https://kubernetes.io/docs/api-reference/v1.6/
  73. Questions? What best practices do you have?