Deploying Production Ready Kubernetes Clusters: Lessons Learnt

4668924ca3c52d96b56315cecb6a4f69?s=47 Rahul Mahale
November 06, 2017

Deploying Production Ready Kubernetes Clusters: Lessons Learnt

Slides of my talk presented at DevopsDays CapeTown 2017

4668924ca3c52d96b56315cecb6a4f69?s=128

Rahul Mahale

November 06, 2017
Tweet

Transcript

  1. 2.

    $whoami Rahul Mahale FOSS Enthusiast. Sr. DevOps Engineer at BigBinary

    https://www.bigbinary.com/ Kubernetes Partner. Rahul_Mahale
  2. 3.
  3. 4.

    Agenda • Kubernetes Cluster provisioning considerations for production. • High

    Level requirements. • Deploying applications on kubernetes cluster.
  4. 5.
  5. 6.
  6. 8.
  7. 9.

    Container Orchestration • Every orchestration s/w (Kubernetes, Docker swarm, Mesos-

    marathon) is opinionated when it comes to describing a containerized application. • This session is focused on Kubernetes.
  8. 11.

    Kubernetes terminologies • Pods • Deployments • Services • Configmaps

    • Secrets • Jobs • Scheduled Jobs. • HPA(Horizontal pod autoscaler)
  9. 13.

    K8s cluster must be, • Highly available. • Secured (behind

    VPN, secured networking). • Auto-Scalable.
  10. 14.

    Way to do it ? • Provision HA cluster using

    Kops. • Private networking using Calico/weave/flannel • Use Cluster-autoscaler k8s addon. • Other tools:- ◦ kubeadm, kubespray. etc.
  11. 15.

    Kubernetes Storage. • Persistent volumes. • NFS storage. • HostPath.

    • Many more options with Kubernetes - Gluster, Ceph, OpenEBS, Rook.
  12. 16.

    Where to host database ? • On kubernetes ? Check

    stateful sets. • We host our DB on AWS RDS and on k8s using PVC. • Pre-created on launch of the new application.
  13. 18.

    Image Building Base Images:- • From Public Registries like DockerHub,

    Quay, GCR etc.. • Self hosted. Generic Best Practice: • One layer for base, one for user configuration and one for application • Leverage ‘USER’ directive to run programs inside container as non-root. • Ensure regular scanning of images. • Use environment variables for runtime configuration.
  14. 19.

    Resource Management. Use kubernetes namespaces: • One namespace per user

    or group • Separate namespaces for Dev/Test/Staging/Build • Specify Resource Quota (cpu, mem, #pods, #services, #RCs, #PersistentVolumeClaims) for each namespace
  15. 21.

    Zero downtime deployments. Cluster Upgrades. - Kops rolling update App

    deployments - Healthcheck. - Readiness and liveness probes. - Container lifecycle hooks.
  16. 22.

    Ingress controller ➔ L7 load balancing - L7 Load-balancing -

    SSL termination - Path-based rules - Multiple host names
  17. 23.

    Auto-scaling applications on K8s. • Horizontal Pod Autoscaler(CPU based). •

    Memory based Autoscaler.(Own) • Custom metrics autoscaling v1.7.2
  18. 26.

    Kubernetes Cron Jobs • Need to restart api server for

    enabling it with --runtime-config=batch/v2alpha1 • Restart policy, restartPolicy: OnFailure
  19. 27.

    Known issues • K8s Issue #42164 - Restart kubelet(docker) or

    drain and terminate node. - Requests must be less than limits. • Pods with PVCs does not scale.
  20. 28.

    Kops rolling upgrade • Kops rolling upgrade might break if

    you are using calico networking. • Check this document[1]. • Cross verify the kops supported version for k8s cluster. [1].https://github.com/kubernetes/kops/blob/master/docs/upgrade_from_kops_1.6_to_1.7_calico_cid r_migration.md
  21. 29.

    Monitoring Kubernetes cluster and apps. • Heapster and influxdb addon

    is available. - kubectl top • Cluster monitoring using Prometheus and Grafana. • Configure Prometheus alerts to notify on slack/email etc. using alertmanager. - Prometheus nodeexporter and prometheus-core manifest must specify resources or might lead to consume more and more resources Other tools:- • Datadog/sysdig/weave-scope etc.
  22. 31.

    Automation • Create artifacts using something like Ansible or your

    own tool, create database, secrets etc. • Creates deployment templates. • Helm is good tool from k8s community. • kubectl or k8s API • Label nodes script. Kops has artifact to specify in cluster.yml
  23. 32.

    Backup and restore of Kubernetes Cluster • Kubernetes state is

    maintained in ETCD which is a distributed - key-value store • Deployment considerations for ETCD - Fault-tolerant cluster - Storage for ETCD (Network and IO latency directly affects ETCD) - ETCD data backup and restore - Reshifter(https://github.com/mhausenblas/reshifter) - Ark • Enable TLS
  24. 33.

    Don’t Forget • Kubeval [1]. • Kubernetes-dashboard. • Heapster •

    kube-state-metrics etc. [1](https://github.com/garethr/kubeval)
  25. 34.