Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deploying Production Ready Kubernetes Clusters: Lessons Learnt

Rahul Mahale
November 06, 2017

Deploying Production Ready Kubernetes Clusters: Lessons Learnt

Slides of my talk presented at DevopsDays CapeTown 2017

Rahul Mahale

November 06, 2017
Tweet

More Decks by Rahul Mahale

Other Decks in Programming

Transcript

  1. $whoami Rahul Mahale FOSS Enthusiast. Sr. DevOps Engineer at BigBinary

    https://www.bigbinary.com/ Kubernetes Partner. Rahul_Mahale
  2. Agenda • Kubernetes Cluster provisioning considerations for production. • High

    Level requirements. • Deploying applications on kubernetes cluster.
  3. Container Orchestration • Every orchestration s/w (Kubernetes, Docker swarm, Mesos-

    marathon) is opinionated when it comes to describing a containerized application. • This session is focused on Kubernetes.
  4. Kubernetes terminologies • Pods • Deployments • Services • Configmaps

    • Secrets • Jobs • Scheduled Jobs. • HPA(Horizontal pod autoscaler)
  5. K8s cluster must be, • Highly available. • Secured (behind

    VPN, secured networking). • Auto-Scalable.
  6. Way to do it ? • Provision HA cluster using

    Kops. • Private networking using Calico/weave/flannel • Use Cluster-autoscaler k8s addon. • Other tools:- ◦ kubeadm, kubespray. etc.
  7. Kubernetes Storage. • Persistent volumes. • NFS storage. • HostPath.

    • Many more options with Kubernetes - Gluster, Ceph, OpenEBS, Rook.
  8. Where to host database ? • On kubernetes ? Check

    stateful sets. • We host our DB on AWS RDS and on k8s using PVC. • Pre-created on launch of the new application.
  9. Image Building Base Images:- • From Public Registries like DockerHub,

    Quay, GCR etc.. • Self hosted. Generic Best Practice: • One layer for base, one for user configuration and one for application • Leverage ‘USER’ directive to run programs inside container as non-root. • Ensure regular scanning of images. • Use environment variables for runtime configuration.
  10. Resource Management. Use kubernetes namespaces: • One namespace per user

    or group • Separate namespaces for Dev/Test/Staging/Build • Specify Resource Quota (cpu, mem, #pods, #services, #RCs, #PersistentVolumeClaims) for each namespace
  11. Zero downtime deployments. Cluster Upgrades. - Kops rolling update App

    deployments - Healthcheck. - Readiness and liveness probes. - Container lifecycle hooks.
  12. Ingress controller ➔ L7 load balancing - L7 Load-balancing -

    SSL termination - Path-based rules - Multiple host names
  13. Auto-scaling applications on K8s. • Horizontal Pod Autoscaler(CPU based). •

    Memory based Autoscaler.(Own) • Custom metrics autoscaling v1.7.2
  14. Kubernetes Cron Jobs • Need to restart api server for

    enabling it with --runtime-config=batch/v2alpha1 • Restart policy, restartPolicy: OnFailure
  15. Known issues • K8s Issue #42164 - Restart kubelet(docker) or

    drain and terminate node. - Requests must be less than limits. • Pods with PVCs does not scale.
  16. Kops rolling upgrade • Kops rolling upgrade might break if

    you are using calico networking. • Check this document[1]. • Cross verify the kops supported version for k8s cluster. [1].https://github.com/kubernetes/kops/blob/master/docs/upgrade_from_kops_1.6_to_1.7_calico_cid r_migration.md
  17. Monitoring Kubernetes cluster and apps. • Heapster and influxdb addon

    is available. - kubectl top • Cluster monitoring using Prometheus and Grafana. • Configure Prometheus alerts to notify on slack/email etc. using alertmanager. - Prometheus nodeexporter and prometheus-core manifest must specify resources or might lead to consume more and more resources Other tools:- • Datadog/sysdig/weave-scope etc.
  18. Automation • Create artifacts using something like Ansible or your

    own tool, create database, secrets etc. • Creates deployment templates. • Helm is good tool from k8s community. • kubectl or k8s API • Label nodes script. Kops has artifact to specify in cluster.yml
  19. Backup and restore of Kubernetes Cluster • Kubernetes state is

    maintained in ETCD which is a distributed - key-value store • Deployment considerations for ETCD - Fault-tolerant cluster - Storage for ETCD (Network and IO latency directly affects ETCD) - ETCD data backup and restore - Reshifter(https://github.com/mhausenblas/reshifter) - Ark • Enable TLS
  20. Don’t Forget • Kubeval [1]. • Kubernetes-dashboard. • Heapster •

    kube-state-metrics etc. [1](https://github.com/garethr/kubeval)