Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kubernetes Storage Overview

Kubernetes Storage Overview

Intro to Kubernetes, volumes, Persistent Volumes, Persistent Volume Claims, Dynamic Provisioning, and Storage Classes

Michelle Au

June 28, 2017
Tweet

More Decks by Michelle Au

Other Decks in Programming

Transcript

  1. Google Cloud Platform 1. Kubernetes Overview 2. Kubernetes Volumes •

    Direct pod volume reference (and why it’s a bad idea) • Persistent volumes and claims (and why they’re a good idea) 3. Dynamic Volume Provisioning Agenda Agenda
  2. Greek for “Helmsman”; also the root of the words “governor”

    and “cybernetic” • Manages container clusters • Inspired and informed by Google’s experiences and internal systems • Supports multiple cloud and bare-metal environments • Supports multiple container runtimes • 100% Open source, written in Go Manage applications, not machines Kubernetes
  3. Google Cloud Platform Kubernetes Goals • API and implementation 100%

    open • Modular and replaceable • Don’t force apps to know about concepts that are • Cloud Provider Specific • Kubernetes Specific Enable Users To • Write once, run anywhere • Avoid vendor lock-in • Avoid coupling app to infrastructure Workload Portability
  4. Container cluster orchestration • Scheduling: Decide where my containers should

    run • Lifecycle and health: Keep my containers running despite failures • Scaling: Make sets of containers bigger or smaller • Naming and discovery: Find where my containers are now • Load balancing: Distribute traffic across a set of containers • Storage volumes: Provide data to containers • Logging and monitoring: Track what’s happening with my containers • Debugging and introspection: Enter or attach to containers • Identity and authorization: Control who can do things to my containers 7
  5. Google Cloud Platform Small group of containers & volumes Tightly

    coupled The atom of scheduling & placement Shared namespace • share IP address & localhost • share IPC, etc. Managed lifecycle • bound to a node, restart in place • can die, cannot be reborn with same ID Example: data puller & web server Consumers Content Manager File Puller Web Server Volume Pod Pods
  6. Google Cloud Platform Pods can be started/killed individually Controllers can

    manage pods for you Workload-specific APIs • ReplicaSet: fungible replicas • StatefulSet: stateful applications • DaemonSet: cluster services • Job: batch workloads Layered on top of the public Pod API You could write your own Workload-specific Controllers ReplicaSet - name = “my-rc” - template = { ... } - replicas = 4 API Server How many? 3 Start 1 more OK How many? 4
  7. Google Cloud Platform Can’t share files between containers Files in

    containers are ephemeral • Can’t run stateful applications • Container crashes result in loss of data Consumers Content Manager File Puller Web Server ? Pod Problem
  8. Google Cloud Platform Kubernetes Volumes • Directory, possibly with some

    data in it • Accessible by all containers in pod • Lifetime same as the pod or longer • Volume Plugins Define • How directory is setup • Medium that backs it • Contents of the directory • http://kubernetes.io/docs/user-guide/ volumes/ Consumers Content Manager File Puller Web Server Volume Pod Volumes
  9. Google Cloud Platform Kubernetes has many volume plugins Persistent •

    GCE Persistent Disk • AWS Elastic Block Store • Azure File Storage • Azure Data Disk • iSCSI • Flocker • NFS • vSphere • GlusterFS • Ceph File and RBD • Cinder • Quobyte Volume • FibreChannel • VMWare Photon PD Ephemeral • Empty dir (and tmpfs) • Expose Kubernetes API • Secret • ConfigMap • DownwardAPI Other • Flex (exec a binary) • Host path Future • Local Storage (alpha) • CSI Volume Plugins
  10. Google Cloud Platform apiVersion: v1 kind: Pod metadata: name: sleepypod

    spec: volumes: - name: data gcePersistentDisk: pdName: panda-disk fsType: ext4 containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumeMounts: - name: data mountPath: /data readOnly: false Volume referenced directly GCE PD Example gcepd.yaml
  11. Google Cloud Platform apiVersion: v1 kind: Pod metadata: name: sleepypod

    spec: volumes: - name: data gcePersistentDisk: pdName: panda-disk fsType: ext4 containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumeMounts: - name: data mountPath: /data readOnly: false gcepd.yaml Volume referenced directly GCE PD Example
  12. Google Cloud Platform apiVersion: v1 kind: Pod metadata: name: sleepypod

    spec: volumes: - name: data gcePersistentDisk: pdName: panda-disk fsType: ext4 containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumeMounts: - name: data mountPath: /data readOnly: false Volume referenced directly gcepd.yaml GCE PD Example
  13. Google Cloud Platform apiVersion: v1 kind: Pod metadata: name: sleepypod

    spec: volumes: - name: data gcePersistentDisk: pdName: panda-disk fsType: ext4 containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumeMounts: - name: data mountPath: /data readOnly: false Volume referenced directly gcepd.yaml GCE PD Example
  14. Google Cloud Platform Kubernetes Goals • API and implementation 100%

    open • Modular and replaceable • Don’t force apps to know about concepts that are • Cloud Provider Specific • Kubernetes Specific Enable Users To • Write once, run anywhere • Avoid vendor lock-in • Avoid coupling app to infrastructure Workload Portability
  15. Google Cloud Platform • Abstracts details of how storage is

    provided from how it is consumed • PersistentVolume (PV) API Object • Piece of networked storage in the cluster • Not used directly in pod • Lifecycle independent of any individual pod • PersistentVolumeClaim (PVC) API Object • Request for storage by a user • Claims request specific size and access modes of storage • Pods reference claims User PVClaim Pod Cluster Admin PersistentVolumes PV/PVC Example
  16. Google Cloud Platform apiVersion: v1 kind: PersistentVolume metadata: name :

    myPV1 spec: accessModes: - ReadWriteOnce capacity: storage: 10Gi persistentVolumeReclaimPolicy: Retain gcePersistentDisk: fsType: ext4 pdName: panda-disk pv.yaml PV/PVC Example apiVersion: v1 kind: PersistentVolume metadata: name : myPV2 spec: accessModes: - ReadWriteOnce capacity: storage: 100Gi persistentVolumeReclaimPolicy: Retain gcePersistentDisk: fsType: ext4 pdName: panda-disk2
  17. Google Cloud Platform $ kubectl create -f pv.yaml persistentvolume "pv1"

    created persistentvolume "pv2" created $ kubectl get pv NAME CAPACITY ACCESSMODES STATUS CLAIM REASON AGE pv1 10Gi RWO Available 1m pv2 100Gi RWO Available 1m PV/PVC Example
  18. Google Cloud Platform apiVersion: v1 kind: PersistentVolumeClaim metadata: name: mypvc

    namespace: testns spec: accessModes: - ReadWriteOnce resources: requests: storage: 100Gi PV/PVC Example pvc.yaml
  19. Google Cloud Platform $ kubectl create -f pv.yaml persistentvolume "pv1"

    created persistentvolume "pv2" created $ kubectl get pv NAME CAPACITY ACCESSMODES STATUS CLAIM REASON AGE pv1 10Gi RWO Available 1m pv2 100Gi RWO Available 1m $ kubectl create -f pvc.yaml persistentvolumeclaim "mypvc" created $ kubectl get pv NAME CAPACITY ACCESSMODES STATUS CLAIM REASON AGE pv1 10Gi RWO Available 3m pv2 100Gi RWO Bound testns/mypvc 3m PV/PVC Example
  20. Google Cloud Platform apiVersion: v1 kind: Pod metadata: name: sleepypod

    spec: volumes: - name: data gcePersistentDisk: pdName: panda-disk fsType: ext4 containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumeMounts: ... Volume referenced via PVC volumes: - name: data persistentVolumeClaim: claimName: mypvc PV/PVC Example gcepd.yaml
  21. Google Cloud Platform • Allows storage to be created on-demand

    (when requested by user). • Eliminates need for cluster administrators to pre-provision storage. • Alpha in Kubernetes 1.2 • Beta in 1.4 • GA in 1.6 User PVClaim Cluster Admin Storage Provider Storage Class Dynamic Provisioning and Storage Classes
  22. Google Cloud Platform • Cluster/Storage admins “enable” dynamic provisioning by

    creating StorageClass objects • StorageClass objects define the parameters used during creation. • StorageClass parameters are opaque to Kubernetes so storage providers can expose any number of custom parameters for the cluster admin to use. Cluster Admin Storage Class kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: slow provisioner: kubernetes.io/gce-pd parameters: type: pd-standard -- kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: fast provisioner: kubernetes.io/gce-pd parameters: type: pd-ssd storage_class.yaml Dynamic Provisioning and Storage Classes
  23. Google Cloud Platform User PVClaim apiVersion: v1 kind: PersistentVolumeClaim metadata:

    name: mypvc namespace: testns annotations: volume.beta.kubernetes.io/storage-class: fast spec: accessModes: - ReadWriteOnce resources: requests: storage: 100Gi apiVersion: v1 kind: PersistentVolumeClaim metadata: name: mypvc namespace: testns spec: accessModes: - ReadWriteOnce resources: requests: storage: 100Gi storageClassName: fast • Users consume storage the same way: PVC • “Selecting” a storage class in PVC triggers dynamic provisioning Dynamic Provisioning and Storage Classes pvc.yaml (on k8s 1.5) pvc.yaml (on k8s 1.6+)
  24. Google Cloud Platform • Users consume storage the same way:

    PVC • “Selecting” a storage class in PVC triggers dynamic provisioning Storage Provider $ kubectl create -f storage_class.yaml storageclass "fast" created $ kubectl create -f pvc.yaml persistentvolumeclaim "mypvc" created $ kubectl get pvc --all-namespaces NAMESPACE NAME STATUS VOLUME CAPACITY ACCESSMODES AGE testns mypvc Bound pvc-331d7407-fe18-11e6-b7cd-42010a8000cd 100Gi RWO 6s $ kubectl get pv pvc-331d7407-fe18-11e6-b7cd-42010a8000cd NAME CAPACITY ACCESSMODES RECLAIMPOLICY STATUS CLAIM REASON AGE pvc-331d7407-fe18-11e6-b7cd-42010a8000cd 100Gi RWO Delete Bound testns/mypvc 13m Dynamic Provisioning and Storage Classes
  25. Google Cloud Platform apiVersion: v1 kind: Pod metadata: name: sleepypod

    spec: volumes: - name: data persistentVolumeClaim: claimName: mypvc containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumeMounts: - name: data mountPath: /data readOnly: false Volume referenced via PVC Dynamic Provisioning and Storage Classes pod.yaml
  26. Google Cloud Platform Kubernetes volumes allow data to be persisted

    and shared between containers in a pod Persistent Volumes and Persistent Volume Claims allows the application to be portable Dynamic provisioning and Storage Classes enables on-demand storage creation, simplifying the admin’s role Agenda Summary
  27. Google Cloud Platform Kubernetes is Open https://kubernetes.io Code: github.com/kubernetes/kubernetes Chat:

    slack.k8s.io Twitter: @kubernetesio open community open design open source open to ideas