Slide 1

Slide 1 text

Google Cloud Platform logo Overview of Kubernetes Storage 6/28/2017 Michelle Au Software Engineer - Google Github: @msau42

Slide 2

Slide 2 text

Google Cloud Platform 1. Kubernetes Overview 2. Kubernetes Volumes • Direct pod volume reference (and why it’s a bad idea) • Persistent volumes and claims (and why they’re a good idea) 3. Dynamic Volume Provisioning Agenda Agenda

Slide 3

Slide 3 text

Google Cloud Platform Kubernetes Overview

Slide 4

Slide 4 text

Greek for “Helmsman”; also the root of the words “governor” and “cybernetic” • Manages container clusters • Inspired and informed by Google’s experiences and internal systems • Supports multiple cloud and bare-metal environments • Supports multiple container runtimes • 100% Open source, written in Go Manage applications, not machines Kubernetes

Slide 5

Slide 5 text

Google Cloud Platform Separation of Concerns Application (Dev) Cluster Kernel/OS (System) Hardware

Slide 6

Slide 6 text

Google Cloud Platform Kubernetes Goals • API and implementation 100% open • Modular and replaceable • Don’t force apps to know about concepts that are • Cloud Provider Specific • Kubernetes Specific Enable Users To • Write once, run anywhere • Avoid vendor lock-in • Avoid coupling app to infrastructure Workload Portability

Slide 7

Slide 7 text

Container cluster orchestration ● Scheduling: Decide where my containers should run ● Lifecycle and health: Keep my containers running despite failures ● Scaling: Make sets of containers bigger or smaller ● Naming and discovery: Find where my containers are now ● Load balancing: Distribute traffic across a set of containers ● Storage volumes: Provide data to containers ● Logging and monitoring: Track what’s happening with my containers ● Debugging and introspection: Enter or attach to containers ● Identity and authorization: Control who can do things to my containers 7

Slide 8

Slide 8 text

Google Cloud Platform Small group of containers & volumes Tightly coupled The atom of scheduling & placement Shared namespace • share IP address & localhost • share IPC, etc. Managed lifecycle • bound to a node, restart in place • can die, cannot be reborn with same ID Example: data puller & web server Consumers Content Manager File Puller Web Server Volume Pod Pods

Slide 9

Slide 9 text

Google Cloud Platform Pods can be started/killed individually Controllers can manage pods for you Workload-specific APIs • ReplicaSet: fungible replicas • StatefulSet: stateful applications • DaemonSet: cluster services • Job: batch workloads Layered on top of the public Pod API You could write your own Workload-specific Controllers ReplicaSet - name = “my-rc” - template = { ... } - replicas = 4 API Server How many? 3 Start 1 more OK How many? 4

Slide 10

Slide 10 text

Google Cloud Platform Kubernetes Volumes

Slide 11

Slide 11 text

Google Cloud Platform Can’t share files between containers Files in containers are ephemeral • Can’t run stateful applications • Container crashes result in loss of data Consumers Content Manager File Puller Web Server ? Pod Problem

Slide 12

Slide 12 text

Google Cloud Platform Kubernetes Volumes • Directory, possibly with some data in it • Accessible by all containers in pod • Lifetime same as the pod or longer • Volume Plugins Define • How directory is setup • Medium that backs it • Contents of the directory • http://kubernetes.io/docs/user-guide/ volumes/ Consumers Content Manager File Puller Web Server Volume Pod Volumes

Slide 13

Slide 13 text

Google Cloud Platform Kubernetes has many volume plugins Persistent • GCE Persistent Disk • AWS Elastic Block Store • Azure File Storage • Azure Data Disk • iSCSI • Flocker • NFS • vSphere • GlusterFS • Ceph File and RBD • Cinder • Quobyte Volume • FibreChannel • VMWare Photon PD Ephemeral • Empty dir (and tmpfs) • Expose Kubernetes API • Secret • ConfigMap • DownwardAPI Other • Flex (exec a binary) • Host path Future • Local Storage (alpha) • CSI Volume Plugins

Slide 14

Slide 14 text

Google Cloud Platform apiVersion: v1 kind: Pod metadata: name: sleepypod spec: volumes: - name: data gcePersistentDisk: pdName: panda-disk fsType: ext4 containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumeMounts: - name: data mountPath: /data readOnly: false Volume referenced directly GCE PD Example gcepd.yaml

Slide 15

Slide 15 text

Google Cloud Platform apiVersion: v1 kind: Pod metadata: name: sleepypod spec: volumes: - name: data gcePersistentDisk: pdName: panda-disk fsType: ext4 containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumeMounts: - name: data mountPath: /data readOnly: false gcepd.yaml Volume referenced directly GCE PD Example

Slide 16

Slide 16 text

Google Cloud Platform apiVersion: v1 kind: Pod metadata: name: sleepypod spec: volumes: - name: data gcePersistentDisk: pdName: panda-disk fsType: ext4 containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumeMounts: - name: data mountPath: /data readOnly: false Volume referenced directly gcepd.yaml GCE PD Example

Slide 17

Slide 17 text

Google Cloud Platform apiVersion: v1 kind: Pod metadata: name: sleepypod spec: volumes: - name: data gcePersistentDisk: pdName: panda-disk fsType: ext4 containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumeMounts: - name: data mountPath: /data readOnly: false Volume referenced directly gcepd.yaml GCE PD Example

Slide 18

Slide 18 text

Google Cloud Platform Kubernetes Goals • API and implementation 100% open • Modular and replaceable • Don’t force apps to know about concepts that are • Cloud Provider Specific • Kubernetes Specific Enable Users To • Write once, run anywhere • Avoid vendor lock-in • Avoid coupling app to infrastructure Workload Portability

Slide 19

Slide 19 text

Google Cloud Platform Persistent Volumes & Claims

Slide 20

Slide 20 text

Google Cloud Platform • Abstracts details of how storage is provided from how it is consumed • PersistentVolume (PV) API Object • Piece of networked storage in the cluster • Not used directly in pod • Lifecycle independent of any individual pod • PersistentVolumeClaim (PVC) API Object • Request for storage by a user • Claims request specific size and access modes of storage • Pods reference claims User PVClaim Pod Cluster Admin PersistentVolumes PV/PVC Example

Slide 21

Slide 21 text

Google Cloud Platform apiVersion: v1 kind: PersistentVolume metadata: name : myPV1 spec: accessModes: - ReadWriteOnce capacity: storage: 10Gi persistentVolumeReclaimPolicy: Retain gcePersistentDisk: fsType: ext4 pdName: panda-disk pv.yaml PV/PVC Example apiVersion: v1 kind: PersistentVolume metadata: name : myPV2 spec: accessModes: - ReadWriteOnce capacity: storage: 100Gi persistentVolumeReclaimPolicy: Retain gcePersistentDisk: fsType: ext4 pdName: panda-disk2

Slide 22

Slide 22 text

Google Cloud Platform $ kubectl create -f pv.yaml persistentvolume "pv1" created persistentvolume "pv2" created $ kubectl get pv NAME CAPACITY ACCESSMODES STATUS CLAIM REASON AGE pv1 10Gi RWO Available 1m pv2 100Gi RWO Available 1m PV/PVC Example

Slide 23

Slide 23 text

Google Cloud Platform apiVersion: v1 kind: PersistentVolumeClaim metadata: name: mypvc namespace: testns spec: accessModes: - ReadWriteOnce resources: requests: storage: 100Gi PV/PVC Example pvc.yaml

Slide 24

Slide 24 text

Google Cloud Platform $ kubectl create -f pv.yaml persistentvolume "pv1" created persistentvolume "pv2" created $ kubectl get pv NAME CAPACITY ACCESSMODES STATUS CLAIM REASON AGE pv1 10Gi RWO Available 1m pv2 100Gi RWO Available 1m $ kubectl create -f pvc.yaml persistentvolumeclaim "mypvc" created $ kubectl get pv NAME CAPACITY ACCESSMODES STATUS CLAIM REASON AGE pv1 10Gi RWO Available 3m pv2 100Gi RWO Bound testns/mypvc 3m PV/PVC Example

Slide 25

Slide 25 text

Google Cloud Platform apiVersion: v1 kind: Pod metadata: name: sleepypod spec: volumes: - name: data gcePersistentDisk: pdName: panda-disk fsType: ext4 containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumeMounts: ... Volume referenced via PVC volumes: - name: data persistentVolumeClaim: claimName: mypvc PV/PVC Example gcepd.yaml

Slide 26

Slide 26 text

Google Cloud Platform Dynamic Provisioning & Storage Classes

Slide 27

Slide 27 text

Google Cloud Platform • Allows storage to be created on-demand (when requested by user). • Eliminates need for cluster administrators to pre-provision storage. • Alpha in Kubernetes 1.2 • Beta in 1.4 • GA in 1.6 User PVClaim Cluster Admin Storage Provider Storage Class Dynamic Provisioning and Storage Classes

Slide 28

Slide 28 text

Google Cloud Platform • Cluster/Storage admins “enable” dynamic provisioning by creating StorageClass objects • StorageClass objects define the parameters used during creation. • StorageClass parameters are opaque to Kubernetes so storage providers can expose any number of custom parameters for the cluster admin to use. Cluster Admin Storage Class kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: slow provisioner: kubernetes.io/gce-pd parameters: type: pd-standard -- kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: fast provisioner: kubernetes.io/gce-pd parameters: type: pd-ssd storage_class.yaml Dynamic Provisioning and Storage Classes

Slide 29

Slide 29 text

Google Cloud Platform User PVClaim apiVersion: v1 kind: PersistentVolumeClaim metadata: name: mypvc namespace: testns annotations: volume.beta.kubernetes.io/storage-class: fast spec: accessModes: - ReadWriteOnce resources: requests: storage: 100Gi apiVersion: v1 kind: PersistentVolumeClaim metadata: name: mypvc namespace: testns spec: accessModes: - ReadWriteOnce resources: requests: storage: 100Gi storageClassName: fast • Users consume storage the same way: PVC • “Selecting” a storage class in PVC triggers dynamic provisioning Dynamic Provisioning and Storage Classes pvc.yaml (on k8s 1.5) pvc.yaml (on k8s 1.6+)

Slide 30

Slide 30 text

Google Cloud Platform • Users consume storage the same way: PVC • “Selecting” a storage class in PVC triggers dynamic provisioning Storage Provider $ kubectl create -f storage_class.yaml storageclass "fast" created $ kubectl create -f pvc.yaml persistentvolumeclaim "mypvc" created $ kubectl get pvc --all-namespaces NAMESPACE NAME STATUS VOLUME CAPACITY ACCESSMODES AGE testns mypvc Bound pvc-331d7407-fe18-11e6-b7cd-42010a8000cd 100Gi RWO 6s $ kubectl get pv pvc-331d7407-fe18-11e6-b7cd-42010a8000cd NAME CAPACITY ACCESSMODES RECLAIMPOLICY STATUS CLAIM REASON AGE pvc-331d7407-fe18-11e6-b7cd-42010a8000cd 100Gi RWO Delete Bound testns/mypvc 13m Dynamic Provisioning and Storage Classes

Slide 31

Slide 31 text

Google Cloud Platform apiVersion: v1 kind: Pod metadata: name: sleepypod spec: volumes: - name: data persistentVolumeClaim: claimName: mypvc containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumeMounts: - name: data mountPath: /data readOnly: false Volume referenced via PVC Dynamic Provisioning and Storage Classes pod.yaml

Slide 32

Slide 32 text

Google Cloud Platform Summary

Slide 33

Slide 33 text

Google Cloud Platform Kubernetes volumes allow data to be persisted and shared between containers in a pod Persistent Volumes and Persistent Volume Claims allows the application to be portable Dynamic provisioning and Storage Classes enables on-demand storage creation, simplifying the admin’s role Agenda Summary

Slide 34

Slide 34 text

Google Cloud Platform Kubernetes is Open https://kubernetes.io Code: github.com/kubernetes/kubernetes Chat: slack.k8s.io Twitter: @kubernetesio open community open design open source open to ideas