Slide 1

Slide 1 text

Google Cloud Platform logo Local Persistent Storage 6/28/2017 Michelle Au Software Engineer - Google Github: @msau42

Slide 2

Slide 2 text

Google Cloud Platform 1. Motivations and Use Cases 2. Hostpath Problems 3. Local Persistent Volume Solution 4. Demo 5. Future Roadmap 6. Documentation Agenda Agenda

Slide 3

Slide 3 text

Google Cloud Platform Use Cases • Using local storage subject to node and storage availability • Not suitable for all use cases! • Data gravity (co-locating data and application) • Distributed datastores and filesystems (Cassandra, GlusterFS, etc) • Large caches Cost • Increase disk utilization in baremetal environments • Reduce operator cost for managing distributed storage systems and supporting infrastructure (networking hardware, etc) Performance • Local SSDs in cloud environments Local Persistent Storage Motivations

Slide 4

Slide 4 text

Google Cloud Platform Hostpath Volume Issues apiVersion: v1 kind: Pod metadata: name: sleepypod spec: volumes: - name: data hostPath: path: /mnt/some-disk containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumeMounts: - name: data mountPath: /data readOnly: false

Slide 5

Slide 5 text

Google Cloud Platform Hostpath Volume Issues apiVersion: v1 kind: Pod metadata: name: sleepypod spec: nodeName: node-1 volumes: - name: data hostPath: path: /mnt/some-disk containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumeMounts: - name: data mountPath: /data readOnly: false Portability • Path (and data) is specific to the node • Need to manually schedule pods to specific nodes, bypassing scheduler • Paths can change across clusters and different environments

Slide 6

Slide 6 text

Google Cloud Platform Hostpath Volume Issues apiVersion: v1 kind: Pod metadata: name: sleepypod spec: nodeName: node-1 volumes: - name: data hostPath: path: /mnt/some-disk containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumeMounts: - name: data mountPath: /data readOnly: false Accounting • Path collisions with other pods • Coordinate with other applications and manually schedule pods to specific nodes • Lifecycle is unmanaged. Manual cleanup whenever application is finished

Slide 7

Slide 7 text

Google Cloud Platform Hostpath Volume Issues apiVersion: v1 kind: Pod metadata: name: sleepypod spec: nodeName: node-1 volumes: - name: data hostPath: path: /mnt/some-disk containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumeMounts: - name: data mountPath: /data readOnly: false Security • Pod can specify any path! • Hostpath often disabled through pod security policy

Slide 8

Slide 8 text

Google Cloud Platform Portability • Use Persistent Volumes abstraction to separate storage details from pod consumption Accounting • Only one Persistent Volume Claim can be bound to a Persistent Volume • API objects with managed lifecycles Security • Only administrators can create Persistent Volumes Local Persistent Volumes Solution

Slide 9

Slide 9 text

Google Cloud Platform Local Persistent Volumes Example apiVersion: v1 kind: PersistentVolumeClaim metadata: name: my-pvc spec: resources: requests: storage: 100Gi storageClassName: local-fast apiVersion: v1 kind: PersistentVolume metadata: Name: local-volume-1 spec: capacity: storage: 100Gi storageClassName: local-fast local: path: /tmp/my-test1 nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - node-1

Slide 10

Slide 10 text

Google Cloud Platform Local Persistent Volumes Solution apiVersion: v1 kind: Pod metadata: name: sleepypod spec: nodeName: node-1 volumes: - name: data hostPath: path: /mnt/somedisk containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumeMounts: ... persistentVolumeClaim: claimName: mypvc

Slide 11

Slide 11 text

Google Cloud Platform New volume type: “local” volume • Can only be used as a Persistent Volume • Scheduler is aware of volume’s node constraints External static provisioner for local volumes • Run as a DaemonSet on every node • Discovers local volumes mounted under configurable directories • Automatically create, cleanup and destroy local Persistent Volumes 1.7 Alpha Details

Slide 12

Slide 12 text

Google Cloud Platform 1. StatefulSet where each instance writes to a local volume 2. Reader pod that reads from one of the local volumes 3. The pods will always be scheduled to the same node that the volume is on Demo

Slide 13

Slide 13 text

Google Cloud Platform • Persistent Volume binding happens before pod scheduling • Doesn’t consider pod resource and scheduling requirements (ie, CPU, pod affinity, etc) • Cannot specify multiple local volumes in a single pod spec • External provisioner cannot correctly detect volume capacity for new volumes created after provisioner has started 1.7 Limitations

Slide 14

Slide 14 text

Google Cloud Platform • Local block devices as a volume source, and for pod consumption • Local volume health monitoring, taints and tolerations • Inline PV (use local disk as ephemeral storage) • Dynamic provisioning Roadmap

Slide 15

Slide 15 text

Google Cloud Platform User guide • https://github.com/kubernetes-incubator/external-storage/tree/master/lo cal-volume Implementation tracker • https://github.com/kubernetes/kubernetes/issues/43640 Proposal • https://github.com/kubernetes/community/pull/306 Documentation