Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Local Persistent Storage

Local Persistent Storage

Use cases and motivations for local persistent storage, problems with hostpath volumes, 1.7 feature overview, future roadmap

Michelle Au

June 28, 2017
Tweet

More Decks by Michelle Au

Other Decks in Programming

Transcript

  1. Google Cloud Platform 1. Motivations and Use Cases 2. Hostpath

    Problems 3. Local Persistent Volume Solution 4. Demo 5. Future Roadmap 6. Documentation Agenda Agenda
  2. Google Cloud Platform Use Cases • Using local storage subject

    to node and storage availability • Not suitable for all use cases! • Data gravity (co-locating data and application) • Distributed datastores and filesystems (Cassandra, GlusterFS, etc) • Large caches Cost • Increase disk utilization in baremetal environments • Reduce operator cost for managing distributed storage systems and supporting infrastructure (networking hardware, etc) Performance • Local SSDs in cloud environments Local Persistent Storage Motivations
  3. Google Cloud Platform Hostpath Volume Issues apiVersion: v1 kind: Pod

    metadata: name: sleepypod spec: volumes: - name: data hostPath: path: /mnt/some-disk containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumeMounts: - name: data mountPath: /data readOnly: false
  4. Google Cloud Platform Hostpath Volume Issues apiVersion: v1 kind: Pod

    metadata: name: sleepypod spec: nodeName: node-1 volumes: - name: data hostPath: path: /mnt/some-disk containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumeMounts: - name: data mountPath: /data readOnly: false Portability • Path (and data) is specific to the node • Need to manually schedule pods to specific nodes, bypassing scheduler • Paths can change across clusters and different environments
  5. Google Cloud Platform Hostpath Volume Issues apiVersion: v1 kind: Pod

    metadata: name: sleepypod spec: nodeName: node-1 volumes: - name: data hostPath: path: /mnt/some-disk containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumeMounts: - name: data mountPath: /data readOnly: false Accounting • Path collisions with other pods • Coordinate with other applications and manually schedule pods to specific nodes • Lifecycle is unmanaged. Manual cleanup whenever application is finished
  6. Google Cloud Platform Hostpath Volume Issues apiVersion: v1 kind: Pod

    metadata: name: sleepypod spec: nodeName: node-1 volumes: - name: data hostPath: path: /mnt/some-disk containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumeMounts: - name: data mountPath: /data readOnly: false Security • Pod can specify any path! • Hostpath often disabled through pod security policy
  7. Google Cloud Platform Portability • Use Persistent Volumes abstraction to

    separate storage details from pod consumption Accounting • Only one Persistent Volume Claim can be bound to a Persistent Volume • API objects with managed lifecycles Security • Only administrators can create Persistent Volumes Local Persistent Volumes Solution
  8. Google Cloud Platform Local Persistent Volumes Example apiVersion: v1 kind:

    PersistentVolumeClaim metadata: name: my-pvc spec: resources: requests: storage: 100Gi storageClassName: local-fast apiVersion: v1 kind: PersistentVolume metadata: Name: local-volume-1 spec: capacity: storage: 100Gi storageClassName: local-fast local: path: /tmp/my-test1 nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - node-1
  9. Google Cloud Platform Local Persistent Volumes Solution apiVersion: v1 kind:

    Pod metadata: name: sleepypod spec: nodeName: node-1 volumes: - name: data hostPath: path: /mnt/somedisk containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumeMounts: ... persistentVolumeClaim: claimName: mypvc
  10. Google Cloud Platform New volume type: “local” volume • Can

    only be used as a Persistent Volume • Scheduler is aware of volume’s node constraints External static provisioner for local volumes • Run as a DaemonSet on every node • Discovers local volumes mounted under configurable directories • Automatically create, cleanup and destroy local Persistent Volumes 1.7 Alpha Details
  11. Google Cloud Platform 1. StatefulSet where each instance writes to

    a local volume 2. Reader pod that reads from one of the local volumes 3. The pods will always be scheduled to the same node that the volume is on Demo
  12. Google Cloud Platform • Persistent Volume binding happens before pod

    scheduling • Doesn’t consider pod resource and scheduling requirements (ie, CPU, pod affinity, etc) • Cannot specify multiple local volumes in a single pod spec • External provisioner cannot correctly detect volume capacity for new volumes created after provisioner has started 1.7 Limitations
  13. Google Cloud Platform • Local block devices as a volume

    source, and for pod consumption • Local volume health monitoring, taints and tolerations • Inline PV (use local disk as ephemeral storage) • Dynamic provisioning Roadmap
  14. Google Cloud Platform User guide • https://github.com/kubernetes-incubator/external-storage/tree/master/lo cal-volume Implementation tracker

    • https://github.com/kubernetes/kubernetes/issues/43640 Proposal • https://github.com/kubernetes/community/pull/306 Documentation