Saad Ali
Senior Software Engineer, Google
May 22, 2019
Debunking the Myth:
Kubernetes Storage is Hard
github.com/saad-ali
twitter.com/the_saad_ali
Slide 2
Slide 2 text
Magic of
Containers
Benefits of Kubernetes
Self
Healing
Intelligent
Scheduling
Auto
Scaling
Service
Discovery
Load
Balancing
Safer
Deployment
App
Portability
Slide 3
Slide 3 text
What about stateful
apps?
Containers are inherently
ephemeral
Stateful apps need to persist bits
across container restarts.
Slide 4
Slide 4 text
Kubernetes Storage
Myths
It’s hard!
Storage on
Kubernetes is
hard.
Don’t do it!
Don’t run stateful
workloads on
Kubernetes.
Slide 5
Slide 5 text
Reality: Storage
is complicated
Slide 6
Slide 6 text
Reality: “Sprinkling
Kubernetes” on a hard
problem won’t
make it go
away.
Slide 7
Slide 7 text
Separation of
concerns
Makes large
problems
manageable.
Slide 8
Slide 8 text
03
Integrate
How do I make my
deployed storage available
in my cluster?
02
Deploy
How do I deploy and
manage my storage?
Seperate Storage Problems
04
Consume
How does my stateful app
provision and use available
storage?
01
Select
What storage should I use?
Slide 9
Slide 9 text
03
Integrate
How do I make my
deployed storage available
in my cluster?
02
Deploy
How do I deploy and
manage my storage?
Seperate Storage Problems
04
Consume
How does my stateful app
provision and use available
storage?
01
Select
What storage should I use?
Slide 10
Slide 10 text
01
Select
What storage
should I use?
Slide 11
Slide 11 text
NFS
SMB
GlusterFS
CephFS
InfluxDB
Prometheus
Graphite
iSCSI
Fibre Channel
GCE Persistent Disks
Amazon EBS
Local Disks
Apache Kafka
RabbitMQ
Google Cloud Pub/Sub
Amazon SQS
Amazon S3
Google Cloud Storage (GCS)
MinIO
What storage should I use?
Object Stores SQL Databases NoSQL Databases
MySQL
PostgreSQL
SQL Server
Key-value or document based
MongoDB
Redis
Cassandra
File Storage
Time series Databases Message Queues
Block Storage
1
2
3
4
Understand your
options
Understand your
requirements
Weigh tradeoffs
Make decision
Slide 12
Slide 12 text
What storage should I use? Understand your
options
Understand your
requirements
Weigh tradeoffs
What does your stateful app need?
What kind of data are you storing?
Where will you be accessing it?
How frequently will you need to access it?
What kind of data protection is required?
1
2
3
Make decision
4
Slide 13
Slide 13 text
What storage should I use?
Availability Durability
Performance Cost
Understand your
options
Understand your
requirements
Weigh tradeoffs
1
2
3
Make decision
4
Slide 14
Slide 14 text
What storage should I use? Understand your
options
Understand your
requirements
Weigh tradeoffs
1
2
3
Make decision
4
“Sprinkling Kubernetes on it” doesn’t magically solve this.
You have to do your homework.
You have to make a decision about your architecture, based on your needs.
Slide 15
Slide 15 text
02
Deploy
How do I deploy
and manage my
storage?
Slide 16
Slide 16 text
You do not have to deploy
your storage on Kubernetes
to use it in Kubernetes.
How do I deploy my storage?
Slide 17
Slide 17 text
You deploy and
manage.
Someone else
deploys and
manages.
Unmanaged
Managed
How do I deploy my storage?
Slide 18
Slide 18 text
How do I deploy my storage?
Storage deployed on top of
Kubernetes is just another
stateful application.
Slide 19
Slide 19 text
Use an operator to deploy
applications with complicated
life cycles.
How do I deploy my storage?
Slide 20
Slide 20 text
How do I deploy my storage?
Software defined storage
deployed on Kubernetes is just
another stateful application.
Slide 21
Slide 21 text
How do I deploy my storage?
Software defined storage
deployed on Kubernetes is just
another stateful application.
Slide 22
Slide 22 text
03
Integrate
How do I make my
deployed storage available
in my cluster?
02
Deploy
How do I deploy and
manage my storage?
Seperate Storage Problems
04
Consume
How does my stateful app
provision and use available
storage?
01
Select
What storage should I use?
Slide 23
Slide 23 text
Data Services vs Block/File
NFS
SMB
GlusterFS
CephFS
InfluxDB
Prometheus
Graphite
iSCSI
Fibre Channel
GCE Persistent Disks
Amazon EBS
Local Disks
Apache Kafka
RabbitMQ
Google Cloud Pub/Sub
Amazon SQS
Amazon S3
Google Cloud Storage (GCS)
MinIO
Object Stores SQL Databases NoSQL Databases
MySQL
PostgreSQL
SQL Server
Key-value or document based
MongoDB
Redis
Cassandra
File Storage
Time series Databases Message Queues
Block Storage
Slide 24
Slide 24 text
Data Services vs Block/File
NFS
SMB
GlusterFS
CephFS
InfluxDB
Prometheus
Graphite
iSCSI
Fibre Channel
GCE Persistent Disks
Amazon EBS
Local Disks
Apache Kafka
RabbitMQ
Google Cloud Pub/Sub
Amazon SQS
Amazon S3
Google Cloud Storage (GCS)
MinIO
Object Stores SQL Databases NoSQL Databases
MySQL
PostgreSQL
SQL Server
Key-value or document based
MongoDB
Redis
Cassandra
File Storage
Time series Databases Message Queues
Block Storage
Slide 25
Slide 25 text
Data Service
Block/File Storage
Physical Storage
Object Store, SQL/NoSQL
DB, Message Queue, etc.
NFS, iSCSI
Fibre Channel, etc.
SSD/Flash Disk
Stateful App Your stateful app
Data Services vs Block/File
Slide 26
Slide 26 text
03
Integrate
How do I make
my deployed
storage
available in my
cluster?
Slide 27
Slide 27 text
Use a Container
Storage
Interface (CSI)
Driver.
Your app must
handle
discovery and
negotiation.
Block/File
Storage
Data
Service
How do I integrate my storage
with Kubernetes?
Slide 28
Slide 28 text
Use a Container
Storage Interface
(CSI) Driver.
Block/File
Storage
How do I integrate my storage
with Kubernetes?
Without
Kubernetes
With
Kubernetes
Swapping storage requires rewrite your app to
handle provisioning, attaching, mounting.
Swapping storage is as easy as deploying a new CSI
Driver and creating a new StorageClass API object.
Slide 29
Slide 29 text
04
Consume
How does my
stateful app use
storage?
Slide 30
Slide 30 text
Use the
Kubernetes
Storage API
(PVC, PV, etc.)
Block/File
Storage
Your app must
handle
Data
Service
How does my stateful app use
storage?
Slide 31
Slide 31 text
Use the
Kubernetes
Storage API
(PVC, PV, etc.)
Block/File
Storage
How does my stateful app use
storage?
Without
Kubernetes
With
Kubernetes
Manual provision, manual attach, manual mount.
Automatic (intelligent) provisioning.
Intelligent scheduling based on storage.
Storage automatically available to correct node and
pod.
Storage moved along with workload.
Portable Kubernetes Storage API -- write once run
anywhere
Slide 32
Slide 32 text
03
Integrate
How do I make my
deployed storage available
in my cluster?
02
Deploy
How do I deploy and
manage my storage?
Seperate Storage Problems
04
Consume
How does my stateful app
provision and use available
storage?
01
Select
What storage should I use?
Slide 33
Slide 33 text
Storage is
complicated...
Kubernetes
makes it
manageable!