The talk will give an introduction to Kubernetes in general and then focus on topics relevant to Data Engineers: in particular, we will talk about how to run stateful workloads on Kubernetes and how to run Machine Learning workloads that use GPUs on Kubernetes.
Kubernetes for Data Engineers
Software Engineer, Google Cloud
What is Kubernetes?
Focus on applications, not machines.
Declarative, not imperative.
Kubernetes for stateless applications
Deployment and ReplicaSet.
Rollouts and rollbacks.
Applications that Data Engineers care about
Data processing frameworks.
Machine learning frameworks.
Running stateful applications
YARN: MapReduce, Hive, Spark etc.
Rest of workloads: bespoke deployments.
Siloed clusters and underutilization.
No standard and management pain.
Kubernetes can help
Borg for the rest of the world.
Running stateful applications on Kubernetes
Stable, unique network identifiers.
Stable, persistent storage.
Ordered, graceful deployment and scaling.
Ordered, graceful termination.
Ordered, automated rolling updates.
Built-in, no need to reinvent.
Encode domain-specific operational knowledge.
Control-loops: observe, rectify, repeat.
Lots of Operators
and so on...
Spark on Kubernetes.
(In progress) Airflow on Kubernetes.
Operators for Tensorflow, PyTorch, Caffe2, MXNet…
Lot of activity.
GPUs in Kubernetes
Support for NVIDIA GPUs.
Support for scheduling any device (GPUs, FPGAs, Infiniband etc.)
Stateless > Deployment and ReplicaSet
Simple stateful > StatefulSet
Distributed databases > Operators
Spark/Airflow > Native integration
ML > Kubeflow
It’s not done yet.