Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kubernetes for Data Engineers

Kubernetes for Data Engineers

The talk will give an introduction to Kubernetes in general and then focus on topics relevant to Data Engineers: in particular, we will talk about how to run stateful workloads on Kubernetes and how to run Machine Learning workloads that use GPUs on Kubernetes.

https://www.brighttalk.com/webcast/15789/321823/kubernetes-for-data-engineers

Rohit Agarwal

April 13, 2018
Tweet

More Decks by Rohit Agarwal

Other Decks in Technology

Transcript

  1. Kubernetes for Data Engineers
    Rohit Agarwal
    Software Engineer, Google Cloud
    @mindprince

    View Slide

  2. What is Kubernetes?
    Open source.
    Container orchestrator.
    Runs everywhere.
    Focus on applications, not machines.

    View Slide

  3. Why Kubernetes?
    Workload portability.
    Legacy compatible.
    Modular.
    Declarative, not imperative.

    View Slide

  4. Kubernetes for stateless applications
    Deployment and ReplicaSet.
    Self-healing.
    Autoscaling.
    Rollouts and rollbacks.
    De-facto standard.

    View Slide

  5. Applications that Data Engineers care about
    Stateful.
    Databases.
    Data processing frameworks.
    Machine learning frameworks.

    View Slide

  6. Running stateful applications
    YARN: MapReduce, Hive, Spark etc.
    Rest of workloads: bespoke deployments.
    Siloed clusters and underutilization.
    No standard and management pain.

    View Slide

  7. Kubernetes can help
    All workloads.
    Standardized tooling.
    Borg for the rest of the world.

    View Slide

  8. Running stateful applications on Kubernetes

    View Slide

  9. StatefulSet
    Stable, unique network identifiers.
    Stable, persistent storage.
    Ordered, graceful deployment and scaling.
    Ordered, graceful termination.
    Ordered, automated rolling updates.
    Built-in, no need to reinvent.

    View Slide

  10. Operators
    Extensions.
    Encode domain-specific operational knowledge.
    Control-loops: observe, rectify, repeat.

    View Slide

  11. Lots of Operators
    etcd.
    Prometheus.
    Kafka.
    Postgres.
    Elasticsearch.
    Redis.
    and so on...

    View Slide

  12. Native integration
    Spark on Kubernetes.
    JupyterHub.
    (In progress) Airflow on Kubernetes.

    View Slide

  13. ML workloads
    Kubeflow project.
    Operators for Tensorflow, PyTorch, Caffe2, MXNet…
    Lot of activity.

    View Slide

  14. GPUs in Kubernetes
    Support for NVIDIA GPUs.
    Support for scheduling any device (GPUs, FPGAs, Infiniband etc.)

    View Slide

  15. Recap
    Stateless > Deployment and ReplicaSet
    Simple stateful > StatefulSet
    Distributed databases > Operators
    Spark/Airflow > Native integration
    ML > Kubeflow

    View Slide

  16. Get involved
    It’s not done yet.
    #sig-big-data
    #wg-machine-learning

    View Slide

  17. Questions?

    View Slide

  18. Thank you!

    View Slide