Slide 1

Slide 1 text

Bridging the gap in Machine Learning on Kubernetes Michael Hausenblas @mhausenblas
 Developer Advocate, Red Hat
 2018-04-19, Microservices Zurich

Slide 2

Slide 2 text

Hit me up on Twitter: @mhausenblas 2 • Developer Advocate @ Red Hat (Go, Kubernetes, OpenShift) • Developer Advocate @ Mesosphere (Mesos, Kubernetes) • Chief Data Engineer @ MapR (Hadoop, HBase, Drill, ML) • Applied research (IE/AT) • Nowadays mainly Go (Python, Node, PHP, Java, C++) • Dev turned ops $ whois mhausenblas

Slide 3

Slide 3 text

Hit me up on Twitter: @mhausenblas 3 data scientist data engineer developer SRE/admin architect PM PHB

Slide 4

Slide 4 text

Motivation

Slide 5

Slide 5 text

Hit me up on Twitter: @mhausenblas 5 The double divide

Slide 6

Slide 6 text

Hit me up on Twitter: @mhausenblas 6 Example microservices setup ML

Slide 7

Slide 7 text

Hit me up on Twitter: @mhausenblas 7 Challenges • interchanging models • versioning of datasets and models • building apps: integrating ML features • deployments (local vs. at scale, GPU support) data scientists data engineers developers ops

Slide 8

Slide 8 text

… demo time!

Slide 9

Slide 9 text

Hit me up on Twitter: @mhausenblas 9 • Dotmesh via 
 https://dotmesh.com/try-dotmesh/ • Kubeflow via
 https://www.katacoda.com/kubeflow/scenarios/deploying-kubeflow Demo

Slide 10

Slide 10 text

Tools of the trade

Slide 11

Slide 11 text

Hit me up on Twitter: @mhausenblas 11 Kubernetes kubernetes.io • Container lifecycle management • Declarative, state-driven • Extensible, modular API • Robust, flexible, scalable Kudos to Lucas Käldström for this figure (source)

Slide 12

Slide 12 text

Hit me up on Twitter: @mhausenblas 12 Kubeflow github.com/kubeflow/kubeflow • Launched in late 2017 by Google • JupyterHub • TensorFlow Training Controller and Server • Intel, Red Hat + growing community

Slide 13

Slide 13 text

Hit me up on Twitter: @mhausenblas 13 Pachyderm pachyderm.io • Graph-oriented data pipeline • Version control • Clients for Python, Go, Scala, etc.

Slide 14

Slide 14 text

Hit me up on Twitter: @mhausenblas 14 Binder mybinder.org • Turns a GitHub repo with Jupyter notebooks into interactive notebooks using Docker • Serves via a JupyterHub server

Slide 15

Slide 15 text

Hit me up on Twitter: @mhausenblas 15 Dotmesh dotmesh.com • Data state management across microservices • Operating on a filesystem level • Externalize snapshotting • Troubleshooting, debugging

Slide 16

Slide 16 text

Hit me up on Twitter: @mhausenblas 16 KAML-D design.kamld.com

Slide 17

Slide 17 text

Resources

Slide 18

Slide 18 text

Hit me up on Twitter: @mhausenblas 18 Engage! • https://kube-machine-learning.rocks • https://github.com/gaocegege/kubeflow-weekly • Kubernetes Machine Learning WG • https://github.com/kubernetes/community/tree/master/wg-machine-learning • https://groups.google.com/forum/#!forum/kubernetes-wg-machine-learning • OpenShift Machine Learning SIG • https://commons.openshift.org/sig/OpenshiftMachineLearning.html

Slide 19

Slide 19 text

Hit me up on Twitter: @mhausenblas 19 Learn! • https://developers.google.com/machine-learning/crash-course/ • https://github.com/Sarasra/models • https://js.tensorflow.org/ • https://learn.openshift.com

Slide 20

Slide 20 text

Hit me up on Twitter: @mhausenblas 20

Slide 21

Slide 21 text

plus.google.com/+RedHat linkedin.com/company/red-hat youtube.com/user/RedHatVideos facebook.com/redhatinc twitter.com/RedHatNews learn.openshift.com