Slide 1

Slide 1 text

Machine Learning with Kubernetes Christopher M Luciano - IBM

Slide 2

Slide 2 text

./speakerQuery.sh • Christopher M Luciano • Advisory Software Engineer @ IBM • Part of Open Source Technology Team in IBM Digital Business Group • Contributor to Kubernetes (SIG Network & SIG Node) • Work on Cloud Native Computing Foundation (CNCF) projects • @cmluciano_ on Twitter • github.com/cmluciano

Slide 3

Slide 3 text

My blog

Slide 4

Slide 4 text

Machine Learning Part of the @cmluciano_ What Where Why Series

Slide 5

Slide 5 text

Error Correction

Slide 6

Slide 6 text

The Goal • Base Knowledge -> • Points of Analysis -> • Corpus of Unstructured -> • Error Correction -> • Rinse and Repeat

Slide 7

Slide 7 text

Sprinkles Distinct Features • Ears • Face • Feet • Tail

Slide 8

Slide 8 text

Sprinkles Features • Face • Ears • Feet

Slide 9

Slide 9 text

Penguin Cat? What do we know? Do we see features? Conclusion

Slide 10

Slide 10 text

Duck Short snout Circular face Eyes not right No fur

Slide 11

Slide 11 text

Onyx Features? • Ears • Face • Snout • Feet

Slide 12

Slide 12 text

Sloth Circular face Short nose Narrow eyes Distinct mouth Hair Feet

Slide 13

Slide 13 text

Cats? Face Ears Nose Tail Feet

Slide 14

Slide 14 text

Watson

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

Watson Developer Cloud

Slide 17

Slide 17 text

IBM X-Force https://www.ibm.com/security/xforce/

Slide 18

Slide 18 text

How Can I Do This Myself • GPUs • Kubernetes • Tensorflow

Slide 19

Slide 19 text

Stacks on Stacks on Stacks Bare-metal -> Openstack -> Virtual Machines -> Runtime -> Kubernetes -> Tensorflow

Slide 20

Slide 20 text

Think of an Irish Breakfast

Slide 21

Slide 21 text

GPU Characteristics • Multiple Video Cards • Driver installation • Heterogenous model distribution • Resource fragmentation • Failure scenarios

Slide 22

Slide 22 text

Multiple Video Cards http://www.softlayer.com/GPU

Slide 23

Slide 23 text

Driver Installation • Host - container driver sets • Kubernetes daemonsets

Slide 24

Slide 24 text

Heterogenous Video Cards Node selectors https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/#api

Slide 25

Slide 25 text

Resource Fragmentation • Scheduler priorities • Topology

Slide 26

Slide 26 text

GPU Failure Scenarios • Fan heat • Insufficient power • ...

Slide 27

Slide 27 text

Kubernetes 1.6 GPU features • Multi-GPU pod scheduling • Video card discovery • Basic failure recovery • Only works with Docker

Slide 28

Slide 28 text

Where Are We Going? • Advanced – Device Recovery – Health checking features – Topology – Metrics, metrics, metrics • Cleanup – Support in Container Resource Interface (CRI) • Container runtime independence – Use of NVML or libnvidia-container

Slide 29

Slide 29 text

Ask Me Anything on Kubernetes • Use Cases for GPU/HPC (High Performance Computing) • Kubernetes Networking • Kubernetes Features • Prometheus • Other CNCF

Slide 30

Slide 30 text

If You Just Want to Talk • Cars • Coffee • Cooking • Fishing • World Culture

Slide 31

Slide 31 text

Questions? • GPUs • Kubernetes • ….

Slide 32

Slide 32 text

Thank You! • Christopher M Luciano • Advisory Software Engineer @ IBM • Part of Open Source Technology Team in IBM Digital Business Group • Contributor to Kubernetes (SIG Network & SIG Node) • Work on Cloud Native Computing Foundation (CNCF) projects • @cmluciano_ on Twitter • github.com/cmluciano