Upgrade to Pro — share decks privately, control downloads, hide ads and more …

AI and ML with HashiCorp Nomad

AI and ML with HashiCorp Nomad

In this presentation, I explain how to use the NVIDIA Device Driver for HashiCorp Nomad to deploy AI and ML workloads.

This version of the talk was given at HashiConf Boston, in October 2024.

Avatar for Kerim Satirli

Kerim Satirli

October 16, 2024
Tweet

More Decks by Kerim Satirli

Other Decks in Technology

Transcript

  1. H100 4000 TFLOPS US$ 28.500 A100 600 - 1200 TOPS

    US$ 22.000 Jetson Orin series 270 TOPS $ 500 - 2,000 GeForce RTX 3090 230 TOPS US$ 1.000 Jetson Xavier series 20 - 30 TOPS $ 500 - 800 Raspberry Pi 5 with AI Accelerator 10 TOPS US$ 100 Cost of Hardware
  2. H100 4000 TFLOPS US$ 28.500 A100 600 - 1200 TOPS

    US$ 22.000 Jetson Orin series 270 TOPS $ 500 - 2,000 GeForce RTX 3090 230 TOPS US$ 1.000 Jetson Xavier series 20 - 30 TOPS $ 500 - 800 Raspberry Pi 5 with AI Accelerator 10 TOPS US$ 100 Cost of Hardware
  3. 5GB 5GB 5GB 5GB 5GB 5GB 5GB 5GB 1 compute

    1 compute 1 compute 1 compute 1 compute 1 compute 1 compute
  4. job "gapcloser-scrapers" { datacenters = ["*"] namespace = "gapcloser" node_pool

    = "gpu_instances" type = "batch" group "scrapers" { count = 1 task "code-scraper" { driver = "docker" config { image = "nvidia/cuda:12.6.0-base" command = "nvidia-smi" } resources { cores = 16 memory = 8192 Targeting Partitions gapcloser.nomad.hcl
  5. driver = "docker" config { image = "nvidia/cuda:12.6.0-base" command =

    "nvidia-smi" } resources { cores = 16 memory = 8192 device "nvidia/gpu/NVIDIA A100-SXM4-40GB MIG 1g.5gb" { count = 3 } device "nvidia/gpu/NVIDIA A100-SXM4-40GB MIG 3g.20gb" { count = 1 } } } } } Targeting Partitions gapcloser.nomad.hcl
  6. driver = "docker" config { image = "nvidia/cuda:12.6.0-base" command =

    "nvidia-smi" } resources { cores = 16 memory = 8192 device "nvidia/gpu/NVIDIA A100-SXM4-40GB MIG 1g.5gb" { count = 3 } device "nvidia/gpu/NVIDIA A100-SXM4-40GB MIG 3g.20gb" { count = 1 } } } } } Targeting Partitions gapcloser.nomad.hcl
  7. driver = "docker" config { image = "nvidia/cuda:12.6.0-base" command =

    "nvidia-smi" } resources { cores = 16 memory = 8192 device "nvidia/gpu/NVIDIA A100-SXM4-40GB MIG 1g.5gb" { count = 3 } device "nvidia/gpu/NVIDIA A100-SXM4-40GB MIG 3g.20gb" { count = 1 } } } } } Targeting Partitions gapcloser.nomad.hcl
  8. Memory MiB Power Watt BAR1 size MiB Driver version (Semantic)

    version string PCI bandwidth MB/s Core and memory clock speed MHz Fingerprinting Options