Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Hybrid GPU Orchestration for Enterprise AI at S...

Sponsored · SiteGround - Reliable hosting with speed, security, and support you can count on.

Hybrid GPU Orchestration for Enterprise AI at Scale

In this presentation, I dive into how to think about hybrid GPU-enabled workloads and why HashiCorp Nomad is the right foundation for an Enterprise AI platform.

This version of the talk was given at IBM's booth at NVIDIA GTC in March 2026.

Avatar for Kerim Satirli

Kerim Satirli PRO

March 17, 2026

Resources

More Decks by Kerim Satirli

Other Decks in Programming

Transcript

  1. Hybrid GPU Orchestration for Enterprise AI at Scale Kerim Satirli

    Senior Developer Advocate II HashiCorp, an IBM Company NVIDIA GTC 2026
  2. mobile deployments and disconnected locations • NVIDIA Jetson Series •

    disposable compute Edge traditional DCS and failover locations • Linux distributions • Windows Server On-Prem AWS, Azure, Google, IBM Cloud, and more • traditional compute • specialized compute Cloud Workload Locations
  3. mobile deployments and disconnected locations • NVIDIA Jetson Series •

    disposable compute Edge traditional DCS and failover locations • Linux distributions • Windows Server On-Prem AWS, Azure, Google, IBM Cloud, and more • traditional compute • specialized compute Cloud Workload Locations HashiCorp Nomad
  4. Executables Containers Runtimes Workload Support • isolated and raw execution

    • Windows binaries • macOS binaries traditional workloads
  5. Executables Containers Runtimes Workload Support • isolated and raw execution

    • Windows binaries • macOS binaries • Docker • Podman • containerd traditional workloads modern workloads
  6. Executables Containers Runtimes Workload Support • isolated and raw execution

    • Windows binaries • macOS binaries • Docker • Podman • containerd • Java • QEMU • libvirt traditional workloads modern workloads specialized workloads
  7. Executables Containers Runtimes Workload Support • isolated and raw execution

    • Windows binaries • macOS binaries • Docker • Podman • containerd • Java • QEMU • libvirt traditional workloads modern workloads specialized workloads HashiCorp Nomad
  8. defines basic job properties • datacenter and region • type

    of job • update strategy Job Workload Specification
  9. defines basic job properties • datacenter and region • type

    of job • update strategy defines how to co-locate tasks • network config • volume config • service discovery Job Group Workload Specification
  10. defines basic job properties • datacenter and region • type

    of job • update strategy defines atomic units of work • driver selection • task environment • resource requirements defines how to co-locate tasks • network config • volume config • service discovery Task Job Group Workload Specification
  11. GPU Workloads with Nomad job "docling" { datacenters = ["dc1"]

    type = "service" constraint { attribute = "${node.class}" value = "linux" } group "api" { task "docling-serve" { driver = "podman" config { image = "quay.io/docling-project/docling-serve:latest" command = "docling-serve" } device "nvidia/gpu" { count = 1 } } } } docling.nomad.hcl
  12. GPU Workloads with Nomad job "docling" { datacenters = ["dc1"]

    type = "service" constraint { attribute = "${node.class}" value = "linux" } group "api" { task "docling-serve" { driver = "podman" config { image = "quay.io/docling-project/docling-serve:latest" command = "docling-serve" } device "nvidia/gpu" { count = 1 } } } } docling.nomad.hcl
  13. GPU Workloads with Nomad job "docling" { datacenters = ["dc1"]

    type = "service" constraint { attribute = "${node.class}" value = "linux" } group "api" { task "docling-serve" { driver = "podman" config { image = "quay.io/docling-project/docling-serve:latest" command = "docling-serve" } device "nvidia/gpu" { count = 1 } } } } docling.nomad.hcl
  14. GPU Workloads with Nomad 18GB 18GB 18GB 18GB 18GB 18GB

    18GB 1 compute 1 compute 1 compute 1 compute 1 compute 1 compute NVIDIA H200 MIG Pro fi le 18GB 1 compute
  15. GPU Workloads with Nomad 18GB 18GB 18GB 18GB 18GB 18GB

    18GB 1 compute 1 compute 1 compute 1 compute 1 compute 1 compute NVIDIA H200 MIG Pro fi le 18GB 1 compute
  16. GPU Workloads with Nomad 18GB 18GB 18GB 18GB 18GB 18GB

    1 compute 1 compute 1 compute 1 compute 1 compute NVIDIA H200 MIG Pro fi le 18GB 1 compute 18GB 1 compute
  17. GPU Workloads with Nomad 18GB 18GB 18GB 18GB 18GB 1

    compute 1 compute 1 compute 1 compute I H200 18GB 1 compute 18GB 1 compute 18GB 1 compute
  18. GPU Workloads with Nomad job "docling" { # other config

    hidden group "api" { task "docling-serve" { # other config hidden device "nvidia/gpu" { count = 1 } resources { device "nvidia/gpu/NVIDIA H200 MIG 1g.18gb" { count = 1 } } } } } docling.nomad.hcl