Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[PyCon US 2026 Packaging Summit LT] More Varian...

[PyCon US 2026 Packaging Summit LT] More Variant, More Diversity for AI Accelerators

Avatar for Joongi Kim

Joongi Kim

May 15, 2026

More Decks by Joongi Kim

Other Decks in Programming

Transcript

  1. Same Analogy, Same Problem Python Package Index Host CPU Features

    GPU Features NPU Features pip install Query package list Package index with variant props Variant props from local variant providers Download the matched wheel variant x86_64 :: level :: v3 nvidia :: cuda_version :: 12.8 nvidia :: sm_arch :: 120_real rebellions :: arch :: atom-max furiosa :: arch :: renegade
  2. Same Analogy, Same Problem Python Package Index Host CPU Features

    GPU Features NPU Features pip install Query package list Package index with variant props Variant props from local variant providers Download the matched wheel variant x86_64 :: level :: v3 nvidia :: cuda_version :: 12.8 nvidia :: sm_arch :: 120_real rebellions :: arch :: atom_max furiosa :: arch :: renegade Container Registry Host CPU Features GPU Features NPU Features docker pull Query image list Image index with variant props Variant props from local variant providers Pull the matched container variant x86_64 :: level :: v3 nvidia :: cuda_version :: 12.8 nvidia :: sm_arch :: 120_real rebellions :: arch :: atom_max furiosa :: arch :: renegade
  3. • Platfrom variants in OCI Open Container Image https://specs.opencontainers.org/image-spec/image-index/#platform-variants Borrows

    Golang's build target variant expressions Current Status L No consideration for accelerators
  4. • Lablup s Backend.AI adopted a custom tagging ruleset. pytorch:2.12.0-ubuntu24.04-py312

    pytorch:2.12.0-ubuntu24.04-cuda13-py312 pytorch:2.12.0-ubuntu22.04-atom-py313 pytorch:2.12.0-ubuntu22.04-... Current Status L Manual client-side selection (exact, partial, ...?) L No standardized tag namings across different vendors L Too long tags for multi- feature-compatible images
  5. • Kubernetes DRA Dynamic Resource Allocation stable in v1.35 Prioritized

    List [stable in v1.35] & DRAListTypeAttributes [alpha in v1.36] Consumed by CEL expressions to generalize matching conditions Specifies alternative combinations of device properties. Updated DRAListTypeAttributes introduces typed values (bools, ints, strings, versions). Current Status L Looks promising, but complexity still left to users (Manual CEL expression writing...)
  6. • Why not just reuse a community driven, community proven

    standard? https://peps.python.org/pep-0817/ PEP 817 Wheel Variant from WheelNext Summit 2025 @ Meta HQ
  7. • Problem Choose appropriate nodes having compatible accelerators for a

    given workload Need to support multi-node jobs (aka gang scheduling) • Solution: full automation no user intervention at cluster scale Per-node variant provider + Per-workload variant labels + Variant matcher within the scheduler Backend.AI Sokovan Scheduler Workload's variant spec is populated from container image labels (scanned & cached in prior) https://github.com/lablup/backend.ai
  8. Q&A / Discussion joongi lablup.com Lablup Inc. Backend.AI Backend.AI GitHub

    Backend.AI Cloud https://www.lablup.com https://www.backend.ai https://github.com/lablup/backend.ai https://cloud.backend.ai