Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[PyCon US 2026 Packaging Summit LT] Sharing mal...

[PyCon US 2026 Packaging Summit LT] Sharing malware scanning results of PyPI from multiple providers

Avatar for Joongi Kim

Joongi Kim

May 15, 2026

More Decks by Joongi Kim

Other Decks in Programming

Transcript

  1. • My company Lablup offers Reservoir , a private mirror

    service to air gapped AI cluster customers. PyPI, Ubuntu, CentOS, CRAN • According to the security compliances and regulations, we are often requested to have security/malware scanning of the artifacts before delivery. The delivery is usually done via hard-disks brought by our FDE. • Devliery frequency: once per month We are combining access to Korea's local mirrors by Kakao Corp. Our Operation
  2. • Size Growth The sheer volume of PyPI artifacts 40

    TB in total now (doubled since 2024) https://pypi.org/stats/ We (and our customers) are running out of disks! Current Status: Increasing Size
  3. • Monthly malware scanning by Lablup Ahnlab V3 (a Korean

    antivirus solution) ClamAV • Observation Recent rapid increase of supply chain attacks Recent days: copy-fail, dirty-frag, nginx-rift, ... The same happening in PyPI, too Current Status: Increasing Malware
  4. • Registering Lablup as a malware scan result provider...? •

    Cooldown periods in the PyPI server side or pip client side Core dev, Donghee Na, has applied a custom company-wise cooldown proxy to his company. How to protect other users? Things to consider Structured skipping Urgent redistributions (e.g., responses to CVEs) (Discussed during Mike Fiedler's talk) Towards Trustworthy PyPI
  5. • Annotating each package with public scan results e.g., "This

    package was validated by a scan at 2025-06-01 by one or more scanning providers." Need to discuss/decide which metadata to include. Make it available through index APIs so that automated mirroring tools can decide whether to include/exclude the flagged packages. Example: Google Assured Open Source SW (Java, Python packages) https://cloud.google.com/security/products/assured-open-source-software https://docs.cloud.google.com/security-command-center/docs/aoss-supported-packages- premium How to balance open-source freedom to register new packages vs. providing trustworthy/validated packages? Towards Trustworthy PyPI Mirroring
  6. Q&A / Discussion Lablup Inc. Backend.AI Backend.AI GitHub Backend.AI Cloud

    https://www.lablup.com https://www.backend.ai https://github.com/lablup/backend.ai https://cloud.backend.ai Daehyun Sung dhsung lablup.com Joongi Kim joongi lablup.com