[PyCon US 2026 Packaging Summit LT] Sharing malware scanning results of PyPI from multiple providers

Sharing Malware Scan Results of PyPI from Multiple Providers Daehyun
Sung, Joongi Kim @ Lablup Inc.

• My company Lablup offers Reservoir , a private mirror
service to air gapped AI cluster customers. PyPI, Ubuntu, CentOS, CRAN • According to the security compliances and regulations, we are often requested to have security/malware scanning of the artifacts before delivery. The delivery is usually done via hard-disks brought by our FDE. • Devliery frequency: once per month We are combining access to Korea's local mirrors by Kakao Corp. Our Operation

• Size Growth The sheer volume of PyPI artifacts 40
TB in total now (doubled since 2024) https://pypi.org/stats/ We (and our customers) are running out of disks! Current Status: Increasing Size

• Monthly malware scanning by Lablup Ahnlab V3 (a Korean
antivirus solution) ClamAV • Observation Recent rapid increase of supply chain attacks Recent days: copy-fail, dirty-frag, nginx-rift, ... The same happening in PyPI, too Current Status: Increasing Malware

• Registering Lablup as a malware scan result provider...? •
Cooldown periods in the PyPI server side or pip client side Core dev, Donghee Na, has applied a custom company-wise cooldown proxy to his company. How to protect other users? Things to consider Structured skipping Urgent redistributions (e.g., responses to CVEs) (Discussed during Mike Fiedler's talk) Towards Trustworthy PyPI

• Annotating each package with public scan results e.g., "This
package was validated by a scan at 2025-06-01 by one or more scanning providers." Need to discuss/decide which metadata to include. Make it available through index APIs so that automated mirroring tools can decide whether to include/exclude the flagged packages. Example: Google Assured Open Source SW (Java, Python packages) https://cloud.google.com/security/products/assured-open-source-software https://docs.cloud.google.com/security-command-center/docs/aoss-supported-packages- premium How to balance open-source freedom to register new packages vs. providing trustworthy/validated packages? Towards Trustworthy PyPI Mirroring

Q&A / Discussion Lablup Inc. Backend.AI Backend.AI GitHub Backend.AI Cloud
https://www.lablup.com https://www.backend.ai https://github.com/lablup/backend.ai https://cloud.backend.ai Daehyun Sung dhsung lablup.com Joongi Kim joongi lablup.com

[PyCon US 2026 Packaging Summit LT] Sharing mal...

[PyCon US 2026 Packaging Summit LT] Sharing malware scanning results of PyPI from multiple providers

Joongi Kim

More Decks by Joongi Kim

Other Decks in Programming

Featured

Transcript

Sharing Malware Scan Results of PyPI from Multiple Providers Daehyun

• My company Lablup offers Reservoir , a private mirror

• Size Growth The sheer volume of PyPI artifacts 40

• Monthly malware scanning by Lablup Ahnlab V3 (a Korean

• Registering Lablup as a malware scan result provider...? •

• Annotating each package with public scan results e.g., "This

Q&A / Discussion Lablup Inc. Backend.AI Backend.AI GitHub Backend.AI Cloud