Upgrade to Pro — share decks privately, control downloads, hide ads and more …

HPC STX Outbrief - SC25 S-HPC Workshop

Avatar for Ian Lee Ian Lee
November 16, 2025

HPC STX Outbrief - SC25 S-HPC Workshop

This out brief summarizes key technical outcomes and community needs from the 2025 HPC Security Technical Exchange (HPC STX) meetings. The session covers practical security implementations being deployed across federal HPC environments, including PL2→PL3 reference architectures, crypto-shredding standard operating procedures, KMS integration patterns, and SELinux configurations for storage tenancy isolation. We'll discuss critical vendor requirements that emerged from the community including full SELinux support in schedulers and filesystems, workable FIPS modes for HPC workloads, multi-tenant fabric security, and Kubernetes STIGs tuned specifically for HPC environments. The talk also highlights opportunities for cross-agency coordination on AI model allow/deny lists, eBPF rule sets, audit profiles, and identifying high-value security events. These discussions reflect the collective experience of federal agencies, national laboratories, and industry partners working to secure mission-critical computational infrastructure while maintaining research agility.

Avatar for Ian Lee

Ian Lee

November 16, 2025
Tweet

More Decks by Ian Lee

Other Decks in Technology

Transcript

  1. 60 Security Professionals Walk Into A Room: Outbrief from the

    3rd HPC Security Technical Exchange Ian Lee ShorePoint, Inc [email protected]
  2. What is the HPC Security Technical Exchange (STX)? • “An

    event to bring together experts, practitioners, and enthusiasts in government high-performance computing (HPC) security to share insights, discuss challenges, and explore innovative solutions.” • Not a conference, but a “moderated discussion” • ~ 60 attendees from ~ 15 different organizations 2 – ShorePoint, Inc. Proprietary Information
  3. Why 2025 felt different • Administration change has brought shift

    • Originally scheduled in April (~ 100 registered), held in September • Zero Trust, multi-tenancy, and PL3 moving from theory to pilots • Vendors and AOs engaging on HPC-specific realities, not just enterprise controls • “HPC is no longer out of scope” 3 – ShorePoint, Inc. Proprietary Information
  4. Recasting Baselines for HPC • Orienting systems around NIST SP

    800-223 zones • Assessments: DOE EA red/purple teaming, share lessons across sites • RADIX: automated STIG checklist creation, JSON outputs, amendments for deviations 4 – ShorePoint, Inc. Proprietary Information
  5. Multi-Tenancy, MLS, and PL3 Patterns • Rising system costs →

    Increased interest in Multi-Tenancy • Caution on MLS, preference for same-level multi-tenant HPC • Pattern: PL2 landing zone, non-interactive PL3 compute, strong KMS, crypto-shredding 5 – ShorePoint, Inc. Proprietary Information
  6. Encryption and Key Management, Performance Included • At rest: SEDs

    and FIPS 140-3 components; hardware vs software • In transit: trend to hardware offload and multi-tenant RDMA fabrics • KMS options: HashiCorp Vault, Fortanix, Entrust/HiTrust; WEKA client- side keying • Future Intersection: Quantum computing, Post Quantum Cryptography 6 – ShorePoint, Inc. Proprietary Information
  7. Logging, Monitoring, and Threat Hunting at Scale • eBPF telemetry:

    Lintap and Panhandle for deep host event details • Auditd realities: rule ordering matters, workload-dependent performance; align with AU-2 intent • SOC use cases: SSH/file forensics, SELinux validation, network-to-user attribution, threat intelligence correlation; where does AI fit in? 7 – ShorePoint, Inc. Proprietary Information
  8. Containers and Kubernetes the HPC Way • Increasing adoption of

    containers by staff and users • CharlieCloud unprivileged approach; strict subuid/setuid posture • Podman vs Singularity tradeoffs; approval favorability of OpenShift; persistent containers discouraged 8 – ShorePoint, Inc. Proprietary Information
  9. Cloud HPC, Data Movement, and Hybrid Friction • Cloud cost

    uplift and capacity constraints; GPU scarcity in GovCloud • Hybrid needs: locality, rack-awareness, coordination with CSPs • Data Movement: Performance impacts of (lack of) data locality, cost 9 – ShorePoint, Inc. Proprietary Information
  10. AI in HPC: Use, Risk, and Governance • Use patterns:

    inference services, dev support, document agents, computational model surrogates • Governance: infrastructure approvals vs weights-as-data, model allow/deny lists, guardrails like NVIDIA NeMo • What role should AI play in HPC operations? 10 – ShorePoint, Inc. Proprietary Information
  11. Procurement and Supply Chain Realities • FIPS storage lead times;

    electrical equipment up to 60 months • RFP specifics: require diagrams, WiFi/Bluetooth/GPS removal, staff citizenship/clearance, named OS/firmware support • SBOMs, private mirrors, MFA for package authors 11 – ShorePoint, Inc. Proprietary Information
  12. Bad Days and Preemptive Mitigations • Pain points: spillage, filesystem

    DoS, quotas/inodes, power/water, zero-day drops on Fridays / holidays, SSH agent forwarding • Mitigations: set (filesystem, use) quotas early, onboarding and AUPs, vendor briefings, HPC-specific IR playbooks 12 – ShorePoint, Inc. Proprietary Information
  13. Calls to Action for Community and Vendors • Standard artifacts:

    PL2→PL3 reference designs, crypto-shredding SOPs, KMS integration patterns, SELinux + storage tenancy • Vendor Asks: full SELinux (in schedulers/filesystems), workable FIPS modes, multi-tenant fabrics, K8s STIGs tuned for HPC • Coordination: share allow/deny AI model lists, eBPF rule sets, audit profiles; what events are the most valuable? 13 – ShorePoint, Inc. Proprietary Information
  14. Calls to Action for SC25 Activities • Government HPC Operator

    + Vendor Meeting • Tues Nov 18, Email Ian for details • The Intersection of HPC and AI Infrastructure • Tues Nov 18, 17:15 - 18:45 CST, Rooms 240-241-242 • Zero Trust in HPC Centers • Wed Nov 19, 12:15 - 13:15 CST, Rooms 261-262-265-266 14 – ShorePoint, Inc. Proprietary Information
  15. What’s Next ? • Expect NIST SP 800-234 early 2026;

    800-223 updates may follow • DoD RMC direction replacing RMF mechanics • Next STX in Spring 2026; planning on having classified sessions again; call for case studies and vendor gap responses • Email Ian if you aren’t on the notification list! 15 – ShorePoint, Inc. Proprietary Information