ran a story on an implant supposedly found in Supermicro servers • The story was met with skepticism – and the implant itself was never produced • Much more interesting, Trammell Hudson showed in his “Modchips of the State” talk that an implant was technically possible…
was possible, Trammell’s talk also showed that an implant was broady unnecessary… • The (entirely proprietary) traditional BMC was so plagued by elementary security problems that an implant would be unnecessary overkill • Trammell pointed to the need for open source firmware – and a true root-of-trust à la Titan (Google), T2 (Apple) and Cerberus (Microsoft) • This was all very inspiring to us at Oxide Computer Company, where we were taking a clean sheet of paper for the server ca. 2019…
to co-design software and hardware into a single, unified system to deliver elastic infrastructure • We have not developed any of our own silicon, but have developed more or less everything else: our own rack resign, our own compute sleds, our own switch (!) – and all of our own software
it with a microcontroller service processor (STM32H753) running our own operating system (Hubris!) • We eliminated the TPM, replacing it with a dedicated root-of-trust (NXP LPC55S69), which also runs Hubris and measures and controls the SP • We eliminated the UEFI BIOS (and SMM!), replacing it with host software that boots holistically, bringing up lowest-level units on the CPU directly • We eliminated the switch operating system, controlling the switch as a (large, complicated!) PCIe device from an adjacent sled
Attacks can get very sophisticated (e.g., voltage glitching) – and making a system tamper-proof seems unlikely. • Systems thinking is essential. Hardware/software co-design necessitates thinking across boundaries that are often insulated from one another – and allowed us to shrink each component to purpose. • Eliminate needless abstractions. Systems thinking renders many abstractions vestigial; these should be eliminated wherever possible.
win. Embedding e.g. bitstreams and firmware into larger, signed artifacts (e.g., SP, host OS) greatly simplifies the system. • Co-design allows for comprehensive software updates. When firmware exists in a holistic system, updates can be handled holistically – and update based attacks can be considered and mitigated • The system is only as good as its key management. Provisioning an RoT in the factory required its own (very) complicated system – which consists of both human and IT components.
huge problem. We discovered two vulnerabilities on the LPC55 (CVE-2021-31532, CVE-2022-22819) – and we believe that both would have been found earlier if it were open. • Hidden cores proliferate. There are an astonishing number of compute elements in the system, many of which are undocumented; these must be understood even if only to be turned off! • Transparency is a constraint. Even well-behaved vendors are not transparent about low-level details – but it is absolutely essential to understand these details to build an effective system!
incentivize transparency? See, e.g. RFD 552 Transparency in Hardware/Software Interfaces • Emerging compute needs like frontier models care about this problem; can we join forces? See, e.g. RAND’s Securing AI Model Weights • Security-focused solutions often sacrifice performance (and, sadly, vice versa!); can we treat both as a constraint?