Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Soul of a New Machine: Rethinking the Server-Side Computer

Soul of a New Machine: Rethinking the Server-Side Computer

Talk given at Stanford's EE380 Computer Systems Colloquium on February 26, 2020 on what we're up to at Oxide Computer Company. Video: https://youtu.be/vvZA9n3e5pc

Bryan Cantrill

April 02, 2020
Tweet

More Decks by Bryan Cantrill

Other Decks in Technology

Transcript

  1. Soul of a New Machine
    Rethinking the Server-Side Computer
    Bryan Cantrill
    Oxide Computer Company

    View Slide

  2. Server-side computing, ca. 1961

    View Slide

  3. Server-side computing, ca. 1961 (cont.)
    Source: Martin Weik, A Third Survey of Domestic Electronic Digital Computing Systems (1961)

    View Slide

  4. Server-side computing, ca. 1975
    Source: Retro-computing Society of Rhode Island

    View Slide

  5. Server-side computing, ca. 1999

    View Slide

  6. Server-side computing, ca. 2009

    View Slide

  7. Hyperscale computing, ca. 2009
    Source: Stephen Shankland/CNET: Google uncloaks once-secret server (2009)

    View Slide

  8. Hyperscale computing, ca. 2020

    View Slide

  9. Server-side computing, ca. 2009

    View Slide

  10. Server-side computing, ca. 2020

    View Slide

  11. The problem
    ● There remain good reasons to own and operate one’s own computer!
    ● But the world has bifurcated: fit-for-purpose infrastructure for
    hyperscalers; rack-mounted personal computers for everyone else
    ● Worse, the commercial server world is split between software-agnostic
    hardware and putatively hardware-agnostic software
    ● Result is a cobbled-together system that the end-user is left to design,
    integrate, operate -- and support
    ● Problems are up and down the stack; we need a new approach

    View Slide

  12. Towards a solution: Hardware/firmware
    ● We need a real hardware root-of-trust, offering firmware attestation
    ● We need a fit-to-purpose BMC, with much less surface area
    ● We need host firmware confined to booting a host operating system
    ● We need a true rack-scale design in which a top-of-rack switch is
    co-designed with compute and storage
    ● We need this in a dense form factor that allows for efficient operation!

    View Slide

  13. Towards a solution: Software
    ● Rack-scale designs necessitate integrated software: hypervisor, control
    plane, storage, ToR + API endpoints for both operator and developer
    ● But the era of proprietary infrastructure software is over: it must be
    fully open and attested!
    ● Much of what needs to be built is software, albeit at very low levels
    (hardware root-of-trust, service processor, boot software, etc.)

    View Slide

  14. Is a solution attainable?
    ● On the one hand, there is an outrageous amount to be done, with many
    different problems that need to be solved concurrently...
    ● But on the other, the solution can be tightly tailored: co-designing
    hardware with software allows for elimination of false generalities
    ● And there are several interesting hardware and software trends that
    make a solution more attainable than it has been historically…

    View Slide

  15. Trends: Hardware components
    ● The industry has recognized the need to collaborate on a hardware
    root-of-trust, e.g. Microsoft Cerberus, Google/lowRISC OpenTitan
    ● The open EDA movement has made FPGA design and implementation
    easier than ever, e.g. Yosys, Chisel, SpinalHDL, Bluespec
    ● RISC-V has allowed for free soft cores -- and the end of Moore’s Law has
    meant that these cores are viable for sophisticated software
    ● Open firmware is arriving (at long last!) and being encouraged by the
    Open Compute Project

    View Slide

  16. Trends: System software components
    ● Infrastructure software is entirely open source: many interesting
    production-grade components new in just the past few years!
    ● And -- perhaps surprisingly -- there’s been a very important
    development in programming languages...
    ● Rust may be the most important system software development in four
    decades: if C was portable assembly, Rust is safe C!
    ● Small Rust-based systems like Tock show particular promise...

    View Slide

  17. Soul of a new computer company
    ● We started Oxide, a computer company in Emeryville, California
    ● Raised seed capital end of 2019, ramping team now (Feb. 2020)
    ● Aiming for functional prototypes in 2021, customer systems in 2022
    ● We are looking for like minds and kindred spirits!
    ● Learn more about Oxide at https://oxide.computer
    ● Check out our podcast, On the Metal!

    View Slide