Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Hardware/software Co-design: The Coming Golden Age

Hardware/software Co-design: The Coming Golden Age

Talk given at RailsConf 2021, Video: https://www.youtube.com/watch?v=nY07zWzhyn4

Bryan Cantrill

April 14, 2021
Tweet

More Decks by Bryan Cantrill

Other Decks in Technology

Transcript

  1. Hardware/Software Co-design:
    The Coming Golden Age
    Bryan Cantrill
    Oxide Computer Company

    View full-size slide

  2. The hardware/software divide
    ● The shift to public cloud computing over the last fifteen years has
    allowed software and hardware to become disconnected
    ● On the one hand, this can be empowering: a SaaS offering can be built
    with no real understanding of the hardware beneath it
    ● But there’s a risk of taking software-centric thinking too far -- and
    drawing the mistaken conclusion that hardware is irrelevant (or worse)
    ● This overshot in thinking is epitomized by Marc Andreessen’s 2011
    essay, “Why Software is Eating the World”

    View full-size slide

  3. Revisiting Andreessen
    ● Certainly, the essay makes an important observation on the importance
    of software in essentially every domain:

    View full-size slide

  4. Revisiting Andreessen
    ● And the effect of Moore’s Law + open source + public cloud computing
    has indisputably lowered the cost of delivering software:

    View full-size slide

  5. Revisiting Andreessen
    ● But the essay errs in fetishizing software, mistakenly viewing extant
    industries as likely to be disrupted by SaaS alone:

    View full-size slide

  6. Revisiting Andreessen
    ● Software is important -- but the essay conflates software companies
    with companies that in fact integrate software and hardware
    ● Companies that Andreessen cited that have thrived -- Amazon, Google,
    etc. -- have very significant hardware components!
    ● Many software-only companies that are cited have disappointed:
    Zynga, Rovio, Groupon, LivingSocial, Foursquare
    ● Andreessen is dismissive of Apple (up 15X) -- and entirely ignores
    companies like NVIDIA (57X), AMD (14X), or even Intel (3X)!

    View full-size slide

  7. Revisiting Andreessen

    View full-size slide

  8. Revisiting another famous essay

    View full-size slide

  9. Gordon Moore, ca. 1965

    View full-size slide

  10. Gordon Moore, ca. 1965

    View full-size slide

  11. Gordon Moore, ca. 1965

    View full-size slide

  12. Gordon Moore, ca. 1965

    View full-size slide

  13. Moore’s Law?

    View full-size slide

  14. Gordon Moore, ca. 1965

    View full-size slide

  15. Moore’s Law!

    View full-size slide

  16. Gordon Moore, ca. 1965

    View full-size slide

  17. Moore’s Law?!

    View full-size slide

  18. So… Moore’s Law?
    ● In his 1965 paper, there is no Moore’s Law per se — just a bunch of
    incredibly astute and prescient observations
    ● The term “Moore’s Law” would be coined by Carver Mead in 1971 as
    part of his work on determining ultimate physical limits
    ● Moore updated the law in 1975 to be a doubling of transistor density
    every two years (Denard scaling would be outlined in detail in 1974)
    ● For many years, Moore’s Law could be inferred to be doublings of
    transistor density, speed, and economics

    View full-size slide

  19. Moore’s Law: Good old days?
    ● The 1980s and early 1990s were great for Moore’s Law — so much so
    that computers needed a “turbo button” to counteract its effects (!!)
    ● But even in those halcyon years, Moore’s Law was leaving DRAM
    behind: memory was becoming denser but no faster
    ● An increasing number of workloads began hitting the memory wall
    ● Caching was necessary but insufficient...

    View full-size slide

  20. Moore’s Law: The memory wall
    ● By the mid-1990s, it had become clear that symmetric multiprocessing
    was the path to deliver throughput on multi-threaded workloads
    ● ...but SMP did nothing for single-threaded performance
    ● Deep pipelining and VLIW were — largely — failed experiments
    ● For single-threaded workloads, microprocessors turned to out-of-order
    and speculative execution to hide memory latency
    ● Even in simpler times, scaling with Moore’s Law was a challenge!

    View full-size slide

  21. Moore’s Law: Architectural shifts
    ● Denard scaling ended in ~2006 due to current leakage…
    ● ...but by then chip multiprocessing was clearly the trajectory
    ● CMP was enhanced by simultaneous multithreading (SMT), which
    offered up to another factor of two on throughput
    ● Thanks to the earlier software work on SMP, CMP/SMT was less of a
    software performance apocalypse than some feared — but more of a
    security apocalypse than anyone anticipated!
    ● And “dark silicon” greatly limits CMP!

    View full-size slide

  22. Moore’s Law: Deceleration
    ● In August 2018, GlobalFoundries suddenly stopped 7nm development,
    citing economics -- it was simply too expensive to stay competitive
    ● GlobalFoundries’ departure left TSMC and Samsung on 7nm -- and Intel
    on 14nm, struggling to get to 10nm
    ● Intel’s Cannon Lake was three years late and an unmitigated disaster --
    and for Ice Lake/Cascade Lake, Intel is intermixing 14nm and 10nm
    ● Moving to 3nm/5nm requires moving beyond FinFETs to GAAFETs --
    and to EUV photolithography; new nodes are very expensive!

    View full-size slide

  23. Aside: Process nodes
    ● You may well wonder: when a process node is “7nm” or “5nm”, what
    exactly is seven nanometers or five nanometers long? (And, um, how big
    is a silicon atom anyway?)
    ● Answer to the second question: ~210 picometers!
    ● Answer to the first question: nothing! Unbelievably, the name of the
    process node no longer measures anything at all (!!) -- it is merely a
    rough expression of transistor density (and implication of process)
    ● E.g. 7nm ≈ 100MTr/mm2 (but there are lots of caveats)

    View full-size slide

  24. Moore’s Law
    ● Increased transistor density is continuing to be possible, but at a greatly
    slowed pace -- and at outsized cost
    ● Economically, Moore’s Law is indisputably ending
    ● But is there another way of looking at it?

    View full-size slide

  25. Another essay, further back in time...

    View full-size slide

  26. Theodore Wright, ca. 1936

    View full-size slide

  27. Wright’s Law
    ● In 1936, Theodore Wright studied the costs of aircraft manufacturing,
    finding that the cost dropped with experience
    ● Over time, when volume doubled, unit costs dropped by 10-15%
    ● This phenomenon has been observed in other technological domains
    ● In 2013, Jessika Trancik et al. found Wright’s Law to hold better
    predictive power for transistor cost than Moore’s Law!
    ● Wright’s Law seems to hold, especially for older process nodes

    View full-size slide

  28. Wright on market creation

    View full-size slide

  29. Wright foreshadowing Moore

    View full-size slide

  30. One final essay...

    View full-size slide

  31. W. Stanley Jevons, ca. 1865

    View full-size slide

  32. W. Stanley Jevons, ca. 1865

    View full-size slide

  33. Jevons foreshadowing Wright

    View full-size slide

  34. Aside: Never say “never”

    View full-size slide

  35. Aside: A contemporary weighs in on Jevons?

    View full-size slide

  36. Back to computing!
    ● Andreessen’s 2011 piece, while containing some truisms, is overly
    software-centric and misses hardware’s role entirely
    ● Moore’s Law -- while prescient! -- is indisputably slowing
    ● Wright’s Law, however, may still be holding for transistors -- especially
    at older processing nodes (22nm, 40nm, 90nm, etc.)
    ● The Jevons Paradox has proven again and again to apply to computing:
    when general purpose computation is cheaper, we find more to do
    ● We can expect more computation in more places

    View full-size slide

  37. Compute everywhere?
    ● More computation doesn’t just mean computers in new places (à la IoT),
    it means CPUs present where we once thought of components
    ● E.g., open 32-bit CPUs replacing hidden, closed 8-bit microcontrollers
    ● We are already seeing CPUs on the NIC (SmartNIC), CPUs next to flash
    (e.g., open-channel SSD) and on the spindle (e.g. WD’s SweRV)
    ● New opportunities for hardware/software co-design: keep hardware
    simple and put more sophistication into software and/or soft logic
    ● There are several trends acting as accelerants for this shift...

    View full-size slide

  38. Open instruction sets
    ● X86 and ARM -- the two market victors -- are both encumbered by
    history and licensing
    ● RISC-V is an attempt to learn from the ISA mistakes of the past, in a
    vessel that is entirely open and -- with open implementations
    ● RISC-V is very promising, but there remain many gaps to close
    ● To succeed, RISC-V must focus as much on the SoC as the ISA -- while
    remaining entirely open!

    View full-size slide

  39. Open FPGAs
    ● FPGA bitstreams have historically been entirely proprietary -- and one
    is therefore dependent upon proprietary tools to generate them
    ● The Lattice iCE40 bitstream format was reverse engineered in 2015 by
    Claire Wolf, and can be entirely synthesized with an open toolchain!
    ● While Xilinx (AMD) and Alterra (Intel) retain proprietary components
    (e.g., for timing models), newcomers like QuickLogic are entirely open
    ● See, e.g., SymbiFlow, Verilog to Routing (VTR), Yosys, OpenFPGA, and
    the (new!) Open Source FPGA Foundation

    View full-size slide

  40. Open HDLs
    ● Hardware description languages have traditionally been dominated by
    Verilog and (later) SystemVerilog
    ● Compilers have been historically proprietary -- and the languages
    themselves are error prone
    ● In recent years we have seen a wave of new, open HDLs, e.g.: Chisel,
    nMigen, Bluespec, SpinalHDL, Mamba (PyMTL 3), HardCaml
    ● Of these, one is particularly noteworthy...

    View full-size slide

  41. Open HDL: Bluespec
    ● Bluespec is a high-level HDL that takes its inspiration from formal
    specification languages -- and strongly typed languages like Haskell
    ● Bluespec uses the expressiveness of the language to move away from
    individual signals -- and to atomic rules and interfaces
    ● This allows for the compiler to do the hard work of connecting modules
    and proving correctness, greatly reducing verification time!
    ● In the words of Oxide engineer Arjen Roodselaar, “Bluespec is to
    SystemVerilog what Rust is to assembly”

    View full-size slide

  42. Open HDL: Bluespec
    ● Bluespec was proprietary for 20 years; open sourced in early 2020!
    ● We at Oxide feel that Bluespec is a profoundly transformative
    technology -- but not one that is broadly understood or appreciated!
    ● More details:
    ○ https://github.com/B-Lang-org/Documentation
    ○ https://github.com/B-Lang-org/bsc
    ○ https://github.com/oxidecomputer/cobalt

    View full-size slide

  43. Open source EDA
    ● Proprietary software has historically dominated EDA…
    ● Open source alternatives have existed for years -- but one in particular,
    KiCad, has enjoyed sufficiently broad sponsorship to close the gaps with
    professional-grade software
    ● The maturity of KiCad coupled with the rise of quick turn PCB
    manufacturing/assembly has allowed for astonishing speed:
    ○ From conception to manufacturer in hours
    ○ From manufacturer to shipping board in days

    View full-size slide

  44. Board economics
    ● Single board computers are very accessible!
    ○ An STM32 Nucleo-144 board with 400 MHz Cortex M7 CPU + 2
    MB of flash + 1 MB of RAM + all I/O peripherals for less than $30
    ○ A BeagleBone Black -- with 1 GHz Cortex A8 CPU + 4 GB of flash +
    512 MB DDR3 + HDMI for less than $60!
    ● All documentation available online and without NDA -- and the
    BeagleBone Black is (nearly) entirely open
    ● The BeagleBone Black can also be used as a logic analyzer via sigrok

    View full-size slide

  45. Open source firmware
    ● The software that runs closest to the hardware is increasingly open,
    with drivers nearly (nearly!) always open
    ● Increasingly, we are seeing the firmware of unseen parts of the system
    become open as well, viz. the Open Source Firmware Conference
    ● This trend is slower in the 7nm SoCs -- but it’s happening!
    ● However, even in putatively open architectures, there generally still
    remains proprietary software in the form of boot ROMs -- and this
    proprietary software remains a problem!

    View full-size slide

  46. Embedded Rust
    ● Rust has proven to be a revolution for systems software: rich type
    system, algebraic types, ownership model allow for fast, correct code
    ● Slightly more surprising has been Rust’s ability to get small -- which
    coupled with its lack of a runtime lets it fit everywhere!
    ● With its safety and expressive power, Rust represents a quantum leap
    over C -- and without losing performance or sacrificing size
    ● We at Oxide are working on a de novo Rust operating system for the
    embedded use case that we will (naturally?) open source; stay tuned!

    View full-size slide

  47. “That changed everything”

    View full-size slide

  48. A new Golden Age!
    ● Thanks to Moore’s Law, Wright’s Law and the rise of open source, it is
    easier to build hardware than ever before!
    ● We are going to see computers in many more places, posing challenges
    to us all to develop reliable, secure, high performing systems
    ● Software remains essential, but we must not think of it in isolation; we
    must co-design the hardware and the software in our systems!
    ● The systems are open, the communities are welcoming! Let’s build!

    View full-size slide