Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Hardware/software Co-design: The Coming Golden Age

Hardware/software Co-design: The Coming Golden Age

Talk given at RailsConf 2021, Video: https://www.youtube.com/watch?v=nY07zWzhyn4

Bryan Cantrill

April 14, 2021
Tweet

More Decks by Bryan Cantrill

Other Decks in Technology

Transcript

  1. Hardware/Software Co-design:
    The Coming Golden Age
    Bryan Cantrill
    Oxide Computer Company

    View Slide

  2. The hardware/software divide
    ● The shift to public cloud computing over the last fifteen years has
    allowed software and hardware to become disconnected
    ● On the one hand, this can be empowering: a SaaS offering can be built
    with no real understanding of the hardware beneath it
    ● But there’s a risk of taking software-centric thinking too far -- and
    drawing the mistaken conclusion that hardware is irrelevant (or worse)
    ● This overshot in thinking is epitomized by Marc Andreessen’s 2011
    essay, “Why Software is Eating the World”

    View Slide

  3. Revisiting Andreessen
    ● Certainly, the essay makes an important observation on the importance
    of software in essentially every domain:

    View Slide

  4. Revisiting Andreessen
    ● And the effect of Moore’s Law + open source + public cloud computing
    has indisputably lowered the cost of delivering software:

    View Slide

  5. Revisiting Andreessen
    ● But the essay errs in fetishizing software, mistakenly viewing extant
    industries as likely to be disrupted by SaaS alone:

    View Slide

  6. Revisiting Andreessen
    ● Software is important -- but the essay conflates software companies
    with companies that in fact integrate software and hardware
    ● Companies that Andreessen cited that have thrived -- Amazon, Google,
    etc. -- have very significant hardware components!
    ● Many software-only companies that are cited have disappointed:
    Zynga, Rovio, Groupon, LivingSocial, Foursquare
    ● Andreessen is dismissive of Apple (up 15X) -- and entirely ignores
    companies like NVIDIA (57X), AMD (14X), or even Intel (3X)!

    View Slide

  7. Revisiting Andreessen

    View Slide

  8. Revisiting another famous essay

    View Slide

  9. Gordon Moore, ca. 1965

    View Slide

  10. Gordon Moore, ca. 1965

    View Slide

  11. Gordon Moore, ca. 1965

    View Slide

  12. Gordon Moore, ca. 1965

    View Slide

  13. Moore’s Law?

    View Slide

  14. Gordon Moore, ca. 1965

    View Slide

  15. Moore’s Law!

    View Slide

  16. Gordon Moore, ca. 1965

    View Slide

  17. Moore’s Law?!

    View Slide

  18. So… Moore’s Law?
    ● In his 1965 paper, there is no Moore’s Law per se — just a bunch of
    incredibly astute and prescient observations
    ● The term “Moore’s Law” would be coined by Carver Mead in 1971 as
    part of his work on determining ultimate physical limits
    ● Moore updated the law in 1975 to be a doubling of transistor density
    every two years (Denard scaling would be outlined in detail in 1974)
    ● For many years, Moore’s Law could be inferred to be doublings of
    transistor density, speed, and economics

    View Slide

  19. Moore’s Law: Good old days?
    ● The 1980s and early 1990s were great for Moore’s Law — so much so
    that computers needed a “turbo button” to counteract its effects (!!)
    ● But even in those halcyon years, Moore’s Law was leaving DRAM
    behind: memory was becoming denser but no faster
    ● An increasing number of workloads began hitting the memory wall
    ● Caching was necessary but insufficient...

    View Slide

  20. Moore’s Law: The memory wall
    ● By the mid-1990s, it had become clear that symmetric multiprocessing
    was the path to deliver throughput on multi-threaded workloads
    ● ...but SMP did nothing for single-threaded performance
    ● Deep pipelining and VLIW were — largely — failed experiments
    ● For single-threaded workloads, microprocessors turned to out-of-order
    and speculative execution to hide memory latency
    ● Even in simpler times, scaling with Moore’s Law was a challenge!

    View Slide

  21. Moore’s Law: Architectural shifts
    ● Denard scaling ended in ~2006 due to current leakage…
    ● ...but by then chip multiprocessing was clearly the trajectory
    ● CMP was enhanced by simultaneous multithreading (SMT), which
    offered up to another factor of two on throughput
    ● Thanks to the earlier software work on SMP, CMP/SMT was less of a
    software performance apocalypse than some feared — but more of a
    security apocalypse than anyone anticipated!
    ● And “dark silicon” greatly limits CMP!

    View Slide

  22. Moore’s Law: Deceleration
    ● In August 2018, GlobalFoundries suddenly stopped 7nm development,
    citing economics -- it was simply too expensive to stay competitive
    ● GlobalFoundries’ departure left TSMC and Samsung on 7nm -- and Intel
    on 14nm, struggling to get to 10nm
    ● Intel’s Cannon Lake was three years late and an unmitigated disaster --
    and for Ice Lake/Cascade Lake, Intel is intermixing 14nm and 10nm
    ● Moving to 3nm/5nm requires moving beyond FinFETs to GAAFETs --
    and to EUV photolithography; new nodes are very expensive!

    View Slide

  23. Aside: Process nodes
    ● You may well wonder: when a process node is “7nm” or “5nm”, what
    exactly is seven nanometers or five nanometers long? (And, um, how big
    is a silicon atom anyway?)
    ● Answer to the second question: ~210 picometers!
    ● Answer to the first question: nothing! Unbelievably, the name of the
    process node no longer measures anything at all (!!) -- it is merely a
    rough expression of transistor density (and implication of process)
    ● E.g. 7nm ≈ 100MTr/mm2 (but there are lots of caveats)

    View Slide

  24. Moore’s Law
    ● Increased transistor density is continuing to be possible, but at a greatly
    slowed pace -- and at outsized cost
    ● Economically, Moore’s Law is indisputably ending
    ● But is there another way of looking at it?

    View Slide

  25. Another essay, further back in time...

    View Slide

  26. Theodore Wright, ca. 1936

    View Slide

  27. Wright’s Law
    ● In 1936, Theodore Wright studied the costs of aircraft manufacturing,
    finding that the cost dropped with experience
    ● Over time, when volume doubled, unit costs dropped by 10-15%
    ● This phenomenon has been observed in other technological domains
    ● In 2013, Jessika Trancik et al. found Wright’s Law to hold better
    predictive power for transistor cost than Moore’s Law!
    ● Wright’s Law seems to hold, especially for older process nodes

    View Slide

  28. Wright on market creation

    View Slide

  29. Wright foreshadowing Moore

    View Slide

  30. One final essay...

    View Slide

  31. W. Stanley Jevons, ca. 1865

    View Slide

  32. W. Stanley Jevons, ca. 1865

    View Slide

  33. Jevons foreshadowing Wright

    View Slide

  34. Aside: Never say “never”

    View Slide

  35. Aside: A contemporary weighs in on Jevons?

    View Slide

  36. Back to computing!
    ● Andreessen’s 2011 piece, while containing some truisms, is overly
    software-centric and misses hardware’s role entirely
    ● Moore’s Law -- while prescient! -- is indisputably slowing
    ● Wright’s Law, however, may still be holding for transistors -- especially
    at older processing nodes (22nm, 40nm, 90nm, etc.)
    ● The Jevons Paradox has proven again and again to apply to computing:
    when general purpose computation is cheaper, we find more to do
    ● We can expect more computation in more places

    View Slide

  37. Compute everywhere?
    ● More computation doesn’t just mean computers in new places (à la IoT),
    it means CPUs present where we once thought of components
    ● E.g., open 32-bit CPUs replacing hidden, closed 8-bit microcontrollers
    ● We are already seeing CPUs on the NIC (SmartNIC), CPUs next to flash
    (e.g., open-channel SSD) and on the spindle (e.g. WD’s SweRV)
    ● New opportunities for hardware/software co-design: keep hardware
    simple and put more sophistication into software and/or soft logic
    ● There are several trends acting as accelerants for this shift...

    View Slide

  38. Open instruction sets
    ● X86 and ARM -- the two market victors -- are both encumbered by
    history and licensing
    ● RISC-V is an attempt to learn from the ISA mistakes of the past, in a
    vessel that is entirely open and -- with open implementations
    ● RISC-V is very promising, but there remain many gaps to close
    ● To succeed, RISC-V must focus as much on the SoC as the ISA -- while
    remaining entirely open!

    View Slide

  39. Open FPGAs
    ● FPGA bitstreams have historically been entirely proprietary -- and one
    is therefore dependent upon proprietary tools to generate them
    ● The Lattice iCE40 bitstream format was reverse engineered in 2015 by
    Claire Wolf, and can be entirely synthesized with an open toolchain!
    ● While Xilinx (AMD) and Alterra (Intel) retain proprietary components
    (e.g., for timing models), newcomers like QuickLogic are entirely open
    ● See, e.g., SymbiFlow, Verilog to Routing (VTR), Yosys, OpenFPGA, and
    the (new!) Open Source FPGA Foundation

    View Slide

  40. Open HDLs
    ● Hardware description languages have traditionally been dominated by
    Verilog and (later) SystemVerilog
    ● Compilers have been historically proprietary -- and the languages
    themselves are error prone
    ● In recent years we have seen a wave of new, open HDLs, e.g.: Chisel,
    nMigen, Bluespec, SpinalHDL, Mamba (PyMTL 3), HardCaml
    ● Of these, one is particularly noteworthy...

    View Slide

  41. Open HDL: Bluespec
    ● Bluespec is a high-level HDL that takes its inspiration from formal
    specification languages -- and strongly typed languages like Haskell
    ● Bluespec uses the expressiveness of the language to move away from
    individual signals -- and to atomic rules and interfaces
    ● This allows for the compiler to do the hard work of connecting modules
    and proving correctness, greatly reducing verification time!
    ● In the words of Oxide engineer Arjen Roodselaar, “Bluespec is to
    SystemVerilog what Rust is to assembly”

    View Slide

  42. Open HDL: Bluespec
    ● Bluespec was proprietary for 20 years; open sourced in early 2020!
    ● We at Oxide feel that Bluespec is a profoundly transformative
    technology -- but not one that is broadly understood or appreciated!
    ● More details:
    ○ https://github.com/B-Lang-org/Documentation
    ○ https://github.com/B-Lang-org/bsc
    ○ https://github.com/oxidecomputer/cobalt

    View Slide

  43. Open source EDA
    ● Proprietary software has historically dominated EDA…
    ● Open source alternatives have existed for years -- but one in particular,
    KiCad, has enjoyed sufficiently broad sponsorship to close the gaps with
    professional-grade software
    ● The maturity of KiCad coupled with the rise of quick turn PCB
    manufacturing/assembly has allowed for astonishing speed:
    ○ From conception to manufacturer in hours
    ○ From manufacturer to shipping board in days

    View Slide

  44. Board economics
    ● Single board computers are very accessible!
    ○ An STM32 Nucleo-144 board with 400 MHz Cortex M7 CPU + 2
    MB of flash + 1 MB of RAM + all I/O peripherals for less than $30
    ○ A BeagleBone Black -- with 1 GHz Cortex A8 CPU + 4 GB of flash +
    512 MB DDR3 + HDMI for less than $60!
    ● All documentation available online and without NDA -- and the
    BeagleBone Black is (nearly) entirely open
    ● The BeagleBone Black can also be used as a logic analyzer via sigrok

    View Slide

  45. Open source firmware
    ● The software that runs closest to the hardware is increasingly open,
    with drivers nearly (nearly!) always open
    ● Increasingly, we are seeing the firmware of unseen parts of the system
    become open as well, viz. the Open Source Firmware Conference
    ● This trend is slower in the 7nm SoCs -- but it’s happening!
    ● However, even in putatively open architectures, there generally still
    remains proprietary software in the form of boot ROMs -- and this
    proprietary software remains a problem!

    View Slide

  46. Embedded Rust
    ● Rust has proven to be a revolution for systems software: rich type
    system, algebraic types, ownership model allow for fast, correct code
    ● Slightly more surprising has been Rust’s ability to get small -- which
    coupled with its lack of a runtime lets it fit everywhere!
    ● With its safety and expressive power, Rust represents a quantum leap
    over C -- and without losing performance or sacrificing size
    ● We at Oxide are working on a de novo Rust operating system for the
    embedded use case that we will (naturally?) open source; stay tuned!

    View Slide

  47. To sum...

    View Slide

  48. “That changed everything”

    View Slide

  49. A new Golden Age!
    ● Thanks to Moore’s Law, Wright’s Law and the rise of open source, it is
    easier to build hardware than ever before!
    ● We are going to see computers in many more places, posing challenges
    to us all to develop reliable, secure, high performing systems
    ● Software remains essential, but we must not think of it in isolation; we
    must co-design the hardware and the software in our systems!
    ● The systems are open, the communities are welcoming! Let’s build!

    View Slide