$30 off During Our Annual Pro Sale. View Details »

Hardware/software Co-design: The Coming Golden Age

Hardware/software Co-design: The Coming Golden Age

Talk given at RailsConf 2021, Video: https://www.youtube.com/watch?v=nY07zWzhyn4

Bryan Cantrill

April 14, 2021
Tweet

More Decks by Bryan Cantrill

Other Decks in Technology

Transcript

  1. Hardware/Software Co-design: The Coming Golden Age Bryan Cantrill Oxide Computer

    Company
  2. The hardware/software divide • The shift to public cloud computing

    over the last fifteen years has allowed software and hardware to become disconnected • On the one hand, this can be empowering: a SaaS offering can be built with no real understanding of the hardware beneath it • But there’s a risk of taking software-centric thinking too far -- and drawing the mistaken conclusion that hardware is irrelevant (or worse) • This overshot in thinking is epitomized by Marc Andreessen’s 2011 essay, “Why Software is Eating the World”
  3. Revisiting Andreessen • Certainly, the essay makes an important observation

    on the importance of software in essentially every domain:
  4. Revisiting Andreessen • And the effect of Moore’s Law +

    open source + public cloud computing has indisputably lowered the cost of delivering software:
  5. Revisiting Andreessen • But the essay errs in fetishizing software,

    mistakenly viewing extant industries as likely to be disrupted by SaaS alone:
  6. Revisiting Andreessen • Software is important -- but the essay

    conflates software companies with companies that in fact integrate software and hardware • Companies that Andreessen cited that have thrived -- Amazon, Google, etc. -- have very significant hardware components! • Many software-only companies that are cited have disappointed: Zynga, Rovio, Groupon, LivingSocial, Foursquare • Andreessen is dismissive of Apple (up 15X) -- and entirely ignores companies like NVIDIA (57X), AMD (14X), or even Intel (3X)!
  7. Revisiting Andreessen

  8. Revisiting another famous essay

  9. Gordon Moore, ca. 1965

  10. Gordon Moore, ca. 1965

  11. Gordon Moore, ca. 1965

  12. Gordon Moore, ca. 1965

  13. Moore’s Law?

  14. Gordon Moore, ca. 1965

  15. Moore’s Law!

  16. Gordon Moore, ca. 1965

  17. Moore’s Law?!

  18. So… Moore’s Law? • In his 1965 paper, there is

    no Moore’s Law per se — just a bunch of incredibly astute and prescient observations • The term “Moore’s Law” would be coined by Carver Mead in 1971 as part of his work on determining ultimate physical limits • Moore updated the law in 1975 to be a doubling of transistor density every two years (Denard scaling would be outlined in detail in 1974) • For many years, Moore’s Law could be inferred to be doublings of transistor density, speed, and economics
  19. Moore’s Law: Good old days? • The 1980s and early

    1990s were great for Moore’s Law — so much so that computers needed a “turbo button” to counteract its effects (!!) • But even in those halcyon years, Moore’s Law was leaving DRAM behind: memory was becoming denser but no faster • An increasing number of workloads began hitting the memory wall • Caching was necessary but insufficient...
  20. Moore’s Law: The memory wall • By the mid-1990s, it

    had become clear that symmetric multiprocessing was the path to deliver throughput on multi-threaded workloads • ...but SMP did nothing for single-threaded performance • Deep pipelining and VLIW were — largely — failed experiments • For single-threaded workloads, microprocessors turned to out-of-order and speculative execution to hide memory latency • Even in simpler times, scaling with Moore’s Law was a challenge!
  21. Moore’s Law: Architectural shifts • Denard scaling ended in ~2006

    due to current leakage… • ...but by then chip multiprocessing was clearly the trajectory • CMP was enhanced by simultaneous multithreading (SMT), which offered up to another factor of two on throughput • Thanks to the earlier software work on SMP, CMP/SMT was less of a software performance apocalypse than some feared — but more of a security apocalypse than anyone anticipated! • And “dark silicon” greatly limits CMP!
  22. Moore’s Law: Deceleration • In August 2018, GlobalFoundries suddenly stopped

    7nm development, citing economics -- it was simply too expensive to stay competitive • GlobalFoundries’ departure left TSMC and Samsung on 7nm -- and Intel on 14nm, struggling to get to 10nm • Intel’s Cannon Lake was three years late and an unmitigated disaster -- and for Ice Lake/Cascade Lake, Intel is intermixing 14nm and 10nm • Moving to 3nm/5nm requires moving beyond FinFETs to GAAFETs -- and to EUV photolithography; new nodes are very expensive!
  23. Aside: Process nodes • You may well wonder: when a

    process node is “7nm” or “5nm”, what exactly is seven nanometers or five nanometers long? (And, um, how big is a silicon atom anyway?) • Answer to the second question: ~210 picometers! • Answer to the first question: nothing! Unbelievably, the name of the process node no longer measures anything at all (!!) -- it is merely a rough expression of transistor density (and implication of process) • E.g. 7nm ≈ 100MTr/mm2 (but there are lots of caveats)
  24. Moore’s Law • Increased transistor density is continuing to be

    possible, but at a greatly slowed pace -- and at outsized cost • Economically, Moore’s Law is indisputably ending • But is there another way of looking at it?
  25. Another essay, further back in time...

  26. Theodore Wright, ca. 1936

  27. Wright’s Law • In 1936, Theodore Wright studied the costs

    of aircraft manufacturing, finding that the cost dropped with experience • Over time, when volume doubled, unit costs dropped by 10-15% • This phenomenon has been observed in other technological domains • In 2013, Jessika Trancik et al. found Wright’s Law to hold better predictive power for transistor cost than Moore’s Law! • Wright’s Law seems to hold, especially for older process nodes
  28. Wright on market creation

  29. Wright foreshadowing Moore

  30. One final essay...

  31. W. Stanley Jevons, ca. 1865

  32. W. Stanley Jevons, ca. 1865

  33. Jevons foreshadowing Wright

  34. Aside: Never say “never”

  35. Aside: A contemporary weighs in on Jevons?

  36. Back to computing! • Andreessen’s 2011 piece, while containing some

    truisms, is overly software-centric and misses hardware’s role entirely • Moore’s Law -- while prescient! -- is indisputably slowing • Wright’s Law, however, may still be holding for transistors -- especially at older processing nodes (22nm, 40nm, 90nm, etc.) • The Jevons Paradox has proven again and again to apply to computing: when general purpose computation is cheaper, we find more to do • We can expect more computation in more places
  37. Compute everywhere? • More computation doesn’t just mean computers in

    new places (à la IoT), it means CPUs present where we once thought of components • E.g., open 32-bit CPUs replacing hidden, closed 8-bit microcontrollers • We are already seeing CPUs on the NIC (SmartNIC), CPUs next to flash (e.g., open-channel SSD) and on the spindle (e.g. WD’s SweRV) • New opportunities for hardware/software co-design: keep hardware simple and put more sophistication into software and/or soft logic • There are several trends acting as accelerants for this shift...
  38. Open instruction sets • X86 and ARM -- the two

    market victors -- are both encumbered by history and licensing • RISC-V is an attempt to learn from the ISA mistakes of the past, in a vessel that is entirely open and -- with open implementations • RISC-V is very promising, but there remain many gaps to close • To succeed, RISC-V must focus as much on the SoC as the ISA -- while remaining entirely open!
  39. Open FPGAs • FPGA bitstreams have historically been entirely proprietary

    -- and one is therefore dependent upon proprietary tools to generate them • The Lattice iCE40 bitstream format was reverse engineered in 2015 by Claire Wolf, and can be entirely synthesized with an open toolchain! • While Xilinx (AMD) and Alterra (Intel) retain proprietary components (e.g., for timing models), newcomers like QuickLogic are entirely open • See, e.g., SymbiFlow, Verilog to Routing (VTR), Yosys, OpenFPGA, and the (new!) Open Source FPGA Foundation
  40. Open HDLs • Hardware description languages have traditionally been dominated

    by Verilog and (later) SystemVerilog • Compilers have been historically proprietary -- and the languages themselves are error prone • In recent years we have seen a wave of new, open HDLs, e.g.: Chisel, nMigen, Bluespec, SpinalHDL, Mamba (PyMTL 3), HardCaml • Of these, one is particularly noteworthy...
  41. Open HDL: Bluespec • Bluespec is a high-level HDL that

    takes its inspiration from formal specification languages -- and strongly typed languages like Haskell • Bluespec uses the expressiveness of the language to move away from individual signals -- and to atomic rules and interfaces • This allows for the compiler to do the hard work of connecting modules and proving correctness, greatly reducing verification time! • In the words of Oxide engineer Arjen Roodselaar, “Bluespec is to SystemVerilog what Rust is to assembly”
  42. Open HDL: Bluespec • Bluespec was proprietary for 20 years;

    open sourced in early 2020! • We at Oxide feel that Bluespec is a profoundly transformative technology -- but not one that is broadly understood or appreciated! • More details: ◦ https://github.com/B-Lang-org/Documentation ◦ https://github.com/B-Lang-org/bsc ◦ https://github.com/oxidecomputer/cobalt
  43. Open source EDA • Proprietary software has historically dominated EDA…

    • Open source alternatives have existed for years -- but one in particular, KiCad, has enjoyed sufficiently broad sponsorship to close the gaps with professional-grade software • The maturity of KiCad coupled with the rise of quick turn PCB manufacturing/assembly has allowed for astonishing speed: ◦ From conception to manufacturer in hours ◦ From manufacturer to shipping board in days
  44. Board economics • Single board computers are very accessible! ◦

    An STM32 Nucleo-144 board with 400 MHz Cortex M7 CPU + 2 MB of flash + 1 MB of RAM + all I/O peripherals for less than $30 ◦ A BeagleBone Black -- with 1 GHz Cortex A8 CPU + 4 GB of flash + 512 MB DDR3 + HDMI for less than $60! • All documentation available online and without NDA -- and the BeagleBone Black is (nearly) entirely open • The BeagleBone Black can also be used as a logic analyzer via sigrok
  45. Open source firmware • The software that runs closest to

    the hardware is increasingly open, with drivers nearly (nearly!) always open • Increasingly, we are seeing the firmware of unseen parts of the system become open as well, viz. the Open Source Firmware Conference • This trend is slower in the 7nm SoCs -- but it’s happening! • However, even in putatively open architectures, there generally still remains proprietary software in the form of boot ROMs -- and this proprietary software remains a problem!
  46. Embedded Rust • Rust has proven to be a revolution

    for systems software: rich type system, algebraic types, ownership model allow for fast, correct code • Slightly more surprising has been Rust’s ability to get small -- which coupled with its lack of a runtime lets it fit everywhere! • With its safety and expressive power, Rust represents a quantum leap over C -- and without losing performance or sacrificing size • We at Oxide are working on a de novo Rust operating system for the embedded use case that we will (naturally?) open source; stay tuned!
  47. To sum...

  48. “That changed everything”

  49. A new Golden Age! • Thanks to Moore’s Law, Wright’s

    Law and the rise of open source, it is easier to build hardware than ever before! • We are going to see computers in many more places, posing challenges to us all to develop reliable, secure, high performing systems • Software remains essential, but we must not think of it in isolation; we must co-design the hardware and the software in our systems! • The systems are open, the communities are welcoming! Let’s build!