Hardware/software Co-design: The Coming Golden Age

Hardware/Software Co-design: The Coming Golden Age Bryan Cantrill Oxide Computer
Company

The hardware/software divide • The shift to public cloud computing
over the last ﬁfteen years has allowed software and hardware to become disconnected • On the one hand, this can be empowering: a SaaS offering can be built with no real understanding of the hardware beneath it • But there’s a risk of taking software-centric thinking too far -- and drawing the mistaken conclusion that hardware is irrelevant (or worse) • This overshot in thinking is epitomized by Marc Andreessen’s 2011 essay, “Why Software is Eating the World”

Revisiting Andreessen • Certainly, the essay makes an important observation
on the importance of software in essentially every domain:

Revisiting Andreessen • And the effect of Moore’s Law +
open source + public cloud computing has indisputably lowered the cost of delivering software:

Revisiting Andreessen • But the essay errs in fetishizing software,
mistakenly viewing extant industries as likely to be disrupted by SaaS alone:

Revisiting Andreessen • Software is important -- but the essay
conﬂates software companies with companies that in fact integrate software and hardware • Companies that Andreessen cited that have thrived -- Amazon, Google, etc. -- have very signiﬁcant hardware components! • Many software-only companies that are cited have disappointed: Zynga, Rovio, Groupon, LivingSocial, Foursquare • Andreessen is dismissive of Apple (up 15X) -- and entirely ignores companies like NVIDIA (57X), AMD (14X), or even Intel (3X)!

Revisiting Andreessen

Revisiting another famous essay

Gordon Moore, ca. 1965

Moore’s Law?

Moore’s Law!

Moore’s Law?!

So… Moore’s Law? • In his 1965 paper, there is
no Moore’s Law per se — just a bunch of incredibly astute and prescient observations • The term “Moore’s Law” would be coined by Carver Mead in 1971 as part of his work on determining ultimate physical limits • Moore updated the law in 1975 to be a doubling of transistor density every two years (Denard scaling would be outlined in detail in 1974) • For many years, Moore’s Law could be inferred to be doublings of transistor density, speed, and economics

Moore’s Law: Good old days? • The 1980s and early
1990s were great for Moore’s Law — so much so that computers needed a “turbo button” to counteract its effects (!!) • But even in those halcyon years, Moore’s Law was leaving DRAM behind: memory was becoming denser but no faster • An increasing number of workloads began hitting the memory wall • Caching was necessary but insufﬁcient...

Moore’s Law: The memory wall • By the mid-1990s, it
had become clear that symmetric multiprocessing was the path to deliver throughput on multi-threaded workloads • ...but SMP did nothing for single-threaded performance • Deep pipelining and VLIW were — largely — failed experiments • For single-threaded workloads, microprocessors turned to out-of-order and speculative execution to hide memory latency • Even in simpler times, scaling with Moore’s Law was a challenge!

Moore’s Law: Architectural shifts • Denard scaling ended in ~2006
due to current leakage… • ...but by then chip multiprocessing was clearly the trajectory • CMP was enhanced by simultaneous multithreading (SMT), which offered up to another factor of two on throughput • Thanks to the earlier software work on SMP, CMP/SMT was less of a software performance apocalypse than some feared — but more of a security apocalypse than anyone anticipated! • And “dark silicon” greatly limits CMP!

Moore’s Law: Deceleration • In August 2018, GlobalFoundries suddenly stopped
7nm development, citing economics -- it was simply too expensive to stay competitive • GlobalFoundries’ departure left TSMC and Samsung on 7nm -- and Intel on 14nm, struggling to get to 10nm • Intel’s Cannon Lake was three years late and an unmitigated disaster -- and for Ice Lake/Cascade Lake, Intel is intermixing 14nm and 10nm • Moving to 3nm/5nm requires moving beyond FinFETs to GAAFETs -- and to EUV photolithography; new nodes are very expensive!

Aside: Process nodes • You may well wonder: when a
process node is “7nm” or “5nm”, what exactly is seven nanometers or ﬁve nanometers long? (And, um, how big is a silicon atom anyway?) • Answer to the second question: ~210 picometers! • Answer to the ﬁrst question: nothing! Unbelievably, the name of the process node no longer measures anything at all (!!) -- it is merely a rough expression of transistor density (and implication of process) • E.g. 7nm ≈ 100MTr/mm2 (but there are lots of caveats)

Moore’s Law • Increased transistor density is continuing to be
possible, but at a greatly slowed pace -- and at outsized cost • Economically, Moore’s Law is indisputably ending • But is there another way of looking at it?

Another essay, further back in time...

Theodore Wright, ca. 1936

Wright’s Law • In 1936, Theodore Wright studied the costs
of aircraft manufacturing, ﬁnding that the cost dropped with experience • Over time, when volume doubled, unit costs dropped by 10-15% • This phenomenon has been observed in other technological domains • In 2013, Jessika Trancik et al. found Wright’s Law to hold better predictive power for transistor cost than Moore’s Law! • Wright’s Law seems to hold, especially for older process nodes

Wright on market creation

Wright foreshadowing Moore

One ﬁnal essay...

W. Stanley Jevons, ca. 1865

Jevons foreshadowing Wright

Aside: Never say “never”

Aside: A contemporary weighs in on Jevons?

Back to computing! • Andreessen’s 2011 piece, while containing some
truisms, is overly software-centric and misses hardware’s role entirely • Moore’s Law -- while prescient! -- is indisputably slowing • Wright’s Law, however, may still be holding for transistors -- especially at older processing nodes (22nm, 40nm, 90nm, etc.) • The Jevons Paradox has proven again and again to apply to computing: when general purpose computation is cheaper, we ﬁnd more to do • We can expect more computation in more places

Compute everywhere? • More computation doesn’t just mean computers in
new places (à la IoT), it means CPUs present where we once thought of components • E.g., open 32-bit CPUs replacing hidden, closed 8-bit microcontrollers • We are already seeing CPUs on the NIC (SmartNIC), CPUs next to ﬂash (e.g., open-channel SSD) and on the spindle (e.g. WD’s SweRV) • New opportunities for hardware/software co-design: keep hardware simple and put more sophistication into software and/or soft logic • There are several trends acting as accelerants for this shift...

Open instruction sets • X86 and ARM -- the two
market victors -- are both encumbered by history and licensing • RISC-V is an attempt to learn from the ISA mistakes of the past, in a vessel that is entirely open and -- with open implementations • RISC-V is very promising, but there remain many gaps to close • To succeed, RISC-V must focus as much on the SoC as the ISA -- while remaining entirely open!

Open FPGAs • FPGA bitstreams have historically been entirely proprietary
-- and one is therefore dependent upon proprietary tools to generate them • The Lattice iCE40 bitstream format was reverse engineered in 2015 by Claire Wolf, and can be entirely synthesized with an open toolchain! • While Xilinx (AMD) and Alterra (Intel) retain proprietary components (e.g., for timing models), newcomers like QuickLogic are entirely open • See, e.g., SymbiFlow, Verilog to Routing (VTR), Yosys, OpenFPGA, and the (new!) Open Source FPGA Foundation

Open HDLs • Hardware description languages have traditionally been dominated
by Verilog and (later) SystemVerilog • Compilers have been historically proprietary -- and the languages themselves are error prone • In recent years we have seen a wave of new, open HDLs, e.g.: Chisel, nMigen, Bluespec, SpinalHDL, Mamba (PyMTL 3), HardCaml • Of these, one is particularly noteworthy...

Open HDL: Bluespec • Bluespec is a high-level HDL that
takes its inspiration from formal speciﬁcation languages -- and strongly typed languages like Haskell • Bluespec uses the expressiveness of the language to move away from individual signals -- and to atomic rules and interfaces • This allows for the compiler to do the hard work of connecting modules and proving correctness, greatly reducing veriﬁcation time! • In the words of Oxide engineer Arjen Roodselaar, “Bluespec is to SystemVerilog what Rust is to assembly”

Open HDL: Bluespec • Bluespec was proprietary for 20 years;
open sourced in early 2020! • We at Oxide feel that Bluespec is a profoundly transformative technology -- but not one that is broadly understood or appreciated! • More details: ◦ https://github.com/B-Lang-org/Documentation ◦ https://github.com/B-Lang-org/bsc ◦ https://github.com/oxidecomputer/cobalt

Open source EDA • Proprietary software has historically dominated EDA…
• Open source alternatives have existed for years -- but one in particular, KiCad, has enjoyed sufﬁciently broad sponsorship to close the gaps with professional-grade software • The maturity of KiCad coupled with the rise of quick turn PCB manufacturing/assembly has allowed for astonishing speed: ◦ From conception to manufacturer in hours ◦ From manufacturer to shipping board in days

Board economics • Single board computers are very accessible! ◦
An STM32 Nucleo-144 board with 400 MHz Cortex M7 CPU + 2 MB of ﬂash + 1 MB of RAM + all I/O peripherals for less than $30 ◦ A BeagleBone Black -- with 1 GHz Cortex A8 CPU + 4 GB of ﬂash + 512 MB DDR3 + HDMI for less than $60! • All documentation available online and without NDA -- and the BeagleBone Black is (nearly) entirely open • The BeagleBone Black can also be used as a logic analyzer via sigrok

Open source ﬁrmware • The software that runs closest to
the hardware is increasingly open, with drivers nearly (nearly!) always open • Increasingly, we are seeing the ﬁrmware of unseen parts of the system become open as well, viz. the Open Source Firmware Conference • This trend is slower in the 7nm SoCs -- but it’s happening! • However, even in putatively open architectures, there generally still remains proprietary software in the form of boot ROMs -- and this proprietary software remains a problem!

Embedded Rust • Rust has proven to be a revolution
for systems software: rich type system, algebraic types, ownership model allow for fast, correct code • Slightly more surprising has been Rust’s ability to get small -- which coupled with its lack of a runtime lets it ﬁt everywhere! • With its safety and expressive power, Rust represents a quantum leap over C -- and without losing performance or sacriﬁcing size • We at Oxide are working on a de novo Rust operating system for the embedded use case that we will (naturally?) open source; stay tuned!

To sum...

“That changed everything”

A new Golden Age! • Thanks to Moore’s Law, Wright’s
Law and the rise of open source, it is easier to build hardware than ever before! • We are going to see computers in many more places, posing challenges to us all to develop reliable, secure, high performing systems • Software remains essential, but we must not think of it in isolation; we must co-design the hardware and the software in our systems! • The systems are open, the communities are welcoming! Let’s build!

Hardware/software Co-design: The Coming Golden Age

Hardware/software Co-design: The Coming Golden Age

More Decks by Bryan Cantrill

Other Decks in Technology

Featured

Transcript