Talk given at P99 CONF 2021. Video: https://www.youtube.com/watch?v=cuvp-e4ztC0
Brought to you by
Rust, Wright's Law, and the
Future of Low-Latency
CTO, Oxide Computer Company
The Arbiter of Systems Performance
• If it needs to be said: software must execute within the conﬁnes of
hardware — and the hardware is the ultimate arbiter of performance
• The history of improving systems performance consists broadly of:
○ Process revolutions that propelled all designs
○ Architectural revolutions that optimized for certain use cases
○ Software revolutions that allowed us to better utilize hardware
• All of these revolutions must live within the conﬁnes of economics!
Process Revolution: Moore’s Law
• In his 1965 paper, Gordon Moore did not actually coin a law per se —
he just a bunch of incredibly astute and prescient observations
• The term “Moore’s Law” would be coined by Carver Mead in 1971 as
part of his work on determining ultimate physical limits
• Moore updated the law in 1975 to be a doubling of transistor density
every two years (Denard scaling would be outlined in detail in 1974)
• For many years, Moore’s Law could be inferred to be doublings of
transistor density, speed, and economics
Moore’s Law: The Good Old Days
• The 1980s and early 1990s were great for Moore’s Law — so much so
that computers needed a “turbo button” to counteract its eﬀects (!!)
• But even in those halcyon years, Moore’s Law was leaving DRAM
behind: memory was becoming denser but no faster
• An increasing number of workloads began hitting the memory wall
• Caching — an essential architectural revolution born in the 1960s —
was necessary but insuﬃcient...
Moore’s Law: The Good Old Days?
• By the mid-1990s, it had become clear that symmetric multiprocessing
was the path to deliver throughput on multi-threaded workloads
• SMP necessitated its own software revolution (multi-threaded systems),
but did little for single-threaded latency
• Deep pipelining and VLIW were — largely — failed experiments
• For single-threaded workloads, microprocessors turned to out-of-order
and speculative execution to hide memory latency
• Even in simpler times, scaling with Moore’s Law was a challenge!
Moore’s Law Deceleration
• In August 2018, GlobalFoundries suddenly stopped 7nm development,
citing economics — it was simply too expensive to stay competitive
• GlobalFoundries’ departure left TSMC and Samsung on 7nm — and
Intel on 14nm, struggling to get to 10nm
• Intel’s Cannon Lake was three years late and an unmitigated disaster —
and for Ice Lake/Cascade Lake, Intel is intermixing 14nm and 10nm
• Moving to 3nm/5nm requires moving beyond FinFETs to GAAFETs —
and to EUV photolithography; new nodes are very expensive!
Aside: Process nodes
• You may well wonder: when a process node is “7nm” or “5nm”, what
exactly is seven nanometers or ﬁve nanometers long? (And, um, how big
is a silicon atom anyway?)
• Answer to the second question: ~210 picometers!
• Answer to the ﬁrst question: nothing! Unbelievably, the name of the
process node no longer measures anything at all (!!) — it is merely a
rough expression of transistor density (and implication of process)
• E.g. 7nm ≈ 100MTr/mm2 (but there are lots of caveats)
The End of Moore’s Law
• Increased transistor density is continuing to be possible, but at a greatly
slowed pace — and at outsized cost
• Moore’s Law has ceased to exist as an economic law
• But is there another way of looking at it?
• In 1936, Theodore Wright studied the costs of aircraft manufacturing,
ﬁnding that the cost dropped with experience
• Over time, when volume doubled, unit costs dropped by 10-15%
• This phenomenon has been observed in other technological domains
• In 2013, Jessika Trancik et al. found Wright’s Law to hold better
predictive power for transistor cost than Moore’s Law!
• Wright’s Law seems to hold, especially for older process nodes
Wright’s Law: Ramiﬁcations
• If Wright’s Law continues to hold, compute will be economically viable in
more and more places that were previously conﬁned to hard logic
• This is true even on die, where chiplets have made it easier than ever to
build a heterogeneous system — and where mixed process nodes have
demanded more sophistication
• Quick, how many cores are on your server? (Don’t forget the hidden
Wright’s Law: Performance Ramiﬁcations
• Having more compute in many more places is particularly germane to
○ More compute close to data (SmartNICs, open-channel SSDs,
on-spindle compute) lowers latency
○ Bringing data to special-purpose compute (GPGPUs, FPGAs)
• But security and multi-tenancy cannot be an afterthought!
• We need to rethink our system software
Aside: A Researcher’s Call to Rethink
Timothy Roscoe, OSDI 2021 Keynote, It's Time for Operating Systems to Rediscover Hardware
The Needed Software Revolution
• Much of the coming compute is, at some level, special purpose
• These systems are much less balanced than our general-purpose
systems — with much less memory and/or non-volatile storage
• The overhead of dynamic environments (Java, Go, Python, etc.) is
unacceptably high — and the development beneﬁt questionable
• Languages traditional used in this domain — C and C++ — both have
well-known challenges around safety and composability
• Enter Rust, and its killer feature...
• Rust is a revolutionary language in many respects, but one that may be
underappreciated is its ability to not depend on its own standard library
• Much of what is valuable about the language — sum types, ownership
model, traits, hygienic macros — is in core, not the standard library
• Crates marked “no_std” will not perform any heap allocations — and
any such allocation is a compile-time error!
• But no_std crates can depend on other no_std crates — lending real
composability to a domain for whom it has been entirely deprived
Rust: no_std binaries
• Rust no_std binaries are stunningly small
○ E.g., at Oxide, we are developing a message-passing, memory
protected system entirely in Rust (Rust microkernel, Rust tasks);
minimal systems are 30K — and entirely realistic ones are < 200K!
• no_std is without real precedent in other languages or environments; it
allows Rust to be put in essentially arbitrarily conﬁned contexts
• Rust is the ﬁrst language since C to meaningfully exist at the boundary
of hardware and systems software!
Rust, Wright’s Law and the Future
• Wright’s Law will continue to hold, resulting in more compute in more
places — and this compute will be essential for systems performance
• These compute elements will be increasingly special-purpose, and are
going to require purpose-ﬁt software
• Rust is proving to be an excellent ﬁt for these use cases!
• We fully expect many more open source, de novo hardware-facing
Rust-based systems — and thanks to no_std they will be able to
leverage one another; the Rust revolution is here!