Slide 1

Slide 1 text

Intelligence is not enough The humanity of engineering Bryan Cantrill Oxide Computer Company

Slide 2

Slide 2 text

OXIDE It always starts with a tweet…

Slide 3

Slide 3 text

OXIDE It always starts with a tweet being trolled…

Slide 4

Slide 4 text

OXIDE It always starts with a tweet being trolled…

Slide 5

Slide 5 text

OXIDE It always starts with a tweet being trolled…

Slide 6

Slide 6 text

OXIDE It always starts with a tweet being trolled…

Slide 7

Slide 7 text

OXIDE “Serious”? • This tweet used the word “serious” three times, mainly to deride others • Not clear what “serious” means in the context of an argument that equates a computer program with nuclear weapons? • Or accuses anyone who disagrees with this assessment of “just vibes”? • Or one that puts the risk of human extinction at the (metaphorical!) hands of a computer program to be 5% with zero methodology? • So, a serious question: why treat this seriously at all?

Slide 8

Slide 8 text

OXIDE Reasons to treat this seriously • Fear of technology isn’t new – and isn’t always poorly founded! • New technologies often have unintended consequences and externalities that merit consideration and discussion • But in those who believe in AI-based extinction risk, the fear itself is alarming – in part because of the actions that it would justify • The “AI pause” – if implemented – would be brazenly authoritarian • The accompanying rhetoric is often disturbingly violent

Slide 9

Slide 9 text

OXIDE Concrete extinction risk • Most AGI-based extinction risk fears – when made concrete – hinge on: ○ A computer program getting ahold of nuclear weapons ○ A computer program making a novel bioweapon ○ A computer program developing novel molecular nanotechnology • We are going to leave aside nuclear weapons, as indisputably serious people have been thinking about it since the dawn of the atomic age • But the latter two have something important in common…

Slide 10

Slide 10 text

OXIDE Superintelligent engineering? • Whether stated explicitly or not, when we talk about the fear of a superintelligent AI actively killing not just some humans but all of them, we are talking about AI making weapons • Let us leave aside many questions about such scenarios (e.g., AI’s alignment, motivation, or means of production – and human adaptability, countermeasures, and resilience), and focus on one pillar… • It depends on AI making applying the constraints of physical and mathematical reality to make new stuff – which is to say, engineering

Slide 11

Slide 11 text

OXIDE Engineering and intelligence • If our very existence is threatened by a superintelligence engaged in engineering, it prompts an important question… • Is engineering an act of intelligence alone? • I can’t speak to building novel bioweapons or the significant challenges in reviving otherwise moribund molecular nanotechnology… • …but we do have a bunch of recent experience building something big and new that is surely simpler than these domains

Slide 12

Slide 12 text

OXIDE What we built!

Slide 13

Slide 13 text

OXIDE Building a computer • In case it needs to be said: building a new computer + new network switch + high-speed backplane + all software from lowest levels of firmware to highest levels of control plane is hard and complicated • It is still, however, engineering not science • Engineering is the act of learning from failure: even when building anew, there will be many occasions when the system does not, in fact, work! • It is worth exploring a tiny fraction of the failures that we endured in building, as they are instructive as to the nature of engineering…

Slide 14

Slide 14 text

OXIDE Failure to bring CPU out of reset • Despite following the documented power sequencing to the CPU (AMD Milan), it was refusing to come out of reset, simply reinitiating the power-on sequence after 1.25 seconds of inactivity • Natural assumption was that power was marginal – but the power looked good (and making it extraordinary didn’t change anything) • Went down any number of blind alleys, performing directed experiments with respect to non-connected pins that shouldn’t make any difference • These experiments weren’t easy!

Slide 15

Slide 15 text

OXIDE Failure to bring CPU out of reset

Slide 16

Slide 16 text

OXIDE Failure to bring CPU out of reset • After several weeks of debugging, we discovered that our voltage regulator had a firmware bug: it adjusted voltage as requested by the CPU via SVI2 – but never sent a completion (VOTF Complete) • The CPU had no way of knowing that the power was in fact correct • AMD’s tool for verifying power (SDLE) did not check for this packet • Corrected regulator firmware resulted in the CPU coming out of reset!

Slide 17

Slide 17 text

OXIDE Failure to bring NIC out of reset • We could not get the Chelsio NIC to come out of reset • Extensive validation did not reveal any signal that was out of spec • Attempting to take a working add-in card (AIC) and destroy it revealed that one of the pinstrap resistors (to select the clock source) was incorrectly specified • We had a 1K ohm pull-down resistor, but this was in fact too weak – and a 499 ohm resistor was required to overcome an internal pull-up • Reworking with the correct resistor resulted in the NIC correctly starting!

Slide 18

Slide 18 text

OXIDE NIC transiently failing to train all PCIe lanes • We have our own platform enablement layer (i.e., no BIOS); we are responsible for initializing devices at the lowest layer • With disconcerting frequency, some number of Chelsio NIC links did not train correctly for some of their lanes on boot • Decoding the Link Status and Training State Machine (LSTSM) on the CPU allowed us to better understand where it was failing, but not why • Discovered that a second PERST resulted in correct training – and moreover that this second PERST is present on legacy firmware!

Slide 19

Slide 19 text

OXIDE Failure to connect to U.2 NVMe drives • In a revision of our PCIe-to-U.2 passthrough card (Sharkfin), we had I2C connectivity – but no PCIe connectivity whatsoever • A previous version of this card had worked, but little had changed in the schematic and the layout – why were the new ones broken?! • Physical inspection revealed that one of the parts was simply wrong! • The wrong reel of parts had been loaded into a pick-and-place machine, and an inverter had been laid down instead of an AND gate (!) • Reworked ~1200 cards in ~96 hours!

Slide 20

Slide 20 text

OXIDE Random data corruption on software install • When installing OS boot images, sporadic (!) corruption was seen • Adding checksums to these images revealed corruption was rampant (!!) • Microprocessor was speculatively loading through a stowaway mapping from early boot, which was allocating in the TLB • If application address conflicted with address of stowaway mapping, kernel would incorrectly copy data from the wire to the wrong location • Eliminating stowaway mapping eliminated the corruption – but highlighted divergent perspectives on side-effects of speculative loads

Slide 21

Slide 21 text

OXIDE What do these have in common? • Each posed an existential risk for the artifact: without solving them, we wouldn’t have something that’s impaired – we would have nothing • Each revealed an emergent property, often at an interface boundary • The breakthrough was often something that “shouldn’t” have worked • Intelligence alone does not solve problems like this • In all cases, we summoned other elements of our character: our resilience, our teamwork, our rigor, our optimism, our curiosity

Slide 22

Slide 22 text

OXIDE Values in engineering • These extra-intelligence values are so important to us, that we have codified them – and use them very explicitly as a lens for hiring • To be clear, we are certainly seeking capable, intelligent people – but that intelligence is useless without these shared (human!) values • We may be more explicit about it than others, but many engineering teams are also implicitly hiring for shared values • Viz.: It is comical to think of an engineering team hiring based only on the results of a test – or any other linear measure of intelligence!

Slide 23

Slide 23 text

OXIDE The humanity in engineering • This humanity necessary to understand and resolve failure – so essential in designing and building – is hidden in the final artifact • This is the soul in Tracy Kidder’s Soul of a New Machine – and the perspiration in Edison’s proverbial 99% perspiration • Computer programs lack this humanity: they do not have willpower, desire, or drive – let alone the deeper human qualities required • Which doesn’t mean that AI can’t be useful to engineers, merely that it cannot engineer autonomously

Slide 24

Slide 24 text

OXIDE So, should we worry about AI? • Extinction risk due to AGI is de minimis – but we must not falsely dichotomize AI into posing existential risk or no risk whatsoever! • The risk that AI does pose may feel mundane – but it is much more how it will be abused (deliberately or accidentally) by existing structures • AI ethics is exceedingly important, especially when it is being used to inform decisions that affect people’s lives! • By acknowledging that AI is and will be an important tool, we can move beyond fear to focus on enforcing existing regulatory regimes

Slide 25

Slide 25 text

OXIDE Further wells to fall down information • Richard Smalley/K. Eric Drexler debate on molecular nanotechnology • Lex Friedman interview with Marc Andreessen • Logan Bartlett interview with Eliezer Yudkowsky • Oxide and Friends podcast, especially Okay Doomer, Tales From the Bringup Lab and More Tales from the Bringup Lab