Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Forgotten Operator

The Forgotten Operator

Talk given at the Southern California Linux Expo in March, 2023. Video: https://www.youtube.com/watch?v=H3YbsOCb4lc

Bryan Cantrill

March 17, 2023
Tweet

More Decks by Bryan Cantrill

Other Decks in Technology

Transcript

  1. OXIDE In the beginning… • In the beginning, computers were

    so expensive that they were shared by necessity – leading to the rise of a (brief) utility computing movement • But with Moore’s Law, computing became denser, faster – and cheaper • With each successive turn – minicomputers, servers, workstations, personal computers – computing became cheaper and easier to own • By the 1990s computing was only on-premises
  2. OXIDE The pain of on-premises compute • With the rise

    of the internet, compute needs exploded • All infrastructure was on-premises – it can’t just be spun up! • Physical infrastructure is capital and labor intensive • Adding insult to injury, it was all proprietary – hardware and software • Physical buildout was exceedingly painful • A confluence of trends began to give rise to an alternative…
  3. OXIDE The rise of cloud computing • Several factors in

    the 2000s came into confluence: ◦ Internet ubiquity + protocol maturity ◦ Rise of open source software at all layers of the stack ◦ Dominance of x86 + “commodity” hardware ◦ Strartup ice age + financial crisis (emphasis on opex over capex) • Added up to cloud computing: shared, elastic, API-driven infrastructure
  4. OXIDE Myths of cloud computing • Cloud computing’s ubiquity in

    the 2010s gave rise to several myths… • Myth: Cloud computing is a low margin business • Reality: Cloud computing is a high margin business! • Myth: The economies of scale from operating a public cloud primarily accrue to purchasing power • Reality: Purchasing power is not unimportant – but the much greater dividend was the ability to invest in innovation!
  5. OXIDE Cloud computing divide • Cloud computing operators – hyperscalers

    – investied relentlessly in innovation, yielding an increasing divide • This innovation drove down their own costs, allowing them to bolster their own positions and continue to innovate • On-premises infrastructure providers didn’t understand the cloud, and increasingly focussed on those customers that shared their confusion • All of this served to accelerate the demise of on-premises compute • So… is anyone left on-prem?
  6. OXIDE The forgotten operator • There (emphatically!) remain good reasons

    to run on-prem! • If you are on-prem in 2023, the reasons are likely good • These include: risk management, regulatory compliance, latency, and (increasingly!) economics • This on-premises operator has been forgotten by everyone ◦ Vendors don’t understand their use case ◦ Fellow technologists act like they have never heard of the cloud!
  7. OXIDE The pain of the forgotten operator • The forgotten

    operator is an extraordinary amount of pain: the abstractions for on-premises compute remain vestigial • Power, cooling, BMC, BIOS, ToR switch, all date from the PC era! • And this is to say nothing of the software! • These systems operate at cross-purposes: they were never designed together – and to the contrary • But do we care?
  8. OXIDE The mandate for rack-scale machines • Those repatriating onto

    on-premises infrastructure will (rightfully) expect API-driven elastic infrastructure • However, that’s not what they’re going to find • What they will find is, in fact, worse than they might remember • We believe that we must do better • We must design rack-scale machines that integrate hardware and software into a single, software-driven system!
  9. OXIDE Reasons for optimism • There are a couple of

    interesting trends that give optimism… ◦ Hardware is easier than ever before – and increasingly open ◦ There have been tremendous software advances, e.g. Rust and P4 ◦ Remote teams make it easier to ramp than ever before • Still, rack-scale design presents new challenges – and is a big build!
  10. OXIDE Oxide Computer Company • We have built a true

    rack-scale machine, with integrated hardware and software, allowing one to easily deploy cloud computing on-premises • After a three year build (!), we are on the cusp of shipping our first product • We can now say it unequivocally: the future demands rack-scale design!
  11. OXIDE Rescuing the forgotten operator • The public cloud will

    always play an important role – it’s not going away • But more and more operators will need to manage both on-prem and public cloud buildout • Those operators have the right to modernity, wherever they deploy! • To the forgotten operator: help is on the way! • Join us and learn more at https://oxide.computer – and check out our weekly Discord, “Oxide and Friends” (now also a podcast!)