Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Docker and the Future of Containers in Production

Bryan Cantrill
January 29, 2015
5

Docker and the Future of Containers in Production

My presentation at the Docker meetup in Seattle on January 28, 2015. This was videoed, but the video seems to be lost; a close approximation of this can be found here: https://www.youtube.com/watch?v=Ll50EFquwSo

Bryan Cantrill

January 29, 2015
Tweet

Transcript

  1. Docker and the
    Future of Containers in
    Production
    CTO
    [email protected]
    Bryan Cantrill
    @bcantrill

    View Slide

  2. Prehistory: Virtualization as cloud catalyst
    • In the 1960s — shortly after the dawn of computing! —
    pundits foresaw a compute utility that would be public
    and multi-tenant
    • The vision was four decades too early: it took the
    internet + commodity computing + virtualization to yield
    cloud computing
    • Virtualization is the essential ingredient for multi-tenant
    operation — but where in the stack to virtualize?
    • Choices around virtualization capture tensions between
    elasticity, tenancy, and performance
    • tl;dr: Virtualization choices drive economic tradeoffs

    View Slide

  3. • The historical answer — since the 1960s — has been to
    virtualize at the level of the hardware:
    • A virtual machine is presented upon which each
    tenant runs an operating system of their choosing
    • There are as many operating systems as tenants
    • The singular advantage of hardware virtualization: it can
    run entire legacy stacks unmodified
    • However, hardware virtualization exacts a heavy price:
    operating systems are not designed to share resources
    like DRAM, CPU, I/O devices or the network
    • Hardware virtualization limits tenancy, elasticity and
    performance
    Hardware-level virtualization?

    View Slide

  4. • Virtualizing at the application platform layer addresses
    the tenancy challenges of hardware virtualization
    • Added advantage of a much more nimble (& developer-
    friendly!) abstraction…
    • ...but at the cost of dictating abstraction to the developer
    • This creates the “Google App Engine problem”:
    developers are in a straightjacket where toy programs
    are easy — but sophisticated apps are impossible
    • Virtualizing at the application platform layer poses many
    other challenges with respect to security, containment
    and scalability
    Platform-level virtualization?

    View Slide

  5. • Virtualizing at the OS level hits the sweet spot:
    • Single OS (i.e., single kernel) allows for efficient use of
    hardware resources, maximizing tenancy and
    performance
    • Disjoint instances are securely compartmentalized by
    the operating system
    • Gives users what appears to be a virtual machine (albeit
    a very fast one) on which to run higher-level software
    • The ease of a PaaS with the generality of IaaS
    • Model was pioneered by FreeBSD jails and taken to
    their logical extreme by Solaris zones — and then aped
    by Linux containers
    OS-level virtualization!

    View Slide

  6. OS-level virtualization in the cloud
    • Joyent runs OS containers in the cloud via SmartOS
    (our illumos derivative) — and we have run containers in
    multi-tenant production since ~2006
    • Core SmartOS facilities are container-aware and
    optimized: Zones, ZFS, DTrace, Crossbow, SMF, etc.
    • SmartOS also supports hardware-level virtualization —
    but we have long advocated OS-level virtualization for
    new build out
    • We emphasized their operational characteristics
    (performance, elasticity, tenancy), and for many years
    we were a lone voice...

    View Slide

  7. Containers as PaaS foundation?
    • Some saw the power of OS containers to facilitate up-
    stack platform-as-a-service abstractions
    • For example, dotCloud — a platform-as-a-service
    provider — build their PaaS on OS containers
    • Hearing that many were interested in their container
    orchestration layer (but not their PaaS), dotCloud open
    sourced their container-based orchestration layer...

    View Slide

  8. ...and Docker was born

    View Slide

  9. Docker revolution
    • Docker has used the rapid provisioning + shared
    underlying filesystem of containers to allow developers
    to think operationally
    • Developers can encode dependencies and deployment
    practices into an image
    • Images can be layered, allowing for swift development
    • Images can be quickly deployed — and re-deployed
    • Docker will do to apt what apt did to tar

    View Slide

  10. Docker’s challenges
    • The Docker model is the future of containers
    • Docker’s challenges are largely around production
    deployment: security, network virtualization, persistence
    • Security concerns are real enough that for multi-tenancy,
    OS containers are currently running in hardware VMs (!!)
    • SmartOS, we have spent a decade addressing these
    concerns — and are proven in production…
    • Could we combine the best of both worlds?
    • Could we somehow deploy Docker containers as
    SmartOS zones?

    View Slide

  11. Docker + SmartOS: Linux binaries?
    • First (obvious) problem: while it has been designed to
    be cross-platform, Docker is Linux-centric
    • While Docker could be ported, the encyclopedia of
    Docker images will likely forever remain Linux binaries
    • SmartOS is Unix — but it isn’t Linux…
    • Could we somehow natively emulate Linux — and run
    Linux binaries directly on the SmartOS kernel?

    View Slide

  12. OS emulation: An old idea
    • Operating systems have long employed system call
    emulation to allow binaries from one operating system
    run on another on the same instruction set architecture
    • Combines the binary footprint of the emulated system
    with the operational advantages of the emulating system
    • Sun first did this with SunOS 4.x binaries on Solaris 2.x
    • In mid-2000s, Sun developed zone-based OS emulation
    for Solaris: branded zones
    • Several brands were developed — notably including an
    LX brand that allowed for Linux emulation

    View Slide

  13. LX-branded zones: Life and death
    • The LX-branded zone worked for RHEL 3 (!): glibc 2.3.2
    + Linux 2.4
    • Remarkable amount of work was done to handle device
    pathing, signal handling, /proc — and arcana like TTY
    ioctls, ptrace, etc.
    • Worked for a surprising number of binaries!
    • But support was only for 2.4 kernels and only for 32-bit;
    2.6 + 64-bit appeared daunting…
    • Support was ripped out of the system on June 11, 2010
    • Fortunately, this was after the system was open sourced
    in June 2005 — and the source was out there...

    View Slide

  14. LX-branded zones: Resurrection!
    • In January 2014, David Mackay, an illumos community
    member, announced that he was able to resurrect the
    LX brand —and that it appeared to work!
    Linked below is a webrev which restores LX branded zones
    support to Illumos:
    http://cr.illumos.org/~webrev/DavidJX8P/lx-zones-restoration/
    I have been running OpenIndiana, using it daily on my
    workstation for over a month with the above webrev applied to
    the illumos-gate and built by myself.
    It would definitely raise interest in Illumos. Indeed, I have
    seen many people who are extremely interested in LX zones.
    The LX zones code is minimally invasive on Illumos itself, and
    is mostly segregated out.
    I hope you find this of interest.

    View Slide

  15. LX-branded zones: Revival
    • Encouraged that the LX-branded work was salvageable,
    Joyent engineer Jerry Jelinek reintegrated the LX brand
    into SmartOS on March 20, 2014...
    • ...and started the (substantial) work to modernize it
    • Guiding principles for LX-branded zone work:
    • Do it all in the open
    • Do it all on SmartOS master (illumos-joyent)
    • Add base illumos facilities wherever possible
    • Aim to upstream to illumos when we’re done

    View Slide

  16. LX-branded zones: Progress
    • Working assiduously over the course of 2014, progress
    was difficult but steady:
    • Ubuntu 10.04 booted in April
    • Ubuntu 12.04 booted in May
    • Ubuntu 14.04 booted in July
    • 64-bit Ubuntu 14.04 booted in October (!)
    • Going into 2015, it was becoming increasingly difficult to
    find Linux software that didn’t work...

    View Slide

  17. LX-branded zones: Working well...

    View Slide

  18. ...and, um, well received

    View Slide

  19. Docker + SmartOS: Provisioning?
    • With the binary problem being tackled, focus turned to
    the mechanics of integrating Docker with the SmartOS
    facilities for provisioning
    • Provisioning a SmartOS zone operates via the global
    zone that represents the control plane of the machine
    • docker is a single binary that functions as both client
    and server — and with too much surface area to run in
    the global zone, especially for a public cloud
    • docker has also embedded Go- and Linux-isms that
    we did not want in the global zone; we needed to find a
    different approach...

    View Slide

  20. Docker Remote API
    • While docker is a single binary that can run on the
    client or the server, it does not run in both at once…
    • docker (the client) communicates with docker (the
    server) via the Docker Remote API
    • The Docker Remote API is expressive, modern and
    robust (i.e. versioned), allowing for docker to
    communicate with Docker backends that aren’t docker
    • The clear approach was therefore to implement a
    Docker Remote API endpoint for SmartDataCenter

    View Slide

  21. Aside: SmartDataCenter
    • Orchestration software for SmartOS-based clouds
    • Unlike other cloud stacks, not designed to run arbitrary
    hypervisors, sell legacy hardware or get 160 companies
    to agree on something
    • SmartDataCenter is designed to leverage the SmartOS
    differentiators: ZFS, DTrace and (esp.) zones
    • Runs both the Joyent Public Cloud and business-critical
    on-premises clouds at well-known brands
    • Born proprietary — but made entirely open source on
    November 6, 2014: http://github.com/joyent/sdc

    View Slide

  22. SmartDataCenter: Architecture
    Booter
    AMQP
    broker
    Public
    API
    Customer
    portal
    ZFS-based multi-tenant filesystem
    Virtual NIC
    Virtual NIC
    Virtual
    SmartOS
    (OS virt.)
    . . .
    Virtual NIC
    Virtual NIC
    Linux
    Guest
    (HW virt.)
    . . .
    Virtual NIC
    Virtual NIC
    Windows
    Guest
    (HW virt.)
    . . .
    Virtual NIC
    Virtual NIC
    Virtual OS
    or Machine
    . . .
    SmartOS kernel
    (network booted)
    SmartOS kernel
    (flash booted)
    Provisioner
    Instrumenter
    Heartbeater
    DHCP/TFTP
    AMQP
    AMQP agents
    Public HTTP
    Head-node
    Compute node
    Tens/hundreds per
    head-node
    . . .
    SDC 7 core services
    Binder
    DNS
    Operator
    portal
    . . .
    Firewall

    View Slide

  23. SmartDataCenter: Core Services
    Analytics
    aggregator
    Key/Value
    Service
    (Moray)
    Firewall
    API
    (FWAPI)
    Virtual
    Machine
    API
    (VMAPI)
    Directory
    Service
    (UFDS)
    Designation
    API
    (DAPI)
    Workflow
    API
    Network
    API
    (NAPI)
    Compute-
    Node API
    (CNAPI)
    Image
    API
    Alerts &
    Monitoring
    (Amon)
    Packaging
    API
    (PAPI)
    Service
    API
    (SAPI)
    DHCP/
    TFTP
    AMQP
    DNS
    Booter
    AMQP
    broker
    Binder
    Public
    API
    Customer
    portal
    Public HTTP
    Operator
    portal
    Operator
    Services Manta
    Other DCs
    Note: Service
    interdependencies not
    shown for readability
    Head-node
    Other core services
    may be provisioned on
    compute nodes
    SDC7 Core Services

    View Slide

  24. SmartDataCenter + Docker
    • Implementing an SDC-wide endpoint for the Docker
    remote API allows us to build in terms of our established
    core services: UFDS, CNAPI, VMAPI, Image API, etc.
    • Has the welcome side-effect of virtualizing the notion of
    Docker host machine: Docker containers can be placed
    anywhere within the data center
    • From a developer perspective, one less thing to manage
    • From an operations perspective, allows for a flexible
    layer of management and control: Docker API endpoints
    become a potential administrative nexus
    • As such, virtualizing the Docker host is somewhat
    analogous to the way ZFS virtualized the filesystem...

    View Slide

  25. SmartDataCenter + Docker: Challenges
    • Some Docker constructs have (implicitly) encoded co-
    locality of Docker containers on a physical machine
    • Some of these constructs (e.g., --volumes-from) we
    will discourage but accommodate by co-scheduling
    • Others (e.g., host directory-based volumes) we are
    implementing via NFS backed by Manta, our (open
    source!) distributed object storage service
    • Moving forward, we are working with Docker to help
    assure that the Docker Remote API doesn’t create new
    implicit dependencies on physical locality

    View Slide

  26. SmartDataCenter + Docker: Networking
    • Parallel to our SmartOS and Docker work, we have
    been working on next-generation software-defined
    networking for SmartOS and SmartDataCenter
    • Goal was to use standard encapsulation/decapsulation
    protocols (i.e., VXLAN) for overlay networks
    • We have taken a kernel-based (and ARP-inspired)
    approach to assure scale
    • Complements SDC’s existing in-kernel, API-managed
    firewall facilities
    • All done in the open: on the dev-overlay branch of
    SmartOS (illumos-joyent) and as sdc-portolan

    View Slide

  27. Putting it all together: sdc-docker
    • Our Docker engine for SDC, sdc-docker, implements
    the end points for the Docker Remote API
    • Work is young (started in earnest in early fall 2014), but
    because it takes advantage of a proven orchestration
    substrate, progress has been very quick…
    • We will be deploying it into early access production in
    the Joyent Public Cloud in Q1CY15
    • It’s open source: http://github.com/joyent/sdc-docker;
    you can install SDC (either on hardware or on VMware)
    and check it out for yourself!
    • A demo is worth a thousand slides...

    View Slide

  28. Future of containers in production
    • For nearly a decade, we at Joyent have believed that
    OS-virtualized containers are the future of computing
    • While the efficiency gains are tremendous, they have
    not alone been enough to propel containers into the
    mainstream
    • We believe that the developer ease of Docker combined
    with the proven production substrate of SmartOS and
    SmartDataCenter yields the best of all worlds
    • The future of containers is one without compromise:
    developer efficiency, operational elasticity, multi-tenant
    security and on-the-metal performance!

    View Slide

  29. Thank you!
    • Jerry Jelinek, @pfmooney, @jmclulow and @jperkin for
    their work on LX branded zones
    • @joshwilsdon, @trentmick, @cachafla and @orlandov
    for their work on sdc-docker
    • @rmustacc, @wayfaringrob, @fredfkuo and @notmatt
    for their work on SDC overlay networking
    • The countless engineers who have worked on or with
    illumos because they believed in OS-based virtualization

    View Slide