Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Docker and the Future of Containers in Production

Bryan Cantrill
January 29, 2015
29

Docker and the Future of Containers in Production

My presentation at the Docker meetup in Seattle on January 28, 2015. This was videoed, but the video seems to be lost; a close approximation of this can be found here: https://www.youtube.com/watch?v=Ll50EFquwSo

Bryan Cantrill

January 29, 2015
Tweet

Transcript

  1. Docker and the
    Future of Containers in
    Production
    CTO
    [email protected]
    Bryan Cantrill
    @bcantrill

    View full-size slide

  2. Prehistory: Virtualization as cloud catalyst
    • In the 1960s — shortly after the dawn of computing! —
    pundits foresaw a compute utility that would be public
    and multi-tenant
    • The vision was four decades too early: it took the
    internet + commodity computing + virtualization to yield
    cloud computing
    • Virtualization is the essential ingredient for multi-tenant
    operation — but where in the stack to virtualize?
    • Choices around virtualization capture tensions between
    elasticity, tenancy, and performance
    • tl;dr: Virtualization choices drive economic tradeoffs

    View full-size slide

  3. • The historical answer — since the 1960s — has been to
    virtualize at the level of the hardware:
    • A virtual machine is presented upon which each
    tenant runs an operating system of their choosing
    • There are as many operating systems as tenants
    • The singular advantage of hardware virtualization: it can
    run entire legacy stacks unmodified
    • However, hardware virtualization exacts a heavy price:
    operating systems are not designed to share resources
    like DRAM, CPU, I/O devices or the network
    • Hardware virtualization limits tenancy, elasticity and
    performance
    Hardware-level virtualization?

    View full-size slide

  4. • Virtualizing at the application platform layer addresses
    the tenancy challenges of hardware virtualization
    • Added advantage of a much more nimble (& developer-
    friendly!) abstraction…
    • ...but at the cost of dictating abstraction to the developer
    • This creates the “Google App Engine problem”:
    developers are in a straightjacket where toy programs
    are easy — but sophisticated apps are impossible
    • Virtualizing at the application platform layer poses many
    other challenges with respect to security, containment
    and scalability
    Platform-level virtualization?

    View full-size slide

  5. • Virtualizing at the OS level hits the sweet spot:
    • Single OS (i.e., single kernel) allows for efficient use of
    hardware resources, maximizing tenancy and
    performance
    • Disjoint instances are securely compartmentalized by
    the operating system
    • Gives users what appears to be a virtual machine (albeit
    a very fast one) on which to run higher-level software
    • The ease of a PaaS with the generality of IaaS
    • Model was pioneered by FreeBSD jails and taken to
    their logical extreme by Solaris zones — and then aped
    by Linux containers
    OS-level virtualization!

    View full-size slide

  6. OS-level virtualization in the cloud
    • Joyent runs OS containers in the cloud via SmartOS
    (our illumos derivative) — and we have run containers in
    multi-tenant production since ~2006
    • Core SmartOS facilities are container-aware and
    optimized: Zones, ZFS, DTrace, Crossbow, SMF, etc.
    • SmartOS also supports hardware-level virtualization —
    but we have long advocated OS-level virtualization for
    new build out
    • We emphasized their operational characteristics
    (performance, elasticity, tenancy), and for many years
    we were a lone voice...

    View full-size slide

  7. Containers as PaaS foundation?
    • Some saw the power of OS containers to facilitate up-
    stack platform-as-a-service abstractions
    • For example, dotCloud — a platform-as-a-service
    provider — build their PaaS on OS containers
    • Hearing that many were interested in their container
    orchestration layer (but not their PaaS), dotCloud open
    sourced their container-based orchestration layer...

    View full-size slide

  8. ...and Docker was born

    View full-size slide

  9. Docker revolution
    • Docker has used the rapid provisioning + shared
    underlying filesystem of containers to allow developers
    to think operationally
    • Developers can encode dependencies and deployment
    practices into an image
    • Images can be layered, allowing for swift development
    • Images can be quickly deployed — and re-deployed
    • Docker will do to apt what apt did to tar

    View full-size slide

  10. Docker’s challenges
    • The Docker model is the future of containers
    • Docker’s challenges are largely around production
    deployment: security, network virtualization, persistence
    • Security concerns are real enough that for multi-tenancy,
    OS containers are currently running in hardware VMs (!!)
    • SmartOS, we have spent a decade addressing these
    concerns — and are proven in production…
    • Could we combine the best of both worlds?
    • Could we somehow deploy Docker containers as
    SmartOS zones?

    View full-size slide

  11. Docker + SmartOS: Linux binaries?
    • First (obvious) problem: while it has been designed to
    be cross-platform, Docker is Linux-centric
    • While Docker could be ported, the encyclopedia of
    Docker images will likely forever remain Linux binaries
    • SmartOS is Unix — but it isn’t Linux…
    • Could we somehow natively emulate Linux — and run
    Linux binaries directly on the SmartOS kernel?

    View full-size slide

  12. OS emulation: An old idea
    • Operating systems have long employed system call
    emulation to allow binaries from one operating system
    run on another on the same instruction set architecture
    • Combines the binary footprint of the emulated system
    with the operational advantages of the emulating system
    • Sun first did this with SunOS 4.x binaries on Solaris 2.x
    • In mid-2000s, Sun developed zone-based OS emulation
    for Solaris: branded zones
    • Several brands were developed — notably including an
    LX brand that allowed for Linux emulation

    View full-size slide

  13. LX-branded zones: Life and death
    • The LX-branded zone worked for RHEL 3 (!): glibc 2.3.2
    + Linux 2.4
    • Remarkable amount of work was done to handle device
    pathing, signal handling, /proc — and arcana like TTY
    ioctls, ptrace, etc.
    • Worked for a surprising number of binaries!
    • But support was only for 2.4 kernels and only for 32-bit;
    2.6 + 64-bit appeared daunting…
    • Support was ripped out of the system on June 11, 2010
    • Fortunately, this was after the system was open sourced
    in June 2005 — and the source was out there...

    View full-size slide

  14. LX-branded zones: Resurrection!
    • In January 2014, David Mackay, an illumos community
    member, announced that he was able to resurrect the
    LX brand —and that it appeared to work!
    Linked below is a webrev which restores LX branded zones
    support to Illumos:
    http://cr.illumos.org/~webrev/DavidJX8P/lx-zones-restoration/
    I have been running OpenIndiana, using it daily on my
    workstation for over a month with the above webrev applied to
    the illumos-gate and built by myself.
    It would definitely raise interest in Illumos. Indeed, I have
    seen many people who are extremely interested in LX zones.
    The LX zones code is minimally invasive on Illumos itself, and
    is mostly segregated out.
    I hope you find this of interest.

    View full-size slide

  15. LX-branded zones: Revival
    • Encouraged that the LX-branded work was salvageable,
    Joyent engineer Jerry Jelinek reintegrated the LX brand
    into SmartOS on March 20, 2014...
    • ...and started the (substantial) work to modernize it
    • Guiding principles for LX-branded zone work:
    • Do it all in the open
    • Do it all on SmartOS master (illumos-joyent)
    • Add base illumos facilities wherever possible
    • Aim to upstream to illumos when we’re done

    View full-size slide

  16. LX-branded zones: Progress
    • Working assiduously over the course of 2014, progress
    was difficult but steady:
    • Ubuntu 10.04 booted in April
    • Ubuntu 12.04 booted in May
    • Ubuntu 14.04 booted in July
    • 64-bit Ubuntu 14.04 booted in October (!)
    • Going into 2015, it was becoming increasingly difficult to
    find Linux software that didn’t work...

    View full-size slide

  17. LX-branded zones: Working well...

    View full-size slide

  18. ...and, um, well received

    View full-size slide

  19. Docker + SmartOS: Provisioning?
    • With the binary problem being tackled, focus turned to
    the mechanics of integrating Docker with the SmartOS
    facilities for provisioning
    • Provisioning a SmartOS zone operates via the global
    zone that represents the control plane of the machine
    • docker is a single binary that functions as both client
    and server — and with too much surface area to run in
    the global zone, especially for a public cloud
    • docker has also embedded Go- and Linux-isms that
    we did not want in the global zone; we needed to find a
    different approach...

    View full-size slide

  20. Docker Remote API
    • While docker is a single binary that can run on the
    client or the server, it does not run in both at once…
    • docker (the client) communicates with docker (the
    server) via the Docker Remote API
    • The Docker Remote API is expressive, modern and
    robust (i.e. versioned), allowing for docker to
    communicate with Docker backends that aren’t docker
    • The clear approach was therefore to implement a
    Docker Remote API endpoint for SmartDataCenter

    View full-size slide

  21. Aside: SmartDataCenter
    • Orchestration software for SmartOS-based clouds
    • Unlike other cloud stacks, not designed to run arbitrary
    hypervisors, sell legacy hardware or get 160 companies
    to agree on something
    • SmartDataCenter is designed to leverage the SmartOS
    differentiators: ZFS, DTrace and (esp.) zones
    • Runs both the Joyent Public Cloud and business-critical
    on-premises clouds at well-known brands
    • Born proprietary — but made entirely open source on
    November 6, 2014: http://github.com/joyent/sdc

    View full-size slide

  22. SmartDataCenter: Architecture
    Booter
    AMQP
    broker
    Public
    API
    Customer
    portal
    ZFS-based multi-tenant filesystem
    Virtual NIC
    Virtual NIC
    Virtual
    SmartOS
    (OS virt.)
    . . .
    Virtual NIC
    Virtual NIC
    Linux
    Guest
    (HW virt.)
    . . .
    Virtual NIC
    Virtual NIC
    Windows
    Guest
    (HW virt.)
    . . .
    Virtual NIC
    Virtual NIC
    Virtual OS
    or Machine
    . . .
    SmartOS kernel
    (network booted)
    SmartOS kernel
    (flash booted)
    Provisioner
    Instrumenter
    Heartbeater
    DHCP/TFTP
    AMQP
    AMQP agents
    Public HTTP
    Head-node
    Compute node
    Tens/hundreds per
    head-node
    . . .
    SDC 7 core services
    Binder
    DNS
    Operator
    portal
    . . .
    Firewall

    View full-size slide

  23. SmartDataCenter: Core Services
    Analytics
    aggregator
    Key/Value
    Service
    (Moray)
    Firewall
    API
    (FWAPI)
    Virtual
    Machine
    API
    (VMAPI)
    Directory
    Service
    (UFDS)
    Designation
    API
    (DAPI)
    Workflow
    API
    Network
    API
    (NAPI)
    Compute-
    Node API
    (CNAPI)
    Image
    API
    Alerts &
    Monitoring
    (Amon)
    Packaging
    API
    (PAPI)
    Service
    API
    (SAPI)
    DHCP/
    TFTP
    AMQP
    DNS
    Booter
    AMQP
    broker
    Binder
    Public
    API
    Customer
    portal
    Public HTTP
    Operator
    portal
    Operator
    Services Manta
    Other DCs
    Note: Service
    interdependencies not
    shown for readability
    Head-node
    Other core services
    may be provisioned on
    compute nodes
    SDC7 Core Services

    View full-size slide

  24. SmartDataCenter + Docker
    • Implementing an SDC-wide endpoint for the Docker
    remote API allows us to build in terms of our established
    core services: UFDS, CNAPI, VMAPI, Image API, etc.
    • Has the welcome side-effect of virtualizing the notion of
    Docker host machine: Docker containers can be placed
    anywhere within the data center
    • From a developer perspective, one less thing to manage
    • From an operations perspective, allows for a flexible
    layer of management and control: Docker API endpoints
    become a potential administrative nexus
    • As such, virtualizing the Docker host is somewhat
    analogous to the way ZFS virtualized the filesystem...

    View full-size slide

  25. SmartDataCenter + Docker: Challenges
    • Some Docker constructs have (implicitly) encoded co-
    locality of Docker containers on a physical machine
    • Some of these constructs (e.g., --volumes-from) we
    will discourage but accommodate by co-scheduling
    • Others (e.g., host directory-based volumes) we are
    implementing via NFS backed by Manta, our (open
    source!) distributed object storage service
    • Moving forward, we are working with Docker to help
    assure that the Docker Remote API doesn’t create new
    implicit dependencies on physical locality

    View full-size slide

  26. SmartDataCenter + Docker: Networking
    • Parallel to our SmartOS and Docker work, we have
    been working on next-generation software-defined
    networking for SmartOS and SmartDataCenter
    • Goal was to use standard encapsulation/decapsulation
    protocols (i.e., VXLAN) for overlay networks
    • We have taken a kernel-based (and ARP-inspired)
    approach to assure scale
    • Complements SDC’s existing in-kernel, API-managed
    firewall facilities
    • All done in the open: on the dev-overlay branch of
    SmartOS (illumos-joyent) and as sdc-portolan

    View full-size slide

  27. Putting it all together: sdc-docker
    • Our Docker engine for SDC, sdc-docker, implements
    the end points for the Docker Remote API
    • Work is young (started in earnest in early fall 2014), but
    because it takes advantage of a proven orchestration
    substrate, progress has been very quick…
    • We will be deploying it into early access production in
    the Joyent Public Cloud in Q1CY15
    • It’s open source: http://github.com/joyent/sdc-docker;
    you can install SDC (either on hardware or on VMware)
    and check it out for yourself!
    • A demo is worth a thousand slides...

    View full-size slide

  28. Future of containers in production
    • For nearly a decade, we at Joyent have believed that
    OS-virtualized containers are the future of computing
    • While the efficiency gains are tremendous, they have
    not alone been enough to propel containers into the
    mainstream
    • We believe that the developer ease of Docker combined
    with the proven production substrate of SmartOS and
    SmartDataCenter yields the best of all worlds
    • The future of containers is one without compromise:
    developer efficiency, operational elasticity, multi-tenant
    security and on-the-metal performance!

    View full-size slide

  29. Thank you!
    • Jerry Jelinek, @pfmooney, @jmclulow and @jperkin for
    their work on LX branded zones
    • @joshwilsdon, @trentmick, @cachafla and @orlandov
    for their work on sdc-docker
    • @rmustacc, @wayfaringrob, @fredfkuo and @notmatt
    for their work on SDC overlay networking
    • The countless engineers who have worked on or with
    illumos because they believed in OS-based virtualization

    View full-size slide