Slide 1

Slide 1 text

1 Container in Mesos Present and Future Jie Yu Timothy Chen

Slide 2

Slide 2 text

2 ● Containerizer overview ● The unified containerizer ● Motivation ● Pluggable architecture ● Container image ● Container network ● Container storage ● Container security ● Extensions ● Moving forward ● Nested container ● VM support ● Unified fetching and caching ● Better abstraction for isolators Outline

Slide 3

Slide 3 text

3 Containerizer ● Between agents and containers ● Launch/update/destroy containers ● Provide isolations between containers ● Report container stats and status What is containerizer? Mesos Master Mesos Master Mesos Master Zookeeper Marathon Framework Cassandra Framework Mesos Agent Containerizer Container Executor T1 T2 Mesos Agent Containerizer Container Executor T1 T2 Mesos Agent Containerizer Container Executor T1 T2

Slide 4

Slide 4 text

4 Docker containerizer ● Delegate to Docker daemon Mesos containerizer ● Using standard OS features (e.g., cgroups, namespaces) ● Pluggable architecture allowing customization and extension Currently supported containerizers

Slide 5

Slide 5 text

New features require changes to both containerizers ● E.g., GPU support, Persistent volumes, etc. ● Code duplication → bugs, maintenance issue Coordination needed between two containerizers ● Global resources like GPU, net_cls handles, etc. 5 Maintaining multiple containerizers is hard

Slide 6

Slide 6 text

6 Goal: Unifying containerizer implementations using a shared, pluggable and extensible architecture Currently, it is based on Mesos containerizer Unified containerizer

Slide 7

Slide 7 text

7 Customization ● Only few limited options with Docker (plugins) Depended on Docker Daemon ● Stability concern ● Extra dependency Support other image formats ● Appc, CVMFS, ... Why don’t we base on Docker containerizer

Slide 8

Slide 8 text

8 ● Pluggable architecture ● Container image ● Container network ● Container storage ● Container security ● Customization and extensions Unified containerizer

Slide 9

Slide 9 text

9 Pluggable architecture Unified containerizer Launcher Isolators Unified containerizer Provisioner Process management Container lifecycle hook Container image support

Slide 10

Slide 10 text

10 Responsible for process management ● Spawn containers ● Kill and wait containers Supported launchers: ● Posix launcher ● Linux launcher ● Systemd launcher (future) Launcher Unified containerizer

Slide 11

Slide 11 text

11 Interface for extensions during the life cycle of a container ● Pre-launch - prepare() ● Post-launch (both in parent and child context) - isolate() ● Termination - cleanup() ● Resources update - update() ● Resources limitation reached - watch() ● Agent restart and recovery - recover() ● Stats and status pulling - usage() Isolator Unified containerizer Sufficient for most of the extensions!

Slide 12

Slide 12 text

12 Isolator example: cgroups memory isolator Unified containerizer Agent Process Launcher creates Subprocess Container Process execve() Script = Isolator::prepare() * Create a cgroup for the container in memory cgroup hierarchy: /sys/fs/cgroup/memory/mesos/… * Start listening for OOM event Isolator::isolate(pid) Block on pipe Move ‘pid’ to the memory cgroup just created Invoke ‘Script’ Exec the executor Signal the Child to continue

Slide 13

Slide 13 text

13 Isolator example: cgroups memory isolator Unified containerizer Agent Process Container Process Isolator::update() Change cgroup control: memory.limit_in_bytes Sending a new Task to Executor, ‘resources’ of the Executor changes Send Task to Executor

Slide 14

Slide 14 text

14 Isolator example: cgroups memory isolator Unified containerizer Agent Process Container Process Isolator::cleanup() Remove the memory cgroup associated with the container Shutdown Executor or kill Task Destroy container Container terminated

Slide 15

Slide 15 text

15 Cgroups isolators: cgroups/cpu, cgroups/mem, ... Disk isolators: disk/du, disk/xfs Filesystem isolators: filesystem/posix, filesystem/linux Volume isolators: docker/volume Network isolators: network/cni, network/port_mapping GPU isolators: gpu/nvidia …... and more! Need your contribution! Built-in isolators Unified containerizer

Slide 16

Slide 16 text

16 Start from 0.28, you can run your Docker container on Mesos without a Docker daemon installed! ● One less dependency in your stack ● Agent restart handled gracefully, task not affected ● Compose well with all existing isolators ● Easier to add extensions Container image support Unified containerizer

Slide 17

Slide 17 text

17 Manage container images ● Store: fetch and cache image layers ● Backend: assemble rootfs from image layers ● E.g., copy, overlayfs, bind, aufs Store can be extended ● Currently supported: Docker, Appc ● Plan to support: CVMFS (join the MesosCon talk!) Provisioner Unified containerizer

Slide 18

Slide 18 text

18 message Image { enum Type { DOCKER = 1; APPC = 2; } required Type type; optional Appc appc; optional Docker docker; optional bool cached; message Docker { ... } message Appc { ... } } message TaskInfo { message ContainerInfo { enum Type { DOCKER = 1; MESOS = 2; } message MesosInfo { optional Image image; } required Type type; optional MesosInfo mesos; } optional ContainerInfo container; } Container image framework API Unified containerizer

Slide 19

Slide 19 text

19 TaskInfo { ... “container” : { “type” : “MESOS”, “mesos” : { “image” : { “type” : “DOCKER”, “docker” : { “name” : “busybox” } } } } } Example: launch a Docker container w/ unified containerizer Unified containerizer Instead of “DOCKER”, which uses Docker containerizer More details can be found at: https://github.com/apache/mesos/blob/master/docs/container-image.md

Slide 20

Slide 20 text

20 Demo Unified containerizer

Slide 21

Slide 21 text

21 Mesos starts to support CNI in 1.0 ● CNI: a container network spec proposed by CoreOS ● Simpler and less dependencies than Docker CNM ● K8s supported it ● Rich plugins from network vendors Main advantages: ● Clear separation between container and network management ● IPAM has its own pluggable interface Container network support Unified containerizer

Slide 22

Slide 22 text

22 Existing CNI plugins ● ipvlan ● macvlan ● bridge ● flannel ● calico ● weave ● … CNI Unified containerizer You can write your own plugin, and Mesos supports it!

Slide 23

Slide 23 text

23 CNI support in Mesos Unified containerizer mesos-agent --isolation=network/cni message NetworkInfo { optional string name; ... } message TaskInfo { message ContainerInfo { repeated NetworkInfo network_infos; } } More details? Join the CNI talk! Implemented as an isolator.

Slide 24

Slide 24 text

24 Support Docker volume plugins from 1.0 ● Define the interface between container runtime and storage provider ● https://docs.docker.com/engine/extend/plugins_volume/ A variety of Docker volume plugins ● Flocker ● Rexray ● Convoy ● Glusterfs ● Ceph Container storage support Unified containerizer

Slide 25

Slide 25 text

25 Docker volume support in Mesos Unified containerizer mesos-agent --isolation=docker/volume message Volume { message Source { enum Type { DOCKER_VOLUME = 1; } message DockerVolume { optional string driver; required string name; } optional Type type; optional DockerVolume docker_volume; } optional Source source; } Implemented as an isolator.

Slide 26

Slide 26 text

26 Linux capabilities ● Fine-grained access control ● Containers running as root have restricted set of capabilities ● Containers running as non-root can have certain capabilities User namespace ● Full privileges inside the user namespace (e.g., uid=0) ● Normal unprivileged user ID in the host user namespace Container security Unified containerizer

Slide 27

Slide 27 text

27 Launcher ● Custom container processes management Isolator ● Extension to the life cycle of a container Provisioner ● New type of images ● Custom fetching and caching Extensions Unified containerizer

Slide 28

Slide 28 text

28 ● Containerizer overview ● The unified containerizer ● Motivation ● Pluggable architecture ● Container image ● Container network ● Container storage ● Container security ● Extensions ● Moving forward ● Nested container ● VM support ● Unified fetching and caching ● Better abstraction for isolators Outline

Slide 29

Slide 29 text

29 Future: unified containerizer! Make it awesome ● Nested container ● VM support ● Unified fetching and caching ● Better abstraction for isolators Future of containerization in Mesos

Slide 30

Slide 30 text

30 Custom executor wants to create sub-containers ● Isolation between sub-containers ● Sub-containers have container images (e.g., Docker) ● When executor dies, sub-containers will be destroyed Use cases: ● K8s on Mesos ● Jenkins on Mesos ● Native POD support Nested container Future work

Slide 31

Slide 31 text

31 VM support Future work Motivation and use cases ● More secure containers ● VM workload ● OpenStack integration Possible implementations ● A new containerizer? ● A plugin to unified containerizer? Goal: launching Mesos tasks/executors in VMs

Slide 32

Slide 32 text

32 Problems: ● Different ways to fetch URIs and container images ● Cached in different places Pluggable fetcher ● A fetcher for each URI scheme ● Allow URIs with custom scheme ● Fetcher is modularized and can be extended (e.g., p2p) Unified caching ● All artifacts are cached the same way ● Content addressable storage ● Garbage collection ● Pre-fetching support Unified fetching and caching Future work

Slide 33

Slide 33 text

33 Problems with the existing abstraction ● Cannot specify dependencies between isolators ● Sharing information between isolators is not possible ● Upgrading isolators in a backward compatible way is hard Potential solutions ● Explicit isolator dependency, both data and control ● Isolator versioning, and version checkpointing ● Isolator registry? Better abstraction for isolators Future work

Slide 34

Slide 34 text

34 Unified containerizer ● Container images ● Container network ● Container storage ● Container security ● Pluggable architecture Future: keep improving the unified containerizer! Summary