is a container? – Developer: create container images – Operator: create isolated execution environment • Containerization in Mesos focuses on the operator side – Started from the very beginning (2011) History of Mesos Containerization
(pre 0.10.0, 2011) Agent Executor Task Executor Task Executor Task Process Session Process Session Process Session • Each container is a process session • No resource isolation
support (0.10, 2012) • Enabled cpu, memory isolation • Freezer cgroup for process management /sys/fs/cgroup/cpu/mesos/<cid> /sys/fs/cgroup/memory/mesos/<cid> /sys/fs/cgroup/freezer/mesos/<cid> Agent Executor Task Executor Task Executor Task Process Session Process Session Process Session
spec • Scope – How to package application bits into images – How to package application configs into images – How to store and transfer images – How to unpack images to get application bits and configs
support OCI image spec (soon) Container Container Container OCI Image Store Docker registry Appc Image Store Will be supported in MesosContainerizer Pluggable container image format
spec • Scope – How to connect containers – How to allocate IP Addresses – How to enforce security policies – How to isolate performance – How to provide quality of service – How to balance network traffic
Networking Interface • A simple CLI based interface • Container orchestrator should invoke the CLI commands – Before container starts – After container terminates • Adopted by major container orchestrators and network vendors – Recently joined CNCF – https://github.com/containernetworking/cni
spec • Scope – How to Create/Destroy volumes – How to Attach/Detach volumes – How to Mount/Unmount volumes – How to create snapshots – How to restore snapshots
Storage Interface • Joint work between major container orchestrators – Mesos, Kubernetes, Docker, Cloud Foundry – https://github.com/container-storage-interface • The goal of CSI in v1.0 – One storage plugin works for all COs – Support dynamic provisioning – Support both local and remote storage – Support Mount and Block volumes
launched by the Mesos agent to execute tasks • 1 : n mapping between an executor and tasks What is an Executor? Mesos Agent Executor Task1 Task2 waitpid()
Protobuf message passing over HTTP (non-standard, unversioned) • Native Library dependency; libmesos.so Executor API New • Versioned API (Currently v1) • Protobuf/JSON over HTTP 1.1 • No native library dependencies
Executor Command Executor* • Old API • Only supports launching a single task • Agent flag --http_command_executor to use the new API Docker Executor* • Old API • Only supports launching a single docker container Custom Executor • Old API or V1 API • Can launch multiple tasks or task groups** Default Executor* • V1 API • Can launch multiple task groups * Built-in Executor ** Task Groups would be explained later
a sidecar/adapter container (e.g., logger, metrics) next to the main application controller • Run a group of containers sharing volumes and network namespace while some of them can have their own mount namespace • Run a group of containers with the same lifecycle, e.g., one container’s failure would cause all other containers to be cleaned up Why need Task Groups aka Pods?
of existing Scheduler/Executor API’s not allowing to launch a group of tasks atomically • A scheduler can launch multiple tasks in a single LAUNCH operation; but they are delivered one-by-one to the executor and might even be dropped in some cases due to a network partition! • The newly introduced abstraction TaskGroup all-or-nothing semantics and ensures a group of tasks are delivered atomically to the executor Limitations of old API!
executor launches tasks in a task group as nested containers • Tasks in a task group share resources (network namespace, volumes) • No resource isolation between tasks or task groups within the executor • Launches a nested container for every task in a task group Default Executor
sidecar/adapter containers that don’t affect the lifecycle of the executor/main application: • Run them in a separate task group; their failure would still keep the main container active • Specify the same executor ID for launching subsequent task groups on the same executor TaskGroup Lifecycle wrt Default Executor
built-in executors rely on the checker native library in `src/checks` in the Mesos codebase. Custom executors are encouraged to use it too! • Command health checks are implemented via debug nested containers since they need to be run from the same mount namespace as the original container • The executor container already shares the network namespace with other containers in the task group making HTTP/TCP checks easier Health Checks
to health checks; but no automatic response upon failure • Allows scheduler to probe the task and use the result as per its own business logic Probes aka “Checks”
an executor submits calls, the agent cannot be certain that the calls are sent by the correct executor process • To prevent executor impersonation, we need to authenticate executor requests • The Default Executor supports AuthN since Mesos 1.3.0! Executor AuthN
Termination Policy (MESOS-3545) • Allow users to override the default on termination policy of the default executor including the ability to restart a failed task • Resource Isolation for Nested Containers • Executor AuthN • Custom secret generator Contributions Welcome! Future Work
in Mesos – Stable, in production for years – Option to not rely on Docker daemon – Pluggable and extensible – Embracing container standards • Use the default executor for overlapping use-cases with the custom executor Summary