Upgrade to Pro — share decks privately, control downloads, hide ads and more …

An overview of Mesos Containerization & The Default Executor

An overview of Mesos Containerization & The Default Executor

Presented at MesosCon NA 2017

Anand Mazumdar

September 16, 2017
Tweet

More Decks by Anand Mazumdar

Other Decks in Technology

Transcript

  1. © 2017 Mesosphere, Inc. All Rights Reserved. 1 An Overview

    of Mesos Containerization & The Default Executor Gilbert Song Anand Mazumdar
  2. © 2017 Mesosphere, Inc. All Rights Reserved. 2 • What

    is a container? – Developer: create container images – Operator: create isolated execution environment • Containerization in Mesos focuses on the operator side – Started from the very beginning (2011) History of Mesos Containerization
  3. © 2017 Mesosphere, Inc. All Rights Reserved. 3 Process based

    (pre 0.10.0, 2011) Agent Executor Task Executor Task Executor Task Process Session Process Session Process Session • Each container is a process session • No resource isolation
  4. © 2017 Mesosphere, Inc. All Rights Reserved. 4 Linux cgroups

    support (0.10, 2012) • Enabled cpu, memory isolation • Freezer cgroup for process management /sys/fs/cgroup/cpu/mesos/<cid> /sys/fs/cgroup/memory/mesos/<cid> /sys/fs/cgroup/freezer/mesos/<cid> Agent Executor Task Executor Task Executor Task Process Session Process Session Process Session
  5. © 2017 Mesosphere, Inc. All Rights Reserved. 5 Containerizer and

    isolators (0.18, 2014) Agent Executor Task Executor Task Executor Task Container Container Container Containerizer • Pluggable architecture • Isolators (lifecycle hooks) – cgroups/cpu – cgroups/mem – ... • Launchers (process mgmt) – linux (cgroups & ns) – posix – windows
  6. © 2017 Mesosphere, Inc. All Rights Reserved. 6 Docker engine

    integration (0.20, 2014) Agent Executor Task Executor Task Docker Container Docker Container Docker Containerizer Mesos Containerizer Executor Task Executor Task Mesos Container Mesos Container • Added a new Docker Containerizer • Shell out to docker commands – docker run – docker pull – docker stop – docker rm
  7. © 2017 Mesosphere, Inc. All Rights Reserved. 7 Native Docker

    image support (0.28, 2016) Agent Executor Task Executor Task Docker Container Docker Container Docker Containerizer Mesos Containerizer Executor Task Executor Task Mesos Container Mesos Container • Isolators – ... – volume/host_path – linux/capabilities – posix/rlimits – docker/runtime – ... • Launchers • Provisioners – Docker image provisioner – Appc image provisioner
  8. © 2017 Mesosphere, Inc. All Rights Reserved. 8 Adopting container

    standards • Container images – Docker – AppC – OCI image spec • Container network – CNI • Container storage – DVDI – CSI Supported through pluggable interfaces in MesosContainerizer
  9. © 2017 Mesosphere, Inc. All Rights Reserved. 9 De facto

    container standard Volume Plugin (DVDI) Network Plugin (libnetwork) Registry API
  10. © 2017 Mesosphere, Inc. All Rights Reserved. 10 We need

    true container standards! • Stable interfaces • Backward compatibility • Multiple implementations • Vendor neutral • Interoperability
  11. © 2017 Mesosphere, Inc. All Rights Reserved. 11 Ideal world

    Volume Plugin (DVDI) → Container Storage Spec Network Plugin (libnetwork) → Container Network Spec Registry API → Container Image Spec
  12. © 2017 Mesosphere, Inc. All Rights Reserved. 12 Standards we

    need for containers • Image • Networking • Storage • Runtime • Metrics • ...
  13. © 2017 Mesosphere, Inc. All Rights Reserved. 13 Container image

    spec • Scope – How to package application bits into images – How to package application configs into images – How to store and transfer images – How to unpack images to get application bits and configs
  14. © 2017 Mesosphere, Inc. All Rights Reserved. 14 OCI: Open

    Container Initiative • OCI image spec – https://github.com/opencontainers/image-spec
  15. © 2017 Mesosphere, Inc. All Rights Reserved. 15 Mesos will

    support OCI image spec (soon) Container Container Container OCI Image Store Docker registry Appc Image Store Will be supported in MesosContainerizer Pluggable container image format
  16. © 2017 Mesosphere, Inc. All Rights Reserved. 16 Container networking

    spec • Scope – How to connect containers – How to allocate IP Addresses – How to enforce security policies – How to isolate performance – How to provide quality of service – How to balance network traffic
  17. © 2017 Mesosphere, Inc. All Rights Reserved. 17 CNI: Container

    Networking Interface • A simple CLI based interface • Container orchestrator should invoke the CLI commands – Before container starts – After container terminates • Adopted by major container orchestrators and network vendors – Recently joined CNCF – https://github.com/containernetworking/cni
  18. © 2017 Mesosphere, Inc. All Rights Reserved. 18 CNI: Container

    Networking Interface Container Runtime Container CNI Plugin IPAM veth Network • Each plugin implements two CLI commands: – ADD: Attach network to the network namespace – DEL: Detach network from the network namespace – Pass config using arguments and environment variables
  19. © 2017 Mesosphere, Inc. All Rights Reserved. 19 Mesos supports

    CNI ...... via an Isolator in MesosContainerizer: --isolation=network/cni,...
  20. © 2017 Mesosphere, Inc. All Rights Reserved. 20 Container storage

    spec • Scope – How to Create/Destroy volumes – How to Attach/Detach volumes – How to Mount/Unmount volumes – How to create snapshots – How to restore snapshots
  21. © 2017 Mesosphere, Inc. All Rights Reserved. 21 CSI: Container

    Storage Interface • Joint work between major container orchestrators – Mesos, Kubernetes, Docker, Cloud Foundry – https://github.com/container-storage-interface • The goal of CSI in v1.0 – One storage plugin works for all COs – Support dynamic provisioning – Support both local and remote storage – Support Mount and Block volumes
  22. © 2017 Mesosphere, Inc. All Rights Reserved. 22 • Nested

    Container • Debug Container Latest new features
  23. © 2017 Mesosphere, Inc. All Rights Reserved. 23 Nested Container

    • Depth > 2! • Volume sharing with siblings • Fully compatible with other features 23 Container Executor Mesos Agent Containerizer LAUNCH Nginx
  24. © 2017 Mesosphere, Inc. All Rights Reserved. 24 Debug Container

    Container Executor Nginx Mesos Agent Containerizer LAUNCH Debug
  25. © 2017 Mesosphere, Inc. All Rights Reserved. 25 • Process

    launched by the Mesos agent to execute tasks • 1 : n mapping between an executor and tasks What is an Executor? Mesos Agent Executor Task1 Task2 waitpid()
  26. © 2017 Mesosphere, Inc. All Rights Reserved. 26 Old •

    Protobuf message passing over HTTP (non-standard, unversioned) • Native Library dependency; libmesos.so Executor API New • Versioned API (Currently v1) • Protobuf/JSON over HTTP 1.1 • No native library dependencies
  27. © 2017 Mesosphere, Inc. All Rights Reserved. 27 Types of

    Executor Command Executor* • Old API • Only supports launching a single task • Agent flag --http_command_executor to use the new API Docker Executor* • Old API • Only supports launching a single docker container Custom Executor • Old API or V1 API • Can launch multiple tasks or task groups** Default Executor* • V1 API • Can launch multiple task groups * Built-in Executor ** Task Groups would be explained later
  28. © 2017 Mesosphere, Inc. All Rights Reserved. 28 • Run

    a sidecar/adapter container (e.g., logger, metrics) next to the main application controller • Run a group of containers sharing volumes and network namespace while some of them can have their own mount namespace • Run a group of containers with the same lifecycle, e.g., one container’s failure would cause all other containers to be cleaned up Why need Task Groups aka Pods?
  29. © 2017 Mesosphere, Inc. All Rights Reserved. 29 • Limitation

    of existing Scheduler/Executor API’s not allowing to launch a group of tasks atomically • A scheduler can launch multiple tasks in a single LAUNCH operation; but they are delivered one-by-one to the executor and might even be dropped in some cases due to a network partition! • The newly introduced abstraction TaskGroup all-or-nothing semantics and ensures a group of tasks are delivered atomically to the executor Limitations of old API!
  30. © 2017 Mesosphere, Inc. All Rights Reserved. 30 • Default

    executor launches tasks in a task group as nested containers • Tasks in a task group share resources (network namespace, volumes) • No resource isolation between tasks or task groups within the executor • Launches a nested container for every task in a task group Default Executor
  31. © 2017 Mesosphere, Inc. All Rights Reserved. 31 • Health

    checks and probes • Authentication • Custom Kill Policies Default Executor Features
  32. © 2017 Mesosphere, Inc. All Rights Reserved. 33 WORKFLOW Mesos

    Agent Default Executor SUBSCRIBE Request (JSON): POST /api/v1/executor HTTP/1.1 { "type": "SUBSCRIBE", "executor_id": { "value": "387aa966-8fc5-4428-a794-5a868a60d3eb" }, "framework_id": { "value": "49154f1b-8cf6-4421-bf13-8bd11dccd1f1" }, ... } SUBSCRIBE
  33. © 2017 Mesosphere, Inc. All Rights Reserved. 34 WORKFLOW Mesos

    Agent Default Executor SUBSCRIBED SUBSCRIBE Response Event (JSON): HTTP/1.1 200 OK <event-length> { "type": "SUBSCRIBED", "subscribed": { "executor_info": { "executor_id": { "value": "387aa966-8fc5-4428-a794-5a868a60d3eb" }, "command": { "value": "\/path\/to\/executor" },... }
  34. © 2017 Mesosphere, Inc. All Rights Reserved. 35 WORKFLOW Mesos

    Agent Default Executor LAUNCH_GROUP LAUNCH_GROUP Event (JSON) <event-length> { "type": "LAUNCH_GROUP", "launch_group": { "task_group" : { "tasks" : [ "task": { .... "command": { "value": "sleep" …. } .... }
  35. © 2017 Mesosphere, Inc. All Rights Reserved. 36 WORKFLOW Mesos

    Agent Default Executor LAUNCH_NESTED_CONTAINER LAUNCH_NESTED_CONTAINER HTTP Request (JSON): POST /api/v1 HTTP/1.1 { "type": "LAUNCH_NESTED_CONTAINER", "launch_nested_container": { …. "command": { "value": "sleep" …. } …. }
  36. © 2017 Mesosphere, Inc. All Rights Reserved. 37 WORKFLOW Mesos

    Agent Default Executor WAIT_NESTED_CONTAINER WAIT_NESTED_CONTAINER HTTP Request (JSON): POST /api/v1 HTTP/1.1 { "type": "WAIT_NESTED_CONTAINER", "wait_nested_container": { "container_id": { "parent": { "value": "6643b4be-583a-4dc3-bf23-a1ffb26dd452" }, "value": "3192b9d1-db71-4699-ae25-e28dfbf42de1" } } } Task 1 Task 2
  37. © 2017 Mesosphere, Inc. All Rights Reserved. 38 TaskGroup Lifecycle

    wrt Default Executor Mesos Agent Default Executor Task 1 Task 2 Task group 1 Task 1 Task 2 Task group 2
  38. © 2017 Mesosphere, Inc. All Rights Reserved. 39 TaskGroup Lifecycle

    wrt Default Executor Mesos Agent Default Executor Task 1 Task 2 Task group 1 Task 1 Task 2 Task group 2 exit 1
  39. © 2017 Mesosphere, Inc. All Rights Reserved. 40 TaskGroup Lifecycle

    wrt Default Executor Mesos Agent Default Executor Task 1 Task 2 Task group 1 Task 1 Task 2 Task group 2
  40. © 2017 Mesosphere, Inc. All Rights Reserved. 41 TaskGroup Lifecycle

    wrt Default Executor Mesos Agent Default Executor Task 1 Task 2 Task group 2 exit 0
  41. © 2017 Mesosphere, Inc. All Rights Reserved. 42 TaskGroup Lifecycle

    wrt Default Executor Mesos Agent Default Executor Task 1 Task group 2
  42. © 2017 Mesosphere, Inc. All Rights Reserved. 43 TaskGroup Lifecycle

    wrt Default Executor Mesos Agent Default Executor Task 1 Task group 2 exit 0
  43. © 2017 Mesosphere, Inc. All Rights Reserved. 44 TaskGroup Lifecycle

    wrt Default Executor Mesos Agent Default Executor Commits suicide when no active task groups
  44. © 2017 Mesosphere, Inc. All Rights Reserved. 45 • For

    sidecar/adapter containers that don’t affect the lifecycle of the executor/main application: • Run them in a separate task group; their failure would still keep the main container active • Specify the same executor ID for launching subsequent task groups on the same executor TaskGroup Lifecycle wrt Default Executor
  45. © 2017 Mesosphere, Inc. All Rights Reserved. 46 Health Checks

    Specified in TaskInfo Includes: • Initial delay • Check interval • Timeout • Max failures • Grace period 3 protocols: • HTTP • TCP • Command
  46. © 2017 Mesosphere, Inc. All Rights Reserved. 47 • All

    built-in executors rely on the checker native library in `src/checks` in the Mesos codebase. Custom executors are encouraged to use it too! • Command health checks are implemented via debug nested containers since they need to be run from the same mount namespace as the original container • The executor container already shares the network namespace with other containers in the task group making HTTP/TCP checks easier Health Checks
  47. © 2017 Mesosphere, Inc. All Rights Reserved. 48 • Similar

    to health checks; but no automatic response upon failure • Allows scheduler to probe the task and use the result as per its own business logic Probes aka “Checks”
  48. © 2017 Mesosphere, Inc. All Rights Reserved. 49 • When

    an executor submits calls, the agent cannot be certain that the calls are sent by the correct executor process • To prevent executor impersonation, we need to authenticate executor requests • The Default Executor supports AuthN since Mesos 1.3.0! Executor AuthN
  49. © 2017 Mesosphere, Inc. All Rights Reserved. 50 Executor AuthN

    Mesos Agent Victim Executor Malicious Process Subscribes with victims FrameworkID and ExecutorID
  50. © 2017 Mesosphere, Inc. All Rights Reserved. 51 • Custom

    Termination Policy (MESOS-3545) • Allow users to override the default on termination policy of the default executor including the ability to restart a failed task • Resource Isolation for Nested Containers • Executor AuthN • Custom secret generator Contributions Welcome! Future Work
  51. © 2017 Mesosphere, Inc. All Rights Reserved. 52 • Containerization

    in Mesos – Stable, in production for years – Option to not rely on Docker daemon – Pluggable and extensible – Embracing container standards • Use the default executor for overlapping use-cases with the custom executor Summary