Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Towards Running Stateful Applications on Nomad

Diptanu Choudhury
August 03, 2016
910

Towards Running Stateful Applications on Nomad

Diptanu Choudhury

August 03, 2016
Tweet

Transcript

  1. Quest to increase agility and reliability Develop Build Deploy Production

    Fast turnaround High Availability
 Control Plane for operations
  2. Application Configuration Constraints job "redis" { datacenters = ["us-east-1"] group

    “cache” { count = 100 task "redis" { driver = "docker" config { image = "redis:latest" } resources { cpu = 500 memory = 256 network { mbits = 10 dynamic_ports = ["redis"] } } } }
  3. job "redis" { datacenters = ["us-east-1"] task "redis" { driver

    = "docker" config { image = "redis:latest" } resources { cpu = 500 memory = 256 network { mbits = 10 dynamic_ports = ["redis"] } } } }
  4. Batch Scheduler
 Service Scheduler Restart Policies
 System Scheduler Consul Integration


    Log Management
 Runtime Stats
 Job Plans Vault Integration
 TLS Sticky Volumes
 Disk Watchers Volume Plugins
 Network Plugins ACL Priorities
 Quotas
  5. Nomad Jobs Task Group Task Group Task A Task B

    Task C Task D Allocation Allocation Job
  6. Allocations Allocations are instances of a task group on a

    compute node Ephemeral in nature Allocations manage the life cycle of tasks within them Provides the environment and file system for the tasks
  7. Allocation Resources CPU Shares Memory Network Ports and IPs Task

    Resources Shared Resources Disk Resources
  8. Allocation Directory Allocation directory provides a shared data directory for

    tasks stdout and stderr streams of tasks are written in logs dir Each Task has a task local directory
  9. Allocations Allocations are ephemeral in nature Allocations can be restarted

    on a different node Allocation directory not preserved when restarted on same node They are garbage collected by Nomad after they transition to terminal state
  10. Sticky Volumes Prefers to restart allocation on the same nodes

    to avoid replication Replicate the shared data dir and task local dirs of allocations Best effort replication, not a Distributed File Systems
  11. group “redis” { ephemeral_disk { sticky = true
 size =

    20000 } task “redis” { … } 
 task “backup-agent” { … } } cache.nomad
  12. Ephemeral Disk Shared Allocation Directory Task A Task B Shared

    Allocation Directory Task A Task B Move
  13. Disk Watcher Watch dog process to monitor disk usage Allocations

    get killed if they use more disk resources than they are allocated If an allocation exceeds it’s disk quota, the allocation fails permanently
  14. The Future Data Volumes to add additional volumes to allocations

    Volume plugins for materializing volumes on storage services like EBS, EFS, NFS, etc File system drivers for supporting file systems such as ZFS