HASHICORP
Schedulers map a set of work to a
set of resources
Slide 8
Slide 8 text
HASHICORP
CPU Scheduler
Web Server -Thread 1
CPU - Core 1
CPU - Core 2
Web Server -Thread 2
Redis -Thread 1
Kernel -Thread 1
Work (Input) Resources
CPU
Scheduler
Slide 9
Slide 9 text
HASHICORP
CPU Scheduler
Web Server -Thread 1
CPU - Core 1
CPU - Core 2
Web Server -Thread 2
Redis -Thread 1
Kernel -Thread 1
Work (Input) Resources
CPU
Scheduler
Slide 10
Slide 10 text
HASHICORP
Schedulers In the Wild
Type Work Resources
CPU Scheduler Threads Physical Cores
AWS EC2 / OpenStack Nova Virtual Machines Hypervisors
Hadoop YARN MapReduce Jobs Client Nodes
Cluster Scheduler Applications Servers
Slide 11
Slide 11 text
HASHICORP
Advantages
Higher Resource Utilization
Decouple Work from Resources
Better Quality of Service
Slide 12
Slide 12 text
HASHICORP
Advantages
Bin Packing
Over-Subscription
Job Queueing
Higher Resource Utilization
Decouple Work from Resources
Better Quality of Service
Slide 13
Slide 13 text
HASHICORP
Advantages
Abstraction
API Contracts
Standardization
Higher Resource Utilization
Decouple Work from Resources
Better Quality of Service
Slide 14
Slide 14 text
HASHICORP
Advantages
Priorities
Resource Isolation
Pre-emption
Higher Resource Utilization
Decouple Work from Resources
Better Quality of Service
HASHICORP
job “foobar” {
group “api” {
# Scale our service up
count = 5
…
}
}
Slide 31
Slide 31 text
HASHICORP
job “foobar” {
group “api” {
# Scale our service down
count = 3
…
}
}
Slide 32
Slide 32 text
HASHICORP
job “foobar” {
group “hdfs-data-node” {
# Ensure the scheduler does not put
# multiple instances on one host
constraint {
distinct_hosts = true
}
…
}
}
Slide 33
Slide 33 text
HASHICORP
job “foobar” {
group “hdfs-data-node” {
# Attempt restart of tasks if they
# fail unexpectedly
restart {
attempts = 5
interval = “10m”
delay = “30s”
}
…
}
}
Slide 34
Slide 34 text
HASHICORP
job “foobar” {
task “my-app” {
# Ensure modern kernel available
constraint {
attribute = “kernel.version”
version = “>= 3.14”
}
…
}
}
HASHICORP
job “foobar” {
task “my-app” {
# Register with Consul for service
# discovery and health checking
service {
port = “http”
check {
type = “tcp”
interval = “10s”
}
}
…
}
}
Slide 37
Slide 37 text
HASHICORP
job “foobar” {
# Make sure this task runs everywhere
type = “system”
# Nothing should evict our collector
priority = 100
task “stats-collector” {
…
}
}
Slide 38
Slide 38 text
Terminal
HASHICORP
$ nomad agent -dev
==> Starting Nomad agent...
==> Nomad agent configuration:
Atlas:
Client: true
Log Level: DEBUG
Region: global (DC: dc1)
Server: true
==> Nomad agent started! Log data will stream in below:
[INFO] serf: EventMemberJoin: nomad.global 127.0.0.1
[INFO] nomad: starting 4 scheduling worker(s) for [service batch _core]
[INFO] raft: Node at 127.0.0.1:4647 [Follower] entering Follower state
[INFO] nomad: adding server nomad.global (Addr: 127.0.0.1:4647) (DC: dc1)
[DEBUG] client: applied fingerprints [storage arch cpu host memory]
[DEBUG] client: available drivers [docker exec]
Slide 39
Slide 39 text
Nomad
HASHICORP
Infrastructure As Code
Declarative Jobs
Desired State
Emergent State
Slide 40
Slide 40 text
HASHICORP
Operationally Simple
Slide 41
Slide 41 text
HASHICORP
Client Server
Slide 42
Slide 42 text
HASHICORP
Built for Scale
Slide 43
Slide 43 text
HASHICORP
Built on Experience
gossip consensus
Slide 44
Slide 44 text
HASHICORP
Built on Research
gossip consensus
Slide 45
Slide 45 text
HASHICORP
Single Region Architecture
SERVER SERVER SERVER
CLIENT CLIENT CLIENT
DC1 DC2 DC3
FOLLOWER LEADER FOLLOWER
REPLICATION
FORWARDING
REPLICATION
FORWARDING
RPC RPC RPC
Slide 46
Slide 46 text
HASHICORP
Multi Region Architecture
SERVER SERVER SERVER
FOLLOWER LEADER FOLLOWER
REPLICATION
FORWARDING
REPLICATION
REGION B
GOSSIP
REPLICATION REPLICATION
FORWARDING
REGION FORWARDING
REGION A
SERVER
FOLLOWER
SERVER SERVER
LEADER FOLLOWER
Slide 47
Slide 47 text
Nomad
HASHICORP
Region is Isolation Domain
1-N Datacenters Per Region
Flexibility to do 1:1 (Consul)
Scheduling Boundary
Slide 48
Slide 48 text
HASHICORP
Thousands of regions
Tens of thousands of clients per region
Thousands of jobs per region
Slide 49
Slide 49 text
HASHICORP
Optimistically Concurrent
Slide 50
Slide 50 text
HASHICORP
Data Model
Slide 51
Slide 51 text
HASHICORP
Evaluations ~= State Change Event
Slide 52
Slide 52 text
HASHICORP
Create / Update / Delete Job
Node Up / Node Down
Allocation Failed
HASHICORP
Server Architecture
Omega Class Scheduler
Pluggable Logic
Internal Coordination and State
Multi-Region / Multi-Datacenter
Slide 61
Slide 61 text
HASHICORP
Client Architecture
Broad OS Support
Host Fingerprinting
Pluggable Drivers
Slide 62
Slide 62 text
HASHICORP
Fingerprinting
Operating System
Hardware
Applications
Environment
Type Examples
Kernel, OS, Versions
CPU, Memory, Disk
Java, Docker, Consul
AWS, GCE
Slide 63
Slide 63 text
HASHICORP
Fingerprinting
Constrain Placement and Bin Pack
Slide 64
Slide 64 text
HASHICORP
Drivers
Execute Tasks
Provide Resource Isolation