Slide 1

Slide 1 text

Nomad and next-generation application architectures

Slide 2

Slide 2 text

Armon Dadgar Founder and CTO @armon

Slide 3

Slide 3 text

HashiCorp Suite CONNECT RUN SECURE PROVISION Infrastructure & applications Applications Infrastructure & applications Infrastructure Consul Nomad Terraform Vault Packer Vagrant Consul Enterprise Nomad Enterprise Vault Enterprise Terraform Enterprise FOR TEAMS OSS TOOL SUITE PRODUCT SUITE

Slide 4

Slide 4 text

Nomad Cluster Manager Scheduler

Slide 5

Slide 5 text

Nomad Cluster Manager Scheduler

Slide 6

Slide 6 text

Schedulers map a set of work to a set of resources

Slide 7

Slide 7 text

CPU Scheduler Web Server -Thread 1 CPU - Core 1 CPU - Core 2 Web Server -Thread 2 Redis -Thread 1 Kernel -Thread 1 Work (Input) Resources CPU Scheduler

Slide 8

Slide 8 text

CPU Scheduler Web Server -Thread 1 CPU - Core 1 CPU - Core 2 Web Server -Thread 2 Redis -Thread 1 Kernel -Thread 1 Work (Input) Resources CPU Scheduler

Slide 9

Slide 9 text

Schedulers in the Wild Type Work Resources CPU Scheduler Threads Physical Cores AWS EC2 / OpenStack Nova Virtual Machines Hypervisors Hadoop YARN MapReduce Jobs Client Nodes Cluster Scheduler Applications Servers

Slide 10

Slide 10 text

Advantages Higher Resource U.liza.on Decouple Work from Resources Be:er Quality of Service

Slide 11

Slide 11 text

Advantages Higher Resource U.liza.on Decouple Work from Resources Be:er Quality of Service Bin Packing Over-Subscrip.on Job Queueing

Slide 12

Slide 12 text

Advantages Higher Resource U.liza.on Decouple Work from Resources Be:er Quality of Service Abstrac.on API Contracts Packaging

Slide 13

Slide 13 text

Advantages Higher Resource U.liza.on Decouple Work from Resources Be:er Quality of Service Priori.es Resource Isola.on Pre-emp.on

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

 Nomad @armon

Slide 16

Slide 16 text

Nomad Cluster Scheduler Easily Deploy Applications Operationally Simple Built for Scale API for Next-Gen Patterns

Slide 17

Slide 17 text

job "redis" { datacenters = ["us-east-1"] task "redis" { driver = "docker" config { image = "redis:latest" } resources { cpu = 500 # Mhz memory = 256 # MB network { mbits = 10 port “redis" {} } } } } example.nomad

Slide 18

Slide 18 text

Declares what to run

Slide 19

Slide 19 text

Nomad determines where and manages how to run

Slide 20

Slide 20 text

Nomad abstracts work from resources

Slide 21

Slide 21 text

OS Workloads Drivers Windows Long Running Service Docker / Rkt / LXC Linux Short Lived Batch Qemu / KVM BSD Periodic Cron “exec” cgroups+chroot Solaris System Agents Static Binaries / Fat JARs

Slide 22

Slide 22 text

Nomad Declarative Jobs Infrastructure as Code Consul Integration Vault Integration Composable vs Platform

Slide 23

Slide 23 text

Empowers developers by de-coupling operators

Slide 24

Slide 24 text

Operationally Simple

Slide 25

Slide 25 text

Client Server

Slide 26

Slide 26 text

Built on Experience GOSSIP CONSENSUS

Slide 27

Slide 27 text

Serf Cluster Management Gossip Based (P2P) Membership Failure Detection Event System

Slide 28

Slide 28 text

Serf Large Scale Production Hardened Simple Clustering and Federation

Slide 29

Slide 29 text

Consul Service Discovery Configuration Coordination (Locking) Central Servers + Distributed Clients

Slide 30

Slide 30 text

Consul Multi-Datacenter Raft Consensus Large Scale Production Hardened

Slide 31

Slide 31 text

Nomad Single Binary No Dependencies Highly Available Multi-DC/Region Support

Slide 32

Slide 32 text

Built for Scale

Slide 33

Slide 33 text

Built on Experience GOSSIP CONSENSUS Mature Libraries Proven Design Patterns Lacking Scheduling Logic

Slide 34

Slide 34 text

Built on Research GOSSIP CONSENSUS

Slide 35

Slide 35 text

No content

Slide 36

Slide 36 text

Single Region Architecture SERVER SERVER SERVER CLIENT CLIENT CLIENT DC1 DC2 DC3 FOLLOWER LEADER FOLLOWER REPLICATION FORWARDING REPLICATION FORWARDING RPC RPC RPC

Slide 37

Slide 37 text

Multi Region Architecture SERVER SERVER SERVER FOLLOWER LEADER FOLLOWER REPLICATION FORWARDING REPLICATION REGION B  GOSSIP REPLICATION REPLICATION FORWARDING REGION FORWARDING  REGION A SERVER FOLLOWER SERVER SERVER LEADER FOLLOWER

Slide 38

Slide 38 text

100’s of Regions 10,000’s of Clients per Region 1000’s of Jobs per Region

Slide 39

Slide 39 text

Nomad Inspired by Google Omega Optimistic Concurrency Service & Batch workloads Pluggable Architecture

Slide 40

Slide 40 text

Nomad Million Container Challenge 1,000 Jobs 1,000 Tasks per Job 5,000 Hosts on GCE 1,000,000 Containers

Slide 41

Slide 41 text

No content

Slide 42

Slide 42 text

“640 KB ought to be enough for anybody.” - Bill Gates

Slide 43

Slide 43 text

2nd Largest Hedge Fund 18K Cores 5 Hours 2,200 Containers/second

Slide 44

Slide 44 text

Next-Gen Patterns

Slide 45

Slide 45 text

Monolith Micro-Service SOA Spectrum

Slide 46

Slide 46 text

Monolith Micro-Service SOA Spectrum Utility of Service

Slide 47

Slide 47 text

Monoliths have high application complexity

Slide 48

Slide 48 text

Microservices have high operational complexity

Slide 49

Slide 49 text

Abstractions allow us to scale complexity

Slide 50

Slide 50 text

Frameworks :: Monoliths Schedulers :: Services

Slide 51

Slide 51 text

Schedulers abstract details, focus on service composition

Slide 52

Slide 52 text

Side Cars • Sidecar or Co-Process Pattern • Application that runs alongside “main” process • Nomad “Task Group” • Borg “Alloc” • Kubernetes “Pod”

Slide 53

Slide 53 text

Routing Proxy Log Shipper App1 Routing Proxy Log Shipper App2 Client Node Allocation #1 Allocation #2 App3 Allocation #3

Slide 54

Slide 54 text

Side Cars • Configuration (Consul-Template) • Logging Agents (Splunk, CloudWatch) • Telemetry Agents (Datadog) • Service Mesh (Envoy, Linkerd) • Load Balancing (HAProxy, Nginx, Fabio)

Slide 55

Slide 55 text

Nomad Transparent Scheduling API awareness Dynamic Behavior

Slide 56

Slide 56 text

Queues

Slide 57

Slide 57 text

Queues • Workers are online service doing batch work • Workers provisioned in advance • N+1 instances for high availability • Typically idle or underutilized

Slide 58

Slide 58 text

Nomad Dispatch • “Dispatch” a worker for each incoming event • Consumer launched on-demand and terminates when done • Publisher shielded from implementation detail • Nomad job acts like a future, queues when busy • Avoids underutilization

Slide 59

Slide 59 text

job “my-dispatch” { datacenter = [“dc1”] type = “batch” parameterized { meta_required = [“input”] } task “worker” { driver = “docker” config = { image = “myworker:latest” args = [“—input”, “${NOMAD_META_INPUT}”] }
 } } my-dispatch.job

Slide 60

Slide 60 text

Nomad Server Register Dispatch Job

Slide 61

Slide 61 text

Nomad Server Register Dispatch Job Web Server Dispatch

Slide 62

Slide 62 text

Nomad Server Register Dispatch Job Web Server Dispatch Worker 1 Schedule

Slide 63

Slide 63 text

Nomad Server Register Dispatch Job Web Server Dispatch Worker 1 Schedule Worker 2 Worker N

Slide 64

Slide 64 text

Function-as-a-Service • AWS Lambda • Small Granularity • Low Volume, Latency Insensitive => Nomad Dispatch • High Volume, Latency Sensitive => Setup Overhead Prohibitive

Slide 65

Slide 65 text

FaaS / Serverless • Process multiple events per worker • Dynamically scale workers • Queue messages to avoid dropping

Slide 66

Slide 66 text

Nomad Server Register Dispatch Job Controller

Slide 67

Slide 67 text

Nomad Server Register Dispatch Job Web Server Push Controller Dispatch

Slide 68

Slide 68 text

Nomad Server Register Dispatch Job Web Server Push Worker 1 Schedule Controller Pull Dispatch

Slide 69

Slide 69 text

Nomad Server Register Dispatch Job Web Server Push Worker 1 Schedule Controller Pull Dispatch Deep Message Queue

Slide 70

Slide 70 text

Nomad Server Register Dispatch Job Web Server Push Worker 1 Schedule Worker 2 Worker N Controller Pull Dispatch

Slide 71

Slide 71 text

Big Data Processing • Large scale batch workload • Graph of processing steps • Each phase dynamic size • Programmatically setup/teardown workers • Native Spark Integration!

Slide 72

Slide 72 text

Nomad Server Executor 1 Schedule Executor 2 Executor N Launch Executors Submit Job

Slide 73

Slide 73 text

Large-scale cluster management at Google with Borg. Abhishek Verma, Luis Pedrosa, Madhukar R. Korupolu, David Oppenheimer, Eric Tune, John Wilkes.

Slide 74

Slide 74 text

Scheduler API blurs line between Application and Infrastructure

Slide 75

Slide 75 text

Nomad enables dynamic behavior while optimizing utilization

Slide 76

Slide 76 text

Nomad Cluster Scheduler Easily Deploy Applications Operationally Simple Built for Scale API for Next-Gen Patterns

Slide 77

Slide 77 text

 Thanks! @armon