Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Nomad and next generation application architectures

Nomad and next generation application architectures

We provide an overview of HashiCorp Nomad, which is a cluster manager and scheduler. We explore what a cluster scheduler is and the benefits they provide, including higher resource utilization, decoupling developers and operators, and better quality of service. The high level design of Nomad is explained and how it supports running over a million containers. The scalability of Nomad sets the stage for using the API from applications to build next generation applications that blur the lines between applications and infrastructure.

Armon Dadgar

June 21, 2017
Tweet

More Decks by Armon Dadgar

Other Decks in Technology

Transcript

  1. Nomad and next-generation
    application architectures

    View full-size slide

  2. Armon Dadgar
    Founder and CTO
    @armon

    View full-size slide

  3. HashiCorp Suite
    CONNECT
    RUN
    SECURE
    PROVISION
    Infrastructure & applications
    Applications
    Infrastructure & applications
    Infrastructure
    Consul
    Nomad
    Terraform
    Vault
    Packer
    Vagrant
    Consul Enterprise
    Nomad Enterprise
    Vault Enterprise
    Terraform Enterprise
    FOR TEAMS
    OSS TOOL SUITE PRODUCT SUITE

    View full-size slide

  4. Nomad
    Cluster Manager
    Scheduler

    View full-size slide

  5. Nomad
    Cluster Manager
    Scheduler

    View full-size slide

  6. Schedulers map a set of
    work to a set of resources

    View full-size slide

  7. CPU Scheduler
    Web Server -Thread 1
    CPU - Core 1
    CPU - Core 2
    Web Server -Thread 2
    Redis -Thread 1
    Kernel -Thread 1
    Work (Input) Resources
    CPU
    Scheduler

    View full-size slide

  8. CPU Scheduler
    Web Server -Thread 1
    CPU - Core 1
    CPU - Core 2
    Web Server -Thread 2
    Redis -Thread 1
    Kernel -Thread 1
    Work (Input) Resources
    CPU
    Scheduler

    View full-size slide

  9. Schedulers in the Wild
    Type Work Resources
    CPU Scheduler Threads Physical Cores
    AWS EC2 /
    OpenStack Nova
    Virtual Machines Hypervisors
    Hadoop YARN MapReduce Jobs Client Nodes
    Cluster Scheduler Applications Servers

    View full-size slide

  10. Advantages
    Higher Resource U.liza.on
    Decouple Work from Resources
    Be:er Quality of Service

    View full-size slide

  11. Advantages
    Higher Resource U.liza.on
    Decouple Work from Resources
    Be:er Quality of Service
    Bin Packing
    Over-Subscrip.on
    Job Queueing

    View full-size slide

  12. Advantages
    Higher Resource U.liza.on
    Decouple Work from Resources
    Be:er Quality of Service
    Abstrac.on
    API Contracts
    Packaging

    View full-size slide

  13. Advantages
    Higher Resource U.liza.on
    Decouple Work from Resources
    Be:er Quality of Service
    Priori.es
    Resource Isola.on
    Pre-emp.on

    View full-size slide


  14. Nomad
    @armon

    View full-size slide

  15. Nomad
    Cluster Scheduler
    Easily Deploy Applications
    Operationally Simple
    Built for Scale
    API for Next-Gen Patterns

    View full-size slide

  16. job "redis" {
    datacenters = ["us-east-1"]
    task "redis" {
    driver = "docker"
    config { image = "redis:latest" }
    resources {
    cpu = 500 # Mhz
    memory = 256 # MB
    network {
    mbits = 10
    port “redis" {}
    }
    }
    }
    }
    example.nomad

    View full-size slide

  17. Declares what to run

    View full-size slide

  18. Nomad determines where
    and manages how to run

    View full-size slide

  19. Nomad abstracts
    work from resources

    View full-size slide

  20. OS Workloads Drivers
    Windows Long Running Service Docker / Rkt / LXC
    Linux Short Lived Batch Qemu / KVM
    BSD Periodic Cron
    “exec”
    cgroups+chroot
    Solaris System Agents
    Static Binaries /
    Fat JARs

    View full-size slide

  21. Nomad
    Declarative Jobs
    Infrastructure as Code
    Consul Integration
    Vault Integration
    Composable vs Platform

    View full-size slide

  22. Empowers developers by
    de-coupling operators

    View full-size slide

  23. Operationally Simple

    View full-size slide

  24. Client Server

    View full-size slide

  25. Built on Experience
    GOSSIP CONSENSUS

    View full-size slide

  26. Serf
    Cluster Management
    Gossip Based (P2P)
    Membership
    Failure Detection
    Event System

    View full-size slide

  27. Serf
    Large Scale
    Production Hardened
    Simple Clustering and
    Federation

    View full-size slide

  28. Consul
    Service Discovery
    Configuration
    Coordination (Locking)
    Central Servers +
    Distributed Clients

    View full-size slide

  29. Consul
    Multi-Datacenter
    Raft Consensus
    Large Scale
    Production Hardened

    View full-size slide

  30. Nomad
    Single Binary
    No Dependencies
    Highly Available
    Multi-DC/Region Support

    View full-size slide

  31. Built for Scale

    View full-size slide

  32. Built on Experience
    GOSSIP CONSENSUS
    Mature Libraries Proven Design Patterns
    Lacking Scheduling Logic

    View full-size slide

  33. Built on Research
    GOSSIP CONSENSUS

    View full-size slide

  34. Single Region Architecture
    SERVER SERVER SERVER
    CLIENT CLIENT CLIENT
    DC1 DC2 DC3
    FOLLOWER LEADER FOLLOWER
    REPLICATION
    FORWARDING
    REPLICATION
    FORWARDING
    RPC RPC RPC

    View full-size slide

  35. Multi Region Architecture
    SERVER SERVER SERVER
    FOLLOWER LEADER FOLLOWER
    REPLICATION
    FORWARDING
    REPLICATION
    REGION B
     GOSSIP
    REPLICATION REPLICATION
    FORWARDING
    REGION FORWARDING
     REGION A
    SERVER
    FOLLOWER
    SERVER SERVER
    LEADER FOLLOWER

    View full-size slide

  36. 100’s of Regions
    10,000’s of Clients per Region
    1000’s of Jobs per Region

    View full-size slide

  37. Nomad
    Inspired by Google Omega
    Optimistic Concurrency
    Service & Batch workloads
    Pluggable Architecture

    View full-size slide

  38. Nomad
    Million Container
    Challenge
    1,000 Jobs
    1,000 Tasks per Job
    5,000 Hosts on GCE
    1,000,000 Containers

    View full-size slide

  39. “640 KB ought to be enough for anybody.”
    - Bill Gates

    View full-size slide

  40. 2nd Largest Hedge Fund
    18K Cores
    5 Hours
    2,200 Containers/second

    View full-size slide

  41. Next-Gen Patterns

    View full-size slide

  42. Monolith
    Micro-Service
    SOA Spectrum

    View full-size slide

  43. Monolith
    Micro-Service
    SOA Spectrum
    Utility of Service

    View full-size slide

  44. Monoliths have high
    application complexity

    View full-size slide

  45. Microservices have high
    operational complexity

    View full-size slide

  46. Abstractions allow us to
    scale complexity

    View full-size slide

  47. Frameworks :: Monoliths
    Schedulers :: Services

    View full-size slide

  48. Schedulers abstract details,
    focus on service composition

    View full-size slide

  49. Side Cars
    • Sidecar or Co-Process Pattern
    • Application that runs alongside “main” process
    • Nomad “Task Group”
    • Borg “Alloc”
    • Kubernetes “Pod”

    View full-size slide

  50. Routing
    Proxy
    Log
    Shipper
    App1
    Routing
    Proxy
    Log
    Shipper
    App2
    Client Node
    Allocation #1
    Allocation #2
    App3
    Allocation #3

    View full-size slide

  51. Side Cars
    • Configuration (Consul-Template)
    • Logging Agents (Splunk, CloudWatch)
    • Telemetry Agents (Datadog)
    • Service Mesh (Envoy, Linkerd)
    • Load Balancing (HAProxy, Nginx, Fabio)

    View full-size slide

  52. Nomad
    Transparent Scheduling
    API awareness
    Dynamic Behavior

    View full-size slide

  53. Queues
    • Workers are online service doing batch work
    • Workers provisioned in advance
    • N+1 instances for high availability
    • Typically idle or underutilized

    View full-size slide

  54. Nomad Dispatch
    • “Dispatch” a worker for each incoming event
    • Consumer launched on-demand and terminates when done
    • Publisher shielded from implementation detail
    • Nomad job acts like a future, queues when busy
    • Avoids underutilization

    View full-size slide

  55. job “my-dispatch” {
    datacenter = [“dc1”]
    type = “batch”
    parameterized {
    meta_required = [“input”]
    }
    task “worker” {
    driver = “docker”
    config = {
    image = “myworker:latest”
    args = [“—input”, “${NOMAD_META_INPUT}”]
    }

    }
    }
    my-dispatch.job

    View full-size slide

  56. Nomad
    Server
    Register
    Dispatch Job

    View full-size slide

  57. Nomad
    Server
    Register
    Dispatch Job
    Web Server
    Dispatch

    View full-size slide

  58. Nomad
    Server
    Register
    Dispatch Job
    Web Server
    Dispatch
    Worker 1
    Schedule

    View full-size slide

  59. Nomad
    Server
    Register
    Dispatch Job
    Web Server
    Dispatch
    Worker 1
    Schedule
    Worker 2
    Worker N

    View full-size slide

  60. Function-as-a-Service
    • AWS Lambda
    • Small Granularity
    • Low Volume, Latency Insensitive => Nomad Dispatch
    • High Volume, Latency Sensitive => Setup Overhead Prohibitive

    View full-size slide

  61. FaaS / Serverless
    • Process multiple events per worker
    • Dynamically scale workers
    • Queue messages to avoid dropping

    View full-size slide

  62. Nomad
    Server
    Register
    Dispatch Job
    Controller

    View full-size slide

  63. Nomad
    Server
    Register
    Dispatch Job
    Web Server
    Push
    Controller
    Dispatch

    View full-size slide

  64. Nomad
    Server
    Register
    Dispatch Job
    Web Server
    Push
    Worker 1
    Schedule
    Controller
    Pull
    Dispatch

    View full-size slide

  65. Nomad
    Server
    Register
    Dispatch Job
    Web Server
    Push
    Worker 1
    Schedule
    Controller
    Pull
    Dispatch
    Deep Message Queue

    View full-size slide

  66. Nomad
    Server
    Register
    Dispatch Job
    Web Server
    Push
    Worker 1
    Schedule
    Worker 2
    Worker N
    Controller
    Pull
    Dispatch

    View full-size slide

  67. Big Data Processing
    • Large scale batch workload
    • Graph of processing steps
    • Each phase dynamic size
    • Programmatically setup/teardown workers
    • Native Spark Integration!

    View full-size slide

  68. Nomad
    Server
    Executor 1
    Schedule
    Executor 2
    Executor N
    Launch
    Executors
    Submit Job

    View full-size slide

  69. Large-scale cluster management at Google with Borg. Abhishek Verma,
    Luis Pedrosa, Madhukar R. Korupolu, David Oppenheimer, Eric Tune, John
    Wilkes.

    View full-size slide

  70. Scheduler API blurs line between
    Application and Infrastructure

    View full-size slide

  71. Nomad enables dynamic behavior
    while optimizing utilization

    View full-size slide

  72. Nomad
    Cluster Scheduler
    Easily Deploy Applications
    Operationally Simple
    Built for Scale
    API for Next-Gen Patterns

    View full-size slide


  73. Thanks!
    @armon

    View full-size slide