Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Scheduling at Scale

Scheduling at Scale

Docker and rkt have made it really easy to package and ship applications but running them at scale, remains a challenge. Also, not all organizations have the bandwidth to containerize their workloads. Nomad, a single binary cluster scheduler, can be used to build a multi region, self-healing production environment that runs a diverse set of workloads including non-containerized applications. Nomad is a powerful and feature-rich scheduler; we will see how it allows native execution of a diverse set of applications without the requirement to package them in a Docker image. We will also look at how we can integrate Nomad and Vault to provide dynamically generated TLS certificates and secrets to our application, how Consul and Consul Template allows us to provide configuration management and feature flagging, and how operationally simple it is to perform zero downtime updates with Blue / Green and Canary deployments. This talk will discuss the theory, and also showcase a live demo of running an application on Nomad, highlighting how simple it is to leverage the power of Nomad, Consul, and Vault to orchestrate your applications.

Anubhav Mishra

June 05, 2018
Tweet

More Decks by Anubhav Mishra

Other Decks in Technology

Transcript

  1. PROVISION, SECURE AND RUN ANY INFRASTRUCTURE Nomad Consul Vault Vagrant

    Packer Terraform Consul Enterprise Terraform Enterprise Vault Enterprise PRODUCT SUITE OSS TOOL SUITE RUN Applications SECURE Application Infrastructure PROVISION Infrastructure FOR INDIVIDUALS FOR TEAMS Nomad Enterprise
  2. Copyright © 2017 HashiCorp @anubhavm  !9 [1] Assigning an

    appropriate number of workers to the jobs during each day of work. Scheduling [1] Read more: http://www.businessdictionary.com/definition/scheduling.html
  3. Copyright © 2017 HashiCorp @anubhavm  !10 A person or

    machine that helps scheduling during each day of work. Scheduler
  4. Copyright © 2017 HashiCorp @anubhavm  !12 A computer program

    that controls or manages the execution of jobs / processes / operations. Scheduler (Computing)
  5. Copyright © 2017 HashiCorp @anubhavm  CPU Scheduler !28 CORE

    CORE CORE CORE CPU SCHEDULER KERNEL APACHE REDIS BASH
  6. Copyright © 2017 HashiCorp @anubhavm  CPU Scheduler !29 CORE

    CORE CPU SCHEDULER KERNEL APACHE REDIS BASH
  7. Copyright © 2017 HashiCorp @anubhavm  CPU Scheduler !30 CORE

    CORE CPU SCHEDULER KERNEL APACHE REDIS BASH
  8. Copyright © 2017 HashiCorp @anubhavm  CPU Scheduler !31 CORE

    CORE CPU SCHEDULER KERNEL APACHE REDIS BASH
  9. Copyright © 2017 HashiCorp @anubhavm  Scheduler Advantages !32 Higher

    Resource Utilization Decouple Work from Resources Better Quality of Service
  10. Copyright © 2017 HashiCorp @anubhavm  Scheduler Advantages !33 Bin

    Packing Over-Subscription Job Queueing Higher Resource Utilization Decouple Work from Resources Better Quality of Service
  11. Copyright © 2017 HashiCorp @anubhavm  Scheduler Advantages !34 Abstraction

    API Contracts Standardization Higher Resource Utilization Decouple Work from Resources Better Quality of Service
  12. Copyright © 2017 HashiCorp @anubhavm  Scheduler Advantages !35 Priorities

    Resource Isolation Pre-emption Higher Resource Utilization Decouple Work from Resources Better Quality of Service
  13. @anubhavm  !X job "redis" { datacenters = ["us-east-1"] task

    "redis" { driver = "docker" config { image = "redis:latest" } resources { cpu = 500 # Mhz memory = 256 # MB network { mbits = 10 port "redis" {} } } } }
  14. @anubhavm  !X job "webserver" { datacenters = ["us-east-1"] task

    "webserver" { driver = "exec" config { command = "yet-another-golang-webserver-linux_amd64" } artifact { source = "https://github.com/anubhavmishra/yet-another-golang-webserver/releases/ download/v1.0.0/yet-another-golang-webserver-linux_amd64" } resources { cpu = 500 # Mhz memory = 128 # MB network { port "http" { static = 8080 } } } } }
  15. Copyright © 2017 HashiCorp @anubhavm  Thousands of regions Tens

    of thousands of clients per region Thousands of jobs per region Scaling Requirements !47
  16. Copyright © 2017 HashiCorp @anubhavm  Our Past Experience !53

    GOSSIP CONSENSUS Mature Libraries Proven Design Patterns
  17. Copyright © 2017 HashiCorp @anubhavm  Our Past Experience !54

    GOSSIP CONSENSUS Mature Libraries Proven Design Patterns ?
  18. @anubhavm  !57 Optimistic vs Pessimistic Internal vs External State

    Single vs Multi Level Fixed vs Pluggable Service vs Batch Oriented
  19. @anubhavm  !58 Inspired by Google Omega Optimistic Concurrency State

    Coordination Service & Batch workloads Pluggable Architecture
  20. Copyright © 2017 HashiCorp @anubhavm  Consul Cluster !59 CLIENT

    CLIENT CLIENT CLIENT CLIENT CLIENT SERVER SERVER SERVER REPLICATION REPLICATION RPC RPC LAN GOSSIP SERVER SERVER SERVER REPLICATION REPLICATION WAN GOSSIP
  21. Copyright © 2017 HashiCorp @anubhavm  Single Region Architecture !60

    SERVER SERVER SERVER CLIENT CLIENT CLIENT DC1 DC2 DC3 FOLLOWER LEADER FOLLOWER REPLICATION FORWARDING REPLICATION FORWARDING RPC RPC RPC
  22. Copyright © 2017 HashiCorp @anubhavm  Single Region Architecture !61

    SERVER SERVER SERVER FOLLOWER LEADER FOLLOWER REPLICATION FORWARDING REPLICATION REGION B  GOSSIP REPLICATION REPLICATION FORWARDING REGION FORWARDING  REGION A SERVER FOLLOWER SERVER SERVER LEADER FOLLOWER
  23. Copyright © 2017 HashiCorp @anubhavm  !62 Region is Isolation

    Domain 1-N Datacenters Per Region Flexibility to do 1:1 (Consul) Scheduling Boundary
  24. Copyright © 2017 HashiCorp @anubhavm  Omega Class Scheduler Pluggable

    Logic Internal Coordination and State Multi-Region / Multi-Datacenter Server Architecture !63
  25. Copyright © 2017 HashiCorp @anubhavm  Broad OS Support Host

    Fingerprinting Pluggable Drivers Client Architecture !64
  26. Copyright © 2017 HashiCorp @anubhavm  Fingerprinting !65 Type Examples

    Operating System Kernel, OS, Version Hardware CPU, Memory, Disk Apps (Capabilities) Docker, Java, Consul Environment AWS, GCE
  27. Copyright © 2017 HashiCorp @anubhavm  !67 “Task Requires Linux,

    Docker, and PCI-Compliant Hardware” expressed as constraints in job file
  28. Copyright © 2017 HashiCorp @anubhavm  !68 “Task needs 512MB

    RAM and 1 Core” expressed as resource in job file
  29. @anubhavm  !71 Containerized Virtualized Standalone Docker Qemu / KVM

    Java Jar Static Binaries rkt LXC Windows Server Containers Hyper-V Xen C#
  30. Copyright © 2017 HashiCorp @anubhavm  !77 “640 KB ought

    to be enough for anybody.” - Bill Gates
  31. Copyright © 2017 HashiCorp @anubhavm  !78 2nd Largest Hedge

    Fund 18K Cores 5 Hours 2,200 Containers/second
  32. Copyright © 2017 HashiCorp @anubhavm  !79 7+ Million Builds

    a Month Sustain 400-1000 Jobs a Minute Great Talk By Danielle Tomlinson: https://youtu.be/b8NQO_vFAYo
  33. Copyright © 2017 HashiCorp @anubhavm  !82 Higher Resource Utilization

    Decouple Work from Resources Better Quality of Service