Slide 1

Slide 1 text

HASHICORP Taming the modern public and private clouds with Nomad Diptanu Gon Choudhury @diptanu PhillyETE 2016

Slide 2

Slide 2 text

HASHICORP Evolution of compute infrastructure 1995 2000 2015

Slide 3

Slide 3 text

HASHICORP Evolution of compute infrastructure

Slide 4

Slide 4 text

HASHICORP Evolution of compute infrastructure Global Public Cloud AWS - US-West-2 AWS - US-East-1 GCP - US-Central-1 Private Clouds Private Clouds

Slide 5

Slide 5 text

HASHICORP Challenges of the modern cloud 10s of 1000s of compute nodes to manage Compute clusters are spread across the globe Static and offline partitioning of clusters are no longer efficient

Slide 6

Slide 6 text

HASHICORP Challenges of the modern cloud Heterogenous API for accessing compute infrastructure Heterogenous primitives for managing network, secrets, etc

Slide 7

Slide 7 text

HASHICORP Evolution of application architecture SOA and Micro Services are replacing monoliths Distributed Systems are the new normal

Slide 8

Slide 8 text

HASHICORP Challenges in running modern services Orchestrated deployment and rollback strategies More modes of failures

Slide 9

Slide 9 text

HASHICORP Cluster Schedulers to the rescue Decouple Work from Resources Better Quality of Service Higher Resource Utilization

Slide 10

Slide 10 text

Nomad HASHICORP Multi-Datacenter Multi-Region Flexible Workloads Job Priorities Bin Packing Large Scale Operationally Simple

Slide 11

Slide 11 text

HASHICORP Nomad as Cluster Scheduler Bin Packing Job Queueing Over-Subscription Higher Resource Utilization Decouple Work from Resources Better Quality of Service

Slide 12

Slide 12 text

HASHICORP Nomad as the Cluster Scheduler Abstraction API Contracts Standardization Higher Resource Utilization Decouple Work from Resources Better Quality of Service

Slide 13

Slide 13 text

HASHICORP Nomad as the Cluster Scheduler Priorities Resource Isolation Pre-emption Higher Resource Utilization Decouple Work from Resources Better Quality of Service

Slide 14

Slide 14 text

HASHICORP Job Specification Declares what to run

Slide 15

Slide 15 text

HASHICORP example.nomad # Define our simple redis job job "redis" { # Run only in us-east-1 datacenters = ["us-east-1"] # Define the single redis task using Docker task "redis" { driver = "docker" config { image = "redis:latest" } resources { cpu = 500 # Mhz memory = 256 # MB network { mbits = 10 dynamic_ports = ["redis"] } } } }

Slide 16

Slide 16 text

HASHICORP Job Specification Nomad determines where and manages how to run

Slide 17

Slide 17 text

HASHICORP Job Specification Abstract work from resources

Slide 18

Slide 18 text

HASHICORP Supports multiple Clouds, DCs and Regions Resources across DCs are presented as single pool Developers can target multiple datacenter in the same job file Unified interface for developers across clouds

Slide 19

Slide 19 text

HASHICORP Unified interface across hybrid clouds AWS GCP Azure On-Prem DC Nomad Job Spec

Slide 20

Slide 20 text

HASHICORP Single Region Architecture SERVER SERVER SERVER CLIENT CLIENT CLIENT DC1 DC2 DC3 FOLLOWER LEADER FOLLOWER REPLICATION FORWARDING REPLICATION FORWARDING RPC RPC RPC

Slide 21

Slide 21 text

HASHICORP Multi Region Architecture SERVER SERVER SERVER FOLLOWER LEADER FOLLOWER REPLICATION FORWARDING REPLICATION REGION B GOSSIP REPLICATION REPLICATION FORWARDING REGION FORWARDING REGION A SERVER FOLLOWER SERVER SERVER LEADER FOLLOWER

Slide 22

Slide 22 text

Nomad HASHICORP Region is Isolation Domain 1-N Datacenters Per Region Flexibility to do 1:1 (Consul) Scheduling Boundary

Slide 23

Slide 23 text

HASHICORP Data Model

Slide 24

Slide 24 text

HASHICORP Evaluations ~= State Change Event

Slide 25

Slide 25 text

HASHICORP Create / Update / Delete Job Node Up / Node Down Allocation Failed

Slide 26

Slide 26 text

HASHICORP External Event Evaluation Creation Evaluation Queuing Evaluation Processing Optimistic Coordination State Updates

Slide 27

Slide 27 text

HASHICORP Scheduler Architecture Concurrent and optimistic scheduling Event Driven invocation of schedulers No head of line blocking for different type of workloads

Slide 28

Slide 28 text

HASHICORP Client Architecture Broad OS Support Host Fingerprinting Pluggable Drivers

Slide 29

Slide 29 text

HASHICORP Drivers Execute Tasks Provide Resource Isolation

Slide 30

Slide 30 text

HASHICORP Containerized Virtualized Standalone Docker Qemu / KVM Java Jar Static Binaries Rocket

Slide 31

Slide 31 text

HASHICORP Containerized Virtualized Standalone Docker Rocket Windows Server Containers Qemu / KVM Hyper-V Xen Java Jar Static Binaries C#

Slide 32

Slide 32 text

HASHICORP Maintainance Primitives First class support for doing maintenance on nodes Drain allocations running on a node nomad node-drain -enable 149cc920 Are you sure you want to enable drain mode for node "149cc920"? [y/N]

Slide 33

Slide 33 text

HASHICORP Service Discovery Aware Allows developers to define services exposed by a job Keep services and checks synced

Slide 34

Slide 34 text

HASHICORP example.nomad job "redis" { task "redis" { ……… service { name = “binstore” tags = [“env:staging”, “stack:beta”] port = “http” check { name = “binstore-http” type = “http” path = “/status” interval = “30s” timeout = “2s” } } ………… } }

Slide 35

Slide 35 text

HASHICORP System Job Scheduler Runs a job on every node on the cluster Great for running monitoring, logging, auditing software

Slide 36

Slide 36 text

HASHICORP Log Management Takes care of rotating logs of services Log forwarding coming soon

Slide 37

Slide 37 text

HASHICORP Thanks! https://github.com/hashicorp/nomad https://www.nomadproject.io/