Upgrade to Pro — share decks privately, control downloads, hide ads and more …

When Mesos met Consul

When Mesos met Consul

Given to Mesos NYC June 17th, 2015

Combining Mesos and Consul to build out a powerful next-generation platform.

Steven Borrelli

June 17, 2015
Tweet

More Decks by Steven Borrelli

Other Decks in Technology

Transcript

  1. W H E N M E S O S M E T
    C O N S U L
    N YC M E S O S M E E T U P
    J U N E 2 015
    Steven  Borrelli  
    @stevendborrelli

    View full-size slide

  2. A B O U T M E
    F O U N D E D A S T E R I S ( 2 014 )
    S YS T E M S E N G I N E E R I N G , H P C ,
    B I G DATA & C LO U D
    F O C U S O N D I S T R I B U T E D C O M P U T I N G , S T R E A M I N G DATA ,
    A N D C O N T I N U O U S D E L I V E RY

    View full-size slide

  3. Today we’re going to talk about

    View full-size slide

  4. I N F R A S T R U C T U R E
    B U I L D I N G T H E N E X T G E N E R A T I O N O F

    View full-size slide

  5. The Problem:
    A Rapidly Changing Computing
    Environment

    View full-size slide

  6. D I V E R S E R E Q U I R E M E N T S
    Development Analytics
    Engineering

    View full-size slide

  7. E M E RG I N G U S E C A S E S
    Mobile First
    Microservices
    Multiple Languages
    Continuous Delivery
    Streaming Data
    Unstructured Data
    Multiple Data Stores
    ETL/Workflows
    Cloud
    DevOps

    View full-size slide

  8. Our Favorite Solution

    View full-size slide

  9. Combine Compute and Storage
    Resources
    Run Diverse Workloads

    View full-size slide

  10. The Strength of Mesos is in the
    Frameworks

    View full-size slide

  11. F R A M E W O R K S
    App-specific:
    Generic:

    View full-size slide

  12. M E S O S C H A L L E N G E S
    • Deployment
    • Framework Development
    • Security & Management
    • Monitoring
    • Service Discovery

    View full-size slide

  13. Let’s Make Mesos Better

    View full-size slide

  14. Consul is Part of the Hashicorp
    Ecosystem:

    View full-size slide

  15. C O N S U L F E AT U R E S

    View full-size slide

  16. RU N N I N G C O N S U L
    • Single Binary (golang)
    • Run on every system
    • 1- 7 Servers per datacenter, rest of systems are
    clients
    • Config via .json files or cli parameters
    • Optional Web UI

    View full-size slide

  17. D O N ’ T RU N C O N S U L I N D O C K E R
    • ARP cache issues with Docker networking, need to
    install conntrack to flush.
    • PITA to mount volumes and open network ports
    • Health checks become more complex
    • Network latency seems to cause instability

    View full-size slide

  18. Clients:
    Failure Detection

    Health Checks

    Respond to local
    requests
    Servers:
    Leader Election

    Forward Request

    Replicate Data
    Consensus
    is achieved via
    gossip protocol
    (nodes) or raft
    (server data)

    View full-size slide

  19. Consul Architecture

    View full-size slide

  20. http://progrium.com/blog/2014/08/20/consul-service-discovery-with-docker/
    C O N S U L S E RV I C E S
    Serf

    View full-size slide

  21. C A P
    Consistency Availability
    Partition Tolerance
    Gossip
    Paxos/
    Raft
    Consul Agent
    Cassandra
    Zookeeper
    Consul K/V
    etcd

    View full-size slide

  22. C O N S U L C O N S I S T E N C Y
    • Servers use raft for consistency (CP)
    • Loss of server quorum will cause availability failure
    • Run a small (odd) number of servers per DC
    • Agents use LAN gossip for node failure detection
    • WAN gossip is used across DCs, higher latency

    View full-size slide

  23. C O N S U L A P I C O N S I S T E N C Y M O D E S
    • default: server can serve requests during election.
    Possible stale values.
    • consistent: leader must be elected
    • stale: any server can respond, even non-leaders.

    View full-size slide

  24. Consul Service Discovery

    View full-size slide

  25. R E G I S T E R A S E RV I C E
    {
    "service": {
    "name": "marathon",
    "tags": [ "admin" ],
    "port": 8080,
    "check": {
    "script": "curl --silent --show-error --fail --dump-header /dev/stderr
    --r
    etry 2 http://127.0.0.1:8080/ping",
    "interval": "10s"
    }
    }
    }
    marathon.json
    Create a file called:
    Optional Health Check
    DNS Name
    HTTP API also supported

    View full-size slide

  26. D N S R E G I S T R AT I O N
    # consul reload
    # dig marathon.service.consul +short
    45.55.95.218
    45.55.95.215
    45.55.162.9
    If a health check fails, entry will not show in DNS.

    View full-size slide

  27. S E RV I C E TAG S
    # dig admin.marathon.service.consul +short
    45.55.95.218
    45.55.95.215
    45.55.162.9
    Tags are supported
    in DNS

    View full-size slide

  28. D N S S RV R E C O R D S
    # dig zookeeper.service.consul SRV +short
    1 1 2181 mi-control-01.node.dc1.consul.
    1 1 2181 mi-control-03.node.dc1.consul.
    1 1 2181 mi-control-02.node.dc1.consul.
    Get the port for any service:
    Nodes are automatically registered in
    DNS. You can even query services and nodes
    in other DCs!

    View full-size slide

  29. S I M P L I F Y M E S O S C O N F I G U R AT I O N
    zk://zookeeper.service.consul:2181/mesos
    Zookeeper config string:
    http://marathon.service.consul:8080
    Marathon config string:
    Mesos config string (we’ll discuss leader later):
    mesos://leader.mesos.service.consul:5050

    View full-size slide

  30. B O N U S !
    H E A LT H C H E C KS YO U R M E S O S C LU S T E R

    View full-size slide

  31. H E A LT H C H E C KS A R E RU N BY T H E
    N O D E S , E X P O S E S TAT E V I A A P I
    [
    {
    "Node": {
    "Node": "mi-control-01",
    "Address": "45.55.95.218"
    },
    "Service": {
    "ID": "chronos",
    "Service": "chronos",
    "Tags": [
    "chronos"
    ],
    "Address": "",
    "Port": 14400
    },
    "Checks": [
    {
    "Node": "mi-control-01",
    "CheckID": "service:chronos",
    "Name": "Service 'chronos' check",
    "Status": "critical",
    "Notes": "",
    "Output": "",
    "ServiceID": "chronos",
    "ServiceName": "chronos"
    },
    curl -L http://localhost:8500/v1/health/service/chronos?pretty=true

    View full-size slide

  32. H E A LT H C H E C K E X I T C O D E S
    Exit  code  0  -­‐  Check  is  passing  
    Exit  code  1  -­‐  Check  is  warning  
       
    Any  other  code  -­‐  Check  is  critical
    Consul Checks are compatible with Nagios/Sensu:

    View full-size slide

  33. Consul Key/Value Store
    Consul ACLs

    View full-size slide

  34. C O N S U L K / V E X P O S E D V I A A P I
    curl -X PUT -d 'test' http://localhost:8500/v1/kv/web/key1
    curl http://localhost:8500/v1/kv/?recurse
    [{"CreateIndex":97,"ModifyIndex":97,"Key":"web/key1","Flags":
    0,"Value":"dGVzdA=="},
    Or use
    consulkv read --ssl nodes/config/test
    Hello World
    consulkv delete --ssl --consul=consul.service.consul:8500 --recurse
    nodes/config/test

    View full-size slide

  35. • Only use in 0.5.2 or higher (upsert support)
    • Master tokens are used to create ACL entries
    • Every ACL entry has a token
    • read/write/deny policy on k/v and service endpoints
    • Can manage with API or
    C O N S U L AC L S

    View full-size slide

  36. Consul Template

    View full-size slide

  37. • Reads data from Consul k/v and service catalog
    • Writes out text files based on go text/template
    • Can be used to dynamically configure systems and
    applications
    C O N S U L T E M P L AT E
    {{range service "web@datacenter"}}

    server {{.Name}} {{.Address}}:{{.Port}}

    {{end}}

    server nyc_web_01 123.456.789.10:8080

    server nyc_web_02 456.789.101.213:8080
    Becomes

    View full-size slide

  38. • Update zoo.cfg as ZK nodes come up/down
    • Writes out text files based on go text/template
    • Restarts Zookeeper nodes
    • https://github.com/CiscoCloud/docker-zookeeper

    DY N A M I C Z O O K E E P E R E N S E M B L E
    {{{ with $s := env "CONSUL_QUERY" }}

    { range service $s "passing, warning" }}

    ZK_HOSTS[{{.ID | regexReplaceAll ".*:zkid-([0-9]*)" "$1"}}]={{.Address}}

    ZK_CLIENT_PORTS[{{.ID | regexReplaceAll ".*:zkid-([0-9]*)" “$1"}}]=2181
    ZK_PEER_PORTS[{{.ID | regexReplaceAll ".*:zkid-([0-9]*)" "$1"}}]=2888

    ZK_ELECTION_PORTS[{{.ID | regexReplaceAll ".*:zkid-([0-9]*)" “$1"}}]=3888
    {{end}}{{end}}

    View full-size slide

  39. Consul Integration with Mesos

    View full-size slide

  40. • Dynamically adds Mesos tasks to Consul
    • Located at https://github.com/CiscoCloud/mesos-
    consul
    • Easy to run as Docker container via Marathon

    • Mesos task shows up as:

    M E S O S - C O N S U L
    curl -X POST [email protected] -H "Content-Type: application/json"
    http://marathon.service.consul:8080/v2/apps'
    taskname.service.consul

    View full-size slide

  41. • Leader detection built-in. Use:
    • Mesos doesn’t have an event bus. Mesos-consul
    needs to poll every few seconds.
    • Mesos (0.22.1 and earlier) doesn’t export Docker
    port mapping information, so all ports are registered
    to the same DNS name.
    M E S O S - C O N S U L
    leader.mesos.service.consul

    View full-size slide

  42. • Dynamically adds Marathon tasks to Consul K/V.
    Can be used to build proxy configurations
    • Located at https://github.com/CiscoCloud/
    marathon-­‐consul
    • Easy to run as Docker container via Marathon
    • Listens to Marathon event bus:
    M A R AT H O N - C O N S U L
    curl -X POST 'http://marathon.service.consul:8080/v2/eventSubscriptions?
    callbackUrl=http://marathon-consul.service.consul:4000/events'

    View full-size slide

  43. Putting It All Together

    View full-size slide

  44. L E T ’ S B U I L D T H E F U T U R E

    View full-size slide

  45. • Runs diverse workloads, from containers to big data
    • Resistant to failure
    • Can be deployed anywhere rapidly
    • Easy to configure and manage
    • Batteries included: service discovery, logging,
    security, etc.
    T H E N E X T P L AT F O R M

    View full-size slide

  46. Microservices-Infrastructure
    (we need a new name)

    View full-size slide

  47. • Integrates Mesos + Consul
    • Easy deployment
    • Includes Logstash, collectd, Docker, Mesos,
    Marathon, Chronos (and more coming)
    • 1,200+ stars on github (#6 trending, 500 stars this
    week)
    • Apache 2.0
    M I C RO S E RV I C E S - I N F R A S T RU C T U R E

    View full-size slide

  48. • Configures HA & Security
    • 0.3.1 released today:
    • Digital Ocean Support!
    • VMWare vSphere Support!
    • Chronos!
    • Distributive added!
    M I C RO S E RV I C E S - I N F R A S T RU C T U R E

    View full-size slide

  49. It’s not a PaaS

    View full-size slide

  50. Think of it like a distribution for modern
    workloads.

    View full-size slide

  51. We’re integrating building blocks that
    you can compose in different ways.

    View full-size slide

  52. • Uses terraform to provision to the following cloud
    providers:
    • AWS
    • Google Cloud
    • OpenStack
    • Digital Ocean
    • vSphere
    M I C RO S E RV I C E S - I N F R A S T RU C T U R E

    View full-size slide

  53. • Our new framework for distributed health checks
    • Single 4mb binary, no gem or pip installs
    • Checks defined in .json format
    • Integrates with Consul, Nagios & Sensu
    • Will verify every node’s configuration
    • Cluster tests itself, no external tools needed
    N E W F E AT U R E : D I S T R I B U T I V E

    View full-size slide

  54. • IP per Mesos task
    • DCOS Support
    • Easy deployment of Mesos frameworks
    • Kubernetes Support
    • Vault Support
    • Dynamic Configuration
    ROA D M A P

    View full-size slide

  55. • 0.3: June 9th
    • 0.3.1: June 17th
    • 0.3.2: June 25th
    • 0.4.0: July 17th
    • 0.5.0 (Mesoscon)
    R E L E A S E S C H E D U L E

    View full-size slide

  56. • Docs: http://microservices-
    infrastructure.readthedocs.org/en/latest/
    • Github Issues: https://github.com/CiscoCloud/
    microservices-infrastructure/issues
    • Gitter.im chat room: https://gitter.im/CiscoCloud/
    microservices-infrastructure
    • Bug reports and pull requests welcome!
    G E T T I N G S U P P O R T

    View full-size slide

  57. We Think it’s Awesome
    https://github.com/CiscoCloud/microservices-infrastructure

    View full-size slide

  58. T H A N K YO U !
    http://aster.is

    View full-size slide