Upgrade to Pro — share decks privately, control downloads, hide ads and more …

When Mesos met Consul

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.

When Mesos met Consul

Given to Mesos NYC June 17th, 2015

Combining Mesos and Consul to build out a powerful next-generation platform.

Avatar for Steven Borrelli

Steven Borrelli

June 17, 2015
Tweet

More Decks by Steven Borrelli

Other Decks in Technology

Transcript

  1. W H E N M E S O S M

    E T C O N S U L N YC M E S O S M E E T U P J U N E 2 015 Steven  Borrelli   @stevendborrelli
  2. A B O U T M E F O U

    N D E D A S T E R I S ( 2 014 ) S YS T E M S E N G I N E E R I N G , H P C , B I G DATA & C LO U D F O C U S O N D I S T R I B U T E D C O M P U T I N G , S T R E A M I N G DATA , A N D C O N T I N U O U S D E L I V E RY
  3. I N F R A S T R U C

    T U R E B U I L D I N G T H E N E X T G E N E R A T I O N O F
  4. D I V E R S E R E Q

    U I R E M E N T S Development Analytics Engineering
  5. E M E RG I N G U S E

    C A S E S Mobile First Microservices Multiple Languages Continuous Delivery Streaming Data Unstructured Data Multiple Data Stores ETL/Workflows Cloud DevOps
  6. F R A M E W O R K S

    App-specific: Generic:
  7. M E S O S C H A L L

    E N G E S • Deployment • Framework Development • Security & Management • Monitoring • Service Discovery
  8. RU N N I N G C O N S

    U L • Single Binary (golang) • Run on every system • 1- 7 Servers per datacenter, rest of systems are clients • Config via .json files or cli parameters • Optional Web UI
  9. D O N ’ T RU N C O N

    S U L I N D O C K E R • ARP cache issues with Docker networking, need to install conntrack to flush. • PITA to mount volumes and open network ports • Health checks become more complex • Network latency seems to cause instability
  10. Clients: Failure Detection
 Health Checks
 Respond to local requests Servers:

    Leader Election
 Forward Request
 Replicate Data Consensus is achieved via gossip protocol (nodes) or raft (server data)
  11. C A P Consistency Availability Partition Tolerance Gossip Paxos/ Raft

    Consul Agent Cassandra Zookeeper Consul K/V etcd
  12. C O N S U L C O N S

    I S T E N C Y • Servers use raft for consistency (CP) • Loss of server quorum will cause availability failure • Run a small (odd) number of servers per DC • Agents use LAN gossip for node failure detection • WAN gossip is used across DCs, higher latency
  13. C O N S U L A P I C

    O N S I S T E N C Y M O D E S • default: server can serve requests during election. Possible stale values. • consistent: leader must be elected • stale: any server can respond, even non-leaders.
  14. R E G I S T E R A S

    E RV I C E { "service": { "name": "marathon", "tags": [ "admin" ], "port": 8080, "check": { "script": "curl --silent --show-error --fail --dump-header /dev/stderr --r etry 2 http://127.0.0.1:8080/ping", "interval": "10s" } } } marathon.json Create a file called: Optional Health Check DNS Name HTTP API also supported
  15. D N S R E G I S T R

    AT I O N # consul reload # dig marathon.service.consul +short 45.55.95.218 45.55.95.215 45.55.162.9 If a health check fails, entry will not show in DNS.
  16. S E RV I C E TAG S # dig

    admin.marathon.service.consul +short 45.55.95.218 45.55.95.215 45.55.162.9 Tags are supported in DNS
  17. D N S S RV R E C O R

    D S # dig zookeeper.service.consul SRV +short 1 1 2181 mi-control-01.node.dc1.consul. 1 1 2181 mi-control-03.node.dc1.consul. 1 1 2181 mi-control-02.node.dc1.consul. Get the port for any service: Nodes are automatically registered in DNS. You can even query services and nodes in other DCs!
  18. S I M P L I F Y M E

    S O S C O N F I G U R AT I O N zk://zookeeper.service.consul:2181/mesos Zookeeper config string: http://marathon.service.consul:8080 Marathon config string: Mesos config string (we’ll discuss leader later): mesos://leader.mesos.service.consul:5050
  19. B O N U S ! H E A LT

    H C H E C KS YO U R M E S O S C LU S T E R
  20. H E A LT H C H E C KS

    A R E RU N BY T H E N O D E S , E X P O S E S TAT E V I A A P I [ { "Node": { "Node": "mi-control-01", "Address": "45.55.95.218" }, "Service": { "ID": "chronos", "Service": "chronos", "Tags": [ "chronos" ], "Address": "", "Port": 14400 }, "Checks": [ { "Node": "mi-control-01", "CheckID": "service:chronos", "Name": "Service 'chronos' check", "Status": "critical", "Notes": "", "Output": "", "ServiceID": "chronos", "ServiceName": "chronos" }, curl -L http://localhost:8500/v1/health/service/chronos?pretty=true
  21. H E A LT H C H E C K

    E X I T C O D E S Exit  code  0  -­‐  Check  is  passing   Exit  code  1  -­‐  Check  is  warning       Any  other  code  -­‐  Check  is  critical Consul Checks are compatible with Nagios/Sensu:
  22. C O N S U L K / V E

    X P O S E D V I A A P I curl -X PUT -d 'test' http://localhost:8500/v1/kv/web/key1 curl http://localhost:8500/v1/kv/?recurse [{"CreateIndex":97,"ModifyIndex":97,"Key":"web/key1","Flags": 0,"Value":"dGVzdA=="}, Or use consulkv read --ssl nodes/config/test Hello World consulkv delete --ssl --consul=consul.service.consul:8500 --recurse nodes/config/test
  23. • Only use in 0.5.2 or higher (upsert support) •

    Master tokens are used to create ACL entries • Every ACL entry has a token • read/write/deny policy on k/v and service endpoints • Can manage with API or C O N S U L AC L S
  24. • Reads data from Consul k/v and service catalog •

    Writes out text files based on go text/template • Can be used to dynamically configure systems and applications C O N S U L T E M P L AT E {{range service "web@datacenter"}}
 server {{.Name}} {{.Address}}:{{.Port}}
 {{end}} 
 server nyc_web_01 123.456.789.10:8080
 server nyc_web_02 456.789.101.213:8080 Becomes
  25. • Update zoo.cfg as ZK nodes come up/down • Writes

    out text files based on go text/template • Restarts Zookeeper nodes • https://github.com/CiscoCloud/docker-zookeeper
 DY N A M I C Z O O K E E P E R E N S E M B L E {{{ with $s := env "CONSUL_QUERY" }}
 { range service $s "passing, warning" }}
 ZK_HOSTS[{{.ID | regexReplaceAll ".*:zkid-([0-9]*)" "$1"}}]={{.Address}}
 ZK_CLIENT_PORTS[{{.ID | regexReplaceAll ".*:zkid-([0-9]*)" “$1"}}]=2181 ZK_PEER_PORTS[{{.ID | regexReplaceAll ".*:zkid-([0-9]*)" "$1"}}]=2888
 ZK_ELECTION_PORTS[{{.ID | regexReplaceAll ".*:zkid-([0-9]*)" “$1"}}]=3888 {{end}}{{end}}
  26. • Dynamically adds Mesos tasks to Consul • Located at

    https://github.com/CiscoCloud/mesos- consul • Easy to run as Docker container via Marathon
 • Mesos task <taskname> shows up as:
 M E S O S - C O N S U L curl -X POST [email protected] -H "Content-Type: application/json" http://marathon.service.consul:8080/v2/apps' taskname.service.consul
  27. • Leader detection built-in. Use: • Mesos doesn’t have an

    event bus. Mesos-consul needs to poll every few seconds. • Mesos (0.22.1 and earlier) doesn’t export Docker port mapping information, so all ports are registered to the same DNS name. M E S O S - C O N S U L leader.mesos.service.consul
  28. • Dynamically adds Marathon tasks to Consul K/V. Can be

    used to build proxy configurations • Located at https://github.com/CiscoCloud/ marathon-­‐consul • Easy to run as Docker container via Marathon • Listens to Marathon event bus: M A R AT H O N - C O N S U L curl -X POST 'http://marathon.service.consul:8080/v2/eventSubscriptions? callbackUrl=http://marathon-consul.service.consul:4000/events'
  29. L E T ’ S B U I L D

    T H E F U T U R E
  30. • Runs diverse workloads, from containers to big data •

    Resistant to failure • Can be deployed anywhere rapidly • Easy to configure and manage • Batteries included: service discovery, logging, security, etc. T H E N E X T P L AT F O R M
  31. • Integrates Mesos + Consul • Easy deployment • Includes

    Logstash, collectd, Docker, Mesos, Marathon, Chronos (and more coming) • 1,200+ stars on github (#6 trending, 500 stars this week) • Apache 2.0 M I C RO S E RV I C E S - I N F R A S T RU C T U R E
  32. • Configures HA & Security • 0.3.1 released today: •

    Digital Ocean Support! • VMWare vSphere Support! • Chronos! • Distributive added! M I C RO S E RV I C E S - I N F R A S T RU C T U R E
  33. • Uses terraform to provision to the following cloud providers:

    • AWS • Google Cloud • OpenStack • Digital Ocean • vSphere M I C RO S E RV I C E S - I N F R A S T RU C T U R E
  34. • Our new framework for distributed health checks • Single

    4mb binary, no gem or pip installs • Checks defined in .json format • Integrates with Consul, Nagios & Sensu • Will verify every node’s configuration • Cluster tests itself, no external tools needed N E W F E AT U R E : D I S T R I B U T I V E
  35. • IP per Mesos task • DCOS Support • Easy

    deployment of Mesos frameworks • Kubernetes Support • Vault Support • Dynamic Configuration ROA D M A P
  36. • 0.3: June 9th • 0.3.1: June 17th • 0.3.2:

    June 25th • 0.4.0: July 17th • 0.5.0 (Mesoscon) R E L E A S E S C H E D U L E
  37. • Docs: http://microservices- infrastructure.readthedocs.org/en/latest/ • Github Issues: https://github.com/CiscoCloud/ microservices-infrastructure/issues •

    Gitter.im chat room: https://gitter.im/CiscoCloud/ microservices-infrastructure • Bug reports and pull requests welcome! G E T T I N G S U P P O R T