N D E D A S T E R I S ( 2 014 ) S YS T E M S E N G I N E E R I N G , H P C , B I G DATA & C LO U D F O C U S O N D I S T R I B U T E D C O M P U T I N G , S T R E A M I N G DATA , A N D C O N T I N U O U S D E L I V E RY
C A S E S Mobile First Microservices Multiple Languages Continuous Delivery Streaming Data Unstructured Data Multiple Data Stores ETL/Workflows Cloud DevOps
U L • Single Binary (golang) • Run on every system • 1- 7 Servers per datacenter, rest of systems are clients • Config via .json files or cli parameters • Optional Web UI
S U L I N D O C K E R • ARP cache issues with Docker networking, need to install conntrack to flush. • PITA to mount volumes and open network ports • Health checks become more complex • Network latency seems to cause instability
I S T E N C Y • Servers use raft for consistency (CP) • Loss of server quorum will cause availability failure • Run a small (odd) number of servers per DC • Agents use LAN gossip for node failure detection • WAN gossip is used across DCs, higher latency
O N S I S T E N C Y M O D E S • default: server can serve requests during election. Possible stale values. • consistent: leader must be elected • stale: any server can respond, even non-leaders.
E RV I C E { "service": { "name": "marathon", "tags": [ "admin" ], "port": 8080, "check": { "script": "curl --silent --show-error --fail --dump-header /dev/stderr --r etry 2 http://127.0.0.1:8080/ping", "interval": "10s" } } } marathon.json Create a file called: Optional Health Check DNS Name HTTP API also supported
AT I O N # consul reload # dig marathon.service.consul +short 45.55.95.218 45.55.95.215 45.55.162.9 If a health check fails, entry will not show in DNS.
D S # dig zookeeper.service.consul SRV +short 1 1 2181 mi-control-01.node.dc1.consul. 1 1 2181 mi-control-03.node.dc1.consul. 1 1 2181 mi-control-02.node.dc1.consul. Get the port for any service: Nodes are automatically registered in DNS. You can even query services and nodes in other DCs!
S O S C O N F I G U R AT I O N zk://zookeeper.service.consul:2181/mesos Zookeeper config string: http://marathon.service.consul:8080 Marathon config string: Mesos config string (we’ll discuss leader later): mesos://leader.mesos.service.consul:5050
A R E RU N BY T H E N O D E S , E X P O S E S TAT E V I A A P I [ { "Node": { "Node": "mi-control-01", "Address": "45.55.95.218" }, "Service": { "ID": "chronos", "Service": "chronos", "Tags": [ "chronos" ], "Address": "", "Port": 14400 }, "Checks": [ { "Node": "mi-control-01", "CheckID": "service:chronos", "Name": "Service 'chronos' check", "Status": "critical", "Notes": "", "Output": "", "ServiceID": "chronos", "ServiceName": "chronos" }, curl -L http://localhost:8500/v1/health/service/chronos?pretty=true
E X I T C O D E S Exit code 0 -‐ Check is passing Exit code 1 -‐ Check is warning Any other code -‐ Check is critical Consul Checks are compatible with Nagios/Sensu:
X P O S E D V I A A P I curl -X PUT -d 'test' http://localhost:8500/v1/kv/web/key1 curl http://localhost:8500/v1/kv/?recurse [{"CreateIndex":97,"ModifyIndex":97,"Key":"web/key1","Flags": 0,"Value":"dGVzdA=="}, Or use consulkv read --ssl nodes/config/test Hello World consulkv delete --ssl --consul=consul.service.consul:8500 --recurse nodes/config/test
Master tokens are used to create ACL entries • Every ACL entry has a token • read/write/deny policy on k/v and service endpoints • Can manage with API or C O N S U L AC L S
Writes out text files based on go text/template • Can be used to dynamically configure systems and applications C O N S U L T E M P L AT E {{range service "web@datacenter"}} server {{.Name}} {{.Address}}:{{.Port}} {{end}} server nyc_web_01 123.456.789.10:8080 server nyc_web_02 456.789.101.213:8080 Becomes
out text files based on go text/template • Restarts Zookeeper nodes • https://github.com/CiscoCloud/docker-zookeeper DY N A M I C Z O O K E E P E R E N S E M B L E {{{ with $s := env "CONSUL_QUERY" }} { range service $s "passing, warning" }} ZK_HOSTS[{{.ID | regexReplaceAll ".*:zkid-([0-9]*)" "$1"}}]={{.Address}} ZK_CLIENT_PORTS[{{.ID | regexReplaceAll ".*:zkid-([0-9]*)" “$1"}}]=2181 ZK_PEER_PORTS[{{.ID | regexReplaceAll ".*:zkid-([0-9]*)" "$1"}}]=2888 ZK_ELECTION_PORTS[{{.ID | regexReplaceAll ".*:zkid-([0-9]*)" “$1"}}]=3888 {{end}}{{end}}
https://github.com/CiscoCloud/mesos- consul • Easy to run as Docker container via Marathon • Mesos task <taskname> shows up as: M E S O S - C O N S U L curl -X POST [email protected] -H "Content-Type: application/json" http://marathon.service.consul:8080/v2/apps' taskname.service.consul
event bus. Mesos-consul needs to poll every few seconds. • Mesos (0.22.1 and earlier) doesn’t export Docker port mapping information, so all ports are registered to the same DNS name. M E S O S - C O N S U L leader.mesos.service.consul
used to build proxy configurations • Located at https://github.com/CiscoCloud/ marathon-‐consul • Easy to run as Docker container via Marathon • Listens to Marathon event bus: M A R AT H O N - C O N S U L curl -X POST 'http://marathon.service.consul:8080/v2/eventSubscriptions? callbackUrl=http://marathon-consul.service.consul:4000/events'
Resistant to failure • Can be deployed anywhere rapidly • Easy to configure and manage • Batteries included: service discovery, logging, security, etc. T H E N E X T P L AT F O R M
Logstash, collectd, Docker, Mesos, Marathon, Chronos (and more coming) • 1,200+ stars on github (#6 trending, 500 stars this week) • Apache 2.0 M I C RO S E RV I C E S - I N F R A S T RU C T U R E
4mb binary, no gem or pip installs • Checks defined in .json format • Integrates with Consul, Nagios & Sensu • Will verify every node’s configuration • Cluster tests itself, no external tools needed N E W F E AT U R E : D I S T R I B U T I V E