Slide 1

Slide 1 text

M E S O S + C O N S U L BAY A R E A M E S O S M E E T U P J U LY 2 015 Steven  Borrelli   @stevendborrelli @Asteris_LLC

Slide 2

Slide 2 text

A B O U T M E F O U N D E D A S T E R I S ( 2 014 ) S YS T E M S E N G I N E E R I N G , H P C , B I G DATA & C LO U D F O C U S O N C O N T I N U O U S D E L I V E RY, S T R E A M I N G DATA , A N D I N F R A S T RU C T U R E S O F T WA R E

Slide 3

Slide 3 text

+

Slide 4

Slide 4 text

Why? Build Great Experiences for Users

Slide 5

Slide 5 text

D I V E R S E U S E R T Y P E S Development Analytics Engineering

Slide 6

Slide 6 text

E M E RG I N G U S E C A S E S Continuous Delivery Mobile First Microservices Multiple Languages Streaming Data Unstructured Data Multiple Data Stores Machine Learning Cloud DevOps

Slide 7

Slide 7 text

Our Favorite Solution

Slide 8

Slide 8 text

No content

Slide 9

Slide 9 text

The Strength of Mesos is in the Frameworks

Slide 10

Slide 10 text

F R A M E W O R K S App-specific: Generic:

Slide 11

Slide 11 text

M E S O S C H A L L E N G E S • Deployment • Framework Development • Security & Management • Monitoring • Service Discovery

Slide 12

Slide 12 text

Service Discovery

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

Consul is Part of the Hashicorp Ecosystem:

Slide 15

Slide 15 text

C O N S U L F E AT U R E S

Slide 16

Slide 16 text

RU N N I N G C O N S U L • Single Binary (golang) • Run on every system • 1- 7 Servers per datacenter, rest of systems are clients • Config via .json files or cli parameters • Optional Web UI

Slide 17

Slide 17 text

P RO B L E M S RU N N I N G C O N S U L I N D O C K E R • ARP cache issues with Docker networking, need to install conntrack to flush. • PITA to mount volumes and open network ports • Health checks become more complex • Network latency seems to cause instability

Slide 18

Slide 18 text

Clients: Failure Detection
 Health Checks
 Respond to local requests Servers: Leader Election
 Forward Request
 Replicate Data Consensus is achieved via gossip protocol (nodes) or raft (server data)

Slide 19

Slide 19 text

Consul Architecture

Slide 20

Slide 20 text

http://progrium.com/blog/2014/08/20/consul-service-discovery-with-docker/ C O N S U L S E RV I C E S Serf

Slide 21

Slide 21 text

C O N S E N S U S M O D E L Consistency Availability Partition Tolerance Gossip Paxos/ Raft Consul Agent Cassandra Zookeeper Consul K/V etcd

Slide 22

Slide 22 text

C O N S U L C O N S I S T E N C Y • Servers use Raft for consistency (CP) • Loss of server quorum will cause availability failure • Run a small (odd) number of servers per DC • Agents use LAN gossip for node failure detection • WAN gossip is used across DCs, higher latency

Slide 23

Slide 23 text

C O N S U L A P I C O N S I S T E N C Y M O D E S • default: server can serve requests during election. Possible stale values. • consistent: leader must be elected • stale: any server can respond, even non-leaders.

Slide 24

Slide 24 text

Consul Service Discovery

Slide 25

Slide 25 text

R E G I S T E R A S E RV I C E { "service": { "name": "marathon", "tags": [ "admin" ], "port": 8080, "check": { "script": "curl --silent --show-error --fail --dump-header /dev/stderr --r etry 2 http://127.0.0.1:8080/ping", "interval": "10s" } } } marathon.json Create a file called: Optional Health Check DNS Name HTTP API also supported

Slide 26

Slide 26 text

D N S R E G I S T R AT I O N # consul reload # dig marathon.service.consul +short 45.55.95.218 45.55.95.215 45.55.162.9 If a health check fails, entry will not show in DNS.

Slide 27

Slide 27 text

S E RV I C E TAG S # dig admin.marathon.service.consul +short 45.55.95.218 45.55.95.215 45.55.162.9 Tags are supported in DNS

Slide 28

Slide 28 text

D N S S RV R E C O R D S # dig zookeeper.service.consul SRV +short 1 1 2181 mi-control-01.node.dc1.consul. 1 1 2181 mi-control-03.node.dc1.consul. 1 1 2181 mi-control-02.node.dc1.consul. Get the port for any service: Nodes are automatically registered in DNS. You can even query services and nodes in other DCs!

Slide 29

Slide 29 text

S I M P L I F Y M E S O S C O N F I G U R AT I O N zk://zookeeper.service.consul:2181/mesos Zookeeper config string: http://marathon.service.consul:8080 Marathon config string: Mesos config string (we’ll discuss leader later): mesos://leader.mesos.service.consul:5050

Slide 30

Slide 30 text

B O N U S ! H E A LT H C H E C KS YO U R M E S O S C LU S T E R

Slide 31

Slide 31 text

H E A LT H C H E C KS A R E RU N BY T H E N O D E S , E X P O S E S TAT E V I A A P I [ { "Node": { "Node": "mi-control-01", "Address": "45.55.95.218" }, "Service": { "ID": "chronos", "Service": "chronos", "Tags": [ "chronos" ], "Address": "", "Port": 14400 }, "Checks": [ { "Node": "mi-control-01", "CheckID": "service:chronos", "Name": "Service 'chronos' check", "Status": "critical", "Notes": "", "Output": "", "ServiceID": "chronos", "ServiceName": "chronos" }, curl -L http://localhost:8500/v1/health/service/chronos?pretty=true

Slide 32

Slide 32 text

H E A LT H C H E C K E X I T C O D E S Exit  code  0  -­‐  Check  is  passing   Exit  code  1  -­‐  Check  is  warning       Any  other  code  -­‐  Check  is  critical Consul Checks are compatible with Nagios/Sensu:

Slide 33

Slide 33 text

Consul Key/Value Store Consul ACLs

Slide 34

Slide 34 text

C O N S U L K / V E X P O S E D V I A A P I curl -X PUT -d 'test' http://localhost:8500/v1/kv/web/key1 curl http://localhost:8500/v1/kv/?recurse [{"CreateIndex":97,"ModifyIndex":97,"Key":"web/key1","Flags": 0,"Value":"dGVzdA=="}, Or use consul-cli kv-read --ssl nodes/config/test Hello World consulcli- kv-delete --ssl --consul=consul.service.consul:8500 -- recurse nodes/config/test

Slide 35

Slide 35 text

• Only use in 0.5.2 or higher (upsert support) • Master tokens are used to create ACL entries • Every ACL entry has a token • read/write/deny policy on k/v and service endpoints • Can manage with API or C O N S U L AC L S

Slide 36

Slide 36 text

• Released today! (7/22/2015) • Wraps the consul API with an easy-to-use CLI for scripting • Manages ACLs, Checks, Locks, K/V, HealthChecks, Services, Sessions, Raft Status • https://github.com/CiscoCloud/consul-cli C O N S U L - C L I

Slide 37

Slide 37 text

• Example: distributed lock C O N S U L - C L I $ ./consul-cli kv-lock --ttl=0 test/locks ba7c8cda-d197-a062-4e3e-f9a737237aa1 $ ./consul-cli kv-read --format=prettyjson test/locks { "Key": "test/locks", "CreateIndex": 386, "ModifyIndex": 386, "LockIndex": 1, "Flags": 0, "Value": "", "Session": "ba7c8cda-d197-a062-4e3e-f9a737237aa1" } $ ./consul-cli kv-unlock \ —session=ba7c8cda-d197-a062-4e3e-f9a737237aa1 test/locks

Slide 38

Slide 38 text

Consul Template

Slide 39

Slide 39 text

• Reads data from Consul k/v and service catalog • Writes out text files based on go text/template • Can be used to dynamically configure systems and applications C O N S U L T E M P L AT E {{range service "web@datacenter"}}
 server {{.Name}} {{.Address}}:{{.Port}}
 {{end}} 
 server nyc_web_01 123.456.789.10:8080
 server nyc_web_02 456.789.101.213:8080 Becomes

Slide 40

Slide 40 text

• Update zoo.cfg as ZK nodes come up/down • Writes out text files based on go text/template • Restarts Zookeeper nodes • https://github.com/CiscoCloud/docker-zookeeper
 DY N A M I C Z O O K E E P E R E N S E M B L E {{{ with $s := env "CONSUL_QUERY" }}
 { range service $s "passing, warning" }}
 ZK_HOSTS[{{.ID | regexReplaceAll ".*:zkid-([0-9]*)" "$1"}}]={{.Address}}
 ZK_CLIENT_PORTS[{{.ID | regexReplaceAll ".*:zkid-([0-9]*)" “$1"}}]=2181 ZK_PEER_PORTS[{{.ID | regexReplaceAll ".*:zkid-([0-9]*)" "$1"}}]=2888
 ZK_ELECTION_PORTS[{{.ID | regexReplaceAll ".*:zkid-([0-9]*)" “$1"}}]=3888 {{end}}{{end}}

Slide 41

Slide 41 text

Consul Integration with Mesos

Slide 42

Slide 42 text

• Dynamically adds Mesos tasks to Consul • Located at https://github.com/CiscoCloud/mesos- consul • Easy to run as Docker container via Marathon
 • Mesos task shows up as:
 M E S O S - C O N S U L curl -X POST [email protected] -H "Content-Type: application/json" http://marathon.service.consul:8080/v2/apps' taskname.service.consul

Slide 43

Slide 43 text

• Leader detection built-in. Use: • Mesos doesn’t have an event bus. Mesos-consul needs to poll every few seconds. • Mesos (0.22.1 and earlier) doesn’t export Docker port mapping information, so all ports are registered to the same DNS name. M E S O S - C O N S U L leader.mesos.service.consul

Slide 44

Slide 44 text

• Dynamically adds Marathon tasks to Consul K/V. Can be used to build proxy configurations. • Located at https://github.com/CiscoCloud/ marathon-­‐consul • Easy to run as Docker container via Marathon • Listens to Marathon event bus: M A R AT H O N - C O N S U L curl -X POST 'http://marathon.service.consul:8080/v2/eventSubscriptions? callbackUrl=http://marathon-consul.service.consul:4000/events'

Slide 45

Slide 45 text

New Pattern: Using Consul to unit test Cluster configuration

Slide 46

Slide 46 text

• https://github.com/CiscoCloud/distributive • Single 4mb binary, no gem or pip installs • Checks defined in .json format • Integrates with Consul, Nagios & Sensu • Will verify every node’s configuration • Cluster tests itself, no external tools needed D I S T R I B U T I V E

Slide 47

Slide 47 text

P U T T I N G I T A L L TO G E T H E R

Slide 48

Slide 48 text

Microservices-Infrastructure (we need a new name)

Slide 49

Slide 49 text

• Integrates Mesos + Consul • Easy deployment • Includes Logstash, collectd, Docker, Mesos, Marathon, Chronos (and more coming) • 1,300+ stars on github Apache 2.0 M I C RO S E RV I C E S - I N F R A S T RU C T U R E

Slide 50

Slide 50 text

• Uses terraform to provision to the following cloud providers: • AWS • Google Cloud • OpenStack • Digital Ocean • vSphere M I C RO S E RV I C E S - I N F R A S T RU C T U R E

Slide 51

Slide 51 text

• Docs: http://microservices- infrastructure.readthedocs.org/en/latest/ • Github Issues: https://github.com/CiscoCloud/ microservices-infrastructure/issues • Gitter.im chat room: https://gitter.im/CiscoCloud/ microservices-infrastructure • Bug reports and pull requests welcome! G E T T I N G S U P P O R T

Slide 52

Slide 52 text

We Think it’s Awesome https://github.com/CiscoCloud/microservices-infrastructure

Slide 53

Slide 53 text

T H A N K YO U ! http://aster.is @Asteris_LLC