Consulと自作OSSを活用した100台規模のWebサービス運用

Ca6281fff64797dc419b78f51f25c0a5?s=47 FUJIWARA Shunichiro
August 21, 2015
20k

 Consulと自作OSSを活用した100台規模のWebサービス運用

Ca6281fff64797dc419b78f51f25c0a5?s=128

FUJIWARA Shunichiro

August 21, 2015
Tweet

Transcript

  1. Consulͱࣗ࡞OSS Λ׆༻ͨ͠ 100୆ن໛ͷWeb αʔϏεӡ༻ YAPC::Asia 2015 @fujiwara

  2. ౻ݪ ढ़Ұ࿠ @fujiwara github.com/fujiwara sfujiwara.hatenablog.com ٕज़෦

  3. Agenda Lobi Consul Consul & My OSS in Lobi

  4. Game & Community

  5. Lobi - εϚʔτϑΥϯήʔϜʹಛԽͨ͠ίϛϡχςΟΛఏڙ

  6. Lobi - εϚʔτϑΥϯήʔϜʹಛԽͨ͠ίϛϡχςΟΛఏڙ

  7. Lobiͷαʔό

  8. Lobiͷαʔόมભ 1. 2010~2011 AWS (US) : 4୆(?) 2. 2011~2013 ࣗࣾαʔό

    : 4 ~ 20୆ 3. 2013.11~ AWS (Tokyo) : 20 ~ 100୆
  9. None
  10. Lobiͷαʔό EC2ͰՔಇ͍ͯ͠Δϗετͷछྨ͕ଟ͍ • app, sdk, stream, sdk-stream, db(3shard), transcode, log(aggregate,analyze),batch,

    deploy... ࣮૷ݴޠ Perl, Node.js, Go ϛυϧ΢ΣΞ͍Ζ͍Ζ • Nginx, MySQL, Starlet, Fluentd, Norikra, memcached, HAProxy, gearman, twemproxy, MHA, dnsmasq...
  11. AWS(Tokyo)Ҡߦޙͷ೰Έ(2014) ಺෦DNS͕ͳ͍ • /etc/hostsΛChefͰੜ੒ • DNSαʔόࣗલ͸໘౗…(౰࣌Internal Route53 ͳ͠) ϗετ௥Ճɼ࡟আͷස౓͕ߴ͍ •

    Φʔτεέʔϧ΋͍ͨ͠ ConsulΛ࢖͓͏ʂ
  12. Consul?

  13. What's Consul www.consul.io - HashiCorp product • Service Discovery •

    Health Checking • Key/Value Store • Multi Datacenter
  14. Architecture

  15. None
  16. None
  17. Agent Ϋϥελ಺ͷnodeશ୆Ͱಈ࡞͢Δdaemon Client mode or Server mode ͷͲͪΒ͔Ͱಈ࡞ ϢʔβʹDNS, HTTP

    interfaceΛఏڙ͢Δ • Ϣʔβ͸جຊతʹlocalhostͷagentͱ௨৴ • Agentಉ͕࢜RPCͰ௨৴(͋·Γҙࣝ͢Δඞཁ͸ ͳ͍) GoͰॻ͔Ε͍ͯͯ1όΠφϦͰಈ࡞ (CLI΋ಉҰ)
  18. Server Ϋϥελ಺ͷগ਺nodeͰಈ࡞(3Ҏ্ͷح਺ਪ঑) ৗʹͲΕ͔1୆ͷΈLeaderʹͳΔ • RaftΞϧΰϦζϜʹΑͬͯLeader͕બग़͞ΕΔ σʔλͷॻ͖ࠐΈ͸ৗʹLeaderʹ σʔλ͸Leader͔ΒଞͷServer΁ϨϓϦϨʔγϣ ϯ

  19. Raft Raft is a protocol for implementing distributed consensus. ෼ࢄ؀ڥͰͷ߹ҙΞϧΰϦζϜ(ϓϩτίϧ)

    Leaderબग़ʹserver nodeͷա൒਺ͷಉҙ͕ඞཁ ͳͷͰ࠷௿3 node͕ඞཁ
  20. Raft ࢀߟࢿྉ thesecretlivesofdata.com/raft/ • ΞχϝʔγϣϯͰΘ͔Γ΍͍͢ www.slideshare.net/pfi/raft-36155398 • ೔ຊޠͰͷղઆࢿྉ github.com/hashicorp/raft •

    Consul͕࢖༻͍ͯ͠ΔGo࣮૷
  21. Service / Node Discovery

  22. Service / Node Discovery αʔϏε಺ͷnode͢΂ͯͰΫϥελΛߏ੒͢Δ • nodeɺnode্Ͱఏڙ͢ΔαʔϏεͷൃݟ • DNS, HTTP

    API ࢮ׆؂ࢹࣦഊͨ͠node͸Ԡ౴͔ΒࣗಈͰ֎ΕΔ
  23. None
  24. Node Discovery consul members ͰΫϥελ಺ͷnodeΛҰཡ $ consul members Node Address

    Status Type Build Protocol DC my-app-i-123456 192.168.1.12:8301 alive server 0.5.2 2 dc1 my-app-i-234567 192.168.1.23:8301 alive server 0.5.2 2 dc1 my-app-i-345678 192.168.1.34:8301 alive server 0.5.2 2 dc1 my-db-i-456789 192.168.1.45:8301 alive client 0.5.2 2 dc1 my-db-i-567890 192.168.1.56:8301 alive client 0.5.2 2 dc1 my-db-i-678901 192.168.1.67:8301 alive client 0.5.2 2 dc1 my-app-i-987654 192.168.1.99:8301 failed client 0.5.2 2 dc1 my-app-i-876543 192.168.1.87:8301 left client 0.5.2 2 dc1 (࣮ࡍ͸݁Ռͷॱ൪͸ෆఆ)
  25. Node Discovery Status=failed : agentͷࢮ׆؂ࢹʹࣦഊͨ͠node $ consul members -status failed

    Node Address Status Type Build Protocol DC my-app-i-987654 192.168.1.99:8301 failed server 0.5.2 2 dc1 Status=left : ਖ਼ৗʹΫϥελ͔Β཭୤ͨ͠node $ consul members -status left Node Address Status Type Build Protocol DC my-app-i-876543 192.168.1.87:8301 left server 0.5.2 2 dc1
  26. Node Discovery via DNS interface consul agent (127.0.0.1:8600) ʹ໰͍߹ΘͤΔ $

    dig @127.0.0.1 -p 8600 my-app-i-123456.node.consul ;; QUESTION SECTION: ;my-app-i-123456.node.consul. IN A ;; ANSWER SECTION: my-app-i-123456.node.consul. 0 IN A 192.168.1.12
  27. Node Discovery via DNS interface Status=failed : DNSͰΞυϨε͕Ҿ͚Δ • Ұ࣌తʹࣄނͰ཭୤͍ͯ͠ΔՄೳੑ͕͋Δ

    Status=left : DNSͰΞυϨε͕Ҿ͚ͳ͘ͳΔ • .consul υϝΠϯͷ໊લղܾΛ consul agent ʹ ౤͛Δ͜ͱͰ಺෦DNSͱͯ͠ར༻Ͱ͖Δ • υϝΠϯ໊͸ઃఆͰมߋՄೳ
  28. Service Difinition of App my-app-i-*:/etc/consul.d/app.json { "service": { "name": "app",

    "port": 3000 } }
  29. Service Discovery of App via DNS interface consul agent (127.0.0.1:8600)

    ʹ໰͍߹ΘͤΔ $ dig @127.0.0.1 -p 8600 app.service.consul ;; QUESTION SECTION: ;app.service.consul. IN A ;; ANSWER SECTION: app.service.consul. 0 IN A 192.168.1.12 app.service.consul. 0 IN A 192.168.1.23 app.service.consul. 0 IN A 192.168.1.34
  30. Service Discovery via DNS interface Answerͷॱ൪͸ϥϯμϜ (≒ Round Robin) UDPͰͷ໰͍߹ΘͤͰ͸ର৅͕4ΞυϨεҎ্

    ͋Δ৔߹ɺ3ΞυϨεͷΈฦΔ • TCPͰ͸͢΂ͯฦΔ TTL ઃఆՄೳ (default 0)
  31. Service Discovery of App via HTTP API http://127.0.0.1:8500 ʹΞΫηε $

    curl http://127.0.0.1:8500/v1/catalog/service/app [ { "Node": "my-app-i-123456", "Address": "192.168.1.12", "ServiceID": "app", "ServiceName": "app", "ServicePort": 3000, ... }, { "Node": "my-app-i-234567", "Address": "192.168.1.23", ... } { "Node": "my-app-i-345678", "Address": "192.168.1.34", ... } ]
  32. Service Difinision of DB (master) my-db-i-456789:/etc/consul.d/db.json { "service": { "name":

    "db", "port": 3306, "tags": ["master"] } }
  33. Service Difinision of DB (slave) my-db-i-567890,678901:/etc/consul.d/db.json { "service": { "name":

    "db", "port": 3306, "tags": ["slave"] } }
  34. Service Discovery of DB (master/slave) {tag}.{service}.service.consul Ͱ໊લղܾ $ dig @127.0.0.1

    -p 8600 master.db.service.consul master.db.service.consul. 0 IN A 192.168.1.45 $ dig @127.0.0.1 -p 8600 slave.db.service.consul slave.db.service.consul. 0 IN A 192.168.1.67 slave.db.service.consul. 0 IN A 192.168.1.56 $ dig @127.0.0.1 -p 8600 db.service.consul db.service.consul. 0 IN A 192.168.1.56 db.service.consul. 0 IN A 192.168.1.45 db.service.consul. 0 IN A 192.168.1.67
  35. External Service nodeʹؔ࿈͠ͳ͍ɺ֎෦DNSͰఆٛ͞Ε͍ͯΔ ໊લ΍IPΞυϨε΋αʔϏεͱͯ͠ఆٛͰ͖Δ $ curl -X PUT -d '{

    "Node":"rds", "Address":"my-rds.xxxxx.ap-northeast-1.rds.amazonaws.com", "Service":{"Service": "rds"} }' http://127.0.0.1:8500/v1/catalog/register $ dig @127.0.0.1 -p 8600 rds.service.consul. ;; ANSWER SECTION: rds.service.consul. 0 IN CNAME my-rds.xxxxx.ap-northeast-1.rds.amazonaws.com. my-rds.xxxxx.ap-northeast-1.rds.amazonaws.com. 10 IN A 192.168.1.100
  36. External Service DNS໊Λొ࿥͢Δ৔߹ɺconsulͷઃఆʹ recursors (֎෦ͷ໊લղܾΛ͢ΔDNSαʔόͷ ΞυϨε) Λ͓ͯ͘͠ • ઃఆ͠ͳ͍ͱ CNAME

    ͷΈ͕ฦΔ consul agent͸֎෦ͷ໊લղܾ݁ՌΛcache͠ͳ ͍
  37. Health Checking

  38. Health Checking ֤αʔϏεʹ͍ͭͯɺ3λΠϓͷϔϧενΣοΫ ΛఆٛͰ͖Δ • Script • HTTP • TTL

  39. Health Checking by script ϢʔβఆٛͷϔϧενΣοΫίϚϯυΛ࣮ߦ exit codeͰঢ়ଶΛ௨஌ (Nagios/Sensuޓ׵) • 0

    : success • 1 : warning • 2 : fail failͷ৔߹͸DNS, HTTPͷԠ౴͔Β֎ΕΔ
  40. Health Checking by HTTP consul agent͕HTTPͰΞΫηε • HTTP 2xx :

    success • HTTP 429 : warning • ͦΕҎ֎ : fail
  41. Health Checking by TTL ఆظతʹagentʹHTTP PUTͯ͠ੜଘΛ఻͑Δ TTL͕੾ΕΔ·ͰʹPUT͕ͳ͚Ε͹failѻ͍

  42. Key/Value Store

  43. Key/Value Store ೚ҙͷ஋Λग़͠ೖΕͰ͖ΔKVS $ curl -XPUT -d 'test' 'http://127.0.0.1:8500/v1/kv/web/key1' true

    $ curl http://127.0.0.1:8500/v1/kv/web/key1 [{ "CreateIndex": 112, "ModifyIndex": 112, "LockIndex": 0, "Key": "web/key1", "Flags": 0, "Value": "dGVzdA==" }]
  44. Key/Value Store URLҾ਺Ͱϝλσʔλ (flags) Λอ࣋Ͱ͖Δ 64bit int, ༻్͸Ϣʔβͷ೚ҙ $ curl

    -XPUT -d 'test' 'http://127.0.0.1:8500/v1/kv/web/key1?flags=123' true $ curl 'http://127.0.0.1:8500/v1/kv/web/key1' [{ ... "Key": "web/key1", "Flags": 123, // <------- ͜Ε "Value": "dGVzdA==" }]
  45. Key/Value Store jsonϨεϙϯεͷ஋͸Base64 encode͞Ε͍ͯΔ jsonͰ͸ͳ͘஋͚ͩੜͰऔΓ͍ͨ৔߹͸Ҿ਺ raw $ curl "http://127.0.0.1:8500/v1/kv/web/key1?raw" test

  46. Key/Value Store ͋Δ֊૚ͷԼͷ஋Λ࠶ؼతʹऔΓ͍ͨ৔߹͸ Ҿ਺ recurse $ curl -s "http://127.0.0.1:8500/v1/kv/web/?recurse" [

    {"CreateIndex":112,"ModifyIndex":115,"LockIndex":0, "Key":"web/key1","Flags":123,"Value":"dGVzdA=="}, {"CreateIndex":122,"ModifyIndex":122,"LockIndex":0, "Key":"web/key2","Flags":0,"Value":"dGVzdDI="}, {"CreateIndex":124,"ModifyIndex":124,"LockIndex":0, "Key":"web/test/1","Flags":0,"Value":"dGVzdDM="} ] όοΫΞοϓʹ΋ར༻Մೳ
  47. Key/Value Store Benchmark GET $ wrk -c 10 -d 10

    -t 2 http://127.0.0.1:8500/v1/kv/web/key1 Server(Leader): 41,832 Requests/sec Server/Client(Follower): 17,281 Server(Follower) stale mode: 37,013 Client(Follower) stale mode: 16,938 Consul v0.5.2 on EC2 c4.2xlarge, GOMAXPROCS=4
  48. Key/Value Store Benchmark PUT $ wrk -c 10 -d 10

    -t 2 -s put.lua http://127.0.0.1:8500/v1/kv/web/key1 -- put.lua wrk.method = "PUT" wrk.body = "test" wrk.headers["Content-Type"] = "application/x-www-form-urlencoded" Server/Client(Follower): 427.56 Requests/sec
  49. Key/Value Store ৗʹ 127.0.0.1:8500 Λ࢖͑͹Α͍ͷͰָ ͔ͳΓߴ଎ͳͷͰGET͸ԕྀͳ͘࢖͑Δ σϑΥϧτͰ͸͢΂ͯͷ໰͍߹ΘͤΛ Leader node͕ॲཧ͢Δ •

    stale modeʹ͢ΔͱLeaderҎ֎ͷServer΋Ԡ ౴Ͱ͖Δ (Ұ؏ੑʹ͍ͭͯ͸ޙड़)
  50. ಺෦DNSͱͯ͠ConsulΛ࢖͏

  51. ಺෦DNSͱͯ͠ConsulΛ࢖͏ node,serviceͷ໊લղܾΛConsul Agent΁޲͚Δ ! resolv.confͰ͸ϙʔτࢦఆͰ͖ͳ͍ ! Port 53͸ಛݖ͕ඞཁ → ͦͷͨΊ͚ͩʹAgentΛrootͰಈ͔͢ͷ͸

    ! Agent͸࠶ؼ໰͍߹Θͤ͸(ҰԠ)Ͱ͖Δ͕cacheػ ೳ͕ͳ͍ ! → dnsmasq, bindͳͲ͔Β.consulυϝΠϯͷΈ forward !
  52. None
  53. ಺෦DNSͱͯ͠ConsulΛ࢖͏ dnsmasqΛશ୆Ͱىಈ .consul υϝΠϯͷ໊લղܾ͸consul agent΁ 127.0.0.1:53 Λdnsmasq͕Listen͢Δ # dnsmasq.conf server=/consul/127.0.0.1#8600

    bind-interfaces listen-address=127.0.0.1
  54. ಺෦DNSͱͯ͠ConsulΛ࢖͏ resolv.conf Ͱ (node|service).consul Λݕࡧυ ϝΠϯʹࢦఆ → node໊ɺservice໊͚ͩͰ઀ଓͰ͖Δ # /etc/resolv.conf

    search node.consul service.consul nameserver 127.0.0.1 # dnsmasq nameserver 172.16.0.2 # VPC resolver nameserver 172.16.0.254 # Unbound on EC2
  55. ಺෦DNSͱͯ͠ConsulΛ࢖͏ consul.io/docs/guides/forwarding.html BINDΛ࢖͏ྫ

  56. ConsulΛຊ൪؀ڥͰӡ༻͢Δ ͨΊʹ

  57. Deployment Table Server node: ຊ൪Ͱ͸࠷௿3୆, 3 or 5͕ਪ঑ consul.io/docs/internals/consensus.html

  58. ServerʹඞཁͳϦιʔε • CPU: 2CPUͰे෼ • GOMAXPROCS=2 Ҏ্Ͱಈ࡞ͤ͞ΔͷΛڧ͘ਪ ঑ • Memory:

    20MBʙ • Disk: 2MBʙ Memory, Disk͸KVͷར༻ঢ়گ࣍ୈ
  59. Serverʹ͸ઐ༻ϗετ͕ඞཁʁ consul agentࣗମ͸ͦΕ΄ͲϦιʔεΛ ࢖༻͠ͳ͍ͨΊɺಉډՄೳ͕ͩ… Disk IO͕ߴෛՙͳ৔߹ʹRaftͷHeartbeat͕ ࣦഊ͠΍͍͢ • Timeout 500ms

    • Heartbeatʹࣦഊ͢ΔͱLeaderબग़͕ߦΘΕΔ • ௨ৗ2,3ඵͰબग़͸׬ྃ͢Δ • ͦͷؒॻ͖ࠐΈॲཧ͕Ͱ͖ͳ͍
  60. Daemonize consul agentࣗ਎͸Deamonಈ࡞Ϟʔυ͕ͳ͍ • Daemontools • RPM & init script

    • github.com/tomhillable/consul-rpm • Systemd ͲΕͰ΋͓޷ΈͰ
  61. Bootstrapping consul agent͸ىಈޙɺΫϥελʹjoin͢Δඞཁ ͕͋Δ 1. ίϚϯυͰ consul joinΛ࣮ߦ 2. ઃఆϑΝΠϧͰ

    start_join Λࢦఆ 3. Atlas࿈ܞ
  62. 1. consul join αʔόͷΞυϨεΛࢦఆͯ͠join $ consul join 192.168.1.11 192.168.1.12 192.168.1.13

    ࣗಈԽͮ͠Β͍ͷͰςετҎ֎Ͱ͸࢖Θͳ͍
  63. 2. start_join consul.confʹαʔόͷΞυϨεΛࢦఆ͓ͯ͘͠ { "start_join": [ "192.168.1.11", "192.168.1.12", "192.168.1.13"] }

    consul serverʹ͸ݻఆIPΞυϨεΛৼΔ͔ɺผͷ DNSͰղܾͰ͖ΔΑ͏ʹ͓ͯ͘͠
  64. 3. Atlas࿈ܞ Atlas - atlas.hashicorp.com Vagrant Packer Terraform ConsulΛ౷߹͢ΔαʔϏε $

    consul agent ... \ -atlas=ATLAS_USERNAME/infrastructure \ -atlas-join \ -atlas-token="YOUR_ATLAS_TOKEN" \ ! ServerͷΞυϨεΛ؅ཧ͢Δඞཁ͕ͳ͍ͷͰָ " 11nodeҎ্͸$40/node
  65. ߴՄ༻ੑͷͨΊʹ

  66. ߴՄ༻ੑͷͨΊʹ Server୆਺ʹΑΓಉ࣌ʹো֐ Λىͯ͜͠΋໰୊ͳ͍node਺ ͕มΘΔ • 3 node → 1 •

    5 node → 2 3 nodeߏ੒࣌ɺ2୆མͪͯ࢒ Γ1୆ʹͳͬͯ͠·͏ͱLeader ͕બग़Ͱ͖ͳ͍ ௕࣌ؒ੾Γ཭͢ϝϯςφϯε ࣌ʹ͸Ұ࣌తʹServer nodeΛ ૿΍͢ख΋
  67. ߴՄ༻ੑͷͨΊʹ Server nodeͷfailover͸ࣗಈ ! Ϣʔβ͸ৗʹlocalhostͷagent͚ͩΛΈ͍ͯΕ ͹Α͍

  68. nodeো֐࣌ͷӨڹ ! LeaderͰ͸ͳ͍ → " ଞnodeʹ͸Өڹͳ͠ ! Leader → "

    Leader࠶બग़ σϑΥϧτͰ͸͢΂ͯͷಡΈॻ͖ΛLeader͕ॲཧ (ڧҰ؏ੑ) Leader͕ܾ·Δ·ͰΞΫηεෆೳ (DNS, HTTP)
  69. Stale mode (DNS) Leader࠶બग़͸௨ৗ2ʙ3ඵͰ׬ྃ ͦͷؒ΋DNSͰNode, Service໊ղܾΛ͍ͨ͠ʁ → Stale mode :

    Leaderະબग़Ͱ΋Ԡ౴Մೳ "dns_config":{ "allow_stale": true, // default false "max_stale": "10s" // default 5s } ݁Ռ͸ݹ͍Մೳੑ͕͋Δ(݁Ռ੔߹ੑ)
  70. DNS TTL default͸TTL 0 → cache͞Εͳ͍ node, serviceผʹTTLΛઃఆՄೳ DNS cacheΛલஈʹ഑ஔͯ͠cacheͰ͖Δ

    "dns_config":{ "node_ttl": "60s", "service_ttl": { "*": "15s" } }
  71. Stale mode (HTTP API) HTTP APIͰstale modeʹ͢Δ৔߹͸Ҿ਺ stale $ curl

    "http://127.0.0.1:8500/v1/kv/web/key1?stale" staleҾ਺ͳ͠ͰLeaderબग़தʹΞΫηε → 500 Internal Server Error
  72. ӡ༻தͷUpgrade consul.io/docs/upgrading.html consul.io/docs/upgrade-specific.html όʔδϣϯผʹ஫ҙ఺͕͋ΔͷͰυΩϡϝϯτΛ ॱ൪ʹAgentΛೖΕସ͑Δ͜ͱͰ Rolling upgradeՄೳ (Leader nodeೖΕସ͑Ͱ࠶બग़͸ى͖Δ)

  73. ҆ఆੑ v0.2࣌୅͔Β1೥Ҏ্ӡ༻ Agentϓϩηε͕མͪͨ͜ͱ͸ͳ͍ ! ΦϖϛεͰશServerΛಉ࣌ʹམͱ͢ͱճ෮ෆೳ → KV͸ఆظόοΫΞοϓΛ (Ϣʔβσʔλ͸ೖΕͪΌμϝ ! )

  74. Agentࣗମͷ؂ࢹ Agent processࣗମͷ؂ࢹ͸ผ్ • process؂ࢹ(consul agent) • TCP/UDP 8600 (DNS)

    • TCP 8500 (HTTP) • http://127.0.0.1:8500/v1/status/leader ಺༰มߋݕ஌ • Leader Lost→࠶બग़ͰมΘΔ
  75. Φʔτεέʔϧ؀ڥͰ ૿ݮ͢ΔαʔόΛѻ͏

  76. 2014.02ʙ Lobi Rec SDKϦϦʔε εϚϑΥΞϓϦʹSDKΛ૊ΈࠐΜͰPlayಈը࿥ը αʔόʹuploadͯ͠ม׵ɺӾཡ ม׵ػೳ͸ElasticTranscoderͰ࣮૷

  77. ElasticTranscoder ! Managed ServiceͳͷͰ؅ཧָ͕ ! ม׵ೳྗ͸উखʹεέʔϧ ! ͪΐͬͱ͓ߴ͍… • SD

    $0.017/min • HD $0.034/min ౤ߘ͋ͨΓ4ύλʔϯ ฏۉ2෼ = $0.204 ≒ 25ԁ
  78. 2014.10 ϞϯελʔετϥΠΫ Lobi Rec SDKಋೖʂʂʂ

  79. ϞϯελʔετϥΠΫ Lobi Rec SDKಋೖ ௒ώοτήʔϜʹ͖ͭ େྔͷಈը͕… !""""""!

  80. EC2 Spot InstanceͰಈըม׵ ! ؅ཧ͕໘౗ ! উखʹεέʔϧ͸ͯ͘͠Εͳ͍ ! ElasticTranscoderΑΓѹ౗తʹ҆Ձ •

    ElasticTranscoder = $0.204/౤ߘ • EC2 Spot cc2.8xlarge(32core) = $0.45/hour
  81. Spot InstanceͰΦʔτεέʔϧ

  82. Spot InstanceͰΦʔτεέʔϧ • εέʔϧΞ΢τ: CPUෛՙͰ͸ͳ͘JobͷྔͰ • Job͕ཷ·Βͳ͍ݶΓCPU 100%͸ ! •

    εέʔϧΠϯ: CPU idle͕શମͷ25%Ҏ্
  83. Φʔτεέʔϧ؀ڥͰͷΠϯελϯεىಈ 1. ࣗಈͰuniqueͳϗετ໊Λ෇͚Δ 2. consul join 3. ChefʹΑΔϓϩϏδϣχϯά 4. ΞϓϦέʔγϣϯͷ࠷৽൛Λdeploy

    5. Zabbix΁ࣗಈొ࿥ͯ͠؂ࢹର৅ʹ௥Ճ
  84. 1. ࣗಈͰuniqueͳϗετ໊Λ෇͚Δ ConsulͷͨΊʹҰҙͳhostname͕ඞཁ Cloud-InitͰىಈ͢ΔϗετͷλΠϓΛઃఆ #cloud-config runcmd: - [sh, -c, 'echo

    "HOSTNAME_PREFIX=transcode" > /etc/sysconfig/hostname-prefix'] rc.localͰಡΈࠐΉ # /etc/rc.local if [ -f /etc/sysconfig/hostname-prefix ]; then . /etc/sysconfig/hostname-prefix fi
  85. 1. ࣗಈͰuniqueͳϗετ໊Λ෇͚Δ hostnameΛ prefix + InstanceID EC2 Name tag ෇༩

    # /etc/rc.local instance_id=$(curl -s 169.254.169.254/latest/meta-data/instance-id) new_hostname="${HOSTNAME_PREFIX}-$instance_id" hostname $new_hostname aws ec2 create-tags \ --resources $instance_id \ --tags "Key=Name,Value=$new_hostname"
  86. 2. ConsulΫϥελʹjoin consul agent -node $(hostname) ࣗಈతʹ಺෦DNSͰ໊લղܾՄೳʹ

  87. 3. ChefʹΑΔϓϩϏδϣχϯά Consul KV͔ΒϗετλΠϓ͝ͱͷJSONΛऔಘ͠ ࣮ͯߦ JSON=$(curl -s "localhost:8500/v1/kv/nodes/bootup/${HOSTNAME_PREFIX}.json?raw") echo "$JSON"

    > /tmp/chef-bootup.json chef-client -j /tmp/chef-bootup.json ॳճChef࣮ߦ࣌ʹChef-Serverʹొ࿥͞ΕΔ
  88. 4. ΞϓϦέʔγϣϯͷ࠷৽൛Λdeploy Stretcher(ޙड़)Ͱdeploy 5. Zabbix΁ࣗಈొ࿥ͯ͠؂ࢹର৅ʹ௥Ճ deploy࣮ߦޙɺzabbix-agentΛىಈˠࣗಈొ࿥ deploy׬ྃલʹىಈ͢ΔͱΞϥʔτ͕ൃ๒ͯ͠͠ ·͏

  89. StretcherΛར༻ͨ͠σϓϩΠ Dan Zen https://www.flickr.com/photos/danzen/2288626158/

  90. What's Stretcher github.com/fujiwara/stretcher Consul / Serf ͱ࿈ܞͯ͠ಈ͘σϓϩΠπʔϧ

  91. Why Stretcher? Archer(rsync) ʹΑΔதԝϗετ͔Βͷdeploy ! pushͰ͸ΦʔτεέʔϧʹରԠͰ͖ͳ͍ ! ֤ϗετ͔ΒrsyncͰpull? → buildதʹrsync͞ΕͨΒ…

    ! ֤ϗετ͔Βgit pull? → grunt, GoͳͲͷbuildੜ੒෺ΛೖΕͨ͘ͳ͍ ! ୆਺͕ଟ͍ͱssh+rsync΋git pull΋πϥ͍ ! AMI࡞Γ௚͠&ೖΕସ͑͸଴ͯͳ͍
  92. Inspired by github.com/sorah/mamiya & AWS CodeDeploy

  93. ઃܭํ਑ AWS͡Όͳͯ͘΋ಈ͘ rsync & ίϚϯυ࣮ߦɺͱ͍͏ϑϩʔ͸౿ऻ Consul eventͰΠϕϯτ௨஌ Consul ͱ͸ૄ݁߹ (consul

    watchͰىಈ) GoͰॻ͘
  94. Architecture

  95. Consul event ֤nodeʹGossip ProtocolͰΠϕϯτΛ ૹ৴͢Δ࢓૊Έ $ consul event -name EVENT_NAME

    [-node REGEX] PAYLOAD Event ID: 3b1f3199-6e69-4b82-4812-b35058864fdd ࢦఆͨ͠Πϕϯτ໊Ͱ (ਖ਼نදݱʹϚον͢Δnodeʹ) payloadΛૹ৴
  96. Consul watch ࢦఆͨ͠ΠϕϯτΛड৴ͨ͠ΒίϚϯυΛ ࣮ߦ͢Δ࢓૊Έ $ consul watch -type event -name

    EVENT_NAME COMMAND payload͸ඪ४ೖྗ͔ΒJSONͰ౉͞ΕΔ [{ "ID": "3b1f3199-6e69-4b82-4812-b35058864fdd", "Name": "test", "Payload": "TXkgcGF5bG9hZA==", ... }]
  97. Deployment process 1. ΞϓϦέʔγϣϯΛbuildͯ͠tar.gzʹ͢Δ ґଘcpan moduleͳͲ͢΂ͯݻΊΔ 2. खॱॻ(manifest)Λॻ͘ 3. tar.gz,

    manifestΛS3(or httpd)ʹ্͛Δ 4. consul event Ͱ manifest URLΛ௨஌ consul event -name deploy s3://... ✄---------- ͜͜·ͰstretcherͰ͸ͳ͍ ---------✄
  98. ✄------------ stretcher͔͜͜Β ------------✄ consul watch -type event -name deploy stretcher

    1. event͔Βmanifest URLΛऔಘ 2. tar.gzΛऔಘͯ͠TMPDIRʹల։ 3. rsync -av --deleteͰߋ৽ 4. command࣮ߦ • ΞϓϦέʔγϣϯ࠶ىಈͳͲ
  99. Manifest src: s3://example.com/app.tar.gz checksum: e0840daaa97cd2cf2175f9e5d133ffb3324a2b93 dest: /home/stretcher/app commands: pre: -

    echo 'staring deploy' post: - echo 'deploy done' success: - cat >> /path/to/success.log failure: - cat >> /path/to/failure.log excludes: - "*.pid" - "*.socket"
  100. LobiͰͷdeploy tar.gz ໿200MB ల։͢Δͱ໿400MB • CPAN modules 110MB • node_modules

    10MB × 5 • Go app binaries 8MB × 5 • Static files (S3ʹஔ͖͍ͨ)
  101. LobiͰͷdeploy 1. ‐ Push to production branch 2. ! Build

    (1 min~) carton install, grunt, npm install, go build ... 3. " Pack tar.gz & Upload (1 min) 4. # Deploy by Startecher (10~30 sec)
  102. LobiͰͷdeploy consul event ૹ৴͔Β10ʙ20ඵͰ׬ྃ ! 2015/08/05 14:58:30 Starting up stretcher

    agent 2015/08/05 14:58:30 Waiting for consul events from STDIN... 2015/08/05 14:58:30 Executing manifest: s3://... 2015/08/05 14:58:33 Extract archive: /dev/shm/stretcher539962129 to /dev/shm/stretcher_src648982332 2015/08/05 14:58:36 rsync [-av --delete --exclude-from /dev/shm/stretcher_src648982332/conf/rsync_exclude.web /dev/shm/stretcher_src648982332/ /home/xxx/web/] 2015/08/05 14:58:36 sending incremental file list ... sent 787780 bytes received 5230 bytes 1586020.00 bytes/sec total size is 359702435 speedup is 453.59 2015/08/05 14:58:36 invoking command: /home/xxx/web/refresh_services.sh 2015/08/05 14:58:41 success. 2015/08/05 14:58:41 Deploy manifest succeeded. Rollback͍ͨ͠ˠ௚લͷmanifestΛeventૹ৴ →10secͰ໭Δ
  103. Deployͷ׬ྃΛ଴ͭ Πϕϯτૹ৴ݩͰ͸࣮ߦ׬͕ྃ෼͔Βͳ͍ • ࣦഊ͢Δϗετ͕͋Δ͔΋͠Εͳ͍ • Քಇதͷશ୆Ͱdeploy׬ྃͨ͠ͷΛ଴͍ͪͨ ……ϩά΋ݟ͍͚ͨͲ100୆෼Ͳ͏΍ͬͯʁ

  104. Consul KV Dashboard

  105. Chef, Serverspec, Deploy, etc... ֤ϗετͰ࣮ߦͨ݁͠ՌΛ֬ೝ͍ͨ͠΋ͷ • Chef • Serverspec •

    Stretcher • etc
  106. IRC / Slackʹ௨஌ʁ ࣾ಺༻ nopaste-cli command $ nopaste-cli -channel "lobi"

    -summary "Deploy done!" < deploy.log nopasteʹPOSTͱಉ࣌ʹURL͕௨஌͞ΕΔ 100୆͋Δͱ௨஌͕……!!!!!!!!×100 Ͳ͔͜ͰҰཡͯ͠ݟ͍ͨʂ
  107. Consul KV Dashboard github.com/fujiwara/consul-kv-dashboard Consul KVΛσʔλετΞʹͨ͠ μογϡϘʔυWebΞϓϦ

  108. σʔλొ࿥ Consul HTTP APIͰ௚઀ૹΔ $ curl -X PUT -d "message"

    \ '127.0.0.1:8500/v1/kv/dashboard/example/myhostname?flags=1422607461000'
  109. keyߏ଄ /v1/kv/dashboard/{category}/{nodename}? flags=({unixtime} * 1000 + {status}) • category: chef,

    serverspec, deploy... • nodename: Consulͷnode໊ • flags: unixtime * 1000 + status • status: 0=Success 1=Warning 2=Danger 3=Info
  110. ը໘ͷଈ࣌ߋ৽ ϒϩοΩϯάΫΤϦΛ࢖͏ consul.io/docs/agent/http.html $ curl -i 127.0.0.1:8500/v1/kv/dashboard/chef/myhost?recurese HTTP/1.1 200 OK

    Content-Type: application/json X-Consul-Index: 261975 [{"CreateIndex":261891,"ModifyIndex":261975,"LockIndex":0, "Key":"dashboard/chef/myhost","Flags":1422602855000,"Value":".....
  111. Blocking query Ϩεϙϯεϔομͷ X-Consul-Index Λ ࣍ͷϦΫΤετͷҾ਺ʹࢦఆ $ curl -i 127.0.0.1:8500/v1/kv/dashboard/chef/myhost

    HTTP/1.1 200 OK X-Consul-Index: 261975 ... $ curl 127.0.0.1:8500/v1/kv/dashboard/chef/myhost?index=261975 ৽͍͠σʔλ͕ൃੜ͢Δ·ͰϨεϙϯε͕஗Ԇ ͍ΘΏΔ Long pooling
  112. Blocking queryͷ׆༻ Consul Template github.com/hashicorp/consul-template KV, node, service౳ͷঢ়ଶมԽΛଈ൓ө Template→fileߋ৽ˠcommand࣮ߦ $

    consul-template \ -consul 127.0.0.1:8500 \ -template "/tmp/template.ctmpl:/var/www/nginx.conf:service nginx restart"
  113. consul-kv-dashboard͕΍Δ͜ͱ(1) GoͰॻ͔ΕͨWebΞϓϦέʔγϣϯ HTML, CSS, JavaScriptΛ഑෍ • go-bindataͰ·ͱΊͨϑΝΠϧΛ http.FileServerͰ഑৴ • όΠφϦҰݸΛஔ͍ͯىಈ͢Δ͚ͩͰ

    ੩తϑΝΠϧ΋഑৴Ͱ͖Δ
  114. consul-kv-dashboard͕΍Δ͜ͱ(2) Consul HTTP API ΁ͷ reverse proxy • /api/... →

    127.0.0.1:8500/v1/kv/dashboard/... • ϨεϙϯεͷJSONΛ੔ܗ {"Flags":1422608524001} ! {"timestamp":"2015-01-30 18:02:04 +0900","status":"warning"} • /v1/catalog/nodes Λblocking queryͰ؂ࢹ ଘࡏ͢ΔnodeͷσʔλͷΈϑΟϧλ
  115. consul-kv-dashboard trigger $ consul-kv-dashboard -trigger COMMAND ΧςΰϦຖʹঢ়ଶ(success→warningͳͲ)͕ มԽͨ͠ΒίϚϯυ࣮ߦՄೳ JSON͕ඪ४ೖྗʹ౉͞ΕΔ {

    "category":"testing", "node":"web01", "address":"192.168.1.10", "timestamp":"2015-01-21 11:22:33 +0900", "status":"danger", "key":"","data":"failure!!" }
  116. ΦʔτεέʔϧͰͷ Deploy Tips

  117. ىಈޙʹ࠷৽൛Λdeploy deploy࣌ɺKVʹ࠷৽ͷmanifest URLΛอଘ 1. consul join 2. stretcherىಈ 3. ࠷৽ͷmanifest

    URLΛKV͔Βऔಘ 4. ࣗ෼ࣗ਎ʹ event (manifest URL) Λૹ৴ 5. deploy !
  118. ΦʔτεέʔϧͰͷ஫ҙ఺ ! AMIʹ࢒͍ͬͯΔݹ͍ΞϓϦ͕ىಈ " ࠷৽ͷ deploy IDͰͳ͍৔߹͸ىಈ͠ͳ͍ deploy࣌: unique ͳ

    ID Λൃߦ • ϑΝΠϧʹॻ͍ͯ tar ʹೖΕΔ • KV ʹ΋ೖΕΔ ىಈ࣌:ϩʔΧϧϑΝΠϧͷ ID ͱ KV Λൺֱ • ҟͳ͍ͬͯͨΒ sleep 10 && exit → restart
  119. bash-completionͰ sshͷϗετ໊ิ׬

  120. bash-completionͰsshͷϗετ໊ิ׬ σϑΥϧτͰ͸ ~/.ssh/known_hosts Λݩʹิ׬ ! աڈʹsshͨ͜͠ͱ͕͋Δϗετ͕ग़ͯ͘Δ " ݱࡏ͸طʹଘࡏ͠ͳ͍Մೳੑ͕͋Δ " Ұ౓΋sshͨ͜͠ͱ͕ͳ͍৽نϗετ͸ग़ͯ͜

    ͳ͍
  121. bash-completionͰsshͷϗετ໊ิ׬ ~/.bash_profile _known_hosts_real() { local members=$(consul members -status=alive | awk

    '!/Node/{printf("%s ", $1)}') COMPREPLY=( $( \ compgen -W "$members" \ ${COMP_WORDS[COMP_CWORD]} \ ) ) return 0 } ! ݱࡏaliveͳϗετͷΈ͕ग़ͯ͘Δʂ
  122. ·ͱΊ

  123. ·ͱΊ Consul͸ػೳ๛෋…͕ͩ ࢖͍͍ͨػೳ͚ͩ࢖͑͹Α͍ • KVɺEventɺHealth Check... DNSɺHTTPΛ࢖͍͜ͳ͢ͱߴ౓ͳࣗಈԽ͕Մೳ • consul-templateͰಈతLBઃఆ

  124. Questions? • Architecture, Service Discovery Health Checking, Key/Value Store •

    ಺෦DNS • ຊ൪؀ڥӡ༻ / ߴՄ༻ੑ • Φʔτεέʔϧ • Stretcher • Consul KV Dashboard