Upgrade to Pro — share decks privately, control downloads, hide ads and more …

平和なConsul Cluster運用 / consul-casual-1

平和なConsul Cluster運用 / consul-casual-1

Consul Casual Talks #1
http://connpass.com/event/35836/

FUJIWARA Shunichiro

August 01, 2016
Tweet

More Decks by FUJIWARA Shunichiro

Other Decks in Technology

Transcript

  1. ฏ࿨ͳ
    Consul cluster
    ӡ༻
    Consul Casual Talks #1@fujiwara

    View Slide

  2. ౻ݪ ढ़Ұ࿠
    @fujiwara
    github.com/fujiwara
    sfujiwara.hatenablog.com
    ٕज़෦

    View Slide

  3. Game & Community

    View Slide

  4. Agenda
    Consulͷ׆༻ࣄྫ
    ฏ࿨ʹӡ༻͢ΔͨΊͷϙΠϯτ

    View Slide

  5. Consulͷ׆༻ࣄྫ
    1. Internal DNS (node, service)
    2. maintͰϝϯςφϯε
    3. StretcherʹΑΔσϓϩΠɺChef࣮ߦ
    4. consul-templateʹΑΔnginxͷઃఆߋ৽
    5. 1୆͔͠ಈ͔ͨ͘͠ͳ͍daemonͷഉଞىಈ

    View Slide

  6. Internal DNS

    View Slide

  7. Internal DNS (node, service)
    node໊ (ྫ kayac-web-i-1234567...)
    service໊(ྫ)
    • log-aggregator : Fluentdͷू໿αʔό
    • log-analyzer : Norikra
    • internal-proxy : ֎ʹग़ͯߦͨ͘ΊͷSquid
    • internal-mta : ֎ʹग़ͯߦͨ͘ΊͷPostfix

    View Slide

  8. Internal DNS (node, service)
    dnsmasqΛશ୆Ͱىಈ
    .consul υϝΠϯͷ໊લղܾ͸consul agent΁
    127.0.0.1:53 Λdnsmasq͕Listen͢Δ
    # dnsmasq.conf
    server=/consul/127.0.0.1#8600
    bind-interfaces
    listen-address=127.0.0.1

    View Slide

  9. View Slide

  10. Internal DNS (node, service)
    resolv.conf Ͱ (node|service).consul ΛݕࡧυϝΠϯʹࢦఆ
    → node໊ɺservice໊͚ͩͰ઀ଓͰ͖Δ
    # /etc/resolv.conf
    search node.consul service.consul
    nameserver 127.0.0.1 # dnsmasq
    nameserver 172.16.0.2 # VPC resolver
    nameserver 172.16.0.254 # Unbound on EC2

    View Slide

  11. bash-completionͰsshͷϗετ໊ิ׬
    ~/.bash_profile
    _known_hosts_real()
    {
    local members=$(consul members -status=alive | awk '!/Node/{printf("%s ", $1)}')
    COMPREPLY=( $( \
    compgen -W "$members" \
    ${COMP_WORDS[COMP_CWORD]} \
    ) )
    return 0
    }
    ੜ͖͍ͯΔϗετͷΈิ׬ީิʹͳΔ
    http://qiita.com/sfujiwara/items/f4fa907ead53ed104e1a

    View Slide

  12. Fluentdͷू໿αʔό΁ૹΔઃఆ
    ConsulͰఏڙ͢ΔDNS໊΁ϥ΢ϯυϩϏϯͰૹ৴

    type forward
    expire_dns_cache 15
    dns_round_robin true
    heartbeat_type tcp

    host log-aggregator.service.consul


    ૹ৴ઌͷྻڍෆཁɺࣗಈ੾Γ཭͠

    View Slide

  13. maintͰϝϯςφϯε

    View Slide

  14. consul maint
    consul maint -enable [-reason "..."]
    ͋ΔnodeΛϝϯςφϯεϞʔυʹ͢Δ
    → serviceͷ໊લղܾ͔Β͸֎ΕΔ
    → nodeͷ໊લղܾ͸Ͱ͖Δ (sshͱ͔)

    View Slide

  15. consul maint࢖༻ྫ
    Fluentdू໿αʔόΛϝϯςφϯε͍ͨ͠৔߹ʹmaint
    → DNS͔Β֎ΕΔͷͰૹ৴͕ࢭ·Δ
    (expire_dns_cache ͷઃఆ͕ඞཁ)
    NorikraͷΠϯελϯεΛऔΓସ͍͑ͨͱ͖ʹ
    • ৽͍͠ϗετΛ maint -enable ͰηοτΞοϓ
    • چͰ maint -enable, ৽Ͱ maint -disable
    • DNSͰೖΕସΘΔͷͰૹ৴ઌ͕੾ΓସΘΔ

    View Slide

  16. PackerͰ AMI ࡞੒࣌ʹ maint
    1. consul cluster ʹ join
    2. maint -enable (ߏஙதʹ૊Έࠐ·Εͳ͍Α͏ʹ)
    3. ChefͰߏங
    4. maintঢ়ଶͷ·· AMI ࡞੒
    5. AMI͔Βىಈͨ͠Πϯελϯε΋maintͷ··
    6. ىಈޙͷॾʑ͕ऴΘͬͨΒ maint -disable → αʔϏεΠϯ

    View Slide

  17. maintͳΒىಈ͠ͳ͍
    daemontools ͷ run script
    #!/bin/bash
    maint=$(consul maint)
    if [[ $maint != "" ]]; then
    echo "$maint"
    sleep 10
    exit 1
    fi
    exec ...
    ϝϯς࣌ʹىಈͯ͠ཉ͘͠ͳ͍daemonΛ੍ޚ
    (maint -enableʹͳͬͯ΋stopͨ͠Γ͸͠ͳ͍)

    View Slide

  18. StretcherʹΑΔσϓϩΠ

    View Slide

  19. StretcherʹΑΔσϓϩΠ
    github.com/fujiwara/stretcher
    Consul / Serf ͱ࿈ܞͯ͠ಈ͘σϓϩΠπʔϧ

    View Slide

  20. View Slide

  21. StretcherͰChef࣮ߦ
    Chef-Server → Stretcher + Chef-Solo
    • Chef-Serverr͕ SPOF / ϘτϧωοΫʹͳΒͳ͍
    • શ୆ʹಉ͡tar, eventΛ഑෍ˠద༻͢ΔjsonΛ֤ϊʔυͰܾఆ
    # /etc/sysconfig/hostname-prefix
    HOSTNAME_PREFIX="xxx-app"
    → nodes/xxx-app.json Λద༻

    View Slide

  22. ChefͷroleݕࡧΛserviceఆٛͰ
    /etc/consul.d/role.json
    {
    "service": {
    "name": "role",
    "tags": [ "batch-server", "db-client", ... ]
    }
    }
    Serviceͱͯ͠ఆٛͯ͠ݕࡧՄೳʹ
    http://localhost:8500/v1/catalog/service/role?
    tag=db-client

    View Slide

  23. http://localhost:8500/v1/catalog/service/role?
    tag=internal-proxy
    [
    {
    "Node": "xxx-i-10bf0fe2",
    "Address": "10.0.0.123",
    "ServiceID": "role",
    "ServiceName": "role",
    ...
    },
    {
    "Node": "xxx-i-3c1b72b3",
    "Address": "10.0.1.234",
    "ServiceID": "role",
    "ServiceName": "role",
    ...
    }
    ]

    View Slide

  24. Daemontools؅ཧԼͷdaemon΋serviceఆٛ
    {
    "service": {
    "name": "daemontools",
    "tags": [ "app", stretcher", "gunfish", ... ]
    }
    }

    View Slide

  25. ͋Δdaemontools؅ཧϓϩηεΛ࠶ىಈ͍ͨ͠
    curl http://localhost:8500/v1/catalog/service/
    daemontools?tag=gunfish | jq -r ".[].Node"
    xxx-admin-i-0391d6162be552655
    xxx-app-i-01a7ff42f4796be4f
    xxx-app-i-05bd652734828b522
    xxx-batch-i-0095ac858fe87d8e5
    Regexp::TrieͰ࠷దͳਖ਼نදݱʹͯ͠ consul exec
    consul exec -node '(?:xxx\-(?:a(?:dmin|pp)|batch))'
    "svc -h /service/gunfish"

    View Slide

  26. consul-templateʹΑΔnginxͷઃఆߋ৽

    View Slide

  27. consul-template
    https://github.com/hashicorp/consul-template
    • Consul KVͷ஋ɺServiceͷղܾ݁ՌͳͲΛτϦΨʹ
    • ςϯϓϨʔτߋ৽ɺ೚ҙscript kick͕Ͱ͖Δ

    View Slide

  28. nginxͷઃఆߋ৽
    # config.hcl
    template {
    source = "/etc/nginx/spam.ip.conf.ctmpl"
    destination = "/etc/nginx/spam.ip.conf"
    command = "service nginx reload"
    perms = 0644
    backup = true
    }
    # spam.ip.conf.ctmpl
    {{key "spam_ips"}}
    localhost:8500/v1/kv/spam_ips ʹPUT͢Δ͚ͩͰઃఆߋ৽

    View Slide

  29. 1୆͔͠ಈ͔ͨ͘͠ͳ͍daemonͷഉଞىಈ

    View Slide

  30. 1୆͔͠ಈ͔ͨ͘͠ͳ͍daemonͷഉଞىಈ
    WebSocketड৴Ͱಈ͘Slack bot→ 2୆Ҏ্Ͱಈ͘ͱ෼਎͢Δ
    Ͱ΋Մ༻ੑΛ͍࣋ͨͤͨ…

    View Slide

  31. 1୆͔͠ಈ͔ͨ͘͠ͳ͍daemonͷഉଞىಈ
    consul lock Λ࢖͏
    ϩοΫΛऔಘͰ͖ͨΒࢦఆͨ͠ίϚϯυ͕࣮ߦ͞ΕΔwrapper
    consul lock -n 1 nuko "/path/to/run-nuko.sh"
    Consul leader͕ೖΕସΘΔͱϩοΫ͕ղ์͞ΕΔͷͰ஫ҙ

    View Slide

  32. ฏ࿨ʹӡ༻͢ΔͨΊͷϙΠϯτ

    View Slide

  33. ฏ࿨ʹӡ༻͢ΔͨΊͷϙΠϯτ
    RaftΛ(େ·͔ʹͰ΋͍͍ͷͰ)஌͓ͬͯ͘
    http://thesecretlivesofdata.com/raft/
    ෼ࢄ؀ڥͰͷ߹ҙܗ੒ΞϧΰϦζϜ
    • Ϧʔμʔબग़ʹʮա൒਺ʯͷ߹ҙ͕ඞཁ
    • 2୆ = 1୆མͪΔͱա൒਺(=2)͕औΕͳ͍
    • 3୆ = 1୆མͪͯ΋ա൒਺(=2)͕औΕΔ
    • 4୆ = 2୆མͪΔͱա൒਺(=3)͕औΕͳ͍

    View Slide

  34. Deployment Table
    ຊ൪Ͱ͸࠷௿3୆, 3 or 5͕ਪ঑
    consul.io/docs/internals/consensus.html

    View Slide

  35. ServerʹඞཁͳϦιʔε
    • CPU: 2CPUͰे෼
    • Memory: 20MBʙ
    • Disk: 2MBʙ
    Memory, Disk͸KVͷར༻ঢ়گ࣍ୈ
    KV dump JSON 10MB, data_dir/raft 120MB
    → consul agent RSS 250MB

    View Slide

  36. Serverʹ͸ઐ༻ϗετ͕ඞཁʁ
    consul agentࣗମ͸ͦΕ΄ͲϦιʔεΛ࢖༻͠ͳ͍
    Disk IO͕ߴෛՙͳ৔߹ʹRaftͷHeartbeat͕ࣦഊ͠΍͍͢
    • Timeout 500ms
    • Heartbeatʹࣦഊ͢ΔͱLeaderબग़͕ߦΘΕΔ
    • ௨ৗ2,3ඵͰબग़͸׬ྃ͢Δ
    • Consul server ͸ tmpfs Λอଘઌʹͯ͠Disk IOͷӨڹճආ

    View Slide

  37. ߴՄ༻ੑͷͨΊʹ
    Server୆਺ʹΑΓಉ࣌ʹো֐Λىͯ͜͠
    ΋໰୊ͳ͍node਺͕มΘΔ
    • 3 node → 1
    • 5 node → 2
    3 nodeߏ੒࣌ɺ2୆མͪͯ࢒Γ1୆ʹͳ
    ͬͯ͠·͏ͱLeader͕બग़Ͱ͖ͳ͍
    ௕࣌ؒ੾Γ཭͢ϝϯςφϯε࣌ʹ͸Ұ࣌
    తʹServer nodeΛ૿΍͢ख΋

    View Slide

  38. ߴՄ༻ੑͷͨΊʹ
    Server nodeͷfailover͸ࣗಈ
    ! Ϣʔβ͸ৗʹlocalhostͷagent͚ͩΛΈ͍ͯΕ͹Α͍

    View Slide

  39. nodeো֐࣌ͷӨڹ
    ! LeaderͰ͸ͳ͍ → " ଞnodeʹ͸Өڹͳ͠
    ! Leader → " Leader࠶બग़
    σϑΥϧτͰ͸͢΂ͯͷಡΈॻ͖ΛLeader͕ॲཧ (ڧҰ؏ੑ)
    Leader͕ܾ·Δ·ͰΞΫηεෆೳ (DNS, HTTP)

    View Slide

  40. Stale mode (DNS)
    Leader࠶બग़͸௨ৗ2ʙ3ඵͰ׬ྃ
    ͦͷؒ΋DNSͰNode, Service໊ղܾΛ͍ͨ͠ʁ
    → Stale mode : Leaderະબग़Ͱ΋Ԡ౴Մೳ
    "dns_config":{
    "allow_stale": true, // default false
    "max_stale": "10s" // default 5s
    }
    ݁Ռ͸ݹ͍Մೳੑ͕͋Δ(݁Ռ੔߹ੑ)

    View Slide

  41. DNS TTL
    default͸TTL 0 → cache͞Εͳ͍
    node, serviceผʹTTLΛઃఆՄೳ
    DNS cache(ͨͱ͑͹dnsmasq)Λલஈʹ഑ஔͯ͠cacheͰ͖Δ
    "dns_config":{
    "node_ttl": "60s",
    "service_ttl": {
    "*": "15s"
    }
    }

    View Slide

  42. Stale mode (HTTP API)
    HTTP APIͰstale modeʹ͢Δ৔߹͸Ҿ਺ stale
    $ curl "http://127.0.0.1:8500/v1/kv/web/key1?stale"
    staleҾ਺ͳ͠ͰLeaderબग़தʹΞΫηε
    → 500 Internal Server Error

    View Slide

  43. ӡ༻தͷUpgrade
    consul.io/docs/upgrading.html
    consul.io/docs/upgrade-specific.html
    όʔδϣϯผʹ஫ҙ఺͕͋ΔͷͰυΩϡϝϯτΛ
    ॱ൪ʹAgentΛೖΕସ͑Δ͜ͱͰ
    Rolling upgradeՄೳ
    (Leader nodeೖΕସ͑Ͱ࠶બग़͸ى͖Δ)

    View Slide

  44. ҆ఆੑ
    v0.2࣌୅͔Β2೥Ҏ্ӡ༻
    Agentϓϩηε͕མͪͨ͜ͱ͸1ճ͚ͩ(0.4.1࣌୅)
    EBS(gp2)ͷΫϨδοτރׇ → IO waitେྔ
    → panic: Timeout starting MDB transaction
    ΦϖϛεͰServerΛམͱ͗͢͠Δͱճ෮ෆೳ

    View Slide

  45. KVͷόοΫΞοϓ
    ͋Δ֊૚ͷԼͷ஋Λ࠶ؼతʹऔΓ͍ͨ৔߹͸ recurse
    $ curl -s "http://127.0.0.1:8500/v1/kv/?recurse"
    [
    {"CreateIndex":112,"ModifyIndex":115,"LockIndex":0,
    "Key":"key1","Flags":123,"Value":"dGVzdA=="},
    {"CreateIndex":122,"ModifyIndex":122,"LockIndex":0,
    "Key":"key2","Flags":0,"Value":"dGVzdDI="},
    {"CreateIndex":124,"ModifyIndex":124,"LockIndex":0,
    "Key":"test/1","Flags":0,"Value":"dGVzdDM="}
    ]
    Key, Flags, ValueΛPUT͠ͳ͓͠ͰϨετΞͰ͖Δ

    View Slide

  46. ࣮ࡍʹLeader͕ೖΕସΘͬͨͱ͖ͷϩά
    2016/07/30 10:07:28 [WARN] raft: Heartbeat timeout reached, starting election
    2016/07/30 10:07:28 [INFO] raft: Node at 10.0.2.132:8300 [Candidate] entering Candidate state
    2016/07/30 10:07:30 [WARN] raft: Election timeout reached, restarting election
    2016/07/30 10:07:30 [INFO] raft: Node at 10.0.2.132:8300 [Candidate] entering Candidate state
    2016/07/30 10:07:30 [INFO] raft: Election won. Tally: 3
    2016/07/30 10:07:30 [INFO] raft: Node at 10.0.2.132:8300 [Leader] entering Leader state
    2016/07/30 10:07:30 [INFO] consul: cluster leadership acquired
    2016/07/30 10:07:30 [INFO] consul: New leader elected: xxx-consul-i-ff26ca5a
    2ඵఔ౓Ͱճ෮
    DNSͷcache / stale mode ͰαʔϏεӨڹͳ͠
    stale໌ࣔ͠ͳ͍HTTP API͸500ʹͳΔˠৗ࣌ୟ͖·͘Δͷ͸…?

    View Slide

  47. ࣮ࡍʹ͋ͬͨා͍࿩

    View Slide

  48. consul exec Ͱେྔ݁Ռऔಘ
    consul exec "cat /var/log/foo.log" | grep ...
    ֤ϗετͷϩάΛconsul execͰऔಘ͠Α͏ͱͨ͠
    → consul exec ͸KVʹҰ୴อଘ͢ΔͷͰϝϞϦ/DBංେԽ
    serverΛ1୆ͣͭ࠶ىಈͯ͠ճ෮

    View Slide

  49. ΦϖϛεͰΫϥελ่յ
    upgrade͔ͨͬͨ͠
    3୆ߏ੒ͷserverͷ1୆Λམͱͯ͠ɺ৽͍͠όΠφϦͰىಈͨ͠
    (ͭ΋Γͩͬͨ)
    ͪΌΜͱىಈ͍ͯ͠ͳ͍ͷʹ2୆໨ͷαʔόΛམͱͨ͠
    → ่յ

    View Slide

  50. ่յͨ͠ΒͲ͏͢Ε͹
    1. མͪண͘
    2. serverΛશ෦ࢭΊΔ
    3. σʔλ(data_dir)΋શ෦ফ͢
    4. serverΛ -bootstrap-expect N Ͱىಈ
    • start_join ·ͨ͸ खಈͰ join
    5. (ඞཁͳΒ) KVΛόοΫΞοϓ͔Β໭͢

    View Slide