$30 off During Our Annual Pro Sale. View Details »

分散リソースマネージメントミドルウェアの設計と実装

 分散リソースマネージメントミドルウェアの設計と実装

分散リソースマネージメントミドルウェアの設計と実装

公立はこだて未来大学 松原研究室 訪問

2019/04/24
さくらインターネット株式会社
さくらインターネット研究所
上級研究員
松本亮介 / まつもとりー / @matsumotory

MATSUMOTO Ryosuke
PRO

April 25, 2019
Tweet

More Decks by MATSUMOTO Ryosuke

Other Decks in Research

Transcript

 1. ͘͞ΒΠϯλʔωοτגࣜձࣾ
  (C) Copyright 1996-2019 SAKURA Internet Inc
  ͘͞ΒΠϯλʔωοτݚڀॴ
  ෼ࢄϦιʔεϚωʔδϝϯτϛυϧ΢ΣΞ
  ͷઃܭͱ࣮૷
  2019/04/23 ্ڃݚڀһ দຊ ྄հ
  ެཱ͸ͩͯ͜ະདྷେֶ দݪݚڀࣨ ๚໰

  View Slide

 2. 2
  ɾ͘͞ΒΠϯλʔωοτݚڀॴ ্ڃݚڀһ
  ɾגࣜձࣾGrooves Forkewll ٕज़ސ໰
  ɾϖύϘݚڀॴ ٬һݚڀһ ݚڀސ໰
  ɾηΩϡϦςΟɾΩϟϯϓߨࢣ
  ɾ৘ใॲཧֶձ Πϯλʔωοτͱӡ༻ٕज़ݚڀձ ֤छҕһ
  ɾژ౎େֶത࢜ʢ৘ใֶʣ
  দຊ྄հ / ·ͭ΋ͱΓʔ / @matsumotory

  View Slide

 3. 3
  1. ͸͡Ίʹ
  2. ϦιʔεϚωʔδϝϯτϛυϧ΢ΣΞ
  3. ϦιʔεϚωʔδϝϯτγεςϜͷ෼ࢄԽͷઃܭͱ࣮૷
  4. ·ͱΊ
  ໨࣍

  View Slide

 4. 4
  ௒ݸମతσʔληϯλʔʹ޲͚ͨऔΓ૊Έ
  • Ϋϥ΢υωΠςΟϒɾϚϧνΫϥ΢υɾಛఆOSSͷґଘͷഉআ
  • ίϯςφ୯ҐͰͷαʔϏε΍ػೳɺ؅ཧख๏ͷ෼ࢄԽ
  • ϚΠΫϩαʔϏεͷΑ͏ͳSOAࢤ޲ͷઃܭख๏΁
  • ૊৫΍νʔϜͱͯ͠ͷࣗ཯෼ࢄԽ΋ؚΉ
  • ͘͞ΒΠϯλʔωοτݚڀॴʹΑΔ௒ݸମతσʔληϯλʔͷऔΓ૊Έ
  • ϨΠςϯγͷ௿ݮͱࣾձʹ༹͚ࠐΉίϯϐϡʔςΟϯά
  • ਓͷۙ͘ʹͲΜͲΜͱΫϥ΢υͷϚγϯύϫʔΛد͍ͤͯ͘

  View Slide

 5. 5
  ӡ༻ٕज़ͷ෼ࢄԽ
  • ϚΠΫϩαʔϏε୯ҐͰνʔϜͱͯ͠ӡ༻͢Δํ޲ੑ͕ͻͱͭ
  • ࣗ཯෼ࢄ͍ͯ͘͠αʔϏε΍ίϯςφΛӡ༻͢Δٕज़ͷਐԽ
  • k8sΛ࢝Ίͱ͢ΔΦʔέετϨʔγϣϯ૚ʹΑΔ؅ཧ
  • k8sʹΑͬͯΫϥ΢υ΍OSΛந৅Խͯ͠Ϋϥ΢υOSͱͳΔੈք
  • k8sʹґଘ͠ͳ͍෼ࢄͨ͠ίϯςφ΍VM؀ڥͷϦιʔεϚωʔδϝϯτ
  • k8sΛΑΓ௿͍૚͔Βࢧ͑Δج൫ٕज़։ൃͱ΋ݴ͑Δ?
  • ΦʔέετϨʔγϣϯ૚ʹ࣮૷͢ΔͨΊͷӡ༻ٕज़ͷ෼ࢄԽ

  View Slide

 6. 1.
  ϦιʔεϚωʔδϝϯτϛυϧ΢ΣΞ

  View Slide

 7. 7
  k2i
  • ίϯςφΛऔΓר͘෼ࢄγεςϜʹ͓͚Δϓϩηε৘ใΛಁաతʹऔಘ
  • ίϯςφͷऩ༰ϗετͷϓϩηε৘ใΛϦϞʔτ͔ΒऔಘɾௐࠪՄೳ
  • ֤ऩ༰ϗετʹىಈ͓ͤͯ͘͜͞ͱͰಁաతʹϓϩηε৘ใΛऔಘՄೳ
  • ࣮૷͸RustͷHTTPαʔό+libprocpsͷRust bindingʹΑ࣮ͬͯ૷
  • Ͱ͖Δ͚ͩߴ଎ʹಈ࡞͢ΔΑ͏ʹ
  • https://github.com/matsumotory/k2i
  • https://github.com/matsumotory/procps-sys

  View Slide

 8. 8
  drcond
  • ϦϞʔτ͔Βऩ༰ϗετͷ֤छϓϩηεͷ֤छϦιʔε࢖༻ྔΛ੍ޚ
  • localݶఆͷcliπʔϧͱͯ͠͸rconͱ͍͏πʔϧΛҎલ࡞ͬͨ
  • ෳ਺ͷऩ༰ϗετʹରͯ͠drcond͕ಈ͍͍ͯΕ͹ϦϞʔτ͔Β੍ޚՄೳ
  • ֤छϛυϧ΢ΣΞ͔ΒҰ࣌తʹϦιʔεΛ੍ޚ͍ͨ͠৔߹΋ར༻Մೳ
  • ྫ: HTTPϦΫΤετ୯ҐͰಛఆͷϦΫΤετॲཧ͚ͩΛ੍ޚ͢Δͱ͔※1
  • drcond͕͋Ε͹SMTPηογϣϯ୯ҐͰ΋൚༻తʹ੍ޚՄೳʹͳΔ
  ※1 দຊ྄հ, ܀ྛ݈ଠ࿠, Ԭ෦णஉ, ϦΫΤετ୯ҐͰԾ૝తʹϋʔυ΢ΣΞϦιʔεΛ෼཭͢ΔWebαʔόͷϦιʔ
  ε੍ޚΞʔΩςΫνϟ, ৘ใॲཧֶձ࿦จࢽ, Vol.59, No.3, pp.1016-1025, 2018೥3݄.

  View Slide

 9. 9
  drcond
  • ϦΫΤετ୯Ґͱ͔ηογϣϯ୯ҐͰͷҰ࣌తͳ੍ݶͳͲͷέʔε
  • Ϧιʔε੍ޚॲཧࣗମ͕ΦʔόʔϔουʹͳΒͳ͍Α͏ͳੑೳ͕ඞཁ
  • Trusterdͱ͍͏ߴ଎ʹmrubyͰಈ࡞͢ΔHTTP/2αʔόΛར༻
  • https://github.com/matsumotory/trusterd
  • nginxΑΓ໿3ഒ଎͍ͱ͍͏ݕূ݁Ռ΋ग़ͨ https://hb.matsumoto-r.jp/
  entry/2015/12/16/000114
  • trusterdʹmruby-cgroupΛ૊ΈࠐΜͰRubyͰrcondΛ࣮૷͍ͯ͠Δ్த
  • https://github.com/matsumotory/mruby-cgroup

  View Slide

 10. drcond is a distributed resource control
  middle-ware.

  View Slide

 11. distributed????

  View Slide

 12. 2.
  ϦιʔεϚωʔδϝϯτϛυϧ΢ΣΞͷ
  ෼ࢄԽ

  View Slide

 13. 13
  ͜͜·Ͱ঺հͨ͠ϛυϧ΢ΣΞ
  • ֤ऩ༰αʔόʹىಈͤ͞Δϛυϧ΢ΣΞ
  • ෳ਺ʹىಈͤͨ͞ΓɺࣗಈͰσϓϩΠ͞ΕͨVM΍෺ཧαʔόͰىಈͨ͠৔߹
  ʹͦΕͧΕͷ৘ใΛूΊͨΓҾ਺ʹ౉࣮ͯ͠ߦ͢Δͷ͸෼ࢄతͰͳ͍
  • ෼ࢄγεςϜͷͨΊͷӡ༻ٕज़ͳͷͰ͋ΔͷͰɺͦ͜Ͱඞཁͳπʔϧ΍ϛυϧ
  ΢ΣΞ΋෼ࢄԽ͍ͤͨ͞
  • ίϯηϯαεΞϧΰϦζϜʹΑͬͯྑ͍ײ͡ʹ͢Δʂ

  View Slide

 14. 14
  ίϯηϯαεΞϧΰϦζϜͰ΍Γ͍ͨ͜ͱ
  • ෼ࢄγεςϜͰ໋ྩʹର͢Δ߹ҙΛͱͬͯશମʹಉ໋͡ྩΛ࣮ߦ͍ͨ͠
  • ͦͷͨΊʹ΋֤αʔόΛϝϯόγοϓͱͯ͠Έͳͯ͠ϝϯόʔγοϓ؅ཧ
  • ϦʔμʔΛҰਓબఆͯ͠ΫϥΠΞϯτ͔Β͸Ϧʔμʔ͋Δ͍͸ϑΥϩϫʔʹ
  ͍ͬͯ΋ϦʔμʔϊʔυʹϦμΠϨΫτ͞ΕΔΑ͏ʹ͍ͨ͠

  View Slide

 15. 15
  ίϯηϯαεΞϧΰϦζϜͰ΍Γ͍ͨ͜ͱ
  • ͦͷ্ͰϦʔμʔʹର໋ͯ͠ྩΛ౤͛ΔͱશϑΥϩϫʔͰ΋໋ྩΛ࣮ߦ
  • ෼ࢄτϥϯβΫγϣϯతͳཁૉ΍ϩάอଘ͸༏ઌ౓͸௿ΊͰݕ౼
  • ૝ఆ͍ͯ͠Δk2i΍drcondͰͷ໋ྩ͕લޙͱؔ܎ͷͳ͍୯Ұ໋ྩ͔ͭτϥ
  ϯβΫγϣϯͱͯ͠ѻΘͳͯ͘΋ྑ͍͔΋͠Εͳ͍͔Β
  • ͍΍Ͱ΋Ϧʔμʔબग़ʹ݁ہτϥϯβΫγϣϯͷϧʔϧ͕ೖΓͦ͏ʁ

  View Slide

 16. 16
  k2i΍drcondͷ෼ࢄԽ
  leader
  k2i
  follower
  k2i
  follower
  k2i
  client

  View Slide

 17. 17
  ίϯηϯαεΞϧΰϦζϜͱ࣮૷
  • Paxos, Raft, Zab, 2PC౳ɺ࣮૷͸etcd, consul, CockroachDB, TiDB౳
  • Paxos
  • ੲ͔Β͋ΔΞϧΰϦζϜͰ͋Δ͕ཧ࿦͕೉ͯ͘͠Paxos-baseΈ͍ͨͳ࣮૷
  ͕େྔʹੜ·ΕΔ
  • ͦͷͨΊ࣮ࡍʹҰ؏ੑΛอূͰ͖͍ͯΔͷ͔͕ෆ໌ͰࠔΔ͜ͱ͕ଟ਺
  • Raft
  • In search of an understandable consensus algorithm in USENIX 2014
  • PaxosΛ࣮༻తʹ͸໰୊ແ͍ϨϕϧͰੑೳͱཧղͷ͠΍͢͞Λվળ

  View Slide

 18. 18
  Raftͷ໘ന͍ͱ͜Ζ
  • ෳࡶͳ෼ࢄ߹ҙܗ੒Λ࣮༻্໰୊ͳ͍ͱ͍͏લఏͰཧ࿦Λߏங
  • ΞϧΰϦζϜͰ͋ͬͯ΋࣮ફΛલఏʹ͍ͯ͠Δͱ͜Ζ͕͓͘͢͝΋͠Ζ͍
  • Ϧʔμʔ͕ແݶʹܾ·Βͳ͍໰୊΋λΠϜΞ΢τΛཚ਺ʹ͢Ε͹࣮༻্໰
  ୊ແ͍ϨϕϧͰϥΠϒϩοΫʹͳͬͯϦʔμʔΛܾΊΒΕΔͱ͔
  • ϩάͷ࠶ૹͱ্ॻ͖ͷޮ཰΋ࣦഊͨ͠Βલʹ໭ͬͯ΍Γ௚͠Λ܁Γฦ͢
  ͷͰҰݟޮ཰͕ѱͦ͏͚ͩͲ࣮༻্͸ͦ͜·Ͱͷ໰୊͕ى͖ʹ͍͘ͱ͔
  • https://www.usenix.org/system/files/conference/atc14/atc14-paper-
  ongaro.pdf

  View Slide

 19. 19
  ෼ࢄτϥϯβΫγϣϯͱ͔ϩάอଘ
  • ίϯηϯαεΞϧΰϦζϜͰॏཁͳཁૉ
  • ͨͩ๻ͷ໨తʹ͸ͦΜͳʹඞཁͳ͍ͷͰ͸ͳ͍͔
  • ໋ྩΛϩάʹͯ͠Ұ؏ੑΛอͭͱ͍͏ػೳ΋k2i΍drcondʹ͓͍ͯ͸໋ྩࣗମ
  ͕εςʔτϨε
  • ͱ͸͍͑ɺ࣮ߦઌͰͷ࣮ߦ଎౓ͳͲ͸ෛՙʹΑͬͯมΘΔͷͰߟྀ͸ඞཁ
  • KVSͷΑ͏ʹϩάΛͨΊͨΓίϯύΫγϣϯΛ͢Δͱ͍͏ͷ͸୹࣌ؒͰྑ͍
  • ετϨʔδ΁ͷอଘΛߟྀ͠ͳ͚Ε͹ߋʹraft͸ߴ͍ੑೳ͕ͰΔ ※1
  ※1 ෼ࢄ߹ҙΞϧΰϦζϜRaftͷௐࠪ http://db-event.jpn.org/deim2017/papers/386.pdf

  View Slide

 20. 20
  Ԡ༻ྫ: Koordinator
  • A Service Approach for Replicating Docker Containers in Kubernetes
  • k8sͰѻ͏ίϯςφ͕εςʔτΛอͭ৔߹ʹreplicaͷҰ؏ੑΛอͭ
  • ϚϧνίϯςφͰwriteͷ௚ޙʹམͪΔͱଞͷίϯςφͱঢ়ଶ͕มΘΔ
  • proxyͱίϯςφؒʹkoordinator૚Λ࡞Δ
  • koordinator૚ͰίϯηϯαεΞϧΰϦζϜΛ࢖໋ͬͯྩΛorderԽͯ͠ෳ
  ਺ͷίϯςφʹ໋ྩॱʹಉ࣮࣌ߦ →Ұ؏ੑΛΑΓڧ͍΋ͷʹ͢Δ
  Hylson Vescovi Netto, Aldelir Fernando Luiz, Miguel Correia, Luciana de Oliveira Rech, Caio Pereira Oliveira,
  Koordinator: A Service Approach for Replicating Docker Containers in Kubernetes, 2018 IEEE Symposium on
  Computers and Communications (ISCC) Year: 2018, Volume: 1, Pages: 58-51

  View Slide

 21. Ԡ༻ྫɿKoordinator
  21
  client
  FW
  proxy
  proxy
  proxy
  koordinator
  koordinator
  koordinator
  app
  app
  app
  client
  FW
  proxy
  proxy
  proxy
  koordinator
  koordinator
  koordinator
  app
  app
  app
  $POTFOTVT"MHPSJUIN
  0SEFSJOH
  &YFDVUJPO
  XSJUFSFRVFTU SFBESFRVFTU

  View Slide

 22. 22
  k2i΍drcondͷ෼ࢄԽ: Multiraft
  leader
  follower
  follower
  client
  leader
  follower
  follower
  leader
  follower
  follower
  Multiraftͱ͔RaftΫϥελͷ෼ࢄΈ͍ͨͳ࿩୊΋͋Δ
  1. https://www.cockroachlabs.com/blog/scaling-raft/
  2. http://db-event.jpn.org/deim2017/papers/386.pdf
  3. http://sergeiturukin.com/2017/06/09/multiraft.html

  View Slide

 23. 23
  ࣮૷༧ఆ
  • Raftͷ࣮૷͸͍ͬͺ͍͋Δ CͰ΋RustͰ΋
  • drcond͸mrubyͰ࡞͍ͬͯΔͷͰCͷϥΠϒϥϦΛ࢖͏ͭ΋Γ
  • k2i͸RustͰॻ͍͍ͯΔͷͰRustΛૉ௚ʹ࢖͏͔
  • ͱ͸͍͑ίϯηϯαεΞϧΰϦζϜͰKVS΍σʔλϕʔεΛ࡞Δͱ͍͏Θ͚Ͱ
  ͸ͳ͍ͷͰɺ֤छϥΠϒϥϦΛ࢖͍ͳ͕Β໨తʹ͋ͬͨՕॴͷ࣮૷Λ͍ͨ͠

  View Slide

 24. 3.
  ·ͱΊ

  View Slide

 25. 25
  ίϯςφͷ෼ࢄγεςϜΛࢧ͑Δӡ༻ٕज़
  • ෼ࢄγεςϜʹ߹Θͤͯӡ༻πʔϧ΋෼ࢄԽ
  • ෼ࢄϦιʔεϚωʔδϝϯτϛυϧ΢ΣΞͷઃܭͱ࣮૷ͷ঺հ
  • ίϯηϯαεΞϧΰϦζϜʹΑΔ֤छϛυϧ΢ΣΞͷ෼ࢄԽରԠ
  • k8s͕Ͳ͜·Ͱ΍ͬͯͲ͜·Ͱӡ༻πʔϧͰରԠ͢΂͖͔
  • k8sͱ෼ࢄӡ༻πʔϧͷཱͪҐஔ͸Ͳ͏͋Δ΂͖͔

  View Slide