Upgrade to Pro — share decks privately, control downloads, hide ads and more …

コンテナの研究開発から学ぶLinuxの要素技術

 コンテナの研究開発から学ぶLinuxの要素技術

コンテナの研究開発から学ぶLinuxの要素技術

IEEE Computer Society Flagship Conference 採録を通じて

3-shake SRE Tech Talk #3 スリーシェイク技術顧問 松本亮介 / まつもとりー 2022/03/18

MATSUMOTO Ryosuke

March 18, 2022
Tweet

More Decks by MATSUMOTO Ryosuke

Other Decks in Technology

Transcript

  1. 3-shake SRE Tech Talk #3 εϦʔγΣΠΫٕज़ސ໰ দຊ྄հ / ·ͭ΋ͱΓʔ 2022/03/18

    ίϯςφͷݚڀ։ൃ͔ΒֶͿLinuxͷཁૉٕज़ IEEE Computer Society Flagship Conference ࠾࿥Λ௨ͯ͡
  2. 2 ɾגࣜձࣾεϦʔγΣΠΫ ٕज़ސ໰ ɾ͘͞ΒΠϯλʔωοτݚڀॴ ্ڃݚڀһɺͦͷଞෳ਺ࣾͷٕज़ސ໰ ɾ৘ใॲཧֶձ IOTݚڀձ OSݚڀձ ҕһɾװࣄ ɾΠϯλʔωοτٕज़ୈ163ҕһձ

    ӡӦҕһ ɾIEEE / ACM / USENIX ֤छձһ ɾژ౎େֶത࢜ʢ৘ใֶʣ ɾhttps://research.matsumoto-r.jp/ দຊ྄հ / ·ͭ΋ͱΓʔ / @matsumotory
  3. • COMPSAC: IEEE Computer Society Flagship International Conference • COMPSAC

    2020 Message from the 2020 Program Chairs-in-Chief ※1 • over 450 submissions this year, to both our conference tracks and associated workshops • accepted 69 regular papers and 69 short papers • 76 papers that were not accepted for the main conference were referred to COMPSAC workshops • An additional 146 papers were submitted directly to our associated workshops • ͜ΕΒΛಡΉݶΓRegular Paperͷ࠾୒཰͸ 69 / (450 - 146) ͷ23%ҎԼ 4 COMPSAC 2020 Regular Paper ˞.FTTBHFGSPNUIF1SPHSBN$IBJSTJO$IJFG IUUQTJFFFDPNQTBDDPNQVUFSPSH
  4. 1. Πϯελϯε্ͰWordPressͷΑ͏ͳҰൠత͔ͭଟ༷ͳWebΞϓϦ͕ಈ࡞Մೳ • ઐ໳తͳ஌͕ࣝͳͯ͘΋ར༻Ͱ͖Δ҆ՁͳαʔϏεΛ࣮ݱ͍ͨ͠ 2. Πϯελϯεͷঢ়ଶมߋॲཧ͕ߴ଎ • Πϯελϯε(ίϯςφ)ͷঢ়ଶͷఀࢭɾىಈɾεέʔϦϯάΛߴ଎ʹ॥؀ • ϦΫΤετ୯ҐͰϦΞΫςΟϒʹঢ়ଶΛܾఆ

    → มԽʹڧ͍ج൫΁ 3. ϋʔυ΢ΣΞϦιʔεͷར༻ޮ཰Λ޲্ • ϦΫΤετ͕ແ͍Πϯελϯε͸Ұఆظؒىಈޙʹఀࢭ ΠϯελϯεΛߴूੵʹऩ༰ՄೳͰมԽʹڧ͍Ծ૝Խج൫FastContainer 8 ߃ৗੑͷ͋ΔมԽʹڧ͍ج൫ͷؔ࿈ݚڀ※1 ˞3ZPTVLF.BUTVNPUP 6DIJP,POEP ,FOUBSP,VSJCBZBTIJ 'BTU$POUBJOFS")PNFPTUBUJD4ZTUFN"SDIJUFDUVSF)JHITQFFE"EBQUJOH&YFDVUJPO &OWJSPONFOU$IBOHFT 5IFSE"OOVBM*&&&*OUFSOBUJPOBM$PNQVUFST 4PGUXBSF BOE"QQMJDBUJPOT$POGFSFODF $0.14"$ +VMZ
  5. • FastContainer ※1͸HTTPϦΫΤετʹԠͯ͡൓Ԡత͔ͭߴ଎ʹΠϯελϯε ͷঢ়ଶʢىಈɺఀࢭɺҠಈɺෳ੡ɺϦιʔε૿ݮ౳ʣΛܾఆ • αʔϏεར༻ऀ͸Wordpressͱ͔WebΞϓϦΛී௨ʹ࢖͏Α͏ͳ࢖͍ํ • ΞΫηε਺ʹԠͨ͡ϦΞΫςΟϒͳεέʔϦϯάॲཧ͕Մೳ • Ϋϥ΢υαʔϏεج൫͸༧ΊΠϯελϯεΛىಈͤͯ͞ϦΫΤετΛॲཧ

    • ΞΫηεूத࣌͸༧ଌత͔ͭϓϩΞΫςΟϒͳεέʔϦϯάॲཧ͕ඞཁ 12 FastContainerͱΫϥ΢υαʔϏεج൫ͷಛ௃ ˞3ZPTVLF.BUTVNPUP 6DIJP,POEP ,FOUBSP,VSJCBZBTIJ 'BTU$POUBJOFS")PNFPTUBUJD4ZTUFN"SDIJUFDUVSF)JHITQFFE"EBQUJOH&YFDVUJPO &OWJSPONFOU$IBOHFT 5IFSE"OOVBM*&&&*OUFSOBUJPOBM$PNQVUFST 4PGUXBSF BOE"QQMJDBUJPOT$POGFSFODF $0.14"$ +VMZ
  6. 4UPSBHF $MJFOU 4FSWFS JOTUBODF" JOTUBODF# JOTUBODF$ 4FSWFS JOTUBODF" 'BTU$POUBJOFS΍Ϋϥ΢υج൫ͷՄ༻ੑ )551ϦΫΤετ

    4UPSBHF $MJFOU 4FSWFS JOTUBODF" JOTUBODF# JOTUBODF$ 4FSWFS JOTUBODF" )551ϦΫΤετ ✗ 15 αʔόো֐
  7. 18 4UPSBHF $MJFOU 4FSWFS JOTUBODF" JOTUBODF# JOTUBODF$ 4FSWFS ఏҊख๏ʴ'BTU$POUBJOFSͷՄ༻ੑ )551ϦΫΤετ

    4UPSBHF $MJFOU 4FSWFS JOTUBODF" JOTUBODF# JOTUBODF$ 4FSWFS )551ϦΫΤετ JOTUBODF" ✗൓Ԡతʹ ࠶഑ஔ αʔόো֐
  8. HTTP FastContainerͷجຊϑϩʔ 20 8FC1SPYZ ʢOHY@NSVCZ $.%# ʴ "1* 8FC%JTQBUDIFS OHY@NSVCZ

    $MJFOU ίϯςφ ίϯςφ ίϯςφ w )551ϦΫΤετͷ)PTUOBNF ΛΩʔʹɺ$.%# ߏ੒؅ཧ%# ͔Βίϯςφͷ৘ใΛऔಘ )551 4  ϦΫΤετ w ίϯςφͷ*1ͱϙʔτʹج͍ ͯίϯςφʹϓϩΩγ w ίϯςφ͕-JTUFO͍ͯ͠ͳ͍ ৔߹͸$.%#͔Βίϯςφ ৘ใΛಘͯىಈ $POUBJOFS&OHJOF IBDPOJXB ऩ༰ϗετ"
  9. blocking each request with mruby 21 SFRVFTU NSVCZ NSVCZ SFTQPOTF

    SFRVFTU SFRVFTU SFTQPOTF SFTQPOTF NSVCZ TFOESFTQPOTF SFDWSFRVFTU BUUIFTBNFUJNF Other responses are delayed in proportion to the time of processing of mruby blocking OPOCMPDLJOHNJEEMFXBSFMJLFOHJOYJOTJOHMFQSPDFTT
  10. 22

  11. non-blocking each request with mruby 23 SFRVFTU SFTQPOTF SFRVFTU SFRVFTU

    SFTQPOTF SFTQPOTF TFOESFTQPOTF SFDWSFRVFTU BUUIFTBNFUJNF CMPDLJOH PQFSBJUPO NSVCZ CMPDLJOH PQFSBJUPO NSVCZ NSVCZ CMPDLJOH PQFSBJUPO OPOCMPDLJOHNJEEMFXBSFMJLFOHJOYJOTJOHMFQSPDFTT
  12. 24

  13. )PTU04 8FC1SPYZ $.%#"1* $POUBJOFS %JTQBUDIFS $POUBJOFS )PTU04 $POUBJOFS %JTQBUDIFS $POUBJOFS

    $MJFOU )551 *$.1PS5$1 *$.1PS5$1 )551 )551 )551 ✗ ࠷ॳͷ࠶഑ஔ࣌͸ίϯςφͷىಈ͕ඞཁͰ͋Δ͕ɺ ىಈޙ͸Ұఆظؒىಈ͠ଓ͚Δɻ
  14. • ICMP/TCPͰᮢ஋νΣοΫ͕Ұ࣌తʹޡݕ஌ͯ͠΋Өڹ͕গͳ͍ • TCPͷ৔߹͸ࣗ࡞TCPελοΫͰԟ෮3ύέοτͰνΣοΫ[3][4] • FastContainerͳͷͰޡݕ஌ͷ࠶഑ஔ͕ੜͯ͡΋αʔϏε͕ܧଓ͞ΕΔ • ޡݕ஌Ͱଞαʔόʹىಈͯ͠͠·ͬͯ΋Ұఆ࣌ؒىಈͨ͠Βఀࢭ͢Δ • ݩαʔόʹ࠶഑ஔ͞ΕͯCMDB্͸ݩαʔόͷΈʹϦΫΤετ͕ྲྀΕΔ

    • Ԡ౴࣌ؒͷᮢ஋΍λΠϜΞ΢τΛΪϦΪϦ·ͰνϡʔχϯάՄೳ 31 ఏҊख๏ͷϙΠϯτʢICMP/TCP؂ࢹʣ <>NBUTVNPUPSZ NSVCZGBTUSFNPUFDIFDL IUUQTHJUIVCDPNNBUTVNPUPSZNSVCZGBTUSFNPUFDIFDL <>-JOVYΧʔωϧͷ5$1ελοΫͱγεςϜίʔϧͷ૊Έ߹ΘͤʹΑΔख๏ΑΓ΋ߴ଎ʹϙʔτͷ-JTUFOνΣοΫΛ ߦ͏ IUUQTICNBUTVNPUPSKQFOUSZ
  15. • αʔόϓϩηεͷىಈ௚ޙΛίϯςφϥϯλΠϜͰϑοΫͯ͠Checkpoint • ࢀߟ: seccompͰγεςϜίʔϧΛ؂ࢹ͠ptraceͰҰ࣌ఀࢭ͔ͯ͠ΒCRIUͰ CheckpointʹΑΔΠϝʔδԽͱ͍͏ํ๏΋͋Δ • CRIUͷதͰseccompΛ࢖͓ͬͯΓύον͕ඞཁͰ൚༻ੑʹ͔͚Δ • seccompͷϓϩηεఀࢭʹ͸CRIUͷػೳΛ࢖͍ͬͯΔͳͲ

    • seccomp࣮ߦޙʹݖݶΛམͱ͍ͯ͠ΔͨΊseccomp͕࢖͑ͳ͍ͳͲ 37 CRIU+seccompʹΑΔFastContainerͷىಈ 04ϨΠϠͰ8FCαʔό͕ىಈ࣌ʹ࣮ߦ͢ΔγεςϜίʔϧΛ؂ࢹ͠ىಈ׬ྃ௚લͷϓϩηεΛΠϝʔδԽ͢Δ IUUQTICNBUTVNPUPSKQFOUSZ
  16. • Webαʔόιϑτ΢ΣΞͷىಈॲཧ׬ྃͰɺ͔ͭωοτϫʔΫ͕Listen͍ͯ͠ͳ͍ ঢ়ଶͷϓϩηεΛΠϝʔδԽ͢Δ͜ͱΛ໨ࢦ͢ • seccompͰ؂ࢹ͢ΔγεςϜίʔϧlisten()Λઃఆ͠ɺΠϝʔδԽ͍ͨ͠αʔόϓϩ ηεΛfork()͔ͯ͠Βexecv() • ਌ϓϩηε͔Βର৅ͷαʔόϓϩηεͷseccompΠϕϯτΛptrace()Ͱ؂ࢹ͠ɺ Listen()࣮ߦલʹΠϕϯτ͕ൃੜ •

    Πϕϯτൃੜ࣌ʹϓϩηεΛCRIUͰΠϝʔδԽͯ͠อଘ 38 γεςϜίʔϧΛ؂ࢹͯ͠௚લͰΠϝʔδԽ 04ϨΠϠͰ8FCαʔό͕ىಈ࣌ʹ࣮ߦ͢ΔγεςϜίʔϧΛ؂ࢹ͠ىಈ׬ྃ௚લͷϓϩηεΛΠϝʔδԽ͢Δ IUUQTICNBUTVNPUPSKQFOUSZ
  17. 42

  18. • ༧උ࣮ݧ: CRIUͱCheckpoint/Restore͢ΔϓϩηεͷϝϞϦαΠζͱͷؔ܎ • ୅දతͳΞϓϦέʔγϣϯΛ࢖ͬͨίϯςφ࠶഑ஔ࣌ͷϨεϙϯελΠϜ • Apache 2.4.18ɼPHP 7.3.0ɼWordpress 5.0.3ʢσϑΥϧτϖʔδʣ

    • Python 3.7.1ɼDjango 2.1.4ɼgunicorn 19.9.0※1 • Ruby 2.5.1ɼRails 5.2.1ɼPuma 3.12.0※2 43 ࣮ݧ಺༰ ˞IUUQTNDMPMJQPQ[FOEFTLDPNIDKBBSUJDMFT ˞IUUQTHJUIVCDPNFWFSZMFBGFMUSBJOJO ݱ࣮తͳن໛ʢݸਓάϧʔϓ಺Ͱͷར༻ͷΞϓϦέʔγϣϯఔ౓ʣͰ%#Λར༻ͨ͠΋ͷΛ࠾
  19. αʔόϓϩηεͷΠϝʔδԽ(Checkpoint/Restore) 45 $IFDLQPJOU3FTUPSF1SPDFTTJOH5JNFEVFUP.FNPSZ6TBHF 1SPDFTTJOHUJNF<TFD>      

         .FNPSZVTBHFQFSQSPDFTT<.#>          $IFDLQPJOU 3FTUPSF ୯ҰͷαʔόϓϩηεͷϝϞϦ࢖༻ྔʹԠͨ͡$IFDLQPJOU3FTUPSFʹඞཁ
  20. • Apache 2.4.18ɼPHP 7.3.0ɼWordpress 5.0.3 • ϓϩηε਺͸3ɼ୯ҰͷϓϩηεͷϝϞϦαΠζ(RSS)͸35MBytes • Python 3.7.1ɼDjango

    2.1.4ɼgunicorn 19.9.0 ※1 • ϓϩηε਺2ɼεϨου਺2ɼ୯ҰͷϓϩηεͷRSS͸33MBytes • Ruby 2.5.1ɼRails 5.2.1ɼPuma 3.12.0 ※2 • ϓϩηε਺2ɼεϨου਺14ɼ୯ҰͷϓϩηεͷRSS͸89MBytes • gemΛࣄલίϯύΠϧ͓ͯ͘͠bootsnapͱ΋ൺֱ 48 ίϯςφ࠶഑ஔ࣌ͷϨεϙϯελΠϜ
  21. • ୯ҰΠϯελϯεͰՄ༻ੑΛ୲อ͢Δߴ଎ͳεέδϡʔϦϯάख๏ΛఏҊ • ෳ਺ΠϯελϯεΛඞཁͱ͠ͳ͍ͨΊϦιʔείετ͕௿͍ • ࣮ݧ͔Βݱ࣌఺Ͱ΋࣮༻ՄೳͳϨϕϧͷ࠶഑ஔͷੑೳ͕ಘΒΕͨ • ϓϩμΫγϣϯ؀ڥͰԠ༻ • ϗετো֐࣌Ͱ͋ͬͯ΋Ϣʔβ͕ؾ͔ͮͳ͍ϨϕϧͰͷՄ༻ੑ

    • ΦʔτεέʔϦϯά࣌ʹ΋γʔϜϨεʹίϯςφΛ૿΍ͯ͠ෛՙରࡦՄೳʹ • ΞΫηε܏޲ͱϦιʔεׂΓ౰͕ͯਖ਼֬ʹ௥ਵՄೳʹ • εέʔϦϯά΍ϋʔυ΢ΣΞϓʔϧͷϦιʔεׂΓ౰ͯ΋࠷దԽ 53 ·ͱΊ
  22. • ࠷ॳͷWWW2020ʹఏग़ͯ͠Reject͞Εͨཧ༝ • ݚڀͷཱͪҐஔ͕ෆ໌֬ɺ৽نੑ͕͍·͍ͪΑ͘Θ͔Βͳ͍ • ຊݚڀͷཱͪҐஔ΍લఏͷ໌֬Խ • ൺֱ͢΂͖ؔ࿈ݚڀ͕ෆ໌ྎ • ຊݚڀͱൺֱ͢΂͖ؔ࿈ݚڀΛॆ࣮ͤͯࠩ͞෼Λ໌֬Խ

    ڭ܇: ΠϯλʔωοτɾWebٕज़෼໺Ͱ͸ൃද࿦จ΍OSSͷ਺΍ٕज़ͷมԽ଎౓ ͕ඇৗʹ଎͍ͨΊɺݚڀͷείʔϓͱ࠷৽ͷ՝୊Λ໌֬ʹ্ͨ͠Ͱࠩ෼Λ͔ͬ͠ Γͱࣔ͠ɺͦͷ՝୊͕ݱ࣮తʹͲΕ΄Ͳҙ͕ٛ͋Δ͜ͱͳͷ͔Λࣔ͢͜ͱ͕େࣄ 56 Accept·Ͱͷաఔ