第4回 Web System Architecture 研究会 (WSA研) の発表資料です。 https://websystemarchitecture.hatenablog.jp/entry/2019/02/26/100725
൚༻తͳϋΠϒϦουετϨʔδγεςϜͷఏҊגࣜձࣾϋʔτϏʔπTakamura Narimichi @nari_ex2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 1
View Slide
ࣗݾհ• Takamura Narimchi / ߴଜ ಓ• @nari_ex• גࣜձࣾϋʔτϏʔπ औక VPoE• ిؾ௨৴େֶ• ใཧֶ෦ใɾ௨৴ֶՊ ֶ࢜• άϩʔϏεܦӦେֶӃ• ܦӦݚڀՊܦӦઐ߈ म࢜ʢMBAʣ2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 2
࣍• എܠͱ՝• ఏҊ• ࣮ํ๏• ຊػߏͷར༻ύλʔϯ• ·ͱΊ2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 3
എܠͱ՝2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 4
എܠͱ՝• WebαʔϏεͷීٴʹΑΓσʔλྔ͕രൃతʹ૿Ճ• େͳσʔλΛอଘ͢ΔετϨʔδͷඅ༻͕૿Ճ• සൟʹΞΫηε͞ΕΔσʔλશମͷ͘͝Ұ෦• අ༻ରޮՌΛߴΊΔͨΊʹར༻සͷ͍σʔλͷඅ༻Λݮ͍ͨ͠※ ຊݚڀʹ͓͚ΔσʔλͱίϯςϯπϑΝΠϧΛରͱ͢Δ2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 5
ิ: σʔλʹؔ͢Δݴ༿ͷఆٛຊݚڀͰར༻ස͝ͱʹσʔλΛ2छྨʹྨ͢Δ• ϗοτσʔλ: ར༻සͷߴ͍σʔλ• ίʔϧυσʔλ: ར༻සͷ͍σʔλ2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 6
ิ: ίϯςϯπσʔλʹ͓͚Δίʔϧυσʔλͷྫ• धཁ͕গͳ͍ෆಈ࢈ͷ݅σʔλ• γʔζϯΦϑʹͳͬͨΞύϨϧͷσʔλ• ड৴ࡁΈ͔ͭࢀর͞Εͳ͍ϝʔϧσʔλ• ϩά৴αʔόʹ͓͚Δݹ͍ϩάσʔλ※ ͍ͣΕϑΝΠϧ୯ҐͰͷΞΫηε͕ඞཁͰ͋ΔέʔεΛఆ2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 7
՝ʹର͢Δجຊઓུ• ίʔϧυσʔλΛԿΒ͔ͷํ๏Ͱ҆ՁͳετϨʔδʹҠಈ͢Δ2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 8
ैདྷख๏1: ΞʔΧΠϒ• ظอଘ͢ΔͨΊʹઐ༻ͷอଘྖҬʹ҆શʹσʔλΛอଘ͢Δ͜ͱ• ҆ՁͷετϨʔδʹҠಈ͢Δ͜ͱͰίετݮ͕Մೳ• ՝• ผͷॴʹҠಈ͢ΔͨΊɺୀආલͱಉ͡Α͏ʹσʔλΛར༻͢Δ͜ͱ͕ࠔ=> ར༻͞ΕΔՄೳੑ͕θϩʹͳΒͳ͍ίϯςϯπσʔλʹෆ͖2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 9
ैདྷख๏2: ΤϯλʔϓϥΠζ͚ετϨʔδ• ར༻සͷ͍σʔλΛ҆ՁͳετϨʔδʹࣗಈͰҠಈ͢Δ• ՝• ϕϯμϩοΫΠϯ• ଟֹͷࢿ͕ඞཁ• Ϋϥυڥͷಋೖ͕ࠔ2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 10
ैདྷख๏2: ΤϯλʔϓϥΠζ͚ετϨʔδ2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 11
ิ: ΫϥυԽʹ͏ετϨʔδͷࣗ༝Լ2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 12
ैདྷख๏3: طଘϑΝΠϧγεςϜͷ֦ு• Btrfs ͷ֦ு1• ϚϧνσόΠεʹରԠ͍ͯ͠Δ Btrfs ͷಛΛ׆͔ͨ͠ݚڀ• ൚༻ϒϩοΫʹͯσʔλͷҠಈΛߦ͏• ՝• σʔλҠಈ࣌ʹڞ༗ϦιʔεʢϝϞϦɺCPUʣͷෛՙ͕ൃੜ• Btrfs Ҏ֎ͷϑΝΠϧγεςϜΛར༻Ͱ͖ͳ͍• ͘ར༻͞Ε͍ͯΔ ext4 xfs ͕ར༻Ͱ͖ͳ͍1 Hot Cold Data Tracking and Migra3on in btrfs.2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 13
ิ: Linux I/O ֎؍2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 14
՝·ͱΊ• ΞʔΧΠϒ• ίϯςϯπσʔλʹෆ͖• ΤϯλʔϓϥΠζ͚• ߴՁ• ϕϯμϩοΫΠϯ• Ϋϥυʹෆ͖• طଘϑΝΠϧγεςϜͷ֦ு• ϑΝΠϧγεςϜͷબ͕Ͱ͖ͳ͍• σʔλҠಈ࣌ͷෛՙʹΑͬͯϝΠϯॲཧͷಈ࡞ʹӨڹ͕ग़Δ2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 15
՝·ͱΊ2• ಋೖͰ͖Δڥ͕ݶΒΕ͍ͯΔ• σʔλҠಈ࣌ͷෛՙ͕՝2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 16
ఏҊ2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 17
൚༻తͳϋΠϒϦουετϨʔδγεςϜͷఏҊ• ༷ʑͳڥͰར༻ՄೳͳϋΠϒϦουετϨʔδγεςϜ• ίʔϧυσʔλҠಈ࣌ʹϝΠϯॲཧͷಈ࡞Λ͛ͳ͍※ Linux ্Ͱͷ࣮Λఆ2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 18
ୡ͍ͨ͜͠ͱ• ൚༻ੑ͕ߴ͍• ΦϯϓϨڥͪΖΜɺΫϥυڥͰར༻Մೳ• OSS Ͱߏ͞ΕɺLinux্Ͱಈ࡞Λ͢Δ• ಋೖ࣌ʹಛఆͷϑΝΠϧγεςϜʹґଘ͠ͳ͍• ϗοτετϨʔδɺίʔϧυετϨʔδ͝ͱʹϑΝΠϧγεςϜΛબΔ• ίʔϧυσʔλͷҠಈෛՙ͕ेʹ͍2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 19
࣮2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 20
্࣮ͷ՝ͱରࡦ• Linux ্Ͱجຊతʹϓϩηε୯ҐͰ੍ޚΛߦ͏• ଳҬίϯτϩʔϧ࠶ஔ࣌ͷΈʹద༻͍ͨ͠• => ΞΫηεৼΓ͚ͱσʔλͷ࠶ஔͷϓϩηεΛ2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 21
࣮ํ1. ΞΫηεৼΓ͚• ϢʔβεϖʔεϑΝΠϧγεςϜʢFUSEʣʹ࣮ͯ• खܰʹಋೖՄೳ• ҙͷϑΝΠϧγεςϜΛར༻Մೳ2. σʔλͷ࠶ஔ• σʔϞϯϓϩηεʹ࣮ͯ2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 22
1. ΞΫηεৼΓ͚2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 23
Mul$-Temperature FileSystemʢMTFSʣ• ϢʔβεϖʔεϑΝΠϧγεςϜ• ϢʔβʔεϖʔεσʔϞϯϓϩηε: m%sd• ϗοτσʔλ༻ͱίʔϧυσʔλ༻ͷύʔςΟγϣϯΛͦΕͧΕࢦఆͯ͠ىಈ• ΞϓϦέʔγϣϯ͔Βཁٻ͞ΕΔϑΝΠϧૢ࡞Λίϯτϩʔϧ͢Δ2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 24
m"sd ͷॲཧ֓ཁ1. ΞϓϦέʔγϣϯ͕ϑΝΠϧΞΫηεΛཁٻ2. FUSE ϥΠϒϥϦΛ௨ͯ͠ m*sd ͕γεςϜίʔϧΛड৴3. ϗοτετϨʔδ͍߹Θͤ• ϑΝΠϧ͕ଘࡏ͠ͳ͍߹ίʔϧυετϨʔδʹ͍߹Θͤ4. औಘͨ͠σʔλΛΞϓϦέʔγϣϯฦ٫2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 25
2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 26
ಡΈࠐΈॲཧͷಈ͖2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 27
ॻ͖ࠐΈͷಈ͖2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 28
2. σʔλͷ࠶ஔ2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 29
σʔλͷ࠶ஔ• ϢʔβʔεϖʔεσʔϞϯϓϩηε: mt-relocatord• ٸ͍Ͱॲཧ͢Δඞཁ͕ͳ͍ͷͰγϯάϧεϨου࣮ߦ2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 30
্࣮ͷ՝ͱରࡦ• I/O ͷଳҬɾεϧʔϓοτ੍ޚ• => cgroup2 ͰϒϩοΫ I/OΛ੍ޚ• IOPS੍ޚ: riops ͱ wiops ʹ੍ͯݶ• εϧʔϓοτ੍ޚ: rbps ͱ wbps Ͱ੍ݶ• => ioprio_set() ͰI/OεέδϡʔϥΛ੍ޚ• CLASS_IDLE Λࢦఆ• ࠶ஔҠ࣌ʹίʔϧυσʔλ͕σΟεΫΩϟογϡΛফඅͯ͠͠·͏• => posix_fadvise(POSIX_FADV_DONTNEED)ͰҠಈରͷίʔϧυσʔλͷΩϟογϡΛΫϦΞ2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 31
Linux I/O ͷϦιʔε੍ޚՕॴ2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 32
࠶ஔॲཧͷ֓ཁ2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 33
mt-relocatord ͷઃఆ༰• ࠶ஔॲཧͷεέδϡʔϦϯάͷઃఆ• ىಈ࣌ࠁ• ࣮ߦपظʢ1୯Ґʣ• ࠶ஔͷᮢઃఆ• ୯Ґ࣌ؒͨΓͷΞΫηεɺߋ৽• ࠷ऴΞΫηεɺ࠷ऴߋ৽͔Βݱࡏ࣌ࠁ·Ͱͷܦա࣌ؒ2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 34
ར༻Πϝʔδ// Create partitions# mkfs.xfs /dev/vda1# mkfs.btrfs /dev/vdb1// Create MTFS managed Volumes# mtfsctl hot-volume create hv0 /export/sda1/www/# mtfsctl cold-volume create cv0 /export/sdb1/www/// Create mfsd# mtfsctl volume start hv0 cv0# systemctl start mtfsd// Start mt-relocatord# systemctl start mt-relocatord2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 35
ຊػߏͷར༻ύλʔϯenvironment Hot Data Storage Cold Data StorageOn-premises(StorageDevice)SSD HDDAWS(Block Storage) EBS Provisioned IOPS SSD Cold HDDAWS(Shared Storage) EFS(ProvisionedThroughput)EFS(Infrequent AccessStorage Class)※ Shared Storage Ҏ֎ҙͷϑΝΠϧγεςϜ͕ར༻Մೳ2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 36
՝• ίʔϧυσʔλҠಈͷ҆શੑͱੑೳ• FUSE ʹΑΔಈ࡞Φʔόʔϔου• ϑΝΠϧ૿େʹର͢Δ mt-relocatord ͷॲཧෛՙ• mt-relocatord ͕εέδϡʔϦϯάػೳΛ࣋ͭඞཁ͕͋Δ͔2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 37
·ͱΊ• ൚༻తͳ֊ԽετϨʔδγεςϜΛఏҊͨ͠• FUSEɺcgroup2ɺioprioɺposix_fadvise ΛΈ߹Θ࣮ͤͨΛఏҊͨ͠• ຊػߏΛίϯςφڥʹԠ༻͍͖͍ͯͨ͠• ex. Docker Volume PluginsɺKubernetes Strage Interface2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 38