General-purpose hybrid storage system

General-purpose hybrid storage system

第4回 Web System Architecture 研究会 (WSA研) の発表資料です。
https://websystemarchitecture.hatenablog.jp/entry/2019/02/26/100725

C236ca44c4fc873f80f97b5cf8a775df?s=128

TAKAMURA Narimichi

April 13, 2019
Tweet

Transcript

  1. ൚༻తͳϋΠϒϦου ετϨʔδγεςϜͷఏҊ גࣜձࣾϋʔτϏʔπ Takamura Narimichi @nari_ex 2019/04/13 ୈ4ճ Web System

    Architecture ݚڀձ | @nari_ex 1
  2. ࣗݾ঺հ • Takamura Narimchi / ߴଜ ੒ಓ • @nari_ex •

    גࣜձࣾϋʔτϏʔπ औక໾ VPoE • ిؾ௨৴େֶ • ৘ใཧ޻ֶ෦৘ใɾ௨৴޻ֶՊ ֶ࢜ • άϩʔϏεܦӦେֶӃ • ܦӦݚڀՊܦӦઐ߈ म࢜ʢMBAʣ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 2
  3. ໨࣍ • എܠͱ՝୊ • ఏҊ • ࣮૷ํ๏ • ຊػߏͷར༻ύλʔϯ •

    ·ͱΊ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 3
  4. എܠͱ՝୊ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 4

  5. എܠͱ՝୊ • WebαʔϏεͷීٴʹΑΓσʔλྔ͕രൃతʹ૿Ճ • ๲େͳσʔλΛอଘ͢ΔετϨʔδͷඅ༻͕૿Ճ • සൟʹΞΫηε͞ΕΔσʔλ͸શମͷ͘͝Ұ෦ • අ༻ରޮՌΛߴΊΔͨΊʹར༻ස౓ͷ௿͍σʔλͷඅ༻Λ࡟ݮ ͍ͨ͠

    ※ ຊݚڀʹ͓͚Δσʔλͱ͸ίϯςϯπϑΝΠϧΛର৅ͱ͢Δ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 5
  6. ิ଍: σʔλʹؔ͢Δݴ༿ͷఆٛ ຊݚڀͰ͸ར༻ස౓͝ͱʹσʔλΛ2छྨʹ෼ྨ͢Δ • ϗοτσʔλ: ར༻ස౓ͷߴ͍σʔλ • ίʔϧυσʔλ: ར༻ස౓ͷ௿͍σʔλ 2019/04/13

    ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 6
  7. ิ଍: ίϯςϯπσʔλʹ͓͚Δίʔϧυσʔλͷྫ • धཁ͕গͳ͍ෆಈ࢈ͷ෺݅σʔλ • γʔζϯΦϑʹͳͬͨΞύϨϧͷ੡඼σʔλ • ड৴ࡁΈ͔ͭࢀর͞Εͳ͍ϝʔϧσʔλ • ϩά഑৴αʔόʹ͓͚Δݹ͍ϩάσʔλ

    ※ ͍ͣΕ΋ϑΝΠϧ୯ҐͰͷΞΫηε͕ඞཁͰ͋ΔέʔεΛ૝ఆ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 7
  8. ՝୊ʹର͢Δجຊઓུ • ίʔϧυσʔλΛԿΒ͔ͷํ๏Ͱ҆ՁͳετϨʔδʹҠಈ͢Δ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ |

    @nari_ex 8
  9. ैདྷख๏1: ΞʔΧΠϒ • ௕ظอଘ͢ΔͨΊʹઐ༻ͷอଘྖҬʹ҆શʹσʔλΛอଘ͢Δ͜ͱ • ҆ՁͷετϨʔδʹҠಈ͢Δ͜ͱͰίετ࡟ݮ͕Մೳ • ՝୊ • ผͷ৔ॴʹҠಈ͢ΔͨΊɺୀආલͱಉ͡Α͏ʹσʔλΛར༻͢

    Δ͜ͱ͕ࠔ೉ => ར༻͞ΕΔՄೳੑ͕θϩʹͳΒͳ͍ίϯςϯπσʔλʹ͸ෆ޲͖ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 9
  10. ैདྷख๏2: ΤϯλʔϓϥΠζ޲͚ετϨʔδ੡඼ • ར༻ස౓ͷ௿͍σʔλΛ҆ՁͳετϨʔδʹࣗಈͰҠಈ͢Δ • ՝୊ • ϕϯμϩοΫΠϯ • ଟֹͷ౤ࢿ͕ඞཁ

    • Ϋϥ΢υ؀ڥ΁ͷಋೖ͕ࠔ೉ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 10
  11. ैདྷख๏2: ΤϯλʔϓϥΠζ޲͚ετϨʔδ੡඼ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex

    11
  12. ิ଍: Ϋϥ΢υԽʹ൐͏ετϨʔδ૚ͷࣗ༝౓௿Լ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex

    12
  13. ैདྷख๏3: طଘϑΝΠϧγεςϜͷ֦ு • Btrfs ͷ֦ு1 • ϚϧνσόΠεʹରԠ͍ͯ͠Δ Btrfs ͷಛ௃Λ׆͔ͨ͠ݚڀ •

    ൚༻ϒϩοΫ૚ʹͯσʔλͷҠಈΛߦ͏ • ՝୊ • σʔλҠಈ࣌ʹڞ༗ϦιʔεʢϝϞϦɺCPUʣͷෛՙ͕ൃੜ • Btrfs Ҏ֎ͷϑΝΠϧγεςϜΛར༻Ͱ͖ͳ͍ • ޿͘ར༻͞Ε͍ͯΔ ext4 ΍ xfs ͕ར༻Ͱ͖ͳ͍ 1 Hot Cold Data Tracking and Migra3on in btrfs. 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 13
  14. ิ଍: Linux I/O ֎؍ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ

    | @nari_ex 14
  15. ՝୊·ͱΊ • ΞʔΧΠϒ • ίϯςϯπσʔλʹෆ޲͖ • ΤϯλʔϓϥΠζ޲͚੡඼ • ߴՁ •

    ϕϯμϩοΫΠϯ • Ϋϥ΢υʹෆ޲͖ • طଘϑΝΠϧγεςϜͷ֦ு • ϑΝΠϧγεςϜͷબ୒͕Ͱ͖ͳ͍ • σʔλҠಈ࣌ͷෛՙʹΑͬͯϝΠϯॲཧͷಈ࡞ʹӨڹ͕ग़Δ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 15
  16. ՝୊·ͱΊ2 • ಋೖͰ͖Δ؀ڥ͕ݶΒΕ͍ͯΔ • σʔλҠಈ࣌ͷෛՙ͕՝୊ 2019/04/13 ୈ4ճ Web System Architecture

    ݚڀձ | @nari_ex 16
  17. ఏҊ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 17

  18. ൚༻తͳϋΠϒϦουετϨʔδγεςϜͷఏҊ • ༷ʑͳ؀ڥͰར༻ՄೳͳϋΠϒϦουετϨʔδγεςϜ • ίʔϧυσʔλҠಈ࣌ʹϝΠϯॲཧͷಈ࡞Λ๦͛ͳ͍ ※ Linux ্Ͱͷ࣮૷Λ૝ఆ 2019/04/13 ୈ4ճ

    Web System Architecture ݚڀձ | @nari_ex 18
  19. ୡ੒͍ͨ͜͠ͱ • ൚༻ੑ͕ߴ͍ • ΦϯϓϨ؀ڥ͸΋ͪΖΜɺΫϥ΢υ؀ڥͰ΋ར༻Մೳ • OSS Ͱߏ੒͞ΕɺLinux্Ͱಈ࡞Λ͢Δ • ಋೖ࣌ʹಛఆͷϑΝΠϧγεςϜʹґଘ͠ͳ͍

    • ϗοτετϨʔδɺίʔϧυετϨʔδ͝ͱʹϑΝΠϧγεςϜΛબ΂Δ • ίʔϧυσʔλͷҠಈෛՙ͕े෼ʹ௿͍ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 19
  20. ࣮૷ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 20

  21. ࣮૷্ͷ՝୊ͱରࡦ • Linux ্Ͱ͸جຊతʹϓϩηε୯ҐͰ੍ޚΛߦ͏ • ଳҬίϯτϩʔϧ͸࠶഑ஔ࣌ͷΈʹద༻͍ͨ͠ • => ΞΫηεৼΓ෼͚ͱσʔλͷ࠶഑ஔͷϓϩηεΛ෼཭ 2019/04/13

    ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 21
  22. ࣮૷ํ਑ 1. ΞΫηεৼΓ෼͚ • ϢʔβεϖʔεϑΝΠϧγεςϜʢFUSEʣʹ࣮ͯ૷ • खܰʹಋೖՄೳ • ೚ҙͷϑΝΠϧγεςϜΛར༻Մೳ 2.

    σʔλͷ࠶഑ஔ • σʔϞϯϓϩηεʹ࣮ͯ૷ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 22
  23. 1. ΞΫηεৼΓ෼͚ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex

    23
  24. Mul$-Temperature FileSystemʢMTFSʣ • ϢʔβεϖʔεϑΝΠϧγεςϜ • ϢʔβʔεϖʔεσʔϞϯϓϩηε: m%sd • ϗοτσʔλ༻ͱίʔϧυσʔλ༻ͷύʔςΟγϣϯΛͦΕ ͧΕࢦఆͯ͠ىಈ

    • ΞϓϦέʔγϣϯ͔Βཁٻ͞ΕΔϑΝΠϧૢ࡞Λίϯτϩʔ ϧ͢Δ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 24
  25. m"sd ͷॲཧ֓ཁ 1. ΞϓϦέʔγϣϯ͕ϑΝΠϧΞΫηεΛཁٻ 2. FUSE ϥΠϒϥϦΛ௨ͯ͠ m*sd ͕γεςϜίʔϧΛड৴ 3.

    ϗοτετϨʔδ΁໰͍߹Θͤ • ϑΝΠϧ͕ଘࡏ͠ͳ͍৔߹͸ίʔϧυετϨʔδʹ໰͍߹Θͤ 4. औಘͨ͠σʔλΛΞϓϦέʔγϣϯ΁ฦ٫ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 25
  26. 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 26

  27. ಡΈࠐΈॲཧͷಈ͖ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 27

  28. ॻ͖ࠐΈͷಈ͖ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 28

  29. 2. σʔλͷ࠶഑ஔ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex

    29
  30. σʔλͷ࠶഑ஔ • ϢʔβʔεϖʔεσʔϞϯϓϩηε: mt-relocatord • ٸ͍Ͱॲཧ͢Δඞཁ͕ͳ͍ͷͰγϯάϧεϨου࣮ߦ 2019/04/13 ୈ4ճ Web System

    Architecture ݚڀձ | @nari_ex 30
  31. ࣮૷্ͷ՝୊ͱରࡦ • I/O ͷଳҬɾεϧʔϓοτ੍ޚ • => cgroup2 ͰϒϩοΫ I/OΛ੍ޚ •

    IOPS੍ޚ: riops ͱ wiops ʹ੍ͯݶ • εϧʔϓοτ੍ޚ: rbps ͱ wbps Ͱ੍ݶ • => ioprio_set() ͰI/Oεέδϡʔϥ૚Λ੍ޚ • CLASS_IDLE Λࢦఆ • ࠶഑ஔҠ࣌ʹίʔϧυσʔλ͕σΟεΫΩϟογϡΛফඅͯ͠͠·͏ • => posix_fadvise(POSIX_FADV_DONTNEED)ͰҠಈର৅ͷίʔϧυσʔλͷΩϟογϡΛΫϦΞ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 31
  32. Linux I/O ͷϦιʔε੍ޚՕॴ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ |

    @nari_ex 32
  33. ࠶഑ஔॲཧͷ֓ཁ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 33

  34. mt-relocatord ͷઃఆ಺༰ • ࠶഑ஔॲཧͷεέδϡʔϦϯάͷઃఆ • ىಈ࣌ࠁ • ࣮ߦपظʢ1೔୯Ґʣ • ࠶഑ஔͷᮢ஋ઃఆ

    • ୯Ґ࣌ؒ౰ͨΓͷΞΫηε਺ɺߋ৽਺ • ࠷ऴΞΫηεɺ࠷ऴߋ৽͔Βݱࡏ࣌ࠁ·Ͱͷܦա࣌ؒ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 34
  35. ར༻Πϝʔδ // Create partitions # mkfs.xfs /dev/vda1 # mkfs.btrfs /dev/vdb1

    // Create MTFS managed Volumes # mtfsctl hot-volume create hv0 /export/sda1/www/ # mtfsctl cold-volume create cv0 /export/sdb1/www/ // Create mfsd # mtfsctl volume start hv0 cv0 # systemctl start mtfsd // Start mt-relocatord # systemctl start mt-relocatord 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 35
  36. ຊػߏͷར༻ύλʔϯ environment Hot Data Storage Cold Data Storage On-premises(Storage Device)

    SSD HDD AWS(Block Storage) EBS Provisioned IOPS SSD Cold HDD AWS(Shared Storage) EFS(Provisioned Throughput) EFS(Infrequent Access Storage Class) ※ Shared Storage Ҏ֎͸೚ҙͷϑΝΠϧγεςϜ͕ར༻Մೳ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 36
  37. ՝୊ • ίʔϧυσʔλҠಈͷ҆શੑͱੑೳ • FUSE ʹΑΔಈ࡞Φʔόʔϔου • ϑΝΠϧ਺૿େʹର͢Δ mt-relocatord ͷॲཧෛՙ

    • mt-relocatord ͕εέδϡʔϦϯάػೳΛ࣋ͭඞཁ͕͋Δ͔ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 37
  38. ·ͱΊ • ൚༻తͳ֊૚ԽετϨʔδγεςϜΛఏҊͨ͠ • FUSEɺcgroup2ɺioprioɺposix_fadvise Λ૊Έ߹Θ࣮ͤͨ૷ ΛఏҊͨ͠ • ຊػߏΛίϯςφ؀ڥʹ΋Ԡ༻͍͖͍ͯͨ͠ •

    ex. Docker Volume PluginsɺKubernetes Strage Interface 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 38