Slide 1

Slide 1 text

dsync: Efficient Block-wise Synchronization of Multi Gigabyte Binary data Thomas Knauth and Christof Fetzer, Technische Universität Dresden ! LISA’13 Best Paper Award id:y_uuki 2014/05/22 ࿦จྠಡձ#4

Slide 2

Slide 2 text

Agenda Backup Problem Implementation Device mapper Evaluation Discussion Conclusion

Slide 3

Slide 3 text

Better Backup • ωοτϫʔΫτϥϑΟοΫͷ࠷খԽ • νΣοΫαϜܭࢉͷCPUίετ͕ͳ͍ • σΟεΫͷread/writeͷ࠷খԽ • OSͷϖʔδΩϟογϡԚછͷ࠷খԽ

Slide 4

Slide 4 text

Backup ͷલఏ • ఆظతʹͳΔ΂͘සൟʹόοΫΞοϓ͍ͨ͠ • σʔλࣗମʹ΄ͱΜͲมߋ͕ͳ͍ • શσʔλྔʹରͯ͠มߋͷ͋ͬͨσʔλ͸ ͔ᷮ

Slide 5

Slide 5 text

Problem • nc/scp: શσʔλίϐʔ • ωοτϫʔΫτϥϑΟοΫେ • 10Gbps Ethernet,100GB, 83 sec (1.2GB/s) • rsync: ࠩ෼͚ͩసૹ • ࠩ෼ܭࢉͷͨΊʹશσʔλಡΈग़͠ඞཁ • νΣοΫαϜܭࢉͷCPUίετߴ͍ • OSͷΩϟογϡԚછ

Slide 6

Slide 6 text

rsync ᶃ ૹ৴ଆͱड৴ଆͷಉظ͍ͨ͠ϑΝΠϧΛݻఆ௕ϒϩοΫʹ෼ׂ ! ϒϩοΫ୯ҐͰࠩ෼Λௐ΂ͯɺࠩ෼ͷ͋ΔϒϩοΫ͚ͩసૹ͍ͨ͠ ! ᶄ ֤ϒϩοΫʹରͯ͠νΣοΫαϜΛܭࢉ͠ɺϒϩοΫͷ಺༰ͷ୅Θ ΓʹνΣοΫαϜ͚ͩΛૹ৴ ! ᶅ νΣοΫαϜΛൺֱͯࠩ͠෼͕͋Δ͔Ͳ͏͔ΛνΣοΫ ऑ͍νΣοΫαϜ(ܭࢉίετ௿)ͱڧ͍νΣοΫαϜ(ܭࢉίετߴ) ऑ͍νΣοΫαϜʢϩʔϦϯάνΣοΫαϜʣͰࠩ෼ͷ͋ΔϒϩοΫΛ ચ͍ग़ͯ͠ɺڧ͍νΣοΫαϜͰ࣮֬ʹࠩ෼νΣοΫ

Slide 7

Slide 7 text

Idea • όοΫΞοϓ࣌ʹมߋՕॴΛܾఆ… • νΣοΫαϜΛൺֱ͢Δ͔͠ํ๏͕ͳ͍ • ࠷ॳ͔ΒมߋՕॴΛτϥοΩϯά͢Δ

Slide 8

Slide 8 text

Implementation • ϒϩοΫσόΠεϨϕϧͰมߋ͞Εͨϒϩο ΫͷτϥοΩϯά৘ใΛอ࣋ • τϥοΩϯά৘ใ: ϒϩοΫ͝ͱͷมߋ༗ແͷ ϑϥά ʢ1bit / blockʣ • ߹ܭσʔλ: 4TiB -> ϑϥά: 128 MiB

Slide 9

Slide 9 text

Interface • ϢʔβεϖʔεΠϯλϑΣʔε • ֤σόΠε͝ͱʹ /proc ҎԼʹରԠ͢ΔϑΝΠϧ͕Ͱ͖Δ • /proc/mydev: ϒϩοΫ൪߸ͷϦετɻ͜Εʹॻ͖ࠐΉͱ bit vector ͸ॳظԽ͞ΕΔ • dmextract: มߋͷ͋ͬͨϒϩοΫͷநग़ stdout:ʢϒϩο Ϋ൪߸, σʔλʣ • dmextract mydev | ssh remotehost dmmerge /dev/ mapper/mydev

Slide 10

Slide 10 text

Block Device ϒϩοΫσόΠευϥΠό ετϨʔδσόΠε(HDD/SSD) ൚༻ϒϩοΫσόΠευϥΠό ΞϓϦέʔγϣϯ ϖʔδΩϟογϡ ϑΝΠϧγεςϜ ϑΝΠϧͷಡΈॻ͖ཁٻ ϑΝΠϧͱσόΠεͷϒϩοΫͱͷϚοϐϯά *0ཁٻΛσόΠεʹదͨ͠ܗʹฒ΂ସ͑ͳͲ 3".ʹಡΈॻ͖σʔλΛΩϟογϡ ݸʑͷϋʔυ΢ΣΞʹ͋Θͤͨॲཧ ε τ Ϩ c δ σ ό Π ε ந ৅ Խ

Slide 11

Slide 11 text

Device mapper http://lc.linux.or.jp/lc2009/slide/T-02-slide.pdf ϒϩοΫσόΠευϥΠό ετϨʔδσόΠε(HDD/SSD) ൚༻ϒϩοΫσόΠευϥΠό ΞϓϦέʔγϣϯ ϖʔδΩϟογϡ ϑΝΠϧγεςϜ ϑΝΠϧͷಡΈॻ͖ཁٻ ϑΝΠϧͱσόΠεͷϒϩοΫͱͷϚοϐϯά *0ཁٻΛσόΠεʹదͨ͠ܗʹฒ΂ସ͑ͳͲ 3".ʹಡΈॻ͖σʔλΛΩϟογϡ ݸʑͷϋʔυ΢ΣΞʹ͋Θͤͨॲཧ ε τ Ϩ c δ σ ό Π ε ந ৅ Խ %FWJDF NBQQFS ϒϩοΫͷಡΈॻ͖ཁٻΛ ͍Ζ͍Ζม׵

Slide 12

Slide 12 text

Device mapper (1) • ෳ਺ͷ෺ཧϒϩοΫσόΠεΛҰͭͷ࿦ཧσόΠεͱͯ͠ ଋͶΒΕΔ • Mirror, Stripe, Snapshot • RAID 0,1,5,10 • Snapshot: ෺ཧσόΠε΁ͷશมߋΛ࿦ཧσόΠεʹϦμ ΠϨΫτ(Copy on Write) • ͋ͱͰ෺ཧσόΠεͱ࿦ཧσόΠεΛϚʔδՄೳ • ෺ཧσόΠε͔ΒόοΫΞοϓσόΠεʹϦϞʔτϦμ ΠϨΫτͰόοΫΞοϓ࡞ΕΔʁʁ

Slide 13

Slide 13 text

Device mapper (2) • 2ͭͷࢹ఺͕͔͚͍ͯΔ -> Snapshot Ͱ͸ແཧ • શมߋΛҰ࣌తʹ࿦ཧσόΠε͕όοϑΝ͢Δඞ ཁ͕͋Δ όοϑΝ͕͋;Εͯσʔλϩετ • ΦϦδφϧσʔλΛόοΫΞοϓઌͰϚʔδ͢Δ ඞཁ͕͋Δ • Device mapper ͸ϒϩοΫมߋΛτϥοΩϯά͢Δ ͨΊͷશͯͷ৘ใΛ΋ͭ • liner mapping mode ࣌ͷ’map’ function

Slide 14

Slide 14 text

Architecture • ෺ཧσόΠεʹରͯ͠τϥοΩϯά͢Δ͔ΘΓʹɺɹ ϧʔϓόοΫσόΠεʹରͯ͠τϥοΩϯά͢Δ • ϧʔϓόοΫσόΠε: ҰൠతͳϑΝΠϧΛϒϩοΫ σόΠεͰ͋Δ͔ͷΑ͏ʹѻ͏ͨΊͷػೳ ϒϩοΫσόΠε %FWJDFNBQQFS ΞϓϦέʔγϣϯ ϒϩοΫσόΠε ϑΝΠϧγεςϜ ϧʔϓόοΫσόΠε %FWJDFNBQQFS ΞϓϦέʔγϣϯ

Slide 15

Slide 15 text

Data Structure • RAM্ʹมߋ৘ใΛ1ϒϩοΫ͋ͨΓ1bitͰ΋ͭ • 1bit ͷཁૉΛ΋ͭϒϩοΫ਺௕ͷ഑ྻ • ϝϞϦΞϩέʔγϣϯͷ޻෉ • kmalloc(), __get_free_pages(), vmalloc() • vmalloc() ͷΈ࣮֬ʹϝΨόΠτ୯ҐͰ֬อՄೳ • kmalloc() ͸εϥϒΦϒδΣΫτ੍ݶ͕͋Δ(32MiB)ɺvmalloc ͸ϖʔδ୯ҐͰ֬อ • ΦϯϝϞϦͳσʔλߏ଄ͳͷͰɺγϟοτμ΢ϯ࣌ʹτϥοΩϯ ά৘ใΛϩετ͢Δ • γϟοτμ΢ϯ࣌ʹτϥοΩϯά৘ใΛσΟεΫʹॻ͖ࠐΜͰɺ ىಈ࣌ʹಡΈग़͢ • յΕ͍ͯΕ͹ϑϧಉظ

Slide 16

Slide 16 text

Evaluation (tools) • scp/nc • rsync • blockmd5sync • rsync ͷϩʔϦϯάνΣοΫαϜͳ͠൛ • ZFS • features: ࿦ཧϘϦϡʔϜɺsnapshotɺ2ͭͷsnapshotͷࠩ෼நग़ • ϒϩοΫσόΠεϨϕϧͩͱΞΫηεͰ͖ͳ͍৘ใ: /tmp ͚ͩແࢹ͢Δ • dsync • ϑΝΠϧγεςϜʹґଘ͠ͳ͍ • ϑΝΠϧγεςϜͷ৘ใ͕ͳ͍ͷͰ੍ݶ΋͋Δ (mtime ͳͲ) • νΣοΫαϜܭࢉͷ͔ΘΓʹɺϒϩοΫ͝ͱʹτϥοΩϯάεςʔλεΛ΋ͨ ͤΔ

Slide 17

Slide 17 text

Evaluation (Benchmarks) • 6-core AMD Phenom II processor • 2 TB spinning disk (Samsung HD204UI), • 128 GB SSD (Intel SSDSC2CT12) • εΠον͝͠ʹΪΨϏοτΠʔαωοτͰ઀ଓ

Slide 18

Slide 18 text

ಉظ࣌ؒ HDD/SSD rsync: 33min, dsync: 7 min )%% 44%

Slide 19

Slide 19 text

CPUར༻཰ STZODड৴ଆνΣοΫαϜ STZODૹ৴ଆνΣοΫαϜ ίΞ࢖͍੾ͬͯΔ

Slide 20

Slide 20 text

ωοτϫʔΫଳҬ

Slide 21

Slide 21 text

Discussion • ৗʹ dsync > rsync • rsync ͸ dsync ͷεʔύʔηοτ • rsync ͸ dsync ͱಉ͡Α͏ʹશͯͷߋ৽͞ΕͨϒϩοΫ Λread/transmit/merge͢Δ • rsync ͸ dsync ʹՃ͑ͯɺߋ৽ϒϩοΫΛܾఆ͢ΔͨΊ ʹɺ”શͯͷϒϩοΫ”Λreadɺchecksumܭࢉ͕ඞཁ • bit vector ͷߋ৽Φʔόϔου • ΦϯϝϞϦ͔ͩΒେͨ͜͠ͱ͸ͳ͍

Slide 22

Slide 22 text

Conclusion • ఆظతͳڊେόΠφϦσʔλͷޮ཰తͳಉظํ ๏ͷఏҊ/࣮૷ • νΣοΫαϜΛܭࢉ͢Δ୅ΘΓʹɺΦϯϥΠϯ ͰมߋΛτϥοΩϯά • Linux kernel ͷ Device mapper֦ு • dmextract and dmmerge • rsycnc vs dsync, 32 min vs 7 min

Slide 23

Slide 23 text

ײ૝ • rsync ͕͕͢͞ʹ͔Θ͍ͦ͏ͳͷͰɺసૹྔ͕͖͍ͯ ͘ΔΠϯλʔωοτܦ༝ͷಉظ࣌ؒൺֱ͕͋ΔͱΑ͞ ͦ͏

Slide 24

Slide 24 text

Linux 3.2 kernel module patch https://bitbucket.org/tknauth/devicemapper/