Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
dsync: Efficient Block-wise Synchronization of ...
Search
Yuuki Tsubouchi (yuuk1)
May 23, 2014
Technology
2
26k
dsync: Efficient Block-wise Synchronization of Multi-Gigabyte Binary Data
論文輪読会#4
ブロックデバイスレベルで実現するrsyncより高速なバックアップについて
Yuuki Tsubouchi (yuuk1)
May 23, 2014
Tweet
Share
More Decks by Yuuki Tsubouchi (yuuk1)
See All by Yuuki Tsubouchi (yuuk1)
eBPFを用いたAIネットワーク監視システム論文の実装 / eBPF Japan Meetup #4
yuukit
3
830
クラウドのテレメトリーシステム研究動向2025年
yuukit
3
950
博士論文公聴会: Scaling Telemetry Workloads in Cloud Applications: Techniques for Instrumentation, Storage, and Mining / PhD Defence
yuukit
1
170
博士学位論文予備審査 / Scaling Telemetry Workloads in Cloud Applications: Techniques for Instrumentation, Storage, and Mining
yuukit
1
1.9k
MetricSifter:クラウドアプリケーションにおける故障箇所特定の効率化のための多変量時系列データの特徴量削減 / FIT 2024
yuukit
2
260
工学としてのSRE再訪 / Revisiting SRE as Engineering
yuukit
19
14k
Cloudless Computingの論文紹介
yuukit
2
560
#SRE論文紹介 Detection is Better Than Cure: A Cloud Incidents Perspective V. Ganatra et. al., ESEC/FSE’23
yuukit
3
2.1k
エンジニアのためのSRE論文への招待 / Introduction to SRE Papers for Engineers
yuukit
2
11k
Other Decks in Technology
See All in Technology
データプラットフォーム技術におけるメダリオンアーキテクチャという考え方/DataPlatformWithMedallionArchitecture
smdmts
5
590
Snowflake Summit 2025 データエンジニアリング関連新機能紹介 / Snowflake Summit 2025 What's New about Data Engineering
tiltmax3
0
280
Create a Rails8 responsive app with Gemini and RubyLLM
palladius
0
140
AWS CDK 実践的アプローチ N選 / aws-cdk-practical-approaches
gotok365
5
570
Clineを含めたAIエージェントを 大規模組織に導入し、投資対効果を考える / Introducing AI agents into your organization
i35_267
4
1.4k
UIテスト自動化サポート- Testbed for XCUIAutomation practice
notoroid
0
120
~宇宙最速~2025年AWS Summit レポート
satodesu
1
1.6k
BrainPadプログラミングコンテスト記念LT会2025_社内イベント&問題解説
brainpadpr
0
160
[TechNight #90-1] 本当に使える?ZDMの新機能を実践検証してみた
oracle4engineer
PRO
3
140
標準技術と独自システムで作る「つらくない」SaaS アカウント管理 / Effortless SaaS Account Management with Standard Technologies & Custom Systems
yuyatakeyama
3
1.1k
Amplifyとゼロからはじめた AIコーディング 成果と展望
mkdev10
1
380
VCpp Link and Library - C++ breaktime 2025 Summer
harukasao
0
220
Featured
See All Featured
Building a Scalable Design System with Sketch
lauravandoore
462
33k
Raft: Consensus for Rubyists
vanstee
140
7k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
29
1.8k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
31
2.4k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
26k
Mobile First: as difficult as doing things right
swwweet
223
9.7k
Being A Developer After 40
akosma
90
590k
Speed Design
sergeychernyshev
31
1k
StorybookのUI Testing Handbookを読んだ
zakiyama
30
5.8k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
46
9.6k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
181
53k
Bootstrapping a Software Product
garrettdimon
PRO
307
110k
Transcript
dsync: Efficient Block-wise Synchronization of Multi Gigabyte Binary data Thomas
Knauth and Christof Fetzer, Technische Universität Dresden ! LISA’13 Best Paper Award id:y_uuki 2014/05/22 จྠಡձ#4
Agenda Backup Problem Implementation Device mapper Evaluation Discussion Conclusion
Better Backup • ωοτϫʔΫτϥϑΟοΫͷ࠷খԽ • νΣοΫαϜܭࢉͷCPUίετ͕ͳ͍ • σΟεΫͷread/writeͷ࠷খԽ • OSͷϖʔδΩϟογϡԚછͷ࠷খԽ
Backup ͷલఏ • ఆظతʹͳΔ͘සൟʹόοΫΞοϓ͍ͨ͠ • σʔλࣗମʹ΄ͱΜͲมߋ͕ͳ͍ • શσʔλྔʹରͯ͠มߋͷ͋ͬͨσʔλ ͔ᷮ
Problem • nc/scp: શσʔλίϐʔ • ωοτϫʔΫτϥϑΟοΫେ • 10Gbps Ethernet,100GB, 83
sec (1.2GB/s) • rsync: ͚ࠩͩసૹ • ࠩܭࢉͷͨΊʹશσʔλಡΈग़͠ඞཁ • νΣοΫαϜܭࢉͷCPUίετߴ͍ • OSͷΩϟογϡԚછ
rsync ᶃ ૹ৴ଆͱड৴ଆͷಉظ͍ͨ͠ϑΝΠϧΛݻఆϒϩοΫʹׂ ! ϒϩοΫ୯ҐͰࠩΛௐͯɺࠩͷ͋ΔϒϩοΫ͚ͩసૹ͍ͨ͠ ! ᶄ ֤ϒϩοΫʹରͯ͠νΣοΫαϜΛܭࢉ͠ɺϒϩοΫͷ༰ͷΘ ΓʹνΣοΫαϜ͚ͩΛૹ৴ !
ᶅ νΣοΫαϜΛൺֱ͕ͯࠩ͋͠Δ͔Ͳ͏͔ΛνΣοΫ ऑ͍νΣοΫαϜ(ܭࢉίετ)ͱڧ͍νΣοΫαϜ(ܭࢉίετߴ) ऑ͍νΣοΫαϜʢϩʔϦϯάνΣοΫαϜʣͰࠩͷ͋ΔϒϩοΫΛ ચ͍ग़ͯ͠ɺڧ͍νΣοΫαϜͰ࣮֬ʹࠩνΣοΫ
Idea • όοΫΞοϓ࣌ʹมߋՕॴΛܾఆ… • νΣοΫαϜΛൺֱ͢Δ͔͠ํ๏͕ͳ͍ • ࠷ॳ͔ΒมߋՕॴΛτϥοΩϯά͢Δ
Implementation • ϒϩοΫσόΠεϨϕϧͰมߋ͞Εͨϒϩο ΫͷτϥοΩϯάใΛอ࣋ • τϥοΩϯάใ: ϒϩοΫ͝ͱͷมߋ༗ແͷ ϑϥά ʢ1bit /
blockʣ • ߹ܭσʔλ: 4TiB -> ϑϥά: 128 MiB
Interface • ϢʔβεϖʔεΠϯλϑΣʔε • ֤σόΠε͝ͱʹ /proc ҎԼʹରԠ͢ΔϑΝΠϧ͕Ͱ͖Δ • /proc/mydev: ϒϩοΫ൪߸ͷϦετɻ͜Εʹॻ͖ࠐΉͱ
bit vector ॳظԽ͞ΕΔ • dmextract: มߋͷ͋ͬͨϒϩοΫͷநग़ stdout:ʢϒϩο Ϋ൪߸, σʔλʣ • dmextract mydev | ssh remotehost dmmerge /dev/ mapper/mydev
Block Device ϒϩοΫσόΠευϥΠό ετϨʔδσόΠε(HDD/SSD) ൚༻ϒϩοΫσόΠευϥΠό ΞϓϦέʔγϣϯ ϖʔδΩϟογϡ ϑΝΠϧγεςϜ ϑΝΠϧͷಡΈॻ͖ཁٻ ϑΝΠϧͱσόΠεͷϒϩοΫͱͷϚοϐϯά
*0ཁٻΛσόΠεʹదͨ͠ܗʹฒସ͑ͳͲ 3".ʹಡΈॻ͖σʔλΛΩϟογϡ ݸʑͷϋʔυΣΞʹ͋Θͤͨॲཧ ε τ Ϩ c δ σ ό Π ε ந Խ
Device mapper http://lc.linux.or.jp/lc2009/slide/T-02-slide.pdf ϒϩοΫσόΠευϥΠό ετϨʔδσόΠε(HDD/SSD) ൚༻ϒϩοΫσόΠευϥΠό ΞϓϦέʔγϣϯ ϖʔδΩϟογϡ ϑΝΠϧγεςϜ ϑΝΠϧͷಡΈॻ͖ཁٻ
ϑΝΠϧͱσόΠεͷϒϩοΫͱͷϚοϐϯά *0ཁٻΛσόΠεʹదͨ͠ܗʹฒସ͑ͳͲ 3".ʹಡΈॻ͖σʔλΛΩϟογϡ ݸʑͷϋʔυΣΞʹ͋Θͤͨॲཧ ε τ Ϩ c δ σ ό Π ε ந Խ %FWJDF NBQQFS ϒϩοΫͷಡΈॻ͖ཁٻΛ ͍Ζ͍Ζม
Device mapper (1) • ෳͷཧϒϩοΫσόΠεΛҰͭͷཧσόΠεͱͯ͠ ଋͶΒΕΔ • Mirror, Stripe, Snapshot
• RAID 0,1,5,10 • Snapshot: ཧσόΠεͷશมߋΛཧσόΠεʹϦμ ΠϨΫτ(Copy on Write) • ͋ͱͰཧσόΠεͱཧσόΠεΛϚʔδՄೳ • ཧσόΠε͔ΒόοΫΞοϓσόΠεʹϦϞʔτϦμ ΠϨΫτͰόοΫΞοϓ࡞ΕΔʁʁ
Device mapper (2) • 2ͭͷࢹ͕͔͚͍ͯΔ -> Snapshot Ͱແཧ • શมߋΛҰ࣌తʹཧσόΠε͕όοϑΝ͢Δඞ
ཁ͕͋Δ όοϑΝ͕͋;Εͯσʔλϩετ • ΦϦδφϧσʔλΛόοΫΞοϓઌͰϚʔδ͢Δ ඞཁ͕͋Δ • Device mapper ϒϩοΫมߋΛτϥοΩϯά͢Δ ͨΊͷશͯͷใΛͭ • liner mapping mode ࣌ͷ’map’ function
Architecture • ཧσόΠεʹରͯ͠τϥοΩϯά͢Δ͔ΘΓʹɺɹ ϧʔϓόοΫσόΠεʹରͯ͠τϥοΩϯά͢Δ • ϧʔϓόοΫσόΠε: ҰൠతͳϑΝΠϧΛϒϩοΫ σόΠεͰ͋Δ͔ͷΑ͏ʹѻ͏ͨΊͷػೳ ϒϩοΫσόΠε %FWJDFNBQQFS
ΞϓϦέʔγϣϯ ϒϩοΫσόΠε ϑΝΠϧγεςϜ ϧʔϓόοΫσόΠε %FWJDFNBQQFS ΞϓϦέʔγϣϯ
Data Structure • RAM্ʹมߋใΛ1ϒϩοΫ͋ͨΓ1bitͰͭ • 1bit ͷཁૉΛͭϒϩοΫͷྻ • ϝϞϦΞϩέʔγϣϯͷ •
kmalloc(), __get_free_pages(), vmalloc() • vmalloc() ͷΈ࣮֬ʹϝΨόΠτ୯ҐͰ֬อՄೳ • kmalloc() εϥϒΦϒδΣΫτ੍ݶ͕͋Δ(32MiB)ɺvmalloc ϖʔδ୯ҐͰ֬อ • ΦϯϝϞϦͳσʔλߏͳͷͰɺγϟοτμϯ࣌ʹτϥοΩϯ άใΛϩετ͢Δ • γϟοτμϯ࣌ʹτϥοΩϯάใΛσΟεΫʹॻ͖ࠐΜͰɺ ىಈ࣌ʹಡΈग़͢ • յΕ͍ͯΕϑϧಉظ
Evaluation (tools) • scp/nc • rsync • blockmd5sync • rsync
ͷϩʔϦϯάνΣοΫαϜͳ͠൛ • ZFS • features: ཧϘϦϡʔϜɺsnapshotɺ2ͭͷsnapshotͷࠩநग़ • ϒϩοΫσόΠεϨϕϧͩͱΞΫηεͰ͖ͳ͍ใ: /tmp ͚ͩແࢹ͢Δ • dsync • ϑΝΠϧγεςϜʹґଘ͠ͳ͍ • ϑΝΠϧγεςϜͷใ͕ͳ͍ͷͰ੍ݶ͋Δ (mtime ͳͲ) • νΣοΫαϜܭࢉͷ͔ΘΓʹɺϒϩοΫ͝ͱʹτϥοΩϯάεςʔλεΛͨ ͤΔ
Evaluation (Benchmarks) • 6-core AMD Phenom II processor • 2
TB spinning disk (Samsung HD204UI), • 128 GB SSD (Intel SSDSC2CT12) • εΠον͝͠ʹΪΨϏοτΠʔαωοτͰଓ
ಉظ࣌ؒ HDD/SSD rsync: 33min, dsync: 7 min )%% 44%
CPUར༻ STZODड৴ଆνΣοΫαϜ STZODૹ৴ଆνΣοΫαϜ ίΞ͍ͬͯΔ
ωοτϫʔΫଳҬ
Discussion • ৗʹ dsync > rsync • rsync dsync
ͷεʔύʔηοτ • rsync dsync ͱಉ͡Α͏ʹશͯͷߋ৽͞ΕͨϒϩοΫ Λread/transmit/merge͢Δ • rsync dsync ʹՃ͑ͯɺߋ৽ϒϩοΫΛܾఆ͢ΔͨΊ ʹɺ”શͯͷϒϩοΫ”Λreadɺchecksumܭࢉ͕ඞཁ • bit vector ͷߋ৽Φʔόϔου • ΦϯϝϞϦ͔ͩΒେͨ͜͠ͱͳ͍
Conclusion • ఆظతͳڊେόΠφϦσʔλͷޮతͳಉظํ ๏ͷఏҊ/࣮ • νΣοΫαϜΛܭࢉ͢ΔΘΓʹɺΦϯϥΠϯ ͰมߋΛτϥοΩϯά • Linux kernel
ͷ Device mapper֦ு • dmextract and dmmerge • rsycnc vs dsync, 32 min vs 7 min
ײ • rsync ͕͕͢͞ʹ͔Θ͍ͦ͏ͳͷͰɺసૹྔ͕͖͍ͯ ͘ΔΠϯλʔωοτܦ༝ͷಉظ࣌ؒൺֱ͕͋ΔͱΑ͞ ͦ͏
Linux 3.2 kernel module patch https://bitbucket.org/tknauth/devicemapper/