Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
HDFS RAID
Search
jiangbo
December 21, 2012
Programming
0
3.9k
HDFS RAID
an introduction of HDFS RAID
jiangbo
December 21, 2012
Tweet
Share
More Decks by jiangbo
See All by jiangbo
HDFS
jiangbo
0
190
Memcached内存管理
jiangbo
1
1.6k
awk
jiangbo
4
310
vim
jiangbo
8
500
Other Decks in Programming
See All in Programming
高速開発のためのコード整理術
sutetotanuki
1
220
【卒業研究】会話ログ分析によるユーザーごとの関心に応じた話題提案手法
momok47
0
170
HTTPプロトコル正しく理解していますか? 〜かわいい猫と共に学ぼう。ฅ^•ω•^ฅ ニャ〜
hekuchan
2
650
AI によるインシデント初動調査の自動化を行う AI インシデントコマンダーを作った話
azukiazusa1
1
330
DevFest Android in Korea 2025 - 개발자 커뮤니티를 통해 얻는 가치
wisemuji
0
180
なぜSQLはAIぽく見えるのか/why does SQL look AI like
florets1
0
320
AIで開発はどれくらい加速したのか?AIエージェントによるコード生成を、現場の評価と研究開発の評価の両面からdeep diveしてみる
daisuketakeda
1
830
Data-Centric Kaggle
isax1015
2
660
.NET Conf 2025 の興味のあるセッ ションを復習した / dotnet conf 2025 quick recap for backend engineer
tomohisa
0
120
re:Invent 2025 のイケてるサービスを紹介する
maroon1st
0
170
インターン生でもAuth0で認証基盤刷新が出来るのか
taku271
0
180
Honoを使ったリモートMCPサーバでAIツールとの連携を加速させる!
tosuri13
1
140
Featured
See All Featured
How to Think Like a Performance Engineer
csswizardry
28
2.4k
Claude Code どこまでも/ Claude Code Everywhere
nwiizo
61
52k
Applied NLP in the Age of Generative AI
inesmontani
PRO
4
2k
Winning Ecommerce Organic Search in an AI Era - #searchnstuff2025
aleyda
0
1.8k
Max Prin - Stacking Signals: How International SEO Comes Together (And Falls Apart)
techseoconnect
PRO
0
71
Optimising Largest Contentful Paint
csswizardry
37
3.6k
The Illustrated Guide to Node.js - THAT Conference 2024
reverentgeek
0
240
How to optimise 3,500 product descriptions for ecommerce in one day using ChatGPT
katarinadahlin
PRO
0
3.4k
Are puppies a ranking factor?
jonoalderson
1
2.6k
Why Your Marketing Sucks and What You Can Do About It - Sophie Logan
marketingsoph
0
60
Self-Hosted WebAssembly Runtime for Runtime-Neutral Checkpoint/Restore in Edge–Cloud Continuum
chikuwait
0
290
Claude Code のすすめ
schroneko
67
210k
Transcript
HDFS RAID ᇙത Friday, December 21, 12
Outline • Overview • Some detail • How to use
Friday, December 21, 12
Why? • $$$$$......and $$$$$.... Friday, December 21, 12
How? • 䫩খ෭ຊ+҆શੑෆ߱=Erasred Code Friday, December 21, 12
Overview Friday, December 21, 12
DRFS • ༻Raidम䐾ޭೳแྃHDFSతDFS • 读औHDFSதblockਾੋ发ੜ CorruptionException҃DecomissionException时 ࠜਾparityจ݅对䇗ৗblock进ߦम䐾 Friday, December 21,
12
RaidShell • ҰClientཧ۩ɼఏڙraidFileɼunraidɼ fixBlock, checkStatusޭೳ • ௨过RPC请ٻRaidNodeɼNN节发ى请ٻ Friday, December 21,
12
RaidNode • DRFSதআNNɼJT֎ୈࡾMaster • 维护Ұ组తdaemon线ఔɼࠜਾ༻户ఏަతraid 请ٻʢraid policyʣraidɼfixɼpurge֤㜎 ૢ࡞ Friday, December
21, 12
Threads on RaidNode • TriggerMonitorɿ֩৺线ఔɼࠜਾpolicyஔ调raid • BlockFixerɿम䐾corruptจ݅తworker线ఔ • BlockCopierɿम䐾decomissionจ݅తworker线ఔ •
PurgeParityThreadɿਗ਼ཧݽၰparityจ݅త线ఔ • PlacementThreadɿࢄdatanode্过ूதతparity blockత线ఔ • HarThreadɿক过ظతparity进ߦharҎ䫩গparityจ݅త线ఔ Friday, December 21, 12
Some detail • raid file • fix file Friday, December
21, 12
Raid File • hadoop raidshell -raidFile /path • TriggerMonitor Friday,
December 21, 12
Friday, December 21, 12
RaidNode.doRaid Friday, December 21, 12
Encode Friday, December 21, 12
Erasred Code • RS: 10:4 • XOR: 10:1 Friday, December
21, 12
Fix File • DRFS readั获䇗ৗ • BlockIntegrityMonitor检查౸䇗ৗจ݅ Friday, December 21,
12
DRFS read Friday, December 21, 12
BlockIntegrityMonitro • local • dist • sorry...图ྃ... Friday, December 21,
12
Local LocalBlockIntegrity线ఔత֩৺ੋपظ调༻doFixํ๏म䐾corruptจ݅ɼओཁ ྲྀఔԼɿ 1. ௨过DFSck获औcurrputจ݅৴ଉʢHTTP访问ʣ 2. 过滤ᎃෆೳ伭䐾తcorruptจ݅ʢ༗parityจ݅తʣ 3. কcorrputతจ݅ഉংɼഉং规则Լ ◦
parityจ݅优ઌɼsourceจ݅ࡏ ◦ parityจ݅தcodec.priorityߴతࡏઌʢcodec.priority௨过JSONத coder_priorityஔʣ 4. 对ഉং߸తcorruptจ݅ྻදґ࣍௨过BlockRecontsturerདྷ伭䐾ɻ Friday, December 21, 12
Dist 1. 检查લਖ਼ࡏ运ߦతम䐾jobɼՌલjobቮ经େဋjob্ݶɼ则೭લత job运ߦʢ该্线ՄҎ௨过raid.blockfix.maxpendingjobsདྷஔɼᘍ认ੋ100Lʣ 2. ௨过DFSck获औ损ᆀతจ݅৴ଉɼblockfixer线ఔ获औcorruptจ݅৴ଉɼ blockCopier获औdecomissionจ݅৴ଉ 3. 计ࢉ获ಘత损ᆀจ݅త优ઌ级 4.
ক计ࢉ优ઌ级తจ݅ྻද҈优ઌ级ഉংɼ࡞为ࢀ构ݐम䐾Jobɻ 5. Jobత输ೖੋॴ༗धཁम䐾తจ݅pathతsequence fileɻձࠜਾ raid.blockfix.filespertaskஔత值进ߦsyncɼଈࡏjobతsplit阶ஈձ҈র该值设ஔ త进ߦsplitɼᘍ认ੋ20 6. JobతMapperओཁੋ௨过Reconstruterࡏtaskص্㠳应จ݅త伭䐾ɻ Friday, December 21, 12
优ઌ级计ࢉ 1. corrputจ݅త优ઌ级Լ(R为จ݅෭ຊɼC为该จ݅corrputత block)ɿ ◦ ᘍ认为LOW ◦ R>1 && C>0时ɿ
HIGH ◦ R==1 && C>1时ɿ HIGH ◦ parityfile corrput && C>0时ɿ HIGH 2. decomission优ઌ级计ࢉ规则ԼʢD为decomissionతblock): ◦ ᘍ认为LOW ◦ D>4时ɿ HIGH Friday, December 21, 12
BlockReconstructor • BlockIntegrityMonitorRaidShell对จ݅తम䐾࠷终௨过 BlockReconstructorདྷɻ • BlockReconstructorम䐾จ݅过ఔओཁ为ࡾ㜎ɿHar parityจ݅ɼparity จ݅ݯਾจ݅ɻ Friday, December
21, 12
Har parity file 1. 获औharจ݅తجຊ৴ଉٴindex 2. 获औharจ݅தతlost blockɼ对㑌block进ߦԼ处ཧɿ 3. ࡏຊจ݅ܥ统创ݐ该blockత临时จ݅ɼ
4. 对该block㐪ٴతॴ༗parityจ݅ɼ获औ对应తsourceจ݅ɼ௨过 Encoderॏ৽encodeɼࡏຊੜparityɻ 5. কຊੜతblockਾ发ૹ౸Ұdatanode্ɼdatanodeత选औ 规则ੋဓू܈தআݪblockॴଐ节֎ਵص选औҰɻ发ૹ过ఔಉ 时ੜblockతmetaจ݅ɻ Friday, December 21, 12
Parity File parityจ݅తम䐾处ཧ૬对简单ɿ 1. ࡏຊ࣍创ݐlost blockత临时จ݅ 2. 获औparityจ݅తݯจ݅ɼ௨过Encoderॏ৽encodeɼࡏຊ ੜparityจ݅తblock 3.
选औҰdnʢ选औ规则har parityจ݅म䐾Ұகʣɼক blockਾ发ૹ౸该dn্ɼኂಉ时ੜmetaจ݅ Friday, December 21, 12
ਾจ݅ ݯจ݅త伭䐾༩parityจ݅తम䐾૬ɼੋҰdecode过ఔɿ 1. 对ဋfileத丢ࣦత㑌block执ߦम䐾ૢ࡞ 2. ࡏຊ创ݐblockత临时จ݅ 3. ௨过Decoder伭䐾blockਾ 4. 选औҰtarget
dnɼকblockਾ发ૹ给target dnɼኂಉ时ੜmeta จ݅ɻ Friday, December 21, 12
Decoder Friday, December 21, 12
How to use http://wiki.aliyun-inc.com/projects/apsara/wiki/Yunti1/HDFS-Raid-deploy http://wiki.apache.org/hadoop/HDFS-RAID Friday, December 21, 12
Ұ问题 • େྔจ݅ڊܕจ݅తraid Friday, December 21, 12
Resouces http://wiki.apache.org/hadoop/HDFS-RAID http://hadoopblog.blogspot.com/2009/08/hdfs-and-erasure-codes-hdfs-raid.html http://jiangbo.me/blog/2012/12/21/hdfs-raid/ Friday, December 21, 12
THX Friday, December 21, 12