$30 off During Our Annual Pro Sale. View Details »

Btrfsの構造

naota
October 19, 2013

 Btrfsの構造

naota

October 19, 2013
Tweet

More Decks by naota

Other Decks in Programming

Transcript

 1. Btrfs ͷػೳɾಛ௃
  BtrfsͬͯͲΜͳϑΝΠϧγεςϜ?
  2013-06-14 1 / 51

  View Slide

 2. Btrfs ͷػೳɾಛ௃
  2013-06-14 2 / 51

  View Slide

 3. Btrfs ͷػೳɾಛ௃
  .
  . Copy On Write
  2013-06-14 3 / 51

  View Slide

 4. Btrfs ͷػೳɾಛ௃
  .
  . Journal
  2013-06-14 4 / 51

  View Slide

 5. Btrfs ͷػೳɾಛ௃
  2013-06-14 5 / 51

  View Slide

 6. Btrfs ͷػೳɾಛ௃
  2013-06-14 6 / 51

  View Slide

 7. Btrfs ͷػೳɾಛ௃
  .
  . Journal ͷऑ఺
  ʮδϟʔφϧʯͱʮຊମʯͱೋճॻ͔ͳ͚Ε͹͍͚ͳ͍
  σʔλ΋δϟʔφϧԽ͢Δͷ͸ޮ཰͕ѱ͍
  ϝλσʔλͳΒ͍͚Δ
  2013-06-14 7 / 51

  View Slide

 8. Btrfs ͷػೳɾಛ௃
  2013-06-14 8 / 51

  View Slide

 9. Btrfs ͷػೳɾಛ௃
  .
  . Copy On Write
  Ұճ͔͠ॻ͔ͳ͍͍ͯ͘
  σʔλɾϝλσʔλ྆ํͷ੔߹ੑΛอͭͷʹ࢖͑Δ
  2013-06-14 9 / 51

  View Slide

 10. Btrfs ͷػೳɾಛ௃
  .
  .
  εφοϓγϣ
  οτ
  2013-06-14 10 / 51

  View Slide

 11. Btrfs ͷػೳɾಛ௃
  .
  .
  σϑϥά
  2013-06-14 11 / 51

  View Slide

 12. Btrfs ͷػೳɾಛ௃
  .
  . Quota
  QGroup ͱݺ͹ΕΔಛघͳ quota
  εφοϓγϣ
  οτ͕͋ΔͷͰʮڞ༗σʔλʯ͕ଘࡏ͢Δ
  ʮશମͷαΠζʯͱʮڞ༗͞Ε͍ͯͳ͍αΠζʯʹΑΔ੍ݶ͕Ͱ͖Δ
  ڈ೥ͷ Software Design10 ݄߸ʹࡌͬͯ·͢
  2013-06-14 12 / 51

  View Slide

 13. Btrfs ͷػೳɾಛ௃
  .
  . Ext2/3 ͔Βͷ convert
  ext2/3 ͷσʔλ͕ͦͷ··ม׵Ͱ͖·͢
  ؾʹ͍Βͳ͚Ε͹ݩʹ໭ͤ·͢
  2013-06-14 13 / 51

  View Slide

 14. Btrfs ͷػೳɾಛ௃
  .
  .
  ͦͷଞ
  16EiB ·Ͱ࢖͑Δ (XFS ͷ 2 ഒ!)
  checksum ͕͋Δ͔ΒϒϩοΫͷ੔߹ੑνΣ
  οΫ OK!
  ಁաతѹॖͰ༰ྔઅ໿!
  RAID Ͱ৴པੑ޲্!
  send/receive Ͱޮ཰తͳόοΫΞοϓ!
  Hot add/remove ͰΒ͘Β͘σΟεΫަ׵!
  dedup Ͱॏෳഉআ!
  SSD ༻࠷దԽ΋͋ΔΑ!
  2013-06-14 14 / 51

  View Slide

 15. Btrfs ͷػೳɾಛ௃
  .
  .
  কདྷతʹ
  ॻ͖͜Έͱಉ࣌ʹࣗಈ dedup
  hot data tracking ͰΑ͘ΞΫηε͢ΔϑΝΠϧΛ SSD ʹΩϟ
  ογϡ!
  fsck?
  2013-06-14 15 / 51

  View Slide

 16. Btrfs ͷػೳɾಛ௃
  .
  .
  ࢖͍ͨ͘ͳ͖ͬͯͨͰ͠ΐ?
  2013-06-14 16 / 51

  View Slide

 17. B ໦
  .
  . B ໦
  Btrfs ͷ΄ͱΜͲ͍ͨΔͱ͜ΖͰ࢖ΘΕΔσʔλߏ଄
  ͜Ε͕Θ͔Βͳ͍ͱ Btrfs ͕Θ͔Βͳ͍
  ͳΜͱ͍ͬͯ΋ “BtrFS” = “B-Tree File System” Ͱ͔͢Βͳ!
  2013-06-14 17 / 51

  View Slide

 18. B ໦
  .
  .
  ೋ෼୳ࡧ໦
  2013-06-14 18 / 51

  View Slide

 19. B ໦
  .
  . B ໦
  2013-06-14 19 / 51

  View Slide

 20. B ໦
  .
  . B ໦΁ͷૠೖ
  2013-06-14 20 / 51

  View Slide

 21. B ໦
  .
  .
  ෼ׂ
  2013-06-14 21 / 51

  View Slide

 22. B ໦
  .
  . CoW ૠೖ
  2013-06-14 22 / 51

  View Slide

 23. B ໦
  2013-06-14 23 / 51

  View Slide

 24. B ໦
  2013-06-14 24 / 51

  View Slide

 25. B ໦
  2013-06-14 25 / 51

  View Slide

 26. B ໦
  2013-06-14 26 / 51

  View Slide

 27. B ໦
  2013-06-14 27 / 51

  View Slide

 28. B ໦
  2013-06-14 28 / 51

  View Slide

 29. Btrfs ͷ໦
  .
  . Btrfs ͷ B ໦
  Key
  ΦϒδΣΫτ ID
  λΠϓ
  Φϑηοτ
  ϊʔυͱϦʔϑ
  Ұ൪Լ͕Ϧʔϑɺ࢒Γ͕Λϊʔυ
  ϦʔϑʹΩʔʹରԠ͢Δσʔλ͕อ؅͞ΕΔ
  2013-06-14 29 / 51

  View Slide

 30. Btrfs ͷ໦
  2013-06-14 30 / 51

  View Slide

 31. Btrfs ͷ໦
  .
  .
  ༷ʑͳ໦
  Root tree
  FS tree
  extent tree
  chunk tree
  device tree
  CSum tree
  2013-06-14 31 / 51

  View Slide

 32. Btrfs ͷ໦
  .
  . Root Tree
  جຊͷ໦!
  ଞͷ໦ͷ root Λอ࣋
  αϒϘϦϡʔϜߏ଄ͷऔಘ
  2013-06-14 32 / 51

  View Slide

 33. Btrfs ͷ໦
  .
  . FS Tree
  αϒϘϦϡʔϜɾεφοϓγϣ
  οτ͝ͱʹ FS tree ͷ root ͕͋Δ
  σΟϨΫτϦߏ଄
  i-node ৘ใ
  ϑΝΠϧσʔλҐஔ (extent address)
  2013-06-14 33 / 51

  View Slide

 34. Btrfs ͷ໦
  .
  . chunk ͱ extent
  2013-06-14 34 / 51

  View Slide

 35. Btrfs ͷ໦
  .
  . device treeɾCSum tree
  device tree
  Btrfs ʹొ࿥͞Ε͍ͯΔσόΠεͷ؅ཧ
  CSum tree
  4KB ͝ͱͷ checksum
  2013-06-14 35 / 51

  View Slide

 36. Btrfs ͷ໦
  .
  .
  ୳ͯ͠ΈΑ͏
  2013-06-14 36 / 51

  View Slide

 37. bootstrap
  Btrfs ͷ tree ΋ϑΝΠϧ΋શͯ extent address(࿦ཧΞυϨε) ͰΞΫ
  ηε
  Root tree ΋ extent address ͰΞΫηε͞ΕΔ
  chunk tree ࣗ਎΋
  Ͳ͏΍ͬͯ࠷ॳͷ extent address Λ෺ཧΞυϨεʹϚοϐϯά͢
  Δͷ?
  2013-06-14 37 / 51

  View Slide

 38. bootstrap
  .
  . superblock
  btrfs Ͱ།Ұ෺ཧతʹΞυϨε͕ܾ·͍ͬͯΔ
  ύʔςΟγϣϯ ઌ಄͔Β 6410KiB, 6410MiB, 25610GiB, 1PiB
  root ͷ extent address
  root tree
  chunk tree
  system chunk ͷ chunk ৘ใ
  chunk tree ͷϊʔυ, Ϧʔϑ͸ system chunk ͔ΒׂΓ౰ͯΒΕΔ
  2013-06-14 38 / 51

  View Slide

 39. bootstrap
  .
  . system chunk
  2013-06-14 39 / 51

  View Slide

 40. bootstrap
  2013-06-14 40 / 51

  View Slide

 41. ࠷ۙͷ͜ͱ
  ࠷ޙʹ͜ͷ 1 ݄͙Β͍;Έ·ͬͯ͘Δ assert ʹ͍ͭͯ࿩͠·͢
  2013-06-14 41 / 51

  View Slide

 42. ࠷ۙͷ͜ͱ
  .
  . dump
  [42604.796633] BTRFS assertion failed: !memcmp_extent_buffer(
  b, &disk_key, offsetof(struct btrfs_leaf, items[0].key),
  sizeof(disk_key)), file: fs/btrfs/ctree.c, line: 2444
  2013-06-14 42 / 51

  View Slide

 43. ࠷ۙͷ͜ͱ
  .
  .
  Ωʔ୳ࡧͷ࠷దԽ
  2013-06-14 43 / 51

  View Slide

 44. ࠷ۙͷ͜ͱ
  .
  . snapshot aware defrag
  2013-06-14 44 / 51

  View Slide

 45. ࠷ۙͷ͜ͱ
  2013-06-14 45 / 51

  View Slide

 46. ࠷ۙͷ͜ͱ
  .
  . tree mod log
  B-tree ͷมߋ࣌ʹมߋલʹͲ͏͍͏஋͕ೖ͍͔ͬͯͨΛه࿥
  transaction ͝ͱʹϦηοτ͞ΕΔ
  ϝϞϦ্͚ͩͰσΟεΫʹ͸ॻ͔Εͳ͍
  ͜ΕͰʮੲͷ B-treeʯΛͱΓ͕ͩͦ͢Ε͕όάͬͯΔ?
  2013-06-14 46 / 51

  View Slide

 47. ࠷ۙͷ͜ͱ
  .
  .
  ݪҼ͸ʜͳΜͰͩΖ͏?
  2013-06-14 47 / 51

  View Slide

 48. ࠷ۙͷ͜ͱ
  ·͋ͱΓ͋͑ͣʮࣗಈσϑϥάʯ੾͓͚ͬͯ͹ (σϑΥϧτ:Φϑ) ͍͍ͱ
  ࢥ͏Α?
  2013-06-14 48 / 51

  View Slide

 49. ࠷ۙͷ͜ͱ
  .
  .
  ͓͠·͍
  2013-06-14 49 / 51

  View Slide

 50. ϦϯΫ
  .
  .
  ࢀߟจݙ
  https://btrfs.wiki.kernel.org/
  https://events.linuxfoundation.org/sites/events/files/
  slides/LinuxCon_2013_NA_Eckermann_Filesystems_btrfs.pdf
  http://people.redhat.com/lczerner/files/btrfs_lczerner.pdf
  https://www.usenix.org/legacy/event/lsf07/tech/rodeh.pdf
  2013-06-14 50 / 51

  View Slide

 51. ϦϯΫ
  .
  .
  ը૾
  http://www.flickr.com/photos/artofphotography-ramsner/9592744354
  http://www.flickr.com/photos/surferbill/2506950772/
  http://www.flickr.com/photos/rore/1304728223/
  http://ja.wikipedia.org/wiki/
  %E3%83%95%E3%82%A1%E3%82%A4%E3%83%AB:
  Binary_search_tree.svg
  http://www.flickr.com/photos/protohiro/85504626/
  http://www.flickr.com/photos/rubyji/74176893/
  http://www.flickr.com/photos/rore/1304728223/
  http://www.flickr.com/photos/dolfiedream/5060030894/
  2013-06-14 51 / 51

  View Slide