Slide 13
Slide 13 text
HFile Format
13
• Only Sequential Writes, just append(key, value)
• Large Sequential Reads are better
• Why grouping records in blocks?
• Easy to split
• Easy to read
• Easy to cache
• Easy to index (if records are sorted)
• Block Compression (snappy, lz4, gz, …)
Record 0
Record 1
Header
…
Record N
Record 0
Record 1
Header
…
Record N
Index 0
…
Index N
Blocks
Key/Value
(record)
Key Length : int
Value Length : int
Key : byte[]
Value : byte[]
Trailer