Optimizing LevelDB for Performance and Scale (RICON East 2013)

Optimizing leveldb for Performance and Scale

leveldb throughput 0 5000 10000 15000 20000 0 10000 20000
30000 40000 50000 60000 70000 80000 90000 100000

leveldb throughput 0 5000 10000 15000 20000 0 10000 20000
30000 40000 50000 60000 70000 80000 90000 100000 0 5000 10000 15000 20000 0 10000 20000 30000 40000 50000 tuned as a server github.com basho/leveldb

key/value lifecycle Write() Skip list Recovery log Immutable memory Level-‐‑0
.sst (overlapping) Level-‐‑1 .sst (sorted/overlapping) Level-‐‑2 .sst (sorted/overlapping) Level-‐‑3 .sst (sorted) Level-‐‑4 .sst (sorted) Level-‐‑5 .sst (sorted) Level-‐‑6 .sst (sorted) MANIFEST

.sst ﬁle anatomy trailer block index ﬁlter table (bloom) data
block data block data block File position 0 metadata index

stalls imm (immutable memory) level 0 full

compaction C2 F M B C1 E A C3
H1 G H0 L C0 D J K N A B C3 E F G H1 L M C0 D J K N Sorted Level+1 Sorted Level Overlap Level Before Compaction After Compaction •  Write Ampliﬁcation: the silent performance killer

stall sources •  Single Database •  Level 0 full and
IMM compactions occur too often •  Level 0 full and blocked by any higher level compaction •  Multiple Databases •  IMM / Level 0 full and blocked by any other active compaction •  IMM / Level 0 full and waiting on queue

compaction management Global Thread block 1 (of 5) Tiered
Lock 0 Tiered Lock 1 IMM to Level 0 compaction thread Level 0 to Level 1 compaction thread Levels 1+ compaction thread Backpressure: Write Throble

key/value retrieval Get() Skip list Immutable memory Use manifest to
find files covering key range by level File in file cache (no: the open file song) Bloom filter suggests exists Use index to identify block with key range Block in read cache (no: see open file song, verse 4) Sequentially walk block to find key

the open file song Open .sst file Read and validate
trailer Request block index Chorus: Read block to user space CRC scan block Compression’s checksum block scan Decompress block into malloc memory Request metadata index Chorus: Request bloom filter Chorus: Chorus: Request data block

time ﬁllers •  Q&A •  Repair •  Level directories for
tiered storage •  Linux and grace of posix_fadvise •  Performance counters •  Independent cache types •  FusionIO / SSD / SATA / AWS

Optimizing LevelDB for Performance and Scale (R...

Optimizing LevelDB for Performance and Scale (RICON East 2013)

Basho Technologies

More Decks by Basho Technologies

Other Decks in Technology

Featured

Transcript

Optimizing leveldb for Performance and Scale

leveldb throughput 0 5000 10000 15000 20000 0 10000 20000

leveldb throughput 0 5000 10000 15000 20000 0 10000 20000

key/value lifecycle Write() Skip list Recovery log Immutable memory Level-‐‑0

.sst ﬁle anatomy trailer block index ﬁlter table (bloom) data

stalls imm (immutable memory) level 0 full

compaction C2 F M B C1 E A C3

stall sources •  Single Database •  Level 0 full and

compaction management Global Thread block 1 (of 5) Tiered

key/value retrieval Get() Skip list Immutable memory Use manifest to

the open ﬁle song Open .sst ﬁle Read and validate

time ﬁllers •  Q&A •  Repair •  Level directories for