Things you should know about Database Storage and Retrieval

Slide 1

Slide 1 text

things you should know about… @ordepdev Database Storage and Retrieval

Slide 2

Slide 2 text

Why should you care?

Slide 3

Slide 3 text

db_set() { echo “$1,$2” >> database; }

Slide 4

Slide 4 text

Log-Structured file 1991

Slide 5

Slide 5 text

A log-structured file system writes all modiﬁcations to disk sequentially in a log-like structure, thereby speeding up both file writing and crash recovery. ” “

Slide 6

Slide 6 text

Collect large amounts of new data in a file cache in main memory, then write the data to disk in a single large I/0.” “

Slide 7

Slide 7 text

How do we avoid running out of space?

Slide 8

Slide 8 text

C:1 A:1 B:1 B:2 A:2 A:3 A:4 B:3 Data File Segment Compactation

Slide 9

Slide 9 text

C:1 A:1 B:1 B:2 A:2 A:3 A:4 B:3 C:1 B:3 A:4 Data File Segment Compacted Segment Compactation

Slide 10

Slide 10 text

B:4 C:2 C:3 C:4 C:5 A:5 A:6 A:7 C:1 A:1 B:1 B:2 A:2 A:3 A:4 B:3 Data Segment 1 Data Segment 2 Merging & Compactation

Slide 11

Slide 11 text

B:4 C:2 C:3 C:4 C:5 A:5 A:6 A:7 C:1 A:1 B:1 B:2 A:2 A:3 A:4 B:3 B:4 C:5 A:7 + Data Segment 1 Data Segment 2 Compacted & Merged Segment Merging & Compactation

Slide 12

Slide 12 text

Why using an Append-only log? Sequential write operations are much more faster than random writes. Concurrency and crash recovery are much simpler. Merging old segments avoids fragmentation.

Slide 13

Slide 13 text

How do we find the value of a given key?

Slide 14

Slide 14 text

Index Additional structure that is derived from data. It keeps some additional metadata on the side that helps to locate the data. Maintaining such structures incurs overhead, especially on write!

Slide 15

Slide 15 text

The simplest possible indexing strategy is to keep an in-memory hash map where each key is mapped to a byte offset.” “

Slide 16

Slide 16 text

Hash Indexes 1 0 0 , { “ n a : “ P m e “ o r t o “ key byte offset 100 0 101 20 Log-structured ﬁle on disk In-memory hash map 0 1 , “ n a m e L i s b o a “ } } \n 1 “ : “ \n

Slide 17

Slide 17 text

The hash map is updated when a new key-value pair is appended to the file in order to reflect the current data offset.” “

Slide 18

Slide 18 text

A:2 B:2 C:2 A:1 B:1 C:1 D:1 E:1 F:1 G:1 H:1 Data Segment 1 Data Segment 2 2010 Hash Indexes

Slide 19

Slide 19 text

When a write occurs, the keydir is atomically updated with the location of the newest data. The old data is still present on disk, but any new reads will use the latest version available in the keydir.” “

Slide 20

Slide 20 text

Hash Indexes LIMITATIONS Not suitable for a very large number of keys, since the entire hash map must fit in memory! Scanning over a range of keys it’s not efficient — it would be necessary to look up each key individually in the hash maps.

Slide 21

Slide 21 text

SSTables A:2 B:2 C:2 A:1 B:1 C:1 D:1 E:1 F:1 G:1 H:1 Data Segment 1 Data Segment 2 2006

Slide 22

Slide 22 text

An SSTable provides a persistent, ordered immutable map from keys to values, where both keys and values are arbitrary byte strings.” “

Slide 23

Slide 23 text

A lookup can be performed by first finding the appropriate block with a binary search in the in-memory index, and then reading the appropriate block from disk.” “

Slide 24

Slide 24 text

sparse in-memory index A:1 B:1 C:1 D:2 E:1 F:1 G:9 H:1 B:4 C:5 A:7 Compacted Data Segment I:2 J:4 K:2 L:7 M:1 N:7 O:1 P:3 key byte offset A 100491 I 101201 M 103041 X 104204 Sorted segment ﬁle on disk ………… ………… In-memory index

Slide 25

Slide 25 text

Merging & Compactation A:2 B:2 C:2 A:1 B:1 C:1 D:1 E:1 F:1 G:1 H:1 Data Segment 1 Data Segment 2

Slide 26

Slide 26 text

Merging & Compactation A:2 B:2 C:2 A:1 B:1 C:1 D:1 E:1 F:1 G:1 H:1 B:4 C:5 A:7 + Data Segment 1 Data Segment 2 Compacted & Merged Segment A:2 B:2 C:2 D:1 E:1 F:1 G:1 H1

Slide 27

Slide 27 text

Storage engines that are based on this principle of merging and compacting sorted files are often called LSM storage engines.

Slide 28

Slide 28 text

LSM-Tree A:2 B:2 C:2 A:1 B:1 C:1 D:1 E:1 F:1 G:1 H:1 Data Segment 1 Data Segment 2 2006

Slide 29

Slide 29 text

The LSM-tree uses an algorithm that defers and batches index changes, cascading the changes from a memory-based component through one or more disk components in an efficient manner reminiscent of merge sort.” “

Slide 30

Slide 30 text

What about performance?

Slide 31

Slide 31 text

Bloom filters Memory-efficient data structure for approximating the contents of a set. “ ”

Slide 32

Slide 32 text

Bloom filters It can tell if a key does not exist in the database, saving many unnecessary disk reads for nonexistent keys.

Slide 33

Slide 33 text

Advantages over Hash indexes All values in one input segment are more recent than all values in the other segment. When multiple segments contain the same key, the value from the most recent segment is kept and older segments are discarded. In order to find a particular key in the file, there’s no longer need to keep the full index in memory!

Slide 34

Slide 34 text

B-TREES A:2 B:2 C:2 A:1 B:1 C:1 D:1 E:1 F:1 G:1 H:1 Data Segment 1 Data Segment 2 1970

Slide 35

Slide 35 text

B-TREES A:2 B:2 C:2 A:1 B:1 C:1 D:1 E:1 F:1 G:1 H:1 Data Segment 1 Data Segment 2 1979

Slide 36

Slide 36 text

The index is organized in pages of ﬁxed size capable of holding up to 2k keys, but pages need only be partially ﬁlled. ” “

Slide 37