Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How shit works: Databases

How shit works: Databases

The last decade has seen a proliferation in data storage technologies, from MongoDB to ElasticSearch to Riak to Datomic and so many others: relational databases, document stores, key-value stores… oh my! So many choices, so much confusion. While fundamentally different products, all of these solve the same age-old problem: that of storing and managing data on a persistent medium. Yet despite using them every day, we are often confounded by how they operate, treating them as magical devices that require special care. No more! They say a little knowledge is a dangerous thing... care to find out?

Tomer Gabel

June 24, 2023
Tweet

More Decks by Tomer Gabel

Other Decks in Programming

Transcript

  1. Every journey begins with a grumpy CEO. Go make me

    some money! Photo: Grumpy Old Men by Michael Summers (CC)
  2. The pipeline churns… Product Manager Chief Architect Management Photos: Andrea

    Appiani, Oren Jack Turner, Philippe de Champaigne (Public Domain)
  3. ... and finally! Specification Management Product Engineering 1. You have

    a product to build 2. It is divided into sane chunks 3. Your team owns a chunk 4. You’re all set to implement it 5. Now you need a database
  4. It really does 1. But that’s not helpful 2. Too

    many paradigms 3. Too many options 4. Too many nuances 5. A little knowledge goes a long way. Photo: He Who Thinks by Damien Galban (CC)
  5. End-User License Agreement 1. I’m not an expert 2. We’ll

    barely scratch the surface 3. We’ll simplify things liberally Image: Public Domain Bullshit ahead!
  6. Beginning at the Beginning “In computing, a database is an

    organized collection of data stored and accessed electronically. Access to this data is usually provided by a "database management system" (DBMS).” -- Wikipedia
  7. What problems does a database solve? 1. Data modelling representation,

    schema 2. Data manipulation insert, update, delete 3. Data retrieval querying 4. Administration security, backup, ACL…
  8. There’s no avoiding it 1. Databases deal with data 2.

    Data must be stored 3. Stored where? 4. Stored how? Photos: D-Kuru, Evan-Amos, Ryse93 (Creative Commons)
  9. Common data representations 1. Relational 2. Document 3. Graph 4.

    Geospatial 5. Time-series Glorified K/V Stores
  10. 3 6 1 5 8 11 15 Remember Binary Search

    Trees? 1. A fundamentally simple idea 2. Keep your data sorted and lookup becomes cheap 3. Ill-suited to disk storage - Limited branching factor
  11. Remember Binary Search Trees? 1. A fundamentally simple idea 2.

    Keep your data sorted and lookup becomes cheap 3. Ill-suited to disk storage - Limited branching factor - Pathological behavior 5 6 3 1 11 8 15
  12. Balanced Trees 3 6 1 5 11 8 15 1.

    Optimized data structures that: - Limit the tree height - Reduce write cost - Avoid waste and fragmentation 2. Examples include: - AVL trees - Red/Black trees - Splay trees Height=2
  13. Hello B-Tree My Old Friend Order = 5 82 27

    54 5 11 13 20 91 99 30 31 37 42 67 72 78 80
  14. 82 27 54 5 11 13 20 26 91 99

    30 31 37 42 67 72 78 80 B-Tree Insertion Too many keys - needs split!
  15. 82 13 27 54 5 11 91 99 20 26

    30 31 37 42 B-Tree Insertion Median is ”pushed up”
  16. B-Trees: Summary 1. A self-balancing tree 2. O(logN) select/insert/delete 3.

    Designed for on-disk storage 4. Many variants 5. Optimized for reads
  17. Pretend it’s a product 1. Let’s build a social network!

    2. Specifically, the storage subsystem for likes/dislikes 3. Extremely frequent operation 4. Are B-Trees suitable?
  18. Not a Good Fit 1. Writes are costly! - Read-before-write

    - Write amplification - Wasted space 2. Data used in aggregate - Infrequent lookups - Optimizing for the wrong thing! Photo: Overloading of vehicles by Ore O.j (CC)
  19. LSM Trees 1. Writes go in a memory buffer 2.

    Keys are timestamped Key Value Timestamp C 1 15 B 7 21 C 2 28 D 5 32 MemTable
  20. LSM Trees 1. Writes go in a memory buffer 2.

    Keys are timestamped 3. At some threshold, buffer is: - Sorted by key - Flushed to disk (SSTable) 4. Append-only, no lookup! Key Value Timestamp C 1 15 B 7 21 C 2 28 D 5 32 MemTable Key Value Timestamp B 7 21 C 2 28 D 5 32 SSTable 1 (on disk)
  21. LSM Trees: Reads Key Value Timestamp C 70 1012 P

    15 990 MemTable Key Value Timestamp B 7 21 C 2 28 D 5 32 SSTable 1 Key Value Timestamp B 5 282 E 14 211 O 2 303 SSTable 2 Key Value Timestamp A 10 805 B 3 860 D 12 901 SSTable 3 What is the value of B?
  22. LSM Trees: Reads Key Value Timestamp C 70 1012 P

    15 990 MemTable Key Value Timestamp B 7 21 C 2 28 D 5 32 SSTable 1 Key Value Timestamp B 5 282 E 14 211 O 2 303 SSTable 2 What is the value of B? Key Value Timestamp A 10 805 B 3 860 D 12 901 SSTable 3
  23. LSM Trees: Reads Key Value Timestamp C 70 1012 P

    15 990 MemTable Key Value Timestamp B 7 21 C 2 28 D 5 32 SSTable 1 Key Value Timestamp B 5 282 E 14 211 O 2 303 SSTable 2 What is the value of B? = 3 Key Value Timestamp A 10 805 B 3 860 D 12 901 SSTable 3
  24. Naïve LSM Trees 1. You probably noticed issues 2. Old

    versions are retained - useful but wasteful 3. All SSTables must be queried - I/O proportional to age - Bloom filters usually used to skip entire lookups Key Value Timestamp B 7 21 C 2 28 D 5 32 B, C, D SSTable Bloom filter
  25. LSM Trees: Compaction Key Value Timestamp B 7 21 C

    2 28 D 5 32 Key Value Timestamp B 5 282 E 14 211 O 2 303 Key Value Timestamp A 10 805 B 3 860 D 12 901 Key Value Timestamp C 70 1012 P 15 990 Key Value Timestamp B 3 860 C 70 1012 D 12 901 E 14 211 O 2 303 P 15 990 Merged SSTable
  26. LSM Trees: Compaction 1. Compaction can happen concurrently 2. Controlling

    the number of SSTables is critical 3. Typically, merges happen on expontentially larger levels SSTable SSTable SSTable SSTable SSTable SSTable SSTable Level 0 Level 1 Level 2
  27. LSM Trees: Summary 1. Optimized for writes - Append-only, no

    lookups - No write amplification 2. Not ideal for read workloads - Can be optimized though
  28. Key Takeaways 1. Early on, the choice matters very little.

    Any off-the-shelf DMBS will probably work. 2. Base your choice on your opertional needs, data access patterns and finally scale. In that order. 3. The more you understand, the better your leverage.