Slide 1

Slide 1 text

ashwanth kumar @_ashwanthkumar Building Distributed RocksDB

Slide 2

Slide 2 text

overview - What’s RocksDB - Why Distributed RocksDB - Lessons from running it in production

Slide 3

Slide 3 text

rocksdb

Slide 4

Slide 4 text

From facebook Fast persistent KV store Server Workloads Embeddable Optimized for SSDs rocksdb Fork of LevelDB Modelled after BigTable LSM Tree based SST files Written in C++

Slide 5

Slide 5 text

Simple C++ API Has bindings in C Java Go Python rocksdb

Slide 6

Slide 6 text

but why build a distributed database?

Slide 7

Slide 7 text

data system paradigms

Slide 8

Slide 8 text

sum of numbers

Slide 9

Slide 9 text

sum of numbers select col from table;

Slide 10

Slide 10 text

sum of numbers select col from table; <>

Slide 11

Slide 11 text

sum of numbers select col from table; <>

Slide 12

Slide 12 text

sum of numbers (attempt 2)

Slide 13

Slide 13 text

sum of numbers select sum(col) from table;

Slide 14

Slide 14 text

sum of numbers select sum(col) from table;

Slide 15

Slide 15 text

sum of numbers <> select sum(col) from table;

Slide 16

Slide 16 text

data shipping function shipping select sum(col) from table; select col from table;

Slide 17

Slide 17 text

but why ship functions?

Slide 18

Slide 18 text

data locality for low latency / data intensive applications

Slide 19

Slide 19 text

recursive reduction aggregations in distributed systems

Slide 20

Slide 20 text

recursive reduction select sum(col) from table;

Slide 21

Slide 21 text

recursive reduction select sum(col) from table;

Slide 22

Slide 22 text

recursive reduction select sum(col) from table;

Slide 23

Slide 23 text

recursive reduction select sum(col) from table; <>

Slide 24

Slide 24 text

- sum / multiplication - (sorted) top-K elements - operations on a graph - eg. link reach on twitter graph - function should be associative and optionally commutative recursive reduction

Slide 25

Slide 25 text

rocksdb @indix

Slide 26

Slide 26 text

- Serving our API in production for 2+ years - Search on hierarchical documents - Dynamic fields didn’t scale well on Solr - Brand / Store / Category Counts for a filter - Price History Service - More than a billion prices and serve online to REST queries rocksdb @indix

Slide 27

Slide 27 text

- Stats (as Monoids) Storage System - All we want was approximate aggregates real-time - HTML Archive System - Stores ~120TB of url and timestamp indexed HTML pages - Real-time scheduler for our crawlers - Finds out which of the 20 urls to crawl now out of 3+ billion urls - Helps crawler crawl 20+ million urls everyday rocksdb @indix

Slide 28

Slide 28 text

suuchi toolkit for building distributed function shipping applications github.com/ashwanthkumar/suuchi

Slide 29

Slide 29 text

ops lessons running rocksdb in production

Slide 30

Slide 30 text

leveled compaction

Slide 31

Slide 31 text

https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide Tuning RocksDB for Read amplification Write amplification Space amplification compaction + other tuning

Slide 32

Slide 32 text

backup & restore - Incremental backups - Store backups in S3 - Sometimes high CPU during backups - Restore happens outside the app lifecycle https://github.com/indix/rocks

Slide 33

Slide 33 text

questions? https://github.com/ashwanthkumar/distributed-rocksdb-talk