Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Lessons from Building Distributed RocksDB

Lessons from Building Distributed RocksDB

Presented as part of Geeknight Chennai - November 2016

Video - https://www.youtube.com/watch?v=PSCa9_Avne0
References - https://github.com/ashwanthkumar/distributed-rocksdb-talk

Ashwanth Kumar

November 24, 2016
Tweet

More Decks by Ashwanth Kumar

Other Decks in Technology

Transcript

  1. From facebook Fast persistent KV store Server Workloads Embeddable Optimized

    for SSDs rocksdb Fork of LevelDB Modelled after BigTable LSM Tree based SST files Written in C++
  2. - sum / multiplication - (sorted) top-K elements - operations

    on a graph - eg. link reach on twitter graph - function should be associative and optionally commutative recursive reduction
  3. - Serving our API in production for 2+ years -

    Search on hierarchical documents - Dynamic fields didn’t scale well on Solr - Brand / Store / Category Counts for a filter - Price History Service - More than a billion prices and serve online to REST queries rocksdb @indix
  4. - Stats (as Monoids) Storage System - All we want

    was approximate aggregates real-time - HTML Archive System - Stores ~120TB of url and timestamp indexed HTML pages - Real-time scheduler for our crawlers - Finds out which of the 20 urls to crawl now out of 3+ billion urls - Helps crawler crawl 20+ million urls everyday rocksdb @indix
  5. backup & restore - Incremental backups - Store backups in

    S3 - Sometimes high CPU during backups - Restore happens outside the app lifecycle https://github.com/indix/rocks