Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Suuchi - Distributed System Primitives

Suuchi - Distributed System Primitives

Ashwanth Kumar

April 20, 2017
Tweet

More Decks by Ashwanth Kumar

Other Decks in Technology

Transcript

  1. rise of KV stores distributed, replicated, fault-tolerant (optionally sorted) 2006

    BigTable from Google 2007 Dynamo from Amazon 2009 VoldemortDB from LinkedIn Cassandra from facebook 2008
  2. 1.1.1.1 1.1.1.2 1.1.1.3 A stats.service.ix 1.1.1.1 1.1.1.2 1.1.1.3 DNS based

    Load Balancing Distributed stats aggregation system @indix Count(“a”, 1L) Unique(“a”, 1L) Count(“c”, 1L) Unique(“a”, 1L) Count(“b”, 1L) monoid.plus monoid.plus monoid.plus
  3. पांग ப Communication key=”foo” key=”bar” key=”baz” Request Routing Sync /

    Async Replication Replication Data Sharding Cluster Membership distributed system problems
  4. + Consistent Hash Ring - Your own sharding technique? primitives

    - sharding/routing Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web node 2 node 1 node 3 node 4
  5. primitives - membership static dynamic fault tolerance in case of

    node/process failure scaling up/down needs downtime of the system
  6. primitives - replication sync async provides very high availability for

    write systems at the cost of eventual consistency every request is successful only if all the replicas succeeded
  7. + KeyValue + RocksDB - Your Own Abstraction? primitives -

    storage embedded KV store from FB for server workloads
  8. - define a gRPC service using proto2 (or proto3) -

    generate the stubs in java / scala - implement the services - connect them together using Suuchi - Server abstraction getting started
  9. - HTML Archive System - Handles 1000+ tps - write

    heavy system - Stores ~120TB of url and timestamp indexed HTML pages - Stats (as Monoids) Storage System* - All we want is approximate aggregates real-time - Real-time scheduler for our crawlers* - Finds out which of the 20 urls to crawl now out of 3+ billion urls - Helps crawler crawl 20+ million urls everyday suuchi @indix
  10. idea behind suuchi membership, request routing / sharding 2011 Gizzard

    from Twitter 2016 Suuchi 2016 Slicer from Google 2015 RingPop from Uber
  11. primitives - request routing Consistent hashing and random trees: Distributed

    caching protocols for relieving hot spots on the World Wide Web node 2 node 1 node 3 node 4
  12. primitives - request routing Consistent hashing and random trees: Distributed

    caching protocols for relieving hot spots on the World Wide Web node 2 node 1 node 3 node 4
  13. primitives - request routing Consistent hashing and random trees: Distributed

    caching protocols for relieving hot spots on the World Wide Web node 2 node 1 node 3 node 4
  14. primitives - request routing Consistent hashing and random trees: Distributed

    caching protocols for relieving hot spots on the World Wide Web node 2 node 3 node 4
  15. - Peer to Peer system - no single point of

    contact - Each node handles or forwards requests transparently - Uses pluggable partitioner scheme - Can be customized as weighted distribution / Rendezvous Hash etc. primitives - request routing