Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Suuchi - Distributed System Primitives

Suuchi - Distributed System Primitives

Avatar for Ashwanth Kumar

Ashwanth Kumar

April 20, 2017
Tweet

More Decks by Ashwanth Kumar

Other Decks in Technology

Transcript

  1. rise of KV stores distributed, replicated, fault-tolerant (optionally sorted) 2006

    BigTable from Google 2007 Dynamo from Amazon 2009 VoldemortDB from LinkedIn Cassandra from facebook 2008
  2. 1.1.1.1 1.1.1.2 1.1.1.3 A stats.service.ix 1.1.1.1 1.1.1.2 1.1.1.3 DNS based

    Load Balancing Distributed stats aggregation system @indix Count(“a”, 1L) Unique(“a”, 1L) Count(“c”, 1L) Unique(“a”, 1L) Count(“b”, 1L) monoid.plus monoid.plus monoid.plus
  3. पांग ப Communication key=”foo” key=”bar” key=”baz” Request Routing Sync /

    Async Replication Replication Data Sharding Cluster Membership distributed system problems
  4. + Consistent Hash Ring - Your own sharding technique? primitives

    - sharding/routing Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web node 2 node 1 node 3 node 4
  5. primitives - membership static dynamic fault tolerance in case of

    node/process failure scaling up/down needs downtime of the system
  6. primitives - replication sync async provides very high availability for

    write systems at the cost of eventual consistency every request is successful only if all the replicas succeeded
  7. + KeyValue + RocksDB - Your Own Abstraction? primitives -

    storage embedded KV store from FB for server workloads
  8. - define a gRPC service using proto2 (or proto3) -

    generate the stubs in java / scala - implement the services - connect them together using Suuchi - Server abstraction getting started
  9. - HTML Archive System - Handles 1000+ tps - write

    heavy system - Stores ~120TB of url and timestamp indexed HTML pages - Stats (as Monoids) Storage System* - All we want is approximate aggregates real-time - Real-time scheduler for our crawlers* - Finds out which of the 20 urls to crawl now out of 3+ billion urls - Helps crawler crawl 20+ million urls everyday suuchi @indix
  10. idea behind suuchi membership, request routing / sharding 2011 Gizzard

    from Twitter 2016 Suuchi 2016 Slicer from Google 2015 RingPop from Uber
  11. primitives - request routing Consistent hashing and random trees: Distributed

    caching protocols for relieving hot spots on the World Wide Web node 2 node 1 node 3 node 4
  12. primitives - request routing Consistent hashing and random trees: Distributed

    caching protocols for relieving hot spots on the World Wide Web node 2 node 1 node 3 node 4
  13. primitives - request routing Consistent hashing and random trees: Distributed

    caching protocols for relieving hot spots on the World Wide Web node 2 node 1 node 3 node 4
  14. primitives - request routing Consistent hashing and random trees: Distributed

    caching protocols for relieving hot spots on the World Wide Web node 2 node 3 node 4
  15. - Peer to Peer system - no single point of

    contact - Each node handles or forwards requests transparently - Uses pluggable partitioner scheme - Can be customized as weighted distribution / Rendezvous Hash etc. primitives - request routing