Talk on building distributed systems using a scala library called Suuchi. The slides are anonymized for submitting in https://2017.fullstackfest.com CFP.
- sharding/routing Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web node 2 node 1 node 3 node 4
heavy system - Stores ~120TB of url and timestamp indexed HTML pages - Stats (as Monoids) Storage System* - All we want is approximate aggregates real-time - Real-time scheduler for our crawlers* - Finds out which of the 20 urls to crawl now out of 3+ billion urls - Helps crawler crawl 20+ million urls everyday suuchi @indix