with Strongly Consistent Caching Antonios Katsarakis, Vasilis Gavrielatos, Nicolai Oswald, Arpit Joshi, Boris Grot, Vijay Nagarajan University of Edinburgh State of the Art Our Solution % Cache size (proportional to dataset) Hit Rate Symmetric Caching … … … Emerging technologies - Can be exploited to alleviate performance bottlenecks Remote Direct Memory Access (RDMA) Low-latency remote memory access In-Memory Storage Avoids slow disk access Need high performance - Low latency: Response time is critical to user satisfaction - High throughput: Must satisfy many concurrent requests - Real-world workloads exhibit skewed data accesses - Leads to inter-server load imbalance Skewed data accesses 128 Servers Observations - Most large scale workloads are Read-Intensive! - Writes: Performance vs Consistency tradeoff Stronger consistency more network traffic - Typical consistency protocols serialize via a directory Can lead to hot-spots due to skew Large scale online services - Massive datasets - Many concurrent users - Rely on multiple nodes for storage and performance Fully Distributed Protocols - Symmetric Caching does not need a directory - Distributed write serialization via logical timestamps Directly execute hot writes on any node - Two strong (per-key) consistency flavours Sequential Consistency (SC) & Linearizability (Lin) - Efficient RDMA implementation Enhance all servers with a cache Skew: hottest objects responsible for most accesses Small but effective cache - 50% hit rate by caching just 0.1% of the dataset Less B/W: only cache misses require remote access Challenge: must keep the caches consistent Enhance all servers with a cache . Symmetric: Store same hottest objects on all nodes Exploit skew: small but effective cache Throughput scales with number of servers Less network b/w: most requests served locally ~ Challenge: must keep the caches consistent Uniformly distribute the accesses across all servers Servers use RDMA to access data within the cluster No locality: Most requests require inter-server communication Increased latency Bottlenecked by network b/w! 9 servers, 56 Gbit NICs, skew exponent = 0.99 (YCSB) … Overloaded … … NUMA Abstraction … … … Local access Remote access >3χ 2.2χ 1.6χ Contrary to conventional wisdom: High-Performance & Strong Consistency with aggressive replication