Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Poster of Scale-Out ccNUMA [EuroSys '18]

Poster of Scale-Out ccNUMA [EuroSys '18]

Poster of Eurosys'18 paper "Scale-Out ccNUMA: Exploiting Skew with Strongly Consistent Caching".

Antonios Katsarakis

April 23, 2018
Tweet

More Decks by Antonios Katsarakis

Other Decks in Research

Transcript

  1. Keeping the Caches Consistent Motivation Results Scale-Out ccNUMA: Exploiting Skew

    with Strongly Consistent Caching Antonios Katsarakis, Vasilis Gavrielatos, Nicolai Oswald, Arpit Joshi, Boris Grot, Vijay Nagarajan University of Edinburgh State of the Art Our Solution % Cache size 
 (proportional to dataset) Hit Rate Symmetric Caching … … … Emerging technologies - Can be exploited to alleviate performance bottlenecks Remote Direct Memory Access (RDMA)
 Low-latency remote memory access In-Memory Storage
 Avoids slow disk access Need high performance - Low latency: 
 Response time is critical to user satisfaction - High throughput: 
 Must satisfy many concurrent requests - Real-world workloads exhibit skewed data accesses - Leads to inter-server load imbalance Skewed data accesses 128 Servers Observations - Most large scale workloads are Read-Intensive! - Writes: Performance vs Consistency tradeoff Stronger consistency more network traffic - Typical consistency protocols serialize via a directory
 Can lead to hot-spots due to skew Large scale online services - Massive datasets - Many concurrent users - Rely on multiple nodes for 
 storage and performance Fully Distributed Protocols - Symmetric Caching does not need a directory - Distributed write serialization via logical timestamps Directly execute hot writes on any node - Two strong (per-key) consistency flavours Sequential Consistency (SC) & Linearizability (Lin) - Efficient RDMA implementation Enhance all servers with a cache Skew: hottest objects responsible for most accesses
 Small but effective cache
 - 50% hit rate by caching just 0.1% of the dataset Less B/W: only cache misses require remote access Challenge: must keep the caches consistent Enhance all servers with a cache . Symmetric: Store same hottest objects on all nodes
 Exploit skew: small but effective cache Throughput scales with number of servers Less network b/w: most requests served locally ~ Challenge: must keep the caches consistent Uniformly distribute the accesses across all servers Servers use RDMA to access data within the cluster
 No locality: 
 Most requests require inter-server communication
 Increased latency Bottlenecked by network b/w! 9 servers, 56 Gbit NICs, skew exponent = 0.99 (YCSB) … Overloaded … … NUMA Abstraction … … … Local access Remote access >3χ 2.2χ 1.6χ Contrary to conventional wisdom:
 High-Performance & Strong Consistency with aggressive replication