Let's explore the three main sharding strategies - Lookup, Range, and Hash - and dive into their unique characteristics, benefits, and ideal use cases with simple examples.
Shard C Tenant2:Users Shard C Tenant2:* Shard B Tenant3:* Shard A Tenant4:* Shard C Shard C Shard B Shard A The Lookup Strategy distributes data across shards using a routing map that directs requests based on a specific key. allowing logical categorization/segregation
Shard C Tenant2:Users Shard C Tenant2:* Shard B Tenant3:* Shard A Tenant4:* Shard C “Tenant2:Users:Th30z:…” Shard C Shard B Shard A The Lookup Strategy distributes data across shards using a routing map that directs requests based on a specific key. allowing logical categorization/segregation
Shard C Tenant2:Users Shard C Tenant2:* Shard B Tenant3:* Shard A Tenant4:* Shard C “Tenant2:Users:Th30z:…” -> getShard(“Tenant2:Users”) Shard C Shard B Shard A The Lookup Strategy distributes data across shards using a routing map that directs requests based on a specific key. allowing logical categorization/segregation
Shard C Tenant2:Users Shard C Tenant2:* Shard B Tenant3:* Shard A Tenant4:* Shard C “Tenant2:Users:Th30z:…” -> getShard(“Tenant2:Users”) Shard C Shard B Shard A The Lookup Strategy distributes data across shards using a routing map that directs requests based on a specific key. allowing logical categorization/segregation = Shard C
Shard B Tenant2:Products Shard C Tenant2:Users Shard C Tenant2:* Shard B Tenant3:* Shard A Tenant4:* Shard C “Tenant2:Users:Th30z:…” -> getShard(“Tenant2:Users”) Shard C Shard B Shard A “Tenant2:Users:Th30z:…” distributes data across shards using a routing map that directs requests based on a specific key. allowing logical categorization/segregation = Shard C
Shard C Tenant2:Users Shard C Tenant2:* Shard B Tenant3:* Shard A Tenant4:* Shard C “Tenant3:Users:Foo:…” -> getShard(“Tenant2:Users”) Shard C Shard B Shard A “Tenant2:Users:Th30z:…” “Tenant3:Users:Foo:…” The Lookup Strategy distributes data across shards using a routing map that directs requests based on a specific key. allowing logical categorization/segregation “Tenant2:Users:Th30z:…” -> getShard(“Tenant2:Users”) = Shard C = Shard A
Shard C Tenant2:Users Shard C Tenant2:* Shard B Tenant3:* Shard A Tenant4:* Shard C “Tenant2:Products:PC:…” -> getShard(“Tenant2:Products”) Shard C Shard B Shard A “Tenant2:Users:Th30z:…” “Tenant3:Users:Foo:…” “Tenant2:Products:PC:…” The Lookup Strategy distributes data across shards using a routing map that directs requests based on a specific key. allowing logical categorization/segregation “Tenant2:Users:Th30z:…” -> getShard(“Tenant2:Users”) = Shard C “Tenant3:Users:Foo:…” -> getShard(“Tenant2:Users”) = Shard A = Shard C
Shard C Tenant2:Users Shard C Tenant2:* Shard B Tenant3:* Shard A Tenant4:* Shard C Shard C Shard B Shard A “Tenant2:Users:Th30z:…” “Tenant3:Users:Foo:…” “Tenant2:Products:PC:…” “Tenant2:News:IPO:…” -> getShard(“Tenant2:News”) “Tenant2:News:IPO:…” The Lookup Strategy distributes data across shards using a routing map that directs requests based on a specific key. allowing logical categorization/segregation “Tenant2:Users:Th30z:…” -> getShard(“Tenant2:Users”) = Shard C “Tenant3:Users:Foo:…” -> getShard(“Tenant2:Users”) = Shard A “Tenant2:Products:PC:…” -> getShard(“Tenant2:Products”) = Shard C = Shard B
Shard B Tenant2:Products Shard C Tenant2:Users Shard C Tenant2:* Shard B Tenant3:* Shard A Tenant4:* Shard C Shard C Shard B Shard A “Tenant2:Users:Th30z:…” -> getShard(“Tenant2:Users”) = Shard C “Tenant2:Users:Th30z:…” “Tenant3:Users:Foo:…” -> getShard(“Tenant2:Users”) = Shard A “Tenant3:Users:Foo:…” “Tenant2:Products:PC:…” -> getShard(“Tenant2:Products”) = Shard C “Tenant2:Products:PC:…” “Tenant2:News:IPO:…” -> getShard(“Tenant2:News”) = Shard B “Tenant2:News:IPO:…” “Tenant1:Products:Car:…” -> getShard(“Tenant1:Products”) = Shard B “Tenant1:Products:Car:…” distributes data across shards using a routing map that directs requests based on a specific key. allowing logical categorization/segregation
Shard B Tenant2:Products Shard C Tenant2:Users Shard C Tenant2:* Shard B Tenant3:* Shard A Tenant4:* Shard C Shard C Shard B Shard A “Tenant2:Users:Th30z:…” -> getShard(“Tenant2:Users”) = Shard C “Tenant2:Users:Th30z:…” “Tenant3:Users:Foo:…” -> getShard(“Tenant2:Users”) = Shard A “Tenant3:Users:Foo:…” “Tenant2:Products:PC:…” -> getShard(“Tenant2:Products”) = Shard C “Tenant2:Products:PC:…” “Tenant2:News:IPO:…” -> getShard(“Tenant2:News”) = Shard B “Tenant2:News:IPO:…” “Tenant1:Products:Car:…” -> getShard(“Tenant1:Products”) = Shard B “Tenant1:Products:Car:…” distributes data across shards using a routing map that directs requests based on a specific key. allowing logical categorization/segregation
2 tttt-zzzz Shard 3 aaaa-jjjj The Range Strategy Shard 2 Shard 3 kkkk-ssss tttt-zzzz distributes data across shards in a sorted manner, allowing efficient sequential scans and range-based queries.
The Range Strategy kkkk-ssss tttt-zzzz distributes data across shards in a sorted manner, allowing efficient sequential scans and range-based queries. Shard 1 Shard 1 Shard 2 Shard 3 Shard 2 Shard 3
NumShards (Key) hash Non-cryptographic hash functions Shard 0 The Hash Strategy Shard 1 Shard 2 distributes data across shards based on a hash function, ensuring even data distribution and load balancing.
NumShards (Key) hash Non-cryptographic hash functions “Hello” Shard 0 The Hash Strategy Shard 1 Shard 2 distributes data across shards based on a hash function, ensuring even data distribution and load balancing.
NumShards (Key) hash Non-cryptographic hash functions “Hello” -> hash(“Hello”) % 3 Shard 0 The Hash Strategy Shard 1 Shard 2 distributes data across shards based on a hash function, ensuring even data distribution and load balancing.
NumShards (Key) hash Non-cryptographic hash functions “Hello” -> hash(“Hello”) % 3 = Shard 1 Shard 0 The Hash Strategy Shard 1 Shard 2 distributes data across shards based on a hash function, ensuring even data distribution and load balancing.
spookyHash city64 … Shard = % NumShards (Key) hash Non-cryptographic hash functions Hello “Hello” -> hash(“Hello”) % 3 = Shard 1 The Hash Strategy distributes data across shards based on a hash function, ensuring even data distribution and load balancing.
% 3 = Shard 0 xxh3 murmur3 sipHash spookyHash city64 … Shard = % NumShards (Key) hash Non-cryptographic hash functions Hello World “Hello” -> hash(“Hello”) % 3 = Shard 1 The Hash Strategy distributes data across shards based on a hash function, ensuring even data distribution and load balancing.
% 3 = Shard 0 xxh3 murmur3 sipHash spookyHash city64 … Shard = % NumShards (Key) hash Non-cryptographic hash functions Hello World Test “World” -> hash(“World”) % 3 = Shard 0 “Hello” -> hash(“Hello”) % 3 = Shard 1 The Hash Strategy distributes data across shards based on a hash function, ensuring even data distribution and load balancing.