Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Sharding is hard.

Sharding is hard.

Rafał Malinowski

March 15, 2014
Tweet

Other Decks in Programming

Transcript

  1. denormalize users_links user_id link 1 phpers.pl 1 meetphp.pl 1 php3city.pl

    … … select count(*) from users_links where user_id=1; users user_id link 1 [email protected]
  2. users_links user_id link 1 phpers.pl 1 meetphp.pl 1 php3city.pl …

    … select links_count from users where user_id=1; users user_id link links_count 1 [email protected] 100. 000 denormalize
  3. Sharding is a type of database partitioning that separates very

    large databases into smaller parts called data shards The Internet
  4. db0 db1 4 5 6 7 0 1 2 3

    8 9 db0 db2 db1
  5. db0 db1 4 5 6 7 0 1 2 3

    8 9 db0 db2 db1 db2 db3
  6. shards moved rows 2 → 3 667 402 2 →

    4 500 000 2 → 5 800 468 3 → 4 750 357 3 → 5 800 060 4 → 5 800 247
  7. shards_counters shard_id count 1 0 2 0 SELECT `shard_id` FROM

    `shards_counters` ORDER BY `count` LIMIT 1;
  8. shards_counters shard_id count 1 1 2 0 shard_1.urls url_id goto

    date 1 meetphp.pl 2014-01-01 sharding_map url_id shard_id 1 1
  9. shards_counters shard_id count 1 1 2 0 SELECT `shard_id` FROM

    `sharding_map` WHERE `url_id` = 1; sharding_map url_id shard_id 1 1
  10. shards_counters shard_id count 1 1 2 0 SELECT `shard_id` FROM

    `shards_counters` ORDER BY `count` LIMIT 1;
  11. shards_counters shard_id count 1 1 2 1 shard_2.urls url_id goto

    date 2 phpers.pl 2014-01-01 sharding_map url_id shard_id 1 1 2 2
  12. shards_counters shard_id count 1 1 2 1 SELECT `shard_id` FROM

    `sharding_map` WHERE `url_id` = 2; sharding_map url_id shard_id 1 1 2 2
  13. SELECT * FROM `shard_2.urls ` WHERE `url_id` = 2; shard_2.urls

    url_id goto date 2 phpers.pl 2014-01-01
  14. $bucketsCount = 1024; ! for ($i = 0; $i <

    $bucketsCount; $i++) { buckets[i] = max(int) / $bucketsCount * i }
  15. 1 6