Pro Yearly is on sale from $80 to $50! »

"Just" shard it

"Just" shard it

Given April 15 2015 with Keyur Govande at Percona Live in Santa Clara, CA

A talk on Etsy's second take on our database sharding architecture.

Etsy started with a monolithic postgres database and moved to a horizontally sharded mysql store with lookup table and master-master replication. Scaling your data store is never as easy as just adding more hardware though, even with sharding.

In this talk, we'll go over the pains and perils of sharding databases and how we solved some of the problems in Etsy's second take on sharding, what we call logical sharding. Logical sharding is where we put several databases on one mysqld, as opposed to one database per server. Upon hitting a resource limit on a box (CPU, Memory, Connections, etc) that would normally require individual row migration, we can instead do file level migrations of a database, replicate, then drop previous databases, which is super fast and a cleaner way to delete old unnecessary shard data. We'll also go over how logical sharding benefits other daily operational challenges like schema changes and backups, as well as makes life easier for our downstream analytics consumers.

Etsy is a fast growing site with over 40 million members and 26 million items listed. Learn about Etsy's painpoints in growing a 120TB+ sharded MySQL datastore and how the architecture and tooling evolved over time.

3d7b72d70ff07f8186126a4464bc6166?s=128

Maggie Zhou

April 15, 2015
Tweet

Transcript

  1. “Just” shard it Logical sharding at Etsy Maggie Zhou @zmagg

    Percona Live 2015 Keyur Govande @keyurdg
  2. None
  3. None
  4. What’s the infrastructure?

  5. L A M P

  6. L A M P

  7. http://surge.omniti.com/2011/speakers/ross-snyder

  8. 2019 http://surge.omniti.com/2011/speakers/ross-snyder

  9. L A M P

  10. 10TB InnoDB buffer pool 150K+ QPS average ~3.5Gbps (outbound-plain text)

    99.9% queries <1ms
  11. Overview • How did Etsy do sharding the first time?

    • What were the problems with it? • How did we solve these problems?
  12. index shard 1 shard 2 shard 3 shard n tickets

  13. master-master replication DB side a side b

  14. shard 1 shard 2 shard 3 shard n UNIQUE IDS

    tickets …
  15. index shard 1 shard 2 shard 3 shard n SHARD

    LOOKUP tickets …
  16. … … shop_2 => shard N shop_1.favorites! shop_2.favorites shop_2.sales shop_3.sales!

    … index shard N
  17. index shard 1 shard 2 shard 3 shard n tickets

  18. Writing example tickets index shard 1 shard 2 shard 3

    shard n ORM … 1) 2) 3)
  19. Reading example tickets index shard 1 shard 2 shard 3

    shard n ORM … 1) 2)
  20. Config file?

  21. tickets index shard 1 shard 2 shard 3 shard n

    shard n+1 …
  22. Capacity planning included setting aside 2 months for shard balancing

  23. 2 months??

  24. 2010’s solution is Not Scaling

  25. Why shard data rebalancing?

  26. writes locked index 1) … … shop_2 => shard N

  27. shop_1.favorites shop_2.favorites! shop_2.sales! shop_3.sales! … shop_2.favorites! shop_2.sales shard N shard

    N+1
  28. writes unlocked index 3) … … shop_2 => shard N+1

  29. Migrations were • locked shops & users out of changes

    for up to hours • error prone • arbitrary • developers had to be aware • created orphaned data • slow to administer
  30. Migrations were Error prone? We can fix the errors! We

    can make the script more robust!! Arbitrary? We can write tooling that figures out which rows are right to migrate off for optimal balance!! Developers had to be aware? We can write better interfaces!
  31. Orphaned data? • Deletes are expensive, so we didn’t do

    them. • Migrations created orphaned data on old pairs that were still picked up by full table scans (downstream systems: search, analytics). !
  32. Migrations were • error prone • arbitrary • developers had

    to be aware • created orphaned data • locked shops & users out of changes for up to hours • slow to administer
  33. Let’s talk about slowness… What if we could move more

    than one row at a time?
  34. Well, okay, why didn’t we do this in the first

    place?
  35. You have to run your site to learn your data

    access patterns.
  36. photo from gizmodo http://gizmodo.com/5632095/justin-bieber-has-dedicated-servers-at-twitter

  37. listing from https://www.etsy.com/shop/NausicaaDistribution

  38. <<enter logical sharding>>

  39. db_pair_1 db_pair_2 db_pair_3 db_pair_N shard1! shard2! …! shard10 shard11! shard12!

    …! shard20 shard21! shard22! …! shard30 ! ! …
  40. db_pair_1 shard1! shard2! …! shard10 db_pair_2 shard2

  41. A Splitting Example

  42. Etsy%App Shard%1% Shard%2% Shard%3% Shard%4 db_pair_1 Shard%1% Shard%2% Shard%3% Shard%4

    A B
  43. Etsy%App Shard%1% Shard%2% Shard%3% Shard%4 db_pair_1 Shard%1% Shard%2% Shard%3% Shard%4

    A B Shard%1% Shard%2% Shard%3% Shard%4 db_pair_31 Shard%1% Shard%2% Shard%3% Shard%4 A B
  44. Etsy%App Shard%1% Shard%2% Shard%3% Shard%4 db_pair_1 Shard%1% Shard%2% Shard%3% Shard%4

    A B Shard%1% Shard%2% Shard%3% Shard%4 db_pair_31 Shard%1% Shard%2% Shard%3% Shard%4 A B
  45. Etsy%App Shard%1% Shard%2% Shard%3% Shard%4 db_pair_1 Shard%1% Shard%2% Shard%3% Shard%4

    A B Shard%1% Shard%2% Shard%3% Shard%4 db_pair_31 Shard%1% Shard%2% Shard%3% Shard%4 A B
  46. Etsy%App Shard%1% Shard%2% Shard%3% Shard%4 db_pair_1 Shard%1% Shard%2% Shard%3% Shard%4

    A B Shard%1% Shard%2% Shard%3% Shard%4 db_pair_31 Shard%1% Shard%2% Shard%3% Shard%4 A B
  47. Etsy%App Shard%1% Shard%2% Shard%3% Shard%4 db_pair_1 Shard%1% Shard%2% Shard%3% Shard%4

    A B Shard%1% Shard%2% Shard%3% Shard%4 db_pair_31 Shard%1% Shard%2% Shard%3% Shard%4 A B
  48. Etsy%App Shard%1% Shard%2% Shard%3% Shard%4 db_pair_1 Shard%1% Shard%2% Shard%3% Shard%4

    A B Shard%1% Shard%2% Shard%3% Shard%4 db_pair_31 Shard%1% Shard%2% Shard%3% Shard%4 A B
  49. Etsy%App Shard%1% Shard%2% Shard%3% Shard%4 db_pair_1 Shard%1% Shard%2% Shard%3% Shard%4

    A B Shard%1% Shard%2% Shard%3% Shard%4 db_pair_31 Shard%1% Shard%2% Shard%3% Shard%4 A B
  50. Etsy%App Shard%1% Shard%2 db_pair_1 Shard%1% Shard%2 A B ! !

    Shard%3% Shard%4 db_pair_31 ! ! Shard%3% Shard%4 A B
  51. How did we move onto this new architecture?

  52. We used the old row-based migration framework, one last time.

  53. We migrated all the sharded data (120TB).

  54. It took us 5 months to move that data using

    the old way.
  55. Today it would take us a few hours.

  56. Nice side effects! • schema changes (alters) now run in

    parallel, significantly faster! • downstream analytics systems replicate faster at the shard- by-shard level! • no more orphaned data! • multi-threaded replication w/ 5.6 • Index files can be shipped
  57. What’d we build? • Snappy compression/decompression to xtrabackup • MySQL

    on-disk space allocation improvements
  58. None
  59. None
  60. DSN related outage… Number of Responses

  61. Hardware…

  62. Summary • How did Etsy do sharding the first time?

    • What were the problems with it? • How did we solve these problems?
  63. 2019 http://surge.omniti.com/2011/speakers/ross-snyder

  64. Resources • Using a tickets database for ID generation: http://code.flickr.net/2010/02/08/ticket-servers-distributed-

    unique-primary-keys-on-the-cheap/ • Using master-master replication: https://codeascraft.com/2012/04/20/two-sides-for-salvation/ • Etsy’s shard architecture, the 2012 edition: http://www.percona.com/live/mysql-conference-2012/sessions/ etsy-shard-architecture-starts-s-and-ends-hard • Scaling Etsy, 2011: http://surge.omniti.com/2011/speakers/ross-snyder • Instagram’s shard & id architecture: http://instagram-engineering.tumblr.com/post/10853187575/sharding- ids-at-instagram • Scaling Pinterest: https://speakerdeck.com/yashh/scaling-pinterest
  65. Questions?

  66. “Just” shard it Logical sharding at Etsy Maggie Zhou @zmagg

    Percona Live 2015 Keyur Govande @keyurdg