"Just" shard it

"Just" shard it

Given April 15 2015 with Keyur Govande at Percona Live in Santa Clara, CA

A talk on Etsy's second take on our database sharding architecture.

Etsy started with a monolithic postgres database and moved to a horizontally sharded mysql store with lookup table and master-master replication. Scaling your data store is never as easy as just adding more hardware though, even with sharding.

In this talk, we'll go over the pains and perils of sharding databases and how we solved some of the problems in Etsy's second take on sharding, what we call logical sharding. Logical sharding is where we put several databases on one mysqld, as opposed to one database per server. Upon hitting a resource limit on a box (CPU, Memory, Connections, etc) that would normally require individual row migration, we can instead do file level migrations of a database, replicate, then drop previous databases, which is super fast and a cleaner way to delete old unnecessary shard data. We'll also go over how logical sharding benefits other daily operational challenges like schema changes and backups, as well as makes life easier for our downstream analytics consumers.

Etsy is a fast growing site with over 40 million members and 26 million items listed. Learn about Etsy's painpoints in growing a 120TB+ sharded MySQL datastore and how the architecture and tooling evolved over time.


Maggie Zhou

April 15, 2015