Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How to scale large database

How to scale large database

Bài nói về các kĩ thuật để mở rộng một database lớn.

duongkai

May 23, 2013
Tweet

More Decks by duongkai

Other Decks in Programming

Transcript

  1. Overview •  First glance about Large Database •  Typical techniques

    to scale •  Database sharding •  Database sharding in MySQL
  2. Example •  400 million active users •  5 billion pieces

    of content per week • 3 billion photos uploaded per month Facebook@2010
  3. Example •  1 billion tweets per week •  140 million

    tweets sent per day •  456 tweets per second @MJ death •  6939 tweets per second on NY day Twitter@2011
  4. Scale topo Replication (Master – Slave) Master Slave Client Read/Write

    Read Only Master Master Storage Client Cluster (shared storage)  
  5. What is Database Sharding •  Horizontal Partitioning •  Data is

    stored in small chunks and distributed across many computers •  Often use with Replication
  6. Database sharding topo Primary  DB   Shard1   Shard2  

    Shard3   Slave1   Slave2   Slave3  
  7. Range sharding •  Distributed by the range of Primary Key

    •  Example – Primary Key: user_id (1..1000) user_shard1 (1..500) user_shard2 (501..1000)
  8. List sharding •  Distributed data by the attribute of the

    data •  Example: database of people in VN – Sharded by the city_name (Ha_Noi, Hai_Phong, Da_Nang,…)
  9. Hash sharding (modulus) •  Distributed data by using a hash

    function on primary key. •  Example: primary_key mod N
  10. Pros of Database Sharding •  Easy to scale (data, write

    I/O) •  Using commodity hardware •  Minimum effect when system failed
  11. Cons of Database sharding •  You MUST implement by yourselves

    •  Operation is harder •  Handle join operation is very difficult •  Data denormalization – > Don’t do it because it’s COOL!
  12. Q&A