Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How to scale large database
Search
duongkai
May 23, 2013
Programming
3
200
How to scale large database
Bài nói về các kĩ thuật để mở rộng một database lớn.
duongkai
May 23, 2013
Tweet
Share
More Decks by duongkai
See All by duongkai
Common crypto flaws in finance mobile apps
duongkai
0
84
Tetcon-2015 Using TLS correctly
duongkai
2
370
How to use SSL/TLS correctly
duongkai
1
180
5S - Xây dựng và thực hiện
duongkai
0
160
Why Random Matters
duongkai
0
77
Crypto-101 @hackerspace 26/07/2013
duongkai
1
110
Trao đổi email
duongkai
0
160
+TetCon.2013_Hacking.Oracle.2012.pdf
duongkai
0
150
Other Decks in Programming
See All in Programming
Implementation Patterns
denyspoltorak
0
290
MDN Web Docs に日本語翻訳でコントリビュート
ohmori_yusuke
0
650
CSC307 Lecture 06
javiergs
PRO
0
680
AWS re:Invent 2025参加 直前 Seattle-Tacoma Airport(SEA)におけるハードウェア紛失インシデントLT
tetutetu214
2
110
AI & Enginnering
codelynx
0
110
なるべく楽してバックエンドに型をつけたい!(楽とは言ってない)
hibiki_cube
0
140
AIによる高速開発をどう制御するか? ガードレール設置で開発速度と品質を両立させたチームの事例
tonkotsuboy_com
7
2.3k
AI前提で考えるiOSアプリのモダナイズ設計
yuukiw00w
0
230
16年目のピクシブ百科事典を支える最新の技術基盤 / The Modern Tech Stack Powering Pixiv Encyclopedia in its 16th Year
ahuglajbclajep
5
1k
インターン生でもAuth0で認証基盤刷新が出来るのか
taku271
0
190
Rust 製のコードエディタ “Zed” を使ってみた
nearme_tech
PRO
0
170
OCaml 5でモダンな並列プログラミングを Enjoyしよう!
haochenx
0
140
Featured
See All Featured
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
508
140k
Jamie Indigo - Trashchat’s Guide to Black Boxes: Technical SEO Tactics for LLMs
techseoconnect
PRO
0
57
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
17k
Public Speaking Without Barfing On Your Shoes - THAT 2023
reverentgeek
1
300
Joys of Absence: A Defence of Solitary Play
codingconduct
1
290
How to make the Groovebox
asonas
2
1.9k
The Cost Of JavaScript in 2023
addyosmani
55
9.5k
Navigating Weather and Climate Data
rabernat
0
100
The Curious Case for Waylosing
cassininazir
0
230
Thoughts on Productivity
jonyablonski
74
5k
Evolving SEO for Evolving Search Engines
ryanjones
0
120
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
196
71k
Transcript
How To Scale Large Database Phạm Tùng Dương – CIO03
Course: Advanced Database
Overview • First glance about Large Database • Typical techniques
to scale • Database sharding • Database sharding in MySQL
First glance about Large Database
When You Talk about Large Database
Example Tumblr @2012
Example • 400 million active users • 5 billion pieces
of content per week • 3 billion photos uploaded per month Facebook@2010
Example • 1 billion tweets per week • 140 million
tweets sent per day • 456 tweets per second @MJ death • 6939 tweets per second on NY day Twitter@2011
What is The Large Database • Large working data sets
• I/O write intensive
Typical approaches
What is The Bottleneck? I/O, I/O and I/O
We have a job which is called Performance Tuning
Scale up • Adding more RAM, more CPU • High
I/O HDD
Scale topo Replication (Master – Slave) Master Slave Client Read/Write
Read Only Master Master Storage Client Cluster (shared storage)
Caching • Memcached • Redis
Finally, Everything in RAM is a Dream!
But, No Silver Bullet!
Database Sharding
What is Database Sharding • Horizontal Partitioning • Data is
stored in small chunks and distributed across many computers • Often use with Replication
Database sharding topo Primary DB Shard1 Shard2
Shard3 Slave1 Slave2 Slave3
3 types • Range sharding • List sharding (Lookup table)
• Hash sharding
Range sharding • Distributed by the range of Primary Key
• Example – Primary Key: user_id (1..1000) user_shard1 (1..500) user_shard2 (501..1000)
List sharding • Distributed data by the attribute of the
data • Example: database of people in VN – Sharded by the city_name (Ha_Noi, Hai_Phong, Da_Nang,…)
Hash sharding (modulus) • Distributed data by using a hash
function on primary key. • Example: primary_key mod N
Pros of Database Sharding • Easy to scale (data, write
I/O) • Using commodity hardware • Minimum effect when system failed
Cons of Database sharding • You MUST implement by yourselves
• Operation is harder • Handle join operation is very difficult • Data denormalization – > Don’t do it because it’s COOL!
Database Sharding in MySQL
Sharding Solutions • Application layer • Storage layer • Heavy
middleware • Lightweight middleware
Application layer • Hibernate Shards • HiveDB
Storage layer • MySQL Spider – Requires to change storage engine
of MySQL
Heavy Middleware • Twitter Gizzard • dbShards – Each db
has an agent
Lightweight Middleware • Acts like a proxy • Route the
request • Spock, CUBRID
You Will Do It Because You Have To … not
because it’s Cool!
Q&A