Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How to scale large database
Search
duongkai
May 23, 2013
Programming
3
200
How to scale large database
Bài nói về các kĩ thuật để mở rộng một database lớn.
duongkai
May 23, 2013
Tweet
Share
More Decks by duongkai
See All by duongkai
Common crypto flaws in finance mobile apps
duongkai
0
83
Tetcon-2015 Using TLS correctly
duongkai
2
360
How to use SSL/TLS correctly
duongkai
1
170
5S - Xây dựng và thực hiện
duongkai
0
160
Why Random Matters
duongkai
0
74
Crypto-101 @hackerspace 26/07/2013
duongkai
1
110
Trao đổi email
duongkai
0
160
+TetCon.2013_Hacking.Oracle.2012.pdf
duongkai
0
140
Other Decks in Programming
See All in Programming
Introduction to Git & GitHub
latte72
0
110
Claude Code と OpenAI o3 で メタデータ情報を作る
laket
0
130
Portapad紹介プレゼンテーション
gotoumakakeru
1
130
20250808_AIAgent勉強会_ClaudeCodeデータ分析の実運用〜競馬を題材に回収率100%の先を目指すメソッドとは〜
kkakeru
0
180
令和最新版手のひらコンピュータ
koba789
14
7.8k
エンジニアのための”最低限いい感じ”デザイン入門
shunshobon
0
110
『リコリス・リコイル』に学ぶ!! 〜キャリア戦略における計画的偶発性理論と変わる勇気の重要性〜
wanko_it
1
540
実践!App Intents対応
yuukiw00w
1
280
kiroでゲームを作ってみた
iriikeita
0
170
MCP連携で加速するAI駆動開発/mcp integration accelerates ai-driven-development
bpstudy
0
300
コンテキストエンジニアリングで変わるAI活用 リファクタリングワークフローの実践から学んだ形式知
leveragestech
0
100
なぜ今、Terraformの本を書いたのか? - 著者陣に聞く!『Terraformではじめる実践IaC』登壇資料
fufuhu
4
620
Featured
See All Featured
The Cult of Friendly URLs
andyhume
79
6.5k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
110
20k
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
10
1k
How to Ace a Technical Interview
jacobian
279
23k
VelocityConf: Rendering Performance Case Studies
addyosmani
332
24k
Gamification - CAS2011
davidbonilla
81
5.4k
Code Review Best Practice
trishagee
69
19k
How to train your dragon (web standard)
notwaldorf
96
6.2k
Principles of Awesome APIs and How to Build Them.
keavy
126
17k
Reflections from 52 weeks, 52 projects
jeffersonlam
351
21k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
50
5.5k
Transcript
How To Scale Large Database Phạm Tùng Dương – CIO03
Course: Advanced Database
Overview • First glance about Large Database • Typical techniques
to scale • Database sharding • Database sharding in MySQL
First glance about Large Database
When You Talk about Large Database
Example Tumblr @2012
Example • 400 million active users • 5 billion pieces
of content per week • 3 billion photos uploaded per month Facebook@2010
Example • 1 billion tweets per week • 140 million
tweets sent per day • 456 tweets per second @MJ death • 6939 tweets per second on NY day Twitter@2011
What is The Large Database • Large working data sets
• I/O write intensive
Typical approaches
What is The Bottleneck? I/O, I/O and I/O
We have a job which is called Performance Tuning
Scale up • Adding more RAM, more CPU • High
I/O HDD
Scale topo Replication (Master – Slave) Master Slave Client Read/Write
Read Only Master Master Storage Client Cluster (shared storage)
Caching • Memcached • Redis
Finally, Everything in RAM is a Dream!
But, No Silver Bullet!
Database Sharding
What is Database Sharding • Horizontal Partitioning • Data is
stored in small chunks and distributed across many computers • Often use with Replication
Database sharding topo Primary DB Shard1 Shard2
Shard3 Slave1 Slave2 Slave3
3 types • Range sharding • List sharding (Lookup table)
• Hash sharding
Range sharding • Distributed by the range of Primary Key
• Example – Primary Key: user_id (1..1000) user_shard1 (1..500) user_shard2 (501..1000)
List sharding • Distributed data by the attribute of the
data • Example: database of people in VN – Sharded by the city_name (Ha_Noi, Hai_Phong, Da_Nang,…)
Hash sharding (modulus) • Distributed data by using a hash
function on primary key. • Example: primary_key mod N
Pros of Database Sharding • Easy to scale (data, write
I/O) • Using commodity hardware • Minimum effect when system failed
Cons of Database sharding • You MUST implement by yourselves
• Operation is harder • Handle join operation is very difficult • Data denormalization – > Don’t do it because it’s COOL!
Database Sharding in MySQL
Sharding Solutions • Application layer • Storage layer • Heavy
middleware • Lightweight middleware
Application layer • Hibernate Shards • HiveDB
Storage layer • MySQL Spider – Requires to change storage engine
of MySQL
Heavy Middleware • Twitter Gizzard • dbShards – Each db
has an agent
Lightweight Middleware • Acts like a proxy • Route the
request • Spock, CUBRID
You Will Do It Because You Have To … not
because it’s Cool!
Q&A