FooLab
http://foolab.ca
@foolabca
Speed up Your
Database
ConFoo - February 27, 2014
Slide 2
Slide 2 text
FooLab
Objectives
• What is slow?
• Indexes.
• Better queries.
• Checklist.
2
Slide 3
Slide 3 text
FooLab
Anna Filina
3
• I help projects ship on time.
• I train and mentor developers.
• I code and optimize.
Slide 4
Slide 4 text
FooLab
What is slow?
Slide 5
Slide 5 text
FooLab
Multitude of queries
• Loops:
• Reduce number of queries.
• Check for lazy loading in your ORM.
5
foreach (id in pictures) {
query_pic_details(id);
}
Slide 6
Slide 6 text
FooLab
Lazy loading
• Data loaded on demand.
• Use LEFT JOIN to get all related data in one go.
6
echo $picture->album->user->name;
FooLab
Count queries
• Manual: add custom code before queries.
• Auto: ORMs (like ActiveRecord) have listeners.
8
Slide 9
Slide 9 text
FooLab
Slow query log
• Open my.cnf
• Queries taking 0.5s or longer will be logged.
• Open the log file and analyze the queries.
9
slow_query_log=ON
long_query_time=0.5
slow_query_log_file=/path/to/file
Slide 10
Slide 10 text
FooLab
Example schema
10
200
users
100
albums each
100
pictures each
2,000,000 total pics
Slide 11
Slide 11 text
FooLab
Simple query
• Get all pictures from user #1
• 38 seconds.
11
SELECT picture.id, picture.title
FROM picture
LEFT JOIN album ON picture.album.id = album.id
WHERE album.user_id = 1;
Slide 12
Slide 12 text
FooLab
Why is it so slow?
Slide 13
Slide 13 text
FooLab
Explain
13
• Prefix with EXPLAIN:
• Each row = table scanned:
EXPLAIN SELECT picture.id, picture.title
FROM picture
LEFT JOIN album ON picture.album.id = album.id
WHERE album.user_id = 1;
+----+-------------+---------+------+---------------+------+---------+------+---------+--------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+------+---------------+------+---------+------+---------+--------------------------------+
| 1 | SIMPLE | album | ALL | NULL | NULL | NULL | NULL | 20000 | Using where |
| 1 | SIMPLE | picture | ALL | NULL | NULL | NULL | NULL | 2000000 | Using where; Using join buffer |
+----+-------------+---------+------+---------------+------+---------+------+---------+--------------------------------+
Slide 14
Slide 14 text
FooLab
Explain
14
• Details:
• Because of the join,
we’re scanning all albums for each picture.
• Total 40 billion rows scanned.
table : album
key : NULL
rows : 20,000
table : picture
key : NULL
rows : 2,000,000
Slide 15
Slide 15 text
FooLab
Indexes
Slide 16
Slide 16 text
FooLab
Metaphor
16
Flip through all pages of a address book...
... or pull on the letter tab.
Slide 17
Slide 17 text
FooLab
Add an index
17
• album.id
• Same query down to 2.38 sec.
• We just saved 36 sec on every request!
ALTER TABLE album ADD PRIMARY KEY(id);
Slide 18
Slide 18 text
FooLab
Under the hood
18
• Details:
• With the index,
we find our album right away.
• Total 2,000,000 rows scanned.
table : album
key : PRIMARY
rows : 1
table : picture
key : NULL
rows : 2,000,000
Slide 19
Slide 19 text
FooLab
Add a better index
19
• picture.album_id
• Now down to 0.12 sec.
• 317 times faster than original.
ALTER TABLE picture ADD INDEX(album_id);
Slide 20
Slide 20 text
FooLab
Under the hood
20
• Details:
• Total 200,000 rows scanned.
table : album
key : NULL
rows : 20,000
table : picture
key : album_id
rows : 100
Slide 21
Slide 21 text
FooLab
Add a better index
21
• album.user_id
• Now down to 0.10 sec.
• Gained another 17% in speed.
ALTER TABLE album ADD INDEX(user_id);
Slide 22
Slide 22 text
FooLab
Under the hood
22
• Details:
• With the second index,
we’re scanning 100 pics.
• Total 10,000 rows scanned.
table : album
key : user_id
rows : 100
table : picture
key : album_id
rows : 100
Slide 23
Slide 23 text
FooLab
Indexes used
23
• Primary and foreign keys make for great indexes.
• Also whatever you use in JOINs and WHEREs.
Slide 24
Slide 24 text
FooLab
Index criteria
24
• Change frequency
• Selectivity: how many different values?
• Good examples:
• Bad examples:
user.gender
picture.views
user.id
picture.date
Slide 25
Slide 25 text
FooLab
Shall we optimize
further?
We’re quite fast already.
Slide 26
Slide 26 text
FooLab
Airport customs
26
• 5 customs officers, 1 min to process a traveler.
• 5 travelers / min = fluid.
• 20 travelers / min = queue slowly builds.
• 1000 travellers / min = 3h wait if you arrive on 2nd
minute.
Slide 27
Slide 27 text
FooLab
Smaller indexes
27
• BIGINT as index is slower than TINYINT.
• Use UNSIGNED to double the maximum value.
Slide 28
Slide 28 text
FooLab
Tips for better queries.
Slide 29
Slide 29 text
FooLab
Functions
29
• Avoid functions on an indexed column:
• The index will be ignored here.
SELECT id, title FROM picture
WHERE YEAR(create_date) >= 2011;
Slide 30
Slide 30 text
FooLab
Limit
30
• Now down to 0.0009 sec.
• 100 times faster than the last query.
SELECT picture.id, picture.title
FROM picture
LEFT JOIN album ON picture.album.id = album.id
WHERE album.user_id = 1
LIMIT 25;
Slide 31
Slide 31 text
FooLab
Other constraints
31
• Range:
• Only makes sense if that column is indexed:
SELECT picture.id, picture.title
FROM picture
LEFT JOIN album ON picture.album.id = album.id
WHERE album.user_id = 1
AND picture.id BETWEEN 26 AND 50; # 0.023 sec
ALTER TABLE picture ADD INDEX(id);
Slide 32
Slide 32 text
FooLab
Other constraints
32
• IN clause:
SELECT picture.id, picture.title
FROM picture
LEFT JOIN album ON picture.album.id = album.id
WHERE album.user_id = 1
AND picture.id IN (15,26,29,32,45); # 0.048 sec
Slide 33
Slide 33 text
FooLab
Split tables
33
• MySQL can partition your tables transparently.
• Creates multiple files on disk.
• Shows data as a single table.
Slide 34
Slide 34 text
FooLab
Archives
34
• MySQL ARCHIVE storage engine:
• Doesn’t support UPDATE or DELETE
• Very fast for SELECT
• You can archive old transactions, news, etc.
Slide 35
Slide 35 text
FooLab
Split databases
35
• Horizontal partitionning, sharding.
• Users 1-1000 and their data in database #1
• Users 1001-2000 and their data in database #2
Slide 36
Slide 36 text
FooLab
Checklist
36
✓Count queries.
✓Log the slow ones.
✓Use EXPLAIN and check
number of rows scanned.
✓Try different indexes,
remove unused ones.
✓Use small indexes,
preferably numeric.
✓Don’t use functions on
indexed columns.
✓Limit number of results.
✓Split tables and
databases.