Slide 1

Slide 1 text

FooLab http://foolab.ca @foolabca Speed up Your Database ConFoo - February 27, 2014

Slide 2

Slide 2 text

FooLab Objectives • What is slow? • Indexes. • Better queries. • Checklist. 2

Slide 3

Slide 3 text

FooLab Anna Filina 3 • I help projects ship on time. • I train and mentor developers. • I code and optimize.

Slide 4

Slide 4 text

FooLab What is slow?

Slide 5

Slide 5 text

FooLab Multitude of queries • Loops: • Reduce number of queries. • Check for lazy loading in your ORM. 5 foreach (id in pictures) { query_pic_details(id); }

Slide 6

Slide 6 text

FooLab Lazy loading • Data loaded on demand. • Use LEFT JOIN to get all related data in one go. 6 echo $picture->album->user->name;

Slide 7

Slide 7 text

FooLab Multitude of queries • Deeply nested code. • Try get_first_thumb instead. 7

Slide 8

Slide 8 text

FooLab Count queries • Manual: add custom code before queries. • Auto: ORMs (like ActiveRecord) have listeners. 8

Slide 9

Slide 9 text

FooLab Slow query log • Open my.cnf • Queries taking 0.5s or longer will be logged. • Open the log file and analyze the queries. 9 slow_query_log=ON long_query_time=0.5 slow_query_log_file=/path/to/file

Slide 10

Slide 10 text

FooLab Example schema 10 200 users 100 albums each 100 pictures each 2,000,000 total pics

Slide 11

Slide 11 text

FooLab Simple query • Get all pictures from user #1 • 38 seconds. 11 SELECT picture.id, picture.title FROM picture LEFT JOIN album ON picture.album.id = album.id WHERE album.user_id = 1;

Slide 12

Slide 12 text

FooLab Why is it so slow?

Slide 13

Slide 13 text

FooLab Explain 13 • Prefix with EXPLAIN: • Each row = table scanned: EXPLAIN SELECT picture.id, picture.title FROM picture LEFT JOIN album ON picture.album.id = album.id WHERE album.user_id = 1; +----+-------------+---------+------+---------------+------+---------+------+---------+--------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+------+---------------+------+---------+------+---------+--------------------------------+ | 1 | SIMPLE | album | ALL | NULL | NULL | NULL | NULL | 20000 | Using where | | 1 | SIMPLE | picture | ALL | NULL | NULL | NULL | NULL | 2000000 | Using where; Using join buffer | +----+-------------+---------+------+---------------+------+---------+------+---------+--------------------------------+

Slide 14

Slide 14 text

FooLab Explain 14 • Details: • Because of the join, we’re scanning all albums for each picture. • Total 40 billion rows scanned. table : album key : NULL rows : 20,000 table : picture key : NULL rows : 2,000,000

Slide 15

Slide 15 text

FooLab Indexes

Slide 16

Slide 16 text

FooLab Metaphor 16 Flip through all pages of a address book... ... or pull on the letter tab.

Slide 17

Slide 17 text

FooLab Add an index 17 • album.id • Same query down to 2.38 sec. • We just saved 36 sec on every request! ALTER TABLE album ADD PRIMARY KEY(id);

Slide 18

Slide 18 text

FooLab Under the hood 18 • Details: • With the index, we find our album right away. • Total 2,000,000 rows scanned. table : album key : PRIMARY rows : 1 table : picture key : NULL rows : 2,000,000

Slide 19

Slide 19 text

FooLab Add a better index 19 • picture.album_id • Now down to 0.12 sec. • 317 times faster than original. ALTER TABLE picture ADD INDEX(album_id);

Slide 20

Slide 20 text

FooLab Under the hood 20 • Details: • Total 200,000 rows scanned. table : album key : NULL rows : 20,000 table : picture key : album_id rows : 100

Slide 21

Slide 21 text

FooLab Add a better index 21 • album.user_id • Now down to 0.10 sec. • Gained another 17% in speed. ALTER TABLE album ADD INDEX(user_id);

Slide 22

Slide 22 text

FooLab Under the hood 22 • Details: • With the second index, we’re scanning 100 pics. • Total 10,000 rows scanned. table : album key : user_id rows : 100 table : picture key : album_id rows : 100

Slide 23

Slide 23 text

FooLab Indexes used 23 • Primary and foreign keys make for great indexes. • Also whatever you use in JOINs and WHEREs.

Slide 24

Slide 24 text

FooLab Index criteria 24 • Change frequency • Selectivity: how many different values? • Good examples: • Bad examples: user.gender picture.views user.id picture.date

Slide 25

Slide 25 text

FooLab Shall we optimize further? We’re quite fast already.

Slide 26

Slide 26 text

FooLab Airport customs 26 • 5 customs officers, 1 min to process a traveler. • 5 travelers / min = fluid. • 20 travelers / min = queue slowly builds. • 1000 travellers / min = 3h wait if you arrive on 2nd minute.

Slide 27

Slide 27 text

FooLab Smaller indexes 27 • BIGINT as index is slower than TINYINT. • Use UNSIGNED to double the maximum value.

Slide 28

Slide 28 text

FooLab Tips for better queries.

Slide 29

Slide 29 text

FooLab Functions 29 • Avoid functions on an indexed column: • The index will be ignored here. SELECT id, title FROM picture WHERE YEAR(create_date) >= 2011;

Slide 30

Slide 30 text

FooLab Limit 30 • Now down to 0.0009 sec. • 100 times faster than the last query. SELECT picture.id, picture.title FROM picture LEFT JOIN album ON picture.album.id = album.id WHERE album.user_id = 1 LIMIT 25;

Slide 31

Slide 31 text

FooLab Other constraints 31 • Range: • Only makes sense if that column is indexed: SELECT picture.id, picture.title FROM picture LEFT JOIN album ON picture.album.id = album.id WHERE album.user_id = 1 AND picture.id BETWEEN 26 AND 50; # 0.023 sec ALTER TABLE picture ADD INDEX(id);

Slide 32

Slide 32 text

FooLab Other constraints 32 • IN clause: SELECT picture.id, picture.title FROM picture LEFT JOIN album ON picture.album.id = album.id WHERE album.user_id = 1 AND picture.id IN (15,26,29,32,45); # 0.048 sec

Slide 33

Slide 33 text

FooLab Split tables 33 • MySQL can partition your tables transparently. • Creates multiple files on disk. • Shows data as a single table.

Slide 34

Slide 34 text

FooLab Archives 34 • MySQL ARCHIVE storage engine: • Doesn’t support UPDATE or DELETE • Very fast for SELECT • You can archive old transactions, news, etc.

Slide 35

Slide 35 text

FooLab Split databases 35 • Horizontal partitionning, sharding. • Users 1-1000 and their data in database #1 • Users 1001-2000 and their data in database #2

Slide 36

Slide 36 text

FooLab Checklist 36 ✓Count queries. ✓Log the slow ones. ✓Use EXPLAIN and check number of rows scanned. ✓Try different indexes, remove unused ones. ✓Use small indexes, preferably numeric. ✓Don’t use functions on indexed columns. ✓Limit number of results. ✓Split tables and databases.

Slide 37

Slide 37 text

FooLab Further reading • Storage engines: http://www.linux.org/article/view/an- introduction-to-mysql-storage-engines • Sharding: http://www.jurriaanpersyn.com/archives/ 2009/02/12/database-sharding-at-netlog-with-mysql- and-php/ 37

Slide 38

Slide 38 text

FooLab Even more fun stuff 38 • Setup read slaves • Denormalize where appropriate • SSD hard drive