Slide 1

Slide 1 text

foolab.ca | @foolabca Speed up Your Database Tech4Africa - October 8, 2014

Slide 2

Slide 2 text

Objectives • What is slow? • Indexes. • Better queries. • Checklist. 2

Slide 3

Slide 3 text

Anna Filina 3 • Developer • Problem solver • Coach • Advisor

Slide 4

Slide 4 text

What is slow?

Slide 5

Slide 5 text

Multitude of queries • Loops: • Reduce number of queries. • Check for lazy loading in your ORM. 5 foreach ($pictures as $id) { query_pic_details($id); }

Slide 6

Slide 6 text

Lazy loading • Data loaded on demand. • Use LEFT JOIN to get all related data in one go. 6 echo $picture->album->user->name;

Slide 7

Slide 7 text

Multitude of queries • Deeply nested code. • Try get_first_thumb instead. 7

Slide 8

Slide 8 text

Count queries • Manual: add custom code before queries. • Auto: ORMs (like Doctrine) have listeners. 8

Slide 9

Slide 9 text

Slow query log • Open my.cnf • Queries taking 0.5s or longer will be logged. • Open the log file and analyze the queries. 9 slow_query_log=ON long_query_time=0.5 slow_query_log_file=/path/to/file

Slide 10

Slide 10 text

Example schema 10 200 users 100 albums each 100 pictures each 2,000,000 total pics

Slide 11

Slide 11 text

Simple query • Get all pictures from user #1 • 38 seconds. 11 SELECT picture.id, picture.title FROM picture LEFT JOIN album ON picture.album.id = album.id WHERE album.user_id = 1;

Slide 12

Slide 12 text

Why is it so slow?

Slide 13

Slide 13 text

Explain 13 • Prefix with EXPLAIN: • Each row = table scanned: EXPLAIN SELECT picture.id, picture.title FROM picture LEFT JOIN album ON picture.album.id = album.id WHERE album.user_id = 1; +----+-------------+---------+------+---------------+------+---------+------+---------+--------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+------+---------------+------+---------+------+---------+--------------------------------+ | 1 | SIMPLE | album | ALL | NULL | NULL | NULL | NULL | 20000 | Using where | | 1 | SIMPLE | picture | ALL | NULL | NULL | NULL | NULL | 2000000 | Using where; Using join buffer | +----+-------------+---------+------+---------------+------+---------+------+---------+--------------------------------+

Slide 14

Slide 14 text

Explain 14 • Details: • Because of the join, we’re scanning all albums for each picture. • Total 40 billion rows scanned. table : album key : NULL rows : 20,000 table : picture key : NULL rows : 2,000,000

Slide 15

Slide 15 text

Indexes

Slide 16

Slide 16 text

Metaphor 16 Flip through all pages of a address book... ... or pull on the letter tab.

Slide 17

Slide 17 text

Add an index 17 • album.id • Same query down to 2.38 sec. • We just saved 36 sec on every request! ALTER TABLE album ADD PRIMARY KEY(id);

Slide 18

Slide 18 text

Under the hood 18 • Details: • With the index, we find our album right away. • Total 2,000,000 rows scanned. table : album key : PRIMARY rows : 1 table : picture key : NULL rows : 2,000,000

Slide 19

Slide 19 text

Add a better index 19 • picture.album_id • Now down to 0.12 sec. • 317 times faster than original. ALTER TABLE picture ADD INDEX(album_id);

Slide 20

Slide 20 text

Under the hood 20 • Details: • Total 200,000 rows scanned. table : album key : NULL rows : 20,000 table : picture key : album_id rows : 100

Slide 21

Slide 21 text

Add a better index 21 • album.user_id • Now down to 0.10 sec. • Gained another 17% in speed. ALTER TABLE album ADD INDEX(user_id);

Slide 22

Slide 22 text

Under the hood 22 • Details: • With the second index, we’re scanning 100 pics. • Total 10,000 rows scanned. table : album key : user_id rows : 100 table : picture key : album_id rows : 100

Slide 23

Slide 23 text

Indexes used 23 • Primary and foreign keys make for great indexes. • Also whatever you use in JOINs and WHEREs.

Slide 24

Slide 24 text

Index criteria 24 • Change frequency • Selectivity: how many different values? • Good examples: • Bad examples: user.gender picture.views user.id picture.date

Slide 25

Slide 25 text

Shall we optimize further? We’re quite fast already.

Slide 26

Slide 26 text

Airport customs 26 • 5 customs officers, 1 min to process a traveler. • 5 travelers / min = fluid. • 20 travelers / min = queue slowly builds. • 1000 travellers / min = 3h wait if you arrive on 2nd minute.

Slide 27

Slide 27 text

Smaller indexes 27 • BIGINT as index is slower than TINYINT. • Use UNSIGNED to double the maximum value.

Slide 28

Slide 28 text

Tips for better queries. 28

Slide 29

Slide 29 text

Functions 29 • Avoid functions on an indexed column: • The index will be ignored here. SELECT id, title FROM picture WHERE YEAR(create_date) >= 2011;

Slide 30

Slide 30 text

Limit 30 • Now down to 0.0009 sec. • 100 times faster than the last query. SELECT picture.id, picture.title FROM picture LEFT JOIN album ON picture.album.id = album.id WHERE album.user_id = 1 LIMIT 25;

Slide 31

Slide 31 text

Other constraints 31 • Range: • Only makes sense if that column is indexed: SELECT picture.id, picture.title FROM picture LEFT JOIN album ON picture.album.id = album.id WHERE album.user_id = 1 AND picture.id BETWEEN 26 AND 50; # 0.023 sec ALTER TABLE picture ADD INDEX(id);

Slide 32

Slide 32 text

Other constraints 32 • IN clause: SELECT picture.id, picture.title FROM picture LEFT JOIN album ON picture.album.id = album.id WHERE album.user_id = 1 AND picture.id IN (15,26,29,32,45); # 0.048 sec

Slide 33

Slide 33 text

Split tables 33 • MySQL can partition your tables transparently. ◦ Creates multiple files on disk. ◦ Shows data as a single table.

Slide 34

Slide 34 text

Archives 34 • MySQL ARCHIVE storage engine: ◦ Doesn’t support UPDATE or DELETE ◦ Very fast for SELECT • You can archive old transactions, news, etc.

Slide 35

Slide 35 text

Split databases 35 • Horizontal partitionning, sharding. • Users 1-1000 and their data in database #1 • Users 1001-2000 and their data in database #2

Slide 36

Slide 36 text

Checklist 36 ✓Count queries. ✓Log the slow ones. ✓Use EXPLAIN and check number of rows scanned. ✓Try different indexes, remove unused ones. ✓Use small indexes, preferably numeric. ✓Don’t use functions on indexed columns. ✓Limit number of results. ✓Split tables and databases.

Slide 37

Slide 37 text

Further reading • Storage engines: http://www.linux.org/article/view/an- introduction-to-mysql-storage-engines • Sharding: http://www.jurriaanpersyn.com/archives/ 2009/02/12/database-sharding-at-netlog-with-mysql- and-php/ 37

Slide 38

Slide 38 text

Even more fun stuff 38 • Setup read slaves • Denormalize where appropriate • SSD hard drive

Slide 39

Slide 39 text

Services • Development: PHP, JS, etc. • Fix problems: bugs, performance, etc. • Coaching & workshops. • Advisor: testing strategy, architecture, etc. 39