Speed Up Your Database 300 Times

B3b2139e4f2c0eca4efe2379fcebc1c5?s=47 Anna Filina
January 30, 2016

Speed Up Your Database 300 Times

Are your queries slow? Learn how to speed them up through better SQL crafting and use of meaningful indices. You will understand what works well and what doesn’t, and will walk away with a checklist for faster databases. I expect that you will all be itching to analyze MySQL queries to see how much you can shave off.

B3b2139e4f2c0eca4efe2379fcebc1c5?s=128

Anna Filina

January 30, 2016
Tweet

Transcript

  1. 2.

    Objectives • What is slow? • Investigate performance issues. •

    Indexes. • Better queries. • Checklist. 2
  2. 5.

    Multitude of queries • Loops:
 
 
 • Reduce number

    of queries. • Check for lazy loading in your ORM.
 5 foreach ($pictures as $id) {
 query_pic_details($id); }
  3. 6.

    Lazy loading • Data loaded on demand.
 • Use JOIN

    to get all related data in one go.
 6 echo $picture->album->user->name;
  4. 8.

    Count queries • Manual: add custom code before queries. •

    Auto: ORMs (like Doctrine) have listeners or data collectors.
 8
  5. 9.

    Slow query log • Open my.cnf
 
 
 • Queries

    taking 0.5s or longer will be logged. • Open the log file and analyze the queries. 9 slow_query_log=ON long_query_time=0.5 slow_query_log_file=/path/to/file
  6. 11.

    Simple query • Get all pictures from user #1
 


    
 
 • 38 seconds. 11 SELECT picture.id, picture.title
 FROM picture
 LEFT JOIN album ON picture.album_id = album.id
 WHERE album.user_id = 1;
  7. 13.

    Explain 13 • Prefix with EXPLAIN:
 
 
 
 •

    Each row = table scanned:
 
 
 EXPLAIN SELECT picture.id, picture.title
 FROM picture
 LEFT JOIN album ON picture.album_id = album.id
 WHERE album.user_id = 1; +----+-------------+---------+------+---------------+------+---------+------+---------+--------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+------+---------------+------+---------+------+---------+--------------------------------+ | 1 | SIMPLE | album | ALL | NULL | NULL | NULL | NULL | 20000 | Using where | | 1 | SIMPLE | picture | ALL | NULL | NULL | NULL | NULL | 2000000 | Using where; Using join buffer | +----+-------------+---------+------+---------------+------+---------+------+---------+--------------------------------+
  8. 14.

    Explain 14 • Details:
 
 
 
 • Because of

    the join,
 we’re scanning all albums for each picture. • Total 40 billion rows scanned.
 table : album key : NULL rows : 20,000 table : picture key : NULL rows : 2,000,000
  9. 15.
  10. 16.
  11. 17.

    Add an index 17 • album.id
 • Same query down

    to 2.38 sec. • We just saved 36 sec on every request!
 ALTER TABLE album ADD PRIMARY KEY(id);
  12. 18.

    Under the hood 18 • Details:
 
 
 
 •

    With the index,
 we find our album right away. • Total 2,000,000 rows scanned.
 table : album key : PRIMARY rows : 1 table : picture key : NULL rows : 2,000,000
  13. 19.

    Add a better index 19 • picture.album_id
 
 • Now

    down to 0.12 sec. • 317 times faster than original. ALTER TABLE picture ADD INDEX(album_id);
  14. 20.

    Under the hood 20 • Details:
 
 
 
 •

    Total 200,000 rows scanned.
 table : album key : NULL rows : 20,000 table : picture key : album_id rows : 100
  15. 21.

    Add a better index 21 • album.user_id
 
 • Now

    down to 0.10 sec. • Gained another 17% in speed.
 ALTER TABLE album ADD INDEX(user_id);
  16. 22.

    Under the hood 22 • Details:
 
 
 
 •

    With the second index,
 we’re scanning 100 pics. • Total 10,000 rows scanned. table : album key : user_id rows : 100 table : picture key : album_id rows : 100
  17. 23.

    Don't index everything! There is an index on each field

    And all selects take long to yield. Try finding which ones you need, Remove the rest and feel the speed. -- Anna Filina 23
  18. 24.

    Indexes used 24 • Primary and foreign keys make for

    great indexes. • Also whatever you use in JOIN, WHERE, ORDER BY.
  19. 25.

    Index criteria 25 • Change frequency • Selectivity: how many

    different values? • Good examples: ◦ user.id ◦ picture.date • Bad examples: ◦ user.is_active ◦ picture.views

  20. 27.

    Airport customs 27 • 5 customs officers, 1 min to

    process a traveller. • 5 travellers / min = fluid. • 20 travellers / min = queue slowly builds. • 1000 travellers / min = 3h wait if you arrive on 2nd minute.
  21. 28.

    Smaller indexes 28 • BIGINT as index is slower than

    TINYINT. • Use UNSIGNED to double the maximum value.
  22. 30.

    Functions 30 • Avoid functions on an indexed column: •

    The index will be ignored here.
 SELECT id, title FROM picture
 WHERE YEAR(create_date) >= 2011;
  23. 31.

    Limit 31 • Now down to 0.0009 sec. • 100

    times faster than the last query. SELECT picture.id, picture.title
 FROM picture
 LEFT JOIN album ON picture.album.id = album.id
 WHERE album.user_id = 1
 LIMIT 25;
  24. 32.

    Other constraints 32 • Range:
 
 
 
 
 •

    Only makes sense if that column is indexed:
 
 
 SELECT picture.id, picture.title
 FROM picture
 LEFT JOIN album ON picture.album.id = album.id
 WHERE album.user_id = 1
 AND picture.id BETWEEN 26 AND 50; # 0.023 sec ALTER TABLE picture ADD INDEX(id);
  25. 33.

    Other constraints 33 • IN clause:
 
 
 
 


    
 
 
 
 
 SELECT picture.id, picture.title
 FROM picture
 LEFT JOIN album ON picture.album.id = album.id
 WHERE album.user_id = 1
 AND picture.id IN (15,26,29,32,45); # 0.048 sec
  26. 34.
  27. 35.

    Problem • New features released before sale • Index was

    not used • Creating too many tmp tables on disk • Slow queries • Long queue 35
  28. 36.

    How did I find the problem? 36 • Status:
 


    
 
 
 
 
 
 
 
 SHOW GLOBAL STATUS; +------------------------------------------+-------------+ | Variable_name | Value | +------------------------------------------+-------------+ | Aborted_clients | 0 | | Aborted_connects | 0 | | Binlog_cache_disk_use | 0 | | Binlog_cache_use | 0 | | Binlog_stmt_cache_disk_use | 0 | | Binlog_stmt_cache_use | 0 | | Bytes_received | 23378012 | | Bytes_sent | 2707328 |
  29. 37.

    How did I find the problem? 37 • Created_tmp_disk_tables ◦

    Temporary tables created on order by, group by, etc. ◦ Stored on disk when can't fit in RAM ◦ Maybe too big resultset. ◦ Are we using index? • Handler_read_rnd_next ◦ Full or partial table scan
  30. 38.

    Split tables 38 • MySQL can partition your tables transparently.

    ◦ Creates multiple files on disk. ◦ Shows data as a single table.

  31. 39.

    Archives 39 • MySQL ARCHIVE storage engine: ◦ Doesn’t support

    UPDATE or DELETE ◦ Very fast for SELECT • You can archive old transactions, news, etc.

  32. 40.

    Split databases 40 • Horizontal partitioning, sharding. • Users 1-1000

    and their data in database #1 • Users 1001-2000 and their data in database #2
  33. 41.

    Checklist 41 ✓Count queries. ✓Log the slow ones. ✓Use EXPLAIN

    and check number of rows scanned. ✓Try different indexes, remove unused ones. ✓Use small indexes, preferably numeric. ✓Don’t use functions on indexed columns. ✓Make sure indexes are used. ✓Limit number of results. ✓Split tables and databases.
  34. 42.

    Further reading • Storage engines: http://www.linux.org/article/view/an- introduction-to-mysql-storage-engines • Sharding: http://www.jurriaanpersyn.com/archives/

    2009/02/12/database-sharding-at-netlog-with-mysql- and-php/ • Optimization manual: http://dev.mysql.com/doc/refman/ 5.7/en/optimization.html 42
  35. 43.

    Even more fun stuff 43 • Setup read slaves •

    Denormalize where appropriate • SSD hard drive
  36. 44.

    Services • Development: PHP, JS, etc. • Fix problems: bugs,

    performance, etc. • Coaching & workshops. • Advisor: testing strategy, architecture, etc. 44