Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Speed Up Your Database 300 Times

Speed Up Your Database 300 Times

Are your queries slow? Learn how to speed them up through better SQL crafting and use of meaningful indices. You will understand what works well and what doesn’t, and will walk away with a checklist for faster databases. I expect that you will all be itching to analyze MySQL queries to see how much you can shave off.

Anna Filina

January 30, 2016
Tweet

More Decks by Anna Filina

Other Decks in Programming

Transcript

  1. foolab.ca | @foolabca
    Speed Up Your Database
    300 Times
    PHPBenelux - January 30, 2016

    View full-size slide

  2. Objectives
    • What is slow?
    • Investigate performance
    issues.
    • Indexes.
    • Better queries.
    • Checklist.
    2

    View full-size slide

  3. Anna Filina
    • Developer
    • Problem solver
    • Teacher
    • Advisor
    • FooLab + ConFoo
    3

    View full-size slide

  4. What is slow?

    View full-size slide

  5. Multitude of queries
    • Loops:



    • Reduce number of queries.
    • Check for lazy loading in your ORM.

    5
    foreach ($pictures as $id) {

    query_pic_details($id);
    }

    View full-size slide

  6. Lazy loading
    • Data loaded on demand.

    • Use JOIN to get all related data in one go.

    6
    echo $picture->album->user->name;

    View full-size slide

  7. Multitude of queries
    • Deeply nested code.
    • Try get_first_thumb instead.

    7

    View full-size slide

  8. Count queries
    • Manual: add custom code before queries.
    • Auto: ORMs (like Doctrine) have listeners or data
    collectors.

    8

    View full-size slide

  9. Slow query log
    • Open my.cnf



    • Queries taking 0.5s or longer will be logged.
    • Open the log file and analyze the queries.
    9
    slow_query_log=ON
    long_query_time=0.5
    slow_query_log_file=/path/to/file

    View full-size slide

  10. Example schema
    10
    200

    users
    100

    albums each
    100

    pictures each
    2,000,000 pictures

    View full-size slide

  11. Simple query
    • Get all pictures from user #1




    • 38 seconds.
    11
    SELECT picture.id, picture.title

    FROM picture

    LEFT JOIN album ON picture.album_id = album.id

    WHERE album.user_id = 1;

    View full-size slide

  12. Why is it so slow?

    View full-size slide

  13. Explain
    13
    • Prefix with EXPLAIN:




    • Each row = table scanned:



    EXPLAIN SELECT picture.id, picture.title

    FROM picture

    LEFT JOIN album ON picture.album_id = album.id

    WHERE album.user_id = 1;
    +----+-------------+---------+------+---------------+------+---------+------+---------+--------------------------------+
    | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
    +----+-------------+---------+------+---------------+------+---------+------+---------+--------------------------------+
    | 1 | SIMPLE | album | ALL | NULL | NULL | NULL | NULL | 20000 | Using where |
    | 1 | SIMPLE | picture | ALL | NULL | NULL | NULL | NULL | 2000000 | Using where; Using join buffer |
    +----+-------------+---------+------+---------------+------+---------+------+---------+--------------------------------+

    View full-size slide

  14. Explain
    14
    • Details:




    • Because of the join,

    we’re scanning all albums for each picture.
    • Total 40 billion rows scanned.

    table : album
    key : NULL
    rows : 20,000
    table : picture
    key : NULL
    rows : 2,000,000

    View full-size slide

  15. Metaphor
    16
    Flip through all pages of a address book...
    ... or pull on the letter tab.

    View full-size slide

  16. Add an index
    17
    • album.id

    • Same query down to 2.38 sec.
    • We just saved 36 sec on every request!

    ALTER TABLE album ADD PRIMARY KEY(id);

    View full-size slide

  17. Under the hood
    18
    • Details:




    • With the index,

    we find our album right away.
    • Total 2,000,000 rows scanned.

    table : album
    key : PRIMARY
    rows : 1
    table : picture
    key : NULL
    rows : 2,000,000

    View full-size slide

  18. Add a better index
    19
    • picture.album_id


    • Now down to 0.12 sec.
    • 317 times faster than original.
    ALTER TABLE picture ADD INDEX(album_id);

    View full-size slide

  19. Under the hood
    20
    • Details:




    • Total 200,000 rows scanned.

    table : album
    key : NULL
    rows : 20,000
    table : picture
    key : album_id
    rows : 100

    View full-size slide

  20. Add a better index
    21
    • album.user_id


    • Now down to 0.10 sec.
    • Gained another 17% in speed.

    ALTER TABLE album ADD INDEX(user_id);

    View full-size slide

  21. Under the hood
    22
    • Details:




    • With the second index,

    we’re scanning 100 pics.
    • Total 10,000 rows scanned.
    table : album
    key : user_id
    rows : 100
    table : picture
    key : album_id
    rows : 100

    View full-size slide

  22. Don't index everything!
    There is an index on each field
    And all selects take long to yield.
    Try finding which ones you need,
    Remove the rest and feel the speed.
    -- Anna Filina
    23

    View full-size slide

  23. Indexes used
    24
    • Primary and foreign keys make for great indexes.
    • Also whatever you use in JOIN, WHERE, ORDER BY.

    View full-size slide

  24. Index criteria
    25
    • Change frequency
    • Selectivity: how many different values?
    • Good examples:
    ◦ user.id
    ◦ picture.date
    • Bad examples:
    ◦ user.is_active
    ◦ picture.views


    View full-size slide

  25. Shall we optimize
    further?
    We’re quite fast already.

    View full-size slide

  26. Airport customs
    27
    • 5 customs officers, 1 min to process a traveller.
    • 5 travellers / min = fluid.
    • 20 travellers / min = queue slowly builds.
    • 1000 travellers / min = 3h wait if you arrive on 2nd
    minute.

    View full-size slide

  27. Smaller indexes
    28
    • BIGINT as index is slower than TINYINT.
    • Use UNSIGNED to double the maximum value.

    View full-size slide

  28. Tips for Better
    Queries

    View full-size slide

  29. Functions
    30
    • Avoid functions on an indexed column:
    • The index will be ignored here.

    SELECT id, title FROM picture

    WHERE YEAR(create_date) >= 2011;

    View full-size slide

  30. Limit
    31
    • Now down to 0.0009 sec.
    • 100 times faster than the last query.
    SELECT picture.id, picture.title

    FROM picture

    LEFT JOIN album ON picture.album.id = album.id

    WHERE album.user_id = 1

    LIMIT 25;

    View full-size slide

  31. Other constraints
    32
    • Range:





    • Only makes sense if that column is indexed:



    SELECT picture.id, picture.title

    FROM picture

    LEFT JOIN album ON picture.album.id = album.id

    WHERE album.user_id = 1

    AND picture.id BETWEEN 26 AND 50; # 0.023 sec
    ALTER TABLE picture ADD INDEX(id);

    View full-size slide

  32. Other constraints
    33
    • IN clause:










    SELECT picture.id, picture.title

    FROM picture

    LEFT JOIN album ON picture.album.id = album.id

    WHERE album.user_id = 1

    AND picture.id IN (15,26,29,32,45); # 0.048 sec

    View full-size slide

  33. Problem
    • New features released before sale
    • Index was not used
    • Creating too many tmp tables on disk
    • Slow queries
    • Long queue
    35

    View full-size slide

  34. How did I find the problem?
    36
    • Status:










    SHOW GLOBAL STATUS;
    +------------------------------------------+-------------+
    | Variable_name | Value |
    +------------------------------------------+-------------+
    | Aborted_clients | 0 |
    | Aborted_connects | 0 |
    | Binlog_cache_disk_use | 0 |
    | Binlog_cache_use | 0 |
    | Binlog_stmt_cache_disk_use | 0 |
    | Binlog_stmt_cache_use | 0 |
    | Bytes_received | 23378012 |
    | Bytes_sent | 2707328 |

    View full-size slide

  35. How did I find the problem?
    37
    • Created_tmp_disk_tables
    ◦ Temporary tables created on order by, group by, etc.
    ◦ Stored on disk when can't fit in RAM
    ◦ Maybe too big resultset.
    ◦ Are we using index?
    • Handler_read_rnd_next
    ◦ Full or partial table scan

    View full-size slide

  36. Split tables
    38
    • MySQL can partition your tables transparently.
    ◦ Creates multiple files on disk.
    ◦ Shows data as a single table.


    View full-size slide

  37. Archives
    39
    • MySQL ARCHIVE storage engine:
    ◦ Doesn’t support UPDATE or DELETE
    ◦ Very fast for SELECT
    • You can archive old transactions, news, etc.


    View full-size slide

  38. Split databases
    40
    • Horizontal partitioning, sharding.
    • Users 1-1000 and their data in database #1
    • Users 1001-2000 and their data in database #2

    View full-size slide

  39. Checklist
    41
    ✓Count queries.
    ✓Log the slow ones.
    ✓Use EXPLAIN and check
    number of rows scanned.
    ✓Try different indexes,
    remove unused ones.
    ✓Use small indexes,
    preferably numeric.
    ✓Don’t use functions on
    indexed columns.
    ✓Make sure indexes are
    used.
    ✓Limit number of results.
    ✓Split tables and
    databases.

    View full-size slide

  40. Further reading
    • Storage engines: http://www.linux.org/article/view/an-
    introduction-to-mysql-storage-engines
    • Sharding: http://www.jurriaanpersyn.com/archives/
    2009/02/12/database-sharding-at-netlog-with-mysql-
    and-php/
    • Optimization manual: http://dev.mysql.com/doc/refman/
    5.7/en/optimization.html
    42

    View full-size slide

  41. Even more fun stuff
    43
    • Setup read slaves
    • Denormalize where appropriate
    • SSD hard drive

    View full-size slide

  42. Services
    • Development: PHP, JS, etc.
    • Fix problems: bugs, performance, etc.
    • Coaching & workshops.
    • Advisor: testing strategy, architecture, etc.
    44

    View full-size slide

  43. @afilina afilina.com
    joind.in

    View full-size slide