$30 off During Our Annual Pro Sale. View Details »

Speed Up Your Database

Speed Up Your Database

Are your queries slow? Learn how to speed them up through better SQL crafting and use meaningful indices. Attendees will understand what works well and what doesn't, and will walk away with a checklist for faster databases.

Anna Filina
PRO

November 11, 2012
Tweet

More Decks by Anna Filina

Other Decks in Programming

Transcript

  1. FooLab
    http://foolab.ca
    @foolabca
    Speed up Your
    Database
    Pycon Canada - November 11, 2012

    View Slide

  2. FooLab
    Speed Matters
    • What is slow?
    • Indexes.
    • Better queries.
    • Checklist.
    2
    Indices*

    View Slide

  3. FooLab
    Anna Filina
    3
    User group
    Non profit
    conference
    FooLab
    Training
    Project rescue
    @afilina

    View Slide

  4. FooLab
    What is Slow?

    View Slide

  5. FooLab
    Multitude of Queries
    • Loops:
    • Reduce the number of queries.
    • Check for lazy loading in your ORM.
    • Query a set instead.
    5
    for id in pictures:
    query_pic_details(id)

    View Slide

  6. FooLab
    Multitude of Queries
    • Deeply nested code.
    • Try get_first_thumb instead.
    6

    View Slide

  7. FooLab
    Count Queries
    • Manual: add custom code before queries.
    • Auto: ORMs (like SQLAlchemy) have listeners.
    7

    View Slide

  8. FooLab
    Slow Query Log
    • Open my.cnf
    • Queries taking 0.5s or longer will be logged.
    • Open the log file and analyze the queries.
    8
    slow_query_log=ON
    long_query_time=0.5
    slow_query_log_file=/path/to/file

    View Slide

  9. FooLab
    Example Schema
    9
    200
    users
    100
    albums each
    100
    pictures each
    2,000,000 total pics

    View Slide

  10. FooLab
    Simple Query
    • Get all pictures from user #1
    • 38 seconds.
    10
    SELECT picture.id, picture.title
    FROM picture
    LEFT JOIN album ON picture.album.id = album.id
    WHERE album.user_id = 1;

    View Slide

  11. FooLab
    Let me do a space jump
    while it fetches my pics...
    11

    View Slide

  12. FooLab
    Why is it Slow?

    View Slide

  13. FooLab
    Explain
    13
    • Prefix with EXPLAIN:
    • Each row = table scanned:
    EXPLAIN SELECT picture.id, picture.title
    FROM picture
    LEFT JOIN album ON picture.album.id = album.id
    WHERE album.user_id = 1;
    +----+-------------+---------+------+---------------+------+---------+------+---------+--------------------------------+
    | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
    +----+-------------+---------+------+---------------+------+---------+------+---------+--------------------------------+
    | 1 | SIMPLE | album | ALL | NULL | NULL | NULL | NULL | 20000 | Using where |
    | 1 | SIMPLE | picture | ALL | NULL | NULL | NULL | NULL | 2000000 | Using where; Using join buffer |
    +----+-------------+---------+------+---------------+------+---------+------+---------+--------------------------------+

    View Slide

  14. FooLab
    Explain
    14
    • Details:
    • Because of the join,
    we’re scanning all albums for each picture.
    • Total 40 billion rows scanned.
    table : album
    key : NULL
    rows : 20,000
    table : picture
    key : NULL
    rows : 2,000,000

    View Slide

  15. FooLab
    Indexes

    View Slide

  16. FooLab
    Metaphor
    16
    Flip through all pages of a address book...
    ... or pull on the letter tab.

    View Slide

  17. FooLab
    Add Index
    17
    • album.id
    • Same query down to 2.38 sec.
    • We just saved 36 sec on every request!
    ALTER TABLE album ADD PRIMARY KEY(id);

    View Slide

  18. FooLab
    Under the Hood
    18
    • Details:
    • With the index,
    we find our album right away.
    • Total 2,000,000 rows scanned.
    table : album
    key : PRIMARY
    rows : 1
    table : picture
    key : NULL
    rows : 2,000,000

    View Slide

  19. FooLab
    Add Next Index
    19
    • picture.album_id
    • Now down to 0.12 sec.
    • 317 times faster than original.
    ALTER TABLE picture ADD INDEX(album_id);

    View Slide

  20. FooLab
    Under the Hood
    20
    • Details:
    • Total 200,000 rows scanned.
    table : album
    key : NULL
    rows : 20,000
    table : picture
    key : album_id
    rows : 100

    View Slide

  21. FooLab
    Add Last Index
    21
    • album.user_id
    • Now down to 0.10 sec.
    • Gained another 17% in speed.
    ALTER TABLE album ADD INDEX(user_id);

    View Slide

  22. FooLab
    Under the Hood
    22
    • Details:
    • With the second index,
    we’re scanning 100 pics.
    • Total 10,000 rows scanned.
    table : album
    key : user_id
    rows : 100
    table : picture
    key : album_id
    rows : 100

    View Slide

  23. FooLab
    Indexes Used
    23
    • Primary and foreign keys make for great indexes.
    • Also whatever you use in JOINs and WHEREs.

    View Slide

  24. FooLab
    Do We Need to
    Optimize Further?
    We’re quite fast already.

    View Slide

  25. FooLab
    Airport Customs
    25
    • 5 customs officers, 1 min to process a traveler.
    • 5 travelers / min = fluid.
    • 20 travelers / min = queue slowly builds.
    • 1000 travellers / min = 3h wait if you arrive on 2nd
    minute.

    View Slide

  26. FooLab
    Microseconds Matter
    26
    Evergrowing queues Alone at full speed

    View Slide

  27. FooLab
    Smaller Indexes
    27
    • BIGINT as index is slower than TINYINT.
    • Use UNSIGNED to double the maximum value.

    View Slide

  28. FooLab
    Tips for Better Queries

    View Slide

  29. FooLab
    Functions & Indexes
    29
    • Avoid functions on an indexed column:
    • The index won’t be used here.
    SELECT id, title FROM picture
    WHERE YEAR(create_date) >= 2011;

    View Slide

  30. FooLab
    Limit
    30
    • Stop picking up firewood when your hands are full:
    • Now down to 0.0009 sec.
    • 100 times faster than the last query.
    SELECT picture.id, picture.title
    FROM picture
    LEFT JOIN album ON picture.album.id = album.id
    WHERE album.user_id = 1
    LIMIT 25;

    View Slide

  31. FooLab
    Other Constraints
    31
    • Range:
    • Only makes sense if that column is indexed:
    SELECT picture.id, picture.title
    FROM picture
    LEFT JOIN album ON picture.album.id = album.id
    WHERE album.user_id = 1
    AND picture.id BETWEEN 26 AND 50; # 0.023 sec
    ALTER TABLE picture ADD INDEX(id);

    View Slide

  32. FooLab
    Other Constraints
    32
    • IN clause:
    SELECT picture.id, picture.title
    FROM picture
    LEFT JOIN album ON picture.album.id = album.id
    WHERE album.user_id = 1
    AND picture.id IN (15,26,29,32,45); # 0.048 sec

    View Slide

  33. FooLab
    Split Tables
    33
    • MySQL can partition your tables transparently.
    • Creates multiple files on disk.
    • Shows data as a single table.

    View Slide

  34. FooLab
    Archives
    34
    • Don’t underestimate archive tables.
    • MySQL ARCHIVE storage engine:
    • Doesn’t support UPDATE or DELETE
    • Very fast for SELECT
    • You can archive old transactions, news, etc.

    View Slide

  35. FooLab
    Split Databases
    35
    • Also called horizontal partitionning or sharding.
    • Users 1-1000 and their data in database #1
    • Users 1001-2000 and their data in database #2

    View Slide

  36. FooLab
    The Checklist
    36
    ✓Count your queries.
    ✓Log the slow ones.
    ✓Use EXPLAIN and check
    number of rows scanned.
    ✓Try different indexes,
    remove the unused ones.
    ✓Use small indexes,
    preferably numeric.
    ✓Don’t use functions on
    indexed columns.
    ✓Limit number of results.
    ✓Split tables and
    databases.

    View Slide

  37. FooLab
    Further Reading
    • SQLAlchemy profiling: http://www.sqlalchemy.org/trac/
    wiki/UsageRecipes/Profiling
    • Storage Engines: http://www.linux.org/article/view/an-
    introduction-to-mysql-storage-engines
    • Sharding: http://www.jurriaanpersyn.com/archives/
    2009/02/12/database-sharding-at-netlog-with-mysql-
    and-php/
    37

    View Slide

  38. FooLab
    Next Steps
    • I will tweet the slides:
    @afilina
    • E-mail your feedback:
    [email protected]
    38

    View Slide