Speed Up Your Database

B3b2139e4f2c0eca4efe2379fcebc1c5?s=47 Anna Filina
November 11, 2012

Speed Up Your Database

Are your queries slow? Learn how to speed them up through better SQL crafting and use meaningful indices. Attendees will understand what works well and what doesn't, and will walk away with a checklist for faster databases.

B3b2139e4f2c0eca4efe2379fcebc1c5?s=128

Anna Filina

November 11, 2012
Tweet

Transcript

  1. FooLab http://foolab.ca @foolabca Speed up Your Database Pycon Canada -

    November 11, 2012
  2. FooLab Speed Matters • What is slow? • Indexes. •

    Better queries. • Checklist. 2 Indices*
  3. FooLab Anna Filina 3 User group Non profit conference FooLab

    Training Project rescue @afilina
  4. FooLab What is Slow?

  5. FooLab Multitude of Queries • Loops: • Reduce the number

    of queries. • Check for lazy loading in your ORM. • Query a set instead. 5 for id in pictures: query_pic_details(id)
  6. FooLab Multitude of Queries • Deeply nested code. • Try

    get_first_thumb instead. 6
  7. FooLab Count Queries • Manual: add custom code before queries.

    • Auto: ORMs (like SQLAlchemy) have listeners. 7
  8. FooLab Slow Query Log • Open my.cnf • Queries taking

    0.5s or longer will be logged. • Open the log file and analyze the queries. 8 slow_query_log=ON long_query_time=0.5 slow_query_log_file=/path/to/file
  9. FooLab Example Schema 9 200 users 100 albums each 100

    pictures each 2,000,000 total pics
  10. FooLab Simple Query • Get all pictures from user #1

    • 38 seconds. 10 SELECT picture.id, picture.title FROM picture LEFT JOIN album ON picture.album.id = album.id WHERE album.user_id = 1;
  11. FooLab Let me do a space jump while it fetches

    my pics... 11
  12. FooLab Why is it Slow?

  13. FooLab Explain 13 • Prefix with EXPLAIN: • Each row

    = table scanned: EXPLAIN SELECT picture.id, picture.title FROM picture LEFT JOIN album ON picture.album.id = album.id WHERE album.user_id = 1; +----+-------------+---------+------+---------------+------+---------+------+---------+--------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+------+---------------+------+---------+------+---------+--------------------------------+ | 1 | SIMPLE | album | ALL | NULL | NULL | NULL | NULL | 20000 | Using where | | 1 | SIMPLE | picture | ALL | NULL | NULL | NULL | NULL | 2000000 | Using where; Using join buffer | +----+-------------+---------+------+---------------+------+---------+------+---------+--------------------------------+
  14. FooLab Explain 14 • Details: • Because of the join,

    we’re scanning all albums for each picture. • Total 40 billion rows scanned. table : album key : NULL rows : 20,000 table : picture key : NULL rows : 2,000,000
  15. FooLab Indexes

  16. FooLab Metaphor 16 Flip through all pages of a address

    book... ... or pull on the letter tab.
  17. FooLab Add Index 17 • album.id • Same query down

    to 2.38 sec. • We just saved 36 sec on every request! ALTER TABLE album ADD PRIMARY KEY(id);
  18. FooLab Under the Hood 18 • Details: • With the

    index, we find our album right away. • Total 2,000,000 rows scanned. table : album key : PRIMARY rows : 1 table : picture key : NULL rows : 2,000,000
  19. FooLab Add Next Index 19 • picture.album_id • Now down

    to 0.12 sec. • 317 times faster than original. ALTER TABLE picture ADD INDEX(album_id);
  20. FooLab Under the Hood 20 • Details: • Total 200,000

    rows scanned. table : album key : NULL rows : 20,000 table : picture key : album_id rows : 100
  21. FooLab Add Last Index 21 • album.user_id • Now down

    to 0.10 sec. • Gained another 17% in speed. ALTER TABLE album ADD INDEX(user_id);
  22. FooLab Under the Hood 22 • Details: • With the

    second index, we’re scanning 100 pics. • Total 10,000 rows scanned. table : album key : user_id rows : 100 table : picture key : album_id rows : 100
  23. FooLab Indexes Used 23 • Primary and foreign keys make

    for great indexes. • Also whatever you use in JOINs and WHEREs.
  24. FooLab Do We Need to Optimize Further? We’re quite fast

    already.
  25. FooLab Airport Customs 25 • 5 customs officers, 1 min

    to process a traveler. • 5 travelers / min = fluid. • 20 travelers / min = queue slowly builds. • 1000 travellers / min = 3h wait if you arrive on 2nd minute.
  26. FooLab Microseconds Matter 26 Evergrowing queues Alone at full speed

  27. FooLab Smaller Indexes 27 • BIGINT as index is slower

    than TINYINT. • Use UNSIGNED to double the maximum value.
  28. FooLab Tips for Better Queries

  29. FooLab Functions & Indexes 29 • Avoid functions on an

    indexed column: • The index won’t be used here. SELECT id, title FROM picture WHERE YEAR(create_date) >= 2011;
  30. FooLab Limit 30 • Stop picking up firewood when your

    hands are full: • Now down to 0.0009 sec. • 100 times faster than the last query. SELECT picture.id, picture.title FROM picture LEFT JOIN album ON picture.album.id = album.id WHERE album.user_id = 1 LIMIT 25;
  31. FooLab Other Constraints 31 • Range: • Only makes sense

    if that column is indexed: SELECT picture.id, picture.title FROM picture LEFT JOIN album ON picture.album.id = album.id WHERE album.user_id = 1 AND picture.id BETWEEN 26 AND 50; # 0.023 sec ALTER TABLE picture ADD INDEX(id);
  32. FooLab Other Constraints 32 • IN clause: SELECT picture.id, picture.title

    FROM picture LEFT JOIN album ON picture.album.id = album.id WHERE album.user_id = 1 AND picture.id IN (15,26,29,32,45); # 0.048 sec
  33. FooLab Split Tables 33 • MySQL can partition your tables

    transparently. • Creates multiple files on disk. • Shows data as a single table.
  34. FooLab Archives 34 • Don’t underestimate archive tables. • MySQL

    ARCHIVE storage engine: • Doesn’t support UPDATE or DELETE • Very fast for SELECT • You can archive old transactions, news, etc.
  35. FooLab Split Databases 35 • Also called horizontal partitionning or

    sharding. • Users 1-1000 and their data in database #1 • Users 1001-2000 and their data in database #2
  36. FooLab The Checklist 36 ✓Count your queries. ✓Log the slow

    ones. ✓Use EXPLAIN and check number of rows scanned. ✓Try different indexes, remove the unused ones. ✓Use small indexes, preferably numeric. ✓Don’t use functions on indexed columns. ✓Limit number of results. ✓Split tables and databases.
  37. FooLab Further Reading • SQLAlchemy profiling: http://www.sqlalchemy.org/trac/ wiki/UsageRecipes/Profiling • Storage

    Engines: http://www.linux.org/article/view/an- introduction-to-mysql-storage-engines • Sharding: http://www.jurriaanpersyn.com/archives/ 2009/02/12/database-sharding-at-netlog-with-mysql- and-php/ 37
  38. FooLab Next Steps • I will tweet the slides: @afilina

    • E-mail your feedback: anna@foolab.ca 38