Slide 1

Slide 1 text

Optimising Queries Optimising Queries For Fun And Profit For Fun And Profit Andrew Godwin / @andrewgodwin Andrew Godwin / @andrewgodwin flickr.com/stuckincustoms flickr.com/stuckincustoms

Slide 2

Slide 2 text

· Django core dev · Django core dev · South · South · Lanyrd · Lanyrd

Slide 3

Slide 3 text

Databases are old. Databases are old. (and they're not magic!) (and they're not magic!) flickr.com/stuckincustoms flickr.com/stuckincustoms

Slide 4

Slide 4 text

· Computers are slow · Computers are slow Otherwise databases would be useless Otherwise databases would be useless

Slide 5

Slide 5 text

· Indexes, joins or replication · Indexes, joins or replication Things a database gives you over a flat file Things a database gives you over a flat file

Slide 6

Slide 6 text

Databases still need you. Databases still need you. They can't solve every problem without hints They can't solve every problem without hints flickr.com/96dpi flickr.com/96dpi

Slide 7

Slide 7 text

Basic Problems Basic Problems

Slide 8

Slide 8 text

· n+1 queries · n+1 queries Far too easy to do! Far too easy to do!

Slide 9

Slide 9 text

photos = [] photos = [] for person in People.objects.all(): for person in People.objects.all(): photos.append(person.photo.file) photos.append(person.photo.file)

Slide 10

Slide 10 text

photo_ids = People.objects.values_list( photo_ids = People.objects.values_list( "photo", flat=True) "photo", flat=True) photos = Photo.objects.filter(id__in=photo_ids) photos = Photo.objects.filter(id__in=photo_ids)

Slide 11

Slide 11 text

photos = [] photos = [] for person in People.objects.select_related("photo"): for person in People.objects.select_related("photo"): photos.append(person.photo.file) photos.append(person.photo.file)

Slide 12

Slide 12 text

· Unrestricted joins · Unrestricted joins Those exponentials add up Those exponentials add up

Slide 13

Slide 13 text

1 million rows JOINed with 1 million rows 1 million rows JOINed with 1 million rows ...is 1,000,000,000,000 potential comparisons ...is 1,000,000,000,000 potential comparisons

Slide 14

Slide 14 text

· Denormalise · Denormalise It'll save you a lot of time. It'll save you a lot of time.

Slide 15

Slide 15 text

Finer Optimisation Finer Optimisation

Slide 16

Slide 16 text

· Your goal: Less queries · Your goal: Less queries Databases are speedier than your code Databases are speedier than your code

Slide 17

Slide 17 text

· Log queries for each page · Log queries for each page Django Debug Toolbar works well for this Django Debug Toolbar works well for this

Slide 18

Slide 18 text

SELECT * FROM people_person WHERE age > 13 AND foo; SELECT * FROM people_person WHERE age > 13 AND foo; SELECT * FROM people_photo WHERE id = 42; SELECT * FROM people_photo WHERE id = 42; SELECT * FROM people_photo WHERE id = 31; SELECT * FROM people_photo WHERE id = 31; SELECT * FROM people_photo WHERE id = 214; SELECT * FROM people_photo WHERE id = 214; SELECT * FROM people_photo WHERE id = 84; SELECT * FROM people_photo WHERE id = 84; SELECT * FROM people_photo WHERE id = 28; SELECT * FROM people_photo WHERE id = 28; SELECT * FROM people_photo WHERE id = 90; SELECT * FROM people_photo WHERE id = 90;

Slide 19

Slide 19 text

SELECT * FROM people_person AS pr SELECT * FROM people_person AS pr INNER JOIN people_photo AS ph ON ph.id = pr.photo_id INNER JOIN people_photo AS ph ON ph.id = pr.photo_id WHERE age > 13 AND foo; WHERE age > 13 AND foo;

Slide 20

Slide 20 text

· At most 100 per page · At most 100 per page Preferably, you want 20 or less Preferably, you want 20 or less

Slide 21

Slide 21 text

· Log slow queries · Log slow queries There's settings for this in PostgreSQL/MySQL There's settings for this in PostgreSQL/MySQL

Slide 22

Slide 22 text

· Analyse slow queries · Analyse slow queries See if they need indexes See if they need indexes

Slide 23

Slide 23 text

# explain select * from aeblog_post where id = 50; # explain select * from aeblog_post where id = 50; QUERY PLAN QUERY PLAN --------------------------------------------------- --------------------------------------------------- Index Scan using aeblog_post_pkey on aeblog_post Index Scan using aeblog_post_pkey on aeblog_post (cost=0.00..8.27 rows=1 width=1812) (cost=0.00..8.27 rows=1 width=1812) Index Cond: (id = 50) Index Cond: (id = 50)

Slide 24

Slide 24 text

# explain select * from aeblog_post # explain select * from aeblog_post where section = 'trip-notes'; where section = 'trip-notes'; QUERY PLAN QUERY PLAN --------------------------------------------------- --------------------------------------------------- Seq Scan on aeblog_post (cost=0.00..14.21 rows=1 Seq Scan on aeblog_post (cost=0.00..14.21 rows=1 width=1812) width=1812) Filter: ((section)::text = 'trip-notes'::text) Filter: ((section)::text = 'trip-notes'::text)

Slide 25

Slide 25 text

· Don't fetch every column · Don't fetch every column Especially if you have big ones Especially if you have big ones

Slide 26

Slide 26 text

Less Downtime Less Downtime

Slide 27

Slide 27 text

· Watch out for locks · Watch out for locks While doing big operations (e.g. adding columns) While doing big operations (e.g. adding columns)

Slide 28

Slide 28 text

· Try not to remove columns · Try not to remove columns It's a lot easier for code to ignore new ones It's a lot easier for code to ignore new ones

Slide 29

Slide 29 text

· Have a warm failover · Have a warm failover But not automatic failover, unless you know why But not automatic failover, unless you know why

Slide 30

Slide 30 text

· Have a read-only mode · Have a read-only mode It means you never truly go down It means you never truly go down

Slide 31

Slide 31 text

Column Types Column Types

Slide 32

Slide 32 text

· Don't use text for everything · Don't use text for everything Especially numbers and dates Especially numbers and dates

Slide 33

Slide 33 text

· Store all times in UTC · Store all times in UTC Convert to and from in the code. Convert to and from in the code.

Slide 34

Slide 34 text

· Make columns NULLable · Make columns NULLable Rather than using blank strings. Rather than using blank strings.

Slide 35

Slide 35 text

· Think about custom types · Think about custom types Geometry, IP addresses, JSON... Geometry, IP addresses, JSON...

Slide 36

Slide 36 text

'NoSQL' 'NoSQL'

Slide 37

Slide 37 text

· Use non-relational wisely · Use non-relational wisely Otherwise you'll make everything slower Otherwise you'll make everything slower

Slide 38

Slide 38 text

· Have a single source of truth · Have a single source of truth So you can rebuild if something goes awry So you can rebuild if something goes awry

Slide 39

Slide 39 text

· Enforce constraints in code · Enforce constraints in code Before your data becomes useless Before your data becomes useless

Slide 40

Slide 40 text

· Postgres does schemaless too · Postgres does schemaless too hstore and now JSON column types hstore and now JSON column types

Slide 41

Slide 41 text

Basically... Basically... flickr.com/stuckincustoms flickr.com/stuckincustoms

Slide 42

Slide 42 text

· Less queries · Less queries

Slide 43

Slide 43 text

· Less queries · Less queries · Less JOINs · Less JOINs

Slide 44

Slide 44 text

· Less queries · Less queries · Less JOINs · Less JOINs · Better indexes · Better indexes

Slide 45

Slide 45 text

· Less queries · Less queries · Less JOINs · Less JOINs · Better indexes · Better indexes · (Faster servers) · (Faster servers)

Slide 46

Slide 46 text

Thanks. Thanks. Andrew Godwin / @andrewgodwin Andrew Godwin / @andrewgodwin flickr.com/oimax flickr.com/oimax