Slide 1

Slide 1 text

ANDREW GODWIN // @andrewgodwin T H E WONDERFUL DATABASES - O F - W O R L D

Slide 2

Slide 2 text

?

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

=

Slide 7

Slide 7 text

! Databases are important

Slide 8

Slide 8 text

Journaling / WAL Transactions Schema changes JOINs

Slide 9

Slide 9 text

Journaling / WAL Database Disk

Slide 10

Slide 10 text

Journaling / WAL Database OS Disk

Slide 11

Slide 11 text

Journaling / WAL Database OS Disk Controller Disk Platter

Slide 12

Slide 12 text

Journaling / WAL Database OS Disk Controller Disk Platter RAID controller

Slide 13

Slide 13 text

Journaling / WAL Database OS Network card Switch Network card

Slide 14

Slide 14 text

Transactions Read Write time → Read Write Read Read t₁ t₂ t₃

Slide 15

Slide 15 text

Transactions Read Write time → Read Write Read Read t₁ t₂ t₃

Slide 16

Slide 16 text

Altering Schema Schema Data ?

Slide 17

Slide 17 text

Altering Schema Schema Data ? ?

Slide 18

Slide 18 text

Altering Schema Schema Data ?

Slide 19

Slide 19 text

JOINs

Slide 20

Slide 20 text

JOINs

Slide 21

Slide 21 text

Table A Table B

Slide 22

Slide 22 text

Slide 23

Slide 23 text

n² Hash Joins Merge Joins Index scans

Slide 24

Slide 24 text

But wait! There's more!

Slide 25

Slide 25 text

Asynchronous Calls import psycopg2 import eventlet eventlet.monkey_patch(psycopg2=True) cur1 = psycopg2.connect().cursor() cur2 = psycopg2.connect().cursor() cur1.execute("BIG SELECT") cur2.execute("ANOTHER SELECT") results1 = cur1.fetchall() results2 = cur2.fetchall()

Slide 26

Slide 26 text

Document Stores • Easy to add data • You still need schema • Large updates can be slow

Slide 27

Slide 27 text

Distributed Databases Consistency Availability Partition Tolerance (pick any two)

Slide 28

Slide 28 text

Performance Tips

Slide 29

Slide 29 text

Don't use MySQL Really, don't.

Slide 30

Slide 30 text

Play to strengths Do search in a search engine. Don't put relational in MongoDB.

Slide 31

Slide 31 text

Use indexes! EXPLAIN is your friend.

Slide 32

Slide 32 text

# explain select * from aeblog_post where id = 50; QUERY PLAN --------------------------------------------------- Index Scan using aeblog_post_pkey on aeblog_post (cost=0.00..8.27 rows=1 width=1812) Index Cond: (id = 50)

Slide 33

Slide 33 text

# explain select * from aeblog_post where section = 'trip-notes'; QUERY PLAN --------------------------------------------------- Seq Scan on aeblog_post (cost=0.00..14.21 rows=1 width=1812) Filter: ((section)::text = 'trip-notes'::text)

Slide 34

Slide 34 text

Analytics. Everywhere. Slow queries. Number of queries.

Slide 35

Slide 35 text

Avoid giant IN queries Seriously. A JOIN is better.

Slide 36

Slide 36 text

Think. Learn.

Slide 37

Slide 37 text

Fin. Andrew Godwin @andrewgodwin http://aeracode.org Thanks.