Slide 1

Slide 1 text

Andrew Godwin @andrewgodwin Rapid schema development with PostgreSQL

Slide 2

Slide 2 text

Django core developer Andrew Godwin Long-time PostgreSQL user Specialise in schemas and storage Hi! I'm

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

as you think Schemas aren't quite as fixed they are

Slide 6

Slide 6 text

Easy to change Don't need to be strict Lock-free ALTER TABLE CREATE INDEX CONCURRENTLY hstore json

Slide 7

Slide 7 text

hstore

Slide 8

Slide 8 text

hstore Key/Value store as a column type id title data 1 Object One model → "4563-a" 2 Smaller One model → "4563-a" scale → 0.43 3 Another One model → "874-b" scale → 2 engraving → "TEST"

Slide 9

Slide 9 text

hstore example CREATE EXTENSION hstore; CREATE TABLE widgets (id int, title text, data hstore); INSERT INTO widgets (id, text, data) VALUES (1, 'Object One', 'model => "8973-b", scale => 2'); SELECT * FROM widgets WHERE data->'scale' = '2'; id | title | data ----+------------+--------------------------------- 1 | Object One | "model"=>"8973-b", "scale"=>"2"

Slide 10

Slide 10 text

hstore example SELECT title, data->'model' AS model FROM widgets WHERE data ?| ARRAY['model', 'diagram']; title | model ------------+-------- Object One | 8973-b CREATE INDEX ON widgets USING gin (data); CREATE INDEX ON widgets ((data->'scale'));

Slide 11

Slide 11 text

json

Slide 12

Slide 12 text

json Structured JSON as a column type id title data 1 My Sofa {"colour": "brown"} 2 Your Sofa {"color"; "red", "weight": 2.6} 3 Their Sofa {"color": "blue", "weight": 0.8, "size": [4.5, 2.1, 1]}

Slide 13

Slide 13 text

json example CREATE TABLE sofas (id int, title text, data json); INSERT INTO sofas (id, title, data) VALUES (1, 'My Sofa', '{"colour": "brown", "size": [3, 4, 5]}'); SELECT title, data->>'colour' AS colour FROM sofas WHERE data->>'colour' = 'brown'; title | colour ---------+-------- My Sofa | brown

Slide 14

Slide 14 text

json example SELECT title, data->'size'->>0 AS length FROM sofas; title | length ---------+-------- My Sofa | 3 CREATE INDEX ON sofas ((data->>'colour'));

Slide 15

Slide 15 text

Code on the Inside

Slide 16

Slide 16 text

Embedded Languages pgSQL, Tcl, Perl, Python JavaScript, Ruby, Scheme, Java, PHP, more... INCLUDED ADDONS

Slide 17

Slide 17 text

Example CREATE FUNCTION add_rating(id int, rating int) RETURNS float AS $$ var data = plv8.execute("SELECT count, total, average FROM ratings WHERE id = $1", [id])[0]; var newTotal = data.total + rating; var newCount = data.count + 1; var newAverage = (newTotal / newCount); plv8.execute("UPDATE ratings SET count = $1, total = $2, average = $3 WHERE id = $4", [newCount, newTotal, newAverage, id]); return newAverage; $$ LANGUAGE plv8; CREATE TABLE ratings (id int, count int, total int, average float);

Slide 18

Slide 18 text

Example INSERT INTO ratings VALUES (1, 0, 0, 0); SELECT add_rating(1, 5); add_rating ------------ 5 SELECT add_rating(1, 4); add_rating ------------ 4.5

Slide 19

Slide 19 text

Why? No round trips between SELECT/UPDATE Separates logic from storage Don't need to install language on client Language choice Encapsulation Speed

Slide 20

Slide 20 text

Why not? Need tool to handle updating code No interactivity apart from SQL console Debugging Versioning

Slide 21

Slide 21 text

Some things are too far... CREATE FUNCTION render_template(template text, context json) RETURNS text AS $$ import json from jinja2 import Template tinstance = Template(template) return tinstance.render(json.loads(context)) $$ LANGUAGE plpythonu; SELECT render_template('hello {{name}}', '{"name": "world"}'); render_template ----------------- hello world

Slide 22

Slide 22 text

Schema changes without downtime

Slide 23

Slide 23 text

Lock-free ALTER TABLE Only certain actions, including: ∙ Adding a column with NULL ∙ Dropping a column ∙ Converting very similar types (varchar → text)

Slide 24

Slide 24 text

CREATE INDEX CONCURRENTLY ∙ Slower, but doesn't lock ∙ Not allowed inside transactions ∙ Can leave unfinished indexes in error cases

Slide 25

Slide 25 text

Bringing it all together

Slide 26

Slide 26 text

Hybrid Schemas id title data 1 An Article text → "Article text" layout → "3-col" slug an-article common, indexed freeform, changed in bulk

Slide 27

Slide 27 text

Zero-downtime changes add NULL column add code populate remove code remove column use schemaless

Slide 28

Slide 28 text

Schema Schemaless Faster querying Data integrity Slightly too easy to JOIN Faster prototyping Cleaner for sparse data Encourages messy code

Slide 29

Slide 29 text

Version your schemas! (especially for schemaless stuff)

Slide 30

Slide 30 text

Fill in data in the background Either at the database level or your code

Slide 31

Slide 31 text

Put code in the database! But not too much. Only small single actions.

Slide 32

Slide 32 text

Experiment. You can install all this in < 1 minute.

Slide 33

Slide 33 text

Thanks! Andrew Godwin @andrewgodwin [email protected]