Rapid schema development with PostgreSQL

Andrew Godwin @andrewgodwin Rapid schema development with PostgreSQL

Django core developer Andrew Godwin Long-time PostgreSQL user Specialise in
schemas and storage Hi! I'm

as you think Schemas aren't quite as fixed they are

Easy to change Don't need to be strict Lock-free ALTER
TABLE CREATE INDEX CONCURRENTLY hstore json

hstore

hstore Key/Value store as a column type id title data
1 Object One model → "4563-a" 2 Smaller One model → "4563-a" scale → 0.43 3 Another One model → "874-b" scale → 2 engraving → "TEST"

hstore example CREATE EXTENSION hstore; CREATE TABLE widgets (id int,
title text, data hstore); INSERT INTO widgets (id, text, data) VALUES (1, 'Object One', 'model => "8973-b", scale => 2'); SELECT * FROM widgets WHERE data->'scale' = '2'; id | title | data ----+------------+--------------------------------- 1 | Object One | "model"=>"8973-b", "scale"=>"2"

hstore example SELECT title, data->'model' AS model FROM widgets WHERE
data ?| ARRAY['model', 'diagram']; title | model ------------+-------- Object One | 8973-b CREATE INDEX ON widgets USING gin (data); CREATE INDEX ON widgets ((data->'scale'));

json Structured JSON as a column type id title data
1 My Sofa {"colour": "brown"} 2 Your Sofa {"color"; "red", "weight": 2.6} 3 Their Sofa {"color": "blue", "weight": 0.8, "size": [4.5, 2.1, 1]}

json example CREATE TABLE sofas (id int, title text, data
json); INSERT INTO sofas (id, title, data) VALUES (1, 'My Sofa', '{"colour": "brown", "size": [3, 4, 5]}'); SELECT title, data->>'colour' AS colour FROM sofas WHERE data->>'colour' = 'brown'; title | colour ---------+-------- My Sofa | brown

json example SELECT title, data->'size'->>0 AS length FROM sofas; title
| length ---------+-------- My Sofa | 3 CREATE INDEX ON sofas ((data->>'colour'));

Code on the Inside

Embedded Languages pgSQL, Tcl, Perl, Python JavaScript, Ruby, Scheme, Java,
PHP, more... INCLUDED ADDONS

Example CREATE FUNCTION add_rating(id int, rating int) RETURNS float AS
$$ var data = plv8.execute("SELECT count, total, average FROM ratings WHERE id = $1", [id])[0]; var newTotal = data.total + rating; var newCount = data.count + 1; var newAverage = (newTotal / newCount); plv8.execute("UPDATE ratings SET count = $1, total = $2, average = $3 WHERE id = $4", [newCount, newTotal, newAverage, id]); return newAverage; $$ LANGUAGE plv8; CREATE TABLE ratings (id int, count int, total int, average float);

Example INSERT INTO ratings VALUES (1, 0, 0, 0); SELECT
add_rating(1, 5); add_rating ------------ 5 SELECT add_rating(1, 4); add_rating ------------ 4.5

Why? No round trips between SELECT/UPDATE Separates logic from storage
Don't need to install language on client Language choice Encapsulation Speed

Why not? Need tool to handle updating code No interactivity
apart from SQL console Debugging Versioning

Some things are too far... CREATE FUNCTION render_template(template text, context
json) RETURNS text AS $$ import json from jinja2 import Template tinstance = Template(template) return tinstance.render(json.loads(context)) $$ LANGUAGE plpythonu; SELECT render_template('hello {{name}}', '{"name": "world"}'); render_template ----------------- hello world

Schema changes without downtime

Lock-free ALTER TABLE Only certain actions, including: ∙ Adding a
column with NULL ∙ Dropping a column ∙ Converting very similar types (varchar → text)

CREATE INDEX CONCURRENTLY ∙ Slower, but doesn't lock ∙ Not
allowed inside transactions ∙ Can leave unfinished indexes in error cases

Bringing it all together

Hybrid Schemas id title data 1 An Article text →
"Article text" layout → "3-col" slug an-article common, indexed freeform, changed in bulk

Zero-downtime changes add NULL column add code populate remove code
remove column use schemaless

Schema Schemaless Faster querying Data integrity Slightly too easy to
JOIN Faster prototyping Cleaner for sparse data Encourages messy code

Version your schemas! (especially for schemaless stuff)

Fill in data in the background Either at the database
level or your code

Put code in the database! But not too much. Only
small single actions.

Experiment. You can install all this in < 1 minute.

Thanks! Andrew Godwin @andrewgodwin [email protected]

Rapid schema development with PostgreSQL

Rapid schema development with PostgreSQL

Andrew Godwin

More Decks by Andrew Godwin

Other Decks in Programming

Featured

Transcript

Andrew Godwin @andrewgodwin Rapid schema development with PostgreSQL

Django core developer Andrew Godwin Long-time PostgreSQL user Specialise in

as you think Schemas aren't quite as fixed they are

Easy to change Don't need to be strict Lock-free ALTER

hstore

hstore Key/Value store as a column type id title data

hstore example CREATE EXTENSION hstore; CREATE TABLE widgets (id int,

hstore example SELECT title, data->'model' AS model FROM widgets WHERE

json

json Structured JSON as a column type id title data

json example CREATE TABLE sofas (id int, title text, data

json example SELECT title, data->'size'->>0 AS length FROM sofas; title

Code on the Inside

Embedded Languages pgSQL, Tcl, Perl, Python JavaScript, Ruby, Scheme, Java,

Example CREATE FUNCTION add_rating(id int, rating int) RETURNS float AS

Example INSERT INTO ratings VALUES (1, 0, 0, 0); SELECT

Why? No round trips between SELECT/UPDATE Separates logic from storage

Why not? Need tool to handle updating code No interactivity

Some things are too far... CREATE FUNCTION render_template(template text, context

Schema changes without downtime

Lock-free ALTER TABLE Only certain actions, including: ∙ Adding a

CREATE INDEX CONCURRENTLY ∙ Slower, but doesn't lock ∙ Not

Bringing it all together

Hybrid Schemas id title data 1 An Article text →

Zero-downtime changes add NULL column add code populate remove code

Schema Schemaless Faster querying Data integrity Slightly too easy to

Version your schemas! (especially for schemaless stuff)

Fill in data in the background Either at the database

Put code in the database! But not too much. Only

Experiment. You can install all this in < 1 minute.

Thanks! Andrew Godwin @andrewgodwin [email protected]