"UPSERT" in PostgreSQL

“UPSERT” in PostgreSQL Peter Geoghegan Pronounced / e n/ ˈɡ
ɪɡə @petervgeoghegan SFPUG Tuesday, May 26, 2015

What is “UPSERT”? • INSERT row, or UPDATE existing one
to reconcile it with what you wanted to INSERT. Always one or the other of those two. No “ifs”, no “buts”. • Coming in PostgreSQL 9.5. • 9.5 also adds a DO NOTHING variant that never locks row. MySQL calls this INSERT IGNORE. • You more or less don't have to worry about concurrency/race conditions with the feature.

What is “UPSERT”? (cont.) • “Fundamental UPSERT property”. At READ
COMMITTED isolation level, you should always get an insert or update. – No unprincipled deadlocking. – No spurious unique constraint violations. • Only way to really ensure this is to make it driven by insert. Take alternative path (i.e., go update existing row) when would-be duplicate violation detected. • Particularly useful for OLTP + web apps. Great for ETL, too.

Syntax “UPSERT” some distributors: INSERT INTO distributors (did, dname)
VALUES (5, 'Gizmo transglobal'), (6, 'Associated Computing, inc') ON CONFLICT (did) DO UPDATE SET dname = EXCLUDED.dname; “INSERT IGNORE” variant: INSERT INTO distributors (did, dname) VALUES (7, 'Redline GmbH') ON CONFLICT (did) DO NOTHING; Consider not updating existing distributor: INSERT INTO distributors (did, dname)AS d VALUES (8, 'Anvil Distribution') ON CONFLICT (did) DO UPDATE SET dname = EXCLUDED.dname; WHERE d.zipcode != '21201';

EXCLUDED.* pseudo table “UPSERT” some distributors: INSERT INTO distributors
(did, dname) VALUES (5, 'Gizmo transglobal'), (6, 'Associated Computing, inc') ON CONFLICT (did) DO UPDATE SET dname = EXCLUDED.dname; “INSERT IGNORE” variant: INSERT INTO distributors (did, dname) VALUES (7, 'Redline GmbH') ON CONFLICT (did) DO NOTHING; Consider not updating existing distributor: INSERT INTO distributors (did, dname)AS d VALUES (8, 'Anvil Distribution') ON CONFLICT (did) DO UPDATE SET dname = EXCLUDED.dname; WHERE d.zipcode != '21201';

Inference Specification “UPSERT” some distributors: INSERT INTO distributors (did,
dname) VALUES (5, 'Gizmo transglobal'), (6, 'Associated Computing, inc') ON CONFLICT (did) DO UPDATE SET dname = EXCLUDED.dname; “INSERT IGNORE” variant: INSERT INTO distributors (did, dname) VALUES (7, 'Redline GmbH') ON CONFLICT (did) DO NOTHING; Consider not updating existing distributor: INSERT INTO distributors (did, dname)AS d VALUES (8, 'Anvil Distribution') ON CONFLICT (did) DO UPDATE SET dname = EXCLUDED.dname; WHERE d.zipcode != '21201';

More on inference “UPSERT” some distributors: INSERT INTO distributors
(did, dname) VALUES (5, 'Gizmo transglobal'), (6, 'Associated Computing, inc') ON CONFLICT (did) DO UPDATE SET dname = EXCLUDED.dname; • Must specify which unique index to take update action on. • Prefer to do this with inference clause, naming relevant column (or columns, or expressions, ...). • Alternatively, ON CONFLICT can name constraint directly, but avoid it (Okay for exclusion constraints, but not unique constraints). • Very flexible and forgiving. Borrows stuff from CREATE INDEX. • Optimizer throws error if it cannot find satisfactory unique index.

EXPLAIN 'ing an UPSERT postgres=# EXPLAIN INSERT INTO upsert (key,
val) VALUES(1, 'Baz') ON CONFLICT (key) DO UPDATE SET val = EXCLUDED.val WHERE EXCLUDED.val <> 'Contrived'; QUERY PLAN Insert on upsert (cost=0.00..0.01 rows=1 width=0) Conflict Resolution: UPDATE Conflict Arbiter Indexes: upsert_pkey Conflict Filter: (excluded.val <> 'Contrived'::text) > Result (cost=0.00..0.01 rows=1 width=0) (5 rows)

Why not SQL MERGE? • SQL standard MERGE statement implemented
by a few other systems. • Synchronizes two tables, inserting, updating, and deleting as it goes. Individual MERGE statement may lack WHEN NOT MATCHED THEN INSERT “handler”. In contrast, Postgres UPSERT is “driven by INSERT”. • Source table could be VALUES(), so appears UPSERT-like. • Lacks “fundamental UPSERT property”, because can (for example) raise duplicate violations for simple UPSERT cases. • ON CONFLICT DO UPDATE has some amount of MERGE-like flexibility. • Idea that MERGE is equivalent to UPSERT is unfortunate myth.

What feature needed to work with (incomplete list) • Data-modifying
common table expressions (WITH clauses). • And relatedly, RETURNING (projects updated rows, too). • Exclusion constraints (for DO NOTHING only). • Updatable views. • postgres_fdw (and foreign data wrappers generally). • And so on... – New row-level security stuff. – Older column privileges stuff. – User tables named “excluded”! – Inheritance. – Subqueries in nested SET list, and nested WHERE clause. • Simple. • Correlated. – Triggers. – Rules. – Partial indexes (for inference clause). – Alternative collations, non-default opclasses in inference clause (purely theoretical requirement).

Summary • UPSERT has very flexible syntax – far more
flexible (and far less error prone) than comparable MySQL feature, for example. • Complicated to implement, but not complicated to use. Involved small revision to chapter in documentation on MVCC rules, for example. • Very useful for ETL, too. Can resolve to not update based on what is found to already exist, so some MERGE-like capabilities here. • Unlike MERGE, has no race conditions. No duplicate violations from arbiter unique index. No “unprincipled deadlocks”. • However, unlike MERGE, inserting is one possible outcome that you must always be happy with.

"UPSERT" in PostgreSQL

"UPSERT" in PostgreSQL

Peter Geoghegan

More Decks by Peter Geoghegan

Other Decks in Programming

Featured

Transcript

“UPSERT” in PostgreSQL Peter Geoghegan Pronounced / e n/ ˈɡ

What is “UPSERT”? • INSERT row, or UPDATE existing one

What is “UPSERT”? (cont.) • “Fundamental UPSERT property”. At READ

Syntax “UPSERT” some distributors: INSERT INTO distributors (did, dname)

EXCLUDED.* pseudo table “UPSERT” some distributors: INSERT INTO distributors

Inference Specification “UPSERT” some distributors: INSERT INTO distributors (did,

More on inference “UPSERT” some distributors: INSERT INTO distributors

EXPLAIN 'ing an UPSERT postgres=# EXPLAIN INSERT INTO upsert (key,

Why not SQL MERGE? • SQL standard MERGE statement implemented

What feature needed to work with (incomplete list) • Data-modifying

Summary • UPSERT has very flexible syntax – far more