Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Concurrency in Postgres

Concurrency in Postgres

Talk that examines handling of concurrency issues in Postgres, and how Postgres 9.3 improves situation surrounding foreign key locks.

(See http://postgresopen.org/2013/schedule/presentations/366/)

F9a2dba12b94d0c204d846a29da56bf5?s=128

Peter Geoghegan

September 20, 2013
Tweet

Transcript

  1. 1/43 Concurrency in Postgres Peter Geoghegan @sternocera

  2. 2/43 About me • Postgres major contributor – To date,

    have mostly worked on performance features like group commit and faster in-memory sorting • Also worked on pg_stat_statements (9.2 normalization stuff) • Work for Heroku as an engineer in the Department of Data – build internal infrastructure for Postgres database-as-a-service
  3. 3/43 Brief plug: pgConf.EU • I'm an organizer • In

    my hometown: Dublin, Ireland • Direct flights from Chicago!
  4. 4/43 What is this talk about? • More topics then

    it is sensible to discuss in a single hour – many things covered in passing could easily justify their own hour • Aspects of concurrency like locking, transaction isolation level and MVCC, and issues surrounding concurrency for application developers. • Mostly, how the database handles things so you often don't have to give concurrency issues any thought • Sometimes you do, though, so it's useful to at least be able to judge if it's one of those times • Ongoing development work, and other topical stuff
  5. 5/43 Two-phase locking (2PL)

  6. 6/43 Two-phase locking (2PL) • Two-phase locking (2PL) concurrency model.

    Used by IBM DB2 and SQL Server (strictly speaking they use SS2PL). Writers and readers may block each other. Can perform poorly, but behaviour at least easy to reason about. Pessimistic. Locks more. Tacitly assumed by SQL standard, which is occasionally evident in subtle ways. Classic model from 1970s (System R). • 2PL seems really weird to me, because you frequently have situations were transactions block on writing data, just because existing data has been “sullied” by being read by some other still current transaction. And vice-versa. • Big banks love 2PL!
  7. 7/43 MVCC (Multi-version concurrency control)

  8. 8/43 MVCC • MVCC (Multi-version concurrency control). Popular and influential

    concurrency model used by Postgres, Oracle, and many other systems. • Multiple versions of rows are stored at the same time. Each session has snapshot which dictates which versions are visible to it. Readers don't block writers, and writers don't block readers. Locks much less than 2PL, but locks are still frequently taken (usually far lighter locks).
  9. 9/43 Locking (table-level locks) • Postgres frequently takes locks implicitly

    – generally fairly rare to have to take them explicitly yourself, but can be done with SQL. • These locks last until end of transaction (with some obscure exceptions). • Just reading from a table (or an index) takes a very light lock on it. This just prevents some other session from dropping the table, for example (since that needs a very heavy lock that blocks all other locks). • Full taxonomy of Postgres locks at: http://www.postgresql.org/docs/current/static/explicit-locking.html • Includes useful conflict table – some don't block others. • Some locks block themselves (across multiple sessions)
  10. 10/43 Locking (cont'd) • Locks held across system viewable in

    pg_locks system view – can be joined against pg_stat_activity to show query involved and so on. • https://wiki.postgresql.org/wiki/Lock_Monitoring A useful resource for locking problems. • https://wiki.postgresql.org/wiki/Lock_dependency_information For lock dependencies. • Locks automatically taken on newly inserted rows, or a row when updated or deleted. • Unique constraints (indexes) sometimes block pending the outcome of other transactions. Sometimes system can't be sure that your value (say, 5) really violates constraint until other session (that inserted same value first) commits or aborts.
  11. 11/43 Other locks • System also takes other, lower-level locks

    that are not exposed to users. • For example, the btree code uses a technique sometimes called “latching” to ensure correct concurrent access. Takes tiny page-level locks for an instant (not duration of transaction) so that data structure can be accessed correctly and efficiently by different sessions – no one gets to see a half-written page, for example. • Totally orthogonal to table-level locks – people often confuse the two.
  12. 12/43 Transaction isolation levels Serializable Repeatable read Read committed

  13. 13/43 Transaction isolation • One of the most important jobs

    of Postgres is to let you pretend that your transaction is the only one running at the time. It makes your life much easier as an application developer if you don't have to worry about things disappearing from under you • This is also the job of locks, that work pretty much the same at all isolation levels • Transaction isolation is essentially the lengths that the database goes to to fool your application. In general, the database has to work harder for it to be more convincing so that you can worry less about getting some aspect of concurrency wrong. • Which particular trade-off is right for your particular application or even individual transaction is up to you. • In my experience, many application developers don't know what this is. Sometimes they pay for this, but mostly things accidentally work out fine. • On the other hand, if you listen to certain database geeks, they'd tell you that you'd be crazy to not use higher isolation levels.
  14. 14/43 Read committed (default) • Fine for many purposes. •

    Each statement executed sees rows that are consistent with its snapshot. Each statement gets a new snapshot. • Sometimes, you can have some anomalies occur where in the course of a transaction (but not during the execution of a single statement) you see rows (more strictly speaking, row versions) that differ from earlier in the transaction. These anomalies are called nonreapeatable reads and phantom reads. • You can only see data committed by other transactions, and never uncommitted data (“dirty data”) from another transaction.
  15. 15/43 Repeatable read • Level up from read committed. This

    was called “serializable” in 9.0, and Oracle implements the same semantics and calls it serializable, just as Postgres used to. • This is kind of similar to Read committed, except you get one snapshot per transaction, not per statement. So you never see data that someone else has committed since the start of your transaction. • There is one big downside/difference (aside from some implications for performance): Postgres is now allowed to throw serialization errors if it cannot give you behavior consistent with the set of guarantees you asked for. This is only an issue with write statements when two transactions UPDATE or DELETE the same data concurrently – one “wins” (unless lock holding transaction aborts, in which case the conflict is averted).
  16. 16/43 Repeatable read (cont'd) • The really compelling case for

    repeatable read is when doing reporting, but certainly not uncommon to use it all the time. • Suppose you have a typical financial report with rows for sales across different departments or something, but also some totals at the bottom. Maybe the totals have to be queried separately by a different command. • If you did this against live data (i.e other transactions modified the data and committed), and didn’t use at least repeatable read isolation level in Postgres, you might get a situation the where numbers don’t add up! • With repeatable read, that wouldn’t happen because both queries would use the same snapshot.
  17. 17/43 Brief(-ish) aside: Write skew anomalies

  18. 18/43 What's a write skew anomaly? • Remember how I

    said repeatable read’s behavior used to be called Serializable by Postgres, and what Oracle still calls serializable is equivalent (snapshot isolation)? • Well, this is free from all the anomalies that the SQL standard says serializable can’t have...but is it really serializable, in the general, everyday sense? In other words, can you rely on things just working as if your transaction was the only one (with maybe some blocking)?
  19. 19/43 Write skew anomalies (cont'd) • Will things always happen

    in a way that makes it look exactly like transactions occurred independently in a serial order to everyone? • Can you test your app with a single session and be confident it's correct? • It turns out that the answer is no!
  20. 20/43 Write skew anomalies (cont'd) • Recall that with MVCC,

    readers don’t block writers and writers don’t block readers. • Remember how I said banks like 2PL databases like DB2? • I didn’t just mean because it’s simple and easy to reason about, though that’s probably part of it. • Because 2PL can have select statements block on some writes in other transactions, it can’t have write-skew anomalies.
  21. 21/43 Write skew anomalies (cont'd) • Actually, write skews are

    pretty simple. • It turns out that 2PL systems might not be so dumb for considering data that another running transaction saw as “sullied”... • Still pretty dumb, though. Certainly, this doesn't matter 99%+ of the time. • But it will matter to you someday.
  22. 22/43 A Write skew anomaly in action • I took

    this example from Snapshot Isolation wikipedia page: http://en.wikipedia.org/wiki/Snapshot_isolation • Consider the example of a person with two bank accounts at the same bank. The bank lets the person have a negative balance in one if the balance in the other is enough that in aggregate the person has a non-negative balance (>= 0). • So application checks if constraint holds, and lets transaction go through. If not, it aborts with error for user. • What happens if two requests come in at approximately the same time with snapshot isolation?
  23. 23/43 Transactions involved in skew Transaction 1 Transaction 2 BEGIN;

    (takes snapshot) checks account 1; $50 checks account 2; $0 Deducts $50 from account 2, leaving its balance at -$50 Thinks aggregate balance is $0. COMMIT; BEGIN; (takes snapshot) checks account 1; $50 checks account 2; $0 Deducts $50 from account 1, leaving it's balance at $0 Thinks aggregate balance is $0 COMMIT;
  24. 24/43 Write skew anomalies (cont'd) • If there was only

    one bank account, or they happened to try to deduct from the same account, a lock implicitly acquired by UPDATE would have saved us here. Hence, the “skew”. • So what do I do? Accept this as the cost of having MVCC, and its guarantee that readers don't block writers and vice-versa? Artificially introduce a write dependency? • Even read-only transactions can be affected by these problems with snapshot isolation. • I'm very conservative when it comes to my data...should I think about a 2PL system instead? • No, just use Postgres. Version 9.1+ has something called Serializable Snapshot isolation (SSI). This is what you get when you ask for serializable level on these versions.
  25. 25/43 Back to isolation levels: Serializable • “Serializable snapshot isolation”

    • Based on recent research. Postgres is world's first implementation. • Sometimes SSI (like repeatable read) throws serialization failures. You have to retry transaction. • It's much better than 2PL; it doesn't just throw failures where 2PL would block. But is still slightly conservative in that sometimes it throws errors when not strictly necessary; it's clever, but not magic. • This system has Postgres keep track of read/write dependencies across transactions that cannot otherwise know what each other transaction is doing. • Fully equivalent to some serial execution order – doesn't promise which one. • Works okay with one session? Automatically works with many (if you can handle serialization conflicts).
  26. 26/43 Race conditions

  27. 27/43 About race conditions • Race conditions occur when there

    is an tacit assumption that things execute in a certain order relative to each other, when in fact there is nothing to guarantee that’s the case. • Write-skew anomaly is a race condition. • Locks will get you pretty far when it comes to avoiding these. • A big offender for races is a trigger than enforces a business rule. “When you insert new tuple, this other condition about this other table must hold; otherwise, abort transaction”.
  28. 28/43 Business rule enforcing triggers • Don’t do this if

    you can avoid it. • Lock the whole table first (obvious disadvantages). • Use a declarative constraint instead. With Postgres, this is possible surprisingly often. For example, exclusion constraints solve many problems that other systems need these kinds of triggers for. It’s always a big bonus to be able to do this, because it will perform better than ad-hoc methods, and you don’t have to worry about the correctness of your own implementation. • Always use SSI. If you can live with the impact on performance and don’t mind handling serialization failures.
  29. 29/43 Deadlocks

  30. 30/43 What's a deadlock? • Postgres always detects when transactions

    deadlock - when one transaction acquires a lock that another blocks on acquiring, while the blocked one itself has locks that the first transaction ends up itself requiring. Nothing can proceed - it’s a “deadly embrace”. In this scenario, Postgres randomly has one of the two transactions abort after a second or so. • The key to avoiding this situation is to acquire locks - including implicit locks from UPDATEs, or locks on values held when inserting into a table with a unique index constraint - in consistent order.
  31. 31/43 Improvements in 9.3 • In Postgres, foreign keys are

    implemented internally as triggers. They acquire locks on referenced (or referencing in the case of the foreign table) rows. • This works much better in 9.3. • You need referenced rows to stick around for your whole transaction when inserting, updating or deleting. • These triggers use special “dirty snapshots” to see uncommitted data...so general “triggers enforces business rule” caveats don't quite apply. • No, you do not want to use dirty snapshots for your app directly. • I guess you could if you really wanted to, and felt like writing your triggers in C, though. This is Postgres.
  32. 32/43 Foreign keys

  33. 33/43 More granular foreign key locks • There has been

    major work in this area in 9.3, which took a great deal of effort. • In 9.2, the foreign key triggers acquire SELECT FOR SHARE locks on rows (it actually used to be SELECT FOR UPDATE many years ago). • The type of lock involved doesn’t block other locks of the same kind, so you can insert a value referencing the same row in two concurrent transactions. But the lock taken does block some other locks, like those taken by UPDATE statements. • This is all needed for correct behavior; the transaction needs to do the rest of its work while knowing that the other row won’t go away. Everything it does is predicated on the row actually being around.
  34. 34/43 More granular foreign key locks (cont'd) • However, do

    UPDATEs really prevent the row from continuing to be around in any real sense? • Not usually - if you updated every single column in the row, you might consider that a whole new row, and then that the lock enforcement makes sense. But those UPDATEs probably aren’t updating the primary key values of the referenced row most of time time.
  35. 35/43 More granular foreign key locks (cont'd) • The relevant

    triggers now use FOR KEY SHARE. You can even use these locks yourself from SQL (if you were so inclined). • UPDATE statements may use FOR NO KEY UPDATE. • This greatly improves the performance with many clients, because they will block each other far less frequently. This tends to help a lot with deadlocking, because in my experience deadlocking tends to involve foreign key acquired locks.
  36. 36/43 -- Postgres 8.0 foreign keys: SELECT * FROM foreign_table

    WHERE key = 1 FOR UPDATE; -- Postgres 8.1 - 9.2 foreign keys: SELECT * FROM foreign_table WHERE key = 2 FOR SHARE; -- Postgres 9.3+ foreign keys: SELECT * FROM foreign_table WHERE key = 3 FOR KEY SHARE; -- Many updates just implicitly take -- a lock like this in 9.3: SELECT * FROM foreign_table WHERE key = 4 FOR NO KEY UPDATE;
  37. 37/43 “Upsert”

  38. 38/43 What is “upsert”, anyway? • The ability for someone

    to write a simple atomic SQL statement that will either insert a row, or update it in the event of it already existing (that is, update a row found that has the same primary key value as the row proposed for insertion, or perhaps exclusive constraint values). • MySQL has INSERT...ON DUPLICATE KEY UPDATE • Postgres requires you to use substransactions. In a loop. It's rather subtle to get all the details right. • Frankly, this is kind of a poor showing for Postgres, because it's such a common operation.
  39. 39/43 “Upsert” (cont'd) • I'm working on it – trying

    to get patch into 9.4, released next year. • You can compose this in a Writeable common table expression to get a very flexible upsert.
  40. 40/43 What's a wCTE? WITH rej AS ( INSERT INTO

    test(a, b) VALUES(123, 'Chicago'), (456, 'Dublin') ON DUPLICATE KEY LOCK FOR UPDATE RETURNING REJECTS * ) UPDATE test SET test.b = rej.b FROM rej WHERE test.a = rej.a; • Think of a common table expression as like a temporary table that lasts only the duration of a statement. • In Postgres, these can contain DML (with RETURNING clause).
  41. 41/43 Okay...what if I can't wait until 2014? CREATE TABLE

    db (a INT PRIMARY KEY, b TEXT); CREATE FUNCTION merge_db(key INT, data TEXT) RETURNS VOID AS $$ BEGIN LOOP -- first try to update the key UPDATE db SET b = data WHERE a = key; IF found THEN RETURN; END IF; -- not there, so try to insert the key -- if someone else inserts the same key concurrently, -- we could get a unique-key failure BEGIN INSERT INTO db(a,b) VALUES (key, data); RETURN; EXCEPTION WHEN unique_violation THEN -- do nothing, and loop to try the UPDATE again END; END LOOP; END; $$ LANGUAGE plpgsql;
  42. 42/43 Upsert thorny implementation details • Upsert needs to lock

    values (like the integer 5), then rows (a row in the table with a primary key value of 5), then release the values lock. Row locks persist until end of transaction. • Unique constraints are always btree indexes under the hood. • No good way to lock values. Had to develop “phased locking” approach. • Lock the values in indexes, then maybe lock rows and relase index locks. • Otherwise, insert heap tuple, then pick up from first phase and actually go through with inserting. Then release index locks.
  43. 43/43 Thanks for listening! Questions?