The code to unlink dropped relations in FinishPreparedTransaction() was
[PostgreSQL.git] / doc / src / sgml / mvcc.sgml
blobb2b4b5bdf6a90e9598dc673c2783b5872028e68e
1 <!-- $PostgreSQL$ -->
3 <chapter id="mvcc">
4 <title>Concurrency Control</title>
6 <indexterm>
7 <primary>concurrency</primary>
8 </indexterm>
10 <para>
11 This chapter describes the behavior of the
12 <productname>PostgreSQL</productname> database system when two or
13 more sessions try to access the same data at the same time. The
14 goals in that situation are to allow efficient access for all
15 sessions while maintaining strict data integrity. Every developer
16 of database applications should be familiar with the topics covered
17 in this chapter.
18 </para>
20 <sect1 id="mvcc-intro">
21 <title>Introduction</title>
23 <indexterm>
24 <primary>MVCC</primary>
25 </indexterm>
27 <para>
28 <productname>PostgreSQL</productname> provides a rich set of tools
29 for developers to manage concurrent access to data. Internally,
30 data consistency is maintained by using a multiversion
31 model (Multiversion Concurrency Control, <acronym>MVCC</acronym>).
32 This means that while querying a database each transaction sees
33 a snapshot of data (a <firstterm>database version</firstterm>)
34 as it was some
35 time ago, regardless of the current state of the underlying data.
36 This protects the transaction from viewing inconsistent data that
37 could be caused by (other) concurrent transaction updates on the same
38 data rows, providing <firstterm>transaction isolation</firstterm>
39 for each database session. <acronym>MVCC</acronym>, by eschewing
40 explicit locking methodologies of traditional database systems,
41 minimizes lock contention in order to allow for reasonable
42 performance in multiuser environments.
43 </para>
45 <para>
46 The main advantage of using the <acronym>MVCC</acronym> model of
47 concurrency control rather than locking is that in
48 <acronym>MVCC</acronym> locks acquired for querying (reading) data
49 do not conflict with locks acquired for writing data, and so
50 reading never blocks writing and writing never blocks reading.
51 </para>
53 <para>
54 Table- and row-level locking facilities are also available in
55 <productname>PostgreSQL</productname> for applications that cannot
56 adapt easily to <acronym>MVCC</acronym> behavior. However, proper
57 use of <acronym>MVCC</acronym> will generally provide better
58 performance than locks. In addition, application-defined advisory
59 locks provide a mechanism for acquiring locks that are not tied
60 to a single transaction.
61 </para>
62 </sect1>
64 <sect1 id="transaction-iso">
65 <title>Transaction Isolation</title>
67 <indexterm>
68 <primary>transaction isolation</primary>
69 </indexterm>
71 <para>
72 The <acronym>SQL</acronym> standard defines four levels of
73 transaction isolation in terms of three phenomena that must be
74 prevented between concurrent transactions. These undesirable
75 phenomena are:
77 <variablelist>
78 <varlistentry>
79 <term>
80 dirty read
81 <indexterm><primary>dirty read</primary></indexterm>
82 </term>
83 <listitem>
84 <para>
85 A transaction reads data written by a concurrent uncommitted transaction.
86 </para>
87 </listitem>
88 </varlistentry>
90 <varlistentry>
91 <term>
92 nonrepeatable read
93 <indexterm><primary>nonrepeatable read</primary></indexterm>
94 </term>
95 <listitem>
96 <para>
97 A transaction re-reads data it has previously read and finds that data
98 has been modified by another transaction (that committed since the
99 initial read).
100 </para>
101 </listitem>
102 </varlistentry>
104 <varlistentry>
105 <term>
106 phantom read
107 <indexterm><primary>phantom read</primary></indexterm>
108 </term>
109 <listitem>
110 <para>
111 A transaction re-executes a query returning a set of rows that satisfy a
112 search condition and finds that the set of rows satisfying the condition
113 has changed due to another recently-committed transaction.
114 </para>
115 </listitem>
116 </varlistentry>
117 </variablelist>
118 </para>
120 <para>
121 <indexterm>
122 <primary>transaction isolation level</primary>
123 </indexterm>
124 The four transaction isolation levels and the corresponding
125 behaviors are described in <xref linkend="mvcc-isolevel-table">.
126 </para>
128 <table tocentry="1" id="mvcc-isolevel-table">
129 <title><acronym>SQL</acronym> Transaction Isolation Levels</title>
130 <tgroup cols="4">
131 <thead>
132 <row>
133 <entry>
134 Isolation Level
135 </entry>
136 <entry>
137 Dirty Read
138 </entry>
139 <entry>
140 Nonrepeatable Read
141 </entry>
142 <entry>
143 Phantom Read
144 </entry>
145 </row>
146 </thead>
147 <tbody>
148 <row>
149 <entry>
150 Read uncommitted
151 </entry>
152 <entry>
153 Possible
154 </entry>
155 <entry>
156 Possible
157 </entry>
158 <entry>
159 Possible
160 </entry>
161 </row>
163 <row>
164 <entry>
165 Read committed
166 </entry>
167 <entry>
168 Not possible
169 </entry>
170 <entry>
171 Possible
172 </entry>
173 <entry>
174 Possible
175 </entry>
176 </row>
178 <row>
179 <entry>
180 Repeatable read
181 </entry>
182 <entry>
183 Not possible
184 </entry>
185 <entry>
186 Not possible
187 </entry>
188 <entry>
189 Possible
190 </entry>
191 </row>
193 <row>
194 <entry>
195 Serializable
196 </entry>
197 <entry>
198 Not possible
199 </entry>
200 <entry>
201 Not possible
202 </entry>
203 <entry>
204 Not possible
205 </entry>
206 </row>
207 </tbody>
208 </tgroup>
209 </table>
211 <para>
212 In <productname>PostgreSQL</productname>, you can request any of the
213 four standard transaction isolation levels. But internally, there are
214 only two distinct isolation levels, which correspond to the levels Read
215 Committed and Serializable. When you select the level Read
216 Uncommitted you really get Read Committed, and when you select
217 Repeatable Read you really get Serializable, so the actual
218 isolation level might be stricter than what you select. This is
219 permitted by the SQL standard: the four isolation levels only
220 define which phenomena must not happen, they do not define which
221 phenomena must happen. The reason that <productname>PostgreSQL</>
222 only provides two isolation levels is that this is the only
223 sensible way to map the standard isolation levels to the multiversion
224 concurrency control architecture. The behavior of the available
225 isolation levels is detailed in the following subsections.
226 </para>
228 <para>
229 To set the transaction isolation level of a transaction, use the
230 command <xref linkend="sql-set-transaction" endterm="sql-set-transaction-title">.
231 </para>
233 <sect2 id="xact-read-committed">
234 <title>Read Committed Isolation Level</title>
236 <indexterm>
237 <primary>transaction isolation level</primary>
238 <secondary>read committed</secondary>
239 </indexterm>
241 <para>
242 <firstterm>Read Committed</firstterm> is the default isolation
243 level in <productname>PostgreSQL</productname>. When a transaction
244 uses this isolation level, a <command>SELECT</command> query
245 (without a <literal>FOR UPDATE/SHARE</> clause) sees only data
246 committed before the query began; it never sees either uncommitted
247 data or changes committed during query execution by concurrent
248 transactions. In effect, a <command>SELECT</command> query sees
249 a snapshot of the database as of the instant the query begins to
250 run. However, <command>SELECT</command> does see the effects
251 of previous updates executed within its own transaction, even
252 though they are not yet committed. Also note that two successive
253 <command>SELECT</command> commands can see different data, even
254 though they are within a single transaction, if other transactions
255 commit changes during execution of the first <command>SELECT</command>.
256 </para>
258 <para>
259 <command>UPDATE</command>, <command>DELETE</command>, <command>SELECT
260 FOR UPDATE</command>, and <command>SELECT FOR SHARE</command> commands
261 behave the same as <command>SELECT</command>
262 in terms of searching for target rows: they will only find target rows
263 that were committed as of the command start time. However, such a target
264 row might have already been updated (or deleted or locked) by
265 another concurrent transaction by the time it is found. In this case, the
266 would-be updater will wait for the first updating transaction to commit or
267 roll back (if it is still in progress). If the first updater rolls back,
268 then its effects are negated and the second updater can proceed with
269 updating the originally found row. If the first updater commits, the
270 second updater will ignore the row if the first updater deleted it,
271 otherwise it will attempt to apply its operation to the updated version of
272 the row. The search condition of the command (the <literal>WHERE</> clause) is
273 re-evaluated to see if the updated version of the row still matches the
274 search condition. If so, the second updater proceeds with its operation
275 using the updated version of the row. In the case of
276 <command>SELECT FOR UPDATE</command> and <command>SELECT FOR
277 SHARE</command>, this means it is the updated version of the row that is
278 locked and returned to the client.
279 </para>
281 <para>
282 Because of the above rule, it is possible for an updating command to see an
283 inconsistent snapshot: it can see the effects of concurrent updating
284 commands on the same rows it is trying to update, but it
285 does not see effects of those commands on other rows in the database.
286 This behavior makes Read Committed mode unsuitable for commands that
287 involve complex search conditions; however, it is just right for simpler
288 cases. For example, consider updating bank balances with transactions
289 like:
291 <screen>
292 BEGIN;
293 UPDATE accounts SET balance = balance + 100.00 WHERE acctnum = 12345;
294 UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 7534;
295 COMMIT;
296 </screen>
298 If two such transactions concurrently try to change the balance of account
299 12345, we clearly want the second transaction to start with the updated
300 version of the account's row. Because each command is affecting only a
301 predetermined row, letting it see the updated version of the row does
302 not create any troublesome inconsistency.
303 </para>
305 <para>
306 More complex usage can produce undesirable results in Read Committed
307 mode. For example, consider a <command>DELETE</command> command
308 operating on data that is being both added and removed from its
309 restriction criteria by another command, e.g., assume
310 <literal>website</literal> is a two-row table with
311 <literal>website.hits</literal> equaling <literal>9</literal> and
312 <literal>10</literal>:
314 <screen>
315 BEGIN;
316 UPDATE website SET hits = hits + 1;
317 -- run from another session: DELETE FROM website WHERE hits = 10;
318 COMMIT;
319 </screen>
321 The <command>DELETE</command> will have no effect even though
322 there is a <literal>website.hits = 10</literal> row before and
323 after the <command>UPDATE</command>. This occurs because the
324 pre-update row value <literal>9</> is skipped, and when the
325 <command>UPDATE</command> completes and <command>DELETE</command>
326 obtains a lock, the new row value is no longer <literal>10</> but
327 <literal>11</>, which no longer matches the criteria.
328 </para>
330 <para>
331 Because Read Committed mode starts each command with a new snapshot
332 that includes all transactions committed up to that instant,
333 subsequent commands in the same transaction will see the effects
334 of the committed concurrent transaction in any case. The point
335 at issue above is whether or not a <emphasis>single</> command
336 sees an absolutely consistent view of the database.
337 </para>
339 <para>
340 The partial transaction isolation provided by Read Committed mode
341 is adequate for many applications, and this mode is fast and simple
342 to use; however, it is not sufficient for all cases. Applications
343 that do complex queries and updates might require a more rigorously
344 consistent view of the database than Read Committed mode provides.
345 </para>
346 </sect2>
348 <sect2 id="xact-serializable">
349 <title>Serializable Isolation Level</title>
351 <indexterm>
352 <primary>transaction isolation level</primary>
353 <secondary>serializable</secondary>
354 </indexterm>
356 <para>
357 The <firstterm>Serializable</firstterm> isolation level provides the strictest transaction
358 isolation. This level emulates serial transaction execution,
359 as if transactions had been executed one after another, serially,
360 rather than concurrently. However, applications using this level must
361 be prepared to retry transactions due to serialization failures.
362 </para>
364 <para>
365 When a transaction is using the serializable level,
366 a <command>SELECT</command> query only sees data committed before the
367 transaction began; it never sees either uncommitted data or changes
368 committed
369 during transaction execution by concurrent transactions. (However,
370 the query does see the effects of previous updates
371 executed within its own transaction, even though they are not yet
372 committed.) This is different from Read Committed in that
373 a query in a serializable transaction
374 sees a snapshot as of the start of the <emphasis>transaction</>,
375 not as of the start
376 of the current query within the transaction. Thus, successive
377 <command>SELECT</command> commands within a <emphasis>single</>
378 transaction see the same data, i.e., they do not see changes made by
379 other transactions that committed after their own transaction started.
380 (This behavior can be ideal for reporting applications.)
381 </para>
383 <para>
384 <command>UPDATE</command>, <command>DELETE</command>, <command>SELECT
385 FOR UPDATE</command>, and <command>SELECT FOR SHARE</command> commands
386 behave the same as <command>SELECT</command>
387 in terms of searching for target rows: they will only find target rows
388 that were committed as of the transaction start time. However, such a
389 target
390 row might have already been updated (or deleted or locked) by
391 another concurrent transaction by the time it is found. In this case, the
392 serializable transaction will wait for the first updating transaction to commit or
393 roll back (if it is still in progress). If the first updater rolls back,
394 then its effects are negated and the serializable transaction can proceed
395 with updating the originally found row. But if the first updater commits
396 (and actually updated or deleted the row, not just locked it)
397 then the serializable transaction will be rolled back with the message
399 <screen>
400 ERROR: could not serialize access due to concurrent update
401 </screen>
403 because a serializable transaction cannot modify or lock rows changed by
404 other transactions after the serializable transaction began.
405 </para>
407 <para>
408 When an application receives this error message, it should abort
409 the current transaction and retry the whole transaction from
410 the beginning. The second time through, the transaction will see the
411 previously-committed change as part of its initial view of the database,
412 so there is no logical conflict in using the new version of the row
413 as the starting point for the new transaction's update.
414 </para>
416 <para>
417 Note that only updating transactions might need to be retried; read-only
418 transactions will never have serialization conflicts.
419 </para>
421 <para>
422 The Serializable mode provides a rigorous guarantee that each
423 transaction sees a wholly consistent view of the database. However,
424 the application has to be prepared to retry transactions when concurrent
425 updates make it impossible to sustain the illusion of serial execution.
426 Since the cost of redoing complex transactions can be significant,
427 serializable mode is recommended only when updating transactions contain logic
428 sufficiently complex that they might give wrong answers in Read
429 Committed mode. Most commonly, Serializable mode is necessary when
430 a transaction executes several successive commands that must see
431 identical views of the database.
432 </para>
434 <sect3 id="mvcc-serializability">
435 <title>Serializable Isolation versus True Serializability</title>
437 <indexterm>
438 <primary>serializability</primary>
439 </indexterm>
441 <indexterm>
442 <primary>predicate locking</primary>
443 </indexterm>
445 <para>
446 The intuitive meaning (and mathematical definition) of
447 <quote>serializable</> execution is that any two successfully committed
448 concurrent transactions will appear to have executed strictly serially,
449 one after the other &mdash; although which one appeared to occur first might
450 not be predictable in advance. It is important to realize that forbidding
451 the undesirable behaviors listed in <xref linkend="mvcc-isolevel-table">
452 is not sufficient to guarantee true serializability, and in fact
453 <productname>PostgreSQL</productname>'s Serializable mode <emphasis>does
454 not guarantee serializable execution in this sense</>. As an example,
455 consider a table <structname>mytab</>, initially containing:
456 <screen>
457 class | value
458 -------+-------
459 1 | 10
460 1 | 20
461 2 | 100
462 2 | 200
463 </screen>
464 Suppose that serializable transaction A computes:
465 <screen>
466 SELECT SUM(value) FROM mytab WHERE class = 1;
467 </screen>
468 and then inserts the result (30) as the <structfield>value</> in a
469 new row with <structfield>class</><literal> = 2</>. Concurrently, serializable
470 transaction B computes:
471 <screen>
472 SELECT SUM(value) FROM mytab WHERE class = 2;
473 </screen>
474 and obtains the result 300, which it inserts in a new row with
475 <structfield>class</><literal> = 1</>. Then both transactions commit. None of
476 the listed undesirable behaviors have occurred, yet we have a result
477 that could not have occurred in either order serially. If A had
478 executed before B, B would have computed the sum 330, not 300, and
479 similarly the other order would have resulted in a different sum
480 computed by A.
481 </para>
483 <para>
484 To guarantee true mathematical serializability, it is necessary for
485 a database system to enforce <firstterm>predicate locking</>, which
486 means that a transaction cannot insert or modify a row that would
487 have matched the <literal>WHERE</> condition of a query in another concurrent
488 transaction. For example, once transaction A has executed the query
489 <literal>SELECT ... WHERE class = 1</>, a predicate-locking system
490 would forbid transaction B from inserting any new row with class 1
491 until A has committed.
492 <footnote>
493 <para>
494 Essentially, a predicate-locking system prevents phantom reads
495 by restricting what is written, whereas MVCC prevents them by
496 restricting what is read.
497 </para>
498 </footnote>
499 Such a locking system is complex to
500 implement and extremely expensive in execution, since every session must
501 be aware of the details of every query executed by every concurrent
502 transaction. And this large expense is mostly wasted, since in
503 practice most applications do not do the sorts of things that could
504 result in problems. (Certainly the example above is rather contrived
505 and unlikely to represent real software.) For these reasons,
506 <productname>PostgreSQL</productname> does not implement predicate
507 locking.
508 </para>
510 <para>
511 In cases where the possibility of non-serializable execution
512 is a real hazard, problems can be prevented by appropriate use of
513 explicit locking. Further discussion appears in the following
514 sections.
515 </para>
516 </sect3>
517 </sect2>
518 </sect1>
520 <sect1 id="explicit-locking">
521 <title>Explicit Locking</title>
523 <indexterm>
524 <primary>lock</primary>
525 </indexterm>
527 <para>
528 <productname>PostgreSQL</productname> provides various lock modes
529 to control concurrent access to data in tables. These modes can
530 be used for application-controlled locking in situations where
531 <acronym>MVCC</acronym> does not give the desired behavior. Also,
532 most <productname>PostgreSQL</productname> commands automatically
533 acquire locks of appropriate modes to ensure that referenced
534 tables are not dropped or modified in incompatible ways while the
535 command executes. (For example, <command>ALTER TABLE</> cannot safely be
536 executed concurrently with other operations on the same table, so it
537 obtains an exclusive lock on the table to enforce that.)
538 </para>
540 <para>
541 To examine a list of the currently outstanding locks in a database
542 server, use the
543 <link linkend="view-pg-locks"><structname>pg_locks</structname></link>
544 system view. For more information on monitoring the status of the lock
545 manager subsystem, refer to <xref linkend="monitoring">.
546 </para>
548 <sect2 id="locking-tables">
549 <title>Table-Level Locks</title>
551 <indexterm zone="locking-tables">
552 <primary>LOCK</primary>
553 </indexterm>
555 <para>
556 The list below shows the available lock modes and the contexts in
557 which they are used automatically by
558 <productname>PostgreSQL</productname>. You can also acquire any
559 of these locks explicitly with the command <xref
560 linkend="sql-lock" endterm="sql-lock-title">.
561 Remember that all of these lock modes are table-level locks,
562 even if the name contains the word
563 <quote>row</quote>; the names of the lock modes are historical.
564 To some extent the names reflect the typical usage of each lock
565 mode &mdash; but the semantics are all the same. The only real difference
566 between one lock mode and another is the set of lock modes with
567 which each conflicts (see <xref linkend="table-lock-compatibility">).
568 . Two transactions cannot hold locks of conflicting
569 modes on the same table at the same time. (However, a transaction
570 never conflicts with itself. For example, it might acquire
571 <literal>ACCESS EXCLUSIVE</literal> lock and later acquire
572 <literal>ACCESS SHARE</literal> lock on the same table.) Non-conflicting
573 lock modes can be held concurrently by many transactions. Notice in
574 particular that some lock modes are self-conflicting (for example,
575 an <literal>ACCESS EXCLUSIVE</literal> lock cannot be held by more than one
576 transaction at a time) while others are not self-conflicting (for example,
577 an <literal>ACCESS SHARE</literal> lock can be held by multiple transactions).
578 </para>
580 <variablelist>
581 <title>Table-level lock modes</title>
582 <varlistentry>
583 <term>
584 <literal>ACCESS SHARE</literal>
585 </term>
586 <listitem>
587 <para>
588 Conflicts with the <literal>ACCESS EXCLUSIVE</literal> lock
589 mode only.
590 </para>
592 <para>
593 The <command>SELECT</command> command acquires a lock of this mode on
594 referenced tables. In general, any query that only <emphasis>reads</> a table
595 and does not modify it will acquire this lock mode.
596 </para>
597 </listitem>
598 </varlistentry>
600 <varlistentry>
601 <term>
602 <literal>ROW SHARE</literal>
603 </term>
604 <listitem>
605 <para>
606 Conflicts with the <literal>EXCLUSIVE</literal> and
607 <literal>ACCESS EXCLUSIVE</literal> lock modes.
608 </para>
610 <para>
611 The <command>SELECT FOR UPDATE</command> and
612 <command>SELECT FOR SHARE</command> commands acquire a
613 lock of this mode on the target table(s) (in addition to
614 <literal>ACCESS SHARE</literal> locks on any other tables
615 that are referenced but not selected
616 <option>FOR UPDATE/FOR SHARE</option>).
617 </para>
618 </listitem>
619 </varlistentry>
621 <varlistentry>
622 <term>
623 <literal>ROW EXCLUSIVE</literal>
624 </term>
625 <listitem>
626 <para>
627 Conflicts with the <literal>SHARE</literal>, <literal>SHARE ROW
628 EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
629 <literal>ACCESS EXCLUSIVE</literal> lock modes.
630 </para>
632 <para>
633 The commands <command>UPDATE</command>,
634 <command>DELETE</command>, and <command>INSERT</command>
635 acquire this lock mode on the target table (in addition to
636 <literal>ACCESS SHARE</literal> locks on any other referenced
637 tables). In general, this lock mode will be acquired by any
638 command that <emphasis>modifies data</> in a table.
639 </para>
640 </listitem>
641 </varlistentry>
643 <varlistentry>
644 <term>
645 <literal>SHARE UPDATE EXCLUSIVE</literal>
646 </term>
647 <listitem>
648 <para>
649 Conflicts with the <literal>SHARE UPDATE EXCLUSIVE</literal>,
650 <literal>SHARE</literal>, <literal>SHARE ROW
651 EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
652 <literal>ACCESS EXCLUSIVE</literal> lock modes.
653 This mode protects a table against
654 concurrent schema changes and <command>VACUUM</> runs.
655 </para>
657 <para>
658 Acquired by <command>VACUUM</command> (without <option>FULL</option>),
659 <command>ANALYZE</>, and <command>CREATE INDEX CONCURRENTLY</>.
660 </para>
661 </listitem>
662 </varlistentry>
664 <varlistentry>
665 <term>
666 <literal>SHARE</literal>
667 </term>
668 <listitem>
669 <para>
670 Conflicts with the <literal>ROW EXCLUSIVE</literal>,
671 <literal>SHARE UPDATE EXCLUSIVE</literal>, <literal>SHARE ROW
672 EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
673 <literal>ACCESS EXCLUSIVE</literal> lock modes.
674 This mode protects a table against concurrent data changes.
675 </para>
677 <para>
678 Acquired by <command>CREATE INDEX</command>
679 (without <option>CONCURRENTLY</option>).
680 </para>
681 </listitem>
682 </varlistentry>
684 <varlistentry>
685 <term>
686 <literal>SHARE ROW EXCLUSIVE</literal>
687 </term>
688 <listitem>
689 <para>
690 Conflicts with the <literal>ROW EXCLUSIVE</literal>,
691 <literal>SHARE UPDATE EXCLUSIVE</literal>,
692 <literal>SHARE</literal>, <literal>SHARE ROW
693 EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
694 <literal>ACCESS EXCLUSIVE</literal> lock modes.
695 </para>
697 <para>
698 This lock mode is not automatically acquired by any
699 <productname>PostgreSQL</productname> command.
700 </para>
701 </listitem>
702 </varlistentry>
704 <varlistentry>
705 <term>
706 <literal>EXCLUSIVE</literal>
707 </term>
708 <listitem>
709 <para>
710 Conflicts with the <literal>ROW SHARE</literal>, <literal>ROW
711 EXCLUSIVE</literal>, <literal>SHARE UPDATE
712 EXCLUSIVE</literal>, <literal>SHARE</literal>, <literal>SHARE
713 ROW EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
714 <literal>ACCESS EXCLUSIVE</literal> lock modes.
715 This mode allows only concurrent <literal>ACCESS SHARE</literal> locks,
716 i.e., only reads from the table can proceed in parallel with a
717 transaction holding this lock mode.
718 </para>
720 <para>
721 This lock mode is not automatically acquired on user tables by any
722 <productname>PostgreSQL</productname> command. However it is
723 acquired on certain system catalogs in some operations.
724 </para>
725 </listitem>
726 </varlistentry>
728 <varlistentry>
729 <term>
730 <literal>ACCESS EXCLUSIVE</literal>
731 </term>
732 <listitem>
733 <para>
734 Conflicts with locks of all modes (<literal>ACCESS
735 SHARE</literal>, <literal>ROW SHARE</literal>, <literal>ROW
736 EXCLUSIVE</literal>, <literal>SHARE UPDATE
737 EXCLUSIVE</literal>, <literal>SHARE</literal>, <literal>SHARE
738 ROW EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
739 <literal>ACCESS EXCLUSIVE</literal>).
740 This mode guarantees that the
741 holder is the only transaction accessing the table in any way.
742 </para>
744 <para>
745 Acquired by the <command>ALTER TABLE</command>, <command>DROP
746 TABLE</command>, <command>TRUNCATE</command>, <command>REINDEX</command>,
747 <command>CLUSTER</command>, and <command>VACUUM FULL</command>
748 commands. This is also the default lock mode for <command>LOCK
749 TABLE</command> statements that do not specify a mode explicitly.
750 </para>
751 </listitem>
752 </varlistentry>
753 </variablelist>
755 <tip>
756 <para>
757 Only an <literal>ACCESS EXCLUSIVE</literal> lock blocks a
758 <command>SELECT</command> (without <option>FOR UPDATE/SHARE</option>)
759 statement.
760 </para>
761 </tip>
763 <para>
764 Once acquired, a lock is normally held till end of transaction. But if a
765 lock is acquired after establishing a savepoint, the lock is released
766 immediately if the savepoint is rolled back to. This is consistent with
767 the principle that <command>ROLLBACK</> cancels all effects of the
768 commands since the savepoint. The same holds for locks acquired within a
769 <application>PL/pgSQL</> exception block: an error escape from the block
770 releases locks acquired within it.
771 </para>
775 <table tocentry="1" id="table-lock-compatibility">
776 <title> Conflicting lock modes</title>
777 <tgroup cols="9">
778 <colspec colnum="2" colname="lockst">
779 <colspec colnum="9" colname="lockend">
780 <spanspec namest="lockst" nameend="lockend" spanname="lockreq">
781 <thead>
782 <row>
783 <entry morerows="1">Requested Lock Mode</entry>
784 <entry spanname="lockreq">Current Lock Mode</entry>
785 </row>
786 <row>
787 <entry>ACCESS SHARE</entry>
788 <entry>ROW SHARE</entry>
789 <entry>ROW EXCLUSIVE</entry>
790 <entry>SHARE UPDATE EXCLUSIVE</entry>
791 <entry>SHARE</entry>
792 <entry>SHARE ROW EXCLUSIVE</entry>
793 <entry>EXCLUSIVE</entry>
794 <entry>ACCESS EXCLUSIVE</entry>
795 </row>
796 </thead>
797 <tbody>
798 <row>
799 <entry>ACCESS SHARE</entry>
800 <entry align="center"></entry>
801 <entry align="center"></entry>
802 <entry align="center"></entry>
803 <entry align="center"></entry>
804 <entry align="center"></entry>
805 <entry align="center"></entry>
806 <entry align="center"></entry>
807 <entry align="center">X</entry>
808 </row>
809 <row>
810 <entry>ROW SHARE</entry>
811 <entry align="center"></entry>
812 <entry align="center"></entry>
813 <entry align="center"></entry>
814 <entry align="center"></entry>
815 <entry align="center"></entry>
816 <entry align="center"></entry>
817 <entry align="center">X</entry>
818 <entry align="center">X</entry>
819 </row>
820 <row>
821 <entry>ROW EXCLUSIVE</entry>
822 <entry align="center"></entry>
823 <entry align="center"></entry>
824 <entry align="center"></entry>
825 <entry align="center"></entry>
826 <entry align="center">X</entry>
827 <entry align="center">X</entry>
828 <entry align="center">X</entry>
829 <entry align="center">X</entry>
830 </row>
831 <row>
832 <entry>SHARE UPDATE EXCLUSIVE</entry>
833 <entry align="center"></entry>
834 <entry align="center"></entry>
835 <entry align="center"></entry>
836 <entry align="center">X</entry>
837 <entry align="center">X</entry>
838 <entry align="center">X</entry>
839 <entry align="center">X</entry>
840 <entry align="center">X</entry>
841 </row>
842 <row>
843 <entry>SHARE</entry>
844 <entry align="center"></entry>
845 <entry align="center"></entry>
846 <entry align="center">X</entry>
847 <entry align="center">X</entry>
848 <entry align="center"></entry>
849 <entry align="center">X</entry>
850 <entry align="center">X</entry>
851 <entry align="center">X</entry>
852 </row>
853 <row>
854 <entry>SHARE ROW EXCLUSIVE</entry>
855 <entry align="center"></entry>
856 <entry align="center"></entry>
857 <entry align="center">X</entry>
858 <entry align="center">X</entry>
859 <entry align="center">X</entry>
860 <entry align="center">X</entry>
861 <entry align="center">X</entry>
862 <entry align="center">X</entry>
863 </row>
864 <row>
865 <entry>EXCLUSIVE</entry>
866 <entry align="center"></entry>
867 <entry align="center">X</entry>
868 <entry align="center">X</entry>
869 <entry align="center">X</entry>
870 <entry align="center">X</entry>
871 <entry align="center">X</entry>
872 <entry align="center">X</entry>
873 <entry align="center">X</entry>
874 </row>
875 <row>
876 <entry>ACCESS EXCLUSIVE</entry>
877 <entry align="center">X</entry>
878 <entry align="center">X</entry>
879 <entry align="center">X</entry>
880 <entry align="center">X</entry>
881 <entry align="center">X</entry>
882 <entry align="center">X</entry>
883 <entry align="center">X</entry>
884 <entry align="center">X</entry>
885 </row>
886 </tbody>
887 </tgroup>
888 </table>
889 </sect2>
891 <sect2 id="locking-rows">
892 <title>Row-Level Locks</title>
894 <para>
895 In addition to table-level locks, there are row-level locks, which
896 can be exclusive or shared locks. An exclusive row-level lock on a
897 specific row is automatically acquired when the row is updated or
898 deleted. The lock is held until the transaction commits or rolls
899 back, just like table-level locks. Row-level locks do
900 not affect data querying; they block only <emphasis>writers to the same
901 row</emphasis>.
902 </para>
904 <para>
905 To acquire an exclusive row-level lock on a row without actually
906 modifying the row, select the row with <command>SELECT FOR
907 UPDATE</command>. Note that once the row-level lock is acquired,
908 the transaction can update the row multiple times without
909 fear of conflicts.
910 </para>
912 <para>
913 To acquire a shared row-level lock on a row, select the row with
914 <command>SELECT FOR SHARE</command>. A shared lock does not prevent
915 other transactions from acquiring the same shared lock. However,
916 no transaction is allowed to update, delete, or exclusively lock a
917 row on which any other transaction holds a shared lock. Any attempt
918 to do so will block until the shared lock(s) have been released.
919 </para>
921 <para>
922 <productname>PostgreSQL</productname> doesn't remember any
923 information about modified rows in memory, so there is no limit on
924 the number of rows locked at one time. However, locking a row
925 might cause a disk write, e.g., <command>SELECT FOR
926 UPDATE</command> modifies selected rows to mark them locked, and so
927 will result in disk writes.
928 </para>
930 <para>
931 In addition to table and row locks, page-level share/exclusive locks are
932 used to control read/write access to table pages in the shared buffer
933 pool. These locks are released immediately after a row is fetched or
934 updated. Application developers normally need not be concerned with
935 page-level locks, but they are mentioned here for completeness.
936 </para>
938 </sect2>
940 <sect2 id="locking-deadlocks">
941 <title>Deadlocks</title>
943 <indexterm zone="locking-deadlocks">
944 <primary>deadlock</primary>
945 </indexterm>
947 <para>
948 The use of explicit locking can increase the likelihood of
949 <firstterm>deadlocks</>, wherein two (or more) transactions each
950 hold locks that the other wants. For example, if transaction 1
951 acquires an exclusive lock on table A and then tries to acquire
952 an exclusive lock on table B, while transaction 2 has already
953 exclusive-locked table B and now wants an exclusive lock on table
954 A, then neither one can proceed.
955 <productname>PostgreSQL</productname> automatically detects
956 deadlock situations and resolves them by aborting one of the
957 transactions involved, allowing the other(s) to complete.
958 (Exactly which transaction will be aborted is difficult to
959 predict and should not be relied upon.)
960 </para>
962 <para>
963 Note that deadlocks can also occur as the result of row-level
964 locks (and thus, they can occur even if explicit locking is not
965 used). Consider the case in which two concurrent
966 transactions modify a table. The first transaction executes:
968 <screen>
969 UPDATE accounts SET balance = balance + 100.00 WHERE acctnum = 11111;
970 </screen>
972 This acquires a row-level lock on the row with the specified
973 account number. Then, the second transaction executes:
975 <screen>
976 UPDATE accounts SET balance = balance + 100.00 WHERE acctnum = 22222;
977 UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 11111;
978 </screen>
980 The first <command>UPDATE</command> statement successfully
981 acquires a row-level lock on the specified row, so it succeeds in
982 updating that row. However, the second <command>UPDATE</command>
983 statement finds that the row it is attempting to update has
984 already been locked, so it waits for the transaction that
985 acquired the lock to complete. Transaction two is now waiting on
986 transaction one to complete before it continues execution. Now,
987 transaction one executes:
989 <screen>
990 UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 22222;
991 </screen>
993 Transaction one attempts to acquire a row-level lock on the
994 specified row, but it cannot: transaction two already holds such
995 a lock. So it waits for transaction two to complete. Thus,
996 transaction one is blocked on transaction two, and transaction
997 two is blocked on transaction one: a deadlock
998 condition. <productname>PostgreSQL</productname> will detect this
999 situation and abort one of the transactions.
1000 </para>
1002 <para>
1003 The best defense against deadlocks is generally to avoid them by
1004 being certain that all applications using a database acquire
1005 locks on multiple objects in a consistent order. In the example
1006 above, if both transactions
1007 had updated the rows in the same order, no deadlock would have
1008 occurred. One should also ensure that the first lock acquired on
1009 an object in a transaction is the most restrictive mode that will be
1010 needed for that object. If it is not feasible to verify this in
1011 advance, then deadlocks can be handled on-the-fly by retrying
1012 transactions that abort due to deadlocks.
1013 </para>
1015 <para>
1016 So long as no deadlock situation is detected, a transaction seeking
1017 either a table-level or row-level lock will wait indefinitely for
1018 conflicting locks to be released. This means it is a bad idea for
1019 applications to hold transactions open for long periods of time
1020 (e.g., while waiting for user input).
1021 </para>
1022 </sect2>
1024 <sect2 id="advisory-locks">
1025 <title>Advisory Locks</title>
1027 <indexterm zone="advisory-locks">
1028 <primary>lock</primary>
1029 <secondary>advisory</secondary>
1030 </indexterm>
1032 <para>
1033 <productname>PostgreSQL</productname> provides a means for
1034 creating locks that have application-defined meanings. These are
1035 called <firstterm>advisory locks</>, because the system does not
1036 enforce their use &mdash; it is up to the application to use them
1037 correctly. Advisory locks can be useful for locking strategies
1038 that are an awkward fit for the MVCC model. Once acquired, an
1039 advisory lock is held until explicitly released or the session ends.
1040 Unlike standard locks, advisory locks do not
1041 honor transaction semantics: a lock acquired during a
1042 transaction that is later rolled back will still be held following the
1043 rollback, and likewise an unlock is effective even if the calling
1044 transaction fails later. The same lock can be acquired multiple times by
1045 its owning process: for each lock request there must be a corresponding
1046 unlock request before the lock is actually released. (If a session
1047 already holds a given lock, additional requests will always succeed, even
1048 if other sessions are awaiting the lock.) Like all locks in
1049 <productname>PostgreSQL</productname>, a complete list of advisory
1050 locks currently held by any session can be found in the
1051 <link linkend="view-pg-locks"><structname>pg_locks</structname></link>
1052 system view.
1053 </para>
1055 <para>
1056 Advisory locks are allocated out of a shared memory pool whose size
1057 is defined by the configuration variables
1058 <xref linkend="guc-max-locks-per-transaction"> and
1059 <xref linkend="guc-max-connections">.
1060 Care must be taken not to exhaust this
1061 memory or the server will be unable to grant any locks at all.
1062 This imposes an upper limit on the number of advisory locks
1063 grantable by the server, typically in the tens to hundreds of thousands
1064 depending on how the server is configured.
1065 </para>
1067 <para>
1068 A common use of advisory locks is to emulate pessimistic locking
1069 strategies typical of so called <quote>flat file</> data management
1070 systems.
1071 While a flag stored in a table could be used for the same purpose,
1072 advisory locks are faster, avoid MVCC bloat, and are automatically
1073 cleaned up by the server at the end of the session.
1074 In certain cases using this advisory locking method, especially in queries
1075 involving explicit ordering and <literal>LIMIT</> clauses, care must be
1076 taken to control the locks acquired because of the order in which SQL
1077 expressions are evaluated. For example:
1078 <screen>
1079 SELECT pg_advisory_lock(id) FROM foo WHERE id = 12345; -- ok
1080 SELECT pg_advisory_lock(id) FROM foo WHERE id &gt; 12345 LIMIT 100; -- danger!
1081 SELECT pg_advisory_lock(q.id) FROM
1083 SELECT id FROM foo WHERE id &gt; 12345 LIMIT 100;
1084 ) q; -- ok
1085 </screen>
1086 In the above queries, the second form is dangerous because the
1087 <literal>LIMIT</> is not guaranteed to be applied before the locking
1088 function is executed. This might cause some locks to be acquired
1089 that the application was not expecting, and hence would fail to release
1090 (until it ends the session).
1091 From the point of view of the application, such locks
1092 would be dangling, although still viewable in
1093 <structname>pg_locks</structname>.
1094 </para>
1096 <para>
1097 The functions provided to manipulate advisory locks are described in
1098 <xref linkend="functions-advisory-locks">.
1099 </para>
1100 </sect2>
1102 </sect1>
1104 <sect1 id="applevel-consistency">
1105 <title>Data Consistency Checks at the Application Level</title>
1107 <para>
1108 Because readers in <productname>PostgreSQL</productname>
1109 do not lock data, regardless of
1110 transaction isolation level, data read by one transaction can be
1111 overwritten by another concurrent transaction. In other words,
1112 if a row is returned by <command>SELECT</command> it doesn't mean that
1113 the row is still current at the instant it is returned (i.e., sometime
1114 after the current query began). The row might have been modified or
1115 deleted by an already-committed transaction that committed after
1116 the <command>SELECT</command> started.
1117 Even if the row is still valid <quote>now</>, it could be changed or
1118 deleted
1119 before the current transaction does a commit or rollback.
1120 </para>
1122 <para>
1123 Another way to think about it is that each
1124 transaction sees a snapshot of the database contents, and concurrently
1125 executing transactions might very well see different snapshots. So the
1126 whole concept of <quote>now</quote> is somewhat ill-defined anyway.
1127 This is not normally
1128 a big problem if the client applications are isolated from each other,
1129 but if the clients can communicate via channels outside the database
1130 then serious confusion might ensue.
1131 </para>
1133 <para>
1134 To ensure the current validity of a row and protect it against
1135 concurrent updates one must use <command>SELECT FOR UPDATE</command>,
1136 <command>SELECT FOR SHARE</command>, or an appropriate <command>LOCK
1137 TABLE</command> statement. (<command>SELECT FOR UPDATE</command>
1138 and <command>SELECT FOR SHARE</command> lock just the
1139 returned rows against concurrent updates, while <command>LOCK
1140 TABLE</command> locks the whole table.) This should be taken into
1141 account when porting applications to
1142 <productname>PostgreSQL</productname> from other environments.
1143 </para>
1145 <para>
1146 Global validity checks require extra thought under <acronym>MVCC</acronym>.
1147 For example, a banking application might wish to check that the sum of
1148 all credits in one table equals the sum of debits in another table,
1149 when both tables are being actively updated. Comparing the results of two
1150 successive <literal>SELECT sum(...)</literal> commands will not work reliably in
1151 Read Committed mode, since the second query will likely include the results
1152 of transactions not counted by the first. Doing the two sums in a
1153 single serializable transaction will give an accurate picture of only the
1154 effects of transactions that committed before the serializable transaction
1155 started &mdash; but one might legitimately wonder whether the answer is still
1156 relevant by the time it is delivered. If the serializable transaction
1157 itself applied some changes before trying to make the consistency check,
1158 the usefulness of the check becomes even more debatable, since now it
1159 includes some but not all post-transaction-start changes. In such cases
1160 a careful person might wish to lock all tables needed for the check,
1161 in order to get an indisputable picture of current reality. A
1162 <literal>SHARE</> mode (or higher) lock guarantees that there are no
1163 uncommitted changes in the locked table, other than those of the current
1164 transaction.
1165 </para>
1167 <para>
1168 Note also that if one is relying on explicit locking to prevent concurrent
1169 changes, one should either use Read Committed mode, or in Serializable
1170 mode be careful to obtain
1171 locks before performing queries. A lock obtained by a
1172 serializable transaction guarantees that no other transactions modifying
1173 the table are still running, but if the snapshot seen by the
1174 transaction predates obtaining the lock, it might predate some now-committed
1175 changes in the table. A serializable transaction's snapshot is actually
1176 frozen at the start of its first query or data-modification command
1177 (<literal>SELECT</>, <literal>INSERT</>,
1178 <literal>UPDATE</>, or <literal>DELETE</>), so
1179 it is possible to obtain locks explicitly before the snapshot is
1180 frozen.
1181 </para>
1182 </sect1>
1184 <sect1 id="locking-indexes">
1185 <title>Locking and Indexes</title>
1187 <indexterm zone="locking-indexes">
1188 <primary>index</primary>
1189 <secondary>locks</secondary>
1190 </indexterm>
1192 <para>
1193 Though <productname>PostgreSQL</productname>
1194 provides nonblocking read/write access to table
1195 data, nonblocking read/write access is not currently offered for every
1196 index access method implemented
1197 in <productname>PostgreSQL</productname>.
1198 The various index types are handled as follows:
1200 <variablelist>
1201 <varlistentry>
1202 <term>
1203 B-tree and <acronym>GiST</acronym> indexes
1204 </term>
1205 <listitem>
1206 <para>
1207 Short-term share/exclusive page-level locks are used for
1208 read/write access. Locks are released immediately after each
1209 index row is fetched or inserted. These index types provide
1210 the highest concurrency without deadlock conditions.
1211 </para>
1212 </listitem>
1213 </varlistentry>
1215 <varlistentry>
1216 <term>
1217 Hash indexes
1218 </term>
1219 <listitem>
1220 <para>
1221 Share/exclusive hash-bucket-level locks are used for read/write
1222 access. Locks are released after the whole bucket is processed.
1223 Bucket-level locks provide better concurrency than index-level
1224 ones, but deadlock is possible since the locks are held longer
1225 than one index operation.
1226 </para>
1227 </listitem>
1228 </varlistentry>
1230 <varlistentry>
1231 <term>
1232 <acronym>GIN</acronym> indexes
1233 </term>
1234 <listitem>
1235 <para>
1236 Short-term share/exclusive page-level locks are used for
1237 read/write access. Locks are released immediately after each
1238 index row is fetched or inserted. But note that insertion of a
1239 GIN-indexed value usually produces several index key insertions
1240 per row, so GIN might do substantial work for a single value's
1241 insertion.
1242 </para>
1243 </listitem>
1244 </varlistentry>
1245 </variablelist>
1246 </para>
1248 <para>
1249 Currently, B-tree indexes offer the best performance for concurrent
1250 applications; since they also have more features than hash
1251 indexes, they are the recommended index type for concurrent
1252 applications that need to index scalar data. When dealing with
1253 non-scalar data, B-trees are not useful, and GiST or GIN indexes should
1254 be used instead.
1255 </para>
1256 </sect1>
1257 </chapter>