1 <!-- doc/src/sgml/amcheck.sgml -->
3 <sect1 id=
"amcheck" xreflabel=
"amcheck">
4 <title>amcheck
— tools to verify table and index consistency
</title>
6 <indexterm zone=
"amcheck">
7 <primary>amcheck
</primary>
11 The
<filename>amcheck
</filename> module provides functions that allow you to
12 verify the logical consistency of the structure of relations.
16 The B-Tree checking functions verify various
<emphasis>invariants
</emphasis> in the
17 structure of the representation of particular relations. The
18 correctness of the access method functions behind index scans and
19 other important operations relies on these invariants always
20 holding. For example, certain functions verify, among other things,
21 that all B-Tree pages have items in
<quote>logical
</quote> order (e.g.,
22 for B-Tree indexes on
<type>text
</type>, index tuples should be in
23 collated lexical order). If that particular invariant somehow fails
24 to hold, we can expect binary searches on the affected page to
25 incorrectly guide index scans, resulting in wrong answers to SQL
26 queries. If the structure appears to be valid, no error is raised.
27 While these checking functions are run, the
<xref
28 linkend=
"guc-search-path"/> is temporarily changed to
<literal>pg_catalog,
32 Verification is performed using the same procedures as those used by
33 index scans themselves, which may be user-defined operator class
34 code. For example, B-Tree index verification relies on comparisons
35 made with one or more B-Tree support function
1 routines. See
<xref
36 linkend=
"xindex-support"/> for details of operator class support
40 Unlike the B-Tree checking functions which report corruption by raising
41 errors, the heap checking function
<function>verify_heapam
</function> checks
42 a table and attempts to return a set of rows, one row per corruption
43 detected. Despite this, if facilities that
44 <function>verify_heapam
</function> relies upon are themselves corrupted, the
45 function may be unable to continue and may instead raise an error.
48 Permission to execute
<filename>amcheck
</filename> functions may be granted
49 to non-superusers, but before granting such permissions careful consideration
50 should be given to data security and privacy concerns. Although the
51 corruption reports generated by these functions do not focus on the contents
52 of the corrupted data so much as on the structure of that data and the nature
53 of the corruptions found, an attacker who gains permission to execute these
54 functions, particularly if the attacker can also induce corruption, might be
55 able to infer something of the data itself from such messages.
58 <sect2 id=
"amcheck-functions">
59 <title>Functions
</title>
64 <function>bt_index_check(index regclass, heapallindexed boolean, checkunique boolean) returns void
</function>
66 <primary>bt_index_check
</primary>
72 <function>bt_index_check
</function> tests that its target, a
73 B-Tree index, respects a variety of invariants. Example usage:
75 test=# SELECT bt_index_check(index =
> c.oid, heapallindexed =
> i.indisunique),
79 JOIN pg_opclass op ON i.indclass[
0] = op.oid
80 JOIN pg_am am ON op.opcmethod = am.oid
81 JOIN pg_class c ON i.indexrelid = c.oid
82 JOIN pg_namespace n ON c.relnamespace = n.oid
83 WHERE am.amname = 'btree' AND n.nspname = 'pg_catalog'
84 -- Don't check temp tables, which may be from another session:
85 AND c.relpersistence != 't'
86 -- Function may throw an error when this is omitted:
87 AND c.relkind = 'i' AND i.indisready AND i.indisvalid
88 ORDER BY c.relpages DESC LIMIT
10;
89 bt_index_check | relname | relpages
90 ----------------+---------------------------------+----------
91 | pg_depend_reference_index |
43
92 | pg_depend_depender_index |
40
93 | pg_proc_proname_args_nsp_index |
31
94 | pg_description_o_c_o_index |
21
95 | pg_attribute_relid_attnam_index |
14
96 | pg_proc_oid_index |
10
97 | pg_attribute_relid_attnum_index |
9
98 | pg_amproc_fam_proc_index |
5
99 | pg_amop_opr_fam_index |
5
100 | pg_amop_fam_strat_index |
5
103 This example shows a session that performs verification of the
104 10 largest catalog indexes in the database
<quote>test
</quote>.
105 Verification of the presence of heap tuples as index tuples is
106 requested for the subset that are unique indexes. Since no
107 error is raised, all indexes tested appear to be logically
108 consistent. Naturally, this query could easily be changed to
109 call
<function>bt_index_check
</function> for every index in the
110 database where verification is supported.
113 <function>bt_index_check
</function> acquires an
<literal>AccessShareLock
</literal>
114 on the target index and the heap relation it belongs to. This lock mode
115 is the same lock mode acquired on relations by simple
116 <literal>SELECT
</literal> statements.
117 <function>bt_index_check
</function> does not verify invariants
118 that span child/parent relationships, but will verify the
119 presence of all heap tuples as index tuples within the index
120 when
<parameter>heapallindexed
</parameter> is
121 <literal>true
</literal>. When
<parameter>checkunique
</parameter>
122 is
<literal>true
</literal> <function>bt_index_check
</function> will
123 check that no more than one among duplicate entries in unique
124 index is visible. When a routine, lightweight test for
125 corruption is required in a live production environment, using
126 <function>bt_index_check
</function> often provides the best
127 trade-off between thoroughness of verification and limiting the
128 impact on application performance and availability.
135 <function>bt_index_parent_check(index regclass, heapallindexed boolean, rootdescend boolean, checkunique boolean) returns void
</function>
137 <primary>bt_index_parent_check
</primary>
143 <function>bt_index_parent_check
</function> tests that its
144 target, a B-Tree index, respects a variety of invariants.
145 Optionally, when the
<parameter>heapallindexed
</parameter>
146 argument is
<literal>true
</literal>, the function verifies the
147 presence of all heap tuples that should be found within the
148 index. When
<parameter>checkunique
</parameter>
149 is
<literal>true
</literal> <function>bt_index_parent_check
</function> will
150 check that no more than one among duplicate entries in unique
151 index is visible. When the optional
<parameter>rootdescend
</parameter>
152 argument is
<literal>true
</literal>, verification re-finds
153 tuples on the leaf level by performing a new search from the
154 root page for each tuple. The checks that can be performed by
155 <function>bt_index_parent_check
</function> are a superset of the
156 checks that can be performed by
<function>bt_index_check
</function>.
157 <function>bt_index_parent_check
</function> can be thought of as
158 a more thorough variant of
<function>bt_index_check
</function>:
159 unlike
<function>bt_index_check
</function>,
160 <function>bt_index_parent_check
</function> also checks
161 invariants that span parent/child relationships, including checking
162 that there are no missing downlinks in the index structure.
163 <function>bt_index_parent_check
</function> follows the general
164 convention of raising an error if it finds a logical
165 inconsistency or other problem.
168 A
<literal>ShareLock
</literal> is required on the target index by
169 <function>bt_index_parent_check
</function> (a
170 <literal>ShareLock
</literal> is also acquired on the heap relation).
171 These locks prevent concurrent data modification from
172 <command>INSERT
</command>,
<command>UPDATE
</command>, and
<command>DELETE
</command>
173 commands. The locks also prevent the underlying relation from
174 being concurrently processed by
<command>VACUUM
</command>, as well as
175 all other utility commands. Note that the function holds locks
176 only while running, not for the entire transaction.
179 <function>bt_index_parent_check
</function>'s additional
180 verification is more likely to detect various pathological
181 cases. These cases may involve an incorrectly implemented
182 B-Tree operator class used by the index that is checked, or,
183 hypothetically, undiscovered bugs in the underlying B-Tree index
184 access method code. Note that
185 <function>bt_index_parent_check
</function> cannot be used when
186 hot standby mode is enabled (i.e., on read-only physical
187 replicas), unlike
<function>bt_index_check
</function>.
194 <function>bt_index_check
</function> and
195 <function>bt_index_parent_check
</function> both output log
196 messages about the verification process at
197 <literal>DEBUG1
</literal> and
<literal>DEBUG2
</literal> severity
198 levels. These messages provide detailed information about the
199 verification process that may be of interest to
200 <productname>PostgreSQL
</productname> developers. Advanced users
201 may also find this information helpful, since it provides
202 additional context should verification actually detect an
203 inconsistency. Running:
205 SET client_min_messages = DEBUG1;
207 in an interactive
<application>psql
</application> session before
208 running a verification query will display messages about the
209 progress of verification with a manageable level of detail.
217 verify_heapam(relation regclass,
218 on_error_stop boolean,
232 Checks a table, sequence, or materialized view for structural corruption,
233 where pages in the relation contain data that is invalidly formatted, and
234 for logical corruption, where pages are structurally valid but
235 inconsistent with the rest of the database cluster.
238 The following optional arguments are recognized:
242 <term><literal>on_error_stop
</literal></term>
245 If true, corruption checking stops at the end of the first block in
246 which any corruptions are found.
254 <term><literal>check_toast
</literal></term>
257 If true, toasted values are checked against the target relation's
261 This option is known to be slow. Also, if the toast table or its
262 index is corrupt, checking it against toast values could conceivably
263 crash the server, although in many cases this would just produce an
272 <term><literal>skip
</literal></term>
275 If not
<literal>none
</literal>, corruption checking skips blocks that
276 are marked as all-visible or all-frozen, as specified.
277 Valid options are
<literal>all-visible
</literal>,
278 <literal>all-frozen
</literal> and
<literal>none
</literal>.
281 Defaults to
<literal>none
</literal>.
286 <term><literal>startblock
</literal></term>
289 If specified, corruption checking begins at the specified block,
290 skipping all previous blocks. It is an error to specify a
291 <parameter>startblock
</parameter> outside the range of blocks in the
295 By default, checking begins at the first block.
300 <term><literal>endblock
</literal></term>
303 If specified, corruption checking ends at the specified block,
304 skipping all remaining blocks. It is an error to specify an
305 <parameter>endblock
</parameter> outside the range of blocks in the target
309 By default, all blocks are checked.
315 For each corruption detected,
<function>verify_heapam
</function> returns
316 a row with the following columns:
320 <term><literal>blkno
</literal></term>
323 The number of the block containing the corrupt page.
328 <term><literal>offnum
</literal></term>
331 The OffsetNumber of the corrupt tuple.
336 <term><literal>attnum
</literal></term>
339 The attribute number of the corrupt column in the tuple, if the
340 corruption is specific to a column and not the tuple as a whole.
345 <term><literal>msg
</literal></term>
348 A message describing the problem detected.
358 <sect2 id=
"amcheck-optional-heapallindexed-verification">
359 <title>Optional
<parameter>heapallindexed
</parameter> Verification
</title>
361 When the
<parameter>heapallindexed
</parameter> argument to B-Tree
362 verification functions is
<literal>true
</literal>, an additional
363 phase of verification is performed against the table associated with
364 the target index relation. This consists of a
<quote>dummy
</quote>
365 <command>CREATE INDEX
</command> operation, which checks for the
366 presence of all hypothetical new index tuples against a temporary,
367 in-memory summarizing structure (this is built when needed during
368 the basic first phase of verification). The summarizing structure
369 <quote>fingerprints
</quote> every tuple found within the target
370 index. The high level principle behind
371 <parameter>heapallindexed
</parameter> verification is that a new
372 index that is equivalent to the existing, target index must only
373 have entries that can be found in the existing structure.
376 The additional
<parameter>heapallindexed
</parameter> phase adds
377 significant overhead: verification will typically take several times
378 longer. However, there is no change to the relation-level locks
379 acquired when
<parameter>heapallindexed
</parameter> verification is
383 The summarizing structure is bound in size by
384 <varname>maintenance_work_mem
</varname>. In order to ensure that
385 there is no more than a
2% probability of failure to detect an
386 inconsistency for each heap tuple that should be represented in the
387 index, approximately
2 bytes of memory are needed per tuple. As
388 less memory is made available per tuple, the probability of missing
389 an inconsistency slowly increases. This approach limits the
390 overhead of verification significantly, while only slightly reducing
391 the probability of detecting a problem, especially for installations
392 where verification is treated as a routine maintenance task. Any
393 single absent or malformed tuple has a new opportunity to be
394 detected with each new verification attempt.
399 <sect2 id=
"amcheck-using-amcheck-effectively">
400 <title>Using
<filename>amcheck
</filename> Effectively
</title>
403 <filename>amcheck
</filename> can be effective at detecting various types of
404 failure modes that
<link
405 linkend=
"app-initdb-data-checksums"><application>data
406 checksums
</application></link> will fail to catch. These include:
411 Structural inconsistencies caused by incorrect operator class
415 This includes issues caused by the comparison rules of operating
416 system collations changing. Comparisons of datums of a collatable
417 type like
<type>text
</type> must be immutable (just as all
418 comparisons used for B-Tree index scans must be immutable), which
419 implies that operating system collation rules must never change.
420 Though rare, updates to operating system collation rules can
421 cause these issues. More commonly, an inconsistency in the
422 collation order between a primary server and a standby server is
423 implicated, possibly because the
<emphasis>major
</emphasis> operating
424 system version in use is inconsistent. Such inconsistencies will
425 generally only arise on standby servers, and so can generally
426 only be detected on standby servers.
429 If a problem like this arises, it may not affect each individual
430 index that is ordered using an affected collation, simply because
431 <emphasis>indexed
</emphasis> values might happen to have the same
432 absolute ordering regardless of the behavioral inconsistency. See
433 <xref linkend=
"locale"/> and
<xref linkend=
"collation"/> for
434 further details about how
<productname>PostgreSQL
</productname> uses
435 operating system locales and collations.
440 Structural inconsistencies between indexes and the heap relations
441 that are indexed (when
<parameter>heapallindexed
</parameter>
442 verification is performed).
445 There is no cross-checking of indexes against their heap relation
446 during normal operation. Symptoms of heap corruption can be subtle.
451 Corruption caused by hypothetical undiscovered bugs in the
452 underlying
<productname>PostgreSQL
</productname> access method
453 code, sort code, or transaction management code.
456 Automatic verification of the structural integrity of indexes
457 plays a role in the general testing of new or proposed
458 <productname>PostgreSQL
</productname> features that could plausibly allow a
459 logical inconsistency to be introduced. Verification of table
460 structure and associated visibility and transaction status
461 information plays a similar role. One obvious testing strategy
462 is to call
<filename>amcheck
</filename> functions continuously
463 when running the standard regression tests. See
<xref
464 linkend=
"regress-run"/> for details on running the tests.
469 File system or storage subsystem faults where checksums happen to
470 simply not be enabled.
473 Note that
<filename>amcheck
</filename> examines a page as represented in some
474 shared memory buffer at the time of verification if there is only a
475 shared buffer hit when accessing the block. Consequently,
476 <filename>amcheck
</filename> does not necessarily examine data read from the
477 file system at the time of verification. Note that when checksums are
478 enabled,
<filename>amcheck
</filename> may raise an error due to a checksum
479 failure when a corrupt block is read into a buffer.
484 Corruption caused by faulty RAM, or the broader memory subsystem.
487 <productname>PostgreSQL
</productname> does not protect against correctable
488 memory errors and it is assumed you will operate using RAM that
489 uses industry standard Error Correcting Codes (ECC) or better
490 protection. However, ECC memory is typically only immune to
491 single-bit errors, and should not be assumed to provide
492 <emphasis>absolute
</emphasis> protection against failures that
493 result in memory corruption.
496 When
<parameter>heapallindexed
</parameter> verification is
497 performed, there is generally a greatly increased chance of
498 detecting single-bit errors, since strict binary equality is
499 tested, and the indexed attributes within the heap are tested.
506 Structural corruption can happen due to faulty storage hardware, or
507 relation files being overwritten or modified by unrelated software.
508 This kind of corruption can also be detected with
509 <link linkend=
"checksums"><application>data page
510 checksums
</application></link>.
514 Relation pages which are correctly formatted, internally consistent, and
515 correct relative to their own internal checksums may still contain
516 logical corruption. As such, this kind of corruption cannot be detected
517 with
<application>checksums
</application>. Examples include toasted
518 values in the main table which lack a corresponding entry in the toast
519 table, and tuples in the main table with a Transaction ID that is older
520 than the oldest valid Transaction ID in the database or cluster.
524 Multiple causes of logical corruption have been observed in production
525 systems, including bugs in the
<productname>PostgreSQL
</productname>
526 server software, faulty and ill-conceived backup and restore tools, and
531 Corrupt relations are most concerning in live production environments,
532 precisely the same environments where high risk activities are least
533 welcome. For this reason,
<function>verify_heapam
</function> has been
534 designed to diagnose corruption without undue risk. It cannot guard
535 against all causes of backend crashes, as even executing the calling
536 query could be unsafe on a badly corrupted system. Access to
<link
537 linkend=
"catalogs-overview">catalog tables
</link> is performed and could
538 be problematic if the catalogs themselves are corrupted.
542 In general,
<filename>amcheck
</filename> can only prove the presence of
543 corruption; it cannot prove its absence.
547 <sect2 id=
"amcheck-repairing-corruption">
548 <title>Repairing Corruption
</title>
550 No error concerning corruption raised by
<filename>amcheck
</filename> should
551 ever be a false positive.
<filename>amcheck
</filename> raises
552 errors in the event of conditions that, by definition, should never
553 happen, and so careful analysis of
<filename>amcheck
</filename>
554 errors is often required.
557 There is no general method of repairing problems that
558 <filename>amcheck
</filename> detects. An explanation for the root cause of
559 an invariant violation should be sought.
<xref
560 linkend=
"pageinspect"/> may play a useful role in diagnosing
561 corruption that
<filename>amcheck
</filename> detects. A
<command>REINDEX
</command>
562 may not be effective in repairing corruption.