doc/src/sgml/maintenance.sgml

   1 <!-- doc/src/sgml/maintenance.sgml -->
   2
   3 <chapter id="maintenance">
   4  <title>Routine Database Maintenance Tasks</title>
   5
   6  <indexterm zone="maintenance">
   7   <primary>maintenance</primary>
   8  </indexterm>
   9
  10  <indexterm zone="maintenance">
  11   <primary>routine maintenance</primary>
  12  </indexterm>
  13
  14   <para>
  15    <productname>PostgreSQL</productname>, like any database software, requires that certain tasks
  16    be performed regularly to achieve optimum performance. The tasks
  17    discussed here are <emphasis>required</emphasis>, but they
  18    are repetitive in nature and can easily be automated using standard
  19    tools such as <application>cron</application> scripts or
  20    Windows' <application>Task Scheduler</application>.  It is the database
  21    administrator's responsibility to set up appropriate scripts, and to
  22    check that they execute successfully.
  23   </para>
  24
  25   <para>
  26    One obvious maintenance task is the creation of backup copies of the data on a
  27    regular schedule.  Without a recent backup, you have no chance of recovery
  28    after a catastrophe (disk failure, fire, mistakenly dropping a critical
  29    table, etc.).  The backup and recovery mechanisms available in
  30    <productname>PostgreSQL</productname> are discussed at length in
  31    <xref linkend="backup"/>.
  32   </para>
  33
  34   <para>
  35    The other main category of maintenance task is periodic <quote>vacuuming</quote>
  36    of the database.  This activity is discussed in
  37    <xref linkend="routine-vacuuming"/>.  Closely related to this is updating
  38    the statistics that will be used by the query planner, as discussed in
  39    <xref linkend="vacuum-for-statistics"/>.
  40   </para>
  41
  42   <para>
  43    Another task that might need periodic attention is log file management.
  44    This is discussed in <xref linkend="logfile-maintenance"/>.
  45   </para>
  46
  47   <para>
  48    <ulink
  49    url="https://bucardo.org/check_postgres/"><application>check_postgres</application></ulink>
  50    is available for monitoring database health and reporting unusual
  51    conditions.  <application>check_postgres</application> integrates with
  52    Nagios and MRTG, but can be run standalone too.
  53   </para>
  54
  55   <para>
  56    <productname>PostgreSQL</productname> is low-maintenance compared
  57    to some other database management systems.  Nonetheless,
  58    appropriate attention to these tasks will go far towards ensuring a
  59    pleasant and productive experience with the system.
  60   </para>
  61
  62  <sect1 id="routine-vacuuming">
  63   <title>Routine Vacuuming</title>
  64
  65   <indexterm zone="routine-vacuuming">
  66    <primary>vacuum</primary>
  67   </indexterm>
  68
  69   <para>
  70    <productname>PostgreSQL</productname> databases require periodic
  71    maintenance known as <firstterm>vacuuming</firstterm>.  For many installations, it
  72    is sufficient to let vacuuming be performed by the <firstterm>autovacuum
  73    daemon</firstterm>, which is described in <xref linkend="autovacuum"/>.  You might
  74    need to adjust the autovacuuming parameters described there to obtain best
  75    results for your situation.  Some database administrators will want to
  76    supplement or replace the daemon's activities with manually-managed
  77    <command>VACUUM</command> commands, which typically are executed according to a
  78    schedule by <application>cron</application> or <application>Task
  79    Scheduler</application> scripts.  To set up manually-managed vacuuming properly,
  80    it is essential to understand the issues discussed in the next few
  81    subsections.  Administrators who rely on autovacuuming may still wish
  82    to skim this material to help them understand and adjust autovacuuming.
  83   </para>
  84
  85   <sect2 id="vacuum-basics">
  86    <title>Vacuuming Basics</title>
  87
  88    <para>
  89     <productname>PostgreSQL</productname>'s
  90     <link linkend="sql-vacuum"><command>VACUUM</command></link> command has to
  91     process each table on a regular basis for several reasons:
  92
  93     <orderedlist>
  94      <listitem>
  95       <simpara>To recover or reuse disk space occupied by updated or deleted
  96       rows.</simpara>
  97      </listitem>
  98
  99      <listitem>
 100       <simpara>To update data statistics used by the
 101       <productname>PostgreSQL</productname> query planner.</simpara>
 102      </listitem>
 103
 104      <listitem>
 105       <simpara>To update the visibility map, which speeds
 106       up <link linkend="indexes-index-only-scans">index-only
 107       scans</link>.</simpara>
 108      </listitem>
 109
 110      <listitem>
 111       <simpara>To protect against loss of very old data due to
 112       <firstterm>transaction ID wraparound</firstterm> or
 113       <firstterm>multixact ID wraparound</firstterm>.</simpara>
 114      </listitem>
 115     </orderedlist>
 116
 117     Each of these reasons dictates performing <command>VACUUM</command> operations
 118     of varying frequency and scope, as explained in the following subsections.
 119    </para>
 120
 121    <para>
 122     There are two variants of <command>VACUUM</command>: standard <command>VACUUM</command>
 123     and <command>VACUUM FULL</command>.  <command>VACUUM FULL</command> can reclaim more
 124     disk space but runs much more slowly.  Also,
 125     the standard form of <command>VACUUM</command> can run in parallel with production
 126     database operations.  (Commands such as <command>SELECT</command>,
 127     <command>INSERT</command>, <command>UPDATE</command>, and
 128     <command>DELETE</command> will continue to function normally, though you
 129     will not be able to modify the definition of a table with commands such as
 130     <command>ALTER TABLE</command> while it is being vacuumed.)
 131     <command>VACUUM FULL</command> requires an
 132     <literal>ACCESS EXCLUSIVE</literal> lock on the table it is
 133     working on, and therefore cannot be done in parallel with other use
 134     of the table.  Generally, therefore,
 135     administrators should strive to use standard <command>VACUUM</command> and
 136     avoid <command>VACUUM FULL</command>.
 137    </para>
 138
 139    <para>
 140     <command>VACUUM</command> creates a substantial amount of I/O
 141     traffic, which can cause poor performance for other active sessions.
 142     There are configuration parameters that can be adjusted to reduce the
 143     performance impact of background vacuuming &mdash; see
 144     <xref linkend="runtime-config-resource-vacuum-cost"/>.
 145    </para>
 146   </sect2>
 147
 148   <sect2 id="vacuum-for-space-recovery">
 149    <title>Recovering Disk Space</title>
 150
 151    <indexterm zone="vacuum-for-space-recovery">
 152     <primary>disk space</primary>
 153    </indexterm>
 154
 155    <para>
 156     In <productname>PostgreSQL</productname>, an
 157     <command>UPDATE</command> or <command>DELETE</command> of a row does not
 158     immediately remove the old version of the row.
 159     This approach is necessary to gain the benefits of multiversion
 160     concurrency control (<acronym>MVCC</acronym>, see <xref linkend="mvcc"/>): the row version
 161     must not be deleted while it is still potentially visible to other
 162     transactions. But eventually, an outdated or deleted row version is no
 163     longer of interest to any transaction. The space it occupies must then be
 164     reclaimed for reuse by new rows, to avoid unbounded growth of disk
 165     space requirements. This is done by running <command>VACUUM</command>.
 166    </para>
 167
 168    <para>
 169     The standard form of <command>VACUUM</command> removes dead row
 170     versions in tables and indexes and marks the space available for
 171     future reuse.  However, it will not return the space to the operating
 172     system, except in the special case where one or more pages at the
 173     end of a table become entirely free and an exclusive table lock can be
 174     easily obtained.  In contrast, <command>VACUUM FULL</command> actively compacts
 175     tables by writing a complete new version of the table file with no dead
 176     space.  This minimizes the size of the table, but can take a long time.
 177     It also requires extra disk space for the new copy of the table, until
 178     the operation completes.
 179    </para>
 180
 181    <para>
 182     The usual goal of routine vacuuming is to do standard <command>VACUUM</command>s
 183     often enough to avoid needing <command>VACUUM FULL</command>.  The
 184     autovacuum daemon attempts to work this way, and in fact will
 185     never issue <command>VACUUM FULL</command>.  In this approach, the idea
 186     is not to keep tables at their minimum size, but to maintain steady-state
 187     usage of disk space: each table occupies space equivalent to its
 188     minimum size plus however much space gets used up between vacuum runs.
 189     Although <command>VACUUM FULL</command> can be used to shrink a table back
 190     to its minimum size and return the disk space to the operating system,
 191     there is not much point in this if the table will just grow again in the
 192     future.  Thus, moderately-frequent standard <command>VACUUM</command> runs are a
 193     better approach than infrequent <command>VACUUM FULL</command> runs for
 194     maintaining heavily-updated tables.
 195    </para>
 196
 197    <para>
 198     Some administrators prefer to schedule vacuuming themselves, for example
 199     doing all the work at night when load is low.
 200     The difficulty with doing vacuuming according to a fixed schedule
 201     is that if a table has an unexpected spike in update activity, it may
 202     get bloated to the point that <command>VACUUM FULL</command> is really necessary
 203     to reclaim space.  Using the autovacuum daemon alleviates this problem,
 204     since the daemon schedules vacuuming dynamically in response to update
 205     activity.  It is unwise to disable the daemon completely unless you
 206     have an extremely predictable workload.  One possible compromise is
 207     to set the daemon's parameters so that it will only react to unusually
 208     heavy update activity, thus keeping things from getting out of hand,
 209     while scheduled <command>VACUUM</command>s are expected to do the bulk of the
 210     work when the load is typical.
 211    </para>
 212
 213    <para>
 214     For those not using autovacuum, a typical approach is to schedule a
 215     database-wide <command>VACUUM</command> once a day during a low-usage period,
 216     supplemented by more frequent vacuuming of heavily-updated tables as
 217     necessary. (Some installations with extremely high update rates vacuum
 218     their busiest tables as often as once every few minutes.) If you have
 219     multiple databases in a cluster, don't forget to
 220     <command>VACUUM</command> each one; the program <xref
 221     linkend="app-vacuumdb"/> might be helpful.
 222    </para>
 223
 224    <tip>
 225    <para>
 226     Plain <command>VACUUM</command> may not be satisfactory when
 227     a table contains large numbers of dead row versions as a result of
 228     massive update or delete activity.  If you have such a table and
 229     you need to reclaim the excess disk space it occupies, you will need
 230     to use <command>VACUUM FULL</command>, or alternatively
 231     <link linkend="sql-cluster"><command>CLUSTER</command></link>
 232     or one of the table-rewriting variants of
 233     <link linkend="sql-altertable"><command>ALTER TABLE</command></link>.
 234     These commands rewrite an entire new copy of the table and build
 235     new indexes for it.  All these options require an
 236     <literal>ACCESS EXCLUSIVE</literal> lock.  Note that
 237     they also temporarily use extra disk space approximately equal to the size
 238     of the table, since the old copies of the table and indexes can't be
 239     released until the new ones are complete.
 240    </para>
 241    </tip>
 242
 243    <tip>
 244    <para>
 245     If you have a table whose entire contents are deleted on a periodic
 246     basis, consider doing it with
 247     <link linkend="sql-truncate"><command>TRUNCATE</command></link> rather
 248     than using <command>DELETE</command> followed by
 249     <command>VACUUM</command>. <command>TRUNCATE</command> removes the
 250     entire content of the table immediately, without requiring a
 251     subsequent <command>VACUUM</command> or <command>VACUUM
 252     FULL</command> to reclaim the now-unused disk space.
 253     The disadvantage is that strict MVCC semantics are violated.
 254    </para>
 255    </tip>
 256   </sect2>
 257
 258   <sect2 id="vacuum-for-statistics">
 259    <title>Updating Planner Statistics</title>
 260
 261    <indexterm zone="vacuum-for-statistics">
 262     <primary>statistics</primary>
 263     <secondary>of the planner</secondary>
 264    </indexterm>
 265
 266    <indexterm zone="vacuum-for-statistics">
 267     <primary>ANALYZE</primary>
 268    </indexterm>
 269
 270    <para>
 271     The <productname>PostgreSQL</productname> query planner relies on
 272     statistical information about the contents of tables in order to
 273     generate good plans for queries.  These statistics are gathered by
 274     the <link linkend="sql-analyze"><command>ANALYZE</command></link> command,
 275     which can be invoked by itself or
 276     as an optional step in <command>VACUUM</command>.  It is important to have
 277     reasonably accurate statistics, otherwise poor choices of plans might
 278     degrade database performance.
 279    </para>
 280
 281    <para>
 282     The autovacuum daemon, if enabled, will automatically issue
 283     <command>ANALYZE</command> commands whenever the content of a table has
 284     changed sufficiently.  However, administrators might prefer to rely
 285     on manually-scheduled <command>ANALYZE</command> operations, particularly
 286     if it is known that update activity on a table will not affect the
 287     statistics of <quote>interesting</quote> columns.  The daemon schedules
 288     <command>ANALYZE</command> strictly as a function of the number of rows
 289     inserted or updated; it has no knowledge of whether that will lead
 290     to meaningful statistical changes.
 291    </para>
 292
 293    <para>
 294     Tuples changed in partitions and inheritance children do not trigger
 295     analyze on the parent table.  If the parent table is empty or rarely
 296     changed, it may never be processed by autovacuum, and the statistics for
 297     the inheritance tree as a whole won't be collected. It is necessary to
 298     run <command>ANALYZE</command> on the parent table manually in order to
 299     keep the statistics up to date.
 300    </para>
 301
 302    <para>
 303     As with vacuuming for space recovery, frequent updates of statistics
 304     are more useful for heavily-updated tables than for seldom-updated
 305     ones. But even for a heavily-updated table, there might be no need for
 306     statistics updates if the statistical distribution of the data is
 307     not changing much. A simple rule of thumb is to think about how much
 308     the minimum and maximum values of the columns in the table change.
 309     For example, a <type>timestamp</type> column that contains the time
 310     of row update will have a constantly-increasing maximum value as
 311     rows are added and updated; such a column will probably need more
 312     frequent statistics updates than, say, a column containing URLs for
 313     pages accessed on a website. The URL column might receive changes just
 314     as often, but the statistical distribution of its values probably
 315     changes relatively slowly.
 316    </para>
 317
 318    <para>
 319     It is possible to run <command>ANALYZE</command> on specific tables and even
 320     just specific columns of a table, so the flexibility exists to update some
 321     statistics more frequently than others if your application requires it.
 322     In practice, however, it is usually best to just analyze the entire
 323     database, because it is a fast operation.  <command>ANALYZE</command> uses a
 324     statistically random sampling of the rows of a table rather than reading
 325     every single row.
 326    </para>
 327
 328    <tip>
 329     <para>
 330      Although per-column tweaking of <command>ANALYZE</command> frequency might not be
 331      very productive, you might find it worthwhile to do per-column
 332      adjustment of the level of detail of the statistics collected by
 333      <command>ANALYZE</command>.  Columns that are heavily used in <literal>WHERE</literal>
 334      clauses and have highly irregular data distributions might require a
 335      finer-grain data histogram than other columns.  See <command>ALTER TABLE
 336      SET STATISTICS</command>, or change the database-wide default using the <xref
 337      linkend="guc-default-statistics-target"/> configuration parameter.
 338     </para>
 339
 340     <para>
 341      Also, by default there is limited information available about
 342      the selectivity of functions.  However, if you create a statistics
 343      object or an expression
 344      index that uses a function call, useful statistics will be
 345      gathered about the function, which can greatly improve query
 346      plans that use the expression index.
 347     </para>
 348    </tip>
 349
 350    <tip>
 351     <para>
 352      The autovacuum daemon does not issue <command>ANALYZE</command> commands for
 353      foreign tables, since it has no means of determining how often that
 354      might be useful.  If your queries require statistics on foreign tables
 355      for proper planning, it's a good idea to run manually-managed
 356      <command>ANALYZE</command> commands on those tables on a suitable schedule.
 357     </para>
 358    </tip>
 359
 360    <tip>
 361     <para>
 362      The autovacuum daemon does not issue <command>ANALYZE</command> commands
 363      for partitioned tables.  Inheritance parents will only be analyzed if the
 364      parent itself is changed - changes to child tables do not trigger
 365      autoanalyze on the parent table.  If your queries require statistics on
 366      parent tables for proper planning, it is necessary to periodically run
 367      a manual <command>ANALYZE</command> on those tables to keep the statistics
 368      up to date.
 369     </para>
 370    </tip>
 371
 372   </sect2>
 373
 374   <sect2 id="vacuum-for-visibility-map">
 375    <title>Updating the Visibility Map</title>
 376
 377    <para>
 378     Vacuum maintains a <link linkend="storage-vm">visibility map</link> for each
 379     table to keep track of which pages contain only tuples that are known to be
 380     visible to all active transactions (and all future transactions, until the
 381     page is again modified).  This has two purposes.  First, vacuum
 382     itself can skip such pages on the next run, since there is nothing to
 383     clean up.
 384    </para>
 385
 386    <para>
 387     Second, it allows <productname>PostgreSQL</productname> to answer some
 388     queries using only the index, without reference to the underlying table.
 389     Since <productname>PostgreSQL</productname> indexes don't contain tuple
 390     visibility information, a normal index scan fetches the heap tuple for each
 391     matching index entry, to check whether it should be seen by the current
 392     transaction.
 393     An <link linkend="indexes-index-only-scans"><firstterm>index-only
 394     scan</firstterm></link>, on the other hand, checks the visibility map first.
 395     If it's known that all tuples on the page are
 396     visible, the heap fetch can be skipped.  This is most useful on
 397     large data sets where the visibility map can prevent disk accesses.
 398     The visibility map is vastly smaller than the heap, so it can easily be
 399     cached even when the heap is very large.
 400    </para>
 401   </sect2>
 402
 403   <sect2 id="vacuum-for-wraparound">
 404    <title>Preventing Transaction ID Wraparound Failures</title>
 405
 406    <indexterm zone="vacuum-for-wraparound">
 407     <primary>transaction ID</primary>
 408     <secondary>wraparound</secondary>
 409    </indexterm>
 410
 411     <indexterm>
 412      <primary>wraparound</primary>
 413      <secondary>of transaction IDs</secondary>
 414     </indexterm>
 415
 416    <para>
 417     <productname>PostgreSQL</productname>'s
 418     <link linkend="mvcc-intro">MVCC</link> transaction semantics
 419     depend on being able to compare transaction ID (<acronym>XID</acronym>)
 420     numbers: a row version with an insertion XID greater than the current
 421     transaction's XID is <quote>in the future</quote> and should not be visible
 422     to the current transaction.  But since transaction IDs have limited size
 423     (32 bits) a cluster that runs for a long time (more
 424     than 4 billion transactions) would suffer <firstterm>transaction ID
 425     wraparound</firstterm>: the XID counter wraps around to zero, and all of a sudden
 426     transactions that were in the past appear to be in the future &mdash; which
 427     means their output become invisible.  In short, catastrophic data loss.
 428     (Actually the data is still there, but that's cold comfort if you cannot
 429     get at it.)  To avoid this, it is necessary to vacuum every table
 430     in every database at least once every two billion transactions.
 431    </para>
 432
 433    <para>
 434     The reason that periodic vacuuming solves the problem is that
 435     <command>VACUUM</command> will mark rows as <emphasis>frozen</emphasis>, indicating that
 436     they were inserted by a transaction that committed sufficiently far in
 437     the past that the effects of the inserting transaction are certain to be
 438     visible to all current and future transactions.
 439     Normal XIDs are
 440     compared using modulo-2<superscript>32</superscript> arithmetic. This means
 441     that for every normal XID, there are two billion XIDs that are
 442     <quote>older</quote> and two billion that are <quote>newer</quote>; another
 443     way to say it is that the normal XID space is circular with no
 444     endpoint. Therefore, once a row version has been created with a particular
 445     normal XID, the row version will appear to be <quote>in the past</quote> for
 446     the next two billion transactions, no matter which normal XID we are
 447     talking about. If the row version still exists after more than two billion
 448     transactions, it will suddenly appear to be in the future. To
 449     prevent this, <productname>PostgreSQL</productname> reserves a special XID,
 450     <literal>FrozenTransactionId</literal>, which does not follow the normal XID
 451     comparison rules and is always considered older
 452     than every normal XID.
 453     Frozen row versions are treated as if the inserting XID were
 454     <literal>FrozenTransactionId</literal>, so that they will appear to be
 455     <quote>in the past</quote> to all normal transactions regardless of wraparound
 456     issues, and so such row versions will be valid until deleted, no matter
 457     how long that is.
 458    </para>
 459
 460    <note>
 461     <para>
 462      In <productname>PostgreSQL</productname> versions before 9.4, freezing was
 463      implemented by actually replacing a row's insertion XID
 464      with <literal>FrozenTransactionId</literal>, which was visible in the
 465      row's <structname>xmin</structname> system column.  Newer versions just set a flag
 466      bit, preserving the row's original <structname>xmin</structname> for possible
 467      forensic use.  However, rows with <structname>xmin</structname> equal
 468      to <literal>FrozenTransactionId</literal> (2) may still be found
 469      in databases <application>pg_upgrade</application>'d from pre-9.4 versions.
 470     </para>
 471     <para>
 472      Also, system catalogs may contain rows with <structname>xmin</structname> equal
 473      to <literal>BootstrapTransactionId</literal> (1), indicating that they were
 474      inserted during the first phase of <application>initdb</application>.
 475      Like <literal>FrozenTransactionId</literal>, this special XID is treated as
 476      older than every normal XID.
 477     </para>
 478    </note>
 479
 480    <para>
 481     <xref linkend="guc-vacuum-freeze-min-age"/>
 482     controls how old an XID value has to be before rows bearing that XID will be
 483     frozen.  Increasing this setting may avoid unnecessary work if the
 484     rows that would otherwise be frozen will soon be modified again,
 485     but decreasing this setting increases
 486     the number of transactions that can elapse before the table must be
 487     vacuumed again.
 488    </para>
 489
 490    <para>
 491     <command>VACUUM</command> uses the <link linkend="storage-vm">visibility map</link>
 492     to determine which pages of a table must be scanned.  Normally, it
 493     will skip pages that don't have any dead row versions even if those pages
 494     might still have row versions with old XID values.  Therefore, normal
 495     <command>VACUUM</command>s won't always freeze every old row version in the table.
 496     When that happens, <command>VACUUM</command> will eventually need to perform an
 497     <firstterm>aggressive vacuum</firstterm>, which will freeze all eligible unfrozen
 498     XID and MXID values, including those from all-visible but not all-frozen pages.
 499     In practice most tables require periodic aggressive vacuuming.
 500     <xref linkend="guc-vacuum-freeze-table-age"/>
 501     controls when <command>VACUUM</command> does that: all-visible but not all-frozen
 502     pages are scanned if the number of transactions that have passed since the
 503     last such scan is greater than <varname>vacuum_freeze_table_age</varname> minus
 504     <varname>vacuum_freeze_min_age</varname>. Setting
 505     <varname>vacuum_freeze_table_age</varname> to 0 forces <command>VACUUM</command> to
 506     always use its aggressive strategy.
 507    </para>
 508
 509    <para>
 510     The maximum time that a table can go unvacuumed is two billion
 511     transactions minus the <varname>vacuum_freeze_min_age</varname> value at
 512     the time of the last aggressive vacuum. If it were to go
 513     unvacuumed for longer than
 514     that, data loss could result.  To ensure that this does not happen,
 515     autovacuum is invoked on any table that might contain unfrozen rows with
 516     XIDs older than the age specified by the configuration parameter <xref
 517     linkend="guc-autovacuum-freeze-max-age"/>.  (This will happen even if
 518     autovacuum is disabled.)
 519    </para>
 520
 521    <para>
 522     This implies that if a table is not otherwise vacuumed,
 523     autovacuum will be invoked on it approximately once every
 524     <varname>autovacuum_freeze_max_age</varname> minus
 525     <varname>vacuum_freeze_min_age</varname> transactions.
 526     For tables that are regularly vacuumed for space reclamation purposes,
 527     this is of little importance.  However, for static tables
 528     (including tables that receive inserts, but no updates or deletes),
 529     there is no need to vacuum for space reclamation, so it can
 530     be useful to try to maximize the interval between forced autovacuums
 531     on very large static tables.  Obviously one can do this either by
 532     increasing <varname>autovacuum_freeze_max_age</varname> or decreasing
 533     <varname>vacuum_freeze_min_age</varname>.
 534    </para>
 535
 536    <para>
 537     The effective maximum for <varname>vacuum_freeze_table_age</varname> is 0.95 *
 538     <varname>autovacuum_freeze_max_age</varname>; a setting higher than that will be
 539     capped to the maximum. A value higher than
 540     <varname>autovacuum_freeze_max_age</varname> wouldn't make sense because an
 541     anti-wraparound autovacuum would be triggered at that point anyway, and
 542     the 0.95 multiplier leaves some breathing room to run a manual
 543     <command>VACUUM</command> before that happens.  As a rule of thumb,
 544     <command>vacuum_freeze_table_age</command> should be set to a value somewhat
 545     below <varname>autovacuum_freeze_max_age</varname>, leaving enough gap so that
 546     a regularly scheduled <command>VACUUM</command> or an autovacuum triggered by
 547     normal delete and update activity is run in that window.  Setting it too
 548     close could lead to anti-wraparound autovacuums, even though the table
 549     was recently vacuumed to reclaim space, whereas lower values lead to more
 550     frequent aggressive vacuuming.
 551    </para>
 552
 553    <para>
 554     The sole disadvantage of increasing <varname>autovacuum_freeze_max_age</varname>
 555     (and <varname>vacuum_freeze_table_age</varname> along with it) is that
 556     the <filename>pg_xact</filename> and <filename>pg_commit_ts</filename>
 557     subdirectories of the database cluster will take more space, because it
 558     must store the commit status and (if <varname>track_commit_timestamp</varname> is
 559     enabled) timestamp of all transactions back to
 560     the <varname>autovacuum_freeze_max_age</varname> horizon.  The commit status uses
 561     two bits per transaction, so if
 562     <varname>autovacuum_freeze_max_age</varname> is set to its maximum allowed value
 563     of two billion, <filename>pg_xact</filename> can be expected to grow to about half
 564     a gigabyte and <filename>pg_commit_ts</filename> to about 20GB.  If this
 565     is trivial compared to your total database size,
 566     setting <varname>autovacuum_freeze_max_age</varname> to its maximum allowed value
 567     is recommended.  Otherwise, set it depending on what you are willing to
 568     allow for <filename>pg_xact</filename> and <filename>pg_commit_ts</filename> storage.
 569     (The default, 200 million transactions, translates to about 50MB
 570     of <filename>pg_xact</filename> storage and about 2GB of <filename>pg_commit_ts</filename>
 571     storage.)
 572    </para>
 573
 574    <para>
 575     One disadvantage of decreasing <varname>vacuum_freeze_min_age</varname> is that
 576     it might cause <command>VACUUM</command> to do useless work: freezing a row
 577     version is a waste of time if the row is modified
 578     soon thereafter (causing it to acquire a new XID).  So the setting should
 579     be large enough that rows are not frozen until they are unlikely to change
 580     any more.
 581    </para>
 582
 583    <para>
 584     To track the age of the oldest unfrozen XIDs in a database,
 585     <command>VACUUM</command> stores XID
 586     statistics in the system tables <structname>pg_class</structname> and
 587     <structname>pg_database</structname>.  In particular,
 588     the <structfield>relfrozenxid</structfield> column of a table's
 589     <structname>pg_class</structname> row contains the oldest remaining unfrozen
 590     XID at the end of the most recent <command>VACUUM</command> that successfully
 591     advanced <structfield>relfrozenxid</structfield> (typically the most recent
 592     aggressive VACUUM).  Similarly, the
 593     <structfield>datfrozenxid</structfield> column of a database's
 594     <structname>pg_database</structname> row is a lower bound on the unfrozen XIDs
 595     appearing in that database &mdash; it is just the minimum of the
 596     per-table <structfield>relfrozenxid</structfield> values within the database.
 597     A convenient way to
 598     examine this information is to execute queries such as:
 599
 600 <programlisting>
 601 SELECT c.oid::regclass as table_name,
 602        greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as age
 603 FROM pg_class c
 604 LEFT JOIN pg_class t ON c.reltoastrelid = t.oid
 605 WHERE c.relkind IN ('r', 'm');
 606
 607 SELECT datname, age(datfrozenxid) FROM pg_database;
 608 </programlisting>
 609
 610     The <literal>age</literal> column measures the number of transactions from the
 611     cutoff XID to the current transaction's XID.
 612    </para>
 613
 614    <tip>
 615     <para>
 616      When the <command>VACUUM</command> command's <literal>VERBOSE</literal>
 617      parameter is specified, <command>VACUUM</command> prints various
 618      statistics about the table.  This includes information about how
 619      <structfield>relfrozenxid</structfield> and
 620      <structfield>relminmxid</structfield> advanced, and the number of
 621      newly frozen pages.  The same details appear in the server log when
 622      autovacuum logging (controlled by <xref
 623       linkend="guc-log-autovacuum-min-duration"/>) reports on a
 624      <command>VACUUM</command> operation executed by autovacuum.
 625     </para>
 626    </tip>
 627
 628    <para>
 629     <command>VACUUM</command> normally only scans pages that have been modified
 630     since the last vacuum, but <structfield>relfrozenxid</structfield> can only be
 631     advanced when every page of the table
 632     that might contain unfrozen XIDs is scanned.  This happens when
 633     <structfield>relfrozenxid</structfield> is more than
 634     <varname>vacuum_freeze_table_age</varname> transactions old, when
 635     <command>VACUUM</command>'s <literal>FREEZE</literal> option is used, or when all
 636     pages that are not already all-frozen happen to
 637     require vacuuming to remove dead row versions. When <command>VACUUM</command>
 638     scans every page in the table that is not already all-frozen, it should
 639     set <literal>age(relfrozenxid)</literal> to a value just a little more than the
 640     <varname>vacuum_freeze_min_age</varname> setting
 641     that was used (more by the number of transactions started since the
 642     <command>VACUUM</command> started).  <command>VACUUM</command>
 643     will set <structfield>relfrozenxid</structfield> to the oldest XID
 644     that remains in the table, so it's possible that the final value
 645     will be much more recent than strictly required.
 646     If no <structfield>relfrozenxid</structfield>-advancing
 647     <command>VACUUM</command> is issued on the table until
 648     <varname>autovacuum_freeze_max_age</varname> is reached, an autovacuum will soon
 649     be forced for the table.
 650    </para>
 651
 652    <para>
 653     If for some reason autovacuum fails to clear old XIDs from a table, the
 654     system will begin to emit warning messages like this when the database's
 655     oldest XIDs reach forty million transactions from the wraparound point:
 656
 657 <programlisting>
 658 WARNING:  database "mydb" must be vacuumed within 39985967 transactions
 659 HINT:  To avoid XID assignment failures, execute a database-wide VACUUM in that database.
 660 </programlisting>
 661
 662     (A manual <command>VACUUM</command> should fix the problem, as suggested by the
 663     hint; but note that the <command>VACUUM</command> should be performed by a
 664     superuser, else it will fail to process system catalogs, which prevent it from
 665     being able to advance the database's <structfield>datfrozenxid</structfield>.)
 666     If these warnings are ignored, the system will refuse to assign new XIDs once
 667     there are fewer than three million transactions left until wraparound:
 668
 669 <programlisting>
 670 ERROR:  database is not accepting commands that assign new XIDs to avoid wraparound data loss in database "mydb"
 671 HINT:  Execute a database-wide VACUUM in that database.
 672 </programlisting>
 673
 674     In this condition any transactions already in progress can continue,
 675     but only read-only transactions can be started. Operations that
 676     modify database records or truncate relations will fail.
 677     The <command>VACUUM</command> command can still be run normally.
 678     Note that, contrary to what was sometimes recommended in earlier releases,
 679     it is not necessary or desirable to stop the postmaster or enter single
 680     user-mode in order to restore normal operation.
 681     Instead, follow these steps:
 682
 683     <orderedlist>
 684      <listitem>
 685       <simpara>Resolve old prepared transactions. You can find these by checking
 686        <link linkend="view-pg-prepared-xacts">pg_prepared_xacts</link> for rows where
 687        <literal>age(transactionid)</literal> is large. Such transactions should be
 688        committed or rolled back.</simpara>
 689      </listitem>
 690      <listitem>
 691       <simpara>End long-running open transactions. You can find these by checking
 692        <link linkend="monitoring-pg-stat-activity-view">pg_stat_activity</link> for rows where
 693        <literal>age(backend_xid)</literal> or <literal>age(backend_xmin)</literal> is
 694        large. Such transactions should be committed or rolled back, or the session
 695        can be terminated using <literal>pg_terminate_backend</literal>.</simpara>
 696      </listitem>
 697      <listitem>
 698       <simpara>Drop any old replication slots. Use
 699        <link linkend="monitoring-pg-stat-replication-view">pg_stat_replication</link> to
 700        find slots where <literal>age(xmin)</literal> or <literal>age(catalog_xmin)</literal>
 701        is large. In many cases, such slots were created for replication to servers that no
 702        longer exist, or that have been down for a long time. If you drop a slot for a server
 703        that still exists and might still try to connect to that slot, that replica may
 704        need to be rebuilt.</simpara>
 705      </listitem>
 706      <listitem>
 707       <simpara>Execute <command>VACUUM</command> in the target database. A database-wide
 708        <literal>VACUUM</literal> is simplest; to reduce the time required, it as also possible
 709        to issue manual <command>VACUUM</command> commands on the tables where
 710        <structfield>relminxid</structfield> is oldest. Do not use <literal>VACUUM FULL</literal>
 711        in this scenario, because it requires an XID and will therefore fail, except in super-user
 712        mode, where it will instead consume an XID and thus increase the risk of transaction ID
 713        wraparound. Do not use <literal>VACUUM FREEZE</literal> either, because it will do
 714        more than the minimum amount of work required to restore normal operation.</simpara>
 715      </listitem>
 716      <listitem>
 717       <simpara>Once normal operation is restored, ensure that autovacuum is properly configured
 718        in the target database in order to avoid future problems.</simpara>
 719      </listitem>
 720     </orderedlist>
 721    </para>
 722
 723    <note>
 724     <para>
 725      In earlier versions, it was sometimes necessary to stop the postmaster and
 726      <command>VACUUM</command> the database in a single-user mode. In typical scenarios, this
 727      is no longer necessary, and should be avoided whenever possible, since it involves taking
 728      the system down. It is also riskier, since it disables transaction ID wraparound safeguards
 729      that are designed to prevent data loss.  The only reason to use single-user mode in this
 730      scenario is if you wish to <command>TRUNCATE</command> or <command>DROP</command> unneeded
 731      tables to avoid needing to <command>VACUUM</command> them.  The three-million-transaction
 732      safety margin exists to let the administrator do this. See the
 733      <xref linkend="app-postgres"/> reference page for details about using single-user mode.
 734     </para>
 735    </note>
 736
 737    <sect3 id="vacuum-for-multixact-wraparound">
 738     <title>Multixacts and Wraparound</title>
 739
 740     <indexterm>
 741      <primary>MultiXactId</primary>
 742     </indexterm>
 743
 744     <indexterm>
 745      <primary>wraparound</primary>
 746      <secondary>of multixact IDs</secondary>
 747     </indexterm>
 748
 749     <para>
 750      <firstterm>Multixact IDs</firstterm> are used to support row locking by
 751      multiple transactions.  Since there is only limited space in a tuple
 752      header to store lock information, that information is encoded as
 753      a <quote>multiple transaction ID</quote>, or multixact ID for short,
 754      whenever there is more than one transaction concurrently locking a
 755      row.  Information about which transaction IDs are included in any
 756      particular multixact ID is stored separately in
 757      the <filename>pg_multixact</filename> subdirectory, and only the multixact ID
 758      appears in the <structfield>xmax</structfield> field in the tuple header.
 759      Like transaction IDs, multixact IDs are implemented as a
 760      32-bit counter and corresponding storage, all of which requires
 761      careful aging management, storage cleanup, and wraparound handling.
 762      There is a separate storage area which holds the list of members in
 763      each multixact, which also uses a 32-bit counter and which must also
 764      be managed.
 765     </para>
 766
 767     <para>
 768      Whenever <command>VACUUM</command> scans any part of a table, it will replace
 769      any multixact ID it encounters which is older than
 770      <xref linkend="guc-vacuum-multixact-freeze-min-age"/>
 771      by a different value, which can be the zero value, a single
 772      transaction ID, or a newer multixact ID.  For each table,
 773      <structname>pg_class</structname>.<structfield>relminmxid</structfield> stores the oldest
 774      possible multixact ID still appearing in any tuple of that table.
 775      If this value is older than
 776      <xref linkend="guc-vacuum-multixact-freeze-table-age"/>, an aggressive
 777      vacuum is forced.  As discussed in the previous section, an aggressive
 778      vacuum means that only those pages which are known to be all-frozen will
 779      be skipped.  <function>mxid_age()</function> can be used on
 780      <structname>pg_class</structname>.<structfield>relminmxid</structfield> to find its age.
 781     </para>
 782
 783     <para>
 784      Aggressive <command>VACUUM</command>s, regardless of what causes
 785      them, are <emphasis>guaranteed</emphasis> to be able to advance
 786      the table's <structfield>relminmxid</structfield>.
 787      Eventually, as all tables in all databases are scanned and their
 788      oldest multixact values are advanced, on-disk storage for older
 789      multixacts can be removed.
 790     </para>
 791
 792     <para>
 793      As a safety device, an aggressive vacuum scan will
 794      occur for any table whose multixact-age is greater than <xref
 795      linkend="guc-autovacuum-multixact-freeze-max-age"/>.  Also, if the
 796      storage occupied by multixacts members exceeds 2GB, aggressive vacuum
 797      scans will occur more often for all tables, starting with those that
 798      have the oldest multixact-age.  Both of these kinds of aggressive
 799      scans will occur even if autovacuum is nominally disabled.
 800     </para>
 801
 802     <para>
 803      Similar to the XID case, if autovacuum fails to clear old MXIDs from a table, the
 804      system will begin to emit warning messages when the database's oldest MXIDs reach forty
 805      million transactions from the wraparound point.  And, just as in the XID case, if these
 806      warnings are ignored, the system will refuse to generate new MXIDs once there are fewer
 807      than three million left until wraparound.
 808     </para>
 809
 810     <para>
 811      Normal operation when MXIDs are exhausted can be restored in much the same way as
 812      when XIDs are exhausted. Follow the same steps in the previous section, but with the
 813      following differences:
 814
 815     <orderedlist>
 816      <listitem>
 817       <simpara>Running transactions and prepared transactions can be ignored if there
 818        is no chance that they might appear in a multixact.</simpara>
 819      </listitem>
 820      <listitem>
 821       <simpara>MXID information is not directly visible in system views such as
 822        <literal>pg_stat_activity</literal>; however, looking for old XIDs is still a good
 823        way of determining which transactions are causing MXID wraparound problems.</simpara>
 824      </listitem>
 825      <listitem>
 826       <simpara>XID exhaustion will block all write transactions, but MXID exhaustion will
 827        only block a subset of write transactions, specifically those that involve
 828        row locks that require an MXID.</simpara>
 829      </listitem>
 830     </orderedlist>
 831    </para>
 832
 833    </sect3>
 834   </sect2>
 835
 836   <sect2 id="autovacuum">
 837    <title>The Autovacuum Daemon</title>
 838
 839    <indexterm>
 840     <primary>autovacuum</primary>
 841     <secondary>general information</secondary>
 842    </indexterm>
 843    <para>
 844     <productname>PostgreSQL</productname> has an optional but highly
 845     recommended feature called <firstterm>autovacuum</firstterm>,
 846     whose purpose is to automate the execution of
 847     <command>VACUUM</command> and <command>ANALYZE</command> commands.
 848     When enabled, autovacuum checks for
 849     tables that have had a large number of inserted, updated or deleted
 850     tuples.  These checks use the statistics collection facility;
 851     therefore, autovacuum cannot be used unless <xref
 852     linkend="guc-track-counts"/> is set to <literal>true</literal>.
 853     In the default configuration, autovacuuming is enabled and the related
 854     configuration parameters are appropriately set.
 855    </para>
 856
 857    <para>
 858     The <quote>autovacuum daemon</quote> actually consists of multiple processes.
 859     There is a persistent daemon process, called the
 860     <firstterm>autovacuum launcher</firstterm>, which is in charge of starting
 861     <firstterm>autovacuum worker</firstterm> processes for all databases. The
 862     launcher will distribute the work across time, attempting to start one
 863     worker within each database every <xref linkend="guc-autovacuum-naptime"/>
 864     seconds.  (Therefore, if the installation has <replaceable>N</replaceable> databases,
 865     a new worker will be launched every
 866     <varname>autovacuum_naptime</varname>/<replaceable>N</replaceable> seconds.)
 867     A maximum of <xref linkend="guc-autovacuum-max-workers"/> worker processes
 868     are allowed to run at the same time. If there are more than
 869     <varname>autovacuum_max_workers</varname> databases to be processed,
 870     the next database will be processed as soon as the first worker finishes.
 871     Each worker process will check each table within its database and
 872     execute <command>VACUUM</command> and/or <command>ANALYZE</command> as needed.
 873     <xref linkend="guc-log-autovacuum-min-duration"/> can be set to monitor
 874     autovacuum workers' activity.
 875    </para>
 876
 877    <para>
 878     If several large tables all become eligible for vacuuming in a short
 879     amount of time, all autovacuum workers might become occupied with
 880     vacuuming those tables for a long period.  This would result
 881     in other tables and databases not being vacuumed until a worker becomes
 882     available. There is no limit on how many workers might be in a
 883     single database, but workers do try to avoid repeating work that has
 884     already been done by other workers. Note that the number of running
 885     workers does not count towards <xref linkend="guc-max-connections"/> or
 886     <xref linkend="guc-superuser-reserved-connections"/> limits.
 887    </para>
 888
 889    <para>
 890     Tables whose <structfield>relfrozenxid</structfield> value is more than
 891     <xref linkend="guc-autovacuum-freeze-max-age"/> transactions old are always
 892     vacuumed (this also applies to those tables whose freeze max age has
 893     been modified via storage parameters; see below).  Otherwise, if the
 894     number of tuples obsoleted since the last
 895     <command>VACUUM</command> exceeds the <quote>vacuum threshold</quote>, the
 896     table is vacuumed.  The vacuum threshold is defined as:
 897 <programlisting>
 898 vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuples
 899 </programlisting>
 900     where the vacuum base threshold is
 901     <xref linkend="guc-autovacuum-vacuum-threshold"/>,
 902     the vacuum scale factor is
 903     <xref linkend="guc-autovacuum-vacuum-scale-factor"/>,
 904     and the number of tuples is
 905     <structname>pg_class</structname>.<structfield>reltuples</structfield>.
 906    </para>
 907
 908    <para>
 909     The table is also vacuumed if the number of tuples inserted since the last
 910     vacuum has exceeded the defined insert threshold, which is defined as:
 911 <programlisting>
 912 vacuum insert threshold = vacuum base insert threshold + vacuum insert scale factor * number of tuples
 913 </programlisting>
 914     where the vacuum insert base threshold is
 915     <xref linkend="guc-autovacuum-vacuum-insert-threshold"/>,
 916     and vacuum insert scale factor is
 917     <xref linkend="guc-autovacuum-vacuum-insert-scale-factor"/>.
 918     Such vacuums may allow portions of the table to be marked as
 919     <firstterm>all visible</firstterm> and also allow tuples to be frozen, which
 920     can reduce the work required in subsequent vacuums.
 921     For tables which receive <command>INSERT</command> operations but no or
 922     almost no <command>UPDATE</command>/<command>DELETE</command> operations,
 923     it may be beneficial to lower the table's
 924     <xref linkend="reloption-autovacuum-freeze-min-age"/> as this may allow
 925     tuples to be frozen by earlier vacuums.  The number of obsolete tuples and
 926     the number of inserted tuples are obtained from the cumulative statistics system;
 927     it is an eventually-consistent count updated by each <command>UPDATE</command>,
 928     <command>DELETE</command> and <command>INSERT</command> operation.
 929     If the <structfield>relfrozenxid</structfield> value of the table
 930     is more than <varname>vacuum_freeze_table_age</varname> transactions old,
 931     an aggressive vacuum is performed to freeze old tuples and advance
 932     <structfield>relfrozenxid</structfield>; otherwise, only pages that have been modified
 933     since the last vacuum are scanned.
 934    </para>
 935
 936    <para>
 937     For analyze, a similar condition is used: the threshold, defined as:
 938 <programlisting>
 939 analyze threshold = analyze base threshold + analyze scale factor * number of tuples
 940 </programlisting>
 941     is compared to the total number of tuples inserted, updated, or deleted
 942     since the last <command>ANALYZE</command>.
 943    </para>
 944
 945    <para>
 946     Partitioned tables do not directly store tuples and consequently
 947     are not processed by autovacuum.  (Autovacuum does process table
 948     partitions just like other tables.)  Unfortunately, this means that
 949     autovacuum does  not run <command>ANALYZE</command> on partitioned
 950     tables, and this can cause suboptimal plans for queries that reference
 951     partitioned table statistics.  You can work around this problem by
 952     manually running <command>ANALYZE</command> on partitioned tables
 953     when they are first populated, and again whenever the distribution
 954     of data in their partitions changes significantly.
 955    </para>
 956
 957    <para>
 958     Temporary tables cannot be accessed by autovacuum.  Therefore,
 959     appropriate vacuum and analyze operations should be performed via
 960     session SQL commands.
 961    </para>
 962
 963    <para>
 964     The default thresholds and scale factors are taken from
 965     <filename>postgresql.conf</filename>, but it is possible to override them
 966     (and many other autovacuum control parameters) on a per-table basis; see
 967     <xref linkend="sql-createtable-storage-parameters"/> for more information.
 968     If a setting has been changed via a table's storage parameters, that value
 969     is used when processing that table; otherwise the global settings are
 970     used. See <xref linkend="runtime-config-autovacuum"/> for more details on
 971     the global settings.
 972    </para>
 973
 974    <para>
 975     When multiple workers are running, the autovacuum cost delay parameters
 976     (see <xref linkend="runtime-config-resource-vacuum-cost"/>) are
 977     <quote>balanced</quote> among all the running workers, so that the
 978     total I/O impact on the system is the same regardless of the number
 979     of workers actually running.  However, any workers processing tables whose
 980     per-table <literal>autovacuum_vacuum_cost_delay</literal> or
 981     <literal>autovacuum_vacuum_cost_limit</literal> storage parameters have been set
 982     are not considered in the balancing algorithm.
 983    </para>
 984
 985    <para>
 986     Autovacuum workers generally don't block other commands.  If a process
 987     attempts to acquire a lock that conflicts with the
 988     <literal>SHARE UPDATE EXCLUSIVE</literal> lock held by autovacuum, lock
 989     acquisition will interrupt the autovacuum.  For conflicting lock modes,
 990     see <xref linkend="table-lock-compatibility"/>.  However, if the autovacuum
 991     is running to prevent transaction ID wraparound (i.e., the autovacuum query
 992     name in the <structname>pg_stat_activity</structname> view ends with
 993     <literal>(to prevent wraparound)</literal>), the autovacuum is not
 994     automatically interrupted.
 995    </para>
 996
 997    <warning>
 998     <para>
 999      Regularly running commands that acquire locks conflicting with a
1000      <literal>SHARE UPDATE EXCLUSIVE</literal> lock (e.g., ANALYZE) can
1001      effectively prevent autovacuums from ever completing.
1002     </para>
1003    </warning>
1004   </sect2>
1005  </sect1>
1006
1007
1008  <sect1 id="routine-reindex">
1009   <title>Routine Reindexing</title>
1010
1011   <indexterm zone="routine-reindex">
1012    <primary>reindex</primary>
1013   </indexterm>
1014
1015   <para>
1016    In some situations it is worthwhile to rebuild indexes periodically
1017    with the <xref linkend="sql-reindex"/> command or a series of individual
1018    rebuilding steps.
1019
1020   </para>
1021
1022   <para>
1023    B-tree index pages that have become completely empty are reclaimed for
1024    re-use.  However, there is still a possibility
1025    of inefficient use of space: if all but a few index keys on a page have
1026    been deleted, the page remains allocated.  Therefore, a usage
1027    pattern in which most, but not all, keys in each range are eventually
1028    deleted will see poor use of space.  For such usage patterns,
1029    periodic reindexing is recommended.
1030   </para>
1031
1032   <para>
1033    The potential for bloat in non-B-tree indexes has not been well
1034    researched.  It is a good idea to periodically monitor the index's physical
1035    size when using any non-B-tree index type.
1036   </para>
1037
1038   <para>
1039    Also, for B-tree indexes, a freshly-constructed index is slightly faster to
1040    access than one that has been updated many times because logically
1041    adjacent pages are usually also physically adjacent in a newly built index.
1042    (This consideration does not apply to non-B-tree indexes.)  It
1043    might be worthwhile to reindex periodically just to improve access speed.
1044   </para>
1045
1046   <para>
1047    <xref linkend="sql-reindex"/> can be used safely and easily in all cases.
1048    This command requires an <literal>ACCESS EXCLUSIVE</literal> lock by
1049    default, hence it is often preferable to execute it with its
1050    <literal>CONCURRENTLY</literal> option, which requires only a
1051    <literal>SHARE UPDATE EXCLUSIVE</literal> lock.
1052   </para>
1053  </sect1>
1054
1055
1056  <sect1 id="logfile-maintenance">
1057   <title>Log File Maintenance</title>
1058
1059   <indexterm zone="logfile-maintenance">
1060    <primary>server log</primary>
1061    <secondary>log file maintenance</secondary>
1062   </indexterm>
1063
1064   <para>
1065    It is a good idea to save the database server's log output
1066    somewhere, rather than just discarding it via <filename>/dev/null</filename>.
1067    The log output is invaluable when diagnosing
1068    problems.
1069   </para>
1070
1071   <note>
1072    <para>
1073     The server log can contain sensitive information and needs to be protected,
1074     no matter how or where it is stored, or the destination to which it is routed.
1075     For example, some DDL statements might contain plaintext passwords or other
1076     authentication details. Logged statements at the <literal>ERROR</literal>
1077     level might show the SQL source code for applications
1078     and might also contain some parts of data rows. Recording data, events and
1079     related information is the intended function of this facility, so this is
1080     not a leakage or a bug. Please ensure the server logs are visible only to
1081     appropriately authorized people.
1082    </para>
1083   </note>
1084
1085   <para>
1086    Log output tends to be voluminous
1087    (especially at higher debug levels) so you won't want to save it
1088    indefinitely.  You need to <emphasis>rotate</emphasis> the log files so that
1089    new log files are started and old ones removed after a reasonable
1090    period of time.
1091   </para>
1092
1093   <para>
1094    If you simply direct the <systemitem>stderr</systemitem> of
1095    <command>postgres</command> into a
1096    file, you will have log output, but
1097    the only way to truncate the log file is to stop and restart
1098    the server. This might be acceptable if you are using
1099    <productname>PostgreSQL</productname> in a development environment,
1100    but few production servers would find this behavior acceptable.
1101   </para>
1102
1103   <para>
1104    A better approach is to send the server's
1105    <systemitem>stderr</systemitem> output to some type of log rotation program.
1106    There is a built-in log rotation facility, which you can use by
1107    setting the configuration parameter <varname>logging_collector</varname> to
1108    <literal>true</literal> in <filename>postgresql.conf</filename>.  The control
1109    parameters for this program are described in <xref
1110    linkend="runtime-config-logging-where"/>. You can also use this approach
1111    to capture the log data in machine readable <acronym>CSV</acronym>
1112    (comma-separated values) format.
1113   </para>
1114
1115   <para>
1116    Alternatively, you might prefer to use an external log rotation
1117    program if you have one that you are already using with other
1118    server software. For example, the <application>rotatelogs</application>
1119    tool included in the <productname>Apache</productname> distribution
1120    can be used with <productname>PostgreSQL</productname>.  One way to
1121    do this is to pipe the server's
1122    <systemitem>stderr</systemitem> output to the desired program.
1123    If you start the server with
1124    <command>pg_ctl</command>, then <systemitem>stderr</systemitem>
1125    is already redirected to <systemitem>stdout</systemitem>, so you just need a
1126    pipe command, for example:
1127
1128 <programlisting>
1129 pg_ctl start | rotatelogs /var/log/pgsql_log 86400
1130 </programlisting>
1131   </para>
1132
1133   <para>
1134    You can combine these approaches by setting up <application>logrotate</application>
1135    to collect log files produced by <productname>PostgreSQL</productname> built-in
1136    logging collector.  In this case, the logging collector defines the names and
1137    location of the log files, while <application>logrotate</application>
1138    periodically archives these files.  When initiating log rotation,
1139    <application>logrotate</application> must ensure that the application
1140    sends further output to the new file.  This is commonly done with a
1141    <literal>postrotate</literal> script that sends a <literal>SIGHUP</literal>
1142    signal to the application, which then reopens the log file.
1143    In <productname>PostgreSQL</productname>, you can run <command>pg_ctl</command>
1144    with the <literal>logrotate</literal> option instead.  When the server receives
1145    this command, the server either switches to a new log file or reopens the
1146    existing file, depending on the logging configuration
1147    (see <xref linkend="runtime-config-logging-where"/>).
1148   </para>
1149
1150   <note>
1151    <para>
1152     When using static log file names, the server might fail to reopen the log
1153     file if the max open file limit is reached or a file table overflow occurs.
1154     In this case, log messages are sent to the old log file until a
1155     successful log rotation. If <application>logrotate</application> is
1156     configured to compress the log file and delete it, the server may lose
1157     the messages logged in this time frame. To avoid this issue, you can
1158     configure the logging collector to dynamically assign log file names
1159     and use a <literal>prerotate</literal> script to ignore open log files.
1160     </para>
1161   </note>
1162
1163   <para>
1164    Another production-grade approach to managing log output is to
1165    send it to <application>syslog</application> and let
1166    <application>syslog</application> deal with file rotation. To do this, set the
1167    configuration parameter <varname>log_destination</varname> to <literal>syslog</literal>
1168    (to log to <application>syslog</application> only) in
1169    <filename>postgresql.conf</filename>. Then you can send a <literal>SIGHUP</literal>
1170    signal to the <application>syslog</application> daemon whenever you want to force it
1171    to start writing a new log file.  If you want to automate log
1172    rotation, the <application>logrotate</application> program can be
1173    configured to work with log files from
1174    <application>syslog</application>.
1175   </para>
1176
1177   <para>
1178    On many systems, however, <application>syslog</application> is not very reliable,
1179    particularly with large log messages; it might truncate or drop messages
1180    just when you need them the most.  Also, on <productname>Linux</productname>,
1181    <application>syslog</application> will flush each message to disk, yielding poor
1182    performance.  (You can use a <quote><literal>-</literal></quote> at the start of the file name
1183    in the <application>syslog</application> configuration file to disable syncing.)
1184   </para>
1185
1186   <para>
1187    Note that all the solutions described above take care of starting new
1188    log files at configurable intervals, but they do not handle deletion
1189    of old, no-longer-useful log files.  You will probably want to set
1190    up a batch job to periodically delete old log files.  Another possibility
1191    is to configure the rotation program so that old log files are overwritten
1192    cyclically.
1193   </para>
1194
1195   <para>
1196    <ulink url="https://pgbadger.darold.net/"><productname>pgBadger</productname></ulink>
1197    is an external project that does sophisticated log file analysis.
1198    <ulink
1199    url="https://bucardo.org/check_postgres/"><productname>check_postgres</productname></ulink>
1200    provides Nagios alerts when important messages appear in the log
1201    files, as well as detection of many other extraordinary conditions.
1202   </para>
1203  </sect1>
1204 </chapter>