Repair memory leaks in plpython.
[pgsql.git] / doc / src / sgml / maintenance.sgml
blob0be90bdc7efe1b3aeda5ecf3bbd88315675d6b45
1 <!-- doc/src/sgml/maintenance.sgml -->
3 <chapter id="maintenance">
4 <title>Routine Database Maintenance Tasks</title>
6 <indexterm zone="maintenance">
7 <primary>maintenance</primary>
8 </indexterm>
10 <indexterm zone="maintenance">
11 <primary>routine maintenance</primary>
12 </indexterm>
14 <para>
15 <productname>PostgreSQL</productname>, like any database software, requires that certain tasks
16 be performed regularly to achieve optimum performance. The tasks
17 discussed here are <emphasis>required</emphasis>, but they
18 are repetitive in nature and can easily be automated using standard
19 tools such as <application>cron</application> scripts or
20 Windows' <application>Task Scheduler</application>. It is the database
21 administrator's responsibility to set up appropriate scripts, and to
22 check that they execute successfully.
23 </para>
25 <para>
26 One obvious maintenance task is the creation of backup copies of the data on a
27 regular schedule. Without a recent backup, you have no chance of recovery
28 after a catastrophe (disk failure, fire, mistakenly dropping a critical
29 table, etc.). The backup and recovery mechanisms available in
30 <productname>PostgreSQL</productname> are discussed at length in
31 <xref linkend="backup"/>.
32 </para>
34 <para>
35 The other main category of maintenance task is periodic <quote>vacuuming</quote>
36 of the database. This activity is discussed in
37 <xref linkend="routine-vacuuming"/>. Closely related to this is updating
38 the statistics that will be used by the query planner, as discussed in
39 <xref linkend="vacuum-for-statistics"/>.
40 </para>
42 <para>
43 Another task that might need periodic attention is log file management.
44 This is discussed in <xref linkend="logfile-maintenance"/>.
45 </para>
47 <para>
48 <ulink
49 url="https://bucardo.org/check_postgres/"><application>check_postgres</application></ulink>
50 is available for monitoring database health and reporting unusual
51 conditions. <application>check_postgres</application> integrates with
52 Nagios and MRTG, but can be run standalone too.
53 </para>
55 <para>
56 <productname>PostgreSQL</productname> is low-maintenance compared
57 to some other database management systems. Nonetheless,
58 appropriate attention to these tasks will go far towards ensuring a
59 pleasant and productive experience with the system.
60 </para>
62 <sect1 id="routine-vacuuming">
63 <title>Routine Vacuuming</title>
65 <indexterm zone="routine-vacuuming">
66 <primary>vacuum</primary>
67 </indexterm>
69 <para>
70 <productname>PostgreSQL</productname> databases require periodic
71 maintenance known as <firstterm>vacuuming</firstterm>. For many installations, it
72 is sufficient to let vacuuming be performed by the <firstterm>autovacuum
73 daemon</firstterm>, which is described in <xref linkend="autovacuum"/>. You might
74 need to adjust the autovacuuming parameters described there to obtain best
75 results for your situation. Some database administrators will want to
76 supplement or replace the daemon's activities with manually-managed
77 <command>VACUUM</command> commands, which typically are executed according to a
78 schedule by <application>cron</application> or <application>Task
79 Scheduler</application> scripts. To set up manually-managed vacuuming properly,
80 it is essential to understand the issues discussed in the next few
81 subsections. Administrators who rely on autovacuuming may still wish
82 to skim this material to help them understand and adjust autovacuuming.
83 </para>
85 <sect2 id="vacuum-basics">
86 <title>Vacuuming Basics</title>
88 <para>
89 <productname>PostgreSQL</productname>'s
90 <link linkend="sql-vacuum"><command>VACUUM</command></link> command has to
91 process each table on a regular basis for several reasons:
93 <orderedlist>
94 <listitem>
95 <simpara>To recover or reuse disk space occupied by updated or deleted
96 rows.</simpara>
97 </listitem>
99 <listitem>
100 <simpara>To update data statistics used by the
101 <productname>PostgreSQL</productname> query planner.</simpara>
102 </listitem>
104 <listitem>
105 <simpara>To update the visibility map, which speeds
106 up <link linkend="indexes-index-only-scans">index-only
107 scans</link>.</simpara>
108 </listitem>
110 <listitem>
111 <simpara>To protect against loss of very old data due to
112 <firstterm>transaction ID wraparound</firstterm> or
113 <firstterm>multixact ID wraparound</firstterm>.</simpara>
114 </listitem>
115 </orderedlist>
117 Each of these reasons dictates performing <command>VACUUM</command> operations
118 of varying frequency and scope, as explained in the following subsections.
119 </para>
121 <para>
122 There are two variants of <command>VACUUM</command>: standard <command>VACUUM</command>
123 and <command>VACUUM FULL</command>. <command>VACUUM FULL</command> can reclaim more
124 disk space but runs much more slowly. Also,
125 the standard form of <command>VACUUM</command> can run in parallel with production
126 database operations. (Commands such as <command>SELECT</command>,
127 <command>INSERT</command>, <command>UPDATE</command>, and
128 <command>DELETE</command> will continue to function normally, though you
129 will not be able to modify the definition of a table with commands such as
130 <command>ALTER TABLE</command> while it is being vacuumed.)
131 <command>VACUUM FULL</command> requires an
132 <literal>ACCESS EXCLUSIVE</literal> lock on the table it is
133 working on, and therefore cannot be done in parallel with other use
134 of the table. Generally, therefore,
135 administrators should strive to use standard <command>VACUUM</command> and
136 avoid <command>VACUUM FULL</command>.
137 </para>
139 <para>
140 <command>VACUUM</command> creates a substantial amount of I/O
141 traffic, which can cause poor performance for other active sessions.
142 There are configuration parameters that can be adjusted to reduce the
143 performance impact of background vacuuming &mdash; see
144 <xref linkend="runtime-config-resource-vacuum-cost"/>.
145 </para>
146 </sect2>
148 <sect2 id="vacuum-for-space-recovery">
149 <title>Recovering Disk Space</title>
151 <indexterm zone="vacuum-for-space-recovery">
152 <primary>disk space</primary>
153 </indexterm>
155 <para>
156 In <productname>PostgreSQL</productname>, an
157 <command>UPDATE</command> or <command>DELETE</command> of a row does not
158 immediately remove the old version of the row.
159 This approach is necessary to gain the benefits of multiversion
160 concurrency control (<acronym>MVCC</acronym>, see <xref linkend="mvcc"/>): the row version
161 must not be deleted while it is still potentially visible to other
162 transactions. But eventually, an outdated or deleted row version is no
163 longer of interest to any transaction. The space it occupies must then be
164 reclaimed for reuse by new rows, to avoid unbounded growth of disk
165 space requirements. This is done by running <command>VACUUM</command>.
166 </para>
168 <para>
169 The standard form of <command>VACUUM</command> removes dead row
170 versions in tables and indexes and marks the space available for
171 future reuse. However, it will not return the space to the operating
172 system, except in the special case where one or more pages at the
173 end of a table become entirely free and an exclusive table lock can be
174 easily obtained. In contrast, <command>VACUUM FULL</command> actively compacts
175 tables by writing a complete new version of the table file with no dead
176 space. This minimizes the size of the table, but can take a long time.
177 It also requires extra disk space for the new copy of the table, until
178 the operation completes.
179 </para>
181 <para>
182 The usual goal of routine vacuuming is to do standard <command>VACUUM</command>s
183 often enough to avoid needing <command>VACUUM FULL</command>. The
184 autovacuum daemon attempts to work this way, and in fact will
185 never issue <command>VACUUM FULL</command>. In this approach, the idea
186 is not to keep tables at their minimum size, but to maintain steady-state
187 usage of disk space: each table occupies space equivalent to its
188 minimum size plus however much space gets used up between vacuum runs.
189 Although <command>VACUUM FULL</command> can be used to shrink a table back
190 to its minimum size and return the disk space to the operating system,
191 there is not much point in this if the table will just grow again in the
192 future. Thus, moderately-frequent standard <command>VACUUM</command> runs are a
193 better approach than infrequent <command>VACUUM FULL</command> runs for
194 maintaining heavily-updated tables.
195 </para>
197 <para>
198 Some administrators prefer to schedule vacuuming themselves, for example
199 doing all the work at night when load is low.
200 The difficulty with doing vacuuming according to a fixed schedule
201 is that if a table has an unexpected spike in update activity, it may
202 get bloated to the point that <command>VACUUM FULL</command> is really necessary
203 to reclaim space. Using the autovacuum daemon alleviates this problem,
204 since the daemon schedules vacuuming dynamically in response to update
205 activity. It is unwise to disable the daemon completely unless you
206 have an extremely predictable workload. One possible compromise is
207 to set the daemon's parameters so that it will only react to unusually
208 heavy update activity, thus keeping things from getting out of hand,
209 while scheduled <command>VACUUM</command>s are expected to do the bulk of the
210 work when the load is typical.
211 </para>
213 <para>
214 For those not using autovacuum, a typical approach is to schedule a
215 database-wide <command>VACUUM</command> once a day during a low-usage period,
216 supplemented by more frequent vacuuming of heavily-updated tables as
217 necessary. (Some installations with extremely high update rates vacuum
218 their busiest tables as often as once every few minutes.) If you have
219 multiple databases in a cluster, don't forget to
220 <command>VACUUM</command> each one; the program <xref
221 linkend="app-vacuumdb"/> might be helpful.
222 </para>
224 <tip>
225 <para>
226 Plain <command>VACUUM</command> may not be satisfactory when
227 a table contains large numbers of dead row versions as a result of
228 massive update or delete activity. If you have such a table and
229 you need to reclaim the excess disk space it occupies, you will need
230 to use <command>VACUUM FULL</command>, or alternatively
231 <link linkend="sql-cluster"><command>CLUSTER</command></link>
232 or one of the table-rewriting variants of
233 <link linkend="sql-altertable"><command>ALTER TABLE</command></link>.
234 These commands rewrite an entire new copy of the table and build
235 new indexes for it. All these options require an
236 <literal>ACCESS EXCLUSIVE</literal> lock. Note that
237 they also temporarily use extra disk space approximately equal to the size
238 of the table, since the old copies of the table and indexes can't be
239 released until the new ones are complete.
240 </para>
241 </tip>
243 <tip>
244 <para>
245 If you have a table whose entire contents are deleted on a periodic
246 basis, consider doing it with
247 <link linkend="sql-truncate"><command>TRUNCATE</command></link> rather
248 than using <command>DELETE</command> followed by
249 <command>VACUUM</command>. <command>TRUNCATE</command> removes the
250 entire content of the table immediately, without requiring a
251 subsequent <command>VACUUM</command> or <command>VACUUM
252 FULL</command> to reclaim the now-unused disk space.
253 The disadvantage is that strict MVCC semantics are violated.
254 </para>
255 </tip>
256 </sect2>
258 <sect2 id="vacuum-for-statistics">
259 <title>Updating Planner Statistics</title>
261 <indexterm zone="vacuum-for-statistics">
262 <primary>statistics</primary>
263 <secondary>of the planner</secondary>
264 </indexterm>
266 <indexterm zone="vacuum-for-statistics">
267 <primary>ANALYZE</primary>
268 </indexterm>
270 <para>
271 The <productname>PostgreSQL</productname> query planner relies on
272 statistical information about the contents of tables in order to
273 generate good plans for queries. These statistics are gathered by
274 the <link linkend="sql-analyze"><command>ANALYZE</command></link> command,
275 which can be invoked by itself or
276 as an optional step in <command>VACUUM</command>. It is important to have
277 reasonably accurate statistics, otherwise poor choices of plans might
278 degrade database performance.
279 </para>
281 <para>
282 The autovacuum daemon, if enabled, will automatically issue
283 <command>ANALYZE</command> commands whenever the content of a table has
284 changed sufficiently. However, administrators might prefer to rely
285 on manually-scheduled <command>ANALYZE</command> operations, particularly
286 if it is known that update activity on a table will not affect the
287 statistics of <quote>interesting</quote> columns. The daemon schedules
288 <command>ANALYZE</command> strictly as a function of the number of rows
289 inserted or updated; it has no knowledge of whether that will lead
290 to meaningful statistical changes.
291 </para>
293 <para>
294 Tuples changed in partitions and inheritance children do not trigger
295 analyze on the parent table. If the parent table is empty or rarely
296 changed, it may never be processed by autovacuum, and the statistics for
297 the inheritance tree as a whole won't be collected. It is necessary to
298 run <command>ANALYZE</command> on the parent table manually in order to
299 keep the statistics up to date.
300 </para>
302 <para>
303 As with vacuuming for space recovery, frequent updates of statistics
304 are more useful for heavily-updated tables than for seldom-updated
305 ones. But even for a heavily-updated table, there might be no need for
306 statistics updates if the statistical distribution of the data is
307 not changing much. A simple rule of thumb is to think about how much
308 the minimum and maximum values of the columns in the table change.
309 For example, a <type>timestamp</type> column that contains the time
310 of row update will have a constantly-increasing maximum value as
311 rows are added and updated; such a column will probably need more
312 frequent statistics updates than, say, a column containing URLs for
313 pages accessed on a website. The URL column might receive changes just
314 as often, but the statistical distribution of its values probably
315 changes relatively slowly.
316 </para>
318 <para>
319 It is possible to run <command>ANALYZE</command> on specific tables and even
320 just specific columns of a table, so the flexibility exists to update some
321 statistics more frequently than others if your application requires it.
322 In practice, however, it is usually best to just analyze the entire
323 database, because it is a fast operation. <command>ANALYZE</command> uses a
324 statistically random sampling of the rows of a table rather than reading
325 every single row.
326 </para>
328 <tip>
329 <para>
330 Although per-column tweaking of <command>ANALYZE</command> frequency might not be
331 very productive, you might find it worthwhile to do per-column
332 adjustment of the level of detail of the statistics collected by
333 <command>ANALYZE</command>. Columns that are heavily used in <literal>WHERE</literal>
334 clauses and have highly irregular data distributions might require a
335 finer-grain data histogram than other columns. See <command>ALTER TABLE
336 SET STATISTICS</command>, or change the database-wide default using the <xref
337 linkend="guc-default-statistics-target"/> configuration parameter.
338 </para>
340 <para>
341 Also, by default there is limited information available about
342 the selectivity of functions. However, if you create a statistics
343 object or an expression
344 index that uses a function call, useful statistics will be
345 gathered about the function, which can greatly improve query
346 plans that use the expression index.
347 </para>
348 </tip>
350 <tip>
351 <para>
352 The autovacuum daemon does not issue <command>ANALYZE</command> commands for
353 foreign tables, since it has no means of determining how often that
354 might be useful. If your queries require statistics on foreign tables
355 for proper planning, it's a good idea to run manually-managed
356 <command>ANALYZE</command> commands on those tables on a suitable schedule.
357 </para>
358 </tip>
360 <tip>
361 <para>
362 The autovacuum daemon does not issue <command>ANALYZE</command> commands
363 for partitioned tables. Inheritance parents will only be analyzed if the
364 parent itself is changed - changes to child tables do not trigger
365 autoanalyze on the parent table. If your queries require statistics on
366 parent tables for proper planning, it is necessary to periodically run
367 a manual <command>ANALYZE</command> on those tables to keep the statistics
368 up to date.
369 </para>
370 </tip>
372 </sect2>
374 <sect2 id="vacuum-for-visibility-map">
375 <title>Updating the Visibility Map</title>
377 <para>
378 Vacuum maintains a <link linkend="storage-vm">visibility map</link> for each
379 table to keep track of which pages contain only tuples that are known to be
380 visible to all active transactions (and all future transactions, until the
381 page is again modified). This has two purposes. First, vacuum
382 itself can skip such pages on the next run, since there is nothing to
383 clean up.
384 </para>
386 <para>
387 Second, it allows <productname>PostgreSQL</productname> to answer some
388 queries using only the index, without reference to the underlying table.
389 Since <productname>PostgreSQL</productname> indexes don't contain tuple
390 visibility information, a normal index scan fetches the heap tuple for each
391 matching index entry, to check whether it should be seen by the current
392 transaction.
393 An <link linkend="indexes-index-only-scans"><firstterm>index-only
394 scan</firstterm></link>, on the other hand, checks the visibility map first.
395 If it's known that all tuples on the page are
396 visible, the heap fetch can be skipped. This is most useful on
397 large data sets where the visibility map can prevent disk accesses.
398 The visibility map is vastly smaller than the heap, so it can easily be
399 cached even when the heap is very large.
400 </para>
401 </sect2>
403 <sect2 id="vacuum-for-wraparound">
404 <title>Preventing Transaction ID Wraparound Failures</title>
406 <indexterm zone="vacuum-for-wraparound">
407 <primary>transaction ID</primary>
408 <secondary>wraparound</secondary>
409 </indexterm>
411 <indexterm>
412 <primary>wraparound</primary>
413 <secondary>of transaction IDs</secondary>
414 </indexterm>
416 <para>
417 <productname>PostgreSQL</productname>'s
418 <link linkend="mvcc-intro">MVCC</link> transaction semantics
419 depend on being able to compare transaction ID (<acronym>XID</acronym>)
420 numbers: a row version with an insertion XID greater than the current
421 transaction's XID is <quote>in the future</quote> and should not be visible
422 to the current transaction. But since transaction IDs have limited size
423 (32 bits) a cluster that runs for a long time (more
424 than 4 billion transactions) would suffer <firstterm>transaction ID
425 wraparound</firstterm>: the XID counter wraps around to zero, and all of a sudden
426 transactions that were in the past appear to be in the future &mdash; which
427 means their output become invisible. In short, catastrophic data loss.
428 (Actually the data is still there, but that's cold comfort if you cannot
429 get at it.) To avoid this, it is necessary to vacuum every table
430 in every database at least once every two billion transactions.
431 </para>
433 <para>
434 The reason that periodic vacuuming solves the problem is that
435 <command>VACUUM</command> will mark rows as <emphasis>frozen</emphasis>, indicating that
436 they were inserted by a transaction that committed sufficiently far in
437 the past that the effects of the inserting transaction are certain to be
438 visible to all current and future transactions.
439 Normal XIDs are
440 compared using modulo-2<superscript>32</superscript> arithmetic. This means
441 that for every normal XID, there are two billion XIDs that are
442 <quote>older</quote> and two billion that are <quote>newer</quote>; another
443 way to say it is that the normal XID space is circular with no
444 endpoint. Therefore, once a row version has been created with a particular
445 normal XID, the row version will appear to be <quote>in the past</quote> for
446 the next two billion transactions, no matter which normal XID we are
447 talking about. If the row version still exists after more than two billion
448 transactions, it will suddenly appear to be in the future. To
449 prevent this, <productname>PostgreSQL</productname> reserves a special XID,
450 <literal>FrozenTransactionId</literal>, which does not follow the normal XID
451 comparison rules and is always considered older
452 than every normal XID.
453 Frozen row versions are treated as if the inserting XID were
454 <literal>FrozenTransactionId</literal>, so that they will appear to be
455 <quote>in the past</quote> to all normal transactions regardless of wraparound
456 issues, and so such row versions will be valid until deleted, no matter
457 how long that is.
458 </para>
460 <note>
461 <para>
462 In <productname>PostgreSQL</productname> versions before 9.4, freezing was
463 implemented by actually replacing a row's insertion XID
464 with <literal>FrozenTransactionId</literal>, which was visible in the
465 row's <structname>xmin</structname> system column. Newer versions just set a flag
466 bit, preserving the row's original <structname>xmin</structname> for possible
467 forensic use. However, rows with <structname>xmin</structname> equal
468 to <literal>FrozenTransactionId</literal> (2) may still be found
469 in databases <application>pg_upgrade</application>'d from pre-9.4 versions.
470 </para>
471 <para>
472 Also, system catalogs may contain rows with <structname>xmin</structname> equal
473 to <literal>BootstrapTransactionId</literal> (1), indicating that they were
474 inserted during the first phase of <application>initdb</application>.
475 Like <literal>FrozenTransactionId</literal>, this special XID is treated as
476 older than every normal XID.
477 </para>
478 </note>
480 <para>
481 <xref linkend="guc-vacuum-freeze-min-age"/>
482 controls how old an XID value has to be before rows bearing that XID will be
483 frozen. Increasing this setting may avoid unnecessary work if the
484 rows that would otherwise be frozen will soon be modified again,
485 but decreasing this setting increases
486 the number of transactions that can elapse before the table must be
487 vacuumed again.
488 </para>
490 <para>
491 <command>VACUUM</command> uses the <link linkend="storage-vm">visibility map</link>
492 to determine which pages of a table must be scanned. Normally, it
493 will skip pages that don't have any dead row versions even if those pages
494 might still have row versions with old XID values. Therefore, normal
495 <command>VACUUM</command>s won't always freeze every old row version in the table.
496 When that happens, <command>VACUUM</command> will eventually need to perform an
497 <firstterm>aggressive vacuum</firstterm>, which will freeze all eligible unfrozen
498 XID and MXID values, including those from all-visible but not all-frozen pages.
499 In practice most tables require periodic aggressive vacuuming.
500 <xref linkend="guc-vacuum-freeze-table-age"/>
501 controls when <command>VACUUM</command> does that: all-visible but not all-frozen
502 pages are scanned if the number of transactions that have passed since the
503 last such scan is greater than <varname>vacuum_freeze_table_age</varname> minus
504 <varname>vacuum_freeze_min_age</varname>. Setting
505 <varname>vacuum_freeze_table_age</varname> to 0 forces <command>VACUUM</command> to
506 always use its aggressive strategy.
507 </para>
509 <para>
510 The maximum time that a table can go unvacuumed is two billion
511 transactions minus the <varname>vacuum_freeze_min_age</varname> value at
512 the time of the last aggressive vacuum. If it were to go
513 unvacuumed for longer than
514 that, data loss could result. To ensure that this does not happen,
515 autovacuum is invoked on any table that might contain unfrozen rows with
516 XIDs older than the age specified by the configuration parameter <xref
517 linkend="guc-autovacuum-freeze-max-age"/>. (This will happen even if
518 autovacuum is disabled.)
519 </para>
521 <para>
522 This implies that if a table is not otherwise vacuumed,
523 autovacuum will be invoked on it approximately once every
524 <varname>autovacuum_freeze_max_age</varname> minus
525 <varname>vacuum_freeze_min_age</varname> transactions.
526 For tables that are regularly vacuumed for space reclamation purposes,
527 this is of little importance. However, for static tables
528 (including tables that receive inserts, but no updates or deletes),
529 there is no need to vacuum for space reclamation, so it can
530 be useful to try to maximize the interval between forced autovacuums
531 on very large static tables. Obviously one can do this either by
532 increasing <varname>autovacuum_freeze_max_age</varname> or decreasing
533 <varname>vacuum_freeze_min_age</varname>.
534 </para>
536 <para>
537 The effective maximum for <varname>vacuum_freeze_table_age</varname> is 0.95 *
538 <varname>autovacuum_freeze_max_age</varname>; a setting higher than that will be
539 capped to the maximum. A value higher than
540 <varname>autovacuum_freeze_max_age</varname> wouldn't make sense because an
541 anti-wraparound autovacuum would be triggered at that point anyway, and
542 the 0.95 multiplier leaves some breathing room to run a manual
543 <command>VACUUM</command> before that happens. As a rule of thumb,
544 <command>vacuum_freeze_table_age</command> should be set to a value somewhat
545 below <varname>autovacuum_freeze_max_age</varname>, leaving enough gap so that
546 a regularly scheduled <command>VACUUM</command> or an autovacuum triggered by
547 normal delete and update activity is run in that window. Setting it too
548 close could lead to anti-wraparound autovacuums, even though the table
549 was recently vacuumed to reclaim space, whereas lower values lead to more
550 frequent aggressive vacuuming.
551 </para>
553 <para>
554 The sole disadvantage of increasing <varname>autovacuum_freeze_max_age</varname>
555 (and <varname>vacuum_freeze_table_age</varname> along with it) is that
556 the <filename>pg_xact</filename> and <filename>pg_commit_ts</filename>
557 subdirectories of the database cluster will take more space, because it
558 must store the commit status and (if <varname>track_commit_timestamp</varname> is
559 enabled) timestamp of all transactions back to
560 the <varname>autovacuum_freeze_max_age</varname> horizon. The commit status uses
561 two bits per transaction, so if
562 <varname>autovacuum_freeze_max_age</varname> is set to its maximum allowed value
563 of two billion, <filename>pg_xact</filename> can be expected to grow to about half
564 a gigabyte and <filename>pg_commit_ts</filename> to about 20GB. If this
565 is trivial compared to your total database size,
566 setting <varname>autovacuum_freeze_max_age</varname> to its maximum allowed value
567 is recommended. Otherwise, set it depending on what you are willing to
568 allow for <filename>pg_xact</filename> and <filename>pg_commit_ts</filename> storage.
569 (The default, 200 million transactions, translates to about 50MB
570 of <filename>pg_xact</filename> storage and about 2GB of <filename>pg_commit_ts</filename>
571 storage.)
572 </para>
574 <para>
575 One disadvantage of decreasing <varname>vacuum_freeze_min_age</varname> is that
576 it might cause <command>VACUUM</command> to do useless work: freezing a row
577 version is a waste of time if the row is modified
578 soon thereafter (causing it to acquire a new XID). So the setting should
579 be large enough that rows are not frozen until they are unlikely to change
580 any more.
581 </para>
583 <para>
584 To track the age of the oldest unfrozen XIDs in a database,
585 <command>VACUUM</command> stores XID
586 statistics in the system tables <structname>pg_class</structname> and
587 <structname>pg_database</structname>. In particular,
588 the <structfield>relfrozenxid</structfield> column of a table's
589 <structname>pg_class</structname> row contains the oldest remaining unfrozen
590 XID at the end of the most recent <command>VACUUM</command> that successfully
591 advanced <structfield>relfrozenxid</structfield> (typically the most recent
592 aggressive VACUUM). Similarly, the
593 <structfield>datfrozenxid</structfield> column of a database's
594 <structname>pg_database</structname> row is a lower bound on the unfrozen XIDs
595 appearing in that database &mdash; it is just the minimum of the
596 per-table <structfield>relfrozenxid</structfield> values within the database.
597 A convenient way to
598 examine this information is to execute queries such as:
600 <programlisting>
601 SELECT c.oid::regclass as table_name,
602 greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as age
603 FROM pg_class c
604 LEFT JOIN pg_class t ON c.reltoastrelid = t.oid
605 WHERE c.relkind IN ('r', 'm');
607 SELECT datname, age(datfrozenxid) FROM pg_database;
608 </programlisting>
610 The <literal>age</literal> column measures the number of transactions from the
611 cutoff XID to the current transaction's XID.
612 </para>
614 <tip>
615 <para>
616 When the <command>VACUUM</command> command's <literal>VERBOSE</literal>
617 parameter is specified, <command>VACUUM</command> prints various
618 statistics about the table. This includes information about how
619 <structfield>relfrozenxid</structfield> and
620 <structfield>relminmxid</structfield> advanced, and the number of
621 newly frozen pages. The same details appear in the server log when
622 autovacuum logging (controlled by <xref
623 linkend="guc-log-autovacuum-min-duration"/>) reports on a
624 <command>VACUUM</command> operation executed by autovacuum.
625 </para>
626 </tip>
628 <para>
629 <command>VACUUM</command> normally only scans pages that have been modified
630 since the last vacuum, but <structfield>relfrozenxid</structfield> can only be
631 advanced when every page of the table
632 that might contain unfrozen XIDs is scanned. This happens when
633 <structfield>relfrozenxid</structfield> is more than
634 <varname>vacuum_freeze_table_age</varname> transactions old, when
635 <command>VACUUM</command>'s <literal>FREEZE</literal> option is used, or when all
636 pages that are not already all-frozen happen to
637 require vacuuming to remove dead row versions. When <command>VACUUM</command>
638 scans every page in the table that is not already all-frozen, it should
639 set <literal>age(relfrozenxid)</literal> to a value just a little more than the
640 <varname>vacuum_freeze_min_age</varname> setting
641 that was used (more by the number of transactions started since the
642 <command>VACUUM</command> started). <command>VACUUM</command>
643 will set <structfield>relfrozenxid</structfield> to the oldest XID
644 that remains in the table, so it's possible that the final value
645 will be much more recent than strictly required.
646 If no <structfield>relfrozenxid</structfield>-advancing
647 <command>VACUUM</command> is issued on the table until
648 <varname>autovacuum_freeze_max_age</varname> is reached, an autovacuum will soon
649 be forced for the table.
650 </para>
652 <para>
653 If for some reason autovacuum fails to clear old XIDs from a table, the
654 system will begin to emit warning messages like this when the database's
655 oldest XIDs reach forty million transactions from the wraparound point:
657 <programlisting>
658 WARNING: database "mydb" must be vacuumed within 39985967 transactions
659 HINT: To avoid XID assignment failures, execute a database-wide VACUUM in that database.
660 </programlisting>
662 (A manual <command>VACUUM</command> should fix the problem, as suggested by the
663 hint; but note that the <command>VACUUM</command> should be performed by a
664 superuser, else it will fail to process system catalogs, which prevent it from
665 being able to advance the database's <structfield>datfrozenxid</structfield>.)
666 If these warnings are ignored, the system will refuse to assign new XIDs once
667 there are fewer than three million transactions left until wraparound:
669 <programlisting>
670 ERROR: database is not accepting commands that assign new XIDs to avoid wraparound data loss in database "mydb"
671 HINT: Execute a database-wide VACUUM in that database.
672 </programlisting>
674 In this condition any transactions already in progress can continue,
675 but only read-only transactions can be started. Operations that
676 modify database records or truncate relations will fail.
677 The <command>VACUUM</command> command can still be run normally.
678 Note that, contrary to what was sometimes recommended in earlier releases,
679 it is not necessary or desirable to stop the postmaster or enter single
680 user-mode in order to restore normal operation.
681 Instead, follow these steps:
683 <orderedlist>
684 <listitem>
685 <simpara>Resolve old prepared transactions. You can find these by checking
686 <link linkend="view-pg-prepared-xacts">pg_prepared_xacts</link> for rows where
687 <literal>age(transactionid)</literal> is large. Such transactions should be
688 committed or rolled back.</simpara>
689 </listitem>
690 <listitem>
691 <simpara>End long-running open transactions. You can find these by checking
692 <link linkend="monitoring-pg-stat-activity-view">pg_stat_activity</link> for rows where
693 <literal>age(backend_xid)</literal> or <literal>age(backend_xmin)</literal> is
694 large. Such transactions should be committed or rolled back, or the session
695 can be terminated using <literal>pg_terminate_backend</literal>.</simpara>
696 </listitem>
697 <listitem>
698 <simpara>Drop any old replication slots. Use
699 <link linkend="monitoring-pg-stat-replication-view">pg_stat_replication</link> to
700 find slots where <literal>age(xmin)</literal> or <literal>age(catalog_xmin)</literal>
701 is large. In many cases, such slots were created for replication to servers that no
702 longer exist, or that have been down for a long time. If you drop a slot for a server
703 that still exists and might still try to connect to that slot, that replica may
704 need to be rebuilt.</simpara>
705 </listitem>
706 <listitem>
707 <simpara>Execute <command>VACUUM</command> in the target database. A database-wide
708 <literal>VACUUM</literal> is simplest; to reduce the time required, it as also possible
709 to issue manual <command>VACUUM</command> commands on the tables where
710 <structfield>relminxid</structfield> is oldest. Do not use <literal>VACUUM FULL</literal>
711 in this scenario, because it requires an XID and will therefore fail, except in super-user
712 mode, where it will instead consume an XID and thus increase the risk of transaction ID
713 wraparound. Do not use <literal>VACUUM FREEZE</literal> either, because it will do
714 more than the minimum amount of work required to restore normal operation.</simpara>
715 </listitem>
716 <listitem>
717 <simpara>Once normal operation is restored, ensure that autovacuum is properly configured
718 in the target database in order to avoid future problems.</simpara>
719 </listitem>
720 </orderedlist>
721 </para>
723 <note>
724 <para>
725 In earlier versions, it was sometimes necessary to stop the postmaster and
726 <command>VACUUM</command> the database in a single-user mode. In typical scenarios, this
727 is no longer necessary, and should be avoided whenever possible, since it involves taking
728 the system down. It is also riskier, since it disables transaction ID wraparound safeguards
729 that are designed to prevent data loss. The only reason to use single-user mode in this
730 scenario is if you wish to <command>TRUNCATE</command> or <command>DROP</command> unneeded
731 tables to avoid needing to <command>VACUUM</command> them. The three-million-transaction
732 safety margin exists to let the administrator do this. See the
733 <xref linkend="app-postgres"/> reference page for details about using single-user mode.
734 </para>
735 </note>
737 <sect3 id="vacuum-for-multixact-wraparound">
738 <title>Multixacts and Wraparound</title>
740 <indexterm>
741 <primary>MultiXactId</primary>
742 </indexterm>
744 <indexterm>
745 <primary>wraparound</primary>
746 <secondary>of multixact IDs</secondary>
747 </indexterm>
749 <para>
750 <firstterm>Multixact IDs</firstterm> are used to support row locking by
751 multiple transactions. Since there is only limited space in a tuple
752 header to store lock information, that information is encoded as
753 a <quote>multiple transaction ID</quote>, or multixact ID for short,
754 whenever there is more than one transaction concurrently locking a
755 row. Information about which transaction IDs are included in any
756 particular multixact ID is stored separately in
757 the <filename>pg_multixact</filename> subdirectory, and only the multixact ID
758 appears in the <structfield>xmax</structfield> field in the tuple header.
759 Like transaction IDs, multixact IDs are implemented as a
760 32-bit counter and corresponding storage, all of which requires
761 careful aging management, storage cleanup, and wraparound handling.
762 There is a separate storage area which holds the list of members in
763 each multixact, which also uses a 32-bit counter and which must also
764 be managed.
765 </para>
767 <para>
768 Whenever <command>VACUUM</command> scans any part of a table, it will replace
769 any multixact ID it encounters which is older than
770 <xref linkend="guc-vacuum-multixact-freeze-min-age"/>
771 by a different value, which can be the zero value, a single
772 transaction ID, or a newer multixact ID. For each table,
773 <structname>pg_class</structname>.<structfield>relminmxid</structfield> stores the oldest
774 possible multixact ID still appearing in any tuple of that table.
775 If this value is older than
776 <xref linkend="guc-vacuum-multixact-freeze-table-age"/>, an aggressive
777 vacuum is forced. As discussed in the previous section, an aggressive
778 vacuum means that only those pages which are known to be all-frozen will
779 be skipped. <function>mxid_age()</function> can be used on
780 <structname>pg_class</structname>.<structfield>relminmxid</structfield> to find its age.
781 </para>
783 <para>
784 Aggressive <command>VACUUM</command>s, regardless of what causes
785 them, are <emphasis>guaranteed</emphasis> to be able to advance
786 the table's <structfield>relminmxid</structfield>.
787 Eventually, as all tables in all databases are scanned and their
788 oldest multixact values are advanced, on-disk storage for older
789 multixacts can be removed.
790 </para>
792 <para>
793 As a safety device, an aggressive vacuum scan will
794 occur for any table whose multixact-age is greater than <xref
795 linkend="guc-autovacuum-multixact-freeze-max-age"/>. Also, if the
796 storage occupied by multixacts members exceeds 2GB, aggressive vacuum
797 scans will occur more often for all tables, starting with those that
798 have the oldest multixact-age. Both of these kinds of aggressive
799 scans will occur even if autovacuum is nominally disabled.
800 </para>
802 <para>
803 Similar to the XID case, if autovacuum fails to clear old MXIDs from a table, the
804 system will begin to emit warning messages when the database's oldest MXIDs reach forty
805 million transactions from the wraparound point. And, just as in the XID case, if these
806 warnings are ignored, the system will refuse to generate new MXIDs once there are fewer
807 than three million left until wraparound.
808 </para>
810 <para>
811 Normal operation when MXIDs are exhausted can be restored in much the same way as
812 when XIDs are exhausted. Follow the same steps in the previous section, but with the
813 following differences:
815 <orderedlist>
816 <listitem>
817 <simpara>Running transactions and prepared transactions can be ignored if there
818 is no chance that they might appear in a multixact.</simpara>
819 </listitem>
820 <listitem>
821 <simpara>MXID information is not directly visible in system views such as
822 <literal>pg_stat_activity</literal>; however, looking for old XIDs is still a good
823 way of determining which transactions are causing MXID wraparound problems.</simpara>
824 </listitem>
825 <listitem>
826 <simpara>XID exhaustion will block all write transactions, but MXID exhaustion will
827 only block a subset of write transactions, specifically those that involve
828 row locks that require an MXID.</simpara>
829 </listitem>
830 </orderedlist>
831 </para>
833 </sect3>
834 </sect2>
836 <sect2 id="autovacuum">
837 <title>The Autovacuum Daemon</title>
839 <indexterm>
840 <primary>autovacuum</primary>
841 <secondary>general information</secondary>
842 </indexterm>
843 <para>
844 <productname>PostgreSQL</productname> has an optional but highly
845 recommended feature called <firstterm>autovacuum</firstterm>,
846 whose purpose is to automate the execution of
847 <command>VACUUM</command> and <command>ANALYZE</command> commands.
848 When enabled, autovacuum checks for
849 tables that have had a large number of inserted, updated or deleted
850 tuples. These checks use the statistics collection facility;
851 therefore, autovacuum cannot be used unless <xref
852 linkend="guc-track-counts"/> is set to <literal>true</literal>.
853 In the default configuration, autovacuuming is enabled and the related
854 configuration parameters are appropriately set.
855 </para>
857 <para>
858 The <quote>autovacuum daemon</quote> actually consists of multiple processes.
859 There is a persistent daemon process, called the
860 <firstterm>autovacuum launcher</firstterm>, which is in charge of starting
861 <firstterm>autovacuum worker</firstterm> processes for all databases. The
862 launcher will distribute the work across time, attempting to start one
863 worker within each database every <xref linkend="guc-autovacuum-naptime"/>
864 seconds. (Therefore, if the installation has <replaceable>N</replaceable> databases,
865 a new worker will be launched every
866 <varname>autovacuum_naptime</varname>/<replaceable>N</replaceable> seconds.)
867 A maximum of <xref linkend="guc-autovacuum-max-workers"/> worker processes
868 are allowed to run at the same time. If there are more than
869 <varname>autovacuum_max_workers</varname> databases to be processed,
870 the next database will be processed as soon as the first worker finishes.
871 Each worker process will check each table within its database and
872 execute <command>VACUUM</command> and/or <command>ANALYZE</command> as needed.
873 <xref linkend="guc-log-autovacuum-min-duration"/> can be set to monitor
874 autovacuum workers' activity.
875 </para>
877 <para>
878 If several large tables all become eligible for vacuuming in a short
879 amount of time, all autovacuum workers might become occupied with
880 vacuuming those tables for a long period. This would result
881 in other tables and databases not being vacuumed until a worker becomes
882 available. There is no limit on how many workers might be in a
883 single database, but workers do try to avoid repeating work that has
884 already been done by other workers. Note that the number of running
885 workers does not count towards <xref linkend="guc-max-connections"/> or
886 <xref linkend="guc-superuser-reserved-connections"/> limits.
887 </para>
889 <para>
890 Tables whose <structfield>relfrozenxid</structfield> value is more than
891 <xref linkend="guc-autovacuum-freeze-max-age"/> transactions old are always
892 vacuumed (this also applies to those tables whose freeze max age has
893 been modified via storage parameters; see below). Otherwise, if the
894 number of tuples obsoleted since the last
895 <command>VACUUM</command> exceeds the <quote>vacuum threshold</quote>, the
896 table is vacuumed. The vacuum threshold is defined as:
897 <programlisting>
898 vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuples
899 </programlisting>
900 where the vacuum base threshold is
901 <xref linkend="guc-autovacuum-vacuum-threshold"/>,
902 the vacuum scale factor is
903 <xref linkend="guc-autovacuum-vacuum-scale-factor"/>,
904 and the number of tuples is
905 <structname>pg_class</structname>.<structfield>reltuples</structfield>.
906 </para>
908 <para>
909 The table is also vacuumed if the number of tuples inserted since the last
910 vacuum has exceeded the defined insert threshold, which is defined as:
911 <programlisting>
912 vacuum insert threshold = vacuum base insert threshold + vacuum insert scale factor * number of tuples
913 </programlisting>
914 where the vacuum insert base threshold is
915 <xref linkend="guc-autovacuum-vacuum-insert-threshold"/>,
916 and vacuum insert scale factor is
917 <xref linkend="guc-autovacuum-vacuum-insert-scale-factor"/>.
918 Such vacuums may allow portions of the table to be marked as
919 <firstterm>all visible</firstterm> and also allow tuples to be frozen, which
920 can reduce the work required in subsequent vacuums.
921 For tables which receive <command>INSERT</command> operations but no or
922 almost no <command>UPDATE</command>/<command>DELETE</command> operations,
923 it may be beneficial to lower the table's
924 <xref linkend="reloption-autovacuum-freeze-min-age"/> as this may allow
925 tuples to be frozen by earlier vacuums. The number of obsolete tuples and
926 the number of inserted tuples are obtained from the cumulative statistics system;
927 it is an eventually-consistent count updated by each <command>UPDATE</command>,
928 <command>DELETE</command> and <command>INSERT</command> operation.
929 If the <structfield>relfrozenxid</structfield> value of the table
930 is more than <varname>vacuum_freeze_table_age</varname> transactions old,
931 an aggressive vacuum is performed to freeze old tuples and advance
932 <structfield>relfrozenxid</structfield>; otherwise, only pages that have been modified
933 since the last vacuum are scanned.
934 </para>
936 <para>
937 For analyze, a similar condition is used: the threshold, defined as:
938 <programlisting>
939 analyze threshold = analyze base threshold + analyze scale factor * number of tuples
940 </programlisting>
941 is compared to the total number of tuples inserted, updated, or deleted
942 since the last <command>ANALYZE</command>.
943 </para>
945 <para>
946 Partitioned tables do not directly store tuples and consequently
947 are not processed by autovacuum. (Autovacuum does process table
948 partitions just like other tables.) Unfortunately, this means that
949 autovacuum does not run <command>ANALYZE</command> on partitioned
950 tables, and this can cause suboptimal plans for queries that reference
951 partitioned table statistics. You can work around this problem by
952 manually running <command>ANALYZE</command> on partitioned tables
953 when they are first populated, and again whenever the distribution
954 of data in their partitions changes significantly.
955 </para>
957 <para>
958 Temporary tables cannot be accessed by autovacuum. Therefore,
959 appropriate vacuum and analyze operations should be performed via
960 session SQL commands.
961 </para>
963 <para>
964 The default thresholds and scale factors are taken from
965 <filename>postgresql.conf</filename>, but it is possible to override them
966 (and many other autovacuum control parameters) on a per-table basis; see
967 <xref linkend="sql-createtable-storage-parameters"/> for more information.
968 If a setting has been changed via a table's storage parameters, that value
969 is used when processing that table; otherwise the global settings are
970 used. See <xref linkend="runtime-config-autovacuum"/> for more details on
971 the global settings.
972 </para>
974 <para>
975 When multiple workers are running, the autovacuum cost delay parameters
976 (see <xref linkend="runtime-config-resource-vacuum-cost"/>) are
977 <quote>balanced</quote> among all the running workers, so that the
978 total I/O impact on the system is the same regardless of the number
979 of workers actually running. However, any workers processing tables whose
980 per-table <literal>autovacuum_vacuum_cost_delay</literal> or
981 <literal>autovacuum_vacuum_cost_limit</literal> storage parameters have been set
982 are not considered in the balancing algorithm.
983 </para>
985 <para>
986 Autovacuum workers generally don't block other commands. If a process
987 attempts to acquire a lock that conflicts with the
988 <literal>SHARE UPDATE EXCLUSIVE</literal> lock held by autovacuum, lock
989 acquisition will interrupt the autovacuum. For conflicting lock modes,
990 see <xref linkend="table-lock-compatibility"/>. However, if the autovacuum
991 is running to prevent transaction ID wraparound (i.e., the autovacuum query
992 name in the <structname>pg_stat_activity</structname> view ends with
993 <literal>(to prevent wraparound)</literal>), the autovacuum is not
994 automatically interrupted.
995 </para>
997 <warning>
998 <para>
999 Regularly running commands that acquire locks conflicting with a
1000 <literal>SHARE UPDATE EXCLUSIVE</literal> lock (e.g., ANALYZE) can
1001 effectively prevent autovacuums from ever completing.
1002 </para>
1003 </warning>
1004 </sect2>
1005 </sect1>
1008 <sect1 id="routine-reindex">
1009 <title>Routine Reindexing</title>
1011 <indexterm zone="routine-reindex">
1012 <primary>reindex</primary>
1013 </indexterm>
1015 <para>
1016 In some situations it is worthwhile to rebuild indexes periodically
1017 with the <xref linkend="sql-reindex"/> command or a series of individual
1018 rebuilding steps.
1020 </para>
1022 <para>
1023 B-tree index pages that have become completely empty are reclaimed for
1024 re-use. However, there is still a possibility
1025 of inefficient use of space: if all but a few index keys on a page have
1026 been deleted, the page remains allocated. Therefore, a usage
1027 pattern in which most, but not all, keys in each range are eventually
1028 deleted will see poor use of space. For such usage patterns,
1029 periodic reindexing is recommended.
1030 </para>
1032 <para>
1033 The potential for bloat in non-B-tree indexes has not been well
1034 researched. It is a good idea to periodically monitor the index's physical
1035 size when using any non-B-tree index type.
1036 </para>
1038 <para>
1039 Also, for B-tree indexes, a freshly-constructed index is slightly faster to
1040 access than one that has been updated many times because logically
1041 adjacent pages are usually also physically adjacent in a newly built index.
1042 (This consideration does not apply to non-B-tree indexes.) It
1043 might be worthwhile to reindex periodically just to improve access speed.
1044 </para>
1046 <para>
1047 <xref linkend="sql-reindex"/> can be used safely and easily in all cases.
1048 This command requires an <literal>ACCESS EXCLUSIVE</literal> lock by
1049 default, hence it is often preferable to execute it with its
1050 <literal>CONCURRENTLY</literal> option, which requires only a
1051 <literal>SHARE UPDATE EXCLUSIVE</literal> lock.
1052 </para>
1053 </sect1>
1056 <sect1 id="logfile-maintenance">
1057 <title>Log File Maintenance</title>
1059 <indexterm zone="logfile-maintenance">
1060 <primary>server log</primary>
1061 <secondary>log file maintenance</secondary>
1062 </indexterm>
1064 <para>
1065 It is a good idea to save the database server's log output
1066 somewhere, rather than just discarding it via <filename>/dev/null</filename>.
1067 The log output is invaluable when diagnosing
1068 problems.
1069 </para>
1071 <note>
1072 <para>
1073 The server log can contain sensitive information and needs to be protected,
1074 no matter how or where it is stored, or the destination to which it is routed.
1075 For example, some DDL statements might contain plaintext passwords or other
1076 authentication details. Logged statements at the <literal>ERROR</literal>
1077 level might show the SQL source code for applications
1078 and might also contain some parts of data rows. Recording data, events and
1079 related information is the intended function of this facility, so this is
1080 not a leakage or a bug. Please ensure the server logs are visible only to
1081 appropriately authorized people.
1082 </para>
1083 </note>
1085 <para>
1086 Log output tends to be voluminous
1087 (especially at higher debug levels) so you won't want to save it
1088 indefinitely. You need to <emphasis>rotate</emphasis> the log files so that
1089 new log files are started and old ones removed after a reasonable
1090 period of time.
1091 </para>
1093 <para>
1094 If you simply direct the <systemitem>stderr</systemitem> of
1095 <command>postgres</command> into a
1096 file, you will have log output, but
1097 the only way to truncate the log file is to stop and restart
1098 the server. This might be acceptable if you are using
1099 <productname>PostgreSQL</productname> in a development environment,
1100 but few production servers would find this behavior acceptable.
1101 </para>
1103 <para>
1104 A better approach is to send the server's
1105 <systemitem>stderr</systemitem> output to some type of log rotation program.
1106 There is a built-in log rotation facility, which you can use by
1107 setting the configuration parameter <varname>logging_collector</varname> to
1108 <literal>true</literal> in <filename>postgresql.conf</filename>. The control
1109 parameters for this program are described in <xref
1110 linkend="runtime-config-logging-where"/>. You can also use this approach
1111 to capture the log data in machine readable <acronym>CSV</acronym>
1112 (comma-separated values) format.
1113 </para>
1115 <para>
1116 Alternatively, you might prefer to use an external log rotation
1117 program if you have one that you are already using with other
1118 server software. For example, the <application>rotatelogs</application>
1119 tool included in the <productname>Apache</productname> distribution
1120 can be used with <productname>PostgreSQL</productname>. One way to
1121 do this is to pipe the server's
1122 <systemitem>stderr</systemitem> output to the desired program.
1123 If you start the server with
1124 <command>pg_ctl</command>, then <systemitem>stderr</systemitem>
1125 is already redirected to <systemitem>stdout</systemitem>, so you just need a
1126 pipe command, for example:
1128 <programlisting>
1129 pg_ctl start | rotatelogs /var/log/pgsql_log 86400
1130 </programlisting>
1131 </para>
1133 <para>
1134 You can combine these approaches by setting up <application>logrotate</application>
1135 to collect log files produced by <productname>PostgreSQL</productname> built-in
1136 logging collector. In this case, the logging collector defines the names and
1137 location of the log files, while <application>logrotate</application>
1138 periodically archives these files. When initiating log rotation,
1139 <application>logrotate</application> must ensure that the application
1140 sends further output to the new file. This is commonly done with a
1141 <literal>postrotate</literal> script that sends a <literal>SIGHUP</literal>
1142 signal to the application, which then reopens the log file.
1143 In <productname>PostgreSQL</productname>, you can run <command>pg_ctl</command>
1144 with the <literal>logrotate</literal> option instead. When the server receives
1145 this command, the server either switches to a new log file or reopens the
1146 existing file, depending on the logging configuration
1147 (see <xref linkend="runtime-config-logging-where"/>).
1148 </para>
1150 <note>
1151 <para>
1152 When using static log file names, the server might fail to reopen the log
1153 file if the max open file limit is reached or a file table overflow occurs.
1154 In this case, log messages are sent to the old log file until a
1155 successful log rotation. If <application>logrotate</application> is
1156 configured to compress the log file and delete it, the server may lose
1157 the messages logged in this time frame. To avoid this issue, you can
1158 configure the logging collector to dynamically assign log file names
1159 and use a <literal>prerotate</literal> script to ignore open log files.
1160 </para>
1161 </note>
1163 <para>
1164 Another production-grade approach to managing log output is to
1165 send it to <application>syslog</application> and let
1166 <application>syslog</application> deal with file rotation. To do this, set the
1167 configuration parameter <varname>log_destination</varname> to <literal>syslog</literal>
1168 (to log to <application>syslog</application> only) in
1169 <filename>postgresql.conf</filename>. Then you can send a <literal>SIGHUP</literal>
1170 signal to the <application>syslog</application> daemon whenever you want to force it
1171 to start writing a new log file. If you want to automate log
1172 rotation, the <application>logrotate</application> program can be
1173 configured to work with log files from
1174 <application>syslog</application>.
1175 </para>
1177 <para>
1178 On many systems, however, <application>syslog</application> is not very reliable,
1179 particularly with large log messages; it might truncate or drop messages
1180 just when you need them the most. Also, on <productname>Linux</productname>,
1181 <application>syslog</application> will flush each message to disk, yielding poor
1182 performance. (You can use a <quote><literal>-</literal></quote> at the start of the file name
1183 in the <application>syslog</application> configuration file to disable syncing.)
1184 </para>
1186 <para>
1187 Note that all the solutions described above take care of starting new
1188 log files at configurable intervals, but they do not handle deletion
1189 of old, no-longer-useful log files. You will probably want to set
1190 up a batch job to periodically delete old log files. Another possibility
1191 is to configure the rotation program so that old log files are overwritten
1192 cyclically.
1193 </para>
1195 <para>
1196 <ulink url="https://pgbadger.darold.net/"><productname>pgBadger</productname></ulink>
1197 is an external project that does sophisticated log file analysis.
1198 <ulink
1199 url="https://bucardo.org/check_postgres/"><productname>check_postgres</productname></ulink>
1200 provides Nagios alerts when important messages appear in the log
1201 files, as well as detection of many other extraordinary conditions.
1202 </para>
1203 </sect1>
1204 </chapter>