doc: clarify pg_dump --no-comments meaning as SQL comments
[pgsql.git] / doc / src / sgml / backup.sgml
blobe4e4c56cf14032131d64014538c571482cee8576
1 <!-- doc/src/sgml/backup.sgml -->
3 <chapter id="backup">
4 <title>Backup and Restore</title>
6 <indexterm zone="backup"><primary>backup</primary></indexterm>
8 <para>
9 As with everything that contains valuable data, <productname>PostgreSQL</productname>
10 databases should be backed up regularly. While the procedure is
11 essentially simple, it is important to have a clear understanding of
12 the underlying techniques and assumptions.
13 </para>
15 <para>
16 There are three fundamentally different approaches to backing up
17 <productname>PostgreSQL</productname> data:
18 <itemizedlist>
19 <listitem><para><acronym>SQL</acronym> dump</para></listitem>
20 <listitem><para>File system level backup</para></listitem>
21 <listitem><para>Continuous archiving</para></listitem>
22 </itemizedlist>
23 Each has its own strengths and weaknesses; each is discussed in turn
24 in the following sections.
25 </para>
27 <sect1 id="backup-dump">
28 <title><acronym>SQL</acronym> Dump</title>
30 <para>
31 The idea behind this dump method is to generate a file with SQL
32 commands that, when fed back to the server, will recreate the
33 database in the same state as it was at the time of the dump.
34 <productname>PostgreSQL</productname> provides the utility program
35 <xref linkend="app-pgdump"/> for this purpose. The basic usage of this
36 command is:
37 <synopsis>
38 pg_dump <replaceable class="parameter">dbname</replaceable> &gt; <replaceable class="parameter">dumpfile</replaceable>
39 </synopsis>
40 As you see, <application>pg_dump</application> writes its result to the
41 standard output. We will see below how this can be useful.
42 While the above command creates a text file, <application>pg_dump</application>
43 can create files in other formats that allow for parallelism and more
44 fine-grained control of object restoration.
45 </para>
47 <para>
48 <application>pg_dump</application> is a regular <productname>PostgreSQL</productname>
49 client application (albeit a particularly clever one). This means
50 that you can perform this backup procedure from any remote host that has
51 access to the database. But remember that <application>pg_dump</application>
52 does not operate with special permissions. In particular, it must
53 have read access to all tables that you want to back up, so in order
54 to back up the entire database you almost always have to run it as a
55 database superuser. (If you do not have sufficient privileges to back up
56 the entire database, you can still back up portions of the database to which
57 you do have access using options such as
58 <option>-n <replaceable>schema</replaceable></option>
59 or <option>-t <replaceable>table</replaceable></option>.)
60 </para>
62 <para>
63 To specify which database server <application>pg_dump</application> should
64 contact, use the command line options <option>-h
65 <replaceable>host</replaceable></option> and <option>-p <replaceable>port</replaceable></option>. The
66 default host is the local host or whatever your
67 <envar>PGHOST</envar> environment variable specifies. Similarly,
68 the default port is indicated by the <envar>PGPORT</envar>
69 environment variable or, failing that, by the compiled-in default.
70 (Conveniently, the server will normally have the same compiled-in
71 default.)
72 </para>
74 <para>
75 Like any other <productname>PostgreSQL</productname> client application,
76 <application>pg_dump</application> will by default connect with the database
77 user name that is equal to the current operating system user name. To override
78 this, either specify the <option>-U</option> option or set the
79 environment variable <envar>PGUSER</envar>. Remember that
80 <application>pg_dump</application> connections are subject to the normal
81 client authentication mechanisms (which are described in <xref
82 linkend="client-authentication"/>).
83 </para>
85 <para>
86 An important advantage of <application>pg_dump</application> over the other backup
87 methods described later is that <application>pg_dump</application>'s output can
88 generally be re-loaded into newer versions of <productname>PostgreSQL</productname>,
89 whereas file-level backups and continuous archiving are both extremely
90 server-version-specific. <application>pg_dump</application> is also the only method
91 that will work when transferring a database to a different machine
92 architecture, such as going from a 32-bit to a 64-bit server.
93 </para>
95 <para>
96 Dumps created by <application>pg_dump</application> are internally consistent,
97 meaning, the dump represents a snapshot of the database at the time
98 <application>pg_dump</application> began running. <application>pg_dump</application> does not
99 block other operations on the database while it is working.
100 (Exceptions are those operations that need to operate with an
101 exclusive lock, such as most forms of <command>ALTER TABLE</command>.)
102 </para>
104 <sect2 id="backup-dump-restore">
105 <title>Restoring the Dump</title>
107 <para>
108 Text files created by <application>pg_dump</application> are intended to
109 be read in by the <application>psql</application> program. The
110 general command form to restore a dump is
111 <synopsis>
112 psql <replaceable class="parameter">dbname</replaceable> &lt; <replaceable class="parameter">dumpfile</replaceable>
113 </synopsis>
114 where <replaceable class="parameter">dumpfile</replaceable> is the
115 file output by the <application>pg_dump</application> command. The database <replaceable
116 class="parameter">dbname</replaceable> will not be created by this
117 command, so you must create it yourself from <literal>template0</literal>
118 before executing <application>psql</application> (e.g., with
119 <literal>createdb -T template0 <replaceable
120 class="parameter">dbname</replaceable></literal>). <application>psql</application>
121 supports options similar to <application>pg_dump</application> for specifying
122 the database server to connect to and the user name to use. See
123 the <xref linkend="app-psql"/> reference page for more information.
124 Non-text file dumps are restored using the <xref
125 linkend="app-pgrestore"/> utility.
126 </para>
128 <para>
129 Before restoring an SQL dump, all the users who own objects or were
130 granted permissions on objects in the dumped database must already
131 exist. If they do not, the restore will fail to recreate the
132 objects with the original ownership and/or permissions.
133 (Sometimes this is what you want, but usually it is not.)
134 </para>
136 <para>
137 By default, the <application>psql</application> script will continue to
138 execute after an SQL error is encountered. You might wish to run
139 <application>psql</application> with
140 the <literal>ON_ERROR_STOP</literal> variable set to alter that
141 behavior and have <application>psql</application> exit with an
142 exit status of 3 if an SQL error occurs:
143 <programlisting>
144 psql --set ON_ERROR_STOP=on <replaceable>dbname</replaceable> &lt; <replaceable>dumpfile</replaceable>
145 </programlisting>
146 Either way, you will only have a partially restored database.
147 Alternatively, you can specify that the whole dump should be
148 restored as a single transaction, so the restore is either fully
149 completed or fully rolled back. This mode can be specified by
150 passing the <option>-1</option> or <option>--single-transaction</option>
151 command-line options to <application>psql</application>. When using this
152 mode, be aware that even a minor error can rollback a
153 restore that has already run for many hours. However, that might
154 still be preferable to manually cleaning up a complex database
155 after a partially restored dump.
156 </para>
158 <para>
159 The ability of <application>pg_dump</application> and <application>psql</application> to
160 write to or read from pipes makes it possible to dump a database
161 directly from one server to another, for example:
162 <programlisting>
163 pg_dump -h <replaceable>host1</replaceable> <replaceable>dbname</replaceable> | psql -h <replaceable>host2</replaceable> <replaceable>dbname</replaceable>
164 </programlisting>
165 </para>
167 <important>
168 <para>
169 The dumps produced by <application>pg_dump</application> are relative to
170 <literal>template0</literal>. This means that any languages, procedures,
171 etc. added via <literal>template1</literal> will also be dumped by
172 <application>pg_dump</application>. As a result, when restoring, if you are
173 using a customized <literal>template1</literal>, you must create the
174 empty database from <literal>template0</literal>, as in the example
175 above.
176 </para>
177 </important>
179 <para>
180 After restoring a backup, it is wise to run <link
181 linkend="sql-analyze"><command>ANALYZE</command></link> on each
182 database so the query optimizer has useful statistics;
183 see <xref linkend="vacuum-for-statistics"/>
184 and <xref linkend="autovacuum"/> for more information.
185 For more advice on how to load large amounts of data
186 into <productname>PostgreSQL</productname> efficiently, refer to <xref
187 linkend="populate"/>.
188 </para>
189 </sect2>
191 <sect2 id="backup-dump-all">
192 <title>Using <application>pg_dumpall</application></title>
194 <para>
195 <application>pg_dump</application> dumps only a single database at a time,
196 and it does not dump information about roles or tablespaces
197 (because those are cluster-wide rather than per-database).
198 To support convenient dumping of the entire contents of a database
199 cluster, the <xref linkend="app-pg-dumpall"/> program is provided.
200 <application>pg_dumpall</application> backs up each database in a given
201 cluster, and also preserves cluster-wide data such as role and
202 tablespace definitions. The basic usage of this command is:
203 <synopsis>
204 pg_dumpall &gt; <replaceable>dumpfile</replaceable>
205 </synopsis>
206 The resulting dump can be restored with <application>psql</application>:
207 <synopsis>
208 psql -f <replaceable class="parameter">dumpfile</replaceable> postgres
209 </synopsis>
210 (Actually, you can specify any existing database name to start from,
211 but if you are loading into an empty cluster then <literal>postgres</literal>
212 should usually be used.) It is always necessary to have
213 database superuser access when restoring a <application>pg_dumpall</application>
214 dump, as that is required to restore the role and tablespace information.
215 If you use tablespaces, make sure that the tablespace paths in the
216 dump are appropriate for the new installation.
217 </para>
219 <para>
220 <application>pg_dumpall</application> works by emitting commands to re-create
221 roles, tablespaces, and empty databases, then invoking
222 <application>pg_dump</application> for each database. This means that while
223 each database will be internally consistent, the snapshots of
224 different databases are not synchronized.
225 </para>
227 <para>
228 Cluster-wide data can be dumped alone using the
229 <application>pg_dumpall</application> <option>--globals-only</option> option.
230 This is necessary to fully backup the cluster if running the
231 <application>pg_dump</application> command on individual databases.
232 </para>
233 </sect2>
235 <sect2 id="backup-dump-large">
236 <title>Handling Large Databases</title>
238 <para>
239 Some operating systems have maximum file size limits that cause
240 problems when creating large <application>pg_dump</application> output files.
241 Fortunately, <application>pg_dump</application> can write to the standard
242 output, so you can use standard Unix tools to work around this
243 potential problem. There are several possible methods:
244 </para>
246 <formalpara>
247 <title>Use compressed dumps.</title>
248 <para>
249 You can use your favorite compression program, for example
250 <application>gzip</application>:
252 <programlisting>
253 pg_dump <replaceable class="parameter">dbname</replaceable> | gzip &gt; <replaceable class="parameter">filename</replaceable>.gz
254 </programlisting>
256 Reload with:
258 <programlisting>
259 gunzip -c <replaceable class="parameter">filename</replaceable>.gz | psql <replaceable class="parameter">dbname</replaceable>
260 </programlisting>
264 <programlisting>
265 cat <replaceable class="parameter">filename</replaceable>.gz | gunzip | psql <replaceable class="parameter">dbname</replaceable>
266 </programlisting>
267 </para>
268 </formalpara>
270 <formalpara>
271 <title>Use <command>split</command>.</title>
272 <para>
273 The <command>split</command> command
274 allows you to split the output into smaller files that are
275 acceptable in size to the underlying file system. For example, to
276 make 2 gigabyte chunks:
278 <programlisting>
279 pg_dump <replaceable class="parameter">dbname</replaceable> | split -b 2G - <replaceable class="parameter">filename</replaceable>
280 </programlisting>
282 Reload with:
284 <programlisting>
285 cat <replaceable class="parameter">filename</replaceable>* | psql <replaceable class="parameter">dbname</replaceable>
286 </programlisting>
288 If using GNU <application>split</application>, it is possible to
289 use it and <application>gzip</application> together:
291 <programlisting>
292 pg_dump <replaceable class="parameter">dbname</replaceable> | split -b 2G --filter='gzip > $FILE.gz'
293 </programlisting>
295 It can be restored using <command>zcat</command>.
296 </para>
297 </formalpara>
299 <formalpara>
300 <title>Use <application>pg_dump</application>'s custom dump format.</title>
301 <para>
302 If <productname>PostgreSQL</productname> was built on a system with the
303 <application>zlib</application> compression library installed, the custom dump
304 format will compress data as it writes it to the output file. This will
305 produce dump file sizes similar to using <command>gzip</command>, but it
306 has the added advantage that tables can be restored selectively. The
307 following command dumps a database using the custom dump format:
309 <programlisting>
310 pg_dump -Fc <replaceable class="parameter">dbname</replaceable> &gt; <replaceable class="parameter">filename</replaceable>
311 </programlisting>
313 A custom-format dump is not a script for <application>psql</application>, but
314 instead must be restored with <application>pg_restore</application>, for example:
316 <programlisting>
317 pg_restore -d <replaceable class="parameter">dbname</replaceable> <replaceable class="parameter">filename</replaceable>
318 </programlisting>
320 See the <xref linkend="app-pgdump"/> and <xref
321 linkend="app-pgrestore"/> reference pages for details.
322 </para>
323 </formalpara>
325 <para>
326 For very large databases, you might need to combine <command>split</command>
327 with one of the other two approaches.
328 </para>
330 <formalpara>
331 <title>Use <application>pg_dump</application>'s parallel dump feature.</title>
332 <para>
333 To speed up the dump of a large database, you can use
334 <application>pg_dump</application>'s parallel mode. This will dump
335 multiple tables at the same time. You can control the degree of
336 parallelism with the <command>-j</command> parameter. Parallel dumps
337 are only supported for the "directory" archive format.
339 <programlisting>
340 pg_dump -j <replaceable class="parameter">num</replaceable> -F d -f <replaceable class="parameter">out.dir</replaceable> <replaceable class="parameter">dbname</replaceable>
341 </programlisting>
343 You can use <command>pg_restore -j</command> to restore a dump in parallel.
344 This will work for any archive of either the "custom" or the "directory"
345 archive mode, whether or not it has been created with <command>pg_dump -j</command>.
346 </para>
347 </formalpara>
348 </sect2>
349 </sect1>
351 <sect1 id="backup-file">
352 <title>File System Level Backup</title>
354 <para>
355 An alternative backup strategy is to directly copy the files that
356 <productname>PostgreSQL</productname> uses to store the data in the database;
357 <xref linkend="creating-cluster"/> explains where these files
358 are located. You can use whatever method you prefer
359 for doing file system backups; for example:
361 <programlisting>
362 tar -cf backup.tar /usr/local/pgsql/data
363 </programlisting>
364 </para>
366 <para>
367 There are two restrictions, however, which make this method
368 impractical, or at least inferior to the <application>pg_dump</application>
369 method:
371 <orderedlist>
372 <listitem>
373 <para>
374 The database server <emphasis>must</emphasis> be shut down in order to
375 get a usable backup. Half-way measures such as disallowing all
376 connections will <emphasis>not</emphasis> work
377 (in part because <command>tar</command> and similar tools do not take
378 an atomic snapshot of the state of the file system,
379 but also because of internal buffering within the server).
380 Information about stopping the server can be found in
381 <xref linkend="server-shutdown"/>. Needless to say, you
382 also need to shut down the server before restoring the data.
383 </para>
384 </listitem>
386 <listitem>
387 <para>
388 If you have dug into the details of the file system layout of the
389 database, you might be tempted to try to back up or restore only certain
390 individual tables or databases from their respective files or
391 directories. This will <emphasis>not</emphasis> work because the
392 information contained in these files is not usable without
393 the commit log files,
394 <filename>pg_xact/*</filename>, which contain the commit status of
395 all transactions. A table file is only usable with this
396 information. Of course it is also impossible to restore only a
397 table and the associated <filename>pg_xact</filename> data
398 because that would render all other tables in the database
399 cluster useless. So file system backups only work for complete
400 backup and restoration of an entire database cluster.
401 </para>
402 </listitem>
403 </orderedlist>
404 </para>
406 <para>
407 An alternative file-system backup approach is to make a
408 <quote>consistent snapshot</quote> of the data directory, if the
409 file system supports that functionality (and you are willing to
410 trust that it is implemented correctly). The typical procedure is
411 to make a <quote>frozen snapshot</quote> of the volume containing the
412 database, then copy the whole data directory (not just parts, see
413 above) from the snapshot to a backup device, then release the frozen
414 snapshot. This will work even while the database server is running.
415 However, a backup created in this way saves
416 the database files in a state as if the database server was not
417 properly shut down; therefore, when you start the database server
418 on the backed-up data, it will think the previous server instance
419 crashed and will replay the WAL log. This is not a problem; just
420 be aware of it (and be sure to include the WAL files in your backup).
421 You can perform a <command>CHECKPOINT</command> before taking the
422 snapshot to reduce recovery time.
423 </para>
425 <para>
426 If your database is spread across multiple file systems, there might not
427 be any way to obtain exactly-simultaneous frozen snapshots of all
428 the volumes. For example, if your data files and WAL log are on different
429 disks, or if tablespaces are on different file systems, it might
430 not be possible to use snapshot backup because the snapshots
431 <emphasis>must</emphasis> be simultaneous.
432 Read your file system documentation very carefully before trusting
433 the consistent-snapshot technique in such situations.
434 </para>
436 <para>
437 If simultaneous snapshots are not possible, one option is to shut down
438 the database server long enough to establish all the frozen snapshots.
439 Another option is to perform a continuous archiving base backup (<xref
440 linkend="backup-base-backup"/>) because such backups are immune to file
441 system changes during the backup. This requires enabling continuous
442 archiving just during the backup process; restore is done using
443 continuous archive recovery (<xref linkend="backup-pitr-recovery"/>).
444 </para>
446 <para>
447 Another option is to use <application>rsync</application> to perform a file
448 system backup. This is done by first running <application>rsync</application>
449 while the database server is running, then shutting down the database
450 server long enough to do an <command>rsync --checksum</command>.
451 (<option>--checksum</option> is necessary because <command>rsync</command> only
452 has file modification-time granularity of one second.) The
453 second <application>rsync</application> will be quicker than the first,
454 because it has relatively little data to transfer, and the end result
455 will be consistent because the server was down. This method
456 allows a file system backup to be performed with minimal downtime.
457 </para>
459 <para>
460 Note that a file system backup will typically be larger
461 than an SQL dump. (<application>pg_dump</application> does not need to dump
462 the contents of indexes for example, just the commands to recreate
463 them.) However, taking a file system backup might be faster.
464 </para>
465 </sect1>
467 <sect1 id="continuous-archiving">
468 <title>Continuous Archiving and Point-in-Time Recovery (PITR)</title>
470 <indexterm zone="backup">
471 <primary>continuous archiving</primary>
472 </indexterm>
474 <indexterm zone="backup">
475 <primary>point-in-time recovery</primary>
476 </indexterm>
478 <indexterm zone="backup">
479 <primary>PITR</primary>
480 </indexterm>
482 <para>
483 At all times, <productname>PostgreSQL</productname> maintains a
484 <firstterm>write ahead log</firstterm> (WAL) in the <filename>pg_wal/</filename>
485 subdirectory of the cluster's data directory. The log records
486 every change made to the database's data files. This log exists
487 primarily for crash-safety purposes: if the system crashes, the
488 database can be restored to consistency by <quote>replaying</quote> the
489 log entries made since the last checkpoint. However, the existence
490 of the log makes it possible to use a third strategy for backing up
491 databases: we can combine a file-system-level backup with backup of
492 the WAL files. If recovery is needed, we restore the file system backup and
493 then replay from the backed-up WAL files to bring the system to a
494 current state. This approach is more complex to administer than
495 either of the previous approaches, but it has some significant
496 benefits:
497 <itemizedlist>
498 <listitem>
499 <para>
500 We do not need a perfectly consistent file system backup as the starting point.
501 Any internal inconsistency in the backup will be corrected by log
502 replay (this is not significantly different from what happens during
503 crash recovery). So we do not need a file system snapshot capability,
504 just <application>tar</application> or a similar archiving tool.
505 </para>
506 </listitem>
507 <listitem>
508 <para>
509 Since we can combine an indefinitely long sequence of WAL files
510 for replay, continuous backup can be achieved simply by continuing to archive
511 the WAL files. This is particularly valuable for large databases, where
512 it might not be convenient to take a full backup frequently.
513 </para>
514 </listitem>
515 <listitem>
516 <para>
517 It is not necessary to replay the WAL entries all the
518 way to the end. We could stop the replay at any point and have a
519 consistent snapshot of the database as it was at that time. Thus,
520 this technique supports <firstterm>point-in-time recovery</firstterm>: it is
521 possible to restore the database to its state at any time since your base
522 backup was taken.
523 </para>
524 </listitem>
525 <listitem>
526 <para>
527 If we continuously feed the series of WAL files to another
528 machine that has been loaded with the same base backup file, we
529 have a <firstterm>warm standby</firstterm> system: at any point we can bring up
530 the second machine and it will have a nearly-current copy of the
531 database.
532 </para>
533 </listitem>
534 </itemizedlist>
535 </para>
537 <note>
538 <para>
539 <application>pg_dump</application> and
540 <application>pg_dumpall</application> do not produce file-system-level
541 backups and cannot be used as part of a continuous-archiving solution.
542 Such dumps are <emphasis>logical</emphasis> and do not contain enough
543 information to be used by WAL replay.
544 </para>
545 </note>
547 <para>
548 As with the plain file-system-backup technique, this method can only
549 support restoration of an entire database cluster, not a subset.
550 Also, it requires a lot of archival storage: the base backup might be bulky,
551 and a busy system will generate many megabytes of WAL traffic that
552 have to be archived. Still, it is the preferred backup technique in
553 many situations where high reliability is needed.
554 </para>
556 <para>
557 To recover successfully using continuous archiving (also called
558 <quote>online backup</quote> by many database vendors), you need a continuous
559 sequence of archived WAL files that extends back at least as far as the
560 start time of your backup. So to get started, you should set up and test
561 your procedure for archiving WAL files <emphasis>before</emphasis> you take your
562 first base backup. Accordingly, we first discuss the mechanics of
563 archiving WAL files.
564 </para>
566 <sect2 id="backup-archiving-wal">
567 <title>Setting Up WAL Archiving</title>
569 <para>
570 In an abstract sense, a running <productname>PostgreSQL</productname> system
571 produces an indefinitely long sequence of WAL records. The system
572 physically divides this sequence into WAL <firstterm>segment
573 files</firstterm>, which are normally 16MB apiece (although the segment size
574 can be altered during <application>initdb</application>). The segment
575 files are given numeric names that reflect their position in the
576 abstract WAL sequence. When not using WAL archiving, the system
577 normally creates just a few segment files and then
578 <quote>recycles</quote> them by renaming no-longer-needed segment files
579 to higher segment numbers. It's assumed that segment files whose
580 contents precede the last checkpoint are no longer of
581 interest and can be recycled.
582 </para>
584 <para>
585 When archiving WAL data, we need to capture the contents of each segment
586 file once it is filled, and save that data somewhere before the segment
587 file is recycled for reuse. Depending on the application and the
588 available hardware, there could be many different ways of <quote>saving
589 the data somewhere</quote>: we could copy the segment files to an NFS-mounted
590 directory on another machine, write them onto a tape drive (ensuring that
591 you have a way of identifying the original name of each file), or batch
592 them together and burn them onto CDs, or something else entirely. To
593 provide the database administrator with flexibility,
594 <productname>PostgreSQL</productname> tries not to make any assumptions about how
595 the archiving will be done. Instead, <productname>PostgreSQL</productname> lets
596 the administrator specify a shell command or an archive library to be executed to copy a
597 completed segment file to wherever it needs to go. This could be as simple
598 as a shell command that uses <literal>cp</literal>, or it could invoke a
599 complex C function &mdash; it's all up to you.
600 </para>
602 <para>
603 To enable WAL archiving, set the <xref linkend="guc-wal-level"/>
604 configuration parameter to <literal>replica</literal> or higher,
605 <xref linkend="guc-archive-mode"/> to <literal>on</literal>,
606 specify the shell command to use in the <xref
607 linkend="guc-archive-command"/> configuration parameter
608 or specify the library to use in the <xref
609 linkend="guc-archive-library"/> configuration parameter. In practice
610 these settings will always be placed in the
611 <filename>postgresql.conf</filename> file.
612 </para>
614 <para>
615 In <varname>archive_command</varname>,
616 <literal>%p</literal> is replaced by the path name of the file to
617 archive, while <literal>%f</literal> is replaced by only the file name.
618 (The path name is relative to the current working directory,
619 i.e., the cluster's data directory.)
620 Use <literal>%%</literal> if you need to embed an actual <literal>%</literal>
621 character in the command. The simplest useful command is something
622 like:
623 <programlisting>
624 archive_command = 'test ! -f /mnt/server/archivedir/%f &amp;&amp; cp %p /mnt/server/archivedir/%f' # Unix
625 archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"' # Windows
626 </programlisting>
627 which will copy archivable WAL segments to the directory
628 <filename>/mnt/server/archivedir</filename>. (This is an example, not a
629 recommendation, and might not work on all platforms.) After the
630 <literal>%p</literal> and <literal>%f</literal> parameters have been replaced,
631 the actual command executed might look like this:
632 <programlisting>
633 test ! -f /mnt/server/archivedir/00000001000000A900000065 &amp;&amp; cp pg_wal/00000001000000A900000065 /mnt/server/archivedir/00000001000000A900000065
634 </programlisting>
635 A similar command will be generated for each new file to be archived.
636 </para>
638 <para>
639 The archive command will be executed under the ownership of the same
640 user that the <productname>PostgreSQL</productname> server is running as. Since
641 the series of WAL files being archived contains effectively everything
642 in your database, you will want to be sure that the archived data is
643 protected from prying eyes; for example, archive into a directory that
644 does not have group or world read access.
645 </para>
647 <para>
648 It is important that the archive command return zero exit status if and
649 only if it succeeds. Upon getting a zero result,
650 <productname>PostgreSQL</productname> will assume that the file has been
651 successfully archived, and will remove or recycle it. However, a nonzero
652 status tells <productname>PostgreSQL</productname> that the file was not archived;
653 it will try again periodically until it succeeds.
654 </para>
656 <para>
657 Another way to archive is to use a custom archive module as the
658 <varname>archive_library</varname>. Since such modules are written in
659 <literal>C</literal>, creating your own may require considerably more effort
660 than writing a shell command. However, archive modules can be more
661 performant than archiving via shell, and they will have access to many
662 useful server resources. For more information about archive modules, see
663 <xref linkend="archive-modules"/>.
664 </para>
666 <para>
667 When the archive command is terminated by a signal (other than
668 <systemitem>SIGTERM</systemitem> that is used as part of a server
669 shutdown) or an error by the shell with an exit status greater than
670 125 (such as command not found), or if the archive function emits an
671 <literal>ERROR</literal> or <literal>FATAL</literal>, the archiver process
672 aborts and gets restarted by the postmaster. In such cases, the failure is
673 not reported in <xref linkend="pg-stat-archiver-view"/>.
674 </para>
676 <para>
677 Archive commands and libraries should generally be designed to refuse to overwrite
678 any pre-existing archive file. This is an important safety feature to
679 preserve the integrity of your archive in case of administrator error
680 (such as sending the output of two different servers to the same archive
681 directory). It is advisable to test your proposed archive library to ensure
682 that it does not overwrite an existing file.
683 </para>
685 <para>
686 In rare cases, <productname>PostgreSQL</productname> may attempt to
687 re-archive a WAL file that was previously archived. For example, if the
688 system crashes before the server makes a durable record of archival
689 success, the server will attempt to archive the file again after
690 restarting (provided archiving is still enabled). When an archive command or library
691 encounters a pre-existing file, it should return a zero status or <literal>true</literal>, respectively,
692 if the WAL file has identical contents to the pre-existing archive and the
693 pre-existing archive is fully persisted to storage. If a pre-existing
694 file contains different contents than the WAL file being archived, the
695 archive command or library <emphasis>must</emphasis> return a nonzero status or
696 <literal>false</literal>, respectively.
697 </para>
699 <para>
700 The example command above for Unix avoids overwriting a pre-existing archive
701 by including a separate
702 <command>test</command> step. On some Unix platforms, <command>cp</command> has
703 switches such as <option>-i</option> that can be used to do the same thing
704 less verbosely, but you should not rely on these without verifying that
705 the right exit status is returned. (In particular, GNU <command>cp</command>
706 will return status zero when <option>-i</option> is used and the target file
707 already exists, which is <emphasis>not</emphasis> the desired behavior.)
708 </para>
710 <para>
711 While designing your archiving setup, consider what will happen if
712 the archive command or library fails repeatedly because some aspect requires
713 operator intervention or the archive runs out of space. For example, this
714 could occur if you write to tape without an autochanger; when the tape
715 fills, nothing further can be archived until the tape is swapped.
716 You should ensure that any error condition or request to a human operator
717 is reported appropriately so that the situation can be
718 resolved reasonably quickly. The <filename>pg_wal/</filename> directory will
719 continue to fill with WAL segment files until the situation is resolved.
720 (If the file system containing <filename>pg_wal/</filename> fills up,
721 <productname>PostgreSQL</productname> will do a PANIC shutdown. No committed
722 transactions will be lost, but the database will remain offline until
723 you free some space.)
724 </para>
726 <para>
727 The speed of the archive command or library is unimportant as long as it can keep up
728 with the average rate at which your server generates WAL data. Normal
729 operation continues even if the archiving process falls a little behind.
730 If archiving falls significantly behind, this will increase the amount of
731 data that would be lost in the event of a disaster. It will also mean that
732 the <filename>pg_wal/</filename> directory will contain large numbers of
733 not-yet-archived segment files, which could eventually exceed available
734 disk space. You are advised to monitor the archiving process to ensure that
735 it is working as you intend.
736 </para>
738 <para>
739 In writing your archive command or library, you should assume that the file names to
740 be archived can be up to 64 characters long and can contain any
741 combination of ASCII letters, digits, and dots. It is not necessary to
742 preserve the original relative path (<literal>%p</literal>) but it is necessary to
743 preserve the file name (<literal>%f</literal>).
744 </para>
746 <para>
747 Note that although WAL archiving will allow you to restore any
748 modifications made to the data in your <productname>PostgreSQL</productname> database,
749 it will not restore changes made to configuration files (that is,
750 <filename>postgresql.conf</filename>, <filename>pg_hba.conf</filename> and
751 <filename>pg_ident.conf</filename>), since those are edited manually rather
752 than through SQL operations.
753 You might wish to keep the configuration files in a location that will
754 be backed up by your regular file system backup procedures. See
755 <xref linkend="runtime-config-file-locations"/> for how to relocate the
756 configuration files.
757 </para>
759 <para>
760 The archive command or function is only invoked on completed WAL segments. Hence,
761 if your server generates only little WAL traffic (or has slack periods
762 where it does so), there could be a long delay between the completion
763 of a transaction and its safe recording in archive storage. To put
764 a limit on how old unarchived data can be, you can set
765 <xref linkend="guc-archive-timeout"/> to force the server to switch
766 to a new WAL segment file at least that often. Note that archived
767 files that are archived early due to a forced switch are still the same
768 length as completely full files. It is therefore unwise to set a very
769 short <varname>archive_timeout</varname> &mdash; it will bloat your archive
770 storage. <varname>archive_timeout</varname> settings of a minute or so are
771 usually reasonable.
772 </para>
774 <para>
775 Also, you can force a segment switch manually with
776 <function>pg_switch_wal</function> if you want to ensure that a
777 just-finished transaction is archived as soon as possible. Other utility
778 functions related to WAL management are listed in <xref
779 linkend="functions-admin-backup-table"/>.
780 </para>
782 <para>
783 When <varname>wal_level</varname> is <literal>minimal</literal> some SQL commands
784 are optimized to avoid WAL logging, as described in <xref
785 linkend="populate-pitr"/>. If archiving or streaming replication were
786 turned on during execution of one of these statements, WAL would not
787 contain enough information for archive recovery. (Crash recovery is
788 unaffected.) For this reason, <varname>wal_level</varname> can only be changed at
789 server start. However, <varname>archive_command</varname> and <varname>archive_library</varname> can be changed with a
790 configuration file reload. If you are archiving via shell and wish to
791 temporarily stop archiving,
792 one way to do it is to set <varname>archive_command</varname> to the empty
793 string (<literal>''</literal>).
794 This will cause WAL files to accumulate in <filename>pg_wal/</filename> until a
795 working <varname>archive_command</varname> is re-established.
796 </para>
797 </sect2>
799 <sect2 id="backup-base-backup">
800 <title>Making a Base Backup</title>
802 <para>
803 The easiest way to perform a base backup is to use the
804 <xref linkend="app-pgbasebackup"/> tool. It can create
805 a base backup either as regular files or as a tar archive. If more
806 flexibility than <xref linkend="app-pgbasebackup"/> can provide is
807 required, you can also make a base backup using the low level API
808 (see <xref linkend="backup-lowlevel-base-backup"/>).
809 </para>
811 <para>
812 It is not necessary to be concerned about the amount of time it takes
813 to make a base backup. However, if you normally run the
814 server with <varname>full_page_writes</varname> disabled, you might notice a drop
815 in performance while the backup runs since <varname>full_page_writes</varname> is
816 effectively forced on during backup mode.
817 </para>
819 <para>
820 To make use of the backup, you will need to keep all the WAL
821 segment files generated during and after the file system backup.
822 To aid you in doing this, the base backup process
823 creates a <firstterm>backup history file</firstterm> that is immediately
824 stored into the WAL archive area. This file is named after the first
825 WAL segment file that you need for the file system backup.
826 For example, if the starting WAL file is
827 <literal>0000000100001234000055CD</literal> the backup history file will be
828 named something like
829 <literal>0000000100001234000055CD.007C9330.backup</literal>. (The second
830 part of the file name stands for an exact position within the WAL
831 file, and can ordinarily be ignored.) Once you have safely archived
832 the file system backup and the WAL segment files used during the
833 backup (as specified in the backup history file), all archived WAL
834 segments with names numerically less are no longer needed to recover
835 the file system backup and can be deleted. However, you should
836 consider keeping several backup sets to be absolutely certain that
837 you can recover your data.
838 </para>
840 <para>
841 The backup history file is just a small text file. It contains the
842 label string you gave to <xref linkend="app-pgbasebackup"/>, as well as
843 the starting and ending times and WAL segments of the backup.
844 If you used the label to identify the associated dump file,
845 then the archived history file is enough to tell you which dump file to
846 restore.
847 </para>
849 <para>
850 Since you have to keep around all the archived WAL files back to your
851 last base backup, the interval between base backups should usually be
852 chosen based on how much storage you want to expend on archived WAL
853 files. You should also consider how long you are prepared to spend
854 recovering, if recovery should be necessary &mdash; the system will have to
855 replay all those WAL segments, and that could take awhile if it has
856 been a long time since the last base backup.
857 </para>
858 </sect2>
860 <sect2 id="backup-incremental-backup">
861 <title>Making an Incremental Backup</title>
863 <para>
864 You can use <xref linkend="app-pgbasebackup"/> to take an incremental
865 backup by specifying the <literal>--incremental</literal> option. You must
866 supply, as an argument to <literal>--incremental</literal>, the backup
867 manifest to an earlier backup from the same server. In the resulting
868 backup, non-relation files will be included in their entirety, but some
869 relation files may be replaced by smaller incremental files which contain
870 only the blocks which have been changed since the earlier backup and enough
871 metadata to reconstruct the current version of the file.
872 </para>
874 <para>
875 To figure out which blocks need to be backed up, the server uses WAL
876 summaries, which are stored in the data directory, inside the directory
877 <literal>pg_wal/summaries</literal>. If the required summary files are not
878 present, an attempt to take an incremental backup will fail. The summaries
879 present in this directory must cover all LSNs from the start LSN of the
880 prior backup to the start LSN of the current backup. Since the server looks
881 for WAL summaries just after establishing the start LSN of the current
882 backup, the necessary summary files probably won't be instantly present
883 on disk, but the server will wait for any missing files to show up.
884 This also helps if the WAL summarization process has fallen behind.
885 However, if the necessary files have already been removed, or if the WAL
886 summarizer doesn't catch up quickly enough, the incremental backup will
887 fail.
888 </para>
890 <para>
891 When restoring an incremental backup, it will be necessary to have not
892 only the incremental backup itself but also all earlier backups that
893 are required to supply the blocks omitted from the incremental backup.
894 See <xref linkend="app-pgcombinebackup"/> for further information about
895 this requirement. Note that there are restrictions on the use of
896 <literal>pg_combinebackup</literal> when the checksum status of the
897 cluster has been changed; see
898 <link linkend="app-pgcombinebackup-limitations">pg_combinebackup
899 limitations</link>.
900 </para>
902 <para>
903 Note that all of the requirements for making use of a full backup also
904 apply to an incremental backup. For instance, you still need all of the
905 WAL segment files generated during and after the file system backup, and
906 any relevant WAL history files. And you still need to create a
907 <literal>recovery.signal</literal> (or <literal>standby.signal</literal>)
908 and perform recovery, as described in
909 <xref linkend="backup-pitr-recovery" />. The requirement to have earlier
910 backups available at restore time and to use
911 <literal>pg_combinebackup</literal> is an additional requirement on top of
912 everything else. Keep in mind that <application>PostgreSQL</application>
913 has no built-in mechanism to figure out which backups are still needed as
914 a basis for restoring later incremental backups. You must keep track of
915 the relationships between your full and incremental backups on your own,
916 and be certain not to remove earlier backups if they might be needed when
917 restoring later incremental backups.
918 </para>
920 <para>
921 Incremental backups typically only make sense for relatively large
922 databases where a significant portion of the data does not change, or only
923 changes slowly. For a small database, it's simpler to ignore the existence
924 of incremental backups and simply take full backups, which are simpler
925 to manage. For a large database all of which is heavily modified,
926 incremental backups won't be much smaller than full backups.
927 </para>
929 <para>
930 An incremental backup is only possible if replay would begin from a later
931 checkpoint than for the previous backup upon which it depends. If you
932 take the incremental backup on the primary, this condition is always
933 satisfied, because each backup triggers a new checkpoint. On a standby,
934 replay begins from the most recent restartpoint. Therefore, an
935 incremental backup of a standby server can fail if there has been very
936 little activity since the previous backup, since no new restartpoint might
937 have been created.
938 </para>
939 </sect2>
941 <sect2 id="backup-lowlevel-base-backup">
942 <title>Making a Base Backup Using the Low Level API</title>
943 <para>
944 Instead of taking a full or incremental base backup using
945 <xref linkend="app-pgbasebackup"/>, you can take a base backup using the
946 low-level API. This procedure contains a few more steps than
947 the <application>pg_basebackup</application> method, but is relatively
948 simple. It is very important that these steps are executed in
949 sequence, and that the success of a step is verified before
950 proceeding to the next step.
951 </para>
952 <para>
953 Multiple backups are able to be run concurrently (both those
954 started using this backup API and those started using
955 <xref linkend="app-pgbasebackup"/>).
956 </para>
957 <para>
958 <orderedlist>
959 <listitem>
960 <para>
961 Ensure that WAL archiving is enabled and working.
962 </para>
963 </listitem>
964 <listitem>
965 <para>
966 Connect to the server (it does not matter which database) as a user with
967 rights to run <function>pg_backup_start</function> (superuser,
968 or a user who has been granted <literal>EXECUTE</literal> on the
969 function) and issue the command:
970 <programlisting>
971 SELECT pg_backup_start(label => 'label', fast => false);
972 </programlisting>
973 where <literal>label</literal> is any string you want to use to uniquely
974 identify this backup operation. The connection
975 calling <function>pg_backup_start</function> must be maintained until the end of
976 the backup, or the backup will be automatically aborted.
977 </para>
979 <para>
980 Online backups are always started at the beginning of a checkpoint.
981 By default, <function>pg_backup_start</function> will wait for the next
982 regularly scheduled checkpoint to complete, which may take a long time (see the
983 configuration parameters <xref linkend="guc-checkpoint-timeout"/> and
984 <xref linkend="guc-checkpoint-completion-target"/>). This is
985 usually preferable as it minimizes the impact on the running system. If you
986 want to start the backup as soon as possible, pass <literal>true</literal> as
987 the second parameter to <function>pg_backup_start</function> and it will
988 request an immediate checkpoint, which will finish as fast as possible using
989 as much I/O as possible.
990 </para>
992 </listitem>
993 <listitem>
994 <para>
995 Perform the backup, using any convenient file-system-backup tool
996 such as <application>tar</application> or <application>cpio</application> (not
997 <application>pg_dump</application> or
998 <application>pg_dumpall</application>). It is neither
999 necessary nor desirable to stop normal operation of the database
1000 while you do this. See
1001 <xref linkend="backup-lowlevel-base-backup-data"/> for things to
1002 consider during this backup.
1003 </para>
1004 </listitem>
1005 <listitem>
1006 <para>
1007 In the same connection as before, issue the command:
1008 <programlisting>
1009 SELECT * FROM pg_backup_stop(wait_for_archive => true);
1010 </programlisting>
1011 This terminates backup mode. On a primary, it also performs an automatic
1012 switch to the next WAL segment. On a standby, it is not possible to
1013 automatically switch WAL segments, so you may wish to run
1014 <function>pg_switch_wal</function> on the primary to perform a manual
1015 switch. The reason for the switch is to arrange for
1016 the last WAL segment file written during the backup interval to be
1017 ready to archive.
1018 </para>
1019 <para>
1020 <function>pg_backup_stop</function> will return one row with three
1021 values. The second of these fields should be written to a file named
1022 <filename>backup_label</filename> in the root directory of the backup. The
1023 third field should be written to a file named
1024 <filename>tablespace_map</filename> unless the field is empty. These files are
1025 vital to the backup working and must be written byte for byte without
1026 modification, which may require opening the file in binary mode.
1027 </para>
1028 </listitem>
1029 <listitem>
1030 <para>
1031 Once the WAL segment files active during the backup are archived, you are
1032 done. The file identified by <function>pg_backup_stop</function>'s first return
1033 value is the last segment that is required to form a complete set of
1034 backup files. On a primary, if <varname>archive_mode</varname> is enabled and the
1035 <literal>wait_for_archive</literal> parameter is <literal>true</literal>,
1036 <function>pg_backup_stop</function> does not return until the last segment has
1037 been archived.
1038 On a standby, <varname>archive_mode</varname> must be <literal>always</literal> in order
1039 for <function>pg_backup_stop</function> to wait.
1040 Archiving of these files happens automatically since you have
1041 already configured <varname>archive_command</varname> or <varname>archive_library</varname>.
1042 In most cases this happens quickly, but you are advised to monitor your
1043 archive system to ensure there are no delays.
1044 If the archive process has fallen behind because of failures of the
1045 archive command or library, it will keep retrying
1046 until the archive succeeds and the backup is complete.
1047 If you wish to place a time limit on the execution of
1048 <function>pg_backup_stop</function>, set an appropriate
1049 <varname>statement_timeout</varname> value, but make note that if
1050 <function>pg_backup_stop</function> terminates because of this your backup
1051 may not be valid.
1052 </para>
1053 <para>
1054 If the backup process monitors and ensures that all WAL segment files
1055 required for the backup are successfully archived then the
1056 <literal>wait_for_archive</literal> parameter (which defaults to true) can be set
1057 to false to have
1058 <function>pg_backup_stop</function> return as soon as the stop backup record is
1059 written to the WAL. By default, <function>pg_backup_stop</function> will wait
1060 until all WAL has been archived, which can take some time. This option
1061 must be used with caution: if WAL archiving is not monitored correctly
1062 then the backup might not include all of the WAL files and will
1063 therefore be incomplete and not able to be restored.
1064 </para>
1065 </listitem>
1066 </orderedlist>
1067 </para>
1068 <sect3 id="backup-lowlevel-base-backup-data">
1069 <title>Backing Up the Data Directory</title>
1070 <para>
1071 Some file system backup tools emit warnings or errors
1072 if the files they are trying to copy change while the copy proceeds.
1073 When taking a base backup of an active database, this situation is normal
1074 and not an error. However, you need to ensure that you can distinguish
1075 complaints of this sort from real errors. For example, some versions
1076 of <application>rsync</application> return a separate exit code for
1077 <quote>vanished source files</quote>, and you can write a driver script to
1078 accept this exit code as a non-error case. Also, some versions of
1079 GNU <application>tar</application> return an error code indistinguishable from
1080 a fatal error if a file was truncated while <application>tar</application> was
1081 copying it. Fortunately, GNU <application>tar</application> versions 1.16 and
1082 later exit with 1 if a file was changed during the backup,
1083 and 2 for other errors. With GNU <application>tar</application> version 1.23 and
1084 later, you can use the warning options <literal>--warning=no-file-changed
1085 --warning=no-file-removed</literal> to hide the related warning messages.
1086 </para>
1088 <para>
1089 Be certain that your backup includes all of the files under
1090 the database cluster directory (e.g., <filename>/usr/local/pgsql/data</filename>).
1091 If you are using tablespaces that do not reside underneath this directory,
1092 be careful to include them as well (and be sure that your backup
1093 archives symbolic links as links, otherwise the restore will corrupt
1094 your tablespaces).
1095 </para>
1097 <para>
1098 You should, however, omit from the backup the files within the
1099 cluster's <filename>pg_wal/</filename> subdirectory. This
1100 slight adjustment is worthwhile because it reduces the risk
1101 of mistakes when restoring. This is easy to arrange if
1102 <filename>pg_wal/</filename> is a symbolic link pointing to someplace outside
1103 the cluster directory, which is a common setup anyway for performance
1104 reasons. You might also want to exclude <filename>postmaster.pid</filename>
1105 and <filename>postmaster.opts</filename>, which record information
1106 about the running <application>postmaster</application>, not about the
1107 <application>postmaster</application> which will eventually use this backup.
1108 (These files can confuse <application>pg_ctl</application>.)
1109 </para>
1111 <para>
1112 It is often a good idea to also omit from the backup the files
1113 within the cluster's <filename>pg_replslot/</filename> directory, so that
1114 replication slots that exist on the primary do not become part of the
1115 backup. Otherwise, the subsequent use of the backup to create a standby
1116 may result in indefinite retention of WAL files on the standby, and
1117 possibly bloat on the primary if hot standby feedback is enabled, because
1118 the clients that are using those replication slots will still be connecting
1119 to and updating the slots on the primary, not the standby. Even if the
1120 backup is only intended for use in creating a new primary, copying the
1121 replication slots isn't expected to be particularly useful, since the
1122 contents of those slots will likely be badly out of date by the time
1123 the new primary comes on line.
1124 </para>
1126 <para>
1127 The contents of the directories <filename>pg_dynshmem/</filename>,
1128 <filename>pg_notify/</filename>, <filename>pg_serial/</filename>,
1129 <filename>pg_snapshots/</filename>, <filename>pg_stat_tmp/</filename>,
1130 and <filename>pg_subtrans/</filename> (but not the directories themselves) can be
1131 omitted from the backup as they will be initialized on postmaster startup.
1132 </para>
1134 <para>
1135 Any file or directory beginning with <filename>pgsql_tmp</filename> can be
1136 omitted from the backup. These files are removed on postmaster start and
1137 the directories will be recreated as needed.
1138 </para>
1140 <para>
1141 <filename>pg_internal.init</filename> files can be omitted from the
1142 backup whenever a file of that name is found. These files contain
1143 relation cache data that is always rebuilt when recovering.
1144 </para>
1146 <para>
1147 The backup label
1148 file includes the label string you gave to <function>pg_backup_start</function>,
1149 as well as the time at which <function>pg_backup_start</function> was run, and
1150 the name of the starting WAL file. In case of confusion it is therefore
1151 possible to look inside a backup file and determine exactly which
1152 backup session the dump file came from. The tablespace map file includes
1153 the symbolic link names as they exist in the directory
1154 <filename>pg_tblspc/</filename> and the full path of each symbolic link.
1155 These files are not merely for your information; their presence and
1156 contents are critical to the proper operation of the system's recovery
1157 process.
1158 </para>
1160 <para>
1161 It is also possible to make a backup while the server is
1162 stopped. In this case, you obviously cannot use
1163 <function>pg_backup_start</function> or <function>pg_backup_stop</function>, and
1164 you will therefore be left to your own devices to keep track of which
1165 backup is which and how far back the associated WAL files go.
1166 It is generally better to follow the continuous archiving procedure above.
1167 </para>
1168 </sect3>
1169 </sect2>
1171 <sect2 id="backup-pitr-recovery">
1172 <title>Recovering Using a Continuous Archive Backup</title>
1174 <para>
1175 Okay, the worst has happened and you need to recover from your backup.
1176 Here is the procedure:
1177 <orderedlist>
1178 <listitem>
1179 <para>
1180 Stop the server, if it's running.
1181 </para>
1182 </listitem>
1183 <listitem>
1184 <para>
1185 If you have the space to do so,
1186 copy the whole cluster data directory and any tablespaces to a temporary
1187 location in case you need them later. Note that this precaution will
1188 require that you have enough free space on your system to hold two
1189 copies of your existing database. If you do not have enough space,
1190 you should at least save the contents of the cluster's <filename>pg_wal</filename>
1191 subdirectory, as it might contain WAL files which
1192 were not archived before the system went down.
1193 </para>
1194 </listitem>
1195 <listitem>
1196 <para>
1197 Remove all existing files and subdirectories under the cluster data
1198 directory and under the root directories of any tablespaces you are using.
1199 </para>
1200 </listitem>
1201 <listitem>
1202 <para>
1203 If you're restoring a full backup, you can restore the database files
1204 directly into the target directories. Be sure that they
1205 are restored with the right ownership (the database system user, not
1206 <literal>root</literal>!) and with the right permissions. If you are using
1207 tablespaces,
1208 you should verify that the symbolic links in <filename>pg_tblspc/</filename>
1209 were correctly restored.
1210 </para>
1211 </listitem>
1212 <listitem>
1213 <para>
1214 If you're restoring an incremental backup, you'll need to restore the
1215 incremental backup and all earlier backups upon which it directly or
1216 indirectly depends to the machine where you are performing the restore.
1217 These backups will need to be placed in separate directories, not the
1218 target directories where you want the running server to end up.
1219 Once this is done, use <xref linkend="app-pgcombinebackup"/> to pull
1220 data from the full backup and all of the subsequent incremental backups
1221 and write out a synthetic full backup to the target directories. As above,
1222 verify that permissions and tablespace links are correct.
1223 </para>
1224 </listitem>
1225 <listitem>
1226 <para>
1227 Remove any files present in <filename>pg_wal/</filename>; these came from the
1228 file system backup and are therefore probably obsolete rather than current.
1229 If you didn't archive <filename>pg_wal/</filename> at all, then recreate
1230 it with proper permissions,
1231 being careful to ensure that you re-establish it as a symbolic link
1232 if you had it set up that way before.
1233 </para>
1234 </listitem>
1235 <listitem>
1236 <para>
1237 If you have unarchived WAL segment files that you saved in step 2,
1238 copy them into <filename>pg_wal/</filename>. (It is best to copy them,
1239 not move them, so you still have the unmodified files if a
1240 problem occurs and you have to start over.)
1241 </para>
1242 </listitem>
1243 <listitem>
1244 <para>
1245 Set recovery configuration settings in
1246 <filename>postgresql.conf</filename> (see <xref
1247 linkend="runtime-config-wal-archive-recovery"/>) and create a file
1248 <filename>recovery.signal</filename> in the cluster
1249 data directory. You might
1250 also want to temporarily modify <filename>pg_hba.conf</filename> to prevent
1251 ordinary users from connecting until you are sure the recovery was successful.
1252 </para>
1253 </listitem>
1254 <listitem>
1255 <para>
1256 Start the server. The server will go into recovery mode and
1257 proceed to read through the archived WAL files it needs. Should the
1258 recovery be terminated because of an external error, the server can
1259 simply be restarted and it will continue recovery. Upon completion
1260 of the recovery process, the server will remove
1261 <filename>recovery.signal</filename> (to prevent
1262 accidentally re-entering recovery mode later) and then
1263 commence normal database operations.
1264 </para>
1265 </listitem>
1266 <listitem>
1267 <para>
1268 Inspect the contents of the database to ensure you have recovered to
1269 the desired state. If not, return to step 1. If all is well,
1270 allow your users to connect by restoring <filename>pg_hba.conf</filename> to normal.
1271 </para>
1272 </listitem>
1273 </orderedlist>
1274 </para>
1276 <para>
1277 The key part of all this is to set up a recovery configuration that
1278 describes how you want to recover and how far the recovery should
1279 run. The one thing that you absolutely must specify is the <varname>restore_command</varname>,
1280 which tells <productname>PostgreSQL</productname> how to retrieve archived
1281 WAL file segments. Like the <varname>archive_command</varname>, this is
1282 a shell command string. It can contain <literal>%f</literal>, which is
1283 replaced by the name of the desired WAL file, and <literal>%p</literal>,
1284 which is replaced by the path name to copy the WAL file to.
1285 (The path name is relative to the current working directory,
1286 i.e., the cluster's data directory.)
1287 Write <literal>%%</literal> if you need to embed an actual <literal>%</literal>
1288 character in the command. The simplest useful command is
1289 something like:
1290 <programlisting>
1291 restore_command = 'cp /mnt/server/archivedir/%f %p'
1292 </programlisting>
1293 which will copy previously archived WAL segments from the directory
1294 <filename>/mnt/server/archivedir</filename>. Of course, you can use something
1295 much more complicated, perhaps even a shell script that requests the
1296 operator to mount an appropriate tape.
1297 </para>
1299 <para>
1300 It is important that the command return nonzero exit status on failure.
1301 The command <emphasis>will</emphasis> be called requesting files that are not
1302 present in the archive; it must return nonzero when so asked. This is not
1303 an error condition. An exception is that if the command was terminated by
1304 a signal (other than <systemitem>SIGTERM</systemitem>, which is used as
1305 part of a database server shutdown) or an error by the shell (such as
1306 command not found), then recovery will abort and the server will not start
1308 </para>
1310 <para>
1311 Not all of the requested files will be WAL segment
1312 files; you should also expect requests for files with a suffix of
1313 <literal>.history</literal>. Also be aware that
1314 the base name of the <literal>%p</literal> path will be different from
1315 <literal>%f</literal>; do not expect them to be interchangeable.
1316 </para>
1318 <para>
1319 WAL segments that cannot be found in the archive will be sought in
1320 <filename>pg_wal/</filename>; this allows use of recent un-archived segments.
1321 However, segments that are available from the archive will be used in
1322 preference to files in <filename>pg_wal/</filename>.
1323 </para>
1325 <para>
1326 Normally, recovery will proceed through all available WAL segments,
1327 thereby restoring the database to the current point in time (or as
1328 close as possible given the available WAL segments). Therefore, a normal
1329 recovery will end with a <quote>file not found</quote> message, the exact text
1330 of the error message depending upon your choice of
1331 <varname>restore_command</varname>. You may also see an error message
1332 at the start of recovery for a file named something like
1333 <filename>00000001.history</filename>. This is also normal and does not
1334 indicate a problem in simple recovery situations; see
1335 <xref linkend="backup-timelines"/> for discussion.
1336 </para>
1338 <para>
1339 If you want to recover to some previous point in time (say, right before
1340 the junior DBA dropped your main transaction table), just specify the
1341 required <link linkend="runtime-config-wal-recovery-target">stopping point</link>. You can specify
1342 the stop point, known as the <quote>recovery target</quote>, either by
1343 date/time, named restore point or by completion of a specific transaction
1344 ID. As of this writing only the date/time and named restore point options
1345 are very usable, since there are no tools to help you identify with any
1346 accuracy which transaction ID to use.
1347 </para>
1349 <note>
1350 <para>
1351 The stop point must be after the ending time of the base backup, i.e.,
1352 the end time of <function>pg_backup_stop</function>. You cannot use a base backup
1353 to recover to a time when that backup was in progress. (To
1354 recover to such a time, you must go back to your previous base backup
1355 and roll forward from there.)
1356 </para>
1357 </note>
1359 <para>
1360 If recovery finds corrupted WAL data, recovery will
1361 halt at that point and the server will not start. In such a case the
1362 recovery process could be re-run from the beginning, specifying a
1363 <quote>recovery target</quote> before the point of corruption so that recovery
1364 can complete normally.
1365 If recovery fails for an external reason, such as a system crash or
1366 if the WAL archive has become inaccessible, then the recovery can simply
1367 be restarted and it will restart almost from where it failed.
1368 Recovery restart works much like checkpointing in normal operation:
1369 the server periodically forces all its state to disk, and then updates
1370 the <filename>pg_control</filename> file to indicate that the already-processed
1371 WAL data need not be scanned again.
1372 </para>
1374 </sect2>
1376 <sect2 id="backup-timelines">
1377 <title>Timelines</title>
1379 <indexterm zone="backup">
1380 <primary>timelines</primary>
1381 </indexterm>
1383 <para>
1384 The ability to restore the database to a previous point in time creates
1385 some complexities that are akin to science-fiction stories about time
1386 travel and parallel universes. For example, in the original history of the database,
1387 suppose you dropped a critical table at 5:15PM on Tuesday evening, but
1388 didn't realize your mistake until Wednesday noon.
1389 Unfazed, you get out your backup, restore to the point-in-time 5:14PM
1390 Tuesday evening, and are up and running. In <emphasis>this</emphasis> history of
1391 the database universe, you never dropped the table. But suppose
1392 you later realize this wasn't such a great idea, and would like
1393 to return to sometime Wednesday morning in the original history.
1394 You won't be able
1395 to if, while your database was up-and-running, it overwrote some of the
1396 WAL segment files that led up to the time you now wish you
1397 could get back to. Thus, to avoid this, you need to distinguish the series of
1398 WAL records generated after you've done a point-in-time recovery from
1399 those that were generated in the original database history.
1400 </para>
1402 <para>
1403 To deal with this problem, <productname>PostgreSQL</productname> has a notion
1404 of <firstterm>timelines</firstterm>. Whenever an archive recovery completes,
1405 a new timeline is created to identify the series of WAL records
1406 generated after that recovery. The timeline
1407 ID number is part of WAL segment file names so a new timeline does
1408 not overwrite the WAL data generated by previous timelines.
1409 For example, in the WAL file name
1410 <filename>0000000100001234000055CD</filename>, the leading
1411 <literal>00000001</literal> is the timeline ID in hexadecimal. (Note that
1412 in other contexts, such as server log messages, timeline IDs are
1413 usually printed in decimal.)
1414 </para>
1416 <para>
1417 It is
1418 in fact possible to archive many different timelines. While that might
1419 seem like a useless feature, it's often a lifesaver. Consider the
1420 situation where you aren't quite sure what point-in-time to recover to,
1421 and so have to do several point-in-time recoveries by trial and error
1422 until you find the best place to branch off from the old history. Without
1423 timelines this process would soon generate an unmanageable mess. With
1424 timelines, you can recover to <emphasis>any</emphasis> prior state, including
1425 states in timeline branches that you abandoned earlier.
1426 </para>
1428 <para>
1429 Every time a new timeline is created, <productname>PostgreSQL</productname> creates
1430 a <quote>timeline history</quote> file that shows which timeline it branched
1431 off from and when. These history files are necessary to allow the system
1432 to pick the right WAL segment files when recovering from an archive that
1433 contains multiple timelines. Therefore, they are archived into the WAL
1434 archive area just like WAL segment files. The history files are just
1435 small text files, so it's cheap and appropriate to keep them around
1436 indefinitely (unlike the segment files which are large). You can, if
1437 you like, add comments to a history file to record your own notes about
1438 how and why this particular timeline was created. Such comments will be
1439 especially valuable when you have a thicket of different timelines as
1440 a result of experimentation.
1441 </para>
1443 <para>
1444 The default behavior of recovery is to recover to the latest timeline found
1445 in the archive. If you wish to recover to the timeline that was current
1446 when the base backup was taken or into a specific child timeline (that
1447 is, you want to return to some state that was itself generated after a
1448 recovery attempt), you need to specify <literal>current</literal> or the
1449 target timeline ID in <xref linkend="guc-recovery-target-timeline"/>. You
1450 cannot recover into timelines that branched off earlier than the base backup.
1451 </para>
1452 </sect2>
1454 <sect2 id="backup-tips">
1455 <title>Tips and Examples</title>
1457 <para>
1458 Some tips for configuring continuous archiving are given here.
1459 </para>
1461 <sect3 id="backup-standalone">
1462 <title>Standalone Hot Backups</title>
1464 <para>
1465 It is possible to use <productname>PostgreSQL</productname>'s backup facilities to
1466 produce standalone hot backups. These are backups that cannot be used
1467 for point-in-time recovery, yet are typically much faster to backup and
1468 restore than <application>pg_dump</application> dumps. (They are also much larger
1469 than <application>pg_dump</application> dumps, so in some cases the speed advantage
1470 might be negated.)
1471 </para>
1473 <para>
1474 As with base backups, the easiest way to produce a standalone
1475 hot backup is to use the <xref linkend="app-pgbasebackup"/>
1476 tool. If you include the <literal>-X</literal> parameter when calling
1477 it, all the write-ahead log required to use the backup will be
1478 included in the backup automatically, and no special action is
1479 required to restore the backup.
1480 </para>
1481 </sect3>
1483 <sect3 id="compressed-archive-logs">
1484 <title>Compressed Archive Logs</title>
1486 <para>
1487 If archive storage size is a concern, you can use
1488 <application>gzip</application> to compress the archive files:
1489 <programlisting>
1490 archive_command = 'gzip &lt; %p &gt; /mnt/server/archivedir/%f.gz'
1491 </programlisting>
1492 You will then need to use <application>gunzip</application> during recovery:
1493 <programlisting>
1494 restore_command = 'gunzip &lt; /mnt/server/archivedir/%f.gz &gt; %p'
1495 </programlisting>
1496 </para>
1497 </sect3>
1499 <sect3 id="backup-scripts">
1500 <title><varname>archive_command</varname> Scripts</title>
1502 <para>
1503 Many people choose to use scripts to define their
1504 <varname>archive_command</varname>, so that their
1505 <filename>postgresql.conf</filename> entry looks very simple:
1506 <programlisting>
1507 archive_command = 'local_backup_script.sh "%p" "%f"'
1508 </programlisting>
1509 Using a separate script file is advisable any time you want to use
1510 more than a single command in the archiving process.
1511 This allows all complexity to be managed within the script, which
1512 can be written in a popular scripting language such as
1513 <application>bash</application> or <application>perl</application>.
1514 </para>
1516 <para>
1517 Examples of requirements that might be solved within a script include:
1518 <itemizedlist>
1519 <listitem>
1520 <para>
1521 Copying data to secure off-site data storage
1522 </para>
1523 </listitem>
1524 <listitem>
1525 <para>
1526 Batching WAL files so that they are transferred every three hours,
1527 rather than one at a time
1528 </para>
1529 </listitem>
1530 <listitem>
1531 <para>
1532 Interfacing with other backup and recovery software
1533 </para>
1534 </listitem>
1535 <listitem>
1536 <para>
1537 Interfacing with monitoring software to report errors
1538 </para>
1539 </listitem>
1540 </itemizedlist>
1541 </para>
1543 <tip>
1544 <para>
1545 When using an <varname>archive_command</varname> script, it's desirable
1546 to enable <xref linkend="guc-logging-collector"/>.
1547 Any messages written to <systemitem>stderr</systemitem> from the script will then
1548 appear in the database server log, allowing complex configurations to
1549 be diagnosed easily if they fail.
1550 </para>
1551 </tip>
1552 </sect3>
1553 </sect2>
1555 <sect2 id="continuous-archiving-caveats">
1556 <title>Caveats</title>
1558 <para>
1559 At this writing, there are several limitations of the continuous archiving
1560 technique. These will probably be fixed in future releases:
1562 <itemizedlist>
1563 <listitem>
1564 <para>
1565 If a <link linkend="sql-createdatabase"><command>CREATE DATABASE</command></link>
1566 command is executed while a base backup is being taken, and then
1567 the template database that the <command>CREATE DATABASE</command> copied
1568 is modified while the base backup is still in progress, it is
1569 possible that recovery will cause those modifications to be
1570 propagated into the created database as well. This is of course
1571 undesirable. To avoid this risk, it is best not to modify any
1572 template databases while taking a base backup.
1573 </para>
1574 </listitem>
1576 <listitem>
1577 <para>
1578 <link linkend="sql-createtablespace"><command>CREATE TABLESPACE</command></link>
1579 commands are WAL-logged with the literal absolute path, and will
1580 therefore be replayed as tablespace creations with the same
1581 absolute path. This might be undesirable if the WAL is being
1582 replayed on a different machine. It can be dangerous even if the
1583 WAL is being replayed on the same machine, but into a new data
1584 directory: the replay will still overwrite the contents of the
1585 original tablespace. To avoid potential gotchas of this sort,
1586 the best practice is to take a new base backup after creating or
1587 dropping tablespaces.
1588 </para>
1589 </listitem>
1590 </itemizedlist>
1591 </para>
1593 <para>
1594 It should also be noted that the default <acronym>WAL</acronym>
1595 format is fairly bulky since it includes many disk page snapshots.
1596 These page snapshots are designed to support crash recovery, since
1597 we might need to fix partially-written disk pages. Depending on
1598 your system hardware and software, the risk of partial writes might
1599 be small enough to ignore, in which case you can significantly
1600 reduce the total volume of archived WAL files by turning off page
1601 snapshots using the <xref linkend="guc-full-page-writes"/>
1602 parameter. (Read the notes and warnings in <xref linkend="wal"/>
1603 before you do so.) Turning off page snapshots does not prevent
1604 use of the WAL for PITR operations. An area for future
1605 development is to compress archived WAL data by removing
1606 unnecessary page copies even when <varname>full_page_writes</varname> is
1607 on. In the meantime, administrators might wish to reduce the number
1608 of page snapshots included in WAL by increasing the checkpoint
1609 interval parameters as much as feasible.
1610 </para>
1611 </sect2>
1612 </sect1>
1614 </chapter>