doc/src/sgml/backup.sgml

   1 <!-- doc/src/sgml/backup.sgml -->
   2
   3 <chapter id="backup">
   4  <title>Backup and Restore</title>
   5
   6  <indexterm zone="backup"><primary>backup</primary></indexterm>
   7
   8  <para>
   9   As with everything that contains valuable data, <productname>PostgreSQL</productname>
  10   databases should be backed up regularly. While the procedure is
  11   essentially simple, it is important to have a clear understanding of
  12   the underlying techniques and assumptions.
  13  </para>
  14
  15  <para>
  16   There are three fundamentally different approaches to backing up
  17   <productname>PostgreSQL</productname> data:
  18   <itemizedlist>
  19    <listitem><para><acronym>SQL</acronym> dump</para></listitem>
  20    <listitem><para>File system level backup</para></listitem>
  21    <listitem><para>Continuous archiving</para></listitem>
  22   </itemizedlist>
  23   Each has its own strengths and weaknesses; each is discussed in turn
  24   in the following sections.
  25  </para>
  26
  27  <sect1 id="backup-dump">
  28   <title><acronym>SQL</acronym> Dump</title>
  29
  30   <para>
  31    The idea behind this dump method is to generate a file with SQL
  32    commands that, when fed back to the server, will recreate the
  33    database in the same state as it was at the time of the dump.
  34    <productname>PostgreSQL</productname> provides the utility program
  35    <xref linkend="app-pgdump"/> for this purpose. The basic usage of this
  36    command is:
  37 <synopsis>
  38 pg_dump <replaceable class="parameter">dbname</replaceable> &gt; <replaceable class="parameter">dumpfile</replaceable>
  39 </synopsis>
  40    As you see, <application>pg_dump</application> writes its result to the
  41    standard output. We will see below how this can be useful.
  42    While the above command creates a text file, <application>pg_dump</application>
  43    can create files in other formats that allow for parallelism and more
  44    fine-grained control of object restoration.
  45   </para>
  46
  47   <para>
  48    <application>pg_dump</application> is a regular <productname>PostgreSQL</productname>
  49    client application (albeit a particularly clever one). This means
  50    that you can perform this backup procedure from any remote host that has
  51    access to the database. But remember that <application>pg_dump</application>
  52    does not operate with special permissions. In particular, it must
  53    have read access to all tables that you want to back up, so in order
  54    to back up the entire database you almost always have to run it as a
  55    database superuser.  (If you do not have sufficient privileges to back up
  56    the entire database, you can still back up portions of the database to which
  57    you do have access using options such as
  58    <option>-n <replaceable>schema</replaceable></option>
  59    or <option>-t <replaceable>table</replaceable></option>.)
  60   </para>
  61
  62   <para>
  63    To specify which database server <application>pg_dump</application> should
  64    contact, use the command line options <option>-h
  65    <replaceable>host</replaceable></option> and <option>-p <replaceable>port</replaceable></option>. The
  66    default host is the local host or whatever your
  67    <envar>PGHOST</envar> environment variable specifies. Similarly,
  68    the default port is indicated by the <envar>PGPORT</envar>
  69    environment variable or, failing that, by the compiled-in default.
  70    (Conveniently, the server will normally have the same compiled-in
  71    default.)
  72   </para>
  73
  74   <para>
  75    Like any other <productname>PostgreSQL</productname> client application,
  76    <application>pg_dump</application> will by default connect with the database
  77    user name that is equal to the current operating system user name. To override
  78    this, either specify the <option>-U</option> option or set the
  79    environment variable <envar>PGUSER</envar>. Remember that
  80    <application>pg_dump</application> connections are subject to the normal
  81    client authentication mechanisms (which are described in <xref
  82    linkend="client-authentication"/>).
  83   </para>
  84
  85   <para>
  86    An important advantage of <application>pg_dump</application> over the other backup
  87    methods described later is that <application>pg_dump</application>'s output can
  88    generally be re-loaded into newer versions of <productname>PostgreSQL</productname>,
  89    whereas file-level backups and continuous archiving are both extremely
  90    server-version-specific.  <application>pg_dump</application> is also the only method
  91    that will work when transferring a database to a different machine
  92    architecture, such as going from a 32-bit to a 64-bit server.
  93   </para>
  94
  95   <para>
  96    Dumps created by <application>pg_dump</application> are internally consistent,
  97    meaning, the dump represents a snapshot of the database at the time
  98    <application>pg_dump</application> began running. <application>pg_dump</application> does not
  99    block other operations on the database while it is working.
 100    (Exceptions are those operations that need to operate with an
 101    exclusive lock, such as most forms of <command>ALTER TABLE</command>.)
 102   </para>
 103
 104   <sect2 id="backup-dump-restore">
 105    <title>Restoring the Dump</title>
 106
 107    <para>
 108     Text files created by <application>pg_dump</application> are intended to
 109     be read in by the <application>psql</application> program. The
 110     general command form to restore a dump is
 111 <synopsis>
 112 psql <replaceable class="parameter">dbname</replaceable> &lt; <replaceable class="parameter">dumpfile</replaceable>
 113 </synopsis>
 114     where <replaceable class="parameter">dumpfile</replaceable> is the
 115     file output by the <application>pg_dump</application> command. The database <replaceable
 116     class="parameter">dbname</replaceable> will not be created by this
 117     command, so you must create it yourself from <literal>template0</literal>
 118     before executing <application>psql</application> (e.g., with
 119     <literal>createdb -T template0 <replaceable
 120     class="parameter">dbname</replaceable></literal>).  <application>psql</application>
 121     supports options similar to <application>pg_dump</application> for specifying
 122     the database server to connect to and the user name to use. See
 123     the <xref linkend="app-psql"/> reference page for more information.
 124     Non-text file dumps are restored using the <xref
 125     linkend="app-pgrestore"/> utility.
 126    </para>
 127
 128    <para>
 129     Before restoring an SQL dump, all the users who own objects or were
 130     granted permissions on objects in the dumped database must already
 131     exist. If they do not, the restore will fail to recreate the
 132     objects with the original ownership and/or permissions.
 133     (Sometimes this is what you want, but usually it is not.)
 134    </para>
 135
 136    <para>
 137     By default, the <application>psql</application> script will continue to
 138     execute after an SQL error is encountered. You might wish to run
 139     <application>psql</application> with
 140     the <literal>ON_ERROR_STOP</literal> variable set to alter that
 141     behavior and have <application>psql</application> exit with an
 142     exit status of 3 if an SQL error occurs:
 143 <programlisting>
 144 psql --set ON_ERROR_STOP=on <replaceable>dbname</replaceable> &lt; <replaceable>dumpfile</replaceable>
 145 </programlisting>
 146     Either way, you will only have a partially restored database.
 147     Alternatively, you can specify that the whole dump should be
 148     restored as a single transaction, so the restore is either fully
 149     completed or fully rolled back. This mode can be specified by
 150     passing the <option>-1</option> or <option>--single-transaction</option>
 151     command-line options to <application>psql</application>. When using this
 152     mode, be aware that even a minor error can rollback a
 153     restore that has already run for many hours. However, that might
 154     still be preferable to manually cleaning up a complex database
 155     after a partially restored dump.
 156    </para>
 157
 158    <para>
 159     The ability of <application>pg_dump</application> and <application>psql</application> to
 160     write to or read from pipes makes it possible to dump a database
 161     directly from one server to another, for example:
 162 <programlisting>
 163 pg_dump -h <replaceable>host1</replaceable> <replaceable>dbname</replaceable> | psql -h <replaceable>host2</replaceable> <replaceable>dbname</replaceable>
 164 </programlisting>
 165    </para>
 166
 167    <important>
 168     <para>
 169      The dumps produced by <application>pg_dump</application> are relative to
 170      <literal>template0</literal>. This means that any languages, procedures,
 171      etc. added via <literal>template1</literal> will also be dumped by
 172      <application>pg_dump</application>. As a result, when restoring, if you are
 173      using a customized <literal>template1</literal>, you must create the
 174      empty database from <literal>template0</literal>, as in the example
 175      above.
 176     </para>
 177    </important>
 178
 179    <para>
 180     After restoring a backup, it is wise to run <link
 181     linkend="sql-analyze"><command>ANALYZE</command></link> on each
 182     database so the query optimizer has useful statistics;
 183     see <xref linkend="vacuum-for-statistics"/>
 184     and <xref linkend="autovacuum"/> for more information.
 185     For more advice on how to load large amounts of data
 186     into <productname>PostgreSQL</productname> efficiently, refer to <xref
 187     linkend="populate"/>.
 188    </para>
 189   </sect2>
 190
 191   <sect2 id="backup-dump-all">
 192    <title>Using <application>pg_dumpall</application></title>
 193
 194    <para>
 195     <application>pg_dump</application> dumps only a single database at a time,
 196     and it does not dump information about roles or tablespaces
 197     (because those are cluster-wide rather than per-database).
 198     To support convenient dumping of the entire contents of a database
 199     cluster, the <xref linkend="app-pg-dumpall"/> program is provided.
 200     <application>pg_dumpall</application> backs up each database in a given
 201     cluster, and also preserves cluster-wide data such as role and
 202     tablespace definitions. The basic usage of this command is:
 203 <synopsis>
 204 pg_dumpall &gt; <replaceable>dumpfile</replaceable>
 205 </synopsis>
 206     The resulting dump can be restored with <application>psql</application>:
 207 <synopsis>
 208 psql -f <replaceable class="parameter">dumpfile</replaceable> postgres
 209 </synopsis>
 210     (Actually, you can specify any existing database name to start from,
 211     but if you are loading into an empty cluster then <literal>postgres</literal>
 212     should usually be used.)  It is always necessary to have
 213     database superuser access when restoring a <application>pg_dumpall</application>
 214     dump, as that is required to restore the role and tablespace information.
 215     If you use tablespaces, make sure that the tablespace paths in the
 216     dump are appropriate for the new installation.
 217    </para>
 218
 219    <para>
 220     <application>pg_dumpall</application> works by emitting commands to re-create
 221     roles, tablespaces, and empty databases, then invoking
 222     <application>pg_dump</application> for each database.  This means that while
 223     each database will be internally consistent, the snapshots of
 224     different databases are not synchronized.
 225    </para>
 226
 227    <para>
 228     Cluster-wide data can be dumped alone using the
 229     <application>pg_dumpall</application> <option>--globals-only</option> option.
 230     This is necessary to fully backup the cluster if running the
 231     <application>pg_dump</application> command on individual databases.
 232    </para>
 233   </sect2>
 234
 235   <sect2 id="backup-dump-large">
 236    <title>Handling Large Databases</title>
 237
 238    <para>
 239     Some operating systems have maximum file size limits that cause
 240     problems when creating large <application>pg_dump</application> output files.
 241     Fortunately, <application>pg_dump</application> can write to the standard
 242     output, so you can use standard Unix tools to work around this
 243     potential problem.  There are several possible methods:
 244    </para>
 245
 246    <formalpara>
 247     <title>Use compressed dumps.</title>
 248     <para>
 249      You can use your favorite compression program, for example
 250      <application>gzip</application>:
 251
 252 <programlisting>
 253 pg_dump <replaceable class="parameter">dbname</replaceable> | gzip &gt; <replaceable class="parameter">filename</replaceable>.gz
 254 </programlisting>
 255
 256      Reload with:
 257
 258 <programlisting>
 259 gunzip -c <replaceable class="parameter">filename</replaceable>.gz | psql <replaceable class="parameter">dbname</replaceable>
 260 </programlisting>
 261
 262      or:
 263
 264 <programlisting>
 265 cat <replaceable class="parameter">filename</replaceable>.gz | gunzip | psql <replaceable class="parameter">dbname</replaceable>
 266 </programlisting>
 267     </para>
 268    </formalpara>
 269
 270    <formalpara>
 271     <title>Use <command>split</command>.</title>
 272     <para>
 273      The <command>split</command> command
 274      allows you to split the output into smaller files that are
 275      acceptable in size to the underlying file system. For example, to
 276      make 2 gigabyte chunks:
 277
 278 <programlisting>
 279 pg_dump <replaceable class="parameter">dbname</replaceable> | split -b 2G - <replaceable class="parameter">filename</replaceable>
 280 </programlisting>
 281
 282      Reload with:
 283
 284 <programlisting>
 285 cat <replaceable class="parameter">filename</replaceable>* | psql <replaceable class="parameter">dbname</replaceable>
 286 </programlisting>
 287
 288      If using GNU <application>split</application>, it is possible to
 289      use it and <application>gzip</application> together:
 290
 291 <programlisting>
 292 pg_dump <replaceable class="parameter">dbname</replaceable> | split -b 2G --filter='gzip > $FILE.gz'
 293 </programlisting>
 294
 295      It can be restored using <command>zcat</command>.
 296     </para>
 297    </formalpara>
 298
 299    <formalpara>
 300     <title>Use <application>pg_dump</application>'s custom dump format.</title>
 301     <para>
 302      If <productname>PostgreSQL</productname> was built on a system with the
 303      <application>zlib</application> compression library installed, the custom dump
 304      format will compress data as it writes it to the output file. This will
 305      produce dump file sizes similar to using <command>gzip</command>, but it
 306      has the added advantage that tables can be restored selectively. The
 307      following command dumps a database using the custom dump format:
 308
 309 <programlisting>
 310 pg_dump -Fc <replaceable class="parameter">dbname</replaceable> &gt; <replaceable class="parameter">filename</replaceable>
 311 </programlisting>
 312
 313      A custom-format dump is not a script for <application>psql</application>, but
 314      instead must be restored with <application>pg_restore</application>, for example:
 315
 316 <programlisting>
 317 pg_restore -d <replaceable class="parameter">dbname</replaceable> <replaceable class="parameter">filename</replaceable>
 318 </programlisting>
 319
 320      See the <xref linkend="app-pgdump"/> and <xref
 321      linkend="app-pgrestore"/> reference pages for details.
 322     </para>
 323    </formalpara>
 324
 325    <para>
 326     For very large databases, you might need to combine <command>split</command>
 327     with one of the other two approaches.
 328    </para>
 329
 330    <formalpara>
 331     <title>Use <application>pg_dump</application>'s parallel dump feature.</title>
 332     <para>
 333      To speed up the dump of a large database, you can use
 334      <application>pg_dump</application>'s parallel mode. This will dump
 335      multiple tables at the same time. You can control the degree of
 336      parallelism with the <command>-j</command> parameter. Parallel dumps
 337      are only supported for the "directory" archive format.
 338
 339 <programlisting>
 340 pg_dump -j <replaceable class="parameter">num</replaceable> -F d -f <replaceable class="parameter">out.dir</replaceable> <replaceable class="parameter">dbname</replaceable>
 341 </programlisting>
 342
 343      You can use <command>pg_restore -j</command> to restore a dump in parallel.
 344      This will work for any archive of either the "custom" or the "directory"
 345      archive mode, whether or not it has been created with <command>pg_dump -j</command>.
 346     </para>
 347    </formalpara>
 348   </sect2>
 349  </sect1>
 350
 351  <sect1 id="backup-file">
 352   <title>File System Level Backup</title>
 353
 354   <para>
 355    An alternative backup strategy is to directly copy the files that
 356    <productname>PostgreSQL</productname> uses to store the data in the database;
 357    <xref linkend="creating-cluster"/> explains where these files
 358    are located.  You can use whatever method you prefer
 359    for doing file system backups; for example:
 360
 361 <programlisting>
 362 tar -cf backup.tar /usr/local/pgsql/data
 363 </programlisting>
 364   </para>
 365
 366   <para>
 367    There are two restrictions, however, which make this method
 368    impractical, or at least inferior to the <application>pg_dump</application>
 369    method:
 370
 371    <orderedlist>
 372     <listitem>
 373      <para>
 374       The database server <emphasis>must</emphasis> be shut down in order to
 375       get a usable backup. Half-way measures such as disallowing all
 376       connections will <emphasis>not</emphasis> work
 377       (in part because <command>tar</command> and similar tools do not take
 378       an atomic snapshot of the state of the file system,
 379       but also because of internal buffering within the server).
 380       Information about stopping the server can be found in
 381       <xref linkend="server-shutdown"/>.  Needless to say, you
 382       also need to shut down the server before restoring the data.
 383      </para>
 384     </listitem>
 385
 386     <listitem>
 387      <para>
 388       If you have dug into the details of the file system layout of the
 389       database, you might be tempted to try to back up or restore only certain
 390       individual tables or databases from their respective files or
 391       directories. This will <emphasis>not</emphasis> work because the
 392       information contained in these files is not usable without
 393       the commit log files,
 394       <filename>pg_xact/*</filename>, which contain the commit status of
 395       all transactions. A table file is only usable with this
 396       information. Of course it is also impossible to restore only a
 397       table and the associated <filename>pg_xact</filename> data
 398       because that would render all other tables in the database
 399       cluster useless.  So file system backups only work for complete
 400       backup and restoration of an entire database cluster.
 401      </para>
 402     </listitem>
 403    </orderedlist>
 404   </para>
 405
 406   <para>
 407    An alternative file-system backup approach is to make a
 408    <quote>consistent snapshot</quote> of the data directory, if the
 409    file system supports that functionality (and you are willing to
 410    trust that it is implemented correctly).  The typical procedure is
 411    to make a <quote>frozen snapshot</quote> of the volume containing the
 412    database, then copy the whole data directory (not just parts, see
 413    above) from the snapshot to a backup device, then release the frozen
 414    snapshot.  This will work even while the database server is running.
 415    However, a backup created in this way saves
 416    the database files in a state as if the database server was not
 417    properly shut down; therefore, when you start the database server
 418    on the backed-up data, it will think the previous server instance
 419    crashed and will replay the WAL log.  This is not a problem; just
 420    be aware of it (and be sure to include the WAL files in your backup).
 421    You can perform a <command>CHECKPOINT</command> before taking the
 422    snapshot to reduce recovery time.
 423   </para>
 424
 425   <para>
 426    If your database is spread across multiple file systems, there might not
 427    be any way to obtain exactly-simultaneous frozen snapshots of all
 428    the volumes.  For example, if your data files and WAL log are on different
 429    disks, or if tablespaces are on different file systems, it might
 430    not be possible to use snapshot backup because the snapshots
 431    <emphasis>must</emphasis> be simultaneous.
 432    Read your file system documentation very carefully before trusting
 433    the consistent-snapshot technique in such situations.
 434   </para>
 435
 436   <para>
 437    If simultaneous snapshots are not possible, one option is to shut down
 438    the database server long enough to establish all the frozen snapshots.
 439    Another option is to perform a continuous archiving base backup (<xref
 440    linkend="backup-base-backup"/>) because such backups are immune to file
 441    system changes during the backup.  This requires enabling continuous
 442    archiving just during the backup process; restore is done using
 443    continuous archive recovery (<xref linkend="backup-pitr-recovery"/>).
 444   </para>
 445
 446   <para>
 447    Another option is to use <application>rsync</application> to perform a file
 448    system backup.  This is done by first running <application>rsync</application>
 449    while the database server is running, then shutting down the database
 450    server long enough to do an <command>rsync --checksum</command>.
 451    (<option>--checksum</option> is necessary because <command>rsync</command> only
 452    has file modification-time granularity of one second.)  The
 453    second <application>rsync</application> will be quicker than the first,
 454    because it has relatively little data to transfer, and the end result
 455    will be consistent because the server was down.  This method
 456    allows a file system backup to be performed with minimal downtime.
 457   </para>
 458
 459   <para>
 460    Note that a file system backup will typically be larger
 461    than an SQL dump. (<application>pg_dump</application> does not need to dump
 462    the contents of indexes for example, just the commands to recreate
 463    them.)  However, taking a file system backup might be faster.
 464   </para>
 465  </sect1>
 466
 467  <sect1 id="continuous-archiving">
 468   <title>Continuous Archiving and Point-in-Time Recovery (PITR)</title>
 469
 470   <indexterm zone="backup">
 471    <primary>continuous archiving</primary>
 472   </indexterm>
 473
 474   <indexterm zone="backup">
 475    <primary>point-in-time recovery</primary>
 476   </indexterm>
 477
 478   <indexterm zone="backup">
 479    <primary>PITR</primary>
 480   </indexterm>
 481
 482   <para>
 483    At all times, <productname>PostgreSQL</productname> maintains a
 484    <firstterm>write ahead log</firstterm> (WAL) in the <filename>pg_wal/</filename>
 485    subdirectory of the cluster's data directory. The log records
 486    every change made to the database's data files.  This log exists
 487    primarily for crash-safety purposes: if the system crashes, the
 488    database can be restored to consistency by <quote>replaying</quote> the
 489    log entries made since the last checkpoint.  However, the existence
 490    of the log makes it possible to use a third strategy for backing up
 491    databases: we can combine a file-system-level backup with backup of
 492    the WAL files.  If recovery is needed, we restore the file system backup and
 493    then replay from the backed-up WAL files to bring the system to a
 494    current state.  This approach is more complex to administer than
 495    either of the previous approaches, but it has some significant
 496    benefits:
 497   <itemizedlist>
 498    <listitem>
 499     <para>
 500      We do not need a perfectly consistent file system backup as the starting point.
 501      Any internal inconsistency in the backup will be corrected by log
 502      replay (this is not significantly different from what happens during
 503      crash recovery).  So we do not need a file system snapshot capability,
 504      just <application>tar</application> or a similar archiving tool.
 505     </para>
 506    </listitem>
 507    <listitem>
 508     <para>
 509      Since we can combine an indefinitely long sequence of WAL files
 510      for replay, continuous backup can be achieved simply by continuing to archive
 511      the WAL files.  This is particularly valuable for large databases, where
 512      it might not be convenient to take a full backup frequently.
 513     </para>
 514    </listitem>
 515    <listitem>
 516     <para>
 517      It is not necessary to replay the WAL entries all the
 518      way to the end.  We could stop the replay at any point and have a
 519      consistent snapshot of the database as it was at that time.  Thus,
 520      this technique supports <firstterm>point-in-time recovery</firstterm>: it is
 521      possible to restore the database to its state at any time since your base
 522      backup was taken.
 523     </para>
 524    </listitem>
 525    <listitem>
 526     <para>
 527      If we continuously feed the series of WAL files to another
 528      machine that has been loaded with the same base backup file, we
 529      have a <firstterm>warm standby</firstterm> system: at any point we can bring up
 530      the second machine and it will have a nearly-current copy of the
 531      database.
 532     </para>
 533    </listitem>
 534   </itemizedlist>
 535   </para>
 536
 537   <note>
 538    <para>
 539     <application>pg_dump</application> and
 540     <application>pg_dumpall</application> do not produce file-system-level
 541     backups and cannot be used as part of a continuous-archiving solution.
 542     Such dumps are <emphasis>logical</emphasis> and do not contain enough
 543     information to be used by WAL replay.
 544    </para>
 545   </note>
 546
 547   <para>
 548    As with the plain file-system-backup technique, this method can only
 549    support restoration of an entire database cluster, not a subset.
 550    Also, it requires a lot of archival storage: the base backup might be bulky,
 551    and a busy system will generate many megabytes of WAL traffic that
 552    have to be archived.  Still, it is the preferred backup technique in
 553    many situations where high reliability is needed.
 554   </para>
 555
 556   <para>
 557    To recover successfully using continuous archiving (also called
 558    <quote>online backup</quote> by many database vendors), you need a continuous
 559    sequence of archived WAL files that extends back at least as far as the
 560    start time of your backup.  So to get started, you should set up and test
 561    your procedure for archiving WAL files <emphasis>before</emphasis> you take your
 562    first base backup.  Accordingly, we first discuss the mechanics of
 563    archiving WAL files.
 564   </para>
 565
 566   <sect2 id="backup-archiving-wal">
 567    <title>Setting Up WAL Archiving</title>
 568
 569    <para>
 570     In an abstract sense, a running <productname>PostgreSQL</productname> system
 571     produces an indefinitely long sequence of WAL records.  The system
 572     physically divides this sequence into WAL <firstterm>segment
 573     files</firstterm>, which are normally 16MB apiece (although the segment size
 574     can be altered during <application>initdb</application>).  The segment
 575     files are given numeric names that reflect their position in the
 576     abstract WAL sequence.  When not using WAL archiving, the system
 577     normally creates just a few segment files and then
 578     <quote>recycles</quote> them by renaming no-longer-needed segment files
 579     to higher segment numbers.  It's assumed that segment files whose
 580     contents precede the last checkpoint are no longer of
 581     interest and can be recycled.
 582    </para>
 583
 584    <para>
 585     When archiving WAL data, we need to capture the contents of each segment
 586     file once it is filled, and save that data somewhere before the segment
 587     file is recycled for reuse.  Depending on the application and the
 588     available hardware, there could be many different ways of <quote>saving
 589     the data somewhere</quote>: we could copy the segment files to an NFS-mounted
 590     directory on another machine, write them onto a tape drive (ensuring that
 591     you have a way of identifying the original name of each file), or batch
 592     them together and burn them onto CDs, or something else entirely.  To
 593     provide the database administrator with flexibility,
 594     <productname>PostgreSQL</productname> tries not to make any assumptions about how
 595     the archiving will be done.  Instead, <productname>PostgreSQL</productname> lets
 596     the administrator specify a shell command or an archive library to be executed to copy a
 597     completed segment file to wherever it needs to go.  This could be as simple
 598     as a shell command that uses <literal>cp</literal>, or it could invoke a
 599     complex C function &mdash; it's all up to you.
 600    </para>
 601
 602    <para>
 603     To enable WAL archiving, set the <xref linkend="guc-wal-level"/>
 604     configuration parameter to <literal>replica</literal> or higher,
 605     <xref linkend="guc-archive-mode"/> to <literal>on</literal>,
 606     specify the shell command to use in the <xref
 607     linkend="guc-archive-command"/> configuration parameter
 608     or specify the library to use in the <xref
 609     linkend="guc-archive-library"/> configuration parameter.  In practice
 610     these settings will always be placed in the
 611     <filename>postgresql.conf</filename> file.
 612    </para>
 613
 614    <para>
 615     In <varname>archive_command</varname>,
 616     <literal>%p</literal> is replaced by the path name of the file to
 617     archive, while <literal>%f</literal> is replaced by only the file name.
 618     (The path name is relative to the current working directory,
 619     i.e., the cluster's data directory.)
 620     Use <literal>%%</literal> if you need to embed an actual <literal>%</literal>
 621     character in the command.  The simplest useful command is something
 622     like:
 623 <programlisting>
 624 archive_command = 'test ! -f /mnt/server/archivedir/%f &amp;&amp; cp %p /mnt/server/archivedir/%f'  # Unix
 625 archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"'  # Windows
 626 </programlisting>
 627     which will copy archivable WAL segments to the directory
 628     <filename>/mnt/server/archivedir</filename>.  (This is an example, not a
 629     recommendation, and might not work on all platforms.)  After the
 630     <literal>%p</literal> and <literal>%f</literal> parameters have been replaced,
 631     the actual command executed might look like this:
 632 <programlisting>
 633 test ! -f /mnt/server/archivedir/00000001000000A900000065 &amp;&amp; cp pg_wal/00000001000000A900000065 /mnt/server/archivedir/00000001000000A900000065
 634 </programlisting>
 635     A similar command will be generated for each new file to be archived.
 636    </para>
 637
 638    <para>
 639     The archive command will be executed under the ownership of the same
 640     user that the <productname>PostgreSQL</productname> server is running as.  Since
 641     the series of WAL files being archived contains effectively everything
 642     in your database, you will want to be sure that the archived data is
 643     protected from prying eyes; for example, archive into a directory that
 644     does not have group or world read access.
 645    </para>
 646
 647    <para>
 648     It is important that the archive command return zero exit status if and
 649     only if it succeeds.  Upon getting a zero result,
 650     <productname>PostgreSQL</productname> will assume that the file has been
 651     successfully archived, and will remove or recycle it.  However, a nonzero
 652     status tells <productname>PostgreSQL</productname> that the file was not archived;
 653     it will try again periodically until it succeeds.
 654    </para>
 655
 656    <para>
 657     Another way to archive is to use a custom archive module as the
 658     <varname>archive_library</varname>.  Since such modules are written in
 659     <literal>C</literal>, creating your own may require considerably more effort
 660     than writing a shell command.  However, archive modules can be more
 661     performant than archiving via shell, and they will have access to many
 662     useful server resources.  For more information about archive modules, see
 663     <xref linkend="archive-modules"/>.
 664    </para>
 665
 666    <para>
 667     When the archive command is terminated by a signal (other than
 668     <systemitem>SIGTERM</systemitem> that is used as part of a server
 669     shutdown) or an error by the shell with an exit status greater than
 670     125 (such as command not found), or if the archive function emits an
 671     <literal>ERROR</literal> or <literal>FATAL</literal>, the archiver process
 672     aborts and gets restarted by the postmaster. In such cases, the failure is
 673     not reported in <xref linkend="pg-stat-archiver-view"/>.
 674    </para>
 675
 676    <para>
 677     Archive commands and libraries should generally be designed to refuse to overwrite
 678     any pre-existing archive file.  This is an important safety feature to
 679     preserve the integrity of your archive in case of administrator error
 680     (such as sending the output of two different servers to the same archive
 681     directory).  It is advisable to test your proposed archive library to ensure
 682     that it does not overwrite an existing file.
 683    </para>
 684
 685    <para>
 686     In rare cases, <productname>PostgreSQL</productname> may attempt to
 687     re-archive a WAL file that was previously archived.  For example, if the
 688     system crashes before the server makes a durable record of archival
 689     success, the server will attempt to archive the file again after
 690     restarting (provided archiving is still enabled).  When an archive command or library
 691     encounters a pre-existing file, it should return a zero status or <literal>true</literal>, respectively,
 692     if the WAL file has identical contents to the pre-existing archive and the
 693     pre-existing archive is fully persisted to storage.  If a pre-existing
 694     file contains different contents than the WAL file being archived, the
 695     archive command or library <emphasis>must</emphasis> return a nonzero status or
 696     <literal>false</literal>, respectively.
 697    </para>
 698
 699    <para>
 700     The example command above for Unix avoids overwriting a pre-existing archive
 701     by including a separate
 702     <command>test</command> step.  On some Unix platforms, <command>cp</command> has
 703     switches such as <option>-i</option> that can be used to do the same thing
 704     less verbosely, but you should not rely on these without verifying that
 705     the right exit status is returned.  (In particular, GNU <command>cp</command>
 706     will return status zero when <option>-i</option> is used and the target file
 707     already exists, which is <emphasis>not</emphasis> the desired behavior.)
 708    </para>
 709
 710    <para>
 711     While designing your archiving setup, consider what will happen if
 712     the archive command or library fails repeatedly because some aspect requires
 713     operator intervention or the archive runs out of space. For example, this
 714     could occur if you write to tape without an autochanger; when the tape
 715     fills, nothing further can be archived until the tape is swapped.
 716     You should ensure that any error condition or request to a human operator
 717     is reported appropriately so that the situation can be
 718     resolved reasonably quickly. The <filename>pg_wal/</filename> directory will
 719     continue to fill with WAL segment files until the situation is resolved.
 720     (If the file system containing <filename>pg_wal/</filename> fills up,
 721     <productname>PostgreSQL</productname> will do a PANIC shutdown.  No committed
 722     transactions will be lost, but the database will remain offline until
 723     you free some space.)
 724    </para>
 725
 726    <para>
 727     The speed of the archive command or library is unimportant as long as it can keep up
 728     with the average rate at which your server generates WAL data.  Normal
 729     operation continues even if the archiving process falls a little behind.
 730     If archiving falls significantly behind, this will increase the amount of
 731     data that would be lost in the event of a disaster. It will also mean that
 732     the <filename>pg_wal/</filename> directory will contain large numbers of
 733     not-yet-archived segment files, which could eventually exceed available
 734     disk space. You are advised to monitor the archiving process to ensure that
 735     it is working as you intend.
 736    </para>
 737
 738    <para>
 739     In writing your archive command or library, you should assume that the file names to
 740     be archived can be up to 64 characters long and can contain any
 741     combination of ASCII letters, digits, and dots.  It is not necessary to
 742     preserve the original relative path (<literal>%p</literal>) but it is necessary to
 743     preserve the file name (<literal>%f</literal>).
 744    </para>
 745
 746    <para>
 747     Note that although WAL archiving will allow you to restore any
 748     modifications made to the data in your <productname>PostgreSQL</productname> database,
 749     it will not restore changes made to configuration files (that is,
 750     <filename>postgresql.conf</filename>, <filename>pg_hba.conf</filename> and
 751     <filename>pg_ident.conf</filename>), since those are edited manually rather
 752     than through SQL operations.
 753     You might wish to keep the configuration files in a location that will
 754     be backed up by your regular file system backup procedures.  See
 755     <xref linkend="runtime-config-file-locations"/> for how to relocate the
 756     configuration files.
 757    </para>
 758
 759    <para>
 760     The archive command or function is only invoked on completed WAL segments.  Hence,
 761     if your server generates only little WAL traffic (or has slack periods
 762     where it does so), there could be a long delay between the completion
 763     of a transaction and its safe recording in archive storage.  To put
 764     a limit on how old unarchived data can be, you can set
 765     <xref linkend="guc-archive-timeout"/> to force the server to switch
 766     to a new WAL segment file at least that often.  Note that archived
 767     files that are archived early due to a forced switch are still the same
 768     length as completely full files.  It is therefore unwise to set a very
 769     short <varname>archive_timeout</varname> &mdash; it will bloat your archive
 770     storage.  <varname>archive_timeout</varname> settings of a minute or so are
 771     usually reasonable.
 772    </para>
 773
 774    <para>
 775     Also, you can force a segment switch manually with
 776     <function>pg_switch_wal</function> if you want to ensure that a
 777     just-finished transaction is archived as soon as possible.  Other utility
 778     functions related to WAL management are listed in <xref
 779     linkend="functions-admin-backup-table"/>.
 780    </para>
 781
 782    <para>
 783     When <varname>wal_level</varname> is <literal>minimal</literal> some SQL commands
 784     are optimized to avoid WAL logging, as described in <xref
 785     linkend="populate-pitr"/>.  If archiving or streaming replication were
 786     turned on during execution of one of these statements, WAL would not
 787     contain enough information for archive recovery.  (Crash recovery is
 788     unaffected.)  For this reason, <varname>wal_level</varname> can only be changed at
 789     server start.  However, <varname>archive_command</varname> and <varname>archive_library</varname> can be changed with a
 790     configuration file reload.  If you are archiving via shell and wish to
 791     temporarily stop archiving,
 792     one way to do it is to set <varname>archive_command</varname> to the empty
 793     string (<literal>''</literal>).
 794     This will cause WAL files to accumulate in <filename>pg_wal/</filename> until a
 795     working <varname>archive_command</varname> is re-established.
 796    </para>
 797   </sect2>
 798
 799   <sect2 id="backup-base-backup">
 800    <title>Making a Base Backup</title>
 801
 802    <para>
 803     The easiest way to perform a base backup is to use the
 804     <xref linkend="app-pgbasebackup"/> tool. It can create
 805     a base backup either as regular files or as a tar archive. If more
 806     flexibility than <xref linkend="app-pgbasebackup"/> can provide is
 807     required, you can also make a base backup using the low level API
 808     (see <xref linkend="backup-lowlevel-base-backup"/>).
 809    </para>
 810
 811    <para>
 812     It is not necessary to be concerned about the amount of time it takes
 813     to make a base backup. However, if you normally run the
 814     server with <varname>full_page_writes</varname> disabled, you might notice a drop
 815     in performance while the backup runs since <varname>full_page_writes</varname> is
 816     effectively forced on during backup mode.
 817    </para>
 818
 819    <para>
 820     To make use of the backup, you will need to keep all the WAL
 821     segment files generated during and after the file system backup.
 822     To aid you in doing this, the base backup process
 823     creates a <firstterm>backup history file</firstterm> that is immediately
 824     stored into the WAL archive area. This file is named after the first
 825     WAL segment file that you need for the file system backup.
 826     For example, if the starting WAL file is
 827     <literal>0000000100001234000055CD</literal> the backup history file will be
 828     named something like
 829     <literal>0000000100001234000055CD.007C9330.backup</literal>. (The second
 830     part of the file name stands for an exact position within the WAL
 831     file, and can ordinarily be ignored.) Once you have safely archived
 832     the file system backup and the WAL segment files used during the
 833     backup (as specified in the backup history file), all archived WAL
 834     segments with names numerically less are no longer needed to recover
 835     the file system backup and can be deleted. However, you should
 836     consider keeping several backup sets to be absolutely certain that
 837     you can recover your data.
 838    </para>
 839
 840    <para>
 841     The backup history file is just a small text file. It contains the
 842     label string you gave to <xref linkend="app-pgbasebackup"/>, as well as
 843     the starting and ending times and WAL segments of the backup.
 844     If you used the label to identify the associated dump file,
 845     then the archived history file is enough to tell you which dump file to
 846     restore.
 847    </para>
 848
 849    <para>
 850     Since you have to keep around all the archived WAL files back to your
 851     last base backup, the interval between base backups should usually be
 852     chosen based on how much storage you want to expend on archived WAL
 853     files.  You should also consider how long you are prepared to spend
 854     recovering, if recovery should be necessary &mdash; the system will have to
 855     replay all those WAL segments, and that could take awhile if it has
 856     been a long time since the last base backup.
 857    </para>
 858   </sect2>
 859
 860   <sect2 id="backup-incremental-backup">
 861    <title>Making an Incremental Backup</title>
 862
 863    <para>
 864     You can use <xref linkend="app-pgbasebackup"/> to take an incremental
 865     backup by specifying the <literal>--incremental</literal> option. You must
 866     supply, as an argument to <literal>--incremental</literal>, the backup
 867     manifest to an earlier backup from the same server. In the resulting
 868     backup, non-relation files will be included in their entirety, but some
 869     relation files may be replaced by smaller incremental files which contain
 870     only the blocks which have been changed since the earlier backup and enough
 871     metadata to reconstruct the current version of the file.
 872    </para>
 873
 874    <para>
 875     To figure out which blocks need to be backed up, the server uses WAL
 876     summaries, which are stored in the data directory, inside the directory
 877     <literal>pg_wal/summaries</literal>. If the required summary files are not
 878     present, an attempt to take an incremental backup will fail. The summaries
 879     present in this directory must cover all LSNs from the start LSN of the
 880     prior backup to the start LSN of the current backup. Since the server looks
 881     for WAL summaries just after establishing the start LSN of the current
 882     backup, the necessary summary files probably won't be instantly present
 883     on disk, but the server will wait for any missing files to show up.
 884     This also helps if the WAL summarization process has fallen behind.
 885     However, if the necessary files have already been removed, or if the WAL
 886     summarizer doesn't catch up quickly enough, the incremental backup will
 887     fail.
 888    </para>
 889
 890    <para>
 891     When restoring an incremental backup, it will be necessary to have not
 892     only the incremental backup itself but also all earlier backups that
 893     are required to supply the blocks omitted from the incremental backup.
 894     See <xref linkend="app-pgcombinebackup"/> for further information about
 895     this requirement. Note that there are restrictions on the use of
 896     <literal>pg_combinebackup</literal> when the checksum status of the
 897     cluster has been changed; see
 898     <link linkend="app-pgcombinebackup-limitations">pg_combinebackup
 899     limitations</link>.
 900    </para>
 901
 902    <para>
 903     Note that all of the requirements for making use of a full backup also
 904     apply to an incremental backup. For instance, you still need all of the
 905     WAL segment files generated during and after the file system backup, and
 906     any relevant WAL history files. And you still need to create a
 907     <literal>recovery.signal</literal> (or <literal>standby.signal</literal>)
 908     and perform recovery, as described in
 909     <xref linkend="backup-pitr-recovery" />. The requirement to have earlier
 910     backups available at restore time and to use
 911     <literal>pg_combinebackup</literal> is an additional requirement on top of
 912     everything else. Keep in mind that <application>PostgreSQL</application>
 913     has no built-in mechanism to figure out which backups are still needed as
 914     a basis for restoring later incremental backups. You must keep track of
 915     the relationships between your full and incremental backups on your own,
 916     and be certain not to remove earlier backups if they might be needed when
 917     restoring later incremental backups.
 918    </para>
 919
 920    <para>
 921     Incremental backups typically only make sense for relatively large
 922     databases where a significant portion of the data does not change, or only
 923     changes slowly. For a small database, it's simpler to ignore the existence
 924     of incremental backups and simply take full backups, which are simpler
 925     to manage. For a large database all of which is heavily modified,
 926     incremental backups won't be much smaller than full backups.
 927    </para>
 928
 929    <para>
 930     An incremental backup is only possible if replay would begin from a later
 931     checkpoint than for the previous backup upon which it depends.  If you
 932     take the incremental backup on the primary, this condition is always
 933     satisfied, because each backup triggers a new checkpoint.  On a standby,
 934     replay begins from the most recent restartpoint.  Therefore, an
 935     incremental backup of a standby server can fail if there has been very
 936     little activity since the previous backup, since no new restartpoint might
 937     have been created.
 938    </para>
 939   </sect2>
 940
 941   <sect2 id="backup-lowlevel-base-backup">
 942    <title>Making a Base Backup Using the Low Level API</title>
 943    <para>
 944     Instead of taking a full or incremental base backup using
 945     <xref linkend="app-pgbasebackup"/>, you can take a base backup using the
 946     low-level API. This procedure contains a few more steps than
 947     the <application>pg_basebackup</application> method, but is relatively
 948     simple. It is very important that these steps are executed in
 949     sequence, and that the success of a step is verified before
 950     proceeding to the next step.
 951    </para>
 952     <para>
 953      Multiple backups are able to be run concurrently (both those
 954      started using this backup API and those started using
 955      <xref linkend="app-pgbasebackup"/>).
 956     </para>
 957     <para>
 958   <orderedlist>
 959    <listitem>
 960     <para>
 961      Ensure that WAL archiving is enabled and working.
 962     </para>
 963    </listitem>
 964    <listitem>
 965     <para>
 966      Connect to the server (it does not matter which database) as a user with
 967      rights to run <function>pg_backup_start</function> (superuser,
 968      or a user who has been granted <literal>EXECUTE</literal> on the
 969      function) and issue the command:
 970 <programlisting>
 971 SELECT pg_backup_start(label => 'label', fast => false);
 972 </programlisting>
 973      where <literal>label</literal> is any string you want to use to uniquely
 974      identify this backup operation. The connection
 975      calling <function>pg_backup_start</function> must be maintained until the end of
 976      the backup, or the backup will be automatically aborted.
 977     </para>
 978
 979     <para>
 980      Online backups are always started at the beginning of a checkpoint.
 981      By default, <function>pg_backup_start</function> will wait for the next
 982      regularly scheduled checkpoint to complete, which may take a long time (see the
 983      configuration parameters <xref linkend="guc-checkpoint-timeout"/> and
 984      <xref linkend="guc-checkpoint-completion-target"/>).  This is
 985      usually preferable as it minimizes the impact on the running system.  If you
 986      want to start the backup as soon as possible, pass <literal>true</literal> as
 987      the second parameter to <function>pg_backup_start</function> and it will
 988      request an immediate checkpoint, which will finish as fast as possible using
 989      as much I/O as possible.
 990     </para>
 991
 992    </listitem>
 993    <listitem>
 994     <para>
 995      Perform the backup, using any convenient file-system-backup tool
 996      such as <application>tar</application> or <application>cpio</application> (not
 997      <application>pg_dump</application> or
 998      <application>pg_dumpall</application>).  It is neither
 999      necessary nor desirable to stop normal operation of the database
1000      while you do this. See
1001      <xref linkend="backup-lowlevel-base-backup-data"/> for things to
1002      consider during this backup.
1003     </para>
1004    </listitem>
1005    <listitem>
1006     <para>
1007      In the same connection as before, issue the command:
1008 <programlisting>
1009 SELECT * FROM pg_backup_stop(wait_for_archive => true);
1010 </programlisting>
1011      This terminates backup mode. On a primary, it also performs an automatic
1012      switch to the next WAL segment.  On a standby, it is not possible to
1013      automatically switch WAL segments, so you may wish to run
1014      <function>pg_switch_wal</function> on the primary to perform a manual
1015      switch. The reason for the switch is to arrange for
1016      the last WAL segment file written during the backup interval to be
1017      ready to archive.
1018     </para>
1019     <para>
1020      <function>pg_backup_stop</function> will return one row with three
1021      values. The second of these fields should be written to a file named
1022      <filename>backup_label</filename> in the root directory of the backup. The
1023      third field should be written to a file named
1024      <filename>tablespace_map</filename> unless the field is empty. These files are
1025      vital to the backup working and must be written byte for byte without
1026      modification, which may require opening the file in binary mode.
1027     </para>
1028    </listitem>
1029    <listitem>
1030     <para>
1031      Once the WAL segment files active during the backup are archived, you are
1032      done.  The file identified by <function>pg_backup_stop</function>'s first return
1033      value is the last segment that is required to form a complete set of
1034      backup files.  On a primary, if <varname>archive_mode</varname> is enabled and the
1035      <literal>wait_for_archive</literal> parameter is <literal>true</literal>,
1036      <function>pg_backup_stop</function> does not return until the last segment has
1037      been archived.
1038      On a standby, <varname>archive_mode</varname> must be <literal>always</literal> in order
1039      for <function>pg_backup_stop</function> to wait.
1040      Archiving of these files happens automatically since you have
1041      already configured <varname>archive_command</varname> or <varname>archive_library</varname>.
1042      In most cases this happens quickly, but you are advised to monitor your
1043      archive system to ensure there are no delays.
1044      If the archive process has fallen behind because of failures of the
1045      archive command or library, it will keep retrying
1046      until the archive succeeds and the backup is complete.
1047      If you wish to place a time limit on the execution of
1048      <function>pg_backup_stop</function>, set an appropriate
1049      <varname>statement_timeout</varname> value, but make note that if
1050      <function>pg_backup_stop</function> terminates because of this your backup
1051      may not be valid.
1052     </para>
1053     <para>
1054      If the backup process monitors and ensures that all WAL segment files
1055      required for the backup are successfully archived then the
1056      <literal>wait_for_archive</literal> parameter (which defaults to true) can be set
1057      to false to have
1058      <function>pg_backup_stop</function> return as soon as the stop backup record is
1059      written to the WAL.  By default, <function>pg_backup_stop</function> will wait
1060      until all WAL has been archived, which can take some time.  This option
1061      must be used with caution: if WAL archiving is not monitored correctly
1062      then the backup might not include all of the WAL files and will
1063      therefore be incomplete and not able to be restored.
1064     </para>
1065    </listitem>
1066   </orderedlist>
1067     </para>
1068    <sect3 id="backup-lowlevel-base-backup-data">
1069    <title>Backing Up the Data Directory</title>
1070    <para>
1071     Some file system backup tools emit warnings or errors
1072     if the files they are trying to copy change while the copy proceeds.
1073     When taking a base backup of an active database, this situation is normal
1074     and not an error.  However, you need to ensure that you can distinguish
1075     complaints of this sort from real errors.  For example, some versions
1076     of <application>rsync</application> return a separate exit code for
1077     <quote>vanished source files</quote>, and you can write a driver script to
1078     accept this exit code as a non-error case.  Also, some versions of
1079     GNU <application>tar</application> return an error code indistinguishable from
1080     a fatal error if a file was truncated while <application>tar</application> was
1081     copying it.  Fortunately, GNU <application>tar</application> versions 1.16 and
1082     later exit with 1 if a file was changed during the backup,
1083     and 2 for other errors.  With GNU <application>tar</application> version 1.23 and
1084     later, you can use the warning options <literal>--warning=no-file-changed
1085     --warning=no-file-removed</literal> to hide the related warning messages.
1086    </para>
1087
1088    <para>
1089     Be certain that your backup includes all of the files under
1090     the database cluster directory (e.g., <filename>/usr/local/pgsql/data</filename>).
1091     If you are using tablespaces that do not reside underneath this directory,
1092     be careful to include them as well (and be sure that your backup
1093     archives symbolic links as links, otherwise the restore will corrupt
1094     your tablespaces).
1095    </para>
1096
1097    <para>
1098     You should, however, omit from the backup the files within the
1099     cluster's <filename>pg_wal/</filename> subdirectory.  This
1100     slight adjustment is worthwhile because it reduces the risk
1101     of mistakes when restoring.  This is easy to arrange if
1102     <filename>pg_wal/</filename> is a symbolic link pointing to someplace outside
1103     the cluster directory, which is a common setup anyway for performance
1104     reasons.  You might also want to exclude <filename>postmaster.pid</filename>
1105     and <filename>postmaster.opts</filename>, which record information
1106     about the running <application>postmaster</application>, not about the
1107     <application>postmaster</application> which will eventually use this backup.
1108     (These files can confuse <application>pg_ctl</application>.)
1109    </para>
1110
1111    <para>
1112     It is often a good idea to also omit from the backup the files
1113     within the cluster's <filename>pg_replslot/</filename> directory, so that
1114     replication slots that exist on the primary do not become part of the
1115     backup.  Otherwise, the subsequent use of the backup to create a standby
1116     may result in indefinite retention of WAL files on the standby, and
1117     possibly bloat on the primary if hot standby feedback is enabled, because
1118     the clients that are using those replication slots will still be connecting
1119     to and updating the slots on the primary, not the standby.  Even if the
1120     backup is only intended for use in creating a new primary, copying the
1121     replication slots isn't expected to be particularly useful, since the
1122     contents of those slots will likely be badly out of date by the time
1123     the new primary comes on line.
1124    </para>
1125
1126    <para>
1127     The contents of the directories <filename>pg_dynshmem/</filename>,
1128     <filename>pg_notify/</filename>, <filename>pg_serial/</filename>,
1129     <filename>pg_snapshots/</filename>, <filename>pg_stat_tmp/</filename>,
1130     and <filename>pg_subtrans/</filename> (but not the directories themselves) can be
1131     omitted from the backup as they will be initialized on postmaster startup.
1132    </para>
1133
1134    <para>
1135     Any file or directory beginning with <filename>pgsql_tmp</filename> can be
1136     omitted from the backup.  These files are removed on postmaster start and
1137     the directories will be recreated as needed.
1138    </para>
1139
1140    <para>
1141     <filename>pg_internal.init</filename> files can be omitted from the
1142     backup whenever a file of that name is found.  These files contain
1143     relation cache data that is always rebuilt when recovering.
1144    </para>
1145
1146    <para>
1147     The backup label
1148     file includes the label string you gave to <function>pg_backup_start</function>,
1149     as well as the time at which <function>pg_backup_start</function> was run, and
1150     the name of the starting WAL file.  In case of confusion it is therefore
1151     possible to look inside a backup file and determine exactly which
1152     backup session the dump file came from.  The tablespace map file includes
1153     the symbolic link names as they exist in the directory
1154     <filename>pg_tblspc/</filename> and the full path of each symbolic link.
1155     These files are not merely for your information; their presence and
1156     contents are critical to the proper operation of the system's recovery
1157     process.
1158    </para>
1159
1160    <para>
1161     It is also possible to make a backup while the server is
1162     stopped.  In this case, you obviously cannot use
1163     <function>pg_backup_start</function> or <function>pg_backup_stop</function>, and
1164     you will therefore be left to your own devices to keep track of which
1165     backup is which and how far back the associated WAL files go.
1166     It is generally better to follow the continuous archiving procedure above.
1167    </para>
1168    </sect3>
1169   </sect2>
1170
1171   <sect2 id="backup-pitr-recovery">
1172    <title>Recovering Using a Continuous Archive Backup</title>
1173
1174    <para>
1175     Okay, the worst has happened and you need to recover from your backup.
1176     Here is the procedure:
1177   <orderedlist>
1178    <listitem>
1179     <para>
1180      Stop the server, if it's running.
1181     </para>
1182    </listitem>
1183    <listitem>
1184     <para>
1185      If you have the space to do so,
1186      copy the whole cluster data directory and any tablespaces to a temporary
1187      location in case you need them later. Note that this precaution will
1188      require that you have enough free space on your system to hold two
1189      copies of your existing database. If you do not have enough space,
1190      you should at least save the contents of the cluster's <filename>pg_wal</filename>
1191      subdirectory, as it might contain WAL files which
1192      were not archived before the system went down.
1193     </para>
1194    </listitem>
1195    <listitem>
1196     <para>
1197      Remove all existing files and subdirectories under the cluster data
1198      directory and under the root directories of any tablespaces you are using.
1199     </para>
1200    </listitem>
1201    <listitem>
1202     <para>
1203      If you're restoring a full backup, you can restore the database files
1204      directly into the target directories.  Be sure that they
1205      are restored with the right ownership (the database system user, not
1206      <literal>root</literal>!) and with the right permissions.  If you are using
1207      tablespaces,
1208      you should verify that the symbolic links in <filename>pg_tblspc/</filename>
1209      were correctly restored.
1210     </para>
1211    </listitem>
1212    <listitem>
1213     <para>
1214      If you're restoring an incremental backup, you'll need to restore the
1215      incremental backup and all earlier backups upon which it directly or
1216      indirectly depends to the machine where you are performing the restore.
1217      These backups will need to be placed in separate directories, not the
1218      target directories where you want the running server to end up.
1219      Once this is done, use <xref linkend="app-pgcombinebackup"/> to pull
1220      data from the full backup and all of the subsequent incremental backups
1221      and write out a synthetic full backup to the target directories. As above,
1222      verify that permissions and tablespace links are correct.
1223     </para>
1224    </listitem>
1225    <listitem>
1226     <para>
1227      Remove any files present in <filename>pg_wal/</filename>; these came from the
1228      file system backup and are therefore probably obsolete rather than current.
1229      If you didn't archive <filename>pg_wal/</filename> at all, then recreate
1230      it with proper permissions,
1231      being careful to ensure that you re-establish it as a symbolic link
1232      if you had it set up that way before.
1233     </para>
1234    </listitem>
1235    <listitem>
1236     <para>
1237      If you have unarchived WAL segment files that you saved in step 2,
1238      copy them into <filename>pg_wal/</filename>.  (It is best to copy them,
1239      not move them, so you still have the unmodified files if a
1240      problem occurs and you have to start over.)
1241     </para>
1242    </listitem>
1243    <listitem>
1244     <para>
1245      Set recovery configuration settings in
1246      <filename>postgresql.conf</filename> (see <xref
1247      linkend="runtime-config-wal-archive-recovery"/>) and create a file
1248      <filename>recovery.signal</filename> in the cluster
1249      data directory. You might
1250      also want to temporarily modify <filename>pg_hba.conf</filename> to prevent
1251      ordinary users from connecting until you are sure the recovery was successful.
1252     </para>
1253    </listitem>
1254    <listitem>
1255     <para>
1256      Start the server.  The server will go into recovery mode and
1257      proceed to read through the archived WAL files it needs.  Should the
1258      recovery be terminated because of an external error, the server can
1259      simply be restarted and it will continue recovery.  Upon completion
1260      of the recovery process, the server will remove
1261      <filename>recovery.signal</filename> (to prevent
1262      accidentally re-entering recovery mode later) and then
1263      commence normal database operations.
1264     </para>
1265    </listitem>
1266    <listitem>
1267     <para>
1268      Inspect the contents of the database to ensure you have recovered to
1269      the desired state.  If not, return to step 1.  If all is well,
1270      allow your users to connect by restoring <filename>pg_hba.conf</filename> to normal.
1271     </para>
1272    </listitem>
1273   </orderedlist>
1274    </para>
1275
1276    <para>
1277     The key part of all this is to set up a recovery configuration that
1278     describes how you want to recover and how far the recovery should
1279     run.  The one thing that you absolutely must specify is the <varname>restore_command</varname>,
1280     which tells <productname>PostgreSQL</productname> how to retrieve archived
1281     WAL file segments.  Like the <varname>archive_command</varname>, this is
1282     a shell command string.  It can contain <literal>%f</literal>, which is
1283     replaced by the name of the desired WAL file, and <literal>%p</literal>,
1284     which is replaced by the path name to copy the WAL file to.
1285     (The path name is relative to the current working directory,
1286     i.e., the cluster's data directory.)
1287     Write <literal>%%</literal> if you need to embed an actual <literal>%</literal>
1288     character in the command.  The simplest useful command is
1289     something like:
1290 <programlisting>
1291 restore_command = 'cp /mnt/server/archivedir/%f %p'
1292 </programlisting>
1293     which will copy previously archived WAL segments from the directory
1294     <filename>/mnt/server/archivedir</filename>.  Of course, you can use something
1295     much more complicated, perhaps even a shell script that requests the
1296     operator to mount an appropriate tape.
1297    </para>
1298
1299    <para>
1300     It is important that the command return nonzero exit status on failure.
1301     The command <emphasis>will</emphasis> be called requesting files that are not
1302     present in the archive; it must return nonzero when so asked.  This is not
1303     an error condition.  An exception is that if the command was terminated by
1304     a signal (other than <systemitem>SIGTERM</systemitem>, which is used as
1305     part of a database server shutdown) or an error by the shell (such as
1306     command not found), then recovery will abort and the server will not start
1307     up.
1308    </para>
1309
1310    <para>
1311     Not all of the requested files will be WAL segment
1312     files; you should also expect requests for files with a suffix of
1313     <literal>.history</literal>. Also be aware that
1314     the base name of the <literal>%p</literal> path will be different from
1315     <literal>%f</literal>; do not expect them to be interchangeable.
1316    </para>
1317
1318    <para>
1319     WAL segments that cannot be found in the archive will be sought in
1320     <filename>pg_wal/</filename>; this allows use of recent un-archived segments.
1321     However, segments that are available from the archive will be used in
1322     preference to files in <filename>pg_wal/</filename>.
1323    </para>
1324
1325    <para>
1326     Normally, recovery will proceed through all available WAL segments,
1327     thereby restoring the database to the current point in time (or as
1328     close as possible given the available WAL segments).  Therefore, a normal
1329     recovery will end with a <quote>file not found</quote> message, the exact text
1330     of the error message depending upon your choice of
1331     <varname>restore_command</varname>.  You may also see an error message
1332     at the start of recovery for a file named something like
1333     <filename>00000001.history</filename>.  This is also normal and does not
1334     indicate a problem in simple recovery situations; see
1335     <xref linkend="backup-timelines"/> for discussion.
1336    </para>
1337
1338    <para>
1339     If you want to recover to some previous point in time (say, right before
1340     the junior DBA dropped your main transaction table), just specify the
1341     required <link linkend="runtime-config-wal-recovery-target">stopping point</link>.  You can specify
1342     the stop point, known as the <quote>recovery target</quote>, either by
1343     date/time, named restore point or by completion of a specific transaction
1344     ID.  As of this writing only the date/time and named restore point options
1345     are very usable, since there are no tools to help you identify with any
1346     accuracy which transaction ID to use.
1347    </para>
1348
1349    <note>
1350      <para>
1351       The stop point must be after the ending time of the base backup, i.e.,
1352       the end time of <function>pg_backup_stop</function>.  You cannot use a base backup
1353       to recover to a time when that backup was in progress.  (To
1354       recover to such a time, you must go back to your previous base backup
1355       and roll forward from there.)
1356      </para>
1357    </note>
1358
1359    <para>
1360     If recovery finds corrupted WAL data, recovery will
1361     halt at that point and the server will not start. In such a case the
1362     recovery process could be re-run from the beginning, specifying a
1363     <quote>recovery target</quote> before the point of corruption so that recovery
1364     can complete normally.
1365     If recovery fails for an external reason, such as a system crash or
1366     if the WAL archive has become inaccessible, then the recovery can simply
1367     be restarted and it will restart almost from where it failed.
1368     Recovery restart works much like checkpointing in normal operation:
1369     the server periodically forces all its state to disk, and then updates
1370     the <filename>pg_control</filename> file to indicate that the already-processed
1371     WAL data need not be scanned again.
1372    </para>
1373
1374   </sect2>
1375
1376   <sect2 id="backup-timelines">
1377    <title>Timelines</title>
1378
1379   <indexterm zone="backup">
1380    <primary>timelines</primary>
1381   </indexterm>
1382
1383    <para>
1384     The ability to restore the database to a previous point in time creates
1385     some complexities that are akin to science-fiction stories about time
1386     travel and parallel universes.  For example, in the original history of the database,
1387     suppose you dropped a critical table at 5:15PM on Tuesday evening, but
1388     didn't realize your mistake until Wednesday noon.
1389     Unfazed, you get out your backup, restore to the point-in-time 5:14PM
1390     Tuesday evening, and are up and running.  In <emphasis>this</emphasis> history of
1391     the database universe, you never dropped the table.  But suppose
1392     you later realize this wasn't such a great idea, and would like
1393     to return to sometime Wednesday morning in the original history.
1394     You won't be able
1395     to if, while your database was up-and-running, it overwrote some of the
1396     WAL segment files that led up to the time you now wish you
1397     could get back to.  Thus, to avoid this, you need to distinguish the series of
1398     WAL records generated after you've done a point-in-time recovery from
1399     those that were generated in the original database history.
1400    </para>
1401
1402    <para>
1403     To deal with this problem, <productname>PostgreSQL</productname> has a notion
1404     of <firstterm>timelines</firstterm>.  Whenever an archive recovery completes,
1405     a new timeline is created to identify the series of WAL records
1406     generated after that recovery.  The timeline
1407     ID number is part of WAL segment file names so a new timeline does
1408     not overwrite the WAL data generated by previous timelines.
1409     For example, in the WAL file name
1410     <filename>0000000100001234000055CD</filename>, the leading
1411     <literal>00000001</literal> is the timeline ID in hexadecimal.  (Note that
1412     in other contexts, such as server log messages, timeline IDs are
1413     usually printed in decimal.)
1414    </para>
1415
1416    <para>
1417     It is
1418     in fact possible to archive many different timelines.  While that might
1419     seem like a useless feature, it's often a lifesaver.  Consider the
1420     situation where you aren't quite sure what point-in-time to recover to,
1421     and so have to do several point-in-time recoveries by trial and error
1422     until you find the best place to branch off from the old history.  Without
1423     timelines this process would soon generate an unmanageable mess.  With
1424     timelines, you can recover to <emphasis>any</emphasis> prior state, including
1425     states in timeline branches that you abandoned earlier.
1426    </para>
1427
1428    <para>
1429     Every time a new timeline is created, <productname>PostgreSQL</productname> creates
1430     a <quote>timeline history</quote> file that shows which timeline it branched
1431     off from and when.  These history files are necessary to allow the system
1432     to pick the right WAL segment files when recovering from an archive that
1433     contains multiple timelines.  Therefore, they are archived into the WAL
1434     archive area just like WAL segment files.  The history files are just
1435     small text files, so it's cheap and appropriate to keep them around
1436     indefinitely (unlike the segment files which are large).  You can, if
1437     you like, add comments to a history file to record your own notes about
1438     how and why this particular timeline was created.  Such comments will be
1439     especially valuable when you have a thicket of different timelines as
1440     a result of experimentation.
1441    </para>
1442
1443    <para>
1444     The default behavior of recovery is to recover to the latest timeline found
1445     in the archive. If you wish to recover to the timeline that was current
1446     when the base backup was taken or into a specific child timeline (that
1447     is, you want to return to some state that was itself generated after a
1448     recovery attempt), you need to specify <literal>current</literal> or the
1449     target timeline ID in <xref linkend="guc-recovery-target-timeline"/>. You
1450     cannot recover into timelines that branched off earlier than the base backup.
1451    </para>
1452   </sect2>
1453
1454   <sect2 id="backup-tips">
1455    <title>Tips and Examples</title>
1456
1457    <para>
1458     Some tips for configuring continuous archiving are given here.
1459    </para>
1460
1461     <sect3 id="backup-standalone">
1462      <title>Standalone Hot Backups</title>
1463
1464      <para>
1465       It is possible to use <productname>PostgreSQL</productname>'s backup facilities to
1466       produce standalone hot backups. These are backups that cannot be used
1467       for point-in-time recovery, yet are typically much faster to backup and
1468       restore than <application>pg_dump</application> dumps.  (They are also much larger
1469       than <application>pg_dump</application> dumps, so in some cases the speed advantage
1470       might be negated.)
1471      </para>
1472
1473      <para>
1474       As with base backups, the easiest way to produce a standalone
1475       hot backup is to use the <xref linkend="app-pgbasebackup"/>
1476       tool. If you include the <literal>-X</literal> parameter when calling
1477       it, all the write-ahead log required to use the backup will be
1478       included in the backup automatically, and no special action is
1479       required to restore the backup.
1480      </para>
1481     </sect3>
1482
1483     <sect3 id="compressed-archive-logs">
1484      <title>Compressed Archive Logs</title>
1485
1486      <para>
1487       If archive storage size is a concern, you can use
1488       <application>gzip</application> to compress the archive files:
1489 <programlisting>
1490 archive_command = 'gzip &lt; %p &gt; /mnt/server/archivedir/%f.gz'
1491 </programlisting>
1492       You will then need to use <application>gunzip</application> during recovery:
1493 <programlisting>
1494 restore_command = 'gunzip &lt; /mnt/server/archivedir/%f.gz &gt; %p'
1495 </programlisting>
1496      </para>
1497     </sect3>
1498
1499     <sect3 id="backup-scripts">
1500      <title><varname>archive_command</varname> Scripts</title>
1501
1502      <para>
1503       Many people choose to use scripts to define their
1504       <varname>archive_command</varname>, so that their
1505       <filename>postgresql.conf</filename> entry looks very simple:
1506 <programlisting>
1507 archive_command = 'local_backup_script.sh "%p" "%f"'
1508 </programlisting>
1509       Using a separate script file is advisable any time you want to use
1510       more than a single command in the archiving process.
1511       This allows all complexity to be managed within the script, which
1512       can be written in a popular scripting language such as
1513       <application>bash</application> or <application>perl</application>.
1514      </para>
1515
1516      <para>
1517       Examples of requirements that might be solved within a script include:
1518       <itemizedlist>
1519        <listitem>
1520         <para>
1521          Copying data to secure off-site data storage
1522         </para>
1523        </listitem>
1524        <listitem>
1525         <para>
1526          Batching WAL files so that they are transferred every three hours,
1527          rather than one at a time
1528         </para>
1529        </listitem>
1530        <listitem>
1531         <para>
1532          Interfacing with other backup and recovery software
1533         </para>
1534        </listitem>
1535        <listitem>
1536         <para>
1537          Interfacing with monitoring software to report errors
1538         </para>
1539        </listitem>
1540       </itemizedlist>
1541      </para>
1542
1543      <tip>
1544       <para>
1545        When using an <varname>archive_command</varname> script, it's desirable
1546        to enable <xref linkend="guc-logging-collector"/>.
1547        Any messages written to <systemitem>stderr</systemitem> from the script will then
1548        appear in the database server log, allowing complex configurations to
1549        be diagnosed easily if they fail.
1550       </para>
1551      </tip>
1552     </sect3>
1553   </sect2>
1554
1555   <sect2 id="continuous-archiving-caveats">
1556    <title>Caveats</title>
1557
1558    <para>
1559     At this writing, there are several limitations of the continuous archiving
1560     technique.  These will probably be fixed in future releases:
1561
1562   <itemizedlist>
1563    <listitem>
1564     <para>
1565      If a <link linkend="sql-createdatabase"><command>CREATE DATABASE</command></link>
1566      command is executed while a base backup is being taken, and then
1567      the template database that the <command>CREATE DATABASE</command> copied
1568      is modified while the base backup is still in progress, it is
1569      possible that recovery will cause those modifications to be
1570      propagated into the created database as well.  This is of course
1571      undesirable.  To avoid this risk, it is best not to modify any
1572      template databases while taking a base backup.
1573     </para>
1574    </listitem>
1575
1576    <listitem>
1577     <para>
1578      <link linkend="sql-createtablespace"><command>CREATE TABLESPACE</command></link>
1579      commands are WAL-logged with the literal absolute path, and will
1580      therefore be replayed as tablespace creations with the same
1581      absolute path.  This might be undesirable if the WAL is being
1582      replayed on a different machine.  It can be dangerous even if the
1583      WAL is being replayed on the same machine, but into a new data
1584      directory: the replay will still overwrite the contents of the
1585      original tablespace.  To avoid potential gotchas of this sort,
1586      the best practice is to take a new base backup after creating or
1587      dropping tablespaces.
1588     </para>
1589    </listitem>
1590   </itemizedlist>
1591    </para>
1592
1593    <para>
1594     It should also be noted that the default <acronym>WAL</acronym>
1595     format is fairly bulky since it includes many disk page snapshots.
1596     These page snapshots are designed to support crash recovery, since
1597     we might need to fix partially-written disk pages.  Depending on
1598     your system hardware and software, the risk of partial writes might
1599     be small enough to ignore, in which case you can significantly
1600     reduce the total volume of archived WAL files by turning off page
1601     snapshots using the <xref linkend="guc-full-page-writes"/>
1602     parameter.  (Read the notes and warnings in <xref linkend="wal"/>
1603     before you do so.)  Turning off page snapshots does not prevent
1604     use of the WAL for PITR operations.  An area for future
1605     development is to compress archived WAL data by removing
1606     unnecessary page copies even when <varname>full_page_writes</varname> is
1607     on.  In the meantime, administrators might wish to reduce the number
1608     of page snapshots included in WAL by increasing the checkpoint
1609     interval parameters as much as feasible.
1610    </para>
1611   </sect2>
1612  </sect1>
1613
1614 </chapter>