At update of non-LP_NORMAL TID, fail instead of corrupting page header.
[pgsql.git] / doc / src / sgml / bgworker.sgml
blob2c393385a91f4a2f3677f32ecc2cdb4186e0917d
1 <!-- doc/src/sgml/bgworker.sgml -->
3 <chapter id="bgworker">
4 <title>Background Worker Processes</title>
6 <indexterm zone="bgworker">
7 <primary>Background workers</primary>
8 </indexterm>
10 <para>
11 PostgreSQL can be extended to run user-supplied code in separate processes.
12 Such processes are started, stopped and monitored by <command>postgres</command>,
13 which permits them to have a lifetime closely linked to the server's status.
14 These processes are attached to <productname>PostgreSQL</productname>'s
15 shared memory area and have the option to connect to databases internally; they can also run
16 multiple transactions serially, just like a regular client-connected server
17 process. Also, by linking to <application>libpq</application> they can connect to the
18 server and behave like a regular client application.
19 </para>
21 <warning>
22 <para>
23 There are considerable robustness and security risks in using background
24 worker processes because, being written in the <literal>C</literal> language,
25 they have unrestricted access to data. Administrators wishing to enable
26 modules that include background worker processes should exercise extreme
27 caution. Only carefully audited modules should be permitted to run
28 background worker processes.
29 </para>
30 </warning>
32 <para>
33 Background workers can be initialized at the time that
34 <productname>PostgreSQL</productname> is started by including the module name in
35 <varname>shared_preload_libraries</varname>. A module wishing to run a background
36 worker can register it by calling
37 <function>RegisterBackgroundWorker(<type>BackgroundWorker</type>
38 *<parameter>worker</parameter>)</function>
39 from its <function>_PG_init()</function> function.
40 Background workers can also be started
41 after the system is up and running by calling
42 <function>RegisterDynamicBackgroundWorker(<type>BackgroundWorker</type>
43 *<parameter>worker</parameter>, <type>BackgroundWorkerHandle</type>
44 **<parameter>handle</parameter>)</function>. Unlike
45 <function>RegisterBackgroundWorker</function>, which can only be called from
46 within the postmaster process,
47 <function>RegisterDynamicBackgroundWorker</function> must be called
48 from a regular backend or another background worker.
49 </para>
51 <para>
52 The structure <structname>BackgroundWorker</structname> is defined thus:
53 <programlisting>
54 typedef void (*bgworker_main_type)(Datum main_arg);
55 typedef struct BackgroundWorker
57 char bgw_name[BGW_MAXLEN];
58 char bgw_type[BGW_MAXLEN];
59 int bgw_flags;
60 BgWorkerStartTime bgw_start_time;
61 int bgw_restart_time; /* in seconds, or BGW_NEVER_RESTART */
62 char bgw_library_name[MAXPGPATH];
63 char bgw_function_name[BGW_MAXLEN];
64 Datum bgw_main_arg;
65 char bgw_extra[BGW_EXTRALEN];
66 pid_t bgw_notify_pid;
67 } BackgroundWorker;
68 </programlisting>
69 </para>
71 <para>
72 <structfield>bgw_name</structfield> and <structfield>bgw_type</structfield> are
73 strings to be used in log messages, process listings and similar contexts.
74 <structfield>bgw_type</structfield> should be the same for all background
75 workers of the same type, so that it is possible to group such workers in a
76 process listing, for example. <structfield>bgw_name</structfield> on the
77 other hand can contain additional information about the specific process.
78 (Typically, the string for <structfield>bgw_name</structfield> will contain
79 the type somehow, but that is not strictly required.)
80 </para>
82 <para>
83 <structfield>bgw_flags</structfield> is a bitwise-or'd bit mask indicating the
84 capabilities that the module wants. Possible values are:
85 <variablelist>
87 <varlistentry>
88 <term><literal>BGWORKER_SHMEM_ACCESS</literal></term>
89 <listitem>
90 <para>
91 <indexterm><primary>BGWORKER_SHMEM_ACCESS</primary></indexterm>
92 Requests shared memory access. This flag is required.
93 </para>
94 </listitem>
95 </varlistentry>
97 <varlistentry>
98 <term><literal>BGWORKER_BACKEND_DATABASE_CONNECTION</literal></term>
99 <listitem>
100 <para>
101 <indexterm><primary>BGWORKER_BACKEND_&zwsp;DATABASE_CONNECTION</primary></indexterm>
102 Requests the ability to establish a database connection through which it
103 can later run transactions and queries. A background worker using
104 <literal>BGWORKER_BACKEND_DATABASE_CONNECTION</literal> to connect to a
105 database must also attach shared memory using
106 <literal>BGWORKER_SHMEM_ACCESS</literal>, or worker start-up will fail.
107 </para>
108 </listitem>
109 </varlistentry>
111 </variablelist>
113 </para>
115 <para>
116 <structfield>bgw_start_time</structfield> is the server state during which
117 <command>postgres</command> should start the process; it can be one of
118 <literal>BgWorkerStart_PostmasterStart</literal> (start as soon as
119 <command>postgres</command> itself has finished its own initialization; processes
120 requesting this are not eligible for database connections),
121 <literal>BgWorkerStart_ConsistentState</literal> (start as soon as a consistent state
122 has been reached in a hot standby, allowing processes to connect to
123 databases and run read-only queries), and
124 <literal>BgWorkerStart_RecoveryFinished</literal> (start as soon as the system has
125 entered normal read-write state). Note the last two values are equivalent
126 in a server that's not a hot standby. Note that this setting only indicates
127 when the processes are to be started; they do not stop when a different state
128 is reached.
129 </para>
131 <para>
132 <structfield>bgw_restart_time</structfield> is the interval, in seconds, that
133 <command>postgres</command> should wait before restarting the process in
134 the event that it crashes. It can be any positive value,
135 or <literal>BGW_NEVER_RESTART</literal>, indicating not to restart the
136 process in case of a crash.
137 </para>
139 <para>
140 <structfield>bgw_library_name</structfield> is the name of a library in
141 which the initial entry point for the background worker should be sought.
142 The named library will be dynamically loaded by the worker process and
143 <structfield>bgw_function_name</structfield> will be used to identify the
144 function to be called. If calling a function in the core code, this must
145 be set to <literal>"postgres"</literal>.
146 </para>
148 <para>
149 <structfield>bgw_function_name</structfield> is the name of the function
150 to use as the initial entry point for the new background worker. If
151 this function is in a dynamically loaded library, it must be marked
152 <literal>PGDLLEXPORT</literal> (and not <literal>static</literal>).
153 </para>
155 <para>
156 <structfield>bgw_main_arg</structfield> is the <type>Datum</type> argument
157 to the background worker main function. This main function should take a
158 single argument of type <type>Datum</type> and return <type>void</type>.
159 <structfield>bgw_main_arg</structfield> will be passed as the argument.
160 In addition, the global variable <literal>MyBgworkerEntry</literal>
161 points to a copy of the <structname>BackgroundWorker</structname> structure
162 passed at registration time; the worker may find it helpful to examine
163 this structure.
164 </para>
166 <para>
167 On Windows (and anywhere else where <literal>EXEC_BACKEND</literal> is
168 defined) or in dynamic background workers it is not safe to pass a
169 <type>Datum</type> by reference, only by value. If an argument is required, it
170 is safest to pass an int32 or other small value and use that as an index
171 into an array allocated in shared memory. If a value like a <type>cstring</type>
172 or <type>text</type> is passed then the pointer won't be valid from the
173 new background worker process.
174 </para>
176 <para>
177 <structfield>bgw_extra</structfield> can contain extra data to be passed
178 to the background worker. Unlike <structfield>bgw_main_arg</structfield>, this data
179 is not passed as an argument to the worker's main function, but it can be
180 accessed via <literal>MyBgworkerEntry</literal>, as discussed above.
181 </para>
183 <para>
184 <structfield>bgw_notify_pid</structfield> is the PID of a PostgreSQL
185 backend process to which the postmaster should send <literal>SIGUSR1</literal>
186 when the process is started or exits. It should be 0 for workers registered
187 at postmaster startup time, or when the backend registering the worker does
188 not wish to wait for the worker to start up. Otherwise, it should be
189 initialized to <literal>MyProcPid</literal>.
190 </para>
192 <para>Once running, the process can connect to a database by calling
193 <function>BackgroundWorkerInitializeConnection(<parameter>char *dbname</parameter>, <parameter>char *username</parameter>, <parameter>uint32 flags</parameter>)</function> or
194 <function>BackgroundWorkerInitializeConnectionByOid(<parameter>Oid dboid</parameter>, <parameter>Oid useroid</parameter>, <parameter>uint32 flags</parameter>)</function>.
195 This allows the process to run transactions and queries using the
196 <literal>SPI</literal> interface. If <varname>dbname</varname> is NULL or
197 <varname>dboid</varname> is <literal>InvalidOid</literal>, the session is not connected
198 to any particular database, but shared catalogs can be accessed.
199 If <varname>username</varname> is NULL or <varname>useroid</varname> is
200 <literal>InvalidOid</literal>, the process will run as the superuser created
201 during <command>initdb</command>. If <literal>BGWORKER_BYPASS_ALLOWCONN</literal>
202 is specified as <varname>flags</varname> it is possible to bypass the restriction
203 to connect to databases not allowing user connections.
204 If <literal>BGWORKER_BYPASS_ROLELOGINCHECK</literal> is specified as
205 <varname>flags</varname> it is possible to bypass the login check for the
206 role used to connect to databases.
207 A background worker can only call one of these two functions, and only
208 once. It is not possible to switch databases.
209 </para>
211 <para>
212 Signals are initially blocked when control reaches the
213 background worker's main function, and must be unblocked by it; this is to
214 allow the process to customize its signal handlers, if necessary.
215 Signals can be unblocked in the new process by calling
216 <function>BackgroundWorkerUnblockSignals</function> and blocked by calling
217 <function>BackgroundWorkerBlockSignals</function>.
218 </para>
220 <para>
221 If <structfield>bgw_restart_time</structfield> for a background worker is
222 configured as <literal>BGW_NEVER_RESTART</literal>, or if it exits with an exit
223 code of 0 or is terminated by <function>TerminateBackgroundWorker</function>,
224 it will be automatically unregistered by the postmaster on exit.
225 Otherwise, it will be restarted after the time period configured via
226 <structfield>bgw_restart_time</structfield>, or immediately if the postmaster
227 reinitializes the cluster due to a backend failure. Backends which need
228 to suspend execution only temporarily should use an interruptible sleep
229 rather than exiting; this can be achieved by calling
230 <function>WaitLatch()</function>. Make sure the
231 <literal>WL_POSTMASTER_DEATH</literal> flag is set when calling that function, and
232 verify the return code for a prompt exit in the emergency case that
233 <command>postgres</command> itself has terminated.
234 </para>
236 <para>
237 When a background worker is registered using the
238 <function>RegisterDynamicBackgroundWorker</function> function, it is
239 possible for the backend performing the registration to obtain information
240 regarding the status of the worker. Backends wishing to do this should
241 pass the address of a <type>BackgroundWorkerHandle *</type> as the second
242 argument to <function>RegisterDynamicBackgroundWorker</function>. If the
243 worker is successfully registered, this pointer will be initialized with an
244 opaque handle that can subsequently be passed to
245 <function>GetBackgroundWorkerPid(<parameter>BackgroundWorkerHandle *</parameter>, <parameter>pid_t *</parameter>)</function> or
246 <function>TerminateBackgroundWorker(<parameter>BackgroundWorkerHandle *</parameter>)</function>.
247 <function>GetBackgroundWorkerPid</function> can be used to poll the status of the
248 worker: a return value of <literal>BGWH_NOT_YET_STARTED</literal> indicates that
249 the worker has not yet been started by the postmaster;
250 <literal>BGWH_STOPPED</literal> indicates that it has been started but is
251 no longer running; and <literal>BGWH_STARTED</literal> indicates that it is
252 currently running. In this last case, the PID will also be returned via the
253 second argument.
254 <function>TerminateBackgroundWorker</function> causes the postmaster to send
255 <literal>SIGTERM</literal> to the worker if it is running, and to unregister it
256 as soon as it is not.
257 </para>
259 <para>
260 In some cases, a process which registers a background worker may wish to
261 wait for the worker to start up. This can be accomplished by initializing
262 <structfield>bgw_notify_pid</structfield> to <literal>MyProcPid</literal> and
263 then passing the <type>BackgroundWorkerHandle *</type> obtained at
264 registration time to
265 <function>WaitForBackgroundWorkerStartup(<parameter>BackgroundWorkerHandle
266 *handle</parameter>, <parameter>pid_t *</parameter>)</function> function.
267 This function will block until the postmaster has attempted to start the
268 background worker, or until the postmaster dies. If the background worker
269 is running, the return value will be <literal>BGWH_STARTED</literal>, and
270 the PID will be written to the provided address. Otherwise, the return
271 value will be <literal>BGWH_STOPPED</literal> or
272 <literal>BGWH_POSTMASTER_DIED</literal>.
273 </para>
275 <para>
276 A process can also wait for a background worker to shut down, by using the
277 <function>WaitForBackgroundWorkerShutdown(<parameter>BackgroundWorkerHandle
278 *handle</parameter>)</function> function and passing the
279 <type>BackgroundWorkerHandle *</type> obtained at registration. This
280 function will block until the background worker exits, or postmaster dies.
281 When the background worker exits, the return value is
282 <literal>BGWH_STOPPED</literal>, if postmaster dies it will return
283 <literal>BGWH_POSTMASTER_DIED</literal>.
284 </para>
286 <para>
287 Background workers can send asynchronous notification messages, either by
288 using the <command>NOTIFY</command> command via <acronym>SPI</acronym>,
289 or directly via <function>Async_Notify()</function>. Such notifications
290 will be sent at transaction commit.
291 Background workers should not register to receive asynchronous
292 notifications with the <command>LISTEN</command> command, as there is no
293 infrastructure for a worker to consume such notifications.
294 </para>
296 <para>
297 The <filename>src/test/modules/worker_spi</filename> module
298 contains a working example,
299 which demonstrates some useful techniques.
300 </para>
302 <para>
303 The maximum number of registered background workers is limited by
304 <xref linkend="guc-max-worker-processes"/>.
305 </para>
306 </chapter>