Repair memory leaks in plpython.
[pgsql.git] / doc / src / sgml / plpython.sgml
blobbee817ea822a29e30a83415ee87addb03ecbb9e4
1 <!-- doc/src/sgml/plpython.sgml -->
3 <chapter id="plpython">
4 <title>PL/Python &mdash; Python Procedural Language</title>
6 <indexterm zone="plpython"><primary>PL/Python</primary></indexterm>
7 <indexterm zone="plpython"><primary>Python</primary></indexterm>
9 <para>
10 The <application>PL/Python</application> procedural language allows
11 <productname>PostgreSQL</productname> functions and procedures to be written in the
12 <ulink url="https://www.python.org">Python language</ulink>.
13 </para>
15 <para>
16 To install PL/Python in a particular database, use
17 <literal>CREATE EXTENSION plpython3u</literal>.
18 </para>
20 <tip>
21 <para>
22 If a language is installed into <literal>template1</literal>, all subsequently
23 created databases will have the language installed automatically.
24 </para>
25 </tip>
27 <para>
28 PL/Python is only available as an <quote>untrusted</quote> language, meaning
29 it does not offer any way of restricting what users can do in it and
30 is therefore named <literal>plpython3u</literal>. A trusted
31 variant <literal>plpython</literal> might become available in the future
32 if a secure execution mechanism is developed in Python. The
33 writer of a function in untrusted PL/Python must take care that the
34 function cannot be used to do anything unwanted, since it will be
35 able to do anything that could be done by a user logged in as the
36 database administrator. Only superusers can create functions in
37 untrusted languages such as <literal>plpython3u</literal>.
38 </para>
40 <note>
41 <para>
42 Users of source packages must specially enable the build of
43 PL/Python during the installation process. (Refer to the
44 installation instructions for more information.) Users of binary
45 packages might find PL/Python in a separate subpackage.
46 </para>
47 </note>
49 <sect1 id="plpython-funcs">
50 <title>PL/Python Functions</title>
52 <para>
53 Functions in PL/Python are declared via the
54 standard <xref linkend="sql-createfunction"/> syntax:
56 <programlisting>
57 CREATE FUNCTION <replaceable>funcname</replaceable> (<replaceable>argument-list</replaceable>)
58 RETURNS <replaceable>return-type</replaceable>
59 AS $$
60 # PL/Python function body
61 $$ LANGUAGE plpython3u;
62 </programlisting>
63 </para>
65 <para>
66 The body of a function is simply a Python script. When the function
67 is called, its arguments are passed as elements of the list
68 <varname>args</varname>; named arguments are also passed as
69 ordinary variables to the Python script. Use of named arguments is
70 usually more readable. The result is returned from the Python code
71 in the usual way, with <literal>return</literal> or
72 <literal>yield</literal> (in case of a result-set statement). If
73 you do not provide a return value, Python returns the default
74 <symbol>None</symbol>. <application>PL/Python</application> translates
75 Python's <symbol>None</symbol> into the SQL null value. In a procedure,
76 the result from the Python code must be <symbol>None</symbol> (typically
77 achieved by ending the procedure without a <literal>return</literal>
78 statement or by using a <literal>return</literal> statement without
79 argument); otherwise, an error will be raised.
80 </para>
82 <para>
83 For example, a function to return the greater of two integers can be
84 defined as:
86 <programlisting>
87 CREATE FUNCTION pymax (a integer, b integer)
88 RETURNS integer
89 AS $$
90 if a &gt; b:
91 return a
92 return b
93 $$ LANGUAGE plpython3u;
94 </programlisting>
96 The Python code that is given as the body of the function definition
97 is transformed into a Python function. For example, the above results in:
99 <programlisting>
100 def __plpython_procedure_pymax_23456():
101 if a &gt; b:
102 return a
103 return b
104 </programlisting>
106 assuming that 23456 is the OID assigned to the function by
107 <productname>PostgreSQL</productname>.
108 </para>
110 <para>
111 The arguments are set as global variables. Because of the scoping
112 rules of Python, this has the subtle consequence that an argument
113 variable cannot be reassigned inside the function to the value of
114 an expression that involves the variable name itself, unless the
115 variable is redeclared as global in the block. For example, the
116 following won't work:
117 <programlisting>
118 CREATE FUNCTION pystrip(x text)
119 RETURNS text
120 AS $$
121 x = x.strip() # error
122 return x
123 $$ LANGUAGE plpython3u;
124 </programlisting>
125 because assigning to <varname>x</varname>
126 makes <varname>x</varname> a local variable for the entire block,
127 and so the <varname>x</varname> on the right-hand side of the
128 assignment refers to a not-yet-assigned local
129 variable <varname>x</varname>, not the PL/Python function
130 parameter. Using the <literal>global</literal> statement, this can
131 be made to work:
132 <programlisting>
133 CREATE FUNCTION pystrip(x text)
134 RETURNS text
135 AS $$
136 global x
137 x = x.strip() # ok now
138 return x
139 $$ LANGUAGE plpython3u;
140 </programlisting>
141 But it is advisable not to rely on this implementation detail of
142 PL/Python. It is better to treat the function parameters as
143 read-only.
144 </para>
145 </sect1>
147 <sect1 id="plpython-data">
148 <title>Data Values</title>
149 <para>
150 Generally speaking, the aim of PL/Python is to provide
151 a <quote>natural</quote> mapping between the PostgreSQL and the
152 Python worlds. This informs the data mapping rules described
153 below.
154 </para>
156 <sect2 id="plpython-data-type-mapping">
157 <title>Data Type Mapping</title>
158 <para>
159 When a PL/Python function is called, its arguments are converted from
160 their PostgreSQL data type to a corresponding Python type:
162 <itemizedlist>
163 <listitem>
164 <para>
165 PostgreSQL <type>boolean</type> is converted to Python <type>bool</type>.
166 </para>
167 </listitem>
169 <listitem>
170 <para>
171 PostgreSQL <type>smallint</type>, <type>int</type>, <type>bigint</type>
172 and <type>oid</type> are converted to Python <type>int</type>.
173 </para>
174 </listitem>
176 <listitem>
177 <para>
178 PostgreSQL <type>real</type> and <type>double</type> are converted to
179 Python <type>float</type>.
180 </para>
181 </listitem>
183 <listitem>
184 <para>
185 PostgreSQL <type>numeric</type> is converted to
186 Python <type>Decimal</type>. This type is imported from
187 the <literal>cdecimal</literal> package if that is available.
188 Otherwise,
189 <literal>decimal.Decimal</literal> from the standard library will be
190 used. <literal>cdecimal</literal> is significantly faster
191 than <literal>decimal</literal>. In Python 3.3 and up,
192 however, <literal>cdecimal</literal> has been integrated into the
193 standard library under the name <literal>decimal</literal>, so there is
194 no longer any difference.
195 </para>
196 </listitem>
198 <listitem>
199 <para>
200 PostgreSQL <type>bytea</type> is converted to Python <type>bytes</type>.
201 </para>
202 </listitem>
204 <listitem>
205 <para>
206 All other data types, including the PostgreSQL character string types,
207 are converted to a Python <type>str</type> (in Unicode like all Python
208 strings).
209 </para>
210 </listitem>
212 <listitem>
213 <para>
214 For nonscalar data types, see below.
215 </para>
216 </listitem>
217 </itemizedlist>
218 </para>
220 <para>
221 When a PL/Python function returns, its return value is converted to the
222 function's declared PostgreSQL return data type as follows:
224 <itemizedlist>
225 <listitem>
226 <para>
227 When the PostgreSQL return type is <type>boolean</type>, the
228 return value will be evaluated for truth according to the
229 <emphasis>Python</emphasis> rules. That is, 0 and empty string
230 are false, but notably <literal>'f'</literal> is true.
231 </para>
232 </listitem>
234 <listitem>
235 <para>
236 When the PostgreSQL return type is <type>bytea</type>, the return value
237 will be converted to Python <type>bytes</type> using the respective
238 Python built-ins, with the result being converted to
239 <type>bytea</type>.
240 </para>
241 </listitem>
243 <listitem>
244 <para>
245 For all other PostgreSQL return types, the return value is converted
246 to a string using the Python built-in <literal>str</literal>, and the
247 result is passed to the input function of the PostgreSQL data type.
248 (If the Python value is a <type>float</type>, it is converted using
249 the <literal>repr</literal> built-in instead of <literal>str</literal>, to
250 avoid loss of precision.)
251 </para>
253 <para>
254 Strings are automatically converted to the PostgreSQL server encoding
255 when they are passed to PostgreSQL.
256 </para>
257 </listitem>
259 <listitem>
260 <para>
261 For nonscalar data types, see below.
262 </para>
263 </listitem>
264 </itemizedlist>
266 Note that logical mismatches between the declared PostgreSQL
267 return type and the Python data type of the actual return object
268 are not flagged; the value will be converted in any case.
269 </para>
270 </sect2>
272 <sect2 id="plpython-data-null">
273 <title>Null, None</title>
274 <para>
275 If an SQL null value<indexterm><primary>null value</primary><secondary
276 sortas="PL/Python">in PL/Python</secondary></indexterm> is passed to a
277 function, the argument value will appear as <symbol>None</symbol> in
278 Python. For example, the function definition of <function>pymax</function>
279 shown in <xref linkend="plpython-funcs"/> will return the wrong answer for null
280 inputs. We could add <literal>STRICT</literal> to the function definition
281 to make <productname>PostgreSQL</productname> do something more reasonable:
282 if a null value is passed, the function will not be called at all,
283 but will just return a null result automatically. Alternatively,
284 we could check for null inputs in the function body:
286 <programlisting>
287 CREATE FUNCTION pymax (a integer, b integer)
288 RETURNS integer
289 AS $$
290 if (a is None) or (b is None):
291 return None
292 if a &gt; b:
293 return a
294 return b
295 $$ LANGUAGE plpython3u;
296 </programlisting>
298 As shown above, to return an SQL null value from a PL/Python
299 function, return the value <symbol>None</symbol>. This can be done whether the
300 function is strict or not.
301 </para>
302 </sect2>
304 <sect2 id="plpython-arrays">
305 <title>Arrays, Lists</title>
306 <para>
307 SQL array values are passed into PL/Python as a Python list. To
308 return an SQL array value out of a PL/Python function, return a
309 Python list:
311 <programlisting>
312 CREATE FUNCTION return_arr()
313 RETURNS int[]
314 AS $$
315 return [1, 2, 3, 4, 5]
316 $$ LANGUAGE plpython3u;
318 SELECT return_arr();
319 return_arr
320 -------------
321 {1,2,3,4,5}
322 (1 row)
323 </programlisting>
325 Multidimensional arrays are passed into PL/Python as nested Python lists.
326 A 2-dimensional array is a list of lists, for example. When returning
327 a multi-dimensional SQL array out of a PL/Python function, the inner
328 lists at each level must all be of the same size. For example:
330 <programlisting>
331 CREATE FUNCTION test_type_conversion_array_int4(x int4[]) RETURNS int4[] AS $$
332 plpy.info(x, type(x))
333 return x
334 $$ LANGUAGE plpython3u;
336 SELECT * FROM test_type_conversion_array_int4(ARRAY[[1,2,3],[4,5,6]]);
337 INFO: ([[1, 2, 3], [4, 5, 6]], &lt;type 'list'&gt;)
338 test_type_conversion_array_int4
339 ---------------------------------
340 {{1,2,3},{4,5,6}}
341 (1 row)
342 </programlisting>
344 Other Python sequences, like tuples, are also accepted for
345 backwards-compatibility with PostgreSQL versions 9.6 and below, when
346 multi-dimensional arrays were not supported. However, they are always
347 treated as one-dimensional arrays, because they are ambiguous with
348 composite types. For the same reason, when a composite type is used in a
349 multi-dimensional array, it must be represented by a tuple, rather than a
350 list.
351 </para>
352 <para>
353 Note that in Python, strings are sequences, which can have
354 undesirable effects that might be familiar to Python programmers:
356 <programlisting>
357 CREATE FUNCTION return_str_arr()
358 RETURNS varchar[]
359 AS $$
360 return "hello"
361 $$ LANGUAGE plpython3u;
363 SELECT return_str_arr();
364 return_str_arr
365 ----------------
366 {h,e,l,l,o}
367 (1 row)
368 </programlisting>
369 </para>
370 </sect2>
372 <sect2 id="plpython-data-composite-types">
373 <title>Composite Types</title>
374 <para>
375 Composite-type arguments are passed to the function as Python mappings. The
376 element names of the mapping are the attribute names of the composite type.
377 If an attribute in the passed row has the null value, it has the value
378 <symbol>None</symbol> in the mapping. Here is an example:
380 <programlisting>
381 CREATE TABLE employee (
382 name text,
383 salary integer,
384 age integer
387 CREATE FUNCTION overpaid (e employee)
388 RETURNS boolean
389 AS $$
390 if e["salary"] &gt; 200000:
391 return True
392 if (e["age"] &lt; 30) and (e["salary"] &gt; 100000):
393 return True
394 return False
395 $$ LANGUAGE plpython3u;
396 </programlisting>
397 </para>
399 <para>
400 There are multiple ways to return row or composite types from a Python
401 function. The following examples assume we have:
403 <programlisting>
404 CREATE TYPE named_value AS (
405 name text,
406 value integer
408 </programlisting>
410 A composite result can be returned as a:
412 <variablelist>
413 <varlistentry>
414 <term>Sequence type (a tuple or list, but not a set because
415 it is not indexable)</term>
416 <listitem>
417 <para>
418 Returned sequence objects must have the same number of items as the
419 composite result type has fields. The item with index 0 is assigned to
420 the first field of the composite type, 1 to the second and so on. For
421 example:
423 <programlisting>
424 CREATE FUNCTION make_pair (name text, value integer)
425 RETURNS named_value
426 AS $$
427 return ( name, value )
428 # or alternatively, as list: return [ name, value ]
429 $$ LANGUAGE plpython3u;
430 </programlisting>
432 To return an SQL null for any column, insert <symbol>None</symbol> at
433 the corresponding position.
434 </para>
435 <para>
436 When an array of composite types is returned, it cannot be returned as a list,
437 because it is ambiguous whether the Python list represents a composite type,
438 or another array dimension.
439 </para>
440 </listitem>
441 </varlistentry>
443 <varlistentry>
444 <term>Mapping (dictionary)</term>
445 <listitem>
446 <para>
447 The value for each result type column is retrieved from the mapping
448 with the column name as key. Example:
450 <programlisting>
451 CREATE FUNCTION make_pair (name text, value integer)
452 RETURNS named_value
453 AS $$
454 return { "name": name, "value": value }
455 $$ LANGUAGE plpython3u;
456 </programlisting>
458 Any extra dictionary key/value pairs are ignored. Missing keys are
459 treated as errors.
460 To return an SQL null value for any column, insert
461 <symbol>None</symbol> with the corresponding column name as the key.
462 </para>
463 </listitem>
464 </varlistentry>
466 <varlistentry>
467 <term>Object (any object providing method <literal>__getattr__</literal>)</term>
468 <listitem>
469 <para>
470 This works the same as a mapping.
471 Example:
473 <programlisting>
474 CREATE FUNCTION make_pair (name text, value integer)
475 RETURNS named_value
476 AS $$
477 class named_value:
478 def __init__ (self, n, v):
479 self.name = n
480 self.value = v
481 return named_value(name, value)
483 # or simply
484 class nv: pass
485 nv.name = name
486 nv.value = value
487 return nv
488 $$ LANGUAGE plpython3u;
489 </programlisting>
490 </para>
491 </listitem>
492 </varlistentry>
493 </variablelist>
494 </para>
496 <para>
497 Functions with <literal>OUT</literal> parameters are also supported. For example:
498 <programlisting>
499 CREATE FUNCTION multiout_simple(OUT i integer, OUT j integer) AS $$
500 return (1, 2)
501 $$ LANGUAGE plpython3u;
503 SELECT * FROM multiout_simple();
504 </programlisting>
505 </para>
507 <para>
508 Output parameters of procedures are passed back the same way. For example:
509 <programlisting>
510 CREATE PROCEDURE python_triple(INOUT a integer, INOUT b integer) AS $$
511 return (a * 3, b * 3)
512 $$ LANGUAGE plpython3u;
514 CALL python_triple(5, 10);
515 </programlisting>
516 </para>
517 </sect2>
519 <sect2 id="plpython-data-set-returning-funcs">
520 <title>Set-Returning Functions</title>
521 <para>
522 A <application>PL/Python</application> function can also return sets of
523 scalar or composite types. There are several ways to achieve this because
524 the returned object is internally turned into an iterator. The following
525 examples assume we have composite type:
527 <programlisting>
528 CREATE TYPE greeting AS (
529 how text,
530 who text
532 </programlisting>
534 A set result can be returned from a:
536 <variablelist>
537 <varlistentry>
538 <term>Sequence type (tuple, list, set)</term>
539 <listitem>
540 <para>
541 <programlisting>
542 CREATE FUNCTION greet (how text)
543 RETURNS SETOF greeting
544 AS $$
545 # return tuple containing lists as composite types
546 # all other combinations work also
547 return ( [ how, "World" ], [ how, "PostgreSQL" ], [ how, "PL/Python" ] )
548 $$ LANGUAGE plpython3u;
549 </programlisting>
550 </para>
551 </listitem>
552 </varlistentry>
554 <varlistentry>
555 <term>Iterator (any object providing <symbol>__iter__</symbol> and
556 <symbol>__next__</symbol> methods)</term>
557 <listitem>
558 <para>
559 <programlisting>
560 CREATE FUNCTION greet (how text)
561 RETURNS SETOF greeting
562 AS $$
563 class producer:
564 def __init__ (self, how, who):
565 self.how = how
566 self.who = who
567 self.ndx = -1
569 def __iter__ (self):
570 return self
572 def __next__(self):
573 self.ndx += 1
574 if self.ndx == len(self.who):
575 raise StopIteration
576 return ( self.how, self.who[self.ndx] )
578 return producer(how, [ "World", "PostgreSQL", "PL/Python" ])
579 $$ LANGUAGE plpython3u;
580 </programlisting>
581 </para>
582 </listitem>
583 </varlistentry>
585 <varlistentry>
586 <term>Generator (<literal>yield</literal>)</term>
587 <listitem>
588 <para>
589 <programlisting>
590 CREATE FUNCTION greet (how text)
591 RETURNS SETOF greeting
592 AS $$
593 for who in [ "World", "PostgreSQL", "PL/Python" ]:
594 yield ( how, who )
595 $$ LANGUAGE plpython3u;
596 </programlisting>
598 </para>
599 </listitem>
600 </varlistentry>
601 </variablelist>
602 </para>
604 <para>
605 Set-returning functions with <literal>OUT</literal> parameters
606 (using <literal>RETURNS SETOF record</literal>) are also
607 supported. For example:
608 <programlisting>
609 CREATE FUNCTION multiout_simple_setof(n integer, OUT integer, OUT integer) RETURNS SETOF record AS $$
610 return [(1, 2)] * n
611 $$ LANGUAGE plpython3u;
613 SELECT * FROM multiout_simple_setof(3);
614 </programlisting>
615 </para>
616 </sect2>
617 </sect1>
619 <sect1 id="plpython-sharing">
620 <title>Sharing Data</title>
621 <para>
622 The global dictionary <varname>SD</varname> is available to store
623 private data between repeated calls to the same function.
624 The global dictionary <varname>GD</varname> is public data,
625 that is available to all Python functions within a session; use with
626 care.<indexterm><primary>global data</primary>
627 <secondary>in PL/Python</secondary></indexterm>
628 </para>
630 <para>
631 Each function gets its own execution environment in the
632 Python interpreter, so that global data and function arguments from
633 <function>myfunc</function> are not available to
634 <function>myfunc2</function>. The exception is the data in the
635 <varname>GD</varname> dictionary, as mentioned above.
636 </para>
637 </sect1>
639 <sect1 id="plpython-do">
640 <title>Anonymous Code Blocks</title>
642 <para>
643 PL/Python also supports anonymous code blocks called with the
644 <xref linkend="sql-do"/> statement:
646 <programlisting>
647 DO $$
648 # PL/Python code
649 $$ LANGUAGE plpython3u;
650 </programlisting>
652 An anonymous code block receives no arguments, and whatever value it
653 might return is discarded. Otherwise it behaves just like a function.
654 </para>
655 </sect1>
657 <sect1 id="plpython-trigger">
658 <title>Trigger Functions</title>
660 <indexterm zone="plpython-trigger">
661 <primary>trigger</primary>
662 <secondary>in PL/Python</secondary>
663 </indexterm>
665 <para>
666 When a function is used as a trigger, the dictionary
667 <literal>TD</literal> contains trigger-related values:
668 <variablelist>
669 <varlistentry>
670 <term><literal>TD["event"]</literal></term>
671 <listitem>
672 <para>
673 contains the event as a string:
674 <literal>INSERT</literal>, <literal>UPDATE</literal>,
675 <literal>DELETE</literal>, or <literal>TRUNCATE</literal>.
676 </para>
677 </listitem>
678 </varlistentry>
680 <varlistentry>
681 <term><literal>TD["when"]</literal></term>
682 <listitem>
683 <para>
684 contains one of <literal>BEFORE</literal>, <literal>AFTER</literal>, or
685 <literal>INSTEAD OF</literal>.
686 </para>
687 </listitem>
688 </varlistentry>
690 <varlistentry>
691 <term><literal>TD["level"]</literal></term>
692 <listitem>
693 <para>
694 contains <literal>ROW</literal> or <literal>STATEMENT</literal>.
695 </para>
696 </listitem>
697 </varlistentry>
699 <varlistentry>
700 <term><literal>TD["new"]</literal></term>
701 <term><literal>TD["old"]</literal></term>
702 <listitem>
703 <para>
704 For a row-level trigger, one or both of these fields contain
705 the respective trigger rows, depending on the trigger event.
706 </para>
707 </listitem>
708 </varlistentry>
710 <varlistentry>
711 <term><literal>TD["name"]</literal></term>
712 <listitem>
713 <para>
714 contains the trigger name.
715 </para>
716 </listitem>
717 </varlistentry>
719 <varlistentry>
720 <term><literal>TD["table_name"]</literal></term>
721 <listitem>
722 <para>
723 contains the name of the table on which the trigger occurred.
724 </para>
725 </listitem>
726 </varlistentry>
728 <varlistentry>
729 <term><literal>TD["table_schema"]</literal></term>
730 <listitem>
731 <para>
732 contains the schema of the table on which the trigger occurred.
733 </para>
734 </listitem>
735 </varlistentry>
737 <varlistentry>
738 <term><literal>TD["relid"]</literal></term>
739 <listitem>
740 <para>
741 contains the OID of the table on which the trigger occurred.
742 </para>
743 </listitem>
744 </varlistentry>
746 <varlistentry>
747 <term><literal>TD["args"]</literal></term>
748 <listitem>
749 <para>
750 If the <command>CREATE TRIGGER</command> command
751 included arguments, they are available in <literal>TD["args"][0]</literal> to
752 <literal>TD["args"][<replaceable>n</replaceable>-1]</literal>.
753 </para>
754 </listitem>
755 </varlistentry>
756 </variablelist>
757 </para>
759 <para>
760 If <literal>TD["when"]</literal> is <literal>BEFORE</literal> or
761 <literal>INSTEAD OF</literal> and
762 <literal>TD["level"]</literal> is <literal>ROW</literal>, you can
763 return <literal>None</literal> or <literal>"OK"</literal> from the
764 Python function to indicate the row is unmodified,
765 <literal>"SKIP"</literal> to abort the event, or if <literal>TD["event"]</literal>
766 is <command>INSERT</command> or <command>UPDATE</command> you can return
767 <literal>"MODIFY"</literal> to indicate you've modified the new row.
768 Otherwise the return value is ignored.
769 </para>
770 </sect1>
772 <sect1 id="plpython-database">
773 <title>Database Access</title>
775 <para>
776 The PL/Python language module automatically imports a Python module
777 called <literal>plpy</literal>. The functions and constants in
778 this module are available to you in the Python code as
779 <literal>plpy.<replaceable>foo</replaceable></literal>.
780 </para>
782 <sect2 id="plpython-database-access-funcs">
783 <title>Database Access Functions</title>
785 <para>
786 The <literal>plpy</literal> module provides several functions to execute
787 database commands:
788 </para>
790 <variablelist>
791 <varlistentry>
792 <term><literal>plpy.<function>execute</function>(<replaceable>query</replaceable> [, <replaceable>limit</replaceable>])</literal></term>
793 <listitem>
794 <para>
795 Calling <function>plpy.execute</function> with a query string and an
796 optional row limit argument causes that query to be run and the result to
797 be returned in a result object.
798 </para>
800 <para>
801 If <replaceable>limit</replaceable> is specified and is greater than
802 zero, then <function>plpy.execute</function> retrieves at
803 most <replaceable>limit</replaceable> rows, much as if the query
804 included a <literal>LIMIT</literal>
805 clause. Omitting <replaceable>limit</replaceable> or specifying it as
806 zero results in no row limit.
807 </para>
809 <para>
810 The result object emulates a list or dictionary object. The result
811 object can be accessed by row number and column name. For example:
812 <programlisting>
813 rv = plpy.execute("SELECT * FROM my_table", 5)
814 </programlisting>
815 returns up to 5 rows from <literal>my_table</literal>. If
816 <literal>my_table</literal> has a column
817 <literal>my_column</literal>, it would be accessed as:
818 <programlisting>
819 foo = rv[i]["my_column"]
820 </programlisting>
821 The number of rows returned can be obtained using the built-in
822 <function>len</function> function.
823 </para>
825 <para>
826 The result object has these additional methods:
827 <variablelist>
828 <varlistentry>
829 <term><literal><function>nrows</function>()</literal></term>
830 <listitem>
831 <para>
832 Returns the number of rows processed by the command. Note that this
833 is not necessarily the same as the number of rows returned. For
834 example, an <command>UPDATE</command> command will set this value but
835 won't return any rows (unless <literal>RETURNING</literal> is used).
836 </para>
837 </listitem>
838 </varlistentry>
840 <varlistentry>
841 <term><literal><function>status</function>()</literal></term>
842 <listitem>
843 <para>
844 The <function>SPI_execute()</function> return value.
845 </para>
846 </listitem>
847 </varlistentry>
849 <varlistentry>
850 <term><literal><function>colnames</function>()</literal></term>
851 <term><literal><function>coltypes</function>()</literal></term>
852 <term><literal><function>coltypmods</function>()</literal></term>
853 <listitem>
854 <para>
855 Return a list of column names, list of column type OIDs, and list of
856 type-specific type modifiers for the columns, respectively.
857 </para>
859 <para>
860 These methods raise an exception when called on a result object from
861 a command that did not produce a result set, e.g.,
862 <command>UPDATE</command> without <literal>RETURNING</literal>, or
863 <command>DROP TABLE</command>. But it is OK to use these methods on
864 a result set containing zero rows.
865 </para>
866 </listitem>
867 </varlistentry>
869 <varlistentry>
870 <term><literal><function>__str__</function>()</literal></term>
871 <listitem>
872 <para>
873 The standard <literal>__str__</literal> method is defined so that it
874 is possible for example to debug query execution results
875 using <literal>plpy.debug(rv)</literal>.
876 </para>
877 </listitem>
878 </varlistentry>
879 </variablelist>
880 </para>
882 <para>
883 The result object can be modified.
884 </para>
886 <para>
887 Note that calling <literal>plpy.execute</literal> will cause the entire
888 result set to be read into memory. Only use that function when you are
889 sure that the result set will be relatively small. If you don't want to
890 risk excessive memory usage when fetching large results,
891 use <literal>plpy.cursor</literal> rather
892 than <literal>plpy.execute</literal>.
893 </para>
894 </listitem>
895 </varlistentry>
897 <varlistentry>
898 <term><literal>plpy.<function>prepare</function>(<replaceable>query</replaceable> [, <replaceable>argtypes</replaceable>])</literal></term>
899 <term><literal>plpy.<function>execute</function>(<replaceable>plan</replaceable> [, <replaceable>arguments</replaceable> [, <replaceable>limit</replaceable>]])</literal></term>
900 <listitem>
901 <para>
902 <indexterm><primary>preparing a query</primary><secondary>in PL/Python</secondary></indexterm>
903 <function>plpy.prepare</function> prepares the execution plan for a
904 query. It is called with a query string and a list of parameter types,
905 if you have parameter references in the query. For example:
906 <programlisting>
907 plan = plpy.prepare("SELECT last_name FROM my_users WHERE first_name = $1", ["text"])
908 </programlisting>
909 <literal>text</literal> is the type of the variable you will be passing
910 for <literal>$1</literal>. The second argument is optional if you don't
911 want to pass any parameters to the query.
912 </para>
913 <para>
914 After preparing a statement, you use a variant of the
915 function <function>plpy.execute</function> to run it:
916 <programlisting>
917 rv = plpy.execute(plan, ["name"], 5)
918 </programlisting>
919 Pass the plan as the first argument (instead of the query string), and a
920 list of values to substitute into the query as the second argument. The
921 second argument is optional if the query does not expect any parameters.
922 The third argument is the optional row limit as before.
923 </para>
925 <para>
926 Alternatively, you can call the <function>execute</function> method on
927 the plan object:
928 <programlisting>
929 rv = plan.execute(["name"], 5)
930 </programlisting>
931 </para>
933 <para>
934 Query parameters and result row fields are converted between PostgreSQL
935 and Python data types as described in <xref linkend="plpython-data"/>.
936 </para>
938 <para>
939 When you prepare a plan using the PL/Python module it is automatically
940 saved. Read the SPI documentation (<xref linkend="spi"/>) for a
941 description of what this means. In order to make effective use of this
942 across function calls one needs to use one of the persistent storage
943 dictionaries <literal>SD</literal> or <literal>GD</literal> (see
944 <xref linkend="plpython-sharing"/>). For example:
945 <programlisting>
946 CREATE FUNCTION usesavedplan() RETURNS trigger AS $$
947 if "plan" in SD:
948 plan = SD["plan"]
949 else:
950 plan = plpy.prepare("SELECT 1")
951 SD["plan"] = plan
952 # rest of function
953 $$ LANGUAGE plpython3u;
954 </programlisting>
955 </para>
956 </listitem>
957 </varlistentry>
959 <varlistentry>
960 <term><literal>plpy.<function>cursor</function>(<replaceable>query</replaceable>)</literal></term>
961 <term><literal>plpy.<function>cursor</function>(<replaceable>plan</replaceable> [, <replaceable>arguments</replaceable>])</literal></term>
962 <listitem>
963 <para>
964 The <literal>plpy.cursor</literal> function accepts the same arguments
965 as <literal>plpy.execute</literal> (except for the row limit) and returns
966 a cursor object, which allows you to process large result sets in smaller
967 chunks. As with <literal>plpy.execute</literal>, either a query string
968 or a plan object along with a list of arguments can be used, or
969 the <function>cursor</function> function can be called as a method of
970 the plan object.
971 </para>
973 <para>
974 The cursor object provides a <literal>fetch</literal> method that accepts
975 an integer parameter and returns a result object. Each time you
976 call <literal>fetch</literal>, the returned object will contain the next
977 batch of rows, never larger than the parameter value. Once all rows are
978 exhausted, <literal>fetch</literal> starts returning an empty result
979 object. Cursor objects also provide an
980 <ulink url="https://docs.python.org/library/stdtypes.html#iterator-types">iterator
981 interface</ulink>, yielding one row at a time until all rows are
982 exhausted. Data fetched that way is not returned as result objects, but
983 rather as dictionaries, each dictionary corresponding to a single result
984 row.
985 </para>
987 <para>
988 An example of two ways of processing data from a large table is:
989 <programlisting>
990 CREATE FUNCTION count_odd_iterator() RETURNS integer AS $$
991 odd = 0
992 for row in plpy.cursor("select num from largetable"):
993 if row['num'] % 2:
994 odd += 1
995 return odd
996 $$ LANGUAGE plpython3u;
998 CREATE FUNCTION count_odd_fetch(batch_size integer) RETURNS integer AS $$
999 odd = 0
1000 cursor = plpy.cursor("select num from largetable")
1001 while True:
1002 rows = cursor.fetch(batch_size)
1003 if not rows:
1004 break
1005 for row in rows:
1006 if row['num'] % 2:
1007 odd += 1
1008 return odd
1009 $$ LANGUAGE plpython3u;
1011 CREATE FUNCTION count_odd_prepared() RETURNS integer AS $$
1012 odd = 0
1013 plan = plpy.prepare("select num from largetable where num % $1 &lt;&gt; 0", ["integer"])
1014 rows = list(plpy.cursor(plan, [2])) # or: = list(plan.cursor([2]))
1016 return len(rows)
1017 $$ LANGUAGE plpython3u;
1018 </programlisting>
1019 </para>
1021 <para>
1022 Cursors are automatically disposed of. But if you want to explicitly
1023 release all resources held by a cursor, use the <literal>close</literal>
1024 method. Once closed, a cursor cannot be fetched from anymore.
1025 </para>
1027 <tip>
1028 <para>
1029 Do not confuse objects created by <literal>plpy.cursor</literal> with
1030 DB-API cursors as defined by
1031 the <ulink url="https://www.python.org/dev/peps/pep-0249/">Python
1032 Database API specification</ulink>. They don't have anything in common
1033 except for the name.
1034 </para>
1035 </tip>
1036 </listitem>
1037 </varlistentry>
1038 </variablelist>
1040 </sect2>
1042 <sect2 id="plpython-trapping">
1043 <title>Trapping Errors</title>
1045 <para>
1046 Functions accessing the database might encounter errors, which
1047 will cause them to abort and raise an exception. Both
1048 <function>plpy.execute</function> and
1049 <function>plpy.prepare</function> can raise an instance of a subclass of
1050 <literal>plpy.SPIError</literal>, which by default will terminate
1051 the function. This error can be handled just like any other
1052 Python exception, by using the <literal>try/except</literal>
1053 construct. For example:
1054 <programlisting>
1055 CREATE FUNCTION try_adding_joe() RETURNS text AS $$
1056 try:
1057 plpy.execute("INSERT INTO users(username) VALUES ('joe')")
1058 except plpy.SPIError:
1059 return "something went wrong"
1060 else:
1061 return "Joe added"
1062 $$ LANGUAGE plpython3u;
1063 </programlisting>
1064 </para>
1066 <para>
1067 The actual class of the exception being raised corresponds to the
1068 specific condition that caused the error. Refer
1069 to <xref linkend="errcodes-table"/> for a list of possible
1070 conditions. The module
1071 <literal>plpy.spiexceptions</literal> defines an exception class
1072 for each <productname>PostgreSQL</productname> condition, deriving
1073 their names from the condition name. For
1074 instance, <literal>division_by_zero</literal>
1075 becomes <literal>DivisionByZero</literal>, <literal>unique_violation</literal>
1076 becomes <literal>UniqueViolation</literal>, <literal>fdw_error</literal>
1077 becomes <literal>FdwError</literal>, and so on. Each of these
1078 exception classes inherits from <literal>SPIError</literal>. This
1079 separation makes it easier to handle specific errors, for
1080 instance:
1081 <programlisting>
1082 CREATE FUNCTION insert_fraction(numerator int, denominator int) RETURNS text AS $$
1083 from plpy import spiexceptions
1084 try:
1085 plan = plpy.prepare("INSERT INTO fractions (frac) VALUES ($1 / $2)", ["int", "int"])
1086 plpy.execute(plan, [numerator, denominator])
1087 except spiexceptions.DivisionByZero:
1088 return "denominator cannot equal zero"
1089 except spiexceptions.UniqueViolation:
1090 return "already have that fraction"
1091 except plpy.SPIError as e:
1092 return "other error, SQLSTATE %s" % e.sqlstate
1093 else:
1094 return "fraction inserted"
1095 $$ LANGUAGE plpython3u;
1096 </programlisting>
1097 Note that because all exceptions from
1098 the <literal>plpy.spiexceptions</literal> module inherit
1099 from <literal>SPIError</literal>, an <literal>except</literal>
1100 clause handling it will catch any database access error.
1101 </para>
1103 <para>
1104 As an alternative way of handling different error conditions, you
1105 can catch the <literal>SPIError</literal> exception and determine
1106 the specific error condition inside the <literal>except</literal>
1107 block by looking at the <literal>sqlstate</literal> attribute of
1108 the exception object. This attribute is a string value containing
1109 the <quote>SQLSTATE</quote> error code. This approach provides
1110 approximately the same functionality
1111 </para>
1112 </sect2>
1113 </sect1>
1115 <sect1 id="plpython-subtransaction">
1116 <title>Explicit Subtransactions</title>
1118 <para>
1119 Recovering from errors caused by database access as described in
1120 <xref linkend="plpython-trapping"/> can lead to an undesirable
1121 situation where some operations succeed before one of them fails,
1122 and after recovering from that error the data is left in an
1123 inconsistent state. PL/Python offers a solution to this problem in
1124 the form of explicit subtransactions.
1125 </para>
1127 <sect2 id="plpython-subtransaction-context-managers">
1128 <title>Subtransaction Context Managers</title>
1130 <para>
1131 Consider a function that implements a transfer between two
1132 accounts:
1133 <programlisting>
1134 CREATE FUNCTION transfer_funds() RETURNS void AS $$
1135 try:
1136 plpy.execute("UPDATE accounts SET balance = balance - 100 WHERE account_name = 'joe'")
1137 plpy.execute("UPDATE accounts SET balance = balance + 100 WHERE account_name = 'mary'")
1138 except plpy.SPIError as e:
1139 result = "error transferring funds: %s" % e.args
1140 else:
1141 result = "funds transferred correctly"
1142 plan = plpy.prepare("INSERT INTO operations (result) VALUES ($1)", ["text"])
1143 plpy.execute(plan, [result])
1144 $$ LANGUAGE plpython3u;
1145 </programlisting>
1146 If the second <literal>UPDATE</literal> statement results in an
1147 exception being raised, this function will report the error, but
1148 the result of the first <literal>UPDATE</literal> will
1149 nevertheless be committed. In other words, the funds will be
1150 withdrawn from Joe's account, but will not be transferred to
1151 Mary's account.
1152 </para>
1154 <para>
1155 To avoid such issues, you can wrap your
1156 <literal>plpy.execute</literal> calls in an explicit
1157 subtransaction. The <literal>plpy</literal> module provides a
1158 helper object to manage explicit subtransactions that gets created
1159 with the <literal>plpy.subtransaction()</literal> function.
1160 Objects created by this function implement the
1161 <ulink url="https://docs.python.org/library/stdtypes.html#context-manager-types">
1162 context manager interface</ulink>. Using explicit subtransactions
1163 we can rewrite our function as:
1164 <programlisting>
1165 CREATE FUNCTION transfer_funds2() RETURNS void AS $$
1166 try:
1167 with plpy.subtransaction():
1168 plpy.execute("UPDATE accounts SET balance = balance - 100 WHERE account_name = 'joe'")
1169 plpy.execute("UPDATE accounts SET balance = balance + 100 WHERE account_name = 'mary'")
1170 except plpy.SPIError as e:
1171 result = "error transferring funds: %s" % e.args
1172 else:
1173 result = "funds transferred correctly"
1174 plan = plpy.prepare("INSERT INTO operations (result) VALUES ($1)", ["text"])
1175 plpy.execute(plan, [result])
1176 $$ LANGUAGE plpython3u;
1177 </programlisting>
1178 Note that the use of <literal>try</literal>/<literal>except</literal> is still
1179 required. Otherwise the exception would propagate to the top of
1180 the Python stack and would cause the whole function to abort with
1181 a <productname>PostgreSQL</productname> error, so that the
1182 <literal>operations</literal> table would not have any row
1183 inserted into it. The subtransaction context manager does not
1184 trap errors, it only assures that all database operations executed
1185 inside its scope will be atomically committed or rolled back. A
1186 rollback of the subtransaction block occurs on any kind of
1187 exception exit, not only ones caused by errors originating from
1188 database access. A regular Python exception raised inside an
1189 explicit subtransaction block would also cause the subtransaction
1190 to be rolled back.
1191 </para>
1192 </sect2>
1193 </sect1>
1195 <sect1 id="plpython-transactions">
1196 <title>Transaction Management</title>
1198 <para>
1199 In a procedure called from the top level or an anonymous code block
1200 (<command>DO</command> command) called from the top level it is possible to
1201 control transactions. To commit the current transaction, call
1202 <literal>plpy.commit()</literal>. To roll back the current transaction,
1203 call <literal>plpy.rollback()</literal>. (Note that it is not possible to
1204 run the SQL commands <command>COMMIT</command> or
1205 <command>ROLLBACK</command> via <function>plpy.execute</function> or
1206 similar. It has to be done using these functions.) After a transaction is
1207 ended, a new transaction is automatically started, so there is no separate
1208 function for that.
1209 </para>
1211 <para>
1212 Here is an example:
1213 <programlisting>
1214 CREATE PROCEDURE transaction_test1()
1215 LANGUAGE plpython3u
1216 AS $$
1217 for i in range(0, 10):
1218 plpy.execute("INSERT INTO test1 (a) VALUES (%d)" % i)
1219 if i % 2 == 0:
1220 plpy.commit()
1221 else:
1222 plpy.rollback()
1225 CALL transaction_test1();
1226 </programlisting>
1227 </para>
1229 <para>
1230 Transactions cannot be ended when an explicit subtransaction is active.
1231 </para>
1232 </sect1>
1234 <sect1 id="plpython-util">
1235 <title>Utility Functions</title>
1236 <para>
1237 The <literal>plpy</literal> module also provides the functions
1238 <simplelist>
1239 <member><literal>plpy.debug(<replaceable>msg, **kwargs</replaceable>)</literal></member>
1240 <member><literal>plpy.log(<replaceable>msg, **kwargs</replaceable>)</literal></member>
1241 <member><literal>plpy.info(<replaceable>msg, **kwargs</replaceable>)</literal></member>
1242 <member><literal>plpy.notice(<replaceable>msg, **kwargs</replaceable>)</literal></member>
1243 <member><literal>plpy.warning(<replaceable>msg, **kwargs</replaceable>)</literal></member>
1244 <member><literal>plpy.error(<replaceable>msg, **kwargs</replaceable>)</literal></member>
1245 <member><literal>plpy.fatal(<replaceable>msg, **kwargs</replaceable>)</literal></member>
1246 </simplelist>
1247 <indexterm><primary>elog</primary><secondary>in PL/Python</secondary></indexterm>
1248 <function>plpy.error</function> and <function>plpy.fatal</function>
1249 actually raise a Python exception which, if uncaught, propagates out to
1250 the calling query, causing the current transaction or subtransaction to
1251 be aborted. <literal>raise plpy.Error(<replaceable>msg</replaceable>)</literal> and
1252 <literal>raise plpy.Fatal(<replaceable>msg</replaceable>)</literal> are
1253 equivalent to calling <literal>plpy.error(<replaceable>msg</replaceable>)</literal> and
1254 <literal>plpy.fatal(<replaceable>msg</replaceable>)</literal>, respectively but
1255 the <literal>raise</literal> form does not allow passing keyword arguments.
1256 The other functions only generate messages of different priority levels.
1257 Whether messages of a particular priority are reported to the client,
1258 written to the server log, or both is controlled by the
1259 <xref linkend="guc-log-min-messages"/> and
1260 <xref linkend="guc-client-min-messages"/> configuration
1261 variables. See <xref linkend="runtime-config"/> for more information.
1262 </para>
1264 <para>
1265 The <replaceable>msg</replaceable> argument is given as a positional argument. For
1266 backward compatibility, more than one positional argument can be given. In
1267 that case, the string representation of the tuple of positional arguments
1268 becomes the message reported to the client.
1269 </para>
1271 <para>
1272 The following keyword-only arguments are accepted:
1273 <simplelist>
1274 <member><literal>detail</literal></member>
1275 <member><literal>hint</literal></member>
1276 <member><literal>sqlstate</literal></member>
1277 <member><literal>schema_name</literal></member>
1278 <member><literal>table_name</literal></member>
1279 <member><literal>column_name</literal></member>
1280 <member><literal>datatype_name</literal></member>
1281 <member><literal>constraint_name</literal></member>
1282 </simplelist>
1283 The string representation of the objects passed as keyword-only arguments
1284 is used to enrich the messages reported to the client. For example:
1286 <programlisting>
1287 CREATE FUNCTION raise_custom_exception() RETURNS void AS $$
1288 plpy.error("custom exception message",
1289 detail="some info about exception",
1290 hint="hint for users")
1291 $$ LANGUAGE plpython3u;
1293 =# SELECT raise_custom_exception();
1294 ERROR: plpy.Error: custom exception message
1295 DETAIL: some info about exception
1296 HINT: hint for users
1297 CONTEXT: Traceback (most recent call last):
1298 PL/Python function "raise_custom_exception", line 4, in &lt;module&gt;
1299 hint="hint for users")
1300 PL/Python function "raise_custom_exception"
1301 </programlisting>
1302 </para>
1304 <para>
1305 Another set of utility functions are
1306 <literal>plpy.quote_literal(<replaceable>string</replaceable>)</literal>,
1307 <literal>plpy.quote_nullable(<replaceable>string</replaceable>)</literal>, and
1308 <literal>plpy.quote_ident(<replaceable>string</replaceable>)</literal>. They
1309 are equivalent to the built-in quoting functions described in <xref
1310 linkend="functions-string"/>. They are useful when constructing
1311 ad-hoc queries. A PL/Python equivalent of dynamic SQL from <xref
1312 linkend="plpgsql-quote-literal-example"/> would be:
1313 <programlisting>
1314 plpy.execute("UPDATE tbl SET %s = %s WHERE key = %s" % (
1315 plpy.quote_ident(colname),
1316 plpy.quote_nullable(newvalue),
1317 plpy.quote_literal(keyvalue)))
1318 </programlisting>
1319 </para>
1320 </sect1>
1322 <sect1 id="plpython-python23">
1323 <title>Python 2 vs. Python 3</title>
1325 <para>
1326 PL/Python supports only Python 3. Past versions of
1327 <productname>PostgreSQL</productname> supported Python 2, using the
1328 <literal>plpythonu</literal> and <literal>plpython2u</literal> language
1329 names.
1330 </para>
1331 </sect1>
1333 <sect1 id="plpython-envar">
1334 <title>Environment Variables</title>
1336 <para>
1337 Some of the environment variables that are accepted by the Python
1338 interpreter can also be used to affect PL/Python behavior. They
1339 would need to be set in the environment of the main PostgreSQL
1340 server process, for example in a start script. The available
1341 environment variables depend on the version of Python; see the
1342 Python documentation for details. At the time of this writing, the
1343 following environment variables have an affect on PL/Python,
1344 assuming an adequate Python version:
1345 <itemizedlist>
1346 <listitem>
1347 <para><envar>PYTHONHOME</envar></para>
1348 </listitem>
1350 <listitem>
1351 <para><envar>PYTHONPATH</envar></para>
1352 </listitem>
1354 <listitem>
1355 <para><envar>PYTHONY2K</envar></para>
1356 </listitem>
1358 <listitem>
1359 <para><envar>PYTHONOPTIMIZE</envar></para>
1360 </listitem>
1362 <listitem>
1363 <para><envar>PYTHONDEBUG</envar></para>
1364 </listitem>
1366 <listitem>
1367 <para><envar>PYTHONVERBOSE</envar></para>
1368 </listitem>
1370 <listitem>
1371 <para><envar>PYTHONCASEOK</envar></para>
1372 </listitem>
1374 <listitem>
1375 <para><envar>PYTHONDONTWRITEBYTECODE</envar></para>
1376 </listitem>
1378 <listitem>
1379 <para><envar>PYTHONIOENCODING</envar></para>
1380 </listitem>
1382 <listitem>
1383 <para><envar>PYTHONUSERBASE</envar></para>
1384 </listitem>
1386 <listitem>
1387 <para><envar>PYTHONHASHSEED</envar></para>
1388 </listitem>
1389 </itemizedlist>
1391 (It appears to be a Python implementation detail beyond the control
1392 of PL/Python that some of the environment variables listed on
1393 the <command>python</command> man page are only effective in a
1394 command-line interpreter and not an embedded Python interpreter.)
1395 </para>
1396 </sect1>
1397 </chapter>