The code to unlink dropped relations in FinishPreparedTransaction() was
[PostgreSQL.git] / doc / src / sgml / xfunc.sgml
blob3c8ce438ebbcfc3fe55ef62fb432be0becc13ad5
1 <!-- $PostgreSQL$ -->
3 <sect1 id="xfunc">
4 <title>User-Defined Functions</title>
6 <indexterm zone="xfunc">
7 <primary>function</primary>
8 <secondary>user-defined</secondary>
9 </indexterm>
11 <para>
12 <productname>PostgreSQL</productname> provides four kinds of
13 functions:
15 <itemizedlist>
16 <listitem>
17 <para>
18 query language functions (functions written in
19 <acronym>SQL</acronym>) (<xref linkend="xfunc-sql">)
20 </para>
21 </listitem>
22 <listitem>
23 <para>
24 procedural language functions (functions written in, for
25 example, <application>PL/pgSQL</> or <application>PL/Tcl</>)
26 (<xref linkend="xfunc-pl">)
27 </para>
28 </listitem>
29 <listitem>
30 <para>
31 internal functions (<xref linkend="xfunc-internal">)
32 </para>
33 </listitem>
34 <listitem>
35 <para>
36 C-language functions (<xref linkend="xfunc-c">)
37 </para>
38 </listitem>
39 </itemizedlist>
40 </para>
42 <para>
43 Every kind
44 of function can take base types, composite types, or
45 combinations of these as arguments (parameters). In addition,
46 every kind of function can return a base type or
47 a composite type. Functions can also be defined to return
48 sets of base or composite values.
49 </para>
51 <para>
52 Many kinds of functions can take or return certain pseudo-types
53 (such as polymorphic types), but the available facilities vary.
54 Consult the description of each kind of function for more details.
55 </para>
57 <para>
58 It's easiest to define <acronym>SQL</acronym>
59 functions, so we'll start by discussing those.
60 Most of the concepts presented for <acronym>SQL</acronym> functions
61 will carry over to the other types of functions.
62 </para>
64 <para>
65 Throughout this chapter, it can be useful to look at the reference
66 page of the <xref linkend="sql-createfunction"
67 endterm="sql-createfunction-title"> command to
68 understand the examples better. Some examples from this chapter
69 can be found in <filename>funcs.sql</filename> and
70 <filename>funcs.c</filename> in the <filename>src/tutorial</>
71 directory in the <productname>PostgreSQL</productname> source
72 distribution.
73 </para>
74 </sect1>
76 <sect1 id="xfunc-sql">
77 <title>Query Language (<acronym>SQL</acronym>) Functions</title>
79 <indexterm zone="xfunc-sql">
80 <primary>function</primary>
81 <secondary>user-defined</secondary>
82 <tertiary>in SQL</tertiary>
83 </indexterm>
85 <para>
86 SQL functions execute an arbitrary list of SQL statements, returning
87 the result of the last query in the list.
88 In the simple (non-set)
89 case, the first row of the last query's result will be returned.
90 (Bear in mind that <quote>the first row</quote> of a multirow
91 result is not well-defined unless you use <literal>ORDER BY</>.)
92 If the last query happens
93 to return no rows at all, the null value will be returned.
94 </para>
96 <para>
97 Alternatively, an SQL function can be declared to return a set,
98 by specifying the function's return type as <literal>SETOF
99 <replaceable>sometype</></literal>, or equivalently by declaring it as
100 <literal>RETURNS TABLE(<replaceable>columns</>)</literal>. In this case
101 all rows of the last query's result are returned. Further details appear
102 below.
103 </para>
105 <para>
106 The body of an SQL function must be a list of SQL
107 statements separated by semicolons. A semicolon after the last
108 statement is optional. Unless the function is declared to return
109 <type>void</>, the last statement must be a <command>SELECT</>,
110 or an <command>INSERT</>, <command>UPDATE</>, or <command>DELETE</>
111 that has a <literal>RETURNING</> clause.
112 </para>
114 <para>
115 Any collection of commands in the <acronym>SQL</acronym>
116 language can be packaged together and defined as a function.
117 Besides <command>SELECT</command> queries, the commands can include data
118 modification queries (<command>INSERT</command>,
119 <command>UPDATE</command>, and <command>DELETE</command>), as well as
120 other SQL commands. (The only exception is that you cannot put
121 <command>BEGIN</>, <command>COMMIT</>, <command>ROLLBACK</>, or
122 <command>SAVEPOINT</> commands into a <acronym>SQL</acronym> function.)
123 However, the final command
124 must be a <command>SELECT</command> or have a <literal>RETURNING</>
125 clause that returns whatever is
126 specified as the function's return type. Alternatively, if you
127 want to define a SQL function that performs actions but has no
128 useful value to return, you can define it as returning <type>void</>.
129 For example, this function removes rows with negative salaries from
130 the <literal>emp</> table:
132 <screen>
133 CREATE FUNCTION clean_emp() RETURNS void AS '
134 DELETE FROM emp
135 WHERE salary &lt; 0;
136 ' LANGUAGE SQL;
138 SELECT clean_emp();
140 clean_emp
141 -----------
143 (1 row)
144 </screen>
145 </para>
147 <para>
148 The syntax of the <command>CREATE FUNCTION</command> command requires
149 the function body to be written as a string constant. It is usually
150 most convenient to use dollar quoting (see <xref
151 linkend="sql-syntax-dollar-quoting">) for the string constant.
152 If you choose to use regular single-quoted string constant syntax,
153 you must double single quote marks (<literal>'</>) and backslashes
154 (<literal>\</>) (assuming escape string syntax) in the body of
155 the function (see <xref linkend="sql-syntax-strings">).
156 </para>
158 <para>
159 Arguments to the SQL function are referenced in the function
160 body using the syntax <literal>$<replaceable>n</></>: <literal>$1</>
161 refers to the first argument, <literal>$2</> to the second, and so on.
162 If an argument is of a composite type, then the dot notation,
163 e.g., <literal>$1.name</literal>, can be used to access attributes
164 of the argument. The arguments can only be used as data values,
165 not as identifiers. Thus for example this is reasonable:
166 <programlisting>
167 INSERT INTO mytable VALUES ($1);
168 </programlisting>
169 but this will not work:
170 <programlisting>
171 INSERT INTO $1 VALUES (42);
172 </programlisting>
173 </para>
175 <sect2 id="xfunc-sql-base-functions">
176 <title><acronym>SQL</acronym> Functions on Base Types</title>
178 <para>
179 The simplest possible <acronym>SQL</acronym> function has no arguments and
180 simply returns a base type, such as <type>integer</type>:
182 <screen>
183 CREATE FUNCTION one() RETURNS integer AS $$
184 SELECT 1 AS result;
185 $$ LANGUAGE SQL;
187 -- Alternative syntax for string literal:
188 CREATE FUNCTION one() RETURNS integer AS '
189 SELECT 1 AS result;
190 ' LANGUAGE SQL;
192 SELECT one();
195 -----
197 </screen>
198 </para>
200 <para>
201 Notice that we defined a column alias within the function body for the result of the function
202 (with the name <literal>result</>), but this column alias is not visible
203 outside the function. Hence, the result is labeled <literal>one</>
204 instead of <literal>result</>.
205 </para>
207 <para>
208 It is almost as easy to define <acronym>SQL</acronym> functions
209 that take base types as arguments. In the example below, notice
210 how we refer to the arguments within the function as <literal>$1</>
211 and <literal>$2</>.
213 <screen>
214 CREATE FUNCTION add_em(integer, integer) RETURNS integer AS $$
215 SELECT $1 + $2;
216 $$ LANGUAGE SQL;
218 SELECT add_em(1, 2) AS answer;
220 answer
221 --------
223 </screen>
224 </para>
226 <para>
227 Here is a more useful function, which might be used to debit a
228 bank account:
230 <programlisting>
231 CREATE FUNCTION tf1 (integer, numeric) RETURNS integer AS $$
232 UPDATE bank
233 SET balance = balance - $2
234 WHERE accountno = $1;
235 SELECT 1;
236 $$ LANGUAGE SQL;
237 </programlisting>
239 A user could execute this function to debit account 17 by $100.00 as
240 follows:
242 <programlisting>
243 SELECT tf1(17, 100.0);
244 </programlisting>
245 </para>
247 <para>
248 In practice one would probably like a more useful result from the
249 function than a constant 1, so a more likely definition
252 <programlisting>
253 CREATE FUNCTION tf1 (integer, numeric) RETURNS numeric AS $$
254 UPDATE bank
255 SET balance = balance - $2
256 WHERE accountno = $1;
257 SELECT balance FROM bank WHERE accountno = $1;
258 $$ LANGUAGE SQL;
259 </programlisting>
261 which adjusts the balance and returns the new balance.
262 The same thing could be done in one command using <literal>RETURNING</>:
264 <programlisting>
265 CREATE FUNCTION tf1 (integer, numeric) RETURNS numeric AS $$
266 UPDATE bank
267 SET balance = balance - $2
268 WHERE accountno = $1
269 RETURNING balance;
270 $$ LANGUAGE SQL;
271 </programlisting>
272 </para>
273 </sect2>
275 <sect2>
276 <title><acronym>SQL</acronym> Functions on Composite Types</title>
278 <para>
279 When writing functions with arguments of composite
280 types, we must not only specify which
281 argument we want (as we did above with <literal>$1</> and <literal>$2</literal>) but
282 also the desired attribute (field) of that argument. For example,
283 suppose that
284 <type>emp</type> is a table containing employee data, and therefore
285 also the name of the composite type of each row of the table. Here
286 is a function <function>double_salary</function> that computes what someone's
287 salary would be if it were doubled:
289 <screen>
290 CREATE TABLE emp (
291 name text,
292 salary numeric,
293 age integer,
294 cubicle point
297 CREATE FUNCTION double_salary(emp) RETURNS numeric AS $$
298 SELECT $1.salary * 2 AS salary;
299 $$ LANGUAGE SQL;
301 SELECT name, double_salary(emp.*) AS dream
302 FROM emp
303 WHERE emp.cubicle ~= point '(2,1)';
305 name | dream
306 ------+-------
307 Bill | 8400
308 </screen>
309 </para>
311 <para>
312 Notice the use of the syntax <literal>$1.salary</literal>
313 to select one field of the argument row value. Also notice
314 how the calling <command>SELECT</> command uses <literal>*</>
315 to select
316 the entire current row of a table as a composite value. The table
317 row can alternatively be referenced using just the table name,
318 like this:
319 <screen>
320 SELECT name, double_salary(emp) AS dream
321 FROM emp
322 WHERE emp.cubicle ~= point '(2,1)';
323 </screen>
324 but this usage is deprecated since it's easy to get confused.
325 </para>
327 <para>
328 Sometimes it is handy to construct a composite argument value
329 on-the-fly. This can be done with the <literal>ROW</> construct.
330 For example, we could adjust the data being passed to the function:
331 <screen>
332 SELECT name, double_salary(ROW(name, salary*1.1, age, cubicle)) AS dream
333 FROM emp;
334 </screen>
335 </para>
337 <para>
338 It is also possible to build a function that returns a composite type.
339 This is an example of a function
340 that returns a single <type>emp</type> row:
342 <programlisting>
343 CREATE FUNCTION new_emp() RETURNS emp AS $$
344 SELECT text 'None' AS name,
345 1000.0 AS salary,
346 25 AS age,
347 point '(2,2)' AS cubicle;
348 $$ LANGUAGE SQL;
349 </programlisting>
351 In this example we have specified each of the attributes
352 with a constant value, but any computation
353 could have been substituted for these constants.
354 </para>
356 <para>
357 Note two important things about defining the function:
359 <itemizedlist>
360 <listitem>
361 <para>
362 The select list order in the query must be exactly the same as
363 that in which the columns appear in the table associated
364 with the composite type. (Naming the columns, as we did above,
365 is irrelevant to the system.)
366 </para>
367 </listitem>
368 <listitem>
369 <para>
370 You must typecast the expressions to match the
371 definition of the composite type, or you will get errors like this:
372 <screen>
373 <computeroutput>
374 ERROR: function declared to return emp returns varchar instead of text at column 1
375 </computeroutput>
376 </screen>
377 </para>
378 </listitem>
379 </itemizedlist>
380 </para>
382 <para>
383 A different way to define the same function is:
385 <programlisting>
386 CREATE FUNCTION new_emp() RETURNS emp AS $$
387 SELECT ROW('None', 1000.0, 25, '(2,2)')::emp;
388 $$ LANGUAGE SQL;
389 </programlisting>
391 Here we wrote a <command>SELECT</> that returns just a single
392 column of the correct composite type. This isn't really better
393 in this situation, but it is a handy alternative in some cases
394 &mdash; for example, if we need to compute the result by calling
395 another function that returns the desired composite value.
396 </para>
398 <para>
399 We could call this function directly in either of two ways:
401 <screen>
402 SELECT new_emp();
404 new_emp
405 --------------------------
406 (None,1000.0,25,"(2,2)")
408 SELECT * FROM new_emp();
410 name | salary | age | cubicle
411 ------+--------+-----+---------
412 None | 1000.0 | 25 | (2,2)
413 </screen>
415 The second way is described more fully in <xref
416 linkend="xfunc-sql-table-functions">.
417 </para>
419 <para>
420 When you use a function that returns a composite type,
421 you might want only one field (attribute) from its result.
422 You can do that with syntax like this:
424 <screen>
425 SELECT (new_emp()).name;
427 name
428 ------
429 None
430 </screen>
432 The extra parentheses are needed to keep the parser from getting
433 confused. If you try to do it without them, you get something like this:
435 <screen>
436 SELECT new_emp().name;
437 ERROR: syntax error at or near "."
438 LINE 1: SELECT new_emp().name;
440 </screen>
441 </para>
443 <para>
444 Another option is to use
445 functional notation for extracting an attribute. The simple way
446 to explain this is that we can use the
447 notations <literal>attribute(table)</> and <literal>table.attribute</>
448 interchangeably.
450 <screen>
451 SELECT name(new_emp());
453 name
454 ------
455 None
456 </screen>
458 <screen>
459 -- This is the same as:
460 -- SELECT emp.name AS youngster FROM emp WHERE emp.age &lt; 30;
462 SELECT name(emp) AS youngster FROM emp WHERE age(emp) &lt; 30;
464 youngster
465 -----------
467 Andy
468 </screen>
469 </para>
471 <tip>
472 <para>
473 The equivalence between functional notation and attribute notation
474 makes it possible to use functions on composite types to emulate
475 <quote>computed fields</>.
476 <indexterm>
477 <primary>computed field</primary>
478 </indexterm>
479 <indexterm>
480 <primary>field</primary>
481 <secondary>computed</secondary>
482 </indexterm>
483 For example, using the previous definition
484 for <literal>double_salary(emp)</>, we can write
486 <screen>
487 SELECT emp.name, emp.double_salary FROM emp;
488 </screen>
490 An application using this wouldn't need to be directly aware that
491 <literal>double_salary</> isn't a real column of the table.
492 (You can also emulate computed fields with views.)
493 </para>
494 </tip>
496 <para>
497 Another way to use a function returning a composite type is to pass the
498 result to another function that accepts the correct row type as input:
500 <screen>
501 CREATE FUNCTION getname(emp) RETURNS text AS $$
502 SELECT $1.name;
503 $$ LANGUAGE SQL;
505 SELECT getname(new_emp());
506 getname
507 ---------
508 None
509 (1 row)
510 </screen>
511 </para>
513 <para>
514 Still another way to use a function that returns a composite type is to
515 call it as a table function, as described in <xref
516 linkend="xfunc-sql-table-functions">.
517 </para>
518 </sect2>
520 <sect2 id="xfunc-output-parameters">
521 <title><acronym>SQL</> Functions with Output Parameters</title>
523 <indexterm>
524 <primary>function</primary>
525 <secondary>output parameter</secondary>
526 </indexterm>
528 <para>
529 An alternative way of describing a function's results is to define it
530 with <firstterm>output parameters</>, as in this example:
532 <screen>
533 CREATE FUNCTION add_em (IN x int, IN y int, OUT sum int)
534 AS 'SELECT $1 + $2'
535 LANGUAGE SQL;
537 SELECT add_em(3,7);
538 add_em
539 --------
541 (1 row)
542 </screen>
544 This is not essentially different from the version of <literal>add_em</>
545 shown in <xref linkend="xfunc-sql-base-functions">. The real value of
546 output parameters is that they provide a convenient way of defining
547 functions that return several columns. For example,
549 <screen>
550 CREATE FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int)
551 AS 'SELECT $1 + $2, $1 * $2'
552 LANGUAGE SQL;
554 SELECT * FROM sum_n_product(11,42);
555 sum | product
556 -----+---------
557 53 | 462
558 (1 row)
559 </screen>
561 What has essentially happened here is that we have created an anonymous
562 composite type for the result of the function. The above example has
563 the same end result as
565 <screen>
566 CREATE TYPE sum_prod AS (sum int, product int);
568 CREATE FUNCTION sum_n_product (int, int) RETURNS sum_prod
569 AS 'SELECT $1 + $2, $1 * $2'
570 LANGUAGE SQL;
571 </screen>
573 but not having to bother with the separate composite type definition
574 is often handy.
575 </para>
577 <para>
578 Notice that output parameters are not included in the calling argument
579 list when invoking such a function from SQL. This is because
580 <productname>PostgreSQL</productname> considers only the input
581 parameters to define the function's calling signature. That means
582 also that only the input parameters matter when referencing the function
583 for purposes such as dropping it. We could drop the above function
584 with either of
586 <screen>
587 DROP FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int);
588 DROP FUNCTION sum_n_product (int, int);
589 </screen>
590 </para>
592 <para>
593 Parameters can be marked as <literal>IN</> (the default),
594 <literal>OUT</>, <literal>INOUT</>, or <literal>VARIADIC</>.
595 An <literal>INOUT</>
596 parameter serves as both an input parameter (part of the calling
597 argument list) and an output parameter (part of the result record type).
598 <literal>VARIADIC</> parameters are input parameters, but are treated
599 specially as described next.
600 </para>
601 </sect2>
603 <sect2 id="xfunc-sql-variadic-functions">
604 <title><acronym>SQL</> Functions with Variable Numbers of Arguments</title>
606 <indexterm>
607 <primary>function</primary>
608 <secondary>variadic</secondary>
609 </indexterm>
611 <indexterm>
612 <primary>variadic function</primary>
613 </indexterm>
615 <para>
616 <acronym>SQL</acronym> functions can be declared to accept
617 variable numbers of arguments, so long as all the <quote>optional</>
618 arguments are of the same data type. The optional arguments will be
619 passed to the function as an array. The function is declared by
620 marking the last parameter as <literal>VARIADIC</>; this parameter
621 must be declared as being of an array type. For example:
623 <screen>
624 CREATE FUNCTION mleast(VARIADIC numeric[]) RETURNS numeric AS $$
625 SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
626 $$ LANGUAGE SQL;
628 SELECT mleast(10, -1, 5, 4.4);
629 mleast
630 --------
632 (1 row)
633 </screen>
635 Effectively, all the actual arguments at or beyond the
636 <literal>VARIADIC</> position are gathered up into a one-dimensional
637 array, as if you had written
639 <screen>
640 SELECT mleast(ARRAY[10, -1, 5, 4.4]); -- doesn't work
641 </screen>
643 You can't actually write that, though &mdash; or at least, it will
644 not match this function definition. A parameter marked
645 <literal>VARIADIC</> matches one or more occurrences of its element
646 type, not of its own type.
647 </para>
649 <para>
650 Sometimes it is useful to be able to pass an already-constructed array
651 to a variadic function; this is particularly handy when one variadic
652 function wants to pass on its array parameter to another one. You can
653 do that by specifying <literal>VARIADIC</> in the call:
655 <screen>
656 SELECT mleast(VARIADIC ARRAY[10, -1, 5, 4.4]);
657 </screen>
659 This prevents expansion of the function's variadic parameter into its
660 element type, thereby allowing the array argument value to match
661 normally. <literal>VARIADIC</> can only be attached to the last
662 actual argument of a function call.
663 </para>
664 </sect2>
666 <sect2 id="xfunc-sql-parameter-defaults">
667 <title><acronym>SQL</> Functions with Default Values for Arguments</title>
669 <indexterm>
670 <primary>function</primary>
671 <secondary>default values for arguments</secondary>
672 </indexterm>
674 <para>
675 Functions can be declared with default values for some or all input
676 arguments. The default values are inserted whenever the function is
677 called with insufficiently many actual arguments. Since arguments
678 can only be omitted from the end of the actual argument list, all
679 parameters after a parameter with a default value have to have
680 default values as well.
681 </para>
683 <para>
684 For example:
685 <screen>
686 CREATE FUNCTION foo(a int, b int DEFAULT 2, c int DEFAULT 3)
687 RETURNS int
688 LANGUAGE SQL
689 AS $$
690 SELECT $1 + $2 + $3;
693 SELECT foo(10, 20, 30);
694 foo
695 -----
697 (1 row)
699 SELECT foo(10, 20);
700 foo
701 -----
703 (1 row)
705 SELECT foo(10);
706 foo
707 -----
709 (1 row)
711 SELECT foo(); -- fails since there is no default for the first argument
712 ERROR: function foo() does not exist
713 </screen>
714 The <literal>=</literal> sign can also be used in place of the
715 key word <literal>DEFAULT</literal>,
716 </para>
717 </sect2>
719 <sect2 id="xfunc-sql-table-functions">
720 <title><acronym>SQL</acronym> Functions as Table Sources</title>
722 <para>
723 All SQL functions can be used in the <literal>FROM</> clause of a query,
724 but it is particularly useful for functions returning composite types.
725 If the function is defined to return a base type, the table function
726 produces a one-column table. If the function is defined to return
727 a composite type, the table function produces a column for each attribute
728 of the composite type.
729 </para>
731 <para>
732 Here is an example:
734 <screen>
735 CREATE TABLE foo (fooid int, foosubid int, fooname text);
736 INSERT INTO foo VALUES (1, 1, 'Joe');
737 INSERT INTO foo VALUES (1, 2, 'Ed');
738 INSERT INTO foo VALUES (2, 1, 'Mary');
740 CREATE FUNCTION getfoo(int) RETURNS foo AS $$
741 SELECT * FROM foo WHERE fooid = $1;
742 $$ LANGUAGE SQL;
744 SELECT *, upper(fooname) FROM getfoo(1) AS t1;
746 fooid | foosubid | fooname | upper
747 -------+----------+---------+-------
748 1 | 1 | Joe | JOE
749 (1 row)
750 </screen>
752 As the example shows, we can work with the columns of the function's
753 result just the same as if they were columns of a regular table.
754 </para>
756 <para>
757 Note that we only got one row out of the function. This is because
758 we did not use <literal>SETOF</>. That is described in the next section.
759 </para>
760 </sect2>
762 <sect2 id="xfunc-sql-functions-returning-set">
763 <title><acronym>SQL</acronym> Functions Returning Sets</title>
765 <indexterm>
766 <primary>function</primary>
767 <secondary>with SETOF</secondary>
768 </indexterm>
770 <para>
771 When an SQL function is declared as returning <literal>SETOF
772 <replaceable>sometype</></literal>, the function's final
773 query is executed to completion, and each row it
774 outputs is returned as an element of the result set.
775 </para>
777 <para>
778 This feature is normally used when calling the function in the <literal>FROM</>
779 clause. In this case each row returned by the function becomes
780 a row of the table seen by the query. For example, assume that
781 table <literal>foo</> has the same contents as above, and we say:
783 <programlisting>
784 CREATE FUNCTION getfoo(int) RETURNS SETOF foo AS $$
785 SELECT * FROM foo WHERE fooid = $1;
786 $$ LANGUAGE SQL;
788 SELECT * FROM getfoo(1) AS t1;
789 </programlisting>
791 Then we would get:
792 <screen>
793 fooid | foosubid | fooname
794 -------+----------+---------
795 1 | 1 | Joe
796 1 | 2 | Ed
797 (2 rows)
798 </screen>
799 </para>
801 <para>
802 It is also possible to return multiple rows with the columns defined by
803 output parameters, like this:
805 <programlisting>
806 CREATE FUNCTION sum_n_product_with_tab (x int, OUT sum int, OUT product int) RETURNS SETOF record AS $$
807 SELECT $1 + tab.y, $1 * tab.y FROM tab;
808 $$ LANGUAGE SQL;
809 </programlisting>
811 The key point here is that you must write <literal>RETURNS SETOF record</>
812 to indicate that the function returns multiple rows instead of just one.
813 If there is only one output parameter, write that parameter's type
814 instead of <type>record</>.
815 </para>
817 <para>
818 Currently, functions returning sets can also be called in the select list
819 of a query. For each row that the query
820 generates by itself, the function returning set is invoked, and an output
821 row is generated for each element of the function's result set. Note,
822 however, that this capability is deprecated and might be removed in future
823 releases. The following is an example function returning a set from the
824 select list:
826 <screen>
827 CREATE FUNCTION listchildren(text) RETURNS SETOF text AS $$
828 SELECT name FROM nodes WHERE parent = $1
829 $$ LANGUAGE SQL;
831 SELECT * FROM nodes;
832 name | parent
833 -----------+--------
834 Top |
835 Child1 | Top
836 Child2 | Top
837 Child3 | Top
838 SubChild1 | Child1
839 SubChild2 | Child1
840 (6 rows)
842 SELECT listchildren('Top');
843 listchildren
844 --------------
845 Child1
846 Child2
847 Child3
848 (3 rows)
850 SELECT name, listchildren(name) FROM nodes;
851 name | listchildren
852 --------+--------------
853 Top | Child1
854 Top | Child2
855 Top | Child3
856 Child1 | SubChild1
857 Child1 | SubChild2
858 (5 rows)
859 </screen>
861 In the last <command>SELECT</command>,
862 notice that no output row appears for <literal>Child2</>, <literal>Child3</>, etc.
863 This happens because <function>listchildren</function> returns an empty set
864 for those arguments, so no result rows are generated.
865 </para>
867 <note>
868 <para>
869 If a function's last command is <command>INSERT</>, <command>UPDATE</>,
870 or <command>DELETE</> with <literal>RETURNING</>, that command will
871 always be executed to completion, even if the function is not declared
872 with <literal>SETOF</> or the calling query does not fetch all the
873 result rows. Any extra rows produced by the <literal>RETURNING</>
874 clause are silently dropped, but the commanded table modifications
875 still happen (and are all completed before returning from the function).
876 </para>
877 </note>
878 </sect2>
880 <sect2 id="xfunc-sql-functions-returning-table">
881 <title><acronym>SQL</acronym> Functions Returning <literal>TABLE</></title>
883 <indexterm>
884 <primary>function</primary>
885 <secondary>RETURNS TABLE</secondary>
886 </indexterm>
888 <para>
889 There is another way to declare a function as returning a set,
890 which is to use the syntax
891 <literal>RETURNS TABLE(<replaceable>columns</>)</literal>.
892 This is equivalent to using one or more <literal>OUT</> parameters plus
893 marking the function as returning <literal>SETOF record</> (or
894 <literal>SETOF</> a single output parameter's type, as appropriate).
895 This notation is specified in recent versions of the SQL standard, and
896 thus may be more portable than using <literal>SETOF</>.
897 </para>
899 <para>
900 For example, the preceding sum-and-product example could also be
901 done this way:
903 <programlisting>
904 CREATE FUNCTION sum_n_product_with_tab (x int) RETURNS TABLE(sum int, product int) AS $$
905 SELECT $1 + tab.y, $1 * tab.y FROM tab;
906 $$ LANGUAGE SQL;
907 </programlisting>
909 It is not allowed to use explicit <literal>OUT</> or <literal>INOUT</>
910 parameters with the <literal>RETURNS TABLE</> notation &mdash; you must
911 put all the output columns in the <literal>TABLE</> list.
912 </para>
913 </sect2>
915 <sect2>
916 <title>Polymorphic <acronym>SQL</acronym> Functions</title>
918 <para>
919 <acronym>SQL</acronym> functions can be declared to accept and
920 return the polymorphic types <type>anyelement</type>,
921 <type>anyarray</type>, <type>anynonarray</type>, and
922 <type>anyenum</type>. See <xref
923 linkend="extend-types-polymorphic"> for a more detailed
924 explanation of polymorphic functions. Here is a polymorphic
925 function <function>make_array</function> that builds up an array
926 from two arbitrary data type elements:
927 <screen>
928 CREATE FUNCTION make_array(anyelement, anyelement) RETURNS anyarray AS $$
929 SELECT ARRAY[$1, $2];
930 $$ LANGUAGE SQL;
932 SELECT make_array(1, 2) AS intarray, make_array('a'::text, 'b') AS textarray;
933 intarray | textarray
934 ----------+-----------
935 {1,2} | {a,b}
936 (1 row)
937 </screen>
938 </para>
940 <para>
941 Notice the use of the typecast <literal>'a'::text</literal>
942 to specify that the argument is of type <type>text</type>. This is
943 required if the argument is just a string literal, since otherwise
944 it would be treated as type
945 <type>unknown</type>, and array of <type>unknown</type> is not a valid
946 type.
947 Without the typecast, you will get errors like this:
948 <screen>
949 <computeroutput>
950 ERROR: could not determine polymorphic type because input has type "unknown"
951 </computeroutput>
952 </screen>
953 </para>
955 <para>
956 It is permitted to have polymorphic arguments with a fixed
957 return type, but the converse is not. For example:
958 <screen>
959 CREATE FUNCTION is_greater(anyelement, anyelement) RETURNS boolean AS $$
960 SELECT $1 &gt; $2;
961 $$ LANGUAGE SQL;
963 SELECT is_greater(1, 2);
964 is_greater
965 ------------
967 (1 row)
969 CREATE FUNCTION invalid_func() RETURNS anyelement AS $$
970 SELECT 1;
971 $$ LANGUAGE SQL;
972 ERROR: cannot determine result data type
973 DETAIL: A function returning a polymorphic type must have at least one polymorphic argument.
974 </screen>
975 </para>
977 <para>
978 Polymorphism can be used with functions that have output arguments.
979 For example:
980 <screen>
981 CREATE FUNCTION dup (f1 anyelement, OUT f2 anyelement, OUT f3 anyarray)
982 AS 'select $1, array[$1,$1]' LANGUAGE SQL;
984 SELECT * FROM dup(22);
985 f2 | f3
986 ----+---------
987 22 | {22,22}
988 (1 row)
989 </screen>
990 </para>
992 <para>
993 Polymorphism can also be used with variadic functions.
994 For example:
995 <screen>
996 CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$
997 SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
998 $$ LANGUAGE SQL;
1000 SELECT anyleast(10, -1, 5, 4);
1001 anyleast
1002 ----------
1004 (1 row)
1006 SELECT anyleast('abc'::text, 'def');
1007 anyleast
1008 ----------
1010 (1 row)
1012 CREATE FUNCTION concat(text, VARIADIC anyarray) RETURNS text AS $$
1013 SELECT array_to_string($2, $1);
1014 $$ LANGUAGE SQL;
1016 SELECT concat('|', 1, 4, 2);
1017 concat
1018 --------
1019 1|4|2
1020 (1 row)
1021 </screen>
1022 </para>
1023 </sect2>
1024 </sect1>
1026 <sect1 id="xfunc-overload">
1027 <title>Function Overloading</title>
1029 <indexterm zone="xfunc-overload">
1030 <primary>overloading</primary>
1031 <secondary>functions</secondary>
1032 </indexterm>
1034 <para>
1035 More than one function can be defined with the same SQL name, so long
1036 as the arguments they take are different. In other words,
1037 function names can be <firstterm>overloaded</firstterm>. When a
1038 query is executed, the server will determine which function to
1039 call from the data types and the number of the provided arguments.
1040 Overloading can also be used to simulate functions with a variable
1041 number of arguments, up to a finite maximum number.
1042 </para>
1044 <para>
1045 When creating a family of overloaded functions, one should be
1046 careful not to create ambiguities. For instance, given the
1047 functions:
1048 <programlisting>
1049 CREATE FUNCTION test(int, real) RETURNS ...
1050 CREATE FUNCTION test(smallint, double precision) RETURNS ...
1051 </programlisting>
1052 it is not immediately clear which function would be called with
1053 some trivial input like <literal>test(1, 1.5)</literal>. The
1054 currently implemented resolution rules are described in
1055 <xref linkend="typeconv">, but it is unwise to design a system that subtly
1056 relies on this behavior.
1057 </para>
1059 <para>
1060 A function that takes a single argument of a composite type should
1061 generally not have the same name as any attribute (field) of that type.
1062 Recall that <literal>attribute(table)</literal> is considered equivalent
1063 to <literal>table.attribute</literal>. In the case that there is an
1064 ambiguity between a function on a composite type and an attribute of
1065 the composite type, the attribute will always be used. It is possible
1066 to override that choice by schema-qualifying the function name
1067 (that is, <literal>schema.func(table)</literal>) but it's better to
1068 avoid the problem by not choosing conflicting names.
1069 </para>
1071 <para>
1072 Another possible conflict is between variadic and non-variadic functions.
1073 For instance, it is possible to create both <literal>foo(numeric)</> and
1074 <literal>foo(VARIADIC numeric[])</>. In this case it is unclear which one
1075 should be matched to a call providing a single numeric argument, such as
1076 <literal>foo(10.1)</>. The rule is that the function appearing
1077 earlier in the search path is used, or if the two functions are in the
1078 same schema, the non-variadic one is preferred.
1079 </para>
1081 <para>
1082 When overloading C-language functions, there is an additional
1083 constraint: The C name of each function in the family of
1084 overloaded functions must be different from the C names of all
1085 other functions, either internal or dynamically loaded. If this
1086 rule is violated, the behavior is not portable. You might get a
1087 run-time linker error, or one of the functions will get called
1088 (usually the internal one). The alternative form of the
1089 <literal>AS</> clause for the SQL <command>CREATE
1090 FUNCTION</command> command decouples the SQL function name from
1091 the function name in the C source code. For instance:
1092 <programlisting>
1093 CREATE FUNCTION test(int) RETURNS int
1094 AS '<replaceable>filename</>', 'test_1arg'
1095 LANGUAGE C;
1096 CREATE FUNCTION test(int, int) RETURNS int
1097 AS '<replaceable>filename</>', 'test_2arg'
1098 LANGUAGE C;
1099 </programlisting>
1100 The names of the C functions here reflect one of many possible conventions.
1101 </para>
1102 </sect1>
1104 <sect1 id="xfunc-volatility">
1105 <title>Function Volatility Categories</title>
1107 <indexterm zone="xfunc-volatility">
1108 <primary>volatility</primary>
1109 <secondary>functions</secondary>
1110 </indexterm>
1111 <indexterm zone="xfunc-volatility">
1112 <primary>VOLATILE</primary>
1113 </indexterm>
1114 <indexterm zone="xfunc-volatility">
1115 <primary>STABLE</primary>
1116 </indexterm>
1117 <indexterm zone="xfunc-volatility">
1118 <primary>IMMUTABLE</primary>
1119 </indexterm>
1121 <para>
1122 Every function has a <firstterm>volatility</> classification, with
1123 the possibilities being <literal>VOLATILE</>, <literal>STABLE</>, or
1124 <literal>IMMUTABLE</>. <literal>VOLATILE</> is the default if the
1125 <xref linkend="sql-createfunction" endterm="sql-createfunction-title">
1126 command does not specify a category. The volatility category is a
1127 promise to the optimizer about the behavior of the function:
1129 <itemizedlist>
1130 <listitem>
1131 <para>
1132 A <literal>VOLATILE</> function can do anything, including modifying
1133 the database. It can return different results on successive calls with
1134 the same arguments. The optimizer makes no assumptions about the
1135 behavior of such functions. A query using a volatile function will
1136 re-evaluate the function at every row where its value is needed.
1137 </para>
1138 </listitem>
1139 <listitem>
1140 <para>
1141 A <literal>STABLE</> function cannot modify the database and is
1142 guaranteed to return the same results given the same arguments
1143 for all rows within a single statement. This category allows the
1144 optimizer to optimize multiple calls of the function to a single
1145 call. In particular, it is safe to use an expression containing
1146 such a function in an index scan condition. (Since an index scan
1147 will evaluate the comparison value only once, not once at each
1148 row, it is not valid to use a <literal>VOLATILE</> function in an
1149 index scan condition.)
1150 </para>
1151 </listitem>
1152 <listitem>
1153 <para>
1154 An <literal>IMMUTABLE</> function cannot modify the database and is
1155 guaranteed to return the same results given the same arguments forever.
1156 This category allows the optimizer to pre-evaluate the function when
1157 a query calls it with constant arguments. For example, a query like
1158 <literal>SELECT ... WHERE x = 2 + 2</> can be simplified on sight to
1159 <literal>SELECT ... WHERE x = 4</>, because the function underlying
1160 the integer addition operator is marked <literal>IMMUTABLE</>.
1161 </para>
1162 </listitem>
1163 </itemizedlist>
1164 </para>
1166 <para>
1167 For best optimization results, you should label your functions with the
1168 strictest volatility category that is valid for them.
1169 </para>
1171 <para>
1172 Any function with side-effects <emphasis>must</> be labeled
1173 <literal>VOLATILE</>, so that calls to it cannot be optimized away.
1174 Even a function with no side-effects needs to be labeled
1175 <literal>VOLATILE</> if its value can change within a single query;
1176 some examples are <literal>random()</>, <literal>currval()</>,
1177 <literal>timeofday()</>.
1178 </para>
1180 <para>
1181 Another important example is that the <function>current_timestamp</>
1182 family of functions qualify as <literal>STABLE</>, since their values do
1183 not change within a transaction.
1184 </para>
1186 <para>
1187 There is relatively little difference between <literal>STABLE</> and
1188 <literal>IMMUTABLE</> categories when considering simple interactive
1189 queries that are planned and immediately executed: it doesn't matter
1190 a lot whether a function is executed once during planning or once during
1191 query execution startup. But there is a big difference if the plan is
1192 saved and reused later. Labeling a function <literal>IMMUTABLE</> when
1193 it really isn't might allow it to be prematurely folded to a constant during
1194 planning, resulting in a stale value being re-used during subsequent uses
1195 of the plan. This is a hazard when using prepared statements or when
1196 using function languages that cache plans (such as
1197 <application>PL/pgSQL</>).
1198 </para>
1200 <para>
1201 For functions written in SQL or in any of the standard procedural
1202 languages, there is a second important property determined by the
1203 volatility category, namely the visibility of any data changes that have
1204 been made by the SQL command that is calling the function. A
1205 <literal>VOLATILE</> function will see such changes, a <literal>STABLE</>
1206 or <literal>IMMUTABLE</> function will not. This behavior is implemented
1207 using the snapshotting behavior of MVCC (see <xref linkend="mvcc">):
1208 <literal>STABLE</> and <literal>IMMUTABLE</> functions use a snapshot
1209 established as of the start of the calling query, whereas
1210 <literal>VOLATILE</> functions obtain a fresh snapshot at the start of
1211 each query they execute.
1212 </para>
1214 <note>
1215 <para>
1216 Functions written in C can manage snapshots however they want, but it's
1217 usually a good idea to make C functions work this way too.
1218 </para>
1219 </note>
1221 <para>
1222 Because of this snapshotting behavior,
1223 a function containing only <command>SELECT</> commands can safely be
1224 marked <literal>STABLE</>, even if it selects from tables that might be
1225 undergoing modifications by concurrent queries.
1226 <productname>PostgreSQL</productname> will execute all commands of a
1227 <literal>STABLE</> function using the snapshot established for the
1228 calling query, and so it will see a fixed view of the database throughout
1229 that query.
1230 </para>
1232 <para>
1233 The same snapshotting behavior is used for <command>SELECT</> commands
1234 within <literal>IMMUTABLE</> functions. It is generally unwise to select
1235 from database tables within an <literal>IMMUTABLE</> function at all,
1236 since the immutability will be broken if the table contents ever change.
1237 However, <productname>PostgreSQL</productname> does not enforce that you
1238 do not do that.
1239 </para>
1241 <para>
1242 A common error is to label a function <literal>IMMUTABLE</> when its
1243 results depend on a configuration parameter. For example, a function
1244 that manipulates timestamps might well have results that depend on the
1245 <xref linkend="guc-timezone"> setting. For safety, such functions should
1246 be labeled <literal>STABLE</> instead.
1247 </para>
1249 <note>
1250 <para>
1251 Before <productname>PostgreSQL</productname> release 8.0, the requirement
1252 that <literal>STABLE</> and <literal>IMMUTABLE</> functions cannot modify
1253 the database was not enforced by the system. Releases 8.0 and later enforce it
1254 by requiring SQL functions and procedural language functions of these
1255 categories to contain no SQL commands other than <command>SELECT</>.
1256 (This is not a completely bulletproof test, since such functions could
1257 still call <literal>VOLATILE</> functions that modify the database.
1258 If you do that, you will find that the <literal>STABLE</> or
1259 <literal>IMMUTABLE</> function does not notice the database changes
1260 applied by the called function, since they are hidden from its snapshot.)
1261 </para>
1262 </note>
1263 </sect1>
1265 <sect1 id="xfunc-pl">
1266 <title>Procedural Language Functions</title>
1268 <para>
1269 <productname>PostgreSQL</productname> allows user-defined functions
1270 to be written in other languages besides SQL and C. These other
1271 languages are generically called <firstterm>procedural
1272 languages</firstterm> (<acronym>PL</>s).
1273 Procedural languages aren't built into the
1274 <productname>PostgreSQL</productname> server; they are offered
1275 by loadable modules.
1276 See <xref linkend="xplang"> and following chapters for more
1277 information.
1278 </para>
1279 </sect1>
1281 <sect1 id="xfunc-internal">
1282 <title>Internal Functions</title>
1284 <indexterm zone="xfunc-internal"><primary>function</><secondary>internal</></>
1286 <para>
1287 Internal functions are functions written in C that have been statically
1288 linked into the <productname>PostgreSQL</productname> server.
1289 The <quote>body</quote> of the function definition
1290 specifies the C-language name of the function, which need not be the
1291 same as the name being declared for SQL use.
1292 (For reasons of backwards compatibility, an empty body
1293 is accepted as meaning that the C-language function name is the
1294 same as the SQL name.)
1295 </para>
1297 <para>
1298 Normally, all internal functions present in the
1299 server are declared during the initialization of the database cluster (<command>initdb</command>),
1300 but a user could use <command>CREATE FUNCTION</command>
1301 to create additional alias names for an internal function.
1302 Internal functions are declared in <command>CREATE FUNCTION</command>
1303 with language name <literal>internal</literal>. For instance, to
1304 create an alias for the <function>sqrt</function> function:
1305 <programlisting>
1306 CREATE FUNCTION square_root(double precision) RETURNS double precision
1307 AS 'dsqrt'
1308 LANGUAGE internal
1309 STRICT;
1310 </programlisting>
1311 (Most internal functions expect to be declared <quote>strict</quote>.)
1312 </para>
1314 <note>
1315 <para>
1316 Not all <quote>predefined</quote> functions are
1317 <quote>internal</quote> in the above sense. Some predefined
1318 functions are written in SQL.
1319 </para>
1320 </note>
1321 </sect1>
1323 <sect1 id="xfunc-c">
1324 <title>C-Language Functions</title>
1326 <indexterm zone="xfunc-c">
1327 <primary>function</primary>
1328 <secondary>user-defined</secondary>
1329 <tertiary>in C</tertiary>
1330 </indexterm>
1332 <para>
1333 User-defined functions can be written in C (or a language that can
1334 be made compatible with C, such as C++). Such functions are
1335 compiled into dynamically loadable objects (also called shared
1336 libraries) and are loaded by the server on demand. The dynamic
1337 loading feature is what distinguishes <quote>C language</> functions
1338 from <quote>internal</> functions &mdash; the actual coding conventions
1339 are essentially the same for both. (Hence, the standard internal
1340 function library is a rich source of coding examples for user-defined
1341 C functions.)
1342 </para>
1344 <para>
1345 Two different calling conventions are currently used for C functions.
1346 The newer <quote>version 1</quote> calling convention is indicated by writing
1347 a <literal>PG_FUNCTION_INFO_V1()</literal> macro call for the function,
1348 as illustrated below. Lack of such a macro indicates an old-style
1349 (<quote>version 0</quote>) function. The language name specified in <command>CREATE FUNCTION</command>
1350 is <literal>C</literal> in either case. Old-style functions are now deprecated
1351 because of portability problems and lack of functionality, but they
1352 are still supported for compatibility reasons.
1353 </para>
1355 <sect2 id="xfunc-c-dynload">
1356 <title>Dynamic Loading</title>
1358 <indexterm zone="xfunc-c-dynload">
1359 <primary>dynamic loading</primary>
1360 </indexterm>
1362 <para>
1363 The first time a user-defined function in a particular
1364 loadable object file is called in a session,
1365 the dynamic loader loads that object file into memory so that the
1366 function can be called. The <command>CREATE FUNCTION</command>
1367 for a user-defined C function must therefore specify two pieces of
1368 information for the function: the name of the loadable
1369 object file, and the C name (link symbol) of the specific function to call
1370 within that object file. If the C name is not explicitly specified then
1371 it is assumed to be the same as the SQL function name.
1372 </para>
1374 <para>
1375 The following algorithm is used to locate the shared object file
1376 based on the name given in the <command>CREATE FUNCTION</command>
1377 command:
1379 <orderedlist>
1380 <listitem>
1381 <para>
1382 If the name is an absolute path, the given file is loaded.
1383 </para>
1384 </listitem>
1386 <listitem>
1387 <para>
1388 If the name starts with the string <literal>$libdir</literal>,
1389 that part is replaced by the <productname>PostgreSQL</> package
1390 library directory
1391 name, which is determined at build time.<indexterm><primary>$libdir</></>
1392 </para>
1393 </listitem>
1395 <listitem>
1396 <para>
1397 If the name does not contain a directory part, the file is
1398 searched for in the path specified by the configuration variable
1399 <xref linkend="guc-dynamic-library-path">.<indexterm><primary>dynamic_library_path</></>
1400 </para>
1401 </listitem>
1403 <listitem>
1404 <para>
1405 Otherwise (the file was not found in the path, or it contains a
1406 non-absolute directory part), the dynamic loader will try to
1407 take the name as given, which will most likely fail. (It is
1408 unreliable to depend on the current working directory.)
1409 </para>
1410 </listitem>
1411 </orderedlist>
1413 If this sequence does not work, the platform-specific shared
1414 library file name extension (often <filename>.so</filename>) is
1415 appended to the given name and this sequence is tried again. If
1416 that fails as well, the load will fail.
1417 </para>
1419 <para>
1420 It is recommended to locate shared libraries either relative to
1421 <literal>$libdir</literal> or through the dynamic library path.
1422 This simplifies version upgrades if the new installation is at a
1423 different location. The actual directory that
1424 <literal>$libdir</literal> stands for can be found out with the
1425 command <literal>pg_config --pkglibdir</literal>.
1426 </para>
1428 <para>
1429 The user ID the <productname>PostgreSQL</productname> server runs
1430 as must be able to traverse the path to the file you intend to
1431 load. Making the file or a higher-level directory not readable
1432 and/or not executable by the <systemitem>postgres</systemitem>
1433 user is a common mistake.
1434 </para>
1436 <para>
1437 In any case, the file name that is given in the
1438 <command>CREATE FUNCTION</command> command is recorded literally
1439 in the system catalogs, so if the file needs to be loaded again
1440 the same procedure is applied.
1441 </para>
1443 <note>
1444 <para>
1445 <productname>PostgreSQL</productname> will not compile a C function
1446 automatically. The object file must be compiled before it is referenced
1447 in a <command>CREATE
1448 FUNCTION</> command. See <xref linkend="dfunc"> for additional
1449 information.
1450 </para>
1451 </note>
1453 <indexterm zone="xfunc-c-dynload">
1454 <primary>magic block</primary>
1455 </indexterm>
1457 <para>
1458 To ensure that a dynamically loaded object file is not loaded into an
1459 incompatible server, <productname>PostgreSQL</productname> checks that the
1460 file contains a <quote>magic block</> with the appropriate contents.
1461 This allows the server to detect obvious incompatibilities, such as code
1462 compiled for a different major version of
1463 <productname>PostgreSQL</productname>. A magic block is required as of
1464 <productname>PostgreSQL</productname> 8.2. To include a magic block,
1465 write this in one (and only one) of the module source files, after having
1466 included the header <filename>fmgr.h</>:
1468 <programlisting>
1469 #ifdef PG_MODULE_MAGIC
1470 PG_MODULE_MAGIC;
1471 #endif
1472 </programlisting>
1474 The <literal>#ifdef</> test can be omitted if the code doesn't
1475 need to compile against pre-8.2 <productname>PostgreSQL</productname>
1476 releases.
1477 </para>
1479 <para>
1480 After it is used for the first time, a dynamically loaded object
1481 file is retained in memory. Future calls in the same session to
1482 the function(s) in that file will only incur the small overhead of
1483 a symbol table lookup. If you need to force a reload of an object
1484 file, for example after recompiling it, use the <xref
1485 linkend="sql-load" endterm="sql-load-title"> command or begin a
1486 fresh session.
1487 </para>
1489 <indexterm zone="xfunc-c-dynload">
1490 <primary>_PG_init</primary>
1491 </indexterm>
1492 <indexterm zone="xfunc-c-dynload">
1493 <primary>_PG_fini</primary>
1494 </indexterm>
1495 <indexterm zone="xfunc-c-dynload">
1496 <primary>library initialization function</primary>
1497 </indexterm>
1498 <indexterm zone="xfunc-c-dynload">
1499 <primary>library finalization function</primary>
1500 </indexterm>
1502 <para>
1503 Optionally, a dynamically loaded file can contain initialization and
1504 finalization functions. If the file includes a function named
1505 <function>_PG_init</>, that function will be called immediately after
1506 loading the file. The function receives no parameters and should
1507 return void. If the file includes a function named
1508 <function>_PG_fini</>, that function will be called immediately before
1509 unloading the file. Likewise, the function receives no parameters and
1510 should return void. Note that <function>_PG_fini</> will only be called
1511 during an unload of the file, not during process termination.
1512 (Presently, an unload only happens in the context of re-loading
1513 the file due to an explicit <command>LOAD</> command.)
1514 </para>
1516 </sect2>
1518 <sect2 id="xfunc-c-basetype">
1519 <title>Base Types in C-Language Functions</title>
1521 <indexterm zone="xfunc-c-basetype">
1522 <primary>data type</primary>
1523 <secondary>internal organization</secondary>
1524 </indexterm>
1526 <para>
1527 To know how to write C-language functions, you need to know how
1528 <productname>PostgreSQL</productname> internally represents base
1529 data types and how they can be passed to and from functions.
1530 Internally, <productname>PostgreSQL</productname> regards a base
1531 type as a <quote>blob of memory</quote>. The user-defined
1532 functions that you define over a type in turn define the way that
1533 <productname>PostgreSQL</productname> can operate on it. That
1534 is, <productname>PostgreSQL</productname> will only store and
1535 retrieve the data from disk and use your user-defined functions
1536 to input, process, and output the data.
1537 </para>
1539 <para>
1540 Base types can have one of three internal formats:
1542 <itemizedlist>
1543 <listitem>
1544 <para>
1545 pass by value, fixed-length
1546 </para>
1547 </listitem>
1548 <listitem>
1549 <para>
1550 pass by reference, fixed-length
1551 </para>
1552 </listitem>
1553 <listitem>
1554 <para>
1555 pass by reference, variable-length
1556 </para>
1557 </listitem>
1558 </itemizedlist>
1559 </para>
1561 <para>
1562 By-value types can only be 1, 2, or 4 bytes in length
1563 (also 8 bytes, if <literal>sizeof(Datum)</literal> is 8 on your machine).
1564 You should be careful to define your types such that they will be the
1565 same size (in bytes) on all architectures. For example, the
1566 <literal>long</literal> type is dangerous because it is 4 bytes on some
1567 machines and 8 bytes on others, whereas <type>int</type> type is 4 bytes
1568 on most Unix machines. A reasonable implementation of the
1569 <type>int4</type> type on Unix machines might be:
1571 <programlisting>
1572 /* 4-byte integer, passed by value */
1573 typedef int int4;
1574 </programlisting>
1575 </para>
1577 <para>
1578 On the other hand, fixed-length types of any size can
1579 be passed by-reference. For example, here is a sample
1580 implementation of a <productname>PostgreSQL</productname> type:
1582 <programlisting>
1583 /* 16-byte structure, passed by reference */
1584 typedef struct
1586 double x, y;
1587 } Point;
1588 </programlisting>
1590 Only pointers to such types can be used when passing
1591 them in and out of <productname>PostgreSQL</productname> functions.
1592 To return a value of such a type, allocate the right amount of
1593 memory with <literal>palloc</literal>, fill in the allocated memory,
1594 and return a pointer to it. (Also, if you just want to return the
1595 same value as one of your input arguments that's of the same data type,
1596 you can skip the extra <literal>palloc</literal> and just return the
1597 pointer to the input value.)
1598 </para>
1600 <para>
1601 Finally, all variable-length types must also be passed
1602 by reference. All variable-length types must begin
1603 with a length field of exactly 4 bytes, and all data to
1604 be stored within that type must be located in the memory
1605 immediately following that length field. The
1606 length field contains the total length of the structure,
1607 that is, it includes the size of the length field
1608 itself.
1609 </para>
1611 <warning>
1612 <para>
1613 <emphasis>Never</> modify the contents of a pass-by-reference input
1614 value. If you do so you are likely to corrupt on-disk data, since
1615 the pointer you are given might point directly into a disk buffer.
1616 The sole exception to this rule is explained in
1617 <xref linkend="xaggr">.
1618 </para>
1619 </warning>
1621 <para>
1622 As an example, we can define the type <type>text</type> as
1623 follows:
1625 <programlisting>
1626 typedef struct {
1627 int4 length;
1628 char data[1];
1629 } text;
1630 </programlisting>
1632 Obviously, the data field declared here is not long enough to hold
1633 all possible strings. Since it's impossible to declare a variable-size
1634 structure in <acronym>C</acronym>, we rely on the knowledge that the
1635 <acronym>C</acronym> compiler won't range-check array subscripts. We
1636 just allocate the necessary amount of space and then access the array as
1637 if it were declared the right length. (This is a common trick, which
1638 you can read about in many textbooks about C.)
1639 </para>
1641 <para>
1642 When manipulating
1643 variable-length types, we must be careful to allocate
1644 the correct amount of memory and set the length field correctly.
1645 For example, if we wanted to store 40 bytes in a <structname>text</>
1646 structure, we might use a code fragment like this:
1648 <programlisting><![CDATA[
1649 #include "postgres.h"
1651 char buffer[40]; /* our source data */
1653 text *destination = (text *) palloc(VARHDRSZ + 40);
1654 destination->length = VARHDRSZ + 40;
1655 memcpy(destination->data, buffer, 40);
1658 </programlisting>
1660 <literal>VARHDRSZ</> is the same as <literal>sizeof(int4)</>, but
1661 it's considered good style to use the macro <literal>VARHDRSZ</>
1662 to refer to the size of the overhead for a variable-length type.
1663 </para>
1665 <para>
1666 <xref linkend="xfunc-c-type-table"> specifies which C type
1667 corresponds to which SQL type when writing a C-language function
1668 that uses a built-in type of <productname>PostgreSQL</>.
1669 The <quote>Defined In</quote> column gives the header file that
1670 needs to be included to get the type definition. (The actual
1671 definition might be in a different file that is included by the
1672 listed file. It is recommended that users stick to the defined
1673 interface.) Note that you should always include
1674 <filename>postgres.h</filename> first in any source file, because
1675 it declares a number of things that you will need anyway.
1676 </para>
1678 <table tocentry="1" id="xfunc-c-type-table">
1679 <title>Equivalent C Types for Built-In SQL Types</title>
1680 <tgroup cols="3">
1681 <thead>
1682 <row>
1683 <entry>
1684 SQL Type
1685 </entry>
1686 <entry>
1687 C Type
1688 </entry>
1689 <entry>
1690 Defined In
1691 </entry>
1692 </row>
1693 </thead>
1694 <tbody>
1695 <row>
1696 <entry><type>abstime</type></entry>
1697 <entry><type>AbsoluteTime</type></entry>
1698 <entry><filename>utils/nabstime.h</filename></entry>
1699 </row>
1700 <row>
1701 <entry><type>boolean</type></entry>
1702 <entry><type>bool</type></entry>
1703 <entry><filename>postgres.h</filename> (maybe compiler built-in)</entry>
1704 </row>
1705 <row>
1706 <entry><type>box</type></entry>
1707 <entry><type>BOX*</type></entry>
1708 <entry><filename>utils/geo_decls.h</filename></entry>
1709 </row>
1710 <row>
1711 <entry><type>bytea</type></entry>
1712 <entry><type>bytea*</type></entry>
1713 <entry><filename>postgres.h</filename></entry>
1714 </row>
1715 <row>
1716 <entry><type>"char"</type></entry>
1717 <entry><type>char</type></entry>
1718 <entry>(compiler built-in)</entry>
1719 </row>
1720 <row>
1721 <entry><type>character</type></entry>
1722 <entry><type>BpChar*</type></entry>
1723 <entry><filename>postgres.h</filename></entry>
1724 </row>
1725 <row>
1726 <entry><type>cid</type></entry>
1727 <entry><type>CommandId</type></entry>
1728 <entry><filename>postgres.h</filename></entry>
1729 </row>
1730 <row>
1731 <entry><type>date</type></entry>
1732 <entry><type>DateADT</type></entry>
1733 <entry><filename>utils/date.h</filename></entry>
1734 </row>
1735 <row>
1736 <entry><type>smallint</type> (<type>int2</type>)</entry>
1737 <entry><type>int2</type> or <type>int16</type></entry>
1738 <entry><filename>postgres.h</filename></entry>
1739 </row>
1740 <row>
1741 <entry><type>int2vector</type></entry>
1742 <entry><type>int2vector*</type></entry>
1743 <entry><filename>postgres.h</filename></entry>
1744 </row>
1745 <row>
1746 <entry><type>integer</type> (<type>int4</type>)</entry>
1747 <entry><type>int4</type> or <type>int32</type></entry>
1748 <entry><filename>postgres.h</filename></entry>
1749 </row>
1750 <row>
1751 <entry><type>real</type> (<type>float4</type>)</entry>
1752 <entry><type>float4*</type></entry>
1753 <entry><filename>postgres.h</filename></entry>
1754 </row>
1755 <row>
1756 <entry><type>double precision</type> (<type>float8</type>)</entry>
1757 <entry><type>float8*</type></entry>
1758 <entry><filename>postgres.h</filename></entry>
1759 </row>
1760 <row>
1761 <entry><type>interval</type></entry>
1762 <entry><type>Interval*</type></entry>
1763 <entry><filename>utils/timestamp.h</filename></entry>
1764 </row>
1765 <row>
1766 <entry><type>lseg</type></entry>
1767 <entry><type>LSEG*</type></entry>
1768 <entry><filename>utils/geo_decls.h</filename></entry>
1769 </row>
1770 <row>
1771 <entry><type>name</type></entry>
1772 <entry><type>Name</type></entry>
1773 <entry><filename>postgres.h</filename></entry>
1774 </row>
1775 <row>
1776 <entry><type>oid</type></entry>
1777 <entry><type>Oid</type></entry>
1778 <entry><filename>postgres.h</filename></entry>
1779 </row>
1780 <row>
1781 <entry><type>oidvector</type></entry>
1782 <entry><type>oidvector*</type></entry>
1783 <entry><filename>postgres.h</filename></entry>
1784 </row>
1785 <row>
1786 <entry><type>path</type></entry>
1787 <entry><type>PATH*</type></entry>
1788 <entry><filename>utils/geo_decls.h</filename></entry>
1789 </row>
1790 <row>
1791 <entry><type>point</type></entry>
1792 <entry><type>POINT*</type></entry>
1793 <entry><filename>utils/geo_decls.h</filename></entry>
1794 </row>
1795 <row>
1796 <entry><type>regproc</type></entry>
1797 <entry><type>regproc</type></entry>
1798 <entry><filename>postgres.h</filename></entry>
1799 </row>
1800 <row>
1801 <entry><type>reltime</type></entry>
1802 <entry><type>RelativeTime</type></entry>
1803 <entry><filename>utils/nabstime.h</filename></entry>
1804 </row>
1805 <row>
1806 <entry><type>text</type></entry>
1807 <entry><type>text*</type></entry>
1808 <entry><filename>postgres.h</filename></entry>
1809 </row>
1810 <row>
1811 <entry><type>tid</type></entry>
1812 <entry><type>ItemPointer</type></entry>
1813 <entry><filename>storage/itemptr.h</filename></entry>
1814 </row>
1815 <row>
1816 <entry><type>time</type></entry>
1817 <entry><type>TimeADT</type></entry>
1818 <entry><filename>utils/date.h</filename></entry>
1819 </row>
1820 <row>
1821 <entry><type>time with time zone</type></entry>
1822 <entry><type>TimeTzADT</type></entry>
1823 <entry><filename>utils/date.h</filename></entry>
1824 </row>
1825 <row>
1826 <entry><type>timestamp</type></entry>
1827 <entry><type>Timestamp*</type></entry>
1828 <entry><filename>utils/timestamp.h</filename></entry>
1829 </row>
1830 <row>
1831 <entry><type>tinterval</type></entry>
1832 <entry><type>TimeInterval</type></entry>
1833 <entry><filename>utils/nabstime.h</filename></entry>
1834 </row>
1835 <row>
1836 <entry><type>varchar</type></entry>
1837 <entry><type>VarChar*</type></entry>
1838 <entry><filename>postgres.h</filename></entry>
1839 </row>
1840 <row>
1841 <entry><type>xid</type></entry>
1842 <entry><type>TransactionId</type></entry>
1843 <entry><filename>postgres.h</filename></entry>
1844 </row>
1845 </tbody>
1846 </tgroup>
1847 </table>
1849 <para>
1850 Now that we've gone over all of the possible structures
1851 for base types, we can show some examples of real functions.
1852 </para>
1853 </sect2>
1855 <sect2>
1856 <title>Version 0 Calling Conventions</title>
1858 <para>
1859 We present the <quote>old style</quote> calling convention first &mdash; although
1860 this approach is now deprecated, it's easier to get a handle on
1861 initially. In the version-0 method, the arguments and result
1862 of the C function are just declared in normal C style, but being
1863 careful to use the C representation of each SQL data type as shown
1864 above.
1865 </para>
1867 <para>
1868 Here are some examples:
1870 <programlisting><![CDATA[
1871 #include "postgres.h"
1872 #include <string.h>
1874 /* by value */
1877 add_one(int arg)
1879 return arg + 1;
1882 /* by reference, fixed length */
1884 float8 *
1885 add_one_float8(float8 *arg)
1887 float8 *result = (float8 *) palloc(sizeof(float8));
1889 *result = *arg + 1.0;
1891 return result;
1894 Point *
1895 makepoint(Point *pointx, Point *pointy)
1897 Point *new_point = (Point *) palloc(sizeof(Point));
1899 new_point->x = pointx->x;
1900 new_point->y = pointy->y;
1902 return new_point;
1905 /* by reference, variable length */
1907 text *
1908 copytext(text *t)
1911 * VARSIZE is the total size of the struct in bytes.
1913 text *new_t = (text *) palloc(VARSIZE(t));
1914 SET_VARSIZE(new_t, VARSIZE(t));
1916 * VARDATA is a pointer to the data region of the struct.
1918 memcpy((void *) VARDATA(new_t), /* destination */
1919 (void *) VARDATA(t), /* source */
1920 VARSIZE(t) - VARHDRSZ); /* how many bytes */
1921 return new_t;
1924 text *
1925 concat_text(text *arg1, text *arg2)
1927 int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ;
1928 text *new_text = (text *) palloc(new_text_size);
1930 SET_VARSIZE(new_text, new_text_size);
1931 memcpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1) - VARHDRSZ);
1932 memcpy(VARDATA(new_text) + (VARSIZE(arg1) - VARHDRSZ),
1933 VARDATA(arg2), VARSIZE(arg2) - VARHDRSZ);
1934 return new_text;
1937 </programlisting>
1938 </para>
1940 <para>
1941 Supposing that the above code has been prepared in file
1942 <filename>funcs.c</filename> and compiled into a shared object,
1943 we could define the functions to <productname>PostgreSQL</productname>
1944 with commands like this:
1946 <programlisting>
1947 CREATE FUNCTION add_one(integer) RETURNS integer
1948 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one'
1949 LANGUAGE C STRICT;
1951 -- note overloading of SQL function name "add_one"
1952 CREATE FUNCTION add_one(double precision) RETURNS double precision
1953 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one_float8'
1954 LANGUAGE C STRICT;
1956 CREATE FUNCTION makepoint(point, point) RETURNS point
1957 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'makepoint'
1958 LANGUAGE C STRICT;
1960 CREATE FUNCTION copytext(text) RETURNS text
1961 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'copytext'
1962 LANGUAGE C STRICT;
1964 CREATE FUNCTION concat_text(text, text) RETURNS text
1965 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'concat_text'
1966 LANGUAGE C STRICT;
1967 </programlisting>
1968 </para>
1970 <para>
1971 Here, <replaceable>DIRECTORY</replaceable> stands for the
1972 directory of the shared library file (for instance the
1973 <productname>PostgreSQL</productname> tutorial directory, which
1974 contains the code for the examples used in this section).
1975 (Better style would be to use just <literal>'funcs'</> in the
1976 <literal>AS</> clause, after having added
1977 <replaceable>DIRECTORY</replaceable> to the search path. In any
1978 case, we can omit the system-specific extension for a shared
1979 library, commonly <literal>.so</literal> or
1980 <literal>.sl</literal>.)
1981 </para>
1983 <para>
1984 Notice that we have specified the functions as <quote>strict</quote>,
1985 meaning that
1986 the system should automatically assume a null result if any input
1987 value is null. By doing this, we avoid having to check for null inputs
1988 in the function code. Without this, we'd have to check for null values
1989 explicitly, by checking for a null pointer for each
1990 pass-by-reference argument. (For pass-by-value arguments, we don't
1991 even have a way to check!)
1992 </para>
1994 <para>
1995 Although this calling convention is simple to use,
1996 it is not very portable; on some architectures there are problems
1997 with passing data types that are smaller than <type>int</type> this way. Also, there is
1998 no simple way to return a null result, nor to cope with null arguments
1999 in any way other than making the function strict. The version-1
2000 convention, presented next, overcomes these objections.
2001 </para>
2002 </sect2>
2004 <sect2>
2005 <title>Version 1 Calling Conventions</title>
2007 <para>
2008 The version-1 calling convention relies on macros to suppress most
2009 of the complexity of passing arguments and results. The C declaration
2010 of a version-1 function is always:
2011 <programlisting>
2012 Datum funcname(PG_FUNCTION_ARGS)
2013 </programlisting>
2014 In addition, the macro call:
2015 <programlisting>
2016 PG_FUNCTION_INFO_V1(funcname);
2017 </programlisting>
2018 must appear in the same source file. (Conventionally. it's
2019 written just before the function itself.) This macro call is not
2020 needed for <literal>internal</>-language functions, since
2021 <productname>PostgreSQL</> assumes that all internal functions
2022 use the version-1 convention. It is, however, required for
2023 dynamically-loaded functions.
2024 </para>
2026 <para>
2027 In a version-1 function, each actual argument is fetched using a
2028 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2029 macro that corresponds to the argument's data type, and the
2030 result is returned using a
2031 <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
2032 macro for the return type.
2033 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2034 takes as its argument the number of the function argument to
2035 fetch, where the count starts at 0.
2036 <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
2037 takes as its argument the actual value to return.
2038 </para>
2040 <para>
2041 Here we show the same functions as above, coded in version-1 style:
2043 <programlisting><![CDATA[
2044 #include "postgres.h"
2045 #include <string.h>
2046 #include "fmgr.h"
2048 /* by value */
2050 PG_FUNCTION_INFO_V1(add_one);
2052 Datum
2053 add_one(PG_FUNCTION_ARGS)
2055 int32 arg = PG_GETARG_INT32(0);
2057 PG_RETURN_INT32(arg + 1);
2060 /* by reference, fixed length */
2062 PG_FUNCTION_INFO_V1(add_one_float8);
2064 Datum
2065 add_one_float8(PG_FUNCTION_ARGS)
2067 /* The macros for FLOAT8 hide its pass-by-reference nature. */
2068 float8 arg = PG_GETARG_FLOAT8(0);
2070 PG_RETURN_FLOAT8(arg + 1.0);
2073 PG_FUNCTION_INFO_V1(makepoint);
2075 Datum
2076 makepoint(PG_FUNCTION_ARGS)
2078 /* Here, the pass-by-reference nature of Point is not hidden. */
2079 Point *pointx = PG_GETARG_POINT_P(0);
2080 Point *pointy = PG_GETARG_POINT_P(1);
2081 Point *new_point = (Point *) palloc(sizeof(Point));
2083 new_point->x = pointx->x;
2084 new_point->y = pointy->y;
2086 PG_RETURN_POINT_P(new_point);
2089 /* by reference, variable length */
2091 PG_FUNCTION_INFO_V1(copytext);
2093 Datum
2094 copytext(PG_FUNCTION_ARGS)
2096 text *t = PG_GETARG_TEXT_P(0);
2098 * VARSIZE is the total size of the struct in bytes.
2100 text *new_t = (text *) palloc(VARSIZE(t));
2101 SET_VARSIZE(new_t, VARSIZE(t));
2103 * VARDATA is a pointer to the data region of the struct.
2105 memcpy((void *) VARDATA(new_t), /* destination */
2106 (void *) VARDATA(t), /* source */
2107 VARSIZE(t) - VARHDRSZ); /* how many bytes */
2108 PG_RETURN_TEXT_P(new_t);
2111 PG_FUNCTION_INFO_V1(concat_text);
2113 Datum
2114 concat_text(PG_FUNCTION_ARGS)
2116 text *arg1 = PG_GETARG_TEXT_P(0);
2117 text *arg2 = PG_GETARG_TEXT_P(1);
2118 int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ;
2119 text *new_text = (text *) palloc(new_text_size);
2121 SET_VARSIZE(new_text, new_text_size);
2122 memcpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1) - VARHDRSZ);
2123 memcpy(VARDATA(new_text) + (VARSIZE(arg1) - VARHDRSZ),
2124 VARDATA(arg2), VARSIZE(arg2) - VARHDRSZ);
2125 PG_RETURN_TEXT_P(new_text);
2128 </programlisting>
2129 </para>
2131 <para>
2132 The <command>CREATE FUNCTION</command> commands are the same as
2133 for the version-0 equivalents.
2134 </para>
2136 <para>
2137 At first glance, the version-1 coding conventions might appear to
2138 be just pointless obscurantism. They do, however, offer a number
2139 of improvements, because the macros can hide unnecessary detail.
2140 An example is that in coding <function>add_one_float8</>, we no longer need to
2141 be aware that <type>float8</type> is a pass-by-reference type. Another
2142 example is that the <literal>GETARG</> macros for variable-length types allow
2143 for more efficient fetching of <quote>toasted</quote> (compressed or
2144 out-of-line) values.
2145 </para>
2147 <para>
2148 One big improvement in version-1 functions is better handling of null
2149 inputs and results. The macro <function>PG_ARGISNULL(<replaceable>n</>)</function>
2150 allows a function to test whether each input is null. (Of course, doing
2151 this is only necessary in functions not declared <quote>strict</>.)
2152 As with the
2153 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> macros,
2154 the input arguments are counted beginning at zero. Note that one
2155 should refrain from executing
2156 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> until
2157 one has verified that the argument isn't null.
2158 To return a null result, execute <function>PG_RETURN_NULL()</function>;
2159 this works in both strict and nonstrict functions.
2160 </para>
2162 <para>
2163 Other options provided in the new-style interface are two
2164 variants of the
2165 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2166 macros. The first of these,
2167 <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>,
2168 guarantees to return a copy of the specified argument that is
2169 safe for writing into. (The normal macros will sometimes return a
2170 pointer to a value that is physically stored in a table, which
2171 must not be written to. Using the
2172 <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>
2173 macros guarantees a writable result.)
2174 The second variant consists of the
2175 <function>PG_GETARG_<replaceable>xxx</replaceable>_SLICE()</function>
2176 macros which take three arguments. The first is the number of the
2177 function argument (as above). The second and third are the offset and
2178 length of the segment to be returned. Offsets are counted from
2179 zero, and a negative length requests that the remainder of the
2180 value be returned. These macros provide more efficient access to
2181 parts of large values in the case where they have storage type
2182 <quote>external</quote>. (The storage type of a column can be specified using
2183 <literal>ALTER TABLE <replaceable>tablename</replaceable> ALTER
2184 COLUMN <replaceable>colname</replaceable> SET STORAGE
2185 <replaceable>storagetype</replaceable></literal>. <replaceable>storagetype</replaceable> is one of
2186 <literal>plain</>, <literal>external</>, <literal>extended</literal>,
2187 or <literal>main</>.)
2188 </para>
2190 <para>
2191 Finally, the version-1 function call conventions make it possible
2192 to return set results (<xref linkend="xfunc-c-return-set">) and
2193 implement trigger functions (<xref linkend="triggers">) and
2194 procedural-language call handlers (<xref
2195 linkend="plhandler">). Version-1 code is also more
2196 portable than version-0, because it does not break restrictions
2197 on function call protocol in the C standard. For more details
2198 see <filename>src/backend/utils/fmgr/README</filename> in the
2199 source distribution.
2200 </para>
2201 </sect2>
2203 <sect2>
2204 <title>Writing Code</title>
2206 <para>
2207 Before we turn to the more advanced topics, we should discuss
2208 some coding rules for <productname>PostgreSQL</productname>
2209 C-language functions. While it might be possible to load functions
2210 written in languages other than C into
2211 <productname>PostgreSQL</productname>, this is usually difficult
2212 (when it is possible at all) because other languages, such as
2213 C++, FORTRAN, or Pascal often do not follow the same calling
2214 convention as C. That is, other languages do not pass argument
2215 and return values between functions in the same way. For this
2216 reason, we will assume that your C-language functions are
2217 actually written in C.
2218 </para>
2220 <para>
2221 The basic rules for writing and building C functions are as follows:
2223 <itemizedlist>
2224 <listitem>
2225 <para>
2226 Use <literal>pg_config
2227 --includedir-server</literal><indexterm><primary>pg_config</><secondary>with user-defined C functions</></>
2228 to find out where the <productname>PostgreSQL</> server header
2229 files are installed on your system (or the system that your
2230 users will be running on).
2231 </para>
2232 </listitem>
2234 <listitem>
2235 <para>
2236 Compiling and linking your code so that it can be dynamically
2237 loaded into <productname>PostgreSQL</productname> always
2238 requires special flags. See <xref linkend="dfunc"> for a
2239 detailed explanation of how to do it for your particular
2240 operating system.
2241 </para>
2242 </listitem>
2244 <listitem>
2245 <para>
2246 Remember to define a <quote>magic block</> for your shared library,
2247 as described in <xref linkend="xfunc-c-dynload">.
2248 </para>
2249 </listitem>
2251 <listitem>
2252 <para>
2253 When allocating memory, use the
2254 <productname>PostgreSQL</productname> functions
2255 <function>palloc</function><indexterm><primary>palloc</></> and <function>pfree</function><indexterm><primary>pfree</></>
2256 instead of the corresponding C library functions
2257 <function>malloc</function> and <function>free</function>.
2258 The memory allocated by <function>palloc</function> will be
2259 freed automatically at the end of each transaction, preventing
2260 memory leaks.
2261 </para>
2262 </listitem>
2264 <listitem>
2265 <para>
2266 Always zero the bytes of your structures using
2267 <function>memset</function>. Without this, it's difficult to
2268 support hash indexes or hash joins, as you must pick out only
2269 the significant bits of your data structure to compute a hash.
2270 Even if you initialize all fields of your structure, there might be
2271 alignment padding (holes in the structure) that contain
2272 garbage values.
2273 </para>
2274 </listitem>
2276 <listitem>
2277 <para>
2278 Most of the internal <productname>PostgreSQL</productname>
2279 types are declared in <filename>postgres.h</filename>, while
2280 the function manager interfaces
2281 (<symbol>PG_FUNCTION_ARGS</symbol>, etc.) are in
2282 <filename>fmgr.h</filename>, so you will need to include at
2283 least these two files. For portability reasons it's best to
2284 include <filename>postgres.h</filename> <emphasis>first</>,
2285 before any other system or user header files. Including
2286 <filename>postgres.h</filename> will also include
2287 <filename>elog.h</filename> and <filename>palloc.h</filename>
2288 for you.
2289 </para>
2290 </listitem>
2292 <listitem>
2293 <para>
2294 Symbol names defined within object files must not conflict
2295 with each other or with symbols defined in the
2296 <productname>PostgreSQL</productname> server executable. You
2297 will have to rename your functions or variables if you get
2298 error messages to this effect.
2299 </para>
2300 </listitem>
2301 </itemizedlist>
2302 </para>
2303 </sect2>
2305 &dfunc;
2307 <sect2 id="xfunc-c-pgxs">
2308 <title>Extension Building Infrastructure</title>
2310 <indexterm zone="xfunc-c-pgxs">
2311 <primary>pgxs</primary>
2312 </indexterm>
2314 <para>
2315 If you are thinking about distributing your
2316 <productname>PostgreSQL</> extension modules, setting up a
2317 portable build system for them can be fairly difficult. Therefore
2318 the <productname>PostgreSQL</> installation provides a build
2319 infrastructure for extensions, called <acronym>PGXS</acronym>, so
2320 that simple extension modules can be built simply against an
2321 already installed server. Note that this infrastructure is not
2322 intended to be a universal build system framework that can be used
2323 to build all software interfacing to <productname>PostgreSQL</>;
2324 it simply automates common build rules for simple server extension
2325 modules. For more complicated packages, you need to write your
2326 own build system.
2327 </para>
2329 <para>
2330 To use the infrastructure for your extension, you must write a
2331 simple makefile. In that makefile, you need to set some variables
2332 and finally include the global <acronym>PGXS</acronym> makefile.
2333 Here is an example that builds an extension module named
2334 <literal>isbn_issn</literal> consisting of a shared library, an
2335 SQL script, and a documentation text file:
2336 <programlisting>
2337 MODULES = isbn_issn
2338 DATA_built = isbn_issn.sql
2339 DOCS = README.isbn_issn
2341 PG_CONFIG = pg_config
2342 PGXS := $(shell $(PG_CONFIG) --pgxs)
2343 include $(PGXS)
2344 </programlisting>
2345 The last three lines should always be the same. Earlier in the
2346 file, you assign variables or add custom
2347 <application>make</application> rules.
2348 </para>
2350 <para>
2351 The following variables can be set:
2353 <variablelist>
2354 <varlistentry>
2355 <term><varname>MODULES</varname></term>
2356 <listitem>
2357 <para>
2358 list of shared objects to be built from source file with same
2359 stem (do not include suffix in this list)
2360 </para>
2361 </listitem>
2362 </varlistentry>
2364 <varlistentry>
2365 <term><varname>DATA</varname></term>
2366 <listitem>
2367 <para>
2368 random files to install into <literal><replaceable>prefix</replaceable>/share/contrib</literal>
2369 </para>
2370 </listitem>
2371 </varlistentry>
2373 <varlistentry>
2374 <term><varname>DATA_built</varname></term>
2375 <listitem>
2376 <para>
2377 random files to install into
2378 <literal><replaceable>prefix</replaceable>/share/contrib</literal>,
2379 which need to be built first
2380 </para>
2381 </listitem>
2382 </varlistentry>
2384 <varlistentry>
2385 <term><varname>DOCS</varname></term>
2386 <listitem>
2387 <para>
2388 random files to install under
2389 <literal><replaceable>prefix</replaceable>/doc/contrib</literal>
2390 </para>
2391 </listitem>
2392 </varlistentry>
2394 <varlistentry>
2395 <term><varname>SCRIPTS</varname></term>
2396 <listitem>
2397 <para>
2398 script files (not binaries) to install into
2399 <literal><replaceable>prefix</replaceable>/bin</literal>
2400 </para>
2401 </listitem>
2402 </varlistentry>
2404 <varlistentry>
2405 <term><varname>SCRIPTS_built</varname></term>
2406 <listitem>
2407 <para>
2408 script files (not binaries) to install into
2409 <literal><replaceable>prefix</replaceable>/bin</literal>,
2410 which need to be built first
2411 </para>
2412 </listitem>
2413 </varlistentry>
2415 <varlistentry>
2416 <term><varname>REGRESS</varname></term>
2417 <listitem>
2418 <para>
2419 list of regression test cases (without suffix), see below
2420 </para>
2421 </listitem>
2422 </varlistentry>
2423 </variablelist>
2425 or at most one of these two:
2427 <variablelist>
2428 <varlistentry>
2429 <term><varname>PROGRAM</varname></term>
2430 <listitem>
2431 <para>
2432 a binary program to build (list objects files in <varname>OBJS</varname>)
2433 </para>
2434 </listitem>
2435 </varlistentry>
2437 <varlistentry>
2438 <term><varname>MODULE_big</varname></term>
2439 <listitem>
2440 <para>
2441 a shared object to build (list object files in <varname>OBJS</varname>)
2442 </para>
2443 </listitem>
2444 </varlistentry>
2445 </variablelist>
2447 The following can also be set:
2449 <variablelist>
2451 <varlistentry>
2452 <term><varname>EXTRA_CLEAN</varname></term>
2453 <listitem>
2454 <para>
2455 extra files to remove in <literal>make clean</literal>
2456 </para>
2457 </listitem>
2458 </varlistentry>
2460 <varlistentry>
2461 <term><varname>PG_CPPFLAGS</varname></term>
2462 <listitem>
2463 <para>
2464 will be added to <varname>CPPFLAGS</varname>
2465 </para>
2466 </listitem>
2467 </varlistentry>
2469 <varlistentry>
2470 <term><varname>PG_LIBS</varname></term>
2471 <listitem>
2472 <para>
2473 will be added to <varname>PROGRAM</varname> link line
2474 </para>
2475 </listitem>
2476 </varlistentry>
2478 <varlistentry>
2479 <term><varname>SHLIB_LINK</varname></term>
2480 <listitem>
2481 <para>
2482 will be added to <varname>MODULE_big</varname> link line
2483 </para>
2484 </listitem>
2485 </varlistentry>
2487 <varlistentry>
2488 <term><varname>PG_CONFIG</varname></term>
2489 <listitem>
2490 <para>
2491 path to <application>pg_config</> program for the
2492 <productname>PostgreSQL</productname> installation to build against
2493 (typically just <literal>pg_config</> to use the first one in your
2494 <varname>PATH</>)
2495 </para>
2496 </listitem>
2497 </varlistentry>
2498 </variablelist>
2499 </para>
2501 <para>
2502 Put this makefile as <literal>Makefile</literal> in the directory
2503 which holds your extension. Then you can do
2504 <literal>make</literal> to compile, and later <literal>make
2505 install</literal> to install your module. By default, the extension is
2506 compiled and installed for the
2507 <productname>PostgreSQL</productname> installation that
2508 corresponds to the first <command>pg_config</command> program
2509 found in your path. You can use a different installation by
2510 setting <varname>PG_CONFIG</varname> to point to its
2511 <command>pg_config</command> program, either within the makefile
2512 or on the <literal>make</literal> command line.
2513 </para>
2515 <caution>
2516 <para>
2517 Changing <varname>PG_CONFIG</varname> only works when building
2518 against <productname>PostgreSQL</productname> 8.3 or later.
2519 With older releases it does not work to set it to anything except
2520 <literal>pg_config</>; you must alter your <varname>PATH</>
2521 to select the installation to build against.
2522 </para>
2523 </caution>
2525 <para>
2526 The scripts listed in the <varname>REGRESS</> variable are used for
2527 regression testing of your module, just like <literal>make
2528 installcheck</literal> is used for the main
2529 <productname>PostgreSQL</productname> server. For this to work you need
2530 to have a subdirectory named <literal>sql/</literal> in your extension's
2531 directory, within which you put one file for each group of tests you want
2532 to run. The files should have extension <literal>.sql</literal>, which
2533 should not be included in the <varname>REGRESS</varname> list in the
2534 makefile. For each test there should be a file containing the expected
2535 result in a subdirectory named <literal>expected/</literal>, with extension
2536 <literal>.out</literal>. The tests are run by executing <literal>make
2537 installcheck</literal>, and the resulting output will be compared to the
2538 expected files. The differences will be written to the file
2539 <literal>regression.diffs</literal> in <command>diff -c</command> format.
2540 Note that trying to run a test which is missing the expected file will be
2541 reported as <quote>trouble</quote>, so make sure you have all expected
2542 files.
2543 </para>
2545 <tip>
2546 <para>
2547 The easiest way of creating the expected files is creating empty files,
2548 then carefully inspecting the result files after a test run (to be found
2549 in the <literal>results/</literal> directory), and copying them to
2550 <literal>expected/</literal> if they match what you want from the test.
2551 </para>
2553 </tip>
2554 </sect2>
2557 <sect2>
2558 <title>Composite-Type Arguments</title>
2560 <para>
2561 Composite types do not have a fixed layout like C structures.
2562 Instances of a composite type can contain null fields. In
2563 addition, composite types that are part of an inheritance
2564 hierarchy can have different fields than other members of the
2565 same inheritance hierarchy. Therefore,
2566 <productname>PostgreSQL</productname> provides a function
2567 interface for accessing fields of composite types from C.
2568 </para>
2570 <para>
2571 Suppose we want to write a function to answer the query:
2573 <programlisting>
2574 SELECT name, c_overpaid(emp, 1500) AS overpaid
2575 FROM emp
2576 WHERE name = 'Bill' OR name = 'Sam';
2577 </programlisting>
2579 Using call conventions version 0, we can define
2580 <function>c_overpaid</> as:
2582 <programlisting><![CDATA[
2583 #include "postgres.h"
2584 #include "executor/executor.h" /* for GetAttributeByName() */
2586 bool
2587 c_overpaid(HeapTupleHeader t, /* the current row of emp */
2588 int32 limit)
2590 bool isnull;
2591 int32 salary;
2593 salary = DatumGetInt32(GetAttributeByName(t, "salary", &isnull));
2594 if (isnull)
2595 return false;
2596 return salary > limit;
2599 </programlisting>
2601 In version-1 coding, the above would look like this:
2603 <programlisting><![CDATA[
2604 #include "postgres.h"
2605 #include "executor/executor.h" /* for GetAttributeByName() */
2607 PG_FUNCTION_INFO_V1(c_overpaid);
2609 Datum
2610 c_overpaid(PG_FUNCTION_ARGS)
2612 HeapTupleHeader t = PG_GETARG_HEAPTUPLEHEADER(0);
2613 int32 limit = PG_GETARG_INT32(1);
2614 bool isnull;
2615 Datum salary;
2617 salary = GetAttributeByName(t, "salary", &isnull);
2618 if (isnull)
2619 PG_RETURN_BOOL(false);
2620 /* Alternatively, we might prefer to do PG_RETURN_NULL() for null salary. */
2622 PG_RETURN_BOOL(DatumGetInt32(salary) > limit);
2625 </programlisting>
2626 </para>
2628 <para>
2629 <function>GetAttributeByName</function> is the
2630 <productname>PostgreSQL</productname> system function that
2631 returns attributes out of the specified row. It has
2632 three arguments: the argument of type <type>HeapTupleHeader</type> passed
2633 into
2634 the function, the name of the desired attribute, and a
2635 return parameter that tells whether the attribute
2636 is null. <function>GetAttributeByName</function> returns a <type>Datum</type>
2637 value that you can convert to the proper data type by using the
2638 appropriate <function>DatumGet<replaceable>XXX</replaceable>()</function>
2639 macro. Note that the return value is meaningless if the null flag is
2640 set; always check the null flag before trying to do anything with the
2641 result.
2642 </para>
2644 <para>
2645 There is also <function>GetAttributeByNum</function>, which selects
2646 the target attribute by column number instead of name.
2647 </para>
2649 <para>
2650 The following command declares the function
2651 <function>c_overpaid</function> in SQL:
2653 <programlisting>
2654 CREATE FUNCTION c_overpaid(emp, integer) RETURNS boolean
2655 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'c_overpaid'
2656 LANGUAGE C STRICT;
2657 </programlisting>
2659 Notice we have used <literal>STRICT</> so that we did not have to
2660 check whether the input arguments were NULL.
2661 </para>
2662 </sect2>
2664 <sect2>
2665 <title>Returning Rows (Composite Types)</title>
2667 <para>
2668 To return a row or composite-type value from a C-language
2669 function, you can use a special API that provides macros and
2670 functions to hide most of the complexity of building composite
2671 data types. To use this API, the source file must include:
2672 <programlisting>
2673 #include "funcapi.h"
2674 </programlisting>
2675 </para>
2677 <para>
2678 There are two ways you can build a composite data value (henceforth
2679 a <quote>tuple</>): you can build it from an array of Datum values,
2680 or from an array of C strings that can be passed to the input
2681 conversion functions of the tuple's column data types. In either
2682 case, you first need to obtain or construct a <structname>TupleDesc</>
2683 descriptor for the tuple structure. When working with Datums, you
2684 pass the <structname>TupleDesc</> to <function>BlessTupleDesc</>,
2685 and then call <function>heap_form_tuple</> for each row. When working
2686 with C strings, you pass the <structname>TupleDesc</> to
2687 <function>TupleDescGetAttInMetadata</>, and then call
2688 <function>BuildTupleFromCStrings</> for each row. In the case of a
2689 function returning a set of tuples, the setup steps can all be done
2690 once during the first call of the function.
2691 </para>
2693 <para>
2694 Several helper functions are available for setting up the needed
2695 <structname>TupleDesc</>. The recommended way to do this in most
2696 functions returning composite values is to call:
2697 <programlisting>
2698 TypeFuncClass get_call_result_type(FunctionCallInfo fcinfo,
2699 Oid *resultTypeId,
2700 TupleDesc *resultTupleDesc)
2701 </programlisting>
2702 passing the same <literal>fcinfo</> struct passed to the calling function
2703 itself. (This of course requires that you use the version-1
2704 calling conventions.) <varname>resultTypeId</> can be specified
2705 as <literal>NULL</> or as the address of a local variable to receive the
2706 function's result type OID. <varname>resultTupleDesc</> should be the
2707 address of a local <structname>TupleDesc</> variable. Check that the
2708 result is <literal>TYPEFUNC_COMPOSITE</>; if so,
2709 <varname>resultTupleDesc</> has been filled with the needed
2710 <structname>TupleDesc</>. (If it is not, you can report an error along
2711 the lines of <quote>function returning record called in context that
2712 cannot accept type record</quote>.)
2713 </para>
2715 <tip>
2716 <para>
2717 <function>get_call_result_type</> can resolve the actual type of a
2718 polymorphic function result; so it is useful in functions that return
2719 scalar polymorphic results, not only functions that return composites.
2720 The <varname>resultTypeId</> output is primarily useful for functions
2721 returning polymorphic scalars.
2722 </para>
2723 </tip>
2725 <note>
2726 <para>
2727 <function>get_call_result_type</> has a sibling
2728 <function>get_expr_result_type</>, which can be used to resolve the
2729 expected output type for a function call represented by an expression
2730 tree. This can be used when trying to determine the result type from
2731 outside the function itself. There is also
2732 <function>get_func_result_type</>, which can be used when only the
2733 function's OID is available. However these functions are not able
2734 to deal with functions declared to return <structname>record</>, and
2735 <function>get_func_result_type</> cannot resolve polymorphic types,
2736 so you should preferentially use <function>get_call_result_type</>.
2737 </para>
2738 </note>
2740 <para>
2741 Older, now-deprecated functions for obtaining
2742 <structname>TupleDesc</>s are:
2743 <programlisting>
2744 TupleDesc RelationNameGetTupleDesc(const char *relname)
2745 </programlisting>
2746 to get a <structname>TupleDesc</> for the row type of a named relation,
2747 and:
2748 <programlisting>
2749 TupleDesc TypeGetTupleDesc(Oid typeoid, List *colaliases)
2750 </programlisting>
2751 to get a <structname>TupleDesc</> based on a type OID. This can
2752 be used to get a <structname>TupleDesc</> for a base or
2753 composite type. It will not work for a function that returns
2754 <structname>record</>, however, and it cannot resolve polymorphic
2755 types.
2756 </para>
2758 <para>
2759 Once you have a <structname>TupleDesc</>, call:
2760 <programlisting>
2761 TupleDesc BlessTupleDesc(TupleDesc tupdesc)
2762 </programlisting>
2763 if you plan to work with Datums, or:
2764 <programlisting>
2765 AttInMetadata *TupleDescGetAttInMetadata(TupleDesc tupdesc)
2766 </programlisting>
2767 if you plan to work with C strings. If you are writing a function
2768 returning set, you can save the results of these functions in the
2769 <structname>FuncCallContext</> structure &mdash; use the
2770 <structfield>tuple_desc</> or <structfield>attinmeta</> field
2771 respectively.
2772 </para>
2774 <para>
2775 When working with Datums, use:
2776 <programlisting>
2777 HeapTuple heap_form_tuple(TupleDesc tupdesc, Datum *values, bool *isnull)
2778 </programlisting>
2779 to build a <structname>HeapTuple</> given user data in Datum form.
2780 </para>
2782 <para>
2783 When working with C strings, use:
2784 <programlisting>
2785 HeapTuple BuildTupleFromCStrings(AttInMetadata *attinmeta, char **values)
2786 </programlisting>
2787 to build a <structname>HeapTuple</> given user data
2788 in C string form. <literal>values</literal> is an array of C strings,
2789 one for each attribute of the return row. Each C string should be in
2790 the form expected by the input function of the attribute data
2791 type. In order to return a null value for one of the attributes,
2792 the corresponding pointer in the <parameter>values</> array
2793 should be set to <symbol>NULL</>. This function will need to
2794 be called again for each row you return.
2795 </para>
2797 <para>
2798 Once you have built a tuple to return from your function, it
2799 must be converted into a <type>Datum</>. Use:
2800 <programlisting>
2801 HeapTupleGetDatum(HeapTuple tuple)
2802 </programlisting>
2803 to convert a <structname>HeapTuple</> into a valid Datum. This
2804 <type>Datum</> can be returned directly if you intend to return
2805 just a single row, or it can be used as the current return value
2806 in a set-returning function.
2807 </para>
2809 <para>
2810 An example appears in the next section.
2811 </para>
2813 </sect2>
2815 <sect2 id="xfunc-c-return-set">
2816 <title>Returning Sets</title>
2818 <para>
2819 There is also a special API that provides support for returning
2820 sets (multiple rows) from a C-language function. A set-returning
2821 function must follow the version-1 calling conventions. Also,
2822 source files must include <filename>funcapi.h</filename>, as
2823 above.
2824 </para>
2826 <para>
2827 A set-returning function (<acronym>SRF</>) is called
2828 once for each item it returns. The <acronym>SRF</> must
2829 therefore save enough state to remember what it was doing and
2830 return the next item on each call.
2831 The structure <structname>FuncCallContext</> is provided to help
2832 control this process. Within a function, <literal>fcinfo-&gt;flinfo-&gt;fn_extra</>
2833 is used to hold a pointer to <structname>FuncCallContext</>
2834 across calls.
2835 <programlisting>
2836 typedef struct
2839 * Number of times we've been called before
2841 * call_cntr is initialized to 0 for you by SRF_FIRSTCALL_INIT(), and
2842 * incremented for you every time SRF_RETURN_NEXT() is called.
2844 uint32 call_cntr;
2847 * OPTIONAL maximum number of calls
2849 * max_calls is here for convenience only and setting it is optional.
2850 * If not set, you must provide alternative means to know when the
2851 * function is done.
2853 uint32 max_calls;
2856 * OPTIONAL pointer to result slot
2858 * This is obsolete and only present for backwards compatibility, viz,
2859 * user-defined SRFs that use the deprecated TupleDescGetSlot().
2861 TupleTableSlot *slot;
2864 * OPTIONAL pointer to miscellaneous user-provided context information
2866 * user_fctx is for use as a pointer to your own data to retain
2867 * arbitrary context information between calls of your function.
2869 void *user_fctx;
2872 * OPTIONAL pointer to struct containing attribute type input metadata
2874 * attinmeta is for use when returning tuples (i.e., composite data types)
2875 * and is not used when returning base data types. It is only needed
2876 * if you intend to use BuildTupleFromCStrings() to create the return
2877 * tuple.
2879 AttInMetadata *attinmeta;
2882 * memory context used for structures that must live for multiple calls
2884 * multi_call_memory_ctx is set by SRF_FIRSTCALL_INIT() for you, and used
2885 * by SRF_RETURN_DONE() for cleanup. It is the most appropriate memory
2886 * context for any memory that is to be reused across multiple calls
2887 * of the SRF.
2889 MemoryContext multi_call_memory_ctx;
2892 * OPTIONAL pointer to struct containing tuple description
2894 * tuple_desc is for use when returning tuples (i.e., composite data types)
2895 * and is only needed if you are going to build the tuples with
2896 * heap_form_tuple() rather than with BuildTupleFromCStrings(). Note that
2897 * the TupleDesc pointer stored here should usually have been run through
2898 * BlessTupleDesc() first.
2900 TupleDesc tuple_desc;
2902 } FuncCallContext;
2903 </programlisting>
2904 </para>
2906 <para>
2907 An <acronym>SRF</> uses several functions and macros that
2908 automatically manipulate the <structname>FuncCallContext</>
2909 structure (and expect to find it via <literal>fn_extra</>). Use:
2910 <programlisting>
2911 SRF_IS_FIRSTCALL()
2912 </programlisting>
2913 to determine if your function is being called for the first or a
2914 subsequent time. On the first call (only) use:
2915 <programlisting>
2916 SRF_FIRSTCALL_INIT()
2917 </programlisting>
2918 to initialize the <structname>FuncCallContext</>. On every function call,
2919 including the first, use:
2920 <programlisting>
2921 SRF_PERCALL_SETUP()
2922 </programlisting>
2923 to properly set up for using the <structname>FuncCallContext</>
2924 and clearing any previously returned data left over from the
2925 previous pass.
2926 </para>
2928 <para>
2929 If your function has data to return, use:
2930 <programlisting>
2931 SRF_RETURN_NEXT(funcctx, result)
2932 </programlisting>
2933 to return it to the caller. (<literal>result</> must be of type
2934 <type>Datum</>, either a single value or a tuple prepared as
2935 described above.) Finally, when your function is finished
2936 returning data, use:
2937 <programlisting>
2938 SRF_RETURN_DONE(funcctx)
2939 </programlisting>
2940 to clean up and end the <acronym>SRF</>.
2941 </para>
2943 <para>
2944 The memory context that is current when the <acronym>SRF</> is called is
2945 a transient context that will be cleared between calls. This means
2946 that you do not need to call <function>pfree</> on everything
2947 you allocated using <function>palloc</>; it will go away anyway. However, if you want to allocate
2948 any data structures to live across calls, you need to put them somewhere
2949 else. The memory context referenced by
2950 <structfield>multi_call_memory_ctx</> is a suitable location for any
2951 data that needs to survive until the <acronym>SRF</> is finished running. In most
2952 cases, this means that you should switch into
2953 <structfield>multi_call_memory_ctx</> while doing the first-call setup.
2954 </para>
2956 <para>
2957 A complete pseudo-code example looks like the following:
2958 <programlisting>
2959 Datum
2960 my_set_returning_function(PG_FUNCTION_ARGS)
2962 FuncCallContext *funcctx;
2963 Datum result;
2964 MemoryContext oldcontext;
2965 <replaceable>further declarations as needed</replaceable>
2967 if (SRF_IS_FIRSTCALL())
2969 funcctx = SRF_FIRSTCALL_INIT();
2970 oldcontext = MemoryContextSwitchTo(funcctx-&gt;multi_call_memory_ctx);
2971 /* One-time setup code appears here: */
2972 <replaceable>user code</replaceable>
2973 <replaceable>if returning composite</replaceable>
2974 <replaceable>build TupleDesc, and perhaps AttInMetadata</replaceable>
2975 <replaceable>endif returning composite</replaceable>
2976 <replaceable>user code</replaceable>
2977 MemoryContextSwitchTo(oldcontext);
2980 /* Each-time setup code appears here: */
2981 <replaceable>user code</replaceable>
2982 funcctx = SRF_PERCALL_SETUP();
2983 <replaceable>user code</replaceable>
2985 /* this is just one way we might test whether we are done: */
2986 if (funcctx-&gt;call_cntr &lt; funcctx-&gt;max_calls)
2988 /* Here we want to return another item: */
2989 <replaceable>user code</replaceable>
2990 <replaceable>obtain result Datum</replaceable>
2991 SRF_RETURN_NEXT(funcctx, result);
2993 else
2995 /* Here we are done returning items and just need to clean up: */
2996 <replaceable>user code</replaceable>
2997 SRF_RETURN_DONE(funcctx);
3000 </programlisting>
3001 </para>
3003 <para>
3004 A complete example of a simple <acronym>SRF</> returning a composite type
3005 looks like:
3006 <programlisting><![CDATA[
3007 PG_FUNCTION_INFO_V1(retcomposite);
3009 Datum
3010 retcomposite(PG_FUNCTION_ARGS)
3012 FuncCallContext *funcctx;
3013 int call_cntr;
3014 int max_calls;
3015 TupleDesc tupdesc;
3016 AttInMetadata *attinmeta;
3018 /* stuff done only on the first call of the function */
3019 if (SRF_IS_FIRSTCALL())
3021 MemoryContext oldcontext;
3023 /* create a function context for cross-call persistence */
3024 funcctx = SRF_FIRSTCALL_INIT();
3026 /* switch to memory context appropriate for multiple function calls */
3027 oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
3029 /* total number of tuples to be returned */
3030 funcctx->max_calls = PG_GETARG_UINT32(0);
3032 /* Build a tuple descriptor for our result type */
3033 if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
3034 ereport(ERROR,
3035 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
3036 errmsg("function returning record called in context "
3037 "that cannot accept type record")));
3040 * generate attribute metadata needed later to produce tuples from raw
3041 * C strings
3043 attinmeta = TupleDescGetAttInMetadata(tupdesc);
3044 funcctx->attinmeta = attinmeta;
3046 MemoryContextSwitchTo(oldcontext);
3049 /* stuff done on every call of the function */
3050 funcctx = SRF_PERCALL_SETUP();
3052 call_cntr = funcctx->call_cntr;
3053 max_calls = funcctx->max_calls;
3054 attinmeta = funcctx->attinmeta;
3056 if (call_cntr < max_calls) /* do when there is more left to send */
3058 char **values;
3059 HeapTuple tuple;
3060 Datum result;
3063 * Prepare a values array for building the returned tuple.
3064 * This should be an array of C strings which will
3065 * be processed later by the type input functions.
3067 values = (char **) palloc(3 * sizeof(char *));
3068 values[0] = (char *) palloc(16 * sizeof(char));
3069 values[1] = (char *) palloc(16 * sizeof(char));
3070 values[2] = (char *) palloc(16 * sizeof(char));
3072 snprintf(values[0], 16, "%d", 1 * PG_GETARG_INT32(1));
3073 snprintf(values[1], 16, "%d", 2 * PG_GETARG_INT32(1));
3074 snprintf(values[2], 16, "%d", 3 * PG_GETARG_INT32(1));
3076 /* build a tuple */
3077 tuple = BuildTupleFromCStrings(attinmeta, values);
3079 /* make the tuple into a datum */
3080 result = HeapTupleGetDatum(tuple);
3082 /* clean up (this is not really necessary) */
3083 pfree(values[0]);
3084 pfree(values[1]);
3085 pfree(values[2]);
3086 pfree(values);
3088 SRF_RETURN_NEXT(funcctx, result);
3090 else /* do when there is no more left */
3092 SRF_RETURN_DONE(funcctx);
3096 </programlisting>
3098 One way to declare this function in SQL is:
3099 <programlisting>
3100 CREATE TYPE __retcomposite AS (f1 integer, f2 integer, f3 integer);
3102 CREATE OR REPLACE FUNCTION retcomposite(integer, integer)
3103 RETURNS SETOF __retcomposite
3104 AS '<replaceable>filename</>', 'retcomposite'
3105 LANGUAGE C IMMUTABLE STRICT;
3106 </programlisting>
3107 A different way is to use OUT parameters:
3108 <programlisting>
3109 CREATE OR REPLACE FUNCTION retcomposite(IN integer, IN integer,
3110 OUT f1 integer, OUT f2 integer, OUT f3 integer)
3111 RETURNS SETOF record
3112 AS '<replaceable>filename</>', 'retcomposite'
3113 LANGUAGE C IMMUTABLE STRICT;
3114 </programlisting>
3115 Notice that in this method the output type of the function is formally
3116 an anonymous <structname>record</> type.
3117 </para>
3119 <para>
3120 The directory <filename>contrib/tablefunc</> in the source
3121 distribution contains more examples of set-returning functions.
3122 </para>
3123 </sect2>
3125 <sect2>
3126 <title>Polymorphic Arguments and Return Types</title>
3128 <para>
3129 C-language functions can be declared to accept and
3130 return the polymorphic types
3131 <type>anyelement</type>, <type>anyarray</type>, <type>anynonarray</type>,
3132 and <type>anyenum</type>.
3133 See <xref linkend="extend-types-polymorphic"> for a more detailed explanation
3134 of polymorphic functions. When function arguments or return types
3135 are defined as polymorphic types, the function author cannot know
3136 in advance what data type it will be called with, or
3137 need to return. There are two routines provided in <filename>fmgr.h</>
3138 to allow a version-1 C function to discover the actual data types
3139 of its arguments and the type it is expected to return. The routines are
3140 called <literal>get_fn_expr_rettype(FmgrInfo *flinfo)</> and
3141 <literal>get_fn_expr_argtype(FmgrInfo *flinfo, int argnum)</>.
3142 They return the result or argument type OID, or <symbol>InvalidOid</symbol> if the
3143 information is not available.
3144 The structure <literal>flinfo</> is normally accessed as
3145 <literal>fcinfo-&gt;flinfo</>. The parameter <literal>argnum</>
3146 is zero based. <function>get_call_result_type</> can also be used
3147 as an alternative to <function>get_fn_expr_rettype</>.
3148 </para>
3150 <para>
3151 For example, suppose we want to write a function to accept a single
3152 element of any type, and return a one-dimensional array of that type:
3154 <programlisting>
3155 PG_FUNCTION_INFO_V1(make_array);
3156 Datum
3157 make_array(PG_FUNCTION_ARGS)
3159 ArrayType *result;
3160 Oid element_type = get_fn_expr_argtype(fcinfo-&gt;flinfo, 0);
3161 Datum element;
3162 bool isnull;
3163 int16 typlen;
3164 bool typbyval;
3165 char typalign;
3166 int ndims;
3167 int dims[MAXDIM];
3168 int lbs[MAXDIM];
3170 if (!OidIsValid(element_type))
3171 elog(ERROR, "could not determine data type of input");
3173 /* get the provided element, being careful in case it's NULL */
3174 isnull = PG_ARGISNULL(0);
3175 if (isnull)
3176 element = (Datum) 0;
3177 else
3178 element = PG_GETARG_DATUM(0);
3180 /* we have one dimension */
3181 ndims = 1;
3182 /* and one element */
3183 dims[0] = 1;
3184 /* and lower bound is 1 */
3185 lbs[0] = 1;
3187 /* get required info about the element type */
3188 get_typlenbyvalalign(element_type, &amp;typlen, &amp;typbyval, &amp;typalign);
3190 /* now build the array */
3191 result = construct_md_array(&amp;element, &amp;isnull, ndims, dims, lbs,
3192 element_type, typlen, typbyval, typalign);
3194 PG_RETURN_ARRAYTYPE_P(result);
3196 </programlisting>
3197 </para>
3199 <para>
3200 The following command declares the function
3201 <function>make_array</function> in SQL:
3203 <programlisting>
3204 CREATE FUNCTION make_array(anyelement) RETURNS anyarray
3205 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'make_array'
3206 LANGUAGE C IMMUTABLE;
3207 </programlisting>
3208 </para>
3210 <para>
3211 There is a variant of polymorphism that is only available to C-language
3212 functions: they can be declared to take parameters of type
3213 <literal>"any"</>. (Note that this type name must be double-quoted,
3214 since it's also a SQL reserved word.) This works like
3215 <type>anyelement</> except that it does not constrain different
3216 <literal>"any"</> arguments to be the same type, nor do they help
3217 determine the function's result type. A C-language function can also
3218 declare its final parameter to be <literal>VARIADIC "any"</>. This will
3219 match one or more actual arguments of any type (not necessarily the same
3220 type). These arguments will <emphasis>not</> be gathered into an array
3221 as happens with normal variadic functions; they will just be passed to
3222 the function separately. The <function>PG_NARGS()</> macro and the
3223 methods described above must be used to determine the number of actual
3224 arguments and their types when using this feature.
3225 </para>
3226 </sect2>
3228 <sect2>
3229 <title>Shared Memory and LWLocks</title>
3231 <para>
3232 Add-ins can reserve LWLocks and an allocation of shared memory on server
3233 startup. The add-in's shared library must be preloaded by specifying
3234 it in
3235 <xref linkend="guc-shared-preload-libraries"><indexterm><primary>shared-preload-libraries</></>.
3236 Shared memory is reserved by calling:
3237 <programlisting>
3238 void RequestAddinShmemSpace(int size)
3239 </programlisting>
3240 from your <function>_PG_init</> function.
3241 </para>
3242 <para>
3243 LWLocks are reserved by calling:
3244 <programlisting>
3245 void RequestAddinLWLocks(int n)
3246 </programlisting>
3247 from <function>_PG_init</>.
3248 </para>
3249 <para>
3250 To avoid possible race-conditions, each backend should use the LWLock
3251 <function>AddinShmemInitLock</> when connecting to and initializing
3252 its allocation of shared memory, as shown here:
3253 <programlisting>
3254 static mystruct *ptr = NULL;
3256 if (!ptr)
3258 bool found;
3260 LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
3261 ptr = ShmemInitStruct("my struct name", size, &amp;found);
3262 if (!ptr)
3263 elog(ERROR, "out of shared memory");
3264 if (!found)
3266 initialize contents of shmem area;
3267 acquire any requested LWLocks using:
3268 ptr->mylockid = LWLockAssign();
3270 LWLockRelease(AddinShmemInitLock);
3272 </programlisting>
3273 </para>
3274 </sect2>
3275 </sect1>