doc/src/sgml/xfunc.sgml

   1 <!-- $PostgreSQL$ -->
   2
   3  <sect1 id="xfunc">
   4   <title>User-Defined Functions</title>
   5
   6   <indexterm zone="xfunc">
   7    <primary>function</primary>
   8    <secondary>user-defined</secondary>
   9   </indexterm>
  10
  11   <para>
  12    <productname>PostgreSQL</productname> provides four kinds of
  13    functions:
  14
  15    <itemizedlist>
  16     <listitem>
  17      <para>
  18       query language functions (functions written in
  19       <acronym>SQL</acronym>) (<xref linkend="xfunc-sql">)
  20      </para>
  21     </listitem>
  22     <listitem>
  23      <para>
  24       procedural language functions (functions written in, for
  25       example, <application>PL/pgSQL</> or <application>PL/Tcl</>)
  26       (<xref linkend="xfunc-pl">)
  27      </para>
  28     </listitem>
  29     <listitem>
  30      <para>
  31       internal functions (<xref linkend="xfunc-internal">)
  32      </para>
  33     </listitem>
  34     <listitem>
  35      <para>
  36       C-language functions (<xref linkend="xfunc-c">)
  37      </para>
  38     </listitem>
  39    </itemizedlist>
  40   </para>
  41
  42   <para>
  43    Every kind
  44    of  function  can take base types, composite types, or
  45    combinations of these as arguments (parameters). In addition,
  46    every kind of function can return a base type or
  47    a composite type.  Functions can also be defined to return
  48    sets of base or composite values.
  49   </para>
  50
  51   <para>
  52    Many kinds of functions can take or return certain pseudo-types
  53    (such as polymorphic types), but the available facilities vary.
  54    Consult the description of each kind of function for more details.
  55   </para>
  56
  57   <para>
  58    It's easiest to define <acronym>SQL</acronym>
  59    functions, so we'll start by discussing those.
  60    Most of the concepts presented for <acronym>SQL</acronym> functions
  61    will carry over to the other types of functions.
  62   </para>
  63
  64   <para>
  65    Throughout this chapter, it can be useful to look at the reference
  66    page of the <xref linkend="sql-createfunction"
  67    endterm="sql-createfunction-title"> command to
  68    understand the examples better.  Some examples from this chapter
  69    can be found in <filename>funcs.sql</filename> and
  70    <filename>funcs.c</filename> in the <filename>src/tutorial</>
  71    directory in the <productname>PostgreSQL</productname> source
  72    distribution.
  73   </para>
  74   </sect1>
  75
  76   <sect1 id="xfunc-sql">
  77    <title>Query Language (<acronym>SQL</acronym>) Functions</title>
  78
  79    <indexterm zone="xfunc-sql">
  80     <primary>function</primary>
  81     <secondary>user-defined</secondary>
  82     <tertiary>in SQL</tertiary>
  83    </indexterm>
  84
  85    <para>
  86     SQL functions execute an arbitrary list of SQL statements, returning
  87     the result of the last query in the list.
  88     In the simple (non-set)
  89     case, the first row of the last query's result will be returned.
  90     (Bear in mind that <quote>the first row</quote> of a multirow
  91     result is not well-defined unless you use <literal>ORDER BY</>.)
  92     If the last query happens
  93     to return no rows at all, the null value will be returned.
  94    </para>
  95
  96    <para>
  97     Alternatively, an SQL function can be declared to return a set,
  98     by specifying the function's return type as <literal>SETOF
  99     <replaceable>sometype</></literal>, or equivalently by declaring it as
 100     <literal>RETURNS TABLE(<replaceable>columns</>)</literal>.  In this case
 101     all rows of the last query's result are returned.  Further details appear
 102     below.
 103    </para>
 104
 105    <para>
 106     The body of an SQL function must be a list of SQL
 107     statements separated by semicolons.  A semicolon after the last
 108     statement is optional.  Unless the function is declared to return
 109     <type>void</>, the last statement must be a <command>SELECT</>,
 110     or an <command>INSERT</>, <command>UPDATE</>, or <command>DELETE</>
 111     that has a <literal>RETURNING</> clause.
 112    </para>
 113
 114     <para>
 115      Any collection of commands in the  <acronym>SQL</acronym>
 116      language can be packaged together and defined as a function.
 117      Besides <command>SELECT</command> queries, the commands can include data
 118      modification queries (<command>INSERT</command>,
 119      <command>UPDATE</command>, and <command>DELETE</command>), as well as
 120      other SQL commands. (The only exception is that you cannot put
 121      <command>BEGIN</>, <command>COMMIT</>, <command>ROLLBACK</>, or
 122      <command>SAVEPOINT</> commands into a <acronym>SQL</acronym> function.)
 123      However, the final command
 124      must be a <command>SELECT</command> or have a <literal>RETURNING</>
 125      clause that returns whatever is
 126      specified as the function's return type.  Alternatively, if you
 127      want to define a SQL function that performs actions but has no
 128      useful value to return, you can define it as returning <type>void</>.
 129      For example, this function removes rows with negative salaries from
 130      the <literal>emp</> table:
 131
 132 <screen>
 133 CREATE FUNCTION clean_emp() RETURNS void AS '
 134     DELETE FROM emp
 135         WHERE salary &lt; 0;
 136 ' LANGUAGE SQL;
 137
 138 SELECT clean_emp();
 139
 140  clean_emp
 141 -----------
 142
 143 (1 row)
 144 </screen>
 145     </para>
 146
 147    <para>
 148     The syntax of the <command>CREATE FUNCTION</command> command requires
 149     the function body to be written as a string constant.  It is usually
 150     most convenient to use dollar quoting (see <xref
 151     linkend="sql-syntax-dollar-quoting">) for the string constant.
 152     If you choose to use regular single-quoted string constant syntax,
 153     you must double single quote marks (<literal>'</>) and backslashes
 154     (<literal>\</>) (assuming escape string syntax) in the body of
 155     the function (see <xref linkend="sql-syntax-strings">).
 156    </para>
 157
 158    <para>
 159     Arguments to the SQL function are referenced in the function
 160     body using the syntax <literal>$<replaceable>n</></>: <literal>$1</>
 161     refers to the first argument, <literal>$2</> to the second, and so on.
 162     If an argument is of a composite type, then the dot notation,
 163     e.g., <literal>$1.name</literal>, can be used to access attributes
 164     of the argument.  The arguments can only be used as data values,
 165     not as identifiers.  Thus for example this is reasonable:
 166 <programlisting>
 167 INSERT INTO mytable VALUES ($1);
 168 </programlisting>
 169 but this will not work:
 170 <programlisting>
 171 INSERT INTO $1 VALUES (42);
 172 </programlisting>
 173    </para>
 174
 175    <sect2 id="xfunc-sql-base-functions">
 176     <title><acronym>SQL</acronym> Functions on Base Types</title>
 177
 178     <para>
 179      The simplest possible <acronym>SQL</acronym> function has no arguments and
 180      simply returns a base type, such as <type>integer</type>:
 181
 182 <screen>
 183 CREATE FUNCTION one() RETURNS integer AS $$
 184     SELECT 1 AS result;
 185 $$ LANGUAGE SQL;
 186
 187 -- Alternative syntax for string literal:
 188 CREATE FUNCTION one() RETURNS integer AS '
 189     SELECT 1 AS result;
 190 ' LANGUAGE SQL;
 191
 192 SELECT one();
 193
 194  one
 195 -----
 196    1
 197 </screen>
 198     </para>
 199
 200     <para>
 201      Notice that we defined a column alias within the function body for the result of the function
 202      (with  the  name <literal>result</>),  but this column alias is not visible
 203      outside the function.  Hence,  the  result  is labeled <literal>one</>
 204      instead of <literal>result</>.
 205     </para>
 206
 207     <para>
 208      It is almost as easy to define <acronym>SQL</acronym> functions
 209      that take base types as arguments.  In the example below, notice
 210      how we refer to the arguments within the function as <literal>$1</>
 211      and <literal>$2</>.
 212
 213 <screen>
 214 CREATE FUNCTION add_em(integer, integer) RETURNS integer AS $$
 215     SELECT $1 + $2;
 216 $$ LANGUAGE SQL;
 217
 218 SELECT add_em(1, 2) AS answer;
 219
 220  answer
 221 --------
 222       3
 223 </screen>
 224     </para>
 225
 226     <para>
 227      Here is a more useful function, which might be used to debit a
 228      bank account:
 229
 230 <programlisting>
 231 CREATE FUNCTION tf1 (integer, numeric) RETURNS integer AS $$
 232     UPDATE bank
 233         SET balance = balance - $2
 234         WHERE accountno = $1;
 235     SELECT 1;
 236 $$ LANGUAGE SQL;
 237 </programlisting>
 238
 239      A user could execute this function to debit account 17 by $100.00 as
 240      follows:
 241
 242 <programlisting>
 243 SELECT tf1(17, 100.0);
 244 </programlisting>
 245     </para>
 246
 247     <para>
 248      In practice one would probably like a more useful result from the
 249      function than a constant 1, so a more likely definition
 250      is:
 251
 252 <programlisting>
 253 CREATE FUNCTION tf1 (integer, numeric) RETURNS numeric AS $$
 254     UPDATE bank
 255         SET balance = balance - $2
 256         WHERE accountno = $1;
 257     SELECT balance FROM bank WHERE accountno = $1;
 258 $$ LANGUAGE SQL;
 259 </programlisting>
 260
 261      which adjusts the balance and returns the new balance.
 262      The same thing could be done in one command using <literal>RETURNING</>:
 263
 264 <programlisting>
 265 CREATE FUNCTION tf1 (integer, numeric) RETURNS numeric AS $$
 266     UPDATE bank
 267         SET balance = balance - $2
 268         WHERE accountno = $1
 269     RETURNING balance;
 270 $$ LANGUAGE SQL;
 271 </programlisting>
 272     </para>
 273    </sect2>
 274
 275    <sect2>
 276     <title><acronym>SQL</acronym> Functions on Composite Types</title>
 277
 278     <para>
 279      When writing  functions with arguments of composite
 280      types, we must  not  only  specify  which
 281      argument  we  want (as we did above with <literal>$1</> and <literal>$2</literal>) but
 282      also the desired attribute (field) of  that  argument.   For  example,
 283      suppose that
 284      <type>emp</type> is a table containing employee data, and therefore
 285      also the name of the composite type of each row of the table.  Here
 286      is a function <function>double_salary</function> that computes what someone's
 287      salary would be if it were doubled:
 288
 289 <screen>
 290 CREATE TABLE emp (
 291     name        text,
 292     salary      numeric,
 293     age         integer,
 294     cubicle     point
 295 );
 296
 297 CREATE FUNCTION double_salary(emp) RETURNS numeric AS $$
 298     SELECT $1.salary * 2 AS salary;
 299 $$ LANGUAGE SQL;
 300
 301 SELECT name, double_salary(emp.*) AS dream
 302     FROM emp
 303     WHERE emp.cubicle ~= point '(2,1)';
 304
 305  name | dream
 306 ------+-------
 307  Bill |  8400
 308 </screen>
 309     </para>
 310
 311     <para>
 312      Notice the use of the syntax <literal>$1.salary</literal>
 313      to select one field of the argument row value.  Also notice
 314      how the calling <command>SELECT</> command uses <literal>*</>
 315      to select
 316      the entire current row of a table as a composite value.  The table
 317      row can alternatively be referenced using just the table name,
 318      like this:
 319 <screen>
 320 SELECT name, double_salary(emp) AS dream
 321     FROM emp
 322     WHERE emp.cubicle ~= point '(2,1)';
 323 </screen>
 324      but this usage is deprecated since it's easy to get confused.
 325     </para>
 326
 327     <para>
 328      Sometimes it is handy to construct a composite argument value
 329      on-the-fly.  This can be done with the <literal>ROW</> construct.
 330      For example, we could adjust the data being passed to the function:
 331 <screen>
 332 SELECT name, double_salary(ROW(name, salary*1.1, age, cubicle)) AS dream
 333     FROM emp;
 334 </screen>
 335     </para>
 336
 337     <para>
 338      It is also possible to build a function that returns a composite type.
 339      This is an example of a function
 340      that returns a single <type>emp</type> row:
 341
 342 <programlisting>
 343 CREATE FUNCTION new_emp() RETURNS emp AS $$
 344     SELECT text 'None' AS name,
 345         1000.0 AS salary,
 346         25 AS age,
 347         point '(2,2)' AS cubicle;
 348 $$ LANGUAGE SQL;
 349 </programlisting>
 350
 351      In this example we have specified each of  the  attributes
 352      with  a  constant value, but any computation
 353      could have been substituted for these constants.
 354     </para>
 355
 356     <para>
 357      Note two important things about defining the function:
 358
 359      <itemizedlist>
 360       <listitem>
 361        <para>
 362         The select list order in the query must be exactly the same as
 363         that in which the columns appear in the table associated
 364         with the composite type.  (Naming the columns, as we did above,
 365         is irrelevant to the system.)
 366        </para>
 367       </listitem>
 368       <listitem>
 369        <para>
 370         You must typecast the expressions to match the
 371         definition of the composite type, or you will get errors like this:
 372 <screen>
 373 <computeroutput>
 374 ERROR:  function declared to return emp returns varchar instead of text at column 1
 375 </computeroutput>
 376 </screen>
 377        </para>
 378       </listitem>
 379      </itemizedlist>
 380     </para>
 381
 382     <para>
 383      A different way to define the same function is:
 384
 385 <programlisting>
 386 CREATE FUNCTION new_emp() RETURNS emp AS $$
 387     SELECT ROW('None', 1000.0, 25, '(2,2)')::emp;
 388 $$ LANGUAGE SQL;
 389 </programlisting>
 390
 391      Here we wrote a <command>SELECT</> that returns just a single
 392      column of the correct composite type.  This isn't really better
 393      in this situation, but it is a handy alternative in some cases
 394      &mdash; for example, if we need to compute the result by calling
 395      another function that returns the desired composite value.
 396     </para>
 397
 398     <para>
 399      We could call this function directly in either of two ways:
 400
 401 <screen>
 402 SELECT new_emp();
 403
 404          new_emp
 405 --------------------------
 406  (None,1000.0,25,"(2,2)")
 407
 408 SELECT * FROM new_emp();
 409
 410  name | salary | age | cubicle
 411 ------+--------+-----+---------
 412  None | 1000.0 |  25 | (2,2)
 413 </screen>
 414
 415      The second way is described more fully in <xref
 416      linkend="xfunc-sql-table-functions">.
 417     </para>
 418
 419     <para>
 420      When you use a function that returns a composite type,
 421      you might want only one field (attribute) from its result.
 422      You can do that with syntax like this:
 423
 424 <screen>
 425 SELECT (new_emp()).name;
 426
 427  name
 428 ------
 429  None
 430 </screen>
 431
 432      The extra parentheses are needed to keep the parser from getting
 433      confused.  If you try to do it without them, you get something like this:
 434
 435 <screen>
 436 SELECT new_emp().name;
 437 ERROR:  syntax error at or near "."
 438 LINE 1: SELECT new_emp().name;
 439                         ^
 440 </screen>
 441     </para>
 442
 443     <para>
 444      Another option is to use
 445      functional notation for extracting an attribute.  The  simple  way
 446      to explain this is that we can use the
 447      notations <literal>attribute(table)</>  and  <literal>table.attribute</>
 448      interchangeably.
 449
 450 <screen>
 451 SELECT name(new_emp());
 452
 453  name
 454 ------
 455  None
 456 </screen>
 457
 458 <screen>
 459 -- This is the same as:
 460 -- SELECT emp.name AS youngster FROM emp WHERE emp.age &lt; 30;
 461
 462 SELECT name(emp) AS youngster FROM emp WHERE age(emp) &lt; 30;
 463
 464  youngster
 465 -----------
 466  Sam
 467  Andy
 468 </screen>
 469     </para>
 470
 471     <tip>
 472      <para>
 473       The equivalence between functional notation and attribute notation
 474       makes it possible to use functions on composite types to emulate
 475       <quote>computed fields</>.
 476       <indexterm>
 477        <primary>computed field</primary>
 478       </indexterm>
 479       <indexterm>
 480        <primary>field</primary>
 481        <secondary>computed</secondary>
 482       </indexterm>
 483       For example, using the previous definition
 484       for <literal>double_salary(emp)</>, we can write
 485
 486 <screen>
 487 SELECT emp.name, emp.double_salary FROM emp;
 488 </screen>
 489
 490       An application using this wouldn't need to be directly aware that
 491       <literal>double_salary</> isn't a real column of the table.
 492       (You can also emulate computed fields with views.)
 493      </para>
 494     </tip>
 495
 496     <para>
 497      Another way to use a function returning a composite type is to pass the
 498      result to another function that accepts the correct row type as input:
 499
 500 <screen>
 501 CREATE FUNCTION getname(emp) RETURNS text AS $$
 502     SELECT $1.name;
 503 $$ LANGUAGE SQL;
 504
 505 SELECT getname(new_emp());
 506  getname
 507 ---------
 508  None
 509 (1 row)
 510 </screen>
 511     </para>
 512
 513     <para>
 514      Still another way to use a function that returns a composite type is to
 515      call it as a table function, as described in <xref
 516      linkend="xfunc-sql-table-functions">.
 517     </para>
 518    </sect2>
 519
 520    <sect2 id="xfunc-output-parameters">
 521     <title><acronym>SQL</> Functions with Output Parameters</title>
 522
 523    <indexterm>
 524     <primary>function</primary>
 525     <secondary>output parameter</secondary>
 526    </indexterm>
 527
 528     <para>
 529      An alternative way of describing a function's results is to define it
 530      with <firstterm>output parameters</>, as in this example:
 531
 532 <screen>
 533 CREATE FUNCTION add_em (IN x int, IN y int, OUT sum int)
 534 AS 'SELECT $1 + $2'
 535 LANGUAGE SQL;
 536
 537 SELECT add_em(3,7);
 538  add_em
 539 --------
 540      10
 541 (1 row)
 542 </screen>
 543
 544      This is not essentially different from the version of <literal>add_em</>
 545      shown in <xref linkend="xfunc-sql-base-functions">.  The real value of
 546      output parameters is that they provide a convenient way of defining
 547      functions that return several columns.  For example,
 548
 549 <screen>
 550 CREATE FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int)
 551 AS 'SELECT $1 + $2, $1 * $2'
 552 LANGUAGE SQL;
 553
 554  SELECT * FROM sum_n_product(11,42);
 555  sum | product
 556 -----+---------
 557   53 |     462
 558 (1 row)
 559 </screen>
 560
 561      What has essentially happened here is that we have created an anonymous
 562      composite type for the result of the function.  The above example has
 563      the same end result as
 564
 565 <screen>
 566 CREATE TYPE sum_prod AS (sum int, product int);
 567
 568 CREATE FUNCTION sum_n_product (int, int) RETURNS sum_prod
 569 AS 'SELECT $1 + $2, $1 * $2'
 570 LANGUAGE SQL;
 571 </screen>
 572
 573      but not having to bother with the separate composite type definition
 574      is often handy.
 575     </para>
 576
 577     <para>
 578      Notice that output parameters are not included in the calling argument
 579      list when invoking such a function from SQL.  This is because
 580      <productname>PostgreSQL</productname> considers only the input
 581      parameters to define the function's calling signature.  That means
 582      also that only the input parameters matter when referencing the function
 583      for purposes such as dropping it.  We could drop the above function
 584      with either of
 585
 586 <screen>
 587 DROP FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int);
 588 DROP FUNCTION sum_n_product (int, int);
 589 </screen>
 590     </para>
 591
 592     <para>
 593      Parameters can be marked as <literal>IN</> (the default),
 594      <literal>OUT</>, <literal>INOUT</>, or <literal>VARIADIC</>.
 595      An <literal>INOUT</>
 596      parameter serves as both an input parameter (part of the calling
 597      argument list) and an output parameter (part of the result record type).
 598      <literal>VARIADIC</> parameters are input parameters, but are treated
 599      specially as described next.
 600     </para>
 601    </sect2>
 602
 603    <sect2 id="xfunc-sql-variadic-functions">
 604     <title><acronym>SQL</> Functions with Variable Numbers of Arguments</title>
 605
 606     <indexterm>
 607      <primary>function</primary>
 608      <secondary>variadic</secondary>
 609     </indexterm>
 610
 611     <indexterm>
 612      <primary>variadic function</primary>
 613     </indexterm>
 614
 615     <para>
 616      <acronym>SQL</acronym> functions can be declared to accept
 617      variable numbers of arguments, so long as all the <quote>optional</>
 618      arguments are of the same data type.  The optional arguments will be
 619      passed to the function as an array.  The function is declared by
 620      marking the last parameter as <literal>VARIADIC</>; this parameter
 621      must be declared as being of an array type.  For example:
 622
 623 <screen>
 624 CREATE FUNCTION mleast(VARIADIC numeric[]) RETURNS numeric AS $$
 625     SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
 626 $$ LANGUAGE SQL;
 627
 628 SELECT mleast(10, -1, 5, 4.4);
 629  mleast
 630 --------
 631      -1
 632 (1 row)
 633 </screen>
 634
 635      Effectively, all the actual arguments at or beyond the
 636      <literal>VARIADIC</> position are gathered up into a one-dimensional
 637      array, as if you had written
 638
 639 <screen>
 640 SELECT mleast(ARRAY[10, -1, 5, 4.4]);    -- doesn't work
 641 </screen>
 642
 643      You can't actually write that, though &mdash; or at least, it will
 644      not match this function definition.  A parameter marked
 645      <literal>VARIADIC</> matches one or more occurrences of its element
 646      type, not of its own type.
 647     </para>
 648
 649     <para>
 650      Sometimes it is useful to be able to pass an already-constructed array
 651      to a variadic function; this is particularly handy when one variadic
 652      function wants to pass on its array parameter to another one.  You can
 653      do that by specifying <literal>VARIADIC</> in the call:
 654
 655 <screen>
 656 SELECT mleast(VARIADIC ARRAY[10, -1, 5, 4.4]);
 657 </screen>
 658
 659      This prevents expansion of the function's variadic parameter into its
 660      element type, thereby allowing the array argument value to match
 661      normally.  <literal>VARIADIC</> can only be attached to the last
 662      actual argument of a function call.
 663     </para>
 664    </sect2>
 665
 666    <sect2 id="xfunc-sql-parameter-defaults">
 667     <title><acronym>SQL</> Functions with Default Values for Arguments</title>
 668
 669     <indexterm>
 670      <primary>function</primary>
 671      <secondary>default values for arguments</secondary>
 672     </indexterm>
 673
 674     <para>
 675      Functions can be declared with default values for some or all input
 676      arguments.  The default values are inserted whenever the function is
 677      called with insufficiently many actual arguments.  Since arguments
 678      can only be omitted from the end of the actual argument list, all
 679      parameters after a parameter with a default value have to have
 680      default values as well.
 681     </para>
 682
 683     <para>
 684      For example:
 685 <screen>
 686 CREATE FUNCTION foo(a int, b int DEFAULT 2, c int DEFAULT 3)
 687 RETURNS int
 688 LANGUAGE SQL
 689 AS $$
 690     SELECT $1 + $2 + $3;
 691 $$;
 692
 693 SELECT foo(10, 20, 30);
 694  foo
 695 -----
 696   60
 697 (1 row)
 698
 699 SELECT foo(10, 20);
 700  foo
 701 -----
 702   33
 703 (1 row)
 704
 705 SELECT foo(10);
 706  foo
 707 -----
 708   15
 709 (1 row)
 710
 711 SELECT foo();  -- fails since there is no default for the first argument
 712 ERROR:  function foo() does not exist
 713 </screen>
 714      The <literal>=</literal> sign can also be used in place of the
 715      key word <literal>DEFAULT</literal>,
 716     </para>
 717    </sect2>
 718
 719    <sect2 id="xfunc-sql-table-functions">
 720     <title><acronym>SQL</acronym> Functions as Table Sources</title>
 721
 722     <para>
 723      All SQL functions can be used in the <literal>FROM</> clause of a query,
 724      but it is particularly useful for functions returning composite types.
 725      If the function is defined to return a base type, the table function
 726      produces a one-column table.  If the function is defined to return
 727      a composite type, the table function produces a column for each attribute
 728      of the composite type.
 729     </para>
 730
 731     <para>
 732      Here is an example:
 733
 734 <screen>
 735 CREATE TABLE foo (fooid int, foosubid int, fooname text);
 736 INSERT INTO foo VALUES (1, 1, 'Joe');
 737 INSERT INTO foo VALUES (1, 2, 'Ed');
 738 INSERT INTO foo VALUES (2, 1, 'Mary');
 739
 740 CREATE FUNCTION getfoo(int) RETURNS foo AS $$
 741     SELECT * FROM foo WHERE fooid = $1;
 742 $$ LANGUAGE SQL;
 743
 744 SELECT *, upper(fooname) FROM getfoo(1) AS t1;
 745
 746  fooid | foosubid | fooname | upper
 747 -------+----------+---------+-------
 748      1 |        1 | Joe     | JOE
 749 (1 row)
 750 </screen>
 751
 752      As the example shows, we can work with the columns of the function's
 753      result just the same as if they were columns of a regular table.
 754     </para>
 755
 756     <para>
 757      Note that we only got one row out of the function.  This is because
 758      we did not use <literal>SETOF</>.  That is described in the next section.
 759     </para>
 760    </sect2>
 761
 762    <sect2 id="xfunc-sql-functions-returning-set">
 763     <title><acronym>SQL</acronym> Functions Returning Sets</title>
 764
 765     <indexterm>
 766      <primary>function</primary>
 767      <secondary>with SETOF</secondary>
 768     </indexterm>
 769
 770     <para>
 771      When an SQL function is declared as returning <literal>SETOF
 772      <replaceable>sometype</></literal>, the function's final
 773      query is executed to completion, and each row it
 774      outputs is returned as an element of the result set.
 775     </para>
 776
 777     <para>
 778      This feature is normally used when calling the function in the <literal>FROM</>
 779      clause.  In this case each row returned by the function becomes
 780      a row of the table seen by the query.  For example, assume that
 781      table <literal>foo</> has the same contents as above, and we say:
 782
 783 <programlisting>
 784 CREATE FUNCTION getfoo(int) RETURNS SETOF foo AS $$
 785     SELECT * FROM foo WHERE fooid = $1;
 786 $$ LANGUAGE SQL;
 787
 788 SELECT * FROM getfoo(1) AS t1;
 789 </programlisting>
 790
 791      Then we would get:
 792 <screen>
 793  fooid | foosubid | fooname
 794 -------+----------+---------
 795      1 |        1 | Joe
 796      1 |        2 | Ed
 797 (2 rows)
 798 </screen>
 799     </para>
 800
 801     <para>
 802      It is also possible to return multiple rows with the columns defined by
 803      output parameters, like this:
 804
 805 <programlisting>
 806 CREATE FUNCTION sum_n_product_with_tab (x int, OUT sum int, OUT product int) RETURNS SETOF record AS $$
 807     SELECT $1 + tab.y, $1 * tab.y FROM tab;
 808 $$ LANGUAGE SQL;
 809 </programlisting>
 810
 811      The key point here is that you must write <literal>RETURNS SETOF record</>
 812      to indicate that the function returns multiple rows instead of just one.
 813      If there is only one output parameter, write that parameter's type
 814      instead of <type>record</>.
 815     </para>
 816
 817     <para>
 818      Currently, functions returning sets can also be called in the select list
 819      of a query.  For each row that the query
 820      generates by itself, the function returning set is invoked, and an output
 821      row is generated for each element of the function's result set. Note,
 822      however, that this capability is deprecated and might be removed in future
 823      releases. The following is an example function returning a set from the
 824      select list:
 825
 826 <screen>
 827 CREATE FUNCTION listchildren(text) RETURNS SETOF text AS $$
 828     SELECT name FROM nodes WHERE parent = $1
 829 $$ LANGUAGE SQL;
 830
 831 SELECT * FROM nodes;
 832    name    | parent
 833 -----------+--------
 834  Top       |
 835  Child1    | Top
 836  Child2    | Top
 837  Child3    | Top
 838  SubChild1 | Child1
 839  SubChild2 | Child1
 840 (6 rows)
 841
 842 SELECT listchildren('Top');
 843  listchildren
 844 --------------
 845  Child1
 846  Child2
 847  Child3
 848 (3 rows)
 849
 850 SELECT name, listchildren(name) FROM nodes;
 851   name  | listchildren
 852 --------+--------------
 853  Top    | Child1
 854  Top    | Child2
 855  Top    | Child3
 856  Child1 | SubChild1
 857  Child1 | SubChild2
 858 (5 rows)
 859 </screen>
 860
 861      In the last <command>SELECT</command>,
 862      notice that no output row appears for <literal>Child2</>, <literal>Child3</>, etc.
 863      This happens because <function>listchildren</function> returns an empty set
 864      for those arguments, so no result rows are generated.
 865     </para>
 866
 867     <note>
 868      <para>
 869       If a function's last command is <command>INSERT</>, <command>UPDATE</>,
 870       or <command>DELETE</> with <literal>RETURNING</>, that command will
 871       always be executed to completion, even if the function is not declared
 872       with <literal>SETOF</> or the calling query does not fetch all the
 873       result rows.  Any extra rows produced by the <literal>RETURNING</>
 874       clause are silently dropped, but the commanded table modifications
 875       still happen (and are all completed before returning from the function).
 876      </para>
 877     </note>
 878    </sect2>
 879
 880    <sect2 id="xfunc-sql-functions-returning-table">
 881     <title><acronym>SQL</acronym> Functions Returning <literal>TABLE</></title>
 882
 883     <indexterm>
 884      <primary>function</primary>
 885      <secondary>RETURNS TABLE</secondary>
 886     </indexterm>
 887
 888     <para>
 889      There is another way to declare a function as returning a set,
 890      which is to use the syntax
 891      <literal>RETURNS TABLE(<replaceable>columns</>)</literal>.
 892      This is equivalent to using one or more <literal>OUT</> parameters plus
 893      marking the function as returning <literal>SETOF record</> (or
 894      <literal>SETOF</> a single output parameter's type, as appropriate).
 895      This notation is specified in recent versions of the SQL standard, and
 896      thus may be more portable than using <literal>SETOF</>.
 897     </para>
 898
 899     <para>
 900      For example, the preceding sum-and-product example could also be
 901      done this way:
 902
 903 <programlisting>
 904 CREATE FUNCTION sum_n_product_with_tab (x int) RETURNS TABLE(sum int, product int) AS $$
 905     SELECT $1 + tab.y, $1 * tab.y FROM tab;
 906 $$ LANGUAGE SQL;
 907 </programlisting>
 908
 909      It is not allowed to use explicit <literal>OUT</> or <literal>INOUT</>
 910      parameters with the <literal>RETURNS TABLE</> notation &mdash; you must
 911      put all the output columns in the <literal>TABLE</> list.
 912     </para>
 913    </sect2>
 914
 915    <sect2>
 916     <title>Polymorphic <acronym>SQL</acronym> Functions</title>
 917
 918     <para>
 919      <acronym>SQL</acronym> functions can be declared to accept and
 920      return the polymorphic types <type>anyelement</type>,
 921      <type>anyarray</type>, <type>anynonarray</type>, and
 922      <type>anyenum</type>.  See <xref
 923      linkend="extend-types-polymorphic"> for a more detailed
 924      explanation of polymorphic functions. Here is a polymorphic
 925      function <function>make_array</function> that builds up an array
 926      from two arbitrary data type elements:
 927 <screen>
 928 CREATE FUNCTION make_array(anyelement, anyelement) RETURNS anyarray AS $$
 929     SELECT ARRAY[$1, $2];
 930 $$ LANGUAGE SQL;
 931
 932 SELECT make_array(1, 2) AS intarray, make_array('a'::text, 'b') AS textarray;
 933  intarray | textarray
 934 ----------+-----------
 935  {1,2}    | {a,b}
 936 (1 row)
 937 </screen>
 938     </para>
 939
 940     <para>
 941      Notice the use of the typecast <literal>'a'::text</literal>
 942      to specify that the argument is of type <type>text</type>. This is
 943      required if the argument is just a string literal, since otherwise
 944      it would be treated as type
 945      <type>unknown</type>, and array of <type>unknown</type> is not a valid
 946      type.
 947      Without the typecast, you will get errors like this:
 948 <screen>
 949 <computeroutput>
 950 ERROR:  could not determine polymorphic type because input has type "unknown"
 951 </computeroutput>
 952 </screen>
 953     </para>
 954
 955     <para>
 956      It is permitted to have polymorphic arguments with a fixed
 957      return type, but the converse is not. For example:
 958 <screen>
 959 CREATE FUNCTION is_greater(anyelement, anyelement) RETURNS boolean AS $$
 960     SELECT $1 &gt; $2;
 961 $$ LANGUAGE SQL;
 962
 963 SELECT is_greater(1, 2);
 964  is_greater
 965 ------------
 966  f
 967 (1 row)
 968
 969 CREATE FUNCTION invalid_func() RETURNS anyelement AS $$
 970     SELECT 1;
 971 $$ LANGUAGE SQL;
 972 ERROR:  cannot determine result data type
 973 DETAIL:  A function returning a polymorphic type must have at least one polymorphic argument.
 974 </screen>
 975     </para>
 976
 977     <para>
 978      Polymorphism can be used with functions that have output arguments.
 979      For example:
 980 <screen>
 981 CREATE FUNCTION dup (f1 anyelement, OUT f2 anyelement, OUT f3 anyarray)
 982 AS 'select $1, array[$1,$1]' LANGUAGE SQL;
 983
 984 SELECT * FROM dup(22);
 985  f2 |   f3
 986 ----+---------
 987  22 | {22,22}
 988 (1 row)
 989 </screen>
 990     </para>
 991
 992     <para>
 993      Polymorphism can also be used with variadic functions.
 994      For example:
 995 <screen>
 996 CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$
 997     SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
 998 $$ LANGUAGE SQL;
 999
1000 SELECT anyleast(10, -1, 5, 4);
1001  anyleast
1002 ----------
1003        -1
1004 (1 row)
1005
1006 SELECT anyleast('abc'::text, 'def');
1007  anyleast
1008 ----------
1009  abc
1010 (1 row)
1011
1012 CREATE FUNCTION concat(text, VARIADIC anyarray) RETURNS text AS $$
1013     SELECT array_to_string($2, $1);
1014 $$ LANGUAGE SQL;
1015
1016 SELECT concat('|', 1, 4, 2);
1017  concat
1018 --------
1019  1|4|2
1020 (1 row)
1021 </screen>
1022     </para>
1023    </sect2>
1024   </sect1>
1025
1026   <sect1 id="xfunc-overload">
1027    <title>Function Overloading</title>
1028
1029    <indexterm zone="xfunc-overload">
1030     <primary>overloading</primary>
1031     <secondary>functions</secondary>
1032    </indexterm>
1033
1034    <para>
1035     More than one function can be defined with the same SQL name, so long
1036     as the arguments they take are different.  In other words,
1037     function names can be <firstterm>overloaded</firstterm>.  When a
1038     query is executed, the server will determine which function to
1039     call from the data types and the number of the provided arguments.
1040     Overloading can also be used to simulate functions with a variable
1041     number of arguments, up to a finite maximum number.
1042    </para>
1043
1044    <para>
1045     When creating a family of overloaded functions, one should be
1046     careful not to create ambiguities.  For instance, given the
1047     functions:
1048 <programlisting>
1049 CREATE FUNCTION test(int, real) RETURNS ...
1050 CREATE FUNCTION test(smallint, double precision) RETURNS ...
1051 </programlisting>
1052     it is not immediately clear which function would be called with
1053     some trivial input like <literal>test(1, 1.5)</literal>.  The
1054     currently implemented resolution rules are described in
1055     <xref linkend="typeconv">, but it is unwise to design a system that subtly
1056     relies on this behavior.
1057    </para>
1058
1059    <para>
1060     A function that takes a single argument of a composite type should
1061     generally not have the same name as any attribute (field) of that type.
1062     Recall that <literal>attribute(table)</literal> is considered equivalent
1063     to <literal>table.attribute</literal>.  In the case that there is an
1064     ambiguity between a function on a composite type and an attribute of
1065     the composite type, the attribute will always be used.  It is possible
1066     to override that choice by schema-qualifying the function name
1067     (that is, <literal>schema.func(table)</literal>) but it's better to
1068     avoid the problem by not choosing conflicting names.
1069    </para>
1070
1071    <para>
1072     Another possible conflict is between variadic and non-variadic functions.
1073     For instance, it is possible to create both <literal>foo(numeric)</> and
1074     <literal>foo(VARIADIC numeric[])</>.  In this case it is unclear which one
1075     should be matched to a call providing a single numeric argument, such as
1076     <literal>foo(10.1)</>.  The rule is that the function appearing
1077     earlier in the search path is used, or if the two functions are in the
1078     same schema, the non-variadic one is preferred.
1079    </para>
1080
1081    <para>
1082     When overloading C-language functions, there is an additional
1083     constraint: The C name of each function in the family of
1084     overloaded functions must be different from the C names of all
1085     other functions, either internal or dynamically loaded.  If this
1086     rule is violated, the behavior is not portable.  You might get a
1087     run-time linker error, or one of the functions will get called
1088     (usually the internal one).  The alternative form of the
1089     <literal>AS</> clause for the SQL <command>CREATE
1090     FUNCTION</command> command decouples the SQL function name from
1091     the function name in the C source code.  For instance:
1092 <programlisting>
1093 CREATE FUNCTION test(int) RETURNS int
1094     AS '<replaceable>filename</>', 'test_1arg'
1095     LANGUAGE C;
1096 CREATE FUNCTION test(int, int) RETURNS int
1097     AS '<replaceable>filename</>', 'test_2arg'
1098     LANGUAGE C;
1099 </programlisting>
1100     The names of the C functions here reflect one of many possible conventions.
1101    </para>
1102   </sect1>
1103
1104   <sect1 id="xfunc-volatility">
1105    <title>Function Volatility Categories</title>
1106
1107    <indexterm zone="xfunc-volatility">
1108     <primary>volatility</primary>
1109     <secondary>functions</secondary>
1110    </indexterm>
1111    <indexterm zone="xfunc-volatility">
1112     <primary>VOLATILE</primary>
1113    </indexterm>
1114    <indexterm zone="xfunc-volatility">
1115     <primary>STABLE</primary>
1116    </indexterm>
1117    <indexterm zone="xfunc-volatility">
1118     <primary>IMMUTABLE</primary>
1119    </indexterm>
1120
1121    <para>
1122     Every function has a <firstterm>volatility</> classification, with
1123     the possibilities being <literal>VOLATILE</>, <literal>STABLE</>, or
1124     <literal>IMMUTABLE</>.  <literal>VOLATILE</> is the default if the
1125     <xref linkend="sql-createfunction" endterm="sql-createfunction-title">
1126     command does not specify a category.  The volatility category is a
1127     promise to the optimizer about the behavior of the function:
1128
1129    <itemizedlist>
1130     <listitem>
1131      <para>
1132       A <literal>VOLATILE</> function can do anything, including modifying
1133       the database.  It can return different results on successive calls with
1134       the same arguments.  The optimizer makes no assumptions about the
1135       behavior of such functions.  A query using a volatile function will
1136       re-evaluate the function at every row where its value is needed.
1137      </para>
1138     </listitem>
1139     <listitem>
1140      <para>
1141       A <literal>STABLE</> function cannot modify the database and is
1142       guaranteed to return the same results given the same arguments
1143       for all rows within a single statement. This category allows the
1144       optimizer to optimize multiple calls of the function to a single
1145       call. In particular, it is safe to use an expression containing
1146       such a function in an index scan condition. (Since an index scan
1147       will evaluate the comparison value only once, not once at each
1148       row, it is not valid to use a <literal>VOLATILE</> function in an
1149       index scan condition.)
1150      </para>
1151     </listitem>
1152     <listitem>
1153      <para>
1154       An <literal>IMMUTABLE</> function cannot modify the database and is
1155       guaranteed to return the same results given the same arguments forever.
1156       This category allows the optimizer to pre-evaluate the function when
1157       a query calls it with constant arguments.  For example, a query like
1158       <literal>SELECT ... WHERE x = 2 + 2</> can be simplified on sight to
1159       <literal>SELECT ... WHERE x = 4</>, because the function underlying
1160       the integer addition operator is marked <literal>IMMUTABLE</>.
1161      </para>
1162     </listitem>
1163    </itemizedlist>
1164    </para>
1165
1166    <para>
1167     For best optimization results, you should label your functions with the
1168     strictest volatility category that is valid for them.
1169    </para>
1170
1171    <para>
1172     Any function with side-effects <emphasis>must</> be labeled
1173     <literal>VOLATILE</>, so that calls to it cannot be optimized away.
1174     Even a function with no side-effects needs to be labeled
1175     <literal>VOLATILE</> if its value can change within a single query;
1176     some examples are <literal>random()</>, <literal>currval()</>,
1177     <literal>timeofday()</>.
1178    </para>
1179
1180    <para>
1181     Another important example is that the <function>current_timestamp</>
1182     family of functions qualify as <literal>STABLE</>, since their values do
1183     not change within a transaction.
1184    </para>
1185
1186    <para>
1187     There is relatively little difference between <literal>STABLE</> and
1188     <literal>IMMUTABLE</> categories when considering simple interactive
1189     queries that are planned and immediately executed: it doesn't matter
1190     a lot whether a function is executed once during planning or once during
1191     query execution startup.  But there is a big difference if the plan is
1192     saved and reused later.  Labeling a function <literal>IMMUTABLE</> when
1193     it really isn't might allow it to be prematurely folded to a constant during
1194     planning, resulting in a stale value being re-used during subsequent uses
1195     of the plan.  This is a hazard when using prepared statements or when
1196     using function languages that cache plans (such as
1197     <application>PL/pgSQL</>).
1198    </para>
1199
1200    <para>
1201     For functions written in SQL or in any of the standard procedural
1202     languages, there is a second important property determined by the
1203     volatility category, namely the visibility of any data changes that have
1204     been made by the SQL command that is calling the function.  A
1205     <literal>VOLATILE</> function will see such changes, a <literal>STABLE</>
1206     or <literal>IMMUTABLE</> function will not.  This behavior is implemented
1207     using the snapshotting behavior of MVCC (see <xref linkend="mvcc">):
1208     <literal>STABLE</> and <literal>IMMUTABLE</> functions use a snapshot
1209     established as of the start of the calling query, whereas
1210     <literal>VOLATILE</> functions obtain a fresh snapshot at the start of
1211     each query they execute.
1212    </para>
1213
1214    <note>
1215     <para>
1216      Functions written in C can manage snapshots however they want, but it's
1217      usually a good idea to make C functions work this way too.
1218     </para>
1219    </note>
1220
1221    <para>
1222     Because of this snapshotting behavior,
1223     a function containing only <command>SELECT</> commands can safely be
1224     marked <literal>STABLE</>, even if it selects from tables that might be
1225     undergoing modifications by concurrent queries.
1226     <productname>PostgreSQL</productname> will execute all commands of a
1227     <literal>STABLE</> function using the snapshot established for the
1228     calling query, and so it will see a fixed view of the database throughout
1229     that query.
1230    </para>
1231
1232    <para>
1233     The same snapshotting behavior is used for <command>SELECT</> commands
1234     within <literal>IMMUTABLE</> functions.  It is generally unwise to select
1235     from database tables within an <literal>IMMUTABLE</> function at all,
1236     since the immutability will be broken if the table contents ever change.
1237     However, <productname>PostgreSQL</productname> does not enforce that you
1238     do not do that.
1239    </para>
1240
1241    <para>
1242     A common error is to label a function <literal>IMMUTABLE</> when its
1243     results depend on a configuration parameter.  For example, a function
1244     that manipulates timestamps might well have results that depend on the
1245     <xref linkend="guc-timezone"> setting.  For safety, such functions should
1246     be labeled <literal>STABLE</> instead.
1247    </para>
1248
1249    <note>
1250     <para>
1251      Before <productname>PostgreSQL</productname> release 8.0, the requirement
1252      that <literal>STABLE</> and <literal>IMMUTABLE</> functions cannot modify
1253      the database was not enforced by the system.  Releases 8.0 and later enforce it
1254      by requiring SQL functions and procedural language functions of these
1255      categories to contain no SQL commands other than <command>SELECT</>.
1256      (This is not a completely bulletproof test, since such functions could
1257      still call <literal>VOLATILE</> functions that modify the database.
1258      If you do that, you will find that the <literal>STABLE</> or
1259      <literal>IMMUTABLE</> function does not notice the database changes
1260      applied by the called function, since they are hidden from its snapshot.)
1261     </para>
1262    </note>
1263   </sect1>
1264
1265   <sect1 id="xfunc-pl">
1266    <title>Procedural Language Functions</title>
1267
1268    <para>
1269     <productname>PostgreSQL</productname> allows user-defined functions
1270     to be written in other languages besides SQL and C.  These other
1271     languages are generically called <firstterm>procedural
1272     languages</firstterm> (<acronym>PL</>s).
1273     Procedural languages aren't built into the
1274     <productname>PostgreSQL</productname> server; they are offered
1275     by loadable modules.
1276     See <xref linkend="xplang"> and following chapters for more
1277     information.
1278    </para>
1279   </sect1>
1280
1281   <sect1 id="xfunc-internal">
1282    <title>Internal Functions</title>
1283
1284    <indexterm zone="xfunc-internal"><primary>function</><secondary>internal</></>
1285
1286    <para>
1287     Internal functions are functions written in C that have been statically
1288     linked into the <productname>PostgreSQL</productname> server.
1289     The <quote>body</quote> of the function definition
1290     specifies the C-language name of the function, which need not be the
1291     same as the name being declared for SQL use.
1292     (For reasons of backwards compatibility, an empty body
1293     is accepted as meaning that the C-language function name is the
1294     same as the SQL name.)
1295    </para>
1296
1297    <para>
1298     Normally, all internal functions present in the
1299     server are declared during the initialization of the database cluster (<command>initdb</command>),
1300     but a user could use <command>CREATE FUNCTION</command>
1301     to create additional alias names for an internal function.
1302     Internal functions are declared in <command>CREATE FUNCTION</command>
1303     with language name <literal>internal</literal>.  For instance, to
1304     create an alias for the <function>sqrt</function> function:
1305 <programlisting>
1306 CREATE FUNCTION square_root(double precision) RETURNS double precision
1307     AS 'dsqrt'
1308     LANGUAGE internal
1309     STRICT;
1310 </programlisting>
1311     (Most internal functions expect to be declared <quote>strict</quote>.)
1312    </para>
1313
1314    <note>
1315     <para>
1316      Not all <quote>predefined</quote> functions are
1317      <quote>internal</quote> in the above sense.  Some predefined
1318      functions are written in SQL.
1319     </para>
1320    </note>
1321   </sect1>
1322
1323   <sect1 id="xfunc-c">
1324    <title>C-Language Functions</title>
1325
1326    <indexterm zone="xfunc-c">
1327     <primary>function</primary>
1328     <secondary>user-defined</secondary>
1329     <tertiary>in C</tertiary>
1330    </indexterm>
1331
1332    <para>
1333     User-defined functions can be written in C (or a language that can
1334     be made compatible with C, such as C++).  Such functions are
1335     compiled into dynamically loadable objects (also called shared
1336     libraries) and are loaded by the server on demand.  The dynamic
1337     loading feature is what distinguishes <quote>C language</> functions
1338     from <quote>internal</> functions &mdash; the actual coding conventions
1339     are essentially the same for both.  (Hence, the standard internal
1340     function library is a rich source of coding examples for user-defined
1341     C functions.)
1342    </para>
1343
1344    <para>
1345     Two different calling conventions are currently used for C functions.
1346     The newer <quote>version 1</quote> calling convention is indicated by writing
1347     a <literal>PG_FUNCTION_INFO_V1()</literal> macro call for the function,
1348     as illustrated below.  Lack of such a macro indicates an old-style
1349     (<quote>version 0</quote>) function.  The language name specified in <command>CREATE FUNCTION</command>
1350     is <literal>C</literal> in either case.  Old-style functions are now deprecated
1351     because of portability problems and lack of functionality, but they
1352     are still supported for compatibility reasons.
1353    </para>
1354
1355   <sect2 id="xfunc-c-dynload">
1356    <title>Dynamic Loading</title>
1357
1358    <indexterm zone="xfunc-c-dynload">
1359     <primary>dynamic loading</primary>
1360    </indexterm>
1361
1362    <para>
1363     The first time a user-defined function in a particular
1364     loadable object file is called in a session,
1365     the dynamic loader loads that object file into memory so that the
1366     function can be called.  The <command>CREATE FUNCTION</command>
1367     for a user-defined C function must therefore specify two pieces of
1368     information for the function: the name of the loadable
1369     object file, and the C name (link symbol) of the specific function to call
1370     within that object file.  If the C name is not explicitly specified then
1371     it is assumed to be the same as the SQL function name.
1372    </para>
1373
1374    <para>
1375     The following algorithm is used to locate the shared object file
1376     based on the name given in the <command>CREATE FUNCTION</command>
1377     command:
1378
1379     <orderedlist>
1380      <listitem>
1381       <para>
1382        If the name is an absolute path, the given file is loaded.
1383       </para>
1384      </listitem>
1385
1386      <listitem>
1387       <para>
1388        If the name starts with the string <literal>$libdir</literal>,
1389        that part is replaced by the <productname>PostgreSQL</> package
1390         library directory
1391        name, which is determined at build time.<indexterm><primary>$libdir</></>
1392       </para>
1393      </listitem>
1394
1395      <listitem>
1396       <para>
1397        If the name does not contain a directory part, the file is
1398        searched for in the path specified by the configuration variable
1399        <xref linkend="guc-dynamic-library-path">.<indexterm><primary>dynamic_library_path</></>
1400       </para>
1401      </listitem>
1402
1403      <listitem>
1404       <para>
1405        Otherwise (the file was not found in the path, or it contains a
1406        non-absolute directory part), the dynamic loader will try to
1407        take the name as given, which will most likely fail.  (It is
1408        unreliable to depend on the current working directory.)
1409       </para>
1410      </listitem>
1411     </orderedlist>
1412
1413     If this sequence does not work, the platform-specific shared
1414     library file name extension (often <filename>.so</filename>) is
1415     appended to the given name and this sequence is tried again.  If
1416     that fails as well, the load will fail.
1417    </para>
1418
1419    <para>
1420     It is recommended to locate shared libraries either relative to
1421     <literal>$libdir</literal> or through the dynamic library path.
1422     This simplifies version upgrades if the new installation is at a
1423     different location.  The actual directory that
1424     <literal>$libdir</literal> stands for can be found out with the
1425     command <literal>pg_config --pkglibdir</literal>.
1426    </para>
1427
1428    <para>
1429     The user ID the <productname>PostgreSQL</productname> server runs
1430     as must be able to traverse the path to the file you intend to
1431     load.  Making the file or a higher-level directory not readable
1432     and/or not executable by the <systemitem>postgres</systemitem>
1433     user is a common mistake.
1434    </para>
1435
1436    <para>
1437     In any case, the file name that is given in the
1438     <command>CREATE FUNCTION</command> command is recorded literally
1439     in the system catalogs, so if the file needs to be loaded again
1440     the same procedure is applied.
1441    </para>
1442
1443    <note>
1444     <para>
1445      <productname>PostgreSQL</productname> will not compile a C function
1446      automatically.  The object file must be compiled before it is referenced
1447      in a <command>CREATE
1448      FUNCTION</> command.  See <xref linkend="dfunc"> for additional
1449      information.
1450     </para>
1451    </note>
1452
1453    <indexterm zone="xfunc-c-dynload">
1454     <primary>magic block</primary>
1455    </indexterm>
1456
1457    <para>
1458     To ensure that a dynamically loaded object file is not loaded into an
1459     incompatible server, <productname>PostgreSQL</productname> checks that the
1460     file contains a <quote>magic block</> with the appropriate contents.
1461     This allows the server to detect obvious incompatibilities, such as code
1462     compiled for a different major version of
1463     <productname>PostgreSQL</productname>.  A magic block is required as of
1464     <productname>PostgreSQL</productname> 8.2.  To include a magic block,
1465     write this in one (and only one) of the module source files, after having
1466     included the header <filename>fmgr.h</>:
1467
1468 <programlisting>
1469 #ifdef PG_MODULE_MAGIC
1470 PG_MODULE_MAGIC;
1471 #endif
1472 </programlisting>
1473
1474     The <literal>#ifdef</> test can be omitted if the code doesn't
1475     need to compile against pre-8.2 <productname>PostgreSQL</productname>
1476     releases.
1477    </para>
1478
1479    <para>
1480     After it is used for the first time, a dynamically loaded object
1481     file is retained in memory.  Future calls in the same session to
1482     the function(s) in that file will only incur the small overhead of
1483     a symbol table lookup.  If you need to force a reload of an object
1484     file, for example after recompiling it, use the <xref
1485     linkend="sql-load" endterm="sql-load-title"> command or begin a
1486     fresh session.
1487    </para>
1488
1489    <indexterm zone="xfunc-c-dynload">
1490     <primary>_PG_init</primary>
1491    </indexterm>
1492    <indexterm zone="xfunc-c-dynload">
1493     <primary>_PG_fini</primary>
1494    </indexterm>
1495    <indexterm zone="xfunc-c-dynload">
1496     <primary>library initialization function</primary>
1497    </indexterm>
1498    <indexterm zone="xfunc-c-dynload">
1499     <primary>library finalization function</primary>
1500    </indexterm>
1501
1502    <para>
1503     Optionally, a dynamically loaded file can contain initialization and
1504     finalization functions.  If the file includes a function named
1505     <function>_PG_init</>, that function will be called immediately after
1506     loading the file.  The function receives no parameters and should
1507     return void.  If the file includes a function named
1508     <function>_PG_fini</>, that function will be called immediately before
1509     unloading the file.  Likewise, the function receives no parameters and
1510     should return void.  Note that <function>_PG_fini</> will only be called
1511     during an unload of the file, not during process termination.
1512     (Presently, an unload only happens in the context of re-loading
1513     the file due to an explicit <command>LOAD</> command.)
1514    </para>
1515
1516   </sect2>
1517
1518    <sect2 id="xfunc-c-basetype">
1519     <title>Base Types in C-Language Functions</title>
1520
1521     <indexterm zone="xfunc-c-basetype">
1522      <primary>data type</primary>
1523      <secondary>internal organization</secondary>
1524     </indexterm>
1525
1526     <para>
1527      To know how to write C-language functions, you need to know how
1528      <productname>PostgreSQL</productname> internally represents base
1529      data types and how they can be passed to and from functions.
1530      Internally, <productname>PostgreSQL</productname> regards a base
1531      type as a <quote>blob of memory</quote>.  The user-defined
1532      functions that you define over a type in turn define the way that
1533      <productname>PostgreSQL</productname> can operate on it.  That
1534      is, <productname>PostgreSQL</productname> will only store and
1535      retrieve the data from disk and use your user-defined functions
1536      to input, process, and output the data.
1537     </para>
1538
1539     <para>
1540      Base types can have one of three internal formats:
1541
1542      <itemizedlist>
1543       <listitem>
1544        <para>
1545         pass by value, fixed-length
1546        </para>
1547       </listitem>
1548       <listitem>
1549        <para>
1550         pass by reference, fixed-length
1551        </para>
1552       </listitem>
1553       <listitem>
1554        <para>
1555         pass by reference, variable-length
1556        </para>
1557       </listitem>
1558      </itemizedlist>
1559     </para>
1560
1561     <para>
1562      By-value  types  can  only be 1, 2, or 4 bytes in length
1563      (also 8 bytes, if <literal>sizeof(Datum)</literal> is 8 on your machine).
1564      You should be careful to define your types such that they will be the
1565      same size (in bytes) on all architectures.  For example, the
1566      <literal>long</literal> type is dangerous because it is 4 bytes on some
1567      machines and 8 bytes on others, whereas <type>int</type> type is 4 bytes
1568      on most Unix machines.  A reasonable implementation of the
1569      <type>int4</type> type on Unix machines might be:
1570
1571 <programlisting>
1572 /* 4-byte integer, passed by value */
1573 typedef int int4;
1574 </programlisting>
1575     </para>
1576
1577     <para>
1578      On  the  other hand, fixed-length types of any size can
1579      be passed by-reference.  For example, here is a  sample
1580      implementation of a <productname>PostgreSQL</productname> type:
1581
1582 <programlisting>
1583 /* 16-byte structure, passed by reference */
1584 typedef struct
1585 {
1586     double  x, y;
1587 } Point;
1588 </programlisting>
1589
1590      Only  pointers  to  such types can be used when passing
1591      them in and out of <productname>PostgreSQL</productname> functions.
1592      To return a value of such a type, allocate the right amount of
1593      memory with <literal>palloc</literal>, fill in the allocated memory,
1594      and return a pointer to it.  (Also, if you just want to return the
1595      same value as one of your input arguments that's of the same data type,
1596      you can skip the extra <literal>palloc</literal> and just return the
1597      pointer to the input value.)
1598     </para>
1599
1600     <para>
1601      Finally, all variable-length types must also be  passed
1602      by  reference.   All  variable-length  types must begin
1603      with a length field of exactly 4 bytes, and all data to
1604      be  stored within that type must be located in the memory
1605      immediately  following  that  length  field.   The
1606      length field contains the total length of the structure,
1607      that is,  it  includes  the  size  of  the  length  field
1608      itself.
1609     </para>
1610
1611     <warning>
1612      <para>
1613       <emphasis>Never</> modify the contents of a pass-by-reference input
1614       value.  If you do so you are likely to corrupt on-disk data, since
1615       the pointer you are given might point directly into a disk buffer.
1616       The sole exception to this rule is explained in
1617       <xref linkend="xaggr">.
1618      </para>
1619     </warning>
1620
1621     <para>
1622      As an example, we can define the type <type>text</type> as
1623      follows:
1624
1625 <programlisting>
1626 typedef struct {
1627     int4 length;
1628     char data[1];
1629 } text;
1630 </programlisting>
1631
1632      Obviously,  the  data  field declared here is not long enough to hold
1633      all possible strings.  Since it's impossible to declare a variable-size
1634      structure in <acronym>C</acronym>, we rely on the knowledge that the
1635      <acronym>C</acronym> compiler won't range-check array subscripts.  We
1636      just allocate the necessary amount of space and then access the array as
1637      if it were declared the right length.  (This is a common trick, which
1638      you can read about in many textbooks about C.)
1639     </para>
1640
1641     <para>
1642      When manipulating
1643      variable-length types, we must  be  careful  to  allocate
1644      the  correct amount  of memory and set the length field correctly.
1645      For example, if we wanted to  store  40  bytes  in  a <structname>text</>
1646      structure, we might use a code fragment like this:
1647
1648 <programlisting><![CDATA[
1649 #include "postgres.h"
1650 ...
1651 char buffer[40]; /* our source data */
1652 ...
1653 text *destination = (text *) palloc(VARHDRSZ + 40);
1654 destination->length = VARHDRSZ + 40;
1655 memcpy(destination->data, buffer, 40);
1656 ...
1657 ]]>
1658 </programlisting>
1659
1660      <literal>VARHDRSZ</> is the same as <literal>sizeof(int4)</>, but
1661      it's considered good style to use the macro <literal>VARHDRSZ</>
1662      to refer to the size of the overhead for a variable-length type.
1663     </para>
1664
1665     <para>
1666      <xref linkend="xfunc-c-type-table"> specifies which C type
1667      corresponds to which SQL type when writing a C-language function
1668      that uses a built-in type of <productname>PostgreSQL</>.
1669      The <quote>Defined In</quote> column gives the header file that
1670      needs to be included to get the type definition.  (The actual
1671      definition might be in a different file that is included by the
1672      listed file.  It is recommended that users stick to the defined
1673      interface.)  Note that you should always include
1674      <filename>postgres.h</filename> first in any source file, because
1675      it declares a number of things that you will need anyway.
1676     </para>
1677
1678      <table tocentry="1" id="xfunc-c-type-table">
1679       <title>Equivalent C Types for Built-In SQL Types</title>
1680       <tgroup cols="3">
1681        <thead>
1682         <row>
1683          <entry>
1684           SQL Type
1685          </entry>
1686          <entry>
1687           C Type
1688          </entry>
1689          <entry>
1690           Defined In
1691          </entry>
1692         </row>
1693        </thead>
1694        <tbody>
1695         <row>
1696          <entry><type>abstime</type></entry>
1697          <entry><type>AbsoluteTime</type></entry>
1698          <entry><filename>utils/nabstime.h</filename></entry>
1699         </row>
1700         <row>
1701          <entry><type>boolean</type></entry>
1702          <entry><type>bool</type></entry>
1703          <entry><filename>postgres.h</filename> (maybe compiler built-in)</entry>
1704         </row>
1705         <row>
1706          <entry><type>box</type></entry>
1707          <entry><type>BOX*</type></entry>
1708          <entry><filename>utils/geo_decls.h</filename></entry>
1709         </row>
1710         <row>
1711          <entry><type>bytea</type></entry>
1712          <entry><type>bytea*</type></entry>
1713          <entry><filename>postgres.h</filename></entry>
1714         </row>
1715         <row>
1716          <entry><type>"char"</type></entry>
1717          <entry><type>char</type></entry>
1718          <entry>(compiler built-in)</entry>
1719         </row>
1720         <row>
1721          <entry><type>character</type></entry>
1722          <entry><type>BpChar*</type></entry>
1723          <entry><filename>postgres.h</filename></entry>
1724         </row>
1725         <row>
1726          <entry><type>cid</type></entry>
1727          <entry><type>CommandId</type></entry>
1728          <entry><filename>postgres.h</filename></entry>
1729         </row>
1730         <row>
1731          <entry><type>date</type></entry>
1732          <entry><type>DateADT</type></entry>
1733          <entry><filename>utils/date.h</filename></entry>
1734         </row>
1735         <row>
1736          <entry><type>smallint</type> (<type>int2</type>)</entry>
1737          <entry><type>int2</type> or <type>int16</type></entry>
1738          <entry><filename>postgres.h</filename></entry>
1739         </row>
1740         <row>
1741          <entry><type>int2vector</type></entry>
1742          <entry><type>int2vector*</type></entry>
1743          <entry><filename>postgres.h</filename></entry>
1744         </row>
1745         <row>
1746          <entry><type>integer</type> (<type>int4</type>)</entry>
1747          <entry><type>int4</type> or <type>int32</type></entry>
1748          <entry><filename>postgres.h</filename></entry>
1749         </row>
1750         <row>
1751          <entry><type>real</type> (<type>float4</type>)</entry>
1752          <entry><type>float4*</type></entry>
1753         <entry><filename>postgres.h</filename></entry>
1754         </row>
1755         <row>
1756          <entry><type>double precision</type> (<type>float8</type>)</entry>
1757          <entry><type>float8*</type></entry>
1758          <entry><filename>postgres.h</filename></entry>
1759         </row>
1760         <row>
1761          <entry><type>interval</type></entry>
1762          <entry><type>Interval*</type></entry>
1763          <entry><filename>utils/timestamp.h</filename></entry>
1764         </row>
1765         <row>
1766          <entry><type>lseg</type></entry>
1767          <entry><type>LSEG*</type></entry>
1768          <entry><filename>utils/geo_decls.h</filename></entry>
1769         </row>
1770         <row>
1771          <entry><type>name</type></entry>
1772          <entry><type>Name</type></entry>
1773          <entry><filename>postgres.h</filename></entry>
1774         </row>
1775         <row>
1776          <entry><type>oid</type></entry>
1777          <entry><type>Oid</type></entry>
1778          <entry><filename>postgres.h</filename></entry>
1779         </row>
1780         <row>
1781          <entry><type>oidvector</type></entry>
1782          <entry><type>oidvector*</type></entry>
1783          <entry><filename>postgres.h</filename></entry>
1784         </row>
1785         <row>
1786          <entry><type>path</type></entry>
1787          <entry><type>PATH*</type></entry>
1788          <entry><filename>utils/geo_decls.h</filename></entry>
1789         </row>
1790         <row>
1791          <entry><type>point</type></entry>
1792          <entry><type>POINT*</type></entry>
1793          <entry><filename>utils/geo_decls.h</filename></entry>
1794         </row>
1795         <row>
1796          <entry><type>regproc</type></entry>
1797          <entry><type>regproc</type></entry>
1798          <entry><filename>postgres.h</filename></entry>
1799         </row>
1800         <row>
1801          <entry><type>reltime</type></entry>
1802          <entry><type>RelativeTime</type></entry>
1803          <entry><filename>utils/nabstime.h</filename></entry>
1804         </row>
1805         <row>
1806          <entry><type>text</type></entry>
1807          <entry><type>text*</type></entry>
1808          <entry><filename>postgres.h</filename></entry>
1809         </row>
1810         <row>
1811          <entry><type>tid</type></entry>
1812          <entry><type>ItemPointer</type></entry>
1813          <entry><filename>storage/itemptr.h</filename></entry>
1814         </row>
1815         <row>
1816          <entry><type>time</type></entry>
1817          <entry><type>TimeADT</type></entry>
1818          <entry><filename>utils/date.h</filename></entry>
1819         </row>
1820         <row>
1821          <entry><type>time with time zone</type></entry>
1822          <entry><type>TimeTzADT</type></entry>
1823          <entry><filename>utils/date.h</filename></entry>
1824         </row>
1825         <row>
1826          <entry><type>timestamp</type></entry>
1827          <entry><type>Timestamp*</type></entry>
1828          <entry><filename>utils/timestamp.h</filename></entry>
1829         </row>
1830         <row>
1831          <entry><type>tinterval</type></entry>
1832          <entry><type>TimeInterval</type></entry>
1833          <entry><filename>utils/nabstime.h</filename></entry>
1834         </row>
1835         <row>
1836          <entry><type>varchar</type></entry>
1837          <entry><type>VarChar*</type></entry>
1838          <entry><filename>postgres.h</filename></entry>
1839         </row>
1840         <row>
1841          <entry><type>xid</type></entry>
1842          <entry><type>TransactionId</type></entry>
1843          <entry><filename>postgres.h</filename></entry>
1844         </row>
1845        </tbody>
1846       </tgroup>
1847      </table>
1848
1849     <para>
1850      Now that we've gone over all of the possible structures
1851      for base types, we can show some examples of real functions.
1852     </para>
1853    </sect2>
1854
1855    <sect2>
1856     <title>Version 0 Calling Conventions</title>
1857
1858     <para>
1859      We present the <quote>old style</quote> calling convention first &mdash; although
1860      this approach is now deprecated, it's easier to get a handle on
1861      initially.  In the version-0 method, the arguments and result
1862      of the C function are just declared in normal C style, but being
1863      careful to use the C representation of each SQL data type as shown
1864      above.
1865     </para>
1866
1867     <para>
1868      Here are some examples:
1869
1870 <programlisting><![CDATA[
1871 #include "postgres.h"
1872 #include <string.h>
1873
1874 /* by value */
1875
1876 int
1877 add_one(int arg)
1878 {
1879     return arg + 1;
1880 }
1881
1882 /* by reference, fixed length */
1883
1884 float8 *
1885 add_one_float8(float8 *arg)
1886 {
1887     float8    *result = (float8 *) palloc(sizeof(float8));
1888
1889     *result = *arg + 1.0;
1890
1891     return result;
1892 }
1893
1894 Point *
1895 makepoint(Point *pointx, Point *pointy)
1896 {
1897     Point     *new_point = (Point *) palloc(sizeof(Point));
1898
1899     new_point->x = pointx->x;
1900     new_point->y = pointy->y;
1901
1902     return new_point;
1903 }
1904
1905 /* by reference, variable length */
1906
1907 text *
1908 copytext(text *t)
1909 {
1910     /*
1911      * VARSIZE is the total size of the struct in bytes.
1912      */
1913     text *new_t = (text *) palloc(VARSIZE(t));
1914     SET_VARSIZE(new_t, VARSIZE(t));
1915     /*
1916      * VARDATA is a pointer to the data region of the struct.
1917      */
1918     memcpy((void *) VARDATA(new_t), /* destination */
1919            (void *) VARDATA(t),     /* source */
1920            VARSIZE(t) - VARHDRSZ);  /* how many bytes */
1921     return new_t;
1922 }
1923
1924 text *
1925 concat_text(text *arg1, text *arg2)
1926 {
1927     int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ;
1928     text *new_text = (text *) palloc(new_text_size);
1929
1930     SET_VARSIZE(new_text, new_text_size);
1931     memcpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1) - VARHDRSZ);
1932     memcpy(VARDATA(new_text) + (VARSIZE(arg1) - VARHDRSZ),
1933            VARDATA(arg2), VARSIZE(arg2) - VARHDRSZ);
1934     return new_text;
1935 }
1936 ]]>
1937 </programlisting>
1938     </para>
1939
1940     <para>
1941      Supposing that the above code has been prepared in file
1942      <filename>funcs.c</filename> and compiled into a shared object,
1943      we could define the functions to <productname>PostgreSQL</productname>
1944      with commands like this:
1945
1946 <programlisting>
1947 CREATE FUNCTION add_one(integer) RETURNS integer
1948      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one'
1949      LANGUAGE C STRICT;
1950
1951 -- note overloading of SQL function name "add_one"
1952 CREATE FUNCTION add_one(double precision) RETURNS double precision
1953      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one_float8'
1954      LANGUAGE C STRICT;
1955
1956 CREATE FUNCTION makepoint(point, point) RETURNS point
1957      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'makepoint'
1958      LANGUAGE C STRICT;
1959
1960 CREATE FUNCTION copytext(text) RETURNS text
1961      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'copytext'
1962      LANGUAGE C STRICT;
1963
1964 CREATE FUNCTION concat_text(text, text) RETURNS text
1965      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'concat_text'
1966      LANGUAGE C STRICT;
1967 </programlisting>
1968     </para>
1969
1970     <para>
1971      Here, <replaceable>DIRECTORY</replaceable> stands for the
1972      directory of the shared library file (for instance the
1973      <productname>PostgreSQL</productname> tutorial directory, which
1974      contains the code for the examples used in this section).
1975      (Better style would be to use just <literal>'funcs'</> in the
1976      <literal>AS</> clause, after having added
1977      <replaceable>DIRECTORY</replaceable> to the search path.  In any
1978      case, we can omit the system-specific extension for a shared
1979      library, commonly <literal>.so</literal> or
1980      <literal>.sl</literal>.)
1981     </para>
1982
1983     <para>
1984      Notice that we have specified the functions as <quote>strict</quote>,
1985      meaning that
1986      the system should automatically assume a null result if any input
1987      value is null.  By doing this, we avoid having to check for null inputs
1988      in the function code.  Without this, we'd have to check for null values
1989      explicitly, by checking for a null pointer for each
1990      pass-by-reference argument.  (For pass-by-value arguments, we don't
1991      even have a way to check!)
1992     </para>
1993
1994     <para>
1995      Although this calling convention is simple to use,
1996      it is not very portable; on some architectures there are problems
1997      with passing data types that are smaller than <type>int</type> this way.  Also, there is
1998      no simple way to return a null result, nor to cope with null arguments
1999      in any way other than making the function strict.  The version-1
2000      convention, presented next, overcomes these objections.
2001     </para>
2002    </sect2>
2003
2004    <sect2>
2005     <title>Version 1 Calling Conventions</title>
2006
2007     <para>
2008      The version-1 calling convention relies on macros to suppress most
2009      of the complexity of passing arguments and results.  The C declaration
2010      of a version-1 function is always:
2011 <programlisting>
2012 Datum funcname(PG_FUNCTION_ARGS)
2013 </programlisting>
2014      In addition, the macro call:
2015 <programlisting>
2016 PG_FUNCTION_INFO_V1(funcname);
2017 </programlisting>
2018      must appear in the same source file.  (Conventionally. it's
2019      written just before the function itself.)  This macro call is not
2020      needed for <literal>internal</>-language functions, since
2021      <productname>PostgreSQL</> assumes that all internal functions
2022      use the version-1 convention.  It is, however, required for
2023      dynamically-loaded functions.
2024     </para>
2025
2026     <para>
2027      In a version-1 function, each actual argument is fetched using a
2028      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2029      macro that corresponds to the argument's data type, and the
2030      result is returned using a
2031      <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
2032      macro for the return type.
2033      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2034      takes as its argument the number of the function argument to
2035      fetch, where the count starts at 0.
2036      <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
2037      takes as its argument the actual value to return.
2038     </para>
2039
2040     <para>
2041      Here we show the same functions as above, coded in version-1 style:
2042
2043 <programlisting><![CDATA[
2044 #include "postgres.h"
2045 #include <string.h>
2046 #include "fmgr.h"
2047
2048 /* by value */
2049
2050 PG_FUNCTION_INFO_V1(add_one);
2051
2052 Datum
2053 add_one(PG_FUNCTION_ARGS)
2054 {
2055     int32   arg = PG_GETARG_INT32(0);
2056
2057     PG_RETURN_INT32(arg + 1);
2058 }
2059
2060 /* by reference, fixed length */
2061
2062 PG_FUNCTION_INFO_V1(add_one_float8);
2063
2064 Datum
2065 add_one_float8(PG_FUNCTION_ARGS)
2066 {
2067     /* The macros for FLOAT8 hide its pass-by-reference nature. */
2068     float8   arg = PG_GETARG_FLOAT8(0);
2069
2070     PG_RETURN_FLOAT8(arg + 1.0);
2071 }
2072
2073 PG_FUNCTION_INFO_V1(makepoint);
2074
2075 Datum
2076 makepoint(PG_FUNCTION_ARGS)
2077 {
2078     /* Here, the pass-by-reference nature of Point is not hidden. */
2079     Point     *pointx = PG_GETARG_POINT_P(0);
2080     Point     *pointy = PG_GETARG_POINT_P(1);
2081     Point     *new_point = (Point *) palloc(sizeof(Point));
2082
2083     new_point->x = pointx->x;
2084     new_point->y = pointy->y;
2085
2086     PG_RETURN_POINT_P(new_point);
2087 }
2088
2089 /* by reference, variable length */
2090
2091 PG_FUNCTION_INFO_V1(copytext);
2092
2093 Datum
2094 copytext(PG_FUNCTION_ARGS)
2095 {
2096     text     *t = PG_GETARG_TEXT_P(0);
2097     /*
2098      * VARSIZE is the total size of the struct in bytes.
2099      */
2100     text     *new_t = (text *) palloc(VARSIZE(t));
2101     SET_VARSIZE(new_t, VARSIZE(t));
2102     /*
2103      * VARDATA is a pointer to the data region of the struct.
2104      */
2105     memcpy((void *) VARDATA(new_t), /* destination */
2106            (void *) VARDATA(t),     /* source */
2107            VARSIZE(t) - VARHDRSZ);  /* how many bytes */
2108     PG_RETURN_TEXT_P(new_t);
2109 }
2110
2111 PG_FUNCTION_INFO_V1(concat_text);
2112
2113 Datum
2114 concat_text(PG_FUNCTION_ARGS)
2115 {
2116     text  *arg1 = PG_GETARG_TEXT_P(0);
2117     text  *arg2 = PG_GETARG_TEXT_P(1);
2118     int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ;
2119     text *new_text = (text *) palloc(new_text_size);
2120
2121     SET_VARSIZE(new_text, new_text_size);
2122     memcpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1) - VARHDRSZ);
2123     memcpy(VARDATA(new_text) + (VARSIZE(arg1) - VARHDRSZ),
2124            VARDATA(arg2), VARSIZE(arg2) - VARHDRSZ);
2125     PG_RETURN_TEXT_P(new_text);
2126 }
2127 ]]>
2128 </programlisting>
2129     </para>
2130
2131     <para>
2132      The <command>CREATE FUNCTION</command> commands are the same as
2133      for the version-0 equivalents.
2134     </para>
2135
2136     <para>
2137      At first glance, the version-1 coding conventions might appear to
2138      be just pointless obscurantism.  They do, however, offer a number
2139      of improvements, because the macros can hide unnecessary detail.
2140      An example is that in coding <function>add_one_float8</>, we no longer need to
2141      be aware that <type>float8</type> is a pass-by-reference type.  Another
2142      example is that the <literal>GETARG</> macros for variable-length types allow
2143      for more efficient fetching of <quote>toasted</quote> (compressed or
2144      out-of-line) values.
2145     </para>
2146
2147     <para>
2148      One big improvement in version-1 functions is better handling of null
2149      inputs and results.  The macro <function>PG_ARGISNULL(<replaceable>n</>)</function>
2150      allows a function to test whether each input is null.  (Of course, doing
2151      this is only necessary in functions not declared <quote>strict</>.)
2152      As with the
2153      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> macros,
2154      the input arguments are counted beginning at zero.  Note that one
2155      should refrain from executing
2156      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> until
2157      one has verified that the argument isn't null.
2158      To return a null result, execute <function>PG_RETURN_NULL()</function>;
2159      this works in both strict and nonstrict functions.
2160     </para>
2161
2162     <para>
2163      Other options provided in the new-style interface are two
2164      variants of the
2165      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2166      macros. The first of these,
2167      <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>,
2168      guarantees to return a copy of the specified argument that is
2169      safe for writing into. (The normal macros will sometimes return a
2170      pointer to a value that is physically stored in a table, which
2171      must not be written to. Using the
2172      <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>
2173      macros guarantees a writable result.)
2174     The second variant consists of the
2175     <function>PG_GETARG_<replaceable>xxx</replaceable>_SLICE()</function>
2176     macros which take three arguments. The first is the number of the
2177     function argument (as above). The second and third are the offset and
2178     length of the segment to be returned. Offsets are counted from
2179     zero, and a negative length requests that the remainder of the
2180     value be returned. These macros provide more efficient access to
2181     parts of large values in the case where they have storage type
2182     <quote>external</quote>. (The storage type of a column can be specified using
2183     <literal>ALTER TABLE <replaceable>tablename</replaceable> ALTER
2184     COLUMN <replaceable>colname</replaceable> SET STORAGE
2185     <replaceable>storagetype</replaceable></literal>. <replaceable>storagetype</replaceable> is one of
2186     <literal>plain</>, <literal>external</>, <literal>extended</literal>,
2187      or <literal>main</>.)
2188     </para>
2189
2190     <para>
2191      Finally, the version-1 function call conventions make it possible
2192      to return set results (<xref linkend="xfunc-c-return-set">) and
2193      implement trigger functions (<xref linkend="triggers">) and
2194      procedural-language call handlers (<xref
2195      linkend="plhandler">).  Version-1 code is also more
2196      portable than version-0, because it does not break restrictions
2197      on function call protocol in the C standard.  For more details
2198      see <filename>src/backend/utils/fmgr/README</filename> in the
2199      source distribution.
2200     </para>
2201    </sect2>
2202
2203    <sect2>
2204     <title>Writing Code</title>
2205
2206     <para>
2207      Before we turn to the more advanced topics, we should discuss
2208      some coding rules for <productname>PostgreSQL</productname>
2209      C-language functions.  While it might be possible to load functions
2210      written in languages other than C into
2211      <productname>PostgreSQL</productname>, this is usually difficult
2212      (when it is possible at all) because other languages, such as
2213      C++, FORTRAN, or Pascal often do not follow the same calling
2214      convention as C.  That is, other languages do not pass argument
2215      and return values between functions in the same way.  For this
2216      reason, we will assume that your C-language functions are
2217      actually written in C.
2218     </para>
2219
2220     <para>
2221      The basic rules for writing and building C functions are as follows:
2222
2223      <itemizedlist>
2224       <listitem>
2225        <para>
2226         Use <literal>pg_config
2227         --includedir-server</literal><indexterm><primary>pg_config</><secondary>with user-defined C functions</></>
2228         to find out where the <productname>PostgreSQL</> server header
2229         files are installed on your system (or the system that your
2230         users will be running on).
2231        </para>
2232       </listitem>
2233
2234       <listitem>
2235        <para>
2236         Compiling and linking your code so that it can be dynamically
2237         loaded into <productname>PostgreSQL</productname> always
2238         requires special flags.  See <xref linkend="dfunc"> for a
2239         detailed explanation of how to do it for your particular
2240         operating system.
2241        </para>
2242       </listitem>
2243
2244       <listitem>
2245        <para>
2246         Remember to define a <quote>magic block</> for your shared library,
2247         as described in <xref linkend="xfunc-c-dynload">.
2248        </para>
2249       </listitem>
2250
2251       <listitem>
2252        <para>
2253         When allocating memory, use the
2254         <productname>PostgreSQL</productname> functions
2255         <function>palloc</function><indexterm><primary>palloc</></> and <function>pfree</function><indexterm><primary>pfree</></>
2256         instead of the corresponding C library functions
2257         <function>malloc</function> and <function>free</function>.
2258         The memory allocated by <function>palloc</function> will be
2259         freed automatically at the end of each transaction, preventing
2260         memory leaks.
2261        </para>
2262       </listitem>
2263
2264       <listitem>
2265        <para>
2266         Always zero the bytes of your structures using
2267         <function>memset</function>.  Without this, it's difficult to
2268         support hash indexes or hash joins, as you must pick out only
2269         the significant bits of your data structure to compute a hash.
2270         Even if you initialize all fields of your structure, there might be
2271         alignment padding (holes in the structure) that contain
2272         garbage values.
2273        </para>
2274       </listitem>
2275
2276       <listitem>
2277        <para>
2278         Most of the internal <productname>PostgreSQL</productname>
2279         types are declared in <filename>postgres.h</filename>, while
2280         the function manager interfaces
2281         (<symbol>PG_FUNCTION_ARGS</symbol>, etc.)  are in
2282         <filename>fmgr.h</filename>, so you will need to include at
2283         least these two files.  For portability reasons it's best to
2284         include <filename>postgres.h</filename> <emphasis>first</>,
2285         before any other system or user header files.  Including
2286         <filename>postgres.h</filename> will also include
2287         <filename>elog.h</filename> and <filename>palloc.h</filename>
2288         for you.
2289        </para>
2290       </listitem>
2291
2292       <listitem>
2293        <para>
2294         Symbol names defined within object files must not conflict
2295         with each other or with symbols defined in the
2296         <productname>PostgreSQL</productname> server executable.  You
2297         will have to rename your functions or variables if you get
2298         error messages to this effect.
2299        </para>
2300       </listitem>
2301      </itemizedlist>
2302     </para>
2303    </sect2>
2304
2305 &dfunc;
2306
2307    <sect2 id="xfunc-c-pgxs">
2308     <title>Extension Building Infrastructure</title>
2309
2310    <indexterm zone="xfunc-c-pgxs">
2311     <primary>pgxs</primary>
2312    </indexterm>
2313
2314    <para>
2315     If you are thinking about distributing your
2316     <productname>PostgreSQL</> extension modules, setting up a
2317     portable build system for them can be fairly difficult.  Therefore
2318     the <productname>PostgreSQL</> installation provides a build
2319     infrastructure for extensions, called <acronym>PGXS</acronym>, so
2320     that simple extension modules can be built simply against an
2321     already installed server.  Note that this infrastructure is not
2322     intended to be a universal build system framework that can be used
2323     to build all software interfacing to <productname>PostgreSQL</>;
2324     it simply automates common build rules for simple server extension
2325     modules.  For more complicated packages, you need to write your
2326     own build system.
2327    </para>
2328
2329    <para>
2330     To use the infrastructure for your extension, you must write a
2331     simple makefile.  In that makefile, you need to set some variables
2332     and finally include the global <acronym>PGXS</acronym> makefile.
2333     Here is an example that builds an extension module named
2334     <literal>isbn_issn</literal> consisting of a shared library, an
2335     SQL script, and a documentation text file:
2336 <programlisting>
2337 MODULES = isbn_issn
2338 DATA_built = isbn_issn.sql
2339 DOCS = README.isbn_issn
2340
2341 PG_CONFIG = pg_config
2342 PGXS := $(shell $(PG_CONFIG) --pgxs)
2343 include $(PGXS)
2344 </programlisting>
2345     The last three lines should always be the same.  Earlier in the
2346     file, you assign variables or add custom
2347     <application>make</application> rules.
2348    </para>
2349
2350    <para>
2351     The following variables can be set:
2352
2353     <variablelist>
2354      <varlistentry>
2355       <term><varname>MODULES</varname></term>
2356       <listitem>
2357        <para>
2358         list of shared objects to be built from source file with same
2359         stem (do not include suffix in this list)
2360        </para>
2361       </listitem>
2362      </varlistentry>
2363
2364      <varlistentry>
2365       <term><varname>DATA</varname></term>
2366       <listitem>
2367        <para>
2368         random files to install into <literal><replaceable>prefix</replaceable>/share/contrib</literal>
2369        </para>
2370       </listitem>
2371      </varlistentry>
2372
2373      <varlistentry>
2374       <term><varname>DATA_built</varname></term>
2375       <listitem>
2376        <para>
2377         random files to install into
2378         <literal><replaceable>prefix</replaceable>/share/contrib</literal>,
2379         which need to be built first
2380        </para>
2381       </listitem>
2382      </varlistentry>
2383
2384      <varlistentry>
2385       <term><varname>DOCS</varname></term>
2386       <listitem>
2387        <para>
2388         random files to install under
2389         <literal><replaceable>prefix</replaceable>/doc/contrib</literal>
2390        </para>
2391       </listitem>
2392      </varlistentry>
2393
2394      <varlistentry>
2395       <term><varname>SCRIPTS</varname></term>
2396       <listitem>
2397        <para>
2398         script files (not binaries) to install into
2399         <literal><replaceable>prefix</replaceable>/bin</literal>
2400        </para>
2401       </listitem>
2402      </varlistentry>
2403
2404      <varlistentry>
2405       <term><varname>SCRIPTS_built</varname></term>
2406       <listitem>
2407        <para>
2408         script files (not binaries) to install into
2409         <literal><replaceable>prefix</replaceable>/bin</literal>,
2410         which need to be built first
2411        </para>
2412       </listitem>
2413      </varlistentry>
2414
2415      <varlistentry>
2416       <term><varname>REGRESS</varname></term>
2417       <listitem>
2418        <para>
2419         list of regression test cases (without suffix), see below
2420        </para>
2421       </listitem>
2422      </varlistentry>
2423     </variablelist>
2424
2425     or at most one of these two:
2426
2427     <variablelist>
2428      <varlistentry>
2429       <term><varname>PROGRAM</varname></term>
2430       <listitem>
2431        <para>
2432         a binary program to build (list objects files in <varname>OBJS</varname>)
2433        </para>
2434       </listitem>
2435      </varlistentry>
2436
2437      <varlistentry>
2438       <term><varname>MODULE_big</varname></term>
2439       <listitem>
2440        <para>
2441         a shared object to build (list object files in <varname>OBJS</varname>)
2442        </para>
2443       </listitem>
2444      </varlistentry>
2445     </variablelist>
2446
2447     The following can also be set:
2448
2449     <variablelist>
2450
2451      <varlistentry>
2452       <term><varname>EXTRA_CLEAN</varname></term>
2453       <listitem>
2454        <para>
2455         extra files to remove in <literal>make clean</literal>
2456        </para>
2457       </listitem>
2458      </varlistentry>
2459
2460      <varlistentry>
2461       <term><varname>PG_CPPFLAGS</varname></term>
2462       <listitem>
2463        <para>
2464         will be added to <varname>CPPFLAGS</varname>
2465        </para>
2466       </listitem>
2467      </varlistentry>
2468
2469      <varlistentry>
2470       <term><varname>PG_LIBS</varname></term>
2471       <listitem>
2472        <para>
2473         will be added to <varname>PROGRAM</varname> link line
2474        </para>
2475       </listitem>
2476      </varlistentry>
2477
2478      <varlistentry>
2479       <term><varname>SHLIB_LINK</varname></term>
2480       <listitem>
2481        <para>
2482         will be added to <varname>MODULE_big</varname> link line
2483        </para>
2484       </listitem>
2485      </varlistentry>
2486
2487      <varlistentry>
2488       <term><varname>PG_CONFIG</varname></term>
2489       <listitem>
2490        <para>
2491         path to <application>pg_config</> program for the
2492         <productname>PostgreSQL</productname> installation to build against
2493         (typically just <literal>pg_config</> to use the first one in your
2494         <varname>PATH</>)
2495        </para>
2496       </listitem>
2497      </varlistentry>
2498     </variablelist>
2499    </para>
2500
2501    <para>
2502     Put this makefile as <literal>Makefile</literal> in the directory
2503     which holds your extension. Then you can do
2504     <literal>make</literal> to compile, and later <literal>make
2505     install</literal> to install your module.  By default, the extension is
2506     compiled and installed for the
2507     <productname>PostgreSQL</productname> installation that
2508     corresponds to the first <command>pg_config</command> program
2509     found in your path.  You can use a different installation by
2510     setting <varname>PG_CONFIG</varname> to point to its
2511     <command>pg_config</command> program, either within the makefile
2512     or on the <literal>make</literal> command line.
2513    </para>
2514
2515    <caution>
2516     <para>
2517      Changing <varname>PG_CONFIG</varname> only works when building
2518      against <productname>PostgreSQL</productname> 8.3 or later.
2519      With older releases it does not work to set it to anything except
2520      <literal>pg_config</>; you must alter your <varname>PATH</>
2521      to select the installation to build against.
2522     </para>
2523    </caution>
2524
2525    <para>
2526     The scripts listed in the <varname>REGRESS</> variable are used for
2527     regression testing of your module, just like <literal>make
2528     installcheck</literal> is used for the main
2529     <productname>PostgreSQL</productname> server.  For this to work you need
2530     to have a subdirectory named <literal>sql/</literal> in your extension's
2531     directory, within which you put one file for each group of tests you want
2532     to run.  The files should have extension <literal>.sql</literal>, which
2533     should not be included in the <varname>REGRESS</varname> list in the
2534     makefile.  For each test there should be a file containing the expected
2535     result in a subdirectory named <literal>expected/</literal>, with extension
2536     <literal>.out</literal>.  The tests are run by executing <literal>make
2537     installcheck</literal>, and the resulting output will be compared to the
2538     expected files.  The differences will be written to the file
2539     <literal>regression.diffs</literal> in <command>diff -c</command> format.
2540     Note that trying to run a test which is missing the expected file will be
2541     reported as <quote>trouble</quote>, so make sure you have all expected
2542     files.
2543    </para>
2544
2545    <tip>
2546     <para>
2547      The easiest way of creating the expected files is creating empty files,
2548      then carefully inspecting the result files after a test run (to be found
2549      in the <literal>results/</literal> directory), and copying them to
2550      <literal>expected/</literal> if they match what you want from the test.
2551     </para>
2552
2553    </tip>
2554   </sect2>
2555
2556
2557    <sect2>
2558     <title>Composite-Type Arguments</title>
2559
2560     <para>
2561      Composite types do not have a fixed layout like C structures.
2562      Instances of a composite type can contain null fields.  In
2563      addition, composite types that are part of an inheritance
2564      hierarchy can have different fields than other members of the
2565      same inheritance hierarchy.  Therefore,
2566      <productname>PostgreSQL</productname> provides a function
2567      interface for accessing fields of composite types from C.
2568     </para>
2569
2570     <para>
2571      Suppose we want to write a function to answer the query:
2572
2573 <programlisting>
2574 SELECT name, c_overpaid(emp, 1500) AS overpaid
2575     FROM emp
2576     WHERE name = 'Bill' OR name = 'Sam';
2577 </programlisting>
2578
2579      Using call conventions version 0, we can define
2580      <function>c_overpaid</> as:
2581
2582 <programlisting><![CDATA[
2583 #include "postgres.h"
2584 #include "executor/executor.h"  /* for GetAttributeByName() */
2585
2586 bool
2587 c_overpaid(HeapTupleHeader t, /* the current row of emp */
2588            int32 limit)
2589 {
2590     bool isnull;
2591     int32 salary;
2592
2593     salary = DatumGetInt32(GetAttributeByName(t, "salary", &isnull));
2594     if (isnull)
2595         return false;
2596     return salary > limit;
2597 }
2598 ]]>
2599 </programlisting>
2600
2601      In version-1 coding, the above would look like this:
2602
2603 <programlisting><![CDATA[
2604 #include "postgres.h"
2605 #include "executor/executor.h"  /* for GetAttributeByName() */
2606
2607 PG_FUNCTION_INFO_V1(c_overpaid);
2608
2609 Datum
2610 c_overpaid(PG_FUNCTION_ARGS)
2611 {
2612     HeapTupleHeader  t = PG_GETARG_HEAPTUPLEHEADER(0);
2613     int32            limit = PG_GETARG_INT32(1);
2614     bool isnull;
2615     Datum salary;
2616
2617     salary = GetAttributeByName(t, "salary", &isnull);
2618     if (isnull)
2619         PG_RETURN_BOOL(false);
2620     /* Alternatively, we might prefer to do PG_RETURN_NULL() for null salary. */
2621
2622     PG_RETURN_BOOL(DatumGetInt32(salary) > limit);
2623 }
2624 ]]>
2625 </programlisting>
2626     </para>
2627
2628     <para>
2629      <function>GetAttributeByName</function> is the
2630      <productname>PostgreSQL</productname> system function that
2631      returns attributes out of the specified row.  It has
2632      three arguments: the argument of type <type>HeapTupleHeader</type> passed
2633      into
2634      the  function, the name of the desired attribute, and a
2635      return parameter that tells whether  the  attribute
2636      is  null.   <function>GetAttributeByName</function> returns a <type>Datum</type>
2637      value that you can convert to the proper data type by using the
2638      appropriate <function>DatumGet<replaceable>XXX</replaceable>()</function>
2639      macro.  Note that the return value is meaningless if the null flag is
2640      set; always check the null flag before trying to do anything with the
2641      result.
2642     </para>
2643
2644     <para>
2645      There is also <function>GetAttributeByNum</function>, which selects
2646      the target attribute by column number instead of name.
2647     </para>
2648
2649     <para>
2650      The following command declares the function
2651      <function>c_overpaid</function> in SQL:
2652
2653 <programlisting>
2654 CREATE FUNCTION c_overpaid(emp, integer) RETURNS boolean
2655     AS '<replaceable>DIRECTORY</replaceable>/funcs', 'c_overpaid'
2656     LANGUAGE C STRICT;
2657 </programlisting>
2658
2659      Notice we have used <literal>STRICT</> so that we did not have to
2660      check whether the input arguments were NULL.
2661     </para>
2662    </sect2>
2663
2664    <sect2>
2665     <title>Returning Rows (Composite Types)</title>
2666
2667     <para>
2668      To return a row or composite-type value from a C-language
2669      function, you can use a special API that provides macros and
2670      functions to hide most of the complexity of building composite
2671      data types.  To use this API, the source file must include:
2672 <programlisting>
2673 #include "funcapi.h"
2674 </programlisting>
2675     </para>
2676
2677     <para>
2678      There are two ways you can build a composite data value (henceforth
2679      a <quote>tuple</>): you can build it from an array of Datum values,
2680      or from an array of C strings that can be passed to the input
2681      conversion functions of the tuple's column data types.  In either
2682      case, you first need to obtain or construct a <structname>TupleDesc</>
2683      descriptor for the tuple structure.  When working with Datums, you
2684      pass the <structname>TupleDesc</> to <function>BlessTupleDesc</>,
2685      and then call <function>heap_form_tuple</> for each row.  When working
2686      with C strings, you pass the <structname>TupleDesc</> to
2687      <function>TupleDescGetAttInMetadata</>, and then call
2688      <function>BuildTupleFromCStrings</> for each row.  In the case of a
2689      function returning a set of tuples, the setup steps can all be done
2690      once during the first call of the function.
2691     </para>
2692
2693     <para>
2694      Several helper functions are available for setting up the needed
2695      <structname>TupleDesc</>.  The recommended way to do this in most
2696      functions returning composite values is to call:
2697 <programlisting>
2698 TypeFuncClass get_call_result_type(FunctionCallInfo fcinfo,
2699                                    Oid *resultTypeId,
2700                                    TupleDesc *resultTupleDesc)
2701 </programlisting>
2702      passing the same <literal>fcinfo</> struct passed to the calling function
2703      itself.  (This of course requires that you use the version-1
2704      calling conventions.)  <varname>resultTypeId</> can be specified
2705      as <literal>NULL</> or as the address of a local variable to receive the
2706      function's result type OID.  <varname>resultTupleDesc</> should be the
2707      address of a local <structname>TupleDesc</> variable.  Check that the
2708      result is <literal>TYPEFUNC_COMPOSITE</>; if so,
2709      <varname>resultTupleDesc</> has been filled with the needed
2710      <structname>TupleDesc</>.  (If it is not, you can report an error along
2711      the lines of <quote>function returning record called in context that
2712      cannot accept type record</quote>.)
2713     </para>
2714
2715     <tip>
2716      <para>
2717       <function>get_call_result_type</> can resolve the actual type of a
2718       polymorphic function result; so it is useful in functions that return
2719       scalar polymorphic results, not only functions that return composites.
2720       The <varname>resultTypeId</> output is primarily useful for functions
2721       returning polymorphic scalars.
2722      </para>
2723     </tip>
2724
2725     <note>
2726      <para>
2727       <function>get_call_result_type</> has a sibling
2728       <function>get_expr_result_type</>, which can be used to resolve the
2729       expected output type for a function call represented by an expression
2730       tree.  This can be used when trying to determine the result type from
2731       outside the function itself.  There is also
2732       <function>get_func_result_type</>, which can be used when only the
2733       function's OID is available.  However these functions are not able
2734       to deal with functions declared to return <structname>record</>, and
2735       <function>get_func_result_type</> cannot resolve polymorphic types,
2736       so you should preferentially use <function>get_call_result_type</>.
2737      </para>
2738     </note>
2739
2740     <para>
2741      Older, now-deprecated functions for obtaining
2742      <structname>TupleDesc</>s are:
2743 <programlisting>
2744 TupleDesc RelationNameGetTupleDesc(const char *relname)
2745 </programlisting>
2746      to get a <structname>TupleDesc</> for the row type of a named relation,
2747      and:
2748 <programlisting>
2749 TupleDesc TypeGetTupleDesc(Oid typeoid, List *colaliases)
2750 </programlisting>
2751      to get a <structname>TupleDesc</> based on a type OID. This can
2752      be used to get a <structname>TupleDesc</> for a base or
2753      composite type.  It will not work for a function that returns
2754      <structname>record</>, however, and it cannot resolve polymorphic
2755      types.
2756     </para>
2757
2758     <para>
2759      Once you have a <structname>TupleDesc</>, call:
2760 <programlisting>
2761 TupleDesc BlessTupleDesc(TupleDesc tupdesc)
2762 </programlisting>
2763      if you plan to work with Datums, or:
2764 <programlisting>
2765 AttInMetadata *TupleDescGetAttInMetadata(TupleDesc tupdesc)
2766 </programlisting>
2767      if you plan to work with C strings.  If you are writing a function
2768      returning set, you can save the results of these functions in the
2769      <structname>FuncCallContext</> structure &mdash; use the
2770      <structfield>tuple_desc</> or <structfield>attinmeta</> field
2771      respectively.
2772     </para>
2773
2774     <para>
2775      When working with Datums, use:
2776 <programlisting>
2777 HeapTuple heap_form_tuple(TupleDesc tupdesc, Datum *values, bool *isnull)
2778 </programlisting>
2779      to build a <structname>HeapTuple</> given user data in Datum form.
2780     </para>
2781
2782     <para>
2783      When working with C strings, use:
2784 <programlisting>
2785 HeapTuple BuildTupleFromCStrings(AttInMetadata *attinmeta, char **values)
2786 </programlisting>
2787      to build a <structname>HeapTuple</> given user data
2788      in C string form.  <literal>values</literal> is an array of C strings,
2789      one for each attribute of the return row. Each C string should be in
2790      the form expected by the input function of the attribute data
2791      type. In order to return a null value for one of the attributes,
2792      the corresponding pointer in the <parameter>values</> array
2793      should be set to <symbol>NULL</>.  This function will need to
2794      be called again for each row you return.
2795     </para>
2796
2797     <para>
2798      Once you have built a tuple to return from your function, it
2799      must be converted into a <type>Datum</>. Use:
2800 <programlisting>
2801 HeapTupleGetDatum(HeapTuple tuple)
2802 </programlisting>
2803      to convert a <structname>HeapTuple</> into a valid Datum.  This
2804      <type>Datum</> can be returned directly if you intend to return
2805      just a single row, or it can be used as the current return value
2806      in a set-returning function.
2807     </para>
2808
2809     <para>
2810      An example appears in the next section.
2811     </para>
2812
2813    </sect2>
2814
2815    <sect2 id="xfunc-c-return-set">
2816     <title>Returning Sets</title>
2817
2818     <para>
2819      There is also a special API that provides support for returning
2820      sets (multiple rows) from a C-language function.  A set-returning
2821      function must follow the version-1 calling conventions.  Also,
2822      source files must include <filename>funcapi.h</filename>, as
2823      above.
2824     </para>
2825
2826     <para>
2827      A set-returning function (<acronym>SRF</>) is called
2828      once for each item it returns.  The <acronym>SRF</> must
2829      therefore save enough state to remember what it was doing and
2830      return the next item on each call.
2831      The structure <structname>FuncCallContext</> is provided to help
2832      control this process.  Within a function, <literal>fcinfo-&gt;flinfo-&gt;fn_extra</>
2833      is used to hold a pointer to <structname>FuncCallContext</>
2834      across calls.
2835 <programlisting>
2836 typedef struct
2837 {
2838     /*
2839      * Number of times we've been called before
2840      *
2841      * call_cntr is initialized to 0 for you by SRF_FIRSTCALL_INIT(), and
2842      * incremented for you every time SRF_RETURN_NEXT() is called.
2843      */
2844     uint32 call_cntr;
2845
2846     /*
2847      * OPTIONAL maximum number of calls
2848      *
2849      * max_calls is here for convenience only and setting it is optional.
2850      * If not set, you must provide alternative means to know when the
2851      * function is done.
2852      */
2853     uint32 max_calls;
2854
2855     /*
2856      * OPTIONAL pointer to result slot
2857      *
2858      * This is obsolete and only present for backwards compatibility, viz,
2859      * user-defined SRFs that use the deprecated TupleDescGetSlot().
2860      */
2861     TupleTableSlot *slot;
2862
2863     /*
2864      * OPTIONAL pointer to miscellaneous user-provided context information
2865      *
2866      * user_fctx is for use as a pointer to your own data to retain
2867      * arbitrary context information between calls of your function.
2868      */
2869     void *user_fctx;
2870
2871     /*
2872      * OPTIONAL pointer to struct containing attribute type input metadata
2873      *
2874      * attinmeta is for use when returning tuples (i.e., composite data types)
2875      * and is not used when returning base data types. It is only needed
2876      * if you intend to use BuildTupleFromCStrings() to create the return
2877      * tuple.
2878      */
2879     AttInMetadata *attinmeta;
2880
2881     /*
2882      * memory context used for structures that must live for multiple calls
2883      *
2884      * multi_call_memory_ctx is set by SRF_FIRSTCALL_INIT() for you, and used
2885      * by SRF_RETURN_DONE() for cleanup. It is the most appropriate memory
2886      * context for any memory that is to be reused across multiple calls
2887      * of the SRF.
2888      */
2889     MemoryContext multi_call_memory_ctx;
2890
2891     /*
2892      * OPTIONAL pointer to struct containing tuple description
2893      *
2894      * tuple_desc is for use when returning tuples (i.e., composite data types)
2895      * and is only needed if you are going to build the tuples with
2896      * heap_form_tuple() rather than with BuildTupleFromCStrings().  Note that
2897      * the TupleDesc pointer stored here should usually have been run through
2898      * BlessTupleDesc() first.
2899      */
2900     TupleDesc tuple_desc;
2901
2902 } FuncCallContext;
2903 </programlisting>
2904     </para>
2905
2906     <para>
2907      An <acronym>SRF</> uses several functions and macros that
2908      automatically manipulate the <structname>FuncCallContext</>
2909      structure (and expect to find it via <literal>fn_extra</>).  Use:
2910 <programlisting>
2911 SRF_IS_FIRSTCALL()
2912 </programlisting>
2913      to determine if your function is being called for the first or a
2914      subsequent time. On the first call (only) use:
2915 <programlisting>
2916 SRF_FIRSTCALL_INIT()
2917 </programlisting>
2918      to initialize the <structname>FuncCallContext</>. On every function call,
2919      including the first, use:
2920 <programlisting>
2921 SRF_PERCALL_SETUP()
2922 </programlisting>
2923      to properly set up for using the <structname>FuncCallContext</>
2924      and clearing any previously returned data left over from the
2925      previous pass.
2926     </para>
2927
2928     <para>
2929      If your function has data to return, use:
2930 <programlisting>
2931 SRF_RETURN_NEXT(funcctx, result)
2932 </programlisting>
2933      to return it to the caller.  (<literal>result</> must be of type
2934      <type>Datum</>, either a single value or a tuple prepared as
2935      described above.)  Finally, when your function is finished
2936      returning data, use:
2937 <programlisting>
2938 SRF_RETURN_DONE(funcctx)
2939 </programlisting>
2940      to clean up and end the <acronym>SRF</>.
2941     </para>
2942
2943     <para>
2944      The memory context that is current when the <acronym>SRF</> is called is
2945      a transient context that will be cleared between calls.  This means
2946      that you do not need to call <function>pfree</> on everything
2947      you allocated using <function>palloc</>; it will go away anyway.  However, if you want to allocate
2948      any data structures to live across calls, you need to put them somewhere
2949      else.  The memory context referenced by
2950      <structfield>multi_call_memory_ctx</> is a suitable location for any
2951      data that needs to survive until the <acronym>SRF</> is finished running.  In most
2952      cases, this means that you should switch into
2953      <structfield>multi_call_memory_ctx</> while doing the first-call setup.
2954     </para>
2955
2956     <para>
2957      A complete pseudo-code example looks like the following:
2958 <programlisting>
2959 Datum
2960 my_set_returning_function(PG_FUNCTION_ARGS)
2961 {
2962     FuncCallContext  *funcctx;
2963     Datum             result;
2964     MemoryContext     oldcontext;
2965     <replaceable>further declarations as needed</replaceable>
2966
2967     if (SRF_IS_FIRSTCALL())
2968     {
2969         funcctx = SRF_FIRSTCALL_INIT();
2970         oldcontext = MemoryContextSwitchTo(funcctx-&gt;multi_call_memory_ctx);
2971         /* One-time setup code appears here: */
2972         <replaceable>user code</replaceable>
2973         <replaceable>if returning composite</replaceable>
2974             <replaceable>build TupleDesc, and perhaps AttInMetadata</replaceable>
2975         <replaceable>endif returning composite</replaceable>
2976         <replaceable>user code</replaceable>
2977         MemoryContextSwitchTo(oldcontext);
2978     }
2979
2980     /* Each-time setup code appears here: */
2981     <replaceable>user code</replaceable>
2982     funcctx = SRF_PERCALL_SETUP();
2983     <replaceable>user code</replaceable>
2984
2985     /* this is just one way we might test whether we are done: */
2986     if (funcctx-&gt;call_cntr &lt; funcctx-&gt;max_calls)
2987     {
2988         /* Here we want to return another item: */
2989         <replaceable>user code</replaceable>
2990         <replaceable>obtain result Datum</replaceable>
2991         SRF_RETURN_NEXT(funcctx, result);
2992     }
2993     else
2994     {
2995         /* Here we are done returning items and just need to clean up: */
2996         <replaceable>user code</replaceable>
2997         SRF_RETURN_DONE(funcctx);
2998     }
2999 }
3000 </programlisting>
3001     </para>
3002
3003     <para>
3004      A complete example of a simple <acronym>SRF</> returning a composite type
3005      looks like:
3006 <programlisting><![CDATA[
3007 PG_FUNCTION_INFO_V1(retcomposite);
3008
3009 Datum
3010 retcomposite(PG_FUNCTION_ARGS)
3011 {
3012     FuncCallContext     *funcctx;
3013     int                  call_cntr;
3014     int                  max_calls;
3015     TupleDesc            tupdesc;
3016     AttInMetadata       *attinmeta;
3017
3018      /* stuff done only on the first call of the function */
3019      if (SRF_IS_FIRSTCALL())
3020      {
3021         MemoryContext   oldcontext;
3022
3023         /* create a function context for cross-call persistence */
3024         funcctx = SRF_FIRSTCALL_INIT();
3025
3026         /* switch to memory context appropriate for multiple function calls */
3027         oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
3028
3029         /* total number of tuples to be returned */
3030         funcctx->max_calls = PG_GETARG_UINT32(0);
3031
3032         /* Build a tuple descriptor for our result type */
3033         if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
3034             ereport(ERROR,
3035                     (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
3036                      errmsg("function returning record called in context "
3037                             "that cannot accept type record")));
3038
3039         /*
3040          * generate attribute metadata needed later to produce tuples from raw
3041          * C strings
3042          */
3043         attinmeta = TupleDescGetAttInMetadata(tupdesc);
3044         funcctx->attinmeta = attinmeta;
3045
3046         MemoryContextSwitchTo(oldcontext);
3047     }
3048
3049     /* stuff done on every call of the function */
3050     funcctx = SRF_PERCALL_SETUP();
3051
3052     call_cntr = funcctx->call_cntr;
3053     max_calls = funcctx->max_calls;
3054     attinmeta = funcctx->attinmeta;
3055
3056     if (call_cntr < max_calls)    /* do when there is more left to send */
3057     {
3058         char       **values;
3059         HeapTuple    tuple;
3060         Datum        result;
3061
3062         /*
3063          * Prepare a values array for building the returned tuple.
3064          * This should be an array of C strings which will
3065          * be processed later by the type input functions.
3066          */
3067         values = (char **) palloc(3 * sizeof(char *));
3068         values[0] = (char *) palloc(16 * sizeof(char));
3069         values[1] = (char *) palloc(16 * sizeof(char));
3070         values[2] = (char *) palloc(16 * sizeof(char));
3071
3072         snprintf(values[0], 16, "%d", 1 * PG_GETARG_INT32(1));
3073         snprintf(values[1], 16, "%d", 2 * PG_GETARG_INT32(1));
3074         snprintf(values[2], 16, "%d", 3 * PG_GETARG_INT32(1));
3075
3076         /* build a tuple */
3077         tuple = BuildTupleFromCStrings(attinmeta, values);
3078
3079         /* make the tuple into a datum */
3080         result = HeapTupleGetDatum(tuple);
3081
3082         /* clean up (this is not really necessary) */
3083         pfree(values[0]);
3084         pfree(values[1]);
3085         pfree(values[2]);
3086         pfree(values);
3087
3088         SRF_RETURN_NEXT(funcctx, result);
3089     }
3090     else    /* do when there is no more left */
3091     {
3092         SRF_RETURN_DONE(funcctx);
3093     }
3094 }
3095 ]]>
3096 </programlisting>
3097
3098      One way to declare this function in SQL is:
3099 <programlisting>
3100 CREATE TYPE __retcomposite AS (f1 integer, f2 integer, f3 integer);
3101
3102 CREATE OR REPLACE FUNCTION retcomposite(integer, integer)
3103     RETURNS SETOF __retcomposite
3104     AS '<replaceable>filename</>', 'retcomposite'
3105     LANGUAGE C IMMUTABLE STRICT;
3106 </programlisting>
3107      A different way is to use OUT parameters:
3108 <programlisting>
3109 CREATE OR REPLACE FUNCTION retcomposite(IN integer, IN integer,
3110     OUT f1 integer, OUT f2 integer, OUT f3 integer)
3111     RETURNS SETOF record
3112     AS '<replaceable>filename</>', 'retcomposite'
3113     LANGUAGE C IMMUTABLE STRICT;
3114 </programlisting>
3115      Notice that in this method the output type of the function is formally
3116      an anonymous <structname>record</> type.
3117     </para>
3118
3119     <para>
3120      The directory <filename>contrib/tablefunc</> in the source
3121      distribution contains more examples of set-returning functions.
3122     </para>
3123    </sect2>
3124
3125    <sect2>
3126     <title>Polymorphic Arguments and Return Types</title>
3127
3128     <para>
3129      C-language functions can be declared to accept and
3130      return the polymorphic types
3131      <type>anyelement</type>, <type>anyarray</type>, <type>anynonarray</type>,
3132      and <type>anyenum</type>.
3133      See <xref linkend="extend-types-polymorphic"> for a more detailed explanation
3134      of polymorphic functions. When function arguments or return types
3135      are defined as polymorphic types, the function author cannot know
3136      in advance what data type it will be called with, or
3137      need to return. There are two routines provided in <filename>fmgr.h</>
3138      to allow a version-1 C function to discover the actual data types
3139      of its arguments and the type it is expected to return. The routines are
3140      called <literal>get_fn_expr_rettype(FmgrInfo *flinfo)</> and
3141      <literal>get_fn_expr_argtype(FmgrInfo *flinfo, int argnum)</>.
3142      They return the result or argument type OID, or <symbol>InvalidOid</symbol> if the
3143      information is not available.
3144      The structure <literal>flinfo</> is normally accessed as
3145      <literal>fcinfo-&gt;flinfo</>. The parameter <literal>argnum</>
3146      is zero based.  <function>get_call_result_type</> can also be used
3147      as an alternative to <function>get_fn_expr_rettype</>.
3148     </para>
3149
3150     <para>
3151      For example, suppose we want to write a function to accept a single
3152      element of any type, and return a one-dimensional array of that type:
3153
3154 <programlisting>
3155 PG_FUNCTION_INFO_V1(make_array);
3156 Datum
3157 make_array(PG_FUNCTION_ARGS)
3158 {
3159     ArrayType  *result;
3160     Oid         element_type = get_fn_expr_argtype(fcinfo-&gt;flinfo, 0);
3161     Datum       element;
3162     bool        isnull;
3163     int16       typlen;
3164     bool        typbyval;
3165     char        typalign;
3166     int         ndims;
3167     int         dims[MAXDIM];
3168     int         lbs[MAXDIM];
3169
3170     if (!OidIsValid(element_type))
3171         elog(ERROR, "could not determine data type of input");
3172
3173     /* get the provided element, being careful in case it's NULL */
3174     isnull = PG_ARGISNULL(0);
3175     if (isnull)
3176         element = (Datum) 0;
3177     else
3178         element = PG_GETARG_DATUM(0);
3179
3180     /* we have one dimension */
3181     ndims = 1;
3182     /* and one element */
3183     dims[0] = 1;
3184     /* and lower bound is 1 */
3185     lbs[0] = 1;
3186
3187     /* get required info about the element type */
3188     get_typlenbyvalalign(element_type, &amp;typlen, &amp;typbyval, &amp;typalign);
3189
3190     /* now build the array */
3191     result = construct_md_array(&amp;element, &amp;isnull, ndims, dims, lbs,
3192                                 element_type, typlen, typbyval, typalign);
3193
3194     PG_RETURN_ARRAYTYPE_P(result);
3195 }
3196 </programlisting>
3197     </para>
3198
3199     <para>
3200      The following command declares the function
3201      <function>make_array</function> in SQL:
3202
3203 <programlisting>
3204 CREATE FUNCTION make_array(anyelement) RETURNS anyarray
3205     AS '<replaceable>DIRECTORY</replaceable>/funcs', 'make_array'
3206     LANGUAGE C IMMUTABLE;
3207 </programlisting>
3208     </para>
3209
3210     <para>
3211      There is a variant of polymorphism that is only available to C-language
3212      functions: they can be declared to take parameters of type
3213      <literal>"any"</>.  (Note that this type name must be double-quoted,
3214      since it's also a SQL reserved word.)  This works like
3215      <type>anyelement</> except that it does not constrain different
3216      <literal>"any"</> arguments to be the same type, nor do they help
3217      determine the function's result type.  A C-language function can also
3218      declare its final parameter to be <literal>VARIADIC "any"</>.  This will
3219      match one or more actual arguments of any type (not necessarily the same
3220      type).  These arguments will <emphasis>not</> be gathered into an array
3221      as happens with normal variadic functions; they will just be passed to
3222      the function separately.  The <function>PG_NARGS()</> macro and the
3223      methods described above must be used to determine the number of actual
3224      arguments and their types when using this feature.
3225     </para>
3226    </sect2>
3227
3228    <sect2>
3229     <title>Shared Memory and LWLocks</title>
3230
3231     <para>
3232      Add-ins can reserve LWLocks and an allocation of shared memory on server
3233      startup.  The add-in's shared library must be preloaded by specifying
3234      it in
3235      <xref linkend="guc-shared-preload-libraries"><indexterm><primary>shared-preload-libraries</></>.
3236      Shared memory is reserved by calling:
3237 <programlisting>
3238 void RequestAddinShmemSpace(int size)
3239 </programlisting>
3240      from your <function>_PG_init</> function.
3241     </para>
3242     <para>
3243      LWLocks are reserved by calling:
3244 <programlisting>
3245 void RequestAddinLWLocks(int n)
3246 </programlisting>
3247      from <function>_PG_init</>.
3248     </para>
3249     <para>
3250      To avoid possible race-conditions, each backend should use the LWLock
3251      <function>AddinShmemInitLock</> when connecting to and initializing
3252      its allocation of shared memory, as shown here:
3253 <programlisting>
3254         static mystruct *ptr = NULL;
3255
3256         if (!ptr)
3257         {
3258                 bool    found;
3259
3260                 LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
3261                 ptr = ShmemInitStruct("my struct name", size, &amp;found);
3262                 if (!ptr)
3263                         elog(ERROR, "out of shared memory");
3264                 if (!found)
3265                 {
3266                         initialize contents of shmem area;
3267                         acquire any requested LWLocks using:
3268                         ptr->mylockid = LWLockAssign();
3269                 }
3270                 LWLockRelease(AddinShmemInitLock);
3271         }
3272 </programlisting>
3273     </para>
3274    </sect2>
3275   </sect1>