Repair memory leaks in plpython.
[pgsql.git] / doc / src / sgml / syntax.sgml
blob916189a7d68ce9079be51e3f41b6f394fd3aac80
1 <!-- doc/src/sgml/syntax.sgml -->
3 <chapter id="sql-syntax">
4 <title>SQL Syntax</title>
6 <indexterm zone="sql-syntax">
7 <primary>syntax</primary>
8 <secondary>SQL</secondary>
9 </indexterm>
11 <para>
12 This chapter describes the syntax of SQL. It forms the foundation
13 for understanding the following chapters which will go into detail
14 about how SQL commands are applied to define and modify data.
15 </para>
17 <para>
18 We also advise users who are already familiar with SQL to read this
19 chapter carefully because it contains several rules and concepts that
20 are implemented inconsistently among SQL databases or that are
21 specific to <productname>PostgreSQL</productname>.
22 </para>
24 <sect1 id="sql-syntax-lexical">
25 <title>Lexical Structure</title>
27 <indexterm>
28 <primary>token</primary>
29 </indexterm>
31 <para>
32 SQL input consists of a sequence of
33 <firstterm>commands</firstterm>. A command is composed of a
34 sequence of <firstterm>tokens</firstterm>, terminated by a
35 semicolon (<quote>;</quote>). The end of the input stream also
36 terminates a command. Which tokens are valid depends on the syntax
37 of the particular command.
38 </para>
40 <para>
41 A token can be a <firstterm>key word</firstterm>, an
42 <firstterm>identifier</firstterm>, a <firstterm>quoted
43 identifier</firstterm>, a <firstterm>literal</firstterm> (or
44 constant), or a special character symbol. Tokens are normally
45 separated by whitespace (space, tab, newline), but need not be if
46 there is no ambiguity (which is generally only the case if a
47 special character is adjacent to some other token type).
48 </para>
50 <para>
51 For example, the following is (syntactically) valid SQL input:
52 <programlisting>
53 SELECT * FROM MY_TABLE;
54 UPDATE MY_TABLE SET A = 5;
55 INSERT INTO MY_TABLE VALUES (3, 'hi there');
56 </programlisting>
57 This is a sequence of three commands, one per line (although this
58 is not required; more than one command can be on a line, and
59 commands can usefully be split across lines).
60 </para>
62 <para>
63 Additionally, <firstterm>comments</firstterm> can occur in SQL
64 input. They are not tokens, they are effectively equivalent to
65 whitespace.
66 </para>
68 <para>
69 The SQL syntax is not very consistent regarding what tokens
70 identify commands and which are operands or parameters. The first
71 few tokens are generally the command name, so in the above example
72 we would usually speak of a <quote>SELECT</quote>, an
73 <quote>UPDATE</quote>, and an <quote>INSERT</quote> command. But
74 for instance the <command>UPDATE</command> command always requires
75 a <token>SET</token> token to appear in a certain position, and
76 this particular variation of <command>INSERT</command> also
77 requires a <token>VALUES</token> in order to be complete. The
78 precise syntax rules for each command are described in <xref linkend="reference"/>.
79 </para>
81 <sect2 id="sql-syntax-identifiers">
82 <title>Identifiers and Key Words</title>
84 <indexterm zone="sql-syntax-identifiers">
85 <primary>identifier</primary>
86 <secondary>syntax of</secondary>
87 </indexterm>
89 <indexterm zone="sql-syntax-identifiers">
90 <primary>name</primary>
91 <secondary>syntax of</secondary>
92 </indexterm>
94 <indexterm zone="sql-syntax-identifiers">
95 <primary>key word</primary>
96 <secondary>syntax of</secondary>
97 </indexterm>
99 <para>
100 Tokens such as <token>SELECT</token>, <token>UPDATE</token>, or
101 <token>VALUES</token> in the example above are examples of
102 <firstterm>key words</firstterm>, that is, words that have a fixed
103 meaning in the SQL language. The tokens <token>MY_TABLE</token>
104 and <token>A</token> are examples of
105 <firstterm>identifiers</firstterm>. They identify names of
106 tables, columns, or other database objects, depending on the
107 command they are used in. Therefore they are sometimes simply
108 called <quote>names</quote>. Key words and identifiers have the
109 same lexical structure, meaning that one cannot know whether a
110 token is an identifier or a key word without knowing the language.
111 A complete list of key words can be found in <xref
112 linkend="sql-keywords-appendix"/>.
113 </para>
115 <para>
116 SQL identifiers and key words must begin with a letter
117 (<literal>a</literal>-<literal>z</literal>, but also letters with
118 diacritical marks and non-Latin letters) or an underscore
119 (<literal>_</literal>). Subsequent characters in an identifier or
120 key word can be letters, underscores, digits
121 (<literal>0</literal>-<literal>9</literal>), or dollar signs
122 (<literal>$</literal>). Note that dollar signs are not allowed in identifiers
123 according to the letter of the SQL standard, so their use might render
124 applications less portable.
125 The SQL standard will not define a key word that contains
126 digits or starts or ends with an underscore, so identifiers of this
127 form are safe against possible conflict with future extensions of the
128 standard.
129 </para>
131 <para>
132 <indexterm><primary>identifier</primary><secondary>length</secondary></indexterm>
133 The system uses no more than <symbol>NAMEDATALEN</symbol>-1
134 bytes of an identifier; longer names can be written in
135 commands, but they will be truncated. By default,
136 <symbol>NAMEDATALEN</symbol> is 64 so the maximum identifier
137 length is 63 bytes. If this limit is problematic, it can be raised by
138 changing the <symbol>NAMEDATALEN</symbol> constant in
139 <filename>src/include/pg_config_manual.h</filename>.
140 </para>
142 <para>
143 <indexterm>
144 <primary>case sensitivity</primary>
145 <secondary>of SQL commands</secondary>
146 </indexterm>
147 Key words and unquoted identifiers are case-insensitive. Therefore:
148 <programlisting>
149 UPDATE MY_TABLE SET A = 5;
150 </programlisting>
151 can equivalently be written as:
152 <programlisting>
153 uPDaTE my_TabLE SeT a = 5;
154 </programlisting>
155 A convention often used is to write key words in upper
156 case and names in lower case, e.g.:
157 <programlisting>
158 UPDATE my_table SET a = 5;
159 </programlisting>
160 </para>
162 <para>
163 <indexterm>
164 <primary>quotation marks</primary>
165 <secondary>and identifiers</secondary>
166 </indexterm>
167 There is a second kind of identifier: the <firstterm>delimited
168 identifier</firstterm> or <firstterm>quoted
169 identifier</firstterm>. It is formed by enclosing an arbitrary
170 sequence of characters in double-quotes
171 (<literal>"</literal>). <!-- " font-lock mania --> A delimited
172 identifier is always an identifier, never a key word. So
173 <literal>"select"</literal> could be used to refer to a column or
174 table named <quote>select</quote>, whereas an unquoted
175 <literal>select</literal> would be taken as a key word and
176 would therefore provoke a parse error when used where a table or
177 column name is expected. The example can be written with quoted
178 identifiers like this:
179 <programlisting>
180 UPDATE "my_table" SET "a" = 5;
181 </programlisting>
182 </para>
184 <para>
185 Quoted identifiers can contain any character, except the character
186 with code zero. (To include a double quote, write two double quotes.)
187 This allows constructing table or column names that would
188 otherwise not be possible, such as ones containing spaces or
189 ampersands. The length limitation still applies.
190 </para>
192 <para>
193 Quoting an identifier also makes it case-sensitive, whereas
194 unquoted names are always folded to lower case. For example, the
195 identifiers <literal>FOO</literal>, <literal>foo</literal>, and
196 <literal>"foo"</literal> are considered the same by
197 <productname>PostgreSQL</productname>, but
198 <literal>"Foo"</literal> and <literal>"FOO"</literal> are
199 different from these three and each other. (The folding of
200 unquoted names to lower case in <productname>PostgreSQL</productname> is
201 incompatible with the SQL standard, which says that unquoted names
202 should be folded to upper case. Thus, <literal>foo</literal>
203 should be equivalent to <literal>"FOO"</literal> not
204 <literal>"foo"</literal> according to the standard. If you want
205 to write portable applications you are advised to always quote a
206 particular name or never quote it.)
207 </para>
209 <indexterm>
210 <primary>Unicode escape</primary>
211 <secondary>in identifiers</secondary>
212 </indexterm>
214 <para>
215 A variant of quoted
216 identifiers allows including escaped Unicode characters identified
217 by their code points. This variant starts
218 with <literal>U&amp;</literal> (upper or lower case U followed by
219 ampersand) immediately before the opening double quote, without
220 any spaces in between, for example <literal>U&amp;"foo"</literal>.
221 (Note that this creates an ambiguity with the
222 operator <literal>&amp;</literal>. Use spaces around the operator to
223 avoid this problem.) Inside the quotes, Unicode characters can be
224 specified in escaped form by writing a backslash followed by the
225 four-digit hexadecimal code point number or alternatively a
226 backslash followed by a plus sign followed by a six-digit
227 hexadecimal code point number. For example, the
228 identifier <literal>"data"</literal> could be written as
229 <programlisting>
230 U&amp;"d\0061t\+000061"
231 </programlisting>
232 The following less trivial example writes the Russian
233 word <quote>slon</quote> (elephant) in Cyrillic letters:
234 <programlisting>
235 U&amp;"\0441\043B\043E\043D"
236 </programlisting>
237 </para>
239 <para>
240 If a different escape character than backslash is desired, it can
241 be specified using
242 the <literal>UESCAPE</literal><indexterm><primary>UESCAPE</primary></indexterm>
243 clause after the string, for example:
244 <programlisting>
245 U&amp;"d!0061t!+000061" UESCAPE '!'
246 </programlisting>
247 The escape character can be any single character other than a
248 hexadecimal digit, the plus sign, a single quote, a double quote,
249 or a whitespace character. Note that the escape character is
250 written in single quotes, not double quotes,
251 after <literal>UESCAPE</literal>.
252 </para>
254 <para>
255 To include the escape character in the identifier literally, write
256 it twice.
257 </para>
259 <para>
260 Either the 4-digit or the 6-digit escape form can be used to
261 specify UTF-16 surrogate pairs to compose characters with code
262 points larger than U+FFFF, although the availability of the
263 6-digit form technically makes this unnecessary. (Surrogate
264 pairs are not stored directly, but are combined into a single
265 code point.)
266 </para>
268 <para>
269 If the server encoding is not UTF-8, the Unicode code point identified
270 by one of these escape sequences is converted to the actual server
271 encoding; an error is reported if that's not possible.
272 </para>
273 </sect2>
276 <sect2 id="sql-syntax-constants">
277 <title>Constants</title>
279 <indexterm zone="sql-syntax-constants">
280 <primary>constant</primary>
281 </indexterm>
283 <para>
284 There are three kinds of <firstterm>implicitly-typed
285 constants</firstterm> in <productname>PostgreSQL</productname>:
286 strings, bit strings, and numbers.
287 Constants can also be specified with explicit types, which can
288 enable more accurate representation and more efficient handling by
289 the system. These alternatives are discussed in the following
290 subsections.
291 </para>
293 <sect3 id="sql-syntax-strings">
294 <title>String Constants</title>
296 <indexterm zone="sql-syntax-strings">
297 <primary>character string</primary>
298 <secondary>constant</secondary>
299 </indexterm>
301 <para>
302 <indexterm>
303 <primary>quotation marks</primary>
304 <secondary>escaping</secondary>
305 </indexterm>
306 A string constant in SQL is an arbitrary sequence of characters
307 bounded by single quotes (<literal>'</literal>), for example
308 <literal>'This is a string'</literal>. To include
309 a single-quote character within a string constant,
310 write two adjacent single quotes, e.g.,
311 <literal>'Dianne''s horse'</literal>.
312 Note that this is <emphasis>not</emphasis> the same as a double-quote
313 character (<literal>"</literal>). <!-- font-lock sanity: " -->
314 </para>
316 <para>
317 Two string constants that are only separated by whitespace
318 <emphasis>with at least one newline</emphasis> are concatenated
319 and effectively treated as if the string had been written as one
320 constant. For example:
321 <programlisting>
322 SELECT 'foo'
323 'bar';
324 </programlisting>
325 is equivalent to:
326 <programlisting>
327 SELECT 'foobar';
328 </programlisting>
329 but:
330 <programlisting>
331 SELECT 'foo' 'bar';
332 </programlisting>
333 is not valid syntax. (This slightly bizarre behavior is specified
334 by <acronym>SQL</acronym>; <productname>PostgreSQL</productname> is
335 following the standard.)
336 </para>
337 </sect3>
339 <sect3 id="sql-syntax-strings-escape">
340 <title>String Constants with C-Style Escapes</title>
342 <indexterm zone="sql-syntax-strings-escape">
343 <primary>escape string syntax</primary>
344 </indexterm>
345 <indexterm zone="sql-syntax-strings-escape">
346 <primary>backslash escapes</primary>
347 </indexterm>
349 <para>
350 <productname>PostgreSQL</productname> also accepts <quote>escape</quote>
351 string constants, which are an extension to the SQL standard.
352 An escape string constant is specified by writing the letter
353 <literal>E</literal> (upper or lower case) just before the opening single
354 quote, e.g., <literal>E'foo'</literal>. (When continuing an escape string
355 constant across lines, write <literal>E</literal> only before the first opening
356 quote.)
357 Within an escape string, a backslash character (<literal>\</literal>) begins a
358 C-like <firstterm>backslash escape</firstterm> sequence, in which the combination
359 of backslash and following character(s) represent a special byte
360 value, as shown in <xref linkend="sql-backslash-table"/>.
361 </para>
363 <table id="sql-backslash-table">
364 <title>Backslash Escape Sequences</title>
365 <tgroup cols="2">
366 <thead>
367 <row>
368 <entry>Backslash Escape Sequence</entry>
369 <entry>Interpretation</entry>
370 </row>
371 </thead>
373 <tbody>
374 <row>
375 <entry><literal>\b</literal></entry>
376 <entry>backspace</entry>
377 </row>
378 <row>
379 <entry><literal>\f</literal></entry>
380 <entry>form feed</entry>
381 </row>
382 <row>
383 <entry><literal>\n</literal></entry>
384 <entry>newline</entry>
385 </row>
386 <row>
387 <entry><literal>\r</literal></entry>
388 <entry>carriage return</entry>
389 </row>
390 <row>
391 <entry><literal>\t</literal></entry>
392 <entry>tab</entry>
393 </row>
394 <row>
395 <entry>
396 <literal>\<replaceable>o</replaceable></literal>,
397 <literal>\<replaceable>oo</replaceable></literal>,
398 <literal>\<replaceable>ooo</replaceable></literal>
399 (<replaceable>o</replaceable> = 0&ndash;7)
400 </entry>
401 <entry>octal byte value</entry>
402 </row>
403 <row>
404 <entry>
405 <literal>\x<replaceable>h</replaceable></literal>,
406 <literal>\x<replaceable>hh</replaceable></literal>
407 (<replaceable>h</replaceable> = 0&ndash;9, A&ndash;F)
408 </entry>
409 <entry>hexadecimal byte value</entry>
410 </row>
411 <row>
412 <entry>
413 <literal>\u<replaceable>xxxx</replaceable></literal>,
414 <literal>\U<replaceable>xxxxxxxx</replaceable></literal>
415 (<replaceable>x</replaceable> = 0&ndash;9, A&ndash;F)
416 </entry>
417 <entry>16 or 32-bit hexadecimal Unicode character value</entry>
418 </row>
419 </tbody>
420 </tgroup>
421 </table>
423 <para>
424 Any other
425 character following a backslash is taken literally. Thus, to
426 include a backslash character, write two backslashes (<literal>\\</literal>).
427 Also, a single quote can be included in an escape string by writing
428 <literal>\'</literal>, in addition to the normal way of <literal>''</literal>.
429 </para>
431 <para>
432 It is your responsibility that the byte sequences you create,
433 especially when using the octal or hexadecimal escapes, compose
434 valid characters in the server character set encoding.
435 A useful alternative is to use Unicode escapes or the
436 alternative Unicode escape syntax, explained
437 in <xref linkend="sql-syntax-strings-uescape"/>; then the server
438 will check that the character conversion is possible.
439 </para>
441 <caution>
442 <para>
443 If the configuration parameter
444 <xref linkend="guc-standard-conforming-strings"/> is <literal>off</literal>,
445 then <productname>PostgreSQL</productname> recognizes backslash escapes
446 in both regular and escape string constants. However, as of
447 <productname>PostgreSQL</productname> 9.1, the default is <literal>on</literal>, meaning
448 that backslash escapes are recognized only in escape string constants.
449 This behavior is more standards-compliant, but might break applications
450 which rely on the historical behavior, where backslash escapes
451 were always recognized. As a workaround, you can set this parameter
452 to <literal>off</literal>, but it is better to migrate away from using backslash
453 escapes. If you need to use a backslash escape to represent a special
454 character, write the string constant with an <literal>E</literal>.
455 </para>
457 <para>
458 In addition to <varname>standard_conforming_strings</varname>, the configuration
459 parameters <xref linkend="guc-escape-string-warning"/> and
460 <xref linkend="guc-backslash-quote"/> govern treatment of backslashes
461 in string constants.
462 </para>
463 </caution>
465 <para>
466 The character with the code zero cannot be in a string constant.
467 </para>
468 </sect3>
470 <sect3 id="sql-syntax-strings-uescape">
471 <title>String Constants with Unicode Escapes</title>
473 <indexterm zone="sql-syntax-strings-uescape">
474 <primary>Unicode escape</primary>
475 <secondary>in string constants</secondary>
476 </indexterm>
478 <para>
479 <productname>PostgreSQL</productname> also supports another type
480 of escape syntax for strings that allows specifying arbitrary
481 Unicode characters by code point. A Unicode escape string
482 constant starts with <literal>U&amp;</literal> (upper or lower case
483 letter U followed by ampersand) immediately before the opening
484 quote, without any spaces in between, for
485 example <literal>U&amp;'foo'</literal>. (Note that this creates an
486 ambiguity with the operator <literal>&amp;</literal>. Use spaces
487 around the operator to avoid this problem.) Inside the quotes,
488 Unicode characters can be specified in escaped form by writing a
489 backslash followed by the four-digit hexadecimal code point
490 number or alternatively a backslash followed by a plus sign
491 followed by a six-digit hexadecimal code point number. For
492 example, the string <literal>'data'</literal> could be written as
493 <programlisting>
494 U&amp;'d\0061t\+000061'
495 </programlisting>
496 The following less trivial example writes the Russian
497 word <quote>slon</quote> (elephant) in Cyrillic letters:
498 <programlisting>
499 U&amp;'\0441\043B\043E\043D'
500 </programlisting>
501 </para>
503 <para>
504 If a different escape character than backslash is desired, it can
505 be specified using
506 the <literal>UESCAPE</literal><indexterm><primary>UESCAPE</primary></indexterm>
507 clause after the string, for example:
508 <programlisting>
509 U&amp;'d!0061t!+000061' UESCAPE '!'
510 </programlisting>
511 The escape character can be any single character other than a
512 hexadecimal digit, the plus sign, a single quote, a double quote,
513 or a whitespace character.
514 </para>
516 <para>
517 To include the escape character in the string literally, write
518 it twice.
519 </para>
521 <para>
522 Either the 4-digit or the 6-digit escape form can be used to
523 specify UTF-16 surrogate pairs to compose characters with code
524 points larger than U+FFFF, although the availability of the
525 6-digit form technically makes this unnecessary. (Surrogate
526 pairs are not stored directly, but are combined into a single
527 code point.)
528 </para>
530 <para>
531 If the server encoding is not UTF-8, the Unicode code point identified
532 by one of these escape sequences is converted to the actual server
533 encoding; an error is reported if that's not possible.
534 </para>
536 <para>
537 Also, the Unicode escape syntax for string constants only works
538 when the configuration
539 parameter <xref linkend="guc-standard-conforming-strings"/> is
540 turned on. This is because otherwise this syntax could confuse
541 clients that parse the SQL statements to the point that it could
542 lead to SQL injections and similar security issues. If the
543 parameter is set to off, this syntax will be rejected with an
544 error message.
545 </para>
546 </sect3>
548 <sect3 id="sql-syntax-dollar-quoting">
549 <title>Dollar-Quoted String Constants</title>
551 <indexterm>
552 <primary>dollar quoting</primary>
553 </indexterm>
555 <para>
556 While the standard syntax for specifying string constants is usually
557 convenient, it can be difficult to understand when the desired string
558 contains many single quotes, since each of those must
559 be doubled. To allow more readable queries in such situations,
560 <productname>PostgreSQL</productname> provides another way, called
561 <quote>dollar quoting</quote>, to write string constants.
562 A dollar-quoted string constant
563 consists of a dollar sign (<literal>$</literal>), an optional
564 <quote>tag</quote> of zero or more characters, another dollar
565 sign, an arbitrary sequence of characters that makes up the
566 string content, a dollar sign, the same tag that began this
567 dollar quote, and a dollar sign. For example, here are two
568 different ways to specify the string <quote>Dianne's horse</quote>
569 using dollar quoting:
570 <programlisting>
571 $$Dianne's horse$$
572 $SomeTag$Dianne's horse$SomeTag$
573 </programlisting>
574 Notice that inside the dollar-quoted string, single quotes can be
575 used without needing to be escaped. Indeed, no characters inside
576 a dollar-quoted string are ever escaped: the string content is always
577 written literally. Backslashes are not special, and neither are
578 dollar signs, unless they are part of a sequence matching the opening
579 tag.
580 </para>
582 <para>
583 It is possible to nest dollar-quoted string constants by choosing
584 different tags at each nesting level. This is most commonly used in
585 writing function definitions. For example:
586 <programlisting>
587 $function$
588 BEGIN
589 RETURN ($1 ~ $q$[\t\r\n\v\\]$q$);
590 END;
591 $function$
592 </programlisting>
593 Here, the sequence <literal>$q$[\t\r\n\v\\]$q$</literal> represents a
594 dollar-quoted literal string <literal>[\t\r\n\v\\]</literal>, which will
595 be recognized when the function body is executed by
596 <productname>PostgreSQL</productname>. But since the sequence does not match
597 the outer dollar quoting delimiter <literal>$function$</literal>, it is
598 just some more characters within the constant so far as the outer
599 string is concerned.
600 </para>
602 <para>
603 The tag, if any, of a dollar-quoted string follows the same rules
604 as an unquoted identifier, except that it cannot contain a dollar sign.
605 Tags are case sensitive, so <literal>$tag$String content$tag$</literal>
606 is correct, but <literal>$TAG$String content$tag$</literal> is not.
607 </para>
609 <para>
610 A dollar-quoted string that follows a keyword or identifier must
611 be separated from it by whitespace; otherwise the dollar quoting
612 delimiter would be taken as part of the preceding identifier.
613 </para>
615 <para>
616 Dollar quoting is not part of the SQL standard, but it is often a more
617 convenient way to write complicated string literals than the
618 standard-compliant single quote syntax. It is particularly useful when
619 representing string constants inside other constants, as is often needed
620 in procedural function definitions. With single-quote syntax, each
621 backslash in the above example would have to be written as four
622 backslashes, which would be reduced to two backslashes in parsing the
623 original string constant, and then to one when the inner string constant
624 is re-parsed during function execution.
625 </para>
626 </sect3>
628 <sect3 id="sql-syntax-bit-strings">
629 <title>Bit-String Constants</title>
631 <indexterm zone="sql-syntax-bit-strings">
632 <primary>bit string</primary>
633 <secondary>constant</secondary>
634 </indexterm>
636 <para>
637 Bit-string constants look like regular string constants with a
638 <literal>B</literal> (upper or lower case) immediately before the
639 opening quote (no intervening whitespace), e.g.,
640 <literal>B'1001'</literal>. The only characters allowed within
641 bit-string constants are <literal>0</literal> and
642 <literal>1</literal>.
643 </para>
645 <para>
646 Alternatively, bit-string constants can be specified in hexadecimal
647 notation, using a leading <literal>X</literal> (upper or lower case),
648 e.g., <literal>X'1FF'</literal>. This notation is equivalent to
649 a bit-string constant with four binary digits for each hexadecimal digit.
650 </para>
652 <para>
653 Both forms of bit-string constant can be continued
654 across lines in the same way as regular string constants.
655 Dollar quoting cannot be used in a bit-string constant.
656 </para>
657 </sect3>
659 <sect3 id="sql-syntax-constants-numeric">
660 <title>Numeric Constants</title>
662 <indexterm>
663 <primary>number</primary>
664 <secondary>constant</secondary>
665 </indexterm>
667 <para>
668 Numeric constants are accepted in these general forms:
669 <synopsis>
670 <replaceable>digits</replaceable>
671 <replaceable>digits</replaceable>.<optional><replaceable>digits</replaceable></optional><optional>e<optional>+-</optional><replaceable>digits</replaceable></optional>
672 <optional><replaceable>digits</replaceable></optional>.<replaceable>digits</replaceable><optional>e<optional>+-</optional><replaceable>digits</replaceable></optional>
673 <replaceable>digits</replaceable>e<optional>+-</optional><replaceable>digits</replaceable>
674 </synopsis>
675 where <replaceable>digits</replaceable> is one or more decimal
676 digits (0 through 9). At least one digit must be before or after the
677 decimal point, if one is used. At least one digit must follow the
678 exponent marker (<literal>e</literal>), if one is present.
679 There cannot be any spaces or other characters embedded in the
680 constant, except for underscores, which can be used for visual grouping as
681 described below. Note that any leading plus or minus sign is not actually
682 considered part of the constant; it is an operator applied to the
683 constant.
684 </para>
686 <para>
687 These are some examples of valid numeric constants:
688 <literallayout>
692 .001
694 1.925e-3
695 </literallayout>
696 </para>
698 <para>
699 Additionally, non-decimal integer constants are accepted in these forms:
700 <synopsis>
701 0x<replaceable>hexdigits</replaceable>
702 0o<replaceable>octdigits</replaceable>
703 0b<replaceable>bindigits</replaceable>
704 </synopsis>
705 where <replaceable>hexdigits</replaceable> is one or more hexadecimal digits
706 (0-9, A-F), <replaceable>octdigits</replaceable> is one or more octal
707 digits (0-7), and <replaceable>bindigits</replaceable> is one or more binary
708 digits (0 or 1). Hexadecimal digits and the radix prefixes can be in
709 upper or lower case. Note that only integers can have non-decimal forms,
710 not numbers with fractional parts.
711 </para>
713 <para>
714 These are some examples of valid non-decimal integer constants:
715 <literallayout>
716 0b100101
717 0B10011001
718 0o273
719 0O755
720 0x42f
721 0XFFFF
722 </literallayout>
723 </para>
725 <para>
726 For visual grouping, underscores can be inserted between digits. These
727 have no further effect on the value of the constant. For example:
728 <literallayout>
729 1_500_000_000
730 0b10001000_00000000
731 0o_1_755
732 0xFFFF_FFFF
733 1.618_034
734 </literallayout>
735 Underscores are not allowed at the start or end of a numeric constant or
736 a group of digits (that is, immediately before or after the decimal point
737 or the exponent marker), and more than one underscore in a row is not
738 allowed.
739 </para>
741 <para>
742 <indexterm><primary>integer</primary></indexterm>
743 <indexterm><primary>bigint</primary></indexterm>
744 <indexterm><primary>numeric</primary></indexterm>
745 A numeric constant that contains neither a decimal point nor an
746 exponent is initially presumed to be type <type>integer</type> if its
747 value fits in type <type>integer</type> (32 bits); otherwise it is
748 presumed to be type <type>bigint</type> if its
749 value fits in type <type>bigint</type> (64 bits); otherwise it is
750 taken to be type <type>numeric</type>. Constants that contain decimal
751 points and/or exponents are always initially presumed to be type
752 <type>numeric</type>.
753 </para>
755 <para>
756 The initially assigned data type of a numeric constant is just a
757 starting point for the type resolution algorithms. In most cases
758 the constant will be automatically coerced to the most
759 appropriate type depending on context. When necessary, you can
760 force a numeric value to be interpreted as a specific data type
761 by casting it.<indexterm><primary>type cast</primary></indexterm>
762 For example, you can force a numeric value to be treated as type
763 <type>real</type> (<type>float4</type>) by writing:
765 <programlisting>
766 REAL '1.23' -- string style
767 1.23::REAL -- PostgreSQL (historical) style
768 </programlisting>
770 These are actually just special cases of the general casting
771 notations discussed next.
772 </para>
773 </sect3>
775 <sect3 id="sql-syntax-constants-generic">
776 <title>Constants of Other Types</title>
778 <indexterm>
779 <primary>data type</primary>
780 <secondary>constant</secondary>
781 </indexterm>
783 <para>
784 A constant of an <emphasis>arbitrary</emphasis> type can be
785 entered using any one of the following notations:
786 <synopsis>
787 <replaceable>type</replaceable> '<replaceable>string</replaceable>'
788 '<replaceable>string</replaceable>'::<replaceable>type</replaceable>
789 CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
790 </synopsis>
791 The string constant's text is passed to the input conversion
792 routine for the type called <replaceable>type</replaceable>. The
793 result is a constant of the indicated type. The explicit type
794 cast can be omitted if there is no ambiguity as to the type the
795 constant must be (for example, when it is assigned directly to a
796 table column), in which case it is automatically coerced.
797 </para>
799 <para>
800 The string constant can be written using either regular SQL
801 notation or dollar-quoting.
802 </para>
804 <para>
805 It is also possible to specify a type coercion using a function-like
806 syntax:
807 <synopsis>
808 <replaceable>typename</replaceable> ( '<replaceable>string</replaceable>' )
809 </synopsis>
810 but not all type names can be used in this way; see <xref
811 linkend="sql-syntax-type-casts"/> for details.
812 </para>
814 <para>
815 The <literal>::</literal>, <literal>CAST()</literal>, and
816 function-call syntaxes can also be used to specify run-time type
817 conversions of arbitrary expressions, as discussed in <xref
818 linkend="sql-syntax-type-casts"/>. To avoid syntactic ambiguity, the
819 <literal><replaceable>type</replaceable> '<replaceable>string</replaceable>'</literal>
820 syntax can only be used to specify the type of a simple literal constant.
821 Another restriction on the
822 <literal><replaceable>type</replaceable> '<replaceable>string</replaceable>'</literal>
823 syntax is that it does not work for array types; use <literal>::</literal>
824 or <literal>CAST()</literal> to specify the type of an array constant.
825 </para>
827 <para>
828 The <literal>CAST()</literal> syntax conforms to SQL. The
829 <literal><replaceable>type</replaceable> '<replaceable>string</replaceable>'</literal>
830 syntax is a generalization of the standard: SQL specifies this syntax only
831 for a few data types, but <productname>PostgreSQL</productname> allows it
832 for all types. The syntax with
833 <literal>::</literal> is historical <productname>PostgreSQL</productname>
834 usage, as is the function-call syntax.
835 </para>
836 </sect3>
837 </sect2>
839 <sect2 id="sql-syntax-operators">
840 <title>Operators</title>
842 <indexterm zone="sql-syntax-operators">
843 <primary>operator</primary>
844 <secondary>syntax</secondary>
845 </indexterm>
847 <para>
848 An operator name is a sequence of up to <symbol>NAMEDATALEN</symbol>-1
849 (63 by default) characters from the following list:
850 <literallayout>
851 + - * / &lt; &gt; = ~ ! @ # % ^ &amp; | ` ?
852 </literallayout>
854 There are a few restrictions on operator names, however:
855 <itemizedlist>
856 <listitem>
857 <para>
858 <literal>--</literal> and <literal>/*</literal> cannot appear
859 anywhere in an operator name, since they will be taken as the
860 start of a comment.
861 </para>
862 </listitem>
864 <listitem>
865 <para>
866 A multiple-character operator name cannot end in <literal>+</literal> or <literal>-</literal>,
867 unless the name also contains at least one of these characters:
868 <literallayout>
869 ~ ! @ # % ^ &amp; | ` ?
870 </literallayout>
871 For example, <literal>@-</literal> is an allowed operator name,
872 but <literal>*-</literal> is not. This restriction allows
873 <productname>PostgreSQL</productname> to parse SQL-compliant
874 queries without requiring spaces between tokens.
875 </para>
876 </listitem>
877 </itemizedlist>
878 </para>
880 <para>
881 When working with non-SQL-standard operator names, you will usually
882 need to separate adjacent operators with spaces to avoid ambiguity.
883 For example, if you have defined a prefix operator named <literal>@</literal>,
884 you cannot write <literal>X*@Y</literal>; you must write
885 <literal>X* @Y</literal> to ensure that
886 <productname>PostgreSQL</productname> reads it as two operator names
887 not one.
888 </para>
889 </sect2>
891 <sect2 id="sql-syntax-special-chars">
892 <title>Special Characters</title>
894 <para>
895 Some characters that are not alphanumeric have a special meaning
896 that is different from being an operator. Details on the usage can
897 be found at the location where the respective syntax element is
898 described. This section only exists to advise the existence and
899 summarize the purposes of these characters.
901 <itemizedlist>
902 <listitem>
903 <para>
904 A dollar sign (<literal>$</literal>) followed by digits is used
905 to represent a positional parameter in the body of a function
906 definition or a prepared statement. In other contexts the
907 dollar sign can be part of an identifier or a dollar-quoted string
908 constant.
909 </para>
910 </listitem>
912 <listitem>
913 <para>
914 Parentheses (<literal>()</literal>) have their usual meaning to
915 group expressions and enforce precedence. In some cases
916 parentheses are required as part of the fixed syntax of a
917 particular SQL command.
918 </para>
919 </listitem>
921 <listitem>
922 <para>
923 Brackets (<literal>[]</literal>) are used to select the elements
924 of an array. See <xref linkend="arrays"/> for more information
925 on arrays.
926 </para>
927 </listitem>
929 <listitem>
930 <para>
931 Commas (<literal>,</literal>) are used in some syntactical
932 constructs to separate the elements of a list.
933 </para>
934 </listitem>
936 <listitem>
937 <para>
938 The semicolon (<literal>;</literal>) terminates an SQL command.
939 It cannot appear anywhere within a command, except within a
940 string constant or quoted identifier.
941 </para>
942 </listitem>
944 <listitem>
945 <para>
946 The colon (<literal>:</literal>) is used to select
947 <quote>slices</quote> from arrays. (See <xref
948 linkend="arrays"/>.) In certain SQL dialects (such as Embedded
949 SQL), the colon is used to prefix variable names.
950 </para>
951 </listitem>
953 <listitem>
954 <para>
955 The asterisk (<literal>*</literal>) is used in some contexts to denote
956 all the fields of a table row or composite value. It also
957 has a special meaning when used as the argument of an
958 aggregate function, namely that the aggregate does not require
959 any explicit parameter.
960 </para>
961 </listitem>
963 <listitem>
964 <para>
965 The period (<literal>.</literal>) is used in numeric
966 constants, and to separate schema, table, and column names.
967 </para>
968 </listitem>
969 </itemizedlist>
971 </para>
972 </sect2>
974 <sect2 id="sql-syntax-comments">
975 <title>Comments</title>
977 <indexterm zone="sql-syntax-comments">
978 <primary>comment</primary>
979 <secondary sortas="SQL">in SQL</secondary>
980 </indexterm>
982 <para>
983 A comment is a sequence of characters beginning with
984 double dashes and extending to the end of the line, e.g.:
985 <programlisting>
986 -- This is a standard SQL comment
987 </programlisting>
988 </para>
990 <para>
991 Alternatively, C-style block comments can be used:
992 <programlisting>
993 /* multiline comment
994 * with nesting: /* nested block comment */
996 </programlisting>
997 where the comment begins with <literal>/*</literal> and extends to
998 the matching occurrence of <literal>*/</literal>. These block
999 comments nest, as specified in the SQL standard but unlike C, so that one can
1000 comment out larger blocks of code that might contain existing block
1001 comments.
1002 </para>
1004 <para>
1005 A comment is removed from the input stream before further syntax
1006 analysis and is effectively replaced by whitespace.
1007 </para>
1008 </sect2>
1010 <sect2 id="sql-precedence">
1011 <title>Operator Precedence</title>
1013 <indexterm zone="sql-precedence">
1014 <primary>operator</primary>
1015 <secondary>precedence</secondary>
1016 </indexterm>
1018 <para>
1019 <xref linkend="sql-precedence-table"/> shows the precedence and
1020 associativity of the operators in <productname>PostgreSQL</productname>.
1021 Most operators have the same precedence and are left-associative.
1022 The precedence and associativity of the operators is hard-wired
1023 into the parser.
1024 Add parentheses if you want an expression with multiple operators
1025 to be parsed in some other way than what the precedence rules imply.
1026 </para>
1028 <table id="sql-precedence-table">
1029 <title>Operator Precedence (highest to lowest)</title>
1031 <tgroup cols="3">
1032 <colspec colname="col1" colwidth="2*"/>
1033 <colspec colname="col2" colwidth="1*"/>
1034 <colspec colname="col3" colwidth="2*"/>
1035 <thead>
1036 <row>
1037 <entry>Operator/Element</entry>
1038 <entry>Associativity</entry>
1039 <entry>Description</entry>
1040 </row>
1041 </thead>
1043 <tbody>
1044 <row>
1045 <entry><token>.</token></entry>
1046 <entry>left</entry>
1047 <entry>table/column name separator</entry>
1048 </row>
1050 <row>
1051 <entry><token>::</token></entry>
1052 <entry>left</entry>
1053 <entry><productname>PostgreSQL</productname>-style typecast</entry>
1054 </row>
1056 <row>
1057 <entry><token>[</token> <token>]</token></entry>
1058 <entry>left</entry>
1059 <entry>array element selection</entry>
1060 </row>
1062 <row>
1063 <entry><token>+</token> <token>-</token></entry>
1064 <entry>right</entry>
1065 <entry>unary plus, unary minus</entry>
1066 </row>
1068 <row>
1069 <entry><token>COLLATE</token></entry>
1070 <entry>left</entry>
1071 <entry>collation selection</entry>
1072 </row>
1074 <row>
1075 <entry><token>AT</token></entry>
1076 <entry>left</entry>
1077 <entry><literal>AT TIME ZONE</literal>, <literal>AT LOCAL</literal></entry>
1078 </row>
1080 <row>
1081 <entry><token>^</token></entry>
1082 <entry>left</entry>
1083 <entry>exponentiation</entry>
1084 </row>
1086 <row>
1087 <entry><token>*</token> <token>/</token> <token>%</token></entry>
1088 <entry>left</entry>
1089 <entry>multiplication, division, modulo</entry>
1090 </row>
1092 <row>
1093 <entry><token>+</token> <token>-</token></entry>
1094 <entry>left</entry>
1095 <entry>addition, subtraction</entry>
1096 </row>
1098 <row>
1099 <entry>(any other operator)</entry>
1100 <entry>left</entry>
1101 <entry>all other native and user-defined operators</entry>
1102 </row>
1104 <row>
1105 <entry><token>BETWEEN</token> <token>IN</token> <token>LIKE</token> <token>ILIKE</token> <token>SIMILAR</token></entry>
1106 <entry></entry>
1107 <entry>range containment, set membership, string matching</entry>
1108 </row>
1110 <row>
1111 <entry><token>&lt;</token> <token>&gt;</token> <token>=</token> <token>&lt;=</token> <token>&gt;=</token> <token>&lt;&gt;</token>
1112 </entry>
1113 <entry></entry>
1114 <entry>comparison operators</entry>
1115 </row>
1117 <row>
1118 <entry><token>IS</token> <token>ISNULL</token> <token>NOTNULL</token></entry>
1119 <entry></entry>
1120 <entry><literal>IS TRUE</literal>, <literal>IS FALSE</literal>, <literal>IS
1121 NULL</literal>, <literal>IS DISTINCT FROM</literal>, etc.</entry>
1122 </row>
1124 <row>
1125 <entry><token>NOT</token></entry>
1126 <entry>right</entry>
1127 <entry>logical negation</entry>
1128 </row>
1130 <row>
1131 <entry><token>AND</token></entry>
1132 <entry>left</entry>
1133 <entry>logical conjunction</entry>
1134 </row>
1136 <row>
1137 <entry><token>OR</token></entry>
1138 <entry>left</entry>
1139 <entry>logical disjunction</entry>
1140 </row>
1141 </tbody>
1142 </tgroup>
1143 </table>
1145 <para>
1146 Note that the operator precedence rules also apply to user-defined
1147 operators that have the same names as the built-in operators
1148 mentioned above. For example, if you define a
1149 <quote>+</quote> operator for some custom data type it will have
1150 the same precedence as the built-in <quote>+</quote> operator, no
1151 matter what yours does.
1152 </para>
1154 <para>
1155 When a schema-qualified operator name is used in the
1156 <literal>OPERATOR</literal> syntax, as for example in:
1157 <programlisting>
1158 SELECT 3 OPERATOR(pg_catalog.+) 4;
1159 </programlisting>
1160 the <literal>OPERATOR</literal> construct is taken to have the default precedence
1161 shown in <xref linkend="sql-precedence-table"/> for
1162 <quote>any other operator</quote>. This is true no matter
1163 which specific operator appears inside <literal>OPERATOR()</literal>.
1164 </para>
1166 <note>
1167 <para>
1168 <productname>PostgreSQL</productname> versions before 9.5 used slightly different
1169 operator precedence rules. In particular, <token>&lt;=</token>
1170 <token>&gt;=</token> and <token>&lt;&gt;</token> used to be treated as
1171 generic operators; <literal>IS</literal> tests used to have higher priority;
1172 and <literal>NOT BETWEEN</literal> and related constructs acted inconsistently,
1173 being taken in some cases as having the precedence of <literal>NOT</literal>
1174 rather than <literal>BETWEEN</literal>. These rules were changed for better
1175 compliance with the SQL standard and to reduce confusion from
1176 inconsistent treatment of logically equivalent constructs. In most
1177 cases, these changes will result in no behavioral change, or perhaps
1178 in <quote>no such operator</quote> failures which can be resolved by adding
1179 parentheses. However there are corner cases in which a query might
1180 change behavior without any parsing error being reported.
1181 </para>
1182 </note>
1183 </sect2>
1184 </sect1>
1186 <sect1 id="sql-expressions">
1187 <title>Value Expressions</title>
1189 <indexterm zone="sql-expressions">
1190 <primary>expression</primary>
1191 <secondary>syntax</secondary>
1192 </indexterm>
1194 <indexterm zone="sql-expressions">
1195 <primary>value expression</primary>
1196 </indexterm>
1198 <indexterm>
1199 <primary>scalar</primary>
1200 <see>expression</see>
1201 </indexterm>
1203 <para>
1204 Value expressions are used in a variety of contexts, such
1205 as in the target list of the <command>SELECT</command> command, as
1206 new column values in <command>INSERT</command> or
1207 <command>UPDATE</command>, or in search conditions in a number of
1208 commands. The result of a value expression is sometimes called a
1209 <firstterm>scalar</firstterm>, to distinguish it from the result of
1210 a table expression (which is a table). Value expressions are
1211 therefore also called <firstterm>scalar expressions</firstterm> (or
1212 even simply <firstterm>expressions</firstterm>). The expression
1213 syntax allows the calculation of values from primitive parts using
1214 arithmetic, logical, set, and other operations.
1215 </para>
1217 <para>
1218 A value expression is one of the following:
1220 <itemizedlist>
1221 <listitem>
1222 <para>
1223 A constant or literal value
1224 </para>
1225 </listitem>
1227 <listitem>
1228 <para>
1229 A column reference
1230 </para>
1231 </listitem>
1233 <listitem>
1234 <para>
1235 A positional parameter reference, in the body of a function definition
1236 or prepared statement
1237 </para>
1238 </listitem>
1240 <listitem>
1241 <para>
1242 A subscripted expression
1243 </para>
1244 </listitem>
1246 <listitem>
1247 <para>
1248 A field selection expression
1249 </para>
1250 </listitem>
1252 <listitem>
1253 <para>
1254 An operator invocation
1255 </para>
1256 </listitem>
1258 <listitem>
1259 <para>
1260 A function call
1261 </para>
1262 </listitem>
1264 <listitem>
1265 <para>
1266 An aggregate expression
1267 </para>
1268 </listitem>
1270 <listitem>
1271 <para>
1272 A window function call
1273 </para>
1274 </listitem>
1276 <listitem>
1277 <para>
1278 A type cast
1279 </para>
1280 </listitem>
1282 <listitem>
1283 <para>
1284 A collation expression
1285 </para>
1286 </listitem>
1288 <listitem>
1289 <para>
1290 A scalar subquery
1291 </para>
1292 </listitem>
1294 <listitem>
1295 <para>
1296 An array constructor
1297 </para>
1298 </listitem>
1300 <listitem>
1301 <para>
1302 A row constructor
1303 </para>
1304 </listitem>
1306 <listitem>
1307 <para>
1308 Another value expression in parentheses (used to group
1309 subexpressions and override
1310 precedence<indexterm><primary>parenthesis</primary></indexterm>)
1311 </para>
1312 </listitem>
1313 </itemizedlist>
1314 </para>
1316 <para>
1317 In addition to this list, there are a number of constructs that can
1318 be classified as an expression but do not follow any general syntax
1319 rules. These generally have the semantics of a function or
1320 operator and are explained in the appropriate location in <xref
1321 linkend="functions"/>. An example is the <literal>IS NULL</literal>
1322 clause.
1323 </para>
1325 <para>
1326 We have already discussed constants in <xref
1327 linkend="sql-syntax-constants"/>. The following sections discuss
1328 the remaining options.
1329 </para>
1331 <sect2 id="sql-expressions-column-refs">
1332 <title>Column References</title>
1334 <indexterm>
1335 <primary>column reference</primary>
1336 </indexterm>
1338 <para>
1339 A column can be referenced in the form:
1340 <synopsis>
1341 <replaceable>correlation</replaceable>.<replaceable>columnname</replaceable>
1342 </synopsis>
1343 </para>
1345 <para>
1346 <replaceable>correlation</replaceable> is the name of a
1347 table (possibly qualified with a schema name), or an alias for a table
1348 defined by means of a <literal>FROM</literal> clause.
1349 The correlation name and separating dot can be omitted if the column name
1350 is unique across all the tables being used in the current query. (See also <xref linkend="queries"/>.)
1351 </para>
1352 </sect2>
1354 <sect2 id="sql-expressions-parameters-positional">
1355 <title>Positional Parameters</title>
1357 <indexterm>
1358 <primary>parameter</primary>
1359 <secondary>syntax</secondary>
1360 </indexterm>
1362 <indexterm>
1363 <primary>$</primary>
1364 </indexterm>
1366 <para>
1367 A positional parameter reference is used to indicate a value
1368 that is supplied externally to an SQL statement. Parameters are
1369 used in SQL function definitions and in prepared queries. Some
1370 client libraries also support specifying data values separately
1371 from the SQL command string, in which case parameters are used to
1372 refer to the out-of-line data values.
1373 The form of a parameter reference is:
1374 <synopsis>
1375 $<replaceable>number</replaceable>
1376 </synopsis>
1377 </para>
1379 <para>
1380 For example, consider the definition of a function,
1381 <function>dept</function>, as:
1383 <programlisting>
1384 CREATE FUNCTION dept(text) RETURNS dept
1385 AS $$ SELECT * FROM dept WHERE name = $1 $$
1386 LANGUAGE SQL;
1387 </programlisting>
1389 Here the <literal>$1</literal> references the value of the first
1390 function argument whenever the function is invoked.
1391 </para>
1392 </sect2>
1394 <sect2 id="sql-expressions-subscripts">
1395 <title>Subscripts</title>
1397 <indexterm>
1398 <primary>subscript</primary>
1399 </indexterm>
1401 <para>
1402 If an expression yields a value of an array type, then a specific
1403 element of the array value can be extracted by writing
1404 <synopsis>
1405 <replaceable>expression</replaceable>[<replaceable>subscript</replaceable>]
1406 </synopsis>
1407 or multiple adjacent elements (an <quote>array slice</quote>) can be extracted
1408 by writing
1409 <synopsis>
1410 <replaceable>expression</replaceable>[<replaceable>lower_subscript</replaceable>:<replaceable>upper_subscript</replaceable>]
1411 </synopsis>
1412 (Here, the brackets <literal>[ ]</literal> are meant to appear literally.)
1413 Each <replaceable>subscript</replaceable> is itself an expression,
1414 which will be rounded to the nearest integer value.
1415 </para>
1417 <para>
1418 In general the array <replaceable>expression</replaceable> must be
1419 parenthesized, but the parentheses can be omitted when the expression
1420 to be subscripted is just a column reference or positional parameter.
1421 Also, multiple subscripts can be concatenated when the original array
1422 is multidimensional.
1423 For example:
1425 <programlisting>
1426 mytable.arraycolumn[4]
1427 mytable.two_d_column[17][34]
1428 $1[10:42]
1429 (arrayfunction(a,b))[42]
1430 </programlisting>
1432 The parentheses in the last example are required.
1433 See <xref linkend="arrays"/> for more about arrays.
1434 </para>
1435 </sect2>
1437 <sect2 id="field-selection">
1438 <title>Field Selection</title>
1440 <indexterm>
1441 <primary>field selection</primary>
1442 </indexterm>
1444 <para>
1445 If an expression yields a value of a composite type (row type), then a
1446 specific field of the row can be extracted by writing
1447 <synopsis>
1448 <replaceable>expression</replaceable>.<replaceable>fieldname</replaceable>
1449 </synopsis>
1450 </para>
1452 <para>
1453 In general the row <replaceable>expression</replaceable> must be
1454 parenthesized, but the parentheses can be omitted when the expression
1455 to be selected from is just a table reference or positional parameter.
1456 For example:
1458 <programlisting>
1459 mytable.mycolumn
1460 $1.somecolumn
1461 (rowfunction(a,b)).col3
1462 </programlisting>
1464 (Thus, a qualified column reference is actually just a special case
1465 of the field selection syntax.) An important special case is
1466 extracting a field from a table column that is of a composite type:
1468 <programlisting>
1469 (compositecol).somefield
1470 (mytable.compositecol).somefield
1471 </programlisting>
1473 The parentheses are required here to show that
1474 <structfield>compositecol</structfield> is a column name not a table name,
1475 or that <structname>mytable</structname> is a table name not a schema name
1476 in the second case.
1477 </para>
1479 <para>
1480 You can ask for all fields of a composite value by
1481 writing <literal>.*</literal>:
1482 <programlisting>
1483 (compositecol).*
1484 </programlisting>
1485 This notation behaves differently depending on context;
1486 see <xref linkend="rowtypes-usage"/> for details.
1487 </para>
1488 </sect2>
1490 <sect2 id="sql-expressions-operator-calls">
1491 <title>Operator Invocations</title>
1493 <indexterm>
1494 <primary>operator</primary>
1495 <secondary>invocation</secondary>
1496 </indexterm>
1498 <para>
1499 There are two possible syntaxes for an operator invocation:
1500 <simplelist>
1501 <member><replaceable>expression</replaceable> <replaceable>operator</replaceable> <replaceable>expression</replaceable> (binary infix operator)</member>
1502 <member><replaceable>operator</replaceable> <replaceable>expression</replaceable> (unary prefix operator)</member>
1503 </simplelist>
1504 where the <replaceable>operator</replaceable> token follows the syntax
1505 rules of <xref linkend="sql-syntax-operators"/>, or is one of the
1506 key words <token>AND</token>, <token>OR</token>, and
1507 <token>NOT</token>, or is a qualified operator name in the form:
1508 <synopsis>
1509 <literal>OPERATOR(</literal><replaceable>schema</replaceable><literal>.</literal><replaceable>operatorname</replaceable><literal>)</literal>
1510 </synopsis>
1511 Which particular operators exist and whether
1512 they are unary or binary depends on what operators have been
1513 defined by the system or the user. <xref linkend="functions"/>
1514 describes the built-in operators.
1515 </para>
1516 </sect2>
1518 <sect2 id="sql-expressions-function-calls">
1519 <title>Function Calls</title>
1521 <indexterm>
1522 <primary>function</primary>
1523 <secondary>invocation</secondary>
1524 </indexterm>
1526 <para>
1527 The syntax for a function call is the name of a function
1528 (possibly qualified with a schema name), followed by its argument list
1529 enclosed in parentheses:
1531 <synopsis>
1532 <replaceable>function_name</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional> )
1533 </synopsis>
1534 </para>
1536 <para>
1537 For example, the following computes the square root of 2:
1538 <programlisting>
1539 sqrt(2)
1540 </programlisting>
1541 </para>
1543 <para>
1544 The list of built-in functions is in <xref linkend="functions"/>.
1545 Other functions can be added by the user.
1546 </para>
1548 <para>
1549 When issuing queries in a database where some users mistrust other users,
1550 observe security precautions from <xref linkend="typeconv-func"/> when
1551 writing function calls.
1552 </para>
1554 <para>
1555 The arguments can optionally have names attached.
1556 See <xref linkend="sql-syntax-calling-funcs"/> for details.
1557 </para>
1559 <note>
1560 <para>
1561 A function that takes a single argument of composite type can
1562 optionally be called using field-selection syntax, and conversely
1563 field selection can be written in functional style. That is, the
1564 notations <literal>col(table)</literal> and <literal>table.col</literal> are
1565 interchangeable. This behavior is not SQL-standard but is provided
1566 in <productname>PostgreSQL</productname> because it allows use of functions to
1567 emulate <quote>computed fields</quote>. For more information see
1568 <xref linkend="rowtypes-usage"/>.
1569 </para>
1570 </note>
1571 </sect2>
1573 <sect2 id="syntax-aggregates">
1574 <title>Aggregate Expressions</title>
1576 <indexterm zone="syntax-aggregates">
1577 <primary>aggregate function</primary>
1578 <secondary>invocation</secondary>
1579 </indexterm>
1581 <indexterm zone="syntax-aggregates">
1582 <primary>ordered-set aggregate</primary>
1583 </indexterm>
1585 <indexterm zone="syntax-aggregates">
1586 <primary>WITHIN GROUP</primary>
1587 </indexterm>
1589 <indexterm zone="syntax-aggregates">
1590 <primary>FILTER</primary>
1591 </indexterm>
1593 <para>
1594 An <firstterm>aggregate expression</firstterm> represents the
1595 application of an aggregate function across the rows selected by a
1596 query. An aggregate function reduces multiple inputs to a single
1597 output value, such as the sum or average of the inputs. The
1598 syntax of an aggregate expression is one of the following:
1600 <synopsis>
1601 <replaceable>aggregate_name</replaceable> (<replaceable>expression</replaceable> [ , ... ] [ <replaceable>order_by_clause</replaceable> ] ) [ FILTER ( WHERE <replaceable>filter_clause</replaceable> ) ]
1602 <replaceable>aggregate_name</replaceable> (ALL <replaceable>expression</replaceable> [ , ... ] [ <replaceable>order_by_clause</replaceable> ] ) [ FILTER ( WHERE <replaceable>filter_clause</replaceable> ) ]
1603 <replaceable>aggregate_name</replaceable> (DISTINCT <replaceable>expression</replaceable> [ , ... ] [ <replaceable>order_by_clause</replaceable> ] ) [ FILTER ( WHERE <replaceable>filter_clause</replaceable> ) ]
1604 <replaceable>aggregate_name</replaceable> ( * ) [ FILTER ( WHERE <replaceable>filter_clause</replaceable> ) ]
1605 <replaceable>aggregate_name</replaceable> ( [ <replaceable>expression</replaceable> [ , ... ] ] ) WITHIN GROUP ( <replaceable>order_by_clause</replaceable> ) [ FILTER ( WHERE <replaceable>filter_clause</replaceable> ) ]
1606 </synopsis>
1608 where <replaceable>aggregate_name</replaceable> is a previously
1609 defined aggregate (possibly qualified with a schema name) and
1610 <replaceable>expression</replaceable> is
1611 any value expression that does not itself contain an aggregate
1612 expression or a window function call. The optional
1613 <replaceable>order_by_clause</replaceable> and
1614 <replaceable>filter_clause</replaceable> are described below.
1615 </para>
1617 <para>
1618 The first form of aggregate expression invokes the aggregate
1619 once for each input row.
1620 The second form is the same as the first, since
1621 <literal>ALL</literal> is the default.
1622 The third form invokes the aggregate once for each distinct value
1623 of the expression (or distinct set of values, for multiple expressions)
1624 found in the input rows.
1625 The fourth form invokes the aggregate once for each input row; since no
1626 particular input value is specified, it is generally only useful
1627 for the <function>count(*)</function> aggregate function.
1628 The last form is used with <firstterm>ordered-set</firstterm> aggregate
1629 functions, which are described below.
1630 </para>
1632 <para>
1633 Most aggregate functions ignore null inputs, so that rows in which
1634 one or more of the expression(s) yield null are discarded. This
1635 can be assumed to be true, unless otherwise specified, for all
1636 built-in aggregates.
1637 </para>
1639 <para>
1640 For example, <literal>count(*)</literal> yields the total number
1641 of input rows; <literal>count(f1)</literal> yields the number of
1642 input rows in which <literal>f1</literal> is non-null, since
1643 <function>count</function> ignores nulls; and
1644 <literal>count(distinct f1)</literal> yields the number of
1645 distinct non-null values of <literal>f1</literal>.
1646 </para>
1648 <para>
1649 Ordinarily, the input rows are fed to the aggregate function in an
1650 unspecified order. In many cases this does not matter; for example,
1651 <function>min</function> produces the same result no matter what order it
1652 receives the inputs in. However, some aggregate functions
1653 (such as <function>array_agg</function> and <function>string_agg</function>) produce
1654 results that depend on the ordering of the input rows. When using
1655 such an aggregate, the optional <replaceable>order_by_clause</replaceable> can be
1656 used to specify the desired ordering. The <replaceable>order_by_clause</replaceable>
1657 has the same syntax as for a query-level <literal>ORDER BY</literal> clause, as
1658 described in <xref linkend="queries-order"/>, except that its expressions
1659 are always just expressions and cannot be output-column names or numbers.
1660 For example:
1661 <programlisting>
1662 WITH vals (v) AS ( VALUES (1),(3),(4),(3),(2) )
1663 SELECT array_agg(v ORDER BY v DESC) FROM vals;
1664 array_agg
1665 -------------
1666 {4,3,3,2,1}
1667 </programlisting>
1668 Since <type>jsonb</type> only keeps the last matching key, ordering
1669 of its keys can be significant:
1670 <programlisting>
1671 WITH vals (k, v) AS ( VALUES ('key0','1'), ('key1','3'), ('key1','2') )
1672 SELECT jsonb_object_agg(k, v ORDER BY v) FROM vals;
1673 jsonb_object_agg
1674 ----------------------------
1675 {"key0": "1", "key1": "3"}
1676 </programlisting>
1677 </para>
1679 <para>
1680 When dealing with multiple-argument aggregate functions, note that the
1681 <literal>ORDER BY</literal> clause goes after all the aggregate arguments.
1682 For example, write this:
1683 <programlisting>
1684 SELECT string_agg(a, ',' ORDER BY a) FROM table;
1685 </programlisting>
1686 not this:
1687 <programlisting>
1688 SELECT string_agg(a ORDER BY a, ',') FROM table; -- incorrect
1689 </programlisting>
1690 The latter is syntactically valid, but it represents a call of a
1691 single-argument aggregate function with two <literal>ORDER BY</literal> keys
1692 (the second one being rather useless since it's a constant).
1693 </para>
1695 <para>
1696 If <literal>DISTINCT</literal> is specified with an
1697 <replaceable>order_by_clause</replaceable>, <literal>ORDER
1698 BY</literal> expressions can only reference columns in the
1699 <literal>DISTINCT</literal> list. For example:
1700 <programlisting>
1701 WITH vals (v) AS ( VALUES (1),(3),(4),(3),(2) )
1702 SELECT array_agg(DISTINCT v ORDER BY v DESC) FROM vals;
1703 array_agg
1704 -----------
1705 {4,3,2,1}
1706 </programlisting>
1707 </para>
1709 <para>
1710 Placing <literal>ORDER BY</literal> within the aggregate's regular argument
1711 list, as described so far, is used when ordering the input rows for
1712 general-purpose and statistical aggregates, for which ordering is
1713 optional. There is a
1714 subclass of aggregate functions called <firstterm>ordered-set
1715 aggregates</firstterm> for which an <replaceable>order_by_clause</replaceable>
1716 is <emphasis>required</emphasis>, usually because the aggregate's computation is
1717 only sensible in terms of a specific ordering of its input rows.
1718 Typical examples of ordered-set aggregates include rank and percentile
1719 calculations. For an ordered-set aggregate,
1720 the <replaceable>order_by_clause</replaceable> is written
1721 inside <literal>WITHIN GROUP (...)</literal>, as shown in the final syntax
1722 alternative above. The expressions in
1723 the <replaceable>order_by_clause</replaceable> are evaluated once per
1724 input row just like regular aggregate arguments, sorted as per
1725 the <replaceable>order_by_clause</replaceable>'s requirements, and fed
1726 to the aggregate function as input arguments. (This is unlike the case
1727 for a non-<literal>WITHIN GROUP</literal> <replaceable>order_by_clause</replaceable>,
1728 which is not treated as argument(s) to the aggregate function.) The
1729 argument expressions preceding <literal>WITHIN GROUP</literal>, if any, are
1730 called <firstterm>direct arguments</firstterm> to distinguish them from
1731 the <firstterm>aggregated arguments</firstterm> listed in
1732 the <replaceable>order_by_clause</replaceable>. Unlike regular aggregate
1733 arguments, direct arguments are evaluated only once per aggregate call,
1734 not once per input row. This means that they can contain variables only
1735 if those variables are grouped by <literal>GROUP BY</literal>; this restriction
1736 is the same as if the direct arguments were not inside an aggregate
1737 expression at all. Direct arguments are typically used for things like
1738 percentile fractions, which only make sense as a single value per
1739 aggregation calculation. The direct argument list can be empty; in this
1740 case, write just <literal>()</literal> not <literal>(*)</literal>.
1741 (<productname>PostgreSQL</productname> will actually accept either spelling, but
1742 only the first way conforms to the SQL standard.)
1743 </para>
1745 <para>
1746 <indexterm>
1747 <primary>median</primary>
1748 <seealso>percentile</seealso>
1749 </indexterm>
1750 An example of an ordered-set aggregate call is:
1752 <programlisting>
1753 SELECT percentile_cont(0.5) WITHIN GROUP (ORDER BY income) FROM households;
1754 percentile_cont
1755 -----------------
1756 50489
1757 </programlisting>
1759 which obtains the 50th percentile, or median, value of
1760 the <structfield>income</structfield> column from table <structname>households</structname>.
1761 Here, <literal>0.5</literal> is a direct argument; it would make no sense
1762 for the percentile fraction to be a value varying across rows.
1763 </para>
1765 <para>
1766 If <literal>FILTER</literal> is specified, then only the input
1767 rows for which the <replaceable>filter_clause</replaceable>
1768 evaluates to true are fed to the aggregate function; other rows
1769 are discarded. For example:
1770 <programlisting>
1771 SELECT
1772 count(*) AS unfiltered,
1773 count(*) FILTER (WHERE i &lt; 5) AS filtered
1774 FROM generate_series(1,10) AS s(i);
1775 unfiltered | filtered
1776 ------------+----------
1777 10 | 4
1778 (1 row)
1779 </programlisting>
1780 </para>
1782 <para>
1783 The predefined aggregate functions are described in <xref
1784 linkend="functions-aggregate"/>. Other aggregate functions can be added
1785 by the user.
1786 </para>
1788 <para>
1789 An aggregate expression can only appear in the result list or
1790 <literal>HAVING</literal> clause of a <command>SELECT</command> command.
1791 It is forbidden in other clauses, such as <literal>WHERE</literal>,
1792 because those clauses are logically evaluated before the results
1793 of aggregates are formed.
1794 </para>
1796 <para>
1797 When an aggregate expression appears in a subquery (see
1798 <xref linkend="sql-syntax-scalar-subqueries"/> and
1799 <xref linkend="functions-subquery"/>), the aggregate is normally
1800 evaluated over the rows of the subquery. But an exception occurs
1801 if the aggregate's arguments (and <replaceable>filter_clause</replaceable>
1802 if any) contain only outer-level variables:
1803 the aggregate then belongs to the nearest such outer level, and is
1804 evaluated over the rows of that query. The aggregate expression
1805 as a whole is then an outer reference for the subquery it appears in,
1806 and acts as a constant over any one evaluation of that subquery.
1807 The restriction about
1808 appearing only in the result list or <literal>HAVING</literal> clause
1809 applies with respect to the query level that the aggregate belongs to.
1810 </para>
1811 </sect2>
1813 <sect2 id="syntax-window-functions">
1814 <title>Window Function Calls</title>
1816 <indexterm zone="syntax-window-functions">
1817 <primary>window function</primary>
1818 <secondary>invocation</secondary>
1819 </indexterm>
1821 <indexterm zone="syntax-window-functions">
1822 <primary>OVER clause</primary>
1823 </indexterm>
1825 <para>
1826 A <firstterm>window function call</firstterm> represents the application
1827 of an aggregate-like function over some portion of the rows selected
1828 by a query. Unlike non-window aggregate calls, this is not tied
1829 to grouping of the selected rows into a single output row &mdash; each
1830 row remains separate in the query output. However the window function
1831 has access to all the rows that would be part of the current row's
1832 group according to the grouping specification (<literal>PARTITION BY</literal>
1833 list) of the window function call.
1834 The syntax of a window function call is one of the following:
1836 <synopsis>
1837 <replaceable>function_name</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional>) [ FILTER ( WHERE <replaceable>filter_clause</replaceable> ) ] OVER <replaceable>window_name</replaceable>
1838 <replaceable>function_name</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional>) [ FILTER ( WHERE <replaceable>filter_clause</replaceable> ) ] OVER ( <replaceable class="parameter">window_definition</replaceable> )
1839 <replaceable>function_name</replaceable> ( * ) [ FILTER ( WHERE <replaceable>filter_clause</replaceable> ) ] OVER <replaceable>window_name</replaceable>
1840 <replaceable>function_name</replaceable> ( * ) [ FILTER ( WHERE <replaceable>filter_clause</replaceable> ) ] OVER ( <replaceable class="parameter">window_definition</replaceable> )
1841 </synopsis>
1842 where <replaceable class="parameter">window_definition</replaceable>
1843 has the syntax
1844 <synopsis>
1845 [ <replaceable class="parameter">existing_window_name</replaceable> ]
1846 [ PARTITION BY <replaceable class="parameter">expression</replaceable> [, ...] ]
1847 [ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
1848 [ <replaceable class="parameter">frame_clause</replaceable> ]
1849 </synopsis>
1850 The optional <replaceable class="parameter">frame_clause</replaceable>
1851 can be one of
1852 <synopsis>
1853 { RANGE | ROWS | GROUPS } <replaceable>frame_start</replaceable> [ <replaceable>frame_exclusion</replaceable> ]
1854 { RANGE | ROWS | GROUPS } BETWEEN <replaceable>frame_start</replaceable> AND <replaceable>frame_end</replaceable> [ <replaceable>frame_exclusion</replaceable> ]
1855 </synopsis>
1856 where <replaceable>frame_start</replaceable>
1857 and <replaceable>frame_end</replaceable> can be one of
1858 <synopsis>
1859 UNBOUNDED PRECEDING
1860 <replaceable>offset</replaceable> PRECEDING
1861 CURRENT ROW
1862 <replaceable>offset</replaceable> FOLLOWING
1863 UNBOUNDED FOLLOWING
1864 </synopsis>
1865 and <replaceable>frame_exclusion</replaceable> can be one of
1866 <synopsis>
1867 EXCLUDE CURRENT ROW
1868 EXCLUDE GROUP
1869 EXCLUDE TIES
1870 EXCLUDE NO OTHERS
1871 </synopsis>
1872 </para>
1874 <para>
1875 Here, <replaceable>expression</replaceable> represents any value
1876 expression that does not itself contain window function calls.
1877 </para>
1879 <para>
1880 <replaceable>window_name</replaceable> is a reference to a named window
1881 specification defined in the query's <literal>WINDOW</literal> clause.
1882 Alternatively, a full <replaceable>window_definition</replaceable> can
1883 be given within parentheses, using the same syntax as for defining a
1884 named window in the <literal>WINDOW</literal> clause; see the
1885 <xref linkend="sql-select"/> reference page for details. It's worth
1886 pointing out that <literal>OVER wname</literal> is not exactly equivalent to
1887 <literal>OVER (wname ...)</literal>; the latter implies copying and modifying the
1888 window definition, and will be rejected if the referenced window
1889 specification includes a frame clause.
1890 </para>
1892 <para>
1893 The <literal>PARTITION BY</literal> clause groups the rows of the query into
1894 <firstterm>partitions</firstterm>, which are processed separately by the window
1895 function. <literal>PARTITION BY</literal> works similarly to a query-level
1896 <literal>GROUP BY</literal> clause, except that its expressions are always just
1897 expressions and cannot be output-column names or numbers.
1898 Without <literal>PARTITION BY</literal>, all rows produced by the query are
1899 treated as a single partition.
1900 The <literal>ORDER BY</literal> clause determines the order in which the rows
1901 of a partition are processed by the window function. It works similarly
1902 to a query-level <literal>ORDER BY</literal> clause, but likewise cannot use
1903 output-column names or numbers. Without <literal>ORDER BY</literal>, rows are
1904 processed in an unspecified order.
1905 </para>
1907 <para>
1908 The <replaceable class="parameter">frame_clause</replaceable> specifies
1909 the set of rows constituting the <firstterm>window frame</firstterm>, which is a
1910 subset of the current partition, for those window functions that act on
1911 the frame instead of the whole partition. The set of rows in the frame
1912 can vary depending on which row is the current row. The frame can be
1913 specified in <literal>RANGE</literal>, <literal>ROWS</literal>
1914 or <literal>GROUPS</literal> mode; in each case, it runs from
1915 the <replaceable>frame_start</replaceable> to
1916 the <replaceable>frame_end</replaceable>.
1917 If <replaceable>frame_end</replaceable> is omitted, the end defaults
1918 to <literal>CURRENT ROW</literal>.
1919 </para>
1921 <para>
1922 A <replaceable>frame_start</replaceable> of <literal>UNBOUNDED PRECEDING</literal> means
1923 that the frame starts with the first row of the partition, and similarly
1924 a <replaceable>frame_end</replaceable> of <literal>UNBOUNDED FOLLOWING</literal> means
1925 that the frame ends with the last row of the partition.
1926 </para>
1928 <para>
1929 In <literal>RANGE</literal> or <literal>GROUPS</literal> mode,
1930 a <replaceable>frame_start</replaceable> of
1931 <literal>CURRENT ROW</literal> means the frame starts with the current
1932 row's first <firstterm>peer</firstterm> row (a row that the
1933 window's <literal>ORDER BY</literal> clause sorts as equivalent to the
1934 current row), while a <replaceable>frame_end</replaceable> of
1935 <literal>CURRENT ROW</literal> means the frame ends with the current
1936 row's last peer row.
1937 In <literal>ROWS</literal> mode, <literal>CURRENT ROW</literal> simply
1938 means the current row.
1939 </para>
1941 <para>
1942 In the <replaceable>offset</replaceable> <literal>PRECEDING</literal>
1943 and <replaceable>offset</replaceable> <literal>FOLLOWING</literal> frame
1944 options, the <replaceable>offset</replaceable> must be an expression not
1945 containing any variables, aggregate functions, or window functions.
1946 The meaning of the <replaceable>offset</replaceable> depends on the
1947 frame mode:
1948 <itemizedlist>
1949 <listitem>
1950 <para>
1951 In <literal>ROWS</literal> mode,
1952 the <replaceable>offset</replaceable> must yield a non-null,
1953 non-negative integer, and the option means that the frame starts or
1954 ends the specified number of rows before or after the current row.
1955 </para>
1956 </listitem>
1957 <listitem>
1958 <para>
1959 In <literal>GROUPS</literal> mode,
1960 the <replaceable>offset</replaceable> again must yield a non-null,
1961 non-negative integer, and the option means that the frame starts or
1962 ends the specified number of <firstterm>peer groups</firstterm>
1963 before or after the current row's peer group, where a peer group is a
1964 set of rows that are equivalent in the <literal>ORDER BY</literal>
1965 ordering. (There must be an <literal>ORDER BY</literal> clause
1966 in the window definition to use <literal>GROUPS</literal> mode.)
1967 </para>
1968 </listitem>
1969 <listitem>
1970 <para>
1971 In <literal>RANGE</literal> mode, these options require that
1972 the <literal>ORDER BY</literal> clause specify exactly one column.
1973 The <replaceable>offset</replaceable> specifies the maximum
1974 difference between the value of that column in the current row and
1975 its value in preceding or following rows of the frame. The data type
1976 of the <replaceable>offset</replaceable> expression varies depending
1977 on the data type of the ordering column. For numeric ordering
1978 columns it is typically of the same type as the ordering column,
1979 but for datetime ordering columns it is an <type>interval</type>.
1980 For example, if the ordering column is of type <type>date</type>
1981 or <type>timestamp</type>, one could write <literal>RANGE BETWEEN
1982 '1 day' PRECEDING AND '10 days' FOLLOWING</literal>.
1983 The <replaceable>offset</replaceable> is still required to be
1984 non-null and non-negative, though the meaning
1985 of <quote>non-negative</quote> depends on its data type.
1986 </para>
1987 </listitem>
1988 </itemizedlist>
1989 In any case, the distance to the end of the frame is limited by the
1990 distance to the end of the partition, so that for rows near the partition
1991 ends the frame might contain fewer rows than elsewhere.
1992 </para>
1994 <para>
1995 Notice that in both <literal>ROWS</literal> and <literal>GROUPS</literal>
1996 mode, <literal>0 PRECEDING</literal> and <literal>0 FOLLOWING</literal>
1997 are equivalent to <literal>CURRENT ROW</literal>. This normally holds
1998 in <literal>RANGE</literal> mode as well, for an appropriate
1999 data-type-specific meaning of <quote>zero</quote>.
2000 </para>
2002 <para>
2003 The <replaceable>frame_exclusion</replaceable> option allows rows around
2004 the current row to be excluded from the frame, even if they would be
2005 included according to the frame start and frame end options.
2006 <literal>EXCLUDE CURRENT ROW</literal> excludes the current row from the
2007 frame.
2008 <literal>EXCLUDE GROUP</literal> excludes the current row and its
2009 ordering peers from the frame.
2010 <literal>EXCLUDE TIES</literal> excludes any peers of the current
2011 row from the frame, but not the current row itself.
2012 <literal>EXCLUDE NO OTHERS</literal> simply specifies explicitly the
2013 default behavior of not excluding the current row or its peers.
2014 </para>
2016 <para>
2017 The default framing option is <literal>RANGE UNBOUNDED PRECEDING</literal>,
2018 which is the same as <literal>RANGE BETWEEN UNBOUNDED PRECEDING AND
2019 CURRENT ROW</literal>. With <literal>ORDER BY</literal>, this sets the frame to be
2020 all rows from the partition start up through the current row's last
2021 <literal>ORDER BY</literal> peer. Without <literal>ORDER BY</literal>,
2022 this means all rows of the partition are included in the window frame,
2023 since all rows become peers of the current row.
2024 </para>
2026 <para>
2027 Restrictions are that
2028 <replaceable>frame_start</replaceable> cannot be <literal>UNBOUNDED FOLLOWING</literal>,
2029 <replaceable>frame_end</replaceable> cannot be <literal>UNBOUNDED PRECEDING</literal>,
2030 and the <replaceable>frame_end</replaceable> choice cannot appear earlier in the
2031 above list of <replaceable>frame_start</replaceable>
2032 and <replaceable>frame_end</replaceable> options than
2033 the <replaceable>frame_start</replaceable> choice does &mdash; for example
2034 <literal>RANGE BETWEEN CURRENT ROW AND <replaceable>offset</replaceable>
2035 PRECEDING</literal> is not allowed.
2036 But, for example, <literal>ROWS BETWEEN 7 PRECEDING AND 8
2037 PRECEDING</literal> is allowed, even though it would never select any
2038 rows.
2039 </para>
2041 <para>
2042 If <literal>FILTER</literal> is specified, then only the input
2043 rows for which the <replaceable>filter_clause</replaceable>
2044 evaluates to true are fed to the window function; other rows
2045 are discarded. Only window functions that are aggregates accept
2046 a <literal>FILTER</literal> clause.
2047 </para>
2049 <para>
2050 The built-in window functions are described in <xref
2051 linkend="functions-window-table"/>. Other window functions can be added by
2052 the user. Also, any built-in or user-defined general-purpose or
2053 statistical aggregate can be used as a window function. (Ordered-set
2054 and hypothetical-set aggregates cannot presently be used as window functions.)
2055 </para>
2057 <para>
2058 The syntaxes using <literal>*</literal> are used for calling parameter-less
2059 aggregate functions as window functions, for example
2060 <literal>count(*) OVER (PARTITION BY x ORDER BY y)</literal>.
2061 The asterisk (<literal>*</literal>) is customarily not used for
2062 window-specific functions. Window-specific functions do not
2063 allow <literal>DISTINCT</literal> or <literal>ORDER BY</literal> to be used within the
2064 function argument list.
2065 </para>
2067 <para>
2068 Window function calls are permitted only in the <literal>SELECT</literal>
2069 list and the <literal>ORDER BY</literal> clause of the query.
2070 </para>
2072 <para>
2073 More information about window functions can be found in
2074 <xref linkend="tutorial-window"/>,
2075 <xref linkend="functions-window"/>, and
2076 <xref linkend="queries-window"/>.
2077 </para>
2078 </sect2>
2080 <sect2 id="sql-syntax-type-casts">
2081 <title>Type Casts</title>
2083 <indexterm>
2084 <primary>data type</primary>
2085 <secondary>type cast</secondary>
2086 </indexterm>
2088 <indexterm>
2089 <primary>type cast</primary>
2090 </indexterm>
2092 <indexterm>
2093 <primary>::</primary>
2094 </indexterm>
2096 <para>
2097 A type cast specifies a conversion from one data type to another.
2098 <productname>PostgreSQL</productname> accepts two equivalent syntaxes
2099 for type casts:
2100 <synopsis>
2101 CAST ( <replaceable>expression</replaceable> AS <replaceable>type</replaceable> )
2102 <replaceable>expression</replaceable>::<replaceable>type</replaceable>
2103 </synopsis>
2104 The <literal>CAST</literal> syntax conforms to SQL; the syntax with
2105 <literal>::</literal> is historical <productname>PostgreSQL</productname>
2106 usage.
2107 </para>
2109 <para>
2110 When a cast is applied to a value expression of a known type, it
2111 represents a run-time type conversion. The cast will succeed only
2112 if a suitable type conversion operation has been defined. Notice that this
2113 is subtly different from the use of casts with constants, as shown in
2114 <xref linkend="sql-syntax-constants-generic"/>. A cast applied to an
2115 unadorned string literal represents the initial assignment of a type
2116 to a literal constant value, and so it will succeed for any type
2117 (if the contents of the string literal are acceptable input syntax for the
2118 data type).
2119 </para>
2121 <para>
2122 An explicit type cast can usually be omitted if there is no ambiguity as
2123 to the type that a value expression must produce (for example, when it is
2124 assigned to a table column); the system will automatically apply a
2125 type cast in such cases. However, automatic casting is only done for
2126 casts that are marked <quote>OK to apply implicitly</quote>
2127 in the system catalogs. Other casts must be invoked with
2128 explicit casting syntax. This restriction is intended to prevent
2129 surprising conversions from being applied silently.
2130 </para>
2132 <para>
2133 It is also possible to specify a type cast using a function-like
2134 syntax:
2135 <synopsis>
2136 <replaceable>typename</replaceable> ( <replaceable>expression</replaceable> )
2137 </synopsis>
2138 However, this only works for types whose names are also valid as
2139 function names. For example, <literal>double precision</literal>
2140 cannot be used this way, but the equivalent <literal>float8</literal>
2141 can. Also, the names <literal>interval</literal>, <literal>time</literal>, and
2142 <literal>timestamp</literal> can only be used in this fashion if they are
2143 double-quoted, because of syntactic conflicts. Therefore, the use of
2144 the function-like cast syntax leads to inconsistencies and should
2145 probably be avoided.
2146 </para>
2148 <note>
2149 <para>
2150 The function-like syntax is in fact just a function call. When
2151 one of the two standard cast syntaxes is used to do a run-time
2152 conversion, it will internally invoke a registered function to
2153 perform the conversion. By convention, these conversion functions
2154 have the same name as their output type, and thus the <quote>function-like
2155 syntax</quote> is nothing more than a direct invocation of the underlying
2156 conversion function. Obviously, this is not something that a portable
2157 application should rely on. For further details see
2158 <xref linkend="sql-createcast"/>.
2159 </para>
2160 </note>
2161 </sect2>
2163 <sect2 id="sql-syntax-collate-exprs">
2164 <title>Collation Expressions</title>
2166 <indexterm>
2167 <primary>COLLATE</primary>
2168 </indexterm>
2170 <para>
2171 The <literal>COLLATE</literal> clause overrides the collation of
2172 an expression. It is appended to the expression it applies to:
2173 <synopsis>
2174 <replaceable>expr</replaceable> COLLATE <replaceable>collation</replaceable>
2175 </synopsis>
2176 where <replaceable>collation</replaceable> is a possibly
2177 schema-qualified identifier. The <literal>COLLATE</literal>
2178 clause binds tighter than operators; parentheses can be used when
2179 necessary.
2180 </para>
2182 <para>
2183 If no collation is explicitly specified, the database system
2184 either derives a collation from the columns involved in the
2185 expression, or it defaults to the default collation of the
2186 database if no column is involved in the expression.
2187 </para>
2189 <para>
2190 The two common uses of the <literal>COLLATE</literal> clause are
2191 overriding the sort order in an <literal>ORDER BY</literal> clause, for
2192 example:
2193 <programlisting>
2194 SELECT a, b, c FROM tbl WHERE ... ORDER BY a COLLATE "C";
2195 </programlisting>
2196 and overriding the collation of a function or operator call that
2197 has locale-sensitive results, for example:
2198 <programlisting>
2199 SELECT * FROM tbl WHERE a &gt; 'foo' COLLATE "C";
2200 </programlisting>
2201 Note that in the latter case the <literal>COLLATE</literal> clause is
2202 attached to an input argument of the operator we wish to affect.
2203 It doesn't matter which argument of the operator or function call the
2204 <literal>COLLATE</literal> clause is attached to, because the collation that is
2205 applied by the operator or function is derived by considering all
2206 arguments, and an explicit <literal>COLLATE</literal> clause will override the
2207 collations of all other arguments. (Attaching non-matching
2208 <literal>COLLATE</literal> clauses to more than one argument, however, is an
2209 error. For more details see <xref linkend="collation"/>.)
2210 Thus, this gives the same result as the previous example:
2211 <programlisting>
2212 SELECT * FROM tbl WHERE a COLLATE "C" &gt; 'foo';
2213 </programlisting>
2214 But this is an error:
2215 <programlisting>
2216 SELECT * FROM tbl WHERE (a &gt; 'foo') COLLATE "C";
2217 </programlisting>
2218 because it attempts to apply a collation to the result of the
2219 <literal>&gt;</literal> operator, which is of the non-collatable data type
2220 <type>boolean</type>.
2221 </para>
2222 </sect2>
2224 <sect2 id="sql-syntax-scalar-subqueries">
2225 <title>Scalar Subqueries</title>
2227 <indexterm>
2228 <primary>subquery</primary>
2229 </indexterm>
2231 <para>
2232 A scalar subquery is an ordinary
2233 <command>SELECT</command> query in parentheses that returns exactly one
2234 row with one column. (See <xref linkend="queries"/> for information about writing queries.)
2235 The <command>SELECT</command> query is executed
2236 and the single returned value is used in the surrounding value expression.
2237 It is an error to use a query that
2238 returns more than one row or more than one column as a scalar subquery.
2239 (But if, during a particular execution, the subquery returns no rows,
2240 there is no error; the scalar result is taken to be null.)
2241 The subquery can refer to variables from the surrounding query,
2242 which will act as constants during any one evaluation of the subquery.
2243 See also <xref linkend="functions-subquery"/> for other expressions involving subqueries.
2244 </para>
2246 <para>
2247 For example, the following finds the largest city population in each
2248 state:
2249 <programlisting>
2250 SELECT name, (SELECT max(pop) FROM cities WHERE cities.state = states.name)
2251 FROM states;
2252 </programlisting>
2253 </para>
2254 </sect2>
2256 <sect2 id="sql-syntax-array-constructors">
2257 <title>Array Constructors</title>
2259 <indexterm>
2260 <primary>array</primary>
2261 <secondary>constructor</secondary>
2262 </indexterm>
2264 <indexterm>
2265 <primary>ARRAY</primary>
2266 </indexterm>
2268 <para>
2269 An array constructor is an expression that builds an
2270 array value using values for its member elements. A simple array
2271 constructor
2272 consists of the key word <literal>ARRAY</literal>, a left square bracket
2273 <literal>[</literal>, a list of expressions (separated by commas) for the
2274 array element values, and finally a right square bracket <literal>]</literal>.
2275 For example:
2276 <programlisting>
2277 SELECT ARRAY[1,2,3+4];
2278 array
2279 ---------
2280 {1,2,7}
2281 (1 row)
2282 </programlisting>
2283 By default,
2284 the array element type is the common type of the member expressions,
2285 determined using the same rules as for <literal>UNION</literal> or
2286 <literal>CASE</literal> constructs (see <xref linkend="typeconv-union-case"/>).
2287 You can override this by explicitly casting the array constructor to the
2288 desired type, for example:
2289 <programlisting>
2290 SELECT ARRAY[1,2,22.7]::integer[];
2291 array
2292 ----------
2293 {1,2,23}
2294 (1 row)
2295 </programlisting>
2296 This has the same effect as casting each expression to the array
2297 element type individually.
2298 For more on casting, see <xref linkend="sql-syntax-type-casts"/>.
2299 </para>
2301 <para>
2302 Multidimensional array values can be built by nesting array
2303 constructors.
2304 In the inner constructors, the key word <literal>ARRAY</literal> can
2305 be omitted. For example, these produce the same result:
2307 <programlisting>
2308 SELECT ARRAY[ARRAY[1,2], ARRAY[3,4]];
2309 array
2310 ---------------
2311 {{1,2},{3,4}}
2312 (1 row)
2314 SELECT ARRAY[[1,2],[3,4]];
2315 array
2316 ---------------
2317 {{1,2},{3,4}}
2318 (1 row)
2319 </programlisting>
2321 Since multidimensional arrays must be rectangular, inner constructors
2322 at the same level must produce sub-arrays of identical dimensions.
2323 Any cast applied to the outer <literal>ARRAY</literal> constructor propagates
2324 automatically to all the inner constructors.
2325 </para>
2327 <para>
2328 Multidimensional array constructor elements can be anything yielding
2329 an array of the proper kind, not only a sub-<literal>ARRAY</literal> construct.
2330 For example:
2331 <programlisting>
2332 CREATE TABLE arr(f1 int[], f2 int[]);
2334 INSERT INTO arr VALUES (ARRAY[[1,2],[3,4]], ARRAY[[5,6],[7,8]]);
2336 SELECT ARRAY[f1, f2, '{{9,10},{11,12}}'::int[]] FROM arr;
2337 array
2338 ------------------------------------------------
2339 {{{1,2},{3,4}},{{5,6},{7,8}},{{9,10},{11,12}}}
2340 (1 row)
2341 </programlisting>
2342 </para>
2344 <para>
2345 You can construct an empty array, but since it's impossible to have an
2346 array with no type, you must explicitly cast your empty array to the
2347 desired type. For example:
2348 <programlisting>
2349 SELECT ARRAY[]::integer[];
2350 array
2351 -------
2353 (1 row)
2354 </programlisting>
2355 </para>
2357 <para>
2358 It is also possible to construct an array from the results of a
2359 subquery. In this form, the array constructor is written with the
2360 key word <literal>ARRAY</literal> followed by a parenthesized (not
2361 bracketed) subquery. For example:
2362 <programlisting>
2363 SELECT ARRAY(SELECT oid FROM pg_proc WHERE proname LIKE 'bytea%');
2364 array
2365 ------------------------------------------------------------------
2366 {2011,1954,1948,1952,1951,1244,1950,2005,1949,1953,2006,31,2412}
2367 (1 row)
2369 SELECT ARRAY(SELECT ARRAY[i, i*2] FROM generate_series(1,5) AS a(i));
2370 array
2371 ----------------------------------
2372 {{1,2},{2,4},{3,6},{4,8},{5,10}}
2373 (1 row)
2374 </programlisting>
2375 The subquery must return a single column.
2376 If the subquery's output column is of a non-array type, the resulting
2377 one-dimensional array will have an element for each row in the
2378 subquery result, with an element type matching that of the
2379 subquery's output column.
2380 If the subquery's output column is of an array type, the result will be
2381 an array of the same type but one higher dimension; in this case all
2382 the subquery rows must yield arrays of identical dimensionality, else
2383 the result would not be rectangular.
2384 </para>
2386 <para>
2387 The subscripts of an array value built with <literal>ARRAY</literal>
2388 always begin with one. For more information about arrays, see
2389 <xref linkend="arrays"/>.
2390 </para>
2392 </sect2>
2394 <sect2 id="sql-syntax-row-constructors">
2395 <title>Row Constructors</title>
2397 <indexterm>
2398 <primary>composite type</primary>
2399 <secondary>constructor</secondary>
2400 </indexterm>
2402 <indexterm>
2403 <primary>row type</primary>
2404 <secondary>constructor</secondary>
2405 </indexterm>
2407 <indexterm>
2408 <primary>ROW</primary>
2409 </indexterm>
2411 <para>
2412 A row constructor is an expression that builds a row value (also
2413 called a composite value) using values
2414 for its member fields. A row constructor consists of the key word
2415 <literal>ROW</literal>, a left parenthesis, zero or more
2416 expressions (separated by commas) for the row field values, and finally
2417 a right parenthesis. For example:
2418 <programlisting>
2419 SELECT ROW(1,2.5,'this is a test');
2420 </programlisting>
2421 The key word <literal>ROW</literal> is optional when there is more than one
2422 expression in the list.
2423 </para>
2425 <para>
2426 A row constructor can include the syntax
2427 <replaceable>rowvalue</replaceable><literal>.*</literal>,
2428 which will be expanded to a list of the elements of the row value,
2429 just as occurs when the <literal>.*</literal> syntax is used at the top level
2430 of a <command>SELECT</command> list (see <xref linkend="rowtypes-usage"/>).
2431 For example, if table <literal>t</literal> has
2432 columns <literal>f1</literal> and <literal>f2</literal>, these are the same:
2433 <programlisting>
2434 SELECT ROW(t.*, 42) FROM t;
2435 SELECT ROW(t.f1, t.f2, 42) FROM t;
2436 </programlisting>
2437 </para>
2439 <note>
2440 <para>
2441 Before <productname>PostgreSQL</productname> 8.2, the
2442 <literal>.*</literal> syntax was not expanded in row constructors, so
2443 that writing <literal>ROW(t.*, 42)</literal> created a two-field row whose first
2444 field was another row value. The new behavior is usually more useful.
2445 If you need the old behavior of nested row values, write the inner
2446 row value without <literal>.*</literal>, for instance
2447 <literal>ROW(t, 42)</literal>.
2448 </para>
2449 </note>
2451 <para>
2452 By default, the value created by a <literal>ROW</literal> expression is of
2453 an anonymous record type. If necessary, it can be cast to a named
2454 composite type &mdash; either the row type of a table, or a composite type
2455 created with <command>CREATE TYPE AS</command>. An explicit cast might be needed
2456 to avoid ambiguity. For example:
2457 <programlisting>
2458 CREATE TABLE mytable(f1 int, f2 float, f3 text);
2460 CREATE FUNCTION getf1(mytable) RETURNS int AS 'SELECT $1.f1' LANGUAGE SQL;
2462 -- No cast needed since only one getf1() exists
2463 SELECT getf1(ROW(1,2.5,'this is a test'));
2464 getf1
2465 -------
2467 (1 row)
2469 CREATE TYPE myrowtype AS (f1 int, f2 text, f3 numeric);
2471 CREATE FUNCTION getf1(myrowtype) RETURNS int AS 'SELECT $1.f1' LANGUAGE SQL;
2473 -- Now we need a cast to indicate which function to call:
2474 SELECT getf1(ROW(1,2.5,'this is a test'));
2475 ERROR: function getf1(record) is not unique
2477 SELECT getf1(ROW(1,2.5,'this is a test')::mytable);
2478 getf1
2479 -------
2481 (1 row)
2483 SELECT getf1(CAST(ROW(11,'this is a test',2.5) AS myrowtype));
2484 getf1
2485 -------
2487 (1 row)
2488 </programlisting>
2489 </para>
2491 <para>
2492 Row constructors can be used to build composite values to be stored
2493 in a composite-type table column, or to be passed to a function that
2494 accepts a composite parameter. Also, it is possible to test rows
2495 using the standard comparison operators as described in <xref
2496 linkend="functions-comparison"/>, to compare one row against another
2497 as described in <xref linkend="functions-comparisons"/>, and to
2498 use them in connection with subqueries, as discussed in <xref
2499 linkend="functions-subquery"/>.
2500 </para>
2502 </sect2>
2504 <sect2 id="syntax-express-eval">
2505 <title>Expression Evaluation Rules</title>
2507 <indexterm>
2508 <primary>expression</primary>
2509 <secondary>order of evaluation</secondary>
2510 </indexterm>
2512 <para>
2513 The order of evaluation of subexpressions is not defined. In
2514 particular, the inputs of an operator or function are not necessarily
2515 evaluated left-to-right or in any other fixed order.
2516 </para>
2518 <para>
2519 Furthermore, if the result of an expression can be determined by
2520 evaluating only some parts of it, then other subexpressions
2521 might not be evaluated at all. For instance, if one wrote:
2522 <programlisting>
2523 SELECT true OR somefunc();
2524 </programlisting>
2525 then <literal>somefunc()</literal> would (probably) not be called
2526 at all. The same would be the case if one wrote:
2527 <programlisting>
2528 SELECT somefunc() OR true;
2529 </programlisting>
2530 Note that this is not the same as the left-to-right
2531 <quote>short-circuiting</quote> of Boolean operators that is found
2532 in some programming languages.
2533 </para>
2535 <para>
2536 As a consequence, it is unwise to use functions with side effects
2537 as part of complex expressions. It is particularly dangerous to
2538 rely on side effects or evaluation order in <literal>WHERE</literal> and <literal>HAVING</literal> clauses,
2539 since those clauses are extensively reprocessed as part of
2540 developing an execution plan. Boolean
2541 expressions (<literal>AND</literal>/<literal>OR</literal>/<literal>NOT</literal> combinations) in those clauses can be reorganized
2542 in any manner allowed by the laws of Boolean algebra.
2543 </para>
2545 <para>
2546 When it is essential to force evaluation order, a <literal>CASE</literal>
2547 construct (see <xref linkend="functions-conditional"/>) can be
2548 used. For example, this is an untrustworthy way of trying to
2549 avoid division by zero in a <literal>WHERE</literal> clause:
2550 <programlisting>
2551 SELECT ... WHERE x &gt; 0 AND y/x &gt; 1.5;
2552 </programlisting>
2553 But this is safe:
2554 <programlisting>
2555 SELECT ... WHERE CASE WHEN x &gt; 0 THEN y/x &gt; 1.5 ELSE false END;
2556 </programlisting>
2557 A <literal>CASE</literal> construct used in this fashion will defeat optimization
2558 attempts, so it should only be done when necessary. (In this particular
2559 example, it would be better to sidestep the problem by writing
2560 <literal>y &gt; 1.5*x</literal> instead.)
2561 </para>
2563 <para>
2564 <literal>CASE</literal> is not a cure-all for such issues, however.
2565 One limitation of the technique illustrated above is that it does not
2566 prevent early evaluation of constant subexpressions.
2567 As described in <xref linkend="xfunc-volatility"/>, functions and
2568 operators marked <literal>IMMUTABLE</literal> can be evaluated when
2569 the query is planned rather than when it is executed. Thus for example
2570 <programlisting>
2571 SELECT CASE WHEN x &gt; 0 THEN x ELSE 1/0 END FROM tab;
2572 </programlisting>
2573 is likely to result in a division-by-zero failure due to the planner
2574 trying to simplify the constant subexpression,
2575 even if every row in the table has <literal>x &gt; 0</literal> so that the
2576 <literal>ELSE</literal> arm would never be entered at run time.
2577 </para>
2579 <para>
2580 While that particular example might seem silly, related cases that don't
2581 obviously involve constants can occur in queries executed within
2582 functions, since the values of function arguments and local variables
2583 can be inserted into queries as constants for planning purposes.
2584 Within <application>PL/pgSQL</application> functions, for example, using an
2585 <literal>IF</literal>-<literal>THEN</literal>-<literal>ELSE</literal> statement to protect
2586 a risky computation is much safer than just nesting it in a
2587 <literal>CASE</literal> expression.
2588 </para>
2590 <para>
2591 Another limitation of the same kind is that a <literal>CASE</literal> cannot
2592 prevent evaluation of an aggregate expression contained within it,
2593 because aggregate expressions are computed before other
2594 expressions in a <literal>SELECT</literal> list or <literal>HAVING</literal> clause
2595 are considered. For example, the following query can cause a
2596 division-by-zero error despite seemingly having protected against it:
2597 <programlisting>
2598 SELECT CASE WHEN min(employees) > 0
2599 THEN avg(expenses / employees)
2601 FROM departments;
2602 </programlisting>
2603 The <function>min()</function> and <function>avg()</function> aggregates are computed
2604 concurrently over all the input rows, so if any row
2605 has <structfield>employees</structfield> equal to zero, the division-by-zero error
2606 will occur before there is any opportunity to test the result of
2607 <function>min()</function>. Instead, use a <literal>WHERE</literal>
2608 or <literal>FILTER</literal> clause to prevent problematic input rows from
2609 reaching an aggregate function in the first place.
2610 </para>
2611 </sect2>
2612 </sect1>
2614 <sect1 id="sql-syntax-calling-funcs">
2615 <title>Calling Functions</title>
2617 <indexterm zone="sql-syntax-calling-funcs">
2618 <primary>notation</primary>
2619 <secondary>functions</secondary>
2620 </indexterm>
2622 <para>
2623 <productname>PostgreSQL</productname> allows functions that have named
2624 parameters to be called using either <firstterm>positional</firstterm> or
2625 <firstterm>named</firstterm> notation. Named notation is especially
2626 useful for functions that have a large number of parameters, since it
2627 makes the associations between parameters and actual arguments more
2628 explicit and reliable.
2629 In positional notation, a function call is written with
2630 its argument values in the same order as they are defined in the function
2631 declaration. In named notation, the arguments are matched to the
2632 function parameters by name and can be written in any order.
2633 For each notation, also consider the effect of function argument types,
2634 documented in <xref linkend="typeconv-func"/>.
2635 </para>
2637 <para>
2638 In either notation, parameters that have default values given in the
2639 function declaration need not be written in the call at all. But this
2640 is particularly useful in named notation, since any combination of
2641 parameters can be omitted; while in positional notation parameters can
2642 only be omitted from right to left.
2643 </para>
2645 <para>
2646 <productname>PostgreSQL</productname> also supports
2647 <firstterm>mixed</firstterm> notation, which combines positional and
2648 named notation. In this case, positional parameters are written first
2649 and named parameters appear after them.
2650 </para>
2652 <para>
2653 The following examples will illustrate the usage of all three
2654 notations, using the following function definition:
2655 <programlisting>
2656 CREATE FUNCTION concat_lower_or_upper(a text, b text, uppercase boolean DEFAULT false)
2657 RETURNS text
2660 SELECT CASE
2661 WHEN $3 THEN UPPER($1 || ' ' || $2)
2662 ELSE LOWER($1 || ' ' || $2)
2663 END;
2665 LANGUAGE SQL IMMUTABLE STRICT;
2666 </programlisting>
2667 Function <function>concat_lower_or_upper</function> has two mandatory
2668 parameters, <literal>a</literal> and <literal>b</literal>. Additionally
2669 there is one optional parameter <literal>uppercase</literal> which defaults
2670 to <literal>false</literal>. The <literal>a</literal> and
2671 <literal>b</literal> inputs will be concatenated, and forced to either
2672 upper or lower case depending on the <literal>uppercase</literal>
2673 parameter. The remaining details of this function
2674 definition are not important here (see <xref linkend="extend"/> for
2675 more information).
2676 </para>
2678 <sect2 id="sql-syntax-calling-funcs-positional">
2679 <title>Using Positional Notation</title>
2681 <indexterm>
2682 <primary>function</primary>
2683 <secondary>positional notation</secondary>
2684 </indexterm>
2686 <para>
2687 Positional notation is the traditional mechanism for passing arguments
2688 to functions in <productname>PostgreSQL</productname>. An example is:
2689 <screen>
2690 SELECT concat_lower_or_upper('Hello', 'World', true);
2691 concat_lower_or_upper
2692 -----------------------
2693 HELLO WORLD
2694 (1 row)
2695 </screen>
2696 All arguments are specified in order. The result is upper case since
2697 <literal>uppercase</literal> is specified as <literal>true</literal>.
2698 Another example is:
2699 <screen>
2700 SELECT concat_lower_or_upper('Hello', 'World');
2701 concat_lower_or_upper
2702 -----------------------
2703 hello world
2704 (1 row)
2705 </screen>
2706 Here, the <literal>uppercase</literal> parameter is omitted, so it
2707 receives its default value of <literal>false</literal>, resulting in
2708 lower case output. In positional notation, arguments can be omitted
2709 from right to left so long as they have defaults.
2710 </para>
2711 </sect2>
2713 <sect2 id="sql-syntax-calling-funcs-named">
2714 <title>Using Named Notation</title>
2716 <indexterm>
2717 <primary>function</primary>
2718 <secondary>named notation</secondary>
2719 </indexterm>
2721 <para>
2722 In named notation, each argument's name is specified using
2723 <literal>=&gt;</literal> to separate it from the argument expression.
2724 For example:
2725 <screen>
2726 SELECT concat_lower_or_upper(a =&gt; 'Hello', b =&gt; 'World');
2727 concat_lower_or_upper
2728 -----------------------
2729 hello world
2730 (1 row)
2731 </screen>
2732 Again, the argument <literal>uppercase</literal> was omitted
2733 so it is set to <literal>false</literal> implicitly. One advantage of
2734 using named notation is that the arguments may be specified in any
2735 order, for example:
2736 <screen>
2737 SELECT concat_lower_or_upper(a =&gt; 'Hello', b =&gt; 'World', uppercase =&gt; true);
2738 concat_lower_or_upper
2739 -----------------------
2740 HELLO WORLD
2741 (1 row)
2743 SELECT concat_lower_or_upper(a =&gt; 'Hello', uppercase =&gt; true, b =&gt; 'World');
2744 concat_lower_or_upper
2745 -----------------------
2746 HELLO WORLD
2747 (1 row)
2748 </screen>
2749 </para>
2751 <para>
2752 An older syntax based on ":=" is supported for backward compatibility:
2753 <screen>
2754 SELECT concat_lower_or_upper(a := 'Hello', uppercase := true, b := 'World');
2755 concat_lower_or_upper
2756 -----------------------
2757 HELLO WORLD
2758 (1 row)
2759 </screen>
2760 </para>
2761 </sect2>
2763 <sect2 id="sql-syntax-calling-funcs-mixed">
2764 <title>Using Mixed Notation</title>
2766 <indexterm>
2767 <primary>function</primary>
2768 <secondary>mixed notation</secondary>
2769 </indexterm>
2771 <para>
2772 The mixed notation combines positional and named notation. However, as
2773 already mentioned, named arguments cannot precede positional arguments.
2774 For example:
2775 <screen>
2776 SELECT concat_lower_or_upper('Hello', 'World', uppercase =&gt; true);
2777 concat_lower_or_upper
2778 -----------------------
2779 HELLO WORLD
2780 (1 row)
2781 </screen>
2782 In the above query, the arguments <literal>a</literal> and
2783 <literal>b</literal> are specified positionally, while
2784 <literal>uppercase</literal> is specified by name. In this example,
2785 that adds little except documentation. With a more complex function
2786 having numerous parameters that have default values, named or mixed
2787 notation can save a great deal of writing and reduce chances for error.
2788 </para>
2790 <note>
2791 <para>
2792 Named and mixed call notations currently cannot be used when calling an
2793 aggregate function (but they do work when an aggregate function is used
2794 as a window function).
2795 </para>
2796 </note>
2797 </sect2>
2798 </sect1>
2800 </chapter>