doc/src/sgml/syntax.sgml

   1 <!-- doc/src/sgml/syntax.sgml -->
   2
   3 <chapter id="sql-syntax">
   4  <title>SQL Syntax</title>
   5
   6  <indexterm zone="sql-syntax">
   7   <primary>syntax</primary>
   8   <secondary>SQL</secondary>
   9  </indexterm>
  10
  11  <para>
  12   This chapter describes the syntax of SQL.  It forms the foundation
  13   for understanding the following chapters which will go into detail
  14   about how SQL commands are applied to define and modify data.
  15  </para>
  16
  17  <para>
  18   We also advise users who are already familiar with SQL to read this
  19   chapter carefully because it contains several rules and concepts that
  20   are implemented inconsistently among SQL databases or that are
  21   specific to <productname>PostgreSQL</productname>.
  22  </para>
  23
  24  <sect1 id="sql-syntax-lexical">
  25   <title>Lexical Structure</title>
  26
  27   <indexterm>
  28    <primary>token</primary>
  29   </indexterm>
  30
  31   <para>
  32    SQL input consists of a sequence of
  33    <firstterm>commands</firstterm>.  A command is composed of a
  34    sequence of <firstterm>tokens</firstterm>, terminated by a
  35    semicolon (<quote>;</quote>).  The end of the input stream also
  36    terminates a command.  Which tokens are valid depends on the syntax
  37    of the particular command.
  38   </para>
  39
  40   <para>
  41    A token can be a <firstterm>key word</firstterm>, an
  42    <firstterm>identifier</firstterm>, a <firstterm>quoted
  43    identifier</firstterm>, a <firstterm>literal</firstterm> (or
  44    constant), or a special character symbol.  Tokens are normally
  45    separated by whitespace (space, tab, newline), but need not be if
  46    there is no ambiguity (which is generally only the case if a
  47    special character is adjacent to some other token type).
  48   </para>
  49
  50    <para>
  51     For example, the following is (syntactically) valid SQL input:
  52 <programlisting>
  53 SELECT * FROM MY_TABLE;
  54 UPDATE MY_TABLE SET A = 5;
  55 INSERT INTO MY_TABLE VALUES (3, 'hi there');
  56 </programlisting>
  57     This is a sequence of three commands, one per line (although this
  58     is not required; more than one command can be on a line, and
  59     commands can usefully be split across lines).
  60    </para>
  61
  62   <para>
  63    Additionally, <firstterm>comments</firstterm> can occur in SQL
  64    input.  They are not tokens, they are effectively equivalent to
  65    whitespace.
  66   </para>
  67
  68   <para>
  69    The SQL syntax is not very consistent regarding what tokens
  70    identify commands and which are operands or parameters.  The first
  71    few tokens are generally the command name, so in the above example
  72    we would usually speak of a <quote>SELECT</quote>, an
  73    <quote>UPDATE</quote>, and an <quote>INSERT</quote> command.  But
  74    for instance the <command>UPDATE</command> command always requires
  75    a <token>SET</token> token to appear in a certain position, and
  76    this particular variation of <command>INSERT</command> also
  77    requires a <token>VALUES</token> in order to be complete.  The
  78    precise syntax rules for each command are described in <xref linkend="reference"/>.
  79   </para>
  80
  81   <sect2 id="sql-syntax-identifiers">
  82    <title>Identifiers and Key Words</title>
  83
  84    <indexterm zone="sql-syntax-identifiers">
  85     <primary>identifier</primary>
  86     <secondary>syntax of</secondary>
  87    </indexterm>
  88
  89    <indexterm zone="sql-syntax-identifiers">
  90     <primary>name</primary>
  91     <secondary>syntax of</secondary>
  92    </indexterm>
  93
  94    <indexterm zone="sql-syntax-identifiers">
  95     <primary>key word</primary>
  96     <secondary>syntax of</secondary>
  97    </indexterm>
  98
  99    <para>
 100     Tokens such as <token>SELECT</token>, <token>UPDATE</token>, or
 101     <token>VALUES</token> in the example above are examples of
 102     <firstterm>key words</firstterm>, that is, words that have a fixed
 103     meaning in the SQL language.  The tokens <token>MY_TABLE</token>
 104     and <token>A</token> are examples of
 105     <firstterm>identifiers</firstterm>.  They identify names of
 106     tables, columns, or other database objects, depending on the
 107     command they are used in.  Therefore they are sometimes simply
 108     called <quote>names</quote>.  Key words and identifiers have the
 109     same lexical structure, meaning that one cannot know whether a
 110     token is an identifier or a key word without knowing the language.
 111     A complete list of key words can be found in <xref
 112     linkend="sql-keywords-appendix"/>.
 113    </para>
 114
 115    <para>
 116     SQL identifiers and key words must begin with a letter
 117     (<literal>a</literal>-<literal>z</literal>, but also letters with
 118     diacritical marks and non-Latin letters) or an underscore
 119     (<literal>_</literal>).  Subsequent characters in an identifier or
 120     key word can be letters, underscores, digits
 121     (<literal>0</literal>-<literal>9</literal>), or dollar signs
 122     (<literal>$</literal>).  Note that dollar signs are not allowed in identifiers
 123     according to the letter of the SQL standard, so their use might render
 124     applications less portable.
 125     The SQL standard will not define a key word that contains
 126     digits or starts or ends with an underscore, so identifiers of this
 127     form are safe against possible conflict with future extensions of the
 128     standard.
 129    </para>
 130
 131    <para>
 132     <indexterm><primary>identifier</primary><secondary>length</secondary></indexterm>
 133     The system uses no more than <symbol>NAMEDATALEN</symbol>-1
 134     bytes of an identifier; longer names can be written in
 135     commands, but they will be truncated.  By default,
 136     <symbol>NAMEDATALEN</symbol> is 64 so the maximum identifier
 137     length is 63 bytes. If this limit is problematic, it can be raised by
 138     changing the <symbol>NAMEDATALEN</symbol> constant in
 139     <filename>src/include/pg_config_manual.h</filename>.
 140    </para>
 141
 142    <para>
 143     <indexterm>
 144      <primary>case sensitivity</primary>
 145      <secondary>of SQL commands</secondary>
 146     </indexterm>
 147     Key words and unquoted identifiers are case-insensitive.  Therefore:
 148 <programlisting>
 149 UPDATE MY_TABLE SET A = 5;
 150 </programlisting>
 151     can equivalently be written as:
 152 <programlisting>
 153 uPDaTE my_TabLE SeT a = 5;
 154 </programlisting>
 155     A convention often used is to write key words in upper
 156     case and names in lower case, e.g.:
 157 <programlisting>
 158 UPDATE my_table SET a = 5;
 159 </programlisting>
 160    </para>
 161
 162    <para>
 163     <indexterm>
 164      <primary>quotation marks</primary>
 165      <secondary>and identifiers</secondary>
 166     </indexterm>
 167     There is a second kind of identifier:  the <firstterm>delimited
 168     identifier</firstterm> or <firstterm>quoted
 169     identifier</firstterm>.  It is formed by enclosing an arbitrary
 170     sequence of characters in double-quotes
 171     (<literal>"</literal>). <!-- " font-lock mania --> A delimited
 172     identifier is always an identifier, never a key word.  So
 173     <literal>"select"</literal> could be used to refer to a column or
 174     table named <quote>select</quote>, whereas an unquoted
 175     <literal>select</literal> would be taken as a key word and
 176     would therefore provoke a parse error when used where a table or
 177     column name is expected.  The example can be written with quoted
 178     identifiers like this:
 179 <programlisting>
 180 UPDATE "my_table" SET "a" = 5;
 181 </programlisting>
 182    </para>
 183
 184    <para>
 185     Quoted identifiers can contain any character, except the character
 186     with code zero.  (To include a double quote, write two double quotes.)
 187     This allows constructing table or column names that would
 188     otherwise not be possible, such as ones containing spaces or
 189     ampersands.  The length limitation still applies.
 190    </para>
 191
 192    <para>
 193     Quoting an identifier also makes it case-sensitive, whereas
 194     unquoted names are always folded to lower case.  For example, the
 195     identifiers <literal>FOO</literal>, <literal>foo</literal>, and
 196     <literal>"foo"</literal> are considered the same by
 197     <productname>PostgreSQL</productname>, but
 198     <literal>"Foo"</literal> and <literal>"FOO"</literal> are
 199     different from these three and each other.  (The folding of
 200     unquoted names to lower case in <productname>PostgreSQL</productname> is
 201     incompatible with the SQL standard, which says that unquoted names
 202     should be folded to upper case.  Thus, <literal>foo</literal>
 203     should be equivalent to <literal>"FOO"</literal> not
 204     <literal>"foo"</literal> according to the standard.  If you want
 205     to write portable applications you are advised to always quote a
 206     particular name or never quote it.)
 207    </para>
 208
 209    <indexterm>
 210      <primary>Unicode escape</primary>
 211      <secondary>in identifiers</secondary>
 212    </indexterm>
 213
 214    <para>
 215     A variant of quoted
 216     identifiers allows including escaped Unicode characters identified
 217     by their code points.  This variant starts
 218     with <literal>U&amp;</literal> (upper or lower case U followed by
 219     ampersand) immediately before the opening double quote, without
 220     any spaces in between, for example <literal>U&amp;"foo"</literal>.
 221     (Note that this creates an ambiguity with the
 222     operator <literal>&amp;</literal>.  Use spaces around the operator to
 223     avoid this problem.)  Inside the quotes, Unicode characters can be
 224     specified in escaped form by writing a backslash followed by the
 225     four-digit hexadecimal code point number or alternatively a
 226     backslash followed by a plus sign followed by a six-digit
 227     hexadecimal code point number.  For example, the
 228     identifier <literal>"data"</literal> could be written as
 229 <programlisting>
 230 U&amp;"d\0061t\+000061"
 231 </programlisting>
 232     The following less trivial example writes the Russian
 233     word <quote>slon</quote> (elephant) in Cyrillic letters:
 234 <programlisting>
 235 U&amp;"\0441\043B\043E\043D"
 236 </programlisting>
 237    </para>
 238
 239    <para>
 240     If a different escape character than backslash is desired, it can
 241     be specified using
 242     the <literal>UESCAPE</literal><indexterm><primary>UESCAPE</primary></indexterm>
 243     clause after the string, for example:
 244 <programlisting>
 245 U&amp;"d!0061t!+000061" UESCAPE '!'
 246 </programlisting>
 247     The escape character can be any single character other than a
 248     hexadecimal digit, the plus sign, a single quote, a double quote,
 249     or a whitespace character.  Note that the escape character is
 250     written in single quotes, not double quotes,
 251     after <literal>UESCAPE</literal>.
 252    </para>
 253
 254    <para>
 255     To include the escape character in the identifier literally, write
 256     it twice.
 257    </para>
 258
 259    <para>
 260     Either the 4-digit or the 6-digit escape form can be used to
 261     specify UTF-16 surrogate pairs to compose characters with code
 262     points larger than U+FFFF, although the availability of the
 263     6-digit form technically makes this unnecessary.  (Surrogate
 264     pairs are not stored directly, but are combined into a single
 265     code point.)
 266    </para>
 267
 268    <para>
 269     If the server encoding is not UTF-8, the Unicode code point identified
 270     by one of these escape sequences is converted to the actual server
 271     encoding; an error is reported if that's not possible.
 272    </para>
 273   </sect2>
 274
 275
 276   <sect2 id="sql-syntax-constants">
 277    <title>Constants</title>
 278
 279    <indexterm zone="sql-syntax-constants">
 280     <primary>constant</primary>
 281    </indexterm>
 282
 283    <para>
 284     There are three kinds of <firstterm>implicitly-typed
 285     constants</firstterm> in <productname>PostgreSQL</productname>:
 286     strings, bit strings, and numbers.
 287     Constants can also be specified with explicit types, which can
 288     enable more accurate representation and more efficient handling by
 289     the system. These alternatives are discussed in the following
 290     subsections.
 291    </para>
 292
 293    <sect3 id="sql-syntax-strings">
 294     <title>String Constants</title>
 295
 296     <indexterm zone="sql-syntax-strings">
 297      <primary>character string</primary>
 298      <secondary>constant</secondary>
 299     </indexterm>
 300
 301     <para>
 302      <indexterm>
 303       <primary>quotation marks</primary>
 304       <secondary>escaping</secondary>
 305      </indexterm>
 306      A string constant in SQL is an arbitrary sequence of characters
 307      bounded by single quotes (<literal>'</literal>), for example
 308      <literal>'This is a string'</literal>.  To include
 309      a single-quote character within a string constant,
 310      write two adjacent single quotes, e.g.,
 311      <literal>'Dianne''s horse'</literal>.
 312      Note that this is <emphasis>not</emphasis> the same as a double-quote
 313      character (<literal>"</literal>). <!-- font-lock sanity: " -->
 314     </para>
 315
 316     <para>
 317      Two string constants that are only separated by whitespace
 318      <emphasis>with at least one newline</emphasis> are concatenated
 319      and effectively treated as if the string had been written as one
 320      constant.  For example:
 321 <programlisting>
 322 SELECT 'foo'
 323 'bar';
 324 </programlisting>
 325      is equivalent to:
 326 <programlisting>
 327 SELECT 'foobar';
 328 </programlisting>
 329      but:
 330 <programlisting>
 331 SELECT 'foo'      'bar';
 332 </programlisting>
 333      is not valid syntax.  (This slightly bizarre behavior is specified
 334      by <acronym>SQL</acronym>; <productname>PostgreSQL</productname> is
 335      following the standard.)
 336     </para>
 337    </sect3>
 338
 339    <sect3 id="sql-syntax-strings-escape">
 340     <title>String Constants with C-Style Escapes</title>
 341
 342      <indexterm zone="sql-syntax-strings-escape">
 343       <primary>escape string syntax</primary>
 344      </indexterm>
 345      <indexterm zone="sql-syntax-strings-escape">
 346       <primary>backslash escapes</primary>
 347      </indexterm>
 348
 349     <para>
 350      <productname>PostgreSQL</productname> also accepts <quote>escape</quote>
 351      string constants, which are an extension to the SQL standard.
 352      An escape string constant is specified by writing the letter
 353      <literal>E</literal> (upper or lower case) just before the opening single
 354      quote, e.g., <literal>E'foo'</literal>.  (When continuing an escape string
 355      constant across lines, write <literal>E</literal> only before the first opening
 356      quote.)
 357      Within an escape string, a backslash character (<literal>\</literal>) begins a
 358      C-like <firstterm>backslash escape</firstterm> sequence, in which the combination
 359      of backslash and following character(s) represent a special byte
 360      value, as shown in <xref linkend="sql-backslash-table"/>.
 361     </para>
 362
 363      <table id="sql-backslash-table">
 364       <title>Backslash Escape Sequences</title>
 365       <tgroup cols="2">
 366       <thead>
 367        <row>
 368         <entry>Backslash Escape Sequence</entry>
 369         <entry>Interpretation</entry>
 370        </row>
 371       </thead>
 372
 373       <tbody>
 374        <row>
 375         <entry><literal>\b</literal></entry>
 376         <entry>backspace</entry>
 377        </row>
 378        <row>
 379         <entry><literal>\f</literal></entry>
 380         <entry>form feed</entry>
 381        </row>
 382        <row>
 383         <entry><literal>\n</literal></entry>
 384         <entry>newline</entry>
 385        </row>
 386        <row>
 387         <entry><literal>\r</literal></entry>
 388         <entry>carriage return</entry>
 389        </row>
 390        <row>
 391         <entry><literal>\t</literal></entry>
 392         <entry>tab</entry>
 393        </row>
 394        <row>
 395         <entry>
 396          <literal>\<replaceable>o</replaceable></literal>,
 397          <literal>\<replaceable>oo</replaceable></literal>,
 398          <literal>\<replaceable>ooo</replaceable></literal>
 399          (<replaceable>o</replaceable> = 0&ndash;7)
 400         </entry>
 401         <entry>octal byte value</entry>
 402        </row>
 403        <row>
 404         <entry>
 405          <literal>\x<replaceable>h</replaceable></literal>,
 406          <literal>\x<replaceable>hh</replaceable></literal>
 407          (<replaceable>h</replaceable> = 0&ndash;9, A&ndash;F)
 408         </entry>
 409         <entry>hexadecimal byte value</entry>
 410        </row>
 411        <row>
 412         <entry>
 413          <literal>\u<replaceable>xxxx</replaceable></literal>,
 414          <literal>\U<replaceable>xxxxxxxx</replaceable></literal>
 415          (<replaceable>x</replaceable> = 0&ndash;9, A&ndash;F)
 416         </entry>
 417         <entry>16 or 32-bit hexadecimal Unicode character value</entry>
 418        </row>
 419       </tbody>
 420       </tgroup>
 421      </table>
 422
 423     <para>
 424      Any other
 425      character following a backslash is taken literally. Thus, to
 426      include a backslash character, write two backslashes (<literal>\\</literal>).
 427      Also, a single quote can be included in an escape string by writing
 428      <literal>\'</literal>, in addition to the normal way of <literal>''</literal>.
 429     </para>
 430
 431     <para>
 432      It is your responsibility that the byte sequences you create,
 433      especially when using the octal or hexadecimal escapes, compose
 434      valid characters in the server character set encoding.
 435      A useful alternative is to use Unicode escapes or the
 436      alternative Unicode escape syntax, explained
 437      in <xref linkend="sql-syntax-strings-uescape"/>; then the server
 438      will check that the character conversion is possible.
 439     </para>
 440
 441     <caution>
 442     <para>
 443      If the configuration parameter
 444      <xref linkend="guc-standard-conforming-strings"/> is <literal>off</literal>,
 445      then <productname>PostgreSQL</productname> recognizes backslash escapes
 446      in both regular and escape string constants.  However, as of
 447      <productname>PostgreSQL</productname> 9.1, the default is <literal>on</literal>, meaning
 448      that backslash escapes are recognized only in escape string constants.
 449      This behavior is more standards-compliant, but might break applications
 450      which rely on the historical behavior, where backslash escapes
 451      were always recognized.  As a workaround, you can set this parameter
 452      to <literal>off</literal>, but it is better to migrate away from using backslash
 453      escapes.  If you need to use a backslash escape to represent a special
 454      character, write the string constant with an <literal>E</literal>.
 455     </para>
 456
 457     <para>
 458      In addition to <varname>standard_conforming_strings</varname>, the configuration
 459      parameters <xref linkend="guc-escape-string-warning"/> and
 460      <xref linkend="guc-backslash-quote"/> govern treatment of backslashes
 461      in string constants.
 462     </para>
 463     </caution>
 464
 465     <para>
 466      The character with the code zero cannot be in a string constant.
 467     </para>
 468    </sect3>
 469
 470    <sect3 id="sql-syntax-strings-uescape">
 471     <title>String Constants with Unicode Escapes</title>
 472
 473     <indexterm  zone="sql-syntax-strings-uescape">
 474      <primary>Unicode escape</primary>
 475      <secondary>in string constants</secondary>
 476     </indexterm>
 477
 478     <para>
 479      <productname>PostgreSQL</productname> also supports another type
 480      of escape syntax for strings that allows specifying arbitrary
 481      Unicode characters by code point.  A Unicode escape string
 482      constant starts with <literal>U&amp;</literal> (upper or lower case
 483      letter U followed by ampersand) immediately before the opening
 484      quote, without any spaces in between, for
 485      example <literal>U&amp;'foo'</literal>.  (Note that this creates an
 486      ambiguity with the operator <literal>&amp;</literal>.  Use spaces
 487      around the operator to avoid this problem.)  Inside the quotes,
 488      Unicode characters can be specified in escaped form by writing a
 489      backslash followed by the four-digit hexadecimal code point
 490      number or alternatively a backslash followed by a plus sign
 491      followed by a six-digit hexadecimal code point number.  For
 492      example, the string <literal>'data'</literal> could be written as
 493 <programlisting>
 494 U&amp;'d\0061t\+000061'
 495 </programlisting>
 496      The following less trivial example writes the Russian
 497      word <quote>slon</quote> (elephant) in Cyrillic letters:
 498 <programlisting>
 499 U&amp;'\0441\043B\043E\043D'
 500 </programlisting>
 501     </para>
 502
 503     <para>
 504      If a different escape character than backslash is desired, it can
 505      be specified using
 506      the <literal>UESCAPE</literal><indexterm><primary>UESCAPE</primary></indexterm>
 507      clause after the string, for example:
 508 <programlisting>
 509 U&amp;'d!0061t!+000061' UESCAPE '!'
 510 </programlisting>
 511      The escape character can be any single character other than a
 512      hexadecimal digit, the plus sign, a single quote, a double quote,
 513      or a whitespace character.
 514     </para>
 515
 516     <para>
 517      To include the escape character in the string literally, write
 518      it twice.
 519     </para>
 520
 521     <para>
 522      Either the 4-digit or the 6-digit escape form can be used to
 523      specify UTF-16 surrogate pairs to compose characters with code
 524      points larger than U+FFFF, although the availability of the
 525      6-digit form technically makes this unnecessary.  (Surrogate
 526      pairs are not stored directly, but are combined into a single
 527      code point.)
 528     </para>
 529
 530     <para>
 531      If the server encoding is not UTF-8, the Unicode code point identified
 532      by one of these escape sequences is converted to the actual server
 533      encoding; an error is reported if that's not possible.
 534     </para>
 535
 536     <para>
 537      Also, the Unicode escape syntax for string constants only works
 538      when the configuration
 539      parameter <xref linkend="guc-standard-conforming-strings"/> is
 540      turned on.  This is because otherwise this syntax could confuse
 541      clients that parse the SQL statements to the point that it could
 542      lead to SQL injections and similar security issues.  If the
 543      parameter is set to off, this syntax will be rejected with an
 544      error message.
 545     </para>
 546    </sect3>
 547
 548    <sect3 id="sql-syntax-dollar-quoting">
 549     <title>Dollar-Quoted String Constants</title>
 550
 551      <indexterm>
 552       <primary>dollar quoting</primary>
 553      </indexterm>
 554
 555     <para>
 556      While the standard syntax for specifying string constants is usually
 557      convenient, it can be difficult to understand when the desired string
 558      contains many single quotes, since each of those must
 559      be doubled. To allow more readable queries in such situations,
 560      <productname>PostgreSQL</productname> provides another way, called
 561      <quote>dollar quoting</quote>, to write string constants.
 562      A dollar-quoted string constant
 563      consists of a dollar sign (<literal>$</literal>), an optional
 564      <quote>tag</quote> of zero or more characters, another dollar
 565      sign, an arbitrary sequence of characters that makes up the
 566      string content, a dollar sign, the same tag that began this
 567      dollar quote, and a dollar sign. For example, here are two
 568      different ways to specify the string <quote>Dianne's horse</quote>
 569      using dollar quoting:
 570 <programlisting>
 571 $$Dianne's horse$$
 572 $SomeTag$Dianne's horse$SomeTag$
 573 </programlisting>
 574      Notice that inside the dollar-quoted string, single quotes can be
 575      used without needing to be escaped.  Indeed, no characters inside
 576      a dollar-quoted string are ever escaped: the string content is always
 577      written literally.  Backslashes are not special, and neither are
 578      dollar signs, unless they are part of a sequence matching the opening
 579      tag.
 580     </para>
 581
 582     <para>
 583      It is possible to nest dollar-quoted string constants by choosing
 584      different tags at each nesting level.  This is most commonly used in
 585      writing function definitions.  For example:
 586 <programlisting>
 587 $function$
 588 BEGIN
 589     RETURN ($1 ~ $q$[\t\r\n\v\\]$q$);
 590 END;
 591 $function$
 592 </programlisting>
 593      Here, the sequence <literal>$q$[\t\r\n\v\\]$q$</literal> represents a
 594      dollar-quoted literal string <literal>[\t\r\n\v\\]</literal>, which will
 595      be recognized when the function body is executed by
 596      <productname>PostgreSQL</productname>.  But since the sequence does not match
 597      the outer dollar quoting delimiter <literal>$function$</literal>, it is
 598      just some more characters within the constant so far as the outer
 599      string is concerned.
 600     </para>
 601
 602     <para>
 603      The tag, if any, of a dollar-quoted string follows the same rules
 604      as an unquoted identifier, except that it cannot contain a dollar sign.
 605      Tags are case sensitive, so <literal>$tag$String content$tag$</literal>
 606      is correct, but <literal>$TAG$String content$tag$</literal> is not.
 607     </para>
 608
 609     <para>
 610      A dollar-quoted string that follows a keyword or identifier must
 611      be separated from it by whitespace; otherwise the dollar quoting
 612      delimiter would be taken as part of the preceding identifier.
 613     </para>
 614
 615     <para>
 616      Dollar quoting is not part of the SQL standard, but it is often a more
 617      convenient way to write complicated string literals than the
 618      standard-compliant single quote syntax.  It is particularly useful when
 619      representing string constants inside other constants, as is often needed
 620      in procedural function definitions.  With single-quote syntax, each
 621      backslash in the above example would have to be written as four
 622      backslashes, which would be reduced to two backslashes in parsing the
 623      original string constant, and then to one when the inner string constant
 624      is re-parsed during function execution.
 625     </para>
 626    </sect3>
 627
 628    <sect3 id="sql-syntax-bit-strings">
 629     <title>Bit-String Constants</title>
 630
 631     <indexterm zone="sql-syntax-bit-strings">
 632      <primary>bit string</primary>
 633      <secondary>constant</secondary>
 634     </indexterm>
 635
 636     <para>
 637      Bit-string constants look like regular string constants with a
 638      <literal>B</literal> (upper or lower case) immediately before the
 639      opening quote (no intervening whitespace), e.g.,
 640      <literal>B'1001'</literal>.  The only characters allowed within
 641      bit-string constants are <literal>0</literal> and
 642      <literal>1</literal>.
 643     </para>
 644
 645     <para>
 646      Alternatively, bit-string constants can be specified in hexadecimal
 647      notation, using a leading <literal>X</literal> (upper or lower case),
 648      e.g., <literal>X'1FF'</literal>.  This notation is equivalent to
 649      a bit-string constant with four binary digits for each hexadecimal digit.
 650     </para>
 651
 652     <para>
 653      Both forms of bit-string constant can be continued
 654      across lines in the same way as regular string constants.
 655      Dollar quoting cannot be used in a bit-string constant.
 656     </para>
 657    </sect3>
 658
 659    <sect3 id="sql-syntax-constants-numeric">
 660     <title>Numeric Constants</title>
 661
 662     <indexterm>
 663      <primary>number</primary>
 664      <secondary>constant</secondary>
 665     </indexterm>
 666
 667     <para>
 668      Numeric constants are accepted in these general forms:
 669 <synopsis>
 670 <replaceable>digits</replaceable>
 671 <replaceable>digits</replaceable>.<optional><replaceable>digits</replaceable></optional><optional>e<optional>+-</optional><replaceable>digits</replaceable></optional>
 672 <optional><replaceable>digits</replaceable></optional>.<replaceable>digits</replaceable><optional>e<optional>+-</optional><replaceable>digits</replaceable></optional>
 673 <replaceable>digits</replaceable>e<optional>+-</optional><replaceable>digits</replaceable>
 674 </synopsis>
 675      where <replaceable>digits</replaceable> is one or more decimal
 676      digits (0 through 9).  At least one digit must be before or after the
 677      decimal point, if one is used.  At least one digit must follow the
 678      exponent marker (<literal>e</literal>), if one is present.
 679      There cannot be any spaces or other characters embedded in the
 680      constant, except for underscores, which can be used for visual grouping as
 681      described below.  Note that any leading plus or minus sign is not actually
 682      considered part of the constant; it is an operator applied to the
 683      constant.
 684     </para>
 685
 686     <para>
 687      These are some examples of valid numeric constants:
 688 <literallayout>
 689 42
 690 3.5
 691 4.
 692 .001
 693 5e2
 694 1.925e-3
 695 </literallayout>
 696     </para>
 697
 698     <para>
 699      Additionally, non-decimal integer constants are accepted in these forms:
 700 <synopsis>
 701 0x<replaceable>hexdigits</replaceable>
 702 0o<replaceable>octdigits</replaceable>
 703 0b<replaceable>bindigits</replaceable>
 704 </synopsis>
 705      where <replaceable>hexdigits</replaceable> is one or more hexadecimal digits
 706      (0-9, A-F), <replaceable>octdigits</replaceable> is one or more octal
 707      digits (0-7), and <replaceable>bindigits</replaceable> is one or more binary
 708      digits (0 or 1).  Hexadecimal digits and the radix prefixes can be in
 709      upper or lower case.  Note that only integers can have non-decimal forms,
 710      not numbers with fractional parts.
 711     </para>
 712
 713     <para>
 714      These are some examples of valid non-decimal integer constants:
 715 <literallayout>
 716 0b100101
 717 0B10011001
 718 0o273
 719 0O755
 720 0x42f
 721 0XFFFF
 722 </literallayout>
 723     </para>
 724
 725     <para>
 726      For visual grouping, underscores can be inserted between digits.  These
 727      have no further effect on the value of the constant.  For example:
 728 <literallayout>
 729 1_500_000_000
 730 0b10001000_00000000
 731 0o_1_755
 732 0xFFFF_FFFF
 733 1.618_034
 734 </literallayout>
 735      Underscores are not allowed at the start or end of a numeric constant or
 736      a group of digits (that is, immediately before or after the decimal point
 737      or the exponent marker), and more than one underscore in a row is not
 738      allowed.
 739     </para>
 740
 741     <para>
 742      <indexterm><primary>integer</primary></indexterm>
 743      <indexterm><primary>bigint</primary></indexterm>
 744      <indexterm><primary>numeric</primary></indexterm>
 745      A numeric constant that contains neither a decimal point nor an
 746      exponent is initially presumed to be type <type>integer</type> if its
 747      value fits in type <type>integer</type> (32 bits); otherwise it is
 748      presumed to be type <type>bigint</type> if its
 749      value fits in type <type>bigint</type> (64 bits); otherwise it is
 750      taken to be type <type>numeric</type>.  Constants that contain decimal
 751      points and/or exponents are always initially presumed to be type
 752      <type>numeric</type>.
 753     </para>
 754
 755     <para>
 756      The initially assigned data type of a numeric constant is just a
 757      starting point for the type resolution algorithms.  In most cases
 758      the constant will be automatically coerced to the most
 759      appropriate type depending on context.  When necessary, you can
 760      force a numeric value to be interpreted as a specific data type
 761      by casting it.<indexterm><primary>type cast</primary></indexterm>
 762      For example, you can force a numeric value to be treated as type
 763      <type>real</type> (<type>float4</type>) by writing:
 764
 765 <programlisting>
 766 REAL '1.23'  -- string style
 767 1.23::REAL   -- PostgreSQL (historical) style
 768 </programlisting>
 769
 770      These are actually just special cases of the general casting
 771      notations discussed next.
 772     </para>
 773    </sect3>
 774
 775    <sect3 id="sql-syntax-constants-generic">
 776     <title>Constants of Other Types</title>
 777
 778     <indexterm>
 779      <primary>data type</primary>
 780      <secondary>constant</secondary>
 781     </indexterm>
 782
 783     <para>
 784      A constant of an <emphasis>arbitrary</emphasis> type can be
 785      entered using any one of the following notations:
 786 <synopsis>
 787 <replaceable>type</replaceable> '<replaceable>string</replaceable>'
 788 '<replaceable>string</replaceable>'::<replaceable>type</replaceable>
 789 CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
 790 </synopsis>
 791      The string constant's text is passed to the input conversion
 792      routine for the type called <replaceable>type</replaceable>. The
 793      result is a constant of the indicated type.  The explicit type
 794      cast can be omitted if there is no ambiguity as to the type the
 795      constant must be (for example, when it is assigned directly to a
 796      table column), in which case it is automatically coerced.
 797     </para>
 798
 799     <para>
 800      The string constant can be written using either regular SQL
 801      notation or dollar-quoting.
 802     </para>
 803
 804     <para>
 805      It is also possible to specify a type coercion using a function-like
 806      syntax:
 807 <synopsis>
 808 <replaceable>typename</replaceable> ( '<replaceable>string</replaceable>' )
 809 </synopsis>
 810      but not all type names can be used in this way; see <xref
 811      linkend="sql-syntax-type-casts"/> for details.
 812     </para>
 813
 814     <para>
 815      The <literal>::</literal>, <literal>CAST()</literal>, and
 816      function-call syntaxes can also be used to specify run-time type
 817      conversions of arbitrary expressions, as discussed in <xref
 818      linkend="sql-syntax-type-casts"/>.  To avoid syntactic ambiguity, the
 819      <literal><replaceable>type</replaceable> '<replaceable>string</replaceable>'</literal>
 820      syntax can only be used to specify the type of a simple literal constant.
 821      Another restriction on the
 822      <literal><replaceable>type</replaceable> '<replaceable>string</replaceable>'</literal>
 823      syntax is that it does not work for array types; use <literal>::</literal>
 824      or <literal>CAST()</literal> to specify the type of an array constant.
 825     </para>
 826
 827     <para>
 828      The <literal>CAST()</literal> syntax conforms to SQL.  The
 829      <literal><replaceable>type</replaceable> '<replaceable>string</replaceable>'</literal>
 830      syntax is a generalization of the standard: SQL specifies this syntax only
 831      for a few data types, but <productname>PostgreSQL</productname> allows it
 832      for all types.  The syntax with
 833      <literal>::</literal> is historical <productname>PostgreSQL</productname>
 834      usage, as is the function-call syntax.
 835     </para>
 836    </sect3>
 837   </sect2>
 838
 839   <sect2 id="sql-syntax-operators">
 840    <title>Operators</title>
 841
 842    <indexterm zone="sql-syntax-operators">
 843     <primary>operator</primary>
 844     <secondary>syntax</secondary>
 845    </indexterm>
 846
 847    <para>
 848     An operator name is a sequence of up to <symbol>NAMEDATALEN</symbol>-1
 849     (63 by default) characters from the following list:
 850 <literallayout>
 851 + - * / &lt; &gt; = ~ ! @ # % ^ &amp; | ` ?
 852 </literallayout>
 853
 854     There are a few restrictions on operator names, however:
 855     <itemizedlist>
 856      <listitem>
 857       <para>
 858        <literal>--</literal> and <literal>/*</literal> cannot appear
 859        anywhere in an operator name, since they will be taken as the
 860        start of a comment.
 861       </para>
 862      </listitem>
 863
 864      <listitem>
 865       <para>
 866        A multiple-character operator name cannot end in <literal>+</literal> or <literal>-</literal>,
 867        unless the name also contains at least one of these characters:
 868 <literallayout>
 869 ~ ! @ # % ^ &amp; | ` ?
 870 </literallayout>
 871        For example, <literal>@-</literal> is an allowed operator name,
 872        but <literal>*-</literal> is not.  This restriction allows
 873        <productname>PostgreSQL</productname> to parse SQL-compliant
 874        queries without requiring spaces between tokens.
 875       </para>
 876      </listitem>
 877     </itemizedlist>
 878    </para>
 879
 880    <para>
 881     When working with non-SQL-standard operator names, you will usually
 882     need to separate adjacent operators with spaces to avoid ambiguity.
 883     For example, if you have defined a prefix operator named <literal>@</literal>,
 884     you cannot write <literal>X*@Y</literal>; you must write
 885     <literal>X* @Y</literal> to ensure that
 886     <productname>PostgreSQL</productname> reads it as two operator names
 887     not one.
 888    </para>
 889   </sect2>
 890
 891   <sect2 id="sql-syntax-special-chars">
 892    <title>Special Characters</title>
 893
 894   <para>
 895    Some characters that are not alphanumeric have a special meaning
 896    that is different from being an operator.  Details on the usage can
 897    be found at the location where the respective syntax element is
 898    described.  This section only exists to advise the existence and
 899    summarize the purposes of these characters.
 900
 901    <itemizedlist>
 902     <listitem>
 903      <para>
 904       A dollar sign (<literal>$</literal>) followed by digits is used
 905       to represent a positional parameter in the body of a function
 906       definition or a prepared statement.  In other contexts the
 907       dollar sign can be part of an identifier or a dollar-quoted string
 908       constant.
 909      </para>
 910     </listitem>
 911
 912     <listitem>
 913      <para>
 914       Parentheses (<literal>()</literal>) have their usual meaning to
 915       group expressions and enforce precedence.  In some cases
 916       parentheses are required as part of the fixed syntax of a
 917       particular SQL command.
 918      </para>
 919     </listitem>
 920
 921     <listitem>
 922      <para>
 923       Brackets (<literal>[]</literal>) are used to select the elements
 924       of an array.  See <xref linkend="arrays"/> for more information
 925       on arrays.
 926      </para>
 927     </listitem>
 928
 929     <listitem>
 930      <para>
 931       Commas (<literal>,</literal>) are used in some syntactical
 932       constructs to separate the elements of a list.
 933      </para>
 934     </listitem>
 935
 936     <listitem>
 937      <para>
 938       The semicolon (<literal>;</literal>) terminates an SQL command.
 939       It cannot appear anywhere within a command, except within a
 940       string constant or quoted identifier.
 941      </para>
 942     </listitem>
 943
 944     <listitem>
 945      <para>
 946       The colon (<literal>:</literal>) is used to select
 947       <quote>slices</quote> from arrays. (See <xref
 948       linkend="arrays"/>.)  In certain SQL dialects (such as Embedded
 949       SQL), the colon is used to prefix variable names.
 950      </para>
 951     </listitem>
 952
 953     <listitem>
 954      <para>
 955       The asterisk (<literal>*</literal>) is used in some contexts to denote
 956       all the fields of a table row or composite value.  It also
 957       has a special meaning when used as the argument of an
 958       aggregate function, namely that the aggregate does not require
 959       any explicit parameter.
 960      </para>
 961     </listitem>
 962
 963     <listitem>
 964      <para>
 965       The period (<literal>.</literal>) is used in numeric
 966       constants, and to separate schema, table, and column names.
 967      </para>
 968     </listitem>
 969    </itemizedlist>
 970
 971    </para>
 972   </sect2>
 973
 974   <sect2 id="sql-syntax-comments">
 975    <title>Comments</title>
 976
 977    <indexterm zone="sql-syntax-comments">
 978     <primary>comment</primary>
 979     <secondary sortas="SQL">in SQL</secondary>
 980    </indexterm>
 981
 982    <para>
 983     A comment is a sequence of characters beginning with
 984     double dashes and extending to the end of the line, e.g.:
 985 <programlisting>
 986 -- This is a standard SQL comment
 987 </programlisting>
 988    </para>
 989
 990    <para>
 991     Alternatively, C-style block comments can be used:
 992 <programlisting>
 993 /* multiline comment
 994  * with nesting: /* nested block comment */
 995  */
 996 </programlisting>
 997     where the comment begins with <literal>/*</literal> and extends to
 998     the matching occurrence of <literal>*/</literal>. These block
 999     comments nest, as specified in the SQL standard but unlike C, so that one can
1000     comment out larger blocks of code that might contain existing block
1001     comments.
1002    </para>
1003
1004    <para>
1005     A comment is removed from the input stream before further syntax
1006     analysis and is effectively replaced by whitespace.
1007    </para>
1008   </sect2>
1009
1010   <sect2 id="sql-precedence">
1011    <title>Operator Precedence</title>
1012
1013    <indexterm zone="sql-precedence">
1014     <primary>operator</primary>
1015     <secondary>precedence</secondary>
1016    </indexterm>
1017
1018    <para>
1019     <xref linkend="sql-precedence-table"/> shows the precedence and
1020     associativity of the operators in <productname>PostgreSQL</productname>.
1021     Most operators have the same precedence and are left-associative.
1022     The precedence and associativity of the operators is hard-wired
1023     into the parser.
1024     Add parentheses if you want an expression with multiple operators
1025     to be parsed in some other way than what the precedence rules imply.
1026    </para>
1027
1028    <table id="sql-precedence-table">
1029     <title>Operator Precedence (highest to lowest)</title>
1030
1031     <tgroup cols="3">
1032      <colspec colname="col1" colwidth="2*"/>
1033      <colspec colname="col2" colwidth="1*"/>
1034      <colspec colname="col3" colwidth="2*"/>
1035      <thead>
1036       <row>
1037        <entry>Operator/Element</entry>
1038        <entry>Associativity</entry>
1039        <entry>Description</entry>
1040       </row>
1041      </thead>
1042
1043      <tbody>
1044       <row>
1045        <entry><token>.</token></entry>
1046        <entry>left</entry>
1047        <entry>table/column name separator</entry>
1048       </row>
1049
1050       <row>
1051        <entry><token>::</token></entry>
1052        <entry>left</entry>
1053        <entry><productname>PostgreSQL</productname>-style typecast</entry>
1054       </row>
1055
1056       <row>
1057        <entry><token>[</token> <token>]</token></entry>
1058        <entry>left</entry>
1059        <entry>array element selection</entry>
1060       </row>
1061
1062       <row>
1063        <entry><token>+</token> <token>-</token></entry>
1064        <entry>right</entry>
1065        <entry>unary plus, unary minus</entry>
1066       </row>
1067
1068       <row>
1069        <entry><token>COLLATE</token></entry>
1070        <entry>left</entry>
1071        <entry>collation selection</entry>
1072       </row>
1073
1074       <row>
1075        <entry><token>AT</token></entry>
1076        <entry>left</entry>
1077        <entry><literal>AT TIME ZONE</literal>, <literal>AT LOCAL</literal></entry>
1078       </row>
1079
1080       <row>
1081        <entry><token>^</token></entry>
1082        <entry>left</entry>
1083        <entry>exponentiation</entry>
1084       </row>
1085
1086       <row>
1087        <entry><token>*</token> <token>/</token> <token>%</token></entry>
1088        <entry>left</entry>
1089        <entry>multiplication, division, modulo</entry>
1090       </row>
1091
1092       <row>
1093        <entry><token>+</token> <token>-</token></entry>
1094        <entry>left</entry>
1095        <entry>addition, subtraction</entry>
1096       </row>
1097
1098       <row>
1099        <entry>(any other operator)</entry>
1100        <entry>left</entry>
1101        <entry>all other native and user-defined operators</entry>
1102       </row>
1103
1104       <row>
1105        <entry><token>BETWEEN</token> <token>IN</token> <token>LIKE</token> <token>ILIKE</token> <token>SIMILAR</token></entry>
1106        <entry></entry>
1107        <entry>range containment, set membership, string matching</entry>
1108       </row>
1109
1110       <row>
1111        <entry><token>&lt;</token> <token>&gt;</token> <token>=</token> <token>&lt;=</token> <token>&gt;=</token> <token>&lt;&gt;</token>
1112 </entry>
1113        <entry></entry>
1114        <entry>comparison operators</entry>
1115       </row>
1116
1117       <row>
1118        <entry><token>IS</token> <token>ISNULL</token> <token>NOTNULL</token></entry>
1119        <entry></entry>
1120        <entry><literal>IS TRUE</literal>, <literal>IS FALSE</literal>, <literal>IS
1121        NULL</literal>, <literal>IS DISTINCT FROM</literal>, etc.</entry>
1122       </row>
1123
1124       <row>
1125        <entry><token>NOT</token></entry>
1126        <entry>right</entry>
1127        <entry>logical negation</entry>
1128       </row>
1129
1130       <row>
1131        <entry><token>AND</token></entry>
1132        <entry>left</entry>
1133        <entry>logical conjunction</entry>
1134       </row>
1135
1136       <row>
1137        <entry><token>OR</token></entry>
1138        <entry>left</entry>
1139        <entry>logical disjunction</entry>
1140       </row>
1141      </tbody>
1142     </tgroup>
1143    </table>
1144
1145    <para>
1146     Note that the operator precedence rules also apply to user-defined
1147     operators that have the same names as the built-in operators
1148     mentioned above.  For example, if you define a
1149     <quote>+</quote> operator for some custom data type it will have
1150     the same precedence as the built-in <quote>+</quote> operator, no
1151     matter what yours does.
1152    </para>
1153
1154    <para>
1155     When a schema-qualified operator name is used in the
1156     <literal>OPERATOR</literal> syntax, as for example in:
1157 <programlisting>
1158 SELECT 3 OPERATOR(pg_catalog.+) 4;
1159 </programlisting>
1160     the <literal>OPERATOR</literal> construct is taken to have the default precedence
1161     shown in <xref linkend="sql-precedence-table"/> for
1162     <quote>any other operator</quote>.  This is true no matter
1163     which specific operator appears inside <literal>OPERATOR()</literal>.
1164    </para>
1165
1166    <note>
1167     <para>
1168      <productname>PostgreSQL</productname> versions before 9.5 used slightly different
1169      operator precedence rules.  In particular, <token>&lt;=</token>
1170      <token>&gt;=</token> and <token>&lt;&gt;</token> used to be treated as
1171      generic operators; <literal>IS</literal> tests used to have higher priority;
1172      and <literal>NOT BETWEEN</literal> and related constructs acted inconsistently,
1173      being taken in some cases as having the precedence of <literal>NOT</literal>
1174      rather than <literal>BETWEEN</literal>.  These rules were changed for better
1175      compliance with the SQL standard and to reduce confusion from
1176      inconsistent treatment of logically equivalent constructs.  In most
1177      cases, these changes will result in no behavioral change, or perhaps
1178      in <quote>no such operator</quote> failures which can be resolved by adding
1179      parentheses.  However there are corner cases in which a query might
1180      change behavior without any parsing error being reported.
1181     </para>
1182    </note>
1183   </sect2>
1184  </sect1>
1185
1186  <sect1 id="sql-expressions">
1187   <title>Value Expressions</title>
1188
1189   <indexterm zone="sql-expressions">
1190    <primary>expression</primary>
1191    <secondary>syntax</secondary>
1192   </indexterm>
1193
1194   <indexterm zone="sql-expressions">
1195    <primary>value expression</primary>
1196   </indexterm>
1197
1198   <indexterm>
1199    <primary>scalar</primary>
1200    <see>expression</see>
1201   </indexterm>
1202
1203   <para>
1204    Value expressions are used in a variety of contexts, such
1205    as in the target list of the <command>SELECT</command> command, as
1206    new column values in <command>INSERT</command> or
1207    <command>UPDATE</command>, or in search conditions in a number of
1208    commands.  The result of a value expression is sometimes called a
1209    <firstterm>scalar</firstterm>, to distinguish it from the result of
1210    a table expression (which is a table).  Value expressions are
1211    therefore also called <firstterm>scalar expressions</firstterm> (or
1212    even simply <firstterm>expressions</firstterm>).  The expression
1213    syntax allows the calculation of values from primitive parts using
1214    arithmetic, logical, set, and other operations.
1215   </para>
1216
1217   <para>
1218    A value expression is one of the following:
1219
1220    <itemizedlist>
1221     <listitem>
1222      <para>
1223       A constant or literal value
1224      </para>
1225     </listitem>
1226
1227     <listitem>
1228      <para>
1229       A column reference
1230      </para>
1231     </listitem>
1232
1233     <listitem>
1234      <para>
1235       A positional parameter reference, in the body of a function definition
1236       or prepared statement
1237      </para>
1238     </listitem>
1239
1240     <listitem>
1241      <para>
1242       A subscripted expression
1243      </para>
1244     </listitem>
1245
1246     <listitem>
1247      <para>
1248       A field selection expression
1249      </para>
1250     </listitem>
1251
1252     <listitem>
1253      <para>
1254       An operator invocation
1255      </para>
1256     </listitem>
1257
1258     <listitem>
1259      <para>
1260       A function call
1261      </para>
1262     </listitem>
1263
1264     <listitem>
1265      <para>
1266       An aggregate expression
1267      </para>
1268     </listitem>
1269
1270     <listitem>
1271      <para>
1272       A window function call
1273      </para>
1274     </listitem>
1275
1276     <listitem>
1277      <para>
1278       A type cast
1279      </para>
1280     </listitem>
1281
1282     <listitem>
1283      <para>
1284       A collation expression
1285      </para>
1286     </listitem>
1287
1288     <listitem>
1289      <para>
1290       A scalar subquery
1291      </para>
1292     </listitem>
1293
1294     <listitem>
1295      <para>
1296       An array constructor
1297      </para>
1298     </listitem>
1299
1300     <listitem>
1301      <para>
1302       A row constructor
1303      </para>
1304     </listitem>
1305
1306     <listitem>
1307      <para>
1308       Another value expression in parentheses (used to group
1309       subexpressions and override
1310       precedence<indexterm><primary>parenthesis</primary></indexterm>)
1311      </para>
1312     </listitem>
1313    </itemizedlist>
1314   </para>
1315
1316   <para>
1317    In addition to this list, there are a number of constructs that can
1318    be classified as an expression but do not follow any general syntax
1319    rules.  These generally have the semantics of a function or
1320    operator and are explained in the appropriate location in <xref
1321    linkend="functions"/>.  An example is the <literal>IS NULL</literal>
1322    clause.
1323   </para>
1324
1325   <para>
1326    We have already discussed constants in <xref
1327    linkend="sql-syntax-constants"/>.  The following sections discuss
1328    the remaining options.
1329   </para>
1330
1331   <sect2 id="sql-expressions-column-refs">
1332    <title>Column References</title>
1333
1334    <indexterm>
1335     <primary>column reference</primary>
1336    </indexterm>
1337
1338    <para>
1339     A column can be referenced in the form:
1340 <synopsis>
1341 <replaceable>correlation</replaceable>.<replaceable>columnname</replaceable>
1342 </synopsis>
1343    </para>
1344
1345    <para>
1346     <replaceable>correlation</replaceable> is the name of a
1347     table (possibly qualified with a schema name), or an alias for a table
1348     defined by means of a <literal>FROM</literal> clause.
1349     The correlation name and separating dot can be omitted if the column name
1350     is unique across all the tables being used in the current query.  (See also <xref linkend="queries"/>.)
1351    </para>
1352   </sect2>
1353
1354   <sect2 id="sql-expressions-parameters-positional">
1355    <title>Positional Parameters</title>
1356
1357    <indexterm>
1358     <primary>parameter</primary>
1359     <secondary>syntax</secondary>
1360    </indexterm>
1361
1362    <indexterm>
1363     <primary>$</primary>
1364    </indexterm>
1365
1366    <para>
1367     A positional parameter reference is used to indicate a value
1368     that is supplied externally to an SQL statement.  Parameters are
1369     used in SQL function definitions and in prepared queries.  Some
1370     client libraries also support specifying data values separately
1371     from the SQL command string, in which case parameters are used to
1372     refer to the out-of-line data values.
1373     The form of a parameter reference is:
1374 <synopsis>
1375 $<replaceable>number</replaceable>
1376 </synopsis>
1377    </para>
1378
1379    <para>
1380     For example, consider the definition of a function,
1381     <function>dept</function>, as:
1382
1383 <programlisting>
1384 CREATE FUNCTION dept(text) RETURNS dept
1385     AS $$ SELECT * FROM dept WHERE name = $1 $$
1386     LANGUAGE SQL;
1387 </programlisting>
1388
1389     Here the <literal>$1</literal> references the value of the first
1390     function argument whenever the function is invoked.
1391    </para>
1392   </sect2>
1393
1394   <sect2 id="sql-expressions-subscripts">
1395    <title>Subscripts</title>
1396
1397    <indexterm>
1398     <primary>subscript</primary>
1399    </indexterm>
1400
1401    <para>
1402     If an expression yields a value of an array type, then a specific
1403     element of the array value can be extracted by writing
1404 <synopsis>
1405 <replaceable>expression</replaceable>[<replaceable>subscript</replaceable>]
1406 </synopsis>
1407     or multiple adjacent elements (an <quote>array slice</quote>) can be extracted
1408     by writing
1409 <synopsis>
1410 <replaceable>expression</replaceable>[<replaceable>lower_subscript</replaceable>:<replaceable>upper_subscript</replaceable>]
1411 </synopsis>
1412     (Here, the brackets <literal>[ ]</literal> are meant to appear literally.)
1413     Each <replaceable>subscript</replaceable> is itself an expression,
1414     which will be rounded to the nearest integer value.
1415    </para>
1416
1417    <para>
1418     In general the array <replaceable>expression</replaceable> must be
1419     parenthesized, but the parentheses can be omitted when the expression
1420     to be subscripted is just a column reference or positional parameter.
1421     Also, multiple subscripts can be concatenated when the original array
1422     is multidimensional.
1423     For example:
1424
1425 <programlisting>
1426 mytable.arraycolumn[4]
1427 mytable.two_d_column[17][34]
1428 $1[10:42]
1429 (arrayfunction(a,b))[42]
1430 </programlisting>
1431
1432     The parentheses in the last example are required.
1433     See <xref linkend="arrays"/> for more about arrays.
1434    </para>
1435   </sect2>
1436
1437   <sect2 id="field-selection">
1438    <title>Field Selection</title>
1439
1440    <indexterm>
1441     <primary>field selection</primary>
1442    </indexterm>
1443
1444    <para>
1445     If an expression yields a value of a composite type (row type), then a
1446     specific field of the row can be extracted by writing
1447 <synopsis>
1448 <replaceable>expression</replaceable>.<replaceable>fieldname</replaceable>
1449 </synopsis>
1450    </para>
1451
1452    <para>
1453     In general the row <replaceable>expression</replaceable> must be
1454     parenthesized, but the parentheses can be omitted when the expression
1455     to be selected from is just a table reference or positional parameter.
1456     For example:
1457
1458 <programlisting>
1459 mytable.mycolumn
1460 $1.somecolumn
1461 (rowfunction(a,b)).col3
1462 </programlisting>
1463
1464     (Thus, a qualified column reference is actually just a special case
1465     of the field selection syntax.)  An important special case is
1466     extracting a field from a table column that is of a composite type:
1467
1468 <programlisting>
1469 (compositecol).somefield
1470 (mytable.compositecol).somefield
1471 </programlisting>
1472
1473     The parentheses are required here to show that
1474     <structfield>compositecol</structfield> is a column name not a table name,
1475     or that <structname>mytable</structname> is a table name not a schema name
1476     in the second case.
1477    </para>
1478
1479    <para>
1480     You can ask for all fields of a composite value by
1481     writing <literal>.*</literal>:
1482 <programlisting>
1483 (compositecol).*
1484 </programlisting>
1485     This notation behaves differently depending on context;
1486     see <xref linkend="rowtypes-usage"/> for details.
1487    </para>
1488   </sect2>
1489
1490   <sect2 id="sql-expressions-operator-calls">
1491    <title>Operator Invocations</title>
1492
1493    <indexterm>
1494     <primary>operator</primary>
1495     <secondary>invocation</secondary>
1496    </indexterm>
1497
1498    <para>
1499     There are two possible syntaxes for an operator invocation:
1500     <simplelist>
1501      <member><replaceable>expression</replaceable> <replaceable>operator</replaceable> <replaceable>expression</replaceable> (binary infix operator)</member>
1502      <member><replaceable>operator</replaceable> <replaceable>expression</replaceable> (unary prefix operator)</member>
1503     </simplelist>
1504     where the <replaceable>operator</replaceable> token follows the syntax
1505     rules of <xref linkend="sql-syntax-operators"/>, or is one of the
1506     key words <token>AND</token>, <token>OR</token>, and
1507     <token>NOT</token>, or is a qualified operator name in the form:
1508 <synopsis>
1509 <literal>OPERATOR(</literal><replaceable>schema</replaceable><literal>.</literal><replaceable>operatorname</replaceable><literal>)</literal>
1510 </synopsis>
1511     Which particular operators exist and whether
1512     they are unary or binary depends on what operators have been
1513     defined by the system or the user.  <xref linkend="functions"/>
1514     describes the built-in operators.
1515    </para>
1516   </sect2>
1517
1518   <sect2 id="sql-expressions-function-calls">
1519    <title>Function Calls</title>
1520
1521    <indexterm>
1522     <primary>function</primary>
1523     <secondary>invocation</secondary>
1524    </indexterm>
1525
1526    <para>
1527     The syntax for a function call is the name of a function
1528     (possibly qualified with a schema name), followed by its argument list
1529     enclosed in parentheses:
1530
1531 <synopsis>
1532 <replaceable>function_name</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional> )
1533 </synopsis>
1534    </para>
1535
1536    <para>
1537     For example, the following computes the square root of 2:
1538 <programlisting>
1539 sqrt(2)
1540 </programlisting>
1541    </para>
1542
1543    <para>
1544     The list of built-in functions is in <xref linkend="functions"/>.
1545     Other functions can be added by the user.
1546    </para>
1547
1548    <para>
1549     When issuing queries in a database where some users mistrust other users,
1550     observe security precautions from <xref linkend="typeconv-func"/> when
1551     writing function calls.
1552    </para>
1553
1554    <para>
1555     The arguments can optionally have names attached.
1556     See <xref linkend="sql-syntax-calling-funcs"/> for details.
1557    </para>
1558
1559    <note>
1560     <para>
1561      A function that takes a single argument of composite type can
1562      optionally be called using field-selection syntax, and conversely
1563      field selection can be written in functional style.  That is, the
1564      notations <literal>col(table)</literal> and <literal>table.col</literal> are
1565      interchangeable.  This behavior is not SQL-standard but is provided
1566      in <productname>PostgreSQL</productname> because it allows use of functions to
1567      emulate <quote>computed fields</quote>.  For more information see
1568      <xref linkend="rowtypes-usage"/>.
1569     </para>
1570    </note>
1571   </sect2>
1572
1573   <sect2 id="syntax-aggregates">
1574    <title>Aggregate Expressions</title>
1575
1576    <indexterm zone="syntax-aggregates">
1577     <primary>aggregate function</primary>
1578     <secondary>invocation</secondary>
1579    </indexterm>
1580
1581    <indexterm zone="syntax-aggregates">
1582     <primary>ordered-set aggregate</primary>
1583    </indexterm>
1584
1585    <indexterm zone="syntax-aggregates">
1586     <primary>WITHIN GROUP</primary>
1587    </indexterm>
1588
1589    <indexterm zone="syntax-aggregates">
1590     <primary>FILTER</primary>
1591    </indexterm>
1592
1593    <para>
1594     An <firstterm>aggregate expression</firstterm> represents the
1595     application of an aggregate function across the rows selected by a
1596     query.  An aggregate function reduces multiple inputs to a single
1597     output value, such as the sum or average of the inputs.  The
1598     syntax of an aggregate expression is one of the following:
1599
1600 <synopsis>
1601 <replaceable>aggregate_name</replaceable> (<replaceable>expression</replaceable> [ , ... ] [ <replaceable>order_by_clause</replaceable> ] ) [ FILTER ( WHERE <replaceable>filter_clause</replaceable> ) ]
1602 <replaceable>aggregate_name</replaceable> (ALL <replaceable>expression</replaceable> [ , ... ] [ <replaceable>order_by_clause</replaceable> ] ) [ FILTER ( WHERE <replaceable>filter_clause</replaceable> ) ]
1603 <replaceable>aggregate_name</replaceable> (DISTINCT <replaceable>expression</replaceable> [ , ... ] [ <replaceable>order_by_clause</replaceable> ] ) [ FILTER ( WHERE <replaceable>filter_clause</replaceable> ) ]
1604 <replaceable>aggregate_name</replaceable> ( * ) [ FILTER ( WHERE <replaceable>filter_clause</replaceable> ) ]
1605 <replaceable>aggregate_name</replaceable> ( [ <replaceable>expression</replaceable> [ , ... ] ] ) WITHIN GROUP ( <replaceable>order_by_clause</replaceable> ) [ FILTER ( WHERE <replaceable>filter_clause</replaceable> ) ]
1606 </synopsis>
1607
1608     where <replaceable>aggregate_name</replaceable> is a previously
1609     defined aggregate (possibly qualified with a schema name) and
1610     <replaceable>expression</replaceable> is
1611     any value expression that does not itself contain an aggregate
1612     expression or a window function call.  The optional
1613     <replaceable>order_by_clause</replaceable> and
1614     <replaceable>filter_clause</replaceable> are described below.
1615    </para>
1616
1617    <para>
1618     The first form of aggregate expression invokes the aggregate
1619     once for each input row.
1620     The second form is the same as the first, since
1621     <literal>ALL</literal> is the default.
1622     The third form invokes the aggregate once for each distinct value
1623     of the expression (or distinct set of values, for multiple expressions)
1624     found in the input rows.
1625     The fourth form invokes the aggregate once for each input row; since no
1626     particular input value is specified, it is generally only useful
1627     for the <function>count(*)</function> aggregate function.
1628     The last form is used with <firstterm>ordered-set</firstterm> aggregate
1629     functions, which are described below.
1630    </para>
1631
1632    <para>
1633     Most aggregate functions ignore null inputs, so that rows in which
1634     one or more of the expression(s) yield null are discarded.  This
1635     can be assumed to be true, unless otherwise specified, for all
1636     built-in aggregates.
1637    </para>
1638
1639    <para>
1640     For example, <literal>count(*)</literal> yields the total number
1641     of input rows; <literal>count(f1)</literal> yields the number of
1642     input rows in which <literal>f1</literal> is non-null, since
1643     <function>count</function> ignores nulls; and
1644     <literal>count(distinct f1)</literal> yields the number of
1645     distinct non-null values of <literal>f1</literal>.
1646    </para>
1647
1648    <para>
1649     Ordinarily, the input rows are fed to the aggregate function in an
1650     unspecified order.  In many cases this does not matter; for example,
1651     <function>min</function> produces the same result no matter what order it
1652     receives the inputs in.  However, some aggregate functions
1653     (such as <function>array_agg</function> and <function>string_agg</function>) produce
1654     results that depend on the ordering of the input rows.  When using
1655     such an aggregate, the optional <replaceable>order_by_clause</replaceable> can be
1656     used to specify the desired ordering.  The <replaceable>order_by_clause</replaceable>
1657     has the same syntax as for a query-level <literal>ORDER BY</literal> clause, as
1658     described in <xref linkend="queries-order"/>, except that its expressions
1659     are always just expressions and cannot be output-column names or numbers.
1660     For example:
1661 <programlisting>
1662 WITH vals (v) AS ( VALUES (1),(3),(4),(3),(2) )
1663 SELECT array_agg(v ORDER BY v DESC) FROM vals;
1664   array_agg
1665 -------------
1666  {4,3,3,2,1}
1667 </programlisting>
1668     Since <type>jsonb</type> only keeps the last matching key, ordering
1669     of its keys can be significant:
1670 <programlisting>
1671 WITH vals (k, v) AS ( VALUES ('key0','1'), ('key1','3'), ('key1','2') )
1672 SELECT jsonb_object_agg(k, v ORDER BY v) FROM vals;
1673       jsonb_object_agg
1674 ----------------------------
1675  {"key0": "1", "key1": "3"}
1676 </programlisting>
1677    </para>
1678
1679    <para>
1680     When dealing with multiple-argument aggregate functions, note that the
1681     <literal>ORDER BY</literal> clause goes after all the aggregate arguments.
1682     For example, write this:
1683 <programlisting>
1684 SELECT string_agg(a, ',' ORDER BY a) FROM table;
1685 </programlisting>
1686     not this:
1687 <programlisting>
1688 SELECT string_agg(a ORDER BY a, ',') FROM table;  -- incorrect
1689 </programlisting>
1690     The latter is syntactically valid, but it represents a call of a
1691     single-argument aggregate function with two <literal>ORDER BY</literal> keys
1692     (the second one being rather useless since it's a constant).
1693    </para>
1694
1695    <para>
1696     If <literal>DISTINCT</literal> is specified with an
1697     <replaceable>order_by_clause</replaceable>, <literal>ORDER
1698     BY</literal> expressions can only reference columns in the
1699     <literal>DISTINCT</literal> list.  For example:
1700 <programlisting>
1701 WITH vals (v) AS ( VALUES (1),(3),(4),(3),(2) )
1702 SELECT array_agg(DISTINCT v ORDER BY v DESC) FROM vals;
1703  array_agg
1704 -----------
1705  {4,3,2,1}
1706 </programlisting>
1707    </para>
1708
1709    <para>
1710     Placing <literal>ORDER BY</literal> within the aggregate's regular argument
1711     list, as described so far, is used when ordering the input rows for
1712     general-purpose and statistical aggregates, for which ordering is
1713     optional.  There is a
1714     subclass of aggregate functions called <firstterm>ordered-set
1715     aggregates</firstterm> for which an <replaceable>order_by_clause</replaceable>
1716     is <emphasis>required</emphasis>, usually because the aggregate's computation is
1717     only sensible in terms of a specific ordering of its input rows.
1718     Typical examples of ordered-set aggregates include rank and percentile
1719     calculations.  For an ordered-set aggregate,
1720     the <replaceable>order_by_clause</replaceable> is written
1721     inside <literal>WITHIN GROUP (...)</literal>, as shown in the final syntax
1722     alternative above.  The expressions in
1723     the <replaceable>order_by_clause</replaceable> are evaluated once per
1724     input row just like regular aggregate arguments, sorted as per
1725     the <replaceable>order_by_clause</replaceable>'s requirements, and fed
1726     to the aggregate function as input arguments.  (This is unlike the case
1727     for a non-<literal>WITHIN GROUP</literal> <replaceable>order_by_clause</replaceable>,
1728     which is not treated as argument(s) to the aggregate function.)  The
1729     argument expressions preceding <literal>WITHIN GROUP</literal>, if any, are
1730     called <firstterm>direct arguments</firstterm> to distinguish them from
1731     the <firstterm>aggregated arguments</firstterm> listed in
1732     the <replaceable>order_by_clause</replaceable>.  Unlike regular aggregate
1733     arguments, direct arguments are evaluated only once per aggregate call,
1734     not once per input row.  This means that they can contain variables only
1735     if those variables are grouped by <literal>GROUP BY</literal>; this restriction
1736     is the same as if the direct arguments were not inside an aggregate
1737     expression at all.  Direct arguments are typically used for things like
1738     percentile fractions, which only make sense as a single value per
1739     aggregation calculation.  The direct argument list can be empty; in this
1740     case, write just <literal>()</literal> not <literal>(*)</literal>.
1741     (<productname>PostgreSQL</productname> will actually accept either spelling, but
1742     only the first way conforms to the SQL standard.)
1743    </para>
1744
1745    <para>
1746     <indexterm>
1747      <primary>median</primary>
1748      <seealso>percentile</seealso>
1749     </indexterm>
1750     An example of an ordered-set aggregate call is:
1751
1752 <programlisting>
1753 SELECT percentile_cont(0.5) WITHIN GROUP (ORDER BY income) FROM households;
1754  percentile_cont
1755 -----------------
1756            50489
1757 </programlisting>
1758
1759    which obtains the 50th percentile, or median, value of
1760    the <structfield>income</structfield> column from table <structname>households</structname>.
1761    Here, <literal>0.5</literal> is a direct argument; it would make no sense
1762    for the percentile fraction to be a value varying across rows.
1763    </para>
1764
1765    <para>
1766     If <literal>FILTER</literal> is specified, then only the input
1767     rows for which the <replaceable>filter_clause</replaceable>
1768     evaluates to true are fed to the aggregate function; other rows
1769     are discarded.  For example:
1770 <programlisting>
1771 SELECT
1772     count(*) AS unfiltered,
1773     count(*) FILTER (WHERE i &lt; 5) AS filtered
1774 FROM generate_series(1,10) AS s(i);
1775  unfiltered | filtered
1776 ------------+----------
1777          10 |        4
1778 (1 row)
1779 </programlisting>
1780    </para>
1781
1782    <para>
1783     The predefined aggregate functions are described in <xref
1784     linkend="functions-aggregate"/>.  Other aggregate functions can be added
1785     by the user.
1786    </para>
1787
1788    <para>
1789     An aggregate expression can only appear in the result list or
1790     <literal>HAVING</literal> clause of a <command>SELECT</command> command.
1791     It is forbidden in other clauses, such as <literal>WHERE</literal>,
1792     because those clauses are logically evaluated before the results
1793     of aggregates are formed.
1794    </para>
1795
1796    <para>
1797     When an aggregate expression appears in a subquery (see
1798     <xref linkend="sql-syntax-scalar-subqueries"/> and
1799     <xref linkend="functions-subquery"/>), the aggregate is normally
1800     evaluated over the rows of the subquery.  But an exception occurs
1801     if the aggregate's arguments (and <replaceable>filter_clause</replaceable>
1802     if any) contain only outer-level variables:
1803     the aggregate then belongs to the nearest such outer level, and is
1804     evaluated over the rows of that query.  The aggregate expression
1805     as a whole is then an outer reference for the subquery it appears in,
1806     and acts as a constant over any one evaluation of that subquery.
1807     The restriction about
1808     appearing only in the result list or <literal>HAVING</literal> clause
1809     applies with respect to the query level that the aggregate belongs to.
1810    </para>
1811   </sect2>
1812
1813   <sect2 id="syntax-window-functions">
1814    <title>Window Function Calls</title>
1815
1816    <indexterm zone="syntax-window-functions">
1817     <primary>window function</primary>
1818     <secondary>invocation</secondary>
1819    </indexterm>
1820
1821    <indexterm zone="syntax-window-functions">
1822     <primary>OVER clause</primary>
1823    </indexterm>
1824
1825    <para>
1826     A <firstterm>window function call</firstterm> represents the application
1827     of an aggregate-like function over some portion of the rows selected
1828     by a query.  Unlike non-window aggregate calls, this is not tied
1829     to grouping of the selected rows into a single output row &mdash; each
1830     row remains separate in the query output.  However the window function
1831     has access to all the rows that would be part of the current row's
1832     group according to the grouping specification (<literal>PARTITION BY</literal>
1833     list) of the window function call.
1834     The syntax of a window function call is one of the following:
1835
1836 <synopsis>
1837 <replaceable>function_name</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional>) [ FILTER ( WHERE <replaceable>filter_clause</replaceable> ) ] OVER <replaceable>window_name</replaceable>
1838 <replaceable>function_name</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional>) [ FILTER ( WHERE <replaceable>filter_clause</replaceable> ) ] OVER ( <replaceable class="parameter">window_definition</replaceable> )
1839 <replaceable>function_name</replaceable> ( * ) [ FILTER ( WHERE <replaceable>filter_clause</replaceable> ) ] OVER <replaceable>window_name</replaceable>
1840 <replaceable>function_name</replaceable> ( * ) [ FILTER ( WHERE <replaceable>filter_clause</replaceable> ) ] OVER ( <replaceable class="parameter">window_definition</replaceable> )
1841 </synopsis>
1842     where <replaceable class="parameter">window_definition</replaceable>
1843     has the syntax
1844 <synopsis>
1845 [ <replaceable class="parameter">existing_window_name</replaceable> ]
1846 [ PARTITION BY <replaceable class="parameter">expression</replaceable> [, ...] ]
1847 [ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
1848 [ <replaceable class="parameter">frame_clause</replaceable> ]
1849 </synopsis>
1850     The optional <replaceable class="parameter">frame_clause</replaceable>
1851     can be one of
1852 <synopsis>
1853 { RANGE | ROWS | GROUPS } <replaceable>frame_start</replaceable> [ <replaceable>frame_exclusion</replaceable> ]
1854 { RANGE | ROWS | GROUPS } BETWEEN <replaceable>frame_start</replaceable> AND <replaceable>frame_end</replaceable> [ <replaceable>frame_exclusion</replaceable> ]
1855 </synopsis>
1856     where <replaceable>frame_start</replaceable>
1857     and <replaceable>frame_end</replaceable> can be one of
1858 <synopsis>
1859 UNBOUNDED PRECEDING
1860 <replaceable>offset</replaceable> PRECEDING
1861 CURRENT ROW
1862 <replaceable>offset</replaceable> FOLLOWING
1863 UNBOUNDED FOLLOWING
1864 </synopsis>
1865     and <replaceable>frame_exclusion</replaceable> can be one of
1866 <synopsis>
1867 EXCLUDE CURRENT ROW
1868 EXCLUDE GROUP
1869 EXCLUDE TIES
1870 EXCLUDE NO OTHERS
1871 </synopsis>
1872    </para>
1873
1874    <para>
1875     Here, <replaceable>expression</replaceable> represents any value
1876     expression that does not itself contain window function calls.
1877    </para>
1878
1879    <para>
1880     <replaceable>window_name</replaceable> is a reference to a named window
1881     specification defined in the query's <literal>WINDOW</literal> clause.
1882     Alternatively, a full <replaceable>window_definition</replaceable> can
1883     be given within parentheses, using the same syntax as for defining a
1884     named window in the <literal>WINDOW</literal> clause; see the
1885     <xref linkend="sql-select"/> reference page for details.  It's worth
1886     pointing out that <literal>OVER wname</literal> is not exactly equivalent to
1887     <literal>OVER (wname ...)</literal>; the latter implies copying and modifying the
1888     window definition, and will be rejected if the referenced window
1889     specification includes a frame clause.
1890    </para>
1891
1892    <para>
1893     The <literal>PARTITION BY</literal> clause groups the rows of the query into
1894     <firstterm>partitions</firstterm>, which are processed separately by the window
1895     function.  <literal>PARTITION BY</literal> works similarly to a query-level
1896     <literal>GROUP BY</literal> clause, except that its expressions are always just
1897     expressions and cannot be output-column names or numbers.
1898     Without <literal>PARTITION BY</literal>, all rows produced by the query are
1899     treated as a single partition.
1900     The <literal>ORDER BY</literal> clause determines the order in which the rows
1901     of a partition are processed by the window function.  It works similarly
1902     to a query-level <literal>ORDER BY</literal> clause, but likewise cannot use
1903     output-column names or numbers.  Without <literal>ORDER BY</literal>, rows are
1904     processed in an unspecified order.
1905    </para>
1906
1907    <para>
1908     The <replaceable class="parameter">frame_clause</replaceable> specifies
1909     the set of rows constituting the <firstterm>window frame</firstterm>, which is a
1910     subset of the current partition, for those window functions that act on
1911     the frame instead of the whole partition.  The set of rows in the frame
1912     can vary depending on which row is the current row.  The frame can be
1913     specified in <literal>RANGE</literal>, <literal>ROWS</literal>
1914     or <literal>GROUPS</literal> mode; in each case, it runs from
1915     the <replaceable>frame_start</replaceable> to
1916     the <replaceable>frame_end</replaceable>.
1917     If <replaceable>frame_end</replaceable> is omitted, the end defaults
1918     to <literal>CURRENT ROW</literal>.
1919    </para>
1920
1921    <para>
1922     A <replaceable>frame_start</replaceable> of <literal>UNBOUNDED PRECEDING</literal> means
1923     that the frame starts with the first row of the partition, and similarly
1924     a <replaceable>frame_end</replaceable> of <literal>UNBOUNDED FOLLOWING</literal> means
1925     that the frame ends with the last row of the partition.
1926    </para>
1927
1928    <para>
1929     In <literal>RANGE</literal> or <literal>GROUPS</literal> mode,
1930     a <replaceable>frame_start</replaceable> of
1931     <literal>CURRENT ROW</literal> means the frame starts with the current
1932     row's first <firstterm>peer</firstterm> row (a row that the
1933     window's <literal>ORDER BY</literal> clause sorts as equivalent to the
1934     current row), while a <replaceable>frame_end</replaceable> of
1935     <literal>CURRENT ROW</literal> means the frame ends with the current
1936     row's last peer row.
1937     In <literal>ROWS</literal> mode, <literal>CURRENT ROW</literal> simply
1938     means the current row.
1939    </para>
1940
1941    <para>
1942     In the <replaceable>offset</replaceable> <literal>PRECEDING</literal>
1943     and <replaceable>offset</replaceable> <literal>FOLLOWING</literal> frame
1944     options, the <replaceable>offset</replaceable> must be an expression not
1945     containing any variables, aggregate functions, or window functions.
1946     The meaning of the <replaceable>offset</replaceable> depends on the
1947     frame mode:
1948     <itemizedlist>
1949      <listitem>
1950       <para>
1951        In <literal>ROWS</literal> mode,
1952        the <replaceable>offset</replaceable> must yield a non-null,
1953        non-negative integer, and the option means that the frame starts or
1954        ends the specified number of rows before or after the current row.
1955       </para>
1956      </listitem>
1957      <listitem>
1958       <para>
1959        In <literal>GROUPS</literal> mode,
1960        the <replaceable>offset</replaceable> again must yield a non-null,
1961        non-negative integer, and the option means that the frame starts or
1962        ends the specified number of <firstterm>peer groups</firstterm>
1963        before or after the current row's peer group, where a peer group is a
1964        set of rows that are equivalent in the <literal>ORDER BY</literal>
1965        ordering.  (There must be an <literal>ORDER BY</literal> clause
1966        in the window definition to use <literal>GROUPS</literal> mode.)
1967       </para>
1968      </listitem>
1969      <listitem>
1970       <para>
1971        In <literal>RANGE</literal> mode, these options require that
1972        the <literal>ORDER BY</literal> clause specify exactly one column.
1973        The <replaceable>offset</replaceable> specifies the maximum
1974        difference between the value of that column in the current row and
1975        its value in preceding or following rows of the frame.  The data type
1976        of the <replaceable>offset</replaceable> expression varies depending
1977        on the data type of the ordering column.  For numeric ordering
1978        columns it is typically of the same type as the ordering column,
1979        but for datetime ordering columns it is an <type>interval</type>.
1980        For example, if the ordering column is of type <type>date</type>
1981        or <type>timestamp</type>, one could write <literal>RANGE BETWEEN
1982        '1 day' PRECEDING AND '10 days' FOLLOWING</literal>.
1983        The <replaceable>offset</replaceable> is still required to be
1984        non-null and non-negative, though the meaning
1985        of <quote>non-negative</quote> depends on its data type.
1986       </para>
1987      </listitem>
1988     </itemizedlist>
1989     In any case, the distance to the end of the frame is limited by the
1990     distance to the end of the partition, so that for rows near the partition
1991     ends the frame might contain fewer rows than elsewhere.
1992    </para>
1993
1994    <para>
1995     Notice that in both <literal>ROWS</literal> and <literal>GROUPS</literal>
1996     mode, <literal>0 PRECEDING</literal> and <literal>0 FOLLOWING</literal>
1997     are equivalent to <literal>CURRENT ROW</literal>.  This normally holds
1998     in <literal>RANGE</literal> mode as well, for an appropriate
1999     data-type-specific meaning of <quote>zero</quote>.
2000    </para>
2001
2002    <para>
2003     The <replaceable>frame_exclusion</replaceable> option allows rows around
2004     the current row to be excluded from the frame, even if they would be
2005     included according to the frame start and frame end options.
2006     <literal>EXCLUDE CURRENT ROW</literal> excludes the current row from the
2007     frame.
2008     <literal>EXCLUDE GROUP</literal> excludes the current row and its
2009     ordering peers from the frame.
2010     <literal>EXCLUDE TIES</literal> excludes any peers of the current
2011     row from the frame, but not the current row itself.
2012     <literal>EXCLUDE NO OTHERS</literal> simply specifies explicitly the
2013     default behavior of not excluding the current row or its peers.
2014    </para>
2015
2016    <para>
2017     The default framing option is <literal>RANGE UNBOUNDED PRECEDING</literal>,
2018     which is the same as <literal>RANGE BETWEEN UNBOUNDED PRECEDING AND
2019     CURRENT ROW</literal>.  With <literal>ORDER BY</literal>, this sets the frame to be
2020     all rows from the partition start up through the current row's last
2021     <literal>ORDER BY</literal> peer.  Without <literal>ORDER BY</literal>,
2022     this means all rows of the partition are included in the window frame,
2023     since all rows become peers of the current row.
2024    </para>
2025
2026    <para>
2027     Restrictions are that
2028     <replaceable>frame_start</replaceable> cannot be <literal>UNBOUNDED FOLLOWING</literal>,
2029     <replaceable>frame_end</replaceable> cannot be <literal>UNBOUNDED PRECEDING</literal>,
2030     and the <replaceable>frame_end</replaceable> choice cannot appear earlier in the
2031     above list of <replaceable>frame_start</replaceable>
2032     and <replaceable>frame_end</replaceable> options than
2033     the <replaceable>frame_start</replaceable> choice does &mdash; for example
2034     <literal>RANGE BETWEEN CURRENT ROW AND <replaceable>offset</replaceable>
2035     PRECEDING</literal> is not allowed.
2036     But, for example, <literal>ROWS BETWEEN 7 PRECEDING AND 8
2037     PRECEDING</literal> is allowed, even though it would never select any
2038     rows.
2039    </para>
2040
2041    <para>
2042     If <literal>FILTER</literal> is specified, then only the input
2043     rows for which the <replaceable>filter_clause</replaceable>
2044     evaluates to true are fed to the window function; other rows
2045     are discarded.  Only window functions that are aggregates accept
2046     a <literal>FILTER</literal> clause.
2047    </para>
2048
2049    <para>
2050     The built-in window functions are described in <xref
2051     linkend="functions-window-table"/>.  Other window functions can be added by
2052     the user.  Also, any built-in or user-defined general-purpose or
2053     statistical aggregate can be used as a window function.  (Ordered-set
2054     and hypothetical-set aggregates cannot presently be used as window functions.)
2055    </para>
2056
2057    <para>
2058     The syntaxes using <literal>*</literal> are used for calling parameter-less
2059     aggregate functions as window functions, for example
2060     <literal>count(*) OVER (PARTITION BY x ORDER BY y)</literal>.
2061     The asterisk (<literal>*</literal>) is customarily not used for
2062     window-specific functions.  Window-specific functions do not
2063     allow <literal>DISTINCT</literal> or <literal>ORDER BY</literal> to be used within the
2064     function argument list.
2065    </para>
2066
2067    <para>
2068     Window function calls are permitted only in the <literal>SELECT</literal>
2069     list and the <literal>ORDER BY</literal> clause of the query.
2070    </para>
2071
2072    <para>
2073     More information about window functions can be found in
2074     <xref linkend="tutorial-window"/>,
2075     <xref linkend="functions-window"/>, and
2076     <xref linkend="queries-window"/>.
2077    </para>
2078   </sect2>
2079
2080   <sect2 id="sql-syntax-type-casts">
2081    <title>Type Casts</title>
2082
2083    <indexterm>
2084     <primary>data type</primary>
2085     <secondary>type cast</secondary>
2086    </indexterm>
2087
2088    <indexterm>
2089     <primary>type cast</primary>
2090    </indexterm>
2091
2092    <indexterm>
2093     <primary>::</primary>
2094    </indexterm>
2095
2096    <para>
2097     A type cast specifies a conversion from one data type to another.
2098     <productname>PostgreSQL</productname> accepts two equivalent syntaxes
2099     for type casts:
2100 <synopsis>
2101 CAST ( <replaceable>expression</replaceable> AS <replaceable>type</replaceable> )
2102 <replaceable>expression</replaceable>::<replaceable>type</replaceable>
2103 </synopsis>
2104     The <literal>CAST</literal> syntax conforms to SQL; the syntax with
2105     <literal>::</literal> is historical <productname>PostgreSQL</productname>
2106     usage.
2107    </para>
2108
2109    <para>
2110     When a cast is applied to a value expression of a known type, it
2111     represents a run-time type conversion.  The cast will succeed only
2112     if a suitable type conversion operation has been defined.  Notice that this
2113     is subtly different from the use of casts with constants, as shown in
2114     <xref linkend="sql-syntax-constants-generic"/>.  A cast applied to an
2115     unadorned string literal represents the initial assignment of a type
2116     to a literal constant value, and so it will succeed for any type
2117     (if the contents of the string literal are acceptable input syntax for the
2118     data type).
2119    </para>
2120
2121    <para>
2122     An explicit type cast can usually be omitted if there is no ambiguity as
2123     to the type that a value expression must produce (for example, when it is
2124     assigned to a table column); the system will automatically apply a
2125     type cast in such cases.  However, automatic casting is only done for
2126     casts that are marked <quote>OK to apply implicitly</quote>
2127     in the system catalogs.  Other casts must be invoked with
2128     explicit casting syntax.  This restriction is intended to prevent
2129     surprising conversions from being applied silently.
2130    </para>
2131
2132    <para>
2133     It is also possible to specify a type cast using a function-like
2134     syntax:
2135 <synopsis>
2136 <replaceable>typename</replaceable> ( <replaceable>expression</replaceable> )
2137 </synopsis>
2138     However, this only works for types whose names are also valid as
2139     function names.  For example, <literal>double precision</literal>
2140     cannot be used this way, but the equivalent <literal>float8</literal>
2141     can.  Also, the names <literal>interval</literal>, <literal>time</literal>, and
2142     <literal>timestamp</literal> can only be used in this fashion if they are
2143     double-quoted, because of syntactic conflicts.  Therefore, the use of
2144     the function-like cast syntax leads to inconsistencies and should
2145     probably be avoided.
2146    </para>
2147
2148    <note>
2149     <para>
2150      The function-like syntax is in fact just a function call.  When
2151      one of the two standard cast syntaxes is used to do a run-time
2152      conversion, it will internally invoke a registered function to
2153      perform the conversion.  By convention, these conversion functions
2154      have the same name as their output type, and thus the <quote>function-like
2155      syntax</quote> is nothing more than a direct invocation of the underlying
2156      conversion function.  Obviously, this is not something that a portable
2157      application should rely on.  For further details see
2158      <xref linkend="sql-createcast"/>.
2159     </para>
2160    </note>
2161   </sect2>
2162
2163   <sect2 id="sql-syntax-collate-exprs">
2164    <title>Collation Expressions</title>
2165
2166    <indexterm>
2167     <primary>COLLATE</primary>
2168    </indexterm>
2169
2170    <para>
2171     The <literal>COLLATE</literal> clause overrides the collation of
2172     an expression.  It is appended to the expression it applies to:
2173 <synopsis>
2174 <replaceable>expr</replaceable> COLLATE <replaceable>collation</replaceable>
2175 </synopsis>
2176     where <replaceable>collation</replaceable> is a possibly
2177     schema-qualified identifier.  The <literal>COLLATE</literal>
2178     clause binds tighter than operators; parentheses can be used when
2179     necessary.
2180    </para>
2181
2182    <para>
2183     If no collation is explicitly specified, the database system
2184     either derives a collation from the columns involved in the
2185     expression, or it defaults to the default collation of the
2186     database if no column is involved in the expression.
2187    </para>
2188
2189    <para>
2190     The two common uses of the <literal>COLLATE</literal> clause are
2191     overriding the sort order in an <literal>ORDER BY</literal> clause, for
2192     example:
2193 <programlisting>
2194 SELECT a, b, c FROM tbl WHERE ... ORDER BY a COLLATE "C";
2195 </programlisting>
2196     and overriding the collation of a function or operator call that
2197     has locale-sensitive results, for example:
2198 <programlisting>
2199 SELECT * FROM tbl WHERE a &gt; 'foo' COLLATE "C";
2200 </programlisting>
2201     Note that in the latter case the <literal>COLLATE</literal> clause is
2202     attached to an input argument of the operator we wish to affect.
2203     It doesn't matter which argument of the operator or function call the
2204     <literal>COLLATE</literal> clause is attached to, because the collation that is
2205     applied by the operator or function is derived by considering all
2206     arguments, and an explicit <literal>COLLATE</literal> clause will override the
2207     collations of all other arguments.  (Attaching non-matching
2208     <literal>COLLATE</literal> clauses to more than one argument, however, is an
2209     error.  For more details see <xref linkend="collation"/>.)
2210     Thus, this gives the same result as the previous example:
2211 <programlisting>
2212 SELECT * FROM tbl WHERE a COLLATE "C" &gt; 'foo';
2213 </programlisting>
2214     But this is an error:
2215 <programlisting>
2216 SELECT * FROM tbl WHERE (a &gt; 'foo') COLLATE "C";
2217 </programlisting>
2218     because it attempts to apply a collation to the result of the
2219     <literal>&gt;</literal> operator, which is of the non-collatable data type
2220     <type>boolean</type>.
2221    </para>
2222   </sect2>
2223
2224   <sect2 id="sql-syntax-scalar-subqueries">
2225    <title>Scalar Subqueries</title>
2226
2227    <indexterm>
2228     <primary>subquery</primary>
2229    </indexterm>
2230
2231    <para>
2232     A scalar subquery is an ordinary
2233     <command>SELECT</command> query in parentheses that returns exactly one
2234     row with one column.  (See <xref linkend="queries"/> for information about writing queries.)
2235     The <command>SELECT</command> query is executed
2236     and the single returned value is used in the surrounding value expression.
2237     It is an error to use a query that
2238     returns more than one row or more than one column as a scalar subquery.
2239     (But if, during a particular execution, the subquery returns no rows,
2240     there is no error; the scalar result is taken to be null.)
2241     The subquery can refer to variables from the surrounding query,
2242     which will act as constants during any one evaluation of the subquery.
2243     See also <xref linkend="functions-subquery"/> for other expressions involving subqueries.
2244    </para>
2245
2246    <para>
2247     For example, the following finds the largest city population in each
2248     state:
2249 <programlisting>
2250 SELECT name, (SELECT max(pop) FROM cities WHERE cities.state = states.name)
2251     FROM states;
2252 </programlisting>
2253    </para>
2254   </sect2>
2255
2256   <sect2 id="sql-syntax-array-constructors">
2257    <title>Array Constructors</title>
2258
2259    <indexterm>
2260     <primary>array</primary>
2261     <secondary>constructor</secondary>
2262    </indexterm>
2263
2264    <indexterm>
2265     <primary>ARRAY</primary>
2266    </indexterm>
2267
2268    <para>
2269     An array constructor is an expression that builds an
2270     array value using values for its member elements.  A simple array
2271     constructor
2272     consists of the key word <literal>ARRAY</literal>, a left square bracket
2273     <literal>[</literal>, a list of expressions (separated by commas) for the
2274     array element values, and finally a right square bracket <literal>]</literal>.
2275     For example:
2276 <programlisting>
2277 SELECT ARRAY[1,2,3+4];
2278   array
2279 ---------
2280  {1,2,7}
2281 (1 row)
2282 </programlisting>
2283     By default,
2284     the array element type is the common type of the member expressions,
2285     determined using the same rules as for <literal>UNION</literal> or
2286     <literal>CASE</literal> constructs (see <xref linkend="typeconv-union-case"/>).
2287     You can override this by explicitly casting the array constructor to the
2288     desired type, for example:
2289 <programlisting>
2290 SELECT ARRAY[1,2,22.7]::integer[];
2291   array
2292 ----------
2293  {1,2,23}
2294 (1 row)
2295 </programlisting>
2296     This has the same effect as casting each expression to the array
2297     element type individually.
2298     For more on casting, see <xref linkend="sql-syntax-type-casts"/>.
2299    </para>
2300
2301    <para>
2302     Multidimensional array values can be built by nesting array
2303     constructors.
2304     In the inner constructors, the key word <literal>ARRAY</literal> can
2305     be omitted.  For example, these produce the same result:
2306
2307 <programlisting>
2308 SELECT ARRAY[ARRAY[1,2], ARRAY[3,4]];
2309      array
2310 ---------------
2311  {{1,2},{3,4}}
2312 (1 row)
2313
2314 SELECT ARRAY[[1,2],[3,4]];
2315      array
2316 ---------------
2317  {{1,2},{3,4}}
2318 (1 row)
2319 </programlisting>
2320
2321     Since multidimensional arrays must be rectangular, inner constructors
2322     at the same level must produce sub-arrays of identical dimensions.
2323     Any cast applied to the outer <literal>ARRAY</literal> constructor propagates
2324     automatically to all the inner constructors.
2325   </para>
2326
2327   <para>
2328     Multidimensional array constructor elements can be anything yielding
2329     an array of the proper kind, not only a sub-<literal>ARRAY</literal> construct.
2330     For example:
2331 <programlisting>
2332 CREATE TABLE arr(f1 int[], f2 int[]);
2333
2334 INSERT INTO arr VALUES (ARRAY[[1,2],[3,4]], ARRAY[[5,6],[7,8]]);
2335
2336 SELECT ARRAY[f1, f2, '{{9,10},{11,12}}'::int[]] FROM arr;
2337                      array
2338 ------------------------------------------------
2339  {{{1,2},{3,4}},{{5,6},{7,8}},{{9,10},{11,12}}}
2340 (1 row)
2341 </programlisting>
2342   </para>
2343
2344   <para>
2345    You can construct an empty array, but since it's impossible to have an
2346    array with no type, you must explicitly cast your empty array to the
2347    desired type.  For example:
2348 <programlisting>
2349 SELECT ARRAY[]::integer[];
2350  array
2351 -------
2352  {}
2353 (1 row)
2354 </programlisting>
2355   </para>
2356
2357   <para>
2358    It is also possible to construct an array from the results of a
2359    subquery.  In this form, the array constructor is written with the
2360    key word <literal>ARRAY</literal> followed by a parenthesized (not
2361    bracketed) subquery. For example:
2362 <programlisting>
2363 SELECT ARRAY(SELECT oid FROM pg_proc WHERE proname LIKE 'bytea%');
2364                               array
2365 ------------------------------------------------------------------
2366  {2011,1954,1948,1952,1951,1244,1950,2005,1949,1953,2006,31,2412}
2367 (1 row)
2368
2369 SELECT ARRAY(SELECT ARRAY[i, i*2] FROM generate_series(1,5) AS a(i));
2370               array
2371 ----------------------------------
2372  {{1,2},{2,4},{3,6},{4,8},{5,10}}
2373 (1 row)
2374 </programlisting>
2375    The subquery must return a single column.
2376    If the subquery's output column is of a non-array type, the resulting
2377    one-dimensional array will have an element for each row in the
2378    subquery result, with an element type matching that of the
2379    subquery's output column.
2380    If the subquery's output column is of an array type, the result will be
2381    an array of the same type but one higher dimension; in this case all
2382    the subquery rows must yield arrays of identical dimensionality, else
2383    the result would not be rectangular.
2384   </para>
2385
2386   <para>
2387    The subscripts of an array value built with <literal>ARRAY</literal>
2388    always begin with one.  For more information about arrays, see
2389    <xref linkend="arrays"/>.
2390   </para>
2391
2392   </sect2>
2393
2394   <sect2 id="sql-syntax-row-constructors">
2395    <title>Row Constructors</title>
2396
2397    <indexterm>
2398     <primary>composite type</primary>
2399     <secondary>constructor</secondary>
2400    </indexterm>
2401
2402    <indexterm>
2403     <primary>row type</primary>
2404     <secondary>constructor</secondary>
2405    </indexterm>
2406
2407    <indexterm>
2408     <primary>ROW</primary>
2409    </indexterm>
2410
2411    <para>
2412     A row constructor is an expression that builds a row value (also
2413     called a composite value) using values
2414     for its member fields.  A row constructor consists of the key word
2415     <literal>ROW</literal>, a left parenthesis, zero or more
2416     expressions (separated by commas) for the row field values, and finally
2417     a right parenthesis.  For example:
2418 <programlisting>
2419 SELECT ROW(1,2.5,'this is a test');
2420 </programlisting>
2421     The key word <literal>ROW</literal> is optional when there is more than one
2422     expression in the list.
2423    </para>
2424
2425    <para>
2426     A row constructor can include the syntax
2427     <replaceable>rowvalue</replaceable><literal>.*</literal>,
2428     which will be expanded to a list of the elements of the row value,
2429     just as occurs when the <literal>.*</literal> syntax is used at the top level
2430     of a <command>SELECT</command> list (see <xref linkend="rowtypes-usage"/>).
2431     For example, if table <literal>t</literal> has
2432     columns <literal>f1</literal> and <literal>f2</literal>, these are the same:
2433 <programlisting>
2434 SELECT ROW(t.*, 42) FROM t;
2435 SELECT ROW(t.f1, t.f2, 42) FROM t;
2436 </programlisting>
2437    </para>
2438
2439    <note>
2440     <para>
2441      Before <productname>PostgreSQL</productname> 8.2, the
2442      <literal>.*</literal> syntax was not expanded in row constructors, so
2443      that writing <literal>ROW(t.*, 42)</literal> created a two-field row whose first
2444      field was another row value.  The new behavior is usually more useful.
2445      If you need the old behavior of nested row values, write the inner
2446      row value without <literal>.*</literal>, for instance
2447      <literal>ROW(t, 42)</literal>.
2448     </para>
2449    </note>
2450
2451    <para>
2452     By default, the value created by a <literal>ROW</literal> expression is of
2453     an anonymous record type.  If necessary, it can be cast to a named
2454     composite type &mdash; either the row type of a table, or a composite type
2455     created with <command>CREATE TYPE AS</command>.  An explicit cast might be needed
2456     to avoid ambiguity.  For example:
2457 <programlisting>
2458 CREATE TABLE mytable(f1 int, f2 float, f3 text);
2459
2460 CREATE FUNCTION getf1(mytable) RETURNS int AS 'SELECT $1.f1' LANGUAGE SQL;
2461
2462 -- No cast needed since only one getf1() exists
2463 SELECT getf1(ROW(1,2.5,'this is a test'));
2464  getf1
2465 -------
2466      1
2467 (1 row)
2468
2469 CREATE TYPE myrowtype AS (f1 int, f2 text, f3 numeric);
2470
2471 CREATE FUNCTION getf1(myrowtype) RETURNS int AS 'SELECT $1.f1' LANGUAGE SQL;
2472
2473 -- Now we need a cast to indicate which function to call:
2474 SELECT getf1(ROW(1,2.5,'this is a test'));
2475 ERROR:  function getf1(record) is not unique
2476
2477 SELECT getf1(ROW(1,2.5,'this is a test')::mytable);
2478  getf1
2479 -------
2480      1
2481 (1 row)
2482
2483 SELECT getf1(CAST(ROW(11,'this is a test',2.5) AS myrowtype));
2484  getf1
2485 -------
2486     11
2487 (1 row)
2488 </programlisting>
2489   </para>
2490
2491   <para>
2492    Row constructors can be used to build composite values to be stored
2493    in a composite-type table column, or to be passed to a function that
2494    accepts a composite parameter.  Also, it is possible to test rows
2495    using the standard comparison operators as described in <xref
2496    linkend="functions-comparison"/>, to compare one row against another
2497    as described in <xref linkend="functions-comparisons"/>, and to
2498    use them in connection with subqueries, as discussed in <xref
2499    linkend="functions-subquery"/>.
2500   </para>
2501
2502   </sect2>
2503
2504   <sect2 id="syntax-express-eval">
2505    <title>Expression Evaluation Rules</title>
2506
2507    <indexterm>
2508     <primary>expression</primary>
2509     <secondary>order of evaluation</secondary>
2510    </indexterm>
2511
2512    <para>
2513     The order of evaluation of subexpressions is not defined.  In
2514     particular, the inputs of an operator or function are not necessarily
2515     evaluated left-to-right or in any other fixed order.
2516    </para>
2517
2518    <para>
2519     Furthermore, if the result of an expression can be determined by
2520     evaluating only some parts of it, then other subexpressions
2521     might not be evaluated at all.  For instance, if one wrote:
2522 <programlisting>
2523 SELECT true OR somefunc();
2524 </programlisting>
2525     then <literal>somefunc()</literal> would (probably) not be called
2526     at all. The same would be the case if one wrote:
2527 <programlisting>
2528 SELECT somefunc() OR true;
2529 </programlisting>
2530     Note that this is not the same as the left-to-right
2531     <quote>short-circuiting</quote> of Boolean operators that is found
2532     in some programming languages.
2533    </para>
2534
2535    <para>
2536     As a consequence, it is unwise to use functions with side effects
2537     as part of complex expressions.  It is particularly dangerous to
2538     rely on side effects or evaluation order in <literal>WHERE</literal> and <literal>HAVING</literal> clauses,
2539     since those clauses are extensively reprocessed as part of
2540     developing an execution plan.  Boolean
2541     expressions (<literal>AND</literal>/<literal>OR</literal>/<literal>NOT</literal> combinations) in those clauses can be reorganized
2542     in any manner allowed by the laws of Boolean algebra.
2543    </para>
2544
2545    <para>
2546     When it is essential to force evaluation order, a <literal>CASE</literal>
2547     construct (see <xref linkend="functions-conditional"/>) can be
2548     used.  For example, this is an untrustworthy way of trying to
2549     avoid division by zero in a <literal>WHERE</literal> clause:
2550 <programlisting>
2551 SELECT ... WHERE x &gt; 0 AND y/x &gt; 1.5;
2552 </programlisting>
2553     But this is safe:
2554 <programlisting>
2555 SELECT ... WHERE CASE WHEN x &gt; 0 THEN y/x &gt; 1.5 ELSE false END;
2556 </programlisting>
2557     A <literal>CASE</literal> construct used in this fashion will defeat optimization
2558     attempts, so it should only be done when necessary.  (In this particular
2559     example, it would be better to sidestep the problem by writing
2560     <literal>y &gt; 1.5*x</literal> instead.)
2561    </para>
2562
2563    <para>
2564     <literal>CASE</literal> is not a cure-all for such issues, however.
2565     One limitation of the technique illustrated above is that it does not
2566     prevent early evaluation of constant subexpressions.
2567     As described in <xref linkend="xfunc-volatility"/>, functions and
2568     operators marked <literal>IMMUTABLE</literal> can be evaluated when
2569     the query is planned rather than when it is executed.  Thus for example
2570 <programlisting>
2571 SELECT CASE WHEN x &gt; 0 THEN x ELSE 1/0 END FROM tab;
2572 </programlisting>
2573     is likely to result in a division-by-zero failure due to the planner
2574     trying to simplify the constant subexpression,
2575     even if every row in the table has <literal>x &gt; 0</literal> so that the
2576     <literal>ELSE</literal> arm would never be entered at run time.
2577    </para>
2578
2579    <para>
2580     While that particular example might seem silly, related cases that don't
2581     obviously involve constants can occur in queries executed within
2582     functions, since the values of function arguments and local variables
2583     can be inserted into queries as constants for planning purposes.
2584     Within <application>PL/pgSQL</application> functions, for example, using an
2585     <literal>IF</literal>-<literal>THEN</literal>-<literal>ELSE</literal> statement to protect
2586     a risky computation is much safer than just nesting it in a
2587     <literal>CASE</literal> expression.
2588    </para>
2589
2590    <para>
2591     Another limitation of the same kind is that a <literal>CASE</literal> cannot
2592     prevent evaluation of an aggregate expression contained within it,
2593     because aggregate expressions are computed before other
2594     expressions in a <literal>SELECT</literal> list or <literal>HAVING</literal> clause
2595     are considered.  For example, the following query can cause a
2596     division-by-zero error despite seemingly having protected against it:
2597 <programlisting>
2598 SELECT CASE WHEN min(employees) > 0
2599             THEN avg(expenses / employees)
2600        END
2601     FROM departments;
2602 </programlisting>
2603     The <function>min()</function> and <function>avg()</function> aggregates are computed
2604     concurrently over all the input rows, so if any row
2605     has <structfield>employees</structfield> equal to zero, the division-by-zero error
2606     will occur before there is any opportunity to test the result of
2607     <function>min()</function>.  Instead, use a <literal>WHERE</literal>
2608     or <literal>FILTER</literal> clause to prevent problematic input rows from
2609     reaching an aggregate function in the first place.
2610    </para>
2611   </sect2>
2612  </sect1>
2613
2614  <sect1 id="sql-syntax-calling-funcs">
2615   <title>Calling Functions</title>
2616
2617    <indexterm zone="sql-syntax-calling-funcs">
2618     <primary>notation</primary>
2619     <secondary>functions</secondary>
2620    </indexterm>
2621
2622    <para>
2623     <productname>PostgreSQL</productname> allows functions that have named
2624     parameters to be called using either <firstterm>positional</firstterm> or
2625     <firstterm>named</firstterm> notation.  Named notation is especially
2626     useful for functions that have a large number of parameters, since it
2627     makes the associations between parameters and actual arguments more
2628     explicit and reliable.
2629     In positional notation, a function call is written with
2630     its argument values in the same order as they are defined in the function
2631     declaration.  In named notation, the arguments are matched to the
2632     function parameters by name and can be written in any order.
2633     For each notation, also consider the effect of function argument types,
2634     documented in <xref linkend="typeconv-func"/>.
2635    </para>
2636
2637    <para>
2638     In either notation, parameters that have default values given in the
2639     function declaration need not be written in the call at all.  But this
2640     is particularly useful in named notation, since any combination of
2641     parameters can be omitted; while in positional notation parameters can
2642     only be omitted from right to left.
2643    </para>
2644
2645    <para>
2646     <productname>PostgreSQL</productname> also supports
2647     <firstterm>mixed</firstterm> notation, which combines positional and
2648     named notation.  In this case, positional parameters are written first
2649     and named parameters appear after them.
2650    </para>
2651
2652    <para>
2653     The following examples will illustrate the usage of all three
2654     notations, using the following function definition:
2655 <programlisting>
2656 CREATE FUNCTION concat_lower_or_upper(a text, b text, uppercase boolean DEFAULT false)
2657 RETURNS text
2658 AS
2659 $$
2660  SELECT CASE
2661         WHEN $3 THEN UPPER($1 || ' ' || $2)
2662         ELSE LOWER($1 || ' ' || $2)
2663         END;
2664 $$
2665 LANGUAGE SQL IMMUTABLE STRICT;
2666 </programlisting>
2667     Function <function>concat_lower_or_upper</function> has two mandatory
2668     parameters, <literal>a</literal> and <literal>b</literal>.  Additionally
2669     there is one optional parameter <literal>uppercase</literal> which defaults
2670     to <literal>false</literal>.  The <literal>a</literal> and
2671     <literal>b</literal> inputs will be concatenated, and forced to either
2672     upper or lower case depending on the <literal>uppercase</literal>
2673     parameter.  The remaining details of this function
2674     definition are not important here (see <xref linkend="extend"/> for
2675     more information).
2676    </para>
2677
2678    <sect2 id="sql-syntax-calling-funcs-positional">
2679     <title>Using Positional Notation</title>
2680
2681     <indexterm>
2682      <primary>function</primary>
2683      <secondary>positional notation</secondary>
2684     </indexterm>
2685
2686     <para>
2687      Positional notation is the traditional mechanism for passing arguments
2688      to functions in <productname>PostgreSQL</productname>.  An example is:
2689 <screen>
2690 SELECT concat_lower_or_upper('Hello', 'World', true);
2691  concat_lower_or_upper
2692 -----------------------
2693  HELLO WORLD
2694 (1 row)
2695 </screen>
2696      All arguments are specified in order.  The result is upper case since
2697      <literal>uppercase</literal> is specified as <literal>true</literal>.
2698      Another example is:
2699 <screen>
2700 SELECT concat_lower_or_upper('Hello', 'World');
2701  concat_lower_or_upper
2702 -----------------------
2703  hello world
2704 (1 row)
2705 </screen>
2706      Here, the <literal>uppercase</literal> parameter is omitted, so it
2707      receives its default value of <literal>false</literal>, resulting in
2708      lower case output.  In positional notation, arguments can be omitted
2709      from right to left so long as they have defaults.
2710     </para>
2711    </sect2>
2712
2713    <sect2 id="sql-syntax-calling-funcs-named">
2714     <title>Using Named Notation</title>
2715
2716     <indexterm>
2717      <primary>function</primary>
2718      <secondary>named notation</secondary>
2719     </indexterm>
2720
2721     <para>
2722      In named notation, each argument's name is specified using
2723      <literal>=&gt;</literal> to separate it from the argument expression.
2724      For example:
2725 <screen>
2726 SELECT concat_lower_or_upper(a =&gt; 'Hello', b =&gt; 'World');
2727  concat_lower_or_upper
2728 -----------------------
2729  hello world
2730 (1 row)
2731 </screen>
2732      Again, the argument <literal>uppercase</literal> was omitted
2733      so it is set to <literal>false</literal> implicitly.  One advantage of
2734      using named notation is that the arguments may be specified in any
2735      order, for example:
2736 <screen>
2737 SELECT concat_lower_or_upper(a =&gt; 'Hello', b =&gt; 'World', uppercase =&gt; true);
2738  concat_lower_or_upper
2739 -----------------------
2740  HELLO WORLD
2741 (1 row)
2742
2743 SELECT concat_lower_or_upper(a =&gt; 'Hello', uppercase =&gt; true, b =&gt; 'World');
2744  concat_lower_or_upper
2745 -----------------------
2746  HELLO WORLD
2747 (1 row)
2748 </screen>
2749     </para>
2750
2751     <para>
2752       An older syntax based on ":=" is supported for backward compatibility:
2753 <screen>
2754 SELECT concat_lower_or_upper(a := 'Hello', uppercase := true, b := 'World');
2755  concat_lower_or_upper
2756 -----------------------
2757  HELLO WORLD
2758 (1 row)
2759 </screen>
2760     </para>
2761    </sect2>
2762
2763   <sect2 id="sql-syntax-calling-funcs-mixed">
2764    <title>Using Mixed Notation</title>
2765
2766    <indexterm>
2767     <primary>function</primary>
2768     <secondary>mixed notation</secondary>
2769    </indexterm>
2770
2771    <para>
2772     The mixed notation combines positional and named notation. However, as
2773     already mentioned, named arguments cannot precede positional arguments.
2774     For example:
2775 <screen>
2776 SELECT concat_lower_or_upper('Hello', 'World', uppercase =&gt; true);
2777  concat_lower_or_upper
2778 -----------------------
2779  HELLO WORLD
2780 (1 row)
2781 </screen>
2782     In the above query, the arguments <literal>a</literal> and
2783     <literal>b</literal> are specified positionally, while
2784     <literal>uppercase</literal> is specified by name.  In this example,
2785     that adds little except documentation.  With a more complex function
2786     having numerous parameters that have default values, named or mixed
2787     notation can save a great deal of writing and reduce chances for error.
2788    </para>
2789
2790    <note>
2791     <para>
2792      Named and mixed call notations currently cannot be used when calling an
2793      aggregate function (but they do work when an aggregate function is used
2794      as a window function).
2795     </para>
2796    </note>
2797   </sect2>
2798  </sect1>
2799
2800 </chapter>