Repair memory leaks in plpython.
[pgsql.git] / doc / src / sgml / rangetypes.sgml
blob92ea0e83dab7b4294e58c5e637ede51394db0710
1 <!-- doc/src/sgml/rangetypes.sgml -->
3 <sect1 id="rangetypes">
4 <title>Range Types</title>
6 <indexterm>
7 <primary>range type</primary>
8 </indexterm>
10 <indexterm>
11 <primary>multirange type</primary>
12 </indexterm>
14 <para>
15 Range types are data types representing a range of values of some
16 element type (called the range's <firstterm>subtype</firstterm>).
17 For instance, ranges
18 of <type>timestamp</type> might be used to represent the ranges of
19 time that a meeting room is reserved. In this case the data type
20 is <type>tsrange</type> (short for <quote>timestamp range</quote>),
21 and <type>timestamp</type> is the subtype. The subtype must have
22 a total order so that it is well-defined whether element values are
23 within, before, or after a range of values.
24 </para>
26 <para>
27 Range types are useful because they represent many element values in a
28 single range value, and because concepts such as overlapping ranges can
29 be expressed clearly. The use of time and date ranges for scheduling
30 purposes is the clearest example; but price ranges, measurement
31 ranges from an instrument, and so forth can also be useful.
32 </para>
34 <para>
35 Every range type has a corresponding multirange type. A multirange is
36 an ordered list of non-contiguous, non-empty, non-null ranges. Most
37 range operators also work on multiranges, and they have a few functions
38 of their own.
39 </para>
41 <sect2 id="rangetypes-builtin">
42 <title>Built-in Range and Multirange Types</title>
44 <para>
45 PostgreSQL comes with the following built-in range types:
46 <itemizedlist>
47 <listitem>
48 <para>
49 <type>int4range</type> &mdash; Range of <type>integer</type>,
50 <type>int4multirange</type> &mdash; corresponding Multirange
51 </para>
52 </listitem>
53 <listitem>
54 <para>
55 <type>int8range</type> &mdash; Range of <type>bigint</type>,
56 <type>int8multirange</type> &mdash; corresponding Multirange
57 </para>
58 </listitem>
59 <listitem>
60 <para>
61 <type>numrange</type> &mdash; Range of <type>numeric</type>,
62 <type>nummultirange</type> &mdash; corresponding Multirange
63 </para>
64 </listitem>
65 <listitem>
66 <para>
67 <type>tsrange</type> &mdash; Range of <type>timestamp without time zone</type>,
68 <type>tsmultirange</type> &mdash; corresponding Multirange
69 </para>
70 </listitem>
71 <listitem>
72 <para>
73 <type>tstzrange</type> &mdash; Range of <type>timestamp with time zone</type>,
74 <type>tstzmultirange</type> &mdash; corresponding Multirange
75 </para>
76 </listitem>
77 <listitem>
78 <para>
79 <type>daterange</type> &mdash; Range of <type>date</type>,
80 <type>datemultirange</type> &mdash; corresponding Multirange
81 </para>
82 </listitem>
83 </itemizedlist>
84 In addition, you can define your own range types;
85 see <xref linkend="sql-createtype"/> for more information.
86 </para>
87 </sect2>
89 <sect2 id="rangetypes-examples">
90 <title>Examples</title>
92 <para>
93 <programlisting>
94 CREATE TABLE reservation (room int, during tsrange);
95 INSERT INTO reservation VALUES
96 (1108, '[2010-01-01 14:30, 2010-01-01 15:30)');
98 -- Containment
99 SELECT int4range(10, 20) @&gt; 3;
101 -- Overlaps
102 SELECT numrange(11.1, 22.2) &amp;&amp; numrange(20.0, 30.0);
104 -- Extract the upper bound
105 SELECT upper(int8range(15, 25));
107 -- Compute the intersection
108 SELECT int4range(10, 20) * int4range(15, 25);
110 -- Is the range empty?
111 SELECT isempty(numrange(1, 5));
112 </programlisting>
114 See <xref linkend="range-operators-table"/>
115 and <xref linkend="range-functions-table"/> for complete lists of
116 operators and functions on range types.
117 </para>
118 </sect2>
120 <sect2 id="rangetypes-inclusivity">
121 <title>Inclusive and Exclusive Bounds</title>
123 <para>
124 Every non-empty range has two bounds, the lower bound and the upper
125 bound. All points between these values are included in the range. An
126 inclusive bound means that the boundary point itself is included in
127 the range as well, while an exclusive bound means that the boundary
128 point is not included in the range.
129 </para>
131 <para>
132 In the text form of a range, an inclusive lower bound is represented by
133 <quote><literal>[</literal></quote> while an exclusive lower bound is
134 represented by <quote><literal>(</literal></quote>. Likewise, an inclusive upper bound is represented by
135 <quote><literal>]</literal></quote>, while an exclusive upper bound is
136 represented by <quote><literal>)</literal></quote>.
137 (See <xref linkend="rangetypes-io"/> for more details.)
138 </para>
140 <para>
141 The functions <literal>lower_inc</literal>
142 and <literal>upper_inc</literal> test the inclusivity of the lower
143 and upper bounds of a range value, respectively.
144 </para>
145 </sect2>
147 <sect2 id="rangetypes-infinite">
148 <title>Infinite (Unbounded) Ranges</title>
150 <para>
151 The lower bound of a range can be omitted, meaning that all
152 values less than the upper bound are included in the range, e.g.,
153 <literal>(,3]</literal>. Likewise, if the upper bound of the range
154 is omitted, then all values greater than the lower bound are included
155 in the range. If both lower and upper bounds are omitted, all values
156 of the element type are considered to be in the range. Specifying a
157 missing bound as inclusive is automatically converted to exclusive,
158 e.g., <literal>[,]</literal> is converted to <literal>(,)</literal>.
159 You can think of these missing values as +/-infinity, but they are
160 special range type values and are considered to be beyond any range
161 element type's +/-infinity values.
162 </para>
164 <para>
165 Element types that have the notion of <quote>infinity</quote> can
166 use them as explicit bound values. For example, with timestamp
167 ranges, <literal>[today,infinity)</literal> excludes the special
168 <type>timestamp</type> value <literal>infinity</literal>,
169 while <literal>[today,infinity]</literal> include it, as does
170 <literal>[today,)</literal> and <literal>[today,]</literal>.
171 </para>
173 <para>
174 The functions <literal>lower_inf</literal>
175 and <literal>upper_inf</literal> test for infinite lower
176 and upper bounds of a range, respectively.
177 </para>
178 </sect2>
180 <sect2 id="rangetypes-io">
181 <title>Range Input/Output</title>
183 <para>
184 The input for a range value must follow one of the following patterns:
185 <synopsis>
186 (<replaceable>lower-bound</replaceable>,<replaceable>upper-bound</replaceable>)
187 (<replaceable>lower-bound</replaceable>,<replaceable>upper-bound</replaceable>]
188 [<replaceable>lower-bound</replaceable>,<replaceable>upper-bound</replaceable>)
189 [<replaceable>lower-bound</replaceable>,<replaceable>upper-bound</replaceable>]
190 empty
191 </synopsis>
192 The parentheses or brackets indicate whether the lower and upper bounds
193 are exclusive or inclusive, as described previously.
194 Notice that the final pattern is <literal>empty</literal>, which
195 represents an empty range (a range that contains no points).
196 </para>
198 <para>
199 The <replaceable>lower-bound</replaceable> may be either a string
200 that is valid input for the subtype, or empty to indicate no
201 lower bound. Likewise, <replaceable>upper-bound</replaceable> may be
202 either a string that is valid input for the subtype, or empty to
203 indicate no upper bound.
204 </para>
206 <para>
207 Each bound value can be quoted using <literal>"</literal> (double quote)
208 characters. This is necessary if the bound value contains parentheses,
209 brackets, commas, double quotes, or backslashes, since these characters
210 would otherwise be taken as part of the range syntax. To put a double
211 quote or backslash in a quoted bound value, precede it with a
212 backslash. (Also, a pair of double quotes within a double-quoted bound
213 value is taken to represent a double quote character, analogously to the
214 rules for single quotes in SQL literal strings.) Alternatively, you can
215 avoid quoting and use backslash-escaping to protect all data characters
216 that would otherwise be taken as range syntax. Also, to write a bound
217 value that is an empty string, write <literal>""</literal>, since writing
218 nothing means an infinite bound.
219 </para>
221 <para>
222 Whitespace is allowed before and after the range value, but any whitespace
223 between the parentheses or brackets is taken as part of the lower or upper
224 bound value. (Depending on the element type, it might or might not be
225 significant.)
226 </para>
228 <note>
229 <para>
230 These rules are very similar to those for writing field values in
231 composite-type literals. See <xref linkend="rowtypes-io-syntax"/> for
232 additional commentary.
233 </para>
234 </note>
236 <para>
237 Examples:
238 <programlisting>
239 -- includes 3, does not include 7, and does include all points in between
240 SELECT '[3,7)'::int4range;
242 -- does not include either 3 or 7, but includes all points in between
243 SELECT '(3,7)'::int4range;
245 -- includes only the single point 4
246 SELECT '[4,4]'::int4range;
248 -- includes no points (and will be normalized to 'empty')
249 SELECT '[4,4)'::int4range;
250 </programlisting>
251 </para>
253 <para>
254 The input for a multirange is curly brackets (<literal>{</literal> and
255 <literal>}</literal>) containing zero or more valid ranges,
256 separated by commas. Whitespace is permitted around the brackets and
257 commas. This is intended to be reminiscent of array syntax, although
258 multiranges are much simpler: they have just one dimension and there is
259 no need to quote their contents. (The bounds of their ranges may be
260 quoted as above however.)
261 </para>
263 <para>
264 Examples:
265 <programlisting>
266 SELECT '{}'::int4multirange;
267 SELECT '{[3,7)}'::int4multirange;
268 SELECT '{[3,7), [8,9)}'::int4multirange;
269 </programlisting>
270 </para>
272 </sect2>
274 <sect2 id="rangetypes-construct">
275 <title>Constructing Ranges and Multiranges</title>
277 <para>
278 Each range type has a constructor function with the same name as the range
279 type. Using the constructor function is frequently more convenient than
280 writing a range literal constant, since it avoids the need for extra
281 quoting of the bound values. The constructor function
282 accepts two or three arguments. The two-argument form constructs a range
283 in standard form (lower bound inclusive, upper bound exclusive), while
284 the three-argument form constructs a range with bounds of the form
285 specified by the third argument.
286 The third argument must be one of the strings
287 <quote><literal>()</literal></quote>,
288 <quote><literal>(]</literal></quote>,
289 <quote><literal>[)</literal></quote>, or
290 <quote><literal>[]</literal></quote>.
291 For example:
293 <programlisting>
294 -- The full form is: lower bound, upper bound, and text argument indicating
295 -- inclusivity/exclusivity of bounds.
296 SELECT numrange(1.0, 14.0, '(]');
298 -- If the third argument is omitted, '[)' is assumed.
299 SELECT numrange(1.0, 14.0);
301 -- Although '(]' is specified here, on display the value will be converted to
302 -- canonical form, since int8range is a discrete range type (see below).
303 SELECT int8range(1, 14, '(]');
305 -- Using NULL for either bound causes the range to be unbounded on that side.
306 SELECT numrange(NULL, 2.2);
307 </programlisting>
308 </para>
310 <para>
311 Each range type also has a multirange constructor with the same name as the
312 multirange type. The constructor function takes zero or more arguments
313 which are all ranges of the appropriate type.
314 For example:
316 <programlisting>
317 SELECT nummultirange();
318 SELECT nummultirange(numrange(1.0, 14.0));
319 SELECT nummultirange(numrange(1.0, 14.0), numrange(20.0, 25.0));
320 </programlisting>
321 </para>
322 </sect2>
324 <sect2 id="rangetypes-discrete">
325 <title>Discrete Range Types</title>
327 <para>
328 A discrete range is one whose element type has a well-defined
329 <quote>step</quote>, such as <type>integer</type> or <type>date</type>.
330 In these types two elements can be said to be adjacent, when there are
331 no valid values between them. This contrasts with continuous ranges,
332 where it's always (or almost always) possible to identify other element
333 values between two given values. For example, a range over the
334 <type>numeric</type> type is continuous, as is a range over <type>timestamp</type>.
335 (Even though <type>timestamp</type> has limited precision, and so could
336 theoretically be treated as discrete, it's better to consider it continuous
337 since the step size is normally not of interest.)
338 </para>
340 <para>
341 Another way to think about a discrete range type is that there is a clear
342 idea of a <quote>next</quote> or <quote>previous</quote> value for each element value.
343 Knowing that, it is possible to convert between inclusive and exclusive
344 representations of a range's bounds, by choosing the next or previous
345 element value instead of the one originally given.
346 For example, in an integer range type <literal>[4,8]</literal> and
347 <literal>(3,9)</literal> denote the same set of values; but this would not be so
348 for a range over numeric.
349 </para>
351 <para>
352 A discrete range type should have a <firstterm>canonicalization</firstterm>
353 function that is aware of the desired step size for the element type.
354 The canonicalization function is charged with converting equivalent values
355 of the range type to have identical representations, in particular
356 consistently inclusive or exclusive bounds.
357 If a canonicalization function is not specified, then ranges with different
358 formatting will always be treated as unequal, even though they might
359 represent the same set of values in reality.
360 </para>
362 <para>
363 The built-in range types <type>int4range</type>, <type>int8range</type>,
364 and <type>daterange</type> all use a canonical form that includes
365 the lower bound and excludes the upper bound; that is,
366 <literal>[)</literal>. User-defined range types can use other conventions,
367 however.
368 </para>
369 </sect2>
371 <sect2 id="rangetypes-defining">
372 <title>Defining New Range Types</title>
374 <para>
375 Users can define their own range types. The most common reason to do
376 this is to use ranges over subtypes not provided among the built-in
377 range types.
378 For example, to define a new range type of subtype <type>float8</type>:
380 <programlisting>
381 CREATE TYPE floatrange AS RANGE (
382 subtype = float8,
383 subtype_diff = float8mi
386 SELECT '[1.234, 5.678]'::floatrange;
387 </programlisting>
389 Because <type>float8</type> has no meaningful
390 <quote>step</quote>, we do not define a canonicalization
391 function in this example.
392 </para>
394 <para>
395 When you define your own range you automatically get a corresponding
396 multirange type.
397 </para>
399 <para>
400 Defining your own range type also allows you to specify a different
401 subtype B-tree operator class or collation to use, so as to change the sort
402 ordering that determines which values fall into a given range.
403 </para>
405 <para>
406 If the subtype is considered to have discrete rather than continuous
407 values, the <command>CREATE TYPE</command> command should specify a
408 <literal>canonical</literal> function.
409 The canonicalization function takes an input range value, and must return
410 an equivalent range value that may have different bounds and formatting.
411 The canonical output for two ranges that represent the same set of values,
412 for example the integer ranges <literal>[1, 7]</literal> and <literal>[1,
413 8)</literal>, must be identical. It doesn't matter which representation
414 you choose to be the canonical one, so long as two equivalent values with
415 different formattings are always mapped to the same value with the same
416 formatting. In addition to adjusting the inclusive/exclusive bounds
417 format, a canonicalization function might round off boundary values, in
418 case the desired step size is larger than what the subtype is capable of
419 storing. For instance, a range type over <type>timestamp</type> could be
420 defined to have a step size of an hour, in which case the canonicalization
421 function would need to round off bounds that weren't a multiple of an hour,
422 or perhaps throw an error instead.
423 </para>
425 <para>
426 In addition, any range type that is meant to be used with GiST or SP-GiST
427 indexes should define a subtype difference, or <literal>subtype_diff</literal>,
428 function. (The index will still work without <literal>subtype_diff</literal>,
429 but it is likely to be considerably less efficient than if a difference
430 function is provided.) The subtype difference function takes two input
431 values of the subtype, and returns their difference
432 (i.e., <replaceable>X</replaceable> minus <replaceable>Y</replaceable>) represented as
433 a <type>float8</type> value. In our example above, the
434 function <function>float8mi</function> that underlies the regular <type>float8</type>
435 minus operator can be used; but for any other subtype, some type
436 conversion would be necessary. Some creative thought about how to
437 represent differences as numbers might be needed, too. To the greatest
438 extent possible, the <literal>subtype_diff</literal> function should agree with
439 the sort ordering implied by the selected operator class and collation;
440 that is, its result should be positive whenever its first argument is
441 greater than its second according to the sort ordering.
442 </para>
444 <para>
445 A less-oversimplified example of a <literal>subtype_diff</literal> function is:
446 </para>
448 <programlisting>
449 CREATE FUNCTION time_subtype_diff(x time, y time) RETURNS float8 AS
450 'SELECT EXTRACT(EPOCH FROM (x - y))' LANGUAGE sql STRICT IMMUTABLE;
452 CREATE TYPE timerange AS RANGE (
453 subtype = time,
454 subtype_diff = time_subtype_diff
457 SELECT '[11:10, 23:00]'::timerange;
458 </programlisting>
460 <para>
461 See <xref linkend="sql-createtype"/> for more information about creating
462 range types.
463 </para>
464 </sect2>
466 <sect2 id="rangetypes-indexing">
467 <title>Indexing</title>
469 <indexterm>
470 <primary>range type</primary>
471 <secondary>indexes on</secondary>
472 </indexterm>
474 <para>
475 GiST and SP-GiST indexes can be created for table columns of range types.
476 GiST indexes can be also created for table columns of multirange types.
477 For instance, to create a GiST index:
478 <programlisting>
479 CREATE INDEX reservation_idx ON reservation USING GIST (during);
480 </programlisting>
481 A GiST or SP-GiST index on ranges can accelerate queries involving these
482 range operators:
483 <literal>=</literal>,
484 <literal>&amp;&amp;</literal>,
485 <literal>&lt;@</literal>,
486 <literal>@&gt;</literal>,
487 <literal>&lt;&lt;</literal>,
488 <literal>&gt;&gt;</literal>,
489 <literal>-|-</literal>,
490 <literal>&amp;&lt;</literal>, and
491 <literal>&amp;&gt;</literal>.
492 A GiST index on multiranges can accelerate queries involving the same
493 set of multirange operators.
494 A GiST index on ranges and GiST index on multiranges can also accelerate
495 queries involving these cross-type range to multirange and multirange to
496 range operators correspondingly:
497 <literal>&amp;&amp;</literal>,
498 <literal>&lt;@</literal>,
499 <literal>@&gt;</literal>,
500 <literal>&lt;&lt;</literal>,
501 <literal>&gt;&gt;</literal>,
502 <literal>-|-</literal>,
503 <literal>&amp;&lt;</literal>, and
504 <literal>&amp;&gt;</literal>.
505 See <xref linkend="range-operators-table"/> for more information.
506 </para>
508 <para>
509 In addition, B-tree and hash indexes can be created for table columns of
510 range types. For these index types, basically the only useful range
511 operation is equality. There is a B-tree sort ordering defined for range
512 values, with corresponding <literal>&lt;</literal> and <literal>&gt;</literal> operators,
513 but the ordering is rather arbitrary and not usually useful in the real
514 world. Range types' B-tree and hash support is primarily meant to
515 allow sorting and hashing internally in queries, rather than creation of
516 actual indexes.
517 </para>
518 </sect2>
520 <sect2 id="rangetypes-constraint">
521 <title>Constraints on Ranges</title>
523 <indexterm>
524 <primary>range type</primary>
525 <secondary>exclude</secondary>
526 </indexterm>
528 <para>
529 While <literal>UNIQUE</literal> is a natural constraint for scalar
530 values, it is usually unsuitable for range types. Instead, an
531 exclusion constraint is often more appropriate
532 (see <link linkend="sql-createtable-exclude">CREATE TABLE
533 ... CONSTRAINT ... EXCLUDE</link>). Exclusion constraints allow the
534 specification of constraints such as <quote>non-overlapping</quote> on a
535 range type. For example:
537 <programlisting>
538 CREATE TABLE reservation (
539 during tsrange,
540 EXCLUDE USING GIST (during WITH &amp;&amp;)
542 </programlisting>
544 That constraint will prevent any overlapping values from existing
545 in the table at the same time:
547 <programlisting>
548 INSERT INTO reservation VALUES
549 ('[2010-01-01 11:30, 2010-01-01 15:00)');
550 INSERT 0 1
552 INSERT INTO reservation VALUES
553 ('[2010-01-01 14:45, 2010-01-01 15:45)');
554 ERROR: conflicting key value violates exclusion constraint "reservation_during_excl"
555 DETAIL: Key (during)=(["2010-01-01 14:45:00","2010-01-01 15:45:00")) conflicts
556 with existing key (during)=(["2010-01-01 11:30:00","2010-01-01 15:00:00")).
557 </programlisting>
558 </para>
560 <para>
561 You can use the <link linkend="btree-gist"><literal>btree_gist</literal></link>
562 extension to define exclusion constraints on plain scalar data types, which
563 can then be combined with range exclusions for maximum flexibility. For
564 example, after <literal>btree_gist</literal> is installed, the following
565 constraint will reject overlapping ranges only if the meeting room numbers
566 are equal:
568 <programlisting>
569 CREATE EXTENSION btree_gist;
570 CREATE TABLE room_reservation (
571 room text,
572 during tsrange,
573 EXCLUDE USING GIST (room WITH =, during WITH &amp;&amp;)
576 INSERT INTO room_reservation VALUES
577 ('123A', '[2010-01-01 14:00, 2010-01-01 15:00)');
578 INSERT 0 1
580 INSERT INTO room_reservation VALUES
581 ('123A', '[2010-01-01 14:30, 2010-01-01 15:30)');
582 ERROR: conflicting key value violates exclusion constraint "room_reservation_room_during_excl"
583 DETAIL: Key (room, during)=(123A, ["2010-01-01 14:30:00","2010-01-01 15:30:00")) conflicts
584 with existing key (room, during)=(123A, ["2010-01-01 14:00:00","2010-01-01 15:00:00")).
586 INSERT INTO room_reservation VALUES
587 ('123B', '[2010-01-01 14:30, 2010-01-01 15:30)');
588 INSERT 0 1
589 </programlisting>
590 </para>
591 </sect2>
592 </sect1>