1 <!-- doc/src/sgml/intarray.sgml -->
3 <sect1 id=
"intarray" xreflabel=
"intarray">
4 <title>intarray
— manipulate arrays of integers
</title>
6 <indexterm zone=
"intarray">
7 <primary>intarray
</primary>
11 The
<filename>intarray
</filename> module provides a number of useful functions
12 and operators for manipulating null-free arrays of integers.
13 There is also support for indexed searches using some of the operators.
17 All of these operations will throw an error if a supplied array contains any
22 Many of these operations are only sensible for one-dimensional arrays.
23 Although they will accept input arrays of more dimensions, the data is
24 treated as though it were a linear array in storage order.
28 This module is considered
<quote>trusted
</quote>, that is, it can be
29 installed by non-superusers who have
<literal>CREATE
</literal> privilege
30 on the current database.
33 <sect2 id=
"intarray-funcs-ops">
34 <title><filename>intarray
</filename> Functions and Operators
</title>
37 The functions provided by the
<filename>intarray
</filename> module
38 are shown in
<xref linkend=
"intarray-func-table"/>, the operators
39 in
<xref linkend=
"intarray-op-table"/>.
42 <table id=
"intarray-func-table">
43 <title><filename>intarray
</filename> Functions
</title>
47 <entry role=
"func_table_entry"><para role=
"func_signature">
61 <entry role=
"func_table_entry"><para role=
"func_signature">
62 <indexterm><primary>icount
</primary></indexterm>
63 <function>icount
</function> (
<type>integer[]
</type> )
64 <returnvalue>integer
</returnvalue>
67 Returns the number of elements in the array.
70 <literal>icount('{
1,
2,
3}'::integer[])
</literal>
71 <returnvalue>3</returnvalue>
76 <entry role=
"func_table_entry"><para role=
"func_signature">
77 <indexterm><primary>sort
</primary></indexterm>
78 <function>sort
</function> (
<type>integer[]
</type>,
<parameter>dir
</parameter> <type>text
</type> )
79 <returnvalue>integer[]
</returnvalue>
82 Sorts the array in either ascending or descending order.
83 <parameter>dir
</parameter> must be
<literal>asc
</literal>
84 or
<literal>desc
</literal>.
87 <literal>sort('{
1,
3,
2}'::integer[], 'desc')
</literal>
88 <returnvalue>{
3,
2,
1}
</returnvalue>
93 <entry role=
"func_table_entry"><para role=
"func_signature">
94 <function>sort
</function> (
<type>integer[]
</type> )
95 <returnvalue>integer[]
</returnvalue>
97 <para role=
"func_signature">
98 <indexterm><primary>sort_asc
</primary></indexterm>
99 <function>sort_asc
</function> (
<type>integer[]
</type> )
100 <returnvalue>integer[]
</returnvalue>
103 Sorts in ascending order.
106 <literal>sort(array[
11,
77,
44])
</literal>
107 <returnvalue>{
11,
44,
77}
</returnvalue>
112 <entry role=
"func_table_entry"><para role=
"func_signature">
113 <indexterm><primary>sort_desc
</primary></indexterm>
114 <function>sort_desc
</function> (
<type>integer[]
</type> )
115 <returnvalue>integer[]
</returnvalue>
118 Sorts in descending order.
121 <literal>sort_desc(array[
11,
77,
44])
</literal>
122 <returnvalue>{
77,
44,
11}
</returnvalue>
127 <entry role=
"func_table_entry"><para role=
"func_signature">
128 <indexterm><primary>uniq
</primary></indexterm>
129 <function>uniq
</function> (
<type>integer[]
</type> )
130 <returnvalue>integer[]
</returnvalue>
133 Removes adjacent duplicates.
134 Often used with
<function>sort
</function> to remove all duplicates.
137 <literal>uniq('{
1,
2,
2,
3,
1,
1}'::integer[])
</literal>
138 <returnvalue>{
1,
2,
3,
1}
</returnvalue>
141 <literal>uniq(sort('{
1,
2,
3,
2,
1}'::integer[]))
</literal>
142 <returnvalue>{
1,
2,
3}
</returnvalue>
147 <entry role=
"func_table_entry"><para role=
"func_signature">
148 <indexterm><primary>idx
</primary></indexterm>
149 <function>idx
</function> (
<type>integer[]
</type>,
<parameter>item
</parameter> <type>integer
</type> )
150 <returnvalue>integer
</returnvalue>
153 Returns index of the first array element
154 matching
<parameter>item
</parameter>, or
0 if no match.
157 <literal>idx(array[
11,
22,
33,
22,
11],
22)
</literal>
158 <returnvalue>2</returnvalue>
163 <entry role=
"func_table_entry"><para role=
"func_signature">
164 <indexterm><primary>subarray
</primary></indexterm>
165 <function>subarray
</function> (
<type>integer[]
</type>,
<parameter>start
</parameter> <type>integer
</type>,
<parameter>len
</parameter> <type>integer
</type> )
166 <returnvalue>integer[]
</returnvalue>
169 Extracts the portion of the array starting at
170 position
<parameter>start
</parameter>, with
<parameter>len
</parameter>
174 <literal>subarray('{
1,
2,
3,
2,
1}'::integer[],
2,
3)
</literal>
175 <returnvalue>{
2,
3,
2}
</returnvalue>
180 <entry role=
"func_table_entry"><para role=
"func_signature">
181 <function>subarray
</function> (
<type>integer[]
</type>,
<parameter>start
</parameter> <type>integer
</type> )
182 <returnvalue>integer[]
</returnvalue>
185 Extracts the portion of the array starting at
186 position
<parameter>start
</parameter>.
189 <literal>subarray('{
1,
2,
3,
2,
1}'::integer[],
2)
</literal>
190 <returnvalue>{
2,
3,
2,
1}
</returnvalue>
195 <entry role=
"func_table_entry"><para role=
"func_signature">
196 <indexterm><primary>intset
</primary></indexterm>
197 <function>intset
</function> (
<type>integer
</type> )
198 <returnvalue>integer[]
</returnvalue>
201 Makes a single-element array.
204 <literal>intset(
42)
</literal>
205 <returnvalue>{
42}
</returnvalue>
212 <table id=
"intarray-op-table">
213 <title><filename>intarray
</filename> Operators
</title>
217 <entry role=
"func_table_entry"><para role=
"func_signature">
228 <entry role=
"func_table_entry"><para role=
"func_signature">
229 <type>integer[]
</type> <literal>&&</literal> <type>integer[]
</type>
230 <returnvalue>boolean
</returnvalue>
233 Do arrays overlap (have at least one element in common)?
238 <entry role=
"func_table_entry"><para role=
"func_signature">
239 <type>integer[]
</type> <literal>@
></literal> <type>integer[]
</type>
240 <returnvalue>boolean
</returnvalue>
243 Does left array contain right array?
248 <entry role=
"func_table_entry"><para role=
"func_signature">
249 <type>integer[]
</type> <literal><@
</literal> <type>integer[]
</type>
250 <returnvalue>boolean
</returnvalue>
253 Is left array contained in right array?
258 <entry role=
"func_table_entry"><para role=
"func_signature">
259 <type></type> <literal>#
</literal> <type>integer[]
</type>
260 <returnvalue>integer
</returnvalue>
263 Returns the number of elements in the array.
268 <entry role=
"func_table_entry"><para role=
"func_signature">
269 <type>integer[]
</type> <literal>#
</literal> <type>integer
</type>
270 <returnvalue>integer
</returnvalue>
273 Returns index of the first array element
274 matching the right argument, or
0 if no match.
275 (Same as
<function>idx
</function> function.)
280 <entry role=
"func_table_entry"><para role=
"func_signature">
281 <type>integer[]
</type> <literal>+
</literal> <type>integer
</type>
282 <returnvalue>integer[]
</returnvalue>
285 Adds element to end of array.
290 <entry role=
"func_table_entry"><para role=
"func_signature">
291 <type>integer[]
</type> <literal>+
</literal> <type>integer[]
</type>
292 <returnvalue>integer[]
</returnvalue>
295 Concatenates the arrays.
300 <entry role=
"func_table_entry"><para role=
"func_signature">
301 <type>integer[]
</type> <literal>-
</literal> <type>integer
</type>
302 <returnvalue>integer[]
</returnvalue>
305 Removes entries matching the right argument from the array.
310 <entry role=
"func_table_entry"><para role=
"func_signature">
311 <type>integer[]
</type> <literal>-
</literal> <type>integer[]
</type>
312 <returnvalue>integer[]
</returnvalue>
315 Removes elements of the right array from the left array.
320 <entry role=
"func_table_entry"><para role=
"func_signature">
321 <type>integer[]
</type> <literal>|
</literal> <type>integer
</type>
322 <returnvalue>integer[]
</returnvalue>
325 Computes the union of the arguments.
330 <entry role=
"func_table_entry"><para role=
"func_signature">
331 <type>integer[]
</type> <literal>|
</literal> <type>integer[]
</type>
332 <returnvalue>integer[]
</returnvalue>
335 Computes the union of the arguments.
340 <entry role=
"func_table_entry"><para role=
"func_signature">
341 <type>integer[]
</type> <literal>&</literal> <type>integer[]
</type>
342 <returnvalue>integer[]
</returnvalue>
345 Computes the intersection of the arguments.
350 <entry role=
"func_table_entry"><para role=
"func_signature">
351 <type>integer[]
</type> <literal>@@
</literal> <type>query_int
</type>
352 <returnvalue>boolean
</returnvalue>
355 Does array satisfy query? (see below)
360 <entry role=
"func_table_entry"><para role=
"func_signature">
361 <type>query_int
</type> <literal>~~
</literal> <type>integer[]
</type>
362 <returnvalue>boolean
</returnvalue>
365 Does array satisfy query? (commutator of
<literal>@@
</literal>)
373 The operators
<literal>&&</literal>,
<literal>@
></literal> and
374 <literal><@
</literal> are equivalent to
<productname>PostgreSQL
</productname>'s built-in
375 operators of the same names, except that they work only on integer arrays
376 that do not contain nulls, while the built-in operators work for any array
377 type. This restriction makes them faster than the built-in operators
382 The
<literal>@@
</literal> and
<literal>~~
</literal> operators test whether an array
383 satisfies a
<firstterm>query
</firstterm>, which is expressed as a value of a
384 specialized data type
<type>query_int
</type>. A
<firstterm>query
</firstterm>
385 consists of integer values that are checked against the elements of
386 the array, possibly combined using the operators
<literal>&</literal>
387 (AND),
<literal>|
</literal> (OR), and
<literal>!
</literal> (NOT). Parentheses
388 can be used as needed. For example,
389 the query
<literal>1&(
2|
3)
</literal> matches arrays that contain
1
390 and also contain either
2 or
3.
394 <sect2 id=
"intarray-index">
395 <title>Index Support
</title>
398 <filename>intarray
</filename> provides index support for the
399 <literal>&&</literal>,
<literal>@
></literal>,
400 and
<literal>@@
</literal> operators, as well as regular array equality.
404 Two parameterized GiST index operator classes are provided:
405 <literal>gist__int_ops
</literal> (used by default) is suitable for
406 small- to medium-size data sets, while
407 <literal>gist__intbig_ops
</literal> uses a larger signature and is more
408 suitable for indexing large data sets (i.e., columns containing
409 a large number of distinct array values).
410 The implementation uses an RD-tree data structure with
411 built-in lossy compression.
415 <literal>gist__int_ops
</literal> approximates an integer set as an array of
416 integer ranges. Its optional integer parameter
<literal>numranges
</literal>
417 determines the maximum number of ranges in
418 one index key. The default value of
<literal>numranges
</literal> is
100.
419 Valid values are between
1 and
253. Using larger arrays as GiST index
420 keys leads to a more precise search (scanning a smaller fraction of the index and
421 fewer heap pages), at the cost of a larger index.
425 <literal>gist__intbig_ops
</literal> approximates an integer set as a bitmap
426 signature. Its optional integer parameter
<literal>siglen
</literal>
427 determines the signature length in bytes.
428 The default signature length is
16 bytes. Valid values of signature length
429 are between
1 and
2024 bytes. Longer signatures lead to a more precise
430 search (scanning a smaller fraction of the index and fewer heap pages), at
431 the cost of a larger index.
435 There is also a non-default GIN operator class
436 <literal>gin__int_ops
</literal>, which supports these operators as well
437 as
<literal><@
</literal>.
441 The choice between GiST and GIN indexing depends on the relative
442 performance characteristics of GiST and GIN, which are discussed elsewhere.
446 <sect2 id=
"intarray-example">
447 <title>Example
</title>
450 -- a message can be in one or more
<quote>sections
</quote>
451 CREATE TABLE message (mid INT PRIMARY KEY, sections INT[], ...);
453 -- create specialized index with signature length of
32 bytes
454 CREATE INDEX message_rdtree_idx ON message USING GIST (sections gist__intbig_ops (siglen =
32));
456 -- select messages in section
1 OR
2 - OVERLAP operator
457 SELECT message.mid FROM message WHERE message.sections
&& '{
1,
2}';
459 -- select messages in sections
1 AND
2 - CONTAINS operator
460 SELECT message.mid FROM message WHERE message.sections @
> '{
1,
2}';
462 -- the same, using QUERY operator
463 SELECT message.mid FROM message WHERE message.sections @@ '
1&2'::query_int;
467 <sect2 id=
"intarray-benchmark">
468 <title>Benchmark
</title>
471 The source directory
<filename>contrib/intarray/bench
</filename> contains a
472 benchmark test suite, which can be run against an installed
473 <productname>PostgreSQL
</productname> server. (It also requires
<filename>DBD::Pg
</filename>
474 to be installed.) To run:
478 cd .../contrib/intarray/bench
480 psql -c
"CREATE EXTENSION intarray" TEST
481 ./create_test.pl | psql TEST
486 The
<filename>bench.pl
</filename> script has numerous options, which
487 are displayed when it is run without any arguments.
491 <sect2 id=
"intarray-Authors">
492 <title>Authors
</title>
495 All work was done by Teodor Sigaev (
<email>teodor@sigaev.ru
</email>) and
496 Oleg Bartunov (
<email>oleg@sai.msu.su
</email>). See
497 <ulink url=
"http://www.sai.msu.su/~megera/postgres/gist/"></ulink> for
498 additional information. Andrey Oktyabrski did a great work on adding new
499 functions and operations.