At update of non-LP_NORMAL TID, fail instead of corrupting page header.
[pgsql.git] / doc / src / sgml / intarray.sgml
blobc72d49b01d89809428bf2793ac8896081cbd36e4
1 <!-- doc/src/sgml/intarray.sgml -->
3 <sect1 id="intarray" xreflabel="intarray">
4 <title>intarray &mdash; manipulate arrays of integers</title>
6 <indexterm zone="intarray">
7 <primary>intarray</primary>
8 </indexterm>
10 <para>
11 The <filename>intarray</filename> module provides a number of useful functions
12 and operators for manipulating null-free arrays of integers.
13 There is also support for indexed searches using some of the operators.
14 </para>
16 <para>
17 All of these operations will throw an error if a supplied array contains any
18 NULL elements.
19 </para>
21 <para>
22 Many of these operations are only sensible for one-dimensional arrays.
23 Although they will accept input arrays of more dimensions, the data is
24 treated as though it were a linear array in storage order.
25 </para>
27 <para>
28 This module is considered <quote>trusted</quote>, that is, it can be
29 installed by non-superusers who have <literal>CREATE</literal> privilege
30 on the current database.
31 </para>
33 <sect2 id="intarray-funcs-ops">
34 <title><filename>intarray</filename> Functions and Operators</title>
36 <para>
37 The functions provided by the <filename>intarray</filename> module
38 are shown in <xref linkend="intarray-func-table"/>, the operators
39 in <xref linkend="intarray-op-table"/>.
40 </para>
42 <table id="intarray-func-table">
43 <title><filename>intarray</filename> Functions</title>
44 <tgroup cols="1">
45 <thead>
46 <row>
47 <entry role="func_table_entry"><para role="func_signature">
48 Function
49 </para>
50 <para>
51 Description
52 </para>
53 <para>
54 Example(s)
55 </para></entry>
56 </row>
57 </thead>
59 <tbody>
60 <row>
61 <entry role="func_table_entry"><para role="func_signature">
62 <indexterm><primary>icount</primary></indexterm>
63 <function>icount</function> ( <type>integer[]</type> )
64 <returnvalue>integer</returnvalue>
65 </para>
66 <para>
67 Returns the number of elements in the array.
68 </para>
69 <para>
70 <literal>icount('{1,2,3}'::integer[])</literal>
71 <returnvalue>3</returnvalue>
72 </para></entry>
73 </row>
75 <row>
76 <entry role="func_table_entry"><para role="func_signature">
77 <indexterm><primary>sort</primary></indexterm>
78 <function>sort</function> ( <type>integer[]</type>, <parameter>dir</parameter> <type>text</type> )
79 <returnvalue>integer[]</returnvalue>
80 </para>
81 <para>
82 Sorts the array in either ascending or descending order.
83 <parameter>dir</parameter> must be <literal>asc</literal>
84 or <literal>desc</literal>.
85 </para>
86 <para>
87 <literal>sort('{1,3,2}'::integer[], 'desc')</literal>
88 <returnvalue>{3,2,1}</returnvalue>
89 </para></entry>
90 </row>
92 <row>
93 <entry role="func_table_entry"><para role="func_signature">
94 <function>sort</function> ( <type>integer[]</type> )
95 <returnvalue>integer[]</returnvalue>
96 </para>
97 <para role="func_signature">
98 <indexterm><primary>sort_asc</primary></indexterm>
99 <function>sort_asc</function> ( <type>integer[]</type> )
100 <returnvalue>integer[]</returnvalue>
101 </para>
102 <para>
103 Sorts in ascending order.
104 </para>
105 <para>
106 <literal>sort(array[11,77,44])</literal>
107 <returnvalue>{11,44,77}</returnvalue>
108 </para></entry>
109 </row>
111 <row>
112 <entry role="func_table_entry"><para role="func_signature">
113 <indexterm><primary>sort_desc</primary></indexterm>
114 <function>sort_desc</function> ( <type>integer[]</type> )
115 <returnvalue>integer[]</returnvalue>
116 </para>
117 <para>
118 Sorts in descending order.
119 </para>
120 <para>
121 <literal>sort_desc(array[11,77,44])</literal>
122 <returnvalue>{77,44,11}</returnvalue>
123 </para></entry>
124 </row>
126 <row>
127 <entry role="func_table_entry"><para role="func_signature">
128 <indexterm><primary>uniq</primary></indexterm>
129 <function>uniq</function> ( <type>integer[]</type> )
130 <returnvalue>integer[]</returnvalue>
131 </para>
132 <para>
133 Removes adjacent duplicates.
134 Often used with <function>sort</function> to remove all duplicates.
135 </para>
136 <para>
137 <literal>uniq('{1,2,2,3,1,1}'::integer[])</literal>
138 <returnvalue>{1,2,3,1}</returnvalue>
139 </para>
140 <para>
141 <literal>uniq(sort('{1,2,3,2,1}'::integer[]))</literal>
142 <returnvalue>{1,2,3}</returnvalue>
143 </para></entry>
144 </row>
146 <row>
147 <entry role="func_table_entry"><para role="func_signature">
148 <indexterm><primary>idx</primary></indexterm>
149 <function>idx</function> ( <type>integer[]</type>, <parameter>item</parameter> <type>integer</type> )
150 <returnvalue>integer</returnvalue>
151 </para>
152 <para>
153 Returns index of the first array element
154 matching <parameter>item</parameter>, or 0 if no match.
155 </para>
156 <para>
157 <literal>idx(array[11,22,33,22,11], 22)</literal>
158 <returnvalue>2</returnvalue>
159 </para></entry>
160 </row>
162 <row>
163 <entry role="func_table_entry"><para role="func_signature">
164 <indexterm><primary>subarray</primary></indexterm>
165 <function>subarray</function> ( <type>integer[]</type>, <parameter>start</parameter> <type>integer</type>, <parameter>len</parameter> <type>integer</type> )
166 <returnvalue>integer[]</returnvalue>
167 </para>
168 <para>
169 Extracts the portion of the array starting at
170 position <parameter>start</parameter>, with <parameter>len</parameter>
171 elements.
172 </para>
173 <para>
174 <literal>subarray('{1,2,3,2,1}'::integer[], 2, 3)</literal>
175 <returnvalue>{2,3,2}</returnvalue>
176 </para></entry>
177 </row>
179 <row>
180 <entry role="func_table_entry"><para role="func_signature">
181 <function>subarray</function> ( <type>integer[]</type>, <parameter>start</parameter> <type>integer</type> )
182 <returnvalue>integer[]</returnvalue>
183 </para>
184 <para>
185 Extracts the portion of the array starting at
186 position <parameter>start</parameter>.
187 </para>
188 <para>
189 <literal>subarray('{1,2,3,2,1}'::integer[], 2)</literal>
190 <returnvalue>{2,3,2,1}</returnvalue>
191 </para></entry>
192 </row>
194 <row>
195 <entry role="func_table_entry"><para role="func_signature">
196 <indexterm><primary>intset</primary></indexterm>
197 <function>intset</function> ( <type>integer</type> )
198 <returnvalue>integer[]</returnvalue>
199 </para>
200 <para>
201 Makes a single-element array.
202 </para>
203 <para>
204 <literal>intset(42)</literal>
205 <returnvalue>{42}</returnvalue>
206 </para></entry>
207 </row>
208 </tbody>
209 </tgroup>
210 </table>
212 <table id="intarray-op-table">
213 <title><filename>intarray</filename> Operators</title>
214 <tgroup cols="1">
215 <thead>
216 <row>
217 <entry role="func_table_entry"><para role="func_signature">
218 Operator
219 </para>
220 <para>
221 Description
222 </para></entry>
223 </row>
224 </thead>
226 <tbody>
227 <row>
228 <entry role="func_table_entry"><para role="func_signature">
229 <type>integer[]</type> <literal>&amp;&amp;</literal> <type>integer[]</type>
230 <returnvalue>boolean</returnvalue>
231 </para>
232 <para>
233 Do arrays overlap (have at least one element in common)?
234 </para></entry>
235 </row>
237 <row>
238 <entry role="func_table_entry"><para role="func_signature">
239 <type>integer[]</type> <literal>@&gt;</literal> <type>integer[]</type>
240 <returnvalue>boolean</returnvalue>
241 </para>
242 <para>
243 Does left array contain right array?
244 </para></entry>
245 </row>
247 <row>
248 <entry role="func_table_entry"><para role="func_signature">
249 <type>integer[]</type> <literal>&lt;@</literal> <type>integer[]</type>
250 <returnvalue>boolean</returnvalue>
251 </para>
252 <para>
253 Is left array contained in right array?
254 </para></entry>
255 </row>
257 <row>
258 <entry role="func_table_entry"><para role="func_signature">
259 <type></type> <literal>#</literal> <type>integer[]</type>
260 <returnvalue>integer</returnvalue>
261 </para>
262 <para>
263 Returns the number of elements in the array.
264 </para></entry>
265 </row>
267 <row>
268 <entry role="func_table_entry"><para role="func_signature">
269 <type>integer[]</type> <literal>#</literal> <type>integer</type>
270 <returnvalue>integer</returnvalue>
271 </para>
272 <para>
273 Returns index of the first array element
274 matching the right argument, or 0 if no match.
275 (Same as <function>idx</function> function.)
276 </para></entry>
277 </row>
279 <row>
280 <entry role="func_table_entry"><para role="func_signature">
281 <type>integer[]</type> <literal>+</literal> <type>integer</type>
282 <returnvalue>integer[]</returnvalue>
283 </para>
284 <para>
285 Adds element to end of array.
286 </para></entry>
287 </row>
289 <row>
290 <entry role="func_table_entry"><para role="func_signature">
291 <type>integer[]</type> <literal>+</literal> <type>integer[]</type>
292 <returnvalue>integer[]</returnvalue>
293 </para>
294 <para>
295 Concatenates the arrays.
296 </para></entry>
297 </row>
299 <row>
300 <entry role="func_table_entry"><para role="func_signature">
301 <type>integer[]</type> <literal>-</literal> <type>integer</type>
302 <returnvalue>integer[]</returnvalue>
303 </para>
304 <para>
305 Removes entries matching the right argument from the array.
306 </para></entry>
307 </row>
309 <row>
310 <entry role="func_table_entry"><para role="func_signature">
311 <type>integer[]</type> <literal>-</literal> <type>integer[]</type>
312 <returnvalue>integer[]</returnvalue>
313 </para>
314 <para>
315 Removes elements of the right array from the left array.
316 </para></entry>
317 </row>
319 <row>
320 <entry role="func_table_entry"><para role="func_signature">
321 <type>integer[]</type> <literal>|</literal> <type>integer</type>
322 <returnvalue>integer[]</returnvalue>
323 </para>
324 <para>
325 Computes the union of the arguments.
326 </para></entry>
327 </row>
329 <row>
330 <entry role="func_table_entry"><para role="func_signature">
331 <type>integer[]</type> <literal>|</literal> <type>integer[]</type>
332 <returnvalue>integer[]</returnvalue>
333 </para>
334 <para>
335 Computes the union of the arguments.
336 </para></entry>
337 </row>
339 <row>
340 <entry role="func_table_entry"><para role="func_signature">
341 <type>integer[]</type> <literal>&amp;</literal> <type>integer[]</type>
342 <returnvalue>integer[]</returnvalue>
343 </para>
344 <para>
345 Computes the intersection of the arguments.
346 </para></entry>
347 </row>
349 <row>
350 <entry role="func_table_entry"><para role="func_signature">
351 <type>integer[]</type> <literal>@@</literal> <type>query_int</type>
352 <returnvalue>boolean</returnvalue>
353 </para>
354 <para>
355 Does array satisfy query? (see below)
356 </para></entry>
357 </row>
359 <row>
360 <entry role="func_table_entry"><para role="func_signature">
361 <type>query_int</type> <literal>~~</literal> <type>integer[]</type>
362 <returnvalue>boolean</returnvalue>
363 </para>
364 <para>
365 Does array satisfy query? (commutator of <literal>@@</literal>)
366 </para></entry>
367 </row>
368 </tbody>
369 </tgroup>
370 </table>
372 <para>
373 The operators <literal>&amp;&amp;</literal>, <literal>@&gt;</literal> and
374 <literal>&lt;@</literal> are equivalent to <productname>PostgreSQL</productname>'s built-in
375 operators of the same names, except that they work only on integer arrays
376 that do not contain nulls, while the built-in operators work for any array
377 type. This restriction makes them faster than the built-in operators
378 in many cases.
379 </para>
381 <para>
382 The <literal>@@</literal> and <literal>~~</literal> operators test whether an array
383 satisfies a <firstterm>query</firstterm>, which is expressed as a value of a
384 specialized data type <type>query_int</type>. A <firstterm>query</firstterm>
385 consists of integer values that are checked against the elements of
386 the array, possibly combined using the operators <literal>&amp;</literal>
387 (AND), <literal>|</literal> (OR), and <literal>!</literal> (NOT). Parentheses
388 can be used as needed. For example,
389 the query <literal>1&amp;(2|3)</literal> matches arrays that contain 1
390 and also contain either 2 or 3.
391 </para>
392 </sect2>
394 <sect2 id="intarray-index">
395 <title>Index Support</title>
397 <para>
398 <filename>intarray</filename> provides index support for the
399 <literal>&amp;&amp;</literal>, <literal>@&gt;</literal>,
400 and <literal>@@</literal> operators, as well as regular array equality.
401 </para>
403 <para>
404 Two parameterized GiST index operator classes are provided:
405 <literal>gist__int_ops</literal> (used by default) is suitable for
406 small- to medium-size data sets, while
407 <literal>gist__intbig_ops</literal> uses a larger signature and is more
408 suitable for indexing large data sets (i.e., columns containing
409 a large number of distinct array values).
410 The implementation uses an RD-tree data structure with
411 built-in lossy compression.
412 </para>
414 <para>
415 <literal>gist__int_ops</literal> approximates an integer set as an array of
416 integer ranges. Its optional integer parameter <literal>numranges</literal>
417 determines the maximum number of ranges in
418 one index key. The default value of <literal>numranges</literal> is 100.
419 Valid values are between 1 and 253. Using larger arrays as GiST index
420 keys leads to a more precise search (scanning a smaller fraction of the index and
421 fewer heap pages), at the cost of a larger index.
422 </para>
424 <para>
425 <literal>gist__intbig_ops</literal> approximates an integer set as a bitmap
426 signature. Its optional integer parameter <literal>siglen</literal>
427 determines the signature length in bytes.
428 The default signature length is 16 bytes. Valid values of signature length
429 are between 1 and 2024 bytes. Longer signatures lead to a more precise
430 search (scanning a smaller fraction of the index and fewer heap pages), at
431 the cost of a larger index.
432 </para>
434 <para>
435 There is also a non-default GIN operator class
436 <literal>gin__int_ops</literal>, which supports these operators as well
437 as <literal>&lt;@</literal>.
438 </para>
440 <para>
441 The choice between GiST and GIN indexing depends on the relative
442 performance characteristics of GiST and GIN, which are discussed elsewhere.
443 </para>
444 </sect2>
446 <sect2 id="intarray-example">
447 <title>Example</title>
449 <programlisting>
450 -- a message can be in one or more <quote>sections</quote>
451 CREATE TABLE message (mid INT PRIMARY KEY, sections INT[], ...);
453 -- create specialized index with signature length of 32 bytes
454 CREATE INDEX message_rdtree_idx ON message USING GIST (sections gist__intbig_ops (siglen = 32));
456 -- select messages in section 1 OR 2 - OVERLAP operator
457 SELECT message.mid FROM message WHERE message.sections &amp;&amp; '{1,2}';
459 -- select messages in sections 1 AND 2 - CONTAINS operator
460 SELECT message.mid FROM message WHERE message.sections @&gt; '{1,2}';
462 -- the same, using QUERY operator
463 SELECT message.mid FROM message WHERE message.sections @@ '1&amp;2'::query_int;
464 </programlisting>
465 </sect2>
467 <sect2 id="intarray-benchmark">
468 <title>Benchmark</title>
470 <para>
471 The source directory <filename>contrib/intarray/bench</filename> contains a
472 benchmark test suite, which can be run against an installed
473 <productname>PostgreSQL</productname> server. (It also requires <filename>DBD::Pg</filename>
474 to be installed.) To run:
475 </para>
477 <programlisting>
478 cd .../contrib/intarray/bench
479 createdb TEST
480 psql -c "CREATE EXTENSION intarray" TEST
481 ./create_test.pl | psql TEST
482 ./bench.pl
483 </programlisting>
485 <para>
486 The <filename>bench.pl</filename> script has numerous options, which
487 are displayed when it is run without any arguments.
488 </para>
489 </sect2>
491 <sect2 id="intarray-Authors">
492 <title>Authors</title>
494 <para>
495 All work was done by Teodor Sigaev (<email>teodor@sigaev.ru</email>) and
496 Oleg Bartunov (<email>oleg@sai.msu.su</email>). See
497 <ulink url="http://www.sai.msu.su/~megera/postgres/gist/"></ulink> for
498 additional information. Andrey Oktyabrski did a great work on adding new
499 functions and operations.
500 </para>
501 </sect2>
503 </sect1>