1 <!-- doc/src/sgml/tablesample-method.sgml -->
3 <chapter id=
"tablesample-method">
4 <title>Writing a Table Sampling Method
</title>
6 <indexterm zone=
"tablesample-method">
7 <primary>table sampling method
</primary>
10 <indexterm zone=
"tablesample-method">
11 <primary><literal>TABLESAMPLE
</literal> method
</primary>
15 <productname>PostgreSQL
</productname>'s implementation of the
<literal>TABLESAMPLE
</literal>
16 clause supports custom table sampling methods, in addition to
17 the
<literal>BERNOULLI
</literal> and
<literal>SYSTEM
</literal> methods that are required
18 by the SQL standard. The sampling method determines which rows of the
19 table will be selected when the
<literal>TABLESAMPLE
</literal> clause is used.
23 At the SQL level, a table sampling method is represented by a single SQL
24 function, typically implemented in C, having the signature
26 method_name(internal) RETURNS tsm_handler
28 The name of the function is the same method name appearing in the
29 <literal>TABLESAMPLE
</literal> clause. The
<type>internal
</type> argument is a dummy
30 (always having value zero) that simply serves to prevent this function from
31 being called directly from an SQL command.
32 The result of the function must be a palloc'd struct of
33 type
<type>TsmRoutine
</type>, which contains pointers to support functions for
34 the sampling method. These support functions are plain C functions and
35 are not visible or callable at the SQL level. The support functions are
36 described in
<xref linkend=
"tablesample-support-functions"/>.
40 In addition to function pointers, the
<type>TsmRoutine
</type> struct must
41 provide these additional fields:
46 <term><literal>List *parameterTypes
</literal></term>
49 This is an OID list containing the data type OIDs of the parameter(s)
50 that will be accepted by the
<literal>TABLESAMPLE
</literal> clause when this
51 sampling method is used. For example, for the built-in methods, this
52 list contains a single item with value
<literal>FLOAT4OID
</literal>, which
53 represents the sampling percentage. Custom sampling methods can have
54 more or different parameters.
60 <term><literal>bool repeatable_across_queries
</literal></term>
63 If
<literal>true
</literal>, the sampling method can deliver identical samples
64 across successive queries, if the same parameters
65 and
<literal>REPEATABLE
</literal> seed value are supplied each time and the
66 table contents have not changed. When this is
<literal>false
</literal>,
67 the
<literal>REPEATABLE
</literal> clause is not accepted for use with the
74 <term><literal>bool repeatable_across_scans
</literal></term>
77 If
<literal>true
</literal>, the sampling method can deliver identical samples
78 across successive scans in the same query (assuming unchanging
79 parameters, seed value, and snapshot).
80 When this is
<literal>false
</literal>, the planner will not select plans that
81 would require scanning the sampled table more than once, since that
82 might result in inconsistent query output.
89 The
<type>TsmRoutine
</type> struct type is declared
90 in
<filename>src/include/access/tsmapi.h
</filename>, which see for additional
95 The table sampling methods included in the standard distribution are good
96 references when trying to write your own. Look into
97 the
<filename>src/backend/access/tablesample
</filename> subdirectory of the source
98 tree for the built-in sampling methods, and into the
<filename>contrib
</filename>
99 subdirectory for add-on methods.
102 <sect1 id=
"tablesample-support-functions">
103 <title>Sampling Method Support Functions
</title>
106 The TSM handler function returns a palloc'd
<type>TsmRoutine
</type> struct
107 containing pointers to the support functions described below. Most of
108 the functions are required, but some are optional, and those pointers can
115 SampleScanGetSampleSize (PlannerInfo *root,
122 This function is called during planning. It must estimate the number of
123 relation pages that will be read during a sample scan, and the number of
124 tuples that will be selected by the scan. (For example, these might be
125 determined by estimating the sampling fraction, and then multiplying
126 the
<literal>baserel-
>pages
</literal> and
<literal>baserel-
>tuples
</literal>
127 numbers by that, being sure to round the results to integral values.)
128 The
<literal>paramexprs
</literal> list holds the expression(s) that are
129 parameters to the
<literal>TABLESAMPLE
</literal> clause. It is recommended to
130 use
<function>estimate_expression_value()
</function> to try to reduce these
131 expressions to constants, if their values are needed for estimation
132 purposes; but the function must provide size estimates even if they cannot
133 be reduced, and it should not fail even if the values appear invalid
134 (remember that they're only estimates of what the run-time values will be).
135 The
<literal>pages
</literal> and
<literal>tuples
</literal> parameters are outputs.
141 InitSampleScan (SampleScanState *node,
145 Initialize for execution of a SampleScan plan node.
146 This is called during executor startup.
147 It should perform any initialization needed before processing can start.
148 The
<structname>SampleScanState
</structname> node has already been created, but
149 its
<structfield>tsm_state
</structfield> field is NULL.
150 The
<function>InitSampleScan
</function> function can palloc whatever internal
151 state data is needed by the sampling method, and store a pointer to
152 it in
<literal>node-
>tsm_state
</literal>.
153 Information about the table to scan is accessible through other fields
154 of the
<structname>SampleScanState
</structname> node (but note that the
155 <literal>node-
>ss.ss_currentScanDesc
</literal> scan descriptor is not set
157 <literal>eflags
</literal> contains flag bits describing the executor's
158 operating mode for this plan node.
162 When
<literal>(eflags
& EXEC_FLAG_EXPLAIN_ONLY)
</literal> is true,
163 the scan will not actually be performed, so this function should only do
164 the minimum required to make the node state valid for
<command>EXPLAIN
</command>
165 and
<function>EndSampleScan
</function>.
169 This function can be omitted (set the pointer to NULL), in which case
170 <function>BeginSampleScan
</function> must perform all initialization needed
171 by the sampling method.
177 BeginSampleScan (SampleScanState *node,
183 Begin execution of a sampling scan.
184 This is called just before the first attempt to fetch a tuple, and
185 may be called again if the scan needs to be restarted.
186 Information about the table to scan is accessible through fields
187 of the
<structname>SampleScanState
</structname> node (but note that the
188 <literal>node-
>ss.ss_currentScanDesc
</literal> scan descriptor is not set
190 The
<literal>params
</literal> array, of length
<literal>nparams
</literal>, contains the
191 values of the parameters supplied in the
<literal>TABLESAMPLE
</literal> clause.
192 These will have the number and types specified in the sampling
193 method's
<literal>parameterTypes
</literal> list, and have been checked
195 <literal>seed
</literal> contains a seed to use for any random numbers generated
196 within the sampling method; it is either a hash derived from the
197 <literal>REPEATABLE
</literal> value if one was given, or the result
198 of
<literal>random()
</literal> if not.
202 This function may adjust the fields
<literal>node-
>use_bulkread
</literal>
203 and
<literal>node-
>use_pagemode
</literal>.
204 If
<literal>node-
>use_bulkread
</literal> is
<literal>true
</literal>, which it is by
205 default, the scan will use a buffer access strategy that encourages
206 recycling buffers after use. It might be reasonable to set this
207 to
<literal>false
</literal> if the scan will visit only a small fraction of the
209 If
<literal>node-
>use_pagemode
</literal> is
<literal>true
</literal>, which it is by
210 default, the scan will perform visibility checking in a single pass for
211 all tuples on each visited page. It might be reasonable to set this
212 to
<literal>false
</literal> if the scan will select only a small fraction of the
213 tuples on each visited page. That will result in fewer tuple visibility
214 checks being performed, though each one will be more expensive because it
215 will require more locking.
219 If the sampling method is
220 marked
<literal>repeatable_across_scans
</literal>, it must be able to
221 select the same set of tuples during a rescan as it did originally, that is
222 a fresh call of
<function>BeginSampleScan
</function> must lead to selecting the
223 same tuples as before (if the
<literal>TABLESAMPLE
</literal> parameters
224 and seed don't change).
230 NextSampleBlock (SampleScanState *node, BlockNumber nblocks);
233 Returns the block number of the next page to be scanned, or
234 <literal>InvalidBlockNumber
</literal> if no pages remain to be scanned.
238 This function can be omitted (set the pointer to NULL), in which case
239 the core code will perform a sequential scan of the entire relation.
240 Such a scan can use synchronized scanning, so that the sampling method
241 cannot assume that the relation pages are visited in the same order on
248 NextSampleTuple (SampleScanState *node,
250 OffsetNumber maxoffset);
253 Returns the offset number of the next tuple to be sampled on the
254 specified page, or
<literal>InvalidOffsetNumber
</literal> if no tuples remain to
255 be sampled.
<literal>maxoffset
</literal> is the largest offset number in use
261 <function>NextSampleTuple
</function> is not explicitly told which of the offset
262 numbers in the range
<literal>1 .. maxoffset
</literal> actually contain valid
263 tuples. This is not normally a problem since the core code ignores
264 requests to sample missing or invisible tuples; that should not result in
265 any bias in the sample. However, if necessary, the function can use
266 <literal>node-
>donetuples
</literal> to examine how many of the tuples
267 it returned were valid and visible.
273 <function>NextSampleTuple
</function> must
<emphasis>not
</emphasis> assume
274 that
<literal>blockno
</literal> is the same page number returned by the most
275 recent
<function>NextSampleBlock
</function> call. It was returned by some
276 previous
<function>NextSampleBlock
</function> call, but the core code is allowed
277 to call
<function>NextSampleBlock
</function> in advance of actually scanning
278 pages, so as to support prefetching. It is OK to assume that once
279 sampling of a given page begins, successive
<function>NextSampleTuple
</function>
280 calls all refer to the same page until
<literal>InvalidOffsetNumber
</literal> is
288 EndSampleScan (SampleScanState *node);
291 End the scan and release resources. It is normally not important
292 to release palloc'd memory, but any externally-visible resources
293 should be cleaned up.
294 This function can be omitted (set the pointer to NULL) in the common
295 case where no such resources exist.