1 <!-- doc/src/sgml/rules.sgml -->
4 <title>The Rule System
</title>
6 <indexterm zone=
"rules">
7 <primary>rule
</primary>
11 This chapter discusses the rule system in
12 <productname>PostgreSQL
</productname>. Production rule systems
13 are conceptually simple, but there are many subtle points
14 involved in actually using them.
18 Some other database systems define active database rules, which
19 are usually stored procedures and triggers. In
20 <productname>PostgreSQL
</productname>, these can be implemented
21 using functions and triggers as well.
25 The rule system (more precisely speaking, the query rewrite rule
26 system) is totally different from stored procedures and triggers.
27 It modifies queries to take rules into consideration, and then
28 passes the modified query to the query planner for planning and
29 execution. It is very powerful, and can be used for many things
30 such as query language procedures, views, and versions. The
31 theoretical foundations and the power of this rule system are
32 also discussed in
<xref linkend=
"ston90b"/> and
<xref
36 <sect1 id=
"querytree">
37 <title>The Query Tree
</title>
39 <indexterm zone=
"querytree">
40 <primary>query tree
</primary>
44 To understand how the rule system works it is necessary to know
45 when it is invoked and what its input and results are.
49 The rule system is located between the parser and the planner.
50 It takes the output of the parser, one query tree, and the user-defined
51 rewrite rules, which are also
52 query trees with some extra information, and creates zero or more
53 query trees as result. So its input and output are always things
54 the parser itself could have produced and thus, anything it sees
55 is basically representable as an
<acronym>SQL
</acronym> statement.
59 Now what is a query tree? It is an internal representation of an
60 <acronym>SQL
</acronym> statement where the single parts that it is
61 built from are stored separately. These query trees can be shown
62 in the server log if you set the configuration parameters
63 <varname>debug_print_parse
</varname>,
64 <varname>debug_print_rewritten
</varname>, or
65 <varname>debug_print_plan
</varname>. The rule actions are also
66 stored as query trees, in the system catalog
67 <structname>pg_rewrite
</structname>. They are not formatted like
68 the log output, but they contain exactly the same information.
72 Reading a raw query tree requires some experience. But since
73 <acronym>SQL
</acronym> representations of query trees are
74 sufficient to understand the rule system, this chapter will not
75 teach how to read them.
79 When reading the
<acronym>SQL
</acronym> representations of the
80 query trees in this chapter it is necessary to be able to identify
81 the parts the statement is broken into when it is in the query tree
82 structure. The parts of a query tree are
91 This is a simple value telling which command
92 (
<command>SELECT
</command>,
<command>INSERT
</command>,
93 <command>UPDATE
</command>,
<command>DELETE
</command>) produced
102 <indexterm><primary>range table
</primary></indexterm>
106 The range table is a list of relations that are used in the query.
107 In a
<command>SELECT
</command> statement these are the relations given after
108 the
<literal>FROM
</literal> key word.
112 Every range table entry identifies a table or view and tells
113 by which name it is called in the other parts of the query.
114 In the query tree, the range table entries are referenced by
115 number rather than by name, so here it doesn't matter if there
116 are duplicate names as it would in an
<acronym>SQL
</acronym>
117 statement. This can happen after the range tables of rules
118 have been merged in. The examples in this chapter will not have
130 This is an index into the range table that identifies the
131 relation where the results of the query go.
135 <command>SELECT
</command> queries don't have a result
136 relation. (The special case of
<command>SELECT INTO
</command> is
137 mostly identical to
<command>CREATE TABLE
</command> followed by
138 <literal>INSERT ... SELECT
</literal>, and is not discussed
143 For
<command>INSERT
</command>,
<command>UPDATE
</command>, and
144 <command>DELETE
</command> commands, the result relation is the table
145 (or view!) where the changes are to take effect.
153 <indexterm><primary>target list
</primary></indexterm>
157 The target list is a list of expressions that define the
158 result of the query. In the case of a
159 <command>SELECT
</command>, these expressions are the ones that
160 build the final output of the query. They correspond to the
161 expressions between the key words
<command>SELECT
</command>
162 and
<command>FROM
</command>. (
<literal>*
</literal> is just an
163 abbreviation for all the column names of a relation. It is
164 expanded by the parser into the individual columns, so the
165 rule system never sees it.)
169 <command>DELETE
</command> commands don't need a normal target list
170 because they don't produce any result. Instead, the planner
171 adds a special
<acronym>CTID
</acronym> entry to the empty target list,
172 to allow the executor to find the row to be deleted.
173 (
<acronym>CTID
</acronym> is added when the result relation is an ordinary
174 table. If it is a view, a whole-row variable is added instead, by
175 the rule system, as described in
<xref linkend=
"rules-views-update"/>.)
179 For
<command>INSERT
</command> commands, the target list describes
180 the new rows that should go into the result relation. It consists of the
181 expressions in the
<literal>VALUES
</literal> clause or the ones from the
182 <command>SELECT
</command> clause in
<literal>INSERT
183 ... SELECT
</literal>. The first step of the rewrite process adds
184 target list entries for any columns that were not assigned to by
185 the original command but have defaults. Any remaining columns (with
186 neither a given value nor a default) will be filled in by the
187 planner with a constant null expression.
191 For
<command>UPDATE
</command> commands, the target list
192 describes the new rows that should replace the old ones. In the
193 rule system, it contains just the expressions from the
<literal>SET
194 column = expression
</literal> part of the command. The planner will
195 handle missing columns by inserting expressions that copy the values
196 from the old row into the new one. Just as for
<command>DELETE
</command>,
197 a
<acronym>CTID
</acronym> or whole-row variable is added so that
198 the executor can identify the old row to be updated.
202 Every entry in the target list contains an expression that can
203 be a constant value, a variable pointing to a column of one
204 of the relations in the range table, a parameter, or an expression
205 tree made of function calls, constants, variables, operators, etc.
216 The query's qualification is an expression much like one of
217 those contained in the target list entries. The result value of
218 this expression is a Boolean that tells whether the operation
219 (
<command>INSERT
</command>,
<command>UPDATE
</command>,
220 <command>DELETE
</command>, or
<command>SELECT
</command>) for the
221 final result row should be executed or not. It corresponds to the
<literal>WHERE
</literal> clause
222 of an
<acronym>SQL
</acronym> statement.
233 The query's join tree shows the structure of the
<literal>FROM
</literal> clause.
234 For a simple query like
<literal>SELECT ... FROM a, b, c
</literal>, the join tree is just
235 a list of the
<literal>FROM
</literal> items, because we are allowed to join them in
236 any order. But when
<literal>JOIN
</literal> expressions, particularly outer joins,
237 are used, we have to join in the order shown by the joins.
238 In that case, the join tree shows the structure of the
<literal>JOIN
</literal> expressions. The
239 restrictions associated with particular
<literal>JOIN
</literal> clauses (from
<literal>ON
</literal> or
240 <literal>USING
</literal> expressions) are stored as qualification expressions attached
241 to those join-tree nodes. It turns out to be convenient to store
242 the top-level
<literal>WHERE
</literal> expression as a qualification attached to the
243 top-level join-tree item, too. So really the join tree represents
244 both the
<literal>FROM
</literal> and
<literal>WHERE
</literal> clauses of a
<command>SELECT
</command>.
255 The other parts of the query tree like the
<literal>ORDER BY
</literal>
256 clause aren't of interest here. The rule system
257 substitutes some entries there while applying rules, but that
258 doesn't have much to do with the fundamentals of the rule
268 <sect1 id=
"rules-views">
269 <title>Views and the Rule System
</title>
271 <indexterm zone=
"rules-views">
272 <primary>rule
</primary>
273 <secondary>and views
</secondary>
276 <indexterm zone=
"rules-views">
277 <primary>view
</primary>
278 <secondary>implementation through rules
</secondary>
282 Views in
<productname>PostgreSQL
</productname> are implemented
283 using the rule system. A view is basically an empty table (having no
284 actual storage) with an
<literal>ON SELECT DO INSTEAD
</literal> rule.
285 Conventionally, that rule is named
<literal>_RETURN
</literal>.
289 CREATE VIEW myview AS SELECT * FROM mytab;
292 is very nearly the same thing as
295 CREATE TABLE myview (
<replaceable>same column list as mytab
</replaceable>);
296 CREATE RULE
"_RETURN" AS ON SELECT TO myview DO INSTEAD
300 although you can't actually write that, because tables are not
301 allowed to have
<literal>ON SELECT
</literal> rules.
305 A view can also have other kinds of
<literal>DO INSTEAD
</literal>
306 rules, allowing
<command>INSERT
</command>,
<command>UPDATE
</command>,
307 or
<command>DELETE
</command> commands to be performed on the view
308 despite its lack of underlying storage.
309 This is discussed further below, in
310 <xref linkend=
"rules-views-update"/>.
313 <sect2 id=
"rules-select">
314 <title>How
<command>SELECT
</command> Rules Work
</title>
316 <indexterm zone=
"rules-select">
317 <primary>rule
</primary>
318 <secondary sortas=
"SELECT">for SELECT
</secondary>
322 Rules
<literal>ON SELECT
</literal> are applied to all queries as the last step, even
323 if the command given is an
<command>INSERT
</command>,
324 <command>UPDATE
</command> or
<command>DELETE
</command>. And they
325 have different semantics from rules on the other command types in that they modify the
326 query tree in place instead of creating a new one. So
327 <command>SELECT
</command> rules are described first.
331 Currently, there can be only one action in an
<literal>ON SELECT
</literal> rule, and it must
332 be an unconditional
<command>SELECT
</command> action that is
<literal>INSTEAD
</literal>. This restriction was
333 required to make rules safe enough to open them for ordinary users, and
334 it restricts
<literal>ON SELECT
</literal> rules to act like views.
338 The examples for this chapter are two join views that do some
339 calculations and some more views using them in turn. One of the
340 two first views is customized later by adding rules for
341 <command>INSERT
</command>,
<command>UPDATE
</command>, and
342 <command>DELETE
</command> operations so that the final result will
343 be a view that behaves like a real table with some magic
344 functionality. This is not such a simple example to start from and
345 this makes things harder to get into. But it's better to have one
346 example that covers all the points discussed step by step rather
347 than having many different ones that might mix up in mind.
351 The real tables we need in the first two rule system descriptions
355 CREATE TABLE shoe_data (
356 shoename text, -- primary key
357 sh_avail integer, -- available number of pairs
358 slcolor text, -- preferred shoelace color
359 slminlen real, -- minimum shoelace length
360 slmaxlen real, -- maximum shoelace length
361 slunit text -- length unit
364 CREATE TABLE shoelace_data (
365 sl_name text, -- primary key
366 sl_avail integer, -- available number of pairs
367 sl_color text, -- shoelace color
368 sl_len real, -- shoelace length
369 sl_unit text -- length unit
373 un_name text, -- primary key
374 un_fact real -- factor to transform to cm
378 As you can see, they represent shoe-store data.
382 The views are created as:
390 sh.slminlen * un.un_fact AS slminlen_cm,
392 sh.slmaxlen * un.un_fact AS slmaxlen_cm,
394 FROM shoe_data sh, unit un
395 WHERE sh.slunit = un.un_name;
397 CREATE VIEW shoelace AS
403 s.sl_len * u.un_fact AS sl_len_cm
404 FROM shoelace_data s, unit u
405 WHERE s.sl_unit = u.un_name;
407 CREATE VIEW shoe_ready AS
412 least(rsh.sh_avail, rsl.sl_avail) AS total_avail
413 FROM shoe rsh, shoelace rsl
414 WHERE rsl.sl_color = rsh.slcolor
415 AND rsl.sl_len_cm
>= rsh.slminlen_cm
416 AND rsl.sl_len_cm
<= rsh.slmaxlen_cm;
419 The
<command>CREATE VIEW
</command> command for the
420 <literal>shoelace
</literal> view (which is the simplest one we
421 have) will create a relation
<literal>shoelace
</literal> and an entry in
422 <structname>pg_rewrite
</structname> that tells that there is a
423 rewrite rule that must be applied whenever the relation
<literal>shoelace
</literal>
424 is referenced in a query's range table. The rule has no rule
425 qualification (discussed later, with the non-
<command>SELECT
</command> rules, since
426 <command>SELECT
</command> rules currently cannot have them) and it is
<literal>INSTEAD
</literal>. Note
427 that rule qualifications are not the same as query qualifications.
428 The action of our rule has a query qualification.
429 The action of the rule is one query tree that is a copy of the
430 <command>SELECT
</command> statement in the view creation command.
436 table entries for
<literal>NEW
</literal> and
<literal>OLD
</literal> that you can see in
437 the
<structname>pg_rewrite
</structname> entry aren't of interest
438 for
<command>SELECT
</command> rules.
443 Now we populate
<literal>unit
</literal>,
<literal>shoe_data
</literal>
444 and
<literal>shoelace_data
</literal> and run a simple query on a view:
447 INSERT INTO unit VALUES ('cm',
1.0);
448 INSERT INTO unit VALUES ('m',
100.0);
449 INSERT INTO unit VALUES ('inch',
2.54);
451 INSERT INTO shoe_data VALUES ('sh1',
2, 'black',
70.0,
90.0, 'cm');
452 INSERT INTO shoe_data VALUES ('sh2',
0, 'black',
30.0,
40.0, 'inch');
453 INSERT INTO shoe_data VALUES ('sh3',
4, 'brown',
50.0,
65.0, 'cm');
454 INSERT INTO shoe_data VALUES ('sh4',
3, 'brown',
40.0,
50.0, 'inch');
456 INSERT INTO shoelace_data VALUES ('sl1',
5, 'black',
80.0, 'cm');
457 INSERT INTO shoelace_data VALUES ('sl2',
6, 'black',
100.0, 'cm');
458 INSERT INTO shoelace_data VALUES ('sl3',
0, 'black',
35.0 , 'inch');
459 INSERT INTO shoelace_data VALUES ('sl4',
8, 'black',
40.0 , 'inch');
460 INSERT INTO shoelace_data VALUES ('sl5',
4, 'brown',
1.0 , 'm');
461 INSERT INTO shoelace_data VALUES ('sl6',
0, 'brown',
0.9 , 'm');
462 INSERT INTO shoelace_data VALUES ('sl7',
7, 'brown',
60 , 'cm');
463 INSERT INTO shoelace_data VALUES ('sl8',
1, 'brown',
40 , 'inch');
465 SELECT * FROM shoelace;
467 sl_name | sl_avail | sl_color | sl_len | sl_unit | sl_len_cm
468 -----------+----------+----------+--------+---------+-----------
469 sl1 |
5 | black |
80 | cm |
80
470 sl2 |
6 | black |
100 | cm |
100
471 sl7 |
7 | brown |
60 | cm |
60
472 sl3 |
0 | black |
35 | inch |
88.9
473 sl4 |
8 | black |
40 | inch |
101.6
474 sl8 |
1 | brown |
40 | inch |
101.6
475 sl5 |
4 | brown |
1 | m |
100
476 sl6 |
0 | brown |
0.9 | m |
90
482 This is the simplest
<command>SELECT
</command> you can do on our
483 views, so we take this opportunity to explain the basics of view
484 rules. The
<literal>SELECT * FROM shoelace
</literal> was
485 interpreted by the parser and produced the query tree:
488 SELECT shoelace.sl_name, shoelace.sl_avail,
489 shoelace.sl_color, shoelace.sl_len,
490 shoelace.sl_unit, shoelace.sl_len_cm
491 FROM shoelace shoelace;
494 and this is given to the rule system. The rule system walks through the
495 range table and checks if there are rules
496 for any relation. When processing the range table entry for
497 <literal>shoelace
</literal> (the only one up to now) it finds the
498 <literal>_RETURN
</literal> rule with the query tree:
501 SELECT s.sl_name, s.sl_avail,
502 s.sl_color, s.sl_len, s.sl_unit,
503 s.sl_len * u.un_fact AS sl_len_cm
504 FROM shoelace old, shoelace new,
505 shoelace_data s, unit u
506 WHERE s.sl_unit = u.un_name;
511 To expand the view, the rewriter simply creates a subquery range-table
512 entry containing the rule's action query tree, and substitutes this
513 range table entry for the original one that referenced the view. The
514 resulting rewritten query tree is almost the same as if you had typed:
517 SELECT shoelace.sl_name, shoelace.sl_avail,
518 shoelace.sl_color, shoelace.sl_len,
519 shoelace.sl_unit, shoelace.sl_len_cm
520 FROM (SELECT s.sl_name,
525 s.sl_len * u.un_fact AS sl_len_cm
526 FROM shoelace_data s, unit u
527 WHERE s.sl_unit = u.un_name) shoelace;
530 There is one difference however: the subquery's range table has two
531 extra entries
<literal>shoelace old
</literal> and
<literal>shoelace new
</literal>. These entries don't
532 participate directly in the query, since they aren't referenced by
533 the subquery's join tree or target list. The rewriter uses them
534 to store the access privilege check information that was originally present
535 in the range-table entry that referenced the view. In this way, the
536 executor will still check that the user has proper privileges to access
537 the view, even though there's no direct use of the view in the rewritten
542 That was the first rule applied. The rule system will continue checking
543 the remaining range-table entries in the top query (in this example there
544 are no more), and it will recursively check the range-table entries in
545 the added subquery to see if any of them reference views. (But it
546 won't expand
<literal>old
</literal> or
<literal>new
</literal> — otherwise we'd have infinite recursion!)
547 In this example, there are no rewrite rules for
<literal>shoelace_data
</literal> or
<literal>unit
</literal>,
548 so rewriting is complete and the above is the final result given to
553 Now we want to write a query that finds out for which shoes currently in the store
554 we have the matching shoelaces (color and length) and where the
555 total number of exactly matching pairs is greater than or equal to two.
558 SELECT * FROM shoe_ready WHERE total_avail
>=
2;
560 shoename | sh_avail | sl_name | sl_avail | total_avail
561 ----------+----------+---------+----------+-------------
562 sh1 |
2 | sl1 |
5 |
2
563 sh3 |
4 | sl7 |
7 |
4
569 The output of the parser this time is the query tree:
572 SELECT shoe_ready.shoename, shoe_ready.sh_avail,
573 shoe_ready.sl_name, shoe_ready.sl_avail,
574 shoe_ready.total_avail
575 FROM shoe_ready shoe_ready
576 WHERE shoe_ready.total_avail
>=
2;
579 The first rule applied will be the one for the
580 <literal>shoe_ready
</literal> view and it results in the
584 SELECT shoe_ready.shoename, shoe_ready.sh_avail,
585 shoe_ready.sl_name, shoe_ready.sl_avail,
586 shoe_ready.total_avail
587 FROM (SELECT rsh.shoename,
591 least(rsh.sh_avail, rsl.sl_avail) AS total_avail
592 FROM shoe rsh, shoelace rsl
593 WHERE rsl.sl_color = rsh.slcolor
594 AND rsl.sl_len_cm
>= rsh.slminlen_cm
595 AND rsl.sl_len_cm
<= rsh.slmaxlen_cm) shoe_ready
596 WHERE shoe_ready.total_avail
>=
2;
599 Similarly, the rules for
<literal>shoe
</literal> and
600 <literal>shoelace
</literal> are substituted into the range table of
601 the subquery, leading to a three-level final query tree:
604 SELECT shoe_ready.shoename, shoe_ready.sh_avail,
605 shoe_ready.sl_name, shoe_ready.sl_avail,
606 shoe_ready.total_avail
607 FROM (SELECT rsh.shoename,
611 least(rsh.sh_avail, rsl.sl_avail) AS total_avail
612 FROM (SELECT sh.shoename,
616 sh.slminlen * un.un_fact AS slminlen_cm,
618 sh.slmaxlen * un.un_fact AS slmaxlen_cm,
620 FROM shoe_data sh, unit un
621 WHERE sh.slunit = un.un_name) rsh,
627 s.sl_len * u.un_fact AS sl_len_cm
628 FROM shoelace_data s, unit u
629 WHERE s.sl_unit = u.un_name) rsl
630 WHERE rsl.sl_color = rsh.slcolor
631 AND rsl.sl_len_cm
>= rsh.slminlen_cm
632 AND rsl.sl_len_cm
<= rsh.slmaxlen_cm) shoe_ready
633 WHERE shoe_ready.total_avail
> 2;
638 This might look inefficient, but the planner will collapse this into a
639 single-level query tree by
<quote>pulling up
</quote> the subqueries,
640 and then it will plan the joins just as if we'd written them out
641 manually. So collapsing the query tree is an optimization that the
642 rewrite system doesn't have to concern itself with.
646 <sect2 id=
"rules-views-non-select">
647 <title>View Rules in Non-
<command>SELECT
</command> Statements
</title>
650 Two details of the query tree aren't touched in the description of
651 view rules above. These are the command type and the result relation.
652 In fact, the command type is not needed by view rules, but the result
653 relation may affect the way in which the query rewriter works, because
654 special care needs to be taken if the result relation is a view.
658 There are only a few differences between a query tree for a
659 <command>SELECT
</command> and one for any other
660 command. Obviously, they have a different command type and for a
661 command other than a
<command>SELECT
</command>, the result
662 relation points to the range-table entry where the result should
663 go. Everything else is absolutely the same. So having two tables
664 <literal>t1
</literal> and
<literal>t2
</literal> with columns
<literal>a
</literal> and
665 <literal>b
</literal>, the query trees for the two statements:
668 SELECT t2.b FROM t1, t2 WHERE t1.a = t2.a;
670 UPDATE t1 SET b = t2.b FROM t2 WHERE t1.a = t2.a;
673 are nearly identical. In particular:
678 The range tables contain entries for the tables
<literal>t1
</literal> and
<literal>t2
</literal>.
684 The target lists contain one variable that points to column
685 <literal>b
</literal> of the range table entry for table
<literal>t2
</literal>.
691 The qualification expressions compare the columns
<literal>a
</literal> of both
692 range-table entries for equality.
698 The join trees show a simple join between
<literal>t1
</literal> and
<literal>t2
</literal>.
705 The consequence is, that both query trees result in similar
706 execution plans: They are both joins over the two tables. For the
707 <command>UPDATE
</command> the missing columns from
<literal>t1
</literal> are added to
708 the target list by the planner and the final query tree will read
712 UPDATE t1 SET a = t1.a, b = t2.b FROM t2 WHERE t1.a = t2.a;
715 and thus the executor run over the join will produce exactly the
719 SELECT t1.a, t2.b FROM t1, t2 WHERE t1.a = t2.a;
722 But there is a little problem in
723 <command>UPDATE
</command>: the part of the executor plan that does
724 the join does not care what the results from the join are
725 meant for. It just produces a result set of rows. The fact that
726 one is a
<command>SELECT
</command> command and the other is an
727 <command>UPDATE
</command> is handled higher up in the executor, where
728 it knows that this is an
<command>UPDATE
</command>, and it knows that
729 this result should go into table
<literal>t1
</literal>. But which of the rows
730 that are there has to be replaced by the new row?
734 To resolve this problem, another entry is added to the target list
735 in
<command>UPDATE
</command> (and also in
736 <command>DELETE
</command>) statements: the current tuple ID
737 (
<acronym>CTID
</acronym>).
<indexterm><primary>CTID
</primary></indexterm>
738 This is a system column containing the
739 file block number and position in the block for the row. Knowing
740 the table, the
<acronym>CTID
</acronym> can be used to retrieve the
741 original row of
<literal>t1
</literal> to be updated. After adding the
742 <acronym>CTID
</acronym> to the target list, the query actually looks like:
745 SELECT t1.a, t2.b, t1.ctid FROM t1, t2 WHERE t1.a = t2.a;
748 Now another detail of
<productname>PostgreSQL
</productname> enters
749 the stage. Old table rows aren't overwritten, and this
750 is why
<command>ROLLBACK
</command> is fast. In an
<command>UPDATE
</command>,
751 the new result row is inserted into the table (after stripping the
752 <acronym>CTID
</acronym>) and in the row header of the old row, which the
753 <acronym>CTID
</acronym> pointed to, the
<literal>cmax
</literal> and
754 <literal>xmax
</literal> entries are set to the current command counter
755 and current transaction ID. Thus the old row is hidden, and after
756 the transaction commits the vacuum cleaner can eventually remove
761 Knowing all that, we can simply apply view rules in absolutely
762 the same way to any command. There is no difference.
766 <sect2 id=
"rules-views-power">
767 <title>The Power of Views in
<productname>PostgreSQL
</productname></title>
770 The above demonstrates how the rule system incorporates view
771 definitions into the original query tree. In the second example, a
772 simple
<command>SELECT
</command> from one view created a final
773 query tree that is a join of
4 tables (
<literal>unit
</literal> was used twice with
778 The benefit of implementing views with the rule system is
779 that the planner has all
780 the information about which tables have to be scanned plus the
781 relationships between these tables plus the restrictive
782 qualifications from the views plus the qualifications from
784 in one single query tree. And this is still the situation
785 when the original query is already a join over views.
786 The planner has to decide which is
787 the best path to execute the query, and the more information
788 the planner has, the better this decision can be. And
789 the rule system as implemented in
<productname>PostgreSQL
</productname>
790 ensures that this is all information available about the query
795 <sect2 id=
"rules-views-update">
796 <title>Updating a View
</title>
799 What happens if a view is named as the target relation for an
800 <command>INSERT
</command>,
<command>UPDATE
</command>,
801 <command>DELETE
</command>, or
<command>MERGE
</command>? Doing the
802 substitutions described above would give a query tree in which the result
803 relation points at a subquery range-table entry, which will not
804 work. There are several ways in which
<productname>PostgreSQL
</productname>
805 can support the appearance of updating a view, however.
806 In order of user-experienced complexity those are: automatically substitute
807 in the underlying table for the view, execute a user-defined trigger,
808 or rewrite the query per a user-defined rule.
809 These options are discussed below.
813 If the subquery selects from a single base relation and is simple
814 enough, the rewriter can automatically replace the subquery with the
815 underlying base relation so that the
<command>INSERT
</command>,
816 <command>UPDATE
</command>,
<command>DELETE
</command>, or
817 <command>MERGE
</command> is applied to the base relation in the
818 appropriate way. Views that are
<quote>simple enough
</quote> for this
819 are called
<firstterm>automatically updatable
</firstterm>. For detailed
820 information on the kinds of view that can be automatically updated, see
821 <xref linkend=
"sql-createview"/>.
825 Alternatively, the operation may be handled by a user-provided
826 <literal>INSTEAD OF
</literal> trigger on the view
827 (see
<xref linkend=
"sql-createtrigger"/>).
828 Rewriting works slightly differently
829 in this case. For
<command>INSERT
</command>, the rewriter does
830 nothing at all with the view, leaving it as the result relation
831 for the query. For
<command>UPDATE
</command>,
<command>DELETE
</command>,
832 and
<command>MERGE
</command>, it's still necessary to expand the
833 view query to produce the
<quote>old
</quote> rows that the command will
834 attempt to update, delete, or merge. So the view is expanded as normal,
835 but another unexpanded range-table entry is added to the query
836 to represent the view in its capacity as the result relation.
840 The problem that now arises is how to identify the rows to be
841 updated in the view. Recall that when the result relation
842 is a table, a special
<acronym>CTID
</acronym> entry is added to the target
843 list to identify the physical locations of the rows to be updated.
844 This does not work if the result relation is a view, because a view
845 does not have any
<acronym>CTID
</acronym>, since its rows do not have
846 actual physical locations. Instead, for an
<command>UPDATE
</command>,
847 <command>DELETE
</command>, or
<command>MERGE
</command> operation, a
848 special
<literal>wholerow
</literal> entry is added to the target list,
849 which expands to include all columns from the view. The executor uses this
850 value to supply the
<quote>old
</quote> row to the
851 <literal>INSTEAD OF
</literal> trigger. It is up to the trigger to work
852 out what to update based on the old and new row values.
856 Another possibility is for the user to define
<literal>INSTEAD
</literal>
857 rules that specify substitute actions for
<command>INSERT
</command>,
858 <command>UPDATE
</command>, and
<command>DELETE
</command> commands on
859 a view. These rules will rewrite the command, typically into a command
860 that updates one or more tables, rather than views. That is the topic
861 of
<xref linkend=
"rules-update"/>. Note that this will not work with
862 <command>MERGE
</command>, which currently does not support rules on
863 the target relation other than
<command>SELECT
</command> rules.
867 Note that rules are evaluated first, rewriting the original query
868 before it is planned and executed. Therefore, if a view has
869 <literal>INSTEAD OF
</literal> triggers as well as rules on
<command>INSERT
</command>,
870 <command>UPDATE
</command>, or
<command>DELETE
</command>, then the rules will be
871 evaluated first, and depending on the result, the triggers may not be
876 Automatic rewriting of an
<command>INSERT
</command>,
877 <command>UPDATE
</command>,
<command>DELETE
</command>, or
878 <command>MERGE
</command> query on a
879 simple view is always tried last. Therefore, if a view has rules or
880 triggers, they will override the default behavior of automatically
885 If there are no
<literal>INSTEAD
</literal> rules or
<literal>INSTEAD OF
</literal>
886 triggers for the view, and the rewriter cannot automatically rewrite
887 the query as an update on the underlying base relation, an error will
888 be thrown because the executor cannot update a view as such.
895 <sect1 id=
"rules-materializedviews">
896 <title>Materialized Views
</title>
898 <indexterm zone=
"rules-materializedviews">
899 <primary>rule
</primary>
900 <secondary>and materialized views
</secondary>
903 <indexterm zone=
"rules-materializedviews">
904 <primary>materialized view
</primary>
905 <secondary>implementation through rules
</secondary>
908 <indexterm zone=
"rules-materializedviews">
909 <primary>view
</primary>
910 <secondary>materialized
</secondary>
914 Materialized views in
<productname>PostgreSQL
</productname> use the
915 rule system like views do, but persist the results in a table-like form.
916 The main differences between:
919 CREATE MATERIALIZED VIEW mymatview AS SELECT * FROM mytab;
925 CREATE TABLE mymatview AS SELECT * FROM mytab;
928 are that the materialized view cannot subsequently be directly updated
929 and that the query used to create the materialized view is stored in
930 exactly the same way that a view's query is stored, so that fresh data
931 can be generated for the materialized view with:
934 REFRESH MATERIALIZED VIEW mymatview;
937 The information about a materialized view in the
938 <productname>PostgreSQL
</productname> system catalogs is exactly
939 the same as it is for a table or view. So for the parser, a
940 materialized view is a relation, just like a table or a view. When
941 a materialized view is referenced in a query, the data is returned
942 directly from the materialized view, like from a table; the rule is
943 only used for populating the materialized view.
947 While access to the data stored in a materialized view is often much
948 faster than accessing the underlying tables directly or through a view,
949 the data is not always current; yet sometimes current data is not needed.
950 Consider a table which records sales:
953 CREATE TABLE invoice (
954 invoice_no integer PRIMARY KEY,
955 seller_no integer, -- ID of salesperson
956 invoice_date date, -- date of sale
957 invoice_amt numeric(
13,
2) -- amount of sale
961 If people want to be able to quickly graph historical sales data, they
962 might want to summarize, and they may not care about the incomplete data
963 for the current date:
966 CREATE MATERIALIZED VIEW sales_summary AS
970 sum(invoice_amt)::numeric(
13,
2) as sales_amt
972 WHERE invoice_date
< CURRENT_DATE
977 CREATE UNIQUE INDEX sales_summary_seller
978 ON sales_summary (seller_no, invoice_date);
981 This materialized view might be useful for displaying a graph in the
982 dashboard created for salespeople. A job could be scheduled to update
983 the statistics each night using this SQL statement:
986 REFRESH MATERIALIZED VIEW sales_summary;
991 Another use for a materialized view is to allow faster access to data
992 brought across from a remote system through a foreign data wrapper.
993 A simple example using
<literal>file_fdw
</literal> is below, with timings,
994 but since this is using cache on the local system the performance
995 difference compared to access to a remote system would usually be greater
996 than shown here. Notice we are also exploiting the ability to put an
997 index on the materialized view, whereas
<literal>file_fdw
</literal> does
998 not support indexes; this advantage might not apply for other sorts of
1006 CREATE EXTENSION file_fdw;
1007 CREATE SERVER local_file FOREIGN DATA WRAPPER file_fdw;
1008 CREATE FOREIGN TABLE words (word text NOT NULL)
1010 OPTIONS (filename '/usr/share/dict/words');
1011 CREATE MATERIALIZED VIEW wrd AS SELECT * FROM words;
1012 CREATE UNIQUE INDEX wrd_word ON wrd (word);
1013 CREATE EXTENSION pg_trgm;
1014 CREATE INDEX wrd_trgm ON wrd USING gist (word gist_trgm_ops);
1018 Now let's spell-check a word. Using
<literal>file_fdw
</literal> directly:
1021 SELECT count(*) FROM words WHERE word = 'caterpiler';
1029 With
<command>EXPLAIN ANALYZE
</command>, we see:
1032 Aggregate (cost=
21763.99.
.21764.00 rows=
1 width=
0) (actual time=
188.180.
.188.181 rows=
1 loops=
1)
1033 -
> Foreign Scan on words (cost=
0.00.
.21761.41 rows=
1032 width=
0) (actual time=
188.177.
.188.177 rows=
0 loops=
1)
1034 Filter: (word = 'caterpiler'::text)
1035 Rows Removed by Filter:
479829
1036 Foreign File: /usr/share/dict/words
1037 Foreign File Size:
4953699
1038 Planning time:
0.118 ms
1039 Execution time:
188.273 ms
1042 If the materialized view is used instead, the query is much faster:
1045 Aggregate (cost=
4.44.
.4.45 rows=
1 width=
0) (actual time=
0.042.
.0.042 rows=
1 loops=
1)
1046 -
> Index Only Scan using wrd_word on wrd (cost=
0.42.
.4.44 rows=
1 width=
0) (actual time=
0.039.
.0.039 rows=
0 loops=
1)
1047 Index Cond: (word = 'caterpiler'::text)
1049 Planning time:
0.164 ms
1050 Execution time:
0.117 ms
1053 Either way, the word is spelled wrong, so let's look for what we might
1054 have wanted. Again using
<literal>file_fdw
</literal> and
1055 <literal>pg_trgm
</literal>:
1058 SELECT word FROM words ORDER BY word
<-
> 'caterpiler' LIMIT
10;
1076 Limit (cost=
11583.61.
.11583.64 rows=
10 width=
32) (actual time=
1431.591.
.1431.594 rows=
10 loops=
1)
1077 -
> Sort (cost=
11583.61.
.11804.76 rows=
88459 width=
32) (actual time=
1431.589.
.1431.591 rows=
10 loops=
1)
1078 Sort Key: ((word
<-
> 'caterpiler'::text))
1079 Sort Method: top-N heapsort Memory:
25kB
1080 -
> Foreign Scan on words (cost=
0.00.
.9672.05 rows=
88459 width=
32) (actual time=
0.057.
.1286.455 rows=
479829 loops=
1)
1081 Foreign File: /usr/share/dict/words
1082 Foreign File Size:
4953699
1083 Planning time:
0.128 ms
1084 Execution time:
1431.679 ms
1087 Using the materialized view:
1090 Limit (cost=
0.29.
.1.06 rows=
10 width=
10) (actual time=
187.222.
.188.257 rows=
10 loops=
1)
1091 -
> Index Scan using wrd_trgm on wrd (cost=
0.29.
.37020.87 rows=
479829 width=
10) (actual time=
187.219.
.188.252 rows=
10 loops=
1)
1092 Order By: (word
<-
> 'caterpiler'::text)
1093 Planning time:
0.196 ms
1094 Execution time:
198.640 ms
1097 If you can tolerate periodic update of the remote data to the local
1098 database, the performance benefit can be substantial.
1103 <sect1 id=
"rules-update">
1104 <title>Rules on
<command>INSERT
</command>,
<command>UPDATE
</command>, and
<command>DELETE
</command></title>
1106 <indexterm zone=
"rules-update">
1107 <primary>rule
</primary>
1108 <secondary sortas=
"INSERT">for INSERT
</secondary>
1111 <indexterm zone=
"rules-update">
1112 <primary>rule
</primary>
1113 <secondary sortas=
"UPDATE">for UPDATE
</secondary>
1116 <indexterm zone=
"rules-update">
1117 <primary>rule
</primary>
1118 <secondary sortas=
"DELETE">for DELETE
</secondary>
1122 Rules that are defined on
<command>INSERT
</command>,
<command>UPDATE
</command>,
1123 and
<command>DELETE
</command> are significantly different from the view rules
1124 described in the previous sections. First, their
<command>CREATE
1125 RULE
</command> command allows more:
1130 They are allowed to have no action.
1136 They can have multiple actions.
1142 They can be
<literal>INSTEAD
</literal> or
<literal>ALSO
</literal> (the default).
1148 The pseudorelations
<literal>NEW
</literal> and
<literal>OLD
</literal> become useful.
1154 They can have rule qualifications.
1159 Second, they don't modify the query tree in place. Instead they
1160 create zero or more new query trees and can throw away the
1166 In many cases, tasks that could be performed by rules
1167 on
<command>INSERT
</command>/
<command>UPDATE
</command>/
<command>DELETE
</command> are better done
1168 with triggers. Triggers are notationally a bit more complicated, but their
1169 semantics are much simpler to understand. Rules tend to have surprising
1170 results when the original query contains volatile functions: volatile
1171 functions may get executed more times than expected in the process of
1172 carrying out the rules.
1176 Also, there are some cases that are not supported by these types of rules at
1177 all, notably including
<literal>WITH
</literal> clauses in the original query and
1178 multiple-assignment sub-
<literal>SELECT
</literal>s in the
<literal>SET
</literal> list
1179 of
<command>UPDATE
</command> queries. This is because copying these constructs
1180 into a rule query would result in multiple evaluations of the sub-query,
1181 contrary to the express intent of the query's author.
1185 <sect2 id=
"rules-update-how">
1186 <title>How Update Rules Work
</title>
1192 CREATE [ OR REPLACE ] RULE
<replaceable class=
"parameter">name
</replaceable> AS ON
<replaceable class=
"parameter">event
</replaceable>
1193 TO
<replaceable class=
"parameter">table
</replaceable> [ WHERE
<replaceable class=
"parameter">condition
</replaceable> ]
1194 DO [ ALSO | INSTEAD ] { NOTHING |
<replaceable class=
"parameter">command
</replaceable> | (
<replaceable class=
"parameter">command
</replaceable> ;
<replaceable class=
"parameter">command
</replaceable> ... ) }
1198 In the following,
<firstterm>update rules
</firstterm> means rules that are defined
1199 on
<command>INSERT
</command>,
<command>UPDATE
</command>, or
<command>DELETE
</command>.
1203 Update rules get applied by the rule system when the result
1204 relation and the command type of a query tree are equal to the
1205 object and event given in the
<command>CREATE RULE
</command> command.
1206 For update rules, the rule system creates a list of query trees.
1207 Initially the query-tree list is empty.
1208 There can be zero (
<literal>NOTHING
</literal> key word), one, or multiple actions.
1209 To simplify, we will look at a rule with one action. This rule
1210 can have a qualification or not and it can be
<literal>INSTEAD
</literal> or
1211 <literal>ALSO
</literal> (the default).
1215 What is a rule qualification? It is a restriction that tells
1216 when the actions of the rule should be done and when not. This
1217 qualification can only reference the pseudorelations
<literal>NEW
</literal> and/or
<literal>OLD
</literal>,
1218 which basically represent the relation that was given as object (but with a
1223 So we have three cases that produce the following query trees for
1228 <term>No qualification, with either
<literal>ALSO
</literal> or
1229 <literal>INSTEAD
</literal></term>
1232 the query tree from the rule action with the original query
1233 tree's qualification added
1239 <term>Qualification given and
<literal>ALSO
</literal></term>
1242 the query tree from the rule action with the rule
1243 qualification and the original query tree's qualification
1250 <term>Qualification given and
<literal>INSTEAD
</literal></term>
1253 the query tree from the rule action with the rule
1254 qualification and the original query tree's qualification; and
1255 the original query tree with the negated rule qualification
1262 Finally, if the rule is
<literal>ALSO
</literal>, the unchanged original query tree is
1263 added to the list. Since only qualified
<literal>INSTEAD
</literal> rules already add the
1264 original query tree, we end up with either one or two output query trees
1265 for a rule with one action.
1269 For
<literal>ON INSERT
</literal> rules, the original query (if not suppressed by
<literal>INSTEAD
</literal>)
1270 is done before any actions added by rules. This allows the actions to
1271 see the inserted row(s). But for
<literal>ON UPDATE
</literal> and
<literal>ON
1272 DELETE
</literal> rules, the original query is done after the actions added by rules.
1273 This ensures that the actions can see the to-be-updated or to-be-deleted
1274 rows; otherwise, the actions might do nothing because they find no rows
1275 matching their qualifications.
1279 The query trees generated from rule actions are thrown into the
1280 rewrite system again, and maybe more rules get applied resulting
1281 in additional or fewer query trees.
1282 So a rule's actions must have either a different
1283 command type or a different result relation than the rule itself is
1284 on, otherwise this recursive process will end up in an infinite loop.
1285 (Recursive expansion of a rule will be detected and reported as an
1290 The query trees found in the actions of the
1291 <structname>pg_rewrite
</structname> system catalog are only
1292 templates. Since they can reference the range-table entries for
1293 <literal>NEW
</literal> and
<literal>OLD
</literal>, some substitutions have to be made before they can be
1294 used. For any reference to
<literal>NEW
</literal>, the target list of the original
1295 query is searched for a corresponding entry. If found, that
1296 entry's expression replaces the reference. Otherwise,
<literal>NEW
</literal> means the
1297 same as
<literal>OLD
</literal> (for an
<command>UPDATE
</command>) or is replaced by
1298 a null value (for an
<command>INSERT
</command>). Any reference to
<literal>OLD
</literal> is
1299 replaced by a reference to the range-table entry that is the
1304 After the system is done applying update rules, it applies view rules to the
1305 produced query tree(s). Views cannot insert new update actions so
1306 there is no need to apply update rules to the output of view rewriting.
1309 <sect3 id=
"rules-update-how-first">
1310 <title>A First Rule Step by Step
</title>
1313 Say we want to trace changes to the
<literal>sl_avail
</literal> column in the
1314 <literal>shoelace_data
</literal> relation. So we set up a log table
1315 and a rule that conditionally writes a log entry when an
1316 <command>UPDATE
</command> is performed on
1317 <literal>shoelace_data
</literal>.
1320 CREATE TABLE shoelace_log (
1321 sl_name text, -- shoelace changed
1322 sl_avail integer, -- new available value
1323 log_who text, -- who did it
1324 log_when timestamp -- when
1327 CREATE RULE log_shoelace AS ON UPDATE TO shoelace_data
1328 WHERE NEW.sl_avail
<> OLD.sl_avail
1329 DO INSERT INTO shoelace_log VALUES (
1342 UPDATE shoelace_data SET sl_avail =
6 WHERE sl_name = 'sl7';
1345 and we look at the log table:
1348 SELECT * FROM shoelace_log;
1350 sl_name | sl_avail | log_who | log_when
1351 ---------+----------+---------+----------------------------------
1352 sl7 |
6 | Al | Tue Oct
20 16:
14:
45 1998 MET DST
1358 That's what we expected. What happened in the background is the following.
1359 The parser created the query tree:
1362 UPDATE shoelace_data SET sl_avail =
6
1363 FROM shoelace_data shoelace_data
1364 WHERE shoelace_data.sl_name = 'sl7';
1367 There is a rule
<literal>log_shoelace
</literal> that is
<literal>ON UPDATE
</literal> with the rule
1368 qualification expression:
1371 NEW.sl_avail
<> OLD.sl_avail
1377 INSERT INTO shoelace_log VALUES (
1378 new.sl_name, new.sl_avail,
1379 current_user, current_timestamp )
1380 FROM shoelace_data new, shoelace_data old;
1383 (This looks a little strange since you cannot normally write
1384 <literal>INSERT ... VALUES ... FROM
</literal>. The
<literal>FROM
</literal>
1385 clause here is just to indicate that there are range-table entries
1386 in the query tree for
<literal>new
</literal> and
<literal>old
</literal>.
1387 These are needed so that they can be referenced by variables in
1388 the
<command>INSERT
</command> command's query tree.)
1392 The rule is a qualified
<literal>ALSO
</literal> rule, so the rule system
1393 has to return two query trees: the modified rule action and the original
1394 query tree. In step
1, the range table of the original query is
1395 incorporated into the rule's action query tree. This results in:
1398 INSERT INTO shoelace_log VALUES (
1399 new.sl_name, new.sl_avail,
1400 current_user, current_timestamp )
1401 FROM shoelace_data new, shoelace_data old,
1402 <emphasis>shoelace_data shoelace_data
</emphasis>;
1405 In step
2, the rule qualification is added to it, so the result set
1406 is restricted to rows where
<literal>sl_avail
</literal> changes:
1409 INSERT INTO shoelace_log VALUES (
1410 new.sl_name, new.sl_avail,
1411 current_user, current_timestamp )
1412 FROM shoelace_data new, shoelace_data old,
1413 shoelace_data shoelace_data
1414 <emphasis>WHERE new.sl_avail
<> old.sl_avail
</emphasis>;
1417 (This looks even stranger, since
<literal>INSERT ... VALUES
</literal> doesn't have
1418 a
<literal>WHERE
</literal> clause either, but the planner and executor will have no
1419 difficulty with it. They need to support this same functionality
1420 anyway for
<literal>INSERT ... SELECT
</literal>.)
1424 In step
3, the original query tree's qualification is added,
1425 restricting the result set further to only the rows that would have been touched
1426 by the original query:
1429 INSERT INTO shoelace_log VALUES (
1430 new.sl_name, new.sl_avail,
1431 current_user, current_timestamp )
1432 FROM shoelace_data new, shoelace_data old,
1433 shoelace_data shoelace_data
1434 WHERE new.sl_avail
<> old.sl_avail
1435 <emphasis>AND shoelace_data.sl_name = 'sl7'
</emphasis>;
1440 Step
4 replaces references to
<literal>NEW
</literal> by the target list entries from the
1441 original query tree or by the matching variable references
1442 from the result relation:
1445 INSERT INTO shoelace_log VALUES (
1446 <emphasis>shoelace_data.sl_name
</emphasis>,
<emphasis>6</emphasis>,
1447 current_user, current_timestamp )
1448 FROM shoelace_data new, shoelace_data old,
1449 shoelace_data shoelace_data
1450 WHERE
<emphasis>6</emphasis> <> old.sl_avail
1451 AND shoelace_data.sl_name = 'sl7';
1457 Step
5 changes
<literal>OLD
</literal> references into result relation references:
1460 INSERT INTO shoelace_log VALUES (
1461 shoelace_data.sl_name,
6,
1462 current_user, current_timestamp )
1463 FROM shoelace_data new, shoelace_data old,
1464 shoelace_data shoelace_data
1465 WHERE
6 <> <emphasis>shoelace_data.sl_avail
</emphasis>
1466 AND shoelace_data.sl_name = 'sl7';
1471 That's it. Since the rule is
<literal>ALSO
</literal>, we also output the
1472 original query tree. In short, the output from the rule system
1473 is a list of two query trees that correspond to these statements:
1476 INSERT INTO shoelace_log VALUES (
1477 shoelace_data.sl_name,
6,
1478 current_user, current_timestamp )
1480 WHERE
6 <> shoelace_data.sl_avail
1481 AND shoelace_data.sl_name = 'sl7';
1483 UPDATE shoelace_data SET sl_avail =
6
1484 WHERE sl_name = 'sl7';
1487 These are executed in this order, and that is exactly what
1488 the rule was meant to do.
1492 The substitutions and the added qualifications
1493 ensure that, if the original query would be, say:
1496 UPDATE shoelace_data SET sl_color = 'green'
1497 WHERE sl_name = 'sl7';
1500 no log entry would get written. In that case, the original query
1501 tree does not contain a target list entry for
1502 <literal>sl_avail
</literal>, so
<literal>NEW.sl_avail
</literal> will get
1503 replaced by
<literal>shoelace_data.sl_avail
</literal>. Thus, the extra
1504 command generated by the rule is:
1507 INSERT INTO shoelace_log VALUES (
1508 shoelace_data.sl_name,
<emphasis>shoelace_data.sl_avail
</emphasis>,
1509 current_user, current_timestamp )
1511 WHERE
<emphasis>shoelace_data.sl_avail
</emphasis> <> shoelace_data.sl_avail
1512 AND shoelace_data.sl_name = 'sl7';
1515 and that qualification will never be true.
1519 It will also work if the original query modifies multiple rows. So
1520 if someone issued the command:
1523 UPDATE shoelace_data SET sl_avail =
0
1524 WHERE sl_color = 'black';
1527 four rows in fact get updated (
<literal>sl1
</literal>,
<literal>sl2
</literal>,
<literal>sl3
</literal>, and
<literal>sl4
</literal>).
1528 But
<literal>sl3
</literal> already has
<literal>sl_avail =
0</literal>. In this case, the original
1529 query trees qualification is different and that results
1530 in the extra query tree:
1533 INSERT INTO shoelace_log
1534 SELECT shoelace_data.sl_name,
0,
1535 current_user, current_timestamp
1537 WHERE
0 <> shoelace_data.sl_avail
1538 AND
<emphasis>shoelace_data.sl_color = 'black'
</emphasis>;
1541 being generated by the rule. This query tree will surely insert
1542 three new log entries. And that's absolutely correct.
1546 Here we can see why it is important that the original query tree
1547 is executed last. If the
<command>UPDATE
</command> had been
1548 executed first, all the rows would have already been set to zero, so the
1549 logging
<command>INSERT
</command> would not find any row where
1550 <literal>0 <> shoelace_data.sl_avail
</literal>.
1556 <sect2 id=
"rules-update-views">
1557 <title>Cooperation with Views
</title>
1559 <indexterm zone=
"rules-update-views"><primary>view
</primary><secondary>updating
</secondary></indexterm>
1562 A simple way to protect view relations from the mentioned
1563 possibility that someone can try to run
<command>INSERT
</command>,
1564 <command>UPDATE
</command>, or
<command>DELETE
</command> on them is
1565 to let those query trees get thrown away. So we could create the rules:
1568 CREATE RULE shoe_ins_protect AS ON INSERT TO shoe
1570 CREATE RULE shoe_upd_protect AS ON UPDATE TO shoe
1572 CREATE RULE shoe_del_protect AS ON DELETE TO shoe
1576 If someone now tries to do any of these operations on the view
1577 relation
<literal>shoe
</literal>, the rule system will
1578 apply these rules. Since the rules have
1579 no actions and are
<literal>INSTEAD
</literal>, the resulting list of
1580 query trees will be empty and the whole query will become
1581 nothing because there is nothing left to be optimized or
1582 executed after the rule system is done with it.
1586 A more sophisticated way to use the rule system is to
1587 create rules that rewrite the query tree into one that
1588 does the right operation on the real tables. To do that
1589 on the
<literal>shoelace
</literal> view, we create
1590 the following rules:
1593 CREATE RULE shoelace_ins AS ON INSERT TO shoelace
1595 INSERT INTO shoelace_data VALUES (
1603 CREATE RULE shoelace_upd AS ON UPDATE TO shoelace
1605 UPDATE shoelace_data
1606 SET sl_name = NEW.sl_name,
1607 sl_avail = NEW.sl_avail,
1608 sl_color = NEW.sl_color,
1609 sl_len = NEW.sl_len,
1610 sl_unit = NEW.sl_unit
1611 WHERE sl_name = OLD.sl_name;
1613 CREATE RULE shoelace_del AS ON DELETE TO shoelace
1615 DELETE FROM shoelace_data
1616 WHERE sl_name = OLD.sl_name;
1621 If you want to support
<literal>RETURNING
</literal> queries on the view,
1622 you need to make the rules include
<literal>RETURNING
</literal> clauses that
1623 compute the view rows. This is usually pretty trivial for views on a
1624 single table, but it's a bit tedious for join views such as
1625 <literal>shoelace
</literal>. An example for the insert case is:
1628 CREATE RULE shoelace_ins AS ON INSERT TO shoelace
1630 INSERT INTO shoelace_data VALUES (
1639 (SELECT shoelace_data.sl_len * u.un_fact
1640 FROM unit u WHERE shoelace_data.sl_unit = u.un_name);
1643 Note that this one rule supports both
<command>INSERT
</command> and
1644 <command>INSERT RETURNING
</command> queries on the view
— the
1645 <literal>RETURNING
</literal> clause is simply ignored for
<command>INSERT
</command>.
1649 Note that in the
<literal>RETURNING
</literal> clause of a rule,
1650 <literal>OLD
</literal> and
<literal>NEW
</literal> refer to the
1651 pseudorelations added as extra range table entries to the rewritten
1652 query, rather than old/new rows in the result relation. Thus, for
1653 example, in a rule supporting
<command>UPDATE
</command> queries on this
1654 view, if the
<literal>RETURNING
</literal> clause contained
1655 <literal>old.sl_name
</literal>, the old name would always be returned,
1656 regardless of whether the
<literal>RETURNING
</literal> clause in the
1657 query on the view specified
<literal>OLD
</literal> or
<literal>NEW
</literal>,
1658 which might be confusing. To avoid this confusion, and support returning
1659 old and new values in queries on the view, the
<literal>RETURNING
</literal>
1660 clause in the rule definition should refer to entries from the result
1661 relation such as
<literal>shoelace_data.sl_name
</literal>, without
1662 specifying
<literal>OLD
</literal> or
<literal>NEW
</literal>.
1666 Now assume that once in a while, a pack of shoelaces arrives at
1667 the shop and a big parts list along with it. But you don't want
1668 to manually update the
<literal>shoelace
</literal> view every
1669 time. Instead we set up two little tables: one where you can
1670 insert the items from the part list, and one with a special
1671 trick. The creation commands for these are:
1674 CREATE TABLE shoelace_arrive (
1679 CREATE TABLE shoelace_ok (
1684 CREATE RULE shoelace_ok_ins AS ON INSERT TO shoelace_ok
1687 SET sl_avail = sl_avail + NEW.ok_quant
1688 WHERE sl_name = NEW.ok_name;
1691 Now you can fill the table
<literal>shoelace_arrive
</literal> with
1692 the data from the parts list:
1695 SELECT * FROM shoelace_arrive;
1697 arr_name | arr_quant
1698 ----------+-----------
1705 Take a quick look at the current data:
1708 SELECT * FROM shoelace;
1710 sl_name | sl_avail | sl_color | sl_len | sl_unit | sl_len_cm
1711 ----------+----------+----------+--------+---------+-----------
1712 sl1 |
5 | black |
80 | cm |
80
1713 sl2 |
6 | black |
100 | cm |
100
1714 sl7 |
6 | brown |
60 | cm |
60
1715 sl3 |
0 | black |
35 | inch |
88.9
1716 sl4 |
8 | black |
40 | inch |
101.6
1717 sl8 |
1 | brown |
40 | inch |
101.6
1718 sl5 |
4 | brown |
1 | m |
100
1719 sl6 |
0 | brown |
0.9 | m |
90
1723 Now move the arrived shoelaces in:
1726 INSERT INTO shoelace_ok SELECT * FROM shoelace_arrive;
1729 and check the results:
1732 SELECT * FROM shoelace ORDER BY sl_name;
1734 sl_name | sl_avail | sl_color | sl_len | sl_unit | sl_len_cm
1735 ----------+----------+----------+--------+---------+-----------
1736 sl1 |
5 | black |
80 | cm |
80
1737 sl2 |
6 | black |
100 | cm |
100
1738 sl7 |
6 | brown |
60 | cm |
60
1739 sl4 |
8 | black |
40 | inch |
101.6
1740 sl3 |
10 | black |
35 | inch |
88.9
1741 sl8 |
21 | brown |
40 | inch |
101.6
1742 sl5 |
4 | brown |
1 | m |
100
1743 sl6 |
20 | brown |
0.9 | m |
90
1746 SELECT * FROM shoelace_log;
1748 sl_name | sl_avail | log_who| log_when
1749 ---------+----------+--------+----------------------------------
1750 sl7 |
6 | Al | Tue Oct
20 19:
14:
45 1998 MET DST
1751 sl3 |
10 | Al | Tue Oct
20 19:
25:
16 1998 MET DST
1752 sl6 |
20 | Al | Tue Oct
20 19:
25:
16 1998 MET DST
1753 sl8 |
21 | Al | Tue Oct
20 19:
25:
16 1998 MET DST
1759 It's a long way from the one
<literal>INSERT ... SELECT
</literal>
1760 to these results. And the description of the query-tree
1761 transformation will be the last in this chapter. First, there is
1762 the parser's output:
1765 INSERT INTO shoelace_ok
1766 SELECT shoelace_arrive.arr_name, shoelace_arrive.arr_quant
1767 FROM shoelace_arrive shoelace_arrive, shoelace_ok shoelace_ok;
1770 Now the first rule
<literal>shoelace_ok_ins
</literal> is applied and turns this
1775 SET sl_avail = shoelace.sl_avail + shoelace_arrive.arr_quant
1776 FROM shoelace_arrive shoelace_arrive, shoelace_ok shoelace_ok,
1777 shoelace_ok old, shoelace_ok new,
1779 WHERE shoelace.sl_name = shoelace_arrive.arr_name;
1782 and throws away the original
<command>INSERT
</command> on
1783 <literal>shoelace_ok
</literal>. This rewritten query is passed to
1784 the rule system again, and the second applied rule
1785 <literal>shoelace_upd
</literal> produces:
1788 UPDATE shoelace_data
1789 SET sl_name = shoelace.sl_name,
1790 sl_avail = shoelace.sl_avail + shoelace_arrive.arr_quant,
1791 sl_color = shoelace.sl_color,
1792 sl_len = shoelace.sl_len,
1793 sl_unit = shoelace.sl_unit
1794 FROM shoelace_arrive shoelace_arrive, shoelace_ok shoelace_ok,
1795 shoelace_ok old, shoelace_ok new,
1796 shoelace shoelace, shoelace old,
1797 shoelace new, shoelace_data shoelace_data
1798 WHERE shoelace.sl_name = shoelace_arrive.arr_name
1799 AND shoelace_data.sl_name = shoelace.sl_name;
1802 Again it's an
<literal>INSTEAD
</literal> rule and the previous query tree is trashed.
1803 Note that this query still uses the view
<literal>shoelace
</literal>.
1804 But the rule system isn't finished with this step, so it continues
1805 and applies the
<literal>_RETURN
</literal> rule on it, and we get:
1808 UPDATE shoelace_data
1809 SET sl_name = s.sl_name,
1810 sl_avail = s.sl_avail + shoelace_arrive.arr_quant,
1811 sl_color = s.sl_color,
1814 FROM shoelace_arrive shoelace_arrive, shoelace_ok shoelace_ok,
1815 shoelace_ok old, shoelace_ok new,
1816 shoelace shoelace, shoelace old,
1817 shoelace new, shoelace_data shoelace_data,
1818 shoelace old, shoelace new,
1819 shoelace_data s, unit u
1820 WHERE s.sl_name = shoelace_arrive.arr_name
1821 AND shoelace_data.sl_name = s.sl_name;
1824 Finally, the rule
<literal>log_shoelace
</literal> gets applied,
1825 producing the extra query tree:
1828 INSERT INTO shoelace_log
1830 s.sl_avail + shoelace_arrive.arr_quant,
1833 FROM shoelace_arrive shoelace_arrive, shoelace_ok shoelace_ok,
1834 shoelace_ok old, shoelace_ok new,
1835 shoelace shoelace, shoelace old,
1836 shoelace new, shoelace_data shoelace_data,
1837 shoelace old, shoelace new,
1838 shoelace_data s, unit u,
1839 shoelace_data old, shoelace_data new
1840 shoelace_log shoelace_log
1841 WHERE s.sl_name = shoelace_arrive.arr_name
1842 AND shoelace_data.sl_name = s.sl_name
1843 AND (s.sl_avail + shoelace_arrive.arr_quant)
<> s.sl_avail;
1846 After that the rule system runs out of rules and returns the
1847 generated query trees.
1851 So we end up with two final query trees that are equivalent to the
1852 <acronym>SQL
</acronym> statements:
1855 INSERT INTO shoelace_log
1857 s.sl_avail + shoelace_arrive.arr_quant,
1860 FROM shoelace_arrive shoelace_arrive, shoelace_data shoelace_data,
1862 WHERE s.sl_name = shoelace_arrive.arr_name
1863 AND shoelace_data.sl_name = s.sl_name
1864 AND s.sl_avail + shoelace_arrive.arr_quant
<> s.sl_avail;
1866 UPDATE shoelace_data
1867 SET sl_avail = shoelace_data.sl_avail + shoelace_arrive.arr_quant
1868 FROM shoelace_arrive shoelace_arrive,
1869 shoelace_data shoelace_data,
1871 WHERE s.sl_name = shoelace_arrive.sl_name
1872 AND shoelace_data.sl_name = s.sl_name;
1875 The result is that data coming from one relation inserted into another,
1876 changed into updates on a third, changed into updating
1877 a fourth plus logging that final update in a fifth
1878 gets reduced into two queries.
1882 There is a little detail that's a bit ugly. Looking at the two
1883 queries, it turns out that the
<literal>shoelace_data
</literal>
1884 relation appears twice in the range table where it could
1885 definitely be reduced to one. The planner does not handle it and
1886 so the execution plan for the rule systems output of the
1887 <command>INSERT
</command> will be
1889 <literallayout class=
"monospaced">
1897 -
> Seq Scan on shoelace_arrive
1898 -
> Seq Scan on shoelace_data
1901 while omitting the extra range table entry would result in a
1903 <literallayout class=
"monospaced">
1910 -
> Seq Scan on shoelace_arrive
1913 which produces exactly the same entries in the log table. Thus,
1914 the rule system caused one extra scan on the table
1915 <literal>shoelace_data
</literal> that is absolutely not
1916 necessary. And the same redundant scan is done once more in the
1917 <command>UPDATE
</command>. But it was a really hard job to make
1918 that all possible at all.
1922 Now we make a final demonstration of the
1923 <productname>PostgreSQL
</productname> rule system and its power.
1924 Say you add some shoelaces with extraordinary colors to your
1928 INSERT INTO shoelace VALUES ('sl9',
0, 'pink',
35.0, 'inch',
0.0);
1929 INSERT INTO shoelace VALUES ('sl10',
1000, 'magenta',
40.0, 'inch',
0.0);
1932 We would like to make a view to check which
1933 <literal>shoelace
</literal> entries do not fit any shoe in color.
1934 The view for this is:
1937 CREATE VIEW shoelace_mismatch AS
1938 SELECT * FROM shoelace WHERE NOT EXISTS
1939 (SELECT shoename FROM shoe WHERE slcolor = sl_color);
1945 SELECT * FROM shoelace_mismatch;
1947 sl_name | sl_avail | sl_color | sl_len | sl_unit | sl_len_cm
1948 ---------+----------+----------+--------+---------+-----------
1949 sl9 |
0 | pink |
35 | inch |
88.9
1950 sl10 |
1000 | magenta |
40 | inch |
101.6
1955 Now we want to set it up so that mismatching shoelaces that are
1956 not in stock are deleted from the database.
1957 To make it a little harder for
<productname>PostgreSQL
</productname>,
1958 we don't delete it directly. Instead we create one more view:
1961 CREATE VIEW shoelace_can_delete AS
1962 SELECT * FROM shoelace_mismatch WHERE sl_avail =
0;
1968 DELETE FROM shoelace WHERE EXISTS
1969 (SELECT * FROM shoelace_can_delete
1970 WHERE sl_name = shoelace.sl_name);
1976 SELECT * FROM shoelace;
1978 sl_name | sl_avail | sl_color | sl_len | sl_unit | sl_len_cm
1979 ---------+----------+----------+--------+---------+-----------
1980 sl1 |
5 | black |
80 | cm |
80
1981 sl2 |
6 | black |
100 | cm |
100
1982 sl7 |
6 | brown |
60 | cm |
60
1983 sl4 |
8 | black |
40 | inch |
101.6
1984 sl3 |
10 | black |
35 | inch |
88.9
1985 sl8 |
21 | brown |
40 | inch |
101.6
1986 sl10 |
1000 | magenta |
40 | inch |
101.6
1987 sl5 |
4 | brown |
1 | m |
100
1988 sl6 |
20 | brown |
0.9 | m |
90
1994 A
<command>DELETE
</command> on a view, with a subquery qualification that
1995 in total uses
4 nesting/joined views, where one of them
1996 itself has a subquery qualification containing a view
1997 and where calculated view columns are used,
1999 one single query tree that deletes the requested data
2004 There are probably only a few situations out in the real world
2005 where such a construct is necessary. But it makes you feel
2006 comfortable that it works.
2012 <sect1 id=
"rules-privileges">
2013 <title>Rules and Privileges
</title>
2015 <indexterm zone=
"rules-privileges">
2016 <primary>privilege
</primary>
2017 <secondary sortas=
"Regeln">with rules
</secondary>
2020 <indexterm zone=
"rules-privileges">
2021 <primary>privilege
</primary>
2022 <secondary sortas=
"Sichten">with views
</secondary>
2026 Due to rewriting of queries by the
<productname>PostgreSQL
</productname>
2027 rule system, other tables/views than those used in the original
2028 query get accessed. When update rules are used, this can include write access
2033 Rewrite rules don't have a separate owner. The owner of
2034 a relation (table or view) is automatically the owner of the
2035 rewrite rules that are defined for it.
2036 The
<productname>PostgreSQL
</productname> rule system changes the
2037 behavior of the default access control system. With the exception of
2038 <literal>SELECT
</literal> rules associated with security invoker views
2039 (see
<link linkend=
"sql-createview"><command>CREATE VIEW
</command></link>),
2040 all relations that are used due to rules get checked against the
2041 privileges of the rule owner, not the user invoking the rule.
2042 This means that, except for security invoker views, users only need the
2043 required privileges for the tables/views that are explicitly named in
2048 For example: A user has a list of phone numbers where some of
2049 them are private, the others are of interest for the assistant of the office.
2050 The user can construct the following:
2053 CREATE TABLE phone_data (person text, phone text, private boolean);
2054 CREATE VIEW phone_number AS
2055 SELECT person, CASE WHEN NOT private THEN phone END AS phone
2057 GRANT SELECT ON phone_number TO assistant;
2060 Nobody except that user (and the database superusers) can access the
2061 <literal>phone_data
</literal> table. But because of the
<command>GRANT
</command>,
2062 the assistant can run a
<command>SELECT
</command> on the
2063 <literal>phone_number
</literal> view. The rule system will rewrite the
2064 <command>SELECT
</command> from
<literal>phone_number
</literal> into a
2065 <command>SELECT
</command> from
<literal>phone_data
</literal>.
2066 Since the user is the owner of
2067 <literal>phone_number
</literal> and therefore the owner of the rule, the
2068 read access to
<literal>phone_data
</literal> is now checked against the user's
2069 privileges and the query is permitted. The check for accessing
2070 <literal>phone_number
</literal> is also performed, but this is done
2071 against the invoking user, so nobody but the user and the
2072 assistant can use it.
2076 The privileges are checked rule by rule. So the assistant is for now the
2077 only one who can see the public phone numbers. But the assistant can set up
2078 another view and grant access to that to the public. Then, anyone
2079 can see the
<literal>phone_number
</literal> data through the assistant's view.
2080 What the assistant cannot do is to create a view that directly
2081 accesses
<literal>phone_data
</literal>. (Actually the assistant can, but it will not work since
2082 every access will be denied during the permission checks.)
2083 And as soon as the user notices that the assistant opened
2084 their
<literal>phone_number
</literal> view, the user can revoke the assistant's access. Immediately, any
2085 access to the assistant's view would fail.
2089 One might think that this rule-by-rule checking is a security
2090 hole, but in fact it isn't. But if it did not work this way, the assistant
2091 could set up a table with the same columns as
<literal>phone_number
</literal> and
2092 copy the data to there once per day. Then it's the assistant's own data and
2093 the assistant can grant access to everyone they want. A
2094 <command>GRANT
</command> command means,
<quote>I trust you
</quote>.
2095 If someone you trust does the thing above, it's time to
2096 think it over and then use
<command>REVOKE
</command>.
2100 Note that while views can be used to hide the contents of certain
2101 columns using the technique shown above, they cannot be used to reliably
2102 conceal the data in unseen rows unless the
2103 <literal>security_barrier
</literal> flag has been set. For example,
2104 the following view is insecure:
2106 CREATE VIEW phone_number AS
2107 SELECT person, phone FROM phone_data WHERE phone NOT LIKE '
412%';
2109 This view might seem secure, since the rule system will rewrite any
2110 <command>SELECT
</command> from
<literal>phone_number
</literal> into a
2111 <command>SELECT
</command> from
<literal>phone_data
</literal> and add the
2112 qualification that only entries where
<literal>phone
</literal> does not begin
2113 with
412 are wanted. But if the user can create their own functions,
2114 it is not difficult to convince the planner to execute the user-defined
2115 function prior to the
<function>NOT LIKE
</function> expression.
2118 CREATE FUNCTION tricky(text, text) RETURNS bool AS $$
2120 RAISE NOTICE '% =
> %', $
1, $
2;
2123 $$ LANGUAGE plpgsql COST
0.0000000000000000000001;
2125 SELECT * FROM phone_number WHERE tricky(person, phone);
2127 Every person and phone number in the
<literal>phone_data
</literal> table will be
2128 printed as a
<literal>NOTICE
</literal>, because the planner will choose to
2129 execute the inexpensive
<function>tricky
</function> function before the
2130 more expensive
<function>NOT LIKE
</function>. Even if the user is
2131 prevented from defining new functions, built-in functions can be used in
2132 similar attacks. (For example, most casting functions include their
2133 input values in the error messages they produce.)
2137 Similar considerations apply to update rules. In the examples of
2138 the previous section, the owner of the tables in the example
2139 database could grant the privileges
<literal>SELECT
</literal>,
2140 <literal>INSERT
</literal>,
<literal>UPDATE
</literal>, and
<literal>DELETE
</literal> on
2141 the
<literal>shoelace
</literal> view to someone else, but only
2142 <literal>SELECT
</literal> on
<literal>shoelace_log
</literal>. The rule action to
2143 write log entries will still be executed successfully, and that
2144 other user could see the log entries. But they could not create fake
2145 entries, nor could they manipulate or remove existing ones. In this
2146 case, there is no possibility of subverting the rules by convincing
2147 the planner to alter the order of operations, because the only rule
2148 which references
<literal>shoelace_log
</literal> is an unqualified
2149 <literal>INSERT
</literal>. This might not be true in more complex scenarios.
2153 When it is necessary for a view to provide row-level security, the
2154 <literal>security_barrier
</literal> attribute should be applied to
2155 the view. This prevents maliciously-chosen functions and operators from
2156 being passed values from rows until after the view has done its work. For
2157 example, if the view shown above had been created like this, it would
2160 CREATE VIEW phone_number WITH (security_barrier) AS
2161 SELECT person, phone FROM phone_data WHERE phone NOT LIKE '
412%';
2163 Views created with the
<literal>security_barrier
</literal> may perform
2164 far worse than views created without this option. In general, there is
2165 no way to avoid this: the fastest possible plan must be rejected
2166 if it may compromise security. For this reason, this option is not
2171 The query planner has more flexibility when dealing with functions that
2172 have no side effects. Such functions are referred to as
<literal>LEAKPROOF
</literal>, and
2173 include many simple, commonly used operators, such as many equality
2174 operators. The query planner can safely allow such functions to be evaluated
2175 at any point in the query execution process, since invoking them on rows
2176 invisible to the user will not leak any information about the unseen rows.
2177 Further, functions which do not take arguments or which are not passed any
2178 arguments from the security barrier view do not have to be marked as
2179 <literal>LEAKPROOF
</literal> to be pushed down, as they never receive data
2180 from the view. In contrast, a function that might throw an error depending
2181 on the values received as arguments (such as one that throws an error in the
2182 event of overflow or division by zero) is not leakproof, and could provide
2183 significant information about the unseen rows if applied before the security
2188 For example, an index scan cannot be selected for queries on security
2189 barrier views (or tables with row-level security policies) if an
2190 operator used in the
<literal>WHERE
</literal> clause is associated with the
2191 operator family of the index, but its underlying function is not marked
2192 <literal>LEAKPROOF
</literal>. The
<xref linkend=
"app-psql"/> program's
2193 <command><link linkend=
"app-psql-meta-command-dao">\dAo+
</link></command>
2194 meta-command is useful to list operator families and determine which of
2195 their operators are marked as leakproof.
2199 It is important to understand that even a view created with the
2200 <literal>security_barrier
</literal> option is intended to be secure only
2201 in the limited sense that the contents of the invisible tuples will not be
2202 passed to possibly-insecure functions. The user may well have other means
2203 of making inferences about the unseen data; for example, they can see the
2204 query plan using
<command>EXPLAIN
</command>, or measure the run time of
2205 queries against the view. A malicious attacker might be able to infer
2206 something about the amount of unseen data, or even gain some information
2207 about the data distribution or most common values (since these things may
2208 affect the run time of the plan; or even, since they are also reflected in
2209 the optimizer statistics, the choice of plan). If these types of
"covert
2210 channel" attacks are of concern, it is probably unwise to grant any access
2215 <sect1 id=
"rules-status">
2216 <title>Rules and Command Status
</title>
2219 The
<productname>PostgreSQL
</productname> server returns a command
2220 status string, such as
<literal>INSERT
149592 1</literal>, for each
2221 command it receives. This is simple enough when there are no rules
2222 involved, but what happens when the query is rewritten by rules?
2226 Rules affect the command status as follows:
2231 If there is no unconditional
<literal>INSTEAD
</literal> rule for the query, then
2232 the originally given query will be executed, and its command
2233 status will be returned as usual. (But note that if there were
2234 any conditional
<literal>INSTEAD
</literal> rules, the negation of their qualifications
2235 will have been added to the original query. This might reduce the
2236 number of rows it processes, and if so the reported status will
2243 If there is any unconditional
<literal>INSTEAD
</literal> rule for the query, then
2244 the original query will not be executed at all. In this case,
2245 the server will return the command status for the last query
2246 that was inserted by an
<literal>INSTEAD
</literal> rule (conditional or
2247 unconditional) and is of the same command type
2248 (
<command>INSERT
</command>,
<command>UPDATE
</command>, or
2249 <command>DELETE
</command>) as the original query. If no query
2250 meeting those requirements is added by any rule, then the
2251 returned command status shows the original query type and
2252 zeroes for the row-count and OID fields.
2259 The programmer can ensure that any desired
<literal>INSTEAD
</literal> rule is the one
2260 that sets the command status in the second case, by giving it the
2261 alphabetically last rule name among the active rules, so that it
2266 <sect1 id=
"rules-triggers">
2267 <title>Rules Versus Triggers
</title>
2269 <indexterm zone=
"rules-triggers">
2270 <primary>rule
</primary>
2271 <secondary sortas=
"Trigger">compared with triggers
</secondary>
2274 <indexterm zone=
"rules-triggers">
2275 <primary>trigger
</primary>
2276 <secondary sortas=
"Regeln">compared with rules
</secondary>
2280 Many things that can be done using triggers can also be
2281 implemented using the
<productname>PostgreSQL
</productname>
2282 rule system. One of the things that cannot be implemented by
2283 rules are some kinds of constraints, especially foreign keys. It is possible
2284 to place a qualified rule that rewrites a command to
<literal>NOTHING
</literal>
2285 if the value of a column does not appear in another table.
2286 But then the data is silently thrown away and that's
2287 not a good idea. If checks for valid values are required,
2288 and in the case of an invalid value an error message should
2289 be generated, it must be done by a trigger.
2293 In this chapter, we focused on using rules to update views. All of
2294 the update rule examples in this chapter can also be implemented
2295 using
<literal>INSTEAD OF
</literal> triggers on the views. Writing such
2296 triggers is often easier than writing rules, particularly if complex
2297 logic is required to perform the update.
2301 For the things that can be implemented by both, which is best
2302 depends on the usage of the database.
2303 A trigger is fired once for each affected row. A rule modifies
2304 the query or generates an additional query. So if many
2305 rows are affected in one statement, a rule issuing one extra
2306 command is likely to be faster than a trigger that is
2307 called for every single row and must re-determine what to do
2308 many times. However, the trigger approach is conceptually far
2309 simpler than the rule approach, and is easier for novices to get right.
2313 Here we show an example of how the choice of rules versus triggers
2314 plays out in one situation. There are two tables:
2317 CREATE TABLE computer (
2318 hostname text, -- indexed
2319 manufacturer text -- indexed
2322 CREATE TABLE software (
2323 software text, -- indexed
2324 hostname text -- indexed
2328 Both tables have many thousands of rows and the indexes on
2329 <structfield>hostname
</structfield> are unique. The rule or trigger should
2330 implement a constraint that deletes rows from
<literal>software
</literal>
2331 that reference a deleted computer. The trigger would use this command:
2334 DELETE FROM software WHERE hostname = $
1;
2337 Since the trigger is called for each individual row deleted from
2338 <literal>computer
</literal>, it can prepare and save the plan for this
2339 command and pass the
<structfield>hostname
</structfield> value in the
2340 parameter. The rule would be written as:
2343 CREATE RULE computer_del AS ON DELETE TO computer
2344 DO DELETE FROM software WHERE hostname = OLD.hostname;
2349 Now we look at different types of deletes. In the case of a:
2352 DELETE FROM computer WHERE hostname = 'mypc.local.net';
2355 the table
<literal>computer
</literal> is scanned by index (fast), and the
2356 command issued by the trigger would also use an index scan (also fast).
2357 The extra command from the rule would be:
2360 DELETE FROM software WHERE computer.hostname = 'mypc.local.net'
2361 AND software.hostname = computer.hostname;
2364 Since there are appropriate indexes set up, the planner
2365 will create a plan of
2367 <literallayout class=
"monospaced">
2369 -
> Index Scan using comp_hostidx on computer
2370 -
> Index Scan using soft_hostidx on software
2373 So there would be not that much difference in speed between
2374 the trigger and the rule implementation.
2378 With the next delete we want to get rid of all the
2000 computers
2379 where the
<structfield>hostname
</structfield> starts with
2380 <literal>old
</literal>. There are two possible commands to do that. One
2384 DELETE FROM computer WHERE hostname
>= 'old'
2385 AND hostname
< 'ole'
2388 The command added by the rule will be:
2391 DELETE FROM software WHERE computer.hostname
>= 'old' AND computer.hostname
< 'ole'
2392 AND software.hostname = computer.hostname;
2397 <literallayout class=
"monospaced">
2399 -
> Seq Scan on software
2401 -
> Index Scan using comp_hostidx on computer
2404 The other possible command is:
2407 DELETE FROM computer WHERE hostname ~ '^old';
2410 which results in the following executing plan for the command
2413 <literallayout class=
"monospaced">
2415 -
> Index Scan using comp_hostidx on computer
2416 -
> Index Scan using soft_hostidx on software
2419 This shows, that the planner does not realize that the
2420 qualification for
<structfield>hostname
</structfield> in
2421 <literal>computer
</literal> could also be used for an index scan on
2422 <literal>software
</literal> when there are multiple qualification
2423 expressions combined with
<literal>AND
</literal>, which is what it does
2424 in the regular-expression version of the command. The trigger will
2425 get invoked once for each of the
2000 old computers that have to be
2426 deleted, and that will result in one index scan over
2427 <literal>computer
</literal> and
2000 index scans over
2428 <literal>software
</literal>. The rule implementation will do it with two
2429 commands that use indexes. And it depends on the overall size of
2430 the table
<literal>software
</literal> whether the rule will still be faster in the
2431 sequential scan situation.
2000 command executions from the trigger over the SPI
2432 manager take some time, even if all the index blocks will soon be in the cache.
2436 The last command we look at is:
2439 DELETE FROM computer WHERE manufacturer = 'bim';
2442 Again this could result in many rows to be deleted from
2443 <literal>computer
</literal>. So the trigger will again run many commands
2444 through the executor. The command generated by the rule will be:
2447 DELETE FROM software WHERE computer.manufacturer = 'bim'
2448 AND software.hostname = computer.hostname;
2451 The plan for that command will again be the nested loop over two
2452 index scans, only using a different index on
<literal>computer
</literal>:
2456 -
> Index Scan using comp_manufidx on computer
2457 -
> Index Scan using soft_hostidx on software
2460 In any of these cases, the extra commands from the rule system
2461 will be more or less independent from the number of affected rows
2466 The summary is, rules will only be significantly slower than
2467 triggers if their actions result in large and badly qualified
2468 joins, a situation where the planner fails.