Fix xslt_process() to ensure that it inserts a NULL terminator after the
[PostgreSQL.git] / doc / src / sgml / query.sgml
bloba3b35c8dedfb57589117be76d41004fe43f65a77
1 <!-- $PostgreSQL$ -->
3 <chapter id="tutorial-sql">
4 <title>The <acronym>SQL</acronym> Language</title>
6 <sect1 id="tutorial-sql-intro">
7 <title>Introduction</title>
9 <para>
10 This chapter provides an overview of how to use
11 <acronym>SQL</acronym> to perform simple operations. This
12 tutorial is only intended to give you an introduction and is in no
13 way a complete tutorial on <acronym>SQL</acronym>. Numerous books
14 have been written on <acronym>SQL</acronym>, including <xref
15 linkend="MELT93"> and <xref linkend="DATE97">.
16 You should be aware that some <productname>PostgreSQL</productname>
17 language features are extensions to the standard.
18 </para>
20 <para>
21 In the examples that follow, we assume that you have created a
22 database named <literal>mydb</literal>, as described in the previous
23 chapter, and have been able to start <application>psql</application>.
24 </para>
26 <para>
27 Examples in this manual can also be found in the
28 <productname>PostgreSQL</productname> source distribution
29 in the directory <filename>src/tutorial/</filename>. To use those
30 files, first change to that directory and run <application>make</>:
32 <screen>
33 <prompt>$</prompt> <userinput>cd <replaceable>....</replaceable>/src/tutorial</userinput>
34 <prompt>$</prompt> <userinput>make</userinput>
35 </screen>
37 This creates the scripts and compiles the C files containing user-defined
38 functions and types. (If you installed a pre-packaged version of
39 <productname>PostgreSQL</productname> rather than building from source,
40 look for a directory named <filename>tutorial</> within the
41 <productname>PostgreSQL</productname> distribution. The <quote>make</>
42 part should already have been done for you.)
43 Then, to start the tutorial, do the following:
45 <screen>
46 <prompt>$</prompt> <userinput>cd <replaceable>....</replaceable>/tutorial</userinput>
47 <prompt>$</prompt> <userinput>psql -s mydb</userinput>
48 <computeroutput>
49 ...
50 </computeroutput>
52 <prompt>mydb=&gt;</prompt> <userinput>\i basics.sql</userinput>
53 </screen>
55 The <literal>\i</literal> command reads in commands from the
56 specified file. <command>psql</command>'s <literal>-s</> option puts you in
57 single step mode which pauses before sending each statement to the
58 server. The commands used in this section are in the file
59 <filename>basics.sql</filename>.
60 </para>
61 </sect1>
64 <sect1 id="tutorial-concepts">
65 <title>Concepts</title>
67 <para>
68 <indexterm><primary>relational database</primary></indexterm>
69 <indexterm><primary>hierarchical database</primary></indexterm>
70 <indexterm><primary>object-oriented database</primary></indexterm>
71 <indexterm><primary>relation</primary></indexterm>
72 <indexterm><primary>table</primary></indexterm>
74 <productname>PostgreSQL</productname> is a <firstterm>relational
75 database management system</firstterm> (<acronym>RDBMS</acronym>).
76 That means it is a system for managing data stored in
77 <firstterm>relations</firstterm>. Relation is essentially a
78 mathematical term for <firstterm>table</firstterm>. The notion of
79 storing data in tables is so commonplace today that it might
80 seem inherently obvious, but there are a number of other ways of
81 organizing databases. Files and directories on Unix-like
82 operating systems form an example of a hierarchical database. A
83 more modern development is the object-oriented database.
84 </para>
86 <para>
87 <indexterm><primary>row</primary></indexterm>
88 <indexterm><primary>column</primary></indexterm>
90 Each table is a named collection of <firstterm>rows</firstterm>.
91 Each row of a given table has the same set of named
92 <firstterm>columns</firstterm>,
93 and each column is of a specific data type. Whereas columns have
94 a fixed order in each row, it is important to remember that SQL
95 does not guarantee the order of the rows within the table in any
96 way (although they can be explicitly sorted for display).
97 </para>
99 <para>
100 <indexterm><primary>database cluster</primary></indexterm>
101 <indexterm><primary>cluster</primary><secondary>of databases</secondary><see>database cluster</see></indexterm>
103 Tables are grouped into databases, and a collection of databases
104 managed by a single <productname>PostgreSQL</productname> server
105 instance constitutes a database <firstterm>cluster</firstterm>.
106 </para>
107 </sect1>
110 <sect1 id="tutorial-table">
111 <title>Creating a New Table</title>
113 <indexterm zone="tutorial-table">
114 <primary>CREATE TABLE</primary>
115 </indexterm>
117 <para>
118 You can create a new table by specifying the table
119 name, along with all column names and their types:
121 <programlisting>
122 CREATE TABLE weather (
123 city varchar(80),
124 temp_lo int, -- low temperature
125 temp_hi int, -- high temperature
126 prcp real, -- precipitation
127 date date
129 </programlisting>
131 You can enter this into <command>psql</command> with the line
132 breaks. <command>psql</command> will recognize that the command
133 is not terminated until the semicolon.
134 </para>
136 <para>
137 White space (i.e., spaces, tabs, and newlines) can be used freely
138 in SQL commands. That means you can type the command aligned
139 differently than above, or even all on one line. Two dashes
140 (<quote><literal>--</literal></quote>) introduce comments.
141 Whatever follows them is ignored up to the end of the line. SQL
142 is case insensitive about key words and identifiers, except
143 when identifiers are double-quoted to preserve the case (not done
144 above).
145 </para>
147 <para>
148 <type>varchar(80)</type> specifies a data type that can store
149 arbitrary character strings up to 80 characters in length.
150 <type>int</type> is the normal integer type. <type>real</type> is
151 a type for storing single precision floating-point numbers.
152 <type>date</type> should be self-explanatory. (Yes, the column of
153 type <type>date</type> is also named <structfield>date</structfield>.
154 This might be convenient or confusing &mdash; you choose.)
155 </para>
157 <para>
158 <productname>PostgreSQL</productname> supports the standard
159 <acronym>SQL</acronym> types <type>int</type>,
160 <type>smallint</type>, <type>real</type>, <type>double
161 precision</type>, <type>char(<replaceable>N</>)</type>,
162 <type>varchar(<replaceable>N</>)</type>, <type>date</type>,
163 <type>time</type>, <type>timestamp</type>, and
164 <type>interval</type>, as well as other types of general utility
165 and a rich set of geometric types.
166 <productname>PostgreSQL</productname> can be customized with an
167 arbitrary number of user-defined data types. Consequently, type
168 names are not key words in the syntax, except where required to
169 support special cases in the <acronym>SQL</acronym> standard.
170 </para>
172 <para>
173 The second example will store cities and their associated
174 geographical location:
175 <programlisting>
176 CREATE TABLE cities (
177 name varchar(80),
178 location point
180 </programlisting>
181 The <type>point</type> type is an example of a
182 <productname>PostgreSQL</productname>-specific data type.
183 </para>
185 <para>
186 <indexterm>
187 <primary>DROP TABLE</primary>
188 </indexterm>
190 Finally, it should be mentioned that if you don't need a table any
191 longer or want to recreate it differently you can remove it using
192 the following command:
193 <synopsis>
194 DROP TABLE <replaceable>tablename</replaceable>;
195 </synopsis>
196 </para>
197 </sect1>
200 <sect1 id="tutorial-populate">
201 <title>Populating a Table With Rows</title>
203 <indexterm zone="tutorial-populate">
204 <primary>INSERT</primary>
205 </indexterm>
207 <para>
208 The <command>INSERT</command> statement is used to populate a table with
209 rows:
211 <programlisting>
212 INSERT INTO weather VALUES ('San Francisco', 46, 50, 0.25, '1994-11-27');
213 </programlisting>
215 Note that all data types use rather obvious input formats.
216 Constants that are not simple numeric values usually must be
217 surrounded by single quotes (<literal>'</>), as in the example.
219 <type>date</type> type is actually quite flexible in what it
220 accepts, but for this tutorial we will stick to the unambiguous
221 format shown here.
222 </para>
224 <para>
225 The <type>point</type> type requires a coordinate pair as input,
226 as shown here:
227 <programlisting>
228 INSERT INTO cities VALUES ('San Francisco', '(-194.0, 53.0)');
229 </programlisting>
230 </para>
232 <para>
233 The syntax used so far requires you to remember the order of the
234 columns. An alternative syntax allows you to list the columns
235 explicitly:
236 <programlisting>
237 INSERT INTO weather (city, temp_lo, temp_hi, prcp, date)
238 VALUES ('San Francisco', 43, 57, 0.0, '1994-11-29');
239 </programlisting>
240 You can list the columns in a different order if you wish or
241 even omit some columns, e.g., if the precipitation is unknown:
242 <programlisting>
243 INSERT INTO weather (date, city, temp_hi, temp_lo)
244 VALUES ('1994-11-29', 'Hayward', 54, 37);
245 </programlisting>
246 Many developers consider explicitly listing the columns better
247 style than relying on the order implicitly.
248 </para>
250 <para>
251 Please enter all the commands shown above so you have some data to
252 work with in the following sections.
253 </para>
255 <para>
256 <indexterm>
257 <primary>COPY</primary>
258 </indexterm>
260 You could also have used <command>COPY</command> to load large
261 amounts of data from flat-text files. This is usually faster
262 because the <command>COPY</command> command is optimized for this
263 application while allowing less flexibility than
264 <command>INSERT</command>. An example would be:
266 <programlisting>
267 COPY weather FROM '/home/user/weather.txt';
268 </programlisting>
270 where the file name for the source file must be available to the
271 backend server machine, not the client, since the backend server
272 reads the file directly. You can read more about the
273 <command>COPY</command> command in <xref linkend="sql-copy"
274 endterm="sql-copy-title">.
275 </para>
276 </sect1>
279 <sect1 id="tutorial-select">
280 <title>Querying a Table</title>
282 <para>
283 <indexterm><primary>query</primary></indexterm>
284 <indexterm><primary>SELECT</primary></indexterm>
286 To retrieve data from a table, the table is
287 <firstterm>queried</firstterm>. An <acronym>SQL</acronym>
288 <command>SELECT</command> statement is used to do this. The
289 statement is divided into a select list (the part that lists the
290 columns to be returned), a table list (the part that lists the
291 tables from which to retrieve the data), and an optional
292 qualification (the part that specifies any restrictions). For
293 example, to retrieve all the rows of table
294 <structname>weather</structname>, type:
295 <programlisting>
296 SELECT * FROM weather;
297 </programlisting>
298 Here <literal>*</literal> is a shorthand for <quote>all columns</quote>.
299 <footnote>
300 <para>
301 While <literal>SELECT *</literal> is useful for off-the-cuff
302 queries, it is widely considered bad style in production code,
303 since adding a column to the table would change the results.
304 </para>
305 </footnote>
306 So the same result would be had with:
307 <programlisting>
308 SELECT city, temp_lo, temp_hi, prcp, date FROM weather;
309 </programlisting>
311 The output should be:
313 <screen>
314 city | temp_lo | temp_hi | prcp | date
315 ---------------+---------+---------+------+------------
316 San Francisco | 46 | 50 | 0.25 | 1994-11-27
317 San Francisco | 43 | 57 | 0 | 1994-11-29
318 Hayward | 37 | 54 | | 1994-11-29
319 (3 rows)
320 </screen>
321 </para>
323 <para>
324 You can write expressions, not just simple column references, in the
325 select list. For example, you can do:
326 <programlisting>
327 SELECT city, (temp_hi+temp_lo)/2 AS temp_avg, date FROM weather;
328 </programlisting>
329 This should give:
330 <screen>
331 city | temp_avg | date
332 ---------------+----------+------------
333 San Francisco | 48 | 1994-11-27
334 San Francisco | 50 | 1994-11-29
335 Hayward | 45 | 1994-11-29
336 (3 rows)
337 </screen>
338 Notice how the <literal>AS</literal> clause is used to relabel the
339 output column. (The <literal>AS</literal> clause is optional.)
340 </para>
342 <para>
343 A query can be <quote>qualified</> by adding a <literal>WHERE</>
344 clause that specifies which rows are wanted. The <literal>WHERE</>
345 clause contains a Boolean (truth value) expression, and only rows for
346 which the Boolean expression is true are returned. The usual
347 Boolean operators (<literal>AND</literal>,
348 <literal>OR</literal>, and <literal>NOT</literal>) are allowed in
349 the qualification. For example, the following
350 retrieves the weather of San Francisco on rainy days:
352 <programlisting>
353 SELECT * FROM weather
354 WHERE city = 'San Francisco' AND prcp &gt; 0.0;
355 </programlisting>
356 Result:
357 <screen>
358 city | temp_lo | temp_hi | prcp | date
359 ---------------+---------+---------+------+------------
360 San Francisco | 46 | 50 | 0.25 | 1994-11-27
361 (1 row)
362 </screen>
363 </para>
365 <para>
366 <indexterm><primary>ORDER BY</primary></indexterm>
368 You can request that the results of a query
369 be returned in sorted order:
371 <programlisting>
372 SELECT * FROM weather
373 ORDER BY city;
374 </programlisting>
376 <screen>
377 city | temp_lo | temp_hi | prcp | date
378 ---------------+---------+---------+------+------------
379 Hayward | 37 | 54 | | 1994-11-29
380 San Francisco | 43 | 57 | 0 | 1994-11-29
381 San Francisco | 46 | 50 | 0.25 | 1994-11-27
382 </screen>
384 In this example, the sort order isn't fully specified, and so you
385 might get the San Francisco rows in either order. But you'd always
386 get the results shown above if you do:
388 <programlisting>
389 SELECT * FROM weather
390 ORDER BY city, temp_lo;
391 </programlisting>
392 </para>
394 <para>
395 <indexterm><primary>DISTINCT</primary></indexterm>
396 <indexterm><primary>duplicate</primary></indexterm>
398 You can request that duplicate rows be removed from the result of
399 a query:
401 <programlisting>
402 SELECT DISTINCT city
403 FROM weather;
404 </programlisting>
406 <screen>
407 city
408 ---------------
409 Hayward
410 San Francisco
411 (2 rows)
412 </screen>
414 Here again, the result row ordering might vary.
415 You can ensure consistent results by using <literal>DISTINCT</literal> and
416 <literal>ORDER BY</literal> together:
417 <footnote>
418 <para>
419 In some database systems, including older versions of
420 <productname>PostgreSQL</productname>, the implementation of
421 <literal>DISTINCT</literal> automatically orders the rows and
422 so <literal>ORDER BY</literal> is unnecessary. But this is not
423 required by the SQL standard, and current
424 <productname>PostgreSQL</productname> does not guarantee that
425 <literal>DISTINCT</literal> causes the rows to be ordered.
426 </para>
427 </footnote>
429 <programlisting>
430 SELECT DISTINCT city
431 FROM weather
432 ORDER BY city;
433 </programlisting>
434 </para>
435 </sect1>
438 <sect1 id="tutorial-join">
439 <title>Joins Between Tables</title>
441 <indexterm zone="tutorial-join">
442 <primary>join</primary>
443 </indexterm>
445 <para>
446 Thus far, our queries have only accessed one table at a time.
447 Queries can access multiple tables at once, or access the same
448 table in such a way that multiple rows of the table are being
449 processed at the same time. A query that accesses multiple rows
450 of the same or different tables at one time is called a
451 <firstterm>join</firstterm> query. As an example, say you wish to
452 list all the weather records together with the location of the
453 associated city. To do that, we need to compare the <structfield>city</>
454 column of each row of the <structname>weather</> table with the
455 <structfield>name</> column of all rows in the <structname>cities</>
456 table, and select the pairs of rows where these values match.
457 <note>
458 <para>
459 This is only a conceptual model. The join is usually performed
460 in a more efficient manner than actually comparing each possible
461 pair of rows, but this is invisible to the user.
462 </para>
463 </note>
464 This would be accomplished by the following query:
466 <programlisting>
467 SELECT *
468 FROM weather, cities
469 WHERE city = name;
470 </programlisting>
472 <screen>
473 city | temp_lo | temp_hi | prcp | date | name | location
474 ---------------+---------+---------+------+------------+---------------+-----------
475 San Francisco | 46 | 50 | 0.25 | 1994-11-27 | San Francisco | (-194,53)
476 San Francisco | 43 | 57 | 0 | 1994-11-29 | San Francisco | (-194,53)
477 (2 rows)
478 </screen>
480 </para>
482 <para>
483 Observe two things about the result set:
484 <itemizedlist>
485 <listitem>
486 <para>
487 There is no result row for the city of Hayward. This is
488 because there is no matching entry in the
489 <structname>cities</structname> table for Hayward, so the join
490 ignores the unmatched rows in the <structname>weather</> table. We will see
491 shortly how this can be fixed.
492 </para>
493 </listitem>
495 <listitem>
496 <para>
497 There are two columns containing the city name. This is
498 correct because the lists of columns from the
499 <structname>weather</structname> and
500 <structname>cities</structname> tables are concatenated. In
501 practice this is undesirable, though, so you will probably want
502 to list the output columns explicitly rather than using
503 <literal>*</literal>:
504 <programlisting>
505 SELECT city, temp_lo, temp_hi, prcp, date, location
506 FROM weather, cities
507 WHERE city = name;
508 </programlisting>
509 </para>
510 </listitem>
511 </itemizedlist>
512 </para>
514 <formalpara>
515 <title>Exercise:</title>
517 <para>
518 Attempt to determine the semantics of this query when the
519 <literal>WHERE</literal> clause is omitted.
520 </para>
521 </formalpara>
523 <para>
524 Since the columns all had different names, the parser
525 automatically found which table they belong to. If there
526 were duplicate column names in the two tables you'd need to
527 <firstterm>qualify</> the column names to show which one you
528 meant, as in:
530 <programlisting>
531 SELECT weather.city, weather.temp_lo, weather.temp_hi,
532 weather.prcp, weather.date, cities.location
533 FROM weather, cities
534 WHERE cities.name = weather.city;
535 </programlisting>
537 It is widely considered good style to qualify all column names
538 in a join query, so that the query won't fail if a duplicate
539 column name is later added to one of the tables.
540 </para>
542 <para>
543 Join queries of the kind seen thus far can also be written in this
544 alternative form:
546 <programlisting>
547 SELECT *
548 FROM weather INNER JOIN cities ON (weather.city = cities.name);
549 </programlisting>
551 This syntax is not as commonly used as the one above, but we show
552 it here to help you understand the following topics.
553 </para>
555 <para>
556 <indexterm><primary>join</primary><secondary>outer</secondary></indexterm>
558 Now we will figure out how we can get the Hayward records back in.
559 What we want the query to do is to scan the
560 <structname>weather</structname> table and for each row to find the
561 matching <structname>cities</structname> row(s). If no matching row is
562 found we want some <quote>empty values</quote> to be substituted
563 for the <structname>cities</structname> table's columns. This kind
564 of query is called an <firstterm>outer join</firstterm>. (The
565 joins we have seen so far are inner joins.) The command looks
566 like this:
568 <programlisting>
569 SELECT *
570 FROM weather LEFT OUTER JOIN cities ON (weather.city = cities.name);
572 city | temp_lo | temp_hi | prcp | date | name | location
573 ---------------+---------+---------+------+------------+---------------+-----------
574 Hayward | 37 | 54 | | 1994-11-29 | |
575 San Francisco | 46 | 50 | 0.25 | 1994-11-27 | San Francisco | (-194,53)
576 San Francisco | 43 | 57 | 0 | 1994-11-29 | San Francisco | (-194,53)
577 (3 rows)
578 </programlisting>
580 This query is called a <firstterm>left outer
581 join</firstterm> because the table mentioned on the left of the
582 join operator will have each of its rows in the output at least
583 once, whereas the table on the right will only have those rows
584 output that match some row of the left table. When outputting a
585 left-table row for which there is no right-table match, empty (null)
586 values are substituted for the right-table columns.
587 </para>
589 <formalpara>
590 <title>Exercise:</title>
592 <para>
593 There are also right outer joins and full outer joins. Try to
594 find out what those do.
595 </para>
596 </formalpara>
598 <para>
599 <indexterm><primary>join</primary><secondary>self</secondary></indexterm>
600 <indexterm><primary>alias</primary><secondary>for table name in query</secondary></indexterm>
602 We can also join a table against itself. This is called a
603 <firstterm>self join</firstterm>. As an example, suppose we wish
604 to find all the weather records that are in the temperature range
605 of other weather records. So we need to compare the
606 <structfield>temp_lo</> and <structfield>temp_hi</> columns of
607 each <structname>weather</structname> row to the
608 <structfield>temp_lo</structfield> and
609 <structfield>temp_hi</structfield> columns of all other
610 <structname>weather</structname> rows. We can do this with the
611 following query:
613 <programlisting>
614 SELECT W1.city, W1.temp_lo AS low, W1.temp_hi AS high,
615 W2.city, W2.temp_lo AS low, W2.temp_hi AS high
616 FROM weather W1, weather W2
617 WHERE W1.temp_lo &lt; W2.temp_lo
618 AND W1.temp_hi &gt; W2.temp_hi;
620 city | low | high | city | low | high
621 ---------------+-----+------+---------------+-----+------
622 San Francisco | 43 | 57 | San Francisco | 46 | 50
623 Hayward | 37 | 54 | San Francisco | 46 | 50
624 (2 rows)
625 </programlisting>
627 Here we have relabeled the weather table as <literal>W1</> and
628 <literal>W2</> to be able to distinguish the left and right side
629 of the join. You can also use these kinds of aliases in other
630 queries to save some typing, e.g.:
631 <programlisting>
632 SELECT *
633 FROM weather w, cities c
634 WHERE w.city = c.name;
635 </programlisting>
636 You will encounter this style of abbreviating quite frequently.
637 </para>
638 </sect1>
641 <sect1 id="tutorial-agg">
642 <title>Aggregate Functions</title>
644 <indexterm zone="tutorial-agg">
645 <primary>aggregate function</primary>
646 </indexterm>
648 <para>
649 <indexterm><primary>average</primary></indexterm>
650 <indexterm><primary>count</primary></indexterm>
651 <indexterm><primary>max</primary></indexterm>
652 <indexterm><primary>min</primary></indexterm>
653 <indexterm><primary>sum</primary></indexterm>
655 Like most other relational database products,
656 <productname>PostgreSQL</productname> supports
657 <firstterm>aggregate functions</>.
658 An aggregate function computes a single result from multiple input rows.
659 For example, there are aggregates to compute the
660 <function>count</function>, <function>sum</function>,
661 <function>avg</function> (average), <function>max</function> (maximum) and
662 <function>min</function> (minimum) over a set of rows.
663 </para>
665 <para>
666 As an example, we can find the highest low-temperature reading anywhere
667 with:
669 <programlisting>
670 SELECT max(temp_lo) FROM weather;
671 </programlisting>
673 <screen>
675 -----
677 (1 row)
678 </screen>
679 </para>
681 <para>
682 <indexterm><primary>subquery</primary></indexterm>
684 If we wanted to know what city (or cities) that reading occurred in,
685 we might try:
687 <programlisting>
688 SELECT city FROM weather WHERE temp_lo = max(temp_lo); <lineannotation>WRONG</lineannotation>
689 </programlisting>
691 but this will not work since the aggregate
692 <function>max</function> cannot be used in the
693 <literal>WHERE</literal> clause. (This restriction exists because
694 the <literal>WHERE</literal> clause determines which rows will be
695 included in the aggregate calculation; so obviously it has to be evaluated
696 before aggregate functions are computed.)
697 However, as is often the case
698 the query can be restated to accomplish the desired result, here
699 by using a <firstterm>subquery</firstterm>:
701 <programlisting>
702 SELECT city FROM weather
703 WHERE temp_lo = (SELECT max(temp_lo) FROM weather);
704 </programlisting>
706 <screen>
707 city
708 ---------------
709 San Francisco
710 (1 row)
711 </screen>
713 This is OK because the subquery is an independent computation
714 that computes its own aggregate separately from what is happening
715 in the outer query.
716 </para>
718 <para>
719 <indexterm><primary>GROUP BY</primary></indexterm>
720 <indexterm><primary>HAVING</primary></indexterm>
722 Aggregates are also very useful in combination with <literal>GROUP
723 BY</literal> clauses. For example, we can get the maximum low
724 temperature observed in each city with:
726 <programlisting>
727 SELECT city, max(temp_lo)
728 FROM weather
729 GROUP BY city;
730 </programlisting>
732 <screen>
733 city | max
734 ---------------+-----
735 Hayward | 37
736 San Francisco | 46
737 (2 rows)
738 </screen>
740 which gives us one output row per city. Each aggregate result is
741 computed over the table rows matching that city.
742 We can filter these grouped
743 rows using <literal>HAVING</literal>:
745 <programlisting>
746 SELECT city, max(temp_lo)
747 FROM weather
748 GROUP BY city
749 HAVING max(temp_lo) &lt; 40;
750 </programlisting>
752 <screen>
753 city | max
754 ---------+-----
755 Hayward | 37
756 (1 row)
757 </screen>
759 which gives us the same results for only the cities that have all
760 <structfield>temp_lo</> values below 40. Finally, if we only care about
761 cities whose
762 names begin with <quote><literal>S</literal></quote>, we might do:
764 <programlisting>
765 SELECT city, max(temp_lo)
766 FROM weather
767 WHERE city LIKE 'S%'<co id="co.tutorial-agg-like">
768 GROUP BY city
769 HAVING max(temp_lo) &lt; 40;
770 </programlisting>
771 <calloutlist>
772 <callout arearefs="co.tutorial-agg-like">
773 <para>
774 The <literal>LIKE</literal> operator does pattern matching and
775 is explained in <xref linkend="functions-matching">.
776 </para>
777 </callout>
778 </calloutlist>
779 </para>
781 <para>
782 It is important to understand the interaction between aggregates and
783 <acronym>SQL</acronym>'s <literal>WHERE</literal> and <literal>HAVING</literal> clauses.
784 The fundamental difference between <literal>WHERE</literal> and
785 <literal>HAVING</literal> is this: <literal>WHERE</literal> selects
786 input rows before groups and aggregates are computed (thus, it controls
787 which rows go into the aggregate computation), whereas
788 <literal>HAVING</literal> selects group rows after groups and
789 aggregates are computed. Thus, the
790 <literal>WHERE</literal> clause must not contain aggregate functions;
791 it makes no sense to try to use an aggregate to determine which rows
792 will be inputs to the aggregates. On the other hand, the
793 <literal>HAVING</literal> clause always contains aggregate functions.
794 (Strictly speaking, you are allowed to write a <literal>HAVING</literal>
795 clause that doesn't use aggregates, but it's seldom useful. The same
796 condition could be used more efficiently at the <literal>WHERE</literal>
797 stage.)
798 </para>
800 <para>
801 In the previous example, we can apply the city name restriction in
802 <literal>WHERE</literal>, since it needs no aggregate. This is
803 more efficient than adding the restriction to <literal>HAVING</literal>,
804 because we avoid doing the grouping and aggregate calculations
805 for all rows that fail the <literal>WHERE</literal> check.
806 </para>
807 </sect1>
810 <sect1 id="tutorial-update">
811 <title>Updates</title>
813 <indexterm zone="tutorial-update">
814 <primary>UPDATE</primary>
815 </indexterm>
817 <para>
818 You can update existing rows using the
819 <command>UPDATE</command> command.
820 Suppose you discover the temperature readings are
821 all off by 2 degrees after November 28. You can correct the
822 data as follows:
824 <programlisting>
825 UPDATE weather
826 SET temp_hi = temp_hi - 2, temp_lo = temp_lo - 2
827 WHERE date &gt; '1994-11-28';
828 </programlisting>
829 </para>
831 <para>
832 Look at the new state of the data:
833 <programlisting>
834 SELECT * FROM weather;
836 city | temp_lo | temp_hi | prcp | date
837 ---------------+---------+---------+------+------------
838 San Francisco | 46 | 50 | 0.25 | 1994-11-27
839 San Francisco | 41 | 55 | 0 | 1994-11-29
840 Hayward | 35 | 52 | | 1994-11-29
841 (3 rows)
842 </programlisting>
843 </para>
844 </sect1>
846 <sect1 id="tutorial-delete">
847 <title>Deletions</title>
849 <indexterm zone="tutorial-delete">
850 <primary>DELETE</primary>
851 </indexterm>
853 <para>
854 Rows can be removed from a table using the <command>DELETE</command>
855 command.
856 Suppose you are no longer interested in the weather of Hayward.
857 Then you can do the following to delete those rows from the table:
858 <programlisting>
859 DELETE FROM weather WHERE city = 'Hayward';
860 </programlisting>
862 All weather records belonging to Hayward are removed.
864 <programlisting>
865 SELECT * FROM weather;
866 </programlisting>
868 <screen>
869 city | temp_lo | temp_hi | prcp | date
870 ---------------+---------+---------+------+------------
871 San Francisco | 46 | 50 | 0.25 | 1994-11-27
872 San Francisco | 41 | 55 | 0 | 1994-11-29
873 (2 rows)
874 </screen>
875 </para>
877 <para>
878 One should be wary of statements of the form
879 <synopsis>
880 DELETE FROM <replaceable>tablename</replaceable>;
881 </synopsis>
883 Without a qualification, <command>DELETE</command> will
884 remove <emphasis>all</> rows from the given table, leaving it
885 empty. The system will not request confirmation before
886 doing this!
887 </para>
888 </sect1>
890 </chapter>