* better
[mascara-docs.git] / lang / C / the.ansi.c.programming.language / c.programming.notes / sx7c.html
blob7004d8be0e2529a8183f855a5de35267645b9de9
1 <!DOCTYPE HTML PUBLIC "-//W3O//DTD W3 HTML 2.0//EN">
2 <!-- This collection of hypertext pages is Copyright 1995-7 by Steve Summit. -->
3 <!-- This material may be freely redistributed and used -->
4 <!-- but may not be republished or sold without permission. -->
5 <html>
6 <head>
7 <link rev="owner" href="mailto:scs@eskimo.com">
8 <link rev="made" href="mailto:scs@eskimo.com">
9 <title>7.3 Order of Evaluation</title>
10 <link href="sx7b.html" rev=precedes>
11 <link href="sx8.html" rel=precedes>
12 <link href="sx7.html" rev=subdocument>
13 </head>
14 <body>
15 <H2>7.3 Order of Evaluation</H2>
17 <p>[This section corresponds to K&amp;R Sec. 2.12]
18 </p><p>When you start using
19 the <TT>++</TT> and <TT>--</TT> operators
20 in larger expressions,
21 you end up with expressions which do several things at once,
22 i.e., they modify several different variables at more or less the same time.
23 When you write such an expression,
24 you must be careful not to have the expression
25 ``pull the rug out from under itself''
26 by assigning two different values to the same variable,
27 or by assigning a new value to a variable
28 at the same time that another part of the expression
29 is trying to use the value of that variable.
30 </p><p>Actually,
31 we had already started writing expressions which did several things at once
32 even before we met the <TT>++</TT> and <TT>--</TT> operators.
33 The expression
34 <pre>
35 (c = getchar()) != EOF
36 </pre>
37 assigns <TT>getchar</TT>'s return value to <TT>c</TT>,
38 <em>and</em> compares it to <TT>EOF</TT>.
39 The <TT>++</TT> and <TT>--</TT> operators make it much easier
40 to cram a lot into a small expression:
41 the example
42 <pre>
43 line[nch++] = c;
44 </pre>
45 from the previous section
46 assigned <TT>c</TT> to <TT>line[nch]</TT>,
47 <em>and</em> incremented <TT>nch</TT>.
48 We'll eventually meet expressions which do <em>three</em> things at once,
49 such as
50 <pre>
51 a[i++] = b[j++];
52 </pre>
53 which assigns <TT>b[j]</TT> to <TT>a[i]</TT>,
54 and increments <TT>i</TT>,
55 <em>and</em> increments <TT>j</TT>.
56 </p><p>If you're not careful, though,
57 it's
58 easy for this sort of thing to get out of hand.
59 Can you figure out exactly what the expression
60 <pre>
61 a[i++] = b[i++]; /* WRONG */
62 </pre>
63 should do?
64 I can't,
65 and here's the important part:
66 <em>neither can the compiler</em>.
67 We know that the definition of postfix <TT>++</TT> is that the
68 former value, before the increment, is what goes on to
69 participate in the rest of the expression,
70 but the expression <TT>a[i++] = b[i++]</TT>
71 contains <em>two</em> <TT>++</TT> operators.
72 Which of them happens first?
73 Does this expression assign the old <TT>i</TT>th element of <TT>b</TT>
74 to the new <TT>i</TT>th element of <TT>a</TT>, or vice versa?
75 No one knows.
76 </p><p>When the order of evaluation matters but is not well-defined
77 (that is, when we can't say for sure which order the compiler
78 will evaluate the various dependent parts in)
79 we say that the meaning of the expression is <dfn>undefined</dfn>,
80 and if we're smart we won't write the expression in the first place.
81 (Why would anyone ever write an ``undefined'' expression?
82 Because sometimes,
83 the compiler happens to evaluate it in the order a programmer wanted,
84 and the programmer assumes that since it works,
85 it must be okay.)
86 </p><p>For example, suppose we carelessly wrote this loop:
87 <pre>
88 int i, a[10];
89 i = 0;
90 while(i &lt; 10)
91 a[i] = i++; /* WRONG */
92 </pre>
93 It looks like we're trying to set <TT>a[0]</TT> to 0,
94 <TT>a[1]</TT> to 1, etc.
95 But what if the increment <TT>i++</TT> happens
96 before the compiler decides which cell of the array <TT>a</TT>
97 to store the (unincremented) result in?
98 We might end up setting
99 <TT>a[1]</TT> to 0, <TT>a[2]</TT> to 1, etc.,
100 instead.
101 Since, in this case, we can't be sure which order things would happen in,
102 we simply shouldn't write code like this.
103 In this case, what we're doing matches the pattern of a
104 <TT>for</TT> loop, anyway, which would be a better choice:
105 <pre>
106 for(i = 0; i &lt; 10; i++)
107 a[i] = i;
108 </pre>
109 Now that the increment <TT>i++</TT> isn't crammed
110 into the same expression that's setting <TT>a[i]</TT>,
111 the code is perfectly well-defined,
112 and is guaranteed to do what we want.
113 </p><p>In general,
114 you should be wary of ever trying to
115 second-guess the order an expression will be evaluated in,
116 with two exceptions:
117 <OL><li>You can obviously assume that precedence will dictate
118 the order in
119 which binary operators
120 are applied.
121 This typically says more than just what order things happens in,
122 but also what the expression actually <em>means</em>.
123 (In other words,
124 the precedence of <TT>*</TT> over <TT>+</TT>
125 says more than that the multiplication ``happens first''
126 in <TT>1 + 2 * 3</TT>;
127 it says that the answer is 7, not 9.)
128 <li>Although we haven't mentioned it yet,
130 it is guaranteed that the logical operators
131 <TT>&amp;&amp;</TT> and <TT>||</TT>
132 are evaluated left-to-right,
133 and that the right-hand side is not evaluated at all
134 if the left-hand side determines the outcome.
135 </OL></p><p>To look at one more example,
136 it might seem that the code
137 <pre>
138 int i = 7;
139 printf("%d\n", i++ * i++);
140 </pre>
141 would have to print 56, because no matter which order the
142 increments happen in, 7<TT>*</TT>8 is 8<TT>*</TT>7 is 56.
143 But <TT>++</TT> just says that the increment happens later,
144 not that it happens immediately,
145 so this code could print 49
146 (if the compiler chose to perform the multiplication first,
147 and both increments later).
148 And,
149 it turns out that ambiguous expressions like this are such a
150 bad idea that the ANSI C Standard does not require compilers to
151 do anything reasonable with them at all.
152 Theoretically,
153 the above code
154 could end up
155 printing 42, or 8923409342, or 0, or crashing
156 your computer.
157 </p><p>Programmers sometimes mistakenly imagine
158 that they can write an expression which tries to do too much at once
159 and then predict exactly how it will behave
160 based on ``order of evaluation.''
161 For example, we know that multiplication has higher
162 <dfn>precedence</dfn> than addition, which means that in the
163 expression
164 <pre>
165 i + j * k
166 </pre>
167 <TT>j</TT> will be multiplied by <TT>k</TT>,
168 and then <TT>i</TT> will be added to the result.
169 Informally, we often say that the multiplication happens
170 ``before'' the addition.
171 That's true in this case, but it doesn't say as much as we
172 might think about a more complicated expression,
173 such as
174 <pre>
175 i++ + j++ * k++
176 </pre>
177 In this case, besides the addition and multiplication,
178 <TT>i</TT>, <TT>j</TT>, and <TT>k</TT> are all being incremented.
179 We can <em>not</em> say which of them will be incremented first;
180 it's the compiler's choice.
181 (In particular, it is <em>not</em> necessarily the case
182 that <TT>j++</TT> or <TT>k++</TT> will happen first;
183 the compiler might choose to save <TT>i</TT>'s value somewhere and
184 increment <TT>i</TT> first,
185 even though it will have to keep the old value around until
186 after it has done the multiplication.)
187 </p><p>In the
188 preceding example,
189 it probably doesn't matter which variable is incremented first.
190 It's not too hard, though,
191 to write an expression where it does matter.
192 In fact,
193 we've seen one already:
194 the ambiguous assignment
195 <TT>a[i++] = b[i++]</TT>.
196 We still don't know which <TT>i++</TT> happens first.
197 (We can <em>not</em> assume,
198 based on the right-to-left behavior of the <TT>=</TT> operator,
199 that the right-hand <TT>i++</TT> will happen first.)
200 But if we had to know what <TT>a[i++] = b[i++]</TT>
201 really did, we'd have to know which <TT>i++</TT> happened first.
202 </p><p>Finally,
203 note that parentheses don't dictate overall evaluation
204 order any more than precedence does.
205 Parentheses override precedence
206 and say which operands go with which operators,
207 and they therefore affect the overall meaning of an expression,
208 but they don't
209 say anything about the order of subexpressions or side effects.
210 We could not ``fix''
211 the evaluation order of
212 any of the
213 expressions we've been discussing
214 by adding parentheses.
215 If we wrote
216 <pre>
217 i++ + (j++ * k++)
218 </pre>
219 we still wouldn't know
220 which of the increments would happen
221 first.
222 (The parentheses would force the multiplication to happen before the addition,
223 but precedence already would have forced that, anyway.)
224 If we wrote
225 <pre>
226 (i++) * (i++)
227 </pre>
228 the parentheses wouldn't force the increments to happen before
229 the multiplication or in any well-defined order;
230 this parenthesized version would be just as undefined as
231 <TT>i++ * i++</TT> was.
232 </p><p>There's a line from Kernighan &amp; Ritchie, which I am fond of
233 quoting when discussing these issues
234 [Sec. 2.12, p. 54]:
235 <blockquote>The moral is that writing code
236 that depends on order of evaluation
237 is a bad programming practice in any language.
238 Naturally,
239 it is necessary to know what things to avoid,
240 but if you don't know
241 <I>how</I>
242 they are done on various machines,
243 you won't be tempted to take advantage of a particular implementation.
244 </blockquote></p><p>The first edition of K&amp;R said
245 <blockquote>...if you don't know
246 <I>how</I>
247 they are done on various machines,
248 that innocence may help to protect you.
249 </blockquote>I actually prefer the first edition wording.
250 Many textbooks encourage you to write small programs
251 to find out how your compiler
252 implements some of these ambiguous expressions,
253 but it's
254 just
255 one step from writing a small program to find out,
256 to writing a real program which makes use of what you've just learned.
257 But you <em>don't</em> want to write programs
258 that work only under one particular compiler,
259 that take advantage of the way that one compiler
260 (but perhaps no other)
261 happens to implement the undefined expressions.
262 It's fine to be curious about what goes on ``under the hood,''
263 and many of you will be curious enough about what's going on
264 with these ``forbidden'' expressions that you'll want to
265 investigate them,
266 but please keep very firmly in mind that,
267 for real programs,
268 the very easiest way of dealing with ambiguous, undefined expressions
269 (which one compiler interprets one way
270 and another interprets another way
271 and a third crashes on)
272 is not to write them in the first place.
273 </p><hr>
275 Read sequentially:
276 <a href="sx7b.html" rev=precedes>prev</a>
277 <a href="sx8.html" rel=precedes>next</a>
278 <a href="sx7.html" rev=subdocument>up</a>
279 <a href="top.html">top</a>
280 </p>
282 This page by <a href="http://www.eskimo.com/~scs/">Steve Summit</a>
283 // <a href="copyright.html">Copyright</a> 1995-1997
284 // <a href="mailto:scs@eskimo.com">mail feedback</a>
285 </p>
286 </body>
287 </html>