* remove "\r" nonsense
[mascara-docs.git] / C / the.ansi.c.programming.language / notes.accompany.ansi.c / sx5g.html
blob292426ee8b29d505551eb5c62e07578d2706b48f
1 <!DOCTYPE HTML PUBLIC "-//W3O//DTD W3 HTML 2.0//EN">
2 <!-- This collection of hypertext pages is Copyright 1995, 1996 by Steve Summit. -->
3 <!-- This material may be freely redistributed and used -->
4 <!-- but may not be republished or sold without permission. -->
5 <html>
6 <head>
7 <link rev="owner" href="mailto:scs@eskimo.com">
8 <link rev="made" href="mailto:scs@eskimo.com">
9 <title>section 2.7: Type Conversions</title>
10 <link href="sx5f.html" rev=precedes>
11 <link href="sx5h.html" rel=precedes>
12 <link href="sx5.html" rev=subdocument>
13 </head>
14 <body>
15 <H2>section 2.7: Type Conversions</H2>
17 <p>The conversion rules described here and on page 44 are straightforward,
18 but they're quite important,
19 so you'll need to learn them well.
20 Usually,
21 conversions happen automatically and when you want them to,
22 but not always,
23 so it's important to keep
24 the rules in mind.
25 (Recall the discussion of <TT>5/9</TT> on page 12.)
26 </p><p>Deep sentence:
27 <blockquote>A <TT>char</TT> is just a small integer,
28 so <TT>char</TT>s may be freely used in arithmetic expressions.
29 </blockquote>Whether you treat a ``small integer'' as a character
30 or an integer is pretty much up to you.
31 As we saw earlier,
32 in the ASCII character set,
33 the character <TT>'0'</TT> has the value 48.
34 Therefore, saying
35 <pre> int i = '0';
36 </pre>is the same as saying
37 <pre> int i = 48;
38 </pre>If you print <TT>i</TT> out as a character,
39 using
40 <pre> putchar(i);
41 </pre>or
42 <pre> printf("%c", i);
43 </pre>(the <TT>%c</TT> format prints characters; see page 13),
44 you'll see the character <TT>'0'</TT>.
45 If you print it out as a number:
46 <pre> printf("%d", i);
47 </pre>you'll see the value 48.
48 </p><p>Most of the time,
49 you'll use whatever notation matches what you're trying to do.
50 If you want the character <TT>'0'</TT>,
51 you'll use <TT>'0'</TT>.
52 If you want the value 48
53 (as the number of months in four years, or something),
54 you'll use <TT>48</TT>.
55 If you want to print characters,
56 you'll use <TT>putchar</TT> or <TT>printf</TT> <TT>%c</TT>,
57 and if you want to print integers,
58 you'll use <TT>printf</TT> <TT>%d</TT>.
59 Occasionally, you'll cross over between
60 thinking of characters as characters and as values,
61 such as in the character-counting program in section 1.6 on page 22,
62 or in the <TT>atoi</TT> function we'll look at next.
63 (You should never have to know that <TT>'0'</TT> has the value 48,
64 and you should never have to write code which depends on it.)
65 </p><p>page 43
66 </p><p>To illustrate the ``schitzophrenic'' nature of characters
67 (are they characters, or are they small integer values?),
68 it's useful to look at an implementation of the standard
69 library function <TT>atoi</TT>.
70 (If you're getting overwhelmed, though, you may skip this
71 example for now, and come back to it later.)
72 The <TT>atoi</TT> routine converts a string like <TT>"123"</TT>
73 into an integer having the corresponding value.
74 </p><p>As you study the <TT>atoi</TT> code at the top of page 43,
75 figure out why it does <em>not</em> seem to explicitly check
76 for the terminating <TT>'\0'</TT> character.
77 </p><p>The expression
78 <pre> s[i] - '0'
79 </pre>is an example of the ``crossing over''
80 between thinking about a character and its value.
81 Since the value of the character <TT>'0'</TT> is not zero
82 (and, similarly, the other numeric characters
83 don't have their ``obvious'' values, either),
84 we have to do a little conversion
85 to get the value 0 from the character <TT>'0'</TT>,
86 the value 1 from the character <TT>'1'</TT>, etc.
87 Since
88 the character set values for
89 the digit characters <TT>'0'</TT> to <TT>'9'</TT> are contiguous
90 (48-57, if you must know),
91 the conversion involves simply subtracting an offset,
92 and the offset
93 (if you think about it)
94 is simply the value of the character <TT>'0'</TT>.
95 We could write
96 <pre> s[i] - 48
97 </pre>if we really wanted to,
98 but that would require knowing what the value actually is.
99 We shouldn't have to know
100 (and it might be different in some other character set),
101 so we can let the compiler do the dirty work
102 by using <TT>'0'</TT> as the offset
103 (since subtracting <TT>'0'</TT> is,
104 by definition,
105 the same as subtracting the value of the character <TT>'0'</TT>).
106 </p><p>The functions from <TT>&lt;ctype.h&gt;</TT> are being introduced
107 here without a lot of fanfare.
108 Here is the main loop of the <TT>atoi</TT> routine,
109 rewritten to use <TT>isdigit</TT>:
110 <pre> for (i = 0; isdigit(s[i]); ++i)
111 n = 10 * n + (s[i] - '0');
112 </pre></p><p>Don't worry too much about the discussion of
113 signed vs. unsigned characters for now.
114 (Don't forget about it completely, though;
115 eventually,
116 you'll find yourself working with a program where the issue is significant.)
117 For now, just remember:
118 <OL><li>Use <TT>int</TT> as the type of any variable
119 which receives the return value from <TT>getchar</TT>,
120 as discussed in section 1.5.1 on page 16.
121 <li>If you're ever dealing with arbitrary ``bytes'' of binary data,
122 you'll usually want to use <TT>unsigned char</TT>.
123 </OL></p><p>page 44
124 </p><p>As we saw in section 2.6 on page 44,
125 relational and logical operators always ``return'' 1 for ``true''
126 and 0 for ``false.''
127 However,
128 when C wants to know whether something is true or false,
129 it just looks at whether it's nonzero or zero,
130 so any nonzero value is considered ``true.''
131 Finally,
132 some functions which return true/false values
133 (the text mentions <TT>isdigit</TT>)
134 may return ``true'' values of other than 1.
135 </p><p>You don't have to worry about these distinctions too much,
136 and you also don't have to worry about the fragment
137 <pre> d = c &gt;= '0' &amp;&amp; c &lt;= '9'
138 </pre>as long as you write conditionals in a sensible way.
139 If you wanted to see
140 whether two variables <TT>a</TT> and <TT>b</TT> were equal,
141 you'd never write
142 <pre> if((a == b) == 1)
143 </pre>(although it <em>would</em> work:
144 the <TT>==</TT> operator ``returns'' 1 if
146 they're equal).
147 Similarly,
148 you don't want to write
149 <pre> if(isdigit(c) == 1)
150 </pre>because it's equally silly-looking,
151 and in this case it might <em>not</em> work.
152 Just write things like
153 <pre> if(a == b)
154 </pre>and
155 <pre> if(isdigit(c))
156 </pre>and you'll steer clear of most problems.
157 (Make sure, though, that you never try something like
158 <TT>if('0' &lt;= c &lt;= '9'</TT>),
159 since this wouldn't do at all what it looks like it's supposed to.)
160 </p><p>The set of implicit conversions on page 44,
161 though informally stated,
162 is exactly the set to remember for now.
163 They're easy to remember if you notice that,
164 as the authors say,
165 ``the `lower' type is <dfn>promoted</dfn> to the `higher' type,''
166 where the ``order'' of the types is
167 <pre> char &lt; short int &lt; int &lt; long int &lt; float &lt; double &lt; long double
168 </pre>(We won't be using <TT>long double</TT>,
169 so you don't need to worry about it.)
170 We'll have more to say about these rules on the next page.
171 </p><p>Don't worry too much for now
172 about the additional rules for <TT>unsigned</TT> values,
173 because we won't be using them at first.
174 </p><p>Do notice that implicit (automatic) conversions do happen
175 across assignments.
176 It's perfectly acceptable to
177 assign a <TT>char</TT> to an <TT>int</TT> or vice versa,
179 assign an <TT>int</TT> to a <TT>float</TT> or vice versa
180 (or any other combination).
181 Obviously, when you assign a value from a larger type to a
182 smaller one,
183 there's a chance that it might not fit.
184 Therefore, compilers will often warn you about such assignments.
185 </p><p>page 45
186 </p><p><dfn>Casts</dfn> can be a bit confusing at first.
187 A <dfn>cast</dfn> is the syntax used to request an explicit type conversion;
188 <dfn>coercion</dfn> is just a more formal word for ``conversion.''
189 A cast consists of a type name in parentheses
190 and is used as a unary operator.
191 You may have used languages
192 which had conversion operators
193 which looked more like function calls:
194 <pre> integer i = 2;
195 floating f = floating(i); /* not C */
196 integer i2 = integer(f); /* not C */
197 </pre>
198 In C,
199 you accomplish the same thing with casts:
200 <pre> int i = 2;
201 float f = (float)i;
202 int i2 = (int)f;
203 </pre>(Actually, in C,
204 we wouldn't need casts in those initializations at all,
205 because conversions between <TT>int</TT> and <TT>float</TT>
206 are some of the ones that C performs automatically.)
207 </p><p>To further understand both how implicit conversions
208 and explicit casts work,
209 let's study how the implicit conversions would look
210 if we wrote them out explicitly.
211 First we'll declare a few variables of various types:
212 <pre> char c1, c2;
213 int i1, i2;
214 long int L1, L2;
215 double d1, d2;
216 </pre>Next we'll look at the kinds of conversions which C
217 automatically performs when performing arithmetic on two
218 dissimilar types, or when assigning a value to a dissimilar type.
219 The rules are straightforward:
220 when performing arithmetic on two dissimilar types,
221 C converts one or both sides to a common type;
222 and when assigning a value,
223 C converts it to the type of the variable being assigned to.
224 </p><p>If we add a <TT>char</TT> to an <TT>int</TT>:
225 <pre> i2 = c1 + i1;
226 </pre>the fourth rule on page 44 tells us to convert the <TT>char</TT> to an <TT>int</TT>,
227 as if we'd written
228 <pre> i2 = (int)c1 + i1;
229 </pre>If we multiply a <TT>long int</TT> and a <TT>double</TT>:
230 <pre> d2 = L1 * d1;
231 </pre>the second rule tells us to convert the <TT>long int</TT> to a <TT>double</TT>,
232 as if we'd written
233 <pre> d2 = (double)L1 * d1;
234 </pre>An assignment of a <TT>char</TT> to an <TT>int</TT>
235 <pre> i1 = c1;
236 </pre>is as if we'd written
237 <pre> i1 = (int)c1;
238 </pre>and
239 an assignment of a <TT>float</TT> to an <TT>int</TT>
240 <pre> i1 = f1;
241 </pre>is as if we'd written
242 <pre> i1 = (int)f1;
243 </pre></p><p>Some programmers worry that implicit conversions are somehow unreliable
244 and prefer to insert lots of explicit conversions.
245 I recommend that you get comfortable with implicit
246 conversions--they're quite useful--and don't clutter your
247 code with extra casts.
248 </p><p>There are a few places where you do need casts, however.
249 Consider the code
250 <pre> i1 = 200;
251 i2 = 400;
252 L1 = i1 * i2;
253 </pre>The product
254 200 x 400 is 80000,
255 which is not guaranteed to fit into an <TT>int</TT>.
256 (Remember that an <TT>int</TT> is only guaranteed to hold
257 values up to 32767.)
258 Since 80000 <em>will</em> fit into a <TT>long int</TT>,
259 you might think that you're okay, but you're not:
260 the two sides of the multiplication are of the same type,
261 so the compiler doesn't see the need to perform any automatic conversions
262 (none of the rules on page 44 apply).
264 multiplication is carried out as an <TT>int</TT>,
265 which overflows with unpredictable results,
266 and only after the damage has been done is the unpredictable
267 value converted to a <TT>long int</TT> for assignment to <TT>L1</TT>.
268 To get a multiplication like this to work,
269 you have to explicitly convert at least one of the <TT>int</TT>'s
270 to <TT>long int</TT>:
271 <pre> L1 = (long int)i1 * i2;
272 </pre>Now,
273 the two sides of the <TT>*</TT> are of different types,
274 so they're <em>both</em> converted to <TT>long int</TT>
275 (by the fifth rule on page 44),
276 and the multiplication is carried out as a <TT>long int</TT>.
277 If it makes you feel safer, you can use two casts:
278 <pre> L1 = (long int)i1 * (long int)i2;
279 </pre>but only one is strictly required.
280 </p><p>A similar problem arises when two integers are being divided.
281 The code
282 <pre> i1 = 1;
283 f1 = i1 / 2;
284 </pre>does not set f1 to 0.5, it sets it to 0.
285 Again,
286 the two operands of the <TT>/</TT> operand are already of the same type
287 (the rules on page 44 still don't apply),
288 so an integer division is performed,
289 which discards any fractional part.
290 (We saw a similar problem in section 1.2 on page 12.)
291 Again, an explicit conversion saves the day:
292 <pre> f1 = (float)i1 / 2;
293 </pre>Alternately,
294 in a case like this,
295 you can use a floating-point constant:
296 <pre> f1 = i1 / 2.0;
297 </pre>In either case,
298 as soon as one of the operands is floating point,
299 the division is carried out in floating point,
300 and you get the result you expect.
301 </p><p>Implicit conversions always happen during arithmetic and
302 assignment to variables.
303 The situation is a bit more complicated when functions are
304 being called, however.
305 </p><p>The authors use the example of the <TT>sqrt</TT> function,
306 which is as good an example as any.
307 <TT>sqrt</TT> accepts an argument of type <TT>double</TT>
308 and returns a value of type <TT>double</TT>.
309 If the compiler didn't know that <TT>sqrt</TT> took a <TT>double</TT>,
310 and if you called
311 <pre> sqrt(4);
312 </pre>or
313 <pre> int n = 4;
314 sqrt(n);
315 </pre>the compiler would pass an <TT>int</TT> to <TT>sqrt</TT>.
316 Since <TT>sqrt</TT> expects a <TT>double</TT>,
317 it will not work correctly if it receives an <TT>int</TT>.
318 Therefore,
319 it was once
320 always
321 necessary to use explicit conversions in cases like
322 this,
323 by calling
324 <pre> sqrt((double)4)
325 </pre>or
326 <pre> sqrt((double)n)
327 </pre>or
328 <pre> sqrt(4.0)
329 </pre></p><p>However,
330 it is now possible,
331 with a <dfn>function prototype</dfn>,
332 to tell the compiler what types of arguments a function expects.
333 The prototype for <TT>sqrt</TT> is
334 <pre> double sqrt(double);
335 </pre>and as long as a prototype is in effect
336 (``in scope,'' as the cognoscenti would say),
337 you can call <TT>sqrt</TT> without worrying about conversions.
338 When a prototype is in effect,
339 the compiler performs implicit conversions during function calls
340 (specifically, while passing the arguments)
341 exactly as it does during simple assignments.
342 </p><p>Obviously, using prototypes makes for much safer programming,
343 and it is recommended that
344 you use them
345 whenever possible.
346 For the standard library functions
347 (the ones already written for you),
348 you get prototypes automatically
349 when you include the <dfn>header files</dfn> which describe
350 sets of library functions.
351 For example,
353 get prototypes for all of C's built-in math functions
354 by putting the line
355 <pre> #include &lt;math.h&gt;
356 </pre>at the top of your program.
357 For functions that you write,
358 you can supply your own prototypes,
359 which
361 we'll be learning more about later.
362 </p><p>However, there are a few situations
363 (we'll talk about them later)
364 where prototypes do not apply,
365 so it's important to remember that function calls are a bit
366 different
367 and that
368 explicit conversions (i.e. casts) may
369 occasionally be required.
370 Don't imagine that prototypes are a panacea.
371 </p><p>page 46
372 </p><p>Don't worry about the <TT>rand</TT> example.
373 </p><hr>
375 Read sequentially:
376 <a href="sx5f.html" rev=precedes>prev</a>
377 <a href="sx5h.html" rel=precedes>next</a>
378 <a href="sx5.html" rev=subdocument>up</a>
379 <a href="top.html">top</a>
380 </p>
382 This page by <a href="http://www.eskimo.com/~scs/">Steve Summit</a>
383 // <a href="copyright.html">Copyright</a> 1995, 1996
384 // <a href="mailto:scs@eskimo.com">mail feedback</a>
385 </p>
386 </body>
387 </html>