* better
[mascara-docs.git] / lang / C / the.ansi.c.programming.language / notes.accompany.ansi.c / sx10d.html
blob6d46d8a883b8354c30f5ca9a8895f41cbdcd3000
1 <!DOCTYPE HTML PUBLIC "-//W3O//DTD W3 HTML 2.0//EN">
2 <!-- This collection of hypertext pages is Copyright 1995, 1996 by Steve Summit. -->
3 <!-- This material may be freely redistributed and used -->
4 <!-- but may not be republished or sold without permission. -->
5 <html>
6 <head>
7 <link rev="owner" href="mailto:scs@eskimo.com">
8 <link rev="made" href="mailto:scs@eskimo.com">
9 <title>section 7.4: Formatted Input -- Scanf</title>
10 <link href="sx10c.html" rev=precedes>
11 <link href="sx10e.html" rel=precedes>
12 <link href="sx10.html" rev=subdocument>
13 </head>
14 <body>
15 <H2>section 7.4: Formatted Input -- Scanf</H2>
17 page 157
18 <p>Somehow we've managed to make it through six chapters
19 without meeting <TT>scanf</TT>,
20 which it turns out is just as well.
21 </p><p>In the examples in this book so far,
22 all input
23 (from the user, or otherwise)
24 has been done with <TT>getchar</TT> or <TT>getline</TT>.
25 If we needed to input a number, we did things like
26 <pre> char line[MAXLINE];
27 int number;
28 getline(line, MAXLINE);
29 number = atoi(line);
30 </pre>Using <TT>scanf</TT>,
31 we could ``simplify'' this to
32 <pre> int number;
33 scanf("%d", &amp;number);
34 </pre>This simplification is convenient and superficially attractive,
35 and it works, as far as it goes.
36 The problem is that <TT>scanf</TT> does not work well in more
37 complicated situations.
38 In section 7.1,
39 we said that
40 calls to <TT>putchar</TT> and <TT>printf</TT> could be interleaved.
41 The same is <em>not</em> always true of <TT>scanf</TT>:
42 you can have baffling problems
43 if you try to intermix calls to <TT>scanf</TT>
44 with calls to <TT>getchar</TT> or <TT>getline</TT>.
45 Worse,
46 it turns out that <TT>scanf</TT>'s error handling
47 is inadequate for many purposes.
48 It tells you whether a conversion succeeded or not
49 (more precisely, it tells you how many conversions succeeded),
50 but it doesn't tell you anything more than that
51 (unless you ask very carefully).
52 Like <TT>atoi</TT> and <TT>atof</TT>,
53 <TT>scanf</TT> stops reading characters
54 when it's processing a <TT>%d</TT> or <TT>%f</TT> input
55 and it finds a non-numeric character.
56 Suppose you've prompted the user to enter a number,
57 and the user accidentally types the letter `x'.
58 <TT>scanf</TT> might return 0, indicating that it couldn't
59 convert a number,
60 but the unconvertable text (the `x')
61 remains on the input stream
62 unless you figure out some other way to remove it.
63 </p><p>For these reasons
64 (and several others, which I won't bother to mention)
65 it's generally recommended that <TT>scanf</TT> not be used
66 for unstructured input such as user prompts.
67 It's much better to read entire lines with something like
68 <TT>getline</TT>
69 (as we've been doing all along)
70 and then process the line somehow.
71 If the line is supposed to be a single number,
72 you can use <TT>atoi</TT> or <TT>atof</TT> to convert it.
73 If the line has more complicated structure,
74 you can use <TT>sscanf</TT>
75 (which we'll meet in a minute)
76 to parse it.
77 (It's better to use <TT>sscanf</TT> than <TT>scanf</TT>
78 because when <TT>sscanf</TT> fails,
79 you have complete control over what you do next.
80 When <TT>scanf</TT> fails,
81 on the other hand,
82 you're at the mercy of where in the input stream it has left you.)
83 </p><p>With that little diatribe against <TT>scanf</TT> out of the way,
84 here are a few comments on individual points made in section 7.4.
85 </p><p>We've met a few functions
86 (e.g. <TT>getline</TT>,
87 <TT>month_day</TT> in section 5.7 on page 111)
88 which return more than one value;
89 the way they do so is to accept a pointer argument that tells them where
90 (in the caller)
91 to write the returned value.
92 <TT>scanf</TT> is the epitome of such functions:
93 it returns potentially many values
94 (one for each <TT>%</TT>-specifier in its format string),
95 and for each value converted and returned,
96 it needs a pointer argument.
97 </p><p>The statement on page 157 that
98 ``blanks or tabs'' in the format string ``are ignored''
99 (which is repeated on page 159)
100 is a simplification:
101 in actuality,
102 a blank or tab
103 (or newline; actually any whitespace)
104 in the format string causes <TT>scanf</TT> to skip whitespace
105 (blanks, tabs, etc.)
106 in the input stream.
107 </p><p>A <TT>*</TT> character in a <TT>scanf</TT> conversion specifier
108 means something completely different than it does for <TT>printf</TT>:
109 for <TT>scanf</TT>,
110 it means to suppress assignment
111 (i.e. for that conversion specifier,
112 there isn't a pointer in the argument list to receive the converted value,
113 so the converted value is discarded).
114 With <TT>scanf</TT>,
115 there is no direct way of taking a field width from the argument list,
116 as <TT>*</TT> does for <TT>printf</TT>.
117 </p><p>Conversion specifiers like <TT>%d</TT> and <TT>%f</TT>
118 automatically skip leading whitespace
119 while looking for something to convert.
120 This means that the format strings <TT>"%d&nbsp;%d"</TT> and <TT>"%d%d"</TT>
121 act exactly the same--the
122 whitespace in the first format string
123 causes whitespace to be skipped before the second <TT>%d</TT>,
124 but the second <TT>%d</TT> would have skipped that whitespace anyway.
125 (Yet another <TT>scanf</TT> foible
126 is that the innocuous-looking format string <TT>"%d\n"</TT>
127 converts a number and then skips whitespace,
128 which means that it will gobble up
129 not only a newline following the number it converts,
130 but any number of newlines or whitespace,
132 in fact
133 it will <em>keep</em> reading
134 until it finds a non-whitespace character,
135 which it then won't read.
136 This sounds confusing,
137 but so is <TT>scanf</TT>'s behavior
138 when given a format string like <TT>"%d\n"</TT>.
139 The moral is simple:
140 don't use trailing <TT>\n</TT>'s in <TT>scanf</TT> format strings.)
141 </p><p>page 158
142 </p><p>Notice that,
143 for <TT>scanf</TT>,
144 the <TT>%e</TT>, <TT>%f</TT>, and <TT>%g</TT> formats
145 are all the same,
146 and signify conversion of a <TT>float</TT> value
147 (they accept a pointer argument of type <TT>float *</TT>).
148 To convert a <TT>double</TT>,
149 you need to use <TT>%le</TT>, <TT>%lf</TT>, or <TT>%lg</TT>.
150 (This is quite different from the <TT>printf</TT> family,
151 which uses <TT>%e</TT>, <TT>%f</TT>, and <TT>%g</TT>
152 for <TT>float</TT>s <em>and</em> <TT>double</TT>s,
153 though all three request different formats.
154 Furthermore,
155 <TT>%le</TT>, <TT>%lf</TT>, and <TT>%lg</TT>
156 are technically incorrect for <TT>printf</TT>,
157 though most compilers
159 probably accept them.)
160 </p><p>page 159
161 </p><p>More precisely,
162 the reason that you don't need to use a <TT>&amp;</TT> with <TT>monthname</TT>
163 is that an array,
164 when it appears in an expression like this,
165 is automatically converted to a pointer.
166 </p><p>The dual-format date conversion example in the middle of page 159
167 is a nice example of the advantages of calling
168 <TT>getline</TT> and then <TT>sscanf</TT>.
169 At the beginning of this section,
170 I said that
171 ``when <TT>sscanf</TT> fails,
172 you have complete control over what you do next.''
173 Here, ``what you do next''
174 is try calling <TT>sscanf</TT> again,
175 on the very same input string
176 (thus effectively backing up to the very beginning of it),
177 using a different format string,
178 to try parsing the input a different way.
179 </p><hr>
181 Read sequentially:
182 <a href="sx10c.html" rev=precedes>prev</a>
183 <a href="sx10e.html" rel=precedes>next</a>
184 <a href="sx10.html" rev=subdocument>up</a>
185 <a href="top.html">top</a>
186 </p>
188 This page by <a href="http://www.eskimo.com/~scs/">Steve Summit</a>
189 // <a href="copyright.html">Copyright</a> 1995, 1996
190 // <a href="mailto:scs@eskimo.com">mail feedback</a>
191 </p>
192 </body>
193 </html>