1 <!DOCTYPE HTML PUBLIC
"-//W3O//DTD W3 HTML 2.0//EN">
2 <!-- This collection of hypertext pages is Copyright 1995, 1996 by Steve Summit. -->
3 <!-- This material may be freely redistributed and used -->
4 <!-- but may not be republished or sold without permission. -->
7 <link rev=
"owner" href=
"mailto:scs@eskimo.com">
8 <link rev=
"made" href=
"mailto:scs@eskimo.com">
9 <title>section
7.4: Formatted Input -- Scanf
</title>
10 <link href=
"sx10c.html" rev=precedes
>
11 <link href=
"sx10e.html" rel=precedes
>
12 <link href=
"sx10.html" rev=subdocument
>
15 <H2>section
7.4: Formatted Input -- Scanf
</H2>
18 <p>Somehow we've managed to make it through six chapters
19 without meeting
<TT>scanf
</TT>,
20 which it turns out is just as well.
21 </p><p>In the examples in this book so far,
23 (from the user, or otherwise)
24 has been done with
<TT>getchar
</TT> or
<TT>getline
</TT>.
25 If we needed to input a number, we did things like
26 <pre> char line[MAXLINE];
28 getline(line, MAXLINE);
30 </pre>Using
<TT>scanf
</TT>,
31 we could ``simplify'' this to
33 scanf(
"%d",
&number);
34 </pre>This simplification is convenient and superficially attractive,
35 and it works, as far as it goes.
36 The problem is that
<TT>scanf
</TT> does not work well in more
37 complicated situations.
40 calls to
<TT>putchar
</TT> and
<TT>printf
</TT> could be interleaved.
41 The same is
<em>not
</em> always true of
<TT>scanf
</TT>:
42 you can have baffling problems
43 if you try to intermix calls to
<TT>scanf
</TT>
44 with calls to
<TT>getchar
</TT> or
<TT>getline
</TT>.
46 it turns out that
<TT>scanf
</TT>'s error handling
47 is inadequate for many purposes.
48 It tells you whether a conversion succeeded or not
49 (more precisely, it tells you how many conversions succeeded),
50 but it doesn't tell you anything more than that
51 (unless you ask very carefully).
52 Like
<TT>atoi
</TT> and
<TT>atof
</TT>,
53 <TT>scanf
</TT> stops reading characters
54 when it's processing a
<TT>%d
</TT> or
<TT>%f
</TT> input
55 and it finds a non-numeric character.
56 Suppose you've prompted the user to enter a number,
57 and the user accidentally types the letter `x'.
58 <TT>scanf
</TT> might return
0, indicating that it couldn't
60 but the unconvertable text (the `x')
61 remains on the input stream
62 unless you figure out some other way to remove it.
63 </p><p>For these reasons
64 (and several others, which I won't bother to mention)
65 it's generally recommended that
<TT>scanf
</TT> not be used
66 for unstructured input such as user prompts.
67 It's much better to read entire lines with something like
69 (as we've been doing all along)
70 and then process the line somehow.
71 If the line is supposed to be a single number,
72 you can use
<TT>atoi
</TT> or
<TT>atof
</TT> to convert it.
73 If the line has more complicated structure,
74 you can use
<TT>sscanf
</TT>
75 (which we'll meet in a minute)
77 (It's better to use
<TT>sscanf
</TT> than
<TT>scanf
</TT>
78 because when
<TT>sscanf
</TT> fails,
79 you have complete control over what you do next.
80 When
<TT>scanf
</TT> fails,
82 you're at the mercy of where in the input stream it has left you.)
83 </p><p>With that little diatribe against
<TT>scanf
</TT> out of the way,
84 here are a few comments on individual points made in section
7.4.
85 </p><p>We've met a few functions
86 (e.g.
<TT>getline
</TT>,
87 <TT>month_day
</TT> in section
5.7 on page
111)
88 which return more than one value;
89 the way they do so is to accept a pointer argument that tells them where
91 to write the returned value.
92 <TT>scanf
</TT> is the epitome of such functions:
93 it returns potentially many values
94 (one for each
<TT>%
</TT>-specifier in its format string),
95 and for each value converted and returned,
96 it needs a pointer argument.
97 </p><p>The statement on page
157 that
98 ``blanks or tabs'' in the format string ``are ignored''
99 (which is repeated on page
159)
103 (or newline; actually any whitespace)
104 in the format string causes
<TT>scanf
</TT> to skip whitespace
107 </p><p>A
<TT>*
</TT> character in a
<TT>scanf
</TT> conversion specifier
108 means something completely different than it does for
<TT>printf
</TT>:
110 it means to suppress assignment
111 (i.e. for that conversion specifier,
112 there isn't a pointer in the argument list to receive the converted value,
113 so the converted value is discarded).
115 there is no direct way of taking a field width from the argument list,
116 as
<TT>*
</TT> does for
<TT>printf
</TT>.
117 </p><p>Conversion specifiers like
<TT>%d
</TT> and
<TT>%f
</TT>
118 automatically skip leading whitespace
119 while looking for something to convert.
120 This means that the format strings
<TT>"%d %d"</TT> and
<TT>"%d%d"</TT>
121 act exactly the same--the
122 whitespace in the first format string
123 causes whitespace to be skipped before the second
<TT>%d
</TT>,
124 but the second
<TT>%d
</TT> would have skipped that whitespace anyway.
125 (Yet another
<TT>scanf
</TT> foible
126 is that the innocuous-looking format string
<TT>"%d\n"</TT>
127 converts a number and then skips whitespace,
128 which means that it will gobble up
129 not only a newline following the number it converts,
130 but any number of newlines or whitespace,
133 it will
<em>keep
</em> reading
134 until it finds a non-whitespace character,
135 which it then won't read.
136 This sounds confusing,
137 but so is
<TT>scanf
</TT>'s behavior
138 when given a format string like
<TT>"%d\n"</TT>.
140 don't use trailing
<TT>\n
</TT>'s in
<TT>scanf
</TT> format strings.)
144 the
<TT>%e
</TT>,
<TT>%f
</TT>, and
<TT>%g
</TT> formats
146 and signify conversion of a
<TT>float
</TT> value
147 (they accept a pointer argument of type
<TT>float *
</TT>).
148 To convert a
<TT>double
</TT>,
149 you need to use
<TT>%le
</TT>,
<TT>%lf
</TT>, or
<TT>%lg
</TT>.
150 (This is quite different from the
<TT>printf
</TT> family,
151 which uses
<TT>%e
</TT>,
<TT>%f
</TT>, and
<TT>%g
</TT>
152 for
<TT>float
</TT>s
<em>and
</em> <TT>double
</TT>s,
153 though all three request different formats.
155 <TT>%le
</TT>,
<TT>%lf
</TT>, and
<TT>%lg
</TT>
156 are technically incorrect for
<TT>printf
</TT>,
157 though most compilers
159 probably accept them.)
161 </p><p>More precisely,
162 the reason that you don't need to use a
<TT>&</TT> with
<TT>monthname
</TT>
164 when it appears in an expression like this,
165 is automatically converted to a pointer.
166 </p><p>The dual-format date conversion example in the middle of page
159
167 is a nice example of the advantages of calling
168 <TT>getline
</TT> and then
<TT>sscanf
</TT>.
169 At the beginning of this section,
171 ``when
<TT>sscanf
</TT> fails,
172 you have complete control over what you do next.''
173 Here, ``what you do next''
174 is try calling
<TT>sscanf
</TT> again,
175 on the very same input string
176 (thus effectively backing up to the very beginning of it),
177 using a different format string,
178 to try parsing the input a different way.
182 <a href=
"sx10c.html" rev=precedes
>prev
</a>
183 <a href=
"sx10e.html" rel=precedes
>next
</a>
184 <a href=
"sx10.html" rev=subdocument
>up
</a>
185 <a href=
"top.html">top
</a>
188 This page by
<a href=
"http://www.eskimo.com/~scs/">Steve Summit
</a>
189 //
<a href=
"copyright.html">Copyright
</a> 1995,
1996
190 //
<a href=
"mailto:scs@eskimo.com">mail feedback
</a>