1 <chapter xmlns="http://docbook.org/ns/docbook" version="5.0"
2 xml:id="std.io" xreflabel="Input and Output">
3 <?dbhtml filename="io.html"?>
7 <indexterm><primary>Input and Output</primary></indexterm>
10 <keyword>ISO C++</keyword>
11 <keyword>library</keyword>
17 <!-- Sect1 01 : Iostream Objects -->
18 <section xml:id="std.io.objects" xreflabel="IO Objects"><info><title>Iostream Objects</title></info>
19 <?dbhtml filename="iostream_objects.html"?>
22 <para>To minimize the time you have to wait on the compiler, it's good to
23 only include the headers you really need. Many people simply include
24 <filename class="headerfile"><iostream></filename> when they don't
25 need to -- and that can <emphasis>penalize your runtime as well.</emphasis>
26 Here are some tips on which header to use
27 for which situations, starting with the simplest.
29 <para><emphasis><filename class="headerfile"><iosfwd></filename></emphasis>
30 should be included whenever you simply need the <emphasis>name</emphasis>
31 of an I/O-related class, such as "<classname>ofstream</classname>" or
32 "<classname>basic_streambuf</classname>".
33 Like the name implies, these are forward declarations.
34 (A word to all you fellow old school programmers:
35 trying to forward declare classes like "<code>class istream;</code>"
37 Look in the <filename class="headerfile"><iosfwd></filename> header
38 if you'd like to know why.) For example,
41 #include <iosfwd>
46 std::ifstream& input_file;
49 extern std::ostream& operator<< (std::ostream&, MyClass&);
51 <para><emphasis><filename class="headerfile"><ios></filename></emphasis>
52 declares the base classes for the entire I/O stream hierarchy,
53 <classname>std::ios_base</classname> and <classname>std::basic_ios<charT></classname>,
54 the counting types <type>std::streamoff</type> and <type>std::streamsize</type>,
55 the file positioning type <type>std::fpos</type>,
56 and the various manipulators like <function>std::hex</function>,
57 <function>std::fixed</function>, <function>std::noshowbase</function>,
60 <para>The <classname>ios_base</classname> class is what holds the format
61 flags, the state flags, and the functions which change them
62 (<function>setf()</function>, <function>width()</function>,
63 <function>precision()</function>, etc).
64 You can also store extra data and register callback functions
65 through <classname>ios_base</classname>, but that has been historically
67 which doesn't depend on the type of characters stored is consolidated
70 <para>The class template <classname>basic_ios</classname> is the highest
72 hierarchy; it is the first one depending on the character type, and
73 holds all general state associated with that type: the pointer to the
74 polymorphic stream buffer, the facet information, etc.
76 <para><emphasis><filename class="headerfile"><streambuf></filename></emphasis>
77 declares the class template <classname>basic_streambuf</classname>, and
78 two standard instantiations, <type>streambuf</type> and
79 <type>wstreambuf</type>. If you need to work with the vastly useful and
80 capable stream buffer classes, e.g., to create a new form of storage
81 transport, this header is the one to include.
83 <para><emphasis><filename class="headerfile"><istream></filename></emphasis>
84 and <emphasis><filename class="headerfile"><ostream></filename></emphasis>
85 are the headers to include when you are using the overloaded
86 <code>>></code> and <code><<</code> operators,
87 or any of the other abstract stream formatting functions.
91 #include <istream>
93 std::ostream& operator<< (std::ostream& os, MyClass& c)
95 return os << c.data1() << c.data2();
98 <para>The <type>std::istream</type> and <type>std::ostream</type> classes
99 are the abstract parents of
100 the various concrete implementations. If you are only using the
101 interfaces, then you only need to use the appropriate interface header.
103 <para><emphasis><filename class="headerfile"><iomanip></filename></emphasis>
104 provides "extractors and inserters that alter information maintained by
105 class <classname>ios_base</classname> and its derived classes,"
106 such as <function>std::setprecision</function> and
107 <function>std::setw</function>. If you need
108 to write expressions like <code>os << setw(3);</code> or
109 <code>is >> setbase(8);</code>, you must include
110 <filename class="headerfile"><iomanip></filename>.
112 <para><emphasis><filename class="headerfile"><sstream></filename></emphasis>
113 and <emphasis><filename class="headerfile"><fstream></filename></emphasis>
114 declare the six stringstream and fstream classes. As they are the
115 standard concrete descendants of <type>istream</type> and <type>ostream</type>,
116 you will already know about them.
118 <para>Finally, <emphasis><filename class="headerfile"><iostream></filename></emphasis>
119 provides the eight standard global objects
120 (<code>cin</code>, <code>cout</code>, etc). To do this correctly, this
121 header also provides the contents of the
122 <filename class="headerfile"><istream></filename> and
123 <filename class="headerfile"><ostream></filename>
124 headers, but nothing else. The contents of this header look like:
127 #include <ostream>
128 #include <istream>
136 // this is explained below
137 <emphasis>static ios_base::Init __foo;</emphasis> // not its real name
140 <para>Now, the runtime penalty mentioned previously: the global objects
141 must be initialized before any of your own code uses them; this is
142 guaranteed by the standard. Like any other global object, they must
143 be initialized once and only once. This is typically done with a
144 construct like the one above, and the nested class
145 <classname>ios_base::Init</classname> is
146 specified in the standard for just this reason.
148 <para>How does it work? Because the header is included before any of your
149 code, the <emphasis>__foo</emphasis> object is constructed before any of
150 your objects. (Global objects are built in the order in which they
151 are declared, and destroyed in reverse order.) The first time the
152 constructor runs, the eight stream objects are set up.
154 <para>The <code>static</code> keyword means that each object file compiled
155 from a source file containing
156 <filename class="headerfile"><iostream></filename> will have its own
157 private copy of <emphasis>__foo</emphasis>. There is no specified order
158 of construction across object files (it's one of those pesky NP complete
159 problems that make life so interesting), so one copy in each object
160 file means that the stream objects are guaranteed to be set up before
161 any of your code which uses them could run, thereby meeting the
162 requirements of the standard.
164 <para>The penalty, of course, is that after the first copy of
165 <emphasis>__foo</emphasis> is constructed, all the others are just wasted
166 processor time. The time spent is merely for an increment-and-test
167 inside a function call, but over several dozen or hundreds of object
168 files, that time can add up. (It's not in a tight loop, either.)
170 <para>The lesson? Only include
171 <filename class="headerfile"><iostream></filename> when you need
173 the standard objects in that source file; you'll pay less startup
174 time. Only include the header files you need to in general; your
175 compile times will go down when there's less parsing work to do.
180 <!-- Sect1 02 : Stream Buffers -->
181 <section xml:id="std.io.streambufs" xreflabel="Stream Buffers"><info><title>Stream Buffers</title></info>
182 <?dbhtml filename="streambufs.html"?>
185 <section xml:id="io.streambuf.derived" xreflabel="Derived streambuf Classes"><info><title>Derived streambuf Classes</title></info>
190 <para>Creating your own stream buffers for I/O can be remarkably easy.
191 If you are interested in doing so, we highly recommend two very
193 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://angelikalanger.com/iostreams.html">Standard C++
194 IOStreams and Locales</link> by Langer and Kreft, ISBN 0-201-18395-1, and
195 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.josuttis.com/libbook/">The C++ Standard Library</link>
196 by Nicolai Josuttis, ISBN 0-201-37926-0. Both are published by
197 Addison-Wesley, who isn't paying us a cent for saying that, honest.
199 <para>Here is a simple example, io/outbuf1, from the Josuttis text. It
200 transforms everything sent through it to uppercase. This version
201 assumes many things about the nature of the character type being
202 used (for more information, read the books or the newsgroups):
205 #include <iostream>
206 #include <streambuf>
207 #include <locale>
208 #include <cstdio>
210 class outbuf : public std::streambuf
213 /* central output function
214 * - print characters in uppercase mode
216 virtual int_type overflow (int_type c) {
218 // convert lowercase to uppercase
219 c = std::toupper(static_cast<char>(c),getloc());
221 // and write the character to the standard output
222 if (putchar(c) == EOF) {
232 // create special output buffer
234 // initialize output stream with that output buffer
235 std::ostream out(&ob);
237 out << "31 hexadecimal: "
238 << std::hex << 31 << std::endl;
242 <para>Try it yourself! More examples can be found in 3.1.x code, in
243 <filename>include/ext/*_filebuf.h</filename>, and in the article
244 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://gabisoft.free.fr/articles/fltrsbf1.html">Filtering
251 <section xml:id="io.streambuf.buffering" xreflabel="Buffering"><info><title>Buffering</title></info>
253 <para>First, are you sure that you understand buffering? Particularly
254 the fact that C++ may not, in fact, have anything to do with it?
256 <para>The rules for buffering can be a little odd, but they aren't any
257 different from those of C. (Maybe that's why they can be a bit
258 odd.) Many people think that writing a newline to an output
259 stream automatically flushes the output buffer. This is true only
260 when the output stream is, in fact, a terminal and not a file
261 or some other device -- and <emphasis>that</emphasis> may not even be true
262 since C++ says nothing about files nor terminals. All of that is
263 system-dependent. (The "newline-buffer-flushing only occurring
264 on terminals" thing is mostly true on Unix systems, though.)
266 <para>Some people also believe that sending <code>endl</code> down an
267 output stream only writes a newline. This is incorrect; after a
268 newline is written, the buffer is also flushed. Perhaps this
269 is the effect you want when writing to a screen -- get the text
270 out as soon as possible, etc -- but the buffering is largely
271 wasted when doing this to a file:
274 output << "a line of text" << endl;
275 output << some_data_variable << endl;
276 output << "another line of text" << endl; </programlisting>
277 <para>The proper thing to do in this case to just write the data out
278 and let the libraries and the system worry about the buffering.
279 If you need a newline, just write a newline:
282 output << "a line of text\n"
283 << some_data_variable << '\n'
284 << "another line of text\n"; </programlisting>
285 <para>I have also joined the output statements into a single statement.
286 You could make the code prettier by moving the single newline to
287 the start of the quoted text on the last line, for example.
289 <para>If you do need to flush the buffer above, you can send an
290 <code>endl</code> if you also need a newline, or just flush the buffer
294 output << ...... << flush; // can use std::flush manipulator
295 output.flush(); // or call a member fn </programlisting>
296 <para>On the other hand, there are times when writing to a file should
297 be like writing to standard error; no buffering should be done
298 because the data needs to appear quickly (a prime example is a
299 log file for security-related information). The way to do this is
300 just to turn off the buffering <emphasis>before any I/O operations at
301 all</emphasis> have been done (note that opening counts as an I/O operation):
308 os.rdbuf()->pubsetbuf(0,0);
309 is.rdbuf()->pubsetbuf(0,0);
311 os.open("/foo/bar/baz");
312 is.open("/qux/quux/quuux");
314 os << "this data is written immediately\n";
315 is >> i; // and this will probably cause a disk read </programlisting>
316 <para>Since all aspects of buffering are handled by a streambuf-derived
317 member, it is necessary to get at that member with <code>rdbuf()</code>.
318 Then the public version of <code>setbuf</code> can be called. The
319 arguments are the same as those for the Standard C I/O Library
320 function (a buffer area followed by its size).
322 <para>A great deal of this is implementation-dependent. For example,
323 <code>streambuf</code> does not specify any actions for its own
324 <code>setbuf()</code>-ish functions; the classes derived from
325 <code>streambuf</code> each define behavior that "makes
326 sense" for that class: an argument of (0,0) turns off buffering
327 for <code>filebuf</code> but does nothing at all for its siblings
328 <code>stringbuf</code> and <code>strstreambuf</code>, and specifying
329 anything other than (0,0) has varying effects.
330 User-defined classes derived from <code>streambuf</code> can
331 do whatever they want. (For <code>filebuf</code> and arguments for
332 <code>(p,s)</code> other than zeros, libstdc++ does what you'd expect:
333 the first <code>s</code> bytes of <code>p</code> are used as a buffer,
334 which you must allocate and deallocate.)
336 <para>A last reminder: there are usually more buffers involved than
337 just those at the language/library level. Kernel buffers, disk
338 buffers, and the like will also have an effect. Inspecting and
339 changing those are system-dependent.
345 <!-- Sect1 03 : Memory-based Streams -->
346 <section xml:id="std.io.memstreams" xreflabel="Memory Streams"><info><title>Memory Based Streams</title></info>
347 <?dbhtml filename="stringstreams.html"?>
349 <section xml:id="std.io.memstreams.compat" xreflabel="Compatibility strstream"><info><title>Compatibility With strstream</title></info>
353 <para>Stringstreams (defined in the header <code><sstream></code>)
354 are in this author's opinion one of the coolest things since
355 sliced time. An example of their use is in the Received Wisdom
356 section for Sect1 21 (Strings),
357 <link linkend="strings.string.Cstring"> describing how to
358 format strings</link>.
360 <para>The quick definition is: they are siblings of ifstream and ofstream,
361 and they do for <code>std::string</code> what their siblings do for
362 files. All that work you put into writing <code><<</code> and
363 <code>>></code> functions for your classes now pays off
364 <emphasis>again!</emphasis> Need to format a string before passing the string
365 to a function? Send your stuff via <code><<</code> to an
366 ostringstream. You've read a string as input and need to parse it?
367 Initialize an istringstream with that string, and then pull pieces
368 out of it with <code>>></code>. Have a stringstream and need to
369 get a copy of the string inside? Just call the <code>str()</code>
372 <para>This only works if you've written your
373 <code><<</code>/<code>>></code> functions correctly, though,
374 and correctly means that they take istreams and ostreams as
375 parameters, not i<emphasis>f</emphasis>streams and o<emphasis>f</emphasis>streams. If they
376 take the latter, then your I/O operators will work fine with
377 file streams, but with nothing else -- including stringstreams.
379 <para>If you are a user of the strstream classes, you need to update
380 your code. You don't have to explicitly append <code>ends</code> to
381 terminate the C-style character array, you don't have to mess with
382 "freezing" functions, and you don't have to manage the
383 memory yourself. The strstreams have been officially deprecated,
384 which means that 1) future revisions of the C++ Standard won't
385 support them, and 2) if you use them, people will laugh at you.
392 <!-- Sect1 04 : File-based Streams -->
393 <section xml:id="std.io.filestreams" xreflabel="File Streams"><info><title>File Based Streams</title></info>
394 <?dbhtml filename="fstreams.html"?>
397 <section xml:id="std.io.filestreams.copying_a_file" xreflabel="Copying a File"><info><title>Copying a File</title></info>
402 <para>So you want to copy a file quickly and easily, and most important,
403 completely portably. And since this is C++, you have an open
404 ifstream (call it IN) and an open ofstream (call it OUT):
407 #include <fstream>
409 std::ifstream IN ("input_file");
410 std::ofstream OUT ("output_file"); </programlisting>
411 <para>Here's the easiest way to get it completely wrong:
414 OUT << IN;</programlisting>
415 <para>For those of you who don't already know why this doesn't work
416 (probably from having done it before), I invite you to quickly
417 create a simple text file called "input_file" containing
421 The quick brown fox jumped over the lazy dog.</programlisting>
422 <para>surrounded by blank lines. Code it up and try it. The contents
423 of "output_file" may surprise you.
425 <para>Seriously, go do it. Get surprised, then come back. It's worth it.
427 <para>The thing to remember is that the <code>basic_[io]stream</code> classes
428 handle formatting, nothing else. In particular, they break up on
429 whitespace. The actual reading, writing, and storing of data is
430 handled by the <code>basic_streambuf</code> family. Fortunately, the
431 <code>operator<<</code> is overloaded to take an ostream and
432 a pointer-to-streambuf, in order to help with just this kind of
433 "dump the data verbatim" situation.
435 <para>Why a <emphasis>pointer</emphasis> to streambuf and not just a streambuf? Well,
436 the [io]streams hold pointers (or references, depending on the
437 implementation) to their buffers, not the actual
438 buffers. This allows polymorphic behavior on the chapter of the buffers
439 as well as the streams themselves. The pointer is easily retrieved
440 using the <code>rdbuf()</code> member function. Therefore, the easiest
441 way to copy the file is:
444 OUT << IN.rdbuf();</programlisting>
445 <para>So what <emphasis>was</emphasis> happening with OUT<<IN? Undefined
446 behavior, since that particular << isn't defined by the Standard.
447 I have seen instances where it is implemented, but the character
448 extraction process removes all the whitespace, leaving you with no
449 blank lines and only "Thequickbrownfox...". With
450 libraries that do not define that operator, IN (or one of IN's
451 member pointers) sometimes gets converted to a void*, and the output
452 file then contains a perfect text representation of a hexadecimal
453 address (quite a big surprise). Others don't compile at all.
455 <para>Also note that none of this is specific to o<emphasis>*f*</emphasis>streams.
456 The operators shown above are all defined in the parent
457 basic_ostream class and are therefore available with all possible
463 <section xml:id="std.io.filestreams.binary" xreflabel="Binary Input and Output"><info><title>Binary Input and Output</title></info>
467 <para>The first and most important thing to remember about binary I/O is
468 that opening a file with <code>ios::binary</code> is not, repeat
469 <emphasis>not</emphasis>, the only thing you have to do. It is not a silver
470 bullet, and will not allow you to use the <code><</>></code>
471 operators of the normal fstreams to do binary I/O.
473 <para>Sorry. Them's the breaks.
475 <para>This isn't going to try and be a complete tutorial on reading and
476 writing binary files (because "binary"
477 covers a lot of ground), but we will try and clear
478 up a couple of misconceptions and common errors.
480 <para>First, <code>ios::binary</code> has exactly one defined effect, no more
481 and no less. Normal text mode has to be concerned with the newline
482 characters, and the runtime system will translate between (for
483 example) '\n' and the appropriate end-of-line sequence (LF on Unix,
484 CRLF on DOS, CR on Macintosh, etc). (There are other things that
485 normal mode does, but that's the most obvious.) Opening a file in
486 binary mode disables this conversion, so reading a CRLF sequence
487 under Windows won't accidentally get mapped to a '\n' character, etc.
488 Binary mode is not supposed to suddenly give you a bitstream, and
489 if it is doing so in your program then you've discovered a bug in
490 your vendor's compiler (or some other chapter of the C++ implementation,
491 possibly the runtime system).
493 <para>Second, using <code><<</code> to write and <code>>></code> to
494 read isn't going to work with the standard file stream classes, even
495 if you use <code>skipws</code> during reading. Why not? Because
496 ifstream and ofstream exist for the purpose of <emphasis>formatting</emphasis>,
497 not reading and writing. Their job is to interpret the data into
498 text characters, and that's exactly what you don't want to happen
501 <para>Third, using the <code>get()</code> and <code>put()/write()</code> member
502 functions still aren't guaranteed to help you. These are
503 "unformatted" I/O functions, but still character-based.
504 (This may or may not be what you want, see below.)
506 <para>Notice how all the problems here are due to the inappropriate use
507 of <emphasis>formatting</emphasis> functions and classes to perform something
508 which <emphasis>requires</emphasis> that formatting not be done? There are a
509 seemingly infinite number of solutions, and a few are listed here:
513 <para><quote>Derive your own fstream-type classes and write your own
514 <</>> operators to do binary I/O on whatever data
515 types you're using.</quote>
518 This is a Bad Thing, because while
519 the compiler would probably be just fine with it, other humans
520 are going to be confused. The overloaded bitshift operators
521 have a well-defined meaning (formatting), and this breaks it.
526 <quote>Build the file structure in memory, then
527 <code>mmap()</code> the file and copy the
532 Well, this is easy to make work, and easy to break, and is
533 pretty equivalent to using <code>::read()</code> and
534 <code>::write()</code> directly, and makes no use of the
535 iostream library at all...
540 <quote>Use streambufs, that's what they're there for.</quote>
543 While not trivial for the beginner, this is the best of all
544 solutions. The streambuf/filebuf layer is the layer that is
545 responsible for actual I/O. If you want to use the C++
546 library for binary I/O, this is where you start.
550 <para>How to go about using streambufs is a bit beyond the scope of this
551 document (at least for now), but while streambufs go a long way,
552 they still leave a couple of things up to you, the programmer.
553 As an example, byte ordering is completely between you and the
554 operating system, and you have to handle it yourself.
556 <para>Deriving a streambuf or filebuf
557 class from the standard ones, one that is specific to your data
558 types (or an abstraction thereof) is probably a good idea, and
559 lots of examples exist in journals and on Usenet. Using the
560 standard filebufs directly (either by declaring your own or by
561 using the pointer returned from an fstream's <code>rdbuf()</code>)
562 is certainly feasible as well.
564 <para>One area that causes problems is trying to do bit-by-bit operations
565 with filebufs. C++ is no different from C in this respect: I/O
566 must be done at the byte level. If you're trying to read or write
567 a few bits at a time, you're going about it the wrong way. You
568 must read/write an integral number of bytes and then process the
569 bytes. (For example, the streambuf functions take and return
570 variables of type <code>int_type</code>.)
572 <para>Another area of problems is opening text files in binary mode.
573 Generally, binary mode is intended for binary files, and opening
574 text files in binary mode means that you now have to deal with all of
575 those end-of-line and end-of-file problems that we mentioned before.
578 An instructive thread from comp.lang.c++.moderated delved off into
579 this topic starting more or less at
580 <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://groups.google.com/forum/#!topic/comp.std.c++/D4e0q9eVSoc">this post</link>
581 and continuing to the end of the thread. (The subject heading is "binary iostreams" on both comp.std.c++
582 and comp.lang.c++.moderated.) Take special note of the replies by James Kanze and Dietmar Kühl.
584 <para>Briefly, the problems of byte ordering and type sizes mean that
585 the unformatted functions like <code>ostream::put()</code> and
586 <code>istream::get()</code> cannot safely be used to communicate
587 between arbitrary programs, or across a network, or from one
588 invocation of a program to another invocation of the same program
589 on a different platform, etc.
595 <!-- Sect1 03 : Interacting with C -->
596 <section xml:id="std.io.c" xreflabel="Interacting with C"><info><title>Interacting with C</title></info>
597 <?dbhtml filename="io_and_c.html"?>
601 <section xml:id="std.io.c.FILE" xreflabel="Using FILE* and file descriptors"><info><title>Using FILE* and file descriptors</title></info>
604 See the <link linkend="manual.ext.io">extensions</link> for using
605 <type>FILE</type> and <type>file descriptors</type> with
606 <classname>ofstream</classname> and
607 <classname>ifstream</classname>.
611 <section xml:id="std.io.c.sync" xreflabel="Performance Issues"><info><title>Performance</title></info>
614 Pathetic Performance? Ditch C.
616 <para>It sounds like a flame on C, but it isn't. Really. Calm down.
617 I'm just saying it to get your attention.
619 <para>Because the C++ library includes the C library, both C-style and
620 C++-style I/O have to work at the same time. For example:
623 #include <iostream>
624 #include <cstdio>
626 std::cout << "Hel";
627 std::printf ("lo, worl");
628 std::cout << "d!\n";
630 <para>This must do what you think it does.
632 <para>Alert members of the audience will immediately notice that buffering
633 is going to make a hash of the output unless special steps are taken.
635 <para>The special steps taken by libstdc++, at least for version 3.0,
636 involve doing very little buffering for the standard streams, leaving
637 most of the buffering to the underlying C library. (This kind of
638 thing is tricky to get right.)
639 The upside is that correctness is ensured. The downside is that
640 writing through <code>cout</code> can quite easily lead to awful
641 performance when the C++ I/O library is layered on top of the C I/O
642 library (as it is for 3.0 by default). Some patches have been applied
643 which improve the situation for 3.1.
645 <para>However, the C and C++ standard streams only need to be kept in sync
646 when both libraries' facilities are in use. If your program only uses
647 C++ I/O, then there's no need to sync with the C streams. The right
648 thing to do in this case is to call
651 #include <emphasis>any of the I/O headers such as ios, iostream, etc</emphasis>
653 std::ios::sync_with_stdio(false);
655 <para>You must do this before performing any I/O via the C++ stream objects.
656 Once you call this, the C++ streams will operate independently of the
657 (unused) C streams. For GCC 3.x, this means that <code>cout</code> and
658 company will become fully buffered on their own.
660 <para>Note, by the way, that the synchronization requirement only applies to
661 the standard streams (<code>cin</code>, <code>cout</code>,
663 <code>clog</code>, and their wide-character counterparts). File stream
664 objects that you declare yourself have no such requirement and are fully