virtio: fix race in enable_cb
[linux-2.6/openmoko-kernel/knife-kernel.git] / Documentation / DocBook / procfs-guide.tmpl
blob1fd6a1ec7591d5f4179cdf2a641f1b055c1f486a
1 <?xml version="1.0" encoding="UTF-8"?>
2 <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
3 "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" [
4 <!ENTITY procfsexample SYSTEM "procfs_example.xml">
5 ]>
7 <book id="LKProcfsGuide">
8 <bookinfo>
9 <title>Linux Kernel Procfs Guide</title>
11 <authorgroup>
12 <author>
13 <firstname>Erik</firstname>
14 <othername>(J.A.K.)</othername>
15 <surname>Mouw</surname>
16 <affiliation>
17 <orgname>Delft University of Technology</orgname>
18 <orgdiv>Faculty of Information Technology and Systems</orgdiv>
19 <address>
20 <email>J.A.K.Mouw@its.tudelft.nl</email>
21 <pob>PO BOX 5031</pob>
22 <postcode>2600 GA</postcode>
23 <city>Delft</city>
24 <country>The Netherlands</country>
25 </address>
26 </affiliation>
27 </author>
28 </authorgroup>
30 <revhistory>
31 <revision>
32 <revnumber>1.0&nbsp;</revnumber>
33 <date>May 30, 2001</date>
34 <revremark>Initial revision posted to linux-kernel</revremark>
35 </revision>
36 <revision>
37 <revnumber>1.1&nbsp;</revnumber>
38 <date>June 3, 2001</date>
39 <revremark>Revised after comments from linux-kernel</revremark>
40 </revision>
41 </revhistory>
43 <copyright>
44 <year>2001</year>
45 <holder>Erik Mouw</holder>
46 </copyright>
49 <legalnotice>
50 <para>
51 This documentation is free software; you can redistribute it
52 and/or modify it under the terms of the GNU General Public
53 License as published by the Free Software Foundation; either
54 version 2 of the License, or (at your option) any later
55 version.
56 </para>
58 <para>
59 This documentation is distributed in the hope that it will be
60 useful, but WITHOUT ANY WARRANTY; without even the implied
61 warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
62 PURPOSE. See the GNU General Public License for more details.
63 </para>
65 <para>
66 You should have received a copy of the GNU General Public
67 License along with this program; if not, write to the Free
68 Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
69 MA 02111-1307 USA
70 </para>
72 <para>
73 For more details see the file COPYING in the source
74 distribution of Linux.
75 </para>
76 </legalnotice>
77 </bookinfo>
82 <toc>
83 </toc>
88 <preface id="Preface">
89 <title>Preface</title>
91 <para>
92 This guide describes the use of the procfs file system from
93 within the Linux kernel. The idea to write this guide came up on
94 the #kernelnewbies IRC channel (see <ulink
95 url="http://www.kernelnewbies.org/">http://www.kernelnewbies.org/</ulink>),
96 when Jeff Garzik explained the use of procfs and forwarded me a
97 message Alexander Viro wrote to the linux-kernel mailing list. I
98 agreed to write it up nicely, so here it is.
99 </para>
101 <para>
102 I'd like to thank Jeff Garzik
103 <email>jgarzik@pobox.com</email> and Alexander Viro
104 <email>viro@parcelfarce.linux.theplanet.co.uk</email> for their input,
105 Tim Waugh <email>twaugh@redhat.com</email> for his <ulink
106 url="http://people.redhat.com/twaugh/docbook/selfdocbook/">Selfdocbook</ulink>,
107 and Marc Joosen <email>marcj@historia.et.tudelft.nl</email> for
108 proofreading.
109 </para>
111 <para>
112 This documentation was written while working on the LART
113 computing board (<ulink
114 url="http://www.lart.tudelft.nl/">http://www.lart.tudelft.nl/</ulink>),
115 which is sponsored by the Mobile Multi-media Communications
116 (<ulink
117 url="http://www.mmc.tudelft.nl/">http://www.mmc.tudelft.nl/</ulink>)
118 and Ubiquitous Communications (<ulink
119 url="http://www.ubicom.tudelft.nl/">http://www.ubicom.tudelft.nl/</ulink>)
120 projects.
121 </para>
123 <para>
124 Erik
125 </para>
126 </preface>
131 <chapter id="intro">
132 <title>Introduction</title>
134 <para>
135 The <filename class="directory">/proc</filename> file system
136 (procfs) is a special file system in the linux kernel. It's a
137 virtual file system: it is not associated with a block device
138 but exists only in memory. The files in the procfs are there to
139 allow userland programs access to certain information from the
140 kernel (like process information in <filename
141 class="directory">/proc/[0-9]+/</filename>), but also for debug
142 purposes (like <filename>/proc/ksyms</filename>).
143 </para>
145 <para>
146 This guide describes the use of the procfs file system from
147 within the Linux kernel. It starts by introducing all relevant
148 functions to manage the files within the file system. After that
149 it shows how to communicate with userland, and some tips and
150 tricks will be pointed out. Finally a complete example will be
151 shown.
152 </para>
154 <para>
155 Note that the files in <filename
156 class="directory">/proc/sys</filename> are sysctl files: they
157 don't belong to procfs and are governed by a completely
158 different API described in the Kernel API book.
159 </para>
160 </chapter>
165 <chapter id="managing">
166 <title>Managing procfs entries</title>
168 <para>
169 This chapter describes the functions that various kernel
170 components use to populate the procfs with files, symlinks,
171 device nodes, and directories.
172 </para>
174 <para>
175 A minor note before we start: if you want to use any of the
176 procfs functions, be sure to include the correct header file!
177 This should be one of the first lines in your code:
178 </para>
180 <programlisting>
181 #include &lt;linux/proc_fs.h&gt;
182 </programlisting>
187 <sect1 id="regularfile">
188 <title>Creating a regular file</title>
190 <funcsynopsis>
191 <funcprototype>
192 <funcdef>struct proc_dir_entry* <function>create_proc_entry</function></funcdef>
193 <paramdef>const char* <parameter>name</parameter></paramdef>
194 <paramdef>mode_t <parameter>mode</parameter></paramdef>
195 <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef>
196 </funcprototype>
197 </funcsynopsis>
199 <para>
200 This function creates a regular file with the name
201 <parameter>name</parameter>, file mode
202 <parameter>mode</parameter> in the directory
203 <parameter>parent</parameter>. To create a file in the root of
204 the procfs, use <constant>NULL</constant> as
205 <parameter>parent</parameter> parameter. When successful, the
206 function will return a pointer to the freshly created
207 <structname>struct proc_dir_entry</structname>; otherwise it
208 will return <constant>NULL</constant>. <xref
209 linkend="userland"/> describes how to do something useful with
210 regular files.
211 </para>
213 <para>
214 Note that it is specifically supported that you can pass a
215 path that spans multiple directories. For example
216 <function>create_proc_entry</function>(<parameter>"drivers/via0/info"</parameter>)
217 will create the <filename class="directory">via0</filename>
218 directory if necessary, with standard
219 <constant>0755</constant> permissions.
220 </para>
222 <para>
223 If you only want to be able to read the file, the function
224 <function>create_proc_read_entry</function> described in <xref
225 linkend="convenience"/> may be used to create and initialise
226 the procfs entry in one single call.
227 </para>
228 </sect1>
233 <sect1 id="Creating_a_symlink">
234 <title>Creating a symlink</title>
236 <funcsynopsis>
237 <funcprototype>
238 <funcdef>struct proc_dir_entry*
239 <function>proc_symlink</function></funcdef> <paramdef>const
240 char* <parameter>name</parameter></paramdef>
241 <paramdef>struct proc_dir_entry*
242 <parameter>parent</parameter></paramdef> <paramdef>const
243 char* <parameter>dest</parameter></paramdef>
244 </funcprototype>
245 </funcsynopsis>
247 <para>
248 This creates a symlink in the procfs directory
249 <parameter>parent</parameter> that points from
250 <parameter>name</parameter> to
251 <parameter>dest</parameter>. This translates in userland to
252 <literal>ln -s</literal> <parameter>dest</parameter>
253 <parameter>name</parameter>.
254 </para>
255 </sect1>
257 <sect1 id="Creating_a_directory">
258 <title>Creating a directory</title>
260 <funcsynopsis>
261 <funcprototype>
262 <funcdef>struct proc_dir_entry* <function>proc_mkdir</function></funcdef>
263 <paramdef>const char* <parameter>name</parameter></paramdef>
264 <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef>
265 </funcprototype>
266 </funcsynopsis>
268 <para>
269 Create a directory <parameter>name</parameter> in the procfs
270 directory <parameter>parent</parameter>.
271 </para>
272 </sect1>
277 <sect1 id="Removing_an_entry">
278 <title>Removing an entry</title>
280 <funcsynopsis>
281 <funcprototype>
282 <funcdef>void <function>remove_proc_entry</function></funcdef>
283 <paramdef>const char* <parameter>name</parameter></paramdef>
284 <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef>
285 </funcprototype>
286 </funcsynopsis>
288 <para>
289 Removes the entry <parameter>name</parameter> in the directory
290 <parameter>parent</parameter> from the procfs. Entries are
291 removed by their <emphasis>name</emphasis>, not by the
292 <structname>struct proc_dir_entry</structname> returned by the
293 various create functions. Note that this function doesn't
294 recursively remove entries.
295 </para>
297 <para>
298 Be sure to free the <structfield>data</structfield> entry from
299 the <structname>struct proc_dir_entry</structname> before
300 <function>remove_proc_entry</function> is called (that is: if
301 there was some <structfield>data</structfield> allocated, of
302 course). See <xref linkend="usingdata"/> for more information
303 on using the <structfield>data</structfield> entry.
304 </para>
305 </sect1>
306 </chapter>
311 <chapter id="userland">
312 <title>Communicating with userland</title>
314 <para>
315 Instead of reading (or writing) information directly from
316 kernel memory, procfs works with <emphasis>call back
317 functions</emphasis> for files: functions that are called when
318 a specific file is being read or written. Such functions have
319 to be initialised after the procfs file is created by setting
320 the <structfield>read_proc</structfield> and/or
321 <structfield>write_proc</structfield> fields in the
322 <structname>struct proc_dir_entry*</structname> that the
323 function <function>create_proc_entry</function> returned:
324 </para>
326 <programlisting>
327 struct proc_dir_entry* entry;
329 entry->read_proc = read_proc_foo;
330 entry->write_proc = write_proc_foo;
331 </programlisting>
333 <para>
334 If you only want to use a the
335 <structfield>read_proc</structfield>, the function
336 <function>create_proc_read_entry</function> described in <xref
337 linkend="convenience"/> may be used to create and initialise the
338 procfs entry in one single call.
339 </para>
343 <sect1 id="Reading_data">
344 <title>Reading data</title>
346 <para>
347 The read function is a call back function that allows userland
348 processes to read data from the kernel. The read function
349 should have the following format:
350 </para>
352 <funcsynopsis>
353 <funcprototype>
354 <funcdef>int <function>read_func</function></funcdef>
355 <paramdef>char* <parameter>buffer</parameter></paramdef>
356 <paramdef>char** <parameter>start</parameter></paramdef>
357 <paramdef>off_t <parameter>off</parameter></paramdef>
358 <paramdef>int <parameter>count</parameter></paramdef>
359 <paramdef>int* <parameter>peof</parameter></paramdef>
360 <paramdef>void* <parameter>data</parameter></paramdef>
361 </funcprototype>
362 </funcsynopsis>
364 <para>
365 The read function should write its information into the
366 <parameter>buffer</parameter>, which will be exactly
367 <literal>PAGE_SIZE</literal> bytes long.
368 </para>
370 <para>
371 The parameter
372 <parameter>peof</parameter> should be used to signal that the
373 end of the file has been reached by writing
374 <literal>1</literal> to the memory location
375 <parameter>peof</parameter> points to.
376 </para>
378 <para>
379 The <parameter>data</parameter>
380 parameter can be used to create a single call back function for
381 several files, see <xref linkend="usingdata"/>.
382 </para>
384 <para>
385 The rest of the parameters and the return value are described
386 by a comment in <filename>fs/proc/generic.c</filename> as follows:
387 </para>
389 <blockquote>
390 <para>
391 You have three ways to return data:
392 </para>
393 <orderedlist>
394 <listitem>
395 <para>
396 Leave <literal>*start = NULL</literal>. (This is the default.)
397 Put the data of the requested offset at that
398 offset within the buffer. Return the number (<literal>n</literal>)
399 of bytes there are from the beginning of the
400 buffer up to the last byte of data. If the
401 number of supplied bytes (<literal>= n - offset</literal>) is
402 greater than zero and you didn't signal eof
403 and the reader is prepared to take more data
404 you will be called again with the requested
405 offset advanced by the number of bytes
406 absorbed. This interface is useful for files
407 no larger than the buffer.
408 </para>
409 </listitem>
410 <listitem>
411 <para>
412 Set <literal>*start</literal> to an unsigned long value less than
413 the buffer address but greater than zero.
414 Put the data of the requested offset at the
415 beginning of the buffer. Return the number of
416 bytes of data placed there. If this number is
417 greater than zero and you didn't signal eof
418 and the reader is prepared to take more data
419 you will be called again with the requested
420 offset advanced by <literal>*start</literal>. This interface is
421 useful when you have a large file consisting
422 of a series of blocks which you want to count
423 and return as wholes.
424 (Hack by Paul.Russell@rustcorp.com.au)
425 </para>
426 </listitem>
427 <listitem>
428 <para>
429 Set <literal>*start</literal> to an address within the buffer.
430 Put the data of the requested offset at <literal>*start</literal>.
431 Return the number of bytes of data placed there.
432 If this number is greater than zero and you
433 didn't signal eof and the reader is prepared to
434 take more data you will be called again with the
435 requested offset advanced by the number of bytes
436 absorbed.
437 </para>
438 </listitem>
439 </orderedlist>
440 </blockquote>
442 <para>
443 <xref linkend="example"/> shows how to use a read call back
444 function.
445 </para>
446 </sect1>
451 <sect1 id="Writing_data">
452 <title>Writing data</title>
454 <para>
455 The write call back function allows a userland process to write
456 data to the kernel, so it has some kind of control over the
457 kernel. The write function should have the following format:
458 </para>
460 <funcsynopsis>
461 <funcprototype>
462 <funcdef>int <function>write_func</function></funcdef>
463 <paramdef>struct file* <parameter>file</parameter></paramdef>
464 <paramdef>const char* <parameter>buffer</parameter></paramdef>
465 <paramdef>unsigned long <parameter>count</parameter></paramdef>
466 <paramdef>void* <parameter>data</parameter></paramdef>
467 </funcprototype>
468 </funcsynopsis>
470 <para>
471 The write function should read <parameter>count</parameter>
472 bytes at maximum from the <parameter>buffer</parameter>. Note
473 that the <parameter>buffer</parameter> doesn't live in the
474 kernel's memory space, so it should first be copied to kernel
475 space with <function>copy_from_user</function>. The
476 <parameter>file</parameter> parameter is usually
477 ignored. <xref linkend="usingdata"/> shows how to use the
478 <parameter>data</parameter> parameter.
479 </para>
481 <para>
482 Again, <xref linkend="example"/> shows how to use this call back
483 function.
484 </para>
485 </sect1>
490 <sect1 id="usingdata">
491 <title>A single call back for many files</title>
493 <para>
494 When a large number of almost identical files is used, it's
495 quite inconvenient to use a separate call back function for
496 each file. A better approach is to have a single call back
497 function that distinguishes between the files by using the
498 <structfield>data</structfield> field in <structname>struct
499 proc_dir_entry</structname>. First of all, the
500 <structfield>data</structfield> field has to be initialised:
501 </para>
503 <programlisting>
504 struct proc_dir_entry* entry;
505 struct my_file_data *file_data;
507 file_data = kmalloc(sizeof(struct my_file_data), GFP_KERNEL);
508 entry->data = file_data;
509 </programlisting>
511 <para>
512 The <structfield>data</structfield> field is a <type>void
513 *</type>, so it can be initialised with anything.
514 </para>
516 <para>
517 Now that the <structfield>data</structfield> field is set, the
518 <function>read_proc</function> and
519 <function>write_proc</function> can use it to distinguish
520 between files because they get it passed into their
521 <parameter>data</parameter> parameter:
522 </para>
524 <programlisting>
525 int foo_read_func(char *page, char **start, off_t off,
526 int count, int *eof, void *data)
528 int len;
530 if(data == file_data) {
531 /* special case for this file */
532 } else {
533 /* normal processing */
536 return len;
538 </programlisting>
540 <para>
541 Be sure to free the <structfield>data</structfield> data field
542 when removing the procfs entry.
543 </para>
544 </sect1>
545 </chapter>
550 <chapter id="tips">
551 <title>Tips and tricks</title>
556 <sect1 id="convenience">
557 <title>Convenience functions</title>
559 <funcsynopsis>
560 <funcprototype>
561 <funcdef>struct proc_dir_entry* <function>create_proc_read_entry</function></funcdef>
562 <paramdef>const char* <parameter>name</parameter></paramdef>
563 <paramdef>mode_t <parameter>mode</parameter></paramdef>
564 <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef>
565 <paramdef>read_proc_t* <parameter>read_proc</parameter></paramdef>
566 <paramdef>void* <parameter>data</parameter></paramdef>
567 </funcprototype>
568 </funcsynopsis>
570 <para>
571 This function creates a regular file in exactly the same way
572 as <function>create_proc_entry</function> from <xref
573 linkend="regularfile"/> does, but also allows to set the read
574 function <parameter>read_proc</parameter> in one call. This
575 function can set the <parameter>data</parameter> as well, like
576 explained in <xref linkend="usingdata"/>.
577 </para>
578 </sect1>
582 <sect1 id="Modules">
583 <title>Modules</title>
585 <para>
586 If procfs is being used from within a module, be sure to set
587 the <structfield>owner</structfield> field in the
588 <structname>struct proc_dir_entry</structname> to
589 <constant>THIS_MODULE</constant>.
590 </para>
592 <programlisting>
593 struct proc_dir_entry* entry;
595 entry->owner = THIS_MODULE;
596 </programlisting>
597 </sect1>
602 <sect1 id="Mode_and_ownership">
603 <title>Mode and ownership</title>
605 <para>
606 Sometimes it is useful to change the mode and/or ownership of
607 a procfs entry. Here is an example that shows how to achieve
608 that:
609 </para>
611 <programlisting>
612 struct proc_dir_entry* entry;
614 entry->mode = S_IWUSR |S_IRUSR | S_IRGRP | S_IROTH;
615 entry->uid = 0;
616 entry->gid = 100;
617 </programlisting>
619 </sect1>
620 </chapter>
625 <chapter id="example">
626 <title>Example</title>
628 <!-- be careful with the example code: it shouldn't be wider than
629 approx. 60 columns, or otherwise it won't fit properly on a page
632 &procfsexample;
634 </chapter>
635 </book>