On Tue, Nov 06, 2007 at 02:33:53AM -0800, akpm@linux-foundation.org wrote:
[mmotm.git] / Documentation / DocBook / procfs-guide.tmpl
blob9eba4b7af73de9dd5d6d83086c374c07efb5a1bc
1 <?xml version="1.0" encoding="UTF-8"?>
2 <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
3 "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" [
4 <!ENTITY procfsexample SYSTEM "procfs_example.xml">
5 ]>
7 <book id="LKProcfsGuide">
8 <bookinfo>
9 <title>Linux Kernel Procfs Guide</title>
11 <authorgroup>
12 <author>
13 <firstname>Erik</firstname>
14 <othername>(J.A.K.)</othername>
15 <surname>Mouw</surname>
16 <affiliation>
17 <address>
18 <email>mouw@nl.linux.org</email>
19 </address>
20 </affiliation>
21 </author>
22 <othercredit>
23 <contrib>
24 This software and documentation were written while working on the
25 LART computing board
26 (<ulink url="http://www.lartmaker.nl/">http://www.lartmaker.nl/</ulink>),
27 which was sponsored by the Delt University of Technology projects
28 Mobile Multi-media Communications and Ubiquitous Communications.
29 </contrib>
30 </othercredit>
31 </authorgroup>
33 <revhistory>
34 <revision>
35 <revnumber>1.0</revnumber>
36 <date>May 30, 2001</date>
37 <revremark>Initial revision posted to linux-kernel</revremark>
38 </revision>
39 <revision>
40 <revnumber>1.1</revnumber>
41 <date>June 3, 2001</date>
42 <revremark>Revised after comments from linux-kernel</revremark>
43 </revision>
44 </revhistory>
46 <copyright>
47 <year>2001</year>
48 <holder>Erik Mouw</holder>
49 </copyright>
52 <legalnotice>
53 <para>
54 This documentation is free software; you can redistribute it
55 and/or modify it under the terms of the GNU General Public
56 License as published by the Free Software Foundation; either
57 version 2 of the License, or (at your option) any later
58 version.
59 </para>
61 <para>
62 This documentation is distributed in the hope that it will be
63 useful, but WITHOUT ANY WARRANTY; without even the implied
64 warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
65 PURPOSE. See the GNU General Public License for more details.
66 </para>
68 <para>
69 You should have received a copy of the GNU General Public
70 License along with this program; if not, write to the Free
71 Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
72 MA 02111-1307 USA
73 </para>
75 <para>
76 For more details see the file COPYING in the source
77 distribution of Linux.
78 </para>
79 </legalnotice>
80 </bookinfo>
85 <toc>
86 </toc>
91 <preface id="Preface">
92 <title>Preface</title>
94 <para>
95 This guide describes the use of the procfs file system from
96 within the Linux kernel. The idea to write this guide came up on
97 the #kernelnewbies IRC channel (see <ulink
98 url="http://www.kernelnewbies.org/">http://www.kernelnewbies.org/</ulink>),
99 when Jeff Garzik explained the use of procfs and forwarded me a
100 message Alexander Viro wrote to the linux-kernel mailing list. I
101 agreed to write it up nicely, so here it is.
102 </para>
104 <para>
105 I'd like to thank Jeff Garzik
106 <email>jgarzik@pobox.com</email> and Alexander Viro
107 <email>viro@parcelfarce.linux.theplanet.co.uk</email> for their input,
108 Tim Waugh <email>twaugh@redhat.com</email> for his <ulink
109 url="http://people.redhat.com/twaugh/docbook/selfdocbook/">Selfdocbook</ulink>,
110 and Marc Joosen <email>marcj@historia.et.tudelft.nl</email> for
111 proofreading.
112 </para>
114 <para>
115 Erik
116 </para>
117 </preface>
122 <chapter id="intro">
123 <title>Introduction</title>
125 <para>
126 The <filename class="directory">/proc</filename> file system
127 (procfs) is a special file system in the linux kernel. It's a
128 virtual file system: it is not associated with a block device
129 but exists only in memory. The files in the procfs are there to
130 allow userland programs access to certain information from the
131 kernel (like process information in <filename
132 class="directory">/proc/[0-9]+/</filename>), but also for debug
133 purposes (like <filename>/proc/ksyms</filename>).
134 </para>
136 <para>
137 This guide describes the use of the procfs file system from
138 within the Linux kernel. It starts by introducing all relevant
139 functions to manage the files within the file system. After that
140 it shows how to communicate with userland, and some tips and
141 tricks will be pointed out. Finally a complete example will be
142 shown.
143 </para>
145 <para>
146 Note that the files in <filename
147 class="directory">/proc/sys</filename> are sysctl files: they
148 don't belong to procfs and are governed by a completely
149 different API described in the Kernel API book.
150 </para>
151 </chapter>
156 <chapter id="managing">
157 <title>Managing procfs entries</title>
159 <para>
160 This chapter describes the functions that various kernel
161 components use to populate the procfs with files, symlinks,
162 device nodes, and directories.
163 </para>
165 <para>
166 A minor note before we start: if you want to use any of the
167 procfs functions, be sure to include the correct header file!
168 This should be one of the first lines in your code:
169 </para>
171 <programlisting>
172 #include &lt;linux/proc_fs.h&gt;
173 </programlisting>
178 <sect1 id="regularfile">
179 <title>Creating a regular file</title>
181 <funcsynopsis>
182 <funcprototype>
183 <funcdef>struct proc_dir_entry* <function>create_proc_entry</function></funcdef>
184 <paramdef>const char* <parameter>name</parameter></paramdef>
185 <paramdef>mode_t <parameter>mode</parameter></paramdef>
186 <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef>
187 </funcprototype>
188 </funcsynopsis>
190 <para>
191 This function creates a regular file with the name
192 <parameter>name</parameter>, file mode
193 <parameter>mode</parameter> in the directory
194 <parameter>parent</parameter>. To create a file in the root of
195 the procfs, use <constant>NULL</constant> as
196 <parameter>parent</parameter> parameter. When successful, the
197 function will return a pointer to the freshly created
198 <structname>struct proc_dir_entry</structname>; otherwise it
199 will return <constant>NULL</constant>. <xref
200 linkend="userland"/> describes how to do something useful with
201 regular files.
202 </para>
204 <para>
205 Note that it is specifically supported that you can pass a
206 path that spans multiple directories. For example
207 <function>create_proc_entry</function>(<parameter>"drivers/via0/info"</parameter>)
208 will create the <filename class="directory">via0</filename>
209 directory if necessary, with standard
210 <constant>0755</constant> permissions.
211 </para>
213 <para>
214 If you only want to be able to read the file, the function
215 <function>create_proc_read_entry</function> described in <xref
216 linkend="convenience"/> may be used to create and initialise
217 the procfs entry in one single call.
218 </para>
219 </sect1>
224 <sect1 id="Creating_a_symlink">
225 <title>Creating a symlink</title>
227 <funcsynopsis>
228 <funcprototype>
229 <funcdef>struct proc_dir_entry*
230 <function>proc_symlink</function></funcdef> <paramdef>const
231 char* <parameter>name</parameter></paramdef>
232 <paramdef>struct proc_dir_entry*
233 <parameter>parent</parameter></paramdef> <paramdef>const
234 char* <parameter>dest</parameter></paramdef>
235 </funcprototype>
236 </funcsynopsis>
238 <para>
239 This creates a symlink in the procfs directory
240 <parameter>parent</parameter> that points from
241 <parameter>name</parameter> to
242 <parameter>dest</parameter>. This translates in userland to
243 <literal>ln -s</literal> <parameter>dest</parameter>
244 <parameter>name</parameter>.
245 </para>
246 </sect1>
248 <sect1 id="Creating_a_directory">
249 <title>Creating a directory</title>
251 <funcsynopsis>
252 <funcprototype>
253 <funcdef>struct proc_dir_entry* <function>proc_mkdir</function></funcdef>
254 <paramdef>const char* <parameter>name</parameter></paramdef>
255 <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef>
256 </funcprototype>
257 </funcsynopsis>
259 <para>
260 Create a directory <parameter>name</parameter> in the procfs
261 directory <parameter>parent</parameter>.
262 </para>
263 </sect1>
268 <sect1 id="Removing_an_entry">
269 <title>Removing an entry</title>
271 <funcsynopsis>
272 <funcprototype>
273 <funcdef>void <function>remove_proc_entry</function></funcdef>
274 <paramdef>const char* <parameter>name</parameter></paramdef>
275 <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef>
276 </funcprototype>
277 </funcsynopsis>
279 <para>
280 Removes the entry <parameter>name</parameter> in the directory
281 <parameter>parent</parameter> from the procfs. Entries are
282 removed by their <emphasis>name</emphasis>, not by the
283 <structname>struct proc_dir_entry</structname> returned by the
284 various create functions. Note that this function doesn't
285 recursively remove entries.
286 </para>
288 <para>
289 Be sure to free the <structfield>data</structfield> entry from
290 the <structname>struct proc_dir_entry</structname> before
291 <function>remove_proc_entry</function> is called (that is: if
292 there was some <structfield>data</structfield> allocated, of
293 course). See <xref linkend="usingdata"/> for more information
294 on using the <structfield>data</structfield> entry.
295 </para>
296 </sect1>
297 </chapter>
302 <chapter id="userland">
303 <title>Communicating with userland</title>
305 <para>
306 Instead of reading (or writing) information directly from
307 kernel memory, procfs works with <emphasis>call back
308 functions</emphasis> for files: functions that are called when
309 a specific file is being read or written. Such functions have
310 to be initialised after the procfs file is created by setting
311 the <structfield>read_proc</structfield> and/or
312 <structfield>write_proc</structfield> fields in the
313 <structname>struct proc_dir_entry*</structname> that the
314 function <function>create_proc_entry</function> returned:
315 </para>
317 <programlisting>
318 struct proc_dir_entry* entry;
320 entry->read_proc = read_proc_foo;
321 entry->write_proc = write_proc_foo;
322 </programlisting>
324 <para>
325 If you only want to use a the
326 <structfield>read_proc</structfield>, the function
327 <function>create_proc_read_entry</function> described in <xref
328 linkend="convenience"/> may be used to create and initialise the
329 procfs entry in one single call.
330 </para>
334 <sect1 id="Reading_data">
335 <title>Reading data</title>
337 <para>
338 The read function is a call back function that allows userland
339 processes to read data from the kernel. The read function
340 should have the following format:
341 </para>
343 <funcsynopsis>
344 <funcprototype>
345 <funcdef>int <function>read_func</function></funcdef>
346 <paramdef>char* <parameter>buffer</parameter></paramdef>
347 <paramdef>char** <parameter>start</parameter></paramdef>
348 <paramdef>off_t <parameter>off</parameter></paramdef>
349 <paramdef>int <parameter>count</parameter></paramdef>
350 <paramdef>int* <parameter>peof</parameter></paramdef>
351 <paramdef>void* <parameter>data</parameter></paramdef>
352 </funcprototype>
353 </funcsynopsis>
355 <para>
356 The read function should write its information into the
357 <parameter>buffer</parameter>, which will be exactly
358 <literal>PAGE_SIZE</literal> bytes long.
359 </para>
361 <para>
362 The parameter
363 <parameter>peof</parameter> should be used to signal that the
364 end of the file has been reached by writing
365 <literal>1</literal> to the memory location
366 <parameter>peof</parameter> points to.
367 </para>
369 <para>
370 The <parameter>data</parameter>
371 parameter can be used to create a single call back function for
372 several files, see <xref linkend="usingdata"/>.
373 </para>
375 <para>
376 The rest of the parameters and the return value are described
377 by a comment in <filename>fs/proc/generic.c</filename> as follows:
378 </para>
380 <blockquote>
381 <para>
382 You have three ways to return data:
383 </para>
384 <orderedlist>
385 <listitem>
386 <para>
387 Leave <literal>*start = NULL</literal>. (This is the default.)
388 Put the data of the requested offset at that
389 offset within the buffer. Return the number (<literal>n</literal>)
390 of bytes there are from the beginning of the
391 buffer up to the last byte of data. If the
392 number of supplied bytes (<literal>= n - offset</literal>) is
393 greater than zero and you didn't signal eof
394 and the reader is prepared to take more data
395 you will be called again with the requested
396 offset advanced by the number of bytes
397 absorbed. This interface is useful for files
398 no larger than the buffer.
399 </para>
400 </listitem>
401 <listitem>
402 <para>
403 Set <literal>*start</literal> to an unsigned long value less than
404 the buffer address but greater than zero.
405 Put the data of the requested offset at the
406 beginning of the buffer. Return the number of
407 bytes of data placed there. If this number is
408 greater than zero and you didn't signal eof
409 and the reader is prepared to take more data
410 you will be called again with the requested
411 offset advanced by <literal>*start</literal>. This interface is
412 useful when you have a large file consisting
413 of a series of blocks which you want to count
414 and return as wholes.
415 (Hack by Paul.Russell@rustcorp.com.au)
416 </para>
417 </listitem>
418 <listitem>
419 <para>
420 Set <literal>*start</literal> to an address within the buffer.
421 Put the data of the requested offset at <literal>*start</literal>.
422 Return the number of bytes of data placed there.
423 If this number is greater than zero and you
424 didn't signal eof and the reader is prepared to
425 take more data you will be called again with the
426 requested offset advanced by the number of bytes
427 absorbed.
428 </para>
429 </listitem>
430 </orderedlist>
431 </blockquote>
433 <para>
434 <xref linkend="example"/> shows how to use a read call back
435 function.
436 </para>
437 </sect1>
442 <sect1 id="Writing_data">
443 <title>Writing data</title>
445 <para>
446 The write call back function allows a userland process to write
447 data to the kernel, so it has some kind of control over the
448 kernel. The write function should have the following format:
449 </para>
451 <funcsynopsis>
452 <funcprototype>
453 <funcdef>int <function>write_func</function></funcdef>
454 <paramdef>struct file* <parameter>file</parameter></paramdef>
455 <paramdef>const char* <parameter>buffer</parameter></paramdef>
456 <paramdef>unsigned long <parameter>count</parameter></paramdef>
457 <paramdef>void* <parameter>data</parameter></paramdef>
458 </funcprototype>
459 </funcsynopsis>
461 <para>
462 The write function should read <parameter>count</parameter>
463 bytes at maximum from the <parameter>buffer</parameter>. Note
464 that the <parameter>buffer</parameter> doesn't live in the
465 kernel's memory space, so it should first be copied to kernel
466 space with <function>copy_from_user</function>. The
467 <parameter>file</parameter> parameter is usually
468 ignored. <xref linkend="usingdata"/> shows how to use the
469 <parameter>data</parameter> parameter.
470 </para>
472 <para>
473 Again, <xref linkend="example"/> shows how to use this call back
474 function.
475 </para>
476 </sect1>
481 <sect1 id="usingdata">
482 <title>A single call back for many files</title>
484 <para>
485 When a large number of almost identical files is used, it's
486 quite inconvenient to use a separate call back function for
487 each file. A better approach is to have a single call back
488 function that distinguishes between the files by using the
489 <structfield>data</structfield> field in <structname>struct
490 proc_dir_entry</structname>. First of all, the
491 <structfield>data</structfield> field has to be initialised:
492 </para>
494 <programlisting>
495 struct proc_dir_entry* entry;
496 struct my_file_data *file_data;
498 file_data = kmalloc(sizeof(struct my_file_data), GFP_KERNEL);
499 entry->data = file_data;
500 </programlisting>
502 <para>
503 The <structfield>data</structfield> field is a <type>void
504 *</type>, so it can be initialised with anything.
505 </para>
507 <para>
508 Now that the <structfield>data</structfield> field is set, the
509 <function>read_proc</function> and
510 <function>write_proc</function> can use it to distinguish
511 between files because they get it passed into their
512 <parameter>data</parameter> parameter:
513 </para>
515 <programlisting>
516 int foo_read_func(char *page, char **start, off_t off,
517 int count, int *eof, void *data)
519 int len;
521 if(data == file_data) {
522 /* special case for this file */
523 } else {
524 /* normal processing */
527 return len;
529 </programlisting>
531 <para>
532 Be sure to free the <structfield>data</structfield> data field
533 when removing the procfs entry.
534 </para>
535 </sect1>
536 </chapter>
541 <chapter id="tips">
542 <title>Tips and tricks</title>
547 <sect1 id="convenience">
548 <title>Convenience functions</title>
550 <funcsynopsis>
551 <funcprototype>
552 <funcdef>struct proc_dir_entry* <function>create_proc_read_entry</function></funcdef>
553 <paramdef>const char* <parameter>name</parameter></paramdef>
554 <paramdef>mode_t <parameter>mode</parameter></paramdef>
555 <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef>
556 <paramdef>read_proc_t* <parameter>read_proc</parameter></paramdef>
557 <paramdef>void* <parameter>data</parameter></paramdef>
558 </funcprototype>
559 </funcsynopsis>
561 <para>
562 This function creates a regular file in exactly the same way
563 as <function>create_proc_entry</function> from <xref
564 linkend="regularfile"/> does, but also allows to set the read
565 function <parameter>read_proc</parameter> in one call. This
566 function can set the <parameter>data</parameter> as well, like
567 explained in <xref linkend="usingdata"/>.
568 </para>
569 </sect1>
573 <sect1 id="Modules">
574 <title>Modules</title>
576 <para>
577 If procfs is being used from within a module, be sure to set
578 the <structfield>owner</structfield> field in the
579 <structname>struct proc_dir_entry</structname> to
580 <constant>THIS_MODULE</constant>.
581 </para>
583 <programlisting>
584 struct proc_dir_entry* entry;
586 entry->owner = THIS_MODULE;
587 </programlisting>
588 </sect1>
593 <sect1 id="Mode_and_ownership">
594 <title>Mode and ownership</title>
596 <para>
597 Sometimes it is useful to change the mode and/or ownership of
598 a procfs entry. Here is an example that shows how to achieve
599 that:
600 </para>
602 <programlisting>
603 struct proc_dir_entry* entry;
605 entry->mode = S_IWUSR |S_IRUSR | S_IRGRP | S_IROTH;
606 entry->uid = 0;
607 entry->gid = 100;
608 </programlisting>
610 </sect1>
611 </chapter>
616 <chapter id="example">
617 <title>Example</title>
619 <!-- be careful with the example code: it shouldn't be wider than
620 approx. 60 columns, or otherwise it won't fit properly on a page
623 &procfsexample;
625 </chapter>
626 </book>