1 .\" $NetBSD: kqueue.2,v 1.32 2012/01/25 00:28:35 christos Exp $
3 .\" Copyright (c) 2000 Jonathan Lemon
4 .\" All rights reserved.
6 .\" Copyright (c) 2001, 2002, 2003 The NetBSD Foundation, Inc.
7 .\" All rights reserved.
9 .\" Portions of this documentation is derived from text contributed by
12 .\" Redistribution and use in source and binary forms, with or without
13 .\" modification, are permitted provided that the following conditions
15 .\" 1. Redistributions of source code must retain the above copyright
16 .\" notice, this list of conditions and the following disclaimer.
17 .\" 2. Redistributions in binary form must reproduce the above copyright
18 .\" notice, this list of conditions and the following disclaimer in the
19 .\" documentation and/or other materials provided with the distribution.
21 .\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND
22 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
23 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
24 .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
25 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
26 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
27 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
28 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
29 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
30 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
33 .\" $FreeBSD: src/lib/libc/sys/kqueue.2,v 1.22 2001/06/27 19:55:57 dd Exp $
42 .Nd kernel event notification mechanism
51 .Fn kqueue1 "int flags"
53 .Fn kevent "int kq" "const struct kevent *changelist" "size_t nchanges" "struct kevent *eventlist" "size_t nevents" "const struct timespec *timeout"
54 .Fn EV_SET "\*[Am]kev" ident filter flags fflags data udata
57 provides a generic method of notifying the user when an event
58 happens or a condition holds, based on the results of small
59 pieces of kernel code termed filters.
60 A kevent is identified by the (ident, filter) pair; there may only
61 be one unique kevent per kqueue.
63 The filter is executed upon the initial registration of a kevent
64 in order to detect whether a preexisting condition is present, and is also
65 executed whenever an event is passed to the filter for evaluation.
66 If the filter determines that the condition should be reported,
67 then the kevent is placed on the kqueue for the user to retrieve.
69 The filter is also run when the user attempts to retrieve the kevent
71 If the filter indicates that the condition that triggered
72 the event no longer holds, the kevent is removed from the kqueue and
75 Multiple events which trigger the filter do not result in multiple
76 kevents being placed on the kqueue; instead, the filter will aggregate
77 the events into a single struct kevent.
80 on a file descriptor will remove any kevents that reference the descriptor.
83 creates a new kernel event queue and returns a descriptor.
87 also allows to set the following
89 on the returned file descriptor:
90 .Bl -column O_NONBLOCK -offset indent
92 Set the close on exec property.
94 Sets non-blocking I/O.
101 The queue is not inherited by a child created with
105 .\" is called without the
107 .\" flag, then the descriptor table is shared,
108 .\" which will allow sharing of the kqueue between two processes.
111 is used to register events with the queue, and return any pending
114 is a pointer to an array of
116 structures, as defined in
118 All changes contained in the
120 are applied before any pending events are read from the queue.
125 is a pointer to an array of kevent structures.
127 determines the size of
133 pointer, it specifies a maximum interval to wait
134 for an event, which will be interpreted as a struct timespec.
142 To effect a poll, the
145 .No non- Ns Dv NULL ,
146 pointing to a zero-valued
149 The same array may be used for the
155 is a macro which is provided for ease of initializing a
160 structure is defined as:
163 uintptr_t ident; /* identifier for this event */
164 uint32_t filter; /* filter for event */
165 uint32_t flags; /* action flags for kqueue */
166 uint32_t fflags; /* filter flag value */
167 int64_t data; /* filter data value */
168 intptr_t udata; /* opaque user data identifier */
175 .Bl -tag -width XXXfilter -offset indent
177 Value used to identify this event.
178 The exact interpretation is determined by the attached filter,
179 but often is a file descriptor.
181 Identifies the kernel filter used to process this event.
182 There are pre-defined system filters (which are described below), and
183 other filters may be added by kernel subsystems as necessary.
185 Actions to perform on the event.
187 Filter-specific flags.
189 Filter-specific data value.
191 Opaque user-defined value passed through the kernel unchanged.
196 field can contain the following values:
197 .Bl -tag -width XXXEV_ONESHOT -offset indent
199 Adds the event to the kqueue.
200 Re-adding an existing event will modify the parameters of the original
201 event, and not result in a duplicate entry.
202 Adding an event automatically enables it,
203 unless overridden by the EV_DISABLE flag.
207 to return the event if it is triggered.
212 The filter itself is not disabled.
214 Removes the event from the kqueue.
215 Events which are attached to file descriptors are automatically deleted
216 on the last close of the descriptor.
218 Causes the event to return only the first occurrence of the filter
220 After the user retrieves the event from the kqueue, it is deleted.
222 After the event is retrieved by the user, its state is reset.
223 This is useful for filters which report state transitions
224 instead of the current state.
225 Note that some filters may automatically set this flag internally.
227 Filters may set this flag to indicate filter-specific EOF condition.
234 Filters are identified by a number.
235 There are two types of filters; pre-defined filters which
236 are described below, and third-party filters that may be added with
237 .Xr kfilter_register 9
238 by kernel sub-systems, third-party device drivers, or loadable
241 As a third-party filter is referenced by a well-known name instead
242 of a statically assigned number, two
244 are supported on the file descriptor returned by
246 to map a filter name to a filter number, and vice-versa (passing
247 arguments in a structure described below):
248 .Bl -tag -width KFILTER_BYFILTER -offset indent
265 The following structure is used to pass arguments in and out of the
267 .Bd -literal -offset indent
268 struct kfilter_mapping {
269 char *name; /* name to lookup or return */
270 size_t len; /* length of name */
271 uint32_t filter; /* filter to lookup or return */
275 Arguments may be passed to and from the filter via the
279 fields in the kevent structure.
281 The predefined system filters are:
282 .Bl -tag -width EVFILT_SIGNAL
284 Takes a descriptor as the identifier, and returns whenever
285 there is data available to read.
286 The behavior of the filter is slightly different depending
287 on the descriptor type.
291 Sockets which have previously been passed to
293 return when there is an incoming connection pending.
295 contains the size of the listen backlog (i.e., the number of
296 connections ready to be accepted with
299 Other socket descriptors return when there is data to be read,
302 value of the socket buffer.
303 This may be overridden with a per-filter low water mark at the
304 time the filter is added by setting the
308 and specifying the new low water mark in
312 contains the number of bytes in the socket buffer.
314 If the read direction of the socket has shutdown, then the filter
317 and returns the socket error (if any) in
319 It is possible for EOF to be returned (indicating the connection is gone)
320 while there is still data pending in the socket buffer.
322 Returns when the file pointer is not at the end of file.
324 contains the offset from current position to end of file,
327 Returns when there is data to read;
329 contains the number of bytes available.
331 When the last writer disconnects, the filter will set EV_EOF in
333 This may be cleared by passing in EV_CLEAR, at which point the
334 filter will resume waiting for data to become available before
338 Takes a descriptor as the identifier, and returns whenever
339 it is possible to write to the descriptor.
340 For sockets, pipes, fifos, and ttys,
342 will contain the amount of space remaining in the write buffer.
343 The filter will set EV_EOF when the reader disconnects, and for
344 the fifo case, this may be cleared by use of EV_CLEAR.
345 Note that this filter is not supported for vnodes.
347 For sockets, the low water mark and socket error handling is
348 identical to the EVFILT_READ case.
350 This is not implemented in
353 The sigevent portion of the AIO request is filled in, with
354 .Va sigev_notify_kqueue
355 containing the descriptor of the kqueue that the event should
358 containing the udata value, and
361 When the aio_* function is called, the event will be registered
362 with the specified kqueue, and the
366 returned by the aio_* function.
367 The filter returns under the same conditions as aio_error.
369 Alternatively, a kevent structure may be initialized, with
371 containing the descriptor of the kqueue, and the
372 address of the kevent structure placed in the
374 field of the AIO request.
375 However, this approach will not work on
376 architectures with 64-bit pointers, and should be considered deprecated.
379 Takes a file descriptor as the identifier and the events to watch for in
381 and returns when one or more of the requested events occurs on the descriptor.
382 The events to monitor are:
383 .Bl -tag -width XXNOTE_RENAME
386 was called on the file referenced by the descriptor.
388 A write occurred on the file referenced by the descriptor.
390 The file referenced by the descriptor was extended.
392 The file referenced by the descriptor had its attributes changed.
394 The link count on the file changed.
396 The file referenced by the descriptor was renamed.
398 Access to the file was revoked via
400 or the underlying fileystem was unmounted.
405 contains the events which triggered the filter.
407 Takes the process ID to monitor as the identifier and the events to watch for
410 and returns when the process performs one or more of the requested events.
411 If a process can normally see another process, it can attach an event to it.
412 The events to monitor are:
413 .Bl -tag -width XXNOTE_TRACKERR
415 The process has exited.
417 The process has called
420 The process has executed a new process via
424 Follow a process across
427 The parent process will return with NOTE_TRACK set in the
429 field, while the child process will return with NOTE_CHILD set in
431 and the parent PID in
434 This flag is returned if the system was unable to attach an event to
435 the child process, usually due to resource limitations.
440 contains the events which triggered the filter.
442 Takes the signal number to monitor as the identifier and returns
443 when the given signal is delivered to the current process.
444 This coexists with the
448 facilities, and has a lower precedence.
449 The filter will record
450 all attempts to deliver a signal to a process, even if the signal has
451 been marked as SIG_IGN.
452 Event notification happens after normal signal delivery processing.
454 returns the number of times the signal has occurred since the last call to
456 This filter automatically sets the EV_CLEAR flag internally.
458 Establishes an arbitrary timer identified by
462 specifies the timeout period in milliseconds.
463 The timer will be periodic unless EV_ONESHOT is specified.
466 contains the number of times the timeout has expired since the last call to
468 This filter automatically sets the EV_CLEAR flag internally.
472 creates a new kernel event queue and returns a file descriptor.
473 If there was an error creating the kernel event queue, a value of \-1 is
474 returned and errno set.
477 returns the number of events placed in the
479 up to the value given by
481 If an error occurs while processing an element of the
483 and there is enough room in the
485 then the event will be placed in the
491 and the system error in
495 will be returned, and
497 will be set to indicate the error condition.
498 If the time limit expires, then
502 The following example program monitors a file (provided to it as the first
503 argument) and prints information about some common events it receives
505 .Bd -literal -offset indent
506 #include \*[Lt]sys/types.h\*[Gt]
507 #include \*[Lt]sys/event.h\*[Gt]
508 #include \*[Lt]sys/time.h\*[Gt]
509 #include \*[Lt]stdio.h\*[Gt]
510 #include \*[Lt]unistd.h\*[Gt]
511 #include \*[Lt]stdlib.h\*[Gt]
512 #include \*[Lt]fcntl.h\*[Gt]
513 #include \*[Lt]err.h\*[Gt]
516 main(int argc, char *argv[])
520 static const struct timespec tout = { 1, 0 };
522 if ((fd = open(argv[1], O_RDONLY)) == -1)
523 err(1, "Cannot open `%s'", argv[1]);
525 if ((kq = kqueue()) == -1)
526 err(1, "Cannot create kqueue");
528 EV_SET(\*[Am]ev, fd, EVFILT_VNODE, EV_ADD | EV_ENABLE | EV_CLEAR,
529 NOTE_DELETE|NOTE_WRITE|NOTE_EXTEND|NOTE_ATTRIB|NOTE_LINK|
530 NOTE_RENAME|NOTE_REVOKE, 0, 0);
531 if (kevent(kq, \*[Am]ev, 1, NULL, 0, \*[Am]tout) == -1)
534 nev = kevent(kq, NULL, 0, \*[Am]ev, 1, \*[Am]tout);
539 if (ev.fflags \*[Am] NOTE_DELETE) {
541 ev.fflags \*[Am]= ~NOTE_DELETE;
543 if (ev.fflags \*[Am] NOTE_WRITE) {
545 ev.fflags \*[Am]= ~NOTE_WRITE;
547 if (ev.fflags \*[Am] NOTE_EXTEND) {
549 ev.fflags \*[Am]= ~NOTE_EXTEND;
551 if (ev.fflags \*[Am] NOTE_ATTRIB) {
552 printf("chmod/chown/utimes ");
553 ev.fflags \*[Am]= ~NOTE_ATTRIB;
555 if (ev.fflags \*[Am] NOTE_LINK) {
556 printf("hardlinked ");
557 ev.fflags \*[Am]= ~NOTE_LINK;
559 if (ev.fflags \*[Am] NOTE_RENAME) {
561 ev.fflags \*[Am]= ~NOTE_RENAME;
563 if (ev.fflags \*[Am] NOTE_REVOKE) {
565 ev.fflags \*[Am]= ~NOTE_REVOKE;
569 warnx("unknown event 0x%x\\n", ev.fflags);
579 The per-process descriptor table is full.
581 The system file table is full.
583 The kernel failed to allocate enough memory for the kernel queue.
591 The process does not have permission to register a filter.
593 The specified descriptor is invalid.
595 There was an error reading or writing the
599 A signal was delivered before the timeout expired and before any
600 events were placed on the kqueue for return.
602 The specified time limit or filter is invalid.
604 The event could not be found to be modified or deleted.
606 No memory was available to register the event.
608 The specified process to attach to does not exist.
611 .\" .Xr aio_error 2 ,
613 .\" .Xr aio_return 2 ,
621 .Xr kfilter_register 9 ,
625 .%T "Kqueue: A Generic and Scalable Event Notification Facility"
626 .%I USENIX Association
627 .%B Proceedings of the FREENIX Track: 2001 USENIX Annual Technical Conference
629 .%U http://www.usenix.org/event/usenix01/freenix01/full_papers/lemon/lemon.pdf
636 functions first appeared in
642 function first appeared in