1 .\" $NetBSD: 6.t,v 1.4 2003/05/03 18:10:39 wiz Exp $
3 .\" Copyright (c) 1983, 1986, 1993
4 .\" The Regents of the University of California. All rights reserved.
6 .\" Redistribution and use in source and binary forms, with or without
7 .\" modification, are permitted provided that the following conditions
9 .\" 1. Redistributions of source code must retain the above copyright
10 .\" notice, this list of conditions and the following disclaimer.
11 .\" 2. Redistributions in binary form must reproduce the above copyright
12 .\" notice, this list of conditions and the following disclaimer in the
13 .\" documentation and/or other materials provided with the distribution.
14 .\" 3. Neither the name of the University nor the names of its contributors
15 .\" may be used to endorse or promote products derived from this software
16 .\" without specific prior written permission.
18 .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
19 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
20 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
21 .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
22 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
23 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
24 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
25 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
26 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
27 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
30 .\" @(#)6.t 8.1 (Berkeley) 6/8/93
33 .\".ds RH "Internal layering
37 \s+2Internal layering\s0
39 The internal structure of the network system is divided into
41 layers correspond to the services provided by the socket
42 abstraction, those provided by the communication protocols,
43 and those provided by the hardware interfaces. The communication
44 protocols are normally layered into two or more individual
45 cooperating layers, though they are collectively viewed
46 in the system as one layer providing services supportive
47 of the appropriate socket abstraction.
49 The following sections describe the properties of each layer
50 in the system and the interfaces to which each must conform.
54 The socket layer deals with the interprocess communication
55 facilities provided by the system. A socket is a bidirectional
56 endpoint of communication which is ``typed'' by the semantics
57 of communication it supports. The system calls described in
58 the \fIBerkeley Software Architecture Manual\fP [Joy86]
59 are used to manipulate sockets.
61 A socket consists of the following data structure:
65 short so_type; /* generic type */
66 short so_options; /* from socket call */
67 short so_linger; /* time to linger while closing */
68 short so_state; /* internal state flags */
69 caddr_t so_pcb; /* protocol control block */
70 struct protosw *so_proto; /* protocol handle */
71 struct socket *so_head; /* back pointer to accept socket */
72 struct socket *so_q0; /* queue of partial connections */
73 short so_q0len; /* partials on so_q0 */
74 struct socket *so_q; /* queue of incoming connections */
75 short so_qlen; /* number of connections on so_q */
76 short so_qlimit; /* max number queued connections */
77 struct sockbuf so_rcv; /* receive queue */
78 struct sockbuf so_snd; /* send queue */
79 short so_timeo; /* connection timeout */
80 u_short so_error; /* error affecting connection */
81 u_short so_oobmark; /* chars to oob mark */
82 short so_pgrp; /* pgrp for signals */
86 Each socket contains two data queues, \fIso_rcv\fP and \fIso_snd\fP,
87 and a pointer to routines which provide supporting services.
88 The type of the socket,
89 \fIso_type\fP is defined at socket creation time and used in selecting
90 those services which are appropriate to support it. The supporting
91 protocol is selected at socket creation time and recorded in
92 the socket data structure for later use. Protocols are defined
93 by a table of procedures, the \fIprotosw\fP structure, which will
94 be described in detail later. A pointer to a protocol-specific
96 the ``protocol control block,'' is also present in the socket structure.
97 Protocols control this data structure, which normally includes a
98 back pointer to the parent socket structure to allow easy
99 lookup when returning information to a user
100 (for example, placing an error number in the \fIso_error\fP
101 field). The other entries in the socket structure are used in
102 queuing connection requests, validating user requests, storing
103 socket characteristics (e.g.
104 options supplied at the time a socket is created), and maintaining
107 Processes ``rendezvous at a socket'' in many instances. For instance,
108 when a process wishes to extract data from a socket's receive queue
109 and it is empty, or lacks sufficient data to satisfy the request,
110 the process blocks, supplying the address of the receive queue as
111 a ``wait channel' to be used in notification. When data arrives
112 for the process and is placed in the socket's queue, the blocked
113 process is identified by the fact it is waiting ``on the queue.''
117 A socket's state is defined from the following:
119 .ta \w'#define 'u +\w'SS_ISDISCONNECTING 'u +\w'0x000 'u
120 #define SS_NOFDREF 0x001 /* no file table ref any more */
121 #define SS_ISCONNECTED 0x002 /* socket connected to a peer */
122 #define SS_ISCONNECTING 0x004 /* in process of connecting to peer */
123 #define SS_ISDISCONNECTING 0x008 /* in process of disconnecting */
124 #define SS_CANTSENDMORE 0x010 /* can't send more data to peer */
125 #define SS_CANTRCVMORE 0x020 /* can't receive more data from peer */
126 #define SS_RCVATMARK 0x040 /* at mark on input */
128 #define SS_PRIV 0x080 /* privileged */
129 #define SS_NBIO 0x100 /* non-blocking ops */
130 #define SS_ASYNC 0x200 /* async i/o notify */
133 The state of a socket is manipulated both by the protocols
134 and the user (through system calls).
135 When a socket is created, the state is defined based on the type of socket.
136 It may change as control actions are performed, for example connection
138 It may also change according to the type of
139 input/output the user wishes to perform, as indicated by options
140 set with \fIfcntl\fP. ``Non-blocking'' I/O implies that
141 a process should never be blocked to await resources. Instead, any
142 call which would block returns prematurely
143 with the error EWOULDBLOCK, or the service request may be partially
144 fulfilled, e.g. a request for more data than is present.
146 If a process requested ``asynchronous'' notification of events
147 related to the socket, the SIGIO signal is posted to the process
148 when such events occur.
149 An event is a change in the socket's state;
150 examples of such occurrences are: space
151 becoming available in the send queue, new data available in the
152 receive queue, connection establishment or disestablishment, etc.
154 A socket may be marked ``privileged'' if it was created by the
155 super-user. Only privileged sockets may
156 bind addresses in privileged portions of an address space
157 or use ``raw'' sockets to access lower levels of the network.
161 A socket's data queue contains a pointer to the data stored in
162 the queue and other entries related to the management of
163 the data. The following structure defines a data queue:
167 u_short sb_cc; /* actual chars in buffer */
168 u_short sb_hiwat; /* max actual char count */
169 u_short sb_mbcnt; /* chars of mbufs used */
170 u_short sb_mbmax; /* max chars of mbufs to use */
171 u_short sb_lowat; /* low water mark */
172 short sb_timeo; /* timeout */
173 struct mbuf *sb_mb; /* the mbuf chain */
174 struct proc *sb_sel; /* process selecting read/write */
175 short sb_flags; /* flags, see below */
179 Data is stored in a queue as a chain of mbufs.
180 The actual count of data characters as well as high and low water marks are
181 used by the protocols in controlling the flow of data.
182 The amount of buffer space (characters of mbufs and associated data pages)
183 is also recorded along with the limit on buffer allocation.
184 The socket routines cooperate in implementing the flow control
185 policy by blocking a process when it requests to send data and
186 the high water mark has been reached, or when it requests to
187 receive data and less than the low water mark is present
188 (assuming non-blocking I/O has not been specified).*
190 * The low-water mark is always presumed to be 0
191 in the current implementation.
194 When a socket is created, the supporting protocol ``reserves'' space
195 for the send and receive queues of the socket.
196 The limit on buffer allocation is set somewhat higher than the limit
198 to account for the granularity of buffer allocation.
199 The actual storage associated with a
200 socket queue may fluctuate during a socket's lifetime, but it is assumed
201 that this reservation will always allow a protocol to acquire enough memory
202 to satisfy the high water marks.
204 The timeout and select values are manipulated by the socket routines
205 in implementing various portions of the interprocess communications
206 facilities and will not be described here.
208 Data queued at a socket is stored in one of two styles.
209 Stream-oriented sockets queue data with no addresses, headers
210 or record boundaries.
211 The data are in mbufs linked through the \fIm_next\fP field.
212 Buffers containing access rights may be present within the chain
213 if the underlying protocol supports passage of access rights.
214 Record-oriented sockets, including datagram sockets,
215 queue data as a list of packets; the sections of packets are distinguished
216 by the types of the mbufs containing them.
217 The mbufs which comprise a record are linked through the \fIm_next\fP field;
218 records are linked from the \fIm_act\fP field of the first mbuf
219 of one packet to the first mbuf of the next.
220 Each packet begins with an mbuf containing the ``from'' address
221 if the protocol provides it,
222 then any buffers containing access rights, and finally any buffers
224 If a record contains no data,
225 no data buffers are required unless neither address nor access rights
228 A socket queue has a number of flags used in synchronizing access
229 to the data and in acquiring resources:
232 #define SB_LOCK 0x01 /* lock on data queue (so_rcv only) */
233 #define SB_WANT 0x02 /* someone is waiting to lock */
234 #define SB_WAIT 0x04 /* someone is waiting for data/space */
235 #define SB_SEL 0x08 /* buffer is selected */
236 #define SB_COLL 0x10 /* collision selecting */
238 The last two flags are manipulated by the system in implementing
239 the select mechanism.
241 Socket connection queuing
243 In dealing with connection oriented sockets (e.g. SOCK_STREAM)
244 the two ends are considered distinct. One end is termed
245 \fIactive\fP, and generates connection requests. The other
246 end is called \fIpassive\fP and accepts connection requests.
248 From the passive side, a socket is marked with
249 SO_ACCEPTCONN when a \fIlisten\fP call is made,
250 creating two queues of sockets: \fIso_q0\fP for connections
251 in progress and \fIso_q\fP for connections already made and
252 awaiting user acceptance.
253 As a protocol is preparing incoming connections, it creates
254 a socket structure queued on \fIso_q0\fP by calling the routine
255 \fIsonewconn\fP(). When the connection
256 is established, the socket structure is then transferred
257 to \fIso_q\fP, making it available for an \fIaccept\fP.
259 If an SO_ACCEPTCONN socket is closed with sockets on either
260 \fIso_q0\fP or \fIso_q\fP, these sockets are dropped,
261 with notification to the peers as appropriate.
265 Each socket is created in a communications domain,
266 which usually implies both an addressing structure (address family)
267 and a set of protocols which implement various socket types within the domain
269 Each domain is defined by the following structure:
271 .ta .5i +\w'struct 'u +\w'(*dom_externalize)(); 'u
273 int dom_family; /* PF_xxx */
275 int (*dom_init)(); /* initialize domain data structures */
276 int (*dom_externalize)(); /* externalize access rights */
277 int (*dom_dispose)(); /* dispose of internalized rights */
278 struct protosw *dom_protosw, *dom_protoswNPROTOSW;
279 struct domain *dom_next;
283 At boot time, each domain configured into the kernel
284 is added to a linked list of domain.
285 The initialization procedure of each domain is then called.
286 After that time, the domain structure is used to locate protocols
287 within the protocol family.
288 It may also contain procedure references
289 for externalization of access rights at the receiving socket
290 and the disposal of access rights that are not received.
292 Protocols are described by a set of entry points and certain
293 socket-visible characteristics, some of which are used in
294 deciding which socket type(s) they may support.
296 An entry in the ``protocol switch'' table exists for each
297 protocol module configured into the system. It has the following form:
299 .ta .5i +\w'struct 'u +\w'domain *pr_domain; 'u
301 short pr_type; /* socket type used for */
302 struct domain *pr_domain; /* domain protocol a member of */
303 short pr_protocol; /* protocol number */
304 short pr_flags; /* socket visible attributes */
305 /* protocol-protocol hooks */
306 int (*pr_input)(); /* input to protocol (from below) */
307 int (*pr_output)(); /* output to protocol (from above) */
308 int (*pr_ctlinput)(); /* control input (from below) */
309 int (*pr_ctloutput)(); /* control output (from above) */
310 /* user-protocol hook */
311 int (*pr_usrreq)(); /* user request */
313 int (*pr_init)(); /* initialization routine */
314 int (*pr_fasttimo)(); /* fast timeout (200ms) */
315 int (*pr_slowtimo)(); /* slow timeout (500ms) */
316 int (*pr_drain)(); /* flush any excess space possible */
320 A protocol is called through the \fIpr_init\fP entry before any other.
321 Thereafter it is called every 200 milliseconds through the
322 \fIpr_fasttimo\fP entry and
323 every 500 milliseconds through the \fIpr_slowtimo\fP for timer based actions.
324 The system will call the \fIpr_drain\fP entry if it is low on space and
325 this should throw away any non-critical data.
327 Protocols pass data between themselves as chains of mbufs using
328 the \fIpr_input\fP and \fIpr_output\fP routines. \fIPr_input\fP
329 passes data up (towards
330 the user) and \fIpr_output\fP passes it down (towards the network); control
331 information passes up and down on \fIpr_ctlinput\fP and \fIpr_ctloutput\fP.
332 The protocol is responsible for the space occupied by any of the
333 arguments to these entries and must either pass it onward or dispose of it.
334 (On output, the lowest level reached must free buffers storing the arguments;
335 on input, the highest level is responsible for freeing buffers.)
337 The \fIpr_usrreq\fP routine interfaces protocols to the socket
338 code and is described below.
340 The \fIpr_flags\fP field is constructed from the following values:
342 .ta \w'#define 'u +\w'PR_CONNREQUIRED 'u +8n
343 #define PR_ATOMIC 0x01 /* exchange atomic messages only */
344 #define PR_ADDR 0x02 /* addresses given with messages */
345 #define PR_CONNREQUIRED 0x04 /* connection required by protocol */
346 #define PR_WANTRCVD 0x08 /* want PRU_RCVD calls */
347 #define PR_RIGHTS 0x10 /* passes capabilities */
349 Protocols which are connection-based specify the PR_CONNREQUIRED
350 flag so that the socket routines will never attempt to send data
351 before a connection has been established. If the PR_WANTRCVD flag
352 is set, the socket routines will notify the protocol when the user
353 has removed data from the socket's receive queue. This allows
354 the protocol to implement acknowledgement on user receipt, and
355 also update windowing information based on the amount of space
356 available in the receive queue. The PR_ADDR field indicates that any
357 data placed in the socket's receive queue will be preceded by the
358 address of the sender. The PR_ATOMIC flag specifies that each \fIuser\fP
359 request to send data must be performed in a single \fIprotocol\fP send
360 request; it is the protocol's responsibility to maintain record
361 boundaries on data to be sent. The PR_RIGHTS flag indicates that the
362 protocol supports the passing of capabilities; this is currently
363 used only by the protocols in the UNIX protocol family.
365 When a socket is created, the socket routines scan the protocol
367 looking for an appropriate protocol to support the type of
368 socket being created. The \fIpr_type\fP field contains one of the
369 possible socket types (e.g. SOCK_STREAM), while the \fIpr_domain\fP
370 is a back pointer to the domain structure.
371 The \fIpr_protocol\fP field contains the protocol number of the
372 protocol, normally a well-known value.
374 Network-interface layer
376 Each network-interface configured into a system defines a
377 path through which packets may be sent and received.
378 Normally a hardware device is associated with this interface,
379 though there is no requirement for this (for example, all
380 systems have a software ``loopback'' interface used for
381 debugging and performance analysis).
382 In addition to manipulating the hardware device, an interface
383 module is responsible
384 for encapsulation and decapsulation of any link-layer header
385 information required to deliver a message to its destination.
386 The selection of which interface to use in delivering packets
387 is a routing decision carried out at a
388 higher level than the network-interface layer.
389 An interface may have addresses in one or more address families.
390 The address is set at boot time using an \fIioctl\fP on a socket
391 in the appropriate domain; this operation is implemented by the protocol
392 family, after verifying the operation through the device \fIioctl\fP entry.
394 An interface is defined by the following structure,
396 .ta .5i +\w'struct 'u +\w'ifaddr *if_addrlist; 'u
398 char *if_name; /* name, e.g. ``en'' or ``lo'' */
399 short if_unit; /* sub-unit for lower level driver */
400 short if_mtu; /* maximum transmission unit */
401 short if_flags; /* up/down, broadcast, etc. */
402 short if_timer; /* time 'til if_watchdog called */
403 struct ifaddr *if_addrlist; /* list of addresses of interface */
404 struct ifqueue if_snd; /* output queue */
405 int (*if_init)(); /* init routine */
406 int (*if_output)(); /* output routine */
407 int (*if_ioctl)(); /* ioctl routine */
408 int (*if_reset)(); /* bus reset routine */
409 int (*if_watchdog)(); /* timer routine */
410 int if_ipackets; /* packets received on interface */
411 int if_ierrors; /* input errors on interface */
412 int if_opackets; /* packets sent on interface */
413 int if_oerrors; /* output errors on interface */
414 int if_collisions; /* collisions on csma interfaces */
415 struct ifnet *if_next;
418 Each interface address has the following form:
420 .ta \w'#define 'u +\w'struct 'u +\w'struct 'u +\w'sockaddr ifa_addr; 'u-\w'struct 'u
422 struct sockaddr ifa_addr; /* address of interface */
424 struct sockaddr ifu_broadaddr;
425 struct sockaddr ifu_dstaddr;
427 struct ifnet *ifa_ifp; /* back-pointer to interface */
428 struct ifaddr *ifa_next; /* next address for interface */
430 .ta \w'#define 'u +\w'ifa_broadaddr 'u +\w'ifa_ifu.ifu_broadaddr 'u
431 #define ifa_broadaddr ifa_ifu.ifu_broadaddr /* broadcast address */
432 #define ifa_dstaddr ifa_ifu.ifu_dstaddr /* other end of p-to-p link */
434 The protocol generally maintains this structure as part of a larger
435 structure containing additional information concerning the address.
437 Each interface has a send queue and routines used for
438 initialization, \fIif_init\fP, and output, \fIif_output\fP.
439 If the interface resides on a system bus, the routine \fIif_reset\fP
440 will be called after a bus reset has been performed.
441 An interface may also
442 specify a timer routine, \fIif_watchdog\fP;
443 if \fIif_timer\fP is non-zero, it is decremented once per second
444 until it reaches zero, at which time the watchdog routine is called.
446 The state of an interface and certain characteristics are stored in
447 the \fIif_flags\fP field. The following values are possible:
450 #define IFF_UP 0x1 /* interface is up */
451 #define IFF_BROADCAST 0x2 /* broadcast is possible */
452 #define IFF_DEBUG 0x4 /* turn on debugging */
453 #define IFF_LOOPBACK 0x8 /* is a loopback net */
454 #define IFF_POINTOPOINT 0x10 /* interface is point-to-point link */
455 #define IFF_NOTRAILERS 0x20 /* avoid use of trailers */
456 #define IFF_RUNNING 0x40 /* resources allocated */
457 #define IFF_NOARP 0x80 /* no address resolution protocol */
459 If the interface is connected to a network which supports transmission
460 of \fIbroadcast\fP packets, the IFF_BROADCAST flag will be set and
461 the \fIifa_broadaddr\fP field will contain the address to be used in
462 sending or accepting a broadcast packet. If the interface is associated
463 with a point-to-point hardware link (for example, a DEC DMR-11), the
464 IFF_POINTOPOINT flag will be set and \fIifa_dstaddr\fP will contain the
465 address of the host on the other side of the connection. These addresses
466 and the local address of the interface, \fIif_addr\fP, are used in
467 filtering incoming packets. The interface sets IFF_RUNNING after
468 it has allocated system resources and posted an initial read on the
469 device it manages. This state bit is used to avoid multiple allocation
470 requests when an interface's address is changed. The IFF_NOTRAILERS
471 flag indicates the interface should refrain from using a \fItrailer\fP
472 encapsulation on outgoing packets, or (where per-host negotiation
473 of trailers is possible) that trailer encapsulations should not be requested;
474 \fItrailer\fP protocols are described
475 in section 14. The IFF_NOARP flag indicates the interface should not
476 use an ``address resolution protocol'' in mapping internetwork addresses
477 to local network addresses.
479 Various statistics are also stored in the interface structure. These
480 may be viewed by users using the \fInetstat\fP(1) program.
482 The interface address and flags may be set with the SIOCSIFADDR and
483 SIOCSIFFLAGS \fIioctl\fP\^s. SIOCSIFADDR is used initially to define each
484 interface's address; SIOGSIFFLAGS can be used to mark
485 an interface down and perform site-specific configuration.
486 The destination address of a point-to-point link is set with SIOCSIFDSTADDR.
487 Corresponding operations exist to read each value.
488 Protocol families may also support operations to set and read the broadcast
490 In addition, the SIOCGIFCONF \fIioctl\fP retrieves a list of interface
491 names and addresses for all interfaces and protocols on the host.
495 All hardware related interfaces currently reside on the UNIBUS.
496 Consequently a common set of utility routines for dealing
497 with the UNIBUS has been developed. Each UNIBUS interface
498 uses a structure of the following form:
500 .ta \w'#define 'u +\w'ifw_xtofree 'u +\w'pte ifu_wmap[IF_MAXNUBAMR]; 'u
502 short iff_uban; /* uba number */
503 short iff_hlen; /* local net header length */
504 struct uba_regs *iff_uba; /* uba regs, in vm */
505 short iff_flags; /* used during uballoc's */
508 Additional structures are associated with each receive and transmit buffer,
509 normally one each per interface; for read,
511 .ta \w'#define 'u +\w'ifw_xtofree 'u +\w'pte ifu_wmap[IF_MAXNUBAMR]; 'u
513 caddr_t ifrw_addr; /* virt addr of header */
514 short ifrw_bdp; /* unibus bdp */
515 short ifrw_flags; /* type, etc. */
516 #define IFRW_W 0x01 /* is a transmit buffer */
517 int ifrw_info; /* value from ubaalloc */
518 int ifrw_proto; /* map register prototype */
519 struct pte *ifrw_mr; /* base of map registers */
524 .ta \w'#define 'u +\w'ifw_xtofree 'u +\w'pte ifu_wmap[IF_MAXNUBAMR]; 'u
527 caddr_t ifw_base; /* virt addr of buffer */
528 struct pte ifw_wmap[IF_MAXNUBAMR]; /* base pages for output */
529 struct mbuf *ifw_xtofree; /* pages being DMA'd out */
530 short ifw_xswapd; /* mask of clusters swapped */
531 short ifw_nmr; /* number of entries in wmap */
533 .ta \w'#define 'u +\w'ifw_xtofree 'u +\w'pte ifu_wmap[IF_MAXNUBAMR]; 'u
534 #define ifw_addr ifrw.ifrw_addr
535 #define ifw_bdp ifrw.ifrw_bdp
536 #define ifw_flags ifrw.ifrw_flags
537 #define ifw_info ifrw.ifrw_info
538 #define ifw_proto ifrw.ifrw_proto
539 #define ifw_mr ifrw.ifrw_mr
541 One of each of these structures is conveniently packaged for interfaces
542 with single buffers for each direction, as follows:
544 .ta \w'#define 'u +\w'ifw_xtofree 'u +\w'pte ifu_wmap[IF_MAXNUBAMR]; 'u
546 struct ifubinfo ifu_info;
548 struct ifxmt ifu_xmt;
550 .ta \w'#define 'u +\w'ifw_xtofree 'u
551 #define ifu_uban ifu_info.iff_uban
552 #define ifu_hlen ifu_info.iff_hlen
553 #define ifu_uba ifu_info.iff_uba
554 #define ifu_flags ifu_info.iff_flags
555 #define ifu_w ifu_xmt.ifrw
556 #define ifu_xtofree ifu_xmt.ifw_xtofree
559 The \fIif_ubinfo\fP structure contains the general information needed
560 to characterize the I/O-mapped buffers for the device.
561 In addition, there is a structure describing each buffer, including
562 UNIBUS resources held by the interface.
563 Sufficient memory pages and bus map registers are allocated to each buffer
564 upon initialization according to the maximum packet size and header length.
565 The kernel virtual address of the buffer is held in \fIifrw_addr\fP,
566 and the map registers begin
567 at \fIifrw_mr\fP. UNIBUS map register \fIifrw_mr\fP\^[\-1]
568 maps the local network header
569 ending on a page boundary. UNIBUS data paths are
570 reserved for read and for
571 write, given by \fIifrw_bdp\fP. The prototype of the map
572 registers for read and for write is saved in \fIifrw_proto\fP.
574 When write transfers are not at least half-full pages on page boundaries,
575 the data are just copied into the pages mapped on the UNIBUS
576 and the transfer is started.
577 If a write transfer is at least half a page long and on a page
578 boundary, UNIBUS page table entries are swapped to reference
579 the pages, and then the initial pages are
580 remapped from \fIifw_wmap\fP when the transfer completes.
581 The mbufs containing the mapped pages are placed on the \fIifw_xtofree\fP
582 queue to be freed after transmission.
584 When read transfers give at least half a page of data to be input, page
585 frames are allocated from a network page list and traded
586 with the pages already containing the data, mapping the allocated
587 pages to replace the input pages for the next UNIBUS data input.
589 The following utility routines are available for use in
590 writing network interface drivers; all use the
591 structures described above.
593 if_ubaminit(ifubinfo, uban, hlen, nmr, ifr, nr, ifx, nx);
595 if_ubainit(ifuba, uban, hlen, nmr);
597 \fIif_ubaminit\fP allocates resources on UNIBUS adapter \fIuban\fP,
598 storing the information in the \fIifubinfo\fP, \fIifrw\fP and \fIifxmt\fP
599 structures referenced.
600 The \fIifr\fP and \fIifx\fP parameters are pointers to arrays
601 of \fIifrw\fP and \fIifxmt\fP structures whose dimensions
602 are \fInr\fP and \fInx\fP, respectively.
603 \fIif_ubainit\fP is a simpler, backwards-compatible interface used
604 for hardware with single buffers of each type.
605 They are called only at boot time or after a UNIBUS reset.
606 One data path (buffered or unbuffered,
607 depending on the \fIifu_flags\fP field) is allocated for each buffer.
608 The \fInmr\fP parameter indicates
609 the number of UNIBUS mapping registers required to map a maximal
610 sized packet onto the UNIBUS, while \fIhlen\fP specifies the size
611 of a local network header, if any, which should be mapped separately
612 from the data (see the description of trailer protocols in chapter 14).
613 Sufficient UNIBUS mapping registers and pages of memory are allocated
614 to initialize the input data path for an initial read. For the output
615 data path, mapping registers and pages of memory are also allocated
616 and mapped onto the UNIBUS. The pages associated with the output
617 data path are held in reserve in the event a write requires copying
618 non-page-aligned data (see \fIif_wubaput\fP below).
619 If \fIif_ubainit\fP is called with memory pages already allocated,
620 they will be used instead of allocating new ones (this normally
621 occurs after a UNIBUS reset).
622 A 1 is returned when allocation and initialization are successful,
625 m = if_ubaget(ifubinfo, ifr, totlen, off0, ifp);
627 m = if_rubaget(ifuba, totlen, off0, ifp);
629 \fIif_ubaget\fP and \fIif_rubaget\fP pull input data
630 out of an interface receive buffer and into an mbuf chain.
631 The first interface passes pointers to the \fIifubinfo\fP structure
632 for the interface and the \fIifrw\fP structure for the receive buffer;
633 the second call may be used for single-buffered devices.
634 \fItotlen\fP specifies the length of data to be obtained, not counting the
635 local network header. If \fIoff0\fP is non-zero, it indicates
636 a byte offset to a trailing local network header which should be
637 copied into a separate mbuf and prepended to the front of the resultant mbuf
638 chain. When the data amount to at least a half a page,
639 the previously mapped data pages are remapped
640 into the mbufs and swapped with fresh pages, thus avoiding
642 The receiving interface is recorded as \fIifp\fP, a pointer to an \fIifnet\fP
643 structure, for the use of the receiving network protocol.
644 A 0 return value indicates a failure to allocate resources.
646 if_wubaput(ifubinfo, ifx, m);
648 if_wubaput(ifuba, m);
650 \fIif_ubaput\fP and \fIif_wubaput\fP map a chain of mbufs
651 onto a network interface in preparation for output.
652 The first interface is used by devices with multiple transmit buffers.
653 The chain includes any local network
654 header, which is copied so that it resides in the mapped and
656 Page-aligned data that are page-aligned in the output buffer
657 are mapped to the UNIBUS in place of the normal buffer page,
658 and the corresponding mbuf is placed on a queue to be freed after transmission.
659 Any other mbufs which contained non-page-sized
660 data portions are copied to the I/O space and then freed.
661 Pages mapped from a previous output operation (no longer needed)