1 .\" $NetBSD: c.t,v 1.2 1998/01/09 06:55:50 perry Exp $
3 .\" Copyright (c) 1983, 1986, 1993
4 .\" The Regents of the University of California. All rights reserved.
6 .\" Redistribution and use in source and binary forms, with or without
7 .\" modification, are permitted provided that the following conditions
9 .\" 1. Redistributions of source code must retain the above copyright
10 .\" notice, this list of conditions and the following disclaimer.
11 .\" 2. Redistributions in binary form must reproduce the above copyright
12 .\" notice, this list of conditions and the following disclaimer in the
13 .\" documentation and/or other materials provided with the distribution.
14 .\" 3. Neither the name of the University nor the names of its contributors
15 .\" may be used to endorse or promote products derived from this software
16 .\" without specific prior written permission.
18 .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
19 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
20 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
21 .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
22 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
23 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
24 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
25 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
26 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
27 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
30 .\" @(#)c.t 8.1 (Berkeley) 6/8/93
33 .\".ds RH "Buffering and congestion control
37 \s+2Buffering and congestion control\s0
39 One of the major factors in the performance of a protocol is
40 the buffering policy used. Lack of a proper buffering policy
41 can force packets to be dropped, cause falsified windowing
42 information to be emitted by protocols, fragment host memory,
43 degrade the overall host performance, etc. Due to problems
44 such as these, most systems allocate a fixed pool of memory
45 to the networking system and impose
46 a policy optimized for ``normal'' network operation.
48 The networking system developed for UNIX is little different in this
49 respect. At boot time a fixed amount of memory is allocated by
50 the networking system. At later times more system memory
51 may be requested as the need arises, but at no time is
52 memory ever returned to the system. It is possible to
53 garbage collect memory from the network, but difficult. In
54 order to perform this garbage collection properly, some
55 portion of the network will have to be ``turned off'' as
56 data structures are updated. The interval over which this
57 occurs must kept small compared to the average inter-packet
58 arrival time, or too much traffic may
59 be lost, impacting other hosts on the network, as well as
60 increasing load on the interconnecting mediums. In our
61 environment we have not experienced a need for such compaction,
62 and thus have left the problem unresolved.
64 The mbuf structure was introduced in chapter 5. In this
65 section a brief description will be given of the allocation
66 mechanisms, and policies used by the protocols in performing
67 connection level buffering.
71 The basic memory allocation routines manage a private page map,
72 the size of which determines the maximum amount of memory
73 that may be allocated by the network.
74 A small amount of memory is allocated at boot time
75 to initialize the mbuf and mbuf page cluster free lists.
76 When the free lists are exhausted, more memory is requested
77 from the system memory allocator if space remains in the map.
78 If memory cannot be allocated,
79 callers may block awaiting free memory,
80 or the failure may be reflected to the caller immediately.
81 The allocator will not block awaiting free map entries, however,
82 as exhaustion of the page map usually indicates that buffers have been lost
84 The private page table is used by the network buffer management
85 routines in remapping pages to
86 be logically contiguous as the need arises. In addition, an
87 array of reference counts parallels the page table and is used
88 when multiple references to a page are present.
90 Mbufs are 128 byte structures, 8 fitting in a 1Kbyte
91 page of memory. When data is placed in mbufs,
92 it is copied or remapped into logically contiguous pages of
93 memory from the network page pool if possible.
94 Data smaller than half of the size
95 of a page is copied into one or more 112 byte mbuf data areas.
97 Protocol buffering policies
99 Protocols reserve fixed amounts of
100 buffering for send and receive queues at socket creation time. These
101 amounts define the high and low water marks used by the socket routines
102 in deciding when to block and unblock a process. The reservation
103 of space does not currently
104 result in any action by the memory management
107 Protocols which provide connection level flow control do this
108 based on the amount of space in the associated socket queues. That
109 is, send windows are calculated based on the amount of free space
110 in the socket's receive queue, while receive windows are adjusted
111 based on the amount of data awaiting transmission in the send queue.
112 Care has been taken to avoid the ``silly window syndrome'' described
113 in [Clark82] at both the sending and receiving ends.
117 Incoming packets from the network are always received unless
118 memory allocation fails. However, each Level 1 protocol
120 has an upper bound on the queue's length, and any packets
121 exceeding that bound are discarded. It is possible for a host to be
122 overwhelmed by excessive network traffic (for instance a host
123 acting as a gateway from a high bandwidth network to a low bandwidth
124 network). As a ``defensive'' mechanism the queue limits may be
125 adjusted to throttle network traffic load on a host.
126 Consider a host willing to devote some percentage of
127 its machine to handling network traffic.
128 If the cost of handling an
129 incoming packet can be calculated so that an acceptable
130 ``packet handling rate''
131 can be determined, then input queue lengths may be dynamically
132 adjusted based on a host's network load and the number of packets
133 awaiting processing. Obviously, discarding packets is
134 not a satisfactory solution to a problem such as this
135 (simply dropping packets is likely to increase the load on a network);
136 the queue lengths were incorporated mainly as a safeguard mechanism.
140 When packets can not be forwarded because of memory limitations,
141 the system attempts to generate a ``source quench'' message. In addition,
142 any other problems encountered during packet forwarding are also
143 reflected back to the sender in the form of ICMP packets. This
144 helps hosts avoid unneeded retransmissions.
146 Broadcast packets are never forwarded due to possible dire
147 consequences. In an early stage of network development, broadcast
148 packets were forwarded and a ``routing loop'' resulted in network
149 saturation and every host on the network crashing.