1 .\" $NetBSD: a.t,v 1.2 1998/01/09 06:55:48 perry Exp $
3 .\" Copyright (c) 1983, 1986, 1993
4 .\" The Regents of the University of California. All rights reserved.
6 .\" Redistribution and use in source and binary forms, with or without
7 .\" modification, are permitted provided that the following conditions
9 .\" 1. Redistributions of source code must retain the above copyright
10 .\" notice, this list of conditions and the following disclaimer.
11 .\" 2. Redistributions in binary form must reproduce the above copyright
12 .\" notice, this list of conditions and the following disclaimer in the
13 .\" documentation and/or other materials provided with the distribution.
14 .\" 3. Neither the name of the University nor the names of its contributors
15 .\" may be used to endorse or promote products derived from this software
16 .\" without specific prior written permission.
18 .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
19 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
20 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
21 .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
22 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
23 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
24 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
25 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
26 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
27 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
30 .\" @(#)a.t 8.1 (Berkeley) 6/8/93
33 .\".ds RH "Gateways and routing
37 \s+2Gateways and routing issues\s0
39 The system has been designed with the expectation that it will
40 be used in an internetwork environment. The ``canonical''
41 environment was envisioned to be a collection of local area
42 networks connected at one or more points through hosts with
43 multiple network interfaces (one on each local area network),
44 and possibly a connection to a long haul network (for example,
45 the ARPANET). In such an environment, issues of
46 gatewaying and packet routing become very important. Certain
47 of these issues, such as congestion
48 control, have been handled in a simplistic manner or specifically
50 Instead, where possible, the network system
51 attempts to provide simple mechanisms upon which more involved
52 policies may be implemented. As some of these problems become
53 better understood, the solutions developed will be incorporated
56 This section will describe the facilities provided for packet
57 routing. The simplistic mechanisms provided for congestion
58 control are described in chapter 12.
62 The network system maintains a set of routing tables for
63 selecting a network interface to use in delivering a
64 packet to its destination. These tables are of the form:
66 .ta \w'struct 'u +\w'u_long 'u +\w'sockaddr rt_gateway; 'u
68 u_long rt_hash; /* hash key for lookups */
69 struct sockaddr rt_dst; /* destination net or host */
70 struct sockaddr rt_gateway; /* forwarding agent */
71 short rt_flags; /* see below */
72 short rt_refcnt; /* no. of references to structure */
73 u_long rt_use; /* packets sent using route */
74 struct ifnet *rt_ifp; /* interface to give packet to */
78 The routing information is organized in two separate tables, one
79 for routes to a host and one for routes to a network. The
80 distinction between hosts and networks is necessary so
81 that a single mechanism may be used
82 for both broadcast and multi-drop type networks, and
83 also for networks built from point-to-point links (e.g
86 Each table is organized as a hashed set of linked lists.
87 Two 32-bit hash values are calculated by routines defined for
88 each address family; one based on the destination being
89 a host, and one assuming the target is the network portion
90 of the address. Each hash value is used to
91 locate a hash chain to search (by taking the value modulo the
92 hash table size) and the entire 32-bit value is then
93 used as a key in scanning the list of routes. Lookups are
94 applied first to the routing
95 table for hosts, then to the routing table for networks.
96 If both lookups fail, a final lookup is made for a ``wildcard''
97 route (by convention, network 0).
98 The first appropriate route discovered is used.
99 By doing this, routes to a specific host on a network may be
100 present as well as routes to the network. This also allows a
101 ``fall back'' network route to be defined to a ``smart'' gateway
102 which may then perform more intelligent routing.
104 Each routing table entry contains a destination (the desired final destination),
105 a gateway to which to send the packet,
106 and various flags which indicate the route's status and type (host or
108 of the number of packets sent using the route is kept, along
109 with a count of ``held references'' to the dynamically
110 allocated structure to insure that memory reclamation
111 occurs only when the route is not in use. Finally, a pointer to the
112 a network interface is kept; packets sent using
113 the route should be handed to this interface.
115 Routes are typed in two ways: either as host or network, and as
116 ``direct'' or ``indirect''. The host/network
117 distinction determines how to compare the \fIrt_dst\fP field
118 during lookup. If the route is to a network, only a packet's
119 destination network is compared to the \fIrt_dst\fP entry stored
120 in the table. If the route is to a host, the addresses must
123 The distinction between ``direct'' and ``indirect'' routes indicates
124 whether the destination is directly connected to the source.
125 This is needed when performing local network encapsulation. If
126 a packet is destined for a peer at a host or network which is
127 not directly connected to the source, the internetwork packet
129 contain the address of the eventual destination, while
130 the local network header will address the intervening
131 gateway. Should the destination be directly connected, these addresses
132 are likely to be identical, or a mapping between the two exists.
133 The RTF_GATEWAY flag indicates that the route is to an ``indirect''
134 gateway agent, and that the local network header should be filled in
135 from the \fIrt_gateway\fP field instead of
136 from the final internetwork destination address.
138 It is assumed that multiple routes to the same destination will not
139 be present; only one of multiple routes, that most recently installed,
142 Routing redirect control messages are used to dynamically
143 modify existing routing table entries as well as dynamically
144 create new routing table entries. On hosts where exhaustive
145 routing information is too expensive to maintain (e.g. work
147 combination of wildcard routing entries and routing redirect
148 messages can be used to provide a simple routing management
149 scheme without the use of a higher level policy process.
150 Current connections may be rerouted after notification of the protocols
151 by means of their \fIpr_ctlinput\fP entries.
152 Statistics are kept by the routing table routines
153 on the use of routing redirect messages and their
154 affect on the routing tables. These statistics may be viewed using
157 Status information other than routing redirect control messages
158 may be used in the future, but at present they are ignored.
159 Likewise, more intelligent ``metrics'' may be used to describe
160 routes in the future, possibly based on bandwidth and monetary
163 Routing table interface
165 A protocol accesses the routing tables through
167 one to allocate a route, one to free a route, and one
168 to process a routing redirect control message.
169 The routine \fIrtalloc\fP performs route allocation; it is
170 called with a pointer to the following structure containing
171 the desired destination:
175 struct rtentry *ro_rt;
176 struct sockaddr ro_dst;
179 The route returned is assumed ``held'' by the caller until
180 released with an \fIrtfree\fP call. Protocols which implement
181 virtual circuits, such as TCP, hold onto routes for the duration
182 of the circuit's lifetime, while connection-less protocols,
183 such as UDP, allocate and free routes whenever their destination address
186 The routine \fIrtredirect\fP is called to process a routing redirect
187 control message. It is called with a destination address,
188 the new gateway to that destination, and the source of the redirect.
189 Redirects are accepted only from the current router for the destination.
190 If a non-wildcard route
191 exists to the destination, the gateway entry in the route is modified
192 to point at the new gateway supplied. Otherwise, a new routing
193 table entry is inserted reflecting the information supplied. Routes
194 to interfaces and routes to gateways which are not directly accessible
195 from the host are ignored.
197 User level routing policies
199 Routing policies implemented in user processes manipulate the
200 kernel routing tables through two \fIioctl\fP calls. The
201 commands SIOCADDRT and SIOCDELRT add and delete routing entries,
202 respectively; the tables are read through the /dev/kmem device.
203 The decision to place policy decisions in a user process implies
204 that routing table updates may lag a bit behind the identification of
205 new routes, or the failure of existing routes, but this period
206 of instability is normally very small with proper implementation
207 of the routing process. Advisory information, such as ICMP
208 error messages and IMP diagnostic messages, may be read from
209 raw sockets (described in the next section).
211 Several routing policy processes have already been implemented. The
213 ``routing daemon'' uses a variant of the Xerox NS Routing Information
214 Protocol [Xerox82] to maintain up-to-date routing tables in our local
215 environment. Interaction with other existing routing protocols,
216 such as the Internet EGP (Exterior Gateway Protocol), has been
217 accomplished using a similar process.