share/doc/papers/pulldown/8.t

   1 .\"     $Id: 8.t,v 1.1 2001/07/04 05:29:25 itojun Exp $
   2 .\"
   3 .\".ds RH Comparisons
   4 .NH 1
   5 Comparisons
   6 .PP
   7 This section compares the following three approaches in terms of
   8 their characteristics and actual behavior:
   9 (1) 4.4BSD
  10 .I m_pullup,
  11 (2) NRL
  12 .I m_pullup2,
  13 and (3) KAME
  14 .I m_pulldown.
  15 .LP
  16 .NH 2
  17 Comparison of assumption
  18 .PP
  19 Table 1 shows the assumptions made by each of the three approaches.
  20 As mentioned earlier,
  21 .I m_pullup
  22 imposes too stringent requirement for the total length of packet headers.
  23 .I m_pullup2
  24 is workable in most cases, although
  25 this approach adds more restrictions than the specification claims.
  26 .I m_pulldown
  27 assumes that the single packet header is smaller than MCLBYTES,
  28 but makes
  29 no restriction regarding the total length of packet headers.
  30 With a standard mbuf chain,
  31 this is the best
  32 .I m_pulldown
  33 can do, since there is no way to hold continuous region longer than MCLBYTES.
  34 This characteristic can contribute to better specification conformance,
  35 since
  36 .I m_pulldown
  37 will impose fewer additional restrictions due to the
  38 requirements of implementation.
  39 .PP
  40 Among the three approaches, only
  41 .I m_pulldown
  42 avoids making unnecessary copies of intermediate header data and
  43 avoids pointer reinitialization after calls to these functions.
  44 These attributes result in smaller overhead during input packet processing.
  45 .PP
  46 .nr table +1
  47 At present,
  48 we know of no other 4.4BSD-based IPv6/IPsec stack that addresses kernel
  49 stack overflow issues,
  50 although we are open to
  51 new perspectives and new information.
  52 .NH 2
  53 Performance comparison based on simulated statistics
  54 .PP
  55 To compare the behavior and performance of
  56 .I m_pulldown
  57 against
  58 .I m_pullup
  59 and
  60 .I m_pullup2
  61 using the same set of traffic and
  62 mbuf chains, we have gathered simulated statistics for
  63 .I m_pullup
  64 and
  65 .I m_pullup2,
  66 in
  67 .I m_pulldown
  68 function.
  69 By running a kernel using the modified
  70 .I m_pulldown
  71 function,
  72 we can easily
  73 gather statistics for these three functions against exactly the same traffic.
  74 .PP
  75 The comparison was made on a computer
  76 (with Celeron 366MHz CPU, 192M bytes of memory)
  77 running NetBSD 1.4.1 with the KAME IPv6/IPsec stack.
  78 Network drivers allocate mbufs just as normal 4.4BSD does.
  79 .I m_pulldown
  80 is called whenever it is needed to ensure continuity in packet data
  81 during inbound packet processing.
  82 The role of the computer is as an end node, not a router.
  83 .PP
  84 To describe the content of the following table,
  85 we must look at the source code fragment.
  86 .nr figure +1
  87 Figure \n[figure]
  88 .nr figure -1
  89 shows the code fragment from our source code.
  90 The code fragment will
  91 (1) make the TCP header on the mbuf chain
  92 .I m
  93 at offset
  94 .I hdrlen
  95 continuous, and (2) point the region with pointer
  96 .I th.
  97 We use a macro named IP6_EXTHDR_CHECK,
  98 and the code before and after the macro expansion is shown in the figure.
  99 .KF
 100 .LD
 101 .ps 6
 102 .vs 7
 103 \f[CR]/* ensure that *th from hdrlen is continuous */
 104 /* before macro expansion... */
 105 struct tcphdr *th;
 106 IP6_EXTHDR_CHECK(th, struct tcphdr *, m,
 107         hdrlen, sizeof(*th));
 108 if (th == NULL)
 109     return;     /*m is already freed*/
 110
 111
 112 /* after macro expansion... */
 113 struct tcphdr *th;
 114 int off;
 115 struct mbuf *n;
 116 if (m->m_len < hdrlen + sizeof(*th)) {
 117     n = m_pulldown(m, hdrlen, sizeof(*th), &off);
 118     if (n)
 119         th = (struct tcphdr *)(mtod(n, caddr_t) + off);
 120     else
 121         th = NULL;
 122 } else
 123     th = (struct tcphdr *)(mtod(m, caddr_t) + hdrlen);
 124 if (th == NULL)
 125     return;\fP
 126 .NL
 127 .DE
 128 .nr figure +1
 129 Figure \n[figure]: code fragment for trimming mbuf chain.
 130 .KE
 131 In Table 2,
 132 the first column identifies the test case.
 133 The second column shows the number of times
 134 the IP6_EXTHDR_CHECK macro was used.
 135 In other words, it shows the number of times we have made checks against
 136 mbuf length.
 137 The remaining columns show, from left to right,
 138 the number of times memory allocation/copy was performed in each of the variants.
 139 In the case of
 140 .I m_pullup,
 141 we counted the number of cases we passed
 142 .I len
 143 in excess of MHLEN (96 bytes in this installation).
 144 .\"With
 145 .\".I m_pullup2
 146 .\"and
 147 .\".I m_pulldown,
 148 .\"there were no such failures.
 149 This result suggests
 150 that there was no packet with a packet header portion larger than
 151 MCLBYTES (2048 bytes).
 152 .\" The percentage in parentheses is ratio against the number on the first column.
 153 In the evaluation we have used
 154 .I m_pulldown
 155 against IPv6 traffic only.
 156 .1C
 157 .KF
 158 .TS
 159 center box;
 160 l cfI cfI cfI
 161 l c c c.
 162         m_pullup        m_pullup2       m_pulldown
 163 _
 164 total header length     MHLEN(100)      MCLBYTES(2048)  \(mi
 165 single header length    \(mi    \(mi    MCLBYTES(2048)
 166 _
 167 T{
 168 avoids copy on intermediate headers
 169 T}      no      no      yes
 170 _
 171 T{
 172 avoids pointer reinitialization
 173 T}      no      no      yes
 174 .TE
 175 .ce
 176 Table 1: assumptions in mbuf manipulation approaches.
 177 .KE
 178 .KF
 179 .TS
 180 center box;
 181 c |c |cfI s s |cfI s s |cfI s
 182 c |r |c c c |c c c |c c
 183 r |r |r r r |r r r |r r.
 184 test    len checks      m_pulldown      m_pullup        m_pullup2
 185                 call    alloc   copy    alloc   copy    fail    alloc   copy
 186 _
 187 (1)     204923  1706    1595    1596    165     165     1541    1596    1596
 188 (2)     1063995 23786   22931   23008   1171    1229    22557   22895   22953
 189 (3)     520028  1245    948     957     432     432     813     945     945
 190 (4)     438602  180     6       6       178     178     2       24      24
 191 (5)     5570    2236    206     206     812     812     1424    1424    1424
 192 .TE
 193 .ce
 194 Table 2: number of mbuf allocation/copy against traffic
 195 .KE
 196 .KF
 197 .TS
 198 center box;
 199 c |c c c c |c c c
 200 c |r r r r |r r r.
 201 test    IPv6 input      TCP     UDP     ICMPv6  1 mbuf  2 mbufs ext mbuf(s)
 202 _
 203 (1)     29334   20892   2699    5739    3624    15632   10078
 204 (2)     313218  215919  15930   80263   38751   172976  101491
 205 (3)     132267  117822  8561    5882    12782   59799   59686
 206 (4)     73160   66512   5249    1343    7475    42053   23632
 207 (5)     1433    148     53      52      103     1203    127
 208 .TE
 209 .ce
 210 Table 3: Traffic characteristics for tests in Table 2
 211 .KE
 212 .if t .2C
 213 .PP
 214 From these measured results, we obtain several interesting observations.
 215 .I m_pullup
 216 actually failed on IPv6 trafic.
 217 If an IPv6 implementation uses
 218 .I m_pullup
 219 for IPv6 input processing,
 220 it must be coded carefully so as to avoid trying
 221 .I m_pullup
 222 against any length longer than MHLEN.
 223 To achieve this end, the code copies the data portion from the mbuf
 224 chain to a separate buffer, and the cost of memory copies becomes a penalty.
 225 .PP
 226 Due to the nature of this simulation,
 227 the comparison described above may contain an implicit bias.
 228 Since the IPv6 protocol processing code is written by using
 229 .I m_pulldown,
 230 the code is somewhat biased toward
 231 .I m_pulldown.
 232 If a programmer had to write the entire IPv6 protocol processing with
 233 .I m_pullup
 234 only, he or she would use
 235 .I m_copydata
 236 to copy intermediate
 237 extension headers buried deep inside the header chains,
 238 thus making it unnecessary to call
 239 .I m_pullup.
 240 In any case, a call to
 241 .I m_copydata
 242 will result in a data copy,
 243 which causes extra overhead.
 244 .\"The author thinks that this bias toward
 245 .\".I m_pulldown
 246 .\"is therefore negligible.
 247 .PP
 248 In all cases, the number of length checks (second column) exceeds the
 249 number of inbound packets.
 250 This behavior is the same as in the original 4.4BSD stack;
 251 we did not add a significant number of length checks to the code.
 252 This is because
 253 .I m_pulldown
 254 (or
 255 .I m_pullup
 256 in the 4.4BSD case)
 257 is called
 258 as necessary during the parsing of the headers.
 259 For example, to process a TCP-over-IPv6 packet, at least 3
 260 checks would be made against m->m_len;
 261 these checks would be made
 262 to grab the IPv6 header (40 bytes),
 263 to grab the TCP header (20 bytes), and to grab the TCP header
 264 and options (20 to 60 bytes).
 265 The length of the TCP option part is kept inside the TCP header,
 266 so the length needs to be checked twice for the TCP part.
 267 .\"If the function call overhead is more significant than the actual
 268 .\".I m_pullup
 269 .\"or
 270 .\".I m_pulldown
 271 .\"operation,
 272 .\"we may be able to blindly call
 273 .\".I m_pulldown
 274 .\"with the maximum TCP option length
 275 .\"(60 bytes) in order to reduce the number of function calls.
 276 .KF
 277 .PS
 278 Ao:     box invis ht boxht*2
 279 A:      box at center of Ao "IPv6 header"
 280 Bo:     box invis ht boxht*2
 281 B:      box at center of Bo "TCP header" "(len)"
 282 Co:     box invis ht boxht*2
 283 C:      box at center of Co "TCP options"
 284 D:      box "payload"
 285
 286 arrow from 1/3 of the way between Ao.sw and Ao.se to Ao.sw
 287 arrow from 2/3 of the way between Ao.sw and Ao.se to Ao.se
 288 line invis from Ao.sw to Ao.se "40"
 289 line from Ao.sw to 4/5 of the way between Ao.sw and A.sw
 290 line from Ao.se to 4/5 of the way between Ao.se and A.se
 291
 292 arrow from 1/3 of the way between Bo.nw and Bo.ne to Bo.nw
 293 arrow from 2/3 of the way between Bo.nw and Bo.ne to Bo.ne
 294 line invis from Bo.nw to Bo.ne "20"
 295 line from Bo.nw to 4/5 of the way between Bo.nw and B.nw
 296 line from Bo.ne to 4/5 of the way between Bo.ne and B.ne
 297
 298 arrow from 1/3 of the way between Bo.sw and Co.se to Bo.sw
 299 arrow from 2/3 of the way between Bo.sw and Co.se to Co.se
 300 line invis from Bo.sw to Co.se "20 to 60"
 301 line from Bo.sw to 4/5 of the way between Bo.sw and B.sw
 302 line from Co.se to 4/5 of the way between Co.se and C.se
 303 .PE
 304 .ce
 305 .nr figure +1
 306 Figure \n[figure]: processing a TCP-over-IPv6 packet requires 3 length checks.
 307 .KE
 308 The results suggest that we call
 309 .I m_pulldown
 310 more frequently in ICMPv6 processing than in the processing of other protocols.
 311 These additional calls are made for parsing of ICMPv6 and for neighbor discovery options.
 312 The use of loopback interface also contributes to the use of
 313 .I m_pulldown.
 314 .PP
 315 In the tests, the number of copies made in the
 316 .I m_pullup2
 317 case is similar to the number made in the
 318 .I m_pulldown
 319 case.
 320 .I m_pulldown
 321 makes less copies than
 322 .I m_pullup2
 323 against packets like below:
 324 .IP \(sq
 325 A packet is kept in multiple mbuf.
 326 With mbuf allocation policy in
 327 .I m_devget,
 328 we will see two mbufs to hold single packet
 329 if the packet is larger than MHLEN and smaller than MHLEN + MLEN,
 330 or the packet is larger than MCLBYTES.
 331 .IP \(sq
 332 We have extension headers in multiple mbufs.
 333 Header portion in the packet needs to occupy first mbuf and
 334 subsequent mbufs.
 335 .LP
 336 To demonstrate the difference, we have generated an IPv6 packet with a
 337 routing header, with 4 IPv6 addresses.
 338 The test result is presented as the 5th test in Table 2.
 339 Packet will look like
 340 .nr figure +1
 341 Figure \n[figure].
 342 .nr figure -1
 343 First 112 bytes are occupied by an IPv6 header and a routing header,
 344 and the remaining 16 bytes are used for an ICMPv6 header and payload.
 345 The packet met the above condition, and
 346 .I m_pulldown
 347 made less copies than
 348 .I m_pullup2.
 349 To process single incoming ICMPv6 packet shown in the figure,
 350 .I m_pullup2
 351 made 7 copies while
 352 .I m_pulldown
 353 made only 1 copy.
 354 .KF
 355 .LD
 356 .ps 6
 357 .vs 7
 358 \f[CR]node A (source) = 2001:240:0:200:260:97ff:fe07:69ea
 359 node B (destination) = 2001:240:0:200:a00:5aff:fe38:6f86
 360 17:39:43.346078 A > B:
 361         srcrt (type=0,segleft=4,[0]B,[1]B,[2]B,[3]B):
 362         icmp6: echo request (len 88, hlim 64)
 363                  6000 0000 0058 2b40 2001 0240 0000 0200
 364                  0260 97ff fe07 69ea 2001 0240 0000 0200
 365                  0a00 5aff fe38 6f86 3a08 0004 0000 0000
 366                  2001 0240 0000 0200 0a00 5aff fe38 6f86
 367                  2001 0240 0000 0200 0a00 5aff fe38 6f86
 368                  2001 0240 0000 0200 0a00 5aff fe38 6f86
 369                  2001 0240 0000 0200 0a00 5aff fe38 6f86
 370                  8000 b650 030e 00c8 ce6e fd38 d553 0700
 371 .DE
 372 .ce
 373 .nr figure +1
 374 Figure \n[figure]: Packets with IPv6 routing header.
 375 .KE
 376 .PP
 377 During the test, we experienced no kernel stack overflow,
 378 thanks to a new calling sequence between IPv6 protocol handlers.
 379 .PP
 380 The number of copies and mbuf allocations vary very much by tests.
 381 We need to investigate the traffic characteristic more carefully,
 382 for example, about the average length of header portion in packets.