7 Network Working Group M. Larson
8 Request for Comments: 4697 P. Barber
9 BCP: 123 VeriSign, Inc.
10 Category: Best Current Practice October 2006
13 Observed DNS Resolution Misbehavior
17 This document specifies an Internet Best Current Practices for the
18 Internet Community, and requests discussion and suggestions for
19 improvements. Distribution of this memo is unlimited.
23 Copyright (C) The Internet Society (2006).
27 This memo describes DNS iterative resolver behavior that results in a
28 significant query volume sent to the root and top-level domain (TLD)
29 name servers. We offer implementation advice to iterative resolver
30 developers to alleviate these unnecessary queries. The
31 recommendations made in this document are a direct byproduct of
32 observation and analysis of abnormal query traffic patterns seen at
33 two of the thirteen root name servers and all thirteen com/net TLD
38 1. Introduction ....................................................2
39 1.1. A Note about Terminology in this Memo ......................3
40 1.2. Key Words ..................................................3
41 2. Observed Iterative Resolver Misbehavior .........................3
42 2.1. Aggressive Requerying for Delegation Information ...........3
43 2.1.1. Recommendation ......................................5
44 2.2. Repeated Queries to Lame Servers ...........................6
45 2.2.1. Recommendation ......................................6
46 2.3. Inability to Follow Multiple Levels of Indirection .........7
47 2.3.1. Recommendation ......................................7
48 2.4. Aggressive Retransmission when Fetching Glue ...............8
49 2.4.1. Recommendation ......................................9
50 2.5. Aggressive Retransmission behind Firewalls .................9
51 2.5.1. Recommendation .....................................10
52 2.6. Misconfigured NS Records ..................................10
53 2.6.1. Recommendation .....................................11
58 Larson & Barber Best Current Practice [Page 1]
60 RFC 4697 Observed DNS Resolution Misbehavior October 2006
63 2.7. Name Server Records with Zero TTL .........................11
64 2.7.1. Recommendation .....................................12
65 2.8. Unnecessary Dynamic Update Messages .......................12
66 2.8.1. Recommendation .....................................13
67 2.9. Queries for Domain Names Resembling IPv4 Addresses ........13
68 2.9.1. Recommendation .....................................14
69 2.10. Misdirected Recursive Queries ............................14
70 2.10.1. Recommendation ....................................14
71 2.11. Suboptimal Name Server Selection Algorithm ...............15
72 2.11.1. Recommendation ....................................15
73 3. Security Considerations ........................................16
74 4. Acknowledgements ...............................................16
75 5. Internationalization Considerations ............................16
76 6. References .....................................................16
77 6.1. Normative References ......................................16
78 6.2. Informative References ....................................16
82 Observation of query traffic received by two root name servers and
83 the thirteen com/net Top-Level Domain (TLD) name servers has revealed
84 that a large proportion of the total traffic often consists of
85 "requeries". A requery is the same question (<QNAME, QTYPE, QCLASS>)
86 asked repeatedly at an unexpectedly high rate. We have observed
87 requeries from both a single IP address and multiple IP addresses
88 (i.e., the same query received simultaneously from multiple IP
91 By analyzing requery events, we have found that the cause of the
92 duplicate traffic is almost always a deficient iterative resolver,
93 stub resolver, or application implementation combined with an
94 operational anomaly. The implementation deficiencies we have
95 identified to date include well-intentioned recovery attempts gone
96 awry, insufficient caching of failures, early abort when multiple
97 levels of indirection must be followed, and aggressive retry by stub
98 resolvers or applications. Anomalies that we have seen trigger
99 requery events include lame delegations, unusual glue records, and
100 anything that makes all authoritative name servers for a zone
101 unreachable (Denial of Service (DoS) attacks, crashes, maintenance,
102 routing failures, congestion, etc.).
104 In the following sections, we provide a detailed explanation of the
105 observed behavior and recommend changes that will reduce the requery
106 rate. None of the changes recommended affects the core DNS protocol
107 specification; instead, this document consists of guidelines to
108 implementors of iterative resolvers.
114 Larson & Barber Best Current Practice [Page 2]
116 RFC 4697 Observed DNS Resolution Misbehavior October 2006
119 1.1. A Note about Terminology in This Memo
121 To recast an old saying about standards, the nice thing about DNS
122 terms is that there are so many of them to choose from. Writing or
123 talking about DNS can be difficult and can cause confusion resulting
124 from a lack of agreed-upon terms for its various components. Further
125 complicating matters are implementations that combine multiple roles
126 into one piece of software, which makes naming the result
127 problematic. An example is the entity that accepts recursive
128 queries, issues iterative queries as necessary to resolve the initial
129 recursive query, caches responses it receives, and which is also able
130 to answer questions about certain zones authoritatively. This entity
131 is an iterative resolver combined with an authoritative name server
132 and is often called a "recursive name server" or a "caching name
135 This memo is concerned principally with the behavior of iterative
136 resolvers, which are typically found as part of a recursive name
137 server. This memo uses the more precise term "iterative resolver",
138 because the focus is usually on that component. In instances where
139 the name server role of this entity requires mentioning, this memo
140 uses the term "recursive name server". As an example of the
141 difference, the name server component of a recursive name server
142 receives DNS queries and the iterative resolver component sends
145 The advent of IPv6 requires mentioning AAAA records as well as A
146 records when discussing glue. To avoid continuous repetition and
147 qualification, this memo uses the general term "address record" to
148 encompass both A and AAAA records when a particular situation is
149 relevant to both types.
153 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
154 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
155 document are to be interpreted as described in RFC 2119 [1].
157 2. Observed Iterative Resolver Misbehavior
159 2.1. Aggressive Requerying for Delegation Information
161 There can be times when every name server in a zone's NS RRSet is
162 unreachable (e.g., during a network outage), unavailable (e.g., the
163 name server process is not running on the server host), or
164 misconfigured (e.g., the name server is not authoritative for the
165 given zone, also known as "lame"). Consider an iterative resolver
166 that attempts to resolve a query for a domain name in such a zone and
170 Larson & Barber Best Current Practice [Page 3]
172 RFC 4697 Observed DNS Resolution Misbehavior October 2006
175 discovers that none of the zone's name servers can provide an answer.
176 We have observed a recursive name server implementation whose
177 iterative resolver then verifies the zone's NS RRSet in its cache by
178 querying for the zone's delegation information: it sends a query for
179 the zone's NS RRSet to one of the parent zone's name servers. (Note
180 that queries with QTYPE=NS are not required by the standard
181 resolution algorithm described in Section 4.3.2 of RFC 1034 [2].
182 These NS queries represent this implementation's addition to that
185 For example, suppose that "example.com" has the following NS RRSet:
187 example.com. IN NS ns1.example.com.
188 example.com. IN NS ns2.example.com.
190 Upon receipt of a query for "www.example.com" and assuming that
191 neither "ns1.example.com" nor "ns2.example.com" can provide an
192 answer, this iterative resolver implementation immediately queries a
193 "com" zone name server for the "example.com" NS RRSet to verify that
194 it has the proper delegation information. This implementation
195 performs this query to a zone's parent zone for each recursive query
196 it receives that fails because of a completely unresponsive set of
197 name servers for the target zone. Consider the effect when a popular
198 zone experiences a catastrophic failure of all its name servers: now
199 every recursive query for domain names in that zone sent to this
200 recursive name server implementation results in a query to the failed
201 zone's parent name servers. On one occasion when several dozen
202 popular zones became unreachable, the query load on the com/net name
203 servers increased by 50%.
205 We believe this verification query is not reasonable. Consider the
206 circumstances: when an iterative resolver is resolving a query for a
207 domain name in a zone it has not previously searched, it uses the
208 list of name servers in the referral from the target zone's parent.
209 If on its first attempt to search the target zone, none of the name
210 servers in the referral is reachable, a verification query to the
211 parent would be pointless: this query to the parent would come so
212 quickly on the heels of the referral that it would be almost certain
213 to contain the same list of name servers. The chance of discovering
214 any new information is slim.
216 The other possibility is that the iterative resolver successfully
217 contacts one of the target zone's name servers and then caches the NS
218 RRSet from the authority section of a response, the proper behavior
219 according to Section 5.4.1 of RFC 2181 [3], because the NS RRSet from
220 the target zone is more trustworthy than delegation information from
221 the parent zone. If, while processing a subsequent recursive query,
222 the iterative resolver discovers that none of the name servers
226 Larson & Barber Best Current Practice [Page 4]
228 RFC 4697 Observed DNS Resolution Misbehavior October 2006
231 specified in the cached NS RRSet is available or authoritative,
232 querying the parent would be wrong. An NS RRSet from the parent zone
233 would now be less trustworthy than data already in the cache.
235 For this query of the parent zone to be useful, the target zone's
236 entire set of name servers would have to change AND the former set of
237 name servers would have to be deconfigured or decommissioned AND the
238 delegation information in the parent zone would have to be updated
239 with the new set of name servers, all within the Time to Live (TTL)
240 of the target zone's NS RRSet. We believe this scenario is uncommon:
241 administrative best practices dictate that changes to a zone's set of
242 name servers happen gradually when at all possible, with servers
243 removed from the NS RRSet left authoritative for the zone as long as
244 possible. The scenarios that we can envision that would benefit from
245 the parent requery behavior do not outweigh its damaging effects.
247 This section should not be understood to claim that all queries to a
248 zone's parent are bad. In some cases, such queries are not only
249 reasonable but required. Consider the situation when required
250 information, such as the address of a name server (i.e., the address
251 record corresponding to the RDATA of an NS record), has timed out of
252 an iterative resolver's cache before the corresponding NS record. If
253 the name of the name server is below the apex of the zone, then the
254 name server's address record is only available as glue in the parent
255 zone. For example, consider this NS record:
257 example.com. IN NS ns.example.com.
259 If a cache has this NS record but not the address record for
260 "ns.example.com", it is unable to contact the "example.com" zone
261 directly and must query the "com" zone to obtain the address record.
262 Note, however, that such a query would not have QTYPE=NS according to
263 the standard resolution algorithm.
265 2.1.1. Recommendation
267 An iterative resolver MUST NOT send a query for the NS RRSet of a
268 non-responsive zone to any of the name servers for that zone's parent
269 zone. For the purposes of this injunction, a non-responsive zone is
270 defined as a zone for which every name server listed in the zone's NS
273 1. is not authoritative for the zone (i.e., lame), or
275 2. returns a server failure response (RCODE=2), or
277 3. is dead or unreachable according to Section 7.2 of RFC 2308 [4].
282 Larson & Barber Best Current Practice [Page 5]
284 RFC 4697 Observed DNS Resolution Misbehavior October 2006
287 2.2. Repeated Queries to Lame Servers
289 Section 2.1 describes a catastrophic failure: when every name server
290 for a zone is unable to provide an answer for one reason or another.
291 A more common occurrence is when a subset of a zone's name servers is
292 unavailable or misconfigured. Different failure modes have different
293 expected durations. Some symptoms indicate problems that are
294 potentially transient, for example, various types of ICMP unreachable
295 messages because a name server process is not running or a host or
296 network is unreachable, or a complete lack of a response to a query.
297 Such responses could be the result of a host rebooting or temporary
298 outages; these events do not necessarily require any human
299 intervention and can be reasonably expected to be temporary.
301 Other symptoms clearly indicate a condition requiring human
302 intervention, such as lame server: if a name server is misconfigured
303 and not authoritative for a zone delegated to it, it is reasonable to
304 assume that this condition has potential to last longer than
305 unreachability or unresponsiveness. Consequently, repeated queries
306 to known lame servers are not useful. In this case of a condition
307 with potential to persist for a long time, a better practice would be
308 to maintain a list of known lame servers and avoid querying them
309 repeatedly in a short interval.
311 It should also be noted, however, that some authoritative name server
312 implementations appear to be lame only for queries of certain types
313 as described in RFC 4074 [5]. In this case, it makes sense to retry
314 the "lame" servers for other types of queries, particularly when all
315 known authoritative name servers appear to be "lame".
317 2.2.1. Recommendation
319 Iterative resolvers SHOULD cache name servers that they discover are
320 not authoritative for zones delegated to them (i.e., lame servers).
321 If this caching is performed, lame servers MUST be cached against the
322 specific query tuple <zone name, class, server IP address>. Zone
323 name can be derived from the owner name of the NS record that was
324 referenced to query the name server that was discovered to be lame.
326 Implementations that perform lame server caching MUST refrain from
327 sending queries to known lame servers for a configurable time
328 interval after the server is discovered to be lame. A minimum
329 interval of thirty minutes is RECOMMENDED.
338 Larson & Barber Best Current Practice [Page 6]
340 RFC 4697 Observed DNS Resolution Misbehavior October 2006
343 An exception to this recommendation occurs if all name servers for a
344 zone are marked lame. In that case, the iterative resolver SHOULD
345 temporarily ignore the servers' lameness status and query one or more
346 servers. This behavior is a workaround for the type-specific
347 lameness issue described in the previous section.
349 Implementors should take care not to make lame server avoidance logic
350 overly broad: note that a name server could be lame for a parent zone
351 but not a child zone, e.g., lame for "example.com" but properly
352 authoritative for "sub.example.com". Therefore, a name server should
353 not be automatically considered lame for subzones. In the case
354 above, even if a name server is known to be lame for "example.com",
355 it should be queried for QNAMEs at or below "sub.example.com" if an
356 NS record indicates that it should be authoritative for that zone.
358 2.3. Inability to Follow Multiple Levels of Indirection
360 Some iterative resolver implementations are unable to follow
361 sufficient levels of indirection. For example, consider the
362 following delegations:
364 foo.example. IN NS ns1.example.com.
365 foo.example. IN NS ns2.example.com.
367 example.com. IN NS ns1.test.example.net.
368 example.com. IN NS ns2.test.example.net.
370 test.example.net. IN NS ns1.test.example.net.
371 test.example.net. IN NS ns2.test.example.net.
373 An iterative resolver resolving the name "www.foo.example" must
374 follow two levels of indirection, first obtaining address records for
375 "ns1.test.example.net" or "ns2.test.example.net" in order to obtain
376 address records for "ns1.example.com" or "ns2.example.com" in order
377 to query those name servers for the address records of
378 "www.foo.example". Although this situation may appear contrived, we
379 have seen multiple similar occurrences and expect more as new generic
380 top-level domains (gTLDs) become active. We anticipate many zones in
381 new gTLDs will use name servers in existing gTLDs, increasing the
382 number of delegations using out-of-zone name servers.
384 2.3.1. Recommendation
386 Clearly constructing a delegation that relies on multiple levels of
387 indirection is not a good administrative practice. However, the
388 practice is widespread enough to require that iterative resolvers be
389 able to cope with it. Iterative resolvers SHOULD be able to handle
390 arbitrary levels of indirection resulting from out-of-zone name
394 Larson & Barber Best Current Practice [Page 7]
396 RFC 4697 Observed DNS Resolution Misbehavior October 2006
399 servers. Iterative resolvers SHOULD implement a level-of-effort
400 counter to avoid loops or otherwise performing too much work in
401 resolving pathological cases.
403 A best practice that avoids this entire issue of indirection is to
404 name one or more of a zone's name servers in the zone itself. For
405 example, if the zone is named "example.com", consider naming some of
406 the name servers "ns{1,2,...}.example.com" (or similar).
408 2.4. Aggressive Retransmission when Fetching Glue
410 When an authoritative name server responds with a referral, it
411 includes NS records in the authority section of the response.
412 According to the algorithm in Section 4.3.2 of RFC 1034 [2], the name
413 server should also "put whatever addresses are available into the
414 additional section, using glue RRs if the addresses are not available
415 from authoritative data or the cache." Some name server
416 implementations take this address inclusion a step further with a
417 feature called "glue fetching". A name server that implements glue
418 fetching attempts to include address records for every NS record in
419 the authority section. If necessary, the name server issues multiple
420 queries of its own to obtain any missing address records.
422 Problems with glue fetching can arise in the context of
423 "authoritative-only" name servers, which only serve authoritative
424 data and ignore requests for recursion. Such an entity will not
425 normally generate any queries of its own. Instead it answers non-
426 recursive queries from iterative resolvers looking for information in
427 zones it serves. With glue fetching enabled, however, an
428 authoritative server invokes an iterative resolver to look up an
429 unknown address record to complete the additional section of a
432 We have observed situations where the iterative resolver of a glue-
433 fetching name server can send queries that reach other name servers,
434 but is apparently prevented from receiving the responses. For
435 example, perhaps the name server is authoritative-only and therefore
436 its administrators expect it to receive only queries and not
437 responses. Perhaps unaware of glue fetching and presuming that the
438 name server's iterative resolver will generate no queries, its
439 administrators place the name server behind a network device that
440 prevents it from receiving responses. If this is the case, all
441 glue-fetching queries will go unanswered.
443 We have observed name server implementations whose iterative
444 resolvers retry excessively when glue-fetching queries are
445 unanswered. A single com/net name server has received hundreds of
446 queries per second from a single such source. Judging from the
450 Larson & Barber Best Current Practice [Page 8]
452 RFC 4697 Observed DNS Resolution Misbehavior October 2006
455 specific queries received and based on additional analysis, we
456 believe these queries result from overly aggressive glue fetching.
458 2.4.1. Recommendation
460 Implementers whose name servers support glue fetching SHOULD take
461 care to avoid sending queries at excessive rates. Implementations
462 SHOULD support throttling logic to detect when queries are sent but
463 no responses are received.
465 2.5. Aggressive Retransmission behind Firewalls
467 A common occurrence and one of the largest sources of repeated
468 queries at the com/net and root name servers appears to result from
469 resolvers behind misconfigured firewalls. In this situation, an
470 iterative resolver is apparently allowed to send queries through a
471 firewall to other name servers, but not receive the responses. The
472 result is more queries than necessary because of retransmission, all
473 of which are useless because the responses are never received. Just
474 as with the glue-fetching scenario described in Section 2.4, the
475 queries are sometimes sent at excessive rates. To make matters
476 worse, sometimes the responses, sent in reply to legitimate queries,
477 trigger an alarm on the originator's intrusion detection system. We
478 are frequently contacted by administrators responding to such alarms
479 who believe our name servers are attacking their systems.
481 Not only do some resolvers in this situation retransmit queries at an
482 excessive rate, but they continue to do so for days or even weeks.
483 This scenario could result from an organization with multiple
484 recursive name servers, only a subset of whose iterative resolvers'
485 traffic is improperly filtered in this manner. Stub resolvers in the
486 organization could be configured to query multiple recursive name
487 servers. Consider the case where a stub resolver queries a filtered
488 recursive name server first. The iterative resolver of this
489 recursive name server sends one or more queries whose replies are
490 filtered, so it cannot respond to the stub resolver, which times out.
491 Then the stub resolver retransmits to a recursive name server that is
492 able to provide an answer. Since resolution ultimately succeeds the
493 underlying problem might not be recognized or corrected. A popular
494 stub resolver implementation has a very aggressive retransmission
495 schedule, including simultaneous queries to multiple recursive name
496 servers, which could explain how such a situation could persist
497 without being detected.
506 Larson & Barber Best Current Practice [Page 9]
508 RFC 4697 Observed DNS Resolution Misbehavior October 2006
511 2.5.1. Recommendation
513 The most obvious recommendation is that administrators SHOULD take
514 care not to place iterative resolvers behind a firewall that allows
515 queries, but not the resulting replies, to pass through.
517 Iterative resolvers SHOULD take care to avoid sending queries at
518 excessive rates. Implementations SHOULD support throttling logic to
519 detect when queries are sent but no responses are received.
521 2.6. Misconfigured NS Records
523 Sometimes a zone administrator forgets to add the trailing dot on the
524 domain names in the RDATA of a zone's NS records. Consider this
525 fragment of the zone file for "example.com":
528 example.com. 3600 IN NS ns1.example.com ; Note missing
529 example.com. 3600 IN NS ns2.example.com ; trailing dots
531 The zone's authoritative servers will parse the NS RDATA as
532 "ns1.example.com.example.com" and "ns2.example.com.example.com" and
533 return NS records with this incorrect RDATA in responses, including
534 typically the authority section of every response containing records
535 from the "example.com" zone.
537 Now consider a typical sequence of queries. An iterative resolver
538 attempting to resolve address records for "www.example.com" with no
539 cached information for this zone will query a "com" authoritative
540 server. The "com" server responds with a referral to the
541 "example.com" zone, consisting of NS records with valid RDATA and
542 associated glue records. (This example assumes that the
543 "example.com" zone delegation information is correct in the "com"
544 zone.) The iterative resolver caches the NS RRSet from the "com"
545 server and follows the referral by querying one of the "example.com"
546 authoritative servers. This server responds with the
547 "www.example.com" address record in the answer section and,
548 typically, the "example.com" NS records in the authority section and,
549 if space in the message remains, glue address records in the
550 additional section. According to Section 5.4.1 of RFC 2181 [3], NS
551 records in the authority section of an authoritative answer are more
552 trustworthy than NS records from the authority section of a non-
553 authoritative answer. Thus, the "example.com" NS RRSet just received
554 from the "example.com" authoritative server overrides the
555 "example.com" NS RRSet received moments ago from the "com"
556 authoritative server.
562 Larson & Barber Best Current Practice [Page 10]
564 RFC 4697 Observed DNS Resolution Misbehavior October 2006
567 But the "example.com" zone contains the erroneous NS RRSet as shown
568 in the example above. Subsequent queries for names in "example.com"
569 will cause the iterative resolver to attempt to use the incorrect NS
570 records and so it will try to resolve the nonexistent names
571 "ns1.example.com.example.com" and "ns2.example.com.example.com". In
572 this example, since all of the zone's name servers are named in the
573 zone itself (i.e., "ns1.example.com.example.com" and
574 "ns2.example.com.example.com" both end in "example.com") and all are
575 bogus, the iterative resolver cannot reach any "example.com" name
576 servers. Therefore, attempts to resolve these names result in
577 address record queries to the "com" authoritative servers. Queries
578 for such obviously bogus glue address records occur frequently at the
579 com/net name servers.
581 2.6.1. Recommendation
583 An authoritative server can detect this situation. A trailing dot
584 missing from an NS record's RDATA always results by definition in a
585 name server name that exists somewhere under the apex of the zone
586 that the NS record appears in. Note that further levels of
587 delegation are possible, so a missing trailing dot could
588 inadvertently create a name server name that actually exists in a
591 An authoritative name server SHOULD issue a warning when one of a
592 zone's NS records references a name server below the zone's apex when
593 a corresponding address record does not exist in the zone AND there
594 are no delegated subzones where the address record could exist.
596 2.7. Name Server Records with Zero TTL
598 Sometimes a popular com/net subdomain's zone is configured with a TTL
599 of zero on the zone's NS records, which prohibits these records from
600 being cached and will result in a higher query volume to the zone's
601 authoritative servers. The zone's administrator should understand
602 the consequences of such a configuration and provision resources
603 accordingly. A zero TTL on the zone's NS RRSet, however, carries
604 additional consequences beyond the zone itself: if an iterative
605 resolver cannot cache a zone's NS records because of a zero TTL, it
606 will be forced to query that zone's parent's name servers each time
607 it resolves a name in the zone. The com/net authoritative servers do
608 see an increased query load when a popular com/net subdomain's zone
609 is configured with a TTL of zero on the zone's NS records.
611 A zero TTL on an RRSet expected to change frequently is extreme but
612 permissible. A zone's NS RRSet is a special case, however, because
613 changes to it must be coordinated with the zone's parent. In most
614 zone parent/child relationships that we are aware of, there is
618 Larson & Barber Best Current Practice [Page 11]
620 RFC 4697 Observed DNS Resolution Misbehavior October 2006
623 typically some delay involved in effecting changes. Furthermore,
624 changes to the set of a zone's authoritative name servers (and
625 therefore to the zone's NS RRSet) are typically relatively rare:
626 providing reliable authoritative service requires a reasonably stable
627 set of servers. Therefore, an extremely low or zero TTL on a zone's
628 NS RRSet rarely makes sense, except in anticipation of an upcoming
629 change. In this case, when the zone's administrator has planned a
630 change and does not want iterative resolvers throughout the Internet
631 to cache the NS RRSet for a long period of time, a low TTL is
634 2.7.1. Recommendation
636 Because of the additional load placed on a zone's parent's
637 authoritative servers resulting from a zero TTL on a zone's NS RRSet,
638 under such circumstances authoritative name servers SHOULD issue a
639 warning when loading a zone.
641 2.8. Unnecessary Dynamic Update Messages
643 The UPDATE message specified in RFC 2136 [6] allows an authorized
644 agent to update a zone's data on an authoritative name server using a
645 DNS message sent over the network. Consider the case of an agent
646 desiring to add a particular resource record. Because of zone cuts,
647 the agent does not necessarily know the proper zone to which the
648 record should be added. The dynamic update process requires that the
649 agent determine the appropriate zone so the UPDATE message can be
650 sent to one of the zone's authoritative servers (typically the
651 primary master as specified in the zone's Start of Authority (SOA)
652 record's MNAME field).
654 The appropriate zone to update is the closest enclosing zone, which
655 cannot be determined only by inspecting the domain name of the record
656 to be updated, since zone cuts can occur anywhere. One way to
657 determine the closest enclosing zone entails walking up the name
658 space tree by sending repeated UPDATE messages until successful. For
659 example, consider an agent attempting to add an address record with
660 the name "foo.bar.example.com". The agent could first attempt to
661 update the "foo.bar.example.com" zone. If the attempt failed, the
662 update could be directed to the "bar.example.com" zone, then the
663 "example.com" zone, then the "com" zone, and finally the root zone.
665 A popular dynamic agent follows this algorithm. The result is many
666 UPDATE messages received by the root name servers, the com/net
667 authoritative servers, and presumably other TLD authoritative
668 servers. A valid question is why the algorithm proceeds to send
669 updates all the way to TLD and root name servers. This behavior is
670 not entirely unreasonable: in enterprise DNS architectures with an
674 Larson & Barber Best Current Practice [Page 12]
676 RFC 4697 Observed DNS Resolution Misbehavior October 2006
679 "internal root" design, there could conceivably be private, non-
680 public TLD or root zones that would be the appropriate targets for a
683 A significant deficiency with this algorithm is that knowledge of a
684 given UPDATE message's failure is not helpful in directing future
685 UPDATE messages to the appropriate servers. A better algorithm would
686 be to find the closest enclosing zone by walking up the name space
687 with queries for SOA or NS rather than "probing" with UPDATE
688 messages. Once the appropriate zone is found, an UPDATE message can
689 be sent. In addition, the results of these queries can be cached to
690 aid in determining the closest enclosing zones for future updates.
691 Once the closest enclosing zone is determined with this method, the
692 update will either succeed or fail and there is no need to send
693 further updates to higher-level zones. The important point is that
694 walking up the tree with queries yields cacheable information,
695 whereas walking up the tree by sending UPDATE messages does not.
697 2.8.1. Recommendation
699 Dynamic update agents SHOULD send SOA or NS queries to progressively
700 higher-level names to find the closest enclosing zone for a given
701 name to update. Only after the appropriate zone is found should the
702 client send an UPDATE message to one of the zone's authoritative
703 servers. Update clients SHOULD NOT "probe" using UPDATE messages by
704 walking up the tree to progressively higher-level zones.
706 2.9. Queries for Domain Names Resembling IPv4 Addresses
708 The root name servers receive a significant number of A record
709 queries where the QNAME looks like an IPv4 address. The source of
710 these queries is unknown. It could be attributed to situations where
711 a user believes that an application will accept either a domain name
712 or an IP address in a given configuration option. The user enters an
713 IP address, but the application assumes that any input is a domain
714 name and attempts to resolve it, resulting in an A record lookup.
715 There could also be applications that produce such queries in a
716 misguided attempt to reverse map IP addresses.
718 These queries result in Name Error (RCODE=3) responses. An iterative
719 resolver can negatively cache such responses, but each response
720 requires a separate cache entry; i.e., a negative cache entry for the
721 domain name "192.0.2.1" does not prevent a subsequent query for the
722 domain name "192.0.2.2".
730 Larson & Barber Best Current Practice [Page 13]
732 RFC 4697 Observed DNS Resolution Misbehavior October 2006
735 2.9.1. Recommendation
737 It would be desirable for the root name servers not to have to answer
738 these queries: they unnecessarily consume CPU resources and network
739 bandwidth. A possible solution is to delegate these numeric TLDs
740 from the root zone to a separate set of servers to absorb the
741 traffic. The "black hole servers" used by the AS 112 Project
742 (http://www.as112.net), which are currently delegated the
743 in-addr.arpa zones corresponding to RFC 1918 [7] private use address
744 space, would be a possible choice to receive these delegations. Of
745 course, the proper and usual root zone change procedures would have
746 to be followed to make such a change to the root zone.
748 2.10. Misdirected Recursive Queries
750 The root name servers receive a significant number of recursive
751 queries (i.e., queries with the Recursion Desired (RD) bit set in the
752 header). Since none of the root servers offers recursion, the
753 servers' response in such a situation ignores the request for
754 recursion and the response probably does not contain the data the
755 querier anticipated. Some of these queries result from users
756 configuring stub resolvers to query a root server. (This situation
757 is not hypothetical: we have received complaints from users when this
758 configuration does not work as hoped.) Of course, users should not
759 direct stub resolvers to use name servers that do not offer
760 recursion, but we are not aware of any stub resolver implementation
761 that offers any feedback to the user when so configured, aside from
762 simply "not working".
764 2.10.1. Recommendation
766 When the IP address of a name server that supposedly offers recursion
767 is configured in a stub resolver using an interactive user interface,
768 the resolver could send a test query to verify that the server indeed
769 supports recursion (i.e., verify that the response has the RA bit set
770 in the header). The user could be notified immediately if the server
773 The stub resolver could also report an error, either through a user
774 interface or in a log file, if the queried server does not support
775 recursion. Error reporting SHOULD be throttled to avoid a
776 notification or log message for every response from a non-recursive
786 Larson & Barber Best Current Practice [Page 14]
788 RFC 4697 Observed DNS Resolution Misbehavior October 2006
791 2.11. Suboptimal Name Server Selection Algorithm
793 An entire document could be devoted to the topic of problems with
794 different implementations of the recursive resolution algorithm. The
795 entire process of recursion is woefully under-specified, requiring
796 each implementor to design an algorithm. Sometimes implementors make
797 poor design choices that could be avoided if a suggested algorithm
798 and best practices were documented, but that is a topic for another
801 Some deficiencies cause significant operational impact and are
802 therefore worth mentioning here. One of these is name server
803 selection by an iterative resolver. When an iterative resolver wants
804 to contact one of a zone's authoritative name servers, how does it
805 choose from the NS records listed in the zone's NS RRSet? If the
806 selection mechanism is suboptimal, queries are not spread evenly
807 among a zone's authoritative servers. The details of the selection
808 mechanism are up to the implementor, but we offer some suggestions.
810 2.11.1. Recommendation
812 This list is not conclusive, but reflects the changes that would
813 produce the most impact in terms of reducing disproportionate query
814 load among a zone's authoritative servers. That is, these changes
815 would help spread the query load evenly.
817 o Do not make assumptions based on NS RRSet order: all NS RRs SHOULD
818 be treated equally. (In the case of the "com" zone, for example,
819 most of the root servers return the NS record for
820 "a.gtld-servers.net" first in the authority section of referrals.
821 Apparently as a result, this server receives disproportionately
822 more traffic than the other twelve authoritative servers for
825 o Use all NS records in an RRSet. (For example, we are aware of
826 implementations that hard-coded information for a subset of the
829 o Maintain state and favor the best-performing of a zone's
830 authoritative servers. A good definition of performance is
831 response time. Non-responsive servers can be penalized with an
832 extremely high response time.
834 o Do not lock onto the best-performing of a zone's name servers. An
835 iterative resolver SHOULD periodically check the performance of
836 all of a zone's name servers to adjust its determination of the
842 Larson & Barber Best Current Practice [Page 15]
844 RFC 4697 Observed DNS Resolution Misbehavior October 2006
847 3. Security Considerations
849 The iterative resolver misbehavior discussed in this document exposes
850 the root and TLD name servers to increased risk of both intentional
851 and unintentional Denial of Service attacks.
853 We believe that implementation of the recommendations offered in this
854 document will reduce the amount of unnecessary traffic seen at root
855 and TLD name servers, thus reducing the opportunity for an attacker
856 to use such queries to his or her advantage.
860 The authors would like to thank the following people for their
861 comments that improved this document: Andras Salamon, Dave Meyer,
862 Doug Barton, Jaap Akkerhuis, Jinmei Tatuya, John Brady, Kevin Darcy,
863 Olafur Gudmundsson, Pekka Savola, Peter Koch, and Rob Austein. We
864 apologize if we have omitted anyone; any oversight was unintentional.
866 5. Internationalization Considerations
868 There are no new internationalization considerations introduced by
873 6.1. Normative References
875 [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement
876 Levels", BCP 14, RFC 2119, March 1997.
878 [2] Mockapetris, P., "Domain names - concepts and facilities", STD
879 13, RFC 1034, November 1987.
881 6.2. Informative References
883 [3] Elz, R. and R. Bush, "Clarifications to the DNS Specification",
886 [4] Andrews, M., "Negative Caching of DNS Queries (DNS NCACHE)", RFC
889 [5] Morishita, Y. and T. Jinmei, "Common Misbehavior Against DNS
890 Queries for IPv6 Addresses", RFC 4074, May 2005.
892 [6] Vixie, P., Thomson, S., Rekhter, Y., and J. Bound, "Dynamic
893 Updates in the Domain Name System (DNS UPDATE)", RFC 2136, April
898 Larson & Barber Best Current Practice [Page 16]
900 RFC 4697 Observed DNS Resolution Misbehavior October 2006
903 [7] Rekhter, Y., Moskowitz, B., Karrenberg, D., de Groot, G., and E.
904 Lear, "Address Allocation for Private Internets", BCP 5, RFC
911 21345 Ridgetop Circle
912 Dulles, VA 20166-6503
915 EMail: mlarson@verisign.com
920 21345 Ridgetop Circle
921 Dulles, VA 20166-6503
924 EMail: pbarber@verisign.com
954 Larson & Barber Best Current Practice [Page 17]
956 RFC 4697 Observed DNS Resolution Misbehavior October 2006
959 Full Copyright Statement
961 Copyright (C) The Internet Society (2006).
963 This document is subject to the rights, licenses and restrictions
964 contained in BCP 78, and except as set forth therein, the authors
965 retain all their rights.
967 This document and the information contained herein are provided on an
968 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
969 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
970 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
971 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
972 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
973 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
975 Intellectual Property
977 The IETF takes no position regarding the validity or scope of any
978 Intellectual Property Rights or other rights that might be claimed to
979 pertain to the implementation or use of the technology described in
980 this document or the extent to which any license under such rights
981 might or might not be available; nor does it represent that it has
982 made any independent effort to identify any such rights. Information
983 on the procedures with respect to rights in RFC documents can be
984 found in BCP 78 and BCP 79.
986 Copies of IPR disclosures made to the IETF Secretariat and any
987 assurances of licenses to be made available, or the result of an
988 attempt made to obtain a general license or permission for the use of
989 such proprietary rights by implementers or users of this
990 specification can be obtained from the IETF on-line IPR repository at
991 http://www.ietf.org/ipr.
993 The IETF invites any interested party to bring to its attention any
994 copyrights, patents or patent applications, or other proprietary
995 rights that may cover technology that may be required to implement
996 this standard. Please address the information to the IETF at
1001 Funding for the RFC Editor function is provided by the IETF
1002 Administrative Support Activity (IASA).
1010 Larson & Barber Best Current Practice [Page 18]