workbench/network/stacks/AROSTCP/dhcp/doc/draft-ietf-dhc-failover-07.txt

   1
   2
   3
   4
   5
   6 Network Working Group                                        Ralph Droms
   7 INTERNET DRAFT                                       Bucknell University
   8
   9                                                              Kim Kinnear
  10                                                               Mark Stapp
  11                                                            Cisco Systems
  12
  13                                                              Bernie Volz
  14                                                                  IPWorks
  15
  16                                                             Steve Gonczi
  17                                                          Network Engines
  18
  19                                                               Greg Rabil
  20                                                              Mike Dooley
  21                                                               Arun Kapur
  22                                                      Lucent Technologies
  23
  24                                                                July 2000
  25                                                     Expires January 2001
  26
  27
  28                          DHCP Failover Protocol
  29                     <draft-ietf-dhc-failover-07.txt>
  30
  31 Status of this Memo
  32
  33    This document is an Internet-Draft and is in full conformance with
  34    all provisions of Section 10 of RFC2026.
  35
  36    Internet-Drafts are working documents of the Internet Engineering
  37    Task Force (IETF), its areas, and its working groups.  Note that
  38    other groups may also distribute working documents as Internet-
  39    Drafts.
  40
  41    Internet-Drafts are draft documents valid for a maximum of six months
  42    and may be updated, replaced, or obsoleted by other documents at any
  43    time.  It is inappropriate to use Internet- Drafts as reference
  44    material or to cite them other than as "work in progress."
  45
  46    The list of current Internet-Drafts can be accessed at
  47    http://www.ietf.org/ietf/1id-abstracts.txt
  48
  49    The list of Internet-Draft Shadow Directories can be accessed at
  50    http://www.ietf.org/shadow.html.
  51
  52
  53
  54
  55
  56
  57 Droms, et. al.            Expires January 2001                  [Page 1]
  58 \f
  59 Internet Draft           DHCP Failover Protocol               July 2000
  60
  61
  62 Copyright Notice
  63
  64    Copyright (C) The Internet Society (2000). All Rights Reserved.
  65
  66 Abstract
  67
  68    DHCP [RFC 2131] allows for multiple servers to be operating on a
  69    single network.  Some sites are interested in running multiple
  70    servers in such a way so as to provide redundancy in case of server
  71    failure.  In order for this to work reliably, the cooperating primary
  72    and secondary servers must maintain a consistent database of the
  73    lease information.  This implies that servers will need to coordinate
  74    any and all lease activity so that this information is synchronized
  75    in case of failover.
  76
  77    This document defines a protocol to provide such synchronization
  78    between two servers.  One server is designated the "primary" server,
  79    the other is the "secondary" server.  This document also describes a
  80    way to integrate the failover protocol with the DHCP load balancing
  81    approach.
  82
  83    This document is a substantial reorganization as well as a technical
  84    and editorial revision of draft-ietf-dhc-failover-05.txt.
  85
  86 Table of Contents
  87
  88
  89     1.  Introduction................................................. 4
  90     2.  Terminology.................................................. 5
  91     2.1.  Requirements terminology................................... 5
  92     2.2.  DHCP and failover terminology.............................. 5
  93     3.  Background and External Requirements......................... 9
  94     3.1.  Key aspects of the DHCP protocol........................... 9
  95     3.2.  BOOTP relay agent implementation........................... 11
  96     3.3.  What does it mean if a server can't communicate with its partner? 12
  97     3.4.  Challenging scenarios for a Failover protocol.............. 12
  98     3.5.  Using TCP to detect partner server failure................. 14
  99     4.  Design Goals................................................. 15
 100     4.1.  Design goals for this protocol............................. 15
 101     4.2.  Limitations of this protocol............................... 16
 102     5.  Protocol Overview............................................ 17
 103     5.1.  Messages and States........................................ 17
 104     5.2.  Fundamental guarantees..................................... 20
 105     5.3.  Load balancing............................................. 26
 106     5.4.  Operating in NORMAL state.................................. 27
 107     5.5.  Operating in COMMUNICATIONS-INTERRUPTED state.............. 27
 108     5.6.  Operating in PARTNER-DOWN state............................ 28
 109
 110
 111
 112
 113 Droms, et. al.            Expires January 2001                  [Page 2]
 114 \f
 115 Internet Draft           DHCP Failover Protocol               July 2000
 116
 117
 118
 119     5.7.  Operating in RECOVER state................................. 28
 120     5.8.  Operating in STARTUP state................................. 28
 121     5.9.  Time synchronization between servers....................... 28
 122     5.10.  IP address binding-status................................. 29
 123     5.11.  DNS dynamic update considerations......................... 33
 124     5.12.  Reservations and failover................................. 37
 125     5.13.  Dynamic BOOTP and failover................................ 39
 126     5.14.  Guidelines for selecting MCLT............................. 39
 127     6.  Common Message Format........................................ 40
 128     6.1.  Message header format...................................... 40
 129     6.2.  Common option format....................................... 43
 130     6.3.  Batching multiple binding update transactions in one BNDUPD mes- 44
 131     7.  Protocol Messages............................................ 46
 132     7.1.  BNDUPD message [3]......................................... 46
 133     7.2.  BNDACK message [4]......................................... 56
 134     7.3.  UPDREQ message [9]......................................... 59
 135     7.4.  UPDREQALL message [7]...................................... 60
 136     7.5.  UPDDONE message [8]........................................ 61
 137     7.6.  POOLREQ message [1]........................................ 62
 138     7.7.  POOLRESP message [2]....................................... 63
 139     7.8.  CONNECT message [5]........................................ 64
 140     7.9.  CONNECTACK message [6]..................................... 68
 141     7.10.  STATE message [10]........................................ 71
 142     7.11.  CONTACT message [11]...................................... 72
 143     7.12.  DISCONNECT message [12]................................... 73
 144     8.  Connection Management........................................ 74
 145     8.1.  Connection granularity..................................... 74
 146     8.2.  Creating the TCP connection................................ 74
 147     8.3.  Using the TCP connection for determining communications status 76
 148     8.4.  Using the TCP connection for binding data.................. 78
 149     8.5.  Using the TCP connection for control messages.............. 78
 150     8.6.  Losing the TCP connection.................................. 78
 151     9.  Failover Endpoint States..................................... 79
 152     9.1.  Server Initialization...................................... 79
 153     9.2.  Server State Transitions................................... 79
 154     9.3.  STARTUP state.............................................. 82
 155     9.4.  PARTNER-DOWN state......................................... 84
 156     9.5.  RECOVER state.............................................. 86
 157     9.6.  NORMAL state............................................... 89
 158     9.7.  COMMUNICATIONS-INTERRUPTED State........................... 91
 159     9.8.  POTENTIAL-CONFLICT state................................... 95
 160     9.9.  RESOLUTION-INTERRUPTED state............................... 96
 161     9.10.  RECOVER-DONE state........................................ 97
 162     9.11.  PAUSED state.............................................. 98
 163     9.12.  SHUTDOWN state............................................ 98
 164     10.  Safe Period................................................. 99
 165     11.  Security.................................................... 101
 166
 167
 168
 169 Droms, et. al.            Expires January 2001                  [Page 3]
 170 \f
 171 Internet Draft           DHCP Failover Protocol               July 2000
 172
 173
 174     11.1.  Simple shared secret...................................... 101
 175     11.2.  TLS....................................................... 102
 176     12.  Failover Options............................................ 103
 177     12.1.  addresses-transferred..................................... 103
 178     12.2.  assigned-IP-address....................................... 103
 179     12.3.  binding-status............................................ 104
 180     12.4.  client-identifier......................................... 104
 181     12.5.  client-hardware-address................................... 105
 182     12.6.  client-last-transaction-time.............................. 105
 183     12.7.  client-reply-options...................................... 105
 184     12.8.  client-request-options.................................... 106
 185     12.9.  DDNS...................................................... 107
 186     12.10.  delayed-service-parameter................................ 108
 187     12.11.  hash-bucket-assignment................................... 108
 188     12.12.  lease-expiration-time.................................... 108
 189     12.13.  max-unacked-bndupd....................................... 109
 190     12.14.  MCLT..................................................... 109
 191     12.15.  message.................................................. 109
 192     12.16.  message-digest........................................... 110
 193     12.17.  potential-expiration-time................................ 110
 194     12.18.  receive-timer............................................ 110
 195     12.19.  protocol-version......................................... 111
 196     12.20.  reject-reason............................................ 112
 197     12.21.  sending-server-IP-address................................ 113
 198     12.22.  server-flags............................................. 113
 199     12.23.  server-state............................................. 114
 200     12.24.  start-time-of-state...................................... 114
 201     12.25.  TLS-reply................................................ 115
 202     12.26.  TLS-request.............................................. 115
 203     12.27.  vendor-class-identifier.................................. 115
 204     12.28.  vendor-specific-options.................................. 116
 205     13.  IANA Considerations......................................... 116
 206     14.  Acknowledgments............................................. 116
 207     15.  References.................................................. 118
 208     16.  Author's information........................................ 119
 209     17.  Full Copyright Statement.................................... 120
 210
 211
 212 1.  Introduction
 213
 214    DHCP [RFC 2131] allows for multiple servers to be operating on a sin-
 215    gle network.  Some sites are interested in running multiple servers
 216    in such a way so as to provide redundancy in case of server failure
 217    since the DHCP subsystem is in many cases a critical part of the net-
 218    work infrastructure.
 219
 220    This document defines a protocol to provide synchronization between
 221    two servers in order that each can take over for the other should
 222
 223
 224
 225 Droms, et. al.            Expires January 2001                  [Page 4]
 226 \f
 227 Internet Draft           DHCP Failover Protocol               July 2000
 228
 229
 230    either one fail or become unreachable.
 231
 232    One server is designated the "primary" server,  the other is the
 233    "secondary" server, and most DHCP client requests are sent to each
 234    server (see Section 3.1.1 for details).
 235
 236    In order to provide a  high availability DHCP service, these
 237    cooperating primary and secondary servers must maintain a consistent
 238    database of lease information.  This implies that servers will need
 239    to coordinate all lease activity so that this information is syn-
 240    chronized in case failover is required.  The protocol messages and
 241    processing techniques required to maintain a consistent database are
 242    specified in the protocol described here.
 243
 244    The failover protocol also contains a way to integrate the DHCP load-
 245    balancing algorithm described in [LOADB] with the failover protocol.
 246
 247 2.  Terminology
 248
 249    This section discusses both the generic requirements terminology com-
 250    mon to many IETF protocol specifications as well as specialized DHCP
 251    and failover protocol specific terminology.
 252
 253 2.1.  Requirements terminology
 254
 255    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
 256    "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
 257    document are to be interpreted as described in RFC 2119 [RFC 2119].
 258
 259
 260 2.2.  DHCP and failover terminology
 261
 262    This document uses the following terms:
 263
 264       o  "binding"
 265
 266          A binding is a collection of configuration parameters, includ-
 267          ing at least an IP address, associated with or "bound to" a
 268          DHCP client.  Bindings are managed by DHCP servers.
 269
 270       o  "binding database"
 271
 272          The collection of bindings managed by a primary and secondary.
 273
 274       o  "binding update transaction"
 275
 276          A binding update transaction refers to the set of information
 277          (contained in options) necessary to perform a binding update
 278
 279
 280
 281 Droms, et. al.            Expires January 2001                  [Page 5]
 282 \f
 283 Internet Draft           DHCP Failover Protocol               July 2000
 284
 285
 286          for a single IP address.  It will be comprised of the
 287          assigned-IP-address option and the binding-status option, along
 288          other options as appropriate.
 289
 290       o  "binding-status"
 291
 292          The binding-status is the status of an IP address with respect
 293          to its association with a client.  There are specific binding-
 294          status values defined for use by the failover protocol, e.g.,
 295          ACTIVE, FREE, RELEASED, ABANDONED, etc.  These are designed to
 296          map more or less directly onto the binding-status values used
 297          internally in most DHCP server implementations.  The term
 298          binding-status refers to the concept also sometimes known as
 299          "lease state" or "IP address state", but in this document the
 300          term "state" is reserved for the failover state of a failover
 301          endpoint, and binding-status is always used to refer to the
 302          state associated with an IP address or lease.
 303
 304       o "DHCP client" or "client"
 305
 306         A DHCP client is an Internet host using DHCP to obtain confi-
 307         guration parameters such as a network address.  The term
 308         "client" used within this document always means a DHCP client,
 309         and never one of the two failover servers.
 310
 311       o "DHCP server" or "server"
 312
 313         A DHCP server is an Internet host that returns configuration
 314         parameters to DHCP clients.
 315
 316       o "DDNS"
 317
 318         An abbreviation for "Dynamic DNS", which refers to the capabil-
 319         ity to update a DNS server's name (actually resource record)
 320         database using an on-the-wire protocol defined in [RFC 2136].
 321
 322       o "DNS"
 323
 324         An abbreviation for "Domain Name System", a scheme where a cen-
 325         tral name repository is used to map names to IP addresses and IP
 326         addresses to names.
 327
 328       o "failover endpoint"
 329
 330         The failover protocol allows for there to be a unique failover
 331         endpoint per partner per role (where role is primary or secon-
 332         dary).  This failover endpoint can take actions and hold unique
 333         states.  There are thus a maximum of two failover endpoints per
 334
 335
 336
 337 Droms, et. al.            Expires January 2001                  [Page 6]
 338 \f
 339 Internet Draft           DHCP Failover Protocol               July 2000
 340
 341
 342         server per partner (one for each partner as a primary and one
 343         for that same partner as a secondary.)
 344
 345       o "FQDN"
 346
 347         An FQDN is a "fully qualified domain name".  A fully qualified
 348         domain name generally is a host name with at least one zone
 349         name, for example "www.dhcp.org" is a fully qualified domain
 350         name.
 351
 352       o "lazy update"
 353
 354         Lazy update refers to the requirement placed on a server imple-
 355         menting a failover protocol to update its failover partner when-
 356         ever the binding database changes.  A failover protocol which
 357         didn't support lazy update would require the failover partner
 358         update to be complete before a DHCP server could respond to a
 359         DHCP client request with a DHCPACK.  A failover protocol which
 360         does support lazy update places no such restriction on the
 361         update of the failover partner server, and so a server can allo-
 362         cate an IP address or extend a lease on an IP address and then
 363         update its failover partner as time permits.  A failover proto-
 364         col which supports lazy update not only removes the requirement
 365         to update the failover partner prior to responding to a DHCP
 366         client with a DHCPACK, but also allows gathering up batches of
 367         updates from one failover server to its partner.
 368
 369       o "MCLT"
 370
 371         The MCLT refers to maximum client lead time.  This time is con-
 372         figured on the primary server and transmitted from the primary
 373         to the secondary server in the CONNECT message.  It is the max-
 374         imum amount of time that one server can extend a lease for a
 375         client's binding beyond the time known by the partner server.
 376         See section 5.2.1 for details.
 377
 378       o "partner"
 379
 380         A "partner", for the purposes of this document, refers to a
 381         failover server, typically the other failover server.  In many
 382         (if not most) cases, the failover protocol is symmetric with
 383         respect to the primary or secondary nature of the servers, and
 384         so it is often appropriate to discuss "updating the partner
 385         server", since it could be a primary server updating a secondary
 386         server or a secondary server updating a primary server.
 387
 388       o "Primary server" or "Primary"
 389
 390
 391
 392
 393 Droms, et. al.            Expires January 2001                  [Page 7]
 394 \f
 395 Internet Draft           DHCP Failover Protocol               July 2000
 396
 397
 398         A DHCP server configured to provide primary service to a set of
 399         DHCP clients for a particular set of subnet address pools.
 400
 401       o "RR"
 402
 403         "RR" is an abbreviation for "resource record".  All records in
 404         the DNS are resource records.  The resource records of most
 405         relevance to this document are the "A" resource record, which
 406         maps a DNS name to a particular IP address, the "PTR" resource
 407         record, which allows a "reverse map", from the IP address back
 408         to a DNS name, and the "KEY" resource record, which is used in
 409         ways defined in [DDNS] to tag a DNS name with the identity of
 410         the DHCP client with which it is associated.
 411
 412       o "Secondary server" or "Secondary"
 413
 414         A DHCP server configured to act as backup to a primary server
 415         for a particular set of subnet address pools.
 416
 417       o "stable storage"
 418
 419         Every DHCP server is assumed to have some form of what is called
 420         "stable storage".  Stable storage is used to hold information
 421         concerning IP address bindings (among other things) so that this
 422         information is not lost in the event of a server failure which
 423         requires restart of the server.
 424
 425       o "state"
 426
 427         In this document, the term "state" refers exclusively to the
 428         state of a failover endpoint, for example: NORMAL,
 429         COMMUNICATIONS-INTERRUPTED, PARTNER-DOWN.  It is not used to
 430         refer to any attributes of an IP address or a binding of an IP
 431         address.  See "binding-status".
 432
 433       o "subnet address pool"
 434
 435         A subnet address pool is the set of IP addresses which is asso-
 436         ciated with a particular network number and subnet mask.  In the
 437         simple case, there is a single network number and subnet mask
 438         and a set of IP addresses.  In the more complex case (sometimes
 439         called "secondary subnets", sometimes "superscopes"), several
 440         (apparently unrelated) network number and subnet mask combina-
 441         tions with their associated IP addresses may all be configured
 442         together into one subnet address pool.
 443
 444
 445
 446
 447
 448
 449 Droms, et. al.            Expires January 2001                  [Page 8]
 450 \f
 451 Internet Draft           DHCP Failover Protocol               July 2000
 452
 453
 454 3.  Background and External Requirements
 455
 456    This section highlights key aspects of the DHCP protocol on which the
 457    failover protocol depends.  It also discusses the requirements that
 458    the failover protocol places on other aspects of the network infras-
 459    tructure, and some general issues surrounding server failure detec-
 460    tion.  Some failure scenarios that provide particular challenges to a
 461    failover protocol are discussed.  Finally, the challenges inherent in
 462    using a TCP connection as a means to detect failure of a partner
 463    server are elaborated.
 464
 465 3.1.  Key aspects of the DHCP protocol
 466
 467    The failover protocol is designed to augment the DHCP protocol as
 468    described in RFC 2131 [RFC 2131].  There are several key aspects of
 469    the DHCP protocol which are required by the failover protocol in
 470    order to successfully meet its design goals.
 471
 472 3.1.1.  Broadcast behavior
 473
 474    There are two aspects of the broadcast behavior of the DHCP protocol
 475    which are key to making the failover protocol operate successfully.
 476    The first is simply that the DHCP protocol requires a DHCP client to
 477    broadcast all DHCPDISCOVER and DHCPREQUEST/INIT-REBOOT messages.
 478    Because of this requirement, a DHCP client who was communicating with
 479    one server will automatically be able to communicate with another
 480    server if one is available.
 481
 482    The second aspect of broadcast behavior is similar to the first, but
 483    involves the distinction between a DHCPREQUEST/RENEW and
 484    DHCPREQUEST/REBINDING.  A DHCPREQUEST/RENEW is the message that a
 485    DHCP client uses to extend its lease.  It is unicast to the DHCP
 486    server from which it acquired the lease.   However, the DHCP protocol
 487    (in a farsighted move), was explicitly designed so that in the event
 488    that a DHCP client cannot contact the server from which it received a
 489    lease on an IP address using a DHCPREQUEST/RENEW, the client is
 490    required to broadcast its renewal using a DHCPREQUEST/REBINDING to
 491    any available DHCP server.  Since all DHCP clients were required to
 492    implement this algorithm, the failover protocol can have a different
 493    server from the one that initially granted a lease be the server to
 494    renew a lease.  Thus, one server can take over for another with no
 495    interruption in the service as experienced by the DHCP client or its
 496    associated applications software.
 497
 498 3.1.2.  Client responsibility
 499
 500    In the DHCP protocol the DHCP clients are entrusted with a consider-
 501    able responsibility.  In particular, after they are granted a lease
 502
 503
 504
 505 Droms, et. al.            Expires January 2001                  [Page 9]
 506 \f
 507 Internet Draft           DHCP Failover Protocol               July 2000
 508
 509
 510    on an IP address, they are enjoined to only use that IP address while
 511    their lease is valid.  Every DHCP client is expected to stop using an
 512    IP address if the expiration time on the lease has passed and if it
 513    cannot get an extension on the lease for that IP address from some
 514    DHCP server.  Thus, the correct behavior of every DHCP client in this
 515    regard is required to ensure the integrity of the DHCP service.  On
 516    the other hand, incorrect behavior by a client in this area will tend
 517    to adversely affect at most one other DHCP client.
 518
 519    Furthermore, any DHCP client which sends in a DHCPREQUEST/RENEW or
 520    DHCPREQUEST/REBINDING to a DHCP server (either unicast for a RENEW or
 521    broadcast for a REBINDING) MUST still have time to run on the lease
 522    for that IP address.  The DHCP server sends the DHCPACK back unicast
 523    to the IP address from which the RENEW or REBINDING originated.
 524
 525    Given the existing responsibility placed on the client to only use an
 526    IP address when the lease is valid, and to only send in a RENEW or
 527    REBINDING if the lease is valid, the failover protocol relies on DHCP
 528    clients to perform responsibly and will, in the absence of conflict-
 529    ing information, believe a DHCP client that is attempting to RENEW or
 530    REBIND a lease on an IP address is the legitimate owner of that IP
 531    address.
 532
 533    If clients do not follow these rules, it is possible for an address
 534    to be in use by more than one client. For a single server, this hap-
 535    pens because the server has leased the expired address to another
 536    client and the original client is also attempting to use the address.
 537    The server would NAK the renewal request. This is made slightly worse
 538    in the failover protocol if the two servers are unable to communicate
 539    with each other and one server leases an available address to a new
 540    client while the other server receives a renewal from a different
 541    client.  In this case, both servers lease the same address to dif-
 542    ferent clients for the MCLT time.
 543
 544    One troublesome issue is that of the DHCP client responsibility when
 545    sending in DHCPREQUEST/INIT-REBOOT requests.  While the original DHCP
 546    RFC was written to require a DHCP client to have time left to run on
 547    the lease for an IP address if the client is sending an INIT-REBOOT
 548    request, it was sufficiently unclear that some client vendors didn't
 549    realize this until recently.  Since the INIT-REBOOT request was sent
 550    with the IP address in the dhcp-requested-address option and not in
 551    the ciaddr (for perfectly good reasons), the similarity to the RENEW
 552    and REBINDING case was lost on many people.
 553
 554    At present, the failover protocol does not assume that a client send-
 555    ing in an INIT-REBOOT request necessarily has a valid lease on the IP
 556    address appearing in the dhcp-requested-address option in the INIT-
 557    REBOOT request.
 558
 559
 560
 561 Droms, et. al.            Expires January 2001                 [Page 10]
 562 \f
 563 Internet Draft           DHCP Failover Protocol               July 2000
 564
 565
 566    The implications of this are as follows: Assume that there is a DHCP
 567    client that gets a lease from one server while that server is unable
 568    to communicate with its failover partner.  Then, assume that after
 569    that client reboots it is able only to communicate with the other
 570    failover server.  If the failover servers have not been able to com-
 571    municate with each other during this process, then the DHCP client
 572    will get a new IP address instead of being able to continue to use
 573    its existing IP address. This will affect no applications on the DHCP
 574    client, since it is rebooting.  However, it will use up an additional
 575    IP address in this marginal case.
 576
 577 3.1.3.  Stable storage update before DHCPACK
 578
 579    The DHCP protocol allocates resources, and in order to operate
 580    correctly it requires that a DHCP server update some form of stable
 581    storage prior to sending a DHCPACK to a DHCP client in order to grant
 582    that client a lease on an IP address.
 583
 584    One of the goals of the failover protocol is that it not add signifi-
 585    cant additional time to this already time consuming requirement to
 586    update stable storage prior to a DHCPACK.  In particular, adding a
 587    requirement to communicate with another server prior to sending a
 588    DHCPACK would greatly simplify the failover protocol, but it would
 589    unacceptably limit the potential scalability of any DHCP server which
 590    employed the failover protocol.
 591
 592 3.2.  BOOTP relay agent implementation
 593
 594    Many DHCP clients are not resident on the same network segment as a
 595    DHCP server.  In order to support this form of network architecture,
 596    most contemporary routers implement something known as a BOOTP Relay
 597    Agent.  This capability inside of a router listens for all broadcasts
 598    at the DHCP port, port 67, and will relay any broadcasts that it
 599    receives on to a DHCP server.  The IP address of the DHCP server must
 600    have been previously configured into the router.  As part of the
 601    relay process, the relay agent will place the address of the inter-
 602    face on which it received the broadcast into the giaddr field of the
 603    DHCP packet.
 604
 605    Since the failover protocol requires two DHCP servers to receive any
 606    broadcast DHCP messages, in order to work with DHCP clients which are
 607    not local to the DHCP server, the BOOTP relay agent on the router
 608    closest to the DHCP client must be configured to point at more than
 609    one DHCP server.
 610
 611    Most BOOTP relay agent implementations allow this duplication of
 612    packets.
 613
 614
 615
 616
 617 Droms, et. al.            Expires January 2001                 [Page 11]
 618 \f
 619 Internet Draft           DHCP Failover Protocol               July 2000
 620
 621
 622    If this is not possible, an administrator might be able to configure
 623    the relay agent with a subnet broadcast address, but in this case the
 624    primary and secondary DHCP servers in a failover pair must both
 625    reside on the same subnet.
 626
 627 3.3.  What does it mean if a server can't communicate with its partner?
 628
 629    In any protocol designed to allow one server to take over some
 630    responsibilities from a partner server in the event of "failure" of
 631    that partner server, there is an inherent difficulty in determining
 632    when that partner server has failed.
 633
 634    In fact, it is fundamentally impossible for one server to distinguish
 635    a network communications failure from the outright failure of the
 636    server to which it is trying to communicate.  In the case where each
 637    server is handing out resources (in this case IP addresses) to a
 638    client community, mistaking an inability to communicate with a
 639    partner server for failure of that partner server could easily cause
 640    both servers to be handing out the same IP addresses to different
 641    clients.
 642
 643    One way that this is sometimes handled is for there to be more than
 644    two servers.  In the case of an odd number of servers, the servers
 645    that can still communicate with a majority of other servers will con-
 646    sider themselves operational, and any server which can't communicate
 647    to a majority of other servers must immediately cease operations.
 648
 649    While this technique works in some domains, having the only server to
 650    which a DHCP client can communicate voluntarily shut itself down
 651    seems like something worth avoiding.
 652
 653    The failover protocol will operate correctly while both servers are
 654    unable to communicate, whether they are both running or not.  At some
 655    point there may be resource contention, and if one of the servers is
 656    actually down, then the operator can inform the operational server
 657    and the operational server will be able to use all of the failed
 658    server's resources.
 659
 660    The protocol also allows detection of an orderly shutdown of a parti-
 661    cipating server.
 662
 663 3.4.  Challenging scenarios for a Failover protocol
 664
 665    There exist two failure scenarios which provide particular challenges
 666    to the correctness guarantees of a failover protocol.
 667
 668
 669
 670
 671
 672
 673 Droms, et. al.            Expires January 2001                 [Page 12]
 674 \f
 675 Internet Draft           DHCP Failover Protocol               July 2000
 676
 677
 678 3.4.1.  Primary Server crash before "lazy" update:
 679
 680    In the case where the primary server sends a DHCPACK to a client for
 681    a newly allocated IP address and then crashes prior to sending the
 682    corresponding update to the secondary server, the secondary server
 683    will have no record of the IP address allocation.  When the secondary
 684    server takes over, it may well try to allocate that IP address to a
 685    different client.  In the case where the first client to receive the
 686    IP address is not on the net at the time (yet while there was still
 687    time to run on its lease), an ICMP echo (i.e., ping) will not prevent
 688    the secondary server from allocating that IP address to a different
 689    client.
 690
 691    The failover protocol deals with this situation by having the primary
 692    and secondary servers allocate addresses for new clients from dis-
 693    joint address pools.  See section 5.4 for details.
 694
 695    A more likely (in that DHCPRENEWs are presumably more common than
 696    DHCPDISCOVERs) and more subtle version of this problem is where the
 697    primary server crashes after extending a client's lease time, and
 698    before updating the secondary with a new time using a lazy update.
 699    After the secondary takes over, if the client is not connected to the
 700    network the secondary will believe the client's lease has expired
 701    when, in fact, it has not.  In this case as well, the IP address
 702    might be reallocated to a different client while the first client is
 703    still using it.
 704
 705    This scenario is handled by the failover protocol through control of
 706    the lease time and the use of the maximum client lead time (MCLT).
 707    See section 5.2.1  for details.
 708
 709 3.4.2.  Network partition where DHCP servers can't communicate but each
 710 can talk to clients:
 711
 712    Several conditions are required for this situation to occur.  First,
 713    due to a network failure, the primary and secondary servers cannot
 714    communicate.  As well, some of the DHCP clients must be able to com-
 715    municate with the primary server, and some of the clients must now
 716    only be able to communicate with the secondary server.  When this
 717    condition occurs, both primary and secondary servers could attempt to
 718    allocate IP addresses for new clients from the same pool of available
 719    addresses.  At some point, then, two clients will end up being allo-
 720    cated the same IP address.  This will cause problems when the network
 721    failure that created this situation is corrected.
 722
 723    The failover protocol deals with this situation by having the primary
 724    and secondary servers allocate addresses for new clients from dis-
 725    joint address pools.  See section 5.4 for details.
 726
 727
 728
 729 Droms, et. al.            Expires January 2001                 [Page 13]
 730 \f
 731 Internet Draft           DHCP Failover Protocol               July 2000
 732
 733
 734 3.5.  Using TCP to detect partner server failure
 735
 736    There are several characteristics of TCP that are important to the
 737    functioning of the failover protocol, which uses one TCP connection
 738    for both bulk data transfer as well as to assess communications
 739    integrity with the other server.  Reliable and ordered message
 740    delivery are chief among these important characteristics.
 741
 742    It would be nice to use the capabilities built in to TCP to allow it
 743    to determine if communications integrity exists to the failover
 744    partner but this strategy contains some problems which require
 745    analysis.  There exist three fundamental cases for an open TCP con-
 746    nection that must be examined.
 747
 748       1.  When no data is being sent then no messages are traveling
 749           across the TCP connection.
 750
 751       2.  When data is queued to be sent, and the receiver has not
 752           blocked the sending of additional data, then messages are
 753           flowing across the TCP connection containing the applications
 754           data.
 755
 756       3.  When data is queued to be sent, and the receiver has blocked
 757           the transmission of additional data, then persist messages are
 758           flowing from the receiver to the sender to ensure that the
 759           sender doesn't miss the receiver opening the window for
 760           further transmissions.
 761
 762    The first case can be turned into the second case by sending
 763    application-level keep-alive messages periodically when there is no
 764    other data queued to be sent.  Note TCP keep-alive messages might be
 765    used as well, but they present additional problems.
 766
 767    Thus, we can ensure that the TCP connection has messages flowing
 768    periodically across the connection fairly easily.  The question
 769    remains as to what TCP will do if the other end of the connection
 770    fails to respond (either because of network partition or because the
 771    receiving server crashes). TCP will attempt to retransmit a message
 772    with an exponential backoff, and will eventually timeout that
 773    retransmission.  However, the length of that timeout cannot, in gen-
 774    eral, be set on a per-connection basis, and is frequently as long as
 775    nine minutes, though in some cases it may be as short as two minutes.
 776    On some systems it can be set system-wide, while on other systems it
 777    cannot be changed at all.
 778
 779    A value for this timeout that would be appropriate for the failover
 780    protocol, say less than 1 minute, could have unpleasant side-effects
 781    on other applications running on the same server, assuming that it
 782
 783
 784
 785 Droms, et. al.            Expires January 2001                 [Page 14]
 786 \f
 787 Internet Draft           DHCP Failover Protocol               July 2000
 788
 789
 790    could be changed at all on the host operating system.
 791
 792    Nine minutes is a long time for the DHCP service to be unavailable to
 793    any new clients that were being served by the server which has
 794    crashed, when there is another server running that could respond to
 795    them as soon as it determines that its partner is not operational.
 796
 797    The conclusion drawn from this analysis is that TCP provides very
 798    useful support for the failover protocol in the areas of reliable and
 799    ordered message delivery, but cannot by itself be relied upon to
 800    detect partner server failure in a fashion acceptable to the needs of
 801    the failover protocol.  Additional failover protocol capabilities
 802    have been created to support timely detection of partner server
 803    failure.  See section 8.3 for details on this mechanism.
 804
 805 4.  Design Goals
 806
 807    This section lists the design goals and the limitations of the fail-
 808    over protocol.
 809
 810 4.1.  Design goals for this protocol
 811
 812    The following is a list of goals that are met by this protocol.  They
 813    are listed in priority order.
 814
 815       1.  Implementations of this protocol must work with existing DHCP
 816           client implementations based on the DHCP protocol [1].
 817
 818       2.  Implementations of the protocol must work with existing BOOTP
 819           relay agent implementations.
 820
 821       3.  The protocol must provide failover redundancy between servers
 822           that are not located on the same subnet.
 823
 824       4.  Provide for continued service to DHCP clients through an
 825           automated mechanism in the event of failure of the primary
 826           server.
 827
 828       5.  Avoid binding an IP address to a client while that binding is
 829           currently valid for another client.  In other words, do not
 830           allocate the same IP address to two clients.
 831
 832       6.  Minimize any need for manual administrative intervention.
 833
 834       7.  Introduce no additional delays in server response time as a
 835           result of the network communications required to implement the
 836           failover protocol, i.e., don't require communications with the
 837           partner between the receipt of a DHCPREQUEST and the
 838
 839
 840
 841 Droms, et. al.            Expires January 2001                 [Page 15]
 842 \f
 843 Internet Draft           DHCP Failover Protocol               July 2000
 844
 845
 846           corresponding DHCPACK.
 847
 848       8.  Share IP address ranges between primary and secondary servers;
 849           i.e., impose no requirement that the pool of available
 850           addresses be manually or permanently divided between servers.
 851
 852       9.  Continue to meet the goals and objectives of this protocol in
 853           the event of server failure or network partition.
 854
 855       10. Provide graceful reintegration of full protocol service after
 856           server failure or network partition.
 857
 858       11. Allow for one computer to act as a secondary server for multi-
 859           ple primary servers.  The protocol must allow failover primary
 860           and secondary configuration choices to be made at a granular-
 861           ity smaller than "all of the subnets served by a single
 862           server", though individual implementations may not choose to
 863           allow such flexibility.
 864
 865       12. Ensure that an existing client can keep its existing IP
 866           address binding if it can communicate with either the primary
 867           or secondary DHCP server implementing this protocol - not just
 868           whichever server that originally offered it the binding.
 869
 870       13. Ensure that a new client can get an IP address from some
 871           server.  Ensure that in the face of partition, where servers
 872           continue to run but cannot communicate with each other, the
 873           above goals and requirements may be met.  In addition, when
 874           the partition condition is removed, allow graceful automatic
 875           re-integration without requiring human intervention.
 876
 877       14. If either primary or secondary server loses all of the infor-
 878           mation that it has stored in stable storage, ensure that it be
 879           able to refresh its stable storage from the other server.
 880
 881       15. Support load balancing between the primary and secondary
 882           servers, and allow configuration of the percentage of the
 883           client population served by each with a moderately fine granu-
 884           larity.
 885
 886
 887 4.2.  Limitations of this protocol
 888
 889    The following are explicit limitations of this protocol.
 890
 891       1.  This protocol provides only one level of redundancy through a
 892           single secondary server for each primary server.
 893
 894
 895
 896
 897 Droms, et. al.            Expires January 2001                 [Page 16]
 898 \f
 899 Internet Draft           DHCP Failover Protocol               July 2000
 900
 901
 902       2.  A subset of the address pool is reserved for secondary server
 903           use.  In order to handle the failure case where both servers
 904           are able to communicate with DHCP clients, but unable to com-
 905           municate with each other, a subset of the IP address pool must
 906           be set aside as a private address pool for the secondary
 907           server.  The secondary can use these to service newly arrived
 908           DHCP clients during such a period.  The required size of this
 909           private pool is based only on the arrival rate of new DHCP
 910           clients and the length of expected downtime, and is not influ-
 911           enced in any way by the total number of DHCP clients supported
 912           by the server pair.
 913
 914           The failover protocol can be used in a mode where both the
 915           primary and secondary servers can share the load between them
 916           when both are operating.  In this load balancing mode, the
 917           addresses allocated by the primary server to the secondary
 918           server are not unused, but are used instead to service the
 919           portion of the client base to which the secondary server is
 920           required to respond.  See section 5.3 for more information on
 921           load balancing.
 922
 923       3.  The primary and secondary servers do not respond to client
 924           requests at all while recovering from a failure that could
 925           have resulted in duplicate IP assignments.  (When synchroniz-
 926           ing in POTENTIAL-CONFLICT state).
 927
 928
 929 5.  Protocol Overview
 930
 931    This section will discuss the failover protocol at a relatively high
 932    level of detail.  In the event that a description in this section
 933    conflicts (or appears to conflict due to the overview nature of this
 934    section) with information in later sections of this draft, the infor-
 935    mation in the later sections should be considered authoritative.
 936
 937 5.1.  Messages and States
 938
 939    This protocol is centered around the message exchange used by one
 940    server to update the other server of binding database changes result-
 941    ing from DHCP client activity:
 942
 943       o Communication of binding database changes
 944
 945         The binding update (BNDUPD) message is used to send the binding
 946         database changes to the partner server, and the partner server
 947         responds with a binding acknowledgement (BNDACK) message when it
 948         has successfully committed those changes to its own stable
 949         storage.
 950
 951
 952
 953 Droms, et. al.            Expires January 2001                 [Page 17]
 954 \f
 955 Internet Draft           DHCP Failover Protocol               July 2000
 956
 957
 958    All of the other messages involve ancillary issues:
 959
 960       o Management of available IP addresses
 961
 962         The pool request (POOLREQ) is used by the secondary server to
 963         request an allocation of IP addresses from the primary server.
 964         The pool response (POOLRESP) is used by the primary server to
 965         inform the secondary server how many IP addresses were allocated
 966         to the secondary server as the result of the pool request.
 967
 968       o Synchronization of the binding databases between the servers
 969         after they've been out of communications
 970
 971         The update request (UPDREQ) message is used by one server to
 972         request that its partner send it all binding database informa-
 973         tion that it has not already seen.  The update request all
 974         (UPDREQALL) message is used by one server to request that all
 975         binding database information be sent in order to recover from a
 976         total loss of its binding database by the requesting server.
 977         The update done (UPDDONE) message is used by the responding
 978         server to indicate that all requested updates have been sent the
 979         responding server and acked by the requesting server.
 980
 981       o Connection establishment
 982
 983         The connect (CONNECT) message is used by the primary server to
 984         establish a high level connection with the other server, and to
 985         transmit several important configuration data items between the
 986         servers.  The connect acknowledgement message (CONNECTACK) is
 987         used by the secondary server to respond to a CONNECT message
 988         from the primary server.  The disconnect (DISCONNECT) message is
 989         used by either server when closing a connection.
 990
 991       o Server synchronization
 992
 993         The state change (STATE) message is used by either server to
 994         inform the other server of a change of failover state.
 995
 996       o Connection integrity management
 997
 998         The contact (CONTACT) message is used by either server to ensure
 999         that the other server continues to see the connection as opera-
1000         tional.  It MUST be transmitted periodically over every esta-
1001         blished connection if other message traffic is not flowing, and
1002         it MAY be sent at any time.
1003
1004
1005
1006
1007
1008
1009 Droms, et. al.            Expires January 2001                 [Page 18]
1010 \f
1011 Internet Draft           DHCP Failover Protocol               July 2000
1012
1013
1014 5.1.1.  Failover endpoints
1015
1016    The proper operation of the failover protocol requires more than the
1017    transmission of messages between one server and the other.  Each end-
1018    point might seem to be a single DHCP server, but in fact there are
1019    many situations where additional flexibility in configuration is use-
1020    ful.
1021
1022    For instance, there might be several servers which are each primary
1023    for a distinct set of address pools, and one server which is secon-
1024    dary for all of those address pools.  The situation with the pri-
1025    maries is straightforward, but the secondary will need to maintain a
1026    separate failover state, partner state, and communications up/down
1027    status for each of the separate primary servers for which it is act-
1028    ing as a secondary.
1029
1030    The failover protocol calls for there to be a unique failover end-
1031    point per partner per role (where role is primary or secondary).
1032    This failover endpoint can take actions and hold unique states.
1033    There are thus a maximum of two failover endpoints per partner (one
1034    for the partner as a primary and one for that same partner as a
1035    secondary.)
1036
1037    Thus, in the case where there are two primary servers A and B each
1038    backed up by a single common secondary server C, there is one fail-
1039    over endpoint on each of A and B, and two different failover end-
1040    points on C.  The two different failover endpoints on C each have
1041    unique states and independent TCP connections.
1042
1043    This document frequently describes the behavior of the protocol in
1044    terms of primary and secondary servers, not primary and secondary
1045    failover endpoints.  However, it is important to remember that every
1046    'server' described in this document is in reality a failover endpoint
1047    that resides in a particular process, and that many failover end-
1048    points may reside in the same process.
1049
1050    It is not the case that there is a unique failover endpoint for each
1051    subnet address pool that participates in a failover relationship.  On
1052    one server, there is one failover endpoint per partner per role,
1053    regardless of how many subnet address pools are managed by that com-
1054    bination of partner and role.  Conversely, on a particular server,
1055    any given subnet address pool will be associated with exactly one
1056    failover endpoint.
1057
1058    When a connection is received from the partner, the unique failover
1059    endpoint to which the message is directed is determined solely by the
1060    IP address of the partner and the port to which the connection is
1061    directed by the partner.  See section 8.2.
1062
1063
1064
1065 Droms, et. al.            Expires January 2001                 [Page 19]
1066 \f
1067 Internet Draft           DHCP Failover Protocol               July 2000
1068
1069
1070 5.2.  Fundamental guarantees
1071
1072    There a several fundamental restrictions this protocol places on what
1073    one server can do in the absence of knowledge of the other server.
1074    Operating within these restrictions allows certain guarantees to be
1075    made to the partner server, and these are key to the correct opera-
1076    tion of the protocol.
1077
1078 5.2.1.  Control of lease time
1079
1080    The key problem with lazy update is that when a server fails after
1081    updating a client with a particular lease time and before updating
1082    its partner, the partner will believe that a lease has expired even
1083    though the client still retains a valid lease on that IP address.
1084
1085    In order to handle this problem, a period of time known as the "Max-
1086    imum Client Lead Time" (MCLT) is defined and must be known to both
1087    the primary and secondary servers.  Proper use of this time interval
1088    places an upper bound on the difference allowed between the lease
1089    time provided to a DHCP client by a server and the lease time known
1090    by that server's partner.  However, the MCLT is typically much less
1091    than the lease time that a server has been configured to offer a
1092    client, and so some strategy must exist to allow a server to offer
1093    the configured lease time to a client.  During a lazy update the
1094    updating server typically updates its partner with a potential
1095    expiration time which is longer than the lease time previously given
1096    to the client and which is longer than the lease time that the server
1097    has been configured to give a client.  This allows that server to
1098    give a longer lease time to the client the next time the client
1099    renews its lease, since the time that it will give to the client will
1100    not exceed the MCLT beyond the potential expiration time acknowledged
1101    by its partner.
1102
1103    The PARTNER-DOWN state exists so that a server can be sure that its
1104    partner is, indeed, down.  Correct operation while in that state
1105    requires (generally) that the server wait the MCLT after anything
1106    that happened prior to its transition into PARTNER-DOWN state (or,
1107    more accurately, when the other server went down if that is known).
1108    Thus, the server MUST wait the MCLT after the partner server went
1109    down before allocating any of the partner's addresses which were
1110    available for allocation.  In the event the partner was not in com-
1111    munication prior to going down, it might have allocated one or more
1112    of its FREE addresses to a DHCP client and been unable to inform the
1113    server entering PARTNER-DOWN prior to going down itself.  By waiting
1114    the MCLT after the time the partner went down, the server in
1115    PARTNER-DOWN state ensures that any clients which have a lease on one
1116    of the partner's FREE addresses will either time out or contact the
1117    server in PARTNER-DOWN by the time that period ends.
1118
1119
1120
1121 Droms, et. al.            Expires January 2001                 [Page 20]
1122 \f
1123 Internet Draft           DHCP Failover Protocol               July 2000
1124
1125
1126    In addition, once a server has transitioned to PARTNER-DOWN state, it
1127    MUST NOT reallocate an IP address from one client to another client
1128    until an additional MCLT interval after the lease by the original
1129    client expires.  (Actually, until the maximum client lead time after
1130    what it believes to be the lease expiration time of the client.)
1131
1132    Some optimizations exist for this restriction, in that it only
1133    applies to leases that were issued BEFORE entering PARTNER-DOWN. Once
1134    a server has entered PARTNER-DOWN and it leases out an address, it
1135    need not wait this time as long as it has never communicated with the
1136    partner since the lease was given out.
1137
1138    The fundamental relationship on which much of the correctness of this
1139    protocol depends is that the lease expiration time known to a DHCP
1140    client MUST NOT be more than the maximum client lead time greater
1141    than the potential expiration time known to a server's partner.
1142
1143    The remainder of this section makes the above fundamental relation-
1144    ship more explicit.
1145
1146    This protocol requires a DHCP server to deal with several different
1147    lease intervals and places specific restrictions on their relation-
1148    ships. The purpose of these restrictions is to allow the other server
1149    in the pair to be able to make certain assumptions in the absence of
1150    an ability to communicate between servers.
1151
1152    The different lease times are:
1153
1154       o desired lease interval
1155
1156         The desired lease interval is the lease interval that a DHCP
1157         server would like to give to a DHCP client in the absence of any
1158         restrictions imposed by the Failover protocol.  Its determina-
1159         tion is outside of the scope of this protocol. Typically this is
1160         the result of external configuration of a DHCP server.
1161
1162       o actual lease interval
1163
1164         The actual lease internal is the lease interval that a DHCP
1165         server gives out to a DHCP client in the dhcp-lease-time option
1166         of a DHCPACK packet.  It may be shorter than the desired client
1167         lease interval (as explained below).
1168
1169       o potential lease interval
1170
1171         The potential lease interval is the lease expiration interval
1172         the local server tells to its partner in the potential-
1173         expiration-time option of a BNDUPD message.
1174
1175
1176
1177 Droms, et. al.            Expires January 2001                 [Page 21]
1178 \f
1179 Internet Draft           DHCP Failover Protocol               July 2000
1180
1181
1182       o acknowledged potential lease interval
1183
1184         The acknowledged potential lease interval is the potential lease
1185         interval the partner server has most recently acknowledged in
1186         the potential-expiration-time option of a BNDACK message.
1187
1188    The key restriction (and guarantee) that any server makes with
1189    respect to lease intervals is that the actual client lease interval
1190    never exceeds the acknowledged potential lease interval (if any) by
1191    more than a fixed amount.  This fixed amount is called the "Maximum
1192    Client Lead Time" (MCLT).
1193
1194    The MCLT MAY be configurable on the primary server, but for correct
1195    server operation it MUST be the same and known to both the primary
1196    and secondary servers.  The secondary server determines the MCLT from
1197    the MCLT option sent from the primary server to the secondary server
1198    in the CONNECT message.
1199
1200    A server MUST record in its stable storage both the actual lease
1201    interval and the most recently acknowledged potential lease interval
1202    for each IP address binding.  It is assumed that the desired client
1203    lease interval can be determined through techniques outside of the
1204    scope of this protocol.  See section 7.1.5 for more details concern-
1205    ing the times that the server MUST record in its stable storage and
1206    the way that they interact with the lease time that may be offered to
1207    a DHCP client.
1208
1209    Again, the fundamental relationship among these times which MUST be
1210    maintained is:
1211
1212        actual lease interval <
1213        ( acknowledged potential lease interval + MCLT )
1214
1215
1216    Figure 5.2.1-1 illustrates an initial lease to a client using the
1217    rules discussed in the example which follows it.  Note that this is
1218    only one example -- as long as the fundamental relationship is
1219    preserved, the actual times used could be quite different.
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233 Droms, et. al.            Expires January 2001                 [Page 22]
1234 \f
1235 Internet Draft           DHCP Failover Protocol               July 2000
1236
1237
1238
1239               DHCP                 Primary             Secondary
1240        time   Client               Server               Server
1241
1242                 | (time in intervals) |  (absolute time)   |
1243                 |                     |                    |
1244                 | >-DHCPDISCOVER->    |                    |
1245                 |     <---DHCPOFFER-< |                    |
1246                 |                     |                    |
1247                 | >-DHCPREQUEST->     |                    |
1248                 |   (selecting)       |                    |
1249                 |                     |                    |
1250          t      |  <--------DHCPACK-< |                    |
1251                 |  lease-time=MCLT    |                    |
1252                 |                     |    >-BNDUPD-->     |
1253                 |                     |  lease-expiration=t+MCLT
1254                 |                     |  potential-expiration=t+(MCLT/2)+X
1255                 |                     |                    |
1256                 |                     |     <-BNDACK-<     |
1257                 |                     |  potential-expiration=t+(MCLT/2)+X
1258                ...                   ...                  ...
1259                 |                     |                    |
1260       t+MCLT/2  | >-DHCPREQUEST->     |                    |
1261                 |      (renew)        |                    |
1262                 |                     |                    |
1263          t1     |  <--------DHCPACK-< |                    |
1264                 |   lease-time=X      |                    |
1265                 |                     |    >-BNDUPD-->     |
1266                 |                     |  lease-expiration=t1+X
1267                 |                     |  potential-expiration=t1+(X/2)+X
1268                 |                     |                    |
1269                 |                     |     <-BNDACK-<     |
1270                 |                     |  potential-expiration=t1+(X/2)+X
1271                ...                   ...                  ...
1272
1273            Figure 5.2.1-1:  Lazy Update Message Traffic
1274                           X = Desired Lease Interval
1275                           Assumes renewal interval = lease interval / 2
1276
1277
1278    DISCUSSION:
1279
1280       This protocol mandates only that the above fundamental relation-
1281       ship concerning lease intervals is preserved.
1282
1283       In the interests of clarity, however, let's examine a specific
1284       example.  The MCLT in this case is 1 hour.  The desired lease
1285       interval is 3 days, and its renewal time is half the lease
1286
1287
1288
1289 Droms, et. al.            Expires January 2001                 [Page 23]
1290 \f
1291 Internet Draft           DHCP Failover Protocol               July 2000
1292
1293
1294       interval.
1295
1296       The rules for this example are:
1297
1298       o What to tell the client:
1299
1300         Take the remainder of the acknowledged potential lease interval.
1301         If this is a new lease, then this value will be zero.  If this
1302         remainder plus the MCLT is greater than the desired lease inter-
1303         val, give the client the desired lease interval else give the
1304         client the remainder plus the MCLT.
1305
1306       o What to tell the failover partner server:
1307
1308         Take the renewal interval (typically half of the actual client
1309         lease interval), add to it the desired lease interval, and add
1310         it to the current time to yield the value that goes into the
1311         potential-expiration-time option.
1312
1313         Also tell the failover partner the actual lease interval by
1314         adding it to the current time to yield the value that goes into
1315         the lease-expiration option.
1316
1317       In operation this might work as follows:
1318
1319       When a server makes an offer for a new lease on an IP address to a
1320       DHCP client, it determines the desired lease interval (in this
1321       case, 3 days).  It then examines the acknowledged potential lease
1322       interval (which in this case is zero) and determines the remainder
1323       of the time left to run, which is also zero.  To this it adds the
1324       MCLT.  Since the actual lease interval cannot be allowed to exceed
1325       the remainder of the current acknowledged potential lease interval
1326       plus the MCLT, the offer made to the client is for the remainder
1327       of the current acknowledged potential lease interval (i.e., zero)
1328       plus the MCLT.  Thus, the actual lease interval is 1 hour.
1329
1330       Once the server has performed the BNDACK to the DHCP client, it
1331       will update the secondary server with the lease information. How-
1332       ever, the desired potential lease interval will be composed of the
1333       one half of the current actual lease interval added to the desired
1334       lease interval. Thus, the secondary server is updated with a
1335       BNDUPD with a lease interval of 3 days + 1/2 hour specified in the
1336       potential-expiration-time option.
1337
1338       When the primary server receives an ACK to its update of the
1339       secondary server's (partner's) potential lease interval, it
1340       records that as the acknowledged potential lease interval.  A
1341       server MUST NOT send a BNDACK in response to a BNDUPD message
1342
1343
1344
1345 Droms, et. al.            Expires January 2001                 [Page 24]
1346 \f
1347 Internet Draft           DHCP Failover Protocol               July 2000
1348
1349
1350       until it is sure that the information in the BNDUPD message
1351       resides in its stable storage.  Thus, the primary server in this
1352       case can be sure that the secondary server has recorded the poten-
1353       tial lease interval in its stable storage when the primary server
1354       receives a BNDACK message from the secondary server.
1355
1356       When the DHCP client attempts to renew at T1 (approximately one
1357       half an hour from the start of the lease), the primary server
1358       again determines the desired lease interval, which is still 3
1359       days.  It then compares this with the remaining acknowledged
1360       potential lease interval (3 days + 1/2 hour) and adjusts for the
1361       time passed since the secondary was last updated (1/2 hour).  Thus
1362       the time remaining of the acknowledged potential lease interval is
1363       3 days.  Adding the MCLT to this yields 3 days plus 1 hour, which
1364       is more than the desired lease interval of 3 days.  So the client
1365       is renewed for the desired lease interval -- 3 days.
1366
1367       When the primary DHCP server updates the secondary DHCP server
1368       after the DHCP client's renewal ACK is complete, it will calculate
1369       the desired potential lease interval as the T1 fraction of the
1370       actual client lease interval (1/2 of 3 days this time = 1.5 days).
1371       To this it will add the desired client lease interval of 3 days,
1372       yielding a total desired partner server lease interval of 4.5
1373       days.  In this way, the primary attempts to have the secondary
1374       always "lead" the client in its understanding of the client's
1375       lease interval so as to be able to always offer the client the
1376       desired client lease interval.
1377
1378       Once the initial actual client lease interval of the MCLT is past,
1379       the protocol operates effectively like the DHCP protocol does
1380       today in its behavior concerning lease intervals. However, the
1381       guarantee that the actual client lease interval will never exceed
1382       the remaining acknowledged partner server lease interval by more
1383       than the MCLT allows full recovery from a variety of failures.
1384
1385 5.2.2.  Controlled re-allocation of IP addresses
1386
1387    When in PARTNER-DOWN state there is a waiting period after which an
1388    IP address can be re-allocated to another client.  For leases which
1389    are available when the server enters PARTNER-DOWN state, the period
1390    is the MCLT from entry into PARTNER-DOWN state.  For IP addresses
1391    which are not available when the server enters PARTNER-DOWN state,
1392    the period is the MCLT after the lease becomes available.  See sec-
1393    tion 9.4.2 for more details.
1394
1395    In any other state, a server cannot reallocate an address from one
1396    client to another without first notifying its partner (through a
1397    BNDUPD message) and receiving acknowledgement (through a BNDACK
1398
1399
1400
1401 Droms, et. al.            Expires January 2001                 [Page 25]
1402 \f
1403 Internet Draft           DHCP Failover Protocol               July 2000
1404
1405
1406    message) that its partner is aware that that first client is not
1407    using the address.
1408
1409    This could be modeled in the following way.  Though this specific
1410    implementation is in no way required, it may serve to better illus-
1411    trate the concept.
1412
1413    An "available" IP address on a server may be allocated to any client.
1414    An IP address which was leased to a client and which expired or was
1415    released by that client would take on a new state, EXPIRED or
1416    RELEASED respectively.  The partner server would then be notified
1417    that this IP address was EXPIRED or RELEASED through a BNDUPD.  When
1418    the sending server received the BNDACK for that IP address showing it
1419    was FREE, it would move the IP address from EXPIRED or RELEASED to
1420    FREE, and it would be available for allocation by the primary server
1421    to any clients.
1422
1423    A server MAY reallocate an IP address in the EXPIRED or RELEASED
1424    state to the same client with no restrictions provided it has not
1425    sent a BNDUPD message to its partner.  This situation would exist if
1426    the lease expired or was released after the transition into PARTNER-
1427    DOWN state, for instance.
1428
1429
1430 5.3.  Load balancing
1431
1432    In order to implement load balancing between a primary and secondary
1433    server pair, each server must respond to DHCPDISCOVER requests from
1434    some clients and not from other clients.  In order to do this suc-
1435    cessfully, each server must be able to determine immediately upon
1436    receipt of a DHCP client request whether it is to service this
1437    request or to ignore it in order to allow the other server to service
1438    the request.
1439
1440    In addition, it should be possible to configure the percentage of
1441    clients which will be serviced by either the primary or secondary
1442    server.  This configuration should be more or less continuous, from
1443    all clients serviced by the primary through an even split with half
1444    serviced by each, to all clients serviced by the secondary.
1445
1446    The technique chosen to support these goals is described in [LOADB].
1447
1448    A bitmap-style Hash Bucket Assignment (as described in [LOADB]) is
1449    used to determine which DHCP clients can be processed.  There are two
1450    potential HBA's in a failover server -- a server HBA and a failover
1451    HBA.   The way that a server acquires a server HBA is outside of the
1452    scope of the failover protocol, but both servers in a failover pair
1453    MUST have the same server HBA. The failover HBA is sent by the
1454
1455
1456
1457 Droms, et. al.            Expires January 2001                 [Page 26]
1458 \f
1459 Internet Draft           DHCP Failover Protocol               July 2000
1460
1461
1462    primary server to the secondary server whenever a connection is esta-
1463    blished, using the hash-bucket-assignment option defined in section
1464    12.11.
1465
1466    When using the server HBA (if any) and the failover HBA (if any), to
1467    decide whether to process a DHCP request, the server HBA always
1468    applies in every failover state, and the failover HBA (which MUST be
1469    a subset of the server HBA) is used by the secondary server to decide
1470    which packets to process when in NORMAL state.
1471
1472 5.4.  Operating in NORMAL state
1473
1474    When in NORMAL state, each server services DHCPDISCOVER's and all
1475    other DHCP requests other than DHCPREQUEST/RENEWAL or
1476    DHCPREQUEST/REBINDING from the client set defined by the load balanc-
1477    ing algorithm [LOADB].  Each server services DHCPREQUEST/RENEWAL or
1478    DHCPDISCOVER/REBINDING requests from any client.
1479
1480    In general, whenever the binding database is changed in stable
1481    storage (other than a change resulting from receiving a BNDUPD from
1482    the failover partner), then a BNDUPD message is sent with the con-
1483    tents of that change to the partner server.  The partner server then
1484    writes the information about that binding in its bindings database in
1485    stable storage and replies with a BNDACK message.
1486
1487    The binding database in a DHCP server would normally be changed as a
1488    result of DHCP protocol activity with a DHCP client  (e.g., granting
1489    a lease to a DHCP client through the familiar
1490    DISCOVER/OFFER/REQUEST/ACK cycle or extending a lease due to a
1491    renewal from a DHCP client) or possibly (on some servers) because a
1492    lease has expired or undergone another state change that must be
1493    recorded in the DHCP binding database.  These are the state changes
1494    that would be communicated to the partner server using a BNDUPD mes-
1495    sage.  Of course, receipt of a BNDUPD message itself will normally
1496    cause an update of the binding database for all of the IP addresses
1497    contained in the BNDUPD, and a binding database change such as this
1498    MUST NOT trigger a corresponding BNDUPD message to the partner.
1499
1500 5.5.  Operating in COMMUNICATIONS-INTERRUPTED state
1501
1502    When operating in COMMUNICATIONS-INTERRUPTED state, each server is
1503    operating independently, but does not assume that its partner is not
1504    operating.  The partner server might be operating and simply unable
1505    to communicate with this server, or might not be operating.
1506
1507    Each server responds to the full range of DHCP client messages that
1508    it receives, but in such a way that graceful reintegration is always
1509    possible when its partner comes back into contact with it.
1510
1511
1512
1513 Droms, et. al.            Expires January 2001                 [Page 27]
1514 \f
1515 Internet Draft           DHCP Failover Protocol               July 2000
1516
1517
1518 5.6.  Operating in PARTNER-DOWN state
1519
1520    When operating in PARTNER-DOWN state, a server assumes that its
1521    partner is not currently operating, but does make allowances for the
1522    possibility that that server was operating in the past, though possi-
1523    bly out of communications with this server.  It responds to all DHCP
1524    client requests in PARTNER-DOWN state.
1525
1526 5.7.  Operating in RECOVER state
1527
1528    A server operating in RECOVER state assumes that it is reintegrating
1529    with a server that has been operating in PARTNER-DOWN state, and that
1530    it needs to update its bindings database before it services DHCP
1531    client requests.
1532
1533    A server may also operate in RECOVER state in order to fully recover
1534    its bindings database from its partner server.
1535
1536 5.8.  Operating in STARTUP state
1537
1538    A server operating in STARTUP state assumes that failover is opera-
1539    tional, and it spends a short time whenever it comes up attempting to
1540    contact the partner.  During this time (generally a few seconds), the
1541    server is unresponsive to DHCP client requests.  This period exists
1542    in order to give a server a chance to determine that its partner has
1543    changed state since it was last in communications, and to react to
1544    that changed state (if any) prior to responding to DHCP client
1545    requests.
1546
1547    The period of time a server remains in STARTUP state SHOULD be long
1548    enough to ensure that it will connect to the other server if that
1549    server is available for connections.
1550
1551 5.9.  Time synchronization between servers
1552
1553    The failover protocol is designed to operate between two servers
1554    which have time values which differ by an arbitrarily large amount.
1555    A particular implementation MAY choose to only support servers whose
1556    time values differ by an arbitrarily small amount.
1557
1558    In any event, whether large or only small differences in time values
1559    are supported, every message that is received MUST be tagged with a
1560    time value as soon as possible after receipt.  This time value is
1561    used along with the time value that is sent in every message between
1562    the failover partners to develop a delta time between the servers.
1563    This delta time is used during the connection process to establish a
1564    baseline delta time between the servers, and upon receipt of each
1565    message, the delta time for that message is used to refine the delta
1566
1567
1568
1569 Droms, et. al.            Expires January 2001                 [Page 28]
1570 \f
1571 Internet Draft           DHCP Failover Protocol               July 2000
1572
1573
1574    time for the server pair.
1575
1576    While the algorithm for this refinement of delta time is not speci-
1577    fied as part of this protocol, a server SHOULD allow the delta time
1578    value for a pair of failover servers to be periodically updated to
1579    account for time drift.  In addition, the delta time value between
1580    servers SHOULD be smoothed in some fashion, so that transient network
1581    delays will not cause it to vary wildly.
1582
1583    A server SHOULD recognize a drastic change in the delta time value as
1584    an event to be signaled to a network administrator, as well as reset-
1585    ting the time delta between the failover partners.
1586
1587    The specific definitions of a minor or drastic change in delta time
1588    as well as the algorithm used to smooth minor changes into the run-
1589    ning delta time are implementation issues and are not further
1590    addresses in this document.
1591
1592 5.10.  IP address binding-status
1593
1594    In most DHCP servers an IP address can take on several different
1595    binding-status values, sometimes also called states.  While no two
1596    DHCP servers probably have exactly the same possible binding-status
1597    values, the DHCP RFC enforces some commonality among the general
1598    semantics of the binding-status values used by various DHCP server
1599    implementations.
1600
1601    In order to transmit binding database updates between one server and
1602    another using the failover protocol, some common denominator
1603    binding-status values must be defined.  It is not expected that these
1604    binding-status-values correspond with any actual implementation of
1605    the DHCP protocol in a DHCP server, but rather that the binding-
1606    status values defined in this document should be a common denominator
1607    of those in use by many DHCP server implementations.  It is a goal of
1608    this protocol that any DHCP server can map the various IP address
1609    binding-status values that it uses internally into these failover IP
1610    address binding-status values on transmission of binding database
1611    updates to its partner, and likewise that it can map any failover IP
1612    address binding-status values it received in a binding update into
1613    its internal IP address binding-status values.
1614
1615    The IP address binding-status values defined for the failover proto-
1616    col are listed below.  Unless otherwise noted below, there MAY be
1617    client information associated with each of these binding-status
1618    values.
1619
1620       o
1621
1622
1623
1624
1625 Droms, et. al.            Expires January 2001                 [Page 29]
1626 \f
1627 Internet Draft           DHCP Failover Protocol               July 2000
1628
1629
1630       o ACTIVE -- Lease is assigned to a client. Client identification
1631         MUST appear.
1632
1633       o EXPIRED -- indicates that a client's binding on an IP address
1634         has expired. When the partner server ACK's the BNDUPD of an
1635         EXPIRED IP address, the server sets its internal state to FREE.
1636         It is then available for allocation to any client of the primary
1637         server.  It may be allocated to the same client on the server
1638         where the lease expired if a BNDUPD containing the EXPIRED state
1639         has not yet been sent to the partner (e.g., in the event that
1640         the servers are not in communication).  Client identification
1641         SHOULD appear.
1642
1643       o RELEASED -- indicates that a DHCP client sent in a DHCPRELEASE
1644         message.  When the partner server ACK's the BNDUPD of an
1645         RELEASED IP address, the server sets its internal state to FREE,
1646         and it is available for allocation by the primary server to any
1647         DHCP client.  It may be allocated to the same client if a BNDUPD
1648         has not yet been sent to the partner.  Client identification
1649         SHOULD appear.
1650
1651       o FREE -- is used when a DHCP server needs to communicate that an
1652         IP address is unused by any DHCP client, but it was not just
1653         released, expired, or reset by a network administrator.  When
1654         the partner server ACK's the BNDUPD of a FREE IP address, the
1655         server sets its internal state such that it is available for
1656         allocation by the primary DHCP server to any DHCP client.  (Note
1657         that in PARTNER-DOWN state, after waiting the MCLT, the IP
1658         address MAY be allocated to a DHCP client by the secondary
1659         server.)
1660
1661         Note that when an IP address that was allocated by the secondary
1662         reverts to the FREE state, it must (like any other IP address)
1663         be assigned to the secondary through the POOLREQ/BNDUPD process
1664         before the secondary can reallocate it.
1665
1666         Client identification MAY appear.
1667
1668       o ABANDONED -- indicates that an IP address is considered unusable
1669         by the DHCP subsystem.  An IP address for which a valid PING
1670         response was received SHOULD be set to ABANDONED.  An IP address
1671         for which a DHCPDECLINE was received should be set to ABANDONED.
1672         Client identification MUST NOT appear.
1673
1674       o RESET -- indicates that this IP address was made available by
1675         operator command.  This is a distinct state so that the reason
1676         that the IP address became FREE can be determined.  Client iden-
1677         tification MAY appear.
1678
1679
1680
1681 Droms, et. al.            Expires January 2001                 [Page 30]
1682 \f
1683 Internet Draft           DHCP Failover Protocol               July 2000
1684
1685
1686       o BACKUP -- indicates that this IP address can be allocated by the
1687         secondary server to a DHCP client at any time. When the MCLT has
1688         passed after its time of entry into PARTNER-DOWN state, the IP
1689         address may be allocated by the primary to any DHCP client.
1690         Client identification MAY appear.
1691
1692    These binding-status values are communicated from one failover
1693    partner to another using the binding-status option, see section 12.3
1694    for details of this option.  Unless otherwise noted above there MAY
1695    be client information associated with each of these binding-status
1696    values.
1697
1698    An IP address will move between these binding-status values using the
1699    following state transition diagram:
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737 Droms, et. al.            Expires January 2001                 [Page 31]
1738 \f
1739 Internet Draft           DHCP Failover Protocol               July 2000
1740
1741
1742
1743
1744                                         DHCP client DECLINE or
1745                                         server detected problem
1746                                         from any state
1747                           +----------+     V   +---------+
1748          External   >---->|   RESET  |     |   |ABANDONED|
1749          command          |          |     +-->|         |
1750                           +----------+         +---------+
1751                                |
1752                            Comm w/Parter(1)
1753                                V
1754      +---------+  Comm(1) +----------+   Comm(1) +---------+
1755      | EXPIRED |--------->|  FREE    |<----------| RELEASED|
1756      |         | w/Parter |          | w/Partner |         |
1757      +---------+          +----------+           +---------+
1758        ^     ^             |    |  +-----------+       ^
1759        |     |             |    |              |       |
1760        | Exp. grace     IP |  IP addr alloc.  IP addr  |
1761        | period ends  address  to sec.(2)     reserved |
1762        |     |        leasedy   V              V       |
1763        |     |        by   | +----------+ +---------+  |
1764        |     |        primary|  BACKUP  | | BACKUP- |  |
1765        |   wait for        | |          | | RESERVED|  |
1766        |  grace period     | +----------+ +---------+  |
1767        |     |             |       |                   |
1768        |     |             |    IP addr leased by      |
1769        |  Expired grace    |       secondary           |
1770        |  period exists    V       V                   |
1771        |     |           +----------+                  |
1772        |     | Lease on  |  ACTIVE  | DHCPRELEASE      |
1773        +-----+-IP addr---|          |------------------+
1774                expires   +----------+
1775
1776
1777        Figure 5.10-1:  Transitions between binding-status values.
1778
1779        (1) This transition MAY also occur if the server is in
1780        PARTNER-DOWN state and the MCLT has passed since the entry
1781        in the RELEASED, EXPIRED, or RESET states.
1782
1783        (2) This transition MAY occur if the server is the secondary
1784        and the MCLT has passed since its entry into PARTNER-DOWN state.
1785
1786
1787
1788    Again, note that a DHCP server implementing the failover protocol
1789    does not have to implement either this state machine or use these
1790
1791
1792
1793 Droms, et. al.            Expires January 2001                 [Page 32]
1794 \f
1795 Internet Draft           DHCP Failover Protocol               July 2000
1796
1797
1798    particular binding-status values in its normal operation of allocat-
1799    ing IP addresses to DHCP clients.  It only needs to map its internal
1800    binding-status-values onto these "standard" binding-status values,
1801    and map these "standard" binding-status values back into its internal
1802    binding-status values.  For example, a server which implements a
1803    grace period for a IP address binding SHOULD simply wait to update
1804    its partner server until the grace period on that binding has run
1805    out.
1806
1807    The process of setting an IP address to FREE deserves some detailed
1808    discussion.  When an IP address is moved to the EXPIRED,RELEASED, or
1809    RESET binding-status on a server, it will send a BNDUPD with the
1810    binding-status of EXPIRED, RELEASED, or RESET to its partner.  If its
1811    partner agrees that is acceptable (see sections 7.1.2 and 7.1.3 con-
1812    cerning why a server might not accept a BNDUPD) it will return a
1813    BNDACK with no reject-reason, signifying that it accepted the update.
1814    As part of the BNDUPD processing, the server returning the BNDACK
1815    will set the binding-status of the IP address to FREE, and upon
1816    receipt of the BNDACK the server which sent the BNDUPD will set the
1817    binding-status of the IP address to FREE.  Thus, the EXPIRED,
1818    RELEASED, or RESET binding-status is something of a transitory state.
1819    This process is encoded in the transition diagram above by "Comm
1820    w/Partner".
1821
1822 5.11.  DNS dynamic update considerations
1823
1824    DHCP servers (and clients) can use DNS Dynamic Updates as described
1825    in [RFC 2136] to maintain DNS name-mappings as they maintain DHCP
1826    leases.  Many different administrative models for DHCP-DNS integra-
1827    tion are possible.  Descriptions of several of these models, and
1828    guidelines that DHCP servers and clients should follow in carrying
1829    them out, are laid out in [DDNS].  The nature of the DHCP failover
1830    protocol introduces some issues concerning dynamic DNS updates that
1831    are not part of non-failover DHCP environments.  This section
1832    describes these issues, and defines the information which failover
1833    partners should exchange and the protocol which they should follow in
1834    order to ensure consistent behavior.  The presence of this section
1835    should not be interpreted as requiring that implementations of the
1836    DHCP failover protocol must also support DDNS updates.  The purpose
1837    of this discussion is to clarify the areas where the DHCP failover
1838    and DHCP-DDNS protocols intersect for the benefit of implementations
1839    which support both protocols, not to introduce a new requirement into
1840    the DHCP failover protocol.  Thus, a DHCP server which implements the
1841    failover protocol MAY also support dynamic DNS updates, but if it
1842    does support dynamic DNS updates it SHOULD utilize the techniques
1843    described here in order to correctly distribute them between the
1844    failover partners.
1845
1846
1847
1848
1849 Droms, et. al.            Expires January 2001                 [Page 33]
1850 \f
1851 Internet Draft           DHCP Failover Protocol               July 2000
1852
1853
1854    From the standpoint of the failover protocol, there is no reason why
1855    a server which is utilizing the DDNS protocol to update a DNS server
1856    should not be a partner with a server which is not utilizing the DDNS
1857    protocol to update a DNS server.  However, a server which is not able
1858    to support DDNS or is not configured to support DDNS SHOULD output a
1859    warning message when it receives BNDUPD messages which indicate that
1860    its failover partner is configured to support the DDNS protocol to
1861    update a DNS server.  An implementation MAY consider this an error
1862    and refuse to operate, or it MAY choose to operate anyway, having
1863    warned the user of the problem in some way.
1864
1865 5.11.1.  Relationship between failover and dynamic DNS update
1866
1867    The failover protocol describes the conditions under which each fail-
1868    over server may renew a lease to its current DHCP client, and
1869    describes the conditions under which it may grant a lease to a new
1870    DHCP client.  An analogous set of conditions determines when a fail-
1871    over server should initiate a DDNS update, and when it should attempt
1872    to remove records from the DNS. The failover protocol's conditions
1873    are based on the desired external behavior: avoiding duplicate
1874    address assignments; allowing clients to continue using leases which
1875    they obtained from one failover partner even if they can only commun-
1876    icate with the other partner; allowing the backup DHCP server to
1877    grant new leases even if it is unable to communicate with the primary
1878    server.  The desired external DDNS behavior for DHCP failover servers
1879    is:
1880
1881       1.  Allow timely DDNS updates from the server which grants a
1882           client a lease. Recognize that there is often a DDNS update
1883           lifecycle which parallels the DHCP lease lifecycle. This is
1884           likely to include the addition of records when the lease is
1885           granted, and the removal of DNS records when the lease is sub-
1886           sequently made available for allocation to a different client.
1887
1888       2.  Communicate enough information between the two failover
1889           servers to allow one to complete the DDNS update 'lifecycle'
1890           even if the other server originally granted the lease.
1891
1892       3.  Avoid redundant or overlapping DDNS updates, where both fail-
1893           over servers are attempting to perform DDNS updates for the
1894           same lease-client binding. Avoid situations where one partner
1895           is attempting to add RRs related to a lease binding while the
1896           other partner is attempting to remove RRs related to the same
1897           lease binding.
1898
1899 5.11.2.  Use of the DDNS option
1900
1901    In order for either server to be able to complete a DDNS update, or
1902
1903
1904
1905 Droms, et. al.            Expires January 2001                 [Page 34]
1906 \f
1907 Internet Draft           DHCP Failover Protocol               July 2000
1908
1909
1910    to remove DNS records which were added by its partner, both servers
1911    need to know the FQDN associated with the lease-client binding. The
1912    FQDN associated with the client's A RR and PTR RR SHOULD be communi-
1913    cated from the server which adds records into the DNS to its partner.
1914    The initiating server SHOULD use the DDNS option in the BNDUPD mes-
1915    sages to inform the partner server of the status of any DDNS updates
1916    associated with a lease binding. Failover servers MAY choose not to
1917    include the DDNS option in BNDUPD messages if there has been no
1918    change in the status of any DDNS update related to the lease binding.
1919    The partner server receiving BNDUPD messages containing the DDNS
1920    option SHOULD compare the status flags and the FQDN contained in the
1921    option data with the current DDNS information it has associated with
1922    the lease binding, and update its notion of the DDNS status accord-
1923    ingly.
1924
1925    The initiating server MAY send a BNDUPD to its partner before the
1926    DDNS update has been successfully completed. If it does so, it SHOULD
1927    leave the 'C' bit in the Flags field clear, to indicate to the
1928    partner that the DDNS update may not be complete. When the DDNS
1929    update has been successfully acknowledged by the DNS server, the ini-
1930    tiating DHCP server SHOULD include the DDNS option in its next BNDUPD
1931    message about the binding, so that the partner server will be able to
1932    record the final status of the DDNS update. The initiating server
1933    SHOULD set the 'C' bit in the DDNS option if the DDNS update was suc-
1934    cessfully accepted by the DNS server.
1935
1936    Some implementations will choose to send a BNDUPD without waiting for
1937    the DDNS update to complete, and then will send a second BNDUPD once
1938    the DDNS update is complete. Other implementations will delay sending
1939    the partner a BNDUPD until the DDNS update has been acknowledged by
1940    the DNS server, or until some time-limit has elapsed, in order to
1941    avoid sending a second BNDUPD.
1942
1943    The Domain Name field in the DDNS option contains the FQDN that will
1944    be associated with the A RR (if the server is performing an A RR
1945    update for the client) and the PTR RR. This FQDN may be composed in
1946    any of several ways, depending on server configuration and the infor-
1947    mation provided by the client in its DHCP messages. The client may
1948    supply a hostname which it would like the server to use in forming
1949    the FQDN, or it may supply the entire FQDN. The server may be config-
1950    ured to attempt to use the information the client supplies, it may be
1951    configured with an FQDN to use for the client, or it may be config-
1952    ured to synthesize an FQDN. The responsive server SHOULD include the
1953    FQDN that it will be using in DDNS updates it initiates when it sends
1954    the DDNS option.
1955
1956    Since the responsive server may not have completed the DDNS update at
1957    the time it sends the first BNDUPD about the lease binding, there may
1958
1959
1960
1961 Droms, et. al.            Expires January 2001                 [Page 35]
1962 \f
1963 Internet Draft           DHCP Failover Protocol               July 2000
1964
1965
1966    be cases where the FQDN in later BNDUPD messages does not match the
1967    FQDN included in earlier messages.  For example, the responsive
1968    server may be configured to handle situations where two or more DHCP
1969    client FQDNs are identical by modifying the most-specific label in
1970    the FQDNs of some of the clients in an attempt to generate unique
1971    FQDNs for them (a process sometimes called "disambiguation").  Alter-
1972    natively, at sites which use some or all of the information which
1973    clients supply to form the FQDN, it's possible that a client's confi-
1974    guration may be changed so that it begins to supply new data.  The
1975    responsive server may react by removing the DNS records which it ori-
1976    ginally added for the client, and replacing them with records that
1977    refer to the client's new FQDN. In such cases, the responsive server
1978    SHOULD include the actual FQDN that was used in subsequent DDNS
1979    options.  The responsive server SHOULD include relevant client-option
1980    data in the client-request-options option in its BNDUPD messages.
1981    This information may be necessary in order to allow the non-
1982    responsive partner to detect client configuration changes that change
1983    the hostname or FQDN data which the client includes in its DHCP
1984    requests.
1985
1986 5.11.3.  Adding RRs to the DNS
1987
1988    A failover server which is going to perform DDNS updates SHOULD ini-
1989    tiate the DDNS update when it grants a new lease to a client. The
1990    non-responsive partner SHOULD NOT initiate a DDNS update when it
1991    receives the BNDUPD after the lease has been granted. The failover
1992    protocol ensures that only one of the partners will grant a lease to
1993    any individual client, so it follows that this requirement will
1994    prevent both partners from initiating updates simultaneously. The
1995    server initiating the update SHOULD follow the protocol in [DDNS].
1996    The server may be configured to perform an A RR update on behalf of
1997    its clients, or not. Ordinarily, a failover server will not initiate
1998    DDNS updates when it renews leases. In two cases, however, a failover
1999    server MAY initiate a DDNS update when it renews a lease to its
2000    existing client:
2001
2002       1.  When the lease was granted before the server was configured to
2003           perform DDNS updates, the server MAY be configured to perform
2004           updates when it next renews existing leases. Since both
2005           servers are responsive to renewals in NORMAL state, it is not
2006           enough to simply require the non-responsive server to avoid a
2007           DNS update in this case.  The server which would be responsive
2008           to a DHCPDISCOVER from this client (even though the current
2009           request is a DHCPREQUEST/RENEW) is the server which should
2010           initiate the DDNS update.
2011
2012       2.  If a server is in PARTNER-DOWN state, it can conclude that its
2013           partner is no longer attempting to perform an update for the
2014
2015
2016
2017 Droms, et. al.            Expires January 2001                 [Page 36]
2018 \f
2019 Internet Draft           DHCP Failover Protocol               July 2000
2020
2021
2022           existing client. If the remaining server has not recorded that
2023           an update for the binding has been successfully completed, the
2024           server MAY initiate a DDNS update.  It MAY initiate this
2025           update immediately upon entry to PARTNER-DOWN state, it may
2026           perform this in the background, or it MAY initiate this update
2027           upon next hearing from the DHCP client.
2028
2029 5.11.4.  Deleting RRs from the DNS
2030
2031    The failover server which makes an IP address FREE SHOULD initiate
2032    any DDNS deletes, if it has recorded that DNS records were added on
2033    behalf of the client.
2034
2035    A server not in PARTNER-DOWN state "makes an IP address FREE" when it
2036    initiates a BNDUPD with a binding-status of FREE, EXPIRED, or
2037    RELEASED.  Its partner confirms this status by acking that BNDUPD,
2038    and upon receipt of the ACK the server has "made the IP address
2039    FREE".  Conversely, a server in PARTNER-DOWN state "makes an IP
2040    address FREE" when it sets the binding-status to FREE, since in
2041    PARTNER-DOWN state not communications is required with the partner.
2042
2043    It is at this point that it should initiate the DDNS operations to
2044    delete RRs from the DDNS. Its partner SHOULD NOT initiate DDNS
2045    deletes for DNS records related to the lease binding as part of send-
2046    ing the BNDACK message.   The partner MAY have issued BNDUPD messages
2047    with a binding-status of FREE, EXPIRED, or RELEASED previously, but
2048    the other server will have NAKed these BNDUPD messages.
2049
2050    The failover protocol ensures that only one of the two partner
2051    servers will be able to make a lease FREE. The server making the
2052    lease FREE may be doing so while it is in NORMAL communication with
2053    its partner, or it may be in PARTNER-DOWN state. If a server is in
2054    PARTNER-DOWN state, it may be performing DDNS deletes for RRs which
2055    its partner added originally. This allows a single remaining partner
2056    server to assume responsibility for all of the DDNS activity which
2057    the two servers were undertaking.
2058
2059    Another implication of this approach is that no DDNS RR deletes will
2060    be performed while either server is in COMMUNICATIONS-INTERRUPTED
2061    state, since no IP addresses are moved into the FREE state during
2062    that period.
2063
2064 5.12.  Reservations and failover
2065
2066    Some DHCP servers support a capability to offer specific pre-
2067    configured IP addresses to DHCP clients.  These are real DHCP
2068    clients, they do the entire DHCP protocol, but these servers always
2069    offer the client a specific pre-configured IP address -- and they
2070
2071
2072
2073 Droms, et. al.            Expires January 2001                 [Page 37]
2074 \f
2075 Internet Draft           DHCP Failover Protocol               July 2000
2076
2077
2078    offer that IP address to no other clients.  Such a capability has
2079    several names, but it is sometimes called a "reservation", in that
2080    the IP address is reserved for a particular DHCP client.
2081
2082    In a situation where there are two DHCP servers serving the same sub-
2083    net without using failover, the two DHCP server's need to have dis-
2084    joint IP address pools, but identical reservations for the DHCP
2085    clients.
2086
2087    In a failover context, both servers need to be configured with the
2088    proper reservations in an identical manner, but if we stop there
2089    problems can occur around the edge conditions where reservations are
2090    made for an IP address that has already been leased to a different
2091    client.  Different servers handle this conflict in different ways,
2092    but the goal of the failover protocol is to allow correct operation
2093    with any server's approach to the normal processing of the DHCP pro-
2094    tocol.
2095
2096    The general solution with regards to reservations is as follows.
2097    Whenever a reserved IP address becomes FREE (i.e., when first config-
2098    ured or whenever a client frees it or it expires or is reset), the
2099    primary server MUST show that IP address as FREE (and thus available
2100    for its own allocation) and it MUST send it to the secondary server
2101    as BACKUP-RESERVED, in order that the secondary server be able to
2102    allocate it as well.
2103
2104    Note that this implies that a reserved IP address goes through the
2105    normal state changes from FREE to ACTIVE (and possibly back to FREE).
2106    The failover protcol supports this approach to reservations, i.e.,
2107    where the IP address undergoes the normal state changes of any IP
2108    address, but it can only be offered to the client for which it is
2109    reserved.  Other approaches to the support of reservations exist in
2110    some DHCP server implementations (e.g., where the IP address is
2111    apparently leased to a particular client forever, without any expira-
2112    tion).  The goal is for the failover protocol to support any of the
2113    usual approaches to reservations, both those that allow an IP address
2114    to go through different states when reserved, and those that don't.
2115
2116    From the above, it follows that a reservation soley on the secondary
2117    will not necessarily allow the secondary to offer that address to
2118    client to whom it is reserved.  The reservation must also appear on
2119    the primary as well for the secondary to be able to offer the IP
2120    address to the client to which is is reserved.
2121
2122    When the reservation on an IP address is cancelled, if the IP address
2123    is currently FREE and the server is the primary, or BACKUP and the
2124    server is the secondary, the server MUST send a BNDUPD to the other
2125    server with the binding-status FREE.
2126
2127
2128
2129 Droms, et. al.            Expires January 2001                 [Page 38]
2130 \f
2131 Internet Draft           DHCP Failover Protocol               July 2000
2132
2133
2134 5.13.  Dynamic BOOTP and failover
2135
2136    Some DHCP servers support a capability to offer IP addresses to BOOTP
2137    clients without having a particular address previously allocated for
2138    those clients.  This capability is often called something like
2139    "dynamic BOOTP".  It is discussed briefly in RFC 1534 [RFC 1534].
2140
2141    This capability has a negative interaction with the fundamental ele-
2142    ments of the failover protocol, in that an address handed out to a
2143    BOOTP device has no term (or effectively no term, in that usually
2144    they are considered leases for "forever").  There is no opportunity
2145    to hand out a lease which is only the MCLT long when first hearing
2146    from a BOOTP device, because they may only interact once with the
2147    DHCP server and they have no notion of a lease expiration time.  Thus
2148    the entire concept of the MCLT and waiting the MCLT after entering
2149    PARTNER-DOWN state is defeated when dealing with BOOTP devices.
2150
2151    With some restrictions, however, dynamic BOOTP devices can be sup-
2152    ported in a server on a subnet where failover is supported.  The only
2153    restriction (and it is not small) is that on any portion of the sub-
2154    net (in any address pool) where dynamic BOOTP devices can be allo-
2155    cated IP addresses, a DHCP server MUST NOT ever use any of the IP
2156    addresses which were previously available for allocation by its fail-
2157    over partner.  Thus, the addresses allocated by the primary to the
2158    secondary for allocation that might have been allocated to BOOTP dev-
2159    ices MUST NOT ever be used by the primary server even if it is in
2160    PARTNER-DOWN state and has waited the MCLT after entering that state.
2161    Conversely, addresses available for allocation by the primary MUST
2162    NOT be used by the secondary even it is in PARTNER-DOWN state.  The
2163    reason for this is because one of those IP address could have been
2164    allocated by the secondary server to a BOOTP device, and the primary
2165    server would have no way of ever knowing that happened.
2166
2167 5.14.  Guidelines for selecting MCLT
2168
2169    There is no one correct value for the MCLT.  There is an explicit
2170    tradeoff between various factors in selecting an MCLT value.
2171
2172 5.14.1.  Short MCLT
2173
2174    A short MCLT value will mean that after entering PARTNER-DOWN state,
2175    a server will only have to wait a short time before it can start
2176    allocating its partner's IP addresses to DHCP clients.  Furthermore,
2177    it will only have to wait a short time after the expiration of a
2178    lease on an IP address before it can reallocate that IP address to
2179    another DHCP client.
2180
2181    However the downside of a short MCLT value is that the initial lease
2182
2183
2184
2185 Droms, et. al.            Expires January 2001                 [Page 39]
2186 \f
2187 Internet Draft           DHCP Failover Protocol               July 2000
2188
2189
2190    interval that will be offered to every new DHCP client will be short,
2191    which will cause increased traffic as those clients will need to send
2192    in their first renew in a half of a short MCLT time.  In addition,
2193    the lease extensions that a server in COMMUNICATIONS-INTERRUPTED
2194    state can give will be only the MCLT after the server has been in
2195    COMMUNICATIONS-INTERRUPTED for around the desired client lease
2196    period.  If a server stays in COMMUNICATIONS-INTERRUPTED for that
2197    long, then the leases it hands out will be short and that will
2198    increase the load on that server, possibly causing difficulty.
2199
2200 5.14.2.  Long MCLT
2201
2202    A long MCLT value will mean that the initial lease period will be
2203    longer and the time that a server in COMMUNICATIONS-INTERRUPTED state
2204    will be able to extend leases (after it has been in COMMUNICATIONS-
2205    INTERRUPTED state for around the desired client lease period) will be
2206    longer.
2207
2208    However, a server entering PARTNER-DOWN state will have to wait the
2209    longer MCLT before being able to allocate its partner's IP addresses
2210    to new DHCP clients.  This may mean that additional IP addresses are
2211    required in order to cover this time period.  Further, the server in
2212    PARTNER-DOWN will have to wait the longer MCLT from every lease
2213    expiration before it can reallocate an IP address to a different DHCP
2214    client.
2215
2216 6.  Common Message Format
2217
2218    This section discusses the common message format that all failover
2219    messages have in common, including the message header format as well
2220    as the common option format.  See section 12 for the the definitions
2221    of the specific options used in the failover protocol.
2222
2223 6.1.  Message header format
2224
2225    The options contained in the payload data section of the failover
2226    message all use a two byte option number and two byte length format.
2227
2228    All failover protocol messages are sent over the TCP connection
2229    between failover endpoints and encoded using a message format
2230    specific to the failover protocol.
2231
2232    There exists a common message format for all failover messages, which
2233    utilizes the options in a way similar to the DHCP protocol.  For each
2234    message type, some options are required and some are optional.  In
2235    addition, when a message is received any options that are not under-
2236    stood by the receiving server MUST be ignored.
2237
2238
2239
2240
2241 Droms, et. al.            Expires January 2001                 [Page 40]
2242 \f
2243 Internet Draft           DHCP Failover Protocol               July 2000
2244
2245
2246    All of the fields in the fixed portion of the message MUST be filled
2247    with correct data in every message sent.
2248
2249    0                   1                   2                   3
2250    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2251    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2252    |        message length (2)     | msg type (1)  |payload off (1)|
2253    +---------------+---------------+---------------+---------------+
2254    |                            time (4)                           |
2255    +---------------------------------------------------------------+
2256    |                            xid (4)                            |
2257    +---------------------------------------------------------------+
2258    |     0 or more additional header bytes  (variable)             |
2259    +---------------------------------------------------------------+
2260    |                    payload data  (variable)                   |
2261    |                                                               |
2262    |               formatted as DHCP-style options                 |
2263    |           using a two byte option code and two byte length    |
2264    |                  See section 6.2 for details.                 |
2265    +---------------------------------------------------------------+
2266
2267
2268
2269    message length - 2 bytes, network byte order
2270
2271    This is the length of the message.  It includes the two byte message
2272    length itself.  The maximum length is 2048 bytes.  The minimum length
2273    is 12.
2274
2275
2276
2277
2278
2279
2280
2281
2282
2283
2284
2285
2286
2287
2288
2289
2290
2291
2292
2293
2294
2295
2296
2297 Droms, et. al.            Expires January 2001                 [Page 41]
2298 \f
2299 Internet Draft           DHCP Failover Protocol               July 2000
2300
2301
2302    msg type - 1 byte
2303
2304    The message type field is used to distinguish between messages.
2305
2306    The following message types are defined:
2307
2308    Value   Message Type
2309    -----   ------------
2310    0       reserved    not used
2311    1       POOLREQ     request allocation of addresses
2312    2       POOLRESP    respond with allocation count
2313    3       BNDUPD      update partner with binding info
2314    4       BNDACK      acknowledge receipt of binding update
2315    5       CONNECT     establish connection with the secondary
2316    6       CONNECTACK  respond to attempt to establish connection with partner
2317    7       UPDREQALL   request full transfer of binding info
2318    8       UPDDONE     ack send and ack of req'd binding info
2319    9       UPDREQ      req transfer of un-acked binding info
2320    10      STATE       inform partner of current state or state change
2321    11      CONTACT     probe communications integrity with partner
2322    12      DISCONNECT  close a connection
2323
2324
2325    New message types should be defined in one of two ranges, 0-127 or
2326    129-255.  The range of 0-127 is used for messages that MUST be sup-
2327    ported by every server, and if a server receives a message in the
2328    range of 0-127 that it doesn't understand, it MUST close the TCP con-
2329    nection.  The range of 128-255 is used for messages which MAY be sup-
2330    ported but are not required, and if a server receives a message in
2331    this range that it does not understand it SHOULD ignore the message.
2332
2333    payload offset - 1 byte
2334
2335    The byte offset of the Payload Data, from the beginning of the
2336    failover message header. The value for the current protocol version
2337    (version 1) is 8.
2338
2339    time - 4 bytes, network byte order
2340
2341    The absolute time in GMT when the message was transmitted,
2342    represented as seconds elapsed since Jan 1, 1970 (i.e., similar to
2343    the ANSI C time_t time value representation).  While the ANSI C
2344    time_t value is signed, the value used in this specification is
2345    unsigned.
2346
2347    A server SHOULD set this time as close to the actual transmission of
2348    the message as possible.
2349
2350
2351
2352
2353 Droms, et. al.            Expires January 2001                 [Page 42]
2354 \f
2355 Internet Draft           DHCP Failover Protocol               July 2000
2356
2357
2358    xid - 4 bytes, network byte order
2359
2360    This is the transaction id of the failover message. The sender of a
2361    failover protocol message is responsible for setting this number, and
2362    the receiver of the message copies the number over into any response
2363    message, treating it as opaque data. The sender MUST ensure that
2364    every message sent from a particular failover endpoint over the
2365    associated TCP connection has a unique transaction id.
2366
2367    For failover messages that have no corresponding response message,
2368    the XID value is meaningless, but MUST be supplied. The XID value is
2369    used solely by the receiver of a response message to determine the
2370    corresponding request message.
2371
2372    Requests messages where the XID is used in the corresponding response
2373    messages are: POOLREQ, BNDUPD, CONNECT, UPDREQALL, and UPDREQ. The
2374    corresponding response messages are POOLRESP, BNDACK, CONNECTACK,
2375    UPDDONE, and UPDDONE, respectively.
2376
2377    As requests/responses don't survive connection reestablishment, XIDs
2378    only need to be unique during a specific connection.
2379
2380
2381    payload data - variable length
2382
2383    The options are placed after the header, after skipping payload
2384    offset bytes from beginning of the message.  The payload data options
2385    are not preceded by a "cookie" value.
2386
2387    The payload data is formatted as DHCP style options using two byte
2388    option codes and two byte option lengths.  The option codes are in a
2389    namespace which is unique to the failover protocol.
2390
2391    The maximum length of the payload data in octets is 2048 less the
2392    size of the header, i.e., the maximum message length is 2048 octets.
2393
2394 6.2.  Common option format
2395
2396    The options contained in the payload data section of the failover
2397    message all use a two byte option number and two byte length format.
2398
2399    The option numbers are drawn from an option number space unique to
2400    the failover protocol.  All of the message types share a common
2401    option number space and common options definitions, though not all
2402    options are required or meaningful for every message.
2403
2404    In contrast to the options which appear in DHCP client and server
2405    messages, the options in failover message are ordered.  That is, for
2406
2407
2408
2409 Droms, et. al.            Expires January 2001                 [Page 43]
2410 \f
2411 Internet Draft           DHCP Failover Protocol               July 2000
2412
2413
2414    some messages the order in which the options appear in the payload
2415    data area is significant.  The messages for which option ordering is
2416    significant explicitly describe the ordering requirements.  If no
2417    ordering requirements are mentioned, then the order is not signifi-
2418    cant for that message.
2419
2420    For all options which refer to time, they all use an absolute time in
2421    GMT.  Time synchronization has already been achieved between the
2422    source and the target server using the CONNECT message and is updated
2423    and refined using the time in every packet.
2424
2425    The time value is an unsigned 32 bit integer in network byte order
2426    giving the number of seconds since 00:00 UTC, 1st January 1970. This
2427    can be converted to an NTP timestamp by adding decimal 2208988800.
2428    This time format will not wrap until the year 2106.  Until sometime
2429    in 2038, it is equal to the ANSI C time_t value (which is a signed 32
2430    bit value and will overflow into a negative number in 2038).
2431
2432    Options should appear once only in each message (except for BNDUPD
2433    and BNDACK messages where bulking is used, see section 6.3 for
2434    details.)  An option that appears twice is not concatenated, but
2435    treated as an error.
2436
2437    Specific option values are described in section 12.
2438
2439    See section 13 for how to define additional options.
2440
2441 6.3.  Batching multiple binding update transactions in one BNDUPD mes-
2442 sage
2443
2444    Implementations of this protocol MAY send multiple binding update
2445    transactions in one BNDUPD message, where a binding update transac-
2446    tion is defined as the set of options which are associated with the
2447    update of a single IP address.  All implementations of this protocol
2448    MUST be prepared to receive BNDUPD messages which contain multiple
2449    binding update transactions and respond correctly to them, including
2450    replying with a BNDACK message which contains status for the multiple
2451    binding update transactions contained in the BNDUPD message.
2452
2453    In the discussion of sending and receiving BNDUPD messages in section
2454    7.1 and BNDACK messages in section 7.2, each BNDUPD message and
2455    BNDACK message is assumed to contain a single binding update transac-
2456    tion in order to reduce the complexity of the discussions in section
2457    7.
2458
2459    Multiple binding update transactions MAY be batched together in one
2460    BNDUPD protocol message with the data sets for the individual tran-
2461    sactions delimited by the assigned-IP-address option, which MUST
2462
2463
2464
2465 Droms, et. al.            Expires January 2001                 [Page 44]
2466 \f
2467 Internet Draft           DHCP Failover Protocol               July 2000
2468
2469
2470    appear first in the option set for each transaction.  Ordering of
2471    options between the assigned-IP-address options is not significant.
2472    This is illustrated in the following schematic representation:
2473
2474
2475        Non-IP Address/Non-client specific options first
2476        assigned-IP-address option for the first IP address
2477            Options pertaining to first address, including
2478            at least the binding-status option and others as
2479            required.
2480        assigned-IP-address option for the second IP address
2481            Options pertaining to second address, including
2482            at least the binding-status option and others as
2483            required.
2484        ...
2485        Trailing options (message digest).
2486
2487
2488    There MUST be a one-to-one correspondence between BNDUPD and BNDACK
2489    messages, and every BNDACK message MUST contain status for all of the
2490    binding update transactions in the corresponding BNDUPD message.
2491
2492    The BNDACK message corresponding to a BNDUPD message MUST contain
2493    assigned-IP-address options for all of the binding update transac-
2494    tions in the BNDUPD message.  Thus, every BNDACK message contains
2495    exactly the same assigned-IP-address options as does its correspond-
2496    ing BNDUPD message.  The order of the assigned-IP-address options
2497    MAY, however, be different.  Here is a schematic representation of a
2498    BNDACK:
2499
2500
2501        Non-IP Address/Non-client specific options first
2502        assigned-IP-address option for the first IP address
2503            If rejected, reject-reason option and message option.
2504        assigned-IP-address option for the second IP address
2505            If rejected, reject-reason option and message option.
2506        ...
2507        Trailing options (message digest).
2508
2509
2510    In case the server chooses to reject some or all of the IP address
2511    binding information in a BNDUPD message in a BNDACK reply, the BNDACK
2512    message MUST contain a reject-reason option following every
2513    assigned-IP-address option in order to indicate that the binding
2514    update transaction for that IP address was not accepted and why.  As
2515    with a BNDACK message containing a single binding update transaction,
2516    an assigned-IP-address option without any associated reject-reason
2517    option indicates a successful binding update transaction.
2518
2519
2520
2521 Droms, et. al.            Expires January 2001                 [Page 45]
2522 \f
2523 Internet Draft           DHCP Failover Protocol               July 2000
2524
2525
2526 7.  Protocol Messages
2527
2528    This section contains the detailed definition of the protocol mes-
2529    sages, including the information to include when sending the message,
2530    as well as the actions to take upon receiving the message.  The mes-
2531    sage type for each message appears as [n] in the heading for the mes-
2532    sage (see section 6.1).
2533
2534 7.1.  BNDUPD message [3]
2535
2536    The binding update (BNDUPD) message is used to send the binding data-
2537    base changes (known as binding update transactions) to the partner
2538    server, and the partner server responds with a binding acknowledge-
2539    ment (BNDACK) message when it has successfully committed those
2540    changes to its own stable storage.
2541
2542    The rest of the failover protocol exists to determine whether the
2543    partner server is able to communicate or not, and to enable the
2544    partners to exchange BNDUPD/BNDACK messages in order to keep their
2545    binding databases in stable storage synchronized.
2546
2547    The rest of this section is written as though every BNDUPD message
2548    contains only a single binding update transaction in order to reduce
2549    the complexity of the discussion.  See section 6.3 for information on
2550    how to create and process BNDUPD and BNDACK messages which contain
2551    multiple binding update transactions.  Note that while a server MAY
2552    generate BNDUPD messages with multiple binding update transactions,
2553    every server MUST be able to process a BNDUPD message which contains
2554    multiple binding update transactions and generate the corresponding
2555    BNDACK messages with status for multiple binding update transactions.
2556
2557    The following table summarizes the various options for the BNDUPD
2558    message.
2559
2560
2561
2562
2563
2564
2565
2566
2567
2568
2569
2570
2571
2572
2573
2574
2575
2576
2577 Droms, et. al.            Expires January 2001                 [Page 46]
2578 \f
2579 Internet Draft           DHCP Failover Protocol               July 2000
2580
2581
2582
2583
2584                                         binding-status            BACKUP
2585                                                                   RESET
2586                                                                   ABANDONED
2587    Option                        ACTIVE     EXPIRED    RELEASED   FREE
2588    ------                        ------     -------    --------   ----
2589    assigned-IP-address (3)       MUST       MUST       MUST       MUST
2590    binding-status                MUST       MUST       MUST       MUST
2591    client-identifier             MAY        MAY        MAY        MAY(2)
2592    client-hardware-address       MUST       MUST       MUST       MAY(2)
2593    lease-expiration-time         MUST       MUST NOT   MUST NOT   MUST NOT
2594    potential-expiration-time     MUST       MUST NOT   MUST NOT   MUST NOT
2595    start-time-of-state           SHOULD     SHOULD     SHOULD     SHOULD
2596    client-last-trans.-time       MUST       SHOULD     MUST       MAY
2597    DDNS(1)                       SHOULD     SHOULD     SHOULD     SHOULD
2598    client-request-options        SHOULD     SHOULD NOT SHOULD     SHOULD NOT
2599    client-reply-options          SHOULD     SHOULD NOT SHOULD NOT SHOULD NOT
2600
2601    (1) MUST if server is performing dynamic DNS for this IP address, else
2602        MUST NOT.
2603    (2) MUST NOT if binding-status is ABANDONED.
2604    (3) assigned-IP-address MUST be the first option for an IP address
2605
2606              Table 7.1-1: Options used in a BNDUPD message
2607
2608
2609 7.1.1.  Sending the BNDUPD message
2610
2611    A BNDUPD message SHOULD be generated whenever any binding changes.  A
2612    change might be in the binding-status, the lease-expiration-time, or
2613    even just the last-transaction-time.  In general, any time a DHCP
2614    server writes its stable storage, a BNDUPD message SHOULD be gen-
2615    erated.  This will often be the result of the processing of a DHCP
2616    client request, but it might also be the result of a successful
2617    dynamic DNS update operation.
2618
2619    BNDUPD (and BNDACK) messages refer to the binding-status of the IP
2620    address, and this protocol defines a series of binding-statuses, dis-
2621    cussed in more detail below.  Some servers may not support all of
2622    these binding-statuses, and so in those cases they will not be sent.
2623    Upon receipt of a BNDUPD message which contains an unsupported
2624    binding-status, a reasonable interpretation should be made (see sec-
2625    tion 5.10).
2626
2627    All BNDUPD messages MUST contain the IP address of the binding update
2628    transaction in the assigned-IP-address option.
2629
2630
2631
2632
2633 Droms, et. al.            Expires January 2001                 [Page 47]
2634 \f
2635 Internet Draft           DHCP Failover Protocol               July 2000
2636
2637
2638    All binding update transactions contain a binding-status option, and
2639    it will have one of the values found in section 5.10.  Client infor-
2640    mation consists of client-hardware-address and possibly a client-
2641    identifier, and is explained in more detail later in this section.
2642    The following table indicates whether client information should or
2643    should not appear with each binding-status in a binding update tran-
2644    saction:
2645
2646
2647        binding-status       includes client information
2648        ------------------------------------------------
2649        ACTIVE                      MUST
2650        EXPIRED                     SHOULD
2651        RELEASED                    SHOULD
2652        FREE                        MAY
2653        ABANDONED                   MUST NOT
2654        RESET                       MAY
2655        BACKUP                      MAY
2656
2657          Table 7.1.1-1: Client information required by various
2658          binding-status values.
2659
2660
2661    The ACTIVE binding-status requires some options to indicate the
2662    length of the binding:
2663
2664
2665       o lease-expiration-time
2666
2667         The lease-expiration-time option MUST appear, and be set to the
2668         expiration time most recently ACKed to the DHCP client.  Note
2669         that the time ACKed to a DHCP client is a lease duration in
2670         seconds, while the lease-expiration-time option in a BNDUPD mes-
2671         sage is an absolute time value.
2672
2673       o potential-expiration-time
2674
2675         The potential-expiration-time option MUST appear, and be set to
2676         a value beyond that of the lease-expiration time.  This is the
2677         value that is ACKed by the BNDACK message.  A server sending a
2678         BNDUPD message MUST be able to recover the potential-
2679         expiration-time sent in every BNDUPD, not just those that
2680         receive a corresponding BNDACK, in order to be able to protect
2681         against possible duplicate allocation of IP addresses after
2682         transitioning to PARTNER-DOWN state. See section 5.2.1 for
2683         details as to why the potential-expiration-time exists and
2684         guidelines for how to decide on the value.
2685
2686
2687
2688
2689 Droms, et. al.            Expires January 2001                 [Page 48]
2690 \f
2691 Internet Draft           DHCP Failover Protocol               July 2000
2692
2693
2694    The following option information applies to all BNDUPD messages,
2695    regardless of the value of the binding-status, unless otherwise
2696    noted.
2697
2698    o Identifying the client
2699
2700      For many of the binding-status values a client MUST appear while
2701      for others a client MAY appear, and for some a client MUST NOT
2702      appear.
2703
2704      A client is identified in a BNDUPD message by at least one and pos-
2705      sibly two options.   The client-hardware-address option MUST appear
2706      any time that a client appears in a BNDUPD message, and contains
2707      the hardware type and chaddr information from the DHCP request
2708      packet.  A failover client-identifier option MUST appear any time
2709      that a client appears in a BNDUPD message if and only if that
2710      client used a DHCP client-identifier option when communicating with
2711      the DHCP server.  See section 12.5 and 12.4 for details of how to
2712      construct these two options from a DHCP request packet.
2713
2714    o start-time-of-state
2715
2716      The start-time-of-state SHOULD appear.  It is set to the time at
2717      which this IP address first took on the state that corresponds to
2718      the current value of binding-status.
2719
2720    o last-transaction-time
2721
2722      The last-transaction-time value SHOULD appear.  This is the time at
2723      which this DHCP server last received a packet from the DHCP client
2724      referenced by the client-identifier or client-hardware-address that
2725      was associated with the IP address referenced by the assigned-IP-
2726      address.
2727
2728    o DDNS
2729
2730      If the DHCP server is performing dynamic DNS operations on behalf
2731      of the DHCP client represented by the client-identifier or client-
2732      hardware-address, then it should include a DDNS option containing
2733      the domain name and status of any dynamic DNS operations enabled.
2734
2735    o client-request-options
2736
2737      If the BNDUPD was triggered by a request from a DHCP client (typi-
2738      cally those with binding-status of ACTIVE and RELEASED), then the
2739      server SHOULD include options of interest to a failover partner
2740      from the client's request packet in the client-request-options for
2741      transmission to its partner (see section 12.8).
2742
2743
2744
2745 Droms, et. al.            Expires January 2001                 [Page 49]
2746 \f
2747 Internet Draft           DHCP Failover Protocol               July 2000
2748
2749
2750      A server sending a BNDUPD SHOULD remember the "interesting" options
2751      or the information that would appear in an "interesting" option for
2752      transmission at a time when the BNDUPD is not closely associated
2753      with a DHCP client request.
2754
2755      A server SHOULD send the following "interesting" options.  It MAY
2756      send any DHCP client options.  As new options are defined, the RFC
2757      defining these options SHOULD include information that they are
2758      "interesting to failover servers" if they should be sent as part of
2759      a BNDUPD.
2760
2761
2762          option          option
2763          number          name
2764          -----------------------------------------
2765
2766          12              host-name
2767          81              client-FQDN [DDNS]
2768          82              relay-agent-information [AGENTINFO]
2769          TBD             user-class [USERCLASS]
2770          60              vendor-class-identifier
2771
2772            Table 7.1.1-2: Options which SHOULD be sent in
2773            the client-request-options option in a BNDUPD message.
2774
2775
2776    o client-reply-options
2777
2778      If the BNDUPD was triggered by a request from a DHCP client (typi-
2779      cally those with binding-status of ACTIVE and RELEASED), then the
2780      server SHOULD include options of interest to a failover partner
2781      from the server's DHCP reply packet in the client-reply-options for
2782      transmission to its partner (see section 12.7).
2783
2784      A server sending a BNDUPD SHOULD remember the "interesting" options
2785      or the information that would appear in an "interesting" option for
2786      transmission at a time when the BNDUPD is not closely associated
2787      with a DHCP client request.
2788
2789      A server SHOULD send the following "interesting" options.  It MAY
2790      send any DHCP client options.  As new options are defined, the RFC
2791      defining these options SHOULD include information that they are
2792      "interesting to failover servers" if they should be sent as part of
2793      a BNDUPD.
2794
2795
2796
2797
2798
2799
2800
2801 Droms, et. al.            Expires January 2001                 [Page 50]
2802 \f
2803 Internet Draft           DHCP Failover Protocol               July 2000
2804
2805
2806
2807
2808          option          option
2809          number          name
2810          -----------------------------------------
2811
2812          58              renewal-time
2813          59              rebinding-time
2814
2815            Table 7.1.1-3: Options which SHOULD be sent in
2816            the client-reply-options option in a BNDUPD message.
2817
2818
2819    The BNDUPD message SHOULD be sent as soon as possible from the time
2820    that the DHCP client received a response and the lease bindings data-
2821    base is written on stable storage.
2822
2823 7.1.2.  Receiving the BNDUPD message
2824
2825    When a server receives a BNDUPD message, it needs to decide how to
2826    process the binding update transaction it contains and whether that
2827    transaction represents a conflict of any sort. The conflict resolu-
2828    tion process MUST be used on the receipt of every BNDUPD message, not
2829    just those that are received while in POTENTIAL-CONFLICT state, in
2830    order to increase the robustness of the protocol.
2831
2832    There are three sorts of conflicts:
2833
2834       o Two clients, one IP address conflict
2835
2836         This is the duplicate IP address allocation conflict. There are
2837         two different clients each allocated the same address.  See sec-
2838         tion 7.1.3 for how to resolve this conflict.
2839
2840       o Two IP addresses, one client conflict
2841
2842         This conflict exists when a client on one server is associated
2843         with a one IP address, and on the other server with a different
2844         IP address in the same or a related subnet. This does not refer
2845         to the case where a single client has addresses in multiple dif-
2846         ferent subnets or administrative domains, but rather the case
2847         where on the same subnet the client has as lease on one IP
2848         address in one server and on a different IP address on the other
2849         server.
2850
2851         This conflict may or may not be a problem for a given DHCP
2852         server implementation.  In the event that a DHCP server requires
2853         that a DHCP client have only one outstanding lease for an IP
2854
2855
2856
2857 Droms, et. al.            Expires January 2001                 [Page 51]
2858 \f
2859 Internet Draft           DHCP Failover Protocol               July 2000
2860
2861
2862         address on one subnet, this conflict should be resolved by
2863         accepting the update which has the latest client-last-
2864         transaction-time.
2865
2866       o binding-status conflict
2867
2868         This is normal conflict, where one server is updating the other
2869         with newer information.  See section 7.1.3 for details of how to
2870         resolve these conflicts.
2871
2872 7.1.3.  Deciding whether to accept the binding update transaction in a
2873 BNDUPD message
2874
2875    IP addresses undergo binding status changes for several reasons,
2876    including receipt and processing of DHCP client requests, administra-
2877    tive inputs and receipt of BNDUPD messages.  Every DHCP server needs
2878    to respond to DHCP client requests and administrative inputs with
2879    changes to its internal record of the binding-status of an IP
2880    address, and this response is not in the scope of the failover proto-
2881    col.  However, the receipt of BNDUPD messages implies at least a pos-
2882    sible change of the binding-status for an IP address, and must be
2883    discussed here.  See section 7.1.2 for general actions to take upon
2884    receipt of a BNDUPD message.
2885
2886    When receiving a BNDUPD message, it is important to note that it may
2887    not be current, in that the server receiving the BNDUPD message may
2888    have had a more recent interaction with the DHCP client than its
2889    partner who sent the BNDUPD message.  In this case, the receiving
2890    server MUST reject the BNDUPD message.  In addition, it is worth not-
2891    ing that two (and possibly three) binding-status values are the
2892    direct result of interaction with a DHCP client, ACTIVE and RELEASED
2893    (and possibly ABANDONED).  All other binding-status values are either
2894    the result of the expiration of a time period or interaction with an
2895    external agency (e.g., a network administrator).
2896
2897    Every BNDUPD message SHOULD contain a client-last-transaction-time
2898    option, which MUST, if it appears, be the time that the server last
2899    interacted with the DHCP client.  It MUST NOT be, for instance, the
2900    time that the lease on an IP address expired.  If there has been no
2901    interaction with the DHCP client in question (or there is no DHCP
2902    client presently associated with this IP address), then there will be
2903    no client-last-transaction-time option in the BNDUPD message.
2904
2905    The list in Figure 7.1.3-1 is indexed by the binding-status that a
2906    server receives in a BNDUPD message.  In many cases, the binding-
2907    status of an IP address within the receiving server's data storage
2908    will have an affect upon the checks performed prior to accepting the
2909    new binding-status in a BNDUPD message.
2910
2911
2912
2913 Droms, et. al.            Expires January 2001                 [Page 52]
2914 \f
2915 Internet Draft           DHCP Failover Protocol               July 2000
2916
2917
2918    In Figure 7.1.3-1, to "accept" a BNDUPD means to update the server's
2919    bindings database with the information contained in the BNDUPD and
2920    once that update is complete, send a BNDACK message corresponding to
2921    the BNDUPD message.  To "reject" a BNDUPD means to respond to the
2922    BNDUPD with a BNDACK with a reject-reason option included.
2923
2924    When interpreting the rules in the following list, if a BNDUPD
2925    doesn't have a client-last-transaction-time value, then it MUST NOT
2926    be considered later than the client-last-transaction-time in the
2927    receiving server's binding.   If the BNDUPD contains a client-last-
2928    transaction-time value and the receiving server's binding does not,
2929    then the client-last-transaction-time value in the BNDUPD MUST be
2930    considered later than the server's.
2931
2932    The second rule concerns clients and IP addresses.  If the clients in
2933    a BNDUPD message and in a receiving server's binding differ, then if
2934    the receiving server's binding-status is ACTIVE and the binding-
2935    status in the BNDUPD is ACTIVE, then if the receiving server is a
2936    secondary server accept it, else reject it.
2937
2938
2939                         binding-status in received BNDUPD
2940        binding-status
2941        in receiving                                  FREE     RESET
2942        server          ACTIVE   EXPIRED   RELEASED   BACKUP  ABANDONED
2943
2944        ACTIVE          accept    time(2)   time(1)    time(2)  accept
2945        EXPIRED         time(1)   accept    accept     accept   accept
2946        RELEASED        time(1)   time(1)   accept     accept   accept
2947        FREE/BACKUP     accept    accept    accept     accept   accept
2948        RESET           time(3)   accept    accept     accept   accept
2949        ABANDONED       reject    reject    reject     reject   accept
2950
2951        time(1): If the client-last-transaction-time in the BNDUPD
2952        is later than the client-last-transaction-time in the
2953        receiving server's binding, accept it, else reject it.
2954
2955        time(2): If the current time is later than the receiving
2956        servers' lease-expiration-time, accept it, else reject it.
2957
2958        time(3): If the client-last-transaction-time in the BNDUPD
2959        is later than the start-time-of-state in the receiving server's
2960        binding, accept it, else reject it.
2961
2962
2963                 Figure 7.1.3-1:  Accepting BNDUPD messages
2964
2965
2966
2967
2968
2969 Droms, et. al.            Expires January 2001                 [Page 53]
2970 \f
2971 Internet Draft           DHCP Failover Protocol               July 2000
2972
2973
2974 7.1.4.  Accepting the BNDUPD message
2975
2976    When accepting a BNDUPD message, the information contained in the
2977    client-request-options and client-reply-options SHOULD be examined
2978    for any information of interest to this server.  For instance, a
2979    server which wished to detect changes in client specified host names
2980    might want to examine and save information from the host-name or
2981    client-FQDN options.  Servers which expect to utilize information
2982    from the relay-agent-information option would want to store this
2983    information.
2984
2985 7.1.5.  Time values related to the BNDUPD message
2986
2987    There are four time values that MAY be sent in a BNDUPD message.
2988
2989       o lease-expiration-time
2990
2991         The time that the server gave to the client, i.e., the time that
2992         the server believes that the client's lease will expire.
2993
2994       o potential-expiration-time
2995
2996         The time that the server wants to be sure its partner waits
2997         (added to the MCLT) before assuming that this lease has expired.
2998         Typically some time beyond the desired client lease time.
2999
3000       o client-last-transaction-time
3001
3002         The time that the client last interacted with this server.
3003
3004       o start-time-of-state
3005
3006         The time at which the binding first went into the current state.
3007
3008    As discussed in section 5.2, each server knows what its partner has
3009    ACKed with regard to potential-expiration time.  In addition, each
3010    server needs to remember what it has told its partner as the
3011    potential-expiration-time.  Moreover, each server must remember what
3012    it has acked to the *other* server as the most recent potential-
3013    expiration-time from that server.
3014
3015    Remember that each server sends a potential-expiration-time and
3016    receives an ACK for that as well as receiving a potential-
3017    expiration-time and needing to remember what it has acked for that.
3018
3019    While they don't have to be named in any particular way, the times
3020    that a server needs to remember for every IP address in order to
3021    implement the failover protocol are:
3022
3023
3024
3025 Droms, et. al.            Expires January 2001                 [Page 54]
3026 \f
3027 Internet Draft           DHCP Failover Protocol               July 2000
3028
3029
3030       o lease-expiration-time
3031
3032         The time that a server gave to the DHCP client.  A DHCP server
3033         needs to remember this time already, just to be a DHCP server.
3034         A server SHOULD update this time with the lease-expiration time
3035         received from a partner in a BNDUPD if the received lease-
3036         expiration time is later than the lease-expiration time recorded
3037         for this binding.
3038
3039       o sent-potential-expiration-time
3040
3041         The latest time sent to the partner for a potential-expiration-
3042         time.
3043
3044       o acked-potential-expiration-time
3045
3046         The latest time that the partner has acked for a potential
3047         expiration time.  Typically the same as sent-potential-
3048         expiration-time if there is not a BNDUPD outstanding.
3049
3050       o received-potential-expiration-time
3051
3052         The latest time that this server has ever received as a
3053         potential-expiration-time from its partner in a BNDUPD that this
3054         server ACKed.
3055
3056    So, a server has to remember two additional times concerning BNDUPD
3057    messages that it has initiated, and one additional time concerning
3058    BNDUPD message that it has received.  How are these times used?
3059
3060    First, let's look at the time that a DHCP server can offer to a DHCP
3061    client.  A server can offer to a DHCP client a time that is no longer
3062    than the MCLT beyond the max( received-potential-expiration-time,
3063    acked-potential-expiration-time).  One might think that the server
3064    should be able to offer only the MCLT beyond the acked-potential-
3065    expiration-time, and while that is certainly simple and easy to
3066    understand, it has negative consequences in actual operation.
3067
3068    To illustrate this, in the simple case where the primary updates the
3069    secondary for a while and then fails, if the secondary can then renew
3070    the client for only the MCLT beyond the acked-potential-expiration-
3071    time, then the secondary will only be able to renew the client for
3072    the MCLT, because the secondary has never sent a BNDUPD packet to the
3073    primary concerning this IP address and client, and so its acked-
3074    potential-expiration-time is zero.
3075
3076    However, since the secondary is allowed to renew the client with the
3077    MCLT beyond the max( received-potential-expiration-time, acked-
3078
3079
3080
3081 Droms, et. al.            Expires January 2001                 [Page 55]
3082 \f
3083 Internet Draft           DHCP Failover Protocol               July 2000
3084
3085
3086    potential-expiration-time), then the secondary can usually renew the
3087    client for the full lease period, at least for the first renew it
3088    sees from the client, since the received-potential-expiration-time is
3089    generally longer than the client's desired lease interval.  The
3090    difference in renew times could make a big difference in server load
3091    on the secondary in this case.
3092
3093    What are the consequences of allowing a server to offer a DHCP client
3094    a lease term of the MCLT beyond the max( received-potential-
3095    expiration-time, acked-potential-expiration-time)?  The consequences
3096    appear whenever a server enters PARTNER-DOWN state, and affect how
3097    long that server has to wait before reallocating expired leases.
3098    With this approach, when a server goes into PARTNER-DOWN state, it
3099    must wait the MCLT beyond the max( lease-expiration-time, sent-
3100    potential-expiration-time, acked-potential-expiration-time,
3101    received-potential-expiration-time ) for each IP address before it
3102    can reallocate that IP address to another DHCP client.   One might
3103    normally think that it needed to wait only the MCLT beyond the max(
3104    lease-expiration-time, received-potential-expiration-time ), i.e.,
3105    beyond what it has told the client and what it has explicitly acked
3106    to the other server.  But with the optimization discussed above --
3107    where either server can offer the DHCP client a lease term of the
3108    MCLT beyond the max( received-potential-expiration-time, acked-
3109    potential-expiration-time), then the additional times sent-
3110    potential-expiration-time and acked-potential-expiration-time must be
3111    added into the expression, since the partner could have used those
3112    times as part of its own lease time calculation.
3113
3114    Thus this optimization may require a longer waiting time when enter-
3115    ing PARTNER-DOWN state, but will generally allow servers to operate
3116    considerably more effectively when running in COMMUNICATIONS-
3117    INTERRUPTED state.
3118
3119 7.2.  BNDACK message [4]
3120
3121    A server sends a binding acknowledgement (BNDACK) message when it has
3122    processed a BNDUPD message and after it has successfully committed to
3123    stable storage any binding database changes made as a result of pro-
3124    cessing the BNDUPD message.  A BNDACK message is used to both accept
3125    or reject a BNDUPD message.  A BNDACK message which contains a
3126    reject-reason option is a rejection of the corresponding BNDUPD mes-
3127    sage.
3128
3129    In order to reduce the complexity of the discussion, the rest of this
3130    section is written as though every BNDUPD message contains only a
3131    single binding update transaction and thus every corresponding BNDACK
3132    message would also contain reply information about only a single
3133    binding update transaction.  See section 6.3 for information on how
3134
3135
3136
3137 Droms, et. al.            Expires January 2001                 [Page 56]
3138 \f
3139 Internet Draft           DHCP Failover Protocol               July 2000
3140
3141
3142    to create and process BNDUPD and BNDACK messages which contain multi-
3143    ple binding update transactions.
3144
3145    Note that while a server MAY generate BNDUPD messages with multiple
3146    binding update transactions, every server MUST be able to process a
3147    BNDUPD message which contains multiple binding update transactions
3148    and generate the corresponding BNDACK messages with status for multi-
3149    ple binding update transactions.  If a server does not ever create
3150    BNDUPD messages which contain multiple binding update transactions,
3151    then it does not need to be able to process a received BNDACK message
3152    with multiple binding update transactions.  However, all servers MUST
3153    be able to create BNDACK messages which deal with multiple binding
3154    update transactions received in a BNDUPD message.
3155
3156    Every BNDUPD message that is received by a server MUST be responded
3157    to with a corresponding BNDACK message.  The receiving server SHOULD
3158    respond quickly to every BNDUPD message but it MAY choose to respond
3159    preferentially to DHCP client requests instead of BNDUPD messages,
3160    since there is no absolute time period within which a BNDACK must be
3161    sent in response to a BNDUPD message, while DHCP clients frequently
3162    have strict time constraints.
3163
3164    A BNDACK message can only be sent in response to a BNDUPD message
3165    using the same TCP connection from which the BNDUPD message was
3166    received, since the XID's in BNDUPD messages are guaranteed unique
3167    only during the life of a single TCP connection.  When a connection
3168    to a partner server goes down, a server with unprocessed BNDUPD mes-
3169    sages MAY simply drop all of those messages, since it can be sure
3170    that the partner will resend them when they are next in communica-
3171    tions (albeit with a different XID), or it MAY instead choose to pro-
3172    cess those BNDUPD messages, but it MUST NOT send any BNDACK messages
3173    in response.
3174
3175    The following table summarizes the options for the BNDACK message.
3176
3177
3178
3179
3180
3181
3182
3183
3184
3185
3186
3187
3188
3189
3190
3191
3192
3193 Droms, et. al.            Expires January 2001                 [Page 57]
3194 \f
3195 Internet Draft           DHCP Failover Protocol               July 2000
3196
3197
3198
3199
3200                                         binding-status            BACKUP
3201                                                                   RESET
3202                                                                   ABANDONED
3203    Option                        ACTIVE     EXPIRED    RELEASED   FREE
3204    ------                        ------     -------    --------   ----
3205    assigned-IP-address  (3)      MUST       MUST       MUST       MUST
3206    binding-status                MUST       MUST       MUST       MUST
3207    client-identifier             MAY        MAY        MAY        MAY(2)
3208    client-hardware-address       MUST       MUST       MUST       MAY(2)
3209    reject-reason                 MAY        MAY        MAY        MAY
3210    message                       MAY        MAY        MAY        MAY
3211    lease-expiration-time         MUST       MUST NOT   MUST NOT   MUST NOT
3212    potential-expiration-time     MUST       MUST NOT   MUST NOT   MUST NOT
3213    start-time-of-state           SHOULD     SHOULD     SHOULD     SHOULD
3214    client-last-trans.-time       SHOULD     SHOULD     SHOULD     MAY
3215    DDNS(1)                       SHOULD     SHOULD     SHOULD     SHOULD
3216
3217    (1) MUST if server is performing dynamic DNS for this IP address, else
3218        MUST NOT.
3219    (2) MUST NOT if binding-status is ABANDONED.
3220    (3) assigned-IP-address MUST be the first option for an IP address
3221
3222               Table 7.2-1: Options used in a BNDACK message
3223
3224
3225 7.2.1.  Sending the BNDACK message
3226
3227    The BNDACK message MUST contain the same xid as the corresponding
3228    BNDUPD message.
3229
3230    The assigned-IP-address option from the BNDUPD message MUST be
3231    included in the BNDACK message.  Any additional options from the
3232    BNDUPD message SHOULD NOT appear in the BNDACK message.  Note that
3233    any information sent in options (e.g, a later lease-expiration time)
3234    in the BNDACK message MUST NOT be assumed to necessarily be recorded
3235    in the stable storage of the server who receives the BNDACK message
3236    because there is no corresponding ACK of the BNDACK message.  Any
3237    information that SHOULD be recorded in the partner server's stable
3238    storage MUST be transmitted in a subsequent BNDUPD.
3239
3240    If the server is accepting the BNDUPD, the BNDACK message includes
3241    only the assigned-IP-address option.  If the server is rejecting the
3242    BNDUPD, the additional option reject-reason MUST appear in the BNDACK
3243    message, and the message option SHOULD appear in this case containing
3244    a human-readable error message describing in some detail the reason
3245    for the rejection of the BNDUPD message.
3246
3247
3248
3249 Droms, et. al.            Expires January 2001                 [Page 58]
3250 \f
3251 Internet Draft           DHCP Failover Protocol               July 2000
3252
3253
3254    If the server rejects the BNDUPD message with a BNDACK and a reject-
3255    reason option, it may be because the server believes that it has
3256    binding information that the other server should know.  A server
3257    which is rejecting a BNDUPD may initiate a BNDUPD of its own in order
3258    to update its partner with what it believes is better binding infor-
3259    mation, but it MUST ensure through some means that it will not end up
3260    in a situation where each server is sending BNDUPD messages as fast
3261    as possible because they can't agree on which server has better bind-
3262    ing data.  Placing a considerable delay on the initiation of a BNDUPD
3263    message after sending a BNDACK with a reject-reason would be one way
3264    to ensure this situation doesn't occur.
3265
3266 7.2.2.  Receiving the BNDACK message
3267
3268    When a server receives a BNDACK message, if it doesn't contain a
3269    reject-reason option that means that the BNDUPD message was accepted,
3270    and the server which sent the BNDUPD SHOULD update its stable storage
3271    with the potential-expiration-time value sent in the BNDUPD message
3272    and returned in the BNDACK message.  Other values sent in the BNDUPD
3273    message MAY be used as desired.
3274
3275    If the BNDACK message contains a reject-reason option, that means
3276    that the BNDUPD was rejected.  There SHOULD be a message option in
3277    the BNDACK giving a text reason for the rejection, and the server
3278    SHOULD log the message in some way.  The server MUST NOT immediately
3279    try to resend the BNDUPD message as there is no reason to believe the
3280    partner won't reject it a second time.  However a server MAY choose
3281    to send another BNDUPD at some future time, for instance when the
3282    server next processes an update request from its partner.
3283
3284 7.3.  UPDREQ message [9]
3285
3286    The update request (UPDREQ) message is used by one server to request
3287    that its partner send it all of the binding database information that
3288    it has not already seen.   Since each server is required to keep
3289    track at all times of the binding information the other server has
3290    received and ACKed, one server can request transmission of all un-
3291    ACKed binding database information held by the other server by using
3292    the UPDREQ message.
3293
3294    The UPDREQ message is used whenever the sending server cannot proceed
3295    before it has processed all previously un-ACKed binding update infor-
3296    mation, since the UPDREQ message should yield a corresponding UPDDONE
3297    message.  The UPDDONE message is not sent until the server that sent
3298    the UPDREQ message has responded to all of the BNDUPD messages gen-
3299    erated by the UPDREQ message with BNDACK messages (they may either be
3300    accepted or rejected by the BNDACK messages, but they MUST have been
3301    responded to). Thus, the sender of the UPDREQ message can be sure
3302
3303
3304
3305 Droms, et. al.            Expires January 2001                 [Page 59]
3306 \f
3307 Internet Draft           DHCP Failover Protocol               July 2000
3308
3309
3310    upon receipt of an UPDDONE message that it has received and committed
3311    to stable storage all outstanding binding database updates.
3312
3313    See section 9, Failover Endpoint States, for the details of when the
3314    UPDREQ message is sent.
3315
3316 7.3.1.  Sending the UPDREQ message
3317
3318    The UPDREQ message has no message specific options.
3319
3320 7.3.2.  Receiving the UPDREQ message
3321
3322    A server receiving an UPDREQ message MUST send all binding database
3323    changes that have not yet been ACKed by the sending server.   These
3324    changes are sent as undistinguished BNDUPD messages.
3325
3326    However, the server which received and is processing the UPDREQ mes-
3327    sage MUST track the BNDACK messages that correspond to the BNDUPD
3328    messages triggered by the UPDREQ message and, when they are all
3329    received, the server MUST send an UPDDONE message.
3330
3331    The server processing the UPDREQ message and sending BNDUPD messages
3332    to its partner SHOULD only track the BNDUPD and BNDACK message pairs
3333    for unACKed binding database changes that were present upon the
3334    receipt of the UPDREQ message.  A server which has received an UPDREQ
3335    message SHOULD send BNDUPD messages for binding database changes that
3336    occur after receipt of the UPDREQ message, but it SHOULD NOT include
3337    those additional BNDUPD messages and their corresponding BNDACK mes-
3338    sages in the accounting necessary to consider the UPDREQ complete and
3339    subsequently send the UPDDONE message.  If some additional binding
3340    database changes end up becoming part of the set of BNDUPD messages
3341    considered as part of the UPDREQ (due to whatever algorithm the
3342    server uses to scan its bindings database for unacked changes) it
3343    will probably not cause any difficulty, but a server MUST NOT attempt
3344    to include all such later BNDUPD messages in the accounting for the
3345    UPDREQ in order to be able to transmit an UPDDONE message.
3346
3347    When queuing up the BNDUPD messages for transmission to the sender of
3348    the UPDREQ message, the server processing the UPDREQ message MUST
3349    honor the value returned in the max-unacked-bndupd option in the CON-
3350    NECT or CONNECTACK message that set up the connection with the send-
3351    ing server.  It MUST NOT send more BNDUPD messages without receiving
3352    corresponding BNDACKs than the value returned in max-unacked-bndupd.
3353
3354 7.4.  UPDREQALL message [7]
3355
3356    The update request all (UPDREQALL) message is used by one server to
3357    request that its partner send it all of the binding database
3358
3359
3360
3361 Droms, et. al.            Expires January 2001                 [Page 60]
3362 \f
3363 Internet Draft           DHCP Failover Protocol               July 2000
3364
3365
3366    information.  This message is used to allow one server to recover
3367    from a failure of stable storage and to restore its binding database
3368    in its entirety from the other server.
3369
3370    A server which sends an UPDREQALL message cannot proceed until all of
3371    its binding update information is restored, and it knows that all of
3372    that information is restored when an UPDDONE message is received.
3373
3374    See section 9, Protocol state transitions, for the details of when
3375    the UPDREQALL message is sent.
3376
3377    The UPDREQALL message has no message specific options.
3378
3379 7.4.1.  Sending the UPDREQALL message
3380
3381    The UPDREQALL is sent.
3382
3383 7.4.2.  Receiving the UPDREQALL message
3384
3385    A server receiving an UPDREQALL message MUST send all binding data-
3386    base information to the sending server.  These changes are sent as
3387    undistinguished BNDUPD messages. Otherwise the processing is the same
3388    as for the UPDREQ message.  See section 7.3.2 for details.
3389
3390 7.5.  UPDDONE message [8]
3391
3392    The update done (UPDDONE) message is used by a server receiving an
3393    UPDREQ or UPDREQALL message to signify that it has sent all of the
3394    BNDUPD messages requested by the UPDREQ or UPDREQALL request and that
3395    it has received a BNDACK for each of those messages.
3396
3397    While a BNDACK message MUST have been received for each BNDUPD mes-
3398    sage prior to the transmission of the UPDDONE message, this doesn't
3399    necessarily mean that all of the BNDUPD messages were accepted, only
3400    that all of them were responded to with a BNDACK message.  Thus, a
3401    NAK (comprised of a BNDACK message containing a reject-reason option)
3402    could be used to reject a BNDUPD, but for the purposes of the UPDDONE
3403    message, such NAK would count as a response to the associated BNDUPD
3404    message, and would not block the eventual transmission of the UPDDONE
3405    message.
3406
3407    The xid in an UPDDONE message MUST be identical to the xid in the
3408    UPDREQ or UPDREQALL message that initiated the update process.
3409
3410    The UPDDONE message has no message specific options.
3411
3412
3413
3414
3415
3416
3417 Droms, et. al.            Expires January 2001                 [Page 61]
3418 \f
3419 Internet Draft           DHCP Failover Protocol               July 2000
3420
3421
3422 7.5.1.  Sending the UPDDONE message
3423
3424    The UPDDONE message SHOULD be sent as soon as the last BNDACK message
3425    corresponding to a BNDUPD message requested by the UPDREQ or
3426    UPDREQALL is received from the server which sent the UPDREQ or
3427    UPDREQALL.  The XID of the UPDDONE message MUST be the same as the
3428    XID of the corresponding UPDREQ or UPDREQALL message.
3429
3430 7.5.2.  Receiving the UPDDONE message
3431
3432    A server receiving the UPDDONE message knows that all of the informa-
3433    tion that it requested by sending an UPDREQ or UPDREQALL message has
3434    now been sent and that it has recorded this information in its stable
3435    storage.  It typically uses the receipt of an UPDDONE message to move
3436    to a different failover state.  See sections 9.5.2 and 9.8.3 for
3437    details.
3438
3439 7.6.  POOLREQ message [1]
3440
3441    The pool request (POOLREQ) message is used by the secondary server to
3442    request an allocation of IP addresses from the primary server.   It
3443    MUST be sent by a secondary server to a primary server to request IP
3444    address allocation by the primary.  The IP addresses allocated are
3445    transmitted using normal BNDUPD messages from the primary to the
3446    secondary.
3447
3448    The POOLREQ message SHOULD be sent from the secondary to the primary
3449    whenever the secondary transitions into NORMAL state.  It SHOULD
3450    periodically be resent in order that any change in the number of
3451    available IP addresses on the primary be reflected in the pool on the
3452    secondary.  The period may be influenced by the secondary server's
3453    leasing activity.
3454
3455    The POOLREQ message has no message specific options.
3456
3457 7.6.1.  Sending the POOLREQ message
3458
3459    The POOLREQ message is sent.
3460
3461 7.6.2.  Receiving the POOLREQ message
3462
3463    When a primary server receives a POOLREQ message it SHOULD examine
3464    the binding database and determine how many IP addresses the secon-
3465    dary server should have, and set these IP addresses to BACKUP state.
3466    It SHOULD then send BNDUPD messages concerning all of these IP
3467    addresses to the secondary server.
3468
3469    Servers frequently have several kinds of IP addresses available on a
3470
3471
3472
3473 Droms, et. al.            Expires January 2001                 [Page 62]
3474 \f
3475 Internet Draft           DHCP Failover Protocol               July 2000
3476
3477
3478    particular network segment.  The failover protocol assumes that both
3479    primary and secondary servers are configured in such a way that each
3480    knows the type and number of IP addresses on every network segment
3481    participating in the failover protocol.  The primary server is
3482    responsible for allocating the secondary server the correct propor-
3483    tion of available IP addresses of each kind, and the secondary server
3484    is responsible for being configured in such a way that it can tell
3485    the kind of every IP address based solely on the IP address itself.
3486
3487    A primary server MUST keep track of how many IP addresses were allo-
3488    cated as a result of processing the POOLREQ message, and send that
3489    number in the POOLRESP message.
3490
3491    A primary server MAY choose to defer processing a POOLREQ message
3492    until a more convenient time to process it, but it should not depend
3493    on the secondary server to resend the POOLREQ message in that case.
3494
3495    If a secondary server receives a POOLREQ message it SHOULD report an
3496    error.
3497
3498 7.7.  POOLRESP message [2]
3499
3500    A primary server sends a POOLRESP message to a secondary server after
3501    the allocation process for available addresses to the secondary
3502    server is complete.  Typically this message will precede some of the
3503    BNDUPD messages that the primary uses to send the actual allocated IP
3504    addresses to the secondary.
3505
3506    The xid in the POOLRESP message MUST be identical to the xid in the
3507    POOLREQ message for which this POOLRESP is a response.
3508
3509
3510 7.7.1.  Sending the POOLRESP message
3511
3512    The POOLRESP message MUST contain the same xid as the corresponding
3513    POOLREQ message.
3514
3515    Only one option MUST appear in a POOLREQ message:
3516
3517       o addresses-transferred
3518
3519         The number of addresses allocated to the secondary server by the
3520         primary server as a result of a POOLREQ is contained in the
3521         addresses-transferred option in a POOLRESP message.  Note this
3522         is the number of addresses that are transferred to the secondary
3523         in the primary's binding database as a result of the correspond-
3524         ing POOLREQ message, and that it may be some time before they
3525         can all be transmitted to the secondary server through the use
3526
3527
3528
3529 Droms, et. al.            Expires January 2001                 [Page 63]
3530 \f
3531 Internet Draft           DHCP Failover Protocol               July 2000
3532
3533
3534         of BNDUPD messages.
3535
3536 7.7.2.  Receiving the POOLRESP message
3537
3538    When a secondary server receives a POOLRESP message, it SHOULD send
3539    another POOLREQ message if the value of the addresses-transferred
3540    option is non-zero.
3541
3542    Typically, no other action is taken on the reception of a POOLRESP
3543    message.
3544
3545 7.8.  CONNECT message [5]
3546
3547    The connect message is used to establish an applications level con-
3548    nection over a newly created TCP connection.  It gives the source
3549    information for the connection, and critical configuration informa-
3550    tion.  It MUST be sent only by the primary server.  Either server can
3551    initiate a TCP connection, but the CONNECT message is only sent by
3552    the primary server.
3553
3554    The CONNECT message MUST be the first message sent down a newly esta-
3555    blished connection, and it MUST be sent only by the primary server.
3556
3557    The following table summarizes the options that are associated with
3558    the CONNECT message:
3559
3560
3561    Option
3562    ------
3563    sending-server-IP-address   MUST
3564    max-unacked-bndupd          MUST
3565    receive-timer               MUST
3566    vendor-class-identifier     MUST
3567    protocol-version            MUST
3568    TLS-request                 MUST (1)
3569    MCLT                        MUST
3570    hash-bucket-assignment      MUST
3571
3572    (1) MUST NOT if CONNECT is being sent over a TLS connection
3573
3574               Table 7.8-1: Options used in a CONNECT message
3575
3576
3577 7.8.1.  Sending the CONNECT message
3578
3579    The CONNECT message MUST be the first message sent by the primary
3580    server after the establishment of a new TCP connection with a secon-
3581    dary server participating in the failover protocol.
3582
3583
3584
3585 Droms, et. al.            Expires January 2001                 [Page 64]
3586 \f
3587 Internet Draft           DHCP Failover Protocol               July 2000
3588
3589
3590    The xid of the CONNECT message must be unique.
3591
3592    The IP address of the primary server MUST be placed in the sending-
3593    server-IP-address option.  This information is placed in an option
3594    inside of the message in order to allow the identity of the sender to
3595    be covered by a shared secret.
3596
3597    The number of BNDUPD messages the primary server can accept without
3598    blocking the TCP connection MUST be placed in the max-unacked-bndupd
3599    option.  This MUST be a number equal to or greater than 1, SHOULD be
3600    a number greater than 10, and SHOULD be a number less than 100.
3601
3602    The length of the receive timer (tReceive, see section 8.3) MUST be
3603    placed in the receive-timer option.
3604
3605    The MCLT MUST be placed in the MCLT option.
3606
3607    The hash-bucket-assignment option MUST be included in the CONNECT
3608    message.  In the event that load balancing is not configured for this
3609    server, the hash-bucket-assignment option will indicate that.  The
3610    value of the hash-bucket-assignment option is determined from the
3611    specific buckets that the primary server has determined that the
3612    secondary server MUST service as part of the load-balancing algo-
3613    rithm.  The way in which the primary server determines this informa-
3614    tion is outside the scope of this protocol definition.  The primary
3615    server SHOULD be configured with a percentage of clients that the
3616    secondary server will be instructed to service, and the primary
3617    server SHOULD use the algorithm in [LOADB] to generate a Hash Bucket
3618    Assignment which it sends to the secondary server.
3619
3620    The vendor class identifier MUST be placed in the vendor-class-
3621    identifier option.
3622
3623    The protocol-version option MUST be included in every CONNECT mes-
3624    sage.  The current value of the protocol version is 1.
3625
3626    The TLS-request option MUST be sent and contains the desired TLS con-
3627    nection request as well as information concerning whether TLS is sup-
3628    ported.    If this CONNECT message is being sent over a already
3629    created TLS connection, the TLS-request MUST NOT appear.
3630
3631 7.8.2.  Receiving the CONNECT message
3632
3633    When a server receives a TCP connection on the failover port, if it
3634    is a PRIMARY server it should send a CONNECT message, and if it is a
3635    secondary server it should wait for a CONNECT message before sending
3636    any messages.  To avoid denial of service attacks, a secondary should
3637    only wait for a CONNECT message on a new connection for a limited
3638
3639
3640
3641 Droms, et. al.            Expires January 2001                 [Page 65]
3642 \f
3643 Internet Draft           DHCP Failover Protocol               July 2000
3644
3645
3646    amount of time and close the connection if none is received during
3647    that time.
3648
3649    When a secondary server receives a CONNECT message it should:
3650
3651       1.  Record the time at which the message was received.
3652
3653       2.  Examine the protocol-version option, and decide if this server
3654           is capable of interoperating with another server running that
3655           protocol version.  If not, send the CONNECTACK message with
3656           the appropriate reject-reason.  The server MUST include its
3657           protocol-version in the CONNECTACK message.
3658
3659       3.  Examine the TLS-request option.  Figure out the TLS-reply
3660           value based on the capabilities and configuration of this
3661           server.  If the result for the TLS-reply value is a 1 and the
3662           connection is accepted, indicating use of TLS, then immedi-
3663           ately send the CONNECTACK message and go into TLS negotiation.
3664           If the TLS-reply value implies rejection of the connection,
3665           then immediately send the CONNECTACK message with the TLS-
3666           reply value and the appropriate reject-reason option value.
3667           In all other cases, save the TLS-reply option information for
3668           the eventual CONNECTACK message.
3669
3670           The possibilities for TLS-request and TLS-reply are:
3671
3672           CONNECT CONNECTACK
3673             TLS     TLS
3674           request  reply
3675                         Reject
3676             t1      t1  Reason   Comments
3677             --      --  ------   --------
3678             0       0           no TLS used
3679             0       1    11     primary won't use TLS, secondary requires TLS
3680             1       0           primary desires TLS, secondary doesn't
3681             1       1           primary desires TLS, secondary will use TLS
3682             2       0    9, 10  primary requires TLS and secondary won't
3683             2       1           primary requires TLS and secondary will use TLS
3684
3685
3686
3687       4.  Check to see if there is a message-digest option in the CON-
3688           NECT message.  If there was, and the server does not support
3689           message-digests, then reject the connection with the appropri-
3690           ate reject-reason in the CONNECTACK.  If the server does sup-
3691           port message-digests, then check this message for validity
3692           based on the message-digest, and reject it if the digest indi-
3693           cates the message was altered.
3694
3695
3696
3697 Droms, et. al.            Expires January 2001                 [Page 66]
3698 \f
3699 Internet Draft           DHCP Failover Protocol               July 2000
3700
3701
3702       5.  Determine if the sender (from the sending-server-IP-address
3703           option) and the implicit role of the sender (i.e., primary)
3704           represents a server with which the receiver was configured to
3705           engage in failover activity.  This is performed after any TLS
3706           or message digest processing so that it occurs after a secure
3707           connection is created, to ensure that there is no tampering
3708           with the IP address of the partner.
3709
3710           If not, then the receiving server should reject the CONNECT
3711           request by sending a CONNECTACK message with a reject-reason
3712           value of: 8, invalid failover partner.
3713
3714           If it is, then the receiving failover endpoint should be
3715           determined.
3716
3717       6.  Decide if the time delta between the sending of the message,
3718           in the time field, and the receipt of the message, recorded in
3719           step 1 above, is acceptable.  A server MAY require an arbi-
3720           trarily small delta in time values in order to set up a fail-
3721           over connection with another server.  See section 5.9 for
3722           information on time synchronization.
3723
3724           If the delta between the time values is too great, the server
3725           should reject the CONNECT request by sending a CONNECTACK mes-
3726           sage with a reject-reason of 4, time mismatch too great.
3727
3728           If the time mismatch is not considered too great then the
3729           receiving server MUST record the delta between the servers.
3730           The receiving server MUST use this delta to correct all of the
3731           absolute times received from the other server in all time-
3732           valued options.  Note that servers can participate in failover
3733           with arbitrarily great time mismatches, as long as it is more
3734           or less constant.
3735
3736       7.  Examine the MCLT option in the CONNECT request and use the
3737           value of the MCLT as the MCLT for this failover endpoint.
3738
3739           The secondary server SHOULD be able to operate with any MCLT
3740           sent by the primary,  but if it cannot, then it should send a
3741           CONNECTACK with a reject-reason of 5, MCLT mismatch.
3742
3743       8.  The server MUST store hash-bucket-assignment option for use
3744           during processing during NORMAL state.  If this hash bucket
3745           assignment conflicts with the secondary server's configured
3746           hash bucket assignment for use in other than NORMAL state, the
3747           secondary server should send a CONNECTACK with a reject reason
3748           of 19, Hash bucket assignment conflict.
3749
3750
3751
3752
3753 Droms, et. al.            Expires January 2001                 [Page 67]
3754 \f
3755 Internet Draft           DHCP Failover Protocol               July 2000
3756
3757
3758       9.  The receiving server MAY use the vendor-class-identifier to do
3759           vendor specific processing.
3760
3761 7.9.  CONNECTACK message [6]
3762
3763    The CONNECTACK message is sent to accept or reject a CONNECT message.
3764    It is sent by the secondary server which received a CONNECT message.
3765
3766    Attempting immediately to reconnect after either receiving a CONNEC-
3767    TACK with a reject-reason or after sending a CONNECTACK with a
3768    reject-reason could yield unwanted looping behavior, since the reason
3769    that the connection was rejected may well not have changed since the
3770    last attempt.  A simple suggested solution is to wait a minute or two
3771    after sending or receiving a CONNECTACK message with a reject-reason
3772    before attempting to reestablish communication.
3773
3774    The following table summarizes the options associated with the CON-
3775    NECTACK message:
3776
3777
3778    Option
3779    ------
3780    sending-server-IP-address   MUST
3781    max-unacked-bndupd          MUST
3782    receive-timer               MUST
3783    vendor-class-identifier     MUST
3784    protocol-version            MUST
3785    TLS-request                 MUST(1)
3786    reject-reason               MAY(2)
3787    message                     MAY
3788    MCLT                        MUST NOT
3789    hash-bucket-assignment      MUST NOT
3790
3791    (1) MUST NOT if sending CONNECTACK after TLS negotiation
3792    (2) Indicates a rejection of the CONNECT message.
3793
3794               Table 7.9-1: Options used in a CONNECTACK message
3795
3796
3797 7.9.1.  Sending the CONNECTACK message
3798
3799    The xid of the CONNECTACK message MUST be that of the corresponding
3800    CONNECT message.
3801
3802    The IP address of the sending server MUST be placed in the sending-
3803    server-IP-address option.  This information is placed in an option
3804    inside of the message in order to allow the identity of the sender to
3805    be covered by a shared secret.
3806
3807
3808
3809 Droms, et. al.            Expires January 2001                 [Page 68]
3810 \f
3811 Internet Draft           DHCP Failover Protocol               July 2000
3812
3813
3814    The protocol-version option MUST be included in every CONNECTACK mes-
3815    sage.  The current value of the protocol version is 1.
3816
3817    If the connection has been rejected, the reject-reason option MUST be
3818    placed in the CONNECTACK message with an appropriate reason, and a
3819    message option SHOULD be included with a human-readable error message
3820    describing the reason for the rejection in some detail.  If the
3821    reject-reason option appears, then the remaining options listed below
3822    do not appear.  The sending server should close the connection after
3823    sending the CONNECTACK if the connection was rejected.
3824
3825    The results of the TLS negotiation MUST be placed in the TLS-reply
3826    option.  If this CONNECTACK message is being sent over an already TLS
3827    secured connection, then there MUST NOT be a TLS-reply option.
3828
3829    If there was a message-digest option in the CONNECT message, then
3830    there MUST be a message-digest in the CONNECTACK message and any sub-
3831    sequent messages if the CONNECTACK does not contain a reject-reason.
3832
3833    The number of BNDUPD messages the server can accept without blocking
3834    the TCP connection MUST be placed in the max-unacked-bndupd option.
3835    This SHOULD be a number greater than 10, and SHOULD be a number less
3836    than 100.
3837
3838    The length of the receive timer (tReceive, see section 8.3) MUST be
3839    placed in the receive-timer option.
3840
3841    The vendor class identifier MUST be placed in the vendor-class-
3842    identifier option.
3843
3844    After a connection is created (either by sending a CONNECTACK message
3845    to the first CONNECT message, or sending a CONNECTACK message to a
3846    CONNECT message received over a TLS connection), the server MUST send
3847    a STATE message.
3848
3849    After a connection is created, the server MUST start two timers for
3850    the connection: tSend and tReceive.   The tSend timer SHOULD be
3851    approximately 33 percent of the time in the receiver-timer option in
3852    the corresponding CONNECT message.  The tReceive timer SHOULD be the
3853    time sent in the receiver-timer option in the CONNECTACK message.
3854
3855    The tReceive timer is reset whenever a message is received from this
3856    TCP connection.  If it ever expires, the TCP connection is dropped
3857    and communications with this partner is considered not ok.
3858
3859    The tSend timer is reset whenever a message is sent over this connec-
3860    tion. When it expires, a CONTACT message MUST be sent.
3861
3862
3863
3864
3865 Droms, et. al.            Expires January 2001                 [Page 69]
3866 \f
3867 Internet Draft           DHCP Failover Protocol               July 2000
3868
3869
3870 7.9.2.  Receiving the CONNECTACK message
3871
3872    If a CONNECTACK message is received with a different XID from the one
3873    in the CONNECT that was sent, it SHOULD be ignored.
3874
3875    When a CONNECTACK message is received, the following actions should
3876    be taken:
3877
3878       1.  Record the time the message was received.
3879
3880       2.  Check to see if the xid on the CONNECTACK matches an outstand-
3881           ing CONNECT message on this TCP connection.
3882
3883       3.  Check to see if there is a reject-reason option in the CONNEC-
3884           TACK message.  If not, continue with step 3.  If there is a
3885           reject-reason option, the server SHOULD report the error code.
3886           If a message option appears a server SHOULD display the string
3887           from the message option in a user visible way.  The server
3888           MUST close the connection if a reject-reason option appears.
3889
3890       4.  Check the value of the TLS-reply option (if any, which there
3891           won't be if this CONNECT is taking place utilizing TLS), and
3892           if it was 1, then skip processing of the rest of the CONNEC-
3893           TACK message, and immediately enter into TLS connection setup.
3894
3895           This step occurs prior to steps 5 and 6 in order to allow
3896           creation of a secure connection (if required) prior to pro-
3897           cessing the protocol version and IP address information.
3898
3899       5.  Examine the value of the protocol-version option.  If this
3900           server is able to establish connections with another server
3901           running this protocol version, then continue, else close the
3902           connection.
3903
3904       6.  Decide if the time delta between the sending of the message,
3905           in the time field, and the receipt of the message, recorded in
3906           step 1 above, is acceptable.  A server MAY require an arbi-
3907           trarily small delta in time values in order to set up a fail-
3908           over connection with another server.
3909
3910           If the delta between the time values is too great, the server
3911           should drop the TCP connection.
3912
3913           If the time mismatch is not considered too great then the
3914           receiving server MUST record the delta between the servers.
3915           The receiving server MUST use this delta to correct all of the
3916           absolute times received from the other server in all time-
3917           valued options.  Note that the failover protocol is
3918
3919
3920
3921 Droms, et. al.            Expires January 2001                 [Page 70]
3922 \f
3923 Internet Draft           DHCP Failover Protocol               July 2000
3924
3925
3926           constructed so that two servers can be failover partners with
3927           arbitrarily great time mismatches.
3928
3929       7.  The receiving server MAY use the vendor-class-identifier to do
3930           vendor specific processing.
3931
3932       8.  After accepting a CONNECTACK message, the server MUST send a
3933           STATE message.
3934
3935           After receiving a CONNECTACK message, the server MUST start
3936           two timers for the connection: tSend and tReceive.   The tSend
3937           timer SHOULD be approximately 20 percent of the time in the
3938           receiver-timer option in the corresponding CONNECTACK message.
3939           The tReceive timer SHOULD be set to the time sent in the
3940           receiver-timer option in the CONNECT message.
3941
3942           The tReceive timer is reset whenever a message is received
3943           from this TCP connection.  If it ever expires, the TCP connec-
3944           tion is dropped and communications with this partner is con-
3945           sidered not ok.
3946
3947           The tSend timer is reset whenever a message is sent over this
3948           connection. When it expires, a CONTACT message MUST be sent.
3949
3950 7.10.  STATE message [10]
3951
3952    The state (STATE) message is used to communicate the current failover
3953    state to the partner server.
3954
3955    The STATE message MUST be sent after sending a CONNECTACK message
3956    that didn't contain a reject-reason option, and MUST be sent after
3957    receiving a CONNECTACK message without a reject-reason option.
3958
3959    A STATE message MUST be sent whenever the failover endpoint changes
3960    its failover state and a connection exists to the partner.
3961
3962    The STATE message requires no response from the failover partner.
3963
3964    The following table shows the options that MUST appear in a STATE
3965    message:
3966
3967
3968
3969
3970
3971
3972
3973
3974
3975
3976
3977 Droms, et. al.            Expires January 2001                 [Page 71]
3978 \f
3979 Internet Draft           DHCP Failover Protocol               July 2000
3980
3981
3982
3983
3984    Option
3985    ------
3986    sending-state               MUST
3987    server-flags                MUST
3988    start-time-of-state         MUST
3989
3990               Table 7.10-1: Options used in a STATE message
3991
3992
3993
3994 7.10.1.  Sending the STATE message
3995
3996    The current failover state is placed in the server-state option and
3997    the current state of the STARTUP flag is placed in the server-flags
3998    option.
3999
4000    The message is sent with a unique xid.
4001
4002    A server SHOULD only send the STATE message either when the connec-
4003    tion is created (i.e, after sending or receiving a CONNECTACK message
4004    with no reject-reason option), or when there is a change from the
4005    values sent in a previous STATE message.
4006
4007 7.10.2.  Receiving the STATE message
4008
4009    Every STATE message SHOULD indicate a change in state or a change in
4010    the flags.
4011
4012    When a STATE message is received, any state transitions specified in
4013    section 9 are taken.
4014
4015    No response to a STATE message is required.
4016
4017 7.11.  CONTACT message [11]
4018
4019    The contact (CONTACT) message is sent to verify communications
4020    integrity with a failover partner.  The CONTACT message is sent when
4021    no messages have been sent to the failover partner for a specified
4022    period of time.  This is determined by the tSend timer expiring (see
4023    section 8.3).
4024
4025    The CONTACT message has no message specific options.
4026
4027 7.11.1.  Sending the CONTACT message
4028
4029    The CONTACT message is sent.
4030
4031
4032
4033 Droms, et. al.            Expires January 2001                 [Page 72]
4034 \f
4035 Internet Draft           DHCP Failover Protocol               July 2000
4036
4037
4038 7.11.2.  Receiving the CONTACT message
4039
4040    When a CONTACT message is received, the tReceive timer is reset (as
4041    it is with any message that is received).
4042
4043    A server SHOULD use the time in the time field and the time the mes-
4044    sage was received to refine the delta time calculations between the
4045    servers.
4046
4047 7.12.  DISCONNECT message [12]
4048
4049    The DISCONNECT is the last message sent over a connection before
4050    dropping an established connection (note that an established connec-
4051    tion is one where a CONNECTACK has been sent without a reject rea-
4052    son).
4053
4054    After sending or receiving a DISCONNECT message, a server needs to
4055    have some mechanism to prevent an error loop. Simply reconnecting to
4056    the partner immediately is not the best option, especially after
4057    several consecutive attempts.
4058
4059    A simple suggested solution is to wait a minute or two after sending
4060    or receiving a DISCONNECT before attempting to reestablish communica-
4061    tion.
4062
4063    The DISCONNECT message MUST be the last message sent down a connec-
4064    tion before it is closed.
4065
4066    The following table summarizes the options that are associated with
4067    the DISCONNECT message:
4068
4069
4070    Option
4071    ------
4072    reject-reason               MUST
4073    message                     SHOULD
4074
4075               Table 7.12-1: Options used in a DISCONNECT message
4076
4077
4078
4079 7.12.1.  Sending the DISCONNECT message
4080
4081    The DISCONNECT message MUST be the last message sent by the a server
4082    which is dropping a TCP connection.
4083
4084    The xid of the DISCONNECT message must be unique.
4085
4086
4087
4088
4089 Droms, et. al.            Expires January 2001                 [Page 73]
4090 \f
4091 Internet Draft           DHCP Failover Protocol               July 2000
4092
4093
4094    The reject-reason option MUST appear giving a reason why the connec-
4095    tion was dropped.  A message option SHOULD appear giving a human
4096    readable error message with possibly more details.
4097
4098 7.12.2.  Receiving the DISCONNECT message
4099
4100    When a server receives a DISCONNECT message it should log the message
4101    if there was one and possibly raise an alarm of some sort if the
4102    reject reason was one that was sufficiently serious.
4103
4104 8.  Connection Management
4105
4106    Servers participating in the failover protocol communicate over TCP
4107    connections.   These TCP connections are used both to transmit bind-
4108    ing information from one server to another as well as to allow each
4109    server to determine whether communications is possible with the other
4110    server.
4111
4112    Central to the operation of the failover protocol is a notion of
4113    "communications okay" or "communications failed".  Failover state
4114    transitions are taken in many cases when the status of communications
4115    with the partner changes, and the existence or non-existence of a TCP
4116    connections between failover endpoints is used to determine if com-
4117    munications is "okay" or "failed".
4118
4119    A single TCP connection exists which connects two failover endpoints.
4120
4121 8.1.  Connection granularity
4122
4123    There exists one TCP connection between each set of failover end-
4124    points.  See section 5.1.1 for an explanation of failover endpoints.
4125
4126    There are a maximum of two TCP connections between any two servers
4127    implementing the failover protocol, one for each of the possible
4128    failover endpoints between these two servers.  There is a minimum of
4129    one TCP connection between one server and every other failover server
4130    with which it implements the failover protocol.
4131
4132 8.2.  Creating the TCP connection
4133
4134    There are two ports used for initiating TCP connections, correspond-
4135    ing to the two roles that a server can fill with respect to another
4136    server.  Every server implementing the failover protcol MUST listen
4137    on at least one of these ports.  Port 647 is the port to which pri-
4138    mary servers will attempt a connection, and port TBD is the port to
4139    which secondary servers will attempt a connection.  When a connection
4140    attempt is received on port 647 it is therefore from a primary
4141    server, and it is attempting to connect to this server to become a
4142
4143
4144
4145 Droms, et. al.            Expires January 2001                 [Page 74]
4146 \f
4147 Internet Draft           DHCP Failover Protocol               July 2000
4148
4149
4150    secondary server for it.  Likewise, when an attempt to connect is
4151    received on port TBD the connection attempt is from a secondary
4152    server, and it is attempting to connect to this server to be a pri-
4153    mary server.  The source port of any TCP connection is unimportant.
4154    See the schematic representation below:
4155
4156
4157       Primary Server
4158       --------------
4159        Listens on port TBD for secondary server to connect to it
4160        Periodically connects on port 647 to contact secondary
4161
4162       Secondary Server
4163       --------------
4164        Listens on port 647 for primary server to connect to it
4165        Periodically connects on port TDB to contact primary
4166
4167
4168    Every server implementing the failover protocol SHOULD attempt to
4169    connect to all of its partners periodically, where the period is
4170    implementation dependent and SHOULD be configurable.  In the event
4171    that a connection has been rejected by a CONNECTACK message with a
4172    reject-reason option contained in it or a DISCONNECT message, a
4173    server SHOULD reduce the frequency with which it attempts to connect
4174    to that server but it SHOULD continue to attempt to connect periodi-
4175    cally.
4176
4177    If a connection attempt has been received from another server in a
4178    particular role (i.e., from a specific failover endpoint) then the
4179    receiving server MUST NOT initiate a connection attempt to the
4180    partner server in that same role.
4181
4182    If both servers happen to attempt to connect simultaneously, the
4183    secondary server MUST drop its attempt in favor of the primary's
4184    attempt.  Thus, in the event that a secondary server receives a con-
4185    nection attempt to port 647 from a primary server when it has already
4186    initiated a connection attempt to port TBD on the same primary
4187    server, it MUST accept the connection to port 647 and it MUST drop
4188    drop the connection attempt to port TBD. In the event that a primary
4189    server receives a connection attempt to port TBD from a secondary
4190    server when it has already initiated a connection attempt to port 647
4191    on that same server, it MUST reject the connection attempt to port
4192    TBD and continue to pursue the connection attempt on port 647.
4193
4194    Once a connection is established, the primary server MUST send a CON-
4195    NECT message across the connection.  A secondary server MUST wait for
4196    the CONNECT message from a primary server.
4197
4198
4199
4200
4201 Droms, et. al.            Expires January 2001                 [Page 75]
4202 \f
4203 Internet Draft           DHCP Failover Protocol               July 2000
4204
4205
4206    Every CONNECT message includes a TLS-request option, and if the CON-
4207    NECTACK message does not reject the CONNECT message and the TLS-reply
4208    option says TLS MUST be used, then the servers will immediately enter
4209    into TLS negotiation.
4210
4211    Once TLS negotiation is complete, the primary server MUST resend the
4212    CONNECT message on the newly secured TLS connection and then wait for
4213    the CONNECTACK message in response.  The TLS-request and TLS-reply
4214    options MUST NOT appear in either this second CONNECT or its associ-
4215    ated CONNECTACK message as they had in the first messages.
4216
4217    The second message sent over a new connection (either a bare TCP con-
4218    nection or a connection utilizing TLS) is a STATE message.  Upon the
4219    receipt of this message, the receiver can consider communications up.
4220
4221    It is entirely possible that two servers will attempt to make connec-
4222    tions to each other essentially simultaneously, and in this case the
4223    secondary server will be waiting for a CONNECT message on each con-
4224    nection.  The primary server MUST send a CONNECT message over one
4225    connection and it MUST close the other connection.
4226
4227    A secondary server MUST NOT respond to the closing of a TCP connec-
4228    tion with a blind attempt to reconnect -- there may be another TCP
4229    connection to the same failover partner already in use.
4230
4231 8.3.  Using the TCP connection for determining communications status
4232
4233    The TCP connection is used to determine the communications status of
4234    the other server, i.e., communications-ok, or communications-
4235    interrupted.
4236
4237    Three things must happen for a server to consider that communications
4238    are ok with respect to another server:
4239
4240
4241       1.  A TCP connection must be established to the other server.
4242
4243       2.  A CONNECT message must be received and a CONNECTACK message
4244           sent in response.  The CONNECT message is used to determine
4245           the identify of the failover endpoint of the other end of the
4246           TCP connection -- without it, the failover endpoint cannot be
4247           uniquely determined.  Without knowledge of the failover end-
4248           point, then the entity with which communications is ok is
4249           undetermined.
4250
4251       3.  A STATE message must be received from the other server over
4252           the connection.  This STATE message initializes important
4253           information necessary to the operation of the state machine
4254
4255
4256
4257 Droms, et. al.            Expires January 2001                 [Page 76]
4258 \f
4259 Internet Draft           DHCP Failover Protocol               July 2000
4260
4261
4262           the governs the behavior of this failover endpoint.
4263
4264    There are two ways that a server can determine that communications
4265    has failed:
4266
4267
4268       1.  The TCP connection can go down, yielding an error when
4269           attempting to send or receive a message. This will happen at
4270           least as often as the period of the tSend timer.
4271
4272       2.  The tReceive timer can expire.
4273
4274    In either of these cases, communications is considered interrupted.
4275
4276    Several difficulties arise when trying to use one TCP connection for
4277    both bulk data transfer as well as to sense the communications status
4278    of the other server.   One aspect of the problem stems from the dif-
4279    ferent requirements of both uses.  The bulk data transfer is of
4280    course critically important to the protocol, but the speed with which
4281    it is processed is not terribly significant.  It might well be
4282    minutes before a BNDUPD message is processed, and while not optimal,
4283    such an occasional delay doesn't compromise the correctness of the
4284    protocol. However, the speed with which one server detects the other
4285    server is up (or, more importantly, down) is more highly constrained.
4286    Generally one server should be able to detect that the other server
4287    is not communicating within a minute or less.
4288
4289    These differing time constraints makes it difficult to use the same
4290    TCP connection for data transfer as well as to sense communications
4291    integrity.   See section 3.5 for additional details on TCP.
4292
4293    The solution to this problem is to require that some message be
4294    received by each end of the connection within a limited time or that
4295    the connection will be considered down.  If no messages have been
4296    sent recently, then a CONTACT message is sent.
4297
4298    In the case where there is no data queued to be sent, this is not a
4299    problem, but in the case where there is data queued to be sent to the
4300    partner, then the CONTACT message will not actually be transmitted
4301    until the queued data is sent.  Section 3.5 explains why waiting for
4302    TCP to determine that the connection is down is not acceptable, and
4303    leads a requirement that the receiving server never block the sending
4304    server from sending CONTACT messages.
4305
4306    In order to meet this requirement, each server tells the other server
4307    the number of outstanding BNDUPD messages that it will accept.  The
4308    receiving server is required to always be able to accept that many
4309    BNDUPD messages off of the connection's input queue even if it cannot
4310
4311
4312
4313 Droms, et. al.            Expires January 2001                 [Page 77]
4314 \f
4315 Internet Draft           DHCP Failover Protocol               July 2000
4316
4317
4318    process them immediately, and to accept all other messages immedi-
4319    ately.
4320
4321    Thus, the sending server's TCP is never blocked from sending a mes-
4322    sage except for very short periods, less than a few seconds unless
4323    the network connection itself has problems.  In this case, if the
4324    CONTACT messages don't make it to the partner then the partner will
4325    close the connection.
4326
4327    DISCUSSION:
4328
4329       When implementing this capability, one needs to be careful when
4330       sending any message on the TCP connection as TCP can easily block
4331       the server if the local TCP send buffers are full.  This can't be
4332       prevented because if the receiver is not reachable (via the net-
4333       work), the sending TCP can't send and thus it will be unable to
4334       empty the local TCP send buffers.  So, all send operations either
4335       need to assume they may block for some time or non-blocking sends
4336       must be used.
4337
4338 8.4.  Using the TCP connection for binding data
4339
4340    Binding data, in the form of BNDUPD messages and BNDACK messages to
4341    respond to them, are sent across the TCP connection.
4342
4343    In order to support timely detection of any failure in the partner
4344    server, the TCP connection MUST NOT block for more than a very short
4345    time, on the order of a few seconds.  Therefore, a server that is
4346    sending BNDUPD messages MUST send only a restricted number before
4347    receiving BNDACK messages about previous messages sent.
4348
4349    The number of outstanding BNDUPD messages that each server will
4350    accept without causing TCP to block transmission of additional data
4351    (i.e, CONTACT messages) is sent by each server in the CONNECT and
4352    CONNECTACK messages in the max-unacked-bndupd option.
4353
4354 8.5.  Using the TCP connection for control messages
4355
4356    The TCP connection is used for control messages: POOLREQ, UPDREQ,
4357    STATE, CONTACT, UPDREQALL and the corresponding reply messages: POOL-
4358    RESP, UPDDONE.  A server MUST immediately accept all of these mes-
4359    sages from the TCP connection.  A server MUST immediately accept any
4360    BNDACK which is received as well.
4361
4362 8.6.  Losing the TCP connection
4363
4364    When the TCP connection is lost, then communications is not ok with
4365    the other server.  A server which has lost communications SHOULD
4366
4367
4368
4369 Droms, et. al.            Expires January 2001                 [Page 78]
4370 \f
4371 Internet Draft           DHCP Failover Protocol               July 2000
4372
4373
4374    immediately attempt to reconnect to the other server, and should
4375    retry these connection attempts periodically.
4376
4377    An acknowledgement message (BNDACK, POOLRESP, UPDDONE) message can
4378    only be sent in response to a request message (BNDUPD, POOLREQ,
4379    UPDREQ, UPDREQALL) on the same TCP connection from which the request
4380    was received, in part since the XID's in the request messages are
4381    guaranteed unique only during the life of a single TCP connection.
4382
4383    When a connection to a partner server goes down, a server with unpro-
4384    cessed request messages MAY simply drop all of those messages, since
4385    it can be sure that the partner will resend them when they are next
4386    in communications.  A server with unprocessed BNDUPD messages when a
4387    TCP connection goes down MAY instead choose to process those BNDUPD
4388    messages, but it MUST NOT send any BNDACK messages in response (again
4389    because of the issues surrounding XID uniqueness).
4390
4391    When the TCP connection is closed explicitly, the DISCONNECT message
4392    with a reject-reason option (and, ideally, a message option) MUST be
4393    sent over the TCP connection.
4394
4395 9.  Failover Endpoint States
4396
4397    This section discusses the various states that a failover endpoint
4398    may take, and the server actions required when entering the state,
4399    operating in the state, and leaving the state, as well as the events
4400    that cause transitions out of the state into another state.
4401
4402    The state transition diagram in Figure 9.2-1 is relevant for this
4403    section. This is the common state transition diagram for both servers
4404    in a failover pair.  In the event that the textual description of a
4405    state differs from the state transition diagram, the textual descrip-
4406    tion is to be considered authoritative.
4407
4408 9.1.  Server Initialization
4409
4410    When a server starts it starts out in STARTUP state.  See section 9.3
4411    below for details.
4412
4413 9.2.  Server State Transitions
4414
4415    Whenever a server transitions into a new state, it MUST record the
4416    state and the time at which it entered that state in stable storage.
4417    If communications is "ok", it MUST also send a STATE message to its
4418    failover partner.
4419
4420    Figure 9.2-1 is the diagram of the server state transitions. The
4421    remainder of this section contains information important to the
4422
4423
4424
4425 Droms, et. al.            Expires January 2001                 [Page 79]
4426 \f
4427 Internet Draft           DHCP Failover Protocol               July 2000
4428
4429
4430    understanding of that diagram.
4431
4432    The server stays in the current state until all of the actions speci-
4433    fied on the state transition are complete.  If communications fails
4434    during one of the actions, the server simply stays in the current
4435    state and attempts a transition whenever the conditions for a transi-
4436    tion are later fulfilled.
4437
4438    In the state transition diagram below, the "+" or "-" in the upper
4439    right corner of each state is a notation about whether communication
4440    is ongoing with the other server.
4441
4442    The legend "responsive", "balanced", or "unresponsive" in each state
4443    indicates whether the server is responsive to all DHCP client
4444    requests, running in load balanced mode, or totally unresponsive in
4445    the respective state.  The terms "responsive" and "unresponsive" have
4446    the obvious meanings, while "balanced" means that a DHCP server may
4447    respond to all DHCPREQUEST messages that are RENEWAL or REBINDING,
4448    and to all other messages from clients for which the load balancing
4449    algorithm indicates that it MUST respond to.  See sections 5.3 and
4450    9.6.2 for details on load balancing.
4451
4452    In the state transition diagram below, when communication is reesta-
4453    blished between the two servers, each must record the state of the
4454    partner when communication was restored.  State transitions on one
4455    server in some cases imply state transitions on the partner server,
4456    so a record of the current state of the partner server must be kept
4457    by each server.
4458
4459    If the state of the partner changes while communicating a server
4460    moves through the communications-failed transition and into whatever
4461    state results.  It then immediately moves through whatever state
4462    transition is appropriate given the current state of the partner
4463    server.  A server performing this operation SHOULD NOT close the TCP
4464    connection to its partner.
4465
4466    DISCUSSION:
4467
4468       The point of this technique is simplicity, both in explanation of
4469       the protocol and in its implementation.  The alternative to this
4470       technique of memory of partner state and automatic state transi-
4471       tion on change of partner state is to have every state in the fol-
4472       lowing diagram have a state transition for every possible state of
4473       the partner.  With the approach adopted, only the states in which
4474       communications are reestablished require a state transition for
4475       each possible partner state.
4476
4477    The current state of a server MUST be recorded in stable storage and
4478
4479
4480
4481 Droms, et. al.            Expires January 2001                 [Page 80]
4482 \f
4483 Internet Draft           DHCP Failover Protocol               July 2000
4484
4485
4486    thus be available to the server after a server restart.
4487
4488
4489         +---------------+  V  +--------------+
4490         |    RECOVER  - |  |  |   STARTUP  - |
4491         |(unresponsive) |  +->|(unresponsive)|
4492         +---------------+     +--------------+
4493            Comm. OK            +-----------------+
4494           Other State:-RECOVER |  PARTNER DOWN - |<-----------------+
4495           |      |             | (responsive)    |                  |
4496          All   POTENTIAL-      +-----------------+ +--------------+ |
4497        Others  CONFLICT------------ | --------+    |  RESOLUTION -| |
4498           |                     Comm. OK      |    |  INTERRUPTED | |
4499          UPDREQ(ALL)          Other State:    |  +-| (responsive) | |
4500        Wait UPDDONE            |        |     |  | +--------------+ |
4501      Wait MCLT from fail   RECOVER  All Others| Comm. OK  ^     |   |
4502       +--------------+         |        V     V  V        |    Ext. |
4503       |RECOVER-DONE +|      +--+    +--------------+    Comm.  Cmd. |
4504       |(unresponsive)|      |       |  POTENTIAL + |    Failed  |   |
4505       +--------------+   Wait for +>|  CONFLICT    |------+     +-->|
4506          Comm. OK         Other   | |(unresponsive)|<--------+      |
4507      +--Other State:-+    State:  | +--------------+         |      |
4508      |   |           |   RECOVER  |         |                |      |
4509      |   All      POTENT.  DONE   | Resolve Conflict         |      |
4510      |  Others:  CONFLICT-- | ----+     (see 9.8)            |      |
4511      | Wait for             V               V                |      |
4512      | Other State: NORMAL +-----------------+               |      |
4513      |   V                 |     NORMAL    + | External      |      |
4514      |   +--+----------+-->|   (balanced)    |-Command---+-- | -----+
4515      |      ^          ^   +-----------------+           |   |
4516      |      |          |            |                    |   |
4517      |  Wait for   Comm. OK       Comm.            External  |
4518      |   Other      Other        Failed            Command   |
4519      |   State:     State:          |                or  |   |
4520      |RECOVER-DONE  NORMAL     Start Safe        Safe    |   |
4521      |      |     COMM. INT.  Period Timer       Period  |   |
4522      |   Comm. OK.     |            V            expiration  |
4523      |  Other State:   |  +------------------+           |   |
4524      |    RECOVER      +--| COMMUNICATIONS - |-----------+   |
4525      V      +-------------|   INTERRUPTED    |   Comm. OK    |
4526     RECOVER               |  (responsive)    |--Other State:-+
4527     RECOVER-DONE--------->+------------------+   All Others
4528
4529            Figure 9.2-1:  Server state diagram.
4530
4531
4532
4533
4534
4535
4536
4537 Droms, et. al.            Expires January 2001                 [Page 81]
4538 \f
4539 Internet Draft           DHCP Failover Protocol               July 2000
4540
4541
4542
4543 9.3.  STARTUP state
4544
4545    The STARTUP state affords an opportunity for a server to probe its
4546    partner server, before starting to service DHCP clients.
4547
4548    DISCUSSION:
4549
4550       Without the STARTUP state, a server would likely start in a state
4551       derived from its previously stored state (held in stable storage),
4552       if any.  However, this may be inconsistent with the current state
4553       of the partner.  The STARTUP state affords the opportunity for a
4554       server to potentially learn the partner's state and determine if
4555       that state is consistent with its derived starting state or
4556       whether some significant state change has occurred at the partner
4557       that forces the server to start in another state.  This is
4558       especially critical if significant time has elapsed while the
4559       server was down.
4560
4561
4562 9.3.1.  Operation while in STARTUP state
4563
4564    Whenever a server is in STARTUP state, it MUST be unresponsive to
4565    DHCP client requests, and so the time spent in the STARTUP state is
4566    necessarily short, typically on the order of a few seconds to a few
4567    tens of seconds.  The exact time spent in the STARTUP state is imple-
4568    mentation dependent, and the primary and secondary server are not
4569    required to spend the same amount of time in the STARTUP state.
4570
4571    Whenever a STATE message is sent to the partner while in STARTUP
4572    state the STARTUP bit MUST be set in the server-flags option and the
4573    previously recorded failover state MUST be placed in the server-state
4574    option.
4575
4576
4577 9.3.2.  Transition out of STARTUP state
4578
4579    Each server starts out in startup state every time it initializes
4580    itself, and performs the following algorithm as part of its initiali-
4581    zation:
4582
4583       1.  Is there any record in stable storage of a previous failover
4584           state?  If yes, set previous-state to the last recorded state
4585           in stable storage, and continue with step 2.
4586
4587           Is there any configuration information that indicates that
4588           this server was previously running but lost its stable
4589           storage?  Such information must typically come from some
4590
4591
4592
4593 Droms, et. al.            Expires January 2001                 [Page 82]
4594 \f
4595 Internet Draft           DHCP Failover Protocol               July 2000
4596
4597
4598           administrative intervention, since it is difficult for a
4599           server to distinguish first startup from a startup after it
4600           has lost its stable storage.  If yes, then set the previous-
4601           state to RECOVER, and set the time-of-failure to whatever time
4602           was configured, and go on to step 2.  This time-of-failure
4603           will be used in the transition out of the RECOVER state into
4604           the RECOVER-DONE state, below.
4605
4606           If there is no record of any previous failover state in stable
4607           storage nor of any previous operational activity for this
4608           server, then set the previous-state to PARTNER-DOWN if this
4609           server is a primary and RECOVER if this server is a secondary,
4610           and set the time-of-failure to a time before the maximum-
4611           client-lead-time before now.  If using standard Posix times, 0
4612           would typically do quite well.
4613
4614       2.  Is the previous-state NORMAL?  If yes, set the previous-state
4615           to COMMUNICATIONS-INTERRUPTED.
4616
4617       3.  Start the STARTUP state timer.  The time that a server remains
4618           in the STARTUP state (absent any communications with its
4619           partner) is implementation dependent and SHOULD be configur-
4620           able.  It SHOULD be long enough for a TCP connection to be
4621           created to a heavily loaded partner across a slow network.
4622
4623       4.  Attempt to create a TCP connection to the failover partner.
4624           See section 8.2.
4625
4626       5.  Wait for "communications okay", i.e., the process discussed in
4627           section 8.2 "Creating the TCP Connection", to complete,
4628           including the receipt of a STATE message from the partner.
4629
4630           When and if communications become "okay", clear the STARTUP
4631           flag, and set the current state to the previous-state.
4632
4633           If the partner is in PARTNER-DOWN state, and if the time at
4634           which it entered PARTNER-DOWN state (as received in the
4635           start-time-of-state option in the STATE message) is later than
4636           the last recorded time of operation of this server, then set
4637           the current state to RECOVER.  If the time at which it entered
4638           PARTNER-DOWN state is earlier than the last recorded time of
4639           operation of this server, then set the current state to
4640           POTENTIAL-CONFLICT.
4641
4642           Then, transition to the current state and take the "communica-
4643           tions okay" state transition based on the current state of
4644           this server and the partner.
4645
4646
4647
4648
4649 Droms, et. al.            Expires January 2001                 [Page 83]
4650 \f
4651 Internet Draft           DHCP Failover Protocol               July 2000
4652
4653
4654       7.  If the startup time expires, take an implementation dependent
4655           action:  The server MAY go to the previous-state, or the
4656           server MAY wait.
4657
4658           Reasons to go to previous-state and begin processing:
4659
4660           If the current server is the only operational server, then if
4661           it waits, there will be no operational DHCP servers.  This
4662           situation could occur very easily where one server fails and
4663           then the other crashes and reboots.  If the rebooting server
4664           doesn't start processing DHCP client requests without first
4665           being in communication with the other server, then the level
4666           of DHCP redundancy is not particularly high.  This is an
4667           appropriate approach if the possibility of partition is low,
4668           or if the safe period expiration time is well beyond the time
4669           at which an operator would notice and react to a partition
4670           situation.  It is also quite appropriate if the safe period
4671           will never expire.
4672
4673           Reasons to wait:
4674
4675           If the current server has been down for longer than the
4676           maximum-client-lead-time, and it is partitioned from the other
4677           server, then when it returns it will attempt to use its own
4678           available addresses to allocate to new DHCP clients, and the
4679           other server may well be in PARTNER-DOWN state and may have
4680           already allocated some of those available addresses to DHCP
4681           clients.  In cases where the possibility of partition is high,
4682           and the safe period expiration time is less than the likely
4683           operator reaction time, this is a good approach to use.
4684
4685 9.4.  PARTNER-DOWN state
4686
4687    PARTNER-DOWN state is a state either server can enter.  When in this
4688    state, the server does not assume that the other server could still
4689    be operating and servicing a different set of clients, but instead
4690    assumes that it is the only server operating. If one server is in
4691    PARTNER-DOWN state, the other server MUST NOT be operating.
4692
4693
4694 9.4.1.  Upon entry to PARTNER-DOWN state
4695
4696    No special actions are required when entering PARTNER-DOWN state.
4697
4698    The server should continue to attempt to connect to the partner
4699    periodically.
4700
4701
4702
4703
4704
4705 Droms, et. al.            Expires January 2001                 [Page 84]
4706 \f
4707 Internet Draft           DHCP Failover Protocol               July 2000
4708
4709
4710 9.4.2.  Operation while in PARTNER-DOWN state
4711
4712    A server in PARTNER-DOWN state MUST respond to DHCP client requests.
4713    It will allow renewal of all outstanding leases on IP addresses, and
4714    will allocate IP addresses from its own pool, and after a fixed
4715    period of time (the MCLT interval) has elapsed from entry into
4716    PARTNER-DOWN state, it will allocate IP addresses from the set of all
4717    available IP addresses.
4718
4719    Once a server has entered NORMAL state, the PARTNER-DOWN state is
4720    entered only on command of an external agency (typically an adminis-
4721    trator of some sort) or after the expiration of an externally config-
4722    ured minimum safe-time after the beginning of COMMUNICATIONS-
4723    INTERRUPTED state.
4724
4725    Any available IP address tagged as available for allocation by the
4726    other server (at entry to PARTNER-DOWN state) MUST NOT be allocated
4727    to a new client until the maximum-client-lead-time beyond the entry
4728    into PARTNER-DOWN state has elapsed.
4729
4730    A server in PARTNER-DOWN state MUST NOT allocate an IP address to a
4731    DHCP client different from that to which it was allocated at the
4732    entrance to PARTNER-DOWN state until the maximum-client-lead-time
4733    beyond the maximum of the following times: client expiration time,
4734    most recently transmitted potential-expiration-time, most recently
4735    received ack of potential-expiration-time from the partner, and most
4736    recently acked potential-expiration-time to the partner.  See section
4737    7.1.5 for details.  If this time would be earlier than the current
4738    time plus the maximum-client-lead-time, then the time the server
4739    entered PARTNER-DOWN state plus the maximum-client-lead-time is used.
4740
4741    Two options exist for lease times given out while in PARTNER-DOWN
4742    state, with different ramifications flowing from each.
4743
4744    If the server wishes the Failover protocol to protect it from loss of
4745    stable storage in PARTNER-DOWN state, then it should ensure that the
4746    MCLT based lease time restrictions in Section 5.1 are maintained,
4747    even in PARTNER-DOWN state.
4748
4749    If the server wishes to forego the protection of the Failover proto-
4750    col in the event of loss of stable storage, then it need recognize no
4751    restrictions on actual client lease times while in PARTNER-DOWN
4752    state.
4753
4754    A server in PARTNER-DOWN state MUST continue to attempt to establish
4755    communications and synchronization with its partner.
4756
4757
4758
4759
4760
4761 Droms, et. al.            Expires January 2001                 [Page 85]
4762 \f
4763 Internet Draft           DHCP Failover Protocol               July 2000
4764
4765
4766 9.4.3.  Transitions out of PARTNER-DOWN state
4767
4768    When a server in PARTNER-DOWN state succeeds in establishing a con-
4769    nection to its partner, its actions are conditional on the state and
4770    flags received in the STATE message from the other server as part of
4771    the process of establishing the connection.
4772
4773    If the STARTUP bit is set in the server-flags option of a received
4774    STATE message, a server in PARTNER-DOWN state MUST NOT take any state
4775    transitions based on reestablishing communications. Essentially, if a
4776    server is in PARTNER-DOWN state, it ignores all STATE messages from
4777    its partner that have the STARTUP bit set in the server-flags option
4778    of the STATE message.
4779
4780    If the STARTUP bit is not set in the server-flags option of a STATE
4781    message received from its partner, then a server in PARTNER-DOWN
4782    state takes the following actions based on the value of the server-
4783    state option in the received STATE message:
4784
4785       o partner in NORMAL, COMMUNICATIONS-INTERRUPTED, PARTNER-DOWN or
4786         POTENTIAL-CONFLICT state
4787
4788         transition to POTENTIAL-CONFLICT state
4789
4790       o partner in RECOVER state
4791
4792         stay in PARTNER-DOWN state
4793
4794       o partner in RECOVER-DONE state
4795
4796         transition into NORMAL state
4797
4798 9.5.  RECOVER state
4799
4800    This state indicates that the server has no information in its stable
4801    storage or that it is re-integrating with a server in PARTNER-DOWN
4802    state after it has been down.  A server in this state MUST attempt to
4803    refresh its stable storage from the other server.
4804
4805 9.5.1.  Operation in RECOVER state
4806
4807    A server in RECOVER MUST NOT respond to DHCP client requests.
4808
4809    A server in RECOVER state will attempt to reestablish communications
4810    with the other server.
4811
4812
4813
4814
4815
4816
4817 Droms, et. al.            Expires January 2001                 [Page 86]
4818 \f
4819 Internet Draft           DHCP Failover Protocol               July 2000
4820
4821
4822 9.5.2.  Transitions out of RECOVER state
4823
4824    If the other server is in POTENTIAL-CONFLICT state when communica-
4825    tions are reestablished, then the server in RECOVER state will move
4826    to POTENTIAL-CONFLICT state itself.
4827
4828    If the other server is in RECOVER state, then this server SHOULD sig-
4829    nal an error and halt processing.
4830
4831    If the other server is in any other state, then the server in RECOVER
4832    state will request an update of missing binding information by send-
4833    ing an UPDREQ message.  If the server has been instructed (through
4834    configuration or other external agency) that it has lost its stable
4835    storage, or if it has deduced that from the fact that it has no
4836    record of ever having talked to its partner, while its partner does
4837    have a record of communicating with it, it MUST send an UPDREQALL
4838    message, otherwise it MUST send an UPDREQ message.
4839
4840    It will wait for an UPDDONE message, and upon receipt of that message
4841    it will start a timer whose expiration is set to a time equal to the
4842    time the server went down (if known) or the current time (if the
4843    down-time is unknown) plus the maximum-client-lead-time.  When this
4844    timer goes off, the server will transition into RECOVER-DONE state.
4845    This is to allow any IP addresses that were allocated by this server
4846    prior to loss of its client binding information in stable storage to
4847    contact the other server or to time out.
4848
4849    See Figure 9.5.2-1.
4850
4851    DISCUSSION:
4852
4853       The actual requirement on this wait period in RECOVER is that it
4854       start not before the recovering server went down, not necessarily
4855       when it came back up.  If the time when the recovering server
4856       failed is known, it could be communicated to the recovering server
4857       (perhaps through actions of the network administrator), and the
4858       wait period could be reduced to the maximum-client-lead-time less
4859       the difference between the current time and the time the server
4860       failed.  In this way, the waiting period could be minimized.
4861       Various heuristics could be used to estimate this time, for exam-
4862       ple if the recovering server periodically updates stable storage
4863       with a time stamp, the wait period could be calculated to start at
4864       the time of the last update of stable storage plus the time
4865       required for the next update (which never occurred).  This esti-
4866       mate is later than the server went down, but probably not too much
4867       later.
4868
4869    If an UPDDONE message isn't received within an implementation
4870
4871
4872
4873 Droms, et. al.            Expires January 2001                 [Page 87]
4874 \f
4875 Internet Draft           DHCP Failover Protocol               July 2000
4876
4877
4878    dependent amount of time, and no BNDUPD messages are being received,
4879    the connection SHOULD be dropped.
4880
4881
4882
4883
4884                 A                                        B
4885               Server                                  Server
4886
4887                 |                                        |
4888              RECOVER                               PARTNER-DOWN
4889                 |                                        |
4890                 | >--UPDREQ-------------------->         |
4891                 |                                        |
4892                 |        <---------------------BNDUPD--< |
4893                 | >--BNDACK-------------------->         |
4894                ...                                      ...
4895                 |                                        |
4896                 |        <---------------------BNDUPD--< |
4897                 | >--BNDACK-------------------->         |
4898                 |                                        |
4899                 |        <--------------------UPDDONE--< |
4900                 |                                        |
4901        Wait MCLT from last known                         |
4902           time of operation                              |
4903                 |                                        |
4904            RECOVER-DONE                                  |
4905                 |                                        |
4906                 | >--STATE-(RECOVER-DONE)------>         |
4907                 |                                     NORMAL
4908                 |        <-------------(NORMAL)-STATE--< |
4909              NORMAL                                      |
4910                 | >---- State-(NORMAL)--------------->
4911                 |                                        |
4912                 |                                        |
4913
4914               Figure 9.5.2-1:  Transition out of RECOVER state
4915
4916
4917
4918
4919
4920
4921
4922
4923
4924
4925
4926
4927
4928
4929 Droms, et. al.            Expires January 2001                 [Page 88]
4930 \f
4931 Internet Draft           DHCP Failover Protocol               July 2000
4932
4933
4934
4935 9.6.  NORMAL state
4936
4937    NORMAL state is the state used by a server when it is communicating
4938    with the other server, and any required resynchronization has been
4939    performed. While some bindings database synchronization is performed
4940    in NORMAL state, potential conflicts are resolved prior to entry into
4941    NORMAL state as is binding database data loss.
4942
4943
4944 9.6.1.  Upon entry to NORMAL state
4945
4946    When entering NORMAL state, a server will send to the other server
4947    all currently unacknowledged binding updates as BNDUPD messages.
4948
4949    When the above process is complete, if the server entering NORMAL
4950    state is a secondary server, then it will request IP addresses for
4951    allocation using the POOLREQ message.
4952
4953
4954 9.6.2.  Processing DHCP client requests and load balancing
4955
4956    In NORMAL state, a server MUST process every DHCPREQUEST/RENEWAL or
4957    DHCPREQUEST/REBINDING request it receives. And, it processes other
4958    requests only for those clients as dictated by the load balancing
4959    algorithm specified in [LOADB].
4960
4961    As discussed in section 5.3, each server will take the client-
4962    identifier from each DHCP client request (or the client-hardware-
4963    address, i.e., the htype concatenated to the front of the chaddr if
4964    no client-identifier is present in the request) and use it as the
4965    'Request ID' specified in [LOADB].  After applying the algorithm
4966    specified in [LOADB] and comparing the result with the hash bucket
4967    assignment (performed during connect processing between failover
4968    servers), each failover server will be able to unambiguously deter-
4969    mine if it should process the DHCP client request.
4970
4971 9.6.3.  Operation in NORMAL state
4972
4973    When in NORMAL state, for every DHCP client request that it
4974    processes, as determined by the algorithm described in section 9.6.2,
4975    above, a server will operate in the following manner:
4976
4977       o Lease time calculations
4978
4979         As discussed in section 5.2.1, "Control of lease time", the
4980         lease interval given to a DHCP client can never be more than the
4981         MCLT greater than the most recently received potential-
4982
4983
4984
4985 Droms, et. al.            Expires January 2001                 [Page 89]
4986 \f
4987 Internet Draft           DHCP Failover Protocol               July 2000
4988
4989
4990         expiration-time from the failover partner or the current time,
4991         whichever is later.
4992
4993         As long as a server adheres to this constraint, the specifics of
4994         the lease interval that it gives to a DHCP client or the value
4995         of the potential-expiration-time sent to its failover partner
4996         are implementation dependent.  One possible approach is dis-
4997         cussed in section 5.2.1, but that particular approach is in no
4998         way required by this protocol.
4999
5000         See section 7.1.5 for details concerning the storage of time
5001         associated IP addresses and how to use these times when calcu-
5002         lating lease times for DHCP clients.
5003
5004       o Lazy update of partner server
5005
5006         After an ACK of a IP address binding, the server servicing a
5007         DHCP client request attempts to update its partner with the new
5008         binding information.  The lease time used in the update of the
5009         secondary MUST be at least that given to the DHCP client in the
5010         DHCPACK, and the potential-expiration-time MUST be at least the
5011         lease time, and SHOULD be considerably longer.
5012
5013       o Reallocation of IP addresses between clients
5014
5015         Whenever a client binding is released or expires, a BNDUPD mes-
5016         sage must be sent to partner, setting the binding state to
5017         RELEASED or EXPIRED.  However, until a BNDACK is received for
5018         this message, the IP address cannot be allocated to another
5019         client.  It can be allocated to the same client again.
5020
5021    In normal state, each server receives binding updates from its
5022    partner server in BNDUPD messages.  It records these in its client
5023    binding database in stable storage and then sends a corresponding
5024    BNDACK message to the primary server.  It MUST ensure that the infor-
5025    mation is recorded in stable storage prior to sending the BNDACK mes-
5026    sage back to its partner.
5027
5028
5029 9.6.4.  Transitions out of NORMAL state
5030
5031    If an external command is received by a server in NORMAL state
5032    informing it that its partner is down, then transition into PARTNER-
5033    DOWN state.  Generally, this would be an unusual situation, where
5034    some external agency knew the partner server was down.  Using the
5035    command in this case would be appropriate if the polling interval and
5036    timeout were long.
5037
5038
5039
5040
5041 Droms, et. al.            Expires January 2001                 [Page 90]
5042 \f
5043 Internet Draft           DHCP Failover Protocol               July 2000
5044
5045
5046    If a server in NORMAL state fails to receive acks to messages sent to
5047    its partner for an implementation dependent period of time, it MAY
5048    move into COMMUNICATIONS-INTERRUPTED state.  This situation might
5049    occur if the partner server was capable of maintaining the TCP con-
5050    nection between the server and also capable of sending a CONTACT mes-
5051    sage every tSend seconds, but was (for some reason) incapable of pro-
5052    cessing BNDUPD messages.
5053
5054    If the communications is determined to not be "ok" (as defined in
5055    section 8), then transition into COMMUNICATIONS-INTERRUPTED state.
5056
5057    If a server in NORMAL state receives any messages from its partner
5058    where the partner has changed state from that expected by the server
5059    in NORMAL state, then the server should transition into
5060    COMMUNICATIONS-INTERRUPTED state and take the appropriate state tran-
5061    sition from there.  For example, it would be expected for the partner
5062    to transition from POTENTIAL-CONFLICT into NORMAL state, but not for
5063    the partner to transition from NORMAL into POTENTIAL-CONFLICT state.
5064
5065 9.7.  COMMUNICATIONS-INTERRUPTED State
5066
5067    A server goes into COMMUNICATIONS-INTERRUPTED state whenever it is
5068    unable to communicate with the other server.  Primary and secondary
5069    servers cycle automatically (without administrative intervention)
5070    between NORMAL and COMMUNICATIONS-INTERRUPTED state as the network
5071    connection between them fails and recovers, or as the partner server
5072    cycles between operational and non-operational.  No duplicate IP
5073    address allocation can occur while the servers cycle between these
5074    states.
5075
5076
5077 9.7.1.  Upon entry to COMMUNICATIONS-INTERRUPTED state
5078
5079    When a server enters COMMUNICATIONS-INTERRUPTED state, if it has been
5080    configured to support an automatic transition out of COMMUNICATIONS-
5081    INTERRUPTED state and into PARTNER-DOWN state (i.e., a "safe period"
5082    has been configured, see section 10), then a timer MUST be started
5083    for the length of the configured safe period.
5084
5085    A server transitioning into the COMMUNICATIONS-INTERRUPTED state from
5086    the NORMAL state SHOULD raise some alarm condition to alert adminis-
5087    trative staff to a potential problem in the DHCP subsystem.
5088
5089
5090 9.7.2.  Operation in COMMUNICATIONS-INTERRUPTED State
5091
5092    In this state a server MUST respond to all DHCP client requests, and
5093    the algorithm for load balancing described in section 5.3 MUST NOT be
5094
5095
5096
5097 Droms, et. al.            Expires January 2001                 [Page 91]
5098 \f
5099 Internet Draft           DHCP Failover Protocol               July 2000
5100
5101
5102    used.  When allocating new IP addresses, each server allocates from
5103    its own IP address pool, where the primary MUST allocate only FREE IP
5104    addresses, and the secondary MUST allocate only BACKUP IP addresses.
5105    When responding to renewal requests, each server will allow continued
5106    renewal of a DHCP client's current lease on an IP address irrespec-
5107    tive of whether that lease was given out by the receiving server or
5108    not, although the renewal period MUST NOT exceed the maximum client
5109    lead time (MCLT) beyond the latest of: 1) the potential-expiration-
5110    time already acknowledged by the other server, or 2) the lease-
5111    expiration-time, or 3) the potential-expiration-time received from
5112    the partner server.
5113
5114    However, since the server cannot communicate with its partner in this
5115    state, the acknowledged-potential-expiration time will not be updated
5116    in any new bindings.  This is likely to eventually cause the actual-
5117    client-lease-times to be the current time plus the maximum-client-
5118    lead-time (unless this is greater than the desired-client-lease-
5119    time).
5120
5121
5122 9.7.3.  Transition out of COMMUNICATIONS-INTERRUPTED State
5123
5124    If the safe period timer expires while a server is in the
5125    COMMUNICATIONS-INTERRUPTED state, it will transition immediately into
5126    PARTNER-DOWN state.
5127
5128    If an external command is received by a server in COMMUNICATIONS-
5129    INTERRUPTED state informing it that its partner is down, it will
5130    transition immediately into PARTNER-DOWN state.
5131
5132    If communications is restored with the other server, then the server
5133    in COMMUNICATIONS-INTERRUPTED state will transition into another
5134    state based on the state of the partner:
5135
5136       o partner in NORMAL or COMMUNICATIONS-INTERRUPTED
5137
5138         The partner SHOULD NOT be in NORMAL state here, since upon res-
5139         toration of communications it MUST have created a new TCP con-
5140         nection which would have forced it into COMMUNICATIONS-
5141         INTERRUPTED state.  Still, we should account for every state
5142         just in case.
5143
5144         Transition into the NORMAL state.
5145
5146       o partner in RECOVER
5147
5148         Stay in COMMUNICATIONS-INTERRUPTED state.
5149
5150
5151
5152
5153 Droms, et. al.            Expires January 2001                 [Page 92]
5154 \f
5155 Internet Draft           DHCP Failover Protocol               July 2000
5156
5157
5158       o partner in RECOVER-DONE
5159
5160         Transition into NORMAL state.
5161
5162       o partner in PARTNER-DOWN or POTENTIAL-CONFLICT
5163
5164         Transition into POTENTIAL-CONFLICT state.
5165
5166       o partner in PAUSED
5167
5168         Stay in COMMUNICATIONS-INTERRUPTED state.
5169
5170       o partner in SHUTDOWN
5171
5172         Transition into PARTNER-DOWN state.
5173
5174    The following figure illustrates the transition from NORMAL to
5175    COMMUNICATIONS-INTERRUPTED state and then back to NORMAL state again.
5176
5177
5178
5179
5180
5181
5182
5183
5184
5185
5186
5187
5188
5189
5190
5191
5192
5193
5194
5195
5196
5197
5198
5199
5200
5201
5202
5203
5204
5205
5206
5207
5208
5209 Droms, et. al.            Expires January 2001                 [Page 93]
5210 \f
5211 Internet Draft           DHCP Failover Protocol               July 2000
5212
5213
5214
5215              Primary                                Secondary
5216               Server                                  Server
5217
5218               NORMAL                                  NORMAL
5219                 | >--CONTACT------------------->         |
5220                 |        <--------------------CONTACT--< |
5221                 |         [TCP connection broken]        |
5222            COMMUNICATIONS          :              COMMUNICATIONS
5223              INTERRUPTED           :                INTERRUPTED
5224                 |      [attempt new TCP connection]      |
5225                 |         [connection succeeds]          |
5226                 |                                        |
5227                 | >--CONNECT------------------->         |
5228                 |        <-----------------CONNECTACK--< |
5229                 |        <-------------------STATE-----< |
5230                 |                                     NORMAL
5231                 | >--STATE--------------------->         |
5232               NORMAL                                     |
5233                 | >--BNDUPD-------------------->         |
5234                 |        <---------------------BNDACK--< |
5235                 |                                        |
5236                 |        <---------------------BNDUPD--< |
5237                 | >------BNDACK---------------->         |
5238                ...                                      ...
5239                 |                                        |
5240                 |        <--------------------POOLREQ--< |
5241                 | >--POOLRESP-(2)-------------->         |
5242                 |                                        |
5243                 | >--BNDUPD-(#1)--------------->         |
5244                 |        <---------------------BNDACK--< |
5245                 |                                        |
5246                 |        <--------------------POOLREQ--< |
5247                 | >--POOLRESP-(0)-------------->         |
5248                 |                                        |
5249                 | >--BNDUPD-(#2)--------------->         |
5250                 |        <---------------------BNDACK--< |
5251                 |                                        |
5252
5253        Figure 9.7.3-1:  Transition from NORMAL to COMMUNICATIONS-
5254                         INTERRUPTED and back (example with 2
5255                         addresses allocated to secondary)
5256
5257
5258
5259
5260
5261
5262
5263
5264
5265 Droms, et. al.            Expires January 2001                 [Page 94]
5266 \f
5267 Internet Draft           DHCP Failover Protocol               July 2000
5268
5269
5270
5271 9.8.  POTENTIAL-CONFLICT state
5272
5273    This state indicates that the two servers are attempting to re-
5274    integrate with each other, but at least one of them was running in a
5275    state that did not guarantee automatic reintegration would be
5276    possible.  In POTENTIAL-CONFLICT state the servers may determine that
5277    the same IP address has been offered and accepted by two different
5278    DHCP clients.
5279
5280    It is a goal of this protocol to minimize the possibility that
5281    POTENTIAL-CONFLICT state is ever entered.
5282
5283 9.8.1.  Upon entry to POTENTIAL-CONFLICT state
5284
5285    When a primary server enters POTENTIAL-CONFLICT state it should
5286    request that the secondary send it all updates of which it is
5287    currently unaware by sending an UPDREQ message to the secondary
5288    server.
5289
5290    A secondary server entering POTENTIAL-CONFLICT state will wait for
5291    the primary to send it an UPDREQ message.
5292
5293 9.8.2.
5294
5295    Any server in POTENTIAL-CONFLICT state MUST NOT process any incoming
5296    DHCP requests.
5297
5298
5299 9.8.3.  Transitions out of POTENTIAL-CONFLICT state
5300
5301    If communications fails with the partner while in POTENTIAL-CONFLICT
5302    state, then the server will transition to RESOLUTION-INTERRUPTED
5303    state.
5304
5305    Whenever either server receives an UPDDONE message from its partner
5306    while in POTENTIAL-CONFLICT state, it MUST transition to NORMAL
5307    state.  This will cause the primary server to leave POTENTIAL-
5308    CONFLICT state prior to the secondary, since the primary sends an
5309    UPDREQ message and receives an UPDDONE before the secondary sends an
5310    UPDREQ message and receives its UPDDONE message.
5311
5312    When a secondary server receives an indication that the primary
5313    server has transitioned from POTENTIAL-CONFLICT to NORMAL state, it
5314    SHOULD send an UPDREQ message to the primary server.
5315
5316
5317
5318
5319
5320
5321 Droms, et. al.            Expires January 2001                 [Page 95]
5322 \f
5323 Internet Draft           DHCP Failover Protocol               July 2000
5324
5325
5326
5327
5328               Primary                                Secondary
5329               Server                                  Server
5330
5331                 |                                        |
5332          POTENTIAL-CONFLICT                    POTENTIAL-CONFLICT
5333                 |                                        |
5334                 | >--UPDREQ-------------------->         |
5335                 |                                        |
5336                 |        <---------------------BNDUPD--< |
5337                 | >--BNDACK-------------------->         |
5338                ...                                      ...
5339                 |                                        |
5340                 |        <---------------------BNDUPD--< |
5341                 | >--BNDACK-------------------->         |
5342                 |                                        |
5343                 |        <--------------------UPDDONE--< |
5344               NORMAL                                     |
5345                 | >--STATE--(NORMAL)----------->         |
5346                 |        <---------------------UPDREQ--< |
5347                 |                                        |
5348                 | >--BNDUPD-------------------->         |
5349                 |        <---------------------BNDACK--< |
5350                ...                                      ...
5351                 | >--BNDUPD-------------------->         |
5352                 |        <---------------------BNDACK--< |
5353                 |                                        |
5354                 | >--UPDDONE------------------->         |
5355                 |                                     NORMAL
5356                 |                                        |
5357                 |        <--------------------POOLREQ--< |
5358                 | >------POOLRESP-(n)---------->         |
5359                 |              addresses                 |
5360
5361            Figure 9.8.3-1:  Transition out of POTENTIAL-CONFLICT
5362
5363
5364 9.9.  RESOLUTION-INTERRUPTED state
5365
5366    This state indicates that the two servers were attempting to re-
5367    integrate with each other in POTENTIAL-CONFLICT state, but
5368    communications failed prior to completion of re-integration.
5369
5370    If the servers remained in POTENTIAL-CONFLICT while communications
5371    was interrupted, neither server would be responsive to DHCP client
5372    requests, and if one server had crashed, then there might be no
5373    server able to process DHCP requests.
5374
5375
5376
5377 Droms, et. al.            Expires January 2001                 [Page 96]
5378 \f
5379 Internet Draft           DHCP Failover Protocol               July 2000
5380
5381
5382 9.9.1.  Upon entry to RESOLUTION-INTERRUPTED state
5383
5384    When a server enters RESOLUTION-INTERRUPTED state it SHOULD raise an
5385    alarm condition to alert administrative staff of a problem in the
5386    DHCP subsystem.
5387
5388 9.9.2.  Operation in RESOLUTION-INTERRUPTED state
5389
5390    In this state a server MUST respond to all DHCP client requests, and
5391    any load balancing (described in section 5.3) MUST NOT be used.  When
5392    allocating new IP addresses, each server SHOULD allocate from its own
5393    IP address pool (if that can be determined), where the primary SHOULD
5394    allocate only FREE IP addresses, and the secondary SHOULD allocate
5395    only BACKUP IP addresses.  When responding to renewal requests, each
5396    server will allow continued renewal of a DHCP client's current lease
5397    on an IP address irrespective of whether that lease was given out by
5398    the receiving server or not, although the renewal period MUST not
5399    exceed the maximum client lead time (MCLT) beyond the latest of: 1)
5400    the potential-expiration-time already acknowledged by the other
5401    server or 2) the lease-expiration-time or 3) `potential-expiration-
5402    time received from the partner server.
5403
5404    However, since the server cannot communicate with its partner in this
5405    state, the acknowledged-potential-expiration time will not be updated
5406    in any new bindings.
5407
5408
5409 9.9.3.  Transitions out of RESOLUTION-INTERRUPTED state
5410
5411    If an external command is received by a server in RESOLUTION-
5412    INTERRUPTED state informing it that its partner is down, it will
5413    transition immediately into PARTNER-DOWN state.
5414
5415    If communications is restored with the other server, then the server
5416    in RESOLUTION-INTERRUPTED state will transition into POTENTIAL-
5417    CONFLICT state.
5418
5419 9.10.  RECOVER-DONE state
5420
5421    This state exists to allow an interlocked transition for one server
5422    from RECOVER state and another server from PARTNER-DOWN or
5423    COMMUNICATIONS-INTERRUPTED state into NORMAL state.
5424
5425 9.10.1.  Operation in RECOVER-DONE state
5426
5427    A server in RECOVER-DONE state MUST respond only to
5428    DHCPREQUEST/RENEWAL and DHCPREQUEST/REBINDING DHCP messages.
5429
5430
5431
5432
5433 Droms, et. al.            Expires January 2001                 [Page 97]
5434 \f
5435 Internet Draft           DHCP Failover Protocol               July 2000
5436
5437
5438 9.10.2.  Transitions out of RECOVER-DONE state
5439
5440    When a server in RECOVER-DONE state determines that its partner
5441    server has entered NORMAL state, then it will transition into NORMAL
5442    state as well.
5443
5444    If communications fails while in RECOVER-DONE state, a server will
5445    stay in RECOVER-DONE state.
5446
5447
5448 9.11.  PAUSED state
5449
5450    This state exists to allow one server to inform another that it will
5451    be out of service for what is predicted to be a relatively short
5452    time, and to allow the other server to transition to COMMUNICATIONS-
5453    INTERRUPTED state immediately and to begin servicing all DHCP clients
5454    with no interruption in service to new DHCP clients.
5455
5456    A server which is aware that it is shutting down temporarily SHOULD
5457    send a STATE message with the server-state option containing PAUSED
5458    state and close the TCP connection.
5459
5460    While a server may or may not transition internally into PAUSED
5461    state, the 'previous' state determined when it is restarted MUST be
5462    the state the server was in prior to receiving the command to shut-
5463    down and restart and which precedes its entry into the PAUSED state.
5464    See section 9.3.2 concerning the use of the previous state upon
5465    server restart.
5466
5467 9.11.1.  Upon entry to PAUSED state
5468
5469    When entering PAUSED state, the server MUST store the previous state
5470    in stable storage, and use that state as the previous state when it
5471    is restarted.
5472
5473 9.11.2.  Transitions out of PAUSED state
5474
5475    A server transitions out of PAUSED state by being restarted.  At that
5476    time, the previous state MUST be the state the server was in prior to
5477    entering the PAUSED state.
5478
5479
5480 9.12.  SHUTDOWN state
5481
5482    This state exists to allow one server to inform another that it will
5483    be out of service for what is predicted to be a relatively long time,
5484    and to allow the other server to transition immediately to PARTNER-
5485    DOWN state, and take over completely for the server going down.
5486
5487
5488
5489 Droms, et. al.            Expires January 2001                 [Page 98]
5490 \f
5491 Internet Draft           DHCP Failover Protocol               July 2000
5492
5493
5494    A server which is aware that it is shutting down SHOULD send a STATE
5495    message with the server-state field containing SHUTDOWN.
5496
5497    While a server may or may not transition internally into SHUTDOWN
5498    state, the 'previous' state determined when it is restarted MUST be
5499    the state active prior to the command to shutdown.  See section 9.3.2
5500    concerning the use of the previous state upon server restart.
5501
5502 9.12.1.  Upon entry to SHUTDOWN state
5503
5504    When entering SHUTDOWN state, the server MUST record the previous
5505    state in stable storage for use when the server is restarted.  It
5506    also MUST record the current time as the last time operational.
5507
5508    A server which is aware that it is shutting down SHOULD send a STATE
5509    message with the server-state field containing SHUTDOWN.
5510
5511 9.12.2.  Operation in SHUTDOWN state
5512
5513    A server in SHUTDOWN state MUST NOT respond to any DHCP client input.
5514
5515    If a server receives any message indicating that the partner has
5516    moved to PARTNER-DOWN state while it is in SHUTDOWN state then it
5517    MUST record RECOVER state as the previous state to be used when it is
5518    restarted.
5519
5520    A server SHOULD wait for a few seconds after informing the partner of
5521    entry into SHUTDOWN state (if communications are okay) to determine
5522    if the partner entered PARTNER-DOWN state.
5523
5524
5525 9.12.3.  Transitions out of SHUTDOWN state
5526
5527    A server transitions out of SHUTDOWN state by being restarted.
5528
5529 10.  Safe Period
5530
5531    Due to the restrictions imposed on each server while in
5532    COMMUNICATIONS-INTERRUPTED state, long-term operation in this state
5533    is not feasible for either server.  One reason that these states
5534    exist at all, is to allow the servers to easily survive transient
5535    network communications failures of a few minutes to a few days
5536    (although the actual time periods will depend a great deal on the
5537    DHCP activity of the network in terms of arrival and departure of
5538    DHCP clients on the network).
5539
5540    Eventually, when the servers are unable to communicate, they will
5541    have to move into a state where they no longer can re-integrate
5542
5543
5544
5545 Droms, et. al.            Expires January 2001                 [Page 99]
5546 \f
5547 Internet Draft           DHCP Failover Protocol               July 2000
5548
5549
5550    without some possibility of a duplicate IP address allocation.  There
5551    are two ways that they can move into this state (known as PARTNER-
5552    DOWN).
5553
5554    They can either be informed by external command that, indeed, the
5555    partner server is down.  In this case, there is no difficulty in mov-
5556    ing into the PARTNER-DOWN state since it is an accurate reflection of
5557    reality and the protocol has been designed to operate correctly (even
5558    during reintegration) as long as, when in PARTNER-DOWN state the
5559    partner is, indeed, down.
5560
5561    The more difficult scenario is when the servers are running unat-
5562    tended for extended periods, and in this case an option is provided
5563    to configure something called a "safe-period" into each server.  This
5564    OPTIONAL safe-period is the period after which either the primary or
5565    secondary server will automatically transition to PARTNER-DOWN from
5566    COMMUNICATIONS-INTERRUPTED state.  If this transition is completed
5567    and the partner is not down, then the possibility of duplicate IP
5568    address allocations will exist.
5569
5570    The goal of the "safe-period" is to allow network operations staff
5571    some time to react to a server moving into COMMUNICATIONS-INTERRUPTED
5572    state.  During the safe-period the only requirement is that the net-
5573    work operations staff determine if both servers are still running --
5574    and if they are, to either fix the network communications failure
5575    between them, or to take one of the servers down before the  expira-
5576    tion of the safe-period.
5577
5578    The length of the safe-period is installation dependent, and depends
5579    in large part on the number of unallocated IP addresses within the
5580    subnet address pool and the expected frequency of arrival of previ-
5581    ously unknown DHCP clients requiring IP addresses.  Many environments
5582    should be able to support safe-periods of several days.
5583
5584    During this safe period, either server will allow renewals from any
5585    existing client.  The only limitation concerns the need for IP
5586    addresses for the DHCP server to hand out to new DHCP clients and the
5587    need to re-allocate IP addresses to different DHCP clients.
5588
5589    The number of "extra" IP addresses required is equal to the expected
5590    total number of new DHCP clients encountered during the safe period.
5591    This is dependent only on the arrival rate of new DHCP clients, not
5592    the total number of outstanding leases on IP addresses.
5593
5594    In the unlikely event that a relatively short safe period of an hour
5595    is all that can be used (given a dearth of IP addresses or a very
5596    high arrival rate of new DHCP clients), even that can provide sub-
5597    stantial benefits in allowing the DHCP subsystem to ride through
5598
5599
5600
5601 Droms, et. al.            Expires January 2001                [Page 100]
5602 \f
5603 Internet Draft           DHCP Failover Protocol               July 2000
5604
5605
5606    minor problems that could occur and be fixed within that hour.  In
5607    these cases, no possibility of duplicate IP address allocation
5608    exists, and re-integration after the failure is solved will be
5609    automatic and require no operator intervention.
5610
5611 11.  Security
5612
5613    The Failover protocol communicates DHCP lease activity and this data
5614    is generally easily discovered via other means, such as by pinging
5615    addresses and doing DNS lookups. Therefore, the need to encrypt the
5616    data over the wire is likely not great (though some sites may feel
5617    differently).
5618
5619    However, it is very desirable to assure the integrity of failover
5620    partners and to thus ensure proper operation of the servers. For
5621    example, denial of service attacks are possible by the communication
5622    of invalid state information to one or both servers.
5623
5624    Therefore, the Failover protocol MUST be capable of being secured by
5625    using a simple shared secret message digest which covers each mes-
5626    sage.  This provides authentication of the servers, but does not pro-
5627    vide encryption of the data exchange.
5628
5629    The Failover protocol MAY also be secured by using TLS [RFC 2246]
5630    (Transport Layer Security) if encryption of the data exchange is
5631    desired.  The use of the shared secret or TLS will not protect
5632    against TCP or IP layer attacks (such as someone sending fake TCP RST
5633    segments). IPsec SHOULD be used to protect against most (if not all)
5634    of these kinds of attacks.
5635
5636 11.1.  Simple shared secret
5637
5638    Messages between the failover partners are authenticated through the
5639    use of a shared secret, which is never sent over the network and must
5640    be known by each server. How each server is told about this shared
5641    secret and secures its storage of the shared secret is outside the
5642    scope of this document.  If a server is configured with a shared
5643    secret for a partner, it MUST send the message-digest option in ALL
5644    messages to that partner and it MUST treat any messages received from
5645    that partner without a message-digest option as failing authentica-
5646    tion.
5647
5648    If a server is not configured with a shared secret for a partner, it
5649    MUST NOT send the message-digest option in any message to that
5650    partner and it MUST treat any messages received from that partner
5651    with a message-digest option as failing authentication.
5652
5653    The shared secret is used to calculate a 16 octet message-digest
5654
5655
5656
5657 Droms, et. al.            Expires January 2001                [Page 101]
5658 \f
5659 Internet Draft           DHCP Failover Protocol               July 2000
5660
5661
5662    which is sent in every failover message as the message-digest option.
5663    See section 12.16. The message-digest contains a one-way 16 octet MD5
5664    [RFC 1321] hash calculated over a stream of octets consisting of the
5665    entire message concatenated with the shared secret.
5666
5667    For calculation, the message includes the message-digest option with
5668    the message-digest data zeroed (16-octets of zero). Once the calcula-
5669    tion is complete, these 16 octets of zero are replaced by the 16-
5670    octet MD5 hash and the message is sent.
5671
5672    For verification, the 16-octet message-digest is saved and replaced
5673    with 16-octets of zero and calculated per above. The resulting MD5
5674    hash is compared to the received hash and if they match, the message
5675    is assumed authenticated.
5676
5677    A failover partner that fails to authenticate a received message or
5678    receives a message without a message-digest option when configured
5679    with a shared secret MUST close the connection immediately and take
5680    steps to notify operators.
5681
5682    This use of the shared secret is very similar to that used for RADIUS
5683    Accounting [RFC 2139].
5684
5685 11.2.  TLS
5686
5687    TLS, Transport Layer Security, as specified in [RFC 2246] MAY be
5688    used.  The use of TLS would be similar to the way it is used with
5689    SMTP [RFC 2487] and IMAP/POP3/ACAP [RFC 2595].
5690
5691    To request the use of TLS, the server that successfully opened a con-
5692    nection to its peer MUST send the TLS option as part of the CONNECT
5693    message.  The server receiving the TLS option MUST respond with a
5694    TLS-reply option indicating its acceptance or rejection of the TLS-
5695    request in the CONNECT message.
5696
5697    If the CONNECTACK message contained a TLS-reply of 1 , then both
5698    servers begin TLS negotiation.
5699
5700    Upon completion of this negotiation, the server which originally sent
5701    the CONNECT message MUST resend its CONNECT message without any TLS-
5702    request, and must wait for a corresponding CONNECTACK.
5703
5704    Implementation of the TLS_DHE_DSS_WITH_3DES_EDE_CBC_SHA [RFC 2246]
5705    cipher suite is REQUIRED in Failover servers supporting TLS. This is
5706    important as it assures that any two compliant implementations can be
5707    configured to interoperate.
5708
5709
5710
5711
5712
5713 Droms, et. al.            Expires January 2001                [Page 102]
5714 \f
5715 Internet Draft           DHCP Failover Protocol               July 2000
5716
5717
5718 12.  Failover Options
5719
5720    This section lists all of the options that are currently defined to
5721    be used with the failover protocol.  See section 6.2 for details con-
5722    cerning time values.
5723
5724
5725 12.1.  addresses-transferred
5726
5727    A 32 bit unsigned long in network byte order. Reports the number of
5728    addresses transferred by the primary to the secondary server
5729    (addresses to be used for the secondary server's private address
5730    pool).
5731
5732         Code        Len       Number of Addresses
5733    +-----+-----+-----+-----+----+-----+-----+-----+
5734    |  0  |  1  |  0  |  4  | n1 |  n2 |  n3 |  n4 |
5735    +-----+-----+-----+-----+----+-----+-----+-----+
5736
5737
5738 12.2.  assigned-IP-address
5739
5740    The DHCP managed IP address to which this message refers.
5741
5742         Code        Len          Address
5743    +-----+-----+-----+-----+----+-----+-----+-----+
5744    |  0  |  2  |  0  |  4  | a1 |  a2 |  a3 |  a4 |
5745    +-----+-----+-----+-----+----+-----+-----+-----+
5746
5747
5748
5749
5750
5751
5752
5753
5754
5755
5756
5757
5758
5759
5760
5761
5762
5763
5764
5765
5766
5767
5768
5769 Droms, et. al.            Expires January 2001                [Page 103]
5770 \f
5771 Internet Draft           DHCP Failover Protocol               July 2000
5772
5773
5774
5775 12.3.  binding-status
5776
5777    This option is used to convey the current state of a binding.
5778
5779        Code         Len     Type
5780    +-----+-----+-----+-----+-----+
5781    |  0  |  3  |  0  |  1  | 1-7 |
5782    +-----+-----+-----+-----+-----+
5783
5784    Legal values for this option are:
5785
5786    Value Binding Status
5787    ----- ------------------------------------------------
5788    1     FREE           Lease is currently available
5789    2     ACTIVE         Lease is assigned to a client
5790    3     EXPIRED        Lease has expired
5791    4     RELEASED       Lease has been released by client
5792    5     ABANDONED      A server, or client flagged address as unusable
5793    6     RESET          Lease was freed by some external agent
5794    7     BACKUP         Lease belongs to secondary's private address pool
5795    8     BACKUP-RESERVED Lease belongs to secondary's private address pool
5796                         as well as primary's since it is reserved on primary.
5797
5798
5799 12.4.  client-identifier
5800
5801    This is the client-identifier for the client associated with a
5802    binding.  The client-identifier data is subject to the same
5803    conventions as DHCP option 81 [RFC 2132].
5804
5805         Code        Len       Client Identifier
5806    +-----+-----+-----+-----+----+-----+---
5807    |  0  |  4  |  0  |  n  | i1 |  i2 | ...
5808    +-----+-----+-----+-----+----+-----+--
5809
5810
5811
5812
5813
5814
5815
5816
5817
5818
5819
5820
5821
5822
5823
5824
5825 Droms, et. al.            Expires January 2001                [Page 104]
5826 \f
5827 Internet Draft           DHCP Failover Protocol               July 2000
5828
5829
5830
5831 12.5.  client-hardware-address
5832
5833    This is the hardware address for the client associated with a
5834    binding.  Byte t1 (type) MUST be set to the proper ARP hardware
5835    address code, as defined in the ARP section of RFC 1700 (it MUST NOT
5836    be zero!)
5837
5838         Code        Len     htype   chaddr
5839    +-----+-----+-----+-----+----+-----+-----+---
5840    |  0  |  5  |  0  |  n  | t1 |  c1 |  c2 | ...
5841    +-----+-----+-----+-----+----+-----+-----+---
5842
5843
5844 12.6.  client-last-transaction-time
5845
5846    The time at which this server last received a DHCP request from a
5847    particular client expressed as an absolute time (see section 6.2).
5848
5849
5850         Code        Len    client last transaction time
5851    +-----+-----+-----+-----+----+-----+-----+-----+
5852    |  0  |  6  |  0  |  4  | t1 |  t2 |  t3 |  t4 |
5853    +-----+-----+-----+-----+----+-----+-----+-----+
5854
5855
5856 12.7.  client-reply-options
5857
5858    This option contains options from a DHCP server's reply to a DHCP
5859    client request.  It is sent in a BNDUPD message.  The first 4 bytes
5860    of the option contain the "magic number" of the option area from
5861    which the DHCP reply options were taken and serves to define the
5862    format of the rest of the sub-options contained in this option.
5863    After the magic number, the options included are in the normal
5864    options format appropriate for that magic number.
5865
5866    A server SHOULD NOT include all of the options in a DHCP server's
5867    reply to a client's request in this option, but rather a server
5868    SHOULD include only those options which are of likely interest to its
5869    partner server.  See section 7.1 for details.
5870
5871         Code        Len         Magic Number      Embedded options
5872    +-----+-----+-----+-----+----+----+----+----+----+----+--
5873    |  0  |  7  |  0  |  n  | m1 | m2 | m3 | m4 | b1 | b2 |  ...
5874    +-----+-----+-----+-----+----+----+----+----+----+----+--
5875
5876
5877
5878
5879
5880
5881 Droms, et. al.            Expires January 2001                [Page 105]
5882 \f
5883 Internet Draft           DHCP Failover Protocol               July 2000
5884
5885
5886
5887 12.8.  client-request-options
5888
5889    This option contains options from a DHCP client's request.  It is
5890    sent in a BNDUPD message.  The first 4 bytes of the option contain
5891    the "magic number" of the option area from which the DHCP client's
5892    request options were taken and serves to define the format of the
5893    rest of the sub-options contained in this option.  After the magic
5894    number, the options included are in the normal options format
5895    appropriate for that magic number.
5896
5897    A server SHOULD NOT include all of the options in a DHCP client
5898    request in this option, but rather a server SHOULD include only those
5899    options which are of likely interest to its partner server.  See
5900    section 7.1 for details.
5901
5902         Code        Len         Magic Number      Embedded options
5903    +-----+-----+-----+-----+----+----+----+----+----+----+--
5904    |  0  |  8  |  0  |  n  | m1 | m2 | m3 | m4 | b1 | b2 |  ...
5905    +-----+-----+-----+-----+----+----+----+----+----+----+--
5906
5907
5908
5909
5910
5911
5912
5913
5914
5915
5916
5917
5918
5919
5920
5921
5922
5923
5924
5925
5926
5927
5928
5929
5930
5931
5932
5933
5934
5935
5936
5937 Droms, et. al.            Expires January 2001                [Page 106]
5938 \f
5939 Internet Draft           DHCP Failover Protocol               July 2000
5940
5941
5942
5943 12.9.  DDNS
5944
5945    If an implementation supports Dynamic DNS updates, this option is
5946    used to communicate the status of the DDNS update associated with a
5947    particular lease binding.  The Flags field conveys the types of DNS
5948    RRs that are to be updated by the DHCP server, and the status of the
5949    DDNS update.  The Domain Name field conveys the DNS FQDN that the
5950    DHCP server is using to refer to the client, in DNS encoding as
5951    specified in [RFC 1035].
5952
5953        Code        Len        Flags      Domain Name
5954    +-----+-----+-----+-----+-----+------+------+-----+------
5955    |  0  |  9  |  0  |  n  |   flags    |  d1  |  d2 | ...
5956    +-----+-----+-----+-----+-----+------+------+-----+------
5957
5958    The Flags field is a 16-bit field; several bit positions are
5959    specified here.
5960
5961                         1 1 1 1 1 1
5962     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
5963    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
5964    |C|A|D|P|       MBZ             |
5965    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
5966
5967    The bits (numbered from the least-significant bit in network
5968    byte-order) are used as follows:
5969
5970    0 (C): A RR update successfully completed
5971    1 (A): Server is controlling A RR on behalf of the client
5972    2 (D): PTR RR update successfully completed (Done)
5973    3 (P): Server is controlling PTR RR on behalf of the client
5974    4-15 : Must be zero
5975
5976    All of the unspecified bit positions SHOULD be set to 0 by servers
5977    sending the Failover-DDNS option, and they MUST be ignored by servers
5978    receiving the option.
5979
5980
5981
5982
5983
5984
5985
5986
5987
5988
5989
5990
5991
5992
5993 Droms, et. al.            Expires January 2001                [Page 107]
5994 \f
5995 Internet Draft           DHCP Failover Protocol               July 2000
5996
5997
5998
5999 12.10.  delayed-service-parameter
6000
6001    The delayed-service-parameter is an optional load balancing tuning
6002    parameter, defined in [LOADB].  If it is used, it MUST be sent in the
6003    same message as the hash-bucket-assignment option (see section
6004    12.11).  Format :
6005
6006
6007        Code        Len    Seconds
6008    +-----+-----+-----+-----+----+
6009    |  0  |  10 |  0  |  1  | S  |
6010    +-----+-----+-----+-----+----+
6011
6012    S is a one byte value, 1..255.
6013
6014
6015 12.11.  hash-bucket-assignment
6016
6017    A set of load balancing hash values for the secondary server.  See
6018    section 5.3 for more information on how this option is used.
6019
6020    The format and usage of the data in this option is defined in
6021    [LOADB].
6022
6023         Code        Len        Hash Buckets
6024    +-----+-----+-----+-----+-----+-----+-----+-----+
6025    |  0  |  11 |  0  |  32 |  b1 |  b2 | ... | b32 |
6026    +-----+-----+-----+-----+-----+-----+-----+-----+
6027
6028
6029 12.12.  lease-expiration-time
6030
6031    The lease expiration time is the lease interval that a DHCP server
6032    has ACKed to a DHCP client added to the time at which that ACK was
6033    transmitted -- expressed as an absolute time (see section 6.2).
6034
6035
6036         Code        Len          Time
6037    +-----+-----+-----+-----+----+-----+-----+-----+
6038    |  0  |  12 |  0  |  4  | t1 |  t2 |  t3 |  t4 |
6039    +-----+-----+-----+-----+----+-----+-----+-----+
6040
6041
6042
6043
6044
6045
6046
6047
6048
6049 Droms, et. al.            Expires January 2001                [Page 108]
6050 \f
6051 Internet Draft           DHCP Failover Protocol               July 2000
6052
6053
6054
6055 12.13.  max-unacked-bndupd
6056
6057    The maximum number of BNDUPD message that this server is prepared to
6058    accept over the TCP connection without causing the TCP connection to
6059    block.  A 32 bit unsigned integer value, in network byte order.
6060
6061
6062         Code        Len     Maximum Unacked BNDUPD
6063    +-----+-----+-----+-----+----+-----+-----+-----+
6064    |  0  |  13 |  0  |  4  | n1 |  n2 |  n3 |  n4 |
6065    +-----+-----+-----+-----+----+-----+-----+-----+
6066
6067
6068 12.14.  MCLT
6069
6070    Maximum Client Lead Time, an interval, in seconds.  A 32 bit unsigned
6071    integer value, in network byte order.
6072
6073         Code        Len             Time
6074    +-----+-----+-----+-----+----+-----+-----+-----+
6075    |  0  |  14 |  0  |  4  | t1 |  t2 |  t3 |  t4 |
6076    +-----+-----+-----+-----+----+-----+-----+-----+
6077
6078
6079 12.15.  message
6080
6081    This option is used to supply a human readable message text.  It may
6082    be used in association with the Reject Reason Code to provide a human
6083    readable error message for the reject.
6084
6085
6086         Code        Len         Text
6087    +-----+-----+-----+-----+------+-----+--
6088    |  0  |  15 |  0  |  n  |  c1  | c2  | ...
6089    +-----+-----+-----+-----+------+-----+--
6090
6091
6092
6093
6094
6095
6096
6097
6098
6099
6100
6101
6102
6103
6104
6105 Droms, et. al.            Expires January 2001                [Page 109]
6106 \f
6107 Internet Draft           DHCP Failover Protocol               July 2000
6108
6109
6110
6111 12.16.  message-digest
6112
6113    The message digest for this message.
6114
6115    This option consists of a variable number of bytes which contain the
6116    message digest of the message prior to the inclusion of this option.
6117
6118    When this option appears in a message, it MUST appear as the last
6119    option in the message.  It MUST appear in every message if message
6120    digests are required.
6121
6122         Code        Len       Message Digest
6123    +-----+-----+-----+-----+----+-----+-----
6124    |  0  |  16 |  0  |  n  | d1 |  d2 | ...
6125    +-----+-----+-----+-----+----+-----+-----
6126
6127
6128 12.17.  potential-expiration-time
6129
6130    The potential expiration time is the time that one server tells
6131    another server that it may wish to grant in a lease to a DHCP client.
6132    It is an absolute time.  See section 6.2.
6133
6134
6135         Code        Len          Time
6136    +-----+-----+-----+-----+----+-----+-----+-----+
6137    |  0  |  17 |  0  |  4  | t1 |  t2 |  t3 |  t4 |
6138    +-----+-----+-----+-----+----+-----+-----+-----+
6139
6140
6141 12.18.  receive-timer
6142
6143    The number of seconds (an interval) within which the server must
6144    receive a message from its partner, or it will assume that
6145    communications with the partner is not ok.  An unsigned 32 bit
6146    integer in network byte order.
6147
6148         Code        Len         Receive Timer
6149    +-----+-----+-----+-----+----+-----+-----+-----+
6150    |  0  |  18 |  0  |  4  | s1 |  s2 |  s3 |  s4 |
6151    +-----+-----+-----+-----+----+-----+-----+-----+
6152
6153
6154
6155
6156
6157
6158
6159
6160
6161 Droms, et. al.            Expires January 2001                [Page 110]
6162 \f
6163 Internet Draft           DHCP Failover Protocol               July 2000
6164
6165
6166
6167 12.19.  protocol-version
6168
6169    The protocol version being used by the server. It is only sent in the
6170    CONNECT and CONNECTACK messages.  The current value for the version
6171    is 1.
6172
6173         Code        Len    Version
6174    +-----+-----+-----+-----+-----+
6175    |  0  |  19 |  0  |  1  |  1  |
6176    +-----+-----+-----+-----+-----+
6177
6178
6179
6180
6181
6182
6183
6184
6185
6186
6187
6188
6189
6190
6191
6192
6193
6194
6195
6196
6197
6198
6199
6200
6201
6202
6203
6204
6205
6206
6207
6208
6209
6210
6211
6212
6213
6214
6215
6216
6217 Droms, et. al.            Expires January 2001                [Page 111]
6218 \f
6219 Internet Draft           DHCP Failover Protocol               July 2000
6220
6221
6222
6223 12.20.  reject-reason
6224
6225    This option is used to selectively reject binding updates. It MAY be
6226    used in a BNDACK message or a CONNECTACK message, always associated
6227    with an assigned-IP-address option, which contains the IP address of
6228    the update being rejected.
6229
6230         Code        Len   Reason Code
6231    +-----+-----+-----+-----+-----+
6232    |  0  |  20 |  0  |  1  |  R1 |
6233    +-----+-----+-----+-----+-----+
6234
6235    Reason codes :
6236
6237    0   Reserved
6238    1   Illegal IP address (not part of any address pool).
6239    2   Fatal conflict exists: address in use by other client.
6240    3   Missing binding information.
6241    4   Connection rejected, time mismatch too great.
6242    5   Connection rejected, invalid MCLT.
6243    6   Connection rejected, unknown reason.
6244    7   Connection rejected, duplicate connection.
6245    8   Connection rejected, invalid failover partner.
6246    9   TLS not supported.
6247    10  TLS supported but not configured.
6248    11  TLS required but not supported by partner.
6249    12  Message digest not supported.
6250    13  Message digest not configured.
6251    14  Protocol version mismatch.
6252    15  Outdated binding information.
6253    16  Less critical binding information.
6254    17  No traffic within sufficient time.
6255    18  Hash bucket assignment conflict.
6256    19-253, reserved.
6257    254 Unknown: Error occurred but does not match any reason code.
6258    255 Reserved for code expansion.
6259
6260
6261
6262
6263
6264
6265
6266
6267
6268
6269
6270
6271
6272
6273 Droms, et. al.            Expires January 2001                [Page 112]
6274 \f
6275 Internet Draft           DHCP Failover Protocol               July 2000
6276
6277
6278
6279 12.21.  sending-server-IP-address
6280
6281    The IP address of the server sending this message.  This option is
6282    required for all messages if the message digest option used.
6283
6284         Code        Len          Address
6285    +-----+-----+-----+-----+----+-----+-----+-----+
6286    |  0  |  21 |  0  |  4  | a1 |  a2 |  a3 |  a4 |
6287    +-----+-----+-----+-----+----+-----+-----+-----+
6288
6289
6290 12.22.  server-flags
6291
6292    This option is used to convey the current flags of the failover
6293    endpoint in the sending server.
6294
6295        Code         Len     Server Flags
6296    +-----+-----+-----+-----+-------+
6297    |  0  |  22 |  0  |  1  | flags |
6298    +-----+-----+-----+-----+-------+
6299
6300    The flags field is an 8-bit field; one bit position is
6301    specified here.
6302
6303
6304     0 1 2 3 4 5 6 7
6305    +-+-+-+-+-+-+-+-+
6306    |S|   MBZ       |
6307    +-+-+-+-+-+-+-+-+
6308
6309    The bits (numbered from the least-significant bit in network
6310    byte-order) are used as follows:
6311
6312    0 (S): STARTUP,
6313           Bit 0 MUST be set to 1 whenever the server is in STARTUP state,
6314           and set to 0 otherwise.  (Note that when in STARTUP state, the
6315           state transmitted in the server-state option is usually the last
6316           recorded state from stable storage, but see section 9.3 for
6317           details.)
6318    1-7  : Must be zero
6319
6320
6321
6322
6323
6324
6325
6326
6327
6328
6329 Droms, et. al.            Expires January 2001                [Page 113]
6330 \f
6331 Internet Draft           DHCP Failover Protocol               July 2000
6332
6333
6334
6335 12.23.  server-state
6336
6337    This option is used to convey the current state of the failover
6338    endpoint in the sending server.
6339
6340        Code         Len   Server State
6341    +-----+-----+-----+-----+-----+
6342    |  0  |  23 |  0  |  1  | 1-9 |
6343    +-----+-----+-----+-----+-----+
6344
6345    Legal values for this option are:
6346
6347    Value   Server State
6348    -----   -------------------------------------------------------------
6349    0       reserved
6350    1       STARTUP                      Startup state (1)
6351    2       NORMAL                       Normal state
6352    3       COMMUNICATIONS-INTERRUPTED   Communication interrupted (safe)
6353    4       PARTNER-DOWN                 Partner down (unsafe mode)
6354    5       POTENTIAL-CONFLICT           Synchronizing
6355    6       RECOVER                      Recovering bindings from partner
6356    7       PAUSED                       Shutting down for a short period.
6357    8       SHUTDOWN                     Shutting down for an extended
6358                                         period.
6359    9       RECOVER-DONE                 Interlock state prior to NORMAL
6360    10      RESOLUTION-INTERRUPTED       Comm. failed during resolution
6361
6362    (1) The STARTUP state is never sent to the partner server, it is
6363    indicated by the STARTUP bit in the server-flags options (see section
6364    12.22).
6365
6366
6367 12.24.  start-time-of-state
6368
6369    This option is used for different states in different messages.  In a
6370    BNDUPD message it represents the start time of the state of the lease
6371    in the BNDUPD message.  In a STATE message, it represents the start
6372    time of the partner server's failover state.  In all cases it is an
6373    absolute time.
6374
6375
6376         Code        Len      Start Time of State
6377    +-----+-----+-----+-----+----+-----+-----+-----+
6378    |  0  |  24 |  0  |  4  | t1 |  t2 |  t3 |  t4 |
6379    +-----+-----+-----+-----+----+-----+-----+-----+
6380
6381
6382
6383
6384
6385 Droms, et. al.            Expires January 2001                [Page 114]
6386 \f
6387 Internet Draft           DHCP Failover Protocol               July 2000
6388
6389
6390
6391 12.25.  TLS-reply
6392
6393    This option contains information relating to TLS security
6394    negotiation.  It is sent in a CONNECTACK message
6395
6396    A t1 value of 0 indicates no TLS operation, a value of 1 indicates
6397    that TLS operation is required.
6398
6399         Code        Len      TLS
6400    +-----+-----+-----+-----+-----+
6401    |  0  |  25 |  0  |  1  |  t1 |
6402    +-----+-----+-----+-----+-----+
6403
6404
6405 12.26.  TLS-request
6406
6407    This option contains information relating to TLS security
6408    negotiation.  It is sent in a CONNECT message.
6409
6410    The t1 byte is the TLS request from this server.  A value of 0
6411    indicates no TLS operation (to communicate the other server MUST NOT
6412    require TLS), a value of 1 indicates that TLS operation is desired
6413    but not required (to communicate, the other server MAY utilize TLS),
6414    and a value of 2 indicates that TLS operation is required (to
6415    communicate the other server MUST utilize TLS) to establish
6416    communications with this server.
6417
6418         Code        Len      TLS
6419    +-----+-----+-----+-----+-----+
6420    |  0  |  26 |  0  |  1  |  t1 |
6421    +-----+-----+-----+-----+-----+
6422
6423
6424 12.27.  vendor-class-identifier
6425
6426    A string which identifies the vendor of the failover protocol
6427    implementation.
6428
6429         Code        Len    vendor class string
6430    +-----+-----+-----+-----+----+-----+---
6431    |  0  |  27 |  0  |  n  | c1 |  c2 |  ...
6432    +-----+-----+-----+-----+----+-----+---
6433
6434
6435
6436
6437
6438
6439
6440
6441 Droms, et. al.            Expires January 2001                [Page 115]
6442 \f
6443 Internet Draft           DHCP Failover Protocol               July 2000
6444
6445
6446
6447 12.28.  vendor-specific-options
6448
6449    This option is used to convey options specific to a particular
6450    vendor's implementation.  The vendor class identifier is used to
6451    specify which option space the embedded options are drawn from.
6452
6453    It functions similarly to the vendor class identifier and vendor
6454    specific options in the DHCP protocol.
6455
6456    This option contains other options in the same two byte code, two
6457    byte length format.  If this option appears in a message without a
6458    corresponding vendor class identifier, it MUST be ignored.
6459
6460         Code        Len     Embedded options
6461    +-----+-----+-----+-----+----+-----+---
6462    |  0  |  28 |  0  |  n  | c1 |  c2 |  ...
6463    +-----+-----+-----+-----+----+-----+---
6464
6465
6466
6467
6468 13.  IANA Considerations
6469
6470    This document defines several number spaces (failover options, fail-
6471    over message types, and failover reject reason codes). For all of
6472    these number spaces, certain values are defined in this specifica-
6473    tion.  New values may only be defined by IETF Consensus, as described
6474    in [RFC 2434]. Basically, this means that they are defined by RFCs
6475    approved by the IESG.
6476
6477
6478 14.  Acknowledgments
6479
6480    Ralph Droms started it all, by sketching out an initial interserver
6481    draft that embodied ideas from several past IETF meetings.  In that
6482    draft, he acknowledged contributions by Jeff Mogul, Greg Minshall,
6483    Rob Stevens, Walt Wimer, Ted Lemon, and the DHC working group.
6484
6485    Kim Kinnear and Bob Cole each extended that draft, separately and
6486    then together, until they created an interserver draft that supported
6487    any number of servers.  The complexity of that approach was just too
6488    great, and that draft wasn't greeted with enthusiasm by many, includ-
6489    ing its authors.
6490
6491    It did however lead to a much simpler approach embodied in the first
6492    Failover draft by Greg Rabil, Mike Dooley, Arun Kapur and Ralph
6493    Droms.  This draft posited only two servers -- a primary and a
6494
6495
6496
6497 Droms, et. al.            Expires January 2001                [Page 116]
6498 \f
6499 Internet Draft           DHCP Failover Protocol               July 2000
6500
6501
6502    secondary.
6503
6504    Kim Kinnear then wrote the Safe Failover draft to layer on top of the
6505    Failover Draft and increase its robustness in the face of certain
6506    rare network failures.
6507
6508    At the spring 1998 IETF meeting in LA, the DHC working group said
6509    that they wanted a merged Failover and Safe Failover draft.  Steve
6510    Gonczi and Bernie Volz stepped up and produced the raw material for
6511    such a merged draft, along with a new message format designed around
6512    DHCP options and other extensions and clarifications.  Kim Kinnear
6513    edited their work into draft format and made other changes in time
6514    for the Summer Chicago IETF meeting.
6515
6516    During the summer and fall of 1998, two groups worked on separate
6517    implementations of the UDP failover draft.  Bernie Volz and Steve
6518    Gonczi constituted one group, and Kim Kinnear, Mark Stapp and Paul
6519    Fox made up the other.  These two groups worked together to produce
6520    considerable changes and simplifications of the protocol during that
6521    period, and Steve Gonczi and Kim Kinnear edited those changes into
6522    -03 draft in time for submission to the December 1998 Orlando IETF
6523    meeting.
6524
6525    In February of 1999 Kim Kinnear and Mark Stapp hosted a meeting on
6526    people interested in the failover draft.  During that meeting a gen-
6527    eral agreement was reached to recast the failover protocol to use TCP
6528    instead of UDP.  In addition, the group together brainstormed a work-
6529    able load-balancing technique.  Kim Kinnear rewrote the entire draft
6530    to include the changes made at that meeting as well as to restructure
6531    the draft along guidelines suggested by Thomas Narten.  The result
6532    was the -04 draft, submitted prior to the Oslo IETF meeting.
6533
6534    The initial idea for a hash-based load balancing approach was offered
6535    by Ted Lemon, and the determination of an algorithm and its integra-
6536    tion into the draft was done by Steve Gonczi.  The security section
6537    was spearheaded by Bernie Volz.  Both contributed considerably to the
6538    ideas and text in the rest of the draft with several reviews.
6539
6540    In early October of 1999, three conference calls were held to discuss
6541    the -04 draft.  The -05 includes changes as a result of those calls,
6542    perhaps the largest of which was to remove the load balancing
6543    approach into a separate draft.   Thanks to all of the many people
6544    who participated in the conference calls.  Changes were made because
6545    of contributions by: Ted Lemon, David Erdmann, Richard Jones, Rob
6546    Stevens, Thomas Narten, Diana Lane, and Andre Kostur.
6547
6548    Another conference call was held in mid-January of 2000, and the -06
6549    draft was produced to tighten up the the -05 draft both technically
6550
6551
6552
6553 Droms, et. al.            Expires January 2001                [Page 117]
6554 \f
6555 Internet Draft           DHCP Failover Protocol               July 2000
6556
6557
6558    as well as editorially.
6559
6560    This, the -07 draft was edited by Kim Kinnear and was based in part
6561    on reviews by Richard Jones, Bernie Volz, and Steve Gonczi.  It embo-
6562    dies several technical updates as well as numerous editorial revi-
6563    sions that enhance both correctness as well as clarity.
6564
6565    These most recent changes have not been widely circulated among the
6566    other authors prior to submission to the IETF.
6567
6568    Many people have reviewed the various earlier drafts that went into
6569    this result.  At American Internet, ideas were contributed by Brad
6570    Parker.  At Cisco Systems Paul Fox and Ellen Garvey contributed to
6571    the design of the protocol.
6572
6573    Glenn Waters of Nortel Networks contributed ideas and enthusiasm to
6574    make a Failover protocol that was both "safe" and "lazy".
6575
6576
6577 15.  References
6578
6579
6580    [AGENTINFO] Patrick, M., "draft-ietf-dhc-agent-options-11.txt", July,
6581       2000.
6582
6583    [DDNS] Rekhter, Y., Stapp, M., "draft-ietf-dhc-dhcp-dns-12.txt",
6584       March, 2000.
6585
6586    [LOADB] Volz, B., Gonczi, S., Lemon, T., Stevens, R., "draft-ietf-
6587       dhc-loadb-02.txt", July, 1999.
6588
6589    [RFC 1035] Mockapetris, P., "Domain Names - Implementation and
6590       Specification", November, 1987.
6591
6592    [RFC 1321] Rivest, R., and Dusse, S., "The MD5 Message-Digest Algo-
6593       rithm", RFC 1321, MIT Laboratory for Computer Science, RSA Data
6594       Security Inc., April 1992.
6595
6596    [RFC 1534] Droms, R., "Interoperation between DHCP and BOOTP", RFC
6597       1534, October 1993.
6598
6599    [RFC 2119] Bradner, S. "Key words for use in RFCs to Indicate
6600       Requirement Levels", RFC 2119.
6601
6602    [RFC 2131] Droms, R., "Dynamic Host Configuration Protocol", RFC
6603       2131, March 1997.
6604
6605    [RFC 2132] Alexander, S.,  Droms, R., "DHCP Options and BOOTP Vendor
6606
6607
6608
6609 Droms, et. al.            Expires January 2001                [Page 118]
6610 \f
6611 Internet Draft           DHCP Failover Protocol               July 2000
6612
6613
6614       Extensions", Internet RFC 2132, March 1997.
6615
6616    [RFC 2136] P. Vixie, S. Thomson, Y. Rekhter, J. Bound, "Dynamic
6617       Updates in the Domain Name System (DNS UPDATE)", RFC 2136, April
6618       1997
6619
6620    [RFC 2139] Rigney, C., "Radius Accounting", RFC 2139, Livingston
6621       Enterprises, April 1997.
6622
6623    [RFC 2246] Dierks, T., "The TLS Protocol, Version 1.0", RFC 2246,
6624       January 1999.
6625
6626    [RFC 2434] Alvestrand, H. and T. Narten, "Guidelines for Writing an
6627       IANA Considerations Section in RFCs", BCP 26, RFC 2434, October
6628       1998.
6629
6630    [RFC 2487] Hoffman, P., "SMTP Service Extension for Secure SMTP over
6631       TLS", RFC 2487, January 1999.
6632
6633    [RFC 2595] Newman, C., "Using TLS with IMAP, POP3, and ACAP", RFC
6634       2595, June 1999.
6635
6636    [USERCLASS] Droms, R., Demirtjis A., Stump, G., Gu, Y., Vyaghrapuri,
6637       R., Beser, B., Privat, J. "draft-ietf-dhc-userclass-08.txt", July,
6638       2000.
6639
6640 16.  Author's information
6641
6642       Ralph Droms
6643       323 Dana Engineering
6644       Bucknell University
6645       Lewisburg, PA  17837
6646
6647       Phone: (717) 524-1145
6648       EMail: droms@bucknell.edu
6649
6650
6651       Kim Kinnear
6652       Mark Stapp
6653       Cisco Systems
6654       250 Apollo Drive
6655       Chelmsford, MA  01824
6656
6657       Phone: (978) 244-8000
6658
6659       EMail: kkinnear@cisco.com
6660              mjs@cisco.com
6661
6662
6663
6664
6665 Droms, et. al.            Expires January 2001                [Page 119]
6666 \f
6667 Internet Draft           DHCP Failover Protocol               July 2000
6668
6669
6670       Bernie Volz
6671       IPWorks, Inc.
6672       959 Concord St.
6673       Framingham, MA  01701
6674
6675       Phone: (508) 879-1809
6676
6677       EMail: volz@ipworks.com
6678
6679
6680       Steve Gonczi
6681       Network Engines, Inc.
6682       25 Dan Road
6683       Canton, MA 02021-2817
6684
6685       Phone: (781) 332-1165
6686
6687       Email: steve.gonczi@networkengines.com
6688
6689
6690
6691       Greg Rabil, Mike Dooley, Arun Kapur
6692       Lucent Technologies
6693       400 Lapp Road
6694       Malvern, PA 19355
6695
6696       Phone: (800) 208-2747
6697
6698       EMail: grabil@lucent.com
6699              mdooley@lucent.com
6700              akapur@lucent.com
6701
6702
6703 17.  Full Copyright Statement
6704
6705 Copyright (C) The Internet Society (1999). All Rights Reserved.
6706
6707 This document and translations of it may be copied and furnished to oth-
6708 ers, and derivative works that comment on or otherwise explain it or
6709 assist in its implementation may be prepared, copied, published and dis-
6710 tributed, in whole or in part, without restriction of any kind, provided
6711 that the above copyright notice and this paragraph are included on all
6712 such copies and derivative works.  However, this document itself may not
6713 be modified in any way, such as by removing the copyright notice or
6714 references to the Internet Society or other Internet organizations,
6715 except as needed for the  purpose of developing Internet standards in
6716 which case the procedures for copyrights defined in the Internet Stan-
6717 dards process must be followed, or as required to translate it into
6718
6719
6720
6721 Droms, et. al.            Expires January 2001                [Page 120]
6722 \f
6723 Internet Draft           DHCP Failover Protocol               July 2000
6724
6725
6726 languages other than English.
6727
6728 The limited permissions granted above are perpetual and will not be
6729 revoked by the Internet Society or its successors or assigns.
6730
6731 This document and the information contained herein is provided on an "AS
6732 IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK
6733 FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT
6734 LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT
6735 INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FIT-
6736 NESS FOR A PARTICULAR PURPOSE.
6737
6738 Open Issues
6739
6740    These issues need to be resolved:
6741
6742
6743       1.  Get another port number for connections.
6744
6745       2.  Resolve how to handle secondary IP address allocation.
6746
6747       3.  Figure out a better way to identify vendors.  How about an
6748           SNMP Enterprise MIB value?
6749
6750       4.  Need to tie reject-reasons to text of draft, remove obsolete
6751           reject-reasons.
6752
6753
6754
6755
6756
6757
6758
6759
6760
6761
6762
6763
6764
6765
6766
6767
6768
6769
6770
6771
6772
6773
6774
6775
6776
6777 Droms, et. al.            Expires January 2001                [Page 121]
6778 \f