lib/libbdev/NOTES

   1 Development notes regarding libbdev, by David van Moolenbroek.
   2
   3
   4 GENERAL MODEL
   5
   6 This library is designed mainly for use by file servers. It essentially covers
   7 two use cases: 1) use of the block device that contains the file system itself,
   8 and 2) use of any block device for raw block I/O (on unmounted file systems)
   9 performed by the root file server. In the first case, the file server is
  10 responsible for opening and closing the block device, and recovery from a
  11 driver restart involves reopening those minor devices. Regular file systems
  12 should have one or at most two (for a separate journal) block devices open at
  13 the same time, which is why NR_OPEN_DEVS is set to a value that is quite low.
  14 In the second case, VFS is responsible for opening and closing the block device
  15 (and performing IOCTLs), as well as reopening the block device on a driver
  16 restart -- the root file server only gets raw I/O (and flush) requests.
  17
  18 At this time, libbdev considers only clean crashes (a crash-only model), and
  19 does not support recovery from behavioral errors. Protocol errors are passed to
  20 the user process, and generally do not have an effect on the overall state of
  21 the library.
  22
  23
  24 RETRY MODEL
  25
  26 The philosophy for recovering from driver restarts in libbdev can be formulated
  27 as follows: we want to tolerate an unlimited number of driver restarts over a
  28 long time, but we do not want to keep retrying individual requests across
  29 driver restarts. As such, we do not keep track of driver restarts on a per-
  30 driver basis, because that would mean we put a hard limit on the number of
  31 restarts for that driver in total. Instead, there are two limits: a driver
  32 restart limit that is kept on a per-request basis, failing only that request
  33 when the limit is reached, and a driver restart limit that is kept during
  34 recovery, limiting the number of restarts and eventually giving up on the
  35 entire driver when even the recovery keeps failing (as no progress is made in
  36 that case).
  37
  38 Each transfer request also has a transfer retry count. The assumption here is
  39 that when a transfer request returns EIO, it can be retried and possibly
  40 succeed upon repetition. The driver restart and transfer retry counts are
  41 tracked independently and thus the first to hit the limit will fail the
  42 request. The behavior should be the same for synchronous and asynchronous
  43 requests in this respect.
  44
  45 It could happen that a new driver gets loaded after we have decided that the
  46 current driver is unusable. This could be due to a race condition (VFS sends a
  47 new-driver request after we've given up) or due to user interaction (the user
  48 loads a replacement driver). The latter case may occur legitimately with raw
  49 I/O on the root file server, so we must not mark the driver as unusable
  50 forever. On the other hand, in the former case, we must not continue to send
  51 I/O without first reopening the minor devices. For this reason, we do not clean
  52 up the record of the minor devices when we mark a driver as unusable.