1 .\" $NetBSD: disk.9,v 1.34 2009/12/30 01:37:17 jnemeth Exp $
3 .\" Copyright (c) 1995, 1996 Jason R. Thorpe.
4 .\" All rights reserved.
6 .\" Redistribution and use in source and binary forms, with or without
7 .\" modification, are permitted provided that the following conditions
9 .\" 1. Redistributions of source code must retain the above copyright
10 .\" notice, this list of conditions and the following disclaimer.
11 .\" 2. Redistributions in binary form must reproduce the above copyright
12 .\" notice, this list of conditions and the following disclaimer in the
13 .\" documentation and/or other materials provided with the distribution.
14 .\" 3. All advertising materials mentioning features or use of this software
15 .\" must display the following acknowledgement:
16 .\" This product includes software developed for the NetBSD Project
17 .\" by Jason R. Thorpe.
18 .\" 4. The name of the author may not be used to endorse or promote products
19 .\" derived from this software without specific prior written permission.
21 .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
22 .\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
23 .\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
24 .\" IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
25 .\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
26 .\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
27 .\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
28 .\" AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
29 .\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
30 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
40 .Nm disk_begindetach ,
48 .Nd generic disk framework
54 .Fn disk_init "struct disk *" "const char *name" "const struct dkdriver *driver"
56 .Fn disk_attach "struct disk *"
58 .Fn disk_begindetach "struct disk *" "int (*lastclose)(device_t)" "device_t self" "int flags"
60 .Fn disk_detach "struct disk *"
62 .Fn disk_destroy "struct disk *"
64 .Fn disk_busy "struct disk *"
66 .Fn disk_unbusy "struct disk *" "long bcount" "int read"
68 .Fn disk_isbusy "struct disk *"
70 .Fn disk_find "const char *"
72 .Fn disk_blocksize "struct disk *" "int blocksize"
76 generic disk framework is designed to provide flexible,
77 scalable, and consistent handling of disk state and metrics information.
78 The fundamental component of this framework is the
80 structure, which is defined as follows:
83 TAILQ_ENTRY(disk) dk_link; /* link in global disklist */
84 const char *dk_name; /* disk name */
85 prop_dictionary_t dk_info; /* reference to disk-info dictionary */
86 int dk_bopenmask; /* block devices open */
87 int dk_copenmask; /* character devices open */
88 int dk_openmask; /* composite (bopen|copen) */
89 int dk_state; /* label state ### */
90 int dk_blkshift; /* shift to convert DEV_BSIZE to blks */
91 int dk_byteshift; /* shift to convert bytes to blks */
94 * Metrics data; note that some metrics may have no meaning
95 * on certain types of disks.
97 struct io_stats *dk_stats;
99 const struct dkdriver *dk_driver; /* pointer to driver */
102 * Information required to be the parent of a disk wedge.
104 kmutex_t dk_rawlock; /* lock on these fields */
105 u_int dk_rawopens; /* # of openes of rawvp */
106 struct vnode *dk_rawvp; /* vnode for the RAW_PART bdev */
108 kmutex_t dk_openlock; /* lock on these and openmask */
109 u_int dk_nwedges; /* # of configured wedges */
110 /* all wedges on this disk */
111 LIST_HEAD(, dkwedge_softc) dk_wedges;
114 * Disk label information. Storage for the in-core disk label
115 * must be dynamically allocated, otherwise the size of this
116 * structure becomes machine-dependent.
118 daddr_t dk_labelsector; /* sector containing label */
119 struct disklabel *dk_label; /* label */
120 struct cpu_disklabel *dk_cpulabel;
124 The system maintains a global linked-list of all disks attached to the
128 may grow or shrink over time as disks are dynamically added and removed
130 Drivers which currently make use of the detachment
131 capability of the framework are the
136 pseudo-device drivers.
138 The following is a brief description of each function in the framework:
139 .Bl -tag -width ".Fn disk_blocksize"
141 Initialize the disk structure.
143 Attach a disk; allocate storage for the disklabel, set the
145 timestamp, insert the disk into the disklist, and increment the
147 .It Fn disk_begindetach
148 Check whether the disk is open, and if not, return 0.
149 If the disk is open, and
155 Otherwise, call the provided
162 and return its exit code.
164 Detach a disk; free storage for the disklabel, remove the disk
165 from the disklist, and decrement the system disk count.
166 If the count drops below zero, panic.
168 Release resources used by the disk structure when it is no longer
173 If this counter goes from 0 to 1, set the timestamp corresponding to
176 Decrement a disk's busy counter.
177 If the count drops below zero, panic.
178 Get the current time, subtract it from the disk's timestamp, and add
179 the difference to the disk's running total.
180 Set the disk's timestamp to the current time.
181 If the provided byte count is greater than 0, add it to the disk's
182 running total and increment the number of transfers performed by the disk.
185 specifies the direction of I/O;
186 if non-zero it means reading from the disk,
187 otherwise it means writing to the disk.
191 if disk is marked as busy and false if it is not.
193 Return a pointer to the disk structure corresponding to the name provided,
194 or NULL if the disk does not exist.
195 .It Fn disk_blocksize
202 with suitable values derived from the supplied physical blocksize.
203 It is only necessary to call this function if the device's physical blocksize
208 The functions typically called by device drivers are
211 .Fn disk_begindetach ,
220 is provided as a utility function.
222 The following ioctls should be implemented by disk drivers:
223 .Bl -tag -width "xxxxxx"
224 .It Dv DIOCGDINFO "struct disklabel"
226 .It Dv DIOCSDINFO "struct disklabel"
227 Set in-memory disklabel.
228 .It Dv DIOCWDINFO "struct disklabel"
229 Set in-memory disklabel and write on-disk disklabel.
230 .It Dv DIOCGPART "struct partinfo"
231 Get partition information.
232 This is used internally.
233 .It Dv DIOCRFORMAT "struct format_op"
235 .It Dv DIOCWFORMAT "struct format_op"
237 .It Dv DIOCSSTEP "int"
239 .It Dv DIOCSRETRIES "int"
240 Set number of retries.
241 .It Dv DIOCKLABEL "int"
242 Specify whether to keep or drop the in-memory disklabel
243 when the device is closed.
244 .It Dv DIOCWLABEL "int"
245 Enable or disable writing to the part of the disk that contains the label.
246 .It Dv DIOCSBAD "struct dkbad"
248 .It Dv DIOCEJECT "int"
249 Eject removable disk.
250 .It Dv DIOCLOCK "int"
251 Lock or unlock disk pack.
252 For devices with removable media, locking is intended to prevent
253 the operator from removing the media.
254 .It Dv DIOCGDEFLABEL "struct disklabel"
258 .It Dv DIOCGCACHE "int"
259 Get status of disk read and write caches.
260 The result is a bitmask containing the following values:
261 .Bl -tag -width DKCACHE_RCHANGE
265 Write(back) cache enabled.
266 .It Dv DKCACHE_RCHANGE
267 Read cache enable is changeable.
268 .It Dv DKCACHE_WCHANGE
269 Write cache enable is changeable.
271 Cache parameters may be saved, so that they persist across reboots
272 or device detach/attach cycles.
274 .It Dv DIOCSCACHE "int"
275 Set status of disk read and write caches.
276 The input is a bitmask in the same format as used for
278 .It Dv DIOCCACHESYNC "int"
279 Synchronise the disk cache.
280 This causes information in the disk's write cache (if any)
281 to be flushed to stable storage.
282 The argument specifies whether or not to force a flush even if
283 the kernel believes that there is no outstanding data.
284 .It Dv DIOCBSLIST "struct disk_badsecinfo"
287 Flush bad sector list.
288 .It Dv DIOCAWEDGE "struct dkwedge_info"
290 .It Dv DIOCGWEDGEINFO "struct dkwedge_info"
291 Get wedge information.
292 .It Dv DIOCDWEDGE "struct dkwedge_info"
294 .It Dv DIOCLWEDGES "struct dkwedge_list"
296 .It Dv DIOCGSTRATEGY "struct disk_strategy"
297 Get disk buffer queue strategy.
298 .It Dv DIOCSSTRATEGY "struct disk_strategy"
299 Set disk buffer queue strategy.
300 .It Dv DIOCGDISKINFO "struct plistref"
301 Get disk-info dictionary.
303 .Sh USING THE FRAMEWORK
304 This section includes a description on basic use of the framework
305 and example usage of its functions.
306 Actual implementation of a device driver which uses the framework
309 Each device in the system uses a
311 structure which contains autoconfiguration and state information for that
313 In the case of disks, the softc should also contain one instance
314 of the disk structure, e.g.:
317 device_t sc_dev; /* generic device information */
318 struct disk sc_dk; /* generic disk information */
323 In order for the system to gather metrics data about a disk, the disk must
324 be registered with the system.
327 routine performs all of the functions currently required to register a disk
328 with the system including allocation of disklabel storage space,
329 recording of the time since boot that the disk was attached, and insertion
331 Note that since this function allocates storage space for the disklabel,
332 it must be called before the disklabel is read from the media or used in
336 is called, a portions of the disk structure must be initialized with
337 data specific to that disk.
340 disk driver, the following would be performed in the autoconfiguration
345 fooattach(device_t parent, device_t self, void *aux)
347 struct foo_softc *sc = device_private(self);
350 /* Initialize and attach the disk structure. */
351 disk_init(\*[Am]sc-\*[Gt]sc_dk, device_xname(self), \*[Am]foodkdriver);
352 disk_attach(\*[Am]sc-\*[Gt]sc_dk);
354 /* Read geometry and fill in pertinent parts of disklabel. */
356 disk_blocksize(\*[Am]sc-\*[Gt]sc_dk, bytes_per_sector);
365 This switch currently includes a pointer to the disk's
368 This switch needs to have global scope and should be initialized as follows:
370 void foostrategy(struct buf *);
372 const struct dkdriver foodkdriver = {
373 .d_strategy = foostrategy,
377 Once the disk is attached, metrics may be gathered on that disk.
378 In order to gather metrics data, the driver must tell the framework when
379 the disk starts and stops operations.
380 This functionality is provided by the
387 is part of device driver private data it needs to be guarded.
388 Mutual exclusion must be done by driver
395 routine should be called immediately before a command to the disk is
400 struct foo_softc *sc;
404 /* Get buffer from drive's transfer queue. */
407 /* Build command to send to drive. */
410 /* Tell the disk framework we're going busy. */
411 mutex_enter(\*[Am]sc-\*[Gt]sc_dk_mtx);
412 disk_busy(\*[Am]sc-\*[Gt]sc_dk);
413 mutex_exit(\*[Am]sc-\*[Gt]sc_dk_mtx);
415 /* Send command to the drive. */
422 is called, a timestamp is taken if the disk's busy counter moves from
423 0 to 1, indicating the disk has gone from an idle to non-idle state.
424 At the end of a transaction, the
426 routine should be called.
427 This routine performs some consistency checks,
428 such as ensuring that the calls to
433 This routine also performs the actual metrics calculation.
434 A timestamp is taken and the difference from the timestamp taken in
436 is added to the disk's total running time.
437 The disk's timestamp is then updated in case there is more than one
438 pending transfer on the disk.
439 A byte count is also added to the disk's running total, and if greater than
440 zero, the number of transfers the disk has performed is incremented.
443 specifies the direction of I/O;
444 if non-zero it means reading from the disk,
445 otherwise it means writing to the disk.
449 struct foo_xfer *xfer;
451 struct foo_softc = (struct foo_softc *)xfer-\*[Gt]xf_softc;
452 struct buf *bp = xfer-\*[Gt]xf_buf;
457 * Get number of bytes transferred. If there is no buf
458 * associated with the xfer, we are being called at the
459 * end of a non-I/O command.
464 nbytes = bp-\*[Gt]b_bcount - bp-\*[Gt]b_resid;
468 mutex_enter(\*[Am]sc-\*[Gt]sc_dk_mtx);
469 /* Notify the disk framework that we've completed the transfer. */
470 disk_unbusy(\*[Am]sc-\*[Gt]sc_dk, nbytes,
471 bp != NULL ? bp-\*[Gt]b_flags \*[Am] B_READ : 0);
472 mutex_exit(\*[Am]sc-\*[Gt]sc_dk_mtx);
479 is used to get status of disk device it returns true if device is
480 currently busy and false if it is not.
485 it requires explicit locking from user side.
487 This section describes places within the
489 source tree where actual
490 code implementing or using the disk framework can be found.
491 All pathnames are relative to
494 The disk framework itself is implemented within the file
495 .Pa sys/kern/subr_disk.c .
496 Data structures and function prototypes for the framework are located in
501 machine-independent SCSI disk and CD-ROM drivers use the
514 drivers use the detachment capability of the framework.
519 .Pa sys/dev/dm/device-mapper.c .
527 generic disk framework appeared in
532 generic disk framework was architected and implemented by
534 .Aq thorpej@NetBSD.org .