1 Fault injection capabilities infrastructure
2 ===========================================
4 See also drivers/md/md-faulty.c and "every_nth" module option for scsi_debug.
7 Available fault injection capabilities
8 --------------------------------------
12 injects slab allocation failures. (kmalloc(), kmem_cache_alloc(), ...)
16 injects page allocation failures. (alloc_pages(), get_free_pages(), ...)
20 injects futex deadlock and uaddr fault errors.
24 injects disk IO errors on devices permitted by setting
25 /sys/block/<device>/make-it-fail or
26 /sys/block/<device>/<partition>/make-it-fail. (generic_make_request())
30 injects MMC data errors on devices permitted by setting
31 debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request
35 injects error return on specific functions, which are marked by
36 ALLOW_ERROR_INJECTION() macro, by setting debugfs entries
37 under /sys/kernel/debug/fail_function. No boot option supported.
39 Configure fault-injection capabilities behavior
40 -----------------------------------------------
44 fault-inject-debugfs kernel module provides some debugfs entries for runtime
45 configuration of fault-injection capabilities.
47 - /sys/kernel/debug/fail*/probability:
49 likelihood of failure injection, in percent.
52 Note that one-failure-per-hundred is a very high error rate
53 for some testcases. Consider setting probability=100 and configure
54 /sys/kernel/debug/fail*/interval for such testcases.
56 - /sys/kernel/debug/fail*/interval:
58 specifies the interval between failures, for calls to
59 should_fail() that pass all the other tests.
61 Note that if you enable this, by setting interval>1, you will
62 probably want to set probability=100.
64 - /sys/kernel/debug/fail*/times:
66 specifies how many times failures may happen at most.
67 A value of -1 means "no limit".
69 - /sys/kernel/debug/fail*/space:
71 specifies an initial resource "budget", decremented by "size"
72 on each call to should_fail(,size). Failure injection is
73 suppressed until "space" reaches zero.
75 - /sys/kernel/debug/fail*/verbose
78 specifies the verbosity of the messages when failure is
79 injected. '0' means no messages; '1' will print only a single
80 log line per failure; '2' will print a call trace too -- useful
81 to debug the problems revealed by fault injection.
83 - /sys/kernel/debug/fail*/task-filter:
86 A value of 'N' disables filtering by process (default).
87 Any positive value limits failures to only processes indicated by
88 /proc/<pid>/make-it-fail==1.
90 - /sys/kernel/debug/fail*/require-start:
91 - /sys/kernel/debug/fail*/require-end:
92 - /sys/kernel/debug/fail*/reject-start:
93 - /sys/kernel/debug/fail*/reject-end:
95 specifies the range of virtual addresses tested during
96 stacktrace walking. Failure is injected only if some caller
97 in the walked stacktrace lies within the required range, and
98 none lies within the rejected range.
99 Default required range is [0,ULONG_MAX) (whole of virtual address space).
100 Default rejected range is [0,0).
102 - /sys/kernel/debug/fail*/stacktrace-depth:
104 specifies the maximum stacktrace depth walked during search
105 for a caller within [require-start,require-end) OR
106 [reject-start,reject-end).
108 - /sys/kernel/debug/fail_page_alloc/ignore-gfp-highmem:
110 Format: { 'Y' | 'N' }
111 default is 'N', setting it to 'Y' won't inject failures into
112 highmem/user allocations.
114 - /sys/kernel/debug/failslab/ignore-gfp-wait:
115 - /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait:
117 Format: { 'Y' | 'N' }
118 default is 'N', setting it to 'Y' will inject failures
119 only into non-sleep allocations (GFP_ATOMIC allocations).
121 - /sys/kernel/debug/fail_page_alloc/min-order:
123 specifies the minimum page allocation order to be injected
126 - /sys/kernel/debug/fail_futex/ignore-private:
128 Format: { 'Y' | 'N' }
129 default is 'N', setting it to 'Y' will disable failure injections
130 when dealing with private (address space) futexes.
132 - /sys/kernel/debug/fail_function/inject:
134 Format: { 'function-name' | '!function-name' | '' }
135 specifies the target function of error injection by name.
136 If the function name leads '!' prefix, given function is
137 removed from injection list. If nothing specified ('')
138 injection list is cleared.
140 - /sys/kernel/debug/fail_function/injectable:
142 (read only) shows error injectable functions and what type of
143 error values can be specified. The error type will be one of
145 - NULL: retval must be 0.
146 - ERRNO: retval must be -1 to -MAX_ERRNO (-4096).
147 - ERR_NULL: retval must be 0 or -1 to -MAX_ERRNO (-4096).
149 - /sys/kernel/debug/fail_function/<functiuon-name>/retval:
151 specifies the "error" return value to inject to the given
152 function for given function. This will be created when
153 user specifies new injection entry.
157 In order to inject faults while debugfs is not available (early boot time),
164 mmc_core.fail_request=<interval>,<probability>,<space>,<times>
168 - /proc/<pid>/fail-nth:
169 - /proc/self/task/<tid>/fail-nth:
171 Write to this file of integer N makes N-th call in the task fail.
172 Read from this file returns a integer value. A value of '0' indicates
173 that the fault setup with a previous write to this file was injected.
174 A positive integer N indicates that the fault wasn't yet injected.
175 Note that this file enables all types of faults (slab, futex, etc).
176 This setting takes precedence over all other generic debugfs settings
177 like probability, interval, times, etc. But per-capability settings
178 (e.g. fail_futex/ignore-private) take precedence over it.
180 This feature is intended for systematic testing of faults in a single
181 system call. See an example below.
183 How to add new fault injection capability
184 -----------------------------------------
186 o #include <linux/fault-inject.h>
188 o define the fault attributes
190 DECLARE_FAULT_INJECTION(name);
192 Please see the definition of struct fault_attr in fault-inject.h
195 o provide a way to configure fault attributes
199 If you need to enable the fault injection capability from boot time, you can
200 provide boot option to configure it. There is a helper function for it:
202 setup_fault_attr(attr, str);
206 failslab, fail_page_alloc, and fail_make_request use this way.
209 fault_create_debugfs_attr(name, parent, attr);
213 If the scope of the fault injection capability is limited to a
214 single kernel module, it is better to provide module parameters to
215 configure the fault attributes.
217 o add a hook to insert failures
219 Upon should_fail() returning true, client code should inject a failure.
221 should_fail(attr, size);
226 o Inject slab allocation failures into module init/exit code
231 echo Y > /sys/kernel/debug/$FAILTYPE/task-filter
232 echo 10 > /sys/kernel/debug/$FAILTYPE/probability
233 echo 100 > /sys/kernel/debug/$FAILTYPE/interval
234 echo -1 > /sys/kernel/debug/$FAILTYPE/times
235 echo 0 > /sys/kernel/debug/$FAILTYPE/space
236 echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
237 echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
241 bash -c "echo 1 > /proc/self/make-it-fail && exec $*"
246 echo "Usage: $0 modulename [ modulename ... ]"
253 faulty_system modprobe $m
256 faulty_system modprobe -r $m
259 ------------------------------------------------------------------------------
261 o Inject page allocation failures only for a specific module
265 FAILTYPE=fail_page_alloc
270 echo "Usage: $0 <modulename>"
276 if [ ! -d /sys/module/$module/sections ]
278 echo Module $module is not loaded
282 cat /sys/module/$module/sections/.text > /sys/kernel/debug/$FAILTYPE/require-start
283 cat /sys/module/$module/sections/.data > /sys/kernel/debug/$FAILTYPE/require-end
285 echo N > /sys/kernel/debug/$FAILTYPE/task-filter
286 echo 10 > /sys/kernel/debug/$FAILTYPE/probability
287 echo 100 > /sys/kernel/debug/$FAILTYPE/interval
288 echo -1 > /sys/kernel/debug/$FAILTYPE/times
289 echo 0 > /sys/kernel/debug/$FAILTYPE/space
290 echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
291 echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
292 echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem
293 echo 10 > /sys/kernel/debug/$FAILTYPE/stacktrace-depth
295 trap "echo 0 > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT
297 echo "Injecting errors into the module $module... (interrupt to stop)"
300 ------------------------------------------------------------------------------
302 o Inject open_ctree error while btrfs mount
307 dd if=/dev/zero of=testfile.img bs=1M seek=1000 count=1
308 DEVICE=$(losetup --show -f testfile.img)
309 mkfs.btrfs -f $DEVICE
312 FAILTYPE=fail_function
314 echo $FAILFUNC > /sys/kernel/debug/$FAILTYPE/inject
315 echo -12 > /sys/kernel/debug/$FAILTYPE/$FAILFUNC/retval
316 echo N > /sys/kernel/debug/$FAILTYPE/task-filter
317 echo 100 > /sys/kernel/debug/$FAILTYPE/probability
318 echo 0 > /sys/kernel/debug/$FAILTYPE/interval
319 echo -1 > /sys/kernel/debug/$FAILTYPE/times
320 echo 0 > /sys/kernel/debug/$FAILTYPE/space
321 echo 1 > /sys/kernel/debug/$FAILTYPE/verbose
323 mount -t btrfs $DEVICE tmpmnt
332 echo > /sys/kernel/debug/$FAILTYPE/inject
339 Tool to run command with failslab or fail_page_alloc
340 ----------------------------------------------------
341 In order to make it easier to accomplish the tasks mentioned above, we can use
342 tools/testing/fault-injection/failcmd.sh. Please run a command
343 "./tools/testing/fault-injection/failcmd.sh --help" for more information and
344 see the following examples.
348 Run a command "make -C tools/testing/selftests/ run_tests" with injecting slab
351 # ./tools/testing/fault-injection/failcmd.sh \
352 -- make -C tools/testing/selftests/ run_tests
354 Same as above except to specify 100 times failures at most instead of one time
357 # ./tools/testing/fault-injection/failcmd.sh --times=100 \
358 -- make -C tools/testing/selftests/ run_tests
360 Same as above except to inject page allocation failure instead of slab
363 # env FAILCMD_TYPE=fail_page_alloc \
364 ./tools/testing/fault-injection/failcmd.sh --times=100 \
365 -- make -C tools/testing/selftests/ run_tests
367 Systematic faults using fail-nth
368 ---------------------------------
370 The following code systematically faults 0-th, 1-st, 2-nd and so on
371 capabilities in the socketpair() system call.
373 #include <sys/types.h>
374 #include <sys/stat.h>
375 #include <sys/socket.h>
376 #include <sys/syscall.h>
386 int i, err, res, fail_nth, fds[2];
389 system("echo N > /sys/kernel/debug/failslab/ignore-gfp-wait");
390 sprintf(buf, "/proc/self/task/%ld/fail-nth", syscall(SYS_gettid));
391 fail_nth = open(buf, O_RDWR);
393 sprintf(buf, "%d", i);
394 write(fail_nth, buf, strlen(buf));
395 res = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds);
397 pread(fail_nth, buf, sizeof(buf), 0);
402 printf("%d-th fault %c: res=%d/%d\n", i, atoi(buf) ? 'N' : 'Y',
412 1-th fault Y: res=-1/23
413 2-th fault Y: res=-1/23
414 3-th fault Y: res=-1/12
415 4-th fault Y: res=-1/12
416 5-th fault Y: res=-1/23
417 6-th fault Y: res=-1/23
418 7-th fault Y: res=-1/23
419 8-th fault Y: res=-1/12
420 9-th fault Y: res=-1/12
421 10-th fault Y: res=-1/12
422 11-th fault Y: res=-1/12
423 12-th fault Y: res=-1/12
424 13-th fault Y: res=-1/12
425 14-th fault Y: res=-1/12
426 15-th fault Y: res=-1/12
427 16-th fault N: res=0/12