1 What: /sys/fs/lustre/version
3 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
5 Shows current running lustre version.
7 What: /sys/fs/lustre/pinger
9 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
11 Shows if the lustre module has pinger support.
12 "on" means yes and "off" means no.
14 What: /sys/fs/lustre/health_check
16 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
18 Shows whenever current system state believed to be "healthy",
19 "NOT HEALTHY", or "LBUG" whenever lustre has experienced
20 an internal assertion failure
22 What: /sys/fs/lustre/jobid_name
24 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
26 Currently running job "name" for this node to be transferred
27 to Lustre servers for purposes of QoS and statistics gathering.
28 Writing into this file will change the name, reading outputs
31 What: /sys/fs/lustre/jobid_var
33 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
35 Control file for lustre "jobstats" functionality, write new
36 value from the list below to change the mode:
37 disable - disable job name reporting to the servers (default)
38 procname_uid - form the job name as the current running
39 command name and pid with a dot in between
41 nodelocal - use jobid_name value from above.
43 What: /sys/fs/lustre/timeout
45 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
47 Controls "lustre timeout" variable, also known as obd_timeout
48 in some old manual. In the past obd_timeout was of paramount
49 importance as the timeout value used everywhere and where
50 other timeouts were derived from. These days it's much less
51 important as network timeouts are mostly determined by
52 AT (adaptive timeouts).
53 Unit: seconds, default: 100
55 What: /sys/fs/lustre/max_dirty_mb
57 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
59 Controls total number of dirty cache (in megabytes) allowed
60 across all mounted lustre filesystems.
61 Since writeout of dirty pages in Lustre is somewhat expensive,
62 when you allow to many dirty pages, this might lead to
63 performance degradations as kernel tries to desperately
64 find some pages to free/writeout.
65 Default 1/2 RAM. Min value 4, max value 9/10 of RAM.
67 What: /sys/fs/lustre/debug_peer_on_timeout
69 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
71 Control if lnet debug information should be printed when
72 an RPC timeout occurs.
76 What: /sys/fs/lustre/dump_on_timeout
78 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
80 Controls if Lustre debug log should be dumped when an RPC
81 timeout occurs. This is useful if yout debug buffer typically
82 rolls over by the time you notice RPC timeouts.
84 What: /sys/fs/lustre/dump_on_eviction
86 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
88 Controls if Lustre debug log should be dumped when an this
89 client is evicted from one of the servers.
90 This is useful if yout debug buffer typically rolls over
91 by the time you notice the eviction event.
93 What: /sys/fs/lustre/at_min
95 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
97 Controls minimum adaptive timeout in seconds. If you encounter
98 a case where clients timeout due to server-reported processing
99 time being too short, you might consider increasing this value.
100 One common case of this if the underlying network has
101 unpredictable long delays.
104 What: /sys/fs/lustre/at_max
106 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
108 Controls maximum adaptive timeout in seconds. If at_max timeout
109 is reached for an RPC, the RPC will time out.
110 Some genuinuely slow network hardware might warrant increasing
112 Setting this value to 0 disables Adaptive Timeouts
113 functionality and old-style obd_timeout value is then used.
116 What: /sys/fs/lustre/at_extra
118 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
120 Controls how much extra time to request for unfinished requests
121 in processing in seconds. Normally a server-side parameter, it
122 is also used on the client for responses to various LDLM ASTs
123 that are handled with a special server thread on the client.
124 This is a way for the servers to ask the clients not to time
125 out the request that reached current servicing time estimate
126 yet and give it some more time.
129 What: /sys/fs/lustre/at_early_margin
131 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
133 Controls when to send the early reply for requests that are
134 about to timeout as an offset to the estimated service time in
138 What: /sys/fs/lustre/at_history
140 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
142 Controls for how many seconds to remember slowest events
143 encountered by adaptive timeouts code.
146 What: /sys/fs/lustre/llite/<fsname>-<uuid>/blocksize
148 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
150 Biggest blocksize on object storage server for this filesystem.
152 What: /sys/fs/lustre/llite/<fsname>-<uuid>/kbytestotal
154 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
156 Shows total number of kilobytes of space on this filesystem
158 What: /sys/fs/lustre/llite/<fsname>-<uuid>/kbytesfree
160 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
162 Shows total number of free kilobytes of space on this filesystem
164 What: /sys/fs/lustre/llite/<fsname>-<uuid>/kbytesavail
166 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
168 Shows total number of free kilobytes of space on this filesystem
169 actually available for use (taking into account per-client
170 grants and filesystem reservations).
172 What: /sys/fs/lustre/llite/<fsname>-<uuid>/filestotal
174 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
176 Shows total number of inodes on the filesystem.
178 What: /sys/fs/lustre/llite/<fsname>-<uuid>/filesfree
180 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
182 Shows estimated number of free inodes on the filesystem
184 What: /sys/fs/lustre/llite/<fsname>-<uuid>/client_type
186 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
188 Shows whenever this filesystem considers this client to be
189 compute cluster-local or remote. Remote clients have
190 additional uid/gid convrting logic applied.
192 What: /sys/fs/lustre/llite/<fsname>-<uuid>/fstype
194 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
196 Shows filesystem type of the filesystem
198 What: /sys/fs/lustre/llite/<fsname>-<uuid>/uuid
200 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
202 Shows this filesystem superblock uuid
204 What: /sys/fs/lustre/llite/<fsname>-<uuid>/max_read_ahead_mb
206 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
208 Sets maximum number of megabytes in system memory to be
209 given to read-ahead cache.
211 What: /sys/fs/lustre/llite/<fsname>-<uuid>/max_read_ahead_per_file_mb
213 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
215 Sets maximum number of megabytes to read-ahead for a single file
217 What: /sys/fs/lustre/llite/<fsname>-<uuid>/max_read_ahead_whole_mb
219 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
221 For small reads, how many megabytes to actually request from
222 the server as initial read-ahead.
224 What: /sys/fs/lustre/llite/<fsname>-<uuid>/checksum_pages
226 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
228 Enables or disables per-page checksum at llite layer, before
229 the pages are actually given to lower level for network transfer
231 What: /sys/fs/lustre/llite/<fsname>-<uuid>/stats_track_pid
233 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
235 Limit Lustre vfs operations gathering to just a single pid.
236 0 to track everything.
238 What: /sys/fs/lustre/llite/<fsname>-<uuid>/stats_track_ppid
240 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
242 Limit Lustre vfs operations gathering to just a single ppid.
243 0 to track everything.
245 What: /sys/fs/lustre/llite/<fsname>-<uuid>/stats_track_gid
247 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
249 Limit Lustre vfs operations gathering to just a single gid.
250 0 to track everything.
252 What: /sys/fs/lustre/llite/<fsname>-<uuid>/statahead_max
254 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
256 Controls maximum number of statahead requests to send when
257 sequential readdir+stat pattern is detected.
259 What: /sys/fs/lustre/llite/<fsname>-<uuid>/statahead_agl
261 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
263 Controls if AGL (async glimpse ahead - obtain object information
264 from OSTs in parallel with MDS during statahead) should be
266 0 to disable, 1 to enable.
268 What: /sys/fs/lustre/llite/<fsname>-<uuid>/lazystatfs
270 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
272 Controls statfs(2) behaviour in the face of down servers.
273 If 0, always wait for all servers to come online,
274 if 1, ignote inactive servers.
276 What: /sys/fs/lustre/llite/<fsname>-<uuid>/max_easize
278 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
280 Shows maximum number of bytes file striping data could be
281 in current configuration of storage.
283 What: /sys/fs/lustre/llite/<fsname>-<uuid>/default_easize
285 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
287 Shows maximum observed file striping data seen by this
288 filesystem client instance.
290 What: /sys/fs/lustre/llite/<fsname>-<uuid>/xattr_cache
292 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
294 Controls extended attributes client-side cache.
295 1 to enable, 0 to disable.
297 What: /sys/fs/lustre/llite/<fsname>-<uuid>/unstable_stats
299 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
301 Shows number of pages that were sent and acknowledged by
302 server but were not yet committed and therefore still
303 pinned in client memory even though no longer dirty.
305 What: /sys/fs/lustre/ldlm/cancel_unused_locks_before_replay
307 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
309 Controls if client should replay unused locks during recovery
310 If a client tends to have a lot of unused locks in LRU,
311 recovery times might become prolonged.
312 1 - just locally cancel unused locks (default)
313 0 - replay unused locks.
315 What: /sys/fs/lustre/ldlm/namespaces/<name>/resource_count
317 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
319 Displays number of lock resources (objects on which individual
320 locks are taken) currently allocated in this namespace.
322 What: /sys/fs/lustre/ldlm/namespaces/<name>/lock_count
324 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
326 Displays number or locks allocated in this namespace.
328 What: /sys/fs/lustre/ldlm/namespaces/<name>/lru_size
330 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
332 Controls and displays LRU size limit for unused locks for this
334 0 - LRU size is unlimited, controlled by server resources
335 positive number - number of locks to allow in lock LRU list
337 What: /sys/fs/lustre/ldlm/namespaces/<name>/lock_unused_count
339 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
341 Display number of locks currently sitting in the LRU list
344 What: /sys/fs/lustre/ldlm/namespaces/<name>/lru_max_age
346 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
348 Maximum number of milliseconds a lock could sit in LRU list
349 before client would voluntarily cancel it as unused.
351 What: /sys/fs/lustre/ldlm/namespaces/<name>/early_lock_cancel
353 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
355 Controls "early lock cancellation" feature on this namespace
356 if supported by the server.
357 When enabled, tries to preemtively cancel locks that would be
358 cancelled by verious operations and bundle the cancellation
359 requests in the same RPC as the main operation, which results
360 in significant speedups due to reduced lock-pingpong RPCs.
362 1 - enabled (default)
364 What: /sys/fs/lustre/ldlm/namespaces/<name>/pool/granted
366 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
368 Displays number of granted locks in this namespace
370 What: /sys/fs/lustre/ldlm/namespaces/<name>/pool/grant_rate
372 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
374 Number of granted locks in this namespace during last
377 What: /sys/fs/lustre/ldlm/namespaces/<name>/pool/cancel_rate
379 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
381 Number of lock cancellations in this namespace during
384 What: /sys/fs/lustre/ldlm/namespaces/<name>/pool/grant_speed
386 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
388 Calculated speed of lock granting (grant_rate - cancel_rate)
391 What: /sys/fs/lustre/ldlm/namespaces/<name>/pool/grant_plan
393 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
395 Estimated number of locks to be granted in the next time
396 interval in this namespace
398 What: /sys/fs/lustre/ldlm/namespaces/<name>/pool/limit
400 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
402 Controls number of allowed locks in this pool.
403 When lru_size is 0, this is the actual limit then.
405 What: /sys/fs/lustre/ldlm/namespaces/<name>/pool/lock_volume_factor
407 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
409 Multiplier for all lock volume calculations above.
410 Default is 1. Increase to make the client to more agressively
411 clean it's lock LRU list for this namespace.
413 What: /sys/fs/lustre/ldlm/namespaces/<name>/pool/server_lock_volume
415 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
417 Calculated server lock volume.
419 What: /sys/fs/lustre/ldlm/namespaces/<name>/pool/recalc_period
421 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
423 Controls length of time between recalculation of above
426 What: /sys/fs/lustre/ldlm/services/ldlm_cbd/threads_min
428 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
430 Controls minimum number of ldlm callback threads to start.
432 What: /sys/fs/lustre/ldlm/services/ldlm_cbd/threads_max
434 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
436 Controls maximum number of ldlm callback threads to start.
438 What: /sys/fs/lustre/ldlm/services/ldlm_cbd/threads_started
440 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
442 Shows actual number of ldlm callback threads running.
444 What: /sys/fs/lustre/ldlm/services/ldlm_cbd/high_priority_ratio
446 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
448 Controls what percentage of ldlm callback threads is dedicated
449 to "high priority" incoming requests.
451 What: /sys/fs/lustre/{obdtype}/{connection_name}/blocksize
453 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
455 Blocksize on backend filesystem for service behind this obd
456 device (or biggest blocksize for compound devices like lov
459 What: /sys/fs/lustre/{obdtype}/{connection_name}/kbytestotal
461 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
463 Total number of kilobytes of space on backend filesystem
464 for service behind this obd (or total amount for compound
465 devices like lov lmv)
467 What: /sys/fs/lustre/{obdtype}/{connection_name}/kbytesfree
469 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
471 Number of free kilobytes on backend filesystem for service
472 behind this obd (or total amount for compound devices
475 What: /sys/fs/lustre/{obdtype}/{connection_name}/kbytesavail
477 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
479 Number of kilobytes of free space on backend filesystem
480 for service behind this obd (or total amount for compound
481 devices like lov lmv) that is actually available for use
482 (taking into account per-client and filesystem reservations).
484 What: /sys/fs/lustre/{obdtype}/{connection_name}/filestotal
486 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
488 Number of inodes on backend filesystem for service behind this
491 What: /sys/fs/lustre/{obdtype}/{connection_name}/filesfree
493 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
495 Number of free inodes on backend filesystem for service
498 What: /sys/fs/lustre/mdc/{connection_name}/max_pages_per_rpc
500 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
502 Maximum number of readdir pages to fit into a single readdir
505 What: /sys/fs/lustre/{mdc,osc}/{connection_name}/max_rpcs_in_flight
507 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
509 Maximum number of parallel RPCs on the wire to allow on
510 this connection. Increasing this number would help on higher
511 latency links, but has a chance of overloading a server
512 if you have too many clients like this.
515 What: /sys/fs/lustre/osc/{connection_name}/max_pages_per_rpc
517 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
519 Maximum number of pages to fit into a single RPC.
520 Typically bigger RPCs allow for better performance.
521 Default: however many pages to form 1M of data (256 pages
522 for 4K page sized platforms)
524 What: /sys/fs/lustre/osc/{connection_name}/active
526 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
528 Controls accessibility of this connection. If set to 0,
529 fail all accesses immediately.
531 What: /sys/fs/lustre/osc/{connection_name}/checksums
533 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
535 Controls whenever to checksum bulk RPC data over the wire
537 1: enable (default) ; 0: disable
539 What: /sys/fs/lustre/osc/{connection_name}/contention_seconds
541 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
543 Controls for how long to consider a file contended once
544 indicated as such by the server.
545 When a file is considered contended, all operations switch to
546 synchronous lockless mode to avoid cache and lock pingpong.
548 What: /sys/fs/lustre/osc/{connection_name}/cur_dirty_bytes
550 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
552 Displays how many dirty bytes is presently in the cache for this
555 What: /sys/fs/lustre/osc/{connection_name}/cur_grant_bytes
557 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
559 Shows how many bytes we have as a "dirty cache" grant from the
560 server. Writing a value smaller than shown allows to release
561 some grant back to the server.
562 Dirty cache grant is a way Lustre ensures that cached successful
563 writes on client do not end up discarded by the server due to
564 lack of space later on.
566 What: /sys/fs/lustre/osc/{connection_name}/cur_lost_grant_bytes
568 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
570 Shows how many granted bytes were released to the server due
571 to lack of write activity on this client.
573 What: /sys/fs/lustre/osc/{connection_name}/grant_shrink_interval
575 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
577 Number of seconds with no write activity for this target
578 to start releasing dirty grant back to the server.
580 What: /sys/fs/lustre/osc/{connection_name}/destroys_in_flight
582 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
584 Number of DESTROY RPCs currently in flight to this target.
586 What: /sys/fs/lustre/osc/{connection_name}/lockless_truncate
588 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
590 Controls whether lockless truncate RPCs are allowed to this
592 Lockless truncate causes server to perform the locking which
593 is beneficial if the truncate is not followed by a write
595 1: enable ; 0: disable (default)
597 What: /sys/fs/lustre/osc/{connection_name}/max_dirty_mb
599 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
601 Controls how much dirty data this client can accumulate
602 for this target. This is orthogonal to dirty grant and is
603 a hard limit even if the server would allow a bigger dirty
605 While allowing higher dirty cache is beneficial for write
606 performance, flushing write cache takes longer and as such
607 the node might be more prone to OOMs.
608 Having this value set too low might result in not being able
609 to sent too many parallel WRITE RPCs.
612 What: /sys/fs/lustre/osc/{connection_name}/resend_count
614 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
616 Controls how many times to try and resend RPCs to this target
617 that failed with "recoverable" status, such as EAGAIN,
620 What: /sys/fs/lustre/lov/{connection_name}/numobd
622 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
624 Number of OSC targets managed by this LOV instance.
626 What: /sys/fs/lustre/lov/{connection_name}/activeobd
628 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
630 Number of OSC targets managed by this LOV instance that are
633 What: /sys/fs/lustre/lmv/{connection_name}/numobd
635 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
637 Number of MDC targets managed by this LMV instance.
639 What: /sys/fs/lustre/lmv/{connection_name}/activeobd
641 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
643 Number of MDC targets managed by this LMV instance that are
646 What: /sys/fs/lustre/lmv/{connection_name}/placement
648 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
650 Determines policy of inode placement in case of multiple
652 CHAR - based on a hash of the file name used at creation time
654 NID - based on a hash of creating client network id.