2 .\" Copyright 2015 Nexenta Systems, Inc. All rights reserved.
3 .\" Copyright (c) 2002, Sun Microsystems, Inc. All Rights Reserved.
4 .\" The contents of this file are subject to the terms of the Common Development and Distribution License (the "License"). You may not use this file except in compliance with the License.
5 .\" You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE or http://www.opensolaris.org/os/licensing. See the License for the specific language governing permissions and limitations under the License.
6 .\" When distributing Covered Code, include this CDDL HEADER in each file and include the License file at usr/src/OPENSOLARIS.LICENSE. If applicable, add the following below this CDDL HEADER, with the fields enclosed by brackets "[]" replaced with your own identifying information: Portions Copyright [yyyy] [name of copyright owner]
7 .TH KMEM_CACHE_CREATE 9F "Feb 18, 2015"
9 kmem_cache_create, kmem_cache_alloc, kmem_cache_free, kmem_cache_destroy,
10 kmem_cache_set_move \- kernel memory cache allocator operations
14 #include <sys/types.h>
17 \fBkmem_cache_t *\fR\fBkmem_cache_create\fR(\fBchar *\fR\fIname\fR, \fBsize_t\fR \fIbufsize\fR,
18 \fBsize_t\fR \fIalign\fR, \fBint\fR (*\fIconstructor\fR)(void *, void *, int),
19 \fBvoid\fR (*\fIdestructor\fR)(void *, void *), \fBvoid\fR (*\fIreclaim\fR)(void *),
20 \fBvoid\fR *\fIprivate\fR, \fBvoid\fR *\fIvmp\fR, \fBint\fR \fIcflags\fR);
25 \fBvoid\fR \fBkmem_cache_destroy\fR(\fBkmem_cache_t\fR *\fIcp\fR);
30 \fBvoid *\fR\fBkmem_cache_alloc\fR(\fBkmem_cache_t\fR *\fIcp\fR, \fBint\fR \fIkmflag\fR);
35 \fBvoid\fR \fBkmem_cache_free\fR(\fBkmem_cache_t\fR *\fIcp\fR, \fBvoid\fR *\fIobj\fR);
40 \fBvoid\fR \fBkmem_cache_set_move\fR(\fBkmem_cache_t\fR *\fIcp\fR, \fBkmem_cbrc_t\fR (*\fImove\fR)(\fBvoid\fR *,
41 \fBvoid\fR *, \fBsize_t\fR *, \fBvoid\fR *));
46 [Synopsis for callback functions:]
51 \fBint\fR (*\fIconstructor\fR)(\fBvoid\fR *\fIbuf\fR, \fBvoid\fR *\fIuser_arg\fR, \fBint\fR \fIkmflags\fR);
56 \fBvoid\fR (*\fIdestructor\fR)(\fBvoid\fR *\fIbuf\fR, \fBvoid\fR *\fIuser_arg\fR);
61 \fBkmem_cbrc_t\fR (*\fImove\fR)(\fBvoid\fR *\fIold\fR, \fBvoid\fR *\fInew\fR, \fBsize_t\fR \fIbufsize\fR,
62 \fBvoid\fR *\fIuser_arg\fR);
67 Solaris DDI specific (Solaris DDI)
70 The parameters for the \fBkmem_cache_*\fR functions are as follows:
77 Descriptive name of a \fBkstat\fR(9S) structure of class \fBkmem_cache\fR.
78 Names longer than 31 characters are truncated.
87 Size of the objects it manages.
96 Required object alignment.
102 \fB\fIconstructor\fR\fR
105 Pointer to an object constructor function. Parameters are defined below.
111 \fB\fIdestructor\fR\fR
114 Pointer to an object destructor function. Parameters are defined below.
123 Drivers should pass \fBNULL\fR.
132 Pass-through argument for constructor/destructor.
141 Drivers should pass \fBNULL\fR.
166 Allow sleeping (blocking) until memory is available.
172 \fB\fBKM_NOSLEEP\fR\fR
175 Return NULL immediately if memory is not available.
181 \fB\fBKM_PUSHPAGE\fR\fR
184 Allow the allocation to use reserved memory.
195 Pointer to the object allocated by \fBkmem_cache_alloc()\fR.
204 Pointer to an object relocation function. Parameters are defined below.
209 The parameters for the callback constructor function are as follows:
213 \fB\fBvoid *\fIbuf\fR\fR\fR
216 Pointer to the object to be constructed.
222 \fB\fBvoid *\fIuser_arg\fR\fR\fR
225 The \fIprivate\fR parameter from the call to \fBkmem_cache_create()\fR; it is
226 typically a pointer to the soft-state structure.
232 \fB\fBint \fIkmflags\fR\fR\fR
235 Propagated \fIkmflag\fR values.
240 The parameters for the callback destructor function are as follows:
244 \fB\fBvoid *\fIbuf\fR\fR\fR
247 Pointer to the object to be deconstructed.
253 \fB\fBvoid *\fIuser_arg\fR\fR\fR
256 The \fIprivate\fR parameter from the call to \fBkmem_cache_create()\fR; it is
257 typically a pointer to the soft-state structure.
262 The parameters for the callback \fBmove()\fR function are as follows:
266 \fB\fBvoid *\fIold\fR\fR\fR
269 Pointer to the object to be moved.
275 \fB\fBvoid *\fInew\fR\fR\fR
278 Pointer to the object that serves as the copy destination for the contents of
285 \fB\fBsize_t \fIbufsize\fR\fR\fR
288 Size of the object to be moved.
294 \fB\fBvoid *\fIuser_arg\fR\fR\fR
297 The private parameter from the call to \fBkmem_cache_create()\fR; it is
298 typically a pointer to the \fBsoft-state\fR structure.
303 In many cases, the cost of initializing and destroying an object exceeds the
304 cost of allocating and freeing memory for it. The functions described here
305 address this condition.
308 Object caching is a technique for dealing with objects that are:
313 frequently allocated and freed, and
319 have setup and initialization costs.
323 The idea is to allow the allocator and its clients to cooperate to preserve the
324 invariant portion of an object's initial state, or constructed state, between
325 uses, so it does not have to be destroyed and re-created every time the object
326 is used. For example, an object containing a mutex only needs to have
327 \fBmutex_init()\fR applied once, the first time the object is allocated. The
328 object can then be freed and reallocated many times without incurring the
329 expense of \fBmutex_destroy()\fR and \fBmutex_init()\fR each time. An object's
330 embedded locks, condition variables, reference counts, lists of other objects,
331 and read-only data all generally qualify as constructed state. The essential
332 requirement is that the client must free the object (using
333 \fBkmem_cache_free()\fR) in its constructed state. The allocator cannot enforce
334 this, so programming errors will lead to hard-to-find bugs.
337 A driver should call \fBkmem_cache_create()\fR at the time of \fB_init\fR(9E)
338 or \fBattach\fR(9E), and call the corresponding \fBkmem_cache_destroy()\fR at
339 the time of \fB_fini\fR(9E) or \fBdetach\fR(9E).
342 \fBkmem_cache_create()\fR creates a cache of objects, each of size
343 \fIbufsize\fR bytes, aligned on an \fIalign\fR boundary. Drivers not requiring
344 a specific alignment can pass 0. \fIname\fR identifies the cache for statistics
345 and debugging. \fIconstructor\fR and \fIdestructor\fR convert plain memory into
346 objects and back again; \fIconstructor\fR can fail if it needs to allocate
347 memory but cannot. \fIprivate\fR is a parameter passed to the constructor and
348 destructor callbacks to support parameterized caches (for example, a pointer to
349 an instance of the driver's soft-state structure). To facilitate debugging,
350 \fBkmem_cache_create()\fR creates a \fBkstat\fR(9S) structure of class
351 \fBkmem_cache\fR and name \fIname\fR. It returns an opaque pointer to the
355 \fBkmem_cache_alloc()\fR gets an object from the cache. The object will be in
356 its constructed state. \fIkmflag\fR has either \fBKM_SLEEP\fR or
357 \fBKM_NOSLEEP\fR set, indicating whether it is acceptable to wait for memory if
358 none is currently available.
361 A small pool of reserved memory is available to allow the system to progress
362 toward the goal of freeing additional memory while in a low memory situation.
363 The \fBKM_PUSHPAGE\fR flag enables use of this reserved memory pool on an
364 allocation. This flag can be used by drivers that implement \fBstrategy\fR(9E)
365 on memory allocations associated with a single I/O operation. The driver
366 guarantees that the I/O operation will complete (or timeout) and, on
367 completion, that the memory will be returned. The \fBKM_PUSHPAGE\fR flag should
368 be used only in \fBkmem_cache_alloc()\fR calls. All allocations from a given
369 cache should be consistent in their use of the flag. A driver that adheres to
370 these restrictions can guarantee progress in a low memory situation without
371 resorting to complex private allocation and queuing schemes. If
372 \fBKM_PUSHPAGE\fR is specified, \fBKM_SLEEP\fR can also be used without causing
376 \fBkmem_cache_free()\fR returns an object to the cache. The object must be in
377 its constructed state.
380 \fBkmem_cache_destroy()\fR destroys the cache and releases all associated
381 resources. All allocated objects must have been previously freed.
384 \fBkmem_cache_set_move()\fR registers a function that the allocator may call to
385 move objects from sparsely allocated pages of memory so that the system can
386 reclaim pages that are tied up by the client. Since caching objects of the same
387 size and type already makes severe memory fragmentation unlikely, there is
388 generally no need to register such a function. The idea is to make it possible
389 to limit worst-case fragmentation in caches that exhibit a tendency to become
390 highly fragmented. Only clients that allocate a mix of long- and short-lived
391 objects from the same cache are prone to exhibit this tendency, making them
392 candidates for a \fBmove()\fR callback.
395 The \fBmove()\fR callback supplies the client with two addresses: the allocated
396 object that the allocator wants to move and a buffer selected by the allocator
397 for the client to use as the copy destination. The new parameter is an
398 allocated, constructed object ready to receive the contents of the old
399 parameter. The \fIbufsize\fR parameter supplies the size of the object, in case
400 a single move function handles multiple caches whose objects differ only in
401 size. Finally, the private parameter passed to the constructor and destructor
402 is also passed to the \fBmove()\fR callback.
405 Only the client knows about its own data and when it is a good time to move it.
406 The client cooperates with the allocator to return unused memory to the system,
407 and the allocator accepts this help at the client's convenience. When asked to
408 move an object, the client can respond with any of the following:
412 typedef enum kmem_cbrc {
425 The client must not explicitly free either of the objects passed to the
426 \fBmove()\fR callback, since the allocator wants to free them directly to the
427 slab layer (bypassing the per-CPU magazine layer). The response tells the
428 allocator which of the two object parameters to free:
432 \fB\fBKMEM_CBRC_YES\fR\fR
435 The client moved the object; the allocator frees the old parameter.
441 \fB\fBKMEM_CBRC_NO\fR\fR
444 The client refused to move the object; the allocator frees the new parameter
445 (the unused copy destination).
451 \fB\fBKMEM_CBRC_LATER\fR\fR
454 The client is using the object and cannot move it now; the allocator frees the
455 new parameter (the unused copy destination). The client should use
456 \fBKMEM_CBRC_LATER\fR instead of \fBKMEM_CBRC_NO\fR if the object is likely to
463 \fB\fBKMEM_CBRC_DONT_NEED\fR\fR
466 The client no longer needs the object; the allocator frees both the old and new
467 parameters. This response is the client's opportunity to be a model citizen and
468 give back as much as it can.
474 \fB\fBKMEM_CBRC_DONT_KNOW\fR\fR
477 The client does not know about the object because:
484 the client has just allocated the object and has not yet put it wherever it
485 expects to find known objects
494 the client has removed the object from wherever it expects to find known
495 objects and is about to free the object
504 the client has freed the object
507 In all of these cases above, the allocator frees the new parameter (the unused
508 copy destination) and searches for the old parameter in the magazine layer. If
509 the object is found, it is removed from the magazine layer and freed to the
510 slab layer so that it will no longer tie up an entire page of memory.
515 Any object passed to the \fBmove()\fR callback is guaranteed to have been
516 touched only by the allocator or by the client. Because memory patterns applied
517 by the allocator always set at least one of the two lowest order bits, the
518 bottom two bits of any pointer member (other than \fBchar *\fR or \fBshort
519 *\fR, which may not be 8-byte aligned on all platforms) are available to the
520 client for marking cached objects that the client is about to free. This way,
521 the client can recognize known objects in the \fBmove()\fR callback by the
522 unmarked (valid) pointer value.
525 If the client refuses to move an object with either \fBKMEM_CBRC_NO\fR or
526 \fBKMEM_CBRC_LATER\fR, and that object later becomes movable, the client can
527 notify the allocator by calling \fBkmem_cache_move_notify()\fR. Alternatively,
528 the client can simply wait for the allocator to call back again with the same
529 object address. Responding \fBKMEM_CRBC_NO\fR even once or responding
530 \fBKMEM_CRBC_LATER\fR too many times for the same object makes the allocator
531 less likely to call back again for that object.
534 [Synopsis for notification function:]
539 \fBvoid\fR \fBkmem_cache_move_notify\fR(\fBkmem_cache_t\fR *\fIcp\fR, \fBvoid\fR *\fIobj\fR);
544 The parameters for the \fBnotification\fR function are as follows:
551 Pointer to the object cache.
560 Pointer to the object that has become movable since an earlier refusal to move
566 Constructors can be invoked during any call to \fBkmem_cache_alloc()\fR, and
567 will run in that context. Similarly, destructors can be invoked during any call
568 to \fBkmem_cache_free()\fR, and can also be invoked during
569 \fBkmem_cache_destroy()\fR. Therefore, the functions that a constructor or
570 destructor invokes must be appropriate in that context. Furthermore, the
571 allocator may also call the constructor and destructor on objects still under
572 its control without client involvement.
575 \fBkmem_cache_create()\fR and \fBkmem_cache_destroy()\fR must not be called
576 from interrupt context. \fBkmem_cache_create()\fR can also block for available
580 \fBkmem_cache_alloc()\fR can be called from interrupt context only if the
581 \fBKM_NOSLEEP\fR flag is set. It can be called from user or kernel context with
585 \fBkmem_cache_free()\fR can be called from user, kernel, or interrupt context.
588 \fBkmem_cache_set_move()\fR is called from the same context as
589 \fBkmem_cache_create()\fR, immediately after \fBkmem_cache_create()\fR and
590 before allocating any objects from the cache.
593 The registered \fBmove()\fR callback is always invoked in the same global
594 callback thread dedicated for move requests, guaranteeing that no matter how
595 many clients register a \fBmove()\fR function, the allocator never tries to
596 move more than one object at a time. Neither the allocator nor the client can
597 be assumed to know the object's whereabouts at the time of the callback.
600 \fBExample 1 \fRObject Caching
603 Consider the following data structure:
611 struct bar *foo_barlist;
619 Assume that a \fBfoo\fR structure cannot be freed until there are no
620 outstanding references to it (\fBfoo_refcnt == 0\fR) and all of its pending
621 \fBbar\fR events (whatever they are) have completed (\fBfoo_barlist ==
622 NULL\fR). The life cycle of a dynamically allocated \fBfoo\fR would be
628 foo = kmem_alloc(sizeof (struct foo), KM_SLEEP);
629 mutex_init(&foo->foo_lock, ...);
630 cv_init(&foo->foo_cv, ...);
632 foo->foo_barlist = NULL;
634 ASSERT(foo->foo_barlist == NULL);
635 ASSERT(foo->foo_refcnt == 0);
636 cv_destroy(&foo->foo_cv);
637 mutex_destroy(&foo->foo_lock);
644 Notice that between each use of a \fBfoo\fR object we perform a sequence of
645 operations that constitutes nothing but expensive overhead. All of this
646 overhead (that is, everything other than \fBuse foo\fR above) can be eliminated
653 foo_constructor(void *buf, void *arg, int tags)
655 struct foo *foo = buf;
656 mutex_init(&foo->foo_lock, ...);
657 cv_init(&foo->foo_cv, ...);
659 foo->foo_barlist = NULL;
664 foo_destructor(void *buf, void *arg)
666 struct foo *foo = buf;
667 ASSERT(foo->foo_barlist == NULL);
668 ASSERT(foo->foo_refcnt == 0);
669 cv_destroy(&foo->foo_cv);
670 mutex_destroy(&foo->foo_lock);
673 user_arg = ddi_get_soft_state(foo_softc, instance);
674 (void) snprintf(buf, KSTAT_STRLEN, "foo%d_cache",
675 ddi_get_instance(dip));
676 foo_cache = kmem_cache_create(buf,
677 sizeof (struct foo), 0,
678 foo_constructor, foo_destructor,
685 To allocate, use, and free a \fBfoo\fR object:
690 foo = kmem_cache_alloc(foo_cache, KM_SLEEP);
692 kmem_cache_free(foo_cache, foo);
698 This makes \fBfoo\fR allocation fast, because the allocator will usually do
699 nothing more than fetch an already-constructed \fBfoo\fR from the cache.
700 \fBfoo_constructor\fR and \fBfoo_destructor\fR will be invoked only to populate
701 and drain the cache, respectively.
704 \fBExample 2 \fRRegistering a Move Callback
707 To register a \fBmove()\fR callback:
712 object_cache = kmem_cache_create(...);
713 kmem_cache_set_move(object_cache, object_move);
719 If successful, the constructor function must return \fB0\fR. If KM_NOSLEEP is
720 set and memory cannot be allocated without sleeping, the constructor must
724 \fBkmem_cache_create()\fR returns a pointer to the allocated cache.
727 If successful, \fBkmem_cache_alloc()\fR returns a pointer to the allocated
728 object. If \fBKM_NOSLEEP\fR is set and memory cannot be allocated without
729 sleeping, \fBkmem_cache_alloc()\fR returns \fBNULL\fR.
732 See \fBattributes\fR(5) for descriptions of the following attributes:
740 ATTRIBUTE TYPE ATTRIBUTE VALUE
742 Interface Stability Committed
747 \fBcondvar\fR(9F), \fBkmem_alloc\fR(9F), \fBmutex\fR(9F), \fBkstat\fR(9S)
750 \fIWriting Device Drivers\fR
753 \fIThe Slab Allocator: An Object-Caching Kernel Memory Allocator\fR, Bonwick,
754 J.; USENIX Summer 1994 Technical Conference (1994).
757 \fIMagazines and vmem: Extending the Slab Allocator to Many CPUs and Arbitrary
758 Resources\fR, Bonwick, J. and Adams, J.; USENIX 2001 Technical Conference
762 The constructor must be immediately reversible by the destructor, since the
763 allocator may call the constructor and destructor on objects still under its
764 control at any time without client involvement.
767 The constructor must respect the \fIkmflags\fR argument by forwarding it to
768 allocations made inside the \fIconstructor\fR, and must not ASSERT anything
769 about the given flags.
772 The user argument forwarded to the constructor must be fully operational before
773 it is passed to \fBkmem_cache_create()\fR.