3 The 'wmem' memory manager is Wireshark's memory management framework, replacing
4 the old 'emem' framework which was removed in Wireshark 2.0.
6 In order to make memory management easier and to reduce the probability of
7 memory leaks, Wireshark provides its own memory management API. This API is
8 implemented inside wsutil/wmem/ and provides memory pools and functions that make
9 it easy to manage memory even in the face of exceptions (which many dissector
10 functions can raise). Memory scopes for dissection are defined in epan/wmem_scopes.h.
12 Correct use of these functions will make your code faster, and greatly reduce
13 the chances that it will leak memory in exceptional cases.
15 Wmem was originally conceived in this email to the wireshark-dev mailing list:
16 https://lists.wireshark.org/archives/wireshark-dev/201210/msg00178.html
18 2. Usage for Consumers
20 If you're writing a dissector, or other "userspace" code, then using wmem
21 should be very similar to using malloc or g_malloc or whatever else you're used
22 to. All you need to do is include the header (epan/wmem_scopes.h) and optionally
23 get a handle to a memory pool (if you want to *create* a memory pool, see the
24 section "3. Usage for Producers" below).
26 A memory pool is an opaque pointer to an object of type wmem_allocator_t, and
27 it is the very first parameter passed to almost every call you make to wmem.
28 Other than that parameter (and the fact that functions are prefixed wmem_)
29 usage is very similar to glib and other utility libraries. For example:
31 wmem_alloc(myPool, 20);
33 allocates 20 bytes in the pool pointed to by myPool.
35 2.1 Memory Pool Lifetimes
37 Every memory pool should have a defined lifetime, or scope, after which all the
38 memory in that pool is unconditionally freed. When you choose to allocate memory
39 in a pool, you *must* be aware of its lifetime: if the lifetime is shorter than
40 you need, your code will contain use-after-free bugs; if the lifetime is longer
41 than you need, your code may contain undetectable memory leaks. In either case,
42 the risks outweigh the benefits.
44 If no pool exists whose lifetime matches the lifetime of your memory, you have
45 two options: create a new pool (see section 3 of this document) or use the NULL
46 pool. Any function that takes a pointer to a wmem_allocator_t can also be passed
47 NULL instead, in which case the memory is managed manually (just like malloc or
48 g_malloc). Memory allocated like this *must* be manually passed to wmem_free()
49 in order to prevent memory leaks (however these memory leaks will at least show
50 up in valgrind). Note that passing wmem_allocated memory directly to free()
51 or g_free() is not safe; the backing type of manually managed memory may be
52 changed without warning.
54 2.2 Wireshark Global Pools
56 Dissectors that include the wmem_scopes.h header file will have three pools available
57 to them automatically: pinfo->pool, wmem_file_scope() and
58 wmem_epan_scope(); there is also a wmem_packet_scope() for cases when the
59 `pinfo` argument is not accessible, but pinfo->pool should be preferred.
61 The pinfo pool is scoped to the dissection of each packet, meaning that any
62 memory allocated in it will be automatically freed at the end of the current
63 packet. The file pool is similarly scoped to the dissection of each file,
64 meaning that any memory allocated in it will be automatically freed when the
65 current capture file is closed.
67 NB: Using these pools outside of the appropriate scope (e.g. using the file
68 pool when there isn't a file open) will throw an assertion.
69 See the comment in epan/wmem_scopes.c for details.
71 The epan pool is scoped to the library's lifetime - memory allocated in it is
72 not freed until epan_cleanup() is called, which is typically but not necessarily
73 at the very end of the program.
77 Certain allocations (such as AT_STRINGZ address allocations and anything that
78 might end up being passed to add_new_data_source) need their memory to stick
79 around a little longer than the usual packet scope - basically until the
80 next packet is dissected. This is, in fact, the scope of Wireshark's pinfo
81 structure, so the pinfo struct has a 'pool' member which is a wmem pool scoped
82 to the lifetime of the pinfo struct.
86 Full documentation for each function (parameters, return values, behaviours)
87 lives (or will live) in Doxygen-format in the header files for those functions.
88 This is just an overview of which header files you should be looking at.
93 - Basic memory management functions (wmem_alloc, wmem_realloc, wmem_free).
98 - Utility functions for manipulating null-terminated C-style strings.
99 Functions like strdup and strdup_printf.
102 - A managed string object implementation, similar to std::string in C++ or
105 2.4.3 Container Data Structures
108 - A growable array (AKA vector) implementation.
111 - A doubly-linked list implementation.
114 - A hash map (AKA hash table) implementation.
117 - A hash multimap (map that can store multiple values with the same key)
121 - A queue implementation (first-in, first-out).
124 - A stack implementation (last-in, first-out).
127 - A balanced binary tree (red-black tree) implementation.
129 2.4.4 Miscellaneous Utilities
132 - Misc. utility functions like memdup.
136 WARNING: You probably don't actually need these; use them only when you're
137 sure you understand the dangers.
139 Sometimes (though hopefully rarely) it may be necessary to store data in a wmem
140 pool that requires additional cleanup before it is freed. For example, perhaps
141 you have a pointer to a file-handle that needs to be closed. In this case, you
142 can register a callback with the wmem_register_callback function
143 declared in wmem_user_cb.h. Every time the memory in a pool is freed, all
144 registered cleanup functions are called first.
146 Note that callback calling order is not defined, you cannot rely on a
147 certain callback being called before or after another.
149 WARNING: Manually freeing or moving memory (with wmem_free or wmem_realloc)
150 will NOT trigger any callbacks. It is an error to call either of
151 those functions on memory if you have a callback registered to deal
152 with the contents of that memory.
154 3. Usage for Producers
156 NB: If you're just writing a dissector, you probably don't need to read
159 One of the problems with the old emem framework was that there were basically
160 two allocator backends (glib and mmap) that were all mixed together in a mess
161 of if statements, environment variables and #ifdefs. In wmem the different
162 allocator backends are cleanly separated out, and it's up to the owner of the
165 3.1 Available Allocator Back-Ends
167 Each available allocator type has a corresponding entry in the
168 wmem_allocator_type_t enumeration defined in wmem_core.h. See the doxygen
169 comments in that header file for details on each type.
173 To create a pool, include the regular wmem header and call the
174 wmem_allocator_new() function with the appropriate type value.
177 #include <wsutil/wmem/wmem.h>
179 wmem_allocator_t *myPool;
180 myPool = wmem_allocator_new(WMEM_ALLOCATOR_SIMPLE);
182 From here on in, you don't need to remember which type of allocator you used
183 (although allocator authors are welcome to expose additional allocator-specific
184 helper functions in their headers). The "myPool" variable can be passed around
185 and used as normal in allocation requests as described in section 2 of this
188 3.3 Destroying a Pool
190 Regardless of which allocator you used to create a pool, it can be destroyed
191 with a call to the function wmem_destroy_allocator(). For example:
193 #include <wsutil/wmem/wmem.h>
195 wmem_allocator_t *myPool;
197 myPool = wmem_allocator_new(WMEM_ALLOCATOR_SIMPLE);
199 /* Allocate some memory in myPool ... */
201 wmem_destroy_allocator(myPool);
203 Destroying a pool will free all the memory allocated in it.
207 It is possible to free all the memory in a pool without destroying it,
208 allowing it to be reused later. Depending on the type of allocator, doing this
209 (by calling wmem_free_all()) can be significantly cheaper than fully destroying
210 and recreating the pool. This method is therefore recommended, especially when
211 the pool would otherwise be scoped to a single iteration of a loop. For example:
213 #include <wsutil/wmem/wmem.h>
215 wmem_allocator_t *myPool;
217 myPool = wmem_allocator_new(WMEM_ALLOCATOR_SIMPLE);
220 /* Allocate some memory in myPool ... */
222 /* Free the memory, faster than destroying and recreating
223 the pool each time through the loop. */
224 wmem_free_all(myPool);
226 wmem_destroy_allocator(myPool);
230 Despite being written in Wireshark's standard C90, wmem follows a fairly
231 object-oriented design pattern. Although efficiency is always a concern, the
232 primary goals in writing wmem were maintainability and preventing memory
235 4.1 struct _wmem_allocator_t
237 The heart of wmem is the _wmem_allocator_t structure defined in the
238 wmem_allocator.h header file. This structure uses C function pointers to
239 implement a common object-oriented design pattern known as an interface (also
240 known as an abstract class to those who are more familiar with C++).
242 Different allocator implementations can provide exactly the same interface by
243 assigning their own functions to the members of an instance of the structure.
244 The structure has eight members in three groups.
246 4.1.1 Implementation Details
251 The private_data pointer is a void pointer that the allocator implementation can
252 use to store whatever internal structures it needs. A pointer to private_data is
253 passed to almost all of the other functions that the allocator implementation
256 The type field is an enumeration of type wmem_allocator_type_t (see
257 section 3.1). Its value is set by the wmem_allocator_new() function, not
258 by the implementation-specific constructor. This field should be considered
259 read-only by the allocator implementation.
261 4.1.2 Consumer Functions
267 These function pointers should be set to functions with semantics obviously
268 similar to their standard-library namesakes. Each one takes an extra parameter
269 that is a copy of the allocator's private_data pointer.
271 Note that wrealloc() and wfree() are not expected to be called directly by user
272 code in most cases - they are primarily optimizations for use by data
273 structures that wmem might want to implement (it's inefficient, for example, to
274 implement a dynamically sized array without some form of realloc).
276 Also note that allocators do not have to handle NULL pointers or 0-length
277 requests in any way - those checks are done in an allocator-agnostic way
278 higher up in wmem. Allocator authors can assume that all incoming pointers
279 (to wrealloc and wfree) are non-NULL, and that all incoming lengths (to walloc
280 and wrealloc) are non-0.
282 4.1.3 Producer/Manager Functions
288 All of these functions take only one parameter, which is the allocator's
289 private_data pointer.
291 The free_all() function should free all the memory currently allocated in the
292 pool. Note that this is not necessarily exactly the same as calling free()
293 on all the allocated blocks - free_all() is allowed to do additional cleanup
294 or to make use of optimizations not available when freeing one block at a time.
296 The gc() function should do whatever it can to reduce excess memory usage in
297 the dissector by returning unused blocks to the OS, optimizing internal data
300 The cleanup() function should do any final cleanup and free any and all memory.
301 It is basically the equivalent of a destructor function. For simplicity, wmem
302 is guaranteed to call free_all() immediately before calling this function. There
303 is no such guarantee that gc() has (ever) been called.
305 4.2 Pool-Agnostic API
307 One of the issues with emem was that the API (including the public data
308 structures) required wrapper functions for each scope implemented. Even
309 if there was a stack implementation in emem, it wasn't necessarily available
310 for use with file-scope memory unless someone took the time to write se_stack_
311 wrapper functions for the interface.
313 In wmem, all public APIs take the pool as the first argument, so that they can
314 be written once and used with any available memory pool. Data structures like
315 wmem's stack implementation only take the pool when created - the provided
316 pointer is stored internally with the data structure, and subsequent calls
317 (like push and pop) will take the stack itself instead of the pool.
321 The primary debugging control for wmem is the WIRESHARK_DEBUG_WMEM_OVERRIDE
322 environment variable. If set, this value forces all calls to
323 wmem_allocator_new() to return the same type of allocator, regardless of which
324 type is requested normally by the code. It currently has four valid values:
326 - The value "simple" forces the use of WMEM_ALLOCATOR_SIMPLE. The valgrind
327 script currently sets this value, since the simple allocator is the only
328 one whose memory allocations are trackable properly by valgrind.
330 - The value "strict" forces the use of WMEM_ALLOCATOR_STRICT. The fuzz-test
331 script currently sets this value, since the goal when fuzz-testing is to find
332 as many errors as possible.
334 - The value "block" forces the use of WMEM_ALLOCATOR_BLOCK. This is not
335 currently used by any scripts, but is useful for stress-testing the block
338 - The value "block_fast" forces the use of WMEM_ALLOCATOR_BLOCK_FAST. This is
339 not currently used by any scripts, but is useful for stress-testing the fast
342 Note that regardless of the value of this variable, it will always be safe to
343 call allocator-specific helpers functions. They are required to be safe no-ops
344 if the allocator argument is of the wrong type.
348 There is a simple test suite for wmem that lives in the file wmem_test.c and
349 should get automatically built into the binary 'wmem_test' when building
350 Wireshark. It contains at least basic tests for all existing functionality.
351 The suite is run automatically by the build-bots via the shell script
352 test/test.py which calls out to test/suite_unittests.py.
354 New features added to wmem (allocators, data structures, utility
355 functions, etc.) MUST also have tests added to this suite.
357 The test suite could potentially use a clean-up by someone more
358 intimately familiar with Glib's testing framework, but it does the job.
360 5. A Note on Performance
362 Because of my own bad judgment, there is the persistent idea floating around
363 that wmem is somehow magically faster than other allocators in the general case.
366 First, wmem supports multiple different allocator backends (see sections 3 and 4
367 of this document), so it is confusing and misleading to try and compare the
368 performance of "wmem" in general with another system anyways.
370 Second, any modern system-provided malloc already has a very clever and
371 efficient allocator algorithm that makes use of blocks, arenas and all sorts of
372 other fancy tricks. Trying to be faster than libc's allocator is generally a
373 waste of time unless you have a specific allocation pattern to optimize for.
375 Third, while there were historically arguments to be made for putting something
376 in front of the kernel to reduce the number of context-switches, modern libc
377 implementations should already do that. Making a dynamic library call is still
378 marginally more expensive than calling a locally-defined linker-optimized
379 function, but it's a difference too small to care about.
381 With all that said, it is true that *some* of wmem's allocators can be
382 substantially faster than your standard libc malloc, in *some* use cases:
383 - The BLOCK and BLOCK_FAST allocators both provide very efficient free_all
384 operations, which can be many orders of magnitude faster than calling free()
385 on each individual allocation.
386 - The BLOCK_FAST allocator in particular is optimized for Wireshark's packet
387 scope pool. It has an extremely short, well-defined lifetime, and a very
388 regular pattern of allocations; I was able to use that knowledge to beat libc
389 rather handily, *in that specific use case*.
392 * Editor modelines - https://www.wireshark.org/tools/modelines.html
397 * indent-tabs-mode: nil
400 * vi: set shiftwidth=4 tabstop=8 expandtab:
401 * :indentSize=4:tabSize=8:noTabs=true: