2 menu "Memory Management options"
4 config SELECT_MEMORY_MODEL
6 depends on ARCH_SELECT_MEMORY_MODEL
10 depends on SELECT_MEMORY_MODEL
11 default DISCONTIGMEM_MANUAL if ARCH_DISCONTIGMEM_DEFAULT
12 default SPARSEMEM_MANUAL if ARCH_SPARSEMEM_DEFAULT
13 default FLATMEM_MANUAL
17 depends on !(ARCH_DISCONTIGMEM_ENABLE || ARCH_SPARSEMEM_ENABLE) || ARCH_FLATMEM_ENABLE
19 This option allows you to change some of the ways that
20 Linux manages its memory internally. Most users will
21 only have one option here: FLATMEM. This is normal
24 Some users of more advanced features like NUMA and
25 memory hotplug may have different options here.
26 DISCONTIGMEM is a more mature, better tested system,
27 but is incompatible with memory hotplug and may suffer
28 decreased performance over SPARSEMEM. If unsure between
29 "Sparse Memory" and "Discontiguous Memory", choose
30 "Discontiguous Memory".
32 If unsure, choose this option (Flat Memory) over any other.
34 config DISCONTIGMEM_MANUAL
35 bool "Discontiguous Memory"
36 depends on ARCH_DISCONTIGMEM_ENABLE
38 This option provides enhanced support for discontiguous
39 memory systems, over FLATMEM. These systems have holes
40 in their physical address spaces, and this option provides
41 more efficient handling of these holes. However, the vast
42 majority of hardware has quite flat address spaces, and
43 can have degraded performance from the extra overhead that
46 Many NUMA configurations will have this as the only option.
48 If unsure, choose "Flat Memory" over this option.
50 config SPARSEMEM_MANUAL
52 depends on ARCH_SPARSEMEM_ENABLE
54 This will be the only option for some systems, including
55 memory hotplug systems. This is normal.
57 For many other systems, this will be an alternative to
58 "Discontiguous Memory". This option provides some potential
59 performance benefits, along with decreased code complexity,
60 but it is newer, and more experimental.
62 If unsure, choose "Discontiguous Memory" or "Flat Memory"
69 depends on (!SELECT_MEMORY_MODEL && ARCH_DISCONTIGMEM_ENABLE) || DISCONTIGMEM_MANUAL
73 depends on (!SELECT_MEMORY_MODEL && ARCH_SPARSEMEM_ENABLE) || SPARSEMEM_MANUAL
77 depends on (!DISCONTIGMEM && !SPARSEMEM) || FLATMEM_MANUAL
79 config FLAT_NODE_MEM_MAP
84 # Both the NUMA code and DISCONTIGMEM use arrays of pg_data_t's
85 # to represent different areas of memory. This variable allows
86 # those dependencies to exist individually.
88 config NEED_MULTIPLE_NODES
90 depends on DISCONTIGMEM || NUMA
92 config HAVE_MEMORY_PRESENT
94 depends on ARCH_HAVE_MEMORY_PRESENT || SPARSEMEM
97 # SPARSEMEM_EXTREME (which is the default) does some bootmem
98 # allocations when memory_present() is called. If this cannot
99 # be done on your architecture, select this option. However,
100 # statically allocating the mem_section[] array can potentially
101 # consume vast quantities of .bss, so be careful.
103 # This option will also potentially produce smaller runtime code
104 # with gcc 3.4 and later.
106 config SPARSEMEM_STATIC
110 # Architecture platforms which require a two level mem_section in SPARSEMEM
111 # must select this option. This is usually for architecture platforms with
112 # an extremely sparse physical address space.
114 config SPARSEMEM_EXTREME
116 depends on SPARSEMEM && !SPARSEMEM_STATIC
118 config SPARSEMEM_VMEMMAP_ENABLE
121 config SPARSEMEM_VMEMMAP
122 bool "Sparse Memory virtual memmap"
123 depends on SPARSEMEM && SPARSEMEM_VMEMMAP_ENABLE
126 SPARSEMEM_VMEMMAP uses a virtually mapped memmap to optimise
127 pfn_to_page and page_to_pfn operations. This is the most
128 efficient option when sufficient kernel resources are available.
130 config HAVE_MEMBLOCK_NODE_MAP
133 config HAVE_MEMBLOCK_PHYS_MAP
136 config HAVE_GENERIC_GUP
139 config ARCH_DISCARD_MEMBLOCK
142 config MEMORY_ISOLATION
146 # Only be set on architectures that have completely implemented memory hotplug
147 # feature. If you are not sure, don't touch it.
149 config HAVE_BOOTMEM_INFO_NODE
152 # eventually, we can have this option just 'select SPARSEMEM'
153 config MEMORY_HOTPLUG
154 bool "Allow for memory hot-add"
155 depends on SPARSEMEM || X86_64_ACPI_NUMA
156 depends on ARCH_ENABLE_MEMORY_HOTPLUG
158 config MEMORY_HOTPLUG_SPARSE
160 depends on SPARSEMEM && MEMORY_HOTPLUG
162 config MEMORY_HOTPLUG_DEFAULT_ONLINE
163 bool "Online the newly added memory blocks by default"
165 depends on MEMORY_HOTPLUG
167 This option sets the default policy setting for memory hotplug
168 onlining policy (/sys/devices/system/memory/auto_online_blocks) which
169 determines what happens to newly added memory regions. Policy setting
170 can always be changed at runtime.
171 See Documentation/memory-hotplug.txt for more information.
173 Say Y here if you want all hot-plugged memory blocks to appear in
174 'online' state by default.
175 Say N here if you want the default policy to keep all hot-plugged
176 memory blocks in 'offline' state.
178 config MEMORY_HOTREMOVE
179 bool "Allow for memory hot remove"
180 select MEMORY_ISOLATION
181 select HAVE_BOOTMEM_INFO_NODE if (X86_64 || PPC64)
182 depends on MEMORY_HOTPLUG && ARCH_ENABLE_MEMORY_HOTREMOVE
185 # Heavily threaded applications may benefit from splitting the mm-wide
186 # page_table_lock, so that faults on different parts of the user address
187 # space can be handled with less contention: split it at this NR_CPUS.
188 # Default to 4 for wider testing, though 8 might be more appropriate.
189 # ARM's adjust_pte (unused if VIPT) depends on mm-wide page_table_lock.
190 # PA-RISC 7xxx's spinlock_t would enlarge struct page from 32 to 44 bytes.
191 # DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC spinlock_t also enlarge struct page.
193 config SPLIT_PTLOCK_CPUS
195 default "999999" if !MMU
196 default "999999" if ARM && !CPU_CACHE_VIPT
197 default "999999" if PARISC && !PA20
200 config ARCH_ENABLE_SPLIT_PMD_PTLOCK
204 # support for memory balloon
205 config MEMORY_BALLOON
209 # support for memory balloon compaction
210 config BALLOON_COMPACTION
211 bool "Allow for balloon memory compaction/migration"
213 depends on COMPACTION && MEMORY_BALLOON
215 Memory fragmentation introduced by ballooning might reduce
216 significantly the number of 2MB contiguous memory blocks that can be
217 used within a guest, thus imposing performance penalties associated
218 with the reduced number of transparent huge pages that could be used
219 by the guest workload. Allowing the compaction & migration for memory
220 pages enlisted as being part of memory balloon devices avoids the
221 scenario aforementioned and helps improving memory defragmentation.
224 # support for memory compaction
226 bool "Allow for memory compaction"
231 Compaction is the only memory management component to form
232 high order (larger physically contiguous) memory blocks
233 reliably. The page allocator relies on compaction heavily and
234 the lack of the feature can lead to unexpected OOM killer
235 invocations for high order memory requests. You shouldn't
236 disable this option unless there really is a strong reason for
237 it and then we would be really interested to hear about that at
241 # support for page migration
244 bool "Page migration"
246 depends on (NUMA || ARCH_ENABLE_MEMORY_HOTREMOVE || COMPACTION || CMA) && MMU
248 Allows the migration of the physical location of pages of processes
249 while the virtual addresses are not changed. This is useful in
250 two situations. The first is on NUMA systems to put pages nearer
251 to the processors accessing. The second is when allocating huge
252 pages as migration can relocate pages to satisfy a huge page
253 allocation instead of reclaiming.
255 config ARCH_ENABLE_HUGEPAGE_MIGRATION
258 config ARCH_ENABLE_THP_MIGRATION
261 config PHYS_ADDR_T_64BIT
265 bool "Enable bounce buffers"
267 depends on BLOCK && MMU && (ZONE_DMA || HIGHMEM)
269 Enable bounce buffers for devices that cannot access
270 the full range of memory available to the CPU. Enabled
271 by default when ZONE_DMA or HIGHMEM is selected, but you
272 may say n to override this.
282 An architecture should select this if it implements the
283 deprecated interface virt_to_bus(). All new architectures
284 should probably not select this.
292 bool "Enable KSM for page merging"
295 Enable Kernel Samepage Merging: KSM periodically scans those areas
296 of an application's address space that an app has advised may be
297 mergeable. When it finds pages of identical content, it replaces
298 the many instances by a single page with that content, so
299 saving memory until one or another app needs to modify the content.
300 Recommended for use with KVM, or with other duplicative applications.
301 See Documentation/vm/ksm.rst for more information: KSM is inactive
302 until a program has madvised that an area is MADV_MERGEABLE, and
303 root has set /sys/kernel/mm/ksm/run to 1 (if CONFIG_SYSFS is set).
305 config DEFAULT_MMAP_MIN_ADDR
306 int "Low address space to protect from user allocation"
310 This is the portion of low virtual memory which should be protected
311 from userspace allocation. Keeping a user from writing to low pages
312 can help reduce the impact of kernel NULL pointer bugs.
314 For most ia64, ppc64 and x86 users with lots of address space
315 a value of 65536 is reasonable and should cause no problems.
316 On arm and other archs it should not be higher than 32768.
317 Programs which use vm86 functionality or have some need to map
318 this low address space will need CAP_SYS_RAWIO or disable this
319 protection by setting the value to 0.
321 This value can be changed after boot using the
322 /proc/sys/vm/mmap_min_addr tunable.
324 config ARCH_SUPPORTS_MEMORY_FAILURE
327 config MEMORY_FAILURE
329 depends on ARCH_SUPPORTS_MEMORY_FAILURE
330 bool "Enable recovery from hardware memory errors"
331 select MEMORY_ISOLATION
334 Enables code to recover from some memory failures on systems
335 with MCA recovery. This allows a system to continue running
336 even when some of its memory has uncorrected errors. This requires
337 special hardware support and typically ECC memory.
339 config HWPOISON_INJECT
340 tristate "HWPoison pages injector"
341 depends on MEMORY_FAILURE && DEBUG_KERNEL && PROC_FS
342 select PROC_PAGE_MONITOR
344 config NOMMU_INITIAL_TRIM_EXCESS
345 int "Turn on mmap() excess space trimming before booting"
349 The NOMMU mmap() frequently needs to allocate large contiguous chunks
350 of memory on which to store mappings, but it can only ask the system
351 allocator for chunks in 2^N*PAGE_SIZE amounts - which is frequently
352 more than it requires. To deal with this, mmap() is able to trim off
353 the excess and return it to the allocator.
355 If trimming is enabled, the excess is trimmed off and returned to the
356 system allocator, which can cause extra fragmentation, particularly
357 if there are a lot of transient processes.
359 If trimming is disabled, the excess is kept, but not used, which for
360 long-term mappings means that the space is wasted.
362 Trimming can be dynamically controlled through a sysctl option
363 (/proc/sys/vm/nr_trim_pages) which specifies the minimum number of
364 excess pages there must be before trimming should occur, or zero if
365 no trimming is to occur.
367 This option specifies the initial value of this option. The default
368 of 1 says that all excess pages should be trimmed.
370 See Documentation/nommu-mmap.txt for more information.
372 config TRANSPARENT_HUGEPAGE
373 bool "Transparent Hugepage Support"
374 depends on HAVE_ARCH_TRANSPARENT_HUGEPAGE
378 Transparent Hugepages allows the kernel to use huge pages and
379 huge tlb transparently to the applications whenever possible.
380 This feature can improve computing performance to certain
381 applications by speeding up page faults during memory
382 allocation, by reducing the number of tlb misses and by speeding
383 up the pagetable walking.
385 If memory constrained on embedded, you may want to say N.
388 prompt "Transparent Hugepage Support sysfs defaults"
389 depends on TRANSPARENT_HUGEPAGE
390 default TRANSPARENT_HUGEPAGE_ALWAYS
392 Selects the sysfs defaults for Transparent Hugepage Support.
394 config TRANSPARENT_HUGEPAGE_ALWAYS
397 Enabling Transparent Hugepage always, can increase the
398 memory footprint of applications without a guaranteed
399 benefit but it will work automatically for all applications.
401 config TRANSPARENT_HUGEPAGE_MADVISE
404 Enabling Transparent Hugepage madvise, will only provide a
405 performance improvement benefit to the applications using
406 madvise(MADV_HUGEPAGE) but it won't risk to increase the
407 memory footprint of applications without a guaranteed
411 config ARCH_WANTS_THP_SWAP
416 depends on TRANSPARENT_HUGEPAGE && ARCH_WANTS_THP_SWAP && SWAP
418 Swap transparent huge pages in one piece, without splitting.
419 XXX: For now, swap cluster backing transparent huge page
420 will be split after swapout.
422 For selection by architectures with reasonable THP sizes.
424 config TRANSPARENT_HUGE_PAGECACHE
426 depends on TRANSPARENT_HUGEPAGE
429 # UP and nommu archs use km based percpu allocator
431 config NEED_PER_CPU_KM
437 bool "Enable cleancache driver to cache clean pages if tmem is present"
440 Cleancache can be thought of as a page-granularity victim cache
441 for clean pages that the kernel's pageframe replacement algorithm
442 (PFRA) would like to keep around, but can't since there isn't enough
443 memory. So when the PFRA "evicts" a page, it first attempts to use
444 cleancache code to put the data contained in that page into
445 "transcendent memory", memory that is not directly accessible or
446 addressable by the kernel and is of unknown and possibly
447 time-varying size. And when a cleancache-enabled
448 filesystem wishes to access a page in a file on disk, it first
449 checks cleancache to see if it already contains it; if it does,
450 the page is copied into the kernel and a disk access is avoided.
451 When a transcendent memory driver is available (such as zcache or
452 Xen transcendent memory), a significant I/O reduction
453 may be achieved. When none is available, all cleancache calls
454 are reduced to a single pointer-compare-against-NULL resulting
455 in a negligible performance hit.
457 If unsure, say Y to enable cleancache
460 bool "Enable frontswap to cache swap pages if tmem is present"
464 Frontswap is so named because it can be thought of as the opposite
465 of a "backing" store for a swap device. The data is stored into
466 "transcendent memory", memory that is not directly accessible or
467 addressable by the kernel and is of unknown and possibly
468 time-varying size. When space in transcendent memory is available,
469 a significant swap I/O reduction may be achieved. When none is
470 available, all frontswap calls are reduced to a single pointer-
471 compare-against-NULL resulting in a negligible performance hit
472 and swap data is stored as normal on the matching swap device.
474 If unsure, say Y to enable frontswap.
477 bool "Contiguous Memory Allocator"
480 select MEMORY_ISOLATION
482 This enables the Contiguous Memory Allocator which allows other
483 subsystems to allocate big physically-contiguous blocks of memory.
484 CMA reserves a region of memory and allows only movable pages to
485 be allocated from it. This way, the kernel can use the memory for
486 pagecache and when a subsystem requests for contiguous area, the
487 allocated pages are migrated away to serve the contiguous request.
492 bool "CMA debug messages (DEVELOPMENT)"
493 depends on DEBUG_KERNEL && CMA
495 Turns on debug messages in CMA. This produces KERN_DEBUG
496 messages for every CMA call as well as various messages while
497 processing calls such as dma_alloc_from_contiguous().
498 This option does not affect warning and error messages.
501 bool "CMA debugfs interface"
502 depends on CMA && DEBUG_FS
504 Turns on the DebugFS interface for CMA.
507 int "Maximum count of the CMA areas"
511 CMA allows to create CMA areas for particular purpose, mainly,
512 used as device private area. This parameter sets the maximum
513 number of CMA area in the system.
515 If unsure, leave the default value "7".
517 config MEM_SOFT_DIRTY
518 bool "Track memory changes"
519 depends on CHECKPOINT_RESTORE && HAVE_ARCH_SOFT_DIRTY && PROC_FS
520 select PROC_PAGE_MONITOR
522 This option enables memory changes tracking by introducing a
523 soft-dirty bit on pte-s. This bit it set when someone writes
524 into a page just as regular dirty bit, but unlike the latter
525 it can be cleared by hands.
527 See Documentation/admin-guide/mm/soft-dirty.rst for more details.
530 bool "Compressed cache for swap pages (EXPERIMENTAL)"
531 depends on FRONTSWAP && CRYPTO=y
536 A lightweight compressed cache for swap pages. It takes
537 pages that are in the process of being swapped out and attempts to
538 compress them into a dynamically allocated RAM-based memory pool.
539 This can result in a significant I/O reduction on swap device and,
540 in the case where decompressing from RAM is faster that swap device
541 reads, can also improve workload performance.
543 This is marked experimental because it is a new feature (as of
544 v3.11) that interacts heavily with memory reclaim. While these
545 interactions don't cause any known issues on simple memory setups,
546 they have not be fully explored on the large set of potential
547 configurations and workloads that exist.
550 tristate "Common API for compressed memory storage"
553 Compressed memory storage API. This allows using either zbud or
557 tristate "Low (Up to 2x) density storage for compressed pages"
560 A special purpose allocator for storing compressed pages.
561 It is designed to store up to two compressed pages per physical
562 page. While this design limits storage density, it has simple and
563 deterministic reclaim properties that make it preferable to a higher
564 density approach when reclaim will be used.
567 tristate "Up to 3x density storage for compressed pages"
571 A special purpose allocator for storing compressed pages.
572 It is designed to store up to three compressed pages per physical
573 page. It is a ZBUD derivative so the simplicity and determinism are
577 tristate "Memory allocator for compressed pages"
581 zsmalloc is a slab-based memory allocator designed to store
582 compressed RAM pages. zsmalloc uses virtual memory mapping
583 in order to reduce fragmentation. However, this results in a
584 non-standard allocator interface where a handle, not a pointer, is
585 returned by an alloc(). This handle must be mapped in order to
586 access the allocated space.
588 config PGTABLE_MAPPING
589 bool "Use page table mapping to access object in zsmalloc"
592 By default, zsmalloc uses a copy-based object mapping method to
593 access allocations that span two pages. However, if a particular
594 architecture (ex, ARM) performs VM mapping faster than copying,
595 then you should select this. This causes zsmalloc to use page table
596 mapping rather than copying for object mapping.
598 You can check speed with zsmalloc benchmark:
599 https://github.com/spartacus06/zsmapbench
602 bool "Export zsmalloc statistics"
606 This option enables code in the zsmalloc to collect various
607 statistics about whats happening in zsmalloc and exports that
608 information to userspace via debugfs.
611 config GENERIC_EARLY_IOREMAP
614 config MAX_STACK_SIZE_MB
615 int "Maximum user stack size for 32-bit processes (MB)"
618 depends on STACK_GROWSUP && (!64BIT || COMPAT)
620 This is the maximum stack size in Megabytes in the VM layout of 32-bit
621 user processes when the stack grows upwards (currently only on parisc
622 arch). The stack will be located at the highest memory address minus
623 the given value, unless the RLIMIT_STACK hard limit is changed to a
624 smaller value in which case that is used.
626 A sane initial value is 80 MB.
628 config DEFERRED_STRUCT_PAGE_INIT
629 bool "Defer initialisation of struct pages to kthreads"
632 depends on !NEED_PER_CPU_KM
635 Ordinarily all struct pages are initialised during early boot in a
636 single thread. On very large machines this can take a considerable
637 amount of time. If this option is set, large machines will bring up
638 a subset of memmap at boot and then initialise the rest in parallel
639 by starting one-off "pgdatinitX" kernel thread for each node X. This
640 has a potential performance impact on processes running early in the
641 lifetime of the system until these kthreads finish the
644 config IDLE_PAGE_TRACKING
645 bool "Enable idle page tracking"
646 depends on SYSFS && MMU
647 select PAGE_EXTENSION if !64BIT
649 This feature allows to estimate the amount of user pages that have
650 not been touched during a given period of time. This information can
651 be useful to tune memory cgroup limits and/or for job placement
652 within a compute cluster.
654 See Documentation/admin-guide/mm/idle_page_tracking.rst for
657 # arch_add_memory() comprehends device memory
658 config ARCH_HAS_ZONE_DEVICE
662 bool "Device memory (pmem, HMM, etc...) hotplug support"
663 depends on MEMORY_HOTPLUG
664 depends on MEMORY_HOTREMOVE
665 depends on SPARSEMEM_VMEMMAP
666 depends on ARCH_HAS_ZONE_DEVICE
670 Device memory hotplug support allows for establishing pmem,
671 or other device driver discovered memory regions, in the
672 memmap. This allows pfn_to_page() lookups of otherwise
673 "device-physical" addresses which is needed for using a DAX
674 mapping in an O_DIRECT operation, among other things.
676 If FS_DAX is enabled, then say Y.
681 depends on (X86_64 || PPC64)
682 depends on ZONE_DEVICE
683 depends on MMU && 64BIT
684 depends on MEMORY_HOTPLUG
685 depends on MEMORY_HOTREMOVE
686 depends on SPARSEMEM_VMEMMAP
688 config MIGRATE_VMA_HELPER
691 config DEV_PAGEMAP_OPS
696 select MIGRATE_VMA_HELPER
699 bool "HMM mirror CPU page table into a device page table"
700 depends on ARCH_HAS_HMM
704 Select HMM_MIRROR if you want to mirror range of the CPU page table of a
705 process into a device page table. Here, mirror means "keep synchronized".
706 Prerequisites: the device must provide the ability to write-protect its
707 page tables (at PAGE_SIZE granularity), and must be able to recover from
708 the resulting potential page faults.
710 config DEVICE_PRIVATE
711 bool "Unaddressable device memory (GPU memory, ...)"
712 depends on ARCH_HAS_HMM
714 select DEV_PAGEMAP_OPS
717 Allows creation of struct pages to represent unaddressable device
718 memory; i.e., memory that is only accessible from the device (or
719 group of devices). You likely also want to select HMM_MIRROR.
722 bool "Addressable device memory (like GPU memory)"
723 depends on ARCH_HAS_HMM
725 select DEV_PAGEMAP_OPS
728 Allows creation of struct pages to represent addressable device
729 memory; i.e., memory that is accessible from both the device and
735 config ARCH_USES_HIGH_VMA_FLAGS
737 config ARCH_HAS_PKEYS
741 bool "Collect percpu memory statistics"
744 This feature collects and exposes statistics via debugfs. The
745 information includes global and per chunk statistics, which can
746 be used to help understand percpu memory usage.
749 bool "Enable infrastructure for get_user_pages_fast() benchmarking"
752 Provides /sys/kernel/debug/gup_benchmark that helps with testing
753 performance of get_user_pages_fast().
755 See tools/testing/selftests/vm/gup_benchmark.c
757 config ARCH_HAS_PTE_SPECIAL