2 menu "Memory Management options"
4 config SELECT_MEMORY_MODEL
6 depends on ARCH_SELECT_MEMORY_MODEL
10 depends on SELECT_MEMORY_MODEL
11 default DISCONTIGMEM_MANUAL if ARCH_DISCONTIGMEM_DEFAULT
12 default SPARSEMEM_MANUAL if ARCH_SPARSEMEM_DEFAULT
13 default FLATMEM_MANUAL
17 depends on !(ARCH_DISCONTIGMEM_ENABLE || ARCH_SPARSEMEM_ENABLE) || ARCH_FLATMEM_ENABLE
19 This option allows you to change some of the ways that
20 Linux manages its memory internally. Most users will
21 only have one option here: FLATMEM. This is normal
24 Some users of more advanced features like NUMA and
25 memory hotplug may have different options here.
26 DISCONTIGMEM is a more mature, better tested system,
27 but is incompatible with memory hotplug and may suffer
28 decreased performance over SPARSEMEM. If unsure between
29 "Sparse Memory" and "Discontiguous Memory", choose
30 "Discontiguous Memory".
32 If unsure, choose this option (Flat Memory) over any other.
34 config DISCONTIGMEM_MANUAL
35 bool "Discontiguous Memory"
36 depends on ARCH_DISCONTIGMEM_ENABLE
38 This option provides enhanced support for discontiguous
39 memory systems, over FLATMEM. These systems have holes
40 in their physical address spaces, and this option provides
41 more efficient handling of these holes. However, the vast
42 majority of hardware has quite flat address spaces, and
43 can have degraded performance from the extra overhead that
46 Many NUMA configurations will have this as the only option.
48 If unsure, choose "Flat Memory" over this option.
50 config SPARSEMEM_MANUAL
52 depends on ARCH_SPARSEMEM_ENABLE
54 This will be the only option for some systems, including
55 memory hotplug systems. This is normal.
57 For many other systems, this will be an alternative to
58 "Discontiguous Memory". This option provides some potential
59 performance benefits, along with decreased code complexity,
60 but it is newer, and more experimental.
62 If unsure, choose "Discontiguous Memory" or "Flat Memory"
69 depends on (!SELECT_MEMORY_MODEL && ARCH_DISCONTIGMEM_ENABLE) || DISCONTIGMEM_MANUAL
73 depends on (!SELECT_MEMORY_MODEL && ARCH_SPARSEMEM_ENABLE) || SPARSEMEM_MANUAL
77 depends on (!DISCONTIGMEM && !SPARSEMEM) || FLATMEM_MANUAL
79 config FLAT_NODE_MEM_MAP
84 # Both the NUMA code and DISCONTIGMEM use arrays of pg_data_t's
85 # to represent different areas of memory. This variable allows
86 # those dependencies to exist individually.
88 config NEED_MULTIPLE_NODES
90 depends on DISCONTIGMEM || NUMA
92 config HAVE_MEMORY_PRESENT
94 depends on ARCH_HAVE_MEMORY_PRESENT || SPARSEMEM
97 # SPARSEMEM_EXTREME (which is the default) does some bootmem
98 # allocations when memory_present() is called. If this cannot
99 # be done on your architecture, select this option. However,
100 # statically allocating the mem_section[] array can potentially
101 # consume vast quantities of .bss, so be careful.
103 # This option will also potentially produce smaller runtime code
104 # with gcc 3.4 and later.
106 config SPARSEMEM_STATIC
110 # Architecture platforms which require a two level mem_section in SPARSEMEM
111 # must select this option. This is usually for architecture platforms with
112 # an extremely sparse physical address space.
114 config SPARSEMEM_EXTREME
116 depends on SPARSEMEM && !SPARSEMEM_STATIC
118 config SPARSEMEM_VMEMMAP_ENABLE
121 config SPARSEMEM_VMEMMAP
122 bool "Sparse Memory virtual memmap"
123 depends on SPARSEMEM && SPARSEMEM_VMEMMAP_ENABLE
126 SPARSEMEM_VMEMMAP uses a virtually mapped memmap to optimise
127 pfn_to_page and page_to_pfn operations. This is the most
128 efficient option when sufficient kernel resources are available.
130 config HAVE_MEMBLOCK_NODE_MAP
133 config HAVE_MEMBLOCK_PHYS_MAP
136 config HAVE_GENERIC_GUP
139 config ARCH_DISCARD_MEMBLOCK
142 config MEMORY_ISOLATION
146 # Only be set on architectures that have completely implemented memory hotplug
147 # feature. If you are not sure, don't touch it.
149 config HAVE_BOOTMEM_INFO_NODE
152 # eventually, we can have this option just 'select SPARSEMEM'
153 config MEMORY_HOTPLUG
154 bool "Allow for memory hot-add"
155 depends on SPARSEMEM || X86_64_ACPI_NUMA
156 depends on ARCH_ENABLE_MEMORY_HOTPLUG
158 config MEMORY_HOTPLUG_SPARSE
160 depends on SPARSEMEM && MEMORY_HOTPLUG
162 config MEMORY_HOTPLUG_DEFAULT_ONLINE
163 bool "Online the newly added memory blocks by default"
165 depends on MEMORY_HOTPLUG
167 This option sets the default policy setting for memory hotplug
168 onlining policy (/sys/devices/system/memory/auto_online_blocks) which
169 determines what happens to newly added memory regions. Policy setting
170 can always be changed at runtime.
171 See Documentation/memory-hotplug.txt for more information.
173 Say Y here if you want all hot-plugged memory blocks to appear in
174 'online' state by default.
175 Say N here if you want the default policy to keep all hot-plugged
176 memory blocks in 'offline' state.
178 config MEMORY_HOTREMOVE
179 bool "Allow for memory hot remove"
180 select MEMORY_ISOLATION
181 select HAVE_BOOTMEM_INFO_NODE if (X86_64 || PPC64)
182 depends on MEMORY_HOTPLUG && ARCH_ENABLE_MEMORY_HOTREMOVE
185 # Heavily threaded applications may benefit from splitting the mm-wide
186 # page_table_lock, so that faults on different parts of the user address
187 # space can be handled with less contention: split it at this NR_CPUS.
188 # Default to 4 for wider testing, though 8 might be more appropriate.
189 # ARM's adjust_pte (unused if VIPT) depends on mm-wide page_table_lock.
190 # PA-RISC 7xxx's spinlock_t would enlarge struct page from 32 to 44 bytes.
191 # DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC spinlock_t also enlarge struct page.
193 config SPLIT_PTLOCK_CPUS
195 default "999999" if !MMU
196 default "999999" if ARM && !CPU_CACHE_VIPT
197 default "999999" if PARISC && !PA20
200 config ARCH_ENABLE_SPLIT_PMD_PTLOCK
204 # support for memory balloon
205 config MEMORY_BALLOON
209 # support for memory balloon compaction
210 config BALLOON_COMPACTION
211 bool "Allow for balloon memory compaction/migration"
213 depends on COMPACTION && MEMORY_BALLOON
215 Memory fragmentation introduced by ballooning might reduce
216 significantly the number of 2MB contiguous memory blocks that can be
217 used within a guest, thus imposing performance penalties associated
218 with the reduced number of transparent huge pages that could be used
219 by the guest workload. Allowing the compaction & migration for memory
220 pages enlisted as being part of memory balloon devices avoids the
221 scenario aforementioned and helps improving memory defragmentation.
224 # support for memory compaction
226 bool "Allow for memory compaction"
231 Compaction is the only memory management component to form
232 high order (larger physically contiguous) memory blocks
233 reliably. The page allocator relies on compaction heavily and
234 the lack of the feature can lead to unexpected OOM killer
235 invocations for high order memory requests. You shouldn't
236 disable this option unless there really is a strong reason for
237 it and then we would be really interested to hear about that at
241 # support for page migration
244 bool "Page migration"
246 depends on (NUMA || ARCH_ENABLE_MEMORY_HOTREMOVE || COMPACTION || CMA) && MMU
248 Allows the migration of the physical location of pages of processes
249 while the virtual addresses are not changed. This is useful in
250 two situations. The first is on NUMA systems to put pages nearer
251 to the processors accessing. The second is when allocating huge
252 pages as migration can relocate pages to satisfy a huge page
253 allocation instead of reclaiming.
255 config ARCH_ENABLE_HUGEPAGE_MIGRATION
258 config ARCH_ENABLE_THP_MIGRATION
261 config PHYS_ADDR_T_64BIT
265 bool "Enable bounce buffers"
267 depends on BLOCK && MMU && (ZONE_DMA || HIGHMEM)
269 Enable bounce buffers for devices that cannot access
270 the full range of memory available to the CPU. Enabled
271 by default when ZONE_DMA or HIGHMEM is selected, but you
272 may say n to override this.
282 An architecture should select this if it implements the
283 deprecated interface virt_to_bus(). All new architectures
284 should probably not select this.
292 bool "Enable KSM for page merging"
296 Enable Kernel Samepage Merging: KSM periodically scans those areas
297 of an application's address space that an app has advised may be
298 mergeable. When it finds pages of identical content, it replaces
299 the many instances by a single page with that content, so
300 saving memory until one or another app needs to modify the content.
301 Recommended for use with KVM, or with other duplicative applications.
302 See Documentation/vm/ksm.rst for more information: KSM is inactive
303 until a program has madvised that an area is MADV_MERGEABLE, and
304 root has set /sys/kernel/mm/ksm/run to 1 (if CONFIG_SYSFS is set).
306 config DEFAULT_MMAP_MIN_ADDR
307 int "Low address space to protect from user allocation"
311 This is the portion of low virtual memory which should be protected
312 from userspace allocation. Keeping a user from writing to low pages
313 can help reduce the impact of kernel NULL pointer bugs.
315 For most ia64, ppc64 and x86 users with lots of address space
316 a value of 65536 is reasonable and should cause no problems.
317 On arm and other archs it should not be higher than 32768.
318 Programs which use vm86 functionality or have some need to map
319 this low address space will need CAP_SYS_RAWIO or disable this
320 protection by setting the value to 0.
322 This value can be changed after boot using the
323 /proc/sys/vm/mmap_min_addr tunable.
325 config ARCH_SUPPORTS_MEMORY_FAILURE
328 config MEMORY_FAILURE
330 depends on ARCH_SUPPORTS_MEMORY_FAILURE
331 bool "Enable recovery from hardware memory errors"
332 select MEMORY_ISOLATION
335 Enables code to recover from some memory failures on systems
336 with MCA recovery. This allows a system to continue running
337 even when some of its memory has uncorrected errors. This requires
338 special hardware support and typically ECC memory.
340 config HWPOISON_INJECT
341 tristate "HWPoison pages injector"
342 depends on MEMORY_FAILURE && DEBUG_KERNEL && PROC_FS
343 select PROC_PAGE_MONITOR
345 config NOMMU_INITIAL_TRIM_EXCESS
346 int "Turn on mmap() excess space trimming before booting"
350 The NOMMU mmap() frequently needs to allocate large contiguous chunks
351 of memory on which to store mappings, but it can only ask the system
352 allocator for chunks in 2^N*PAGE_SIZE amounts - which is frequently
353 more than it requires. To deal with this, mmap() is able to trim off
354 the excess and return it to the allocator.
356 If trimming is enabled, the excess is trimmed off and returned to the
357 system allocator, which can cause extra fragmentation, particularly
358 if there are a lot of transient processes.
360 If trimming is disabled, the excess is kept, but not used, which for
361 long-term mappings means that the space is wasted.
363 Trimming can be dynamically controlled through a sysctl option
364 (/proc/sys/vm/nr_trim_pages) which specifies the minimum number of
365 excess pages there must be before trimming should occur, or zero if
366 no trimming is to occur.
368 This option specifies the initial value of this option. The default
369 of 1 says that all excess pages should be trimmed.
371 See Documentation/nommu-mmap.txt for more information.
373 config TRANSPARENT_HUGEPAGE
374 bool "Transparent Hugepage Support"
375 depends on HAVE_ARCH_TRANSPARENT_HUGEPAGE
379 Transparent Hugepages allows the kernel to use huge pages and
380 huge tlb transparently to the applications whenever possible.
381 This feature can improve computing performance to certain
382 applications by speeding up page faults during memory
383 allocation, by reducing the number of tlb misses and by speeding
384 up the pagetable walking.
386 If memory constrained on embedded, you may want to say N.
389 prompt "Transparent Hugepage Support sysfs defaults"
390 depends on TRANSPARENT_HUGEPAGE
391 default TRANSPARENT_HUGEPAGE_ALWAYS
393 Selects the sysfs defaults for Transparent Hugepage Support.
395 config TRANSPARENT_HUGEPAGE_ALWAYS
398 Enabling Transparent Hugepage always, can increase the
399 memory footprint of applications without a guaranteed
400 benefit but it will work automatically for all applications.
402 config TRANSPARENT_HUGEPAGE_MADVISE
405 Enabling Transparent Hugepage madvise, will only provide a
406 performance improvement benefit to the applications using
407 madvise(MADV_HUGEPAGE) but it won't risk to increase the
408 memory footprint of applications without a guaranteed
412 config ARCH_WANTS_THP_SWAP
417 depends on TRANSPARENT_HUGEPAGE && ARCH_WANTS_THP_SWAP && SWAP
419 Swap transparent huge pages in one piece, without splitting.
420 XXX: For now, swap cluster backing transparent huge page
421 will be split after swapout.
423 For selection by architectures with reasonable THP sizes.
425 config TRANSPARENT_HUGE_PAGECACHE
427 depends on TRANSPARENT_HUGEPAGE
430 # UP and nommu archs use km based percpu allocator
432 config NEED_PER_CPU_KM
438 bool "Enable cleancache driver to cache clean pages if tmem is present"
441 Cleancache can be thought of as a page-granularity victim cache
442 for clean pages that the kernel's pageframe replacement algorithm
443 (PFRA) would like to keep around, but can't since there isn't enough
444 memory. So when the PFRA "evicts" a page, it first attempts to use
445 cleancache code to put the data contained in that page into
446 "transcendent memory", memory that is not directly accessible or
447 addressable by the kernel and is of unknown and possibly
448 time-varying size. And when a cleancache-enabled
449 filesystem wishes to access a page in a file on disk, it first
450 checks cleancache to see if it already contains it; if it does,
451 the page is copied into the kernel and a disk access is avoided.
452 When a transcendent memory driver is available (such as zcache or
453 Xen transcendent memory), a significant I/O reduction
454 may be achieved. When none is available, all cleancache calls
455 are reduced to a single pointer-compare-against-NULL resulting
456 in a negligible performance hit.
458 If unsure, say Y to enable cleancache
461 bool "Enable frontswap to cache swap pages if tmem is present"
465 Frontswap is so named because it can be thought of as the opposite
466 of a "backing" store for a swap device. The data is stored into
467 "transcendent memory", memory that is not directly accessible or
468 addressable by the kernel and is of unknown and possibly
469 time-varying size. When space in transcendent memory is available,
470 a significant swap I/O reduction may be achieved. When none is
471 available, all frontswap calls are reduced to a single pointer-
472 compare-against-NULL resulting in a negligible performance hit
473 and swap data is stored as normal on the matching swap device.
475 If unsure, say Y to enable frontswap.
478 bool "Contiguous Memory Allocator"
481 select MEMORY_ISOLATION
483 This enables the Contiguous Memory Allocator which allows other
484 subsystems to allocate big physically-contiguous blocks of memory.
485 CMA reserves a region of memory and allows only movable pages to
486 be allocated from it. This way, the kernel can use the memory for
487 pagecache and when a subsystem requests for contiguous area, the
488 allocated pages are migrated away to serve the contiguous request.
493 bool "CMA debug messages (DEVELOPMENT)"
494 depends on DEBUG_KERNEL && CMA
496 Turns on debug messages in CMA. This produces KERN_DEBUG
497 messages for every CMA call as well as various messages while
498 processing calls such as dma_alloc_from_contiguous().
499 This option does not affect warning and error messages.
502 bool "CMA debugfs interface"
503 depends on CMA && DEBUG_FS
505 Turns on the DebugFS interface for CMA.
508 int "Maximum count of the CMA areas"
512 CMA allows to create CMA areas for particular purpose, mainly,
513 used as device private area. This parameter sets the maximum
514 number of CMA area in the system.
516 If unsure, leave the default value "7".
518 config MEM_SOFT_DIRTY
519 bool "Track memory changes"
520 depends on CHECKPOINT_RESTORE && HAVE_ARCH_SOFT_DIRTY && PROC_FS
521 select PROC_PAGE_MONITOR
523 This option enables memory changes tracking by introducing a
524 soft-dirty bit on pte-s. This bit it set when someone writes
525 into a page just as regular dirty bit, but unlike the latter
526 it can be cleared by hands.
528 See Documentation/admin-guide/mm/soft-dirty.rst for more details.
531 bool "Compressed cache for swap pages (EXPERIMENTAL)"
532 depends on FRONTSWAP && CRYPTO=y
537 A lightweight compressed cache for swap pages. It takes
538 pages that are in the process of being swapped out and attempts to
539 compress them into a dynamically allocated RAM-based memory pool.
540 This can result in a significant I/O reduction on swap device and,
541 in the case where decompressing from RAM is faster that swap device
542 reads, can also improve workload performance.
544 This is marked experimental because it is a new feature (as of
545 v3.11) that interacts heavily with memory reclaim. While these
546 interactions don't cause any known issues on simple memory setups,
547 they have not be fully explored on the large set of potential
548 configurations and workloads that exist.
551 tristate "Common API for compressed memory storage"
554 Compressed memory storage API. This allows using either zbud or
558 tristate "Low (Up to 2x) density storage for compressed pages"
561 A special purpose allocator for storing compressed pages.
562 It is designed to store up to two compressed pages per physical
563 page. While this design limits storage density, it has simple and
564 deterministic reclaim properties that make it preferable to a higher
565 density approach when reclaim will be used.
568 tristate "Up to 3x density storage for compressed pages"
572 A special purpose allocator for storing compressed pages.
573 It is designed to store up to three compressed pages per physical
574 page. It is a ZBUD derivative so the simplicity and determinism are
578 tristate "Memory allocator for compressed pages"
582 zsmalloc is a slab-based memory allocator designed to store
583 compressed RAM pages. zsmalloc uses virtual memory mapping
584 in order to reduce fragmentation. However, this results in a
585 non-standard allocator interface where a handle, not a pointer, is
586 returned by an alloc(). This handle must be mapped in order to
587 access the allocated space.
589 config PGTABLE_MAPPING
590 bool "Use page table mapping to access object in zsmalloc"
593 By default, zsmalloc uses a copy-based object mapping method to
594 access allocations that span two pages. However, if a particular
595 architecture (ex, ARM) performs VM mapping faster than copying,
596 then you should select this. This causes zsmalloc to use page table
597 mapping rather than copying for object mapping.
599 You can check speed with zsmalloc benchmark:
600 https://github.com/spartacus06/zsmapbench
603 bool "Export zsmalloc statistics"
607 This option enables code in the zsmalloc to collect various
608 statistics about whats happening in zsmalloc and exports that
609 information to userspace via debugfs.
612 config GENERIC_EARLY_IOREMAP
615 config MAX_STACK_SIZE_MB
616 int "Maximum user stack size for 32-bit processes (MB)"
619 depends on STACK_GROWSUP && (!64BIT || COMPAT)
621 This is the maximum stack size in Megabytes in the VM layout of 32-bit
622 user processes when the stack grows upwards (currently only on parisc
623 arch). The stack will be located at the highest memory address minus
624 the given value, unless the RLIMIT_STACK hard limit is changed to a
625 smaller value in which case that is used.
627 A sane initial value is 80 MB.
629 config DEFERRED_STRUCT_PAGE_INIT
630 bool "Defer initialisation of struct pages to kthreads"
633 depends on !NEED_PER_CPU_KM
636 Ordinarily all struct pages are initialised during early boot in a
637 single thread. On very large machines this can take a considerable
638 amount of time. If this option is set, large machines will bring up
639 a subset of memmap at boot and then initialise the rest in parallel
640 by starting one-off "pgdatinitX" kernel thread for each node X. This
641 has a potential performance impact on processes running early in the
642 lifetime of the system until these kthreads finish the
645 config IDLE_PAGE_TRACKING
646 bool "Enable idle page tracking"
647 depends on SYSFS && MMU
648 select PAGE_EXTENSION if !64BIT
650 This feature allows to estimate the amount of user pages that have
651 not been touched during a given period of time. This information can
652 be useful to tune memory cgroup limits and/or for job placement
653 within a compute cluster.
655 See Documentation/admin-guide/mm/idle_page_tracking.rst for
658 # arch_add_memory() comprehends device memory
659 config ARCH_HAS_ZONE_DEVICE
663 bool "Device memory (pmem, HMM, etc...) hotplug support"
664 depends on MEMORY_HOTPLUG
665 depends on MEMORY_HOTREMOVE
666 depends on SPARSEMEM_VMEMMAP
667 depends on ARCH_HAS_ZONE_DEVICE
671 Device memory hotplug support allows for establishing pmem,
672 or other device driver discovered memory regions, in the
673 memmap. This allows pfn_to_page() lookups of otherwise
674 "device-physical" addresses which is needed for using a DAX
675 mapping in an O_DIRECT operation, among other things.
677 If FS_DAX is enabled, then say Y.
682 depends on (X86_64 || PPC64)
683 depends on ZONE_DEVICE
684 depends on MMU && 64BIT
685 depends on MEMORY_HOTPLUG
686 depends on MEMORY_HOTREMOVE
687 depends on SPARSEMEM_VMEMMAP
689 config MIGRATE_VMA_HELPER
692 config DEV_PAGEMAP_OPS
697 select MIGRATE_VMA_HELPER
700 bool "HMM mirror CPU page table into a device page table"
701 depends on ARCH_HAS_HMM
705 Select HMM_MIRROR if you want to mirror range of the CPU page table of a
706 process into a device page table. Here, mirror means "keep synchronized".
707 Prerequisites: the device must provide the ability to write-protect its
708 page tables (at PAGE_SIZE granularity), and must be able to recover from
709 the resulting potential page faults.
711 config DEVICE_PRIVATE
712 bool "Unaddressable device memory (GPU memory, ...)"
713 depends on ARCH_HAS_HMM
715 select DEV_PAGEMAP_OPS
718 Allows creation of struct pages to represent unaddressable device
719 memory; i.e., memory that is only accessible from the device (or
720 group of devices). You likely also want to select HMM_MIRROR.
723 bool "Addressable device memory (like GPU memory)"
724 depends on ARCH_HAS_HMM
726 select DEV_PAGEMAP_OPS
729 Allows creation of struct pages to represent addressable device
730 memory; i.e., memory that is accessible from both the device and
736 config ARCH_USES_HIGH_VMA_FLAGS
738 config ARCH_HAS_PKEYS
742 bool "Collect percpu memory statistics"
745 This feature collects and exposes statistics via debugfs. The
746 information includes global and per chunk statistics, which can
747 be used to help understand percpu memory usage.
750 bool "Enable infrastructure for get_user_pages_fast() benchmarking"
753 Provides /sys/kernel/debug/gup_benchmark that helps with testing
754 performance of get_user_pages_fast().
756 See tools/testing/selftests/vm/gup_benchmark.c
758 config ARCH_HAS_PTE_SPECIAL