2 menu "Memory Management options"
4 config SELECT_MEMORY_MODEL
6 depends on ARCH_SELECT_MEMORY_MODEL
10 depends on SELECT_MEMORY_MODEL
11 default DISCONTIGMEM_MANUAL if ARCH_DISCONTIGMEM_DEFAULT
12 default SPARSEMEM_MANUAL if ARCH_SPARSEMEM_DEFAULT
13 default FLATMEM_MANUAL
17 depends on !(ARCH_DISCONTIGMEM_ENABLE || ARCH_SPARSEMEM_ENABLE) || ARCH_FLATMEM_ENABLE
19 This option allows you to change some of the ways that
20 Linux manages its memory internally. Most users will
21 only have one option here: FLATMEM. This is normal
24 Some users of more advanced features like NUMA and
25 memory hotplug may have different options here.
26 DISCONTIGMEM is a more mature, better tested system,
27 but is incompatible with memory hotplug and may suffer
28 decreased performance over SPARSEMEM. If unsure between
29 "Sparse Memory" and "Discontiguous Memory", choose
30 "Discontiguous Memory".
32 If unsure, choose this option (Flat Memory) over any other.
34 config DISCONTIGMEM_MANUAL
35 bool "Discontiguous Memory"
36 depends on ARCH_DISCONTIGMEM_ENABLE
38 This option provides enhanced support for discontiguous
39 memory systems, over FLATMEM. These systems have holes
40 in their physical address spaces, and this option provides
41 more efficient handling of these holes. However, the vast
42 majority of hardware has quite flat address spaces, and
43 can have degraded performance from the extra overhead that
46 Many NUMA configurations will have this as the only option.
48 If unsure, choose "Flat Memory" over this option.
50 config SPARSEMEM_MANUAL
52 depends on ARCH_SPARSEMEM_ENABLE
54 This will be the only option for some systems, including
55 memory hotplug systems. This is normal.
57 For many other systems, this will be an alternative to
58 "Discontiguous Memory". This option provides some potential
59 performance benefits, along with decreased code complexity,
60 but it is newer, and more experimental.
62 If unsure, choose "Discontiguous Memory" or "Flat Memory"
69 depends on (!SELECT_MEMORY_MODEL && ARCH_DISCONTIGMEM_ENABLE) || DISCONTIGMEM_MANUAL
73 depends on (!SELECT_MEMORY_MODEL && ARCH_SPARSEMEM_ENABLE) || SPARSEMEM_MANUAL
77 depends on (!DISCONTIGMEM && !SPARSEMEM) || FLATMEM_MANUAL
79 config FLAT_NODE_MEM_MAP
84 # Both the NUMA code and DISCONTIGMEM use arrays of pg_data_t's
85 # to represent different areas of memory. This variable allows
86 # those dependencies to exist individually.
88 config NEED_MULTIPLE_NODES
90 depends on DISCONTIGMEM || NUMA
92 config HAVE_MEMORY_PRESENT
94 depends on ARCH_HAVE_MEMORY_PRESENT || SPARSEMEM
97 # SPARSEMEM_EXTREME (which is the default) does some bootmem
98 # allocations when memory_present() is called. If this cannot
99 # be done on your architecture, select this option. However,
100 # statically allocating the mem_section[] array can potentially
101 # consume vast quantities of .bss, so be careful.
103 # This option will also potentially produce smaller runtime code
104 # with gcc 3.4 and later.
106 config SPARSEMEM_STATIC
110 # Architecture platforms which require a two level mem_section in SPARSEMEM
111 # must select this option. This is usually for architecture platforms with
112 # an extremely sparse physical address space.
114 config SPARSEMEM_EXTREME
116 depends on SPARSEMEM && !SPARSEMEM_STATIC
118 config SPARSEMEM_VMEMMAP_ENABLE
121 config SPARSEMEM_VMEMMAP
122 bool "Sparse Memory virtual memmap"
123 depends on SPARSEMEM && SPARSEMEM_VMEMMAP_ENABLE
126 SPARSEMEM_VMEMMAP uses a virtually mapped memmap to optimise
127 pfn_to_page and page_to_pfn operations. This is the most
128 efficient option when sufficient kernel resources are available.
133 config HAVE_MEMBLOCK_NODE_MAP
136 config HAVE_MEMBLOCK_PHYS_MAP
139 config HAVE_GENERIC_GUP
142 config ARCH_DISCARD_MEMBLOCK
148 config MEMORY_ISOLATION
152 # Only be set on architectures that have completely implemented memory hotplug
153 # feature. If you are not sure, don't touch it.
155 config HAVE_BOOTMEM_INFO_NODE
158 # eventually, we can have this option just 'select SPARSEMEM'
159 config MEMORY_HOTPLUG
160 bool "Allow for memory hot-add"
161 depends on SPARSEMEM || X86_64_ACPI_NUMA
162 depends on ARCH_ENABLE_MEMORY_HOTPLUG
164 config MEMORY_HOTPLUG_SPARSE
166 depends on SPARSEMEM && MEMORY_HOTPLUG
168 config MEMORY_HOTPLUG_DEFAULT_ONLINE
169 bool "Online the newly added memory blocks by default"
171 depends on MEMORY_HOTPLUG
173 This option sets the default policy setting for memory hotplug
174 onlining policy (/sys/devices/system/memory/auto_online_blocks) which
175 determines what happens to newly added memory regions. Policy setting
176 can always be changed at runtime.
177 See Documentation/memory-hotplug.txt for more information.
179 Say Y here if you want all hot-plugged memory blocks to appear in
180 'online' state by default.
181 Say N here if you want the default policy to keep all hot-plugged
182 memory blocks in 'offline' state.
184 config MEMORY_HOTREMOVE
185 bool "Allow for memory hot remove"
186 select MEMORY_ISOLATION
187 select HAVE_BOOTMEM_INFO_NODE if (X86_64 || PPC64)
188 depends on MEMORY_HOTPLUG && ARCH_ENABLE_MEMORY_HOTREMOVE
191 # Heavily threaded applications may benefit from splitting the mm-wide
192 # page_table_lock, so that faults on different parts of the user address
193 # space can be handled with less contention: split it at this NR_CPUS.
194 # Default to 4 for wider testing, though 8 might be more appropriate.
195 # ARM's adjust_pte (unused if VIPT) depends on mm-wide page_table_lock.
196 # PA-RISC 7xxx's spinlock_t would enlarge struct page from 32 to 44 bytes.
197 # DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC spinlock_t also enlarge struct page.
199 config SPLIT_PTLOCK_CPUS
201 default "999999" if !MMU
202 default "999999" if ARM && !CPU_CACHE_VIPT
203 default "999999" if PARISC && !PA20
206 config ARCH_ENABLE_SPLIT_PMD_PTLOCK
210 # support for memory balloon
211 config MEMORY_BALLOON
215 # support for memory balloon compaction
216 config BALLOON_COMPACTION
217 bool "Allow for balloon memory compaction/migration"
219 depends on COMPACTION && MEMORY_BALLOON
221 Memory fragmentation introduced by ballooning might reduce
222 significantly the number of 2MB contiguous memory blocks that can be
223 used within a guest, thus imposing performance penalties associated
224 with the reduced number of transparent huge pages that could be used
225 by the guest workload. Allowing the compaction & migration for memory
226 pages enlisted as being part of memory balloon devices avoids the
227 scenario aforementioned and helps improving memory defragmentation.
230 # support for memory compaction
232 bool "Allow for memory compaction"
237 Compaction is the only memory management component to form
238 high order (larger physically contiguous) memory blocks
239 reliably. The page allocator relies on compaction heavily and
240 the lack of the feature can lead to unexpected OOM killer
241 invocations for high order memory requests. You shouldn't
242 disable this option unless there really is a strong reason for
243 it and then we would be really interested to hear about that at
247 # support for page migration
250 bool "Page migration"
252 depends on (NUMA || ARCH_ENABLE_MEMORY_HOTREMOVE || COMPACTION || CMA) && MMU
254 Allows the migration of the physical location of pages of processes
255 while the virtual addresses are not changed. This is useful in
256 two situations. The first is on NUMA systems to put pages nearer
257 to the processors accessing. The second is when allocating huge
258 pages as migration can relocate pages to satisfy a huge page
259 allocation instead of reclaiming.
261 config ARCH_ENABLE_HUGEPAGE_MIGRATION
264 config ARCH_ENABLE_THP_MIGRATION
267 config PHYS_ADDR_T_64BIT
271 bool "Enable bounce buffers"
273 depends on BLOCK && MMU && (ZONE_DMA || HIGHMEM)
275 Enable bounce buffers for devices that cannot access
276 the full range of memory available to the CPU. Enabled
277 by default when ZONE_DMA or HIGHMEM is selected, but you
278 may say n to override this.
288 An architecture should select this if it implements the
289 deprecated interface virt_to_bus(). All new architectures
290 should probably not select this.
298 bool "Enable KSM for page merging"
301 Enable Kernel Samepage Merging: KSM periodically scans those areas
302 of an application's address space that an app has advised may be
303 mergeable. When it finds pages of identical content, it replaces
304 the many instances by a single page with that content, so
305 saving memory until one or another app needs to modify the content.
306 Recommended for use with KVM, or with other duplicative applications.
307 See Documentation/vm/ksm.rst for more information: KSM is inactive
308 until a program has madvised that an area is MADV_MERGEABLE, and
309 root has set /sys/kernel/mm/ksm/run to 1 (if CONFIG_SYSFS is set).
311 config DEFAULT_MMAP_MIN_ADDR
312 int "Low address space to protect from user allocation"
316 This is the portion of low virtual memory which should be protected
317 from userspace allocation. Keeping a user from writing to low pages
318 can help reduce the impact of kernel NULL pointer bugs.
320 For most ia64, ppc64 and x86 users with lots of address space
321 a value of 65536 is reasonable and should cause no problems.
322 On arm and other archs it should not be higher than 32768.
323 Programs which use vm86 functionality or have some need to map
324 this low address space will need CAP_SYS_RAWIO or disable this
325 protection by setting the value to 0.
327 This value can be changed after boot using the
328 /proc/sys/vm/mmap_min_addr tunable.
330 config ARCH_SUPPORTS_MEMORY_FAILURE
333 config MEMORY_FAILURE
335 depends on ARCH_SUPPORTS_MEMORY_FAILURE
336 bool "Enable recovery from hardware memory errors"
337 select MEMORY_ISOLATION
340 Enables code to recover from some memory failures on systems
341 with MCA recovery. This allows a system to continue running
342 even when some of its memory has uncorrected errors. This requires
343 special hardware support and typically ECC memory.
345 config HWPOISON_INJECT
346 tristate "HWPoison pages injector"
347 depends on MEMORY_FAILURE && DEBUG_KERNEL && PROC_FS
348 select PROC_PAGE_MONITOR
350 config NOMMU_INITIAL_TRIM_EXCESS
351 int "Turn on mmap() excess space trimming before booting"
355 The NOMMU mmap() frequently needs to allocate large contiguous chunks
356 of memory on which to store mappings, but it can only ask the system
357 allocator for chunks in 2^N*PAGE_SIZE amounts - which is frequently
358 more than it requires. To deal with this, mmap() is able to trim off
359 the excess and return it to the allocator.
361 If trimming is enabled, the excess is trimmed off and returned to the
362 system allocator, which can cause extra fragmentation, particularly
363 if there are a lot of transient processes.
365 If trimming is disabled, the excess is kept, but not used, which for
366 long-term mappings means that the space is wasted.
368 Trimming can be dynamically controlled through a sysctl option
369 (/proc/sys/vm/nr_trim_pages) which specifies the minimum number of
370 excess pages there must be before trimming should occur, or zero if
371 no trimming is to occur.
373 This option specifies the initial value of this option. The default
374 of 1 says that all excess pages should be trimmed.
376 See Documentation/nommu-mmap.txt for more information.
378 config TRANSPARENT_HUGEPAGE
379 bool "Transparent Hugepage Support"
380 depends on HAVE_ARCH_TRANSPARENT_HUGEPAGE
382 select RADIX_TREE_MULTIORDER
384 Transparent Hugepages allows the kernel to use huge pages and
385 huge tlb transparently to the applications whenever possible.
386 This feature can improve computing performance to certain
387 applications by speeding up page faults during memory
388 allocation, by reducing the number of tlb misses and by speeding
389 up the pagetable walking.
391 If memory constrained on embedded, you may want to say N.
394 prompt "Transparent Hugepage Support sysfs defaults"
395 depends on TRANSPARENT_HUGEPAGE
396 default TRANSPARENT_HUGEPAGE_ALWAYS
398 Selects the sysfs defaults for Transparent Hugepage Support.
400 config TRANSPARENT_HUGEPAGE_ALWAYS
403 Enabling Transparent Hugepage always, can increase the
404 memory footprint of applications without a guaranteed
405 benefit but it will work automatically for all applications.
407 config TRANSPARENT_HUGEPAGE_MADVISE
410 Enabling Transparent Hugepage madvise, will only provide a
411 performance improvement benefit to the applications using
412 madvise(MADV_HUGEPAGE) but it won't risk to increase the
413 memory footprint of applications without a guaranteed
417 config ARCH_WANTS_THP_SWAP
422 depends on TRANSPARENT_HUGEPAGE && ARCH_WANTS_THP_SWAP && SWAP
424 Swap transparent huge pages in one piece, without splitting.
425 XXX: For now, swap cluster backing transparent huge page
426 will be split after swapout.
428 For selection by architectures with reasonable THP sizes.
430 config TRANSPARENT_HUGE_PAGECACHE
432 depends on TRANSPARENT_HUGEPAGE
435 # UP and nommu archs use km based percpu allocator
437 config NEED_PER_CPU_KM
443 bool "Enable cleancache driver to cache clean pages if tmem is present"
446 Cleancache can be thought of as a page-granularity victim cache
447 for clean pages that the kernel's pageframe replacement algorithm
448 (PFRA) would like to keep around, but can't since there isn't enough
449 memory. So when the PFRA "evicts" a page, it first attempts to use
450 cleancache code to put the data contained in that page into
451 "transcendent memory", memory that is not directly accessible or
452 addressable by the kernel and is of unknown and possibly
453 time-varying size. And when a cleancache-enabled
454 filesystem wishes to access a page in a file on disk, it first
455 checks cleancache to see if it already contains it; if it does,
456 the page is copied into the kernel and a disk access is avoided.
457 When a transcendent memory driver is available (such as zcache or
458 Xen transcendent memory), a significant I/O reduction
459 may be achieved. When none is available, all cleancache calls
460 are reduced to a single pointer-compare-against-NULL resulting
461 in a negligible performance hit.
463 If unsure, say Y to enable cleancache
466 bool "Enable frontswap to cache swap pages if tmem is present"
470 Frontswap is so named because it can be thought of as the opposite
471 of a "backing" store for a swap device. The data is stored into
472 "transcendent memory", memory that is not directly accessible or
473 addressable by the kernel and is of unknown and possibly
474 time-varying size. When space in transcendent memory is available,
475 a significant swap I/O reduction may be achieved. When none is
476 available, all frontswap calls are reduced to a single pointer-
477 compare-against-NULL resulting in a negligible performance hit
478 and swap data is stored as normal on the matching swap device.
480 If unsure, say Y to enable frontswap.
483 bool "Contiguous Memory Allocator"
484 depends on HAVE_MEMBLOCK && MMU
486 select MEMORY_ISOLATION
488 This enables the Contiguous Memory Allocator which allows other
489 subsystems to allocate big physically-contiguous blocks of memory.
490 CMA reserves a region of memory and allows only movable pages to
491 be allocated from it. This way, the kernel can use the memory for
492 pagecache and when a subsystem requests for contiguous area, the
493 allocated pages are migrated away to serve the contiguous request.
498 bool "CMA debug messages (DEVELOPMENT)"
499 depends on DEBUG_KERNEL && CMA
501 Turns on debug messages in CMA. This produces KERN_DEBUG
502 messages for every CMA call as well as various messages while
503 processing calls such as dma_alloc_from_contiguous().
504 This option does not affect warning and error messages.
507 bool "CMA debugfs interface"
508 depends on CMA && DEBUG_FS
510 Turns on the DebugFS interface for CMA.
513 int "Maximum count of the CMA areas"
517 CMA allows to create CMA areas for particular purpose, mainly,
518 used as device private area. This parameter sets the maximum
519 number of CMA area in the system.
521 If unsure, leave the default value "7".
523 config MEM_SOFT_DIRTY
524 bool "Track memory changes"
525 depends on CHECKPOINT_RESTORE && HAVE_ARCH_SOFT_DIRTY && PROC_FS
526 select PROC_PAGE_MONITOR
528 This option enables memory changes tracking by introducing a
529 soft-dirty bit on pte-s. This bit it set when someone writes
530 into a page just as regular dirty bit, but unlike the latter
531 it can be cleared by hands.
533 See Documentation/admin-guide/mm/soft-dirty.rst for more details.
536 bool "Compressed cache for swap pages (EXPERIMENTAL)"
537 depends on FRONTSWAP && CRYPTO=y
542 A lightweight compressed cache for swap pages. It takes
543 pages that are in the process of being swapped out and attempts to
544 compress them into a dynamically allocated RAM-based memory pool.
545 This can result in a significant I/O reduction on swap device and,
546 in the case where decompressing from RAM is faster that swap device
547 reads, can also improve workload performance.
549 This is marked experimental because it is a new feature (as of
550 v3.11) that interacts heavily with memory reclaim. While these
551 interactions don't cause any known issues on simple memory setups,
552 they have not be fully explored on the large set of potential
553 configurations and workloads that exist.
556 tristate "Common API for compressed memory storage"
559 Compressed memory storage API. This allows using either zbud or
563 tristate "Low (Up to 2x) density storage for compressed pages"
566 A special purpose allocator for storing compressed pages.
567 It is designed to store up to two compressed pages per physical
568 page. While this design limits storage density, it has simple and
569 deterministic reclaim properties that make it preferable to a higher
570 density approach when reclaim will be used.
573 tristate "Up to 3x density storage for compressed pages"
577 A special purpose allocator for storing compressed pages.
578 It is designed to store up to three compressed pages per physical
579 page. It is a ZBUD derivative so the simplicity and determinism are
583 tristate "Memory allocator for compressed pages"
587 zsmalloc is a slab-based memory allocator designed to store
588 compressed RAM pages. zsmalloc uses virtual memory mapping
589 in order to reduce fragmentation. However, this results in a
590 non-standard allocator interface where a handle, not a pointer, is
591 returned by an alloc(). This handle must be mapped in order to
592 access the allocated space.
594 config PGTABLE_MAPPING
595 bool "Use page table mapping to access object in zsmalloc"
598 By default, zsmalloc uses a copy-based object mapping method to
599 access allocations that span two pages. However, if a particular
600 architecture (ex, ARM) performs VM mapping faster than copying,
601 then you should select this. This causes zsmalloc to use page table
602 mapping rather than copying for object mapping.
604 You can check speed with zsmalloc benchmark:
605 https://github.com/spartacus06/zsmapbench
608 bool "Export zsmalloc statistics"
612 This option enables code in the zsmalloc to collect various
613 statistics about whats happening in zsmalloc and exports that
614 information to userspace via debugfs.
617 config GENERIC_EARLY_IOREMAP
620 config MAX_STACK_SIZE_MB
621 int "Maximum user stack size for 32-bit processes (MB)"
624 depends on STACK_GROWSUP && (!64BIT || COMPAT)
626 This is the maximum stack size in Megabytes in the VM layout of 32-bit
627 user processes when the stack grows upwards (currently only on parisc
628 arch). The stack will be located at the highest memory address minus
629 the given value, unless the RLIMIT_STACK hard limit is changed to a
630 smaller value in which case that is used.
632 A sane initial value is 80 MB.
634 config DEFERRED_STRUCT_PAGE_INIT
635 bool "Defer initialisation of struct pages to kthreads"
637 depends on NO_BOOTMEM
639 depends on !NEED_PER_CPU_KM
641 Ordinarily all struct pages are initialised during early boot in a
642 single thread. On very large machines this can take a considerable
643 amount of time. If this option is set, large machines will bring up
644 a subset of memmap at boot and then initialise the rest in parallel
645 by starting one-off "pgdatinitX" kernel thread for each node X. This
646 has a potential performance impact on processes running early in the
647 lifetime of the system until these kthreads finish the
650 config IDLE_PAGE_TRACKING
651 bool "Enable idle page tracking"
652 depends on SYSFS && MMU
653 select PAGE_EXTENSION if !64BIT
655 This feature allows to estimate the amount of user pages that have
656 not been touched during a given period of time. This information can
657 be useful to tune memory cgroup limits and/or for job placement
658 within a compute cluster.
660 See Documentation/admin-guide/mm/idle_page_tracking.rst for
663 # arch_add_memory() comprehends device memory
664 config ARCH_HAS_ZONE_DEVICE
668 bool "Device memory (pmem, HMM, etc...) hotplug support"
669 depends on MEMORY_HOTPLUG
670 depends on MEMORY_HOTREMOVE
671 depends on SPARSEMEM_VMEMMAP
672 depends on ARCH_HAS_ZONE_DEVICE
673 select RADIX_TREE_MULTIORDER
676 Device memory hotplug support allows for establishing pmem,
677 or other device driver discovered memory regions, in the
678 memmap. This allows pfn_to_page() lookups of otherwise
679 "device-physical" addresses which is needed for using a DAX
680 mapping in an O_DIRECT operation, among other things.
682 If FS_DAX is enabled, then say Y.
687 depends on (X86_64 || PPC64)
688 depends on ZONE_DEVICE
689 depends on MMU && 64BIT
690 depends on MEMORY_HOTPLUG
691 depends on MEMORY_HOTREMOVE
692 depends on SPARSEMEM_VMEMMAP
694 config MIGRATE_VMA_HELPER
697 config DEV_PAGEMAP_OPS
702 select MIGRATE_VMA_HELPER
705 bool "HMM mirror CPU page table into a device page table"
706 depends on ARCH_HAS_HMM
710 Select HMM_MIRROR if you want to mirror range of the CPU page table of a
711 process into a device page table. Here, mirror means "keep synchronized".
712 Prerequisites: the device must provide the ability to write-protect its
713 page tables (at PAGE_SIZE granularity), and must be able to recover from
714 the resulting potential page faults.
716 config DEVICE_PRIVATE
717 bool "Unaddressable device memory (GPU memory, ...)"
718 depends on ARCH_HAS_HMM
720 select DEV_PAGEMAP_OPS
723 Allows creation of struct pages to represent unaddressable device
724 memory; i.e., memory that is only accessible from the device (or
725 group of devices). You likely also want to select HMM_MIRROR.
728 bool "Addressable device memory (like GPU memory)"
729 depends on ARCH_HAS_HMM
731 select DEV_PAGEMAP_OPS
734 Allows creation of struct pages to represent addressable device
735 memory; i.e., memory that is accessible from both the device and
741 config ARCH_USES_HIGH_VMA_FLAGS
743 config ARCH_HAS_PKEYS
747 bool "Collect percpu memory statistics"
750 This feature collects and exposes statistics via debugfs. The
751 information includes global and per chunk statistics, which can
752 be used to help understand percpu memory usage.
755 bool "Enable infrastructure for get_user_pages_fast() benchmarking"
758 Provides /sys/kernel/debug/gup_benchmark that helps with testing
759 performance of get_user_pages_fast().
761 See tools/testing/selftests/vm/gup_benchmark.c
763 config ARCH_HAS_PTE_SPECIAL