Documentation/admin-guide/l1tf.rst

   1 L1TF - L1 Terminal Fault
   2 ========================
   3
   4 L1 Terminal Fault is a hardware vulnerability which allows unprivileged
   5 speculative access to data which is available in the Level 1 Data Cache
   6 when the page table entry controlling the virtual address, which is used
   7 for the access, has the Present bit cleared or other reserved bits set.
   8
   9 Affected processors
  10 -------------------
  11
  12 This vulnerability affects a wide range of Intel processors. The
  13 vulnerability is not present on:
  14
  15    - Processors from AMD, Centaur and other non Intel vendors
  16
  17    - Older processor models, where the CPU family is < 6
  18
  19    - A range of Intel ATOM processors (Cedarview, Cloverview, Lincroft,
  20      Penwell, Pineview, Silvermont, Airmont, Merrifield)
  21
  22    - The Intel XEON PHI family
  23
  24    - Intel processors which have the ARCH_CAP_RDCL_NO bit set in the
  25      IA32_ARCH_CAPABILITIES MSR. If the bit is set the CPU is not affected
  26      by the Meltdown vulnerability either. These CPUs should become
  27      available by end of 2018.
  28
  29 Whether a processor is affected or not can be read out from the L1TF
  30 vulnerability file in sysfs. See :ref:`l1tf_sys_info`.
  31
  32 Related CVEs
  33 ------------
  34
  35 The following CVE entries are related to the L1TF vulnerability:
  36
  37    =============  =================  ==============================
  38    CVE-2018-3615  L1 Terminal Fault  SGX related aspects
  39    CVE-2018-3620  L1 Terminal Fault  OS, SMM related aspects
  40    CVE-2018-3646  L1 Terminal Fault  Virtualization related aspects
  41    =============  =================  ==============================
  42
  43 Problem
  44 -------
  45
  46 If an instruction accesses a virtual address for which the relevant page
  47 table entry (PTE) has the Present bit cleared or other reserved bits set,
  48 then speculative execution ignores the invalid PTE and loads the referenced
  49 data if it is present in the Level 1 Data Cache, as if the page referenced
  50 by the address bits in the PTE was still present and accessible.
  51
  52 While this is a purely speculative mechanism and the instruction will raise
  53 a page fault when it is retired eventually, the pure act of loading the
  54 data and making it available to other speculative instructions opens up the
  55 opportunity for side channel attacks to unprivileged malicious code,
  56 similar to the Meltdown attack.
  57
  58 While Meltdown breaks the user space to kernel space protection, L1TF
  59 allows to attack any physical memory address in the system and the attack
  60 works across all protection domains. It allows an attack of SGX and also
  61 works from inside virtual machines because the speculation bypasses the
  62 extended page table (EPT) protection mechanism.
  63
  64
  65 Attack scenarios
  66 ----------------
  67
  68 1. Malicious user space
  69 ^^^^^^^^^^^^^^^^^^^^^^^
  70
  71    Operating Systems store arbitrary information in the address bits of a
  72    PTE which is marked non present. This allows a malicious user space
  73    application to attack the physical memory to which these PTEs resolve.
  74    In some cases user-space can maliciously influence the information
  75    encoded in the address bits of the PTE, thus making attacks more
  76    deterministic and more practical.
  77
  78    The Linux kernel contains a mitigation for this attack vector, PTE
  79    inversion, which is permanently enabled and has no performance
  80    impact. The kernel ensures that the address bits of PTEs, which are not
  81    marked present, never point to cacheable physical memory space.
  82
  83    A system with an up to date kernel is protected against attacks from
  84    malicious user space applications.
  85
  86 2. Malicious guest in a virtual machine
  87 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  88
  89    The fact that L1TF breaks all domain protections allows malicious guest
  90    OSes, which can control the PTEs directly, and malicious guest user
  91    space applications, which run on an unprotected guest kernel lacking the
  92    PTE inversion mitigation for L1TF, to attack physical host memory.
  93
  94    A special aspect of L1TF in the context of virtualization is symmetric
  95    multi threading (SMT). The Intel implementation of SMT is called
  96    HyperThreading. The fact that Hyperthreads on the affected processors
  97    share the L1 Data Cache (L1D) is important for this. As the flaw allows
  98    only to attack data which is present in L1D, a malicious guest running
  99    on one Hyperthread can attack the data which is brought into the L1D by
 100    the context which runs on the sibling Hyperthread of the same physical
 101    core. This context can be host OS, host user space or a different guest.
 102
 103    If the processor does not support Extended Page Tables, the attack is
 104    only possible, when the hypervisor does not sanitize the content of the
 105    effective (shadow) page tables.
 106
 107    While solutions exist to mitigate these attack vectors fully, these
 108    mitigations are not enabled by default in the Linux kernel because they
 109    can affect performance significantly. The kernel provides several
 110    mechanisms which can be utilized to address the problem depending on the
 111    deployment scenario. The mitigations, their protection scope and impact
 112    are described in the next sections.
 113
 114    The default mitigations and the rationale for choosing them are explained
 115    at the end of this document. See :ref:`default_mitigations`.
 116
 117 .. _l1tf_sys_info:
 118
 119 L1TF system information
 120 -----------------------
 121
 122 The Linux kernel provides a sysfs interface to enumerate the current L1TF
 123 status of the system: whether the system is vulnerable, and which
 124 mitigations are active. The relevant sysfs file is:
 125
 126 /sys/devices/system/cpu/vulnerabilities/l1tf
 127
 128 The possible values in this file are:
 129
 130   ===========================   ===============================
 131   'Not affected'                The processor is not vulnerable
 132   'Mitigation: PTE Inversion'   The host protection is active
 133   ===========================   ===============================
 134
 135 If KVM/VMX is enabled and the processor is vulnerable then the following
 136 information is appended to the 'Mitigation: PTE Inversion' part:
 137
 138   - SMT status:
 139
 140     =====================  ================
 141     'VMX: SMT vulnerable'  SMT is enabled
 142     'VMX: SMT disabled'    SMT is disabled
 143     =====================  ================
 144
 145   - L1D Flush mode:
 146
 147     ================================  ====================================
 148     'L1D vulnerable'                  L1D flushing is disabled
 149
 150     'L1D conditional cache flushes'   L1D flush is conditionally enabled
 151
 152     'L1D cache flushes'               L1D flush is unconditionally enabled
 153     ================================  ====================================
 154
 155 The resulting grade of protection is discussed in the following sections.
 156
 157
 158 Host mitigation mechanism
 159 -------------------------
 160
 161 The kernel is unconditionally protected against L1TF attacks from malicious
 162 user space running on the host.
 163
 164
 165 Guest mitigation mechanisms
 166 ---------------------------
 167
 168 .. _l1d_flush:
 169
 170 1. L1D flush on VMENTER
 171 ^^^^^^^^^^^^^^^^^^^^^^^
 172
 173    To make sure that a guest cannot attack data which is present in the L1D
 174    the hypervisor flushes the L1D before entering the guest.
 175
 176    Flushing the L1D evicts not only the data which should not be accessed
 177    by a potentially malicious guest, it also flushes the guest
 178    data. Flushing the L1D has a performance impact as the processor has to
 179    bring the flushed guest data back into the L1D. Depending on the
 180    frequency of VMEXIT/VMENTER and the type of computations in the guest
 181    performance degradation in the range of 1% to 50% has been observed. For
 182    scenarios where guest VMEXIT/VMENTER are rare the performance impact is
 183    minimal. Virtio and mechanisms like posted interrupts are designed to
 184    confine the VMEXITs to a bare minimum, but specific configurations and
 185    application scenarios might still suffer from a high VMEXIT rate.
 186
 187    The kernel provides two L1D flush modes:
 188     - conditional ('cond')
 189     - unconditional ('always')
 190
 191    The conditional mode avoids L1D flushing after VMEXITs which execute
 192    only audited code paths before the corresponding VMENTER. These code
 193    paths have been verified that they cannot expose secrets or other
 194    interesting data to an attacker, but they can leak information about the
 195    address space layout of the hypervisor.
 196
 197    Unconditional mode flushes L1D on all VMENTER invocations and provides
 198    maximum protection. It has a higher overhead than the conditional
 199    mode. The overhead cannot be quantified correctly as it depends on the
 200    workload scenario and the resulting number of VMEXITs.
 201
 202    The general recommendation is to enable L1D flush on VMENTER. The kernel
 203    defaults to conditional mode on affected processors.
 204
 205    **Note**, that L1D flush does not prevent the SMT problem because the
 206    sibling thread will also bring back its data into the L1D which makes it
 207    attackable again.
 208
 209    L1D flush can be controlled by the administrator via the kernel command
 210    line and sysfs control files. See :ref:`mitigation_control_command_line`
 211    and :ref:`mitigation_control_kvm`.
 212
 213 .. _guest_confinement:
 214
 215 2. Guest VCPU confinement to dedicated physical cores
 216 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 217
 218    To address the SMT problem, it is possible to make a guest or a group of
 219    guests affine to one or more physical cores. The proper mechanism for
 220    that is to utilize exclusive cpusets to ensure that no other guest or
 221    host tasks can run on these cores.
 222
 223    If only a single guest or related guests run on sibling SMT threads on
 224    the same physical core then they can only attack their own memory and
 225    restricted parts of the host memory.
 226
 227    Host memory is attackable, when one of the sibling SMT threads runs in
 228    host OS (hypervisor) context and the other in guest context. The amount
 229    of valuable information from the host OS context depends on the context
 230    which the host OS executes, i.e. interrupts, soft interrupts and kernel
 231    threads. The amount of valuable data from these contexts cannot be
 232    declared as non-interesting for an attacker without deep inspection of
 233    the code.
 234
 235    **Note**, that assigning guests to a fixed set of physical cores affects
 236    the ability of the scheduler to do load balancing and might have
 237    negative effects on CPU utilization depending on the hosting
 238    scenario. Disabling SMT might be a viable alternative for particular
 239    scenarios.
 240
 241    For further information about confining guests to a single or to a group
 242    of cores consult the cpusets documentation:
 243
 244    https://www.kernel.org/doc/Documentation/cgroup-v1/cpusets.txt
 245
 246 .. _interrupt_isolation:
 247
 248 3. Interrupt affinity
 249 ^^^^^^^^^^^^^^^^^^^^^
 250
 251    Interrupts can be made affine to logical CPUs. This is not universally
 252    true because there are types of interrupts which are truly per CPU
 253    interrupts, e.g. the local timer interrupt. Aside of that multi queue
 254    devices affine their interrupts to single CPUs or groups of CPUs per
 255    queue without allowing the administrator to control the affinities.
 256
 257    Moving the interrupts, which can be affinity controlled, away from CPUs
 258    which run untrusted guests, reduces the attack vector space.
 259
 260    Whether the interrupts with are affine to CPUs, which run untrusted
 261    guests, provide interesting data for an attacker depends on the system
 262    configuration and the scenarios which run on the system. While for some
 263    of the interrupts it can be assumed that they won't expose interesting
 264    information beyond exposing hints about the host OS memory layout, there
 265    is no way to make general assumptions.
 266
 267    Interrupt affinity can be controlled by the administrator via the
 268    /proc/irq/$NR/smp_affinity[_list] files. Limited documentation is
 269    available at:
 270
 271    https://www.kernel.org/doc/Documentation/IRQ-affinity.txt
 272
 273 .. _smt_control:
 274
 275 4. SMT control
 276 ^^^^^^^^^^^^^^
 277
 278    To prevent the SMT issues of L1TF it might be necessary to disable SMT
 279    completely. Disabling SMT can have a significant performance impact, but
 280    the impact depends on the hosting scenario and the type of workloads.
 281    The impact of disabling SMT needs also to be weighted against the impact
 282    of other mitigation solutions like confining guests to dedicated cores.
 283
 284    The kernel provides a sysfs interface to retrieve the status of SMT and
 285    to control it. It also provides a kernel command line interface to
 286    control SMT.
 287
 288    The kernel command line interface consists of the following options:
 289
 290      =========== ==========================================================
 291      nosmt       Affects the bring up of the secondary CPUs during boot. The
 292                  kernel tries to bring all present CPUs online during the
 293                  boot process. "nosmt" makes sure that from each physical
 294                  core only one - the so called primary (hyper) thread is
 295                  activated. Due to a design flaw of Intel processors related
 296                  to Machine Check Exceptions the non primary siblings have
 297                  to be brought up at least partially and are then shut down
 298                  again.  "nosmt" can be undone via the sysfs interface.
 299
 300      nosmt=force Has the same effect as "nosmt" but it does not allow to
 301                  undo the SMT disable via the sysfs interface.
 302      =========== ==========================================================
 303
 304    The sysfs interface provides two files:
 305
 306    - /sys/devices/system/cpu/smt/control
 307    - /sys/devices/system/cpu/smt/active
 308
 309    /sys/devices/system/cpu/smt/control:
 310
 311      This file allows to read out the SMT control state and provides the
 312      ability to disable or (re)enable SMT. The possible states are:
 313
 314         ==============  ===================================================
 315         on              SMT is supported by the CPU and enabled. All
 316                         logical CPUs can be onlined and offlined without
 317                         restrictions.
 318
 319         off             SMT is supported by the CPU and disabled. Only
 320                         the so called primary SMT threads can be onlined
 321                         and offlined without restrictions. An attempt to
 322                         online a non-primary sibling is rejected
 323
 324         forceoff        Same as 'off' but the state cannot be controlled.
 325                         Attempts to write to the control file are rejected.
 326
 327         notsupported    The processor does not support SMT. It's therefore
 328                         not affected by the SMT implications of L1TF.
 329                         Attempts to write to the control file are rejected.
 330         ==============  ===================================================
 331
 332      The possible states which can be written into this file to control SMT
 333      state are:
 334
 335      - on
 336      - off
 337      - forceoff
 338
 339    /sys/devices/system/cpu/smt/active:
 340
 341      This file reports whether SMT is enabled and active, i.e. if on any
 342      physical core two or more sibling threads are online.
 343
 344    SMT control is also possible at boot time via the l1tf kernel command
 345    line parameter in combination with L1D flush control. See
 346    :ref:`mitigation_control_command_line`.
 347
 348 5. Disabling EPT
 349 ^^^^^^^^^^^^^^^^
 350
 351   Disabling EPT for virtual machines provides full mitigation for L1TF even
 352   with SMT enabled, because the effective page tables for guests are
 353   managed and sanitized by the hypervisor. Though disabling EPT has a
 354   significant performance impact especially when the Meltdown mitigation
 355   KPTI is enabled.
 356
 357   EPT can be disabled in the hypervisor via the 'kvm-intel.ept' parameter.
 358
 359 There is ongoing research and development for new mitigation mechanisms to
 360 address the performance impact of disabling SMT or EPT.
 361
 362 .. _mitigation_control_command_line:
 363
 364 Mitigation control on the kernel command line
 365 ---------------------------------------------
 366
 367 The kernel command line allows to control the L1TF mitigations at boot
 368 time with the option "l1tf=". The valid arguments for this option are:
 369
 370   ============  =============================================================
 371   full          Provides all available mitigations for the L1TF
 372                 vulnerability. Disables SMT and enables all mitigations in
 373                 the hypervisors, i.e. unconditional L1D flushing
 374
 375                 SMT control and L1D flush control via the sysfs interface
 376                 is still possible after boot.  Hypervisors will issue a
 377                 warning when the first VM is started in a potentially
 378                 insecure configuration, i.e. SMT enabled or L1D flush
 379                 disabled.
 380
 381   full,force    Same as 'full', but disables SMT and L1D flush runtime
 382                 control. Implies the 'nosmt=force' command line option.
 383                 (i.e. sysfs control of SMT is disabled.)
 384
 385   flush         Leaves SMT enabled and enables the default hypervisor
 386                 mitigation, i.e. conditional L1D flushing
 387
 388                 SMT control and L1D flush control via the sysfs interface
 389                 is still possible after boot.  Hypervisors will issue a
 390                 warning when the first VM is started in a potentially
 391                 insecure configuration, i.e. SMT enabled or L1D flush
 392                 disabled.
 393
 394   flush,nosmt   Disables SMT and enables the default hypervisor mitigation,
 395                 i.e. conditional L1D flushing.
 396
 397                 SMT control and L1D flush control via the sysfs interface
 398                 is still possible after boot.  Hypervisors will issue a
 399                 warning when the first VM is started in a potentially
 400                 insecure configuration, i.e. SMT enabled or L1D flush
 401                 disabled.
 402
 403   flush,nowarn  Same as 'flush', but hypervisors will not warn when a VM is
 404                 started in a potentially insecure configuration.
 405
 406   off           Disables hypervisor mitigations and doesn't emit any
 407                 warnings.
 408   ============  =============================================================
 409
 410 The default is 'flush'. For details about L1D flushing see :ref:`l1d_flush`.
 411
 412
 413 .. _mitigation_control_kvm:
 414
 415 Mitigation control for KVM - module parameter
 416 -------------------------------------------------------------
 417
 418 The KVM hypervisor mitigation mechanism, flushing the L1D cache when
 419 entering a guest, can be controlled with a module parameter.
 420
 421 The option/parameter is "kvm-intel.vmentry_l1d_flush=". It takes the
 422 following arguments:
 423
 424   ============  ==============================================================
 425   always        L1D cache flush on every VMENTER.
 426
 427   cond          Flush L1D on VMENTER only when the code between VMEXIT and
 428                 VMENTER can leak host memory which is considered
 429                 interesting for an attacker. This still can leak host memory
 430                 which allows e.g. to determine the hosts address space layout.
 431
 432   never         Disables the mitigation
 433   ============  ==============================================================
 434
 435 The parameter can be provided on the kernel command line, as a module
 436 parameter when loading the modules and at runtime modified via the sysfs
 437 file:
 438
 439 /sys/module/kvm_intel/parameters/vmentry_l1d_flush
 440
 441 The default is 'cond'. If 'l1tf=full,force' is given on the kernel command
 442 line, then 'always' is enforced and the kvm-intel.vmentry_l1d_flush
 443 module parameter is ignored and writes to the sysfs file are rejected.
 444
 445
 446 Mitigation selection guide
 447 --------------------------
 448
 449 1. No virtualization in use
 450 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
 451
 452    The system is protected by the kernel unconditionally and no further
 453    action is required.
 454
 455 2. Virtualization with trusted guests
 456 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 457
 458    If the guest comes from a trusted source and the guest OS kernel is
 459    guaranteed to have the L1TF mitigations in place the system is fully
 460    protected against L1TF and no further action is required.
 461
 462    To avoid the overhead of the default L1D flushing on VMENTER the
 463    administrator can disable the flushing via the kernel command line and
 464    sysfs control files. See :ref:`mitigation_control_command_line` and
 465    :ref:`mitigation_control_kvm`.
 466
 467
 468 3. Virtualization with untrusted guests
 469 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 470
 471 3.1. SMT not supported or disabled
 472 """"""""""""""""""""""""""""""""""
 473
 474   If SMT is not supported by the processor or disabled in the BIOS or by
 475   the kernel, it's only required to enforce L1D flushing on VMENTER.
 476
 477   Conditional L1D flushing is the default behaviour and can be tuned. See
 478   :ref:`mitigation_control_command_line` and :ref:`mitigation_control_kvm`.
 479
 480 3.2. EPT not supported or disabled
 481 """"""""""""""""""""""""""""""""""
 482
 483   If EPT is not supported by the processor or disabled in the hypervisor,
 484   the system is fully protected. SMT can stay enabled and L1D flushing on
 485   VMENTER is not required.
 486
 487   EPT can be disabled in the hypervisor via the 'kvm-intel.ept' parameter.
 488
 489 3.3. SMT and EPT supported and active
 490 """""""""""""""""""""""""""""""""""""
 491
 492   If SMT and EPT are supported and active then various degrees of
 493   mitigations can be employed:
 494
 495   - L1D flushing on VMENTER:
 496
 497     L1D flushing on VMENTER is the minimal protection requirement, but it
 498     is only potent in combination with other mitigation methods.
 499
 500     Conditional L1D flushing is the default behaviour and can be tuned. See
 501     :ref:`mitigation_control_command_line` and :ref:`mitigation_control_kvm`.
 502
 503   - Guest confinement:
 504
 505     Confinement of guests to a single or a group of physical cores which
 506     are not running any other processes, can reduce the attack surface
 507     significantly, but interrupts, soft interrupts and kernel threads can
 508     still expose valuable data to a potential attacker. See
 509     :ref:`guest_confinement`.
 510
 511   - Interrupt isolation:
 512
 513     Isolating the guest CPUs from interrupts can reduce the attack surface
 514     further, but still allows a malicious guest to explore a limited amount
 515     of host physical memory. This can at least be used to gain knowledge
 516     about the host address space layout. The interrupts which have a fixed
 517     affinity to the CPUs which run the untrusted guests can depending on
 518     the scenario still trigger soft interrupts and schedule kernel threads
 519     which might expose valuable information. See
 520     :ref:`interrupt_isolation`.
 521
 522 The above three mitigation methods combined can provide protection to a
 523 certain degree, but the risk of the remaining attack surface has to be
 524 carefully analyzed. For full protection the following methods are
 525 available:
 526
 527   - Disabling SMT:
 528
 529     Disabling SMT and enforcing the L1D flushing provides the maximum
 530     amount of protection. This mitigation is not depending on any of the
 531     above mitigation methods.
 532
 533     SMT control and L1D flushing can be tuned by the command line
 534     parameters 'nosmt', 'l1tf', 'kvm-intel.vmentry_l1d_flush' and at run
 535     time with the matching sysfs control files. See :ref:`smt_control`,
 536     :ref:`mitigation_control_command_line` and
 537     :ref:`mitigation_control_kvm`.
 538
 539   - Disabling EPT:
 540
 541     Disabling EPT provides the maximum amount of protection as well. It is
 542     not depending on any of the above mitigation methods. SMT can stay
 543     enabled and L1D flushing is not required, but the performance impact is
 544     significant.
 545
 546     EPT can be disabled in the hypervisor via the 'kvm-intel.ept'
 547     parameter.
 548
 549 3.4. Nested virtual machines
 550 """"""""""""""""""""""""""""
 551
 552 When nested virtualization is in use, three operating systems are involved:
 553 the bare metal hypervisor, the nested hypervisor and the nested virtual
 554 machine.  VMENTER operations from the nested hypervisor into the nested
 555 guest will always be processed by the bare metal hypervisor. If KVM is the
 556 bare metal hypervisor it will:
 557
 558  - Flush the L1D cache on every switch from the nested hypervisor to the
 559    nested virtual machine, so that the nested hypervisor's secrets are not
 560    exposed to the nested virtual machine;
 561
 562  - Flush the L1D cache on every switch from the nested virtual machine to
 563    the nested hypervisor; this is a complex operation, and flushing the L1D
 564    cache avoids that the bare metal hypervisor's secrets are exposed to the
 565    nested virtual machine;
 566
 567  - Instruct the nested hypervisor to not perform any L1D cache flush. This
 568    is an optimization to avoid double L1D flushing.
 569
 570
 571 .. _default_mitigations:
 572
 573 Default mitigations
 574 -------------------
 575
 576   The kernel default mitigations for vulnerable processors are:
 577
 578   - PTE inversion to protect against malicious user space. This is done
 579     unconditionally and cannot be controlled.
 580
 581   - L1D conditional flushing on VMENTER when EPT is enabled for
 582     a guest.
 583
 584   The kernel does not by default enforce the disabling of SMT, which leaves
 585   SMT systems vulnerable when running untrusted guests with EPT enabled.
 586
 587   The rationale for this choice is:
 588
 589   - Force disabling SMT can break existing setups, especially with
 590     unattended updates.
 591
 592   - If regular users run untrusted guests on their machine, then L1TF is
 593     just an add on to other malware which might be embedded in an untrusted
 594     guest, e.g. spam-bots or attacks on the local network.
 595
 596     There is no technical way to prevent a user from running untrusted code
 597     on their machines blindly.
 598
 599   - It's technically extremely unlikely and from today's knowledge even
 600     impossible that L1TF can be exploited via the most popular attack
 601     mechanisms like JavaScript because these mechanisms have no way to
 602     control PTEs. If this would be possible and not other mitigation would
 603     be possible, then the default might be different.
 604
 605   - The administrators of cloud and hosting setups have to carefully
 606     analyze the risk for their scenarios and make the appropriate
 607     mitigation choices, which might even vary across their deployed
 608     machines and also result in other changes of their overall setup.
 609     There is no way for the kernel to provide a sensible default for this
 610     kind of scenarios.