1 What: /sys/bus/cxl/flush
4 Contact: linux-cxl@vger.kernel.org
6 (WO) If userspace manually unbinds a port the kernel schedules
7 all descendant memdevs for unbind. Writing '1' to this attribute
11 What: /sys/bus/cxl/devices/memX/firmware_version
14 Contact: linux-cxl@vger.kernel.org
16 (RO) "FW Revision" string as reported by the Identify
17 Memory Device Output Payload in the CXL-2.0
21 What: /sys/bus/cxl/devices/memX/ram/size
24 Contact: linux-cxl@vger.kernel.org
26 (RO) "Volatile Only Capacity" as bytes. Represents the
27 identically named field in the Identify Memory Device Output
28 Payload in the CXL-2.0 specification.
31 What: /sys/bus/cxl/devices/memX/ram/qos_class
34 Contact: linux-cxl@vger.kernel.org
36 (RO) For CXL host platforms that support "QoS Telemmetry"
37 this attribute conveys a comma delimited list of platform
38 specific cookies that identifies a QoS performance class
39 for the volatile partition of the CXL mem device. These
40 class-ids can be compared against a similar "qos_class"
41 published for a root decoder. While it is not required
42 that the endpoints map their local memory-class to a
43 matching platform class, mismatches are not recommended
44 and there are platform specific performance related
45 side-effects that may result. First class-id is displayed.
48 What: /sys/bus/cxl/devices/memX/pmem/size
51 Contact: linux-cxl@vger.kernel.org
53 (RO) "Persistent Only Capacity" as bytes. Represents the
54 identically named field in the Identify Memory Device Output
55 Payload in the CXL-2.0 specification.
58 What: /sys/bus/cxl/devices/memX/pmem/qos_class
61 Contact: linux-cxl@vger.kernel.org
63 (RO) For CXL host platforms that support "QoS Telemmetry"
64 this attribute conveys a comma delimited list of platform
65 specific cookies that identifies a QoS performance class
66 for the persistent partition of the CXL mem device. These
67 class-ids can be compared against a similar "qos_class"
68 published for a root decoder. While it is not required
69 that the endpoints map their local memory-class to a
70 matching platform class, mismatches are not recommended
71 and there are platform specific performance related
72 side-effects that may result. First class-id is displayed.
75 What: /sys/bus/cxl/devices/memX/serial
78 Contact: linux-cxl@vger.kernel.org
80 (RO) 64-bit serial number per the PCIe Device Serial Number
81 capability. Mandatory for CXL devices, see CXL 2.0 8.1.12.2
82 Memory Device PCIe Capabilities and Extended Capabilities.
85 What: /sys/bus/cxl/devices/memX/numa_node
88 Contact: linux-cxl@vger.kernel.org
90 (RO) If NUMA is enabled and the platform has affinitized the
91 host PCI device for this memory device, emit the CPU node
92 affinity for this device.
95 What: /sys/bus/cxl/devices/memX/security/state
98 Contact: linux-cxl@vger.kernel.org
100 (RO) Reading this file will display the CXL security state for
101 that device. Such states can be: 'disabled', 'sanitize', when
102 a sanitization is currently underway; or those available only
103 for persistent memory: 'locked', 'unlocked' or 'frozen'. This
104 sysfs entry is select/poll capable from userspace to notify
105 upon completion of a sanitize operation.
108 What: /sys/bus/cxl/devices/memX/security/sanitize
111 Contact: linux-cxl@vger.kernel.org
113 (WO) Write a boolean 'true' string value to this attribute to
114 sanitize the device to securely re-purpose or decommission it.
115 This is done by ensuring that all user data and meta-data,
116 whether it resides in persistent capacity, volatile capacity,
117 or the LSA, is made permanently unavailable by whatever means
118 is appropriate for the media type. This functionality requires
119 the device to be disabled, that is, not actively decoding any
120 HPA ranges. This permits avoiding explicit global CPU cache
121 management, relying instead for it to be done when a region
122 transitions between software programmed and hardware committed
123 states. If this file is not present, then there is no hardware
124 support for the operation.
127 What /sys/bus/cxl/devices/memX/security/erase
130 Contact: linux-cxl@vger.kernel.org
132 (WO) Write a boolean 'true' string value to this attribute to
133 secure erase user data by changing the media encryption keys for
134 all user data areas of the device. This functionality requires
135 the device to be disabled, that is, not actively decoding any
136 HPA ranges. This permits avoiding explicit global CPU cache
137 management, relying instead for it to be done when a region
138 transitions between software programmed and hardware committed
139 states. If this file is not present, then there is no hardware
140 support for the operation.
143 What: /sys/bus/cxl/devices/memX/firmware/
146 Contact: linux-cxl@vger.kernel.org
148 (RW) Firmware uploader mechanism. The different files under
149 this directory can be used to upload and activate new
150 firmware for CXL devices. The interfaces under this are
151 documented in sysfs-class-firmware.
154 What: /sys/bus/cxl/devices/*/devtype
157 Contact: linux-cxl@vger.kernel.org
159 (RO) CXL device objects export the devtype attribute which
160 mirrors the same value communicated in the DEVTYPE environment
161 variable for uevents for devices on the "cxl" bus.
164 What: /sys/bus/cxl/devices/*/modalias
167 Contact: linux-cxl@vger.kernel.org
169 (RO) CXL device objects export the modalias attribute which
170 mirrors the same value communicated in the MODALIAS environment
171 variable for uevents for devices on the "cxl" bus.
174 What: /sys/bus/cxl/devices/portX/uport
177 Contact: linux-cxl@vger.kernel.org
179 (RO) CXL port objects are enumerated from either a platform
180 firmware device (ACPI0017 and ACPI0016) or PCIe switch upstream
181 port with CXL component registers. The 'uport' symlink connects
182 the CXL portX object to the device that published the CXL port
186 What: /sys/bus/cxl/devices/{port,endpoint}X/parent_dport
189 Contact: linux-cxl@vger.kernel.org
191 (RO) CXL port objects are instantiated for each upstream port in
192 a CXL/PCIe switch, and for each endpoint to map the
193 corresponding memory device into the CXL port hierarchy. When a
194 descendant CXL port (switch or endpoint) is enumerated it is
195 useful to know which 'dport' object in the parent CXL port
196 routes to this descendant. The 'parent_dport' symlink points to
197 the device representing the downstream port of a CXL switch that
198 routes to {port,endpoint}X.
201 What: /sys/bus/cxl/devices/portX/dportY
204 Contact: linux-cxl@vger.kernel.org
206 (RO) CXL port objects are enumerated from either a platform
207 firmware device (ACPI0017 and ACPI0016) or PCIe switch upstream
208 port with CXL component registers. The 'dportY' symlink
209 identifies one or more downstream ports that the upstream port
210 may target in its decode of CXL memory resources. The 'Y'
211 integer reflects the hardware port unique-id used in the
212 hardware decoder target list.
215 What: /sys/bus/cxl/devices/portX/decoders_committed
218 Contact: linux-cxl@vger.kernel.org
220 (RO) A memory device is considered active when any of its
221 decoders are in the "committed" state (See CXL 3.0 8.2.4.19.7
222 CXL HDM Decoder n Control Register). Hotplug and destructive
223 operations like "sanitize" are blocked while device is actively
224 decoding a Host Physical Address range. Note that this number
225 may be elevated without any regionX objects active or even
226 enumerated, as this may be due to decoders established by
227 platform firwmare or a previous kernel (kexec).
230 What: /sys/bus/cxl/devices/decoderX.Y
233 Contact: linux-cxl@vger.kernel.org
235 (RO) CXL decoder objects are enumerated from either a platform
236 firmware description, or a CXL HDM decoder register set in a
237 PCIe device (see CXL 2.0 section 8.2.5.12 CXL HDM Decoder
238 Capability Structure). The 'X' in decoderX.Y represents the
239 cxl_port container of this decoder, and 'Y' represents the
240 instance id of a given decoder resource.
243 What: /sys/bus/cxl/devices/decoderX.Y/{start,size}
246 Contact: linux-cxl@vger.kernel.org
248 (RO) The 'start' and 'size' attributes together convey the
249 physical address base and number of bytes mapped in the
250 decoder's decode window. For decoders of devtype
251 "cxl_decoder_root" the address range is fixed. For decoders of
252 devtype "cxl_decoder_switch" the address is bounded by the
253 decode range of the cxl_port ancestor of the decoder's cxl_port,
254 and dynamically updates based on the active memory regions in
258 What: /sys/bus/cxl/devices/decoderX.Y/locked
261 Contact: linux-cxl@vger.kernel.org
263 (RO) CXL HDM decoders have the capability to lock the
264 configuration until the next device reset. For decoders of
265 devtype "cxl_decoder_root" there is no standard facility to
266 unlock them. For decoders of devtype "cxl_decoder_switch" a
267 secondary bus reset, of the PCIe bridge that provides the bus
268 for this decoders uport, unlocks / resets the decoder.
271 What: /sys/bus/cxl/devices/decoderX.Y/target_list
274 Contact: linux-cxl@vger.kernel.org
276 (RO) Display a comma separated list of the current decoder
277 target configuration. The list is ordered by the current
278 configured interleave order of the decoder's dport instances.
279 Each entry in the list is a dport id.
282 What: /sys/bus/cxl/devices/decoderX.Y/cap_{pmem,ram,type2,type3}
285 Contact: linux-cxl@vger.kernel.org
287 (RO) When a CXL decoder is of devtype "cxl_decoder_root", it
288 represents a fixed memory window identified by platform
289 firmware. A fixed window may only support a subset of memory
290 types. The 'cap_*' attributes indicate whether persistent
291 memory, volatile memory, accelerator memory, and / or expander
292 memory may be mapped behind this decoder's memory window.
295 What: /sys/bus/cxl/devices/decoderX.Y/target_type
298 Contact: linux-cxl@vger.kernel.org
300 (RO) When a CXL decoder is of devtype "cxl_decoder_switch", it
301 can optionally decode either accelerator memory (type-2) or
302 expander memory (type-3). The 'target_type' attribute indicates
303 the current setting which may dynamically change based on what
304 memory regions are activated in this decode hierarchy.
307 What: /sys/bus/cxl/devices/endpointX/CDAT
310 Contact: linux-cxl@vger.kernel.org
312 (RO) If this sysfs entry is not present no DOE mailbox was
313 found to support CDAT data. If it is present and the length of
314 the data is 0 reading the CDAT data failed. Otherwise the CDAT
318 What: /sys/bus/cxl/devices/decoderX.Y/mode
321 Contact: linux-cxl@vger.kernel.org
323 (RW) When a CXL decoder is of devtype "cxl_decoder_endpoint" it
324 translates from a host physical address range, to a device local
325 address range. Device-local address ranges are further split
326 into a 'ram' (volatile memory) range and 'pmem' (persistent
327 memory) range. The 'mode' attribute emits one of 'ram', 'pmem',
328 'mixed', or 'none'. The 'mixed' indication is for error cases
329 when a decoder straddles the volatile/persistent partition
330 boundary, and 'none' indicates the decoder is not actively
331 decoding, or no DPA allocation policy has been set.
333 'mode' can be written, when the decoder is in the 'disabled'
334 state, with either 'ram' or 'pmem' to set the boundaries for the
338 What: /sys/bus/cxl/devices/decoderX.Y/dpa_resource
341 Contact: linux-cxl@vger.kernel.org
343 (RO) When a CXL decoder is of devtype "cxl_decoder_endpoint",
344 and its 'dpa_size' attribute is non-zero, this attribute
345 indicates the device physical address (DPA) base address of the
349 What: /sys/bus/cxl/devices/decoderX.Y/dpa_size
352 Contact: linux-cxl@vger.kernel.org
354 (RW) When a CXL decoder is of devtype "cxl_decoder_endpoint" it
355 translates from a host physical address range, to a device local
356 address range. The range, base address plus length in bytes, of
357 DPA allocated to this decoder is conveyed in these 2 attributes.
358 Allocations can be mutated as long as the decoder is in the
359 disabled state. A write to 'dpa_size' releases the previous DPA
360 allocation and then attempts to allocate from the free capacity
361 in the device partition referred to by 'decoderX.Y/mode'.
362 Allocate and free requests can only be performed on the highest
363 instance number disabled decoder with non-zero size. I.e.
364 allocations are enforced to occur in increasing 'decoderX.Y/id'
365 order and frees are enforced to occur in decreasing
366 'decoderX.Y/id' order.
369 What: /sys/bus/cxl/devices/decoderX.Y/interleave_ways
372 Contact: linux-cxl@vger.kernel.org
374 (RO) The number of targets across which this decoder's host
375 physical address (HPA) memory range is interleaved. The device
376 maps every Nth block of HPA (of size ==
377 'interleave_granularity') to consecutive DPA addresses. The
378 decoder's position in the interleave is determined by the
379 device's (endpoint or switch) switch ancestry. For root
380 decoders their interleave is specified by platform firmware and
381 they only specify a downstream target order for host bridges.
384 What: /sys/bus/cxl/devices/decoderX.Y/interleave_granularity
387 Contact: linux-cxl@vger.kernel.org
389 (RO) The number of consecutive bytes of host physical address
390 space this decoder claims at address N before the decode rotates
391 to the next target in the interleave at address N +
392 interleave_granularity (assuming N is aligned to
393 interleave_granularity).
396 What: /sys/bus/cxl/devices/decoderX.Y/create_{pmem,ram}_region
397 Date: May, 2022, January, 2023
398 KernelVersion: v6.0 (pmem), v6.3 (ram)
399 Contact: linux-cxl@vger.kernel.org
401 (RW) Write a string in the form 'regionZ' to start the process
402 of defining a new persistent, or volatile memory region
403 (interleave-set) within the decode range bounded by root decoder
404 'decoderX.Y'. The value written must match the current value
405 returned from reading this attribute. An atomic compare exchange
406 operation is done on write to assign the requested id to a
407 region and allocate the region-id for the next creation attempt.
408 EBUSY is returned if the region name written does not match the
409 current cached value.
412 What: /sys/bus/cxl/devices/decoderX.Y/delete_region
415 Contact: linux-cxl@vger.kernel.org
417 (WO) Write a string in the form 'regionZ' to delete that region,
418 provided it is currently idle / not bound to a driver.
421 What: /sys/bus/cxl/devices/decoderX.Y/qos_class
424 Contact: linux-cxl@vger.kernel.org
426 (RO) For CXL host platforms that support "QoS Telemmetry" this
427 root-decoder-only attribute conveys a platform specific cookie
428 that identifies a QoS performance class for the CXL Window.
429 This class-id can be compared against a similar "qos_class"
430 published for each memory-type that an endpoint supports. While
431 it is not required that endpoints map their local memory-class
432 to a matching platform class, mismatches are not recommended and
433 there are platform specific side-effects that may result.
436 What: /sys/bus/cxl/devices/regionZ/uuid
439 Contact: linux-cxl@vger.kernel.org
441 (RW) Write a unique identifier for the region. This field must
442 be set for persistent regions and it must not conflict with the
443 UUID of another region. For volatile ram regions this
444 attribute is a read-only empty string.
447 What: /sys/bus/cxl/devices/regionZ/interleave_granularity
450 Contact: linux-cxl@vger.kernel.org
452 (RW) Set the number of consecutive bytes each device in the
453 interleave set will claim. The possible interleave granularity
454 values are determined by the CXL spec and the participating
458 What: /sys/bus/cxl/devices/regionZ/interleave_ways
461 Contact: linux-cxl@vger.kernel.org
463 (RW) Configures the number of devices participating in the
464 region is set by writing this value. Each device will provide
465 1/interleave_ways of storage for the region.
468 What: /sys/bus/cxl/devices/regionZ/size
471 Contact: linux-cxl@vger.kernel.org
473 (RW) System physical address space to be consumed by the region.
474 When written trigger the driver to allocate space out of the
475 parent root decoder's address space. When read the size of the
476 address space is reported and should match the span of the
477 region's resource attribute. Size shall be set after the
478 interleave configuration parameters. Once set it cannot be
479 changed, only freed by writing 0. The kernel makes no guarantees
480 that data is maintained over an address space freeing event, and
481 there is no guarantee that a free followed by an allocate
482 results in the same address being allocated.
485 What: /sys/bus/cxl/devices/regionZ/mode
488 Contact: linux-cxl@vger.kernel.org
490 (RO) The mode of a region is established at region creation time
491 and dictates the mode of the endpoint decoder that comprise the
492 region. For more details on the possible modes see
493 /sys/bus/cxl/devices/decoderX.Y/mode
496 What: /sys/bus/cxl/devices/regionZ/resource
499 Contact: linux-cxl@vger.kernel.org
501 (RO) A region is a contiguous partition of a CXL root decoder
502 address space. Region capacity is allocated by writing to the
503 size attribute, the resulting physical address space determined
504 by the driver is reflected here. It is therefore not useful to
505 read this before writing a value to the size attribute.
508 What: /sys/bus/cxl/devices/regionZ/target[0..N]
511 Contact: linux-cxl@vger.kernel.org
513 (RW) Write an endpoint decoder object name to 'targetX' where X
514 is the intended position of the endpoint device in the region
515 interleave and N is the 'interleave_ways' setting for the
516 region. ENXIO is returned if the write results in an impossible
517 to map decode scenario, like the endpoint is unreachable at that
518 position relative to the root decoder interleave. EBUSY is
519 returned if the position in the region is already occupied, or
520 if the region is not in a state to accept interleave
521 configuration changes. EINVAL is returned if the object name is
522 not an endpoint decoder. Once all positions have been
523 successfully written a final validation for decode conflicts is
524 performed before activating the region.
527 What: /sys/bus/cxl/devices/regionZ/commit
530 Contact: linux-cxl@vger.kernel.org
532 (RW) Write a boolean 'true' string value to this attribute to
533 trigger the region to transition from the software programmed
534 state to the actively decoding in hardware state. The commit
535 operation in addition to validating that the region is in proper
536 configured state, validates that the decoders are being
537 committed in spec mandated order (last committed decoder id +
538 1), and checks that the hardware accepts the commit request.
539 Reading this value indicates whether the region is committed or
543 What: /sys/bus/cxl/devices/memX/trigger_poison_list
546 Contact: linux-cxl@vger.kernel.org
548 (WO) When a boolean 'true' is written to this attribute the
549 memdev driver retrieves the poison list from the device. The
550 list consists of addresses that are poisoned, or would result
551 in poison if accessed, and the source of the poison. This
552 attribute is only visible for devices supporting the
553 capability. The retrieved errors are logged as kernel
554 events when cxl_poison event tracing is enabled.
557 What: /sys/bus/cxl/devices/regionZ/accessY/read_bandwidth
558 /sys/bus/cxl/devices/regionZ/accessY/write_banwidth
561 Contact: linux-cxl@vger.kernel.org
563 (RO) The aggregated read or write bandwidth of the region. The
564 number is the accumulated read or write bandwidth of all CXL memory
565 devices that contributes to the region in MB/s. It is
566 identical data that should appear in
567 /sys/devices/system/node/nodeX/accessY/initiators/read_bandwidth or
568 /sys/devices/system/node/nodeX/accessY/initiators/write_bandwidth.
569 See Documentation/ABI/stable/sysfs-devices-node. access0 provides
570 the number to the closest initiator and access1 provides the
571 number to the closest CPU.
574 What: /sys/bus/cxl/devices/regionZ/accessY/read_latency
575 /sys/bus/cxl/devices/regionZ/accessY/write_latency
578 Contact: linux-cxl@vger.kernel.org
580 (RO) The read or write latency of the region. The number is
581 the worst read or write latency of all CXL memory devices that
582 contributes to the region in nanoseconds. It is identical data
583 that should appear in
584 /sys/devices/system/node/nodeX/accessY/initiators/read_latency or
585 /sys/devices/system/node/nodeX/accessY/initiators/write_latency.
586 See Documentation/ABI/stable/sysfs-devices-node. access0 provides
587 the number to the closest initiator and access1 provides the
588 number to the closest CPU.