2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
Commit Graph

612 Commits

Author SHA1 Message Date
Li Ming
0a46f60a9f cxl/edac: Fix using wrong repair type to check dram event record
cxl_find_rec_dram() is used to find a DRAM event record based on the
inputted attributes. Different repair_type of the inputted attributes
will check the DRAM event record in different ways.
When EDAC driver is performing a memory rank sparing, it should use
CXL_RANK_SPARING rather than CXL_BANK_SPARING as repair_type for DRAM
event record checking.

Fixes: 588ca944c2 ("cxl/edac: Add CXL memory device memory sparing control feature")
Signed-off-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Link: https://patch.msgid.link/20250620052924.138892-1-ming.li@zohomail.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-06-25 12:05:45 -07:00
Dan Williams
3c70ec71ab cxl/ras: Fix CPER handler device confusion
By inspection, cxl_cper_handle_prot_err() is making a series of fragile
assumptions that can lead to crashes:

1/ It assumes that endpoints identified in the record are a CXL-type-3
   device, nothing guarantees that.

2/ It assumes that the device is bound to the cxl_pci driver, nothing
   guarantees that.

3/ Minor, it holds the device lock over the switch-port tracing for no
   reason as the trace is 100% generated from data in the record.

Correct those by checking that the PCIe endpoint parents a cxl_memdev
before assuming the format of the driver data, and move the lock to where
it is required. Consequently this also makes the implementation ready for
CXL accelerators that are not bound to cxl_pci.

Fixes: 36f257e3b0 ("acpi/ghes, cxl/pci: Process CXL CPER Protocol Errors")
Cc: Terry Bowman <terry.bowman@amd.com>
Cc: Li Ming <ming.li@zohomail.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Reviewed-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Link: https://patch.msgid.link/20250612192043.2254617-1-dan.j.williams@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-06-13 09:02:04 -07:00
Li Ming
a403fe6c0b cxl/edac: Fix potential memory leak issues
In cxl_store_rec_gen_media() and cxl_store_rec_dram(), use kmemdup() to
duplicate a cxl gen_media/dram event to store the event in a xarray by
xa_store(). The cxl gen_media/dram event allocated by kmemdup() should
be freed in the case that the xa_store() fails.

Fixes: 0b5ccb0de1 ("cxl/edac: Support for finding memory operation attributes from the current boot")
Signed-off-by: Li Ming <ming.li@zohomail.com>
Tested-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20250613011648.102840-1-ming.li@zohomail.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-06-13 08:45:30 -07:00
Li Ming
fdc9be9092 cxl/edac: Fix the min_scrub_cycle of a region miscalculation
When trying to update the scrub_cycle value of a cxl region, which means
updating the scrub_cycle value of each memdev under a cxl region. cxl
driver needs to guarantee the new scrub_cycle value is greater than the
min_scrub_cycle value of a memdev, otherwise the updating operation will
fail(Per Table 8-223 in CXL r3.2 section 8.2.10.9.11.1).

Current implementation logic of getting the min_scrub_cycle value of a
cxl region is that getting the min_scrub_cycle value of each memdevs
under the cxl region, then using the minimum min_scrub_cycle value as
the region's min_scrub_cycle. Checking if the new scrub_cycle value is
greater than this value. If yes, updating the new scrub_cycle value to
each memdevs. The issue is that the new scrub_cycle value is possibly
greater than the minimum min_scrub_cycle value of all memdevs but less
than the maximum min_scrub_cycle value of all memdevs if memdevs have
a different min_scrub_cycle value. The updating operation will always
fail on these memdevs which have a greater min_scrub_cycle than the new
scrub_cycle.

The correct implementation logic is to get the maximum value of these
memdevs' min_scrub_cycle, check if the new scrub_cycle value is greater
than the value. If yes, the new scrub_cycle value is fit for the region.

The change also impacts the result of
cxl_patrol_scrub_get_min_scrub_cycle(), the interface returned the
minimum min_scrub_cycle value among all memdevs under the region before
the change. The interface will return the maximum min_scrub_cycle value
among all memdevs under the region with the change.

Signed-off-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://patch.msgid.link/20250603104314.25569-1-ming.li@zohomail.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-06-10 13:54:15 -07:00
Dan Carpenter
87b42c114c cxl: fix return value in cxlctl_validate_set_features()
The cxlctl_validate_set_features() function is type bool.  It's supposed
to return true for valid requests and false for invalid.  However, this
error path returns ERR_PTR(-EINVAL) which is true when it was intended to
return false.

The incorrect return will result in kernel failing to prevent a
incorrect op_size passed in from userspace to be detected.

[ dj: Add user impact to commit log ]

Fixes: f76e0bbc8b ("cxl: Update prototype of function get_support_feature_info()")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://patch.msgid.link/aDbFPSCujpJLY1if@stanley.mountain
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-06-09 09:18:15 -07:00
Dave Jiang
9f153b7fb5 Merge branch 'for-6.16/cxl-features-ras' into cxl-for-next
Add CXL RAS Features support. Features include "patrol scrub control",
"error check scrub", "perform maintenance", and "memory sparing". This
support connects the RAS Featurs to EDAC.
2025-05-23 13:26:24 -07:00
Shiju Jose
be9b359e05 cxl/edac: Add CXL memory device soft PPR control feature
Post Package Repair (PPR) maintenance operations may be supported by CXL
devices that implement CXL.mem protocol. A PPR maintenance operation
requests the CXL device to perform a repair operation on its media.
For example, a CXL device with DRAM components that support PPR features
may implement PPR Maintenance operations. DRAM components may support two
types of PPR, hard PPR (hPPR), for a permanent row repair, and Soft PPR
(sPPR), for a temporary row repair. Soft PPR is much faster than hPPR,
but the repair is lost with a power cycle.

During the execution of a PPR Maintenance operation, a CXL memory device:
- May or may not retain data
- May or may not be able to process CXL.mem requests correctly, including
the ones that target the DPA involved in the repair.
These CXL Memory Device capabilities are specified by Restriction Flags
in the sPPR Feature and hPPR Feature.

Soft PPR maintenance operation may be executed at runtime, if data is
retained and CXL.mem requests are correctly processed. For CXL devices with
DRAM components, hPPR maintenance operation may be executed only at boot
because typically data may not be retained with hPPR maintenance operation.

When a CXL device identifies error on a memory component, the device
may inform the host about the need for a PPR maintenance operation by using
an Event Record, where the Maintenance Needed flag is set. The Event Record
specifies the DPA that should be repaired. A CXL device may not keep track
of the requests that have already been sent and the information on which
DPA should be repaired may be lost upon power cycle.
The userspace tool requests for maintenance operation if the number of
corrected error reported on a CXL.mem media exceeds error threshold.

CXL spec 3.2 section 8.2.10.7.1.2 describes the device's sPPR (soft PPR)
maintenance operation and section 8.2.10.7.1.3 describes the device's
hPPR (hard PPR) maintenance operation feature.

CXL spec 3.2 section 8.2.10.7.2.1 describes the sPPR feature discovery and
configuration.

CXL spec 3.2 section 8.2.10.7.2.2 describes the hPPR feature discovery and
configuration.

Add support for controlling CXL memory device soft PPR (sPPR) feature.
Register with EDAC driver, which gets the memory repair attr descriptors
from the EDAC memory repair driver and exposes sysfs repair control
attributes for PRR to the userspace. For example CXL PPR control for the
CXL mem0 device is exposed in /sys/bus/edac/devices/cxl_mem0/mem_repairX/

Add checks to ensure the memory to be repaired is offline and originates
from a CXL DRAM or CXL gen_media error record reported in the current boot,
before requesting a PPR operation on the device.

Note: Tested with QEMU patch for CXL PPR feature.
https://lore.kernel.org/linux-cxl/20250509172229.726-1-shiju.jose@huawei.com/T/#m70b2b010f43f7f4a6f9acee5ec9008498bf292c3

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250521124749.817-9-shiju.jose@huawei.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-23 13:25:06 -07:00
Shiju Jose
588ca944c2 cxl/edac: Add CXL memory device memory sparing control feature
Memory sparing is defined as a repair function that replaces a portion of
memory with a portion of functional memory at that same DPA. The subclasses
for this operation vary in terms of the scope of the sparing being
performed. The cacheline sparing subclass refers to a sparing action that
can replace a full cacheline. Row sparing is provided as an alternative to
PPR sparing functions and its scope is that of a single DDR row.
As per CXL r3.2 Table 8-125 foot note 1. Memory sparing is preferred over
PPR when possible.
Bank sparing allows an entire bank to be replaced. Rank sparing is defined
as an operation in which an entire DDR rank is replaced.

Memory sparing maintenance operations may be supported by CXL devices
that implement CXL.mem protocol. A sparing maintenance operation requests
the CXL device to perform a repair operation on its media.
For example, a CXL device with DRAM components that support memory sparing
features may implement sparing maintenance operations.

The host may issue a query command by setting query resources flag in the
input payload (CXL spec 3.2 Table 8-120) to determine availability of
sparing resources for a given address. In response to a query request,
the device shall report the resource availability by producing the memory
sparing event record (CXL spec 3.2 Table 8-60) in which the Channel, Rank,
Nibble Mask, Bank Group, Bank, Row, Column, Sub-Channel fields are a copy
of the values specified in the request.

During the execution of a sparing maintenance operation, a CXL memory
device:
- may not retain data
- may not be able to process CXL.mem requests correctly.
These CXL memory device capabilities are specified by restriction flags
in the memory sparing feature readable attributes.

When a CXL device identifies error on a memory component, the device
may inform the host about the need for a memory sparing maintenance
operation by using DRAM event record, where the 'maintenance needed' flag
may set. The event record contains some of the DPA, Channel, Rank,
Nibble Mask, Bank Group, Bank, Row, Column, Sub-Channel fields that
should be repaired. The userspace tool requests for maintenance operation
if the 'maintenance needed' flag set in the CXL DRAM error record.

CXL spec 3.2 section 8.2.10.7.1.4 describes the device's memory sparing
maintenance operation feature.

CXL spec 3.2 section 8.2.10.7.2.3 describes the memory sparing feature
discovery and configuration.

Add support for controlling CXL memory device memory sparing feature.
Register with EDAC driver, which gets the memory repair attr descriptors
from the EDAC memory repair driver and exposes sysfs repair control
attributes for memory sparing to the userspace. For example CXL memory
sparing control for the CXL mem0 device is exposed in
/sys/bus/edac/devices/cxl_mem0/mem_repairX/

Use case
========
1. CXL device identifies a failure in a memory component, report to
   userspace in a CXL DRAM trace event with DPA and other attributes of
   memory to repair such as channel, rank, nibble mask, bank Group,
   bank, row, column, sub-channel.

2. Rasdaemon process the trace event and may issue query request in sysfs
check resources available for memory sparing if either of the following
conditions met.
 - 'maintenance needed' flag set in the event record.
 - 'threshold event' flag set for CVME threshold feature.
 - When the number of corrected error reported on a CXL.mem media to the
   userspace exceeds the threshold value for corrected error count defined
   by the userspace policy.

3. Rasdaemon process the memory sparing trace event and issue repair
   request for memory sparing.

Kernel CXL driver shall report memory sparing event record to the userspace
with the resource availability in order rasdaemon to process the event
record and issue a repair request in sysfs for the memory sparing operation
in the CXL device.

Note: Based on the feedbacks from the community 'query' sysfs attribute is
removed and reporting memory sparing error record to the userspace are not
supported. Instead userspace issues sparing operation and kernel does the
same to the CXL memory device, when 'maintenance needed' flag set in the
DRAM event record.

Add checks to ensure the memory to be repaired is offline and if online,
then originates from a CXL DRAM error record reported in the current boot
before requesting a memory sparing operation on the device.

Note: Tested memory sparing feature control with QEMU patch
      "hw/cxl: Add emulation for memory sparing control feature"
      https://lore.kernel.org/linux-cxl/20250509172229.726-1-shiju.jose@huawei.com/T/#m5f38512a95670d75739f9dad3ee91b95c7f5c8d6

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250521124749.817-8-shiju.jose@huawei.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-23 13:24:53 -07:00
Shiju Jose
0b5ccb0de1 cxl/edac: Support for finding memory operation attributes from the current boot
Certain operations on memory, such as memory repair, are permitted
only when the address and other attributes for the operation are
from the current boot. This is determined by checking whether the
memory attributes for the operation match those in the CXL gen_media
or CXL DRAM memory event records reported during the current boot.

The CXL event records must be backed up because they are cleared
in the hardware after being processed by the kernel.

Support is added for storing CXL gen_media or CXL DRAM memory event
records in xarrays. Old records are deleted when they expire or when
there is an overflow and which depends on platform correctly report
Event Record Timestamp field of CXL spec Table 8-55 Common Event
Record Format.

Additionally, helper functions are implemented to find a matching
record in the xarray storage based on the memory attributes and
repair type.

Add validity check, when matching attributes for sparing, using
the validity flag in the DRAM event record, to ensure that all
required attributes for a requested repair operation are valid and
set.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250521124749.817-7-shiju.jose@huawei.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-23 13:24:38 -07:00
Shiju Jose
077ee5f7dd cxl/edac: Add support for PERFORM_MAINTENANCE command
Add support for PERFORM_MAINTENANCE command.

CXL spec 3.2 section 8.2.10.7.1 describes the Perform Maintenance command.
This command requests the device to execute the maintenance operation
specified by the maintenance operation class and the maintenance operation
subclass.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250521124749.817-6-shiju.jose@huawei.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-23 13:24:28 -07:00
Shiju Jose
85fb6a16ad cxl/edac: Add CXL memory device ECS control feature
CXL spec 3.2 section 8.2.10.9.11.2 describes the DDR5 ECS (Error Check
Scrub) control feature.
The Error Check Scrub (ECS) is a feature defined in JEDEC DDR5 SDRAM
Specification (JESD79-5) and allows the DRAM to internally read, correct
single-bit errors, and write back corrected data bits to the DRAM array
while providing transparency to error counts.

The ECS control allows the requester to change the log entry type, the ECS
threshold count (provided the request falls within the limits specified in
DDR5 mode registers), switch between codeword mode and row count mode, and
reset the ECS counter.

Register with EDAC device driver, which retrieves the ECS attribute
descriptors from the EDAC ECS and exposes the ECS control attributes to
userspace via sysfs. For example, the ECS control for the memory media FRU0
in CXL mem0 device is located at /sys/bus/edac/devices/cxl_mem0/ecs_fru0/

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250521124749.817-5-shiju.jose@huawei.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-23 13:24:20 -07:00
Shiju Jose
0c6e6f1357 cxl/edac: Add CXL memory device patrol scrub control feature
CXL spec 3.2 section 8.2.10.9.11.1 describes the device patrol scrub
control feature. The device patrol scrub proactively locates and makes
corrections to errors in regular cycle.

Allow specifying the number of hours within which the patrol scrub must be
completed, subject to minimum and maximum limits reported by the device.
Also allow disabling scrub allowing trade-off error rates against
performance.

Add support for patrol scrub control on CXL memory devices.
Register with the EDAC device driver, which retrieves the scrub attribute
descriptors from EDAC scrub and exposes the sysfs scrub control attributes
to userspace. For example, scrub control for the CXL memory device
"cxl_mem0" is exposed in /sys/bus/edac/devices/cxl_mem0/scrubX/.

Additionally, add support for region-based CXL memory patrol scrub control.
CXL memory regions may be interleaved across one or more CXL memory
devices. For example, region-based scrub control for "cxl_region1" is
exposed in /sys/bus/edac/devices/cxl_region1/scrubX/.

[dj: A few formatting fixes from Jonathan]

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250521124749.817-4-shiju.jose@huawei.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-23 13:24:09 -07:00
Shiju Jose
f76e0bbc8b cxl: Update prototype of function get_support_feature_info()
Add following changes to function get_support_feature_info()
1. Make generic to share between cxl-fwctl and cxl-edac paths.
2. Rename get_support_feature_info() to cxl_feature_info()
3. Change parameter const struct fwctl_rpc_cxl *rpc_in to
   const uuid_t *uuid.

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250521124749.817-3-shiju.jose@huawei.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-23 13:24:00 -07:00
Alison Schofield
bfc6270ab3 cxl/features: Remove the inline specifier from to_cxlfs()
to_cxlfs() was declared 'inline' in the header but only defined in
drivers/cxl/core/features.c. This has worked because features.c was
the only file using the function and the definition happened to be
available in the same compilation unit.

However, in preparation for a second .c file using the header and
needing to call the function, the inline specifier became an issue.
Sparse flagged the declaration as invalid since 'inline' requires a
visible definition at the point of use.

Defining the function in the header was considered but rejected, as
it depends on internal symbols not visible at that level.

Remove the inline specifier to correct the linkage violation.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Shiju Jose <shiju.jose@huawei.com>
Link: https://patch.msgid.link/20250521233625.1745849-1-alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-22 07:39:01 -07:00
Li Ming
6eed708a56 cxl/feature: Remove redundant code of get supported features
In cxlctl_get_supported_features(), there is a code block that handles
the case where the requested is equal to 0. But the code following the
code block can also handle this situation. So the code block is not
needed.

Signed-off-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20250516143220.35302-1-ming.li@zohomail.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-19 08:58:04 -07:00
Alison Schofield
f97bdc61c7 Documentation: Update the CXL Maturity Map
Changes for extended-linear cache, hetero-interleave, and HPA->DPA
address translation.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20250512214225.1389484-1-alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-15 16:34:26 -07:00
Alison Schofield
d542461211 cxl: Sync up the driver-api/cxl documentation
pmem.c regs.c mbox.c identifiers were missing. Add them to
memory-devices.rst following their respective DOC comment includes.

Two acpi.c identifiers were available, but not used in kernel-doc's:
1) Add add_cxl_resources to memory-devices.rst and fix up the Sphinx
complaint on the ascii art by escaping it.
2) Add cxl_acpi_evaluate_qtg_dsm to access-coordinates.rst.

core/features.c is new. Add a "DOC: cxl features" comment to the
source and identifiers to memory_devices.rst.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://patch.msgid.link/20250513215813.1419645-1-alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-13 15:12:15 -07:00
Dan Carpenter
a223ce1957 cxl/hdm: Clean up a debug printk
Smatch complains that %pa is for phys_addr_t types and "size" is a u64.

    drivers/cxl/core/hdm.c:521 cxl_dpa_alloc() error: '%pa' expects
    argument of type 'phys_addr_t*', argument 4 has type 'ullong*

Looking at this, to me it seems more useful to print the sizes as
decimal instead of hex.  Let's do that.

[dj: Adjusted based on latest code changes. ]

Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://patch.msgid.link/3d3d969d-651d-4e9d-a892-900876a60ab5@moroto.mountain
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-09 14:04:08 -07:00
Dave Jiang
68d8b4f399 Merge branch 'for-6.16/cxl-cleanups' into cxl-for-next
In preparation for code changes related to AMD Zen5 address translation
support, a number of small code refactor and cleanups are send ahead.
2025-05-09 09:59:28 -07:00
Robert Richter
98a863fee2 cxl: Add a dev_dbg() when a decoder was added to a port
Improve debugging by adding and unifying messages whenever a decoder
was added to a port. It is especially useful to get the decoder
mapping of the involved CXL host bridge or PCI device. This avoids a
complex lookup of the decoder/port/device mappings in sysfs.

Example log messages:

  cxl_acpi ACPI0017:00: decoder0.0 added to root0
  cxl_acpi ACPI0017:00: decoder0.1 added to root0
  ...
   pci0000:e0: decoder1.0 added to port1
   pci0000:e0: decoder1.1 added to port1
  ...
  cxl_mem mem0: decoder5.0 added to endpoint5
  cxl_mem mem0: decoder5.1 added to endpoint5

Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Tested-by: Gregory Price <gourry@gourry.net>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250509150700.2817697-15-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-09 09:57:43 -07:00
Robert Richter
d90acdf49e cxl/region: Add a dev_err() on missing target list entries
Broken target lists are hard to discover as the driver fails at a
later initialization stage. Add an error message for this.

Example log messages:

  cxl_mem mem1: failed to find endpoint6:0000:e0:01.3 in target list of decoder1.1
  cxl_port endpoint6: failed to register decoder6.0: -6
  cxl_port endpoint6: probe: 0

Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: "Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com>
Tested-by: Gregory Price <gourry@gourry.net>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250509150700.2817697-14-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-09 09:57:43 -07:00
Robert Richter
9efefa1c6f cxl/region: Add a dev_warn() on registration failure
Esp. in complex system configurations with multiple endpoints and
interleaving setups it is hard to detect region setup failures as its
registration may silently fail. Add messages to show registration
failures.

Example log message:

  cxl region5: region sort successful
  cxl region5: mem0:endpoint5 decoder5.0 add: mem0:decoder5.0 @ 0 next: none nr_eps: 1 nr_targets: 1
  cxl_port endpoint5: decoder5.0: range: 0x22350000000-0x2634fffffff iw: 1 ig: 256
  cxl region5: pci0000:e0:port1 decoder1.2 add: mem0:decoder5.0 @ 0 next: mem0 nr_eps: 1 nr_targets: 1
  cxl region5: pci0000:e0:port1 iw: 1 ig: 256
  cxl region5: pci0000:e0:port1: decoder1.2 expected 0000:e0:01.2 at 0
  cxl endpoint5: failed to attach decoder5.0 to region5: -6
  cxl_port endpoint5: probe: 0

Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: "Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com>
Tested-by: Gregory Price <gourry@gourry.net>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250509150700.2817697-13-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-09 09:57:43 -07:00
Robert Richter
d6879d8cfb cxl/region: Add function to find a port's switch decoder by range
Factor out code to find the switch decoder of a port for a specific
address range. Reuse the code to search a root decoder, create the
function cxl_port_find_switch_decoder() and rework
match_root_decoder_by_range() to be usable for switch decoders too.

Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: "Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com>
Tested-by: Gregory Price <gourry@gourry.net>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250509150700.2817697-12-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-09 09:57:42 -07:00
Robert Richter
868a8f1f04 cxl/region: Factor out code to find a root decoder's region
In function cxl_add_to_region() there is code to determine a root
decoder's region. Factor that code out. This is in preparation to
further rework and simplify function cxl_add_to_region().

The reference count must be decremented after using the region.
cxl_find_region_by_range() is paired with the put_cxl_region cleanup
helper that can be used for this.

[dj: Fixed up "obj __free(...) = NULL" pattern]

Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: "Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com>
Tested-by: Gregory Price <gourry@gourry.net>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250509150700.2817697-11-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-09 09:57:29 -07:00
Robert Richter
9466ee9816 cxl/region: Factor out code to find the root decoder
In function cxl_add_to_region() there is code to determine the root
decoder associated to an endpoint decoder. Factor out that code for
later reuse. This has the benefit of reducing cxl_add_to_region()'s
function complexity.

The reference count must be decremented after using the root decoder.
cxl_find_root_decoder() is paired with the put_cxl_root_decoder
cleanup helper that can be used for this.

[dj: Fixed up "obj __free(...) = NULL" pattern]

Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: "Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com>
Tested-by: Gregory Price <gourry@gourry.net>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250509150700.2817697-10-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-09 09:54:21 -07:00
Robert Richter
74bf125abd cxl/port: Replace put_cxl_root() by a cleanup helper
Function put_cxl_root() is only used by its cleanup helper. Remove the
function entirely and only use the helper.

Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: "Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250509150700.2817697-9-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-09 09:48:26 -07:00
Robert Richter
5ed826fc4b cxl/region: Move find_cxl_root() to cxl_add_to_region()
When adding an endpoint to a region, the root port is determined
first. Move this directly into cxl_add_to_region(). This is in
preparation of the initialization of endpoints that iterates the port
hierarchy from the endpoint up to the root port.

As a side-effect the root argument is removed from the argument lists
of cxl_add_to_region() and related functions. Now, the endpoint is the
only parameter to add a region. This simplifies the function
interface.

Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: "Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com>
Tested-by: Gregory Price <gourry@gourry.net>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250509150700.2817697-8-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-09 09:48:26 -07:00
Robert Richter
0ee2d97810 cxl/region: Avoid duplicate call of cxl_port_pick_region_decoder()
Function cxl_port_pick_region_decoder() is called twice, in
alloc_region_ref() and cxl_rr_alloc_decoder(). Both functions are
subsequently called from cxl_port_attach_region(). Make the decoder a
function argument to both which avoids a duplicate call of
cxl_port_pick_region_decoder().

Now, cxl_rr_alloc_decoder() no longer allocates the decoder. Instead,
the previously picked decoder is assigned to the region reference.
Hence, rename the function to cxl_rr_assign_decoder().

Moving the call out of alloc_region_ref() also moves it out of the
xa_for_each() loop in there. Now, cxld is determined no longer only
for each auto-generated region, but now once for all regions
regardless of auto-generated or not. This is fine as the cxld argument
is needed for all regions in cxl_rr_assign_decoder() and an error would
be returned otherwise anyway. So it is better to determine the decoder
in front of all this and fail early if missing instead of running
through all that code with multiple calls of
cxl_port_pick_region_decoder().

Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: "Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com>
Tested-by: Gregory Price <gourry@gourry.net>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250509150700.2817697-7-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-09 09:48:26 -07:00
Robert Richter
a3a96873b2 cxl/region: Rename function to cxl_port_pick_region_decoder()
Current function cxl_region_find_decoder() is used to find a port's
decoder during region setup. In the region creation path the function
is an allocator to find a free port. In the region assembly path, it
is recalling the decoder that platform firmware picked for validation
purposes.

Rename function to cxl_port_pick_region_decoder() that better
describes its use and update the function's description.

The result of cxl_port_pick_region_decoder() is recorded in a 'struct
cxl_region_ref' in @port for later recall when other endpoints might
also be targets of the picked decoder.

Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: "Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com>
Tested-by: Gregory Price <gourry@gourry.net>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250509150700.2817697-6-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-09 09:48:26 -07:00
Robert Richter
99ff9060b2 cxl: Introduce parent_port_of() helper
Often a parent port must be determined. Introduce the parent_port_of()
helper function to avoid open coding of determination of a parent
port.

Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: "Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com>
Tested-by: Gregory Price <gourry@gourry.net>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250509150700.2817697-5-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-09 09:48:25 -07:00
Robert Richter
88bc0503c4 cxl/pci: Add comments to cxl_hdm_decode_init()
There are various configuration cases of HDM decoder registers causing
different code paths. Add comments to cxl_hdm_decode_init() to better
explain them.

Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: "Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Tested-by: Gregory Price <gourry@gourry.net>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250509150700.2817697-4-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-09 09:48:25 -07:00
Robert Richter
d858631b1c cxl/pci: Moving code in cxl_hdm_decode_init()
Commit 3f9e075317 ("cxl/pci: simplify the check of mem_enabled in
cxl_hdm_decode_init()") changed the code flow in this function. The
root port is determined before a check to leave the function. Since
the root port is not used by the check it can be moved to run the
check first. This improves code readability and avoids unnesessary
code execution.

Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: "Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Gregory Price <gourry@gourry.net>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250509150700.2817697-3-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-09 09:48:25 -07:00
Robert Richter
21339b30f0 cxl: Remove else after return
Remove unnecessary 'else' after return. Improves readability of code.
It is easier to place comments. Check and fix all occurrences under
drivers/cxl/.

Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: "Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com>
Tested-by: Gregory Price <gourry@gourry.net>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250509150700.2817697-2-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-09 09:48:25 -07:00
Gregory Price
ce32b0c9c5 cxl: core/region - ignore interleave granularity when ways=1
When validating decoder IW/IG when setting up regions, the granularity
is irrelevant when iw=1 - all accesses will always route to the only
target anyway - so all ig values are "correct". Loosen the requirement
that `ig = (parent_iw * parent_ig)` when iw=1.

On some Zen5 platforms, the platform BIOS specifies a 256-byte
interleave granularity window for host bridges when there is only
one target downstream.  This leads to Linux rejecting the configuration
of a region with a x2 root with two x1 hostbridges.

Decoder Programming:
   root - iw:2 ig:256
   hb1  - iw:1 ig:256  (Linux expects 512)
   hb2  - iw:1 ig:256  (Linux expects 512)
   ep1  - iw:2 ig:256
   ep2  - iw:2 ig:256

This change allows all decoders downstream of a passthrough decoder to
also be configured as passthrough (iw:1 ig:X), but still disallows
downstream decoders from applying subsequent interleaves.

e.g. in the above example if there was another decoder south of hb1
attempting to interleave 2 endpoints - Linux would enforce hb1.ig=512
because the southern decoder would have iw:2 and require ig=pig*piw.

[DJ: Fixed up against 6.15-rc1]

Signed-off-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Tested-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250402232552.999634-1-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-04-28 08:48:30 -07:00
Dave Jiang
cdafa67c02 cxl: Remove always true condition for cxlctl_validate_hw_command()
smatch warnings:
drivers/cxl/core/features.c:441 cxlctl_validate_hw_command() warn: always true condition '(scope >= 0) => (0-u32max >= 0)'

Remove the check entirely as it has no effect. Expectation is both of these
operations should be allowed for all check levels. They are read only and
have no change effects.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202504041033.2HBboAZR-lkp@intel.com/
Reviewed-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20250404165418.3032414-1-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-04-28 08:48:30 -07:00
Smita Koralahalli
078d3ee7c1 cxl/core/regs.c: Skip Memory Space Enable check for RCD and RCH Ports
According to CXL r3.2 section 8.2.1.2, the PCI_COMMAND register fields,
including Memory Space Enable bit, have no effect on the behavior of an
RCD Upstream Port. Retaining this check may incorrectly cause
cxl_pci_probe() to fail on a valid RCD upstream Port.

While the specification is explicit only for RCD Upstream Ports, this
check is solely for accessing the RCRB, which is always mapped through
memory space. Therefore, its safe to remove the check entirely. In
practice, firmware reliably enables the Memory Space Enable bit for
RCH Downstream Ports and no failures have been observed.

Removing the check simplifies the code and avoids unnecessary
special-casing, while relying on BIOS/firmware to configure devices
correctly. Moreover, any failures due to inaccessible RCRB regions
will still be caught either in __rcrb_to_component() or while
parsing the component register block.

The following failure was observed in dmesg when the check was present:
	cxl_pci 0000:7f:00.0: No component registers (-6)

Fixes: d5b1a27143 ("cxl/acpi: Extract component registers of restricted hosts from RCRB")
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Cc: <stable@vger.kernel.org>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Terry Bowman <terry.bowman@amd.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Robert Richter <rrichter@amd.com>
Link: https://patch.msgid.link/20250407192734.70631-1-Smita.KoralahalliChannabasappa@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-04-21 08:30:13 -07:00
Li Ming
25174d5cd2 cxl/feature: Update out_len in set feature failure case
CXL subsystem supports userspace to configure features via fwctl
interface, it will configure features by using Set Feature command.
Whatever Set Feature succeeds or fails, CXL driver always needs to
return a structure fwctl_rpc_cxl_out to caller, and returned size is
updated in a out_len parameter. The out_len should be updated not only
when the set feature succeeds, but also when the set feature fails.

Fixes: eb5dfcb9e3 ("cxl: Add support to handle user feature commands for set feature")
Signed-off-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20250410024521.514095-1-ming.li@zohomail.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-04-18 09:33:56 -07:00
Dave Jiang
dc915672f9 cxl: Fix devm host device for CXL fwctl initialization
Testing revealed the following error message for a CXL memdev that has
Feature support:
[   56.690430] cxl mem0: Resources present before probing

Attach the allocation of cxl_fwctl to the parent device of cxl_memdev.
devm_add_* calls for cxl_memdev should not happen before the memdev
probe function or outside the scope of the memdev driver.

cxl_test missed this bug because cxl_test always arranges for the
cxl_mem driver to be loaded before cxl_mock_mem runs. So the driver core
always finds the devres list idle in that case.

[DJ: Updated subject title and added commit log suggestion from djbw]

Fixes: 858ce2f56b ("cxl: Add FWCTL support to CXL")
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://lore.kernel.org/linux-cxl/6801aea053466_71fe2944c@dwillia2-xfh.jf.intel.com.notmuch/
Link: https://patch.msgid.link/20250418002933.406439-1-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-04-18 09:33:04 -07:00
Li Ming
36aace15d9 cxl/pci: Drop the parameter is_port of cxl_gpf_get_dvsec()
The first parameter of cxl_gpf_get_dvsec() is a struct device, can be
used to distinguish if the device is a cxl dport or a cxl pci device by
checking the PCIe type of it, so the parameter is_port is unnecessary
to cxl_gpf_get_dvsec(), using parameter struct device is enough.

Signed-off-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://patch.msgid.link/20250323093110.233040-4-ming.li@zohomail.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-04-09 12:48:18 -07:00
Li Ming
6af941db6a cxl/pci: Update Port GPF timeout only when the first EP attaching
update_gpf_port_dvsec() is used to update GPF Phase timeout, if a CXL
switch is under a CXL root port, update_gpf_port_dvsec() will be invoked
on the CXL root port when each cxl memory device under the CXL switch is
attaching. It is enough to be invoked once, others are redundant.

When the first EP attaching, it always triggers its ancestor dports to
locate their own Port GPF DVSEC. The change is that invoking
update_gpf_port_dvsec() on these ancestor dports after ancestor dport
locating a Port GPF DVSEC. It guarantees that update_gpf_port_dvsec() is
invoked on a dport only happens during the first EP attaching.

Signed-off-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://patch.msgid.link/20250323093110.233040-3-ming.li@zohomail.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-04-09 12:48:18 -07:00
Li Ming
87d2de042c cxl/core: Fix caching dport GPF DVSEC issue
Per Table 8-2 in CXL r3.2 section 8.1.1 and CXL r3.2 section 8.1.6, only
CXL Downstream switch ports and CXL root ports have GPF DVSEC for CXL
Port(DVSEC ID 04h).

CXL subsystem has a gpf_dvsec in struct cxl_port which is used to cache
the offset of a GPF DVSEC in PCIe configuration space. It will be
updated during the first EP attaching to the cxl_port, so the gpf_dvsec
can only cache the GPF DVSEC offset of the dport which the first EP is
under. Will not have chance to update it during other EPs attaching.
That means CXL subsystem will use the same GPF DVSEC offset for all
dports under the port, it will be a problem if the GPF DVSEC offset
cached in cxl_port is not the right offset for a dport.

Moving gpf_dvsec from struct cxl_port to struct cxl_dport, make every
cxl dport has their own GPF DVSEC offset caching, and each cxl dport
uses its own GPF DVSEC offset for GPF DVSEC accessing.

Fixes: a52b6a2c1c ("cxl/pci: Support Global Persistent Flush (GPF)")
Signed-off-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://patch.msgid.link/20250323093110.233040-2-ming.li@zohomail.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-04-09 12:48:18 -07:00
Linus Torvalds
01ecadbe09 cxl for v6.15
- Add support for Global Persistent Flush (GPF)
 - Cleanup of DPA partition metadata handling
 	- Remove the CXL_DECODER_MIXED enum that's not needed anymore
 	- Introduce helpers to access resource and perf meta data
 	- Introduce 'struct cxl_dpa_partition' and 'struct cxl_range_info'
 	- Make cxl_dpa_alloc() DPA partition number agnostic
 	- Remove cxl_decoder_mode
 	- Cleanup partition size and perf helpers
 - Remove unused CXL partition values
 - Add logging support for CXL CPER endpoint and port protocol errors
 	- Prefix protocol error struct and function names with cxl_
 	- Move protocol error definitions and structures to a common location
 	- Remove drivers/firmware/efi/cper_cxl.h to include/linux/cper.h
 	- Add support in GHES to process CXL CPER protocol errors
 	- Process CXL CPER protocol errors
 	- Add trace logging for CXL PCIe port RAS errors
 - Remove redundant gp_port init
 - Add validation of cxl device serial number
 - CXL ABI documentation updates/fixups
 - A series that uses guard() to clean up open coded mutex lockings and remove gotos for error
   handling.
 - Some followup patches to support dirty shutdown accounting
 	- Add helper to retrieve DVSEC offset for dirty shutdown registers
 	- Rename cxl_get_dirty_shutdown() to cxl_arm_dirty_shutdown()
 	- Add support for dirty shutdown count via sysfs
 	- cxl_test support for dirty shutdown
 - A series to support CXL mailbox Features commands. Mostly in preparation for CXL EDAC
   code to utilize the Features commands. It's also in preparation for CXL fwctl support
   to utilize the CXL Features. The commands include "Get Supported Features", "Get Feature",
   and "Set Feature".
 - A series to support extended linear cache support described by the ACPI HMAT table. The
   addition helps enumerate the cache and also provides additional RAS reporting support for
   configuration with extended linear cache. (and related fixes for the
   series).
 - An update to cxl_test to support a 3-way capable CFMWS.
 - A documentation fix to remove unused "mixed mode".
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE5DAy15EJMCV1R6v9YGjFFmlTOEoFAmfqtP4ACgkQYGjFFmlT
 OEqx9A//UsCWf1CH8bvjKXxSTlQmtPlNpcXe+gVR0sc5cL2VFxKf93AY8Zo1Br5A
 b40gtZJz9QwjwGwIvDiki9U2bopOyX3aMOyBJMYmLuL/irY8ENx2ra7ODbxe7uGn
 oZwpwG2sEGQxIAG2bCpVuCDIt8JjNvsTJo45TICs07w9TWTmH4Swpbz1g8VGpDz/
 kCQcXXHSHZleR5BzqVRKxjjqGEUFj2xDMzAI8VSL+7izMMoPLbjwnl2c1fwaLBPd
 iJTMboTXDj7eVMta/qqGkG7pshM81SnkSzy8cxImj3r4SRgRTZg9U8vhrR3K1kdH
 F05Ozd12tljtNXLWthENZPUbfcovy9oTxzMt/gVut7j6C7H3s3KCSbV7zhz5BmfD
 XcapOX4Cu7ptn88KLqE5a98oLuq2DXrLOcX5vKPYBfAO+68rC+gSAPSbzfZlSHa0
 1/TsxVvzDQUBVZWL94DeHvemyQb58GQBOypeNZbH8P4gAhWJqk3hZEO+wlSxpfd+
 R7wgabfKJUJ82KusCZHIW1Wg3/IrXb4yC+UyiObS5RgIJWpRmOkuJEHDvEUje+Dj
 aOWw/H3vZgeZnpW87FRxzvDJx1/0jZI1vsxH65m2wrvz6n5aGIA/Q6pgqCdU/m6c
 I231bl1bmZzJ8u3+vOZL4tFHcYHh4XCwQp+ZQt1uDa0fA5LbLhc=
 =ZME1
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull Compute Express Link (CXL)  updates from Dave Jiang:

 - Add support for Global Persistent Flush (GPF)

 - Cleanup of DPA partition metadata handling:
     - Remove the CXL_DECODER_MIXED enum that's not needed anymore
     - Introduce helpers to access resource and perf meta data
     - Introduce 'struct cxl_dpa_partition' and 'struct cxl_range_info'
     - Make cxl_dpa_alloc() DPA partition number agnostic
     - Remove cxl_decoder_mode
     - Cleanup partition size and perf helpers

 - Remove unused CXL partition values

 - Add logging support for CXL CPER endpoint and port protocol errors:
     - Prefix protocol error struct and function names with cxl_
     - Move protocol error definitions and structures to a common location
     - Remove drivers/firmware/efi/cper_cxl.h to include/linux/cper.h
     - Add support in GHES to process CXL CPER protocol errors
     - Process CXL CPER protocol errors
     - Add trace logging for CXL PCIe port RAS errors

 - Remove redundant gp_port init

 - Add validation of cxl device serial number

 - CXL ABI documentation updates/fixups

 - A series that uses guard() to clean up open coded mutex lockings and
   remove gotos for error handling.

 - Some followup patches to support dirty shutdown accounting:
     - Add helper to retrieve DVSEC offset for dirty shutdown registers
     - Rename cxl_get_dirty_shutdown() to cxl_arm_dirty_shutdown()
     - Add support for dirty shutdown count via sysfs
     - cxl_test support for dirty shutdown

 - A series to support CXL mailbox Features commands.

   Mostly in preparation for CXL EDAC code to utilize the Features
   commands. It's also in preparation for CXL fwctl support to utilize
   the CXL Features. The commands include "Get Supported Features", "Get
   Feature", and "Set Feature".

 - A series to support extended linear cache support described by the
   ACPI HMAT table.

   The addition helps enumerate the cache and also provides additional
   RAS reporting support for configuration with extended linear cache.
   (and related fixes for the series).

 - An update to cxl_test to support a 3-way capable CFMWS

 - A documentation fix to remove unused "mixed mode"

* tag 'cxl-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (39 commits)
  cxl/region: Fix the first aliased address miscalculation
  cxl/region: Quiet some dev_warn()s in extended linear cache setup
  cxl/Documentation: Remove 'mixed' from sysfs mode doc
  cxl: Fix warning from emitting resource_size_t as long long int on 32bit systems
  cxl/test: Define a CFMWS capable of a 3 way HB interleave
  cxl/mem: Do not return error if CONFIG_CXL_MCE unset
  tools/testing/cxl: Set Shutdown State support
  cxl/pmem: Export dirty shutdown count via sysfs
  cxl/pmem: Rename cxl_dirty_shutdown_state()
  cxl/pci: Introduce cxl_gpf_get_dvsec()
  cxl/pci: Support Global Persistent Flush (GPF)
  cxl: Document missing sysfs files
  cxl: Plug typos in ABI doc
  cxl/pmem: debug invalid serial number data
  cxl/cdat: Remove redundant gp_port initialization
  cxl/memdev: Remove unused partition values
  cxl/region: Drop goto pattern of construct_region()
  cxl/region: Drop goto pattern in cxl_dax_region_alloc()
  cxl/core: Use guard() to drop goto pattern of cxl_dpa_alloc()
  cxl/core: Use guard() to drop the goto pattern of cxl_dpa_free()
  ...
2025-04-02 20:04:43 -07:00
Li Ming
aae0594a70 cxl/region: Fix the first aliased address miscalculation
In extended linear cache(ELC) case, cxl_port_get_spa_cache_alias() helps
to get the aliased address of a SPA, it considers the first address in
CXL memory range is "region start + region cache size + 1", but it
should be "region start + region cache size".

So if a SPA is equal to "region start + region cache size", its aliased
address should be "SPA - region cache size".

Signed-off-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://patch.msgid.link/20250317070124.815028-1-ming.li@zohomail.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-20 11:28:45 -07:00
Dave Jiang
eb5dfcb9e3 cxl: Add support to handle user feature commands for set feature
Add helper function to parse the user data from fwctl RPC ioctl and
send the parsed input parameters to cxl_set_feature() call.

Link: https://patch.msgid.link/r/20250307205648.1021626-6-dave.jiang@intel.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-17 14:41:37 -03:00
Dave Jiang
5908f3ed6d cxl: Add support to handle user feature commands for get feature
Add helper function to parse the user data from fwctl RPC ioctl and
send the parsed input parameters to cxl_get_feature() call.

Link: https://patch.msgid.link/r/20250307205648.1021626-5-dave.jiang@intel.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-17 14:41:37 -03:00
Dave Jiang
4d1c09cef2 cxl: Add support for fwctl RPC command to enable CXL feature commands
fwctl provides a fwctl_ops->fw_rpc() callback in order to issue ioctls
to a device. The cxl fwctl driver will start by supporting the CXL
Feature commands: Get Supported Features, Get Feature, and Set Feature.

The fw_rpc() callback provides 'enum fwctl_rpc_scope' parameter where
it indicates the security scope of the call. The Get Supported Features
and Get Feature calls can be executed with the scope of
FWCTL_RPC_CONFIGRATION. The Set Feature call is gated by the effects
of the Feature reported by Get Supported Features call for the specific
Feature.

Only "Get Supported Features" is supported in this patch. Additional
commands will be added in follow on patches. "Get Supported Features"
will filter the Features that are exclusive to the kernel. The flag
field of the Feature details will be cleared of the "Changeable"
field and the "set feat size" will be set to 0 to indicate that
the feature is not changeable.

Link: https://patch.msgid.link/r/20250307205648.1021626-4-dave.jiang@intel.com
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-17 14:41:37 -03:00
Dave Jiang
858ce2f56b cxl: Add FWCTL support to CXL
Add fwctl support code to allow sending of CXL feature commands from
userspace through as ioctls via FWCTL. Provide initial setup bits. The
CXL PCI probe function will call devm_cxl_setup_fwctl() after the
cxl_memdev has been enumerated in order to setup FWCTL char device under
the cxl_memdev like the existing memdev char device for issuing CXL raw
mailbox commands from userspace via ioctls.

Link: https://patch.msgid.link/r/20250307205648.1021626-2-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-17 14:41:36 -03:00
Dave Jiang
3b5d43245f Merge branch 'for-6.15/features' into cxl-for-next
Add CXL Features support. Setup code for enabling in kernel usage of CXL
Features. Expecting EDAC/RAS to utilize CXL Features in kernel for
things such as memory sparing. Also prepartion for enabling of CXL FWCTL
support to issue allowed Features from user space.
2025-03-17 09:22:59 -07:00
Alison Schofield
74d9c59658 cxl/region: Quiet some dev_warn()s in extended linear cache setup
Extended Linear Cache (ELC) setup code emits a dev_warn(), "Extended
linear cache calculation failed." for issues found while setting up
the ELC.

For platforms without CONFIG_ACPI_HMAT, every auto region setup will
emit the warning because the default !ACPI_HMAT return value is
EOPNOTSUPP. Suppress it by skipping the warn for EOPNOTSUPP. Change
the EOPNOTSUPP in the actual ELC failure path to ENXIO.

Remove the check and enusing dev_warn() when region resource size is
NULL. The endpoint decoders hpa_range used to create the resource is
checked in init_hdm_decoder(), so it cannot be NULL here.

For good measure, add the rc value to the dev_warn(). It will either
be the -ENOENT returned by HMAT if the mem target is not found, or
the -ENXIO from the region driver calculation.

Reviewed-by: Li Ming <ming.li@zohomail.com>
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20250306213700.2606304-1-alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-14 16:28:15 -07:00
Dave Jiang
3d3e3b9444 cxl: Fix warning from emitting resource_size_t as long long int on 32bit systems
Reported by kernel test bot from an ARM build:
drivers/cxl/core/region.c:3263:26: warning: format '%lld' expects argument of type 'long long int', but argument 3 has type 'resource_size_t' {aka 'unsigned int'} [-Wformat=]

On a 32bit system, resource_size_t is defined as 32bit long vs on a 64bit
system it is defined as 64bit long. Use %pa format to deal with
resource_size_t.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202503010252.mIDhZ5kY-lkp@intel.com/
Fixes: 0ec9849b63 ("acpi/hmat / cxl: Add extended linear cache support for CXL")
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20250228204739.3849309-1-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-14 16:27:54 -07:00