PCIe r6.0, sec 5.9, requires a 10ms delay between programming a device to
change to or from D3hot and the time the device is next accessed (unless
Readiness Notifications are used).
The 10ms value (PCI_PM_D3HOT_WAIT) doesn't appear directly here because
some chipsets require 120ms for devices *below* them (pci_pm_d3hot_delay)
and some devices require more or less than 10ms (dev->d3hot_delay).
But msleep(10) typically waits about *20*ms, which is more than we need.
Switch to usleep_range() to improve the delay accuracy.
Based on a commit from Sajid in the Pixel 6 kernel tree [1]. On a Pixel 6,
the 10ms delay for the Exynos PCIe device delayed for an average of 19ms.
Switching to usleep_range() decreased the resume time by about 9ms.
[1] 18a8cad68d
[bhelgaas commit log, add timers-howto.rst link]
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/timers/timers-howto.rst?id=v5.19#n73
Link: https://lore.kernel.org/r/20220921212735.2131588-1-willmcvicker@google.com
Signed-off-by: Sajid Dalvi <sdalvi@google.com>
Signed-off-by: Will McVicker <willmcvicker@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
We want to disable PTM on Root Ports because that allows some chips, e.g.,
Intel mobile chips since Coffee Lake, to enter a lower-power PM state.
That means we also have to disable PTM on downstream devices. PCIe r6.0,
sec 2.2.8, recommends that functions support generation of messages in
non-D0 states, so we have to assume Switch Upstream Ports or Endpoints may
send PTM Requests while in D1, D2, and D3hot. A PTM message received by a
Downstream Port (including a Root Port) with PTM disabled must be treated
as an Unsupported Request (sec 6.21.3).
PTM was previously disabled only for Root Ports, and it was disabled in
pci_prepare_to_sleep(), which is not called at all if a driver supports
legacy PM or does its own state saving.
Instead, disable PTM early in pci_pm_suspend() and pci_pm_runtime_suspend()
so we do it in all cases.
Previously PTM was disabled *after* saving device state, so the state
restore on resume automatically re-enabled it. Since we now disable PTM
*before* saving state, we must explicitly re-enable it in pci_pm_resume()
and pci_pm_runtime_resume().
Here's a sample of errors that occur when PTM is disabled only on the Root
Port. With this topology:
0000:00:1d.0 Root Port to [bus 08-71]
0000:08:00.0 Switch Upstream Port to [bus 09-71]
Kai-Heng reported errors like this:
pcieport 0000:00:1d.0: [20] UnsupReq (First)
pcieport 0000:00:1d.0: AER: TLP Header: 34000000 08000052 00000000 00000000
Decoding TLP header 0x34...... (0011 0100b) and 0x08000052:
Fmt 001b 4 DW header, no data
Type 1 0100b Msg (Local - Terminate at Receiver)
Requester ID 0x0800 Bus 08 Devfn 00.0
Message Code 0x52 0101 0010b PTM Request
The 00:1d.0 Root Port logged an Unsupported Request error when it received
a PTM Request with Requester ID 08:00.0.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215453
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=216210
Fixes: a697f072f5 ("PCI: Disable PTM during suspend to save power")
Link: https://lore.kernel.org/r/20220909202505.314195-10-helgaas@kernel.org
Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Rajvi Jingar <rajvi.jingar@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
We disable PTM during suspend because that allows some Root Ports to enter
lower-power PM states, which means we also need to disable PTM for all
downstream devices. Add pci_suspend_ptm() and pci_resume_ptm() for this
purpose.
pci_enable_ptm() and pci_disable_ptm() are for drivers to use to enable or
disable PTM. They use dev->ptm_enabled to keep track of whether PTM should
be enabled.
pci_suspend_ptm() and pci_resume_ptm() are PCI core-internal functions to
temporarily disable PTM during suspend and (depending on dev->ptm_enabled)
re-enable PTM during resume.
Enable/disable/suspend/resume all use internal __pci_enable_ptm() and
__pci_disable_ptm() functions that only update the PTM Control register.
Outline:
pci_enable_ptm(struct pci_dev *dev)
{
__pci_enable_ptm(dev);
dev->ptm_enabled = 1;
pci_ptm_info(dev);
}
pci_disable_ptm(struct pci_dev *dev)
{
if (dev->ptm_enabled) {
__pci_disable_ptm(dev);
dev->ptm_enabled = 0;
}
}
pci_suspend_ptm(struct pci_dev *dev)
{
if (dev->ptm_enabled)
__pci_disable_ptm(dev);
}
pci_resume_ptm(struct pci_dev *dev)
{
if (dev->ptm_enabled)
__pci_enable_ptm(dev);
}
Nothing currently calls pci_resume_ptm(); the suspend path saves the PTM
state before disabling PTM, so the PTM state restore in the resume path
implicitly re-enables it. A future change will use pci_resume_ptm() to fix
some problems with this approach.
Link: https://lore.kernel.org/r/20220909202505.314195-5-helgaas@kernel.org
Tested-by: Rajvi Jingar <rajvi.jingar@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
- Remove pci_get_legacy_ide_irq(); use ATA_PRIMARY_IRQ() and
ATA_SECONDARY_IRQ() instead (Stafford Horne)
- Remove isa_dma_bridge_buggy, except for x86_32, the only place it's used
(Stafford Horne)
- Define ARCH_GENERIC_PCI_MMAP_RESOURCE for csky (Stafford Horne)
- Move common PCI definitions that arches sometimes override to
asm-generic/pci.h (Stafford Horne)
- Include <linux/isa-dma.h> for 'isa_dma_bridge_buggy' when needed
(bisection hole here) (Randy Dunlap)
* pci/header-cleanup-immutable:
PCI: Stub __pci_ioport_map() for arches that don't support it at all
x86/cyrix: include header linux/isa-dma.h
asm-generic: Add new pci.h and use it
csky: PCI: Define ARCH_GENERIC_PCI_MMAP_RESOURCE
PCI: Move isa_dma_bridge_buggy out of asm/dma.h
PCI: Remove pci_get_legacy_ide_irq() and asm-generic/pci.h
The isa_dma_bridge_buggy symbol is only used for x86_32, and only x86_32
platforms or quirks ever set it.
Add a new linux/isa-dma.h header that #defines isa_dma_bridge_buggy to 0
except on x86_32, where we keep it as a variable, and remove all the arch-
specific definitions.
[bhelgaas: commit log]
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Suggested-by: Christoph Hellwig <hch@infradead.org>
Link: https://lore.kernel.org/r/20220722214944.831438-3-shorne@gmail.com
Signed-off-by: Stafford Horne <shorne@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
pcie_aspm_pm_state_change() was introduced at the inception of PCIe ASPM
code, but it can cause some issues. For instance, when ASPM config is
changed via sysfs, those changes won't persist across power state change
because pcie_aspm_pm_state_change() overwrites them.
Also, if the driver restores L1SS [1] after system resume, the restored
state will also be overwritten by pcie_aspm_pm_state_change().
Remove pcie_aspm_pm_state_change(). If there's any hardware that really
needs it to function, a quirk can be used instead.
[1] https://lore.kernel.org/linux-pci/20220201123536.12962-1-vidyas@nvidia.com/
Link: https://lore.kernel.org/r/20220509073639.2048236-1-kai.heng.feng@canonical.com
[bhelgaas: remove additional pcie_aspm_pm_state_change() call in
pci_set_low_power_state(), added by
10aa5377fc ("PCI/PM: Split pci_raw_set_power_state()") and moved by
7957d20145 ("PCI/PM: Relocate pci_set_low_power_state()")]
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
92597f97a4 ("PCI/PM: Avoid putting Elo i2 PCIe Ports in D3cold") omitted
braces around the new Elo i2 entry, so it overwrote the existing Gigabyte
X299 entry. Add the appropriate braces.
Found by:
$ make W=1 drivers/pci/pci.o
CC drivers/pci/pci.o
drivers/pci/pci.c:2974:12: error: initialized field overwritten [-Werror=override-init]
2974 | .ident = "Elo i2",
| ^~~~~~~~
Link: https://lore.kernel.org/r/20220526221258.GA409855@bhelgaas
Fixes: 92597f97a4 ("PCI/PM: Avoid putting Elo i2 PCIe Ports in D3cold")
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v5.15+
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmKP/tQUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vzH0xAAojQowrSWzZ5FKTqI+L/L9ZXoAb+e
9IvQljKc9taJldmXp+EB9wkS/5B+VtQcC2qUQuWEQXUoECF8qHlcB4l+XQyd1tWO
O0vZxETH22xjLLrjG2F3l5rrfkJZAf2nEugwbDk97YEgiimeOiRcv3bx6AUCtj6I
rPJ13Fop3Jke7sQMcXYJe3gQLT1o1AKiQGghiCFNi/gzx2lXI6mmHBgLxFoiqcby
WpfXbvbJti95HRaahUR3HaDFfHj4HVkQNLlTtIykJ3Tl2/rOhWEJjI8JOIQpAA+M
WBrWw9rfgbScTiGV+dZ3h7hKiPnHKl9YETIX7L0oA2sj0jZcIs0d6mSBZx0kYuI9
eAlx+qSK9xpbQQr/fdYaUdF1q4QdtU0BYOvOWOzWsqYCECMRJ1PUHFSMbmR/+PNB
P5lHnAbggRSoxdAtwFYv1HTr+VpGH9S+5oxHCz3ohpMjYy6mkCZwHpZn3doaU3ci
KG6yIoVKftm3fZdtFvL03qHl/I8+X24ZhT/T/278PRGjkhSyr56hZo8hg0gqqTct
ngip8qNABmSbqpr73/W6Vl42zAbYtNk1BykYahbKupgW8FbT7hqaZTB05V87pVu+
Ko1aJM6VoOP9rMlKHI9ba8eYCzDrZbLZUFn7ljNPDpzutf0tAwtgwzvZBXN3za6+
Z9+D5dxmvrZEIbA=
=hEti
-----END PGP SIGNATURE-----
Merge tag 'pci-v5.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull pci updates from Bjorn Helgaas:
"Resource management:
- Restrict E820 clipping to PCI host bridge windows (Bjorn Helgaas)
- Log E820 clipping better (Bjorn Helgaas)
- Add kernel cmdline options to enable/disable E820 clipping (Hans de
Goede)
- Disable E820 reserved region clipping for IdeaPads, Yoga, Yoga
Slip, Acer Spin 5, Clevo Barebone systems where clipping leaves no
usable address space for touchpads, Thunderbolt devices, etc (Hans
de Goede)
- Disable E820 clipping by default starting in 2023 (Hans de Goede)
PCI device hotplug:
- Include files to remove implicit dependencies (Christophe Leroy)
- Only put Root Ports in D3 if they can signal and wake from D3 so
AMD Yellow Carp doesn't miss hotplug events (Mario Limonciello)
Power management:
- Define pci_restore_standard_config() only for CONFIG_PM_SLEEP since
it's unused otherwise (Krzysztof Kozlowski)
- Power up devices completely, including anything platform firmware
needs to do, during runtime resume (Rafael J. Wysocki)
- Move pci_resume_bus() to PM callbacks so we observe the required
bridge power-up delays (Rafael J. Wysocki)
- Drop unneeded runtime_d3cold device flag (Rafael J. Wysocki)
- Split pci_raw_set_power_state() between pci_power_up() and a new
pci_set_low_power_state() (Rafael J. Wysocki)
- Set current_state to D3cold if config read returns ~0, indicating
the device is not accessible (Rafael J. Wysocki)
- Do not call pci_update_current_state() from pci_power_up() so BARs
and ASPM config are restored correctly (Rafael J. Wysocki)
- Write 0 to PMCSR in pci_power_up() in all cases (Rafael J. Wysocki)
- Split pci_power_up() to pci_set_full_power_state() to avoid some
redundant operations (Rafael J. Wysocki)
- Skip restoring BARs if device is not in D0 (Rafael J. Wysocki)
- Rearrange and clarify pci_set_power_state() (Rafael J. Wysocki)
- Remove redundant BAR restores from pci_pm_thaw_noirq() (Rafael J.
Wysocki)
Virtualization:
- Acquire device lock before config space access lock to avoid AB/BA
deadlock with sriov_numvfs_store() (Yicong Yang)
Error handling:
- Clear MULTI_ERR_COR/UNCOR_RCV bits, which a race could previously
leave permanently set (Kuppuswamy Sathyanarayanan)
Peer-to-peer DMA:
- Whitelist Intel Skylake-E Root Ports regardless of which devfn they
are (Shlomo Pongratz)
ASPM:
- Override L1 acceptable latency advertised by Intel DG2 so ASPM L1
can be enabled (Mika Westerberg)
Cadence PCIe controller driver:
- Set up device-specific register to allow PTM Responder to be
enabled by the normal architected bit (Christian Gmeiner)
- Override advertised FLR support since the controller doesn't
implement FLR correctly (Parshuram Thombare)
Cadence PCIe endpoint driver:
- Correct bitmap size for the ob_region_map of outbound window usage
(Dan Carpenter)
Freescale i.MX6 PCIe controller driver:
- Fix PERST# assertion/deassertion so we observe the required delays
before accessing device (Francesco Dolcini)
Freescale Layerscape PCIe controller driver:
- Add "big-endian" DT property (Hou Zhiqiang)
- Update SCFG DT property (Hou Zhiqiang)
- Add "aer", "pme", "intr" DT properties (Li Yang)
- Add DT compatible strings for ls1028a (Xiaowei Bao)
Intel VMD host bridge driver:
- Assign VMD IRQ domain before enumeration to avoid IOMMU interrupt
remapping errors when MSI-X remapping is disabled (Nirmal Patel)
- Revert VMD workaround that kept MSI-X remapping enabled when IOMMU
remapping was enabled (Nirmal Patel)
Marvell MVEBU PCIe controller driver:
- Add of_pci_get_slot_power_limit() to parse the
'slot-power-limit-milliwatt' DT property (Pali Rohár)
- Add mvebu support for sending Set_Slot_Power_Limit message (Pali
Rohár)
MediaTek PCIe controller driver:
- Fix refcount leak in mtk_pcie_subsys_powerup() (Miaoqian Lin)
MediaTek PCIe Gen3 controller driver:
- Reset PHY and MAC at probe time (AngeloGioacchino Del Regno)
Microchip PolarFlare PCIe controller driver:
- Add chained_irq_enter()/chained_irq_exit() calls to mc_handle_msi()
and mc_handle_intx() to avoid lost interrupts (Conor Dooley)
- Fix interrupt handling race (Daire McNamara)
NVIDIA Tegra194 PCIe controller driver:
- Drop tegra194 MSI register save/restore, which is unnecessary since
the DWC core does it (Jisheng Zhang)
Qualcomm PCIe controller driver:
- Add SM8150 SoC DT binding and support (Bhupesh Sharma)
- Fix pipe clock imbalance (Johan Hovold)
- Fix runtime PM imbalance on probe errors (Johan Hovold)
- Fix PHY init imbalance on probe errors (Johan Hovold)
- Convert DT binding to YAML (Dmitry Baryshkov)
- Update DT binding to show that resets aren't required for
MSM8996/APQ8096 platforms (Dmitry Baryshkov)
- Add explicit register names per chipset in DT binding (Dmitry
Baryshkov)
- Add sc7280-specific clock and reset definitions to DT binding
(Dmitry Baryshkov)
Rockchip PCIe controller driver:
- Fix bitmap size when searching for free outbound region (Dan
Carpenter)
Rockchip DesignWare PCIe controller driver:
- Remove "snps,dw-pcie" from rockchip-dwc DT "compatible" property
because it's not fully compatible with rockchip (Peter Geis)
- Reset rockchip-dwc controller at probe (Peter Geis)
- Add rockchip-dwc INTx support (Peter Geis)
Synopsys DesignWare PCIe controller driver:
- Return error instead of success if DMA mapping of MSI area fails
(Jiantao Zhang)
Miscellaneous:
- Change pci_set_dma_mask() documentation references to
dma_set_mask() (Alex Williamson)"
* tag 'pci-v5.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (64 commits)
dt-bindings: PCI: qcom: Add schema for sc7280 chipset
dt-bindings: PCI: qcom: Specify reg-names explicitly
dt-bindings: PCI: qcom: Do not require resets on msm8996 platforms
dt-bindings: PCI: qcom: Convert to YAML
PCI: qcom: Fix unbalanced PHY init on probe errors
PCI: qcom: Fix runtime PM imbalance on probe errors
PCI: qcom: Fix pipe clock imbalance
PCI: qcom: Add SM8150 SoC support
dt-bindings: pci: qcom: Document PCIe bindings for SM8150 SoC
x86/PCI: Disable E820 reserved region clipping starting in 2023
x86/PCI: Disable E820 reserved region clipping via quirks
x86/PCI: Add kernel cmdline options to use/ignore E820 reserved regions
PCI: microchip: Fix potential race in interrupt handling
PCI/AER: Clear MULTI_ERR_COR/UNCOR_RCV bits
PCI: cadence: Clear FLR in device capabilities register
PCI: cadence: Allow PTM Responder to be enabled
PCI: vmd: Revert 2565e5b69c ("PCI: vmd: Do not disable MSI-X remapping if interrupt remapping is enabled by IOMMU.")
PCI: vmd: Assign VMD IRQ domain before enumeration
PCI: Avoid pci_dev_lock() AB/BA deadlock with sriov_numvfs_store()
PCI: rockchip-dwc: Add legacy interrupt support
...
The sysfs sriov_numvfs_store() path acquires the device lock before the
config space access lock:
sriov_numvfs_store
device_lock # A (1) acquire device lock
sriov_configure
vfio_pci_sriov_configure # (for example)
vfio_pci_core_sriov_configure
pci_disable_sriov
sriov_disable
pci_cfg_access_lock
pci_wait_cfg # B (4) wait for dev->block_cfg_access == 0
Previously, pci_dev_lock() acquired the config space access lock before the
device lock:
pci_dev_lock
pci_cfg_access_lock
dev->block_cfg_access = 1 # B (2) set dev->block_cfg_access = 1
device_lock # A (3) wait for device lock
Any path that uses pci_dev_lock(), e.g., pci_reset_function(), may
deadlock with sriov_numvfs_store() if the operations occur in the sequence
(1) (2) (3) (4).
Avoid the deadlock by reversing the order in pci_dev_lock() so it acquires
the device lock before the config space access lock, the same as the
sriov_numvfs_store() path.
[bhelgaas: combined and adapted commit log from Jay Zhou's independent
subsequent posting:
https://lore.kernel.org/r/20220404062539.1710-1-jianjay.zhou@huawei.com]
Link: https://lore.kernel.org/linux-pci/1583489997-17156-1-git-send-email-yangyicong@hisilicon.com/
Also-posted-by: Jay Zhou <jianjay.zhou@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The part of pci_set_power_state() related to transitions into
low-power states is unnecessary convoluted, so clearly divide it
into the D3cold special case and the general case covering all of
the other states.
Also fix a potential issue with calling pci_bus_set_current_state()
to set the current state of all devices on the subordinate bus to
D3cold without checking if the power state of the parent bridge has
really changed to D3cold.
Link: https://lore.kernel.org/r/2139440.Mh6RI2rZIc@kreacher
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Make the following assorted non-essential changes in
pci_set_low_power_state():
1. Drop two redundant checks from it (the caller takes care of these
conditions).
2. Change the log level of a messages printed by it to "debug",
because it only indicates a programming mistake.
Link: https://lore.kernel.org/r/2539071.Lt9SDvczpP@kreacher
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Do not attempt to restore the device's BARs in
pci_set_full_power_state() if the actual current
power state of the device is not D0.
Link: https://lore.kernel.org/r/1849718.CQOukoFCf9@kreacher
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
One of the two callers of pci_power_up() invokes
pci_update_current_state() and pci_restore_state() right after calling
it, in which case running the part of it happening after the mandatory
transition delays is redundant, so move that part out of it into a new
function called pci_set_full_power_state() that will be invoked from
pci_set_power_state() which is the other caller of pci_power_up().
Link: https://lore.kernel.org/r/1942150.usQuhbGJ8B@kreacher
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Make pci_power_up() write 0 to the device's PCI_PM_CTRL register in
order to put it into D0 regardless of the power state returned by
the previous read from that register which should not matter.
Link: https://lore.kernel.org/r/5748066.MhkbZ0Pkbq@kreacher
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Notice that calling pci_update_current_state() from pci_power_up() is
redundant and may be harmful in some cases.
First, if the device is in a low-power state before pci_power_up()
gets called for it and platform_pci_set_power_state() successfully
changes its power state to D0, pci_update_current_state() will update
current_state to reflect that and pci_power_up() will return success
right away without restoring the device's BARs or reconfiguring ASPM
which may be necessary. This is arguably incorrect and definitely
inconsistent with the case when platform_pci_set_power_state() returns
an error (for example, because the device is not power-manageable by
the platform firmware).
Second, current_state should not be overwritten until the decision
whether or not to restore the device's BARs is made, because that
decision generally depends on its value. Again, calling
pci_update_current_state() in pci_power_up() is not consistent with
this observation.
Next, pci_power_up() attempts to read from the device's PCI_PM_CTRL
register regardless of the current_state value unless it is PCI_D0,
including the case when pci_update_current_state() sets current_state
to PCI_D3cold to indicate that the device is not accessible. If the
register read is not successful, current_state will be set to
PCI_D3cold anyway, so that pci_update_current_state() action is
redundant.
Further, if pci_update_current_state() reads the device's PCI_PM_CTRL
register, pci_power_up() will repeat that read going forward and
it is not necessary to update current_state in the meantime.
Finally, if pm_cap is not set (in which case the PCI_PM_CTRL register
is not present), the power state of the device should be determined
with the help of the platform firmware or set to D0 if that's not
possible and pci_update_current_state() does not do that.
Accordingly, rearrange pci_power_up() so as to address the above
shortcomings.
Link: https://lore.kernel.org/r/3695055.kQq0lBPeGt@kreacher
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Some actions carried out by pci_platform_power_transition(() in
pci_power_up() are redundant, but before dealing with them, make
pci_power_up() call the pci_platform_power_transition() code directly
(and avoid a redundant check when pm_cap is unset while at it).
Link: https://lore.kernel.org/r/1922486.PYKUYFuaPT@kreacher
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Make pci_power_up() and pci_set_low_power_state() change current_state
to PCI_D3cold when the device is not accessible along the lines of
pci_update_current_state().
Link: https://lore.kernel.org/r/10104376.nUPlyArG6x@kreacher
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Because pci_set_power_state() is the only caller of
pci_set_low_power_state(), put the latter next to the former.
No functional impact.
Link: https://lore.kernel.org/r/3202976.44csPzL39Z@kreacher
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The transitions from low-power states to D0 and the other way around
are unnecessarily tangled in pci_raw_set_power_state() which makes it
rather hard to follow.
Moreover, the only caller of pci_raw_set_power_state() passing PCI_D0
as its state argument is pci_power_up(), so the code carrying out
transitions into D0 can be put directly into that function.
Accordingly, move the code handling transitions from low-power states
into D0 directly into pci_power_up() and rename the remaining part
of pci_raw_set_power_state() to pci_set_low_power_state(), because
it only handles transitions into low-power state now.
While at it, fix up some white space, update some comments and modify
messages printed by pci_power_up() and pci_set_low_power_state() to
be less confusing (which is the only expected functional impact of
this change).
Link: https://lore.kernel.org/r/13038676.uLZWGnKmhe@kreacher
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Save one config space access in pci_update_current_state() by testing the
retrieved PCI_PM_CTRL register value against PCI_POSSIBLE_ERROR() instead
of invoking pci_device_is_present() separately.
While at it, drop a pair of unnecessary parens.
No expected functional impact.
Link: https://lore.kernel.org/r/1917095.PYKUYFuaPT@kreacher
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The runtime_d3cold flag is not needed any more, so drop it.
Link: https://lore.kernel.org/r/8077784.T7Z3S40VBb@kreacher
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Calling pci_resume_bus() on the secondary bus from pci_power_up() as it is
done now is questionable, because it depends on the mandatory bridge
power-up delays that are only covered by the PCI bus type PM callbacks.
For this reason, move the subordinate bus resume to those callbacks too and
use the observation that if a bridge is being powered-up during resume from
system-wide suspend, it may be still desirable to runtime-resume its
subordinate bus after completing the system-wide transition (in case the
resume of the devices on that bus is skipped during it).
Link: https://lore.kernel.org/r/3190097.aeNJFYEL58@kreacher
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
If a Root Port on Elo i2 is put into D3cold and then back into D0, the
downstream device becomes permanently inaccessible, so add a bridge D3 DMI
quirk for that system.
This was exposed by 14858dcc3b ("PCI: Use pci_update_current_state() in
pci_enable_device_flags()"), but before that commit the Root Port in
question had never been put into D3cold for real due to a mismatch between
its power state retrieved from the PCI_PM_CTRL register (which was
accessible even though the platform firmware indicated that the port was in
D3cold) and the state of an ACPI power resource involved in its power
management.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215715
Link: https://lore.kernel.org/r/11980172.O9o76ZdvQC@kreacher
Reported-by: Stefan Gottwald <gottwald@igel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v5.15+
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmHgpugUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vz59g//eWRLb0j2Vgv84ZH4x1iv6MaBboQr
2wScnfoN+MIoh+tuM4kRak15X4nB8rJhNZZCzesMUN6PeZvrkoPo4sz/xdzIrA/N
qY3h8NZ3nC4yCvs/tGem0zZUcSCJsxUAD0eegyMSa142xGIOQTHBSJRflR9osKSo
bnQlKTkugV8t4kD7NlQ5M3HzN3R+mjsII5JNzCqv2XlzAZG3D8DhPyIpZnRNAOmW
KiHOVXvQOocfUlvSs5kBlhgR1HgJkGnruCrJ1iDCWQH1Zk0iuVgoZWgVda6Cs3Xv
gcTJLB7VoSdNZKnct9aMNYPKziHkYc7clilPeDsJs5TlSv3kKERzLj6c/5ZAxFWN
+RsH+zYHDXJSsL/w0twPnaF5WCuVYUyrs3UiSjUvShKl1T9k9J+Jo8zwUUZx8Xb0
qXX8jRGMHolBGwPXm2fHEb4bwTUI8emPj29qK4L96KsQ3zKXWB8eGSosxUP52Tti
RR2WZjkvwlREZCJp6jSEJYkhzoEaVAm8CjKpKUuneX9WcUOsMBSs9k7EXbUy7JeM
hq5Keuqa8PZo/IK2DYYAchNnBJUDMsWJeduBW12qSmx3J+9victP2qOFu+9skP0a
85xlO6Cx8beiQh+XnY7jyROvIFuxTnGKHgkq/89Ham/whEzdJ+GRIiYB218kLLCW
ILdas3C2iiGz99I=
=Vgg4
-----END PGP SIGNATURE-----
Merge tag 'pci-v5.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull pci updates from Bjorn Helgaas:
"Enumeration:
- Use pci_find_vsec_capability() instead of open-coding it (Andy
Shevchenko)
- Convert pci_dev_present() stub from macro to static inline to avoid
'unused variable' errors (Hans de Goede)
- Convert sysfs slot attributes from default_attrs to default_groups
(Greg Kroah-Hartman)
- Use DWORD accesses for LTR, L1 SS to avoid BayHub OZ711LV2 erratum
(Rajat Jain)
- Remove unnecessary initialization of static variables (Longji Guo)
Resource management:
- Always write Intel I210 ROM BAR on update to work around device
defect (Bjorn Helgaas)
PCIe native device hotplug:
- Fix pciehp lockdep errors on Thunderbolt undock (Hans de Goede)
- Fix infinite loop in pciehp IRQ handler on power fault (Lukas
Wunner)
Power management:
- Convert amd64-agp, sis-agp, via-agp from legacy PCI power
management to generic power management (Vaibhav Gupta)
IOMMU:
- Add function 1 DMA alias quirk for Marvell 88SE9125 SATA controller
so it can work with an IOMMU (Yifeng Li)
Error handling:
- Add PCI_ERROR_RESPONSE and related definitions for signaling and
checking for transaction errors on PCI (Naveen Naidu)
- Fabricate PCI_ERROR_RESPONSE data (~0) in config read wrappers,
instead of in host controller drivers, when transactions fail on
PCI (Naveen Naidu)
- Use PCI_POSSIBLE_ERROR() to check for possible failure of config
reads (Naveen Naidu)
Peer-to-peer DMA:
- Add Logan Gunthorpe as P2PDMA maintainer (Bjorn Helgaas)
ASPM:
- Calculate link L0s and L1 exit latencies when needed instead of
caching them (Saheed O. Bolarinwa)
- Calculate device L0s and L1 acceptable exit latencies when needed
instead of caching them (Saheed O. Bolarinwa)
- Remove struct aspm_latency since it's no longer needed (Saheed O.
Bolarinwa)
APM X-Gene PCIe controller driver:
- Fix IB window setup, which was broken by the fact that IB resources
are now sorted in address order instead of DT dma-ranges order (Rob
Herring)
Apple PCIe controller driver:
- Enable clock gating to save power (Hector Martin)
- Fix REFCLK1 enable/poll logic (Hector Martin)
Broadcom STB PCIe controller driver:
- Declare bitmap correctly for use by bitmap interfaces (Christophe
JAILLET)
- Clean up computation of legacy and non-legacy MSI bitmasks (Florian
Fainelli)
- Update suspend/resume/remove error handling to warn about errors
and not fail the operation (Jim Quinlan)
- Correct the "pcie" and "msi" interrupt descriptions in DT binding
(Jim Quinlan)
- Add DT bindings for endpoint voltage regulators (Jim Quinlan)
- Split brcm_pcie_setup() into two functions (Jim Quinlan)
- Add mechanism for turning on voltage regulators for connected
devices (Jim Quinlan)
- Turn voltage regulators for connected devices on/off when bus is
added or removed (Jim Quinlan)
- When suspending, don't turn off voltage regulators for wakeup
devices (Jim Quinlan)
Freescale i.MX6 PCIe controller driver:
- Add i.MX8MM support (Richard Zhu)
Freescale Layerscape PCIe controller driver:
- Use DWC common ops instead of layerscape-specific link-up functions
(Hou Zhiqiang)
Intel VMD host bridge driver:
- Honor platform ACPI _OSC feature negotiation for Root Ports below
VMD (Kai-Heng Feng)
- Add support for Raptor Lake SKUs (Karthik L Gopalakrishnan)
- Reset everything below VMD before enumerating to work around
failure to enumerate NVMe devices when guest OS reboots (Nirmal
Patel)
Bridge emulation (used by Marvell Aardvark and MVEBU):
- Make emulated ROM BAR read-only by default (Pali Rohár)
- Make some emulated legacy PCI bits read-only for PCIe devices (Pali
Rohár)
- Update reserved bits in emulated PCIe Capability (Pali Rohár)
- Allow drivers to emulate different PCIe Capability versions (Pali
Rohár)
- Set emulated Capabilities List bit for all PCIe devices, since they
must have at least a PCIe Capability (Pali Rohár)
Marvell Aardvark PCIe controller driver:
- Add bridge emulation definitions for PCIe DEVCAP2, DEVCTL2,
DEVSTA2, LNKCAP2, LNKCTL2, LNKSTA2, SLTCAP2, SLTCTL2, SLTSTA2 (Pali
Rohár)
- Add aardvark support for DEVCAP2, DEVCTL2, LNKCAP2 and LNKCTL2
registers (Pali Rohár)
- Clear all MSIs at setup to avoid spurious interrupts (Pali Rohár)
- Disable bus mastering when unbinding host controller driver (Pali
Rohár)
- Mask all interrupts when unbinding host controller driver (Pali
Rohár)
- Fix memory leak in host controller unbind (Pali Rohár)
- Assert PERST# when unbinding host controller driver (Pali Rohár)
- Disable link training when unbinding host controller driver (Pali
Rohár)
- Disable common PHY when unbinding host controller driver (Pali
Rohár)
- Fix resource type checking to check only IORESOURCE_MEM, not
IORESOURCE_MEM_64, which is a flavor of IORESOURCE_MEM (Pali Rohár)
Marvell MVEBU PCIe controller driver:
- Implement pci_remap_iospace() for ARM so mvebu can use
devm_pci_remap_iospace() instead of the previous ARM-specific
pci_ioremap_io() interface (Pali Rohár)
- Use the standard pci_host_probe() instead of the device-specific
mvebu_pci_host_probe() (Pali Rohár)
- Replace all uses of ARM-specific pci_ioremap_io() with the ARM
implementation of the standard pci_remap_iospace() interface and
remove pci_ioremap_io() (Pali Rohár)
- Skip initializing invalid Root Ports (Pali Rohár)
- Check for errors from pci_bridge_emul_init() (Pali Rohár)
- Ignore any bridges at non-zero function numbers (Pali Rohár)
- Return ~0 data for invalid config read size (Pali Rohár)
- Disallow mapping interrupts on emulated bridges (Pali Rohár)
- Clear Root Port Memory & I/O Space Enable and Bus Master Enable at
initialization (Pali Rohár)
- Make type bits in Root Port I/O Base register read-only (Pali
Rohár)
- Disable Root Port windows when base/limit set to invalid values
(Pali Rohár)
- Set controller to Root Complex mode (Pali Rohár)
- Set Root Port Class Code to PCI Bridge (Pali Rohár)
- Update emulated Root Port secondary bus numbers to better reflect
the actual topology (Pali Rohár)
- Add PCI_BRIDGE_CTL_BUS_RESET support to emulated Root Ports so
pci_reset_secondary_bus() can reset connected devices (Pali Rohár)
- Add PCI_EXP_DEVCTL Error Reporting Enable support to emulated Root
Ports (Pali Rohár)
- Add PCI_EXP_RTSTA PME Status bit support to emulated Root Ports
(Pali Rohár)
- Add DEVCAP2, DEVCTL2 and LNKCTL2 support to emulated Root Ports on
Armada XP and newer devices (Pali Rohár)
- Export mvebu-mbus.c symbols to allow pci-mvebu.c to be a module
(Pali Rohár)
- Add support for compiling as a module (Pali Rohár)
MediaTek PCIe controller driver:
- Assert PERST# for 100ms to allow power and clock to stabilize
(qizhong cheng)
MediaTek PCIe Gen3 controller driver:
- Disable Mediatek DVFSRC voltage request since lack of DVFSRC to
respond to the request causes failure to exit L1 PM Substate
(Jianjun Wang)
MediaTek MT7621 PCIe controller driver:
- Declare mt7621_pci_ops static (Sergio Paracuellos)
- Give pcibios_root_bridge_prepare() access to host bridge windows
(Sergio Paracuellos)
- Move MIPS I/O coherency unit setup from driver to
pcibios_root_bridge_prepare() (Sergio Paracuellos)
- Add missing MODULE_LICENSE() (Sergio Paracuellos)
- Allow COMPILE_TEST for all arches (Sergio Paracuellos)
Microsoft Hyper-V host bridge driver:
- Add hv-internal interfaces to encapsulate arch IRQ dependencies
(Sunil Muthuswamy)
- Add arm64 Hyper-V vPCI support (Sunil Muthuswamy)
Qualcomm PCIe controller driver:
- Undo PM setup in qcom_pcie_probe() error handling path (Christophe
JAILLET)
- Use __be16 type to store return value from cpu_to_be16()
(Manivannan Sadhasivam)
- Constify static dw_pcie_ep_ops (Rikard Falkeborn)
Renesas R-Car PCIe controller driver:
- Fix aarch32 abort handler so it doesn't check the wrong bus clock
before accessing the host controller (Marek Vasut)
TI Keystone PCIe controller driver:
- Add register offset for ti,syscon-pcie-id and ti,syscon-pcie-mode
DT properties (Kishon Vijay Abraham I)
MicroSemi Switchtec management driver:
- Add Gen4 automotive device IDs (Kelvin Cao)
- Declare state_names[] as static so it's not allocated and
initialized for every call (Kelvin Cao)
Host controller driver cleanups:
- Use of_device_get_match_data(), not of_match_device(), when we only
need the device data in altera, artpec6, cadence, designware-plat,
dra7xx, keystone, kirin (Fan Fei)
- Drop pointless of_device_get_match_data() cast in j721e (Bjorn
Helgaas)
- Drop redundant struct device * from j721e since struct cdns_pcie
already has one (Bjorn Helgaas)
- Rename driver structs to *_pcie in intel-gw, iproc, ls-gen4,
mediatek-gen3, microchip, mt7621, rcar-gen2, tegra194, uniphier,
xgene, xilinx, xilinx-cpm for consistency across drivers (Fan Fei)
- Fix invalid address space conversions in hisi, spear13xx (Bjorn
Helgaas)
Miscellaneous:
- Sort Intel Device IDs by value (Andy Shevchenko)
- Change Capability offsets to hex to match spec (Baruch Siach)
- Correct misspellings (Krzysztof Wilczyński)
- Terminate statement with semicolon in pci_endpoint_test.c (Ming
Wang)"
* tag 'pci-v5.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (151 commits)
PCI: mt7621: Allow COMPILE_TEST for all arches
PCI: mt7621: Add missing MODULE_LICENSE()
PCI: mt7621: Move MIPS setup to pcibios_root_bridge_prepare()
PCI: Let pcibios_root_bridge_prepare() access bridge->windows
PCI: mt7621: Declare mt7621_pci_ops static
PCI: brcmstb: Do not turn off WOL regulators on suspend
PCI: brcmstb: Add control of subdevice voltage regulators
PCI: brcmstb: Add mechanism to turn on subdev regulators
PCI: brcmstb: Split brcm_pcie_setup() into two funcs
dt-bindings: PCI: Add bindings for Brcmstb EP voltage regulators
dt-bindings: PCI: Correct brcmstb interrupts, interrupt-map.
PCI: brcmstb: Fix function return value handling
PCI: brcmstb: Do not use __GENMASK
PCI: brcmstb: Declare 'used' as bitmap, not unsigned long
PCI: hv: Add arm64 Hyper-V vPCI support
PCI: hv: Make the code arch neutral by adding arch specific interfaces
PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors
x86/PCI: Remove initialization of static variables to false
PCI: Use DWORD accesses for LTR, L1 SS to avoid erratum
misc: pci_endpoint_test: Terminate statement with semicolon
...
- Add PCI_ERROR_RESPONSE and related definitions for signaling and checking
for transaction errors on PCI (Naveen Naidu)
- Fabricate PCI_ERROR_RESPONSE data (~0) in config read wrappers, instead
of in host controller drivers, when transactions fail on PCI (Naveen
Naidu)
- Use PCI_POSSIBLE_ERROR() to check for possible failure of config reads
(Naveen Naidu)
* pci/errors:
PCI: xgene: Use PCI_ERROR_RESPONSE to identify config read errors
PCI: hv: Use PCI_ERROR_RESPONSE to identify config read errors
PCI: keystone: Use PCI_ERROR_RESPONSE to identify config read errors
PCI: Use PCI_ERROR_RESPONSE to identify config read errors
PCI: cpqphp: Use PCI_POSSIBLE_ERROR() to check config reads
PCI/PME: Use PCI_POSSIBLE_ERROR() to check config reads
PCI/DPC: Use PCI_POSSIBLE_ERROR() to check config reads
PCI: pciehp: Use PCI_POSSIBLE_ERROR() to check config reads
PCI: vmd: Use PCI_POSSIBLE_ERROR() to check config reads
PCI/ERR: Use PCI_POSSIBLE_ERROR() to check config reads
PCI: rockchip-host: Drop error data fabrication when config read fails
PCI: rcar-host: Drop error data fabrication when config read fails
PCI: altera: Drop error data fabrication when config read fails
PCI: mvebu: Drop error data fabrication when config read fails
PCI: aardvark: Drop error data fabrication when config read fails
PCI: kirin: Drop error data fabrication when config read fails
PCI: histb: Drop error data fabrication when config read fails
PCI: exynos: Drop error data fabrication when config read fails
PCI: mediatek: Drop error data fabrication when config read fails
PCI: iproc: Drop error data fabrication when config read fails
PCI: thunder: Drop error data fabrication when config read fails
PCI: Drop error data fabrication when config read fails
PCI: Use PCI_SET_ERROR_RESPONSE() for disconnected devices
PCI: Set error response data when config read fails
PCI: Add PCI_ERROR_RESPONSE and related definitions
Some devices have an erratum such that they only support DWORD accesses to
some registers. E.g., this Bayhub O2 device ([VID:DID] = [0x1217:0x8621])
only supports DWORD accesses to LTR latency registers and L1 PM substates
control registers:
https://github.com/rajatxjain/public_shared/blob/main/OZ711LV2_appnote.pdf
The L1 PM substate control registers are DWORD sized, and hence their
access in the kernel is already DWORD sized, so we don't need to do
anything for them.
However, the LTR registers being WORD sized, are in need of a solution.
Convert the WORD sized accesses to these registers into DWORD sized
accesses while saving and restoring them.
Link: https://lore.kernel.org/r/20211222012105.3438916-1-rajatja@google.com
Signed-off-by: Rajat Jain <rajatja@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The MSI core will introduce runtime allocation of MSI related data. This
data will be devres managed and has to be set up before enabling
PCI/MSI[-X]. This would introduce an ordering issue vs. pcim_release().
The setup order is:
pcim_enable_device()
devres_alloc(pcim_release...);
...
pci_irq_alloc()
msi_setup_device_data()
devres_alloc(msi_device_data_release, ...)
and once the device is released these release functions are invoked in the
opposite order:
msi_device_data_release()
...
pcim_release()
pci_disable_msi[x]()
which is obviously wrong, because pci_disable_msi[x]() requires the MSI
data to be available to tear down the MSI[-X] interrupts.
Remove the MSI[-X] teardown from pcim_release() and add an explicit action
to be installed on the attempt of enabling PCI/MSI[-X].
This allows the MSI core data allocation to be ordered correctly in a
subsequent step.
Reported-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Nishanth Menon <nm@ti.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/87tuf9rdoj.ffs@tglx
When config pci_ops.read() can detect failed PCI transactions, the data
returned to the CPU is PCI_ERROR_RESPONSE (~0 or 0xffffffff).
Obviously a successful PCI config read may *also* return that data if a
config register happens to contain ~0, so it doesn't definitively indicate
an error unless we know the register cannot contain ~0.
Use PCI_POSSIBLE_ERROR() to check the response we get when we read data
from hardware. This unifies PCI error response checking and makes error
checks consistent and easier to find.
Link: https://lore.kernel.org/r/f4d18d470cb90f9cb52ea155b01528ba2e76e8d6.1637243717.git.naveennaidu479@gmail.com
Signed-off-by: Naveen Naidu <naveennaidu479@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
- Add PCI automatic error recovery.
- Fix tape driver timer initialization broken during timers api cleanup.
- Fix bogus CPU measurement counters values on CPUs offlining.
- Check the validity of subchanel before reading other fields in
the schib in cio code.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAmGPwDMACgkQjYWKoQLX
FBgCoQf/VBel0vDex9NNVo59OmGTNh9NPPT2cUU8vYEmwHfaBeInZVEx5WOxXijl
8MIbEgi6Wt3EwnIghjouC50nk8jCiNOJ8Z/wG+01zZpVpLk2GvKGjxoYxKg+5E6T
sOSr7TMeKOtOp23xKAXGVIzzkrDTSyr3qruTKg/m6TFhQ0XSm/ld2k6tR5AARbuB
UtsxBOtPWyHm1xPyuhjr+c6riK2vGQwJwYya4vGtIW8ix9uZoPabdqqzWsD3meBc
B6fe96YQGxA8Tt80FtyJ6pHEhNDr8CE656aJZNJCnd7q1RmLWC1R/aUme+9wqQtO
i9YmMvc+uzgQonpo+YgWqu9fIQqcXA==
=OjW1
-----END PGP SIGNATURE-----
Merge tag 's390-5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull more s390 updates from Vasily Gorbik:
- Add PCI automatic error recovery.
- Fix tape driver timer initialization broken during timers api
cleanup.
- Fix bogus CPU measurement counters values on CPUs offlining.
- Check the validity of subchanel before reading other fields in the
schib in cio code.
* tag 's390-5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/cio: check the subchannel validity for dev_busid
s390/cpumf: cpum_cf PMU displays invalid value after hotplug remove
s390/tape: fix timer initialization in tape_std_assign()
s390/pci: implement minimal PCI error recovery
PCI: Export pci_dev_lock()
s390/pci: implement reset_slot for hotplug slot
s390/pci: refresh function handle in iomap
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmGNcPYUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vxTvQ//Wl8O7D+ckdw/4xMVi2N2uc6vCbB6
or5wmW1jfE6H7YvXbUrSMEBFoRfaz9nhuT3w9i60hzYmc9uqbsMCONBHLFAaILCP
By/qkNgMFuskZGHTziqsf2Qfza04KQDv4quUl8SvlUfQeOtzUXe0vtXCrFmMTDGs
QQ/bbVbI6WazTwzC11NALU3Kc8AZQXQMkd5Nt4fBfuZwFhBfnEckJgF3PmSZ7PpO
N/29wJDsdkHSBNqTO1eeqdVIN8GQ6aycUYEQp3gvDCvwQ3CXtPP6SScTnqibbzjO
3wuBk+FFH8Zdq8e7CEYoj0mL2RMKhLrY1MLV8IGI1YiPy8pH1EujOiDG0uNk7YSg
Zqo3NMq9L2IR28odTMc8W8oknz6CTO+rhoW22CbyImtrPT5RnWsyj1xH+k2lMxt/
3viTV5lDzKr6V6c5xdhCP26eNzt06mlrduvQ5EvsfGu6whckyrgd6XOppExdWA/S
24N9GCJuZRCp5wgETq+Vxl48ZWLzkJ3IwoVrIhx4UTChvXPIgYrviGnEFp64ZCuc
FbrINeeddxWdXkkuJ44FhmSkv7Gcl87w0cofy9tnArxZza5v3wGATFp7G1z8WO+0
HWoBhdeGGQ3+ClJUo9ihO2r9hABUbveJqxgjQbCs8P/7bR0acCoeGJCoZVj7KUUe
a6GDw16Lfu87QuQ=
=0aFC
-----END PGP SIGNATURE-----
Merge tag 'pci-v5.16-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
"Revert conversion to struct device.driver instead of struct
pci_dev.driver.
The device.driver is set earlier, and using it caused the PCI core to
call driver PM entry points before .probe() and after .remove(), when
the driver isn't prepared.
This caused NULL pointer dereferences in i2c_designware_pci and
probably other driver issues"
* tag 'pci-v5.16-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
Revert "PCI: Use to_pci_driver() instead of pci_dev->driver"
Revert "PCI: Remove struct pci_dev->driver"
This reverts commit 2a4d9408c9.
Robert reported a NULL pointer dereference caused by the PCI core
(local_pci_probe()) calling the i2c_designware_pci driver's
.runtime_resume() method before the .probe() method. i2c_dw_pci_resume()
depends on initialization done by i2c_dw_pci_probe().
Prior to 2a4d9408c9 ("PCI: Use to_pci_driver() instead of
pci_dev->driver"), pci_pm_runtime_resume() avoided calling the
.runtime_resume() method because pci_dev->driver had not been set yet.
2a4d9408c9 and b5f9c644eb ("PCI: Remove struct pci_dev->driver"),
removed pci_dev->driver, replacing it by device->driver, which *has* been
set by this time, so pci_pm_runtime_resume() called the .runtime_resume()
method when it previously had not.
Fixes: 2a4d9408c9 ("PCI: Use to_pci_driver() instead of pci_dev->driver")
Link: https://lore.kernel.org/linux-i2c/CAP145pgdrdiMAT7=-iB1DMgA7t_bMqTcJL4N0=6u8kNY3EU0dw@mail.gmail.com/
Reported-by: Robert Święcki <robert@swiecki.net>
Tested-by: Robert Święcki <robert@swiecki.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
- Fix support for platforms that do not enumerate every ACPI0016 (CXL
Host Bridge) in the CHBS (ACPI Host Bridge Structure).
- Introduce a common pci_find_dvsec_capability() helper, clean up open
coded implementations in various drivers.
- Add 'cxl_test' for regression testing CXL subsystem ABIs. 'cxl_test'
is a module built from tools/testing/cxl/ that mocks up a CXL topology
to augment the nascent support for emulation of CXL devices in QEMU.
- Convert libnvdimm to use the uuid API.
- Complete the definition of CXL namespace labels in libnvdimm.
- Tunnel libnvdimm label operations from nd_ioctl() back to the CXL
mailbox driver. Enable 'ndctl {read,write}-labels' for CXL.
- Continue to sort and refactor functionality into distinct driver and
core-infrastructure buckets. For example, mailbox handling is now a
generic core capability consumed by the PCI and cxl_test drivers.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCYYRtyQAKCRDfioYZHlFs
Z7UsAP9DzUN6IWWnYk1R95YXYVxFriRtRsBjujAqTg49EMghawEAoHaA9lxO3Hho
l25TLYUOmB/zFTlUbe6YQptMJZ5YLwY=
=im9j
-----END PGP SIGNATURE-----
Merge tag 'cxl-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull cxl updates from Dan Williams:
"More preparation and plumbing work in the CXL subsystem.
From an end user perspective the highlight here is lighting up the CXL
Persistent Memory related commands (label read / write) with the
generic ioctl() front-end in LIBNVDIMM.
Otherwise, the ability to instantiate new persistent and volatile
memory regions is still on track for v5.17.
Summary:
- Fix support for platforms that do not enumerate every ACPI0016 (CXL
Host Bridge) in the CHBS (ACPI Host Bridge Structure).
- Introduce a common pci_find_dvsec_capability() helper, clean up
open coded implementations in various drivers.
- Add 'cxl_test' for regression testing CXL subsystem ABIs.
'cxl_test' is a module built from tools/testing/cxl/ that mocks up
a CXL topology to augment the nascent support for emulation of CXL
devices in QEMU.
- Convert libnvdimm to use the uuid API.
- Complete the definition of CXL namespace labels in libnvdimm.
- Tunnel libnvdimm label operations from nd_ioctl() back to the CXL
mailbox driver. Enable 'ndctl {read,write}-labels' for CXL.
- Continue to sort and refactor functionality into distinct driver
and core-infrastructure buckets. For example, mailbox handling is
now a generic core capability consumed by the PCI and cxl_test
drivers"
* tag 'cxl-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (34 commits)
ocxl: Use pci core's DVSEC functionality
cxl/pci: Use pci core's DVSEC functionality
PCI: Add pci_find_dvsec_capability to find designated VSEC
cxl/pci: Split cxl_pci_setup_regs()
cxl/pci: Add @base to cxl_register_map
cxl/pci: Make more use of cxl_register_map
cxl/pci: Remove pci request/release regions
cxl/pci: Fix NULL vs ERR_PTR confusion
cxl/pci: Remove dev_dbg for unknown register blocks
cxl/pci: Convert register block identifiers to an enum
cxl/acpi: Do not fail cxl_acpi_probe() based on a missing CHBS
cxl/pci: Disambiguate cxl_pci further from cxl_mem
Documentation/cxl: Add bus internal docs
cxl/core: Split decoder setup into alloc + add
tools/testing/cxl: Introduce a mock memory device + driver
cxl/mbox: Move command definitions to common location
cxl/bus: Populate the target list at decoder create
tools/testing/cxl: Introduce a mocked-up CXL port hierarchy
cxl/pmem: Add support for multiple nvdimm-bridge objects
cxl/pmem: Translate NVDIMM label commands to CXL label commands
...
Commit e3a9b1212b ("PCI: Export pci_dev_trylock() and pci_dev_unlock()")
already exported pci_dev_trylock()/pci_dev_unlock() however in some
circumstances such as during error recovery it makes sense to block
waiting to get full access to the device so also export pci_dev_lock().
Link: https://lore.kernel.org/all/20210928181014.GA713179@bhelgaas/
Acked-by: Pierre Morel <pmorel@linux.ibm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmGFXBkUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vx6Tg/7BsGWm8f+uw/mr9lLm47q2mc4XyoO
7bR9KDp5NM84W/8ZOU7dqqqsnY0ddrSOLBRyhJJYMW3SwJd1y1ajTBsL1Ujqv+eN
z+JUFmhq4Laqm4k6Spc9CEJE+Ol5P6gGUtxLYo6PM2R0VxnSs/rDxctT5i7YOpCi
COJ+NVT/mc/by2loz1kLTSR9GgtBBgd+Y8UA33GFbHKssROw02L0OI3wffp81Oba
EhMGPoD+0FndAniDw+vaOSoO+YaBuTfbM92T/O00mND69Fj1PWgmNWZz7gAVgsXb
3RrNENUFxgw6CDt7LZWB8OyT04iXe0R2kJs+PA9gigFCGbypwbd/Nbz5M7e9HUTR
ray+1EpZib6+nIksQBL2mX8nmtyHMcLiM57TOEhq0+ECDO640MiRm8t0FIG/1E8v
3ZYd9w20o/NxlFNXHxxpZ3D/osGH5ocyF5c5m1rfB4RGRwztZGL172LWCB0Ezz9r
eHB8sWxylxuhrH+hp2BzQjyddg7rbF+RA4AVfcQSxUpyV01hoRocKqknoDATVeLH
664nJIINFxKJFwfuL3E6OhrInNe1LnAhCZsHHqbS+NNQFgvPRznbixBeLkI9dMf5
Yf6vpsWO7ur8lHHbRndZubVu8nxklXTU7B/w+C11sq6k9LLRJSHzanr3Fn9WA80x
sznCxwUvbTCu1r0=
=nsMh
-----END PGP SIGNATURE-----
Merge tag 'pci-v5.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull pci updates from Bjorn Helgaas:
"Enumeration:
- Conserve IRQs by setting up portdrv IRQs only when there are users
(Jan Kiszka)
- Rework and simplify _OSC negotiation for control of PCIe features
(Joerg Roedel)
- Remove struct pci_dev.driver pointer since it's redundant with the
struct device.driver pointer (Uwe Kleine-König)
Resource management:
- Coalesce contiguous host bridge apertures from _CRS to accommodate
BARs that cover more than one aperture (Kai-Heng Feng)
Sysfs:
- Check CAP_SYS_ADMIN before parsing user input (Krzysztof
Wilczyński)
- Return -EINVAL consistently from "store" functions (Krzysztof
Wilczyński)
- Use sysfs_emit() in endpoint "show" functions to avoid buffer
overruns (Kunihiko Hayashi)
PCIe native device hotplug:
- Ignore Link Down/Up caused by resets during error recovery so
endpoint drivers can remain bound to the device (Lukas Wunner)
Virtualization:
- Avoid bus resets on Atheros QCA6174, where they hang the device
(Ingmar Klein)
- Work around Pericom PI7C9X2G switch packet drop erratum by using
store and forward mode instead of cut-through (Nathan Rossi)
- Avoid trying to enable AtomicOps on VFs; the PF setting applies to
all VFs (Selvin Xavier)
MSI:
- Document that /sys/bus/pci/devices/.../irq contains the legacy INTx
interrupt or the IRQ of the first MSI (not MSI-X) vector (Barry
Song)
VPD:
- Add pci_read_vpd_any() and pci_write_vpd_any() to access anywhere
in the possible VPD space; use these to simplify the cxgb3 driver
(Heiner Kallweit)
Peer-to-peer DMA:
- Add (not subtract) the bus offset when calculating DMA address
(Wang Lu)
ASPM:
- Re-enable LTR at Downstream Ports so they don't report Unsupported
Requests when reset or hot-added devices send LTR messages
(Mingchuang Qiao)
Apple PCIe controller driver:
- Add driver for Apple M1 PCIe controller (Alyssa Rosenzweig, Marc
Zyngier)
Cadence PCIe controller driver:
- Return success when probe succeeds instead of falling into error
path (Li Chen)
HiSilicon Kirin PCIe controller driver:
- Reorganize PHY logic and add support for external PHY drivers
(Mauro Carvalho Chehab)
- Support PERST# GPIOs for HiKey970 external PEX 8606 bridge (Mauro
Carvalho Chehab)
- Add Kirin 970 support (Mauro Carvalho Chehab)
- Make driver removable (Mauro Carvalho Chehab)
Intel VMD host bridge driver:
- If IOMMU supports interrupt remapping, leave VMD MSI-X remapping
enabled (Adrian Huang)
- Number each controller so we can tell them apart in
/proc/interrupts (Chunguang Xu)
- Avoid building on UML because VMD depends on x86 bare metal APIs
(Johannes Berg)
Marvell Aardvark PCIe controller driver:
- Define macros for PCI_EXP_DEVCTL_PAYLOAD_* (Pali Rohár)
- Set Max Payload Size to 512 bytes per Marvell spec (Pali Rohár)
- Downgrade PIO Response Status messages to debug level (Marek Behún)
- Preserve CRS SV (Config Request Retry Software Visibility) bit in
emulated Root Control register (Pali Rohár)
- Fix issue in configuring reference clock (Pali Rohár)
- Don't clear status bits for masked interrupts (Pali Rohár)
- Don't mask unused interrupts (Pali Rohár)
- Avoid code repetition in advk_pcie_rd_conf() (Marek Behún)
- Retry config accesses on CRS response (Pali Rohár)
- Simplify emulated Root Capabilities initialization (Pali Rohár)
- Fix several link training issues (Pali Rohár)
- Fix link-up checking via LTSSM (Pali Rohár)
- Fix reporting of Data Link Layer Link Active (Pali Rohár)
- Fix emulation of W1C bits (Marek Behún)
- Fix MSI domain .alloc() method to return zero on success (Marek
Behún)
- Read entire 16-bit MSI vector in MSI handler, not just low 8 bits
(Marek Behún)
- Clear Root Port I/O Space, Memory Space, and Bus Master Enable bits
at startup; PCI core will set those as necessary (Pali Rohár)
- When operating as a Root Port, set class code to "PCI Bridge"
instead of the default "Mass Storage Controller" (Pali Rohár)
- Add emulation for PCI_BRIDGE_CTL_BUS_RESET since aardvark doesn't
implement this per spec (Pali Rohár)
- Add emulation of option ROM BAR since aardvark doesn't implement
this per spec (Pali Rohár)
MediaTek MT7621 PCIe controller driver:
- Add MediaTek MT7621 PCIe host controller driver and DT binding
(Sergio Paracuellos)
Qualcomm PCIe controller driver:
- Add SC8180x compatible string (Bjorn Andersson)
- Add endpoint controller driver and DT binding (Manivannan
Sadhasivam)
- Restructure to use of_device_get_match_data() (Prasad Malisetty)
- Add SC7280-specific pcie_1_pipe_clk_src handling (Prasad Malisetty)
Renesas R-Car PCIe controller driver:
- Remove unnecessary includes (Geert Uytterhoeven)
Rockchip DesignWare PCIe controller driver:
- Add DT binding (Simon Xue)
Socionext UniPhier Pro5 controller driver:
- Serialize INTx masking/unmasking (Kunihiko Hayashi)
Synopsys DesignWare PCIe controller driver:
- Run dwc .host_init() method before registering MSI interrupt
handler so we can deal with pending interrupts left by bootloader
(Bjorn Andersson)
- Clean up Kconfig dependencies (Andy Shevchenko)
- Export symbols to allow more modular drivers (Luca Ceresoli)
TI DRA7xx PCIe controller driver:
- Allow host and endpoint drivers to be modules (Luca Ceresoli)
- Enable external clock if present (Luca Ceresoli)
TI J721E PCIe driver:
- Disable PHY when probe fails after initializing it (Christophe
JAILLET)
MicroSemi Switchtec management driver:
- Return error to application when command execution fails because an
out-of-band reset has cleared the device BARs, Memory Space Enable,
etc (Kelvin Cao)
- Fix MRPC error status handling issue (Kelvin Cao)
- Mask out other bits when reading of management VEP instance ID
(Kelvin Cao)
- Return EOPNOTSUPP instead of ENOTSUPP from sysfs show functions
(Kelvin Cao)
- Add check of event support (Logan Gunthorpe)
Miscellaneous:
- Remove unused pci_pool wrappers, which have been replaced by
dma_pool (Cai Huoqing)
- Use 'unsigned int' instead of bare 'unsigned' (Krzysztof
Wilczyński)
- Use kstrtobool() directly, sans strtobool() wrapper (Krzysztof
Wilczyński)
- Fix some sscanf(), sprintf() format mismatches (Krzysztof
Wilczyński)
- Update PCI subsystem information in MAINTAINERS (Krzysztof
Wilczyński)
- Correct some misspellings (Krzysztof Wilczyński)"
* tag 'pci-v5.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (137 commits)
PCI: Add ACS quirk for Pericom PI7C9X2G switches
PCI: apple: Configure RID to SID mapper on device addition
iommu/dart: Exclude MSI doorbell from PCIe device IOVA range
PCI: apple: Implement MSI support
PCI: apple: Add INTx and per-port interrupt support
PCI: kirin: Allow removing the driver
PCI: kirin: De-init the dwc driver
PCI: kirin: Disable clkreq during poweroff sequence
PCI: kirin: Move the power-off code to a common routine
PCI: kirin: Add power_off support for Kirin 960 PHY
PCI: kirin: Allow building it as a module
PCI: kirin: Add MODULE_* macros
PCI: kirin: Add Kirin 970 compatible
PCI: kirin: Support PERST# GPIOs for HiKey970 external PEX 8606 bridge
PCI: apple: Set up reference clocks when probing
PCI: apple: Add initial hardware bring-up
PCI: of: Allow matching of an interrupt-map local to a PCI device
of/irq: Allow matching of an interrupt-map local to an interrupt controller
irqdomain: Make of_phandle_args_to_fwspec() generally available
PCI: Do not enable AtomicOps on VFs
...
- Tidy setup-irq.c comments (Pranay Sanghai)
- Fix misspellings (Krzysztof Wilczyński)
- Fix sprintf(), sscanf() format mismatches (Krzysztof Wilczyński)
- Tidy cpqphp code formatting (Krzysztof Wilczyński)
- Remove unused pci_pool wrappers, which have been replaced by dma_pool
(Cai Huoqing)
- Remove a redundant initialization in __pci_reset_function_locked() (Colin
Ian King)
- Use 'unsigned int' instead of 'unsigned' (Krzysztof Wilczyński)
- Update PCI subsystem information in MAINTAINERS (Krzysztof Wilczyński)
- Include generic <linux/> headers instead of <asm/> for cpqphp and vmd
(Krzysztof Wilczyński)
* pci/misc:
PCI: vmd: Drop redundant includes of <asm/device.h>, <asm/msi.h>
PCI: cpqphp: Use <linux/io.h> instead of <asm/io.h>
MAINTAINERS: Update PCI subsystem information
PCI: Prefer 'unsigned int' over bare 'unsigned'
PCI: Remove redundant 'rc' initialization
PCI: Remove unused pci_pool wrappers
PCI: cpqphp: Format if-statement code block correctly
PCI: Use unsigned to match sscanf("%x") in pci_dev_str_match_path()
PCI: hv: Remove unnecessary use of %hx
PCI: Correct misspelled and remove duplicated words
PCI: Tidy comments
- Ignore Link Down/Up caused by error-induced Hot Reset so endpoint driver
can remain bound to device during error recovery (Lukas Wunner)
- Remove unused resume err_handler (Lukas Wunner)
- Remove unused pcie_port_bus_{,un}register() declarations (Lukas Wunner)
- Skip compiling err.c when CONFIG_PCIEAER not set (Lukas Wunner)
* pci/hotplug:
PCI/ERR: Reduce compile time for CONFIG_PCIEAER=n
PCI/portdrv: Remove unused pcie_port_bus_{,un}register() declarations
PCI/portdrv: Remove unused resume err_handler
PCI: pciehp: Ignore Link Down/Up caused by error-induced Hot Reset
PCI/portdrv: Rename pm_iter() to pcie_port_device_iter()
- Drop the struct pci_dev.driver pointer, which is redundant with the
struct device.driver pointer (Uwe Kleine-König)
* pci/driver:
PCI: Remove struct pci_dev->driver
PCI: Use to_pci_driver() instead of pci_dev->driver
x86/pci/probe_roms: Use to_pci_driver() instead of pci_dev->driver
perf/x86/intel/uncore: Use to_pci_driver() instead of pci_dev->driver
powerpc/eeh: Use to_pci_driver() instead of pci_dev->driver
usb: xhci: Use to_pci_driver() instead of pci_dev->driver
cxl: Use to_pci_driver() instead of pci_dev->driver
cxl: Factor out common dev->driver expressions
xen/pcifront: Use to_pci_driver() instead of pci_dev->driver
xen/pcifront: Drop pcifront_common_process() tests of pcidev, pdrv
nfp: use dev_driver_string() instead of pci_dev->driver->name
mlxsw: pci: Use dev_driver_string() instead of pci_dev->driver->name
net: marvell: prestera: use dev_driver_string() instead of pci_dev->driver->name
net: hns3: use dev_driver_string() instead of pci_dev->driver->name
crypto: hisilicon - use dev_driver_string() instead of pci_dev->driver->name
powerpc/eeh: Use dev_driver_string() instead of struct pci_dev->driver->name
ssb: Use dev_driver_string() instead of pci_dev->driver->name
bcma: simplify reference to driver name
crypto: qat - simplify adf_enable_aer()
scsi: message: fusion: Remove unused mpt_pci driver .probe() 'id' parameter
PCI/ERR: Factor out common dev->driver expressions
PCI: Drop pci_device_probe() test of !pci_dev->driver
PCI: Drop pci_device_remove() test of pci_dev->driver
PCI: Return NULL for to_pci_driver(NULL)
- Rename pcibios_add_device() to pcibios_device_add() since it's called
from pci_device_add() (Oliver O'Halloran)
- Don't try to enable AtomicOps on VFs, since they can only be enabled on
the PF (Selvin Xavier)
* pci/enumeration:
PCI: Do not enable AtomicOps on VFs
PCI: Rename pcibios_add_device() to pcibios_device_add()
Host crashes when pci_enable_atomic_ops_to_root() is called for VFs with
virtual buses. The virtual buses added to SR-IOV have bus->self set to NULL
and host crashes due to this.
PID: 4481 TASK: ffff89c6941b0000 CPU: 53 COMMAND: "bash"
...
#3 [ffff9a9481713808] oops_end at ffffffffb9025cd6
#4 [ffff9a9481713828] page_fault_oops at ffffffffb906e417
#5 [ffff9a9481713888] exc_page_fault at ffffffffb9a0ad14
#6 [ffff9a94817138b0] asm_exc_page_fault at ffffffffb9c00ace
[exception RIP: pcie_capability_read_dword+28]
RIP: ffffffffb952fd5c RSP: ffff9a9481713960 RFLAGS: 00010246
RAX: 0000000000000001 RBX: ffff89c6b1096000 RCX: 0000000000000000
RDX: ffff9a9481713990 RSI: 0000000000000024 RDI: 0000000000000000
RBP: 0000000000000080 R8: 0000000000000008 R9: ffff89c64341a2f8
R10: 0000000000000002 R11: 0000000000000000 R12: ffff89c648bab000
R13: 0000000000000000 R14: 0000000000000000 R15: ffff89c648bab0c8
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#7 [ffff9a9481713988] pci_enable_atomic_ops_to_root at ffffffffb95359a6
#8 [ffff9a94817139c0] bnxt_qplib_determine_atomics at ffffffffc08c1a33 [bnxt_re]
#9 [ffff9a94817139d0] bnxt_re_dev_init at ffffffffc08ba2d1 [bnxt_re]
Per PCIe r5.0, sec 9.3.5.10, the AtomicOp Requester Enable bit in Device
Control 2 is reserved for VFs. The PF value applies to all associated VFs.
Return -EINVAL if pci_enable_atomic_ops_to_root() is called for a VF.
Link: https://lore.kernel.org/r/1631354585-16597-1-git-send-email-selvin.xavier@broadcom.com
Fixes: 35f5ace5de ("RDMA/bnxt_re: Enable global atomic ops if platform supports")
Fixes: 430a23689d ("PCI: Add pci_enable_atomic_ops_to_root()")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Here is the big set of staging driver updates and cleanups for 5.16-rc1.
Overall we ended up removing a lot of code this time, a bit over 20,000
lines are now gone thanks to a lot of cleanup work by many developers.
Nothing huge in here functionality wise, just loads of cleanups:
- r8188eu driver major cleanups and removal of unused and dead
code
- wlan-ng minor cleanups
- fbtft driver cleanups
- most driver cleanups
- rtl8* drivers cleanups
- rts5208 driver cleanups
- vt6655 driver cleanups
- vc04_services drivers cleanups
- wfx cleanups on the way to almost getting this merged out of
staging (it's close!)
- tiny mips changes needed for the mt7621 drivers, they have
been acked by the respective subsystem maintainers to go
through this tree.
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYYPZQQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yml9wCeJl83anYno0xh+UP6CsEkbe64VJEAoIEKyry/
tlUowcatxGfz3aYA1wTc
=FyAK
-----END PGP SIGNATURE-----
Merge tag 'staging-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver updates from Greg KH:
"Here is the big set of staging driver updates and cleanups for
5.16-rc1.
Overall we ended up removing a lot of code this time, a bit over
20,000 lines are now gone thanks to a lot of cleanup work by many
developers.
Nothing huge in here functionality wise, just loads of cleanups:
- r8188eu driver major cleanups and removal of unused and dead code
- wlan-ng minor cleanups
- fbtft driver cleanups
- most driver cleanups
- rtl8* drivers cleanups
- rts5208 driver cleanups
- vt6655 driver cleanups
- vc04_services drivers cleanups
- wfx cleanups on the way to almost getting this merged out of
staging (it's close!)
- tiny mips changes needed for the mt7621 drivers, they have been
acked by the respective subsystem maintainers to go through this
tree.
All of these have been in linux-next for a while with no reported
issues"
* tag 'staging-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (622 commits)
staging: r8188eu: hal: remove goto statement and local variable
staging: rtl8723bs: hal remove the assignment to itself
staging: rtl8723bs: fix unmet dependency on CRYPTO for CRYPTO_LIB_ARC4
staging: vchiq_core: get rid of typedef
staging: fieldbus: anybus: reframe comment to avoid warning
staging: r8188eu: fix missing unlock in rtw_resume()
staging: r8188eu: core: remove the goto from rtw_IOL_accquire_xmit_frame
staging: r8188eu: core: remove goto statement
staging: vt6655: Rename `dwAL7230InitTable` array
staging: vt6655: Rename `dwAL2230PowerTable` array
staging: vt6655: Rename `dwAL7230InitTableAMode` array
staging: vt6655: Rename `dwAL7230ChannelTable2` array
staging: vt6655: Rename `dwAL7230ChannelTable1` array
staging: vt6655: Rename `dwAL7230ChannelTable0` array
staging: vt6655: Rename `dwAL2230ChannelTable1` array
staging: vt6655: Rename `dwAL2230ChannelTable0` array
staging: r8712u: fix control-message timeout
staging: rtl8192u: fix control-message timeouts
staging: mt7621-dts: add missing SPDX license to files
staging: vchiq_core: fix quoted strings split across lines
...
Add pci_find_dvsec_capability to locate a Designated Vendor-Specific
Extended Capability with the specified Vendor ID and Capability ID.
The Designated Vendor-Specific Extended Capability (DVSEC) allows one or
more "vendor" specific capabilities that are not tied to the Vendor ID
of the PCI component. Where the DVSEC Vendor may be a standards body
like CXL.
Cc: David E. Box <david.e.box@linux.intel.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: linux-pci@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Andrew Donnellan <ajd@linux.ibm.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/163379787943.692348.6814373487017444007.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The bare "unsigned" type implicitly means "unsigned int", but the preferred
coding style is to use the complete type name.
Update the bare use of "unsigned" to the preferred "unsigned int".
No change to functionality intended.
See a1ce18e4f9 ("checkpatch: warn on bare unsigned or signed declarations
without int").
Link: https://lore.kernel.org/r/20211013014136.1117543-1-kw@linux.com
Signed-off-by: Krzysztof Wilczyński <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The variable 'rc' is being initialized with a value that is never read.
Remove the redundant assignment.
Addresses-Coverity: ("Unused value")
Link: https://lore.kernel.org/r/20210910161417.91001-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Krzysztof Wilczyński <kw@linux.com>
The ordering of operations in pci_back_from_sleep() is incorrect,
because the device may be in D3cold when it runs and pci_enable_wake()
needs to access the device's configuration space which cannot be
done in D3cold.
Fix this by calling pci_set_power_state() to put the device into D0
before calling pci_enable_wake() for it.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Per PCIe r5.0, sec 7.5.3.16, Downstream Ports must disable LTR if the link
goes down (the Port goes DL_Down status). This is a problem because the
Downstream Port's dev->ltr_path is still set, so we think LTR is still
enabled, and we enable LTR in the Endpoint. When it sends LTR messages,
they cause Unsupported Request errors at the Downstream Port.
This happens in the reset path, where we may enable LTR in
pci_restore_pcie_state() even though the Downstream Port disabled LTR
because the reset caused a link down event.
It also happens in the hot-remove and hot-add path, where we may enable LTR
in pci_configure_ltr() even though the Downstream Port disabled LTR when
the hot-remove took the link down.
In these two scenarios, check the upstream bridge and restore its LTR
enable if appropriate.
The Unsupported Request may be logged by AER as follows:
pcieport 0000:00:1d.0: AER: Uncorrected (Non-Fatal) error received: id=00e8
pcieport 0000:00:1d.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=00e8(Requester ID)
pcieport 0000:00:1d.0: device [8086:9d18] error status/mask=00100000/00010000
pcieport 0000:00:1d.0: [20] Unsupported Request (First)
In addition, if LTR is not configured correctly, the link cannot enter the
L1.2 state, which prevents some machines from entering the S0ix low power
state.
[bhelgaas: commit log]
Link: https://lore.kernel.org/r/20211012075614.54576-1-mingchuang.qiao@mediatek.com
Reported-by: Utkarsh H Patel <utkarsh.h.patel@intel.com>
Signed-off-by: Mingchuang Qiao <mingchuang.qiao@mediatek.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Struct pci_driver contains a struct device_driver, so for PCI devices, it's
easy to convert a device_driver * to a pci_driver * with to_pci_driver().
The device_driver * is in struct device, so we don't need to also keep
track of the pci_driver * in struct pci_dev.
Replace pci_dev->driver with to_pci_driver(). This is a step toward
removing pci_dev->driver.
[bhelgaas: split to separate patch]
Link: https://lore.kernel.org/r/20211004125935.2300113-11-u.kleine-koenig@pengutronix.de
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The sole non-static function in err.c, pcie_do_recovery(), is only
called from:
* aer.c (if CONFIG_PCIEAER=y)
* dpc.c (if CONFIG_PCIE_DPC=y, which depends on CONFIG_PCIEAER)
* edr.c (if CONFIG_PCIE_EDR=y, which depends on CONFIG_PCIE_DPC)
Thus, err.c need not be compiled if CONFIG_PCIEAER=n.
Also, pci_uevent_ers() and pcie_clear_device_status(), which are called
from err.c, can be #ifdef'ed away unless CONFIG_PCIEAER=y.
Since x86_64_defconfig doesn't enable CONFIG_PCIEAER, this change may
slightly reduce compile time for anyone doing a test build with that
config.
Link: https://lore.kernel.org/r/98f9041151268c1c035ab64cca320ad86803f64a.1627638184.git.lukas@wunner.de
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cppcheck warns that pci_dev_str_match_path() passes pointers to signed ints
to sscanf("%x"), which expects pointers to *unsigned* ints:
invalidScanfArgType_int drivers/pci/pci.c:312 %x in format string (no. 2) requires 'unsigned int *' but the argument type is 'signed int *'.
Declare the variables as unsigned to avoid this issue.
[bhelgaas: commit log]
Link: https://lore.kernel.org/r/20211008222732.2868493-3-kw@linux.com
Signed-off-by: Krzysztof Wilczyński <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Drop two invocations of platform_pci_power_manageable() that are not
necessary, because the functions called when it returns 'true' do the
requisite "power manageable" checks themselves.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Ferry Toth <fntoth@gmail.com>
The pci_choose_state() and pci_target_state() implementations are
somewhat divergent without a good reason, because they are used
for similar purposes.
Change the pci_choose_state() implementation to use pci_target_state()
internally except for transitions to the working state of the system
in which case it is expected to return D0.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Ferry Toth <fntoth@gmail.com>
Make pci_target_state() return D3cold or D0 without checking PME
support if the current power state of the device is D3cold or if it
does not support the standard PCI PM, respectively.
Next, drop the tergat_state local variable that has become redundant
from it.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Ferry Toth <fntoth@gmail.com>
After previous changes there are no more users of struct
pci_platform_pm_ops in the tree, so drop it along with all of the
remaining related code.
No functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Ferry Toth <fntoth@gmail.com>
pci_remap_iospace() was originally meant as an architecture specific helper
but it moved into generic code after all architectures had the same
requirements. MIPS has different requirements so it should not be shared.
The way for doing this will be using a macro 'pci_remap_iospace' defined
for those architectures that need a special treatment. Hence, put core API
function inside preprocesor conditional code for 'pci_remap_iospace'
definition.
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
Link: https://lore.kernel.org/r/20210925203224.10419-5-sergio.paracuellos@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Using struct pci_platform_pm_ops for ACPI adds unnecessary
indirection to the interactions between the PCI core and ACPI PM,
which is also subject to retpolines.
Moreover, it is not particularly clear from the current code that,
as far as PCI PM is concerned, "platform" really means just ACPI
except for the special casess when Intel MID PCI PM is used or when
ACPI support is disabled (through the kernel config or command line,
or because there are no usable ACPI tables on the system).
To address the above, rework the PCI PM code to invoke ACPI PM
functions directly as needed and drop the acpi_pci_platform_pm
object that is not necessary any more.
Accordingly, update some of the ACPI PM functions in question to do
extra checks in case the ACPI support is disabled (which previously
was taken care of by avoiding to set the pci_platform_ops pointer
in those cases).
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Ferry Toth <fntoth@gmail.com>
There are only two users of struct pci_platform_pm_ops in the tree,
one of which is Intel MID PM and the other one is ACPI. They are
mutually exclusive and the MID PM should take precedence when they
both are enabled, but whether or not this really is the case hinges
on the specific ordering of arch_initcall() calls made by them.
The struct pci_platform_pm_ops abstraction is not really necessary
for just these two users, but it adds complexity and overhead because
of retoplines involved in using all of the function pointers in there.
It also makes following the code a bit more difficult than it would
be otherwise.
Moreover, Intel MID PCI PM doesn't even implement the majority of the
function pointers in struct pci_platform_pm_ops in a meaningful way,
so switch over the PCI core to calling the relevant MID PM routines,
mid_pci_set_power_state() and mid_pci_set_power_state(), directly as
needed and drop mid_pci_platform_pm.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Ferry Toth <fntoth@gmail.com>
The general convention for pcibios_* hooks is that they're named after the
corresponding pci_* function they provide a hook for. The exception is
pcibios_add_device() which provides a hook for pci_device_add().
Rename pcibios_add_device() to pcibios_device_add() so it matches
pci_device_add().
Also, remove the export of the microblaze version. The only caller must be
compiled as a built-in so there's no reason for the export.
Link: https://lore.kernel.org/r/20210913152709.48013-1-oohall@gmail.com
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Niklas Schnelle <schnelle@linux.ibm.com> # s390
PCIe Address Translation Services (ATS) provides a mechanism for a device
to provide an on-device caching translation agent (device IOTLB). We
already have a means to disable support for this feature via the pci=noats
option. For untrusted and externally facing devices, we not only disable
ATS support for the device, but we use Access Control Services (ACS)
Transaction Blocking to actively prevent devices from sending TLPs with
non-default AT field values.
Extend pci=noats to also make use of PCI_ACS_TB so that not only is ATS
disabled at the device, but blocked at the downstream ports. This provides
a means to further lock-down ATS for cases such as device assignment, where
it may not be the hardware configuration of the device that makes it
untrusted, but the driver running on the device.
Link: https://lore.kernel.org/r/162404966325.2362347.12176138291577486015.stgit@omen
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Rajat Jain <rajatja@google.com>
Change the type of probe argument in functions which implement reset
methods from int to bool to make the context and intent clear.
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20210817180500.1253-10-ameynarkhede03@gmail.com
Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
_RST is a standard ACPI method that performs a function level reset of a
device (ACPI v6.3, sec 7.3.25).
Add pci_dev_acpi_reset() to probe for _RST method and execute if present.
The default priority of this reset is set to below device-specific and
above hardware resets.
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20210817180500.1253-9-ameynarkhede03@gmail.com
Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Add "reset_method" sysfs attribute to enable user to query and set
preferred device reset methods and their ordering.
[bhelgaas: on invalid sysfs input, return error and preserve previous
config, as in earlier patch versions]
Co-developed-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20210817180500.1253-6-ameynarkhede03@gmail.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
"reset_fn" indicates whether the device supports any reset mechanism.
Remove the use of reset_fn in favor of the reset_methods array that tracks
supported reset mechanisms of a device and their ordering.
The octeon driver incorrectly used reset_fn to detect whether the device
supports FLR or not. Use pcie_reset_flr() to probe whether it supports FLR.
Co-developed-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20210817180500.1253-5-ameynarkhede03@gmail.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Add reset_methods[] in struct pci_dev to keep track of reset mechanisms
supported by the device and their ordering.
Refactor probing and reset functions to take advantage of calling
convention of reset functions.
Co-developed-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20210817180500.1253-4-ameynarkhede03@gmail.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Most reset methods are of the form "pci_*_reset(dev, probe)". pcie_flr()
was an exception because it relied on a separate pcie_has_flr() function
instead of taking a "probe" argument.
Add "pcie_reset_flr(dev, probe)" to follow the convention. Remove
pcie_has_flr().
Some pcie_flr() callers that did not use pcie_has_flr() remain.
[bhelgaas: commit log, rework pcie_reset_flr() to use dev->devcap directly]
Link: https://lore.kernel.org/r/20210817180500.1253-3-ameynarkhede03@gmail.com
Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Add a new member called devcap in struct pci_dev for caching the PCIe
Device Capabilities register to avoid reading PCI_EXP_DEVCAP multiple
times.
Refactor pcie_has_flr() to use cached device capabilities.
Link: https://lore.kernel.org/r/20210817180500.1253-2-ameynarkhede03@gmail.com
Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
pci_dev_str_match_path() is often called with a spinlock held so the
allocation has to be atomic. The call tree is:
pci_specified_resource_alignment() <-- takes spin_lock();
pci_dev_str_match()
pci_dev_str_match_path()
Fixes: 45db33709c ("PCI: Allow specifying devices using a base bus and path of devfns")
Link: https://lore.kernel.org/r/20210812070004.GC31863@kili
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
PME signaling is only enabled by __pci_enable_wake() if the target
device can signal PME from the given target power state (to avoid
pointless reconfiguration of the device), but if the hierarchy above
the device goes into D3cold, the device itself will end up in D3cold
too, so if it can signal PME from D3cold, it should be enabled to
do so in __pci_enable_wake().
[Note that if the device does not end up in D3cold and it cannot
signal PME from the original target power state, it will not signal
PME, so in that case the behavior does not change.]
Link: https://lore.kernel.org/linux-pm/3149540.aeNJFYEL58@kreacher/
Fixes: 5bcc2fb4e8 ("PCI PM: Simplify PCI wake-up code")
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reported-by: Utkarsh H Patel <utkarsh.h.patel@intel.com>
Reported-by: Koba Ko <koba.ko@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
It is inconsistent to return PCI_D0 from pci_target_state() instead
of the original target state if 'wakeup' is true and the device
cannot signal PME from D0.
This only happens when the device cannot signal PME from the original
target state and any shallower power states (including D0) and that
case is effectively equivalent to the one in which PME singaling is
not supported at all. Since the original target state is returned in
the latter case, make the function do that in the former one too.
Link: https://lore.kernel.org/linux-pm/3149540.aeNJFYEL58@kreacher/
Fixes: 666ff6f83e ("PCI/PM: Avoid using device_may_wakeup() for runtime PM")
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reported-by: Utkarsh H Patel <utkarsh.h.patel@intel.com>
Reported-by: Koba Ko <koba.ko@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
pci_ioremap_bar() and pci_ioremap_wc_bar() shared similar implementations
but differed in unimportant ways. Align them by adding a shared helper,
__pci_ioremap_resource().
Upgrade warning message to error level, since it indicates a driver defect.
Remove WARN_ON() from WC path in favor of the error message.
[bhelgaas: commit log, use ioremap() since pci_iomap_range() doesn't add
anything]
Link: https://lore.kernel.org/r/20210713102436.304693-1-kw@linux.com
Signed-off-by: Krzysztof Wilczyński <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Updating the current_state field of struct pci_dev the way it is done
in pci_enable_device_flags() before calling do_pci_enable_device() may
not work. For example, if the given PCI device depends on an ACPI
power resource whose _STA method initially returns 0 ("off"), but the
config space of the PCI device is accessible and the power state
retrieved from the PCI_PM_CTRL register is D0, the current_state
field in the struct pci_dev representing that device will get out of
sync with the power.state of its ACPI companion object and that will
lead to power management issues going forward.
To avoid such issues, make pci_enable_device_flags() call
pci_update_current_state() which takes ACPI device power management
into account, if present, to retrieve the current power state of the
device.
Link: https://lore.kernel.org/lkml/20210314000439.3138941-1-luzmaximilian@gmail.com/
Reported-by: Maximilian Luz <luzmaximilian@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Maximilian Luz <luzmaximilian@gmail.com>
Other places in the kernel use this form, and so just
provide a common path for it.
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Link: https://lore.kernel.org/r/20210623022824.308041-2-mcgrof@kernel.org
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Revert commit 4514d991d9 ("PCI: PM: Do not read power state in
pci_enable_device_flags()") that is reported to cause PCI device
initialization issues on some systems.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=213481
Link: https://lore.kernel.org/linux-acpi/YNDoGICcg0V8HhpQ@eldamar.lan
Reported-by: Michael <phyre@rogers.com>
Reported-by: Salvatore Bonaccorso <carnil@debian.org>
Fixes: 4514d991d9 ("PCI: PM: Do not read power state in pci_enable_device_flags()")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The value of the "resource_alignment" can be specified using a kernel
command-line argument ("pci=resource_alignment=") or through the
corresponding sysfs attribute under the /sys/bus/pci path.
Previously, when the value was set via the kernel command-line argument,
and then subsequently accessed through sysfs attribute, the value read back
was not correct:
# grep -oE 'pci=resource_alignment.+' /proc/cmdline
pci=resource_alignment=20@00:1f.2
# cat /sys/bus/pci/resource_alignment
20@00:1f.
This was also true when the value was set through the sysfs attribute
without including a trailing newline:
# echo -n 20@00:1f.2 > /sys/bus/pci/resource_alignment
# cat /sys/bus/pci/resource_alignment
20@00:1f.
When it was set through the sysfs attribute *including* a newline,
reading it back worked as intended:
# echo 20@00:1f.2 > /sys/bus/pci/resource_alignment
# cat /sys/bus/pci/resource_alignment
20@00:1f.2
To fix this inconsistency, append a trailing newline in the show() function
and strip the trailing line in the store() function if one is present.
Also, allow for the value previously set using either a command-line
argument or through the sysfs object to be cleared at run-time.
[bhelgaas: fold in kfree fix from
https://lore.kernel.org/linux-pci/20210604133230.983956-4-kw@linux.com]
Fixes: e499081da1 ("PCI: Force trailing new line to resource_alignment_param in sysfs")
Link: https://lore.kernel.org/r/20210603000112.703037-4-kw@linux.com
Signed-off-by: Krzysztof Wilczyński <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
The sysfs_emit() and sysfs_emit_at() functions were introduced to make
it less ambiguous which function is preferred when writing to the output
buffer in a device attribute's "show" callback [1].
Convert the PCI sysfs object "show" functions from sprintf(), snprintf()
and scnprintf() to sysfs_emit() and sysfs_emit_at() accordingly, as the
latter is aware of the PAGE_SIZE buffer and correctly returns the number
of bytes written into the buffer.
No functional change intended.
[1] Documentation/filesystems/sysfs.rst
Related commit: ad025f8e46 ("PCI/sysfs: Use sysfs_emit() and
sysfs_emit_at() in "show" functions").
Link: https://lore.kernel.org/r/20210603000112.703037-2-kw@linux.com
Signed-off-by: Krzysztof Wilczyński <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
pci_parent_bus_reset() resets a device by performing a Secondary Bus Reset
on a PCI-to-PCI bridge leading to the device.
pci_dev_reset_slot_function() does the same, except that it uses a hotplug
driver to keep the reset from looking like a hot-remove followed by a
hot-add.
Add a pci_reset_bus_function() wrapper, which attempts the hotplug driver
slot reset and falls back to the parent bus reset if that fails. This
provides a single interface for performing a Secondary Bus Reset.
[bhelgaas: commit log, don't expose yet]
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20210323100625.0021a943@omen.home.shazbot.org/
Link: https://lore.kernel.org/r/20210408182328.12323-1-raphael.norwitz@nutanix.com
Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
New drivers/devices
- Support for QCOM SM8150 GPI DMA
Updates:
- Big pile of idxd updates including support for performance monitoring
- Support in dw-edma for interleaved dma
- Support for synchronize() in Xilinx driver
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAmCRd8UACgkQfBQHDyUj
g0da8w/+L0o/qmwIYr2WHLIX8fXNSJkVu001p+eqN7UcSy4DBym4YEeo66jYzMHu
lJV9Wa0LC1Yzi0CTwP/bDMMVeT2NpTquHyat4SB8deTI9H4RiAoEX2hogjPYYZ14
ZCCjSpyHjF6VomFu7yyluXPe3s1cCopeiSDMDrHBfTYWhH0SSya6ObGcCdqEV1SO
p2MwW+5mTLjYVMcWTV8tuRS67MVf2tUPT+gvmX8KY0bEeqL+hpzTKDEAHOSW8p9D
PiyKX0bPwfXupXiYmbkQlSEH8+qwarrLNPFU/uxXAym5vsTxP2D3eoeKp/9U/H6H
nOuueFod+7LDgI5fe+BpOXW98G0mnzX/anPLMUInCbkc4JPLdHvnakQ7kxM7EPn3
hMi8DCPv2Ft/cc14KLT1mgnL2+SawVHigyVcSK1YFq2vlzy+m7tbEHXpiGUDlY8h
bwG6gCafN7D8U33vtipQtMmwgRGBXgytUPFq8J73tw+DuHTZqP2eZUQuqNfRPXa6
4cmWAbIn4JLVBxlwADfhMJNdeBEgHqkl2aWZPcoQmKOiBtnOd+tAL5Hb7EQWqyhB
J1cVkYyCGASVxrTTiK3s1gcqJ0lsjFqon+OA4V03GO0yHqkK+LTd3RsubKdp7y7R
db3ab0C0uIH9oc9NmqShN6j9aaQIiEWtQTBlJju/ObLaMfV3mGk=
=Jjb4
-----END PGP SIGNATURE-----
Merge tag 'dmaengine-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine updates from Vinod Koul:
"New drivers/devices:
- Support for QCOM SM8150 GPI DMA
Updates:
- Big pile of idxd updates including support for performance
monitoring
- Support in dw-edma for interleaved dma
- Support for synchronize() in Xilinx driver"
* tag 'dmaengine-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (42 commits)
dmaengine: idxd: Enable IDXD performance monitor support
dmaengine: idxd: Add IDXD performance monitor support
dmaengine: idxd: remove MSIX masking for interrupt handlers
dmaengine: idxd: device cmd should use dedicated lock
dmaengine: idxd: support reporting of halt interrupt
dmaengine: idxd: enable SVA feature for IOMMU
dmaengine: idxd: convert sprintf() to sysfs_emit() for all usages
dmaengine: idxd: add interrupt handle request and release support
dmaengine: idxd: add support for readonly config mode
dmaengine: idxd: add percpu_ref to descriptor submission path
dmaengine: idxd: remove detection of device type
dmaengine: idxd: iax bus removal
dmaengine: idxd: fix cdev setup and free device lifetime issues
dmaengine: idxd: fix group conf_dev lifetime
dmaengine: idxd: fix engine conf_dev lifetime
dmaengine: idxd: fix wq conf_dev 'struct device' lifetime
dmaengine: idxd: fix idxd conf_dev 'struct device' lifetime
dmaengine: idxd: use ida for device instance enumeration
dmaengine: idxd: removal of pcim managed mmio mapping
dmaengine: idxd: cleanup pci interrupt vector allocation management
...
- Configure FC and FTS for functions other than 0 (Ryder Lee)
- Add missing MODULE_DEVICE_TABLE (Qiheng Lin)
- Add YAML schema for MediaTek (Jianjun Wang)
- Export pci_pio_to_address() for module use (Jianjun Wang)
- Add MediaTek MT8192 PCIe controller driver (Jianjun Wang)
- Add MediaTek MT8192 INTx support (Jianjun Wang)
- Add MediaTek MT8192 MSI support (Jianjun Wang)
- Add MediaTek MT8192 system power management support (Jianjun Wang)
* remotes/lorenzo/pci/mediatek:
MAINTAINERS: Add Jianjun Wang as MediaTek PCI co-maintainer
PCI: mediatek-gen3: Add system PM support
PCI: mediatek-gen3: Add MSI support
PCI: mediatek-gen3: Add INTx support
PCI: mediatek-gen3: Add MediaTek Gen3 driver for MT8192
PCI: Export pci_pio_to_address() for module use
dt-bindings: PCI: mediatek-gen3: Add YAML schema
PCI: mediatek: Add missing MODULE_DEVICE_TABLE
PCI: mediatek: Configure FC and FTS for functions other than 0
This is a shim around vunmap_range, get rid of it.
Move the main API comment from the _noflush variant to the normal
variant, and make _noflush internal to mm/.
[npiggin@gmail.com: fix nommu builds and a comment bug per sfr]
Link: https://lkml.kernel.org/r/1617292598.m6g0knx24s.astroid@bobo.none
[akpm@linux-foundation.org: move vunmap_range_noflush() stub inside !CONFIG_MMU, not !CONFIG_NUMA]
[npiggin@gmail.com: fix nommu builds]
Link: https://lkml.kernel.org/r/1617292497.o1uhq5ipxp.astroid@bobo.none
Link: https://lkml.kernel.org/r/20210322021806.892164-5-npiggin@gmail.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Cédric Le Goater <clg@kaod.org>
Cc: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This interface will be used by PCI host drivers for PIO translation,
export it to support compiling those drivers as kernel modules.
Link: https://lore.kernel.org/r/20210420061723.989-3-jianjun.wang@mediatek.com
Signed-off-by: Jianjun Wang <jianjun.wang@mediatek.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Add pci_disable_parity() to disable reporting of parity errors for a
device by clearing PCI_COMMAND_PARITY.
The device will still set PCI_STATUS_DETECTED_PARITY when it detects
a parity error or receives a Poisoned TLP, but it will not set
PCI_STATUS_PARITY, which means it will not assert PERR#
(conventional PCI) or report Poisoned TLPs (PCIe).
Based-on: https://lore.kernel.org/linux-arm-kernel/d375987c-ea4f-dd98-4ef8-99b2fbfe7c33@gmail.com/
Based-on-patch-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/20210330174318.1289680-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
It should not be necessary to update the current_state field of
struct pci_dev in pci_enable_device_flags() before calling
do_pci_enable_device() for the device, because none of the
code between that point and the pci_set_power_state() call in
do_pci_enable_device() invoked later depends on it.
Moreover, doing that is actively harmful in some cases. For example,
if the given PCI device depends on an ACPI power resource whose _STA
method initially returns 0 ("off"), but the config space of the PCI
device is accessible and the power state retrieved from the
PCI_PM_CTRL register is D0, the current_state field in the struct
pci_dev representing that device will get out of sync with the
power.state of its ACPI companion object and that will lead to
power management issues going forward.
To avoid such issues it is better to leave the current_state value
as is until it is changed to PCI_D0 by do_pci_enable_device() as
appropriate. However, the power state of the device is not changed
to PCI_D0 if it is already enabled when pci_enable_device_flags()
gets called for it, so update its current_state in that case, but
use pci_update_current_state() covering platform PM too for that.
Link: https://lore.kernel.org/lkml/20210314000439.3138941-1-luzmaximilian@gmail.com/
Reported-by: Maximilian Luz <luzmaximilian@gmail.com>
Tested-by: Maximilian Luz <luzmaximilian@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Add pci_find_vsec_capability() to locate a Vendor-Specific Extended
Capability with the specified VSEC ID.
The Vendor-Specific Extended Capability (VSEC) allows one or more
proprietary capabilities defined by the vendor which aren't standard
or shared between vendors.
Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/d89506834fb11c6fa0bd5d515c0dd55b13ac6958.1613674948.git.gustavo.pimentel@synopsys.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmA2xiQUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vzRDA/9GCyEskI9DMtyT9UeoTMzpHcUZpaU
eCbLa2BSPjOKlrHLnPY7IwE0nT7ihe4OOcm8uOYOWtulE46XJNCHfxlUYP3SbI0Y
JlG0FBCh4ldzCzzKsftwkSvVhk+gn+ms9ucJ8q2iBSOXVhG/41IbX7++8IfbQM4v
VHjdYUmTCCiOSRDtBVi82p4+GAHxH8IhaB0gDNb1Q7myj+qJKL5nKjK/nukgO0fO
UpCnSxyua48Ij+c59Y1QAIhGeORq5Gg5Q4ussY3FxS9ovhZODEGQwCFniTfilqRw
wEB9Fb8tiPY60ljEyDPnERMkiW69zutTJqOY4LfwmoRM9IEbxD6VPIqF5gin8sB7
pHhX4KUU+eB1hQdK9SGKjkwyehquNKzTdxsu2jccltOKwBm5jcXYeOvu2bJTzZn+
rrZPYJoA1dQig3bEuOzsBxvW4Jaj7IsVfVcao4OzXyh8Y7tLr9kVDXxr7JC/EkPM
zRK24yglERD2J1JXgNMvOuJQj6JmRHhEbV/faZci8x8ZEaz1FawRAUZqHf/gGmnW
2CllarHbRnchPyD8btv03Mp84WG6fCfKy7zG2D8HxOsiStDO/5ICehHtGcvYg7IL
RuE4Tj8OKdcbw/8cO4C3842FqiSj34+jooNIHSLyBqcpJam6VsN4XqNIZCL+DeG5
Q2JXruAaahTWOZg=
=GXL5
-----END PGP SIGNATURE-----
Merge tag 'pci-v5.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
"Enumeration:
- Remove unnecessary locking around _OSC (Bjorn Helgaas)
- Clarify message about _OSC failure (Bjorn Helgaas)
- Remove notification of PCIe bandwidth changes (Bjorn Helgaas)
- Tidy checking of syscall user config accessors (Heiner Kallweit)
Resource management:
- Decline to resize resources if boot config must be preserved (Ard
Biesheuvel)
- Fix pci_register_io_range() memory leak (Geert Uytterhoeven)
Error handling (Keith Busch):
- Clear error status from the correct device
- Retain error recovery status so drivers can use it after reset
- Log the type of Port (Root or Switch Downstream) that we reset
- Always request a reset for Downstream Ports in frozen state
Endpoint framework and NTB (Kishon Vijay Abraham I):
- Make *_get_first_free_bar() take into account 64 bit BAR
- Add helper API to get the 'next' unreserved BAR
- Make *_free_bar() return error codes on failure
- Remove unused pci_epf_match_device()
- Add support to associate secondary EPC with EPF
- Add support in configfs to associate two EPCs with EPF
- Add pci_epc_ops to map MSI IRQ
- Add pci_epf_ops to expose function-specific attrs
- Allow user to create sub-directory of 'EPF Device' directory
- Implement ->msi_map_irq() ops for cadence
- Configure LM_EP_FUNC_CFG based on epc->function_num_map for cadence
- Add EP function driver to provide NTB functionality
- Add support for EPF PCI Non-Transparent Bridge
- Add specification for PCI NTB function device
- Add PCI endpoint NTB function user guide
- Add configfs binding documentation for pci-ntb endpoint function
Broadcom STB PCIe controller driver:
- Add support for BCM4908 and external PERST# signal controller
(Rafał Miłecki)
Cadence PCIe controller driver:
- Retrain Link to work around Gen2 training defect (Nadeem Athani)
- Fix merge botch in cdns_pcie_host_map_dma_ranges() (Krzysztof
Wilczyński)
Freescale Layerscape PCIe controller driver:
- Add LX2160A rev2 EP mode support (Hou Zhiqiang)
- Convert to builtin_platform_driver() (Michael Walle)
MediaTek PCIe controller driver:
- Fix OF node reference leak (Krzysztof Wilczyński)
Microchip PolarFlare PCIe controller driver:
- Add Microchip PolarFire PCIe controller driver (Daire McNamara)
Qualcomm PCIe controller driver:
- Use PHY_REFCLK_USE_PAD only for ipq8064 (Ansuel Smith)
- Add support for ddrss_sf_tbu clock for sm8250 (Dmitry Baryshkov)
Renesas R-Car PCIe controller driver:
- Drop PCIE_RCAR config option (Lad Prabhakar)
- Always allocate MSI addresses in 32bit space (Marek Vasut)
Rockchip PCIe controller driver:
- Add FriendlyARM NanoPi M4B DT binding (Chen-Yu Tsai)
- Make 'ep-gpios' DT property optional (Chen-Yu Tsai)
Synopsys DesignWare PCIe controller driver:
- Work around ECRC configuration hardware defect (Vidya Sagar)
- Drop support for config space in DT 'ranges' (Rob Herring)
- Change size to u64 for EP outbound iATU (Shradha Todi)
- Add upper limit address for outbound iATU (Shradha Todi)
- Make dw_pcie ops optional (Jisheng Zhang)
- Remove unnecessary dw_pcie_ops from al driver (Jisheng Zhang)
Xilinx Versal CPM PCIe controller driver:
- Fix OF node reference leak (Pan Bian)
Miscellaneous:
- Remove tango host controller driver (Arnd Bergmann)
- Remove IRQ handler & data together (altera-msi, brcmstb, dwc)
(Martin Kaiser)
- Fix xgene-msi race in installing chained IRQ handler (Martin
Kaiser)
- Apply CONFIG_PCI_DEBUG to entire drivers/pci hierarchy (Junhao He)
- Fix pci-bridge-emul array overruns (Russell King)
- Remove obsolete uses of WARN_ON(in_interrupt()) (Sebastian Andrzej
Siewior)"
* tag 'pci-v5.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (69 commits)
PCI: qcom: Use PHY_REFCLK_USE_PAD only for ipq8064
PCI: qcom: Add support for ddrss_sf_tbu clock
dt-bindings: PCI: qcom: Document ddrss_sf_tbu clock for sm8250
PCI: al: Remove useless dw_pcie_ops
PCI: dwc: Don't assume the ops in dw_pcie always exist
PCI: dwc: Add upper limit address for outbound iATU
PCI: dwc: Change size to u64 for EP outbound iATU
PCI: dwc: Drop support for config space in 'ranges'
PCI: layerscape: Convert to builtin_platform_driver()
PCI: layerscape: Add LX2160A rev2 EP mode support
dt-bindings: PCI: layerscape: Add LX2160A rev2 compatible strings
PCI: dwc: Work around ECRC configuration issue
PCI/portdrv: Report reset for frozen channel
PCI/AER: Specify the type of Port that was reset
PCI/ERR: Retain status from error notification
PCI/AER: Clear AER status from Root Port when resetting Downstream Port
PCI/ERR: Clear status of the reporting device
dt-bindings: arm: rockchip: Add FriendlyARM NanoPi M4B
PCI: rockchip: Make 'ep-gpios' DT property optional
Documentation: PCI: Add PCI endpoint NTB function user guide
...
Kmemleak reports:
unreferenced object 0xc328de40 (size 64):
comm "kworker/1:1", pid 21, jiffies 4294938212 (age 1484.670s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 e0 d8 fc eb 00 00 00 00 ................
00 00 10 fe 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<ad758d10>] pci_register_io_range+0x3c/0x80
[<2c7f139e>] of_pci_range_to_resource+0x48/0xc0
[<f079ecc8>] devm_of_pci_get_host_bridge_resources.constprop.0+0x2ac/0x3ac
[<e999753b>] devm_of_pci_bridge_init+0x60/0x1b8
[<a895b229>] devm_pci_alloc_host_bridge+0x54/0x64
[<e451ddb0>] rcar_pcie_probe+0x2c/0x644
In case a PCI host driver's probe is deferred, the same I/O range may be
allocated again, and be ignored, causing a memory leak.
Fix this by (a) letting logic_pio_register_range() return -EEXIST if the
passed range already exists, so pci_register_io_range() will free it, and
by (b) making pci_register_io_range() not consider -EEXIST an error
condition.
Link: https://lore.kernel.org/r/20210202100332.829047-1-geert+renesas@glider.be
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
RX 5600 XT Pulse advertises support for BAR 0 being 256MB, 512MB,
or 1GB, but it also supports 2GB, 4GB, and 8GB. Add a rebar
size quirk so that the BAR 0 is big enough to cover complete VARM.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20210107175017.15893-5-nirmoy.das@amd.com
Users of pci_resize_resource() need a way to calculate BAR size
from desired bytes. Add a helper function and export it so that
modular drivers can use it.
Signed-off-by: Darren Salt <devspam@moreofthesa.me.uk>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20210107175017.15893-3-nirmoy.das@amd.com
- Save/restore Precision Time Measurement Capability for suspend/resume
(David E. Box)
- Disable PTM during suspend to save power (David E. Box)
* pci/ptm:
PCI: Disable PTM during suspend to save power
PCI/PTM: Save/restore Precision Time Measurement Capability for suspend/resume
- Add sysfs attribute for device power state (Maximilian Luz)
- Rename pci_wakeup_bus() to pci_resume_bus() (Mika Westerberg)
- Do not generate wakeup event when runtime resuming bus (Mika Westerberg)
* pci/pm:
PCI/PM: Do not generate wakeup event when runtime resuming device
PCI/PM: Rename pci_wakeup_bus() to pci_resume_bus()
PCI: Add sysfs attribute for device power state
- Decode PCIe 64 GT/s link speed (Gustavo Pimentel)
- De-duplicate Device IDs in the driver dynamic IDs list (Zhenzhong Duan)
- Return u8 from pci_find_capability() and similar (Puranjay Mohan)
- Return u16 from pci_find_ext_capability() and similar (Bjorn Helgaas)
- Include both device and resource name in config space resources
(Alexander Lobakin)
- Fix ACPI companion lookup for device 0 on the root bus (Rafael J.
Wysocki)
* pci/enumeration:
PCI/ACPI: Fix companion lookup for device 0 on the root bus
PCI: Keep both device and resource name for config space remaps
PCI: Return u16 from pci_find_ext_capability() and similar
PCI: Return u8 from pci_find_capability() and similar
PCI: Avoid duplicate IDs in driver dynamic IDs list
PCI: Move pci_match_device() ahead of new_id_store()
PCI: Decode PCIe 64 GT/s link speed
Follow the rule taken in commit 35bd8c07db ("devres: keep both device
name and resource name in pretty name") and keep both device and resource
names while requesting memory regions for PCI config space to prettify e.g.
/proc/iomem output:
Before (DWC Host Controller):
18b00000-18b01fff : dbi
18b10000-18b11fff : config
18b20000-18b21fff : dbi
18b30000-18b31fff : config
After:
18b00000-18b01fff : 18b00000.pci dbi
18b10000-18b11fff : 18b00000.pci config
18b20000-18b21fff : 18b20000.pci dbi
18b30000-18b31fff : 18b20000.pci config
Link: https://lore.kernel.org/r/WbKfdybjZ6xNIUjcC5oC8NcuLqrJfkxQAlnO80ag@cp3-web-020.plabs.ch
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
There are systems (for example, Intel based mobile platforms since Coffee
Lake) where the power drawn while suspended can be significantly reduced by
disabling Precision Time Measurement (PTM) on PCIe root ports as this
allows the port to enter a lower-power PM state and the SoC to reach a
lower-power idle state. To save this power, disable the PTM feature on root
ports during pci_prepare_to_sleep() and pci_finish_runtime_suspend(). The
feature will be returned to its previous state during restore and error
recovery.
Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=209361
Link: https://lore.kernel.org/r/20201207223951.19667-2-david.e.box@linux.intel.com
Reported-by: Len Brown <len.brown@intel.com>
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The PCI subsystem does not currently save and restore the configuration
space for the Precision Time Measurement (PTM) Extended Capability leading
to the possibility of the feature returning disabled on S3 resume. This
has been observed on Intel Coffee Lake desktops. Add save/restore of the
PTM control register. This saves the PTM Enable, Root Select, and Effective
Granularity bits.
Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20201207223951.19667-1-david.e.box@linux.intel.com
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Drivers like ehci_hcd and xhci_hcd use pci_set_mwi() and emit an annnoying
message like the following that results in user questions whether something
is broken:
xhci_hcd 0000:00:15.0: cache line size of 64 is not supported
Root cause of the message is that on several chips the Cache Line Size
register is hard-wired to 0.
Change this message to debug level; an interested caller can still inform
the user (if deemed helpful) based on the return code.
Link: https://lore.kernel.org/r/be1ed3a2-98b9-ee1d-20b8-477f3d93961d@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
When a PCI bridge is runtime resumed from D3cold, we resume any downstream
devices as well. Previously, we also generated a wakeup event for each
device even though this is not a wakeup signal coming from the hardware.
Normally this does not cause problems but when combined with
/sys/power/wakeup_count like using the steps below:
# count=$(cat /sys/power/wakeup_count)
# echo $count > /sys/power/wakeup_count
# echo mem > /sys/power/state
The system suspend cycle might fail at this point if a PCI bridge that was
runtime suspended (D3cold) was runtime resumed for any reason. The runtime
resume calls pci_resume_bus(), which generates a wakeup event and increases
wakeup_count.
Since this is not a real wakeup event, remove the call to
pci_wakeup_event() from pci_resume_one().
[bhelgaas: reorder, commit log]
Link: https://lore.kernel.org/r/20201125090733.77782-1-mika.westerberg@linux.intel.com
Reported-by: Utkarsh Patel <utkarsh.h.patel@intel.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
A "wakeup" is a signal from a device telling the system that the device or
the whole system should be awakened and made active. PCI devices are made
active by "resuming" them.
pci_wakeup_bus() is not involved with the wakeup signal; it *resumes*
devices on a bus (possibly in response to a wakeup signal, but that's at a
higher level).
Rename pci_wakeup_bus() to pci_resume_bus() to better reflect what it does.
No functional change intended.
[bhelgaas: commit log, reorder before removal of pci_wakeup_event()]
Link: https://lore.kernel.org/r/20201125090733.77782-2-mika.westerberg@linux.intel.com
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
PCI Express Extended Capabilities are in config space between offsets 256
and 4K. These offsets all fit in 16 bits.
Change the return type of pci_find_ext_capability() and supporting
functions from int to u16 to match the specification. Many callers use
"int", which is fine, but there's no need to store more than a u16.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
PCI Capabilities are linked in a list that must appear in the first 256
bytes of config space. Each capabilities list pointer is 8 bits.
Change the return type of pci_find_capability() and supporting functions
from int to u8 to match the specification.
[bhelgaas: change other related interfaces, fix HyperTransport typos]
Link: https://lore.kernel.org/r/20201129164626.12887-1-puranjay12@gmail.com
Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The shift of 1 by align_order is evaluated using 32 bit arithmetic and the
result is assigned to a resource_size_t type variable that is a 64 bit
unsigned integer on 64 bit platforms. Fix an overflow before widening issue
by making the 1 a ULL.
Addresses-Coverity: ("Unintentional integer overflow")
Fixes: 32a9a682be ("PCI: allow assignment of memory resources with a specified alignment")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
32-bit BARs are limited to 2GB size (2^31). By extension, I assume 64-bit
BARs are limited to 2^63 bytes. Limit the alignment requested by the
"pci=resource_alignment=" command-line parameter to 2^63.
Link: https://lore.kernel.org/r/20201007123045.GS4282@kadam
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Previously ASPM L1 Substates control registers (CTL1 and CTL2) weren't
saved and restored during suspend/resume leading to L1 Substates
configuration being lost post-resume.
Save the L1 Substates control registers so that the configuration is
retained post-resume.
Link: https://lore.kernel.org/r/20201024190442.871-1-vidyas@nvidia.com
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Some devices support ACS functionality even though they don't have a
spec-compliant ACS Capability; pci_enable_acs() has a quirk mechanism to
handle them.
We want to enable ACS whenever possible, but 52fbf5bdee ("PCI: Cache ACS
capability offset in device") inadvertently broke this by calling
pci_enable_acs() only if we find an ACS Capability.
This resulted in ACS not being enabled for these non-compliant devices,
which means devices can't be separated into different IOMMU groups, which
in turn means we may not be able to pass those devices through to VMs, as
reported by Boris V:
https://lore.kernel.org/r/74aeea93-8a46-5f5a-343c-790d4c655da3@bstnet.org
Fixes: 52fbf5bdee ("PCI: Cache ACS capability offset in device")
Link: https://lore.kernel.org/r/20201028231545.4116866-1-rajatja@google.com
Reported-by: Boris V <borisvk@bstnet.org>
Signed-off-by: Rajat Jain <rajatja@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
- Remove unnecessary #includes (Gustavo Pimentel)
- Fix intel_mid_pci.c build error when !CONFIG_ACPI (Randy Dunlap)
- Use scnprintf(), not snprintf(), in sysfs "show" functions (Krzysztof
Wilczyński)
- Simplify pci-pf-stub by using module_pci_driver() (Liu Shixin)
- Print IRQ used by Link Bandwidth Notification (Dongdong Liu)
- Update sysfs mmap-related #ifdef comments (Clint Sbisa)
- Simplify pci_dev_reset_slot_function() (Lukas Wunner)
- Use "NULL" instead of "0" to fix sparse warnings (Gustavo Pimentel)
- Simplify bool comparisons (Krzysztof Wilczyński)
- Drop double zeroing for P2PDMA sg_init_table() (Julia Lawall)
* pci/misc:
PCI: v3-semi: Remove unneeded break
PCI/P2PDMA: Drop double zeroing for sg_init_table()
PCI: Simplify bool comparisons
PCI: endpoint: Use "NULL" instead of "0" as a NULL pointer
PCI: Simplify pci_dev_reset_slot_function()
PCI: Update mmap-related #ifdef comments
PCI/LINK: Print IRQ number used by port
PCI/IOV: Simplify pci-pf-stub with module_pci_driver()
PCI: Use scnprintf(), not snprintf(), in sysfs "show" functions
x86/PCI: Fix intel_mid_pci.c build error when ACPI is not enabled
PCI: Remove unnecessary header includes
- Use for_each_child_of_node() and for_each_node_by_name() instead of
open-coding them (Qinglang Miao)
- Reduce pciehp noisiness on hot removal (Lukas Wunner)
- Remove unused assignment in shpchp (Krzysztof Wilczyński)
* pci/hotplug:
PCI: shpchp: Remove unused 'rc' assignment
PCI: pciehp: Reduce noisiness on hot removal
PCI: rpadlpar: Use for_each_child_of_node() and for_each_node_by_name()
- Tone down message about missing optional MCFG (Jeremy Linton)
- Add schedule point in pci_read_config() (Jiang Biao)
- Add Ampere Altra SOC MCFG quirk (Tuan Phan)
- Add Kconfig options for MPS/MRRS strategy (Jim Quinlan)
* pci/enumeration:
PCI: Add Kconfig options for MPS/MRRS strategy
PCI/ACPI: Add Ampere Altra SOC MCFG quirk
PCI: Add schedule point in pci_read_config()
PCI/ACPI: Tone down missing MCFG message
Add Kconfig options for changing the default pcie_bus_config, i.e., the
strategy for configuration MPS and MRRS, in the same manner as the
CONFIG_PCIEASPM_XXXX choice. The pci_bus_config setting may still be
overridden by kernel command-line parameters, e.g.,
"pci=pcie_bus_tune_off".
[bhelgaas: depend on EXPERT, tweak help texts]
Link: https://lore.kernel.org/r/20200928194651.5393-2-james.quinlan@broadcom.com
Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This reverts commit 7e24bc347e.
7e24bc347e was based on PCIe r5.0, sec 5.9, which claims we need a 200 ms
delay when transitioning to or from D2. However, sec 5.3.1.3 states the
delay as 200 μs (microseconds), as does the table in PCIe r4.0, sec 5.9.1.
This looks like a typo in the r5.0 spec, so revert back to a 200 μs delay
instead of a 200 ms delay.
Fixes: 7e24bc347e ("PCI/PM: Apply D2 delay as milliseconds, not microseconds")
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Take care about Coccinelle warnings:
drivers/pci/pci.c:6008:6-12: WARNING: Comparison to bool
drivers/pci/pci.c:6024:7-13: WARNING: Comparison to bool
No change to functionality intended.
Link: https://lore.kernel.org/r/20200925224555.1752460-1-kw@linux.com
Signed-off-by: Krzysztof Wilczyński <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
PCI devices support two variants of the D3 power state: D3hot (main power
present) D3cold (main power removed). Previously struct pci_dev contained:
unsigned int d3_delay; /* D3->D0 transition time in ms */
unsigned int d3cold_delay; /* D3cold->D0 transition time in ms */
"d3_delay" refers specifically to the D3hot state. Rename it to
"d3hot_delay" to avoid ambiguity and align with the ACPI "_DSM for
Specifying Device Readiness Durations" in the PCI Firmware spec r3.2,
sec 4.6.9.
There is no change to the functionality.
Link: https://lore.kernel.org/r/20200730210848.1578826-1-kw@linux.com
Signed-off-by: Krzysztof Wilczyński <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_dev_reset_slot_function() refuses to reset a hotplug slot if it is
shared by multiple pci_devs. That's the case if and only if the slot is
occupied by a multifunction device.
Simplify the function to check the device's multifunction flag instead
of iterating over the devices on the bus. (Iterating over the devices
requires holding pci_bus_sem, which the function erroneously does not
acquire.)
Link: https://lore.kernel.org/r/c6aab5af096f7b1b3db57f6335cebba8f0fcca89.1595330431.git.lukas@wunner.de
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
When a PCIe card is hot-removed, the Presence Detect State and Data Link
Layer Link Active bits often do not clear simultaneously. I've seen delays
of up to 244 msec between the two events with Thunderbolt.
After pciehp has brought down the slot in response to the first event, the
other bit may still be set. It's not discernible whether it's set because
a new card is already in the slot or if it will soon clear. So pciehp
tries to bring up the slot and in the latter case fails with a bunch of
messages, some of them at KERN_ERR severity. If the slot is no longer
occupied, the messages are false positives and annoy users.
Stuart Hayes reports the following splat on hot removal:
KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Link Up
KERN_INFO pcieport 0000:3c:06.0: pciehp: Timeout waiting for Presence Detect
KERN_ERR pcieport 0000:3c:06.0: pciehp: link training error: status 0x0001
KERN_ERR pcieport 0000:3c:06.0: pciehp: Failed to check link status
Dongdong Liu complains about a similar splat:
KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Link Down
KERN_INFO iommu: Removing device 0000:87:00.0 from group 12
KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Card present
KERN_INFO pcieport 0000:80:10.0: Data Link Layer Link Active not set in 1000 msec
KERN_ERR pciehp 0000:80:10.0:pcie004: Failed to check link status
Users are particularly irritated to see a bringup attempt even though the
slot was explicitly brought down via sysfs. In a perfect world, we could
avoid this by setting Link Disable on slot bringdown and re-enabling it
upon a Presence Detect State change. In reality however, there are broken
hotplug ports which hardwire Presence Detect to zero, see 80696f9914
("PCI: pciehp: Tolerate Presence Detect hardwired to zero"). Conversely,
PCIe r1.0 hotplug ports hardwire Link Active to zero because Link Active
Reporting wasn't specified before PCIe r1.1. On unplug, some ports first
clear Presence then Link (see Stuart Hayes' splat) whereas others use the
inverse order (see Dongdong Liu's splat). To top it off, there are hotplug
ports which flap the Presence and Link bits on slot bringup, see
6c35a1ac3d ("PCI: pciehp: Tolerate initially unstable link").
pciehp is designed to work with all of these variants. Surplus attempts at
slot bringup are a lesser evil than not being able to bring up slots at
all. Although we could try to perfect the behavior for specific hotplug
controllers, we'd risk breaking others or increasing code complexity.
But we can certainly minimize annoyance by emitting only a single message
with KERN_INFO severity if bringup is unsuccessful:
* Drop the "Timeout waiting for Presence Detect" message in
pcie_wait_for_presence(). The sole caller of that function,
pciehp_check_link_status(), ignores the timeout and carries on. It emits
error messages of its own and I don't think this particular message adds
much value.
* There's a single error condition in pciehp_check_link_status() which
does not emit a message. Adding one allows dropping the "Failed to check
link status" message emitted by board_added() if
pciehp_check_link_status() returns a non-zero integer.
* Tone down all messages in pciehp_check_link_status() to KERN_INFO
severity and rephrase them to look as innocuous as possible. To this
end, move the message emitted by pcie_wait_for_link_delay() to its
callers.
As a result, Stuart Hayes' splat becomes:
KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Link Up
KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Cannot train link: status 0x0001
Dongdong Liu's splat becomes:
KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Card present
KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): No link
The messages now merely serve as information that presence or link bits
were set a little longer than expected. Bringup failures which are not
false positives are still reported, albeit no longer at KERN_ERR severity.
Link: https://lore.kernel.org/linux-pci/20200310182100.102987-1-stuart.w.hayes@gmail.com/
Link: https://lore.kernel.org/linux-pci/1547649064-19019-1-git-send-email-liudongdong3@huawei.com/
Link: https://lore.kernel.org/r/b45e46fd8a6aa6930aaac9d7718c2e4b787a4e5e.1595935071.git.lukas@wunner.de
Reported-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Reported-by: Dongdong Liu <liudongdong3@huawei.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Translation Blocking is a required feature for Downstream Ports (Root
Ports or Switch Downstream Ports) that implement ACS. When enabled, the
Port checks the Address Type (AT) of each upstream Memory Request it
receives.
The default AT (00b) means "untranslated" and the IOMMU can decide whether
to treat the address as I/O virtual or physical.
If AT is not the default, i.e., if the Memory Request contains an
already-translated (physical) address, the Port blocks the request and
reports an ACS error.
When enabling ACS, enable Translation Blocking for external-facing ports
and untrusted (external) devices. This is to help prevent attacks from
external devices that initiate DMA with physical addresses that bypass the
IOMMU.
[bhelgaas: commit log, simplify setting bit and drop warning; TB is
required for Downstream Ports with ACS, so we should never see the warning]
Link: https://lore.kernel.org/r/20200707224604.3737893-4-rajatja@google.com
Signed-off-by: Rajat Jain <rajatja@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
- Use pci_channel_state_t instead of enum pci_channel_state (Luc Van
Oostenryck)
- Simplify __aer_print_error() (Bjorn Helgaas)
- Log AER correctable errors as warning, not error (Matt Jolly)
- Rename pci_aer_clear_device_status() to pcie_clear_device_status() (Bjorn
Helgaas)
- Clear PCIe Device Status errors only if OS owns AER (Jonathan Cameron)
* pci/error:
PCI/ERR: Clear PCIe Device Status errors only if OS owns AER
PCI/ERR: Rename pci_aer_clear_device_status() to pcie_clear_device_status()
PCI/AER: Log correctable errors as warning, not error
PCI/AER: Simplify __aer_print_error()
PCI: Use 'pci_channel_state_t' instead of 'enum pci_channel_state'
pci_aer_clear_device_status() clears the error bits in the PCIe Device
Status Register (PCI_EXP_DEVSTA). Every PCIe device has this register,
regardless of whether it supports AER.
Rename pci_aer_clear_device_status() to pcie_clear_device_status() to make
clear that it is PCIe-specific but not AER-specific. Move it to
drivers/pci/pci.c, again since it's not AER-specific. No functional change
intended.
Link: https://lore.kernel.org/r/20200717195619.766662-1-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Currently the ACS capability is being looked up at a number of places. Read
and store it once at enumeration so that it can be used by all later. No
functional change intended.
Link: https://lore.kernel.org/r/20200707224604.3737893-2-rajatja@google.com
Signed-off-by: Rajat Jain <rajatja@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Move pci_enable_acs() and dependencies further up in the source code to
avoid having to forward declare it when we make it static in near future.
No functional changes intended.
Link: https://lore.kernel.org/r/20200707224604.3737893-1-rajatja@google.com
Signed-off-by: Rajat Jain <rajatja@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The PCI config accessors (pci_read_config_word(), et al) return
PCIBIOS_SUCCESSFUL (zero) or positive error values like
PCIBIOS_FUNC_NOT_SUPPORTED.
The PCIe capability accessors (pcie_capability_read_word(), et al)
similarly return PCIBIOS errors, but some callers assume they return
generic errno values like -EINVAL.
For example, the Myri-10G probe function returns a positive PCIBIOS error
if the pcie_capability_clear_and_set_word() in pcie_set_readrq() fails:
myri10ge_probe
status = pcie_set_readrq
return pcie_capability_clear_and_set_word
if (status)
return status
A positive return from a PCI driver probe function would cause a "Driver
probe function unexpectedly returned" warning from local_pci_probe()
instead of the desired probe failure.
Convert PCIBIOS errors to generic errno for all callers of:
pcie_capability_read_word
pcie_capability_read_dword
pcie_capability_write_word
pcie_capability_write_dword
pcie_capability_set_word
pcie_capability_set_dword
pcie_capability_clear_word
pcie_capability_clear_dword
pcie_capability_clear_and_set_word
pcie_capability_clear_and_set_dword
that check the return code for anything other than zero.
[bhelgaas: commit log, squash together]
Suggested-by: Bjorn Helgaas <bjorn@helgaas.com>
Link: https://lore.kernel.org/r/20200615073225.24061-1-refactormyself@gmail.com
Signed-off-by: Bolarinwa Olayemi Saheed <refactormyself@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
- Check .bridge_d3() hook for NULL before calling it (Bjorn Helgaas)
- Disable PME# for Pericom OHCI/UHCI USB controllers because it's
not reliably asserted on USB hotplug (Kai-Heng Feng)
- Assume ports without DLL Link Active train links in 100 ms to work
around Thunderbolt bridge defects (Mika Westerberg)
* pci/pm:
PCI/PM: Assume ports without DLL Link Active train links in 100 ms
PCI/PM: Adjust pcie_wait_for_link_delay() for caller delay
PCI: Avoid Pericom USB controller OHCI/EHCI PME# defect
serial: 8250_pci: Move Pericom IDs to pci_ids.h
PCI/PM: Call .bridge_d3() hook only if non-NULL
Kai-Heng Feng reported that it takes a long time (> 1 s) to resume
Thunderbolt-connected devices from both runtime suspend and system sleep
(s2idle).
This was because some Downstream Ports that support > 5 GT/s do not also
support Data Link Layer Link Active reporting. Per PCIe r5.0 sec 6.6.1:
With a Downstream Port that supports Link speeds greater than 5.0 GT/s,
software must wait a minimum of 100 ms after Link training completes
before sending a Configuration Request to the device immediately below
that Port. Software can determine when Link training completes by polling
the Data Link Layer Link Active bit or by setting up an associated
interrupt (see Section 6.7.3.3).
Sec 7.5.3.6 requires such Ports to support DLL Link Active reporting, but
at least the Intel JHL6240 Thunderbolt 3 Bridge [8086:15c0] and the Intel
JHL7540 Thunderbolt 3 Bridge [8086:15ea] do not.
Previously we tried to wait for Link training to complete, but since there
was no DLL Link Active reporting, all we could do was wait the worst-case
1000 ms, then another 100 ms.
Instead of using the supported speeds to determine whether to wait for Link
training, check whether the port supports DLL Link Active reporting. The
Ports in question do not, so we'll wait only the 100 ms required for Ports
that support Link speeds <= 5 GT/s.
This of course assumes these Ports always train the Link within 100 ms even
if they are operating at > 5 GT/s, which is not required by the spec.
[bhelgaas: commit log, comment]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206837
Link: https://lore.kernel.org/r/20200514133043.27429-1-mika.westerberg@linux.intel.com
Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The caller of pcie_wait_for_link_delay() specifies the time to wait after
the link becomes active. When the downstream port doesn't support link
active reporting, obviously we can't tell when the link becomes active, so
we waited the worst-case time (1000 ms) plus 100 ms, ignoring the delay
from the caller.
Instead, wait for 1000 ms + the delay from the caller.
Fixes: 4827d63891 ("PCI/PM: Add pcie_wait_for_link_delay()")
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Previously we used pcie_find_root_port() to find a Root Port from a PCIe
device and pci_find_pcie_root_port() to find a Root Port from a
Conventional PCI device.
Unify the two functions and use pcie_find_root_port() to find a Root Port
from either a Conventional PCI device or a PCIe device. Then there is no
need to distinguish the type of the device.
Link: https://lore.kernel.org/r/1589019568-5216-1-git-send-email-yangyicong@hisilicon.com
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Kalle Valo <kvalo@codeaurora.org> # wireless
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> # thunderbolt
The current codebase makes use of the zero-length array language extension
to the C90 standard, but the preferred mechanism to declare variable-length
types such as these as a flexible array member [1][2], introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning in
case the flexible array does not occur last in the structure, which will
help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.
Also, notice that dynamic memory allocations won't be affected by this
change:
Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero. [1]
sizeof(flexible-array-member) triggers a warning because flexible array
members have incomplete type [1]. There are some instances of code in which
the sizeof() operator is being incorrectly/erroneously applied to
zero-length arrays, and the result is zero. Such instances may be hiding
some bugs. So, this work (flexible-array member conversions) will also help
to get completely rid of those sorts of issues.
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")
Link: https://lore.kernel.org/r/20200507190544.GA15633@embeddedor
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
26ad34d510 ("PCI / ACPI: Whitelist D3 for more PCIe hotplug ports") added
the struct pci_platform_pm_ops.bridge_d3() function pointer and
platform_pci_bridge_d3() to use it.
The .bridge_d3() op is implemented by acpi_pci_platform_pm, but not by
mid_pci_platform_pm. We don't expect platform_pci_bridge_d3() to be called
on Intel MID platforms, but nothing in the code itself would prevent that.
Check the .bridge_d3() pointer for NULL before calling it.
Fixes: 26ad34d510 ("PCI / ACPI: Whitelist D3 for more PCIe hotplug ports")
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl6GTQMUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vy3PhAAmqpYBRobOsG8QbmKDjoJEFtkqdvD
z6+4zf/R+hF11RyXjMDwihIe8d+tkQ4eAaYu6Oh5PrTyanz0G0PgeCrivZeytULk
thqQIWzDQMVA5vN/2/Vy8s5s+3HzP8z/MZOFScJ7+xA1MndXptPRTNmFUbjx+GAv
x8/pTp0u9AF6m7itX65DxXvwkzjWamt+Ar4Yx2IcuKAU/M5RtfuZO3PpDnqn7/wk
JFlkRoYeFB6qNnnkPdeyPHl9dALhuhzgdTyklQEnKVW3nf3xThYDhcEwdh6kBQgl
0dH8lL5LXy7PKGN8RES4wB0Vqndw/HlsCF5O4wkkfItbnbJxGJtS139e5973m0ud
sgWvF4yJAT2jCKhIeNz34sePQJMyWALhv0XzZCsJ0YeGHsrV1jrHELkwUT1+eIsT
3UV0iZ6aL06zQJDyKUbbIcQzEQ/wwBC+x9VgsyL54K1quCQZ1N1Nl/dvrb4cRG9m
m9EhJK/brDf4c0uFlOmMTSxV1t5J+z6ZSQnh1ShD/o5yBsxqN6q5brDT6LEs+jbM
LsIkA18jJOd4OyiDs98YiFKvIfFQbQ0LEBQpJwhF0snvfBFMMbUYN/T/NYneWON/
F0TpkFoP7PXDuq55iNaLdnObfzrpC9kdzUyWvePUvjxIl55bkf+/qtUny+H48t4L
dNggvW052d7BHes=
=deWu
-----END PGP SIGNATURE-----
Merge tag 'pci-v5.7-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull pci updates from Bjorn Helgaas:
"Enumeration:
- Revert sysfs "rescan" renames that broke apps (Kelsey Skunberg)
- Add more 32 GT/s link speed decoding and improve the implementation
(Yicong Yang)
Resource management:
- Add support for sizing programmable host bridge apertures and fix a
related alpha Nautilus regression (Ivan Kokshaysky)
Interrupts:
- Add boot interrupt quirk mechanism for Xeon chipsets and document
boot interrupts (Sean V Kelley)
PCIe native device hotplug:
- When possible, disable in-band presence detect and use PDS
(Alexandru Gagniuc)
- Add DMI table for devices that don't use in-band presence detection
but don't advertise that correctly (Stuart Hayes)
- Fix hang when powering slots up/down via sysfs (Lukas Wunner)
- Fix an MSI interrupt race (Stuart Hayes)
Virtualization:
- Add ACS quirks for Zhaoxin devices (Raymond Pang)
Error handling:
- Add Error Disconnect Recover (EDR) support so firmware can report
devices disconnected via DPC and we can try to recover (Kuppuswamy
Sathyanarayanan)
Peer-to-peer DMA:
- Add Intel Sky Lake-E Root Ports B, C, D to the whitelist (Andrew
Maier)
ASPM:
- Reduce severity of common clock config message (Chris Packham)
- Clear the correct bits when enabling L1 substates, so we don't go
to the wrong state (Yicong Yang)
Endpoint framework:
- Replace EPF linkup ops with notifier call chain and improve locking
(Kishon Vijay Abraham I)
- Fix concurrent memory allocation in OB address region (Kishon Vijay
Abraham I)
- Move PF function number assignment to EPC core to support multiple
function creation methods (Kishon Vijay Abraham I)
- Fix issue with clearing configfs "start" entry (Kunihiko Hayashi)
- Fix issue with endpoint MSI-X ignoring BAR Indicator and Table
Offset (Kishon Vijay Abraham I)
- Add support for testing DMA transfers (Kishon Vijay Abraham I)
- Add support for testing > 10 endpoint devices (Kishon Vijay Abraham I)
- Add support for tests to clear IRQ (Kishon Vijay Abraham I)
- Add common DT schema for endpoint controllers (Kishon Vijay Abraham I)
Amlogic Meson PCIe controller driver:
- Add DT bindings for AXG PCIe PHY, shared MIPI/PCIe analog PHY (Remi
Pommarel)
- Add Amlogic AXG PCIe PHY, AXG MIPI/PCIe analog PHY drivers (Remi
Pommarel)
Cadence PCIe controller driver:
- Add Root Complex/Endpoint DT schema for Cadence PCIe (Kishon Vijay
Abraham I)
Intel VMD host bridge driver:
- Add two VMD Device IDs that require bus restriction mode (Sushma
Kalakota)
Mobiveil PCIe controller driver:
- Refactor and modularize mobiveil driver (Hou Zhiqiang)
- Add support for Mobiveil GPEX Gen4 host (Hou Zhiqiang)
Microsoft Hyper-V host bridge driver:
- Add support for Hyper-V PCI protocol version 1.3 and
PCI_BUS_RELATIONS2 (Long Li)
- Refactor to prepare for virtual PCI on non-x86 architectures (Boqun
Feng)
- Fix memory leak in hv_pci_probe()'s error path (Dexuan Cui)
NVIDIA Tegra PCIe controller driver:
- Use pci_parse_request_of_pci_ranges() (Rob Herring)
- Add support for endpoint mode and related DT updates (Vidya Sagar)
- Reduce -EPROBE_DEFER error message log level (Thierry Reding)
Qualcomm PCIe controller driver:
- Restrict class fixup to specific Qualcomm devices (Bjorn Andersson)
Synopsys DesignWare PCIe controller driver:
- Refactor core initialization code for endpoint mode (Vidya Sagar)
- Fix endpoint MSI-X to use correct table address (Kishon Vijay
Abraham I)
TI DRA7xx PCIe controller driver:
- Fix MSI IRQ handling (Vignesh Raghavendra)
TI Keystone PCIe controller driver:
- Allow AM654 endpoint to raise MSI-X interrupt (Kishon Vijay Abraham I)
Miscellaneous:
- Quirk ASMedia XHCI USB to avoid "PME# from D0" defect (Kai-Heng
Feng)
- Use ioremap(), not phys_to_virt(), for platform ROM to fix video
ROM mapping with CONFIG_HIGHMEM (Mikel Rychliski)"
* tag 'pci-v5.7-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (96 commits)
misc: pci_endpoint_test: remove duplicate macro PCI_ENDPOINT_TEST_STATUS
PCI: tegra: Print -EPROBE_DEFER error message at debug level
misc: pci_endpoint_test: Use full pci-endpoint-test name in request_irq()
misc: pci_endpoint_test: Fix to support > 10 pci-endpoint-test devices
tools: PCI: Add 'e' to clear IRQ
misc: pci_endpoint_test: Add ioctl to clear IRQ
misc: pci_endpoint_test: Avoid using module parameter to determine irqtype
PCI: keystone: Allow AM654 PCIe Endpoint to raise MSI-X interrupt
PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get correct MSI-X table address
PCI: endpoint: Fix ->set_msix() to take BIR and offset as arguments
misc: pci_endpoint_test: Add support to get DMA option from userspace
tools: PCI: Add 'd' command line option to support DMA
misc: pci_endpoint_test: Use streaming DMA APIs for buffer allocation
PCI: endpoint: functions/pci-epf-test: Print throughput information
PCI: endpoint: functions/pci-epf-test: Add DMA support to transfer data
PCI: pciehp: Fix MSI interrupt race
PCI: pciehp: Fix indefinite wait on sysfs requests
PCI: endpoint: Fix clearing start entry in configfs
PCI: tegra: Add support for PCIe endpoint mode in Tegra194
PCI: sysfs: Revert "rescan" file renames
...
The AER interfaces to clear error status registers were a confusing mess:
- pci_cleanup_aer_uncorrect_error_status() cleared non-fatal errors
from the Uncorrectable Error Status register.
- pci_aer_clear_fatal_status() cleared fatal errors from the
Uncorrectable Error Status register.
- pci_cleanup_aer_error_status_regs() cleared the Root Error Status
register (for Root Ports), the Uncorrectable Error Status register,
and the Correctable Error Status register.
Rename them to make them consistent:
From To
---------------------------------------- -------------------------------
pci_cleanup_aer_uncorrect_error_status() pci_aer_clear_nonfatal_status()
pci_aer_clear_fatal_status() pci_aer_clear_fatal_status()
pci_cleanup_aer_error_status_regs() pci_aer_clear_status()
Since pci_cleanup_aer_error_status_regs() (renamed to
pci_aer_clear_status()) is only used within drivers/pci/, move the
declaration from <linux/aer.h> to drivers/pci/pci.h.
[bhelgaas: commit log, add renames]
Link: https://lore.kernel.org/r/d1310a75dc3d28f7e8da4e99c45fbd3e60fe238e.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add PCIE_LNKCAP2_SLS2SPEED macro for transforming raw Link Capabilities 2
values to the pci_bus_speed. This is next to PCIE_SPEED2MBS_ENC() to make
it easier to update both places when adding support for new speeds.
Link: https://lore.kernel.org/r/1581937984-40353-10-git-send-email-yangyicong@hisilicon.com
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Previously some PCI speed strings came from pci_speed_string(), some came
from the PCIe-specific PCIE_SPEED2STR(), and some came from a PCIe-specific
switch statement. These methods were inconsistent:
pci_speed_string() PCIE_SPEED2STR() switch
------------------ ---------------- ------
33 MHz PCI
...
2.5 GT/s PCIe 2.5 GT/s 2.5 GT/s
5.0 GT/s PCIe 5 GT/s 5 GT/s
8.0 GT/s PCIe 8 GT/s 8 GT/s
16.0 GT/s PCIe 16 GT/s 16 GT/s
32.0 GT/s PCIe 32 GT/s 32 GT/s
Standardize on pci_speed_string() as the single source of these strings.
Note that this adds ".0" and "PCIe" to some messages, including sysfs
"max_link_speed" files, a brcmstb "link up" message, and the link status
dmesg logging, e.g.,
nvme 0000:01:00.0: 16.000 Gb/s available PCIe bandwidth, limited by 5.0 GT/s PCIe x4 link at 0000:00:01.1 (capable of 31.504 Gb/s with 8.0 GT/s PCIe x4 link)
I think it's better to standardize on a single version of the speed text.
Previously we had strings like this:
/sys/bus/pci/slots/0/cur_bus_speed: 8.0 GT/s PCIe
/sys/bus/pci/slots/0/max_bus_speed: 8.0 GT/s PCIe
/sys/devices/pci0000:00/0000:00:1c.0/current_link_speed: 8 GT/s
/sys/devices/pci0000:00/0000:00:1c.0/max_link_speed: 8 GT/s
This changes the latter two to match the slots files:
/sys/devices/pci0000:00/0000:00:1c.0/current_link_speed: 8.0 GT/s PCIe
/sys/devices/pci0000:00/0000:00:1c.0/max_link_speed: 8.0 GT/s PCIe
Based-on-patch by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Several device drivers read their Device Serial Number from the PCIe
extended config space.
Introduce a new helper function, pci_get_dsn(). This function reads the
eight bytes of the DSN and returns them as a u64. If the capability does not
exist for the device, the function returns 0.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Several drivers use the following code sequence:
1. Read PCI_STATUS
2. Mask out non-error bits
3. Action based on error bits set
4. Write back set error bits to clear them
As this is a repeated pattern, add a helper to the PCI core.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>