2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
Commit Graph

33509 Commits

Author SHA1 Message Date
Srinivasan Shanmugam
083a0c8d17 drm/amdgpu: Enhance Cleaner Shader Handling in GFX v9.0 Architecture v2
This commit modifies the gfx_v9_0_ring_emit_cleaner_shader function
to use a switch statement for cleaner shader emission based on the
specific GFX IP version.

The function now distinguishes between different IP versions, using
PACKET3_RUN_CLEANER_SHADER_9_0 for the versions 9.0.1, 9.1.0,
9.2.1, 9.2.2, 9.3.0, and 9.4.0, while retaining
PACKET3_RUN_CLEANER_SHADER for version 9.4.2.

v2: Simplify logic (Alex).

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 17:03:06 -04:00
Srinivasan Shanmugam
8896abcfdd drm/amdgpu: Add PACKET3_RUN_CLEANER_SHADER_9_0 for Cleaner Shader execution
This commit introduces the PACKET3_RUN_CLEANER_SHADER_9_0 definition,
which is a command packet utilized to instruct the GPU to execute the
cleaner shader for the GFX9.0 graphics architecture.

The cleaner shader is a piece of GPU code that is responsible for
clearing or initializing essential GPU resources, such as Local Data
Share (LDS), Vector General Purpose Registers (VGPRs), and Scalar
General Purpose Registers (SGPRs). Properly clearing these resources  is
vital for ensuring data isolation and security between different
workloads executed on the GPU.

When the GPU receives this packet, it fetches and runs the cleaner
shader instructions from the specified location in the packet.  Thus by
preventing data leaks and ensuring that previous job states do not
interfere with subsequent workloads.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 17:03:02 -04:00
Jesse Zhang
cf93f10101 drm/amd/amdgpu: Fix out of bounds warning in amdgpu_hw_ip_info
Fix an array index out of bounds warning in the DMA IP case of
amdgpu_hw_ip_info() where it was incorrectly checking
adev->gfx.gfx_ring[i].no_user_submission instead of
adev->sdma.instance[i].ring.no_user_submission.

The mismatch caused UBSAN to report an array bounds violation since
it was accessing the GFX ring array with SDMA instance indices.

Fixes: 4310acd446 ("drm/amdgpu: add ring flag for no user submissions")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 17:02:17 -04:00
Eric Huang
4172b556fd drm/amdkfd: add smi events for process start and end
rocm-smi will be able to show the events for KFD process
start/end, it is the implementation of this feature.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 17:01:25 -04:00
Lijo Lazar
1d9bff4cf8 drm/amdgpu: Use the right function for hdp flush
There are a few prechecks made before HDP flush like a flush is not
required on APU bare metal. Using hdp callback directly bypasses those
checks. Use amdgpu_device_flush_hdp which takes care of prechecks.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 17:01:06 -04:00
Ellen Pan
5045c6c698 drm/amdgpu: Direct ret in ras_reset_err_cnt on VF
With adding sriov_vf check, we directly return EOPNOTSUPP in
ras_reset_error_count as we should not do anything on VF to reset RAS error
count.

This also fixes the issue that loading guest driver causes register
violations.

Reviewed-by: Ahmad Rehman <Ahmad.Rehman@amd.com>
Signed-off-by: Ellen Pan <yunru.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 17:01:00 -04:00
Lijo Lazar
18a878fd8a drm/amdgpu: Use generic hdp flush function
Except HDP v5.2 all use a common logic for HDP flush. Use a generic
function. HDP v5.2 forces NO_KIQ logic, revisit it later.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 17:00:56 -04:00
Candice Li
d6b22b1dff drm/amdgpu: Set RAS EEPROM table version to v3 for umc v12_5
Set RAS EEPROM table version to v3 for umc v12_5.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 17:00:50 -04:00
Jesse.zhang@amd.com
e21e1e8bb8 drm/amdgpu: Enable per-queue reset for SDMA v4.4.2 on IP v9.5.0
Add support for per-queue reset on SDMA v4.4.2 when running with:
1. MEC firmware version 17 or later
2. DPM indicates SDMA reset is supported

v2: Fixed supported firmware versions  (Lijo)

Signed-off-by: Jesse.Zhang <Jesse.zhang@amd.com>
Reviewed-by: Tim Huang <tim.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 17:00:44 -04:00
Srinivasan Shanmugam
e62a8bc5d6 drm/amdgpu/gfx11: Add Cleaner Shader Support for GFX11.5.2/11.5.3 GPUs
Enable the cleaner shader for additional GFX11.5.2/11.5.3 series GPUs to
ensure data isolation among GPU tasks. The cleaner shader is tasked with
clearing the Local Data Store (LDS), Vector General Purpose Registers
(VGPRs), and Scalar General Purpose Registers (SGPRs), which helps avoid
data leakage and guarantees the accuracy of computational results.

This update extends cleaner shader support to GFX11.5.2/11.5.3 GPUs,
previously available for GFX11.0.3. It enhances security by clearing GPU
memory between processes and maintains a consistent GPU state across KGD
and KFD workloads.

Cc: Mario Sopena-Novales <mario.novales@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 17:00:39 -04:00
Alex Deucher
20c50a9a79 drm/amd/display/dml2: use vzalloc rather than kzalloc
The structures are large and they do not require contiguous
memory so use vzalloc.

Fixes: 70839da636 ("drm/amd/display: Add new DCN401 sources")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4126
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 17:00:02 -04:00
Roman Li
309d11b4bb drm/amd/display: Add htmldocs description for fused_io interface
[Why]
htmldocs build warning: "Function parameter or struct member 'fused_io'
not described in 'amdgpu_display_manager'".

[How]
Add missing description.

Fixes: ce801e5d6c ("drm/amd/display: HDCP Locality check using DMUB Fused IO")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Roman Li <Roman.Li@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 16:58:19 -04:00
Alex Deucher
2e0454b730 drm/amdgpu: adjust enforce_isolation handling
Switch from a bool to an enum and allow more options
for enforce isolation.  There are now 3 modes of operation:
- Disabled (0)
- Enabled (serialization and cleaner shader) (1)
- Enabled in legacy mode (no serialization or cleaner shader) (2)
This provides better flexibility for more use cases.

Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 16:58:15 -04:00
Alex Deucher
b86fd212f3 drm/amdgpu/mes12: use the device value for enforce isolation
Use the local setting rather than the global parameter.

Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 16:58:12 -04:00
Alex Deucher
a7bb01337f drm/amdgpu/mes11: use the device value for enforce isolation
Use the local setting rather than the global parameter.

Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 16:58:04 -04:00
David Rosca
0f4dfe86fe drm/amdgpu: Add back JPEG to video caps for carrizo and newer
JPEG is not supported on Vega only.

Fixes: 0a6e7b06bd ("drm/amdgpu: Remove JPEG from vega and carrizo video caps")
Signed-off-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 16:56:44 -04:00
Prike Liang
9a218d6f47 drm/amdgpu/gfx12: Implement the GFX12 KCQ pipe reset
Implement the GFX12 KCQ pipe reset, and disable the GFX12
kernel compute queue until the CPFW fully supports it.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 16:56:07 -04:00
Ce Sun
732c6cefc1 drm/amdgpu: Replace tmp_adev with hive in amdgpu_pci_slot_reset
Checking hive is more readable.

The following smatch warning:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:6820 amdgpu_pci_slot_reset()
warn: iterator used outside loop: 'tmp_adev'

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 16:55:55 -04:00
ZhenGuo Yin
39938a8ed9 drm/amdgpu: fix warning of drm_mm_clean
Kernel doorbell BOs needs to be freed before ttm_fini.

Fixes: 54c30d2a8d ("drm/amdgpu: create kernel doorbell pages")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 16:54:57 -04:00
Kenneth Feng
c770ef1967 drm/amd/amdgpu: disable ASPM in some situations
disable ASPM with some ASICs on some specific platforms.
required from PCIe controller owner.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 16:54:48 -04:00
Prike Liang
0ec7535f5b drm/amdgpu: remove the duplicated mes queue active state setting
The MES queue deactivation and active status are already set in
mes_userq_unmap|map(), so the caller needn't set the queue_active
bit again.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 16:54:39 -04:00
Ruili Ji
f0ec5926da amd/amdgpu: Implement VCN queue reset for vcn 4.0.3
Add function for vcn queue reset to make driver to
do fine-grained reset instead of the whole gpu reset.

Reviewed-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruili Ji <ruiliji2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 16:54:27 -04:00
Masha Grinman
8ef4e99674 drm/amdgpu: Move read of snoop register from guest to host
Guest is reading/writing to snoop register which is a security violation
We moved the code to the host driver
And also added a validation on the guest side to check if it's guest

Signed-off-by: Masha Grinman <Masha.Grinman@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 16:54:02 -04:00
Mario Limonciello
2aabd44aa8 drm/amd: Forbid suspending into non-default suspend states
On systems that default to 'deep' some userspace software likes
to try to suspend in 'deep' first.  If there is a failure for any
reason (such as -ENOMEM) the failure is ignored and then it will
try to use 's2idle' as a fallback. This fails, but more importantly
it leads to graphical problems.

Forbid this behavior and only allow suspending in the last state
supported by the system.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4093
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250408180957.4027643-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 16:53:08 -04:00
Christian König
8b2ae7d492 drm/amdgpu: use a dummy owner for sysfs triggered cleaner shaders v4
Otherwise triggering sysfs multiple times without other submissions in
between only runs the shader once.

v2: add some comment
v3: re-add missing cast
v4: squash in semicolon fix

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11 16:52:33 -04:00
Dave Airlie
47271a0cae amd-drm-fixes-6.15-2025-04-09:
amdgpu:
 - MES FW version caching fixes
 - Only use GTT as a fallback if we already have a backing store
 - dma_buf fix
 - IP discovery fix
 - Replay and PSR with VRR fix
 - DC FP fixes
 - eDP fixes
 - KIQ TLB invalidate fix
 - Enable dmem groups support
 - Allow pinning VRAM dma bufs if imports can do P2P
 - Workload profile fixes
 - Prevent possible division by 0 in fan handling
 
 amdkfd:
 - Queue reset fixes
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZ/akUAAKCRC93/aFa7yZ
 2HtpAP9ptJj5RKxOcdh8lbiOZN3RGjxsqw6wXk7LUnw3A2npVAD5AXoaTis4S84e
 0NIY7bd1Q1njP2akA4O5AKykkpgVtgw=
 =Ha4H
 -----END PGP SIGNATURE-----

Merge tag 'amd-drm-fixes-6.15-2025-04-09' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.15-2025-04-09:

amdgpu:
- MES FW version caching fixes
- Only use GTT as a fallback if we already have a backing store
- dma_buf fix
- IP discovery fix
- Replay and PSR with VRR fix
- DC FP fixes
- eDP fixes
- KIQ TLB invalidate fix
- Enable dmem groups support
- Allow pinning VRAM dma bufs if imports can do P2P
- Workload profile fixes
- Prevent possible division by 0 in fan handling

amdkfd:
- Queue reset fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250409165238.1180153-1-alexander.deucher@amd.com
2025-04-10 17:14:02 +10:00
Alex Deucher
34779e1446 drm/amdgpu/mes12: optimize MES pipe FW version fetching
Don't fetch it again if we already have it.  It seems the
registers don't reliably have the value at resume in some
cases.

Fixes: 785f0f9fe7 ("drm/amdgpu: Add mes v12_0 ip block support (v4)")
Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 9e7b08d239)
Cc: stable@vger.kernel.org # 6.12.x
2025-04-09 10:53:11 -04:00
Denis Arefev
7ba88b5ccc drm/amd/pm/smu11: Prevent division by zero
The user can set any speed value.
If speed is greater than UINT_MAX/8, division by zero is possible.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 1e866f1fe5 ("drm/amd/pm: Prevent divide by zero")
Signed-off-by: Denis Arefev <arefev@swemel.ru>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit da7dc714a8)
Cc: stable@vger.kernel.org
2025-04-09 10:53:11 -04:00
Alex Deucher
35a5440832 drm/amdgpu: cancel gfx idle work in device suspend for s0ix
This is normally handled in the gfx IP suspend callbacks, but
for S0ix, those are skipped because we don't want to touch
gfx.  So handle it in device suspend.

Fixes: b9467983b7 ("drm/amdgpu: add dynamic workload profile switching for gfx10")
Fixes: 963537ca23 ("drm/amdgpu: add dynamic workload profile switching for gfx11")
Fixes: 5f95a15495 ("drm/amdgpu: add dynamic workload profile switching for gfx12")
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 906ad45167)
Cc: stable@vger.kernel.org
2025-04-09 10:53:11 -04:00
Kenneth Feng
50f29ead1f drm/amd/display: pause the workload setting in dm
Pause the workload setting in dm when doing idle optimization

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit b23f81c442)
2025-04-09 10:53:11 -04:00
Alex Deucher
c81a3ceedb drm/amdgpu/pm/swsmu: implement pause workload profile
Add the callback for implementation for swsmu.

Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 92e511d1ce)
2025-04-09 10:53:11 -04:00
Alex Deucher
7f991dd364 drm/amdgpu/pm: add workload profile pause helper
To be used for display idle optimizations when
we want to pause non-default profiles.

Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 6dafb5d4c7)
2025-04-09 10:53:11 -04:00
Alex Deucher
72801504fd drm/amdgpu/sdma7: add support for disable_kq
When the parameter is set, disable user submissions
to kernel queues.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:23 -04:00
Alex Deucher
fcf5eb979a drm/amdgpu/sdma6: add support for disable_kq
When the parameter is set, disable user submissions
to kernel queues.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:23 -04:00
Alex Deucher
1d65006fc1 drm/amdgpu/sdma: add flag for tracking disable_kq
For SDMA, we still need kernel queues for paging so
they need to be initialized, but we no not want to
accept submissions from userspace when disable_kq
is set.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:23 -04:00
Alex Deucher
0981e0ef18 drm/amdgpu/gfx12: add support for disable_kq
Plumb in support for disabling kernel queues.

v2: use ring counts per Felix' suggestion
v3: fix stream fault handler, enable EOP interrupts
v4: fix MEC interrupt offset (Sunil)
v5: clean up after removing extra sched.ready settings

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:23 -04:00
Alex Deucher
1e63ebc0d4 drm/amdgpu/gfx11: add support for disable_kq
Plumb in support for disabling kernel queues in
GFX11.  We have to bring up a GFX queue briefly in
order to initialize the clear state.  After that
we can disable it.

v2: use ring counts per Felix' suggestion
v3: fix stream fault handler, enable EOP interrupts
v4: fix MEC interrupt offset (Sunil)

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:22 -04:00
Alex Deucher
1f61fc28b9 drm/amdgpu/mes: make more vmids available when disable_kq=1
If we don't have kernel queues, the vmids can be used by
the MES for user queues.

Acked-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:22 -04:00
Alex Deucher
acdc43f270 drm/amdgpu/mes: update hqd masks when disable_kq is set
Make all resources available to user queues.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Suggested-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:22 -04:00
Alex Deucher
f091fa777b drm/amdgpu/gfx: add generic handling for disable_kq
Add proper checks for disable_kq functionality in
gfx helper functions.  Add special logic for families
that require the clear state setup.

v2: use ring count as per Felix suggestion
v3: fix num_gfx_rings handling in amdgpu_gfx_graphics_queue_acquire()
v4: fix error code (Alex)

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:22 -04:00
Alex Deucher
4310acd446 drm/amdgpu: add ring flag for no user submissions
This would be set by IPs which only accept submissions
from the kernel, not userspace, such as when kernel
queues are disabled. Don't expose the rings to userspace
and reject any submissions in the CS IOCTL.

v2: fix error code (Alex)

Reviewed-by: Sunil Khatri<sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:22 -04:00
Alex Deucher
a96a787d6d drm/amdgpu: add parameter to disable kernel queues
On chips that support user queues, setting this option
will disable kernel queues to be used to validate
user queues without kernel queues.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:22 -04:00
Alex Deucher
cf97de5b54 drm/amdgpu/userq: prevent runtime pm when userqs are active
Similar to KFD, prevent runtime pm while user queues are active.

Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:22 -04:00
Alex Deucher
4ce60dbada drm/amdgpu: store userq_managers in a list in adev
So we can iterate across them when we need to manage
all user queues.

v2: add uq_mgr to adev list in amdgpu_userq_mgr_init

Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:22 -04:00
Alex Deucher
100b6010d7 drm/amdgpu: bump version for user queue IP support query
Add the user queue IP support query to the drm_amdgpu_info_device
query.

Cc: marek.olsak@amd.com
Cc: prike.liang@amd.com
Cc: sunil.khatri@amd.com
Cc: yogesh.mohanmarimuthu@amd.com
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:22 -04:00
Alex Deucher
1af6881263 drm/amdgpu: add UAPI to query if user queues are supported
Add an INFO query to check if user queues are supported.

v2: switch to a mask of IPs (Marek)
v3: move to drm_amdgpu_info_device (Marek)

Cc: marek.olsak@amd.com
Cc: prike.liang@amd.com
Cc: sunil.khatri@amd.com
Cc: yogesh.mohanmarimuthu@amd.com
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:22 -04:00
Alex Deucher
a4a3373da2 drm/amdgpu/gfx12: split userq setup to a separate switch
Add a separate switch statement for the userq callback
assignment so that we can assign the callbacks for each
asic as the firmware becomes available.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:22 -04:00
Alex Deucher
9983ed9693 drm/amdgpu/gfx11: clean up and consolidate sw_init
With the ME details fixed, we can now consolidate
this state.  Also split out the userq setup into a separate
switch statement so that we can set them per IP version
when the firmwares are ready.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:22 -04:00
Arvind Yadav
32bd8b3ea7 drm/amdgpu: Fix display freezing issue when resizing apps
The display is freezing because the amdgpu_userq_wait_ioctl()
is waiting for a non-user queue fence(specifically, the PT update fence).

RootCause:
The resume_work is initiated by both amdgpu_userq_suspend and
amdgpu_userqueue_ensure_ev_fence at same time. The amdgpu_userq_suspend
signals a dma-fence and subsequently triggers the resume_work, which is
intended to replace the existing fence by creating new dma-fence. However,
following this, the amdgpu_userqueue_ensure_ev_fence schedules another
resume_work that generates a new dma-fence, thereby replacing the one
created by amdgpu_userq_suspend. Consequently, the original fence will
never be signaled.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Shashank Sharma <shashank.sharma@amd.com>
Cc: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:21 -04:00
Alex Deucher
b6f190e623 drm/amdgpu/mes: warn on unexpected pipe numbers
Warn if the number of pipes exceeds what the MES supports.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:21 -04:00
Alex Deucher
9e2bbba1d5 drm/amdgpu/mes: centralize gfx_hqd mask management
Move it to amdgpu_mes to align with the compute and
sdma hqd masks. No functional change.

v2: rebase on new changes
v3: misc optimizations

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Sunil Khatri<sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:21 -04:00
Alex Deucher
4220d2c7c4 drm/amdgpu: remove is_mes_queue flag
This was leftover from MES bring up when we had MES
user queues in the kernel.  It's no longer used so
remove it.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:21 -04:00
Alex Deucher
cb17fff3a2 drm/amdgpu/mes: remove unused functions
Leftover from the MES self tests that were removed previously.

Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:21 -04:00
Alex Deucher
158bfbc72c drm/amdgpu: validate user queue parameters
Make sure these are set properly to ensure compatibility if
we ever update the IOCTL interface.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:21 -04:00
Arvind Yadav
ad6c120f68 drm/amdgpu: fix the memleak caused by fence not released
Encountering a taint issue during the unloading of gpu_sched
due to the fence not being released/put. In this context,
amdgpu_vm_clear_freed is responsible for creating a job to
update the page table (PT). It allocates kmem_cache for
drm_sched_fence and returns the finished fence associated
with job->base.s_fence. In case of Usermode queue this finished
fence is added to the timeline sync object through
amdgpu_gem_update_bo_mapping, which is utilized by user
space to ensure the completion of the PT update.

[  508.900587] =============================================================================
[  508.900605] BUG drm_sched_fence (Tainted: G                 N): Objects remaining in drm_sched_fence on __kmem_cache_shutdown()
[  508.900617] -----------------------------------------------------------------------------

[  508.900627] Slab 0xffffe0cc04548780 objects=32 used=2 fp=0xffff8ea81521f000 flags=0x17ffffc0000240(workingset|head|node=0|zone=2|lastcpupid=0x1fffff)
[  508.900645] CPU: 3 UID: 0 PID: 2337 Comm: rmmod Tainted: G                 N 6.12.0+ #1
[  508.900651] Tainted: [N]=TEST
[  508.900653] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE/X570 AORUS ELITE, BIOS F34 06/10/2021
[  508.900656] Call Trace:
[  508.900659]  <TASK>
[  508.900665]  dump_stack_lvl+0x70/0x90
[  508.900674]  dump_stack+0x14/0x20
[  508.900678]  slab_err+0xcb/0x110
[  508.900687]  ? srso_return_thunk+0x5/0x5f
[  508.900692]  ? try_to_grab_pending+0xd3/0x1d0
[  508.900697]  ? srso_return_thunk+0x5/0x5f
[  508.900701]  ? mutex_lock+0x17/0x50
[  508.900708]  __kmem_cache_shutdown+0x144/0x2d0
[  508.900713]  ? flush_rcu_work+0x50/0x60
[  508.900719]  kmem_cache_destroy+0x46/0x1f0
[  508.900728]  drm_sched_fence_slab_fini+0x19/0x970 [gpu_sched]
[  508.900736]  __do_sys_delete_module.constprop.0+0x184/0x320
[  508.900744]  ? srso_return_thunk+0x5/0x5f
[  508.900747]  ? debug_smp_processor_id+0x1b/0x30
[  508.900754]  __x64_sys_delete_module+0x16/0x20
[  508.900758]  x64_sys_call+0xdf/0x20d0
[  508.900763]  do_syscall_64+0x51/0x120
[  508.900769]  entry_SYSCALL_64_after_hwframe+0x76/0x7e

v2: call dma_fence_put in amdgpu_gem_va_update_vm
v3: Addressed review comments from Christian.
    - calling amdgpu_gem_update_timeline_node before switch.
    - puting a dma_fence in case of error or !timeline_syncobj.
v4: Addressed review comments from Christian.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Shashank Sharma <shashank.sharma@amd.com>
Cc: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Le Ma <le.ma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:21 -04:00
Alex Deucher
ecdb0b32e5 drm/amdgpu/userq: move the header to amdgpu directory
To align with other headers.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:21 -04:00
Alex Deucher
665de8c947 drm/amdgpu/userq: remove BROKEN from config
This can be enabled now.  We have the firmware checks
in place.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:21 -04:00
Alex Deucher
5ca4095960 drm/amdgpu: add userq firmware version checks
Currently disabled until the firmwares are officially
released.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:21 -04:00
Alex Deucher
f36e4876c8 drm/amdgpu/gfx11: fix config guard
s/CONFIG_DRM_AMD_USERQ_GFX/CONFIG_DRM_AMDGPU_NAVI3X_USERQ/

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:21 -04:00
Alex Deucher
c4f42c8d0b drm/amdgpu/Kconfig: fix wording of DRM_AMDGPU_NAVI3X_USERQ
The feature is not navi3x specific at this point.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:20 -04:00
Alex Deucher
df85baa767 drm/amdgpu: return an error in the userq IOCTL when DRM_AMDGPU_NAVI3X_USERQ=n
I'd swear this was already fixed, but I guess the patch never
landed.  Add it now.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:20 -04:00
Alex Deucher
2a060b3ae9 drm/amdgpu/userq: handle runtime pm
Take a reference when we create a queue and drop it
when we destroy the queue.  We need to keep the device
active while user queues are active.

v2: squash in fix from Sunil
v3: squash in fix from Prike

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:20 -04:00
Alex Deucher
29adc5c2dd drm/amdgpu/userq: fix hardcoded uq functions
Use the IP type to look up the userq functions rather
than hardcoding it.

Reviewed-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:20 -04:00
Arvind Yadav
f15d4e92f7 drm/amdgpu: Fix display freeze lockup error
A deadlock situation has arised between the userq
signal ioctl and the eviction fence. In this scenario,
the function amdgpu_userq_signal_ioctl() has acquired a reservation
lock on the read/write buffer object (BO) through drm_exec.
Subsequently, it calls amdgpu_userqueue_ensure_ev_fence(),
which is in a waiting for the userq resume work.
Meanwhile, the userq suspend worker has initiated the userq resume
work(amdgpu_userqueue_resume_worker). This userq resume work attempts
to validate the vm->done BO, leading to amdgpu_userqueue_validate_bos
also attempting to reservation lock the same write BO that is already
locked by amdgpu_userq_signal_ioctl.
As a result, the resume work becomes stalled, causing
amdgpu_userqueue_ensure_ev_fence to remain in a waiting state.

Call Trace:
[  242.836469] INFO: task gnome-shel:cs0:1288 blocked for more than 120 seconds.
[  242.836486]       Tainted: G           OE      6.12.0-rc2rebased-oct-24+ #4
[  242.836491] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  242.836494] task:gnome-shel:cs0  state:D stack:0     pid:1288  tgid:1282  ppid:1180   flags:0x00000002
[  242.836503] Call Trace:
[  242.836508]  <TASK>
[  242.836517]  __schedule+0x3e0/0xb10
[  242.836530]  ? srso_return_thunk+0x5/0x5f
[  242.836541]  schedule+0x31/0x120
[  242.836546]  schedule_timeout+0x150/0x160
[  242.836551]  ? srso_return_thunk+0x5/0x5f
[  242.836555]  ? sysvec_call_function+0x69/0xd0
[  242.836562]  ? srso_return_thunk+0x5/0x5f
[  242.836567]  ? preempt_count_add+0x7f/0xd0
[  242.836577]  __wait_for_common+0x91/0x180
[  242.836582]  ? __pfx_schedule_timeout+0x10/0x10
[  242.836590]  wait_for_completion+0x28/0x30
[  242.836595]  __flush_work+0x16c/0x290
[  242.836602]  ? __pfx_wq_barrier_func+0x10/0x10
[  242.836611]  flush_delayed_work+0x3a/0x60
[  242.836621]  amdgpu_userqueue_ensure_ev_fence+0x2d/0xb0 [amdgpu]
[  242.836966]  amdgpu_userq_signal_ioctl+0x959/0xec0 [amdgpu]
[  242.837171]  ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[  242.837365]  drm_ioctl_kernel+0xae/0x100 [drm]
[  242.837398]  drm_ioctl+0x2a1/0x500 [drm]
[  242.837420]  ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[  242.837622]  ? srso_return_thunk+0x5/0x5f
[  242.837627]  ? srso_return_thunk+0x5/0x5f
[  242.837630]  ? _raw_spin_unlock_irqrestore+0x2b/0x50
[  242.837635]  amdgpu_drm_ioctl+0x4f/0x90 [amdgpu]
[  242.837811]  __x64_sys_ioctl+0x99/0xd0
[  242.837820]  x64_sys_call+0x1209/0x20d0
[  242.837825]  do_syscall_64+0x51/0x120
[  242.837830]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  242.837835] RIP: 0033:0x7f2f33f1a94f
[  242.837838] RSP: 002b:00007f2f24ffea30 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  242.837842] RAX: ffffffffffffffda RBX: 00007f2f24ffebd0 RCX: 00007f2f33f1a94f
[  242.837845] RDX: 00007f2f24ffebd0 RSI: 00000000c0306457 RDI: 000000000000000d
[  242.837847] RBP: 00007f2f24ffeab0 R08: 0000000000000000 R09: 0000000000000000
[  242.837849] R10: 00007f2f24ffecd0 R11: 0000000000000246 R12: 00007f2f25000640
[  242.837851] R13: 00000000c0306457 R14: 000000000000000d R15: 00007fff3b39c1e0
[  242.837858]  </TASK>
[  242.837865] INFO: task Xwayland:cs0:1517 blocked for more than 120 seconds.
[  242.837869]       Tainted: G           OE      6.12.0-rc2rebased-oct-24+ #4
[  242.837872] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  242.837874] task:Xwayland:cs0    state:D stack:0     pid:1517  tgid:1338  ppid:1282   flags:0x00004002
[  242.837878] Call Trace:
[  242.837880]  <TASK>
[  242.837883]  __schedule+0x3e0/0xb10
[  242.837890]  schedule+0x31/0x120
[  242.837894]  schedule_preempt_disabled+0x1c/0x30
[  242.837897]  __mutex_lock.constprop.0+0x386/0x6e0
[  242.837902]  ? srso_return_thunk+0x5/0x5f
[  242.837905]  ? __timer_delete_sync+0x81/0xe0
[  242.837911]  __mutex_lock_slowpath+0x13/0x20
[  242.837915]  mutex_lock+0x3b/0x50
[  242.837919]  amdgpu_userqueue_ensure_ev_fence+0x35/0xb0 [amdgpu]
[  242.838138]  amdgpu_userq_signal_ioctl+0x959/0xec0 [amdgpu]
[  242.838340]  ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[  242.838531]  drm_ioctl_kernel+0xae/0x100 [drm]
[  242.838559]  drm_ioctl+0x2a1/0x500 [drm]
[  242.838580]  ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[  242.838778]  ? srso_return_thunk+0x5/0x5f
[  242.838783]  ? srso_return_thunk+0x5/0x5f
[  242.838786]  ? _raw_spin_unlock_irqrestore+0x2b/0x50
[  242.838791]  amdgpu_drm_ioctl+0x4f/0x90 [amdgpu]
[  242.838967]  __x64_sys_ioctl+0x99/0xd0
[  242.838972]  x64_sys_call+0x1209/0x20d0
[  242.838975]  do_syscall_64+0x51/0x120
[  242.838979]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  242.838982] RIP: 0033:0x7f9118b1a94f
[  242.838985] RSP: 002b:00007f910cdff760 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  242.838989] RAX: ffffffffffffffda RBX: 00007f910cdff910 RCX: 00007f9118b1a94f
[  242.838991] RDX: 00007f910cdff910 RSI: 00000000c0306457 RDI: 000000000000000c
[  242.838993] RBP: 00007f910cdff7e0 R08: 0000000000000000 R09: 0000000000000001
[  242.838995] R10: 00007f910cdff9d4 R11: 0000000000000246 R12: 00007f910ce00640
[  242.838997] R13: 00000000c0306457 R14: 000000000000000c R15: 00007fff9dd11d10
[  242.839004]  </TASK>

v2: Addressed review comemnts from Christian.
v3/v4: Addressed review comemnts from Christian.
   - Move drm_exec drm_exec loop after userq fence create.
   - cleanup the newly created userq fence in case of error.
v5 - Addressed review comemnts from Christian.
   - Create a new amdgpu_userq_fence_alloc() function for allocation.
   - Calling dma_fence_put for cleanup procedure.
   - make amdgpu_userq_fence_create() function static.
   - drm_exec_init is called after mutex_unlock.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Shashank Sharma <shashank.sharma@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:20 -04:00
Arunpravin Paneer Selvam
fc4a85c6b2 drm/amdgpu: Modify the seq64 VM cache policy
The seq64 VM cache policy should be set to UC (Uncached) to
match with userqueue fence address kernel mapped memory's
cache settings.

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:20 -04:00
Arunpravin Paneer Selvam
239a310b49 drm/amdgpu: Fix out-of-bounds issue in user fence
Fix out-of-bounds issue in userq fence create when
accessing the userq xa structure. Added a lock to
protect the race condition.

v2:(Christian)
  - Allocate memory with GFP_ATOMIC.

v3:
  - Moved to 2 xa approach.

v4:(Christian)
  - Lock the xa_for_each blocks and memory allocation part
    as well to make sure that xa is not modified in between
    the 2 xa_for_each blocks.

BUG: KASAN: slab-out-of-bounds in amdgpu_userq_fence_create+0x726/0x880 [amdgpu]
[  +0.000006] Call Trace:
[  +0.000005]  <TASK>
[  +0.000005]  dump_stack_lvl+0x6c/0x90
[  +0.000011]  print_report+0xc4/0x5e0
[  +0.000009]  ? srso_return_thunk+0x5/0x5f
[  +0.000008]  ? kasan_complete_mode_report_info+0x26/0x1d0
[  +0.000007]  ? amdgpu_userq_fence_create+0x726/0x880 [amdgpu]
[  +0.000405]  kasan_report+0xdf/0x120
[  +0.000009]  ? amdgpu_userq_fence_create+0x726/0x880 [amdgpu]
[  +0.000405]  __asan_report_store8_noabort+0x17/0x20
[  +0.000007]  amdgpu_userq_fence_create+0x726/0x880 [amdgpu]
[  +0.000406]  ? __pfx_amdgpu_userq_fence_create+0x10/0x10 [amdgpu]
[  +0.000408]  ? srso_return_thunk+0x5/0x5f
[  +0.000008]  ? ttm_resource_move_to_lru_tail+0x235/0x4f0 [ttm]
[  +0.000013]  ? srso_return_thunk+0x5/0x5f
[  +0.000008]  amdgpu_userq_signal_ioctl+0xd29/0x1c70 [amdgpu]
[  +0.000412]  ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[  +0.000404]  ? try_to_wake_up+0x165/0x1840
[  +0.000010]  ? __pfx_futex_wake_mark+0x10/0x10
[  +0.000011]  drm_ioctl_kernel+0x178/0x2f0 [drm]
[  +0.000050]  ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[  +0.000404]  ? __pfx_drm_ioctl_kernel+0x10/0x10 [drm]
[  +0.000043]  ? __kasan_check_read+0x11/0x20
[  +0.000007]  ? srso_return_thunk+0x5/0x5f
[  +0.000007]  ? __kasan_check_write+0x14/0x20
[  +0.000008]  drm_ioctl+0x513/0xd20 [drm]
[  +0.000040]  ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[  +0.000407]  ? __pfx_drm_ioctl+0x10/0x10 [drm]
[  +0.000044]  ? srso_return_thunk+0x5/0x5f
[  +0.000007]  ? _raw_spin_lock_irqsave+0x99/0x100
[  +0.000007]  ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[  +0.000006]  ? __rseq_handle_notify_resume+0x188/0xc30
[  +0.000008]  ? srso_return_thunk+0x5/0x5f
[  +0.000008]  ? srso_return_thunk+0x5/0x5f
[  +0.000006]  ? _raw_spin_unlock_irqrestore+0x27/0x50
[  +0.000010]  amdgpu_drm_ioctl+0xcd/0x1d0 [amdgpu]
[  +0.000388]  __x64_sys_ioctl+0x135/0x1b0
[  +0.000009]  x64_sys_call+0x1205/0x20d0
[  +0.000007]  do_syscall_64+0x4d/0x120
[  +0.000008]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  +0.000007] RIP: 0033:0x7f7c3d31a94f

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:20 -04:00
Saleemkhan Jamadar
49cd3353db drm/amdgpu: add db size and offset range for VCN and VPE
VCN and VPE have different offset range, update the doorbell
offset range repsectively.
Doorbell size for VCN and VPE is 32bit.

v1 : add gfx switch case and fix checkpatch warnings (Shashank)

Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:20 -04:00
Saleemkhan Jamadar
3e37fcb57b drm/amdgpu: map doorbell for the requested userq
Introduce db_info structure to the populate the doorbell
information that is required to be mapped.

Made changes to the doorbell mapping func more generic,
by taking parameters that vary based on IPs and/or usecase
into db_info structure.

v2 - Fix space alignment and checkpatch warnings(Shashank)

Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:20 -04:00
Christian König
8639d2f5ca drm/amdgpu: fix call to amdgpu_eviction_fence_detach
That needs to be done after grabbing the lock, not before.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:20 -04:00
Arvind Yadav
adba092973 drm/amdgpu: Fix Illegal opcode in command stream Error
When applications closes, it triggers the drm_file_free
function which subsequently releases all allocated buffer
objects. Concurrently, the resume_worker thread will attempt
to map the usermode queue. However, since the wptr buffer
object has already been deallocated, this will result in
an Illegal opcode error being raised in the command stream.

Now replacing drm_release() with a new function
amdgpu_drm_release(). This function will set the flag to
prevent the scheduling of any new queue resume/map, stop
all queues and then call drm_release().

V2:
  - Replace drm_release with amdgpu_drm_release(Christian).

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:20 -04:00
Arunpravin Paneer Selvam
02521454f0 drm/amdgpu: Apply sign extension to seq64
Apply sign extension to seq64 va address.

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:20 -04:00
Christian König
91acb5d47b drm/amdgpu: Modify the MES process va end limit
Modify the MES process va end limit to max pfn.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:19 -04:00
Arunpravin Paneer Selvam
ed5fdc1fc2 drm/amdgpu: Fix the use-after-free issue in wait IOCTL
The xarray pointer which has the userqueue xarray structure
reference should be cleared when the userqueue gets
destroyed. Otherwise, we may access the freed xa memory and
see the below warnings.

warning 1:
BUG: KASAN: slab-use-after-free in _raw_spin_lock+0x7a/0xe0
[  +0.000044] Call Trace:
[  +0.000017]  <TASK>
[  +0.000016]  dump_stack_lvl+0x6c/0x90
[  +0.000025]  print_report+0xc4/0x5e0
[  +0.000025]  ? srso_return_thunk+0x5/0x5f
[  +0.000024]  ? kasan_complete_mode_report_info+0x60/0x1d0
[  +0.000030]  ? _raw_spin_lock+0x7a/0xe0
[  +0.000023]  kasan_report+0xdf/0x120
[  +0.000023]  ? _raw_spin_lock+0x7a/0xe0
[  +0.000025]  kasan_check_range+0xf7/0x1b0
[  +0.000025]  __kasan_check_write+0x14/0x20
[  +0.000024]  _raw_spin_lock+0x7a/0xe0
[  +0.000023]  ? __pfx__raw_spin_lock+0x10/0x10
[  +0.000024]  ? amdgpu_userq_wait_ioctl+0xac0/0x1f30 [amdgpu]
[  +0.000442]  amdgpu_userq_wait_ioctl+0x18fc/0x1f30 [amdgpu]
[  +0.000428]  ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu]
[  +0.000424]  ? __pfx_idr_alloc_u32+0x10/0x10
[  +0.000027]  ? srso_return_thunk+0x5/0x5f
[  +0.000024]  ? __kasan_check_write+0x14/0x20
[  +0.000025]  ? srso_return_thunk+0x5/0x5f
[  +0.000024]  ? idr_alloc+0x72/0xc0
[  +0.000023]  ? srso_return_thunk+0x5/0x5f
[  +0.000023]  ? fput+0x1c/0x2f0
[  +0.000025]  drm_ioctl_kernel+0x178/0x2f0 [drm]
[  +0.000065]  ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu]
[  +0.000425]  ? __pfx_drm_ioctl_kernel+0x10/0x10 [drm]
[  +0.000064]  ? srso_return_thunk+0x5/0x5f
[  +0.000023]  ? __kasan_check_write+0x14/0x20
[  +0.000025]  drm_ioctl+0x513/0xd20 [drm]
[  +0.000058]  ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu]
[  +0.000428]  ? __pfx_drm_ioctl+0x10/0x10 [drm]
[  +0.000061]  ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[  +0.000027]  ? __count_memcg_events+0x11f/0x3a0
[  +0.000027]  ? srso_return_thunk+0x5/0x5f
[  +0.001040]  ? srso_return_thunk+0x5/0x5f
[  +0.000969]  ? _raw_spin_unlock_irqrestore+0x27/0x50
[  +0.000966]  amdgpu_drm_ioctl+0xcd/0x1d0 [amdgpu]
[  +0.001352]  __x64_sys_ioctl+0x135/0x1b0
[  +0.000966]  x64_sys_call+0x1205/0x20d0
[  +0.000968]  do_syscall_64+0x4d/0x120
[  +0.000960]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  +0.000962] RIP: 0033:0x7f42af11a94f

warning 2:
WARNING: at lib/xarray.c:1849 __xa_alloc+0x13a/0x150
[  366.491409] RIP: 0010:__xa_alloc+0x13a/0x150
[  366.491434] Call Trace:
[  366.491437]  <TASK>
[  366.491440]  ? show_regs+0x6d/0x80
[  366.491445]  ? __warn+0x91/0x140
[  366.491450]  ? __xa_alloc+0x13a/0x150
[  366.491453]  ? report_bug+0x1c9/0x1e0
[  366.491459]  ? handle_bug+0x63/0xa0
[  366.491463]  ? exc_invalid_op+0x1d/0x80
[  366.491467]  ? asm_exc_invalid_op+0x1f/0x30
[  366.491476]  ? __xa_alloc+0x13a/0x150
[  366.491484]  amdgpu_userq_wait_ioctl+0xe0e/0xfe0 [amdgpu]
[  366.491743]  ? idr_alloc_u32+0x97/0xd0
[  366.491749]  ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu]
[  366.491912]  drm_ioctl_kernel+0xae/0x100 [drm]
[  366.491942]  drm_ioctl+0x2a1/0x500 [drm]
[  366.491961]  ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu]
[  366.492127]  ? srso_return_thunk+0x5/0x5f
[  366.492132]  ? srso_return_thunk+0x5/0x5f
[  366.492135]  ? _raw_spin_unlock_irqrestore+0x2b/0x50
[  366.492139]  amdgpu_drm_ioctl+0x4f/0x90 [amdgpu]
[  366.492288]  __x64_sys_ioctl+0x99/0xd0
[  366.492295]  x64_sys_call+0x1209/0x20d0
[  366.492299]  do_syscall_64+0x51/0x120
[  366.492303]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  366.492418] RIP: 0033:0x7f86f3b1a94f

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:19 -04:00
Arunpravin Paneer Selvam
c9e20cb005 drm/amdgpu: Fix NULL ptr dereference issue for non userq fences
Add the correct fences count variable [num_fences] in the fences
array iteration to handle the userq / non-userq fences.

v2:(Christian)
  - All fences in the array either come from some reservation object
    or drm_syncobj. If any of those are NULL then there is a bug
    somewhere else.

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:19 -04:00
Arunpravin Paneer Selvam
9ed335d939 drm/amdgpu: Add mqd for userq compute queue
Add mqd for userq compute queue for gfx11/gfx12

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:19 -04:00
Shashank Sharma
31f7efcdca drm/amdgpu: enable eviction fence
This patch enables attachment and detachment of eviction fences.
This is just a fork of eviction fence enabling code from the first
patch of the series so that the CI testing can happen on fully
fledged code.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:19 -04:00
Shashank Sharma
a242a3e4b5 drm/amdgpu: simplify eviction fence suspend/resume
The basic idea in this redesign is to add an eviction fence only in UQ
resume path. When userqueue is not present, keep ev_fence as NULL

Main changes are:
 - do not create the eviction fence during evf_mgr_init, keeping
   evf_mgr->ev_fence=NULL until UQ get active.
 - do not replace the ev_fence in evf_resume path, but replace it only in
   uq_resume path, so remove all the unnecessary code from ev_fence_resume.
 - add a new helper function (amdgpu_userqueue_ensure_ev_fence) which
   will do the following:
   - flush any pending uq_resume work, so that it could create an
     eviction_fence
   - if there is no pending uq_resume_work, add a uq_resume work and
     wait for it to execute so that we always have a valid ev_fence
 - call this helper function from two places, to ensure we have a valid
   ev_fence:
   - when a new uq is created
   - when a new uq completion fence is created

v2: Worked on review comments by Christian.
v3: Addressed few more review comments by Christian.
v4: Move mutex lock outside of the amdgpu_userqueue_suspend()
    function (Christian).
v5: squash in build fix (Alex)

Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:19 -04:00
Arunpravin Paneer Selvam
dd5a376cd2 drm/amdgpu: enable userqueue secure sem for GFX 12
- Add a field in struct amdgpu_mqd_prop for userqueue
  secure sem fence address since now we have a generic
  file for mes_userqueue.c
- Add secure sem fence address mqd support to gfx12 into
  their corresponding init functions.
- Enable secure semaphore IRQ handling

V2: Address review comment from Alex:
    Use fence_address instead of fenceaddress (Shashank)

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:19 -04:00
Somalapuram Amaranath
988c9e7046 drm/amdgpu: enable userqueue support for GFX12
This patch enables Usermode queue support across GFX, Compute
and SDMA IPs on GFX12/SDMA7. It typically reuses Navi3X userqueue
IP functions to create and destroy MQDs.

v2: rebase on proposed changes (Alex)

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Arvind Yadav <arvind.yadav@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:19 -04:00
Alex Deucher
79819d9a0a drm/amdgpu/uq: make MES UQ setup generic
Now that all of the IP specific code has been moved into
the IP specific functions, we can make this code generic.

V2: Fixed build errors and porting logics (Shashank)

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:19 -04:00
Alex Deucher
b965c5d871 drm/amdgpu/uq: remove gfx11 specifics from UQ setup
This can all be handled by in the IP specific mpd init
code.

V2: Removed setting of gds_va, which was removed during UAPI
    review (Shashank)

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:19 -04:00
Alex Deucher
21926b5db8 drm/amdgpu/sdma7: update mqd init for UQ
Set the addresses for the UQ metadata.

V2: Fix lower offset mask (Shashank)
V2: Use lower_32_bits for mqd objects(Alex)

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:19 -04:00
Alex Deucher
d07a7fcb8d drm/amdgpu/sdma6: update mqd init for UQ
Set the addresses for the UQ metadata.

V2: Fix lower address mask (Shashank)
V3: Use lower_32_bits for MQD objects (Alex)

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:18 -04:00
Alex Deucher
ab328d9a7b drm/amdgpu/gfx12: update mqd init for UQ
Set the addresses for the UQ metadata.

V2: Fix lower address mask (Shashank)
V3: Use lower_32_bits() for MQD objects (Alex)

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:18 -04:00
Amaranath Somalapuram
f2234816a3 drm/amdgpu: fix IGT CI regression with eviction fence
This patch fixes one of the regressions in eviction fence code with
IGT tests.

Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Amaranath Somalapuram <amaranath.somalapuram@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:18 -04:00
Alex Deucher
7179439e34 drm/amdgpu/gfx11: update mqd init for UQ
Set the addresses for the UQ metadata.

V2: Fix lower address (Shashank)
V3: Restore lower_32_bits() for MQD addresses (Alex)

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:18 -04:00
Alex Deucher
825f82cf93 drm/amdgpu: add some additional members to amdgpu_mqd_prop
These are needed to make userqueue infrastructure generic.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:18 -04:00
Shashank Sharma
b8e6d3f68c drm/amdgpu: handle eviction fence race
The eviction process can get into a race condition between the eviction
fence suspend work (which replaces the old fence with new) and kms_close
(which destroys the fence and doesn't expect a new one).

This patch:
- adds a flag to indicate that fd is closing, so fence replacement is
  not required (evf_mgr->fd_closing)
- adds a flush_work() during the ev_fence_destroy routine

V2: Addressed review comments from Christian:
    - Do not use mutex to sync
    - Use flush_work and wait for suspend_work to be done

V3: Fixed state machine for queue->active, which adds into race between
    suspend/resume and queue ops

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:18 -04:00
Shashank Sharma
44cfdf368f drm/amdgpu: resume gfx userqueues
This patch adds support for userqueue resume. What it typically does is
this:
- adds a new delayed work for resuming all the queues.
- schedules this delayed work from the suspend work.
- validates the BOs and replaces the eviction fence before resuming all
  the queues running under this instance of userq manager.

V2: Addressed Christian's review comments:
    - declare local variables like ret at the bottom.
    - lock all the object first, then start attaching the new fence.
    - dont replace old eviction fence, just attach new eviction fence.
    - no error logs for drm_exec_lock failures
    - no need to reserve bos after drm_exec_locked
    - schedule the resume worker immediately (not after 100 ms)
    - check for NULL BO (Arvind)

V5: Rebased wrt changes in suspend patch
    - moved amdgpu_userqueue_validate_vm_bo in this patch
    - initialized ret in resume_all

V6: Rebase
V7: Addressed review comments from Christian
    - Do not use list_for_each_safe() with vm->invalidated, its not
      correct way
V8: Fixed the race condition between suspend/close/fence

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Acked-by: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:18 -04:00
Shashank Sharma
b0328087c1 drm/amdgpu: suspend gfx userqueues
This patch adds suspend support for gfx userqueues. It typically does
the following:
- adds an enable_signaling function for the eviction fence, so that it
  can trigger the userqueue suspend,
- adds a delayed work to handle suspending of the eviction_fence
- adds a suspend function to handle suspending of userqueues which
  suspends all the queues under this userq manager and signals the
  eviction fence,
- adds a function to replace the old eviction fence with a new one and
  attach it to each of the objects,
- adds reference of userq manager in the eviction fence container so
  that it can be used in the suspend function.

V2: Addressed Christian's review comments:
    - schedule suspend work immediately

V4: Addressed Christian's review comments:
    - wait for pending uq fences before starting suspend, added
      queue->last_fence for the same
    - accommodate ev_fence_mgr into existing code
    - some bug fixes and NULL checks

V5: Addressed Christian's review comments (gitlab)
    - Wait for eviction fence to get signaled in destroy,
      don't signal it
    - Wait for eviction fence to get signaled in replace fence,
      don't signal it

V6: Addressed Christian's review comments
    - Do not destroy the old eviction fence until we have it replaced
    - Change the sequence of fence replacement sub-tasks
    - reusing the ev_fence delayed work for userqueue suspend as well
      (Shashank).

V7: Addressed Christian's review comments
    - give evf_mgr as argument (instead of fpriv) to replace_fence()
    - save ptr to evf_mgr in ev_fence (instead of uq_mgr)
    - modify suspend_all_queues logic to reflect error properly
    - remove the garbage drm_exec_lock section in wait_for_signal
    - grab the userqueue mutex before starting the wait for fence
    - remove the unrelated gobj check from signal_ioctl

V8: Added race condition fixes

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Acked-by: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:18 -04:00
Shashank Sharma
30e4d78138 drm/amdgpu: add userqueue suspend/resume functions
This patch adds userqueue suspend/resume functions at
core MES V11 IP level.

V2: use true/false for queue_active status (Christian)
    added Christian's R-B

V3: reset/set queue status in mqd.create and mqd.destroy

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:18 -04:00
Shashank Sharma
fb796c3087 drm/amdgpu: add gfx eviction fence helpers
This patch adds basic eviction fence framework for the gfx buffers.
The idea is to:
- One eviction fence is created per gfx process, at kms_open.
- This fence is attached to all the gem buffers created
  by this process.
- This fence is detached to all the gem buffers at postclose_kms.

This framework will be further used for usermode queues.

V2: Addressed review comments from Christian
    - keep fence_ctx and fence_seq directly in fpriv
    - evcition_fence should be dynamically allocated
    - do not save eviction fence instance in BO, there could be many
      such fences attached to one BO
    - use dma_resv_replace_fence() in detach

V3: Addressed review comments from Christian
    - eviction fence create and destroy functions should be called
      only once from fpriv create/destroy
    - use dma_fence_put() in eviction_fence_destroy

V4: Addressed review comments from Christian:
    - create a separate ev_fence_mgr structure
    - cleanup fence init part
    - do not add a domain for fence owner KGD

V5: Addressed review comments from Christian:
    - drop the dma_fence_is_signaled check
    - use a local variable to access evf_mgr->ev_fence under the
      spin_lock() multiple places
    - remove the vm->is_compute_ctx check to attach gfx eviction fence,
      in gem_object_open

V6: Addressed review comments from Christian:
    - drop the return value from eviction_fence_signal
    - reserve_fence should be the first thing inside the
      attach_eviction_fence function, also keep the resv_add_fence inside
      the lock
    - remove the unwanted ev_fence check inside detach function
    - fix wrong variable check in eviction_fence_init function
    - return the error value of eviction_fence_init to the caller, dont
      keep it void.
    - fail gem_object_open if attaching of eviction_fence fails
    - detach the eviction fence only when amdgpu_vm_is_bo_always_valid
      is not true.

V7: Addressed review comments from Christian:
    - Do not add a uq_mgr ptr in ev_fence, rather add evf_mgr

V8: Move eviction fence enabling into separate patch for CI

Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:18 -04:00
Sunil Khatri
a640126fbd drm/amdgpu: add the argument description for gpu_addr
Add argument description for the input argument
gpu_addr for amdgpu_seq64_alloc.

Fixes the warning raised by the compiler:
drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c:168:
warning: Function parameter or struct member 'gpu_addr' not described in 'amdgpu_seq64_alloc

Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:18 -04:00
Shashank Sharma
90c448fef3 drm/amdgpu: add new AMDGPU_INFO subquery for userq objects
This patch adds a new subquery (AMDGPU_INFO_UQ_FW_AREAS) in
AMDGPU_INFO_IOCTL to get the size and alignment of shadow
and csa objects from the FW setup. This information is
required for the userqueue consumers.

V2: Added Alex's suggestions and addressed review comments:
- make this query IP specific (GFX/SDMA etc)
- give a better title (AMDGPU_INFO_UQ_METADATA)
- restructured the code as per sample code shared by Alex

V3: Split the UAPI patch from shadow_size_fn modifications
V4: Addressed review comments from UAPI review (Marek/Pierre-Eric)
    - Change the query name to AMDGPU_INFO_UQ_FW_AREAS
    - remove unused inpur parameter for AMDGPU_HW_IP*

UAPI link: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/400/

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Arvind Yadav <arvind.yadav@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:17 -04:00
Shashank Sharma
aed7caf2d4 drm/amdgpu: add get_gfx_shadow_info callback for gfx12
This callback gets the size and alignment requirements
for the gfx shadow buffer for preemption.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:17 -04:00
Arunpravin Paneer Selvam
2761bb9a31 drm/amdgpu: Modify userq signal/wait struct field names
Modify kernel UAPI userq signal/wait struct field names and
description corresponding to the libdrm UAPI review comments.

libdrm MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/392

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:17 -04:00
Shashank Sharma
d9e697f19b drm/amdgpu: bypass SRIOV check for shadow size info
Currently, the shadow FW space size and alignment information is
protected under a flag (adev->gfx.cp_gfx_shadow) which gets set
only in case of SRIOV setups.
if (amdgpu_sriov_vf(adev))
    adev->gfx.cp_gfx_shadow = true;

But we need this information for GFX Userqueues, so that user can
create these objects while creating userqueue. This patch series
creates a method to get this information bypassing the dependency
on this check.

This patch:
- adds a new input parameter flag to the gfx.funcs->get_gfx_shadow_info
fptr definition, so that it can accommodate the information without the
check (adev->gfx.cp_gfx_shadow) on request.
- updates the existing definition of amdgpu_gfx_get_gfx_shadow_info to
adjust with this new flag.

Next patch in the series is adding a UAPI which will consume this info.

V2: split this patch from the new UAPI patch

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Arvind Yadav <arvind.yadav@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:17 -04:00
Shashank Sharma
2e06b175ff drm/amdgpu: fix userqueue UAPI comments
This patch fixes some of the pending UAPI review comments
from the libDRM/UAPI review process.

- It updates some outdated comments in the userqueue UAPI header
  highlighted during the libdrm UAPI review.
- It removes the GDS BO support which was found unused.
- It also removes the unused flags parameter from the UAPI.
- It also adds a padding variables in userqueue in/out structures.

(Pierre-Eric and Marek)
  - clarify comments on top of drm_amdgpu_userq_in
  - clarify comment for queue_id (in)
  - clarify comment for mqd
  - clarify comment for compute MQD size
  - clarify comment for queue_id (out)
  - remove GDB object from BO object list
  - remove the unused flags parameter

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:17 -04:00
Shashank Sharma
5f2f78314c Revert "drm/amdgpu: don't allow userspace to create a doorbell BO"
This reverts commit 6be2ad4f00.

This patch was to block userspace to use doorbell manager UAPI
until usermode queue UAPI gets approved. UQ UAPI got approved in the
following MR:
https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/392

Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:17 -04:00
Arvind Yadav
38c67ec9aa drm/amdgpu: Add input fence to sync bo map/unmap
This patch adds input fences to VM_IOCTL for buffer object.
The kernel will map/unmap the BO only when the fence is signaled.
The UAPI for the same has been approved here:
https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/392

V2: Bug fix (Arvind)
V3: Bug fix (Arvind)
V4: Rename UAPI objects as per UAPI review (Marek)
V5: Addressed review comemnts from Christian
     - function should return error.
     - Add 'TODO' comment
     - The input fence should be independent of the operation.
V6: Addressed review comemnts from Christian
    - Release the memory allocated by memdup_user().
V7: Addressed review comemnts from Christian
    - Drop the debug print and add "return r;" for the error handling.

V11: Rebase
v12: Fix 32-bit holes issue in sturct drm_amdgpu_gem_va.
v13: Fix deadlock issue.
v14: Fix merge conflict.
v15: Fix review comment by renaming syncobj handles.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:17 -04:00
Arunpravin Paneer Selvam
189ee986b0 drm/amdgpu: add userq specific kernel config for fence ioctls
Keep the user queue fence signal and wait IOCTLs in the
kernel config CONFIG_DRM_AMDGPU_NAVI3X_USERQ.

v2(Christian):
  - Remove the userq specific config added for kernel queues fence init
    function.

v3(Alex):
  - It will be better to return an error(-ENOTSUPP) in these cases.

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:17 -04:00
Arunpravin Paneer Selvam
f7cb6a28e1 drm/amdgpu: Add gpu_addr support to seq64 allocation
Add gpu address support to seq64 alloc function.

v1:(Christian)
  - Add the user of this new interface change to the same
    patch.

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:17 -04:00
Arunpravin Paneer Selvam
cb4a73f46f drm/amdgpu: Add separate array of read and write for BO handles
Drop AMDGPU_USERQ_BO_WRITE as this should not be a global option
of the IOCTL, It should be option per buffer. Hence adding separate
array for read and write BO handles.

v2(Marek):
  - Internal kernel details shouldn't be here. This file should only
    document the observed behavior, not the implementation .

v3:
  - Fix DAL CI clang issue.

v4:
  - Added Alex RB to merge the kernel UAPI changes since he has
    already approved the amdgpu_drm.h changes.

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Suggested-by: Marek Olšák <marek.olsak@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:17 -04:00
Arunpravin Paneer Selvam
d8675102ba drm/amdgpu: add vm root BO lock before accessing the vm
Add a vm root BO lock before accessing the userqueue VM.

v1:(Christian)
   - Keep the VM locked until you are done with the mapping.
   - Grab a temporary BO reference, drop the VM lock and acquire the BO.
     When you are done with everything just drop the BO lock and
     then the temporary BO reference.

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:17 -04:00
Arunpravin Paneer Selvam
fbea3d3174 drm/amdgpu: Add the missing error handling for xa_store() call
Add the missing error handling for xa_store() call in the function
amdgpu_userq_fence_driver_alloc().

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:17 -04:00
Arunpravin Paneer Selvam
e7cf21fbb2 drm/amdgpu: Few optimization and fixes for userq fence driver
Few optimization and fixes for userq fence driver.

v1:(Christian):
  - Remove unnecessary comments.
  - In drm_exec_init call give num_bo_handles as last parameter it would
    making allocation of the array more efficient
  - Handle return value of __xa_store() and improve the error handling of
    amdgpu_userq_fence_driver_alloc().

v2:(Christian):
   - Revert userq_xa xarray init to XA_FLAGS_LOCK_IRQ.
   - move the xa_unlock before the error check of the call xa_err(__xa_store())
     and moved this change to a separate patch as this is adding a missing error
     handling.
   - Removed the unnecessary comments.

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:16 -04:00
Arunpravin Paneer Selvam
ac4a1f7f13 drm/amdgpu: Remove the MES self test
Remove MES self test as this conflicts the userqueue fence
interrupts.

v2:(Christian)
  - remove the amdgpu_mes_self_test() function and any now unused code.

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:16 -04:00
Arvind Yadav
70773bef4e drm/amdgpu: update userqueue BOs and PDs
This patch updates the VM_IOCTL to allow userspace to synchronize
the mapping/unmapping of a BO in the page table.

The major changes are:
- it adds a drm_timeline object as an input parameter to the VM IOCTL.
- this object is used by the kernel to sync the update of the BO in
  the page table during the mapping of the object.
- the kernel also synchronizes the tlb flush of the page table entry of
  this object during the unmapping (Added in this series:
  https://patchwork.freedesktop.org/series/131276/ and
  https://patchwork.freedesktop.org/patch/584182/)
- the userspace can wait on this timeline, and then the BO is ready to
  be consumed by the GPU.

The UAPI for the same has been approved here:
https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/392

V2:
 - remove the eviction fence coupling

V3:
 - added the drm timeline support instead of input/output fence
   (Christian)

V4:
 - made timeline 64-bit (Christian)
 - bug fix (Arvind)

V5: GLCTS bug fix (Arvind)
V6: Rename syncobj_handle -> timeline_syncobj_out
    Rename point -> timeline_point_in (Marek)
V7: Addressed review comments from Christian:
    - do not send last_update fence in case of vm_clear_freed, instead
      return the fence from gen_va_update_vm
    - move the functions to update bo_mapping  to amdgpu_gem.c
    - do not use amdgpu_userq_update_vm anymore in userq_create()
V8: Addressed review comments from Christian:
    - Split amdgpu_gem_update_bo_mapping function.
    - amdgpu_gem_va_update_vm should return stub for error.
V9: Addressed review comments from Christian:
    - Rename the function amdgpu_gem_update_timeline_node.
    - amdgpu_gem_update_timeline_node should be void function.
    - when timeline_point is zero don't allocate a chain and
      call drm_syncobj_replace_fence() instead of
      drm_syncobj_add_point().
V11: rebase
V12: Fix 32-bit holes issue in sturct drm_amdgpu_gem_va.
V13: Fix the review comment by renaming timeline syncobj (Marek)

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Felix Kuehling <felix.kuehling@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:16 -04:00
Arunpravin Paneer Selvam
8949843762 drm/amdgpu: Enable userq fence interrupt support
Add support to handle the userqueue protected fence signal hardware
interrupt.

Create a xarray which maps the doorbell index to the fence driver address.
This would help to retrieve the fence driver information when an userq fence
interrupt is triggered. Firmware sends the doorbell offset value and
this info is compared with the queue's mqd doorbell offset value.
If they are same, we process the userq fence interrupt.

v1:(Christian):
  - use xa_load to extract the fence driver.
  - move the amdgpu_userq_fence_driver_process call within the xa_lock
    as there is a chance that fence_drv might be freed.

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:16 -04:00
Arunpravin Paneer Selvam
15e30a6e47 drm/amdgpu: Add wait IOCTL timeline syncobj support
Add user fence wait IOCTL timeline syncobj support.

v2:(Christian)
  - handle dma_fence_wait() return value.
  - shorten the variable name syncobj_timeline_points a bit.
  - move num_points up to avoid padding issues.

v3:(Christian)
  - Handle timeline drm_syncobj_find_fence() call error
    handling
  - Use dma_fence_unwrap_for_each() in timeline fence as
    there could be more than one fence.

v4:(Christian)
  - Drop the first num_fences since fence is always included in
    the dma_fence_unwrap_for_each() iteration, when fence != f
    then fence is most likely just a container.

v5: Added Alex RB to merge the kernel UAPI changes since he has
    already approved the amdgpu_drm.h changes.

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:16 -04:00
Arunpravin Paneer Selvam
a292fdecd7 drm/amdgpu: Implement userqueue signal/wait IOCTL
This patch introduces new IOCTL for userqueue secure semaphore.

The signal IOCTL called from userspace application creates a drm
syncobj and array of bo GEM handles and passed in as parameter to
the driver to install the fence into it.

The wait IOCTL gets an array of drm syncobjs, finds the fences
attached to the drm syncobjs and obtain the array of
memory_address/fence_value combintion which are returned to
userspace.

v2: (Christian)
    - Install fence into GEM BO object.
    - Lock all BO's using the dma resv subsystem
    - Reorder the sequence in signal IOCTL function.
    - Get write pointer from the shadow wptr
    - use userq_fence to fetch the va/value in wait IOCTL.

v3: (Christian)
    - Use drm_exec helper for the proper BO drm reserve and avoid BO
      lock/unlock issues.
    - fence/fence driver reference count logic for signal/wait IOCTLs.

v4: (Christian)
    - Fixed the drm_exec calling sequence
    - use dma_resv_for_each_fence_unlock if BO's are not locked
    - Modified the fence_info array storing logic.

v5: (Christian)
    - Keep fence_drv until wait queue execution.
    - Add dma_fence_wait for other fences.
    - Lock BO's using drm_exec as the number of fences in them could
      change.
    - Install signaled fences as well into BO/Syncobj.
    - Move Syncobj fence installation code after the drm_exec_prepare_array.
    - Directly add dma_resv_usage_rw(args->bo_flags....
    - remove unnecessary dma_fence_put.

v6: (Christian)
    - Add xarray stuff to store the fence_drv
    - Implement a function to iterate over the xarray and drop
      the fence_drv references.
    - Add drm_exec_until_all_locked() wrapper
    - Add a check that if we haven't exceeded the user allocated num_fences
      before adding dma_fence to the fences array.

v7: (Christian)
    - Use memdup_user() for kmalloc_array + copy_from_user
    - Move the fence_drv references from the xarray into the newly created fence
      and drop the fence_drv references when we signal this fence.
    - Move this locking of BOs before the "if (!wait_info->num_fences)",
      this way you need this code block only once.
    - Merge the error handling code and the cleanup + return 0 code.
    - Initializing the xa should probably be done in the userq code.
    - Remove the userq back pointer stored in fence_drv.
    - Pass xarray as parameter in amdgpu_userq_walk_and_drop_fence_drv()

v8: (Christian)
    - Move fence_drv references must come before adding the fence to the list.
    - Use xa_lock_irqsave_nested for nested spinlock operations.
    - userq_mgr should be per fpriv and not one per device.
    - Restructure the interrupt process code for the early exit of the loop.
    - The reference acquired in the syncobj fence replace code needs to be
      kept around.
    - Modify the dma_fence acquire placement in wait IOCTL.
    - Move USERQ_BO_WRITE flag to UAPI header file.
    - drop the fence drv reference after telling the hw to stop accessing it.
    - Add multi sync object support to userq signal IOCTL.

V9: (Christian)
    - Store all the fence_drv ref to other drivers and not ourself.
    - Remove the userq fence xa implementation and replace with
      kvmalloc_array.

v10: (Christian)
     - Add a comment for the userq_xa xarray
     - drop the if check of userq_fence->fence_drv_array
     - use the i variable to initialize userq_fence->fence_drv_array_count
     - drop the fence reference before you free the array in the error handling,
       otherwise it could be that some references leaked.

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:16 -04:00
Arunpravin Paneer Selvam
2e65ea1ab2 drm/amdgpu: screen freeze and userq driver crash
Screen freeze and userq fence driver crash while playing Xonotic

v2: (Christian)
    - There is change that fence might signal in between testing
      and grabbing the lock. Hence we can move the lock above the
      if..else check and use the dma_fence_is_signaled_locked().

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:16 -04:00
Arunpravin Paneer Selvam
8493312a94 drm/amdgpu: Add mqd support for the fence address
- Add a field in struct v11_gfx_mqd for userqueue
  fence address.

- Assign fence gpu VA address to the userqueue mqd
  fence address fields.

v2: Remove the mask and replace with lower_32_bits (Christian)

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:16 -04:00
Arunpravin Paneer Selvam
97ff194625 drm/amdgpu: Implement a new userqueue fence driver
Developed a userqueue fence driver for the userqueue process shared
BO synchronization.

Create a dma fence having write pointer as the seqno and allocate a
seq64 memory for each user queue process and feed this memory address
into the firmware/hardware, thus the firmware writes the read pointer
into the given address when the process completes it execution.
Compare wptr and rptr, if rptr >= wptr, signal the fences for the waiting
process to consume the buffers.

v2: Worked on review comments from Christian for the following
    modifications

    - Add wptr as sequence number into the fence
    - Add a reference count for the fence driver
    - Add dma_fence_put below the list_del as it might
      frees the userq fence.
    - Trim unnecessary code in interrupt handler.
    - Check dma fence signaled state in dma fence creation
      function for a potential problem of hardware completing
      the job processing beforehand.
    - Add necessary locks.
    - Create a list and process all the unsignaled fences.
    - clean up fences in destroy function.
    - implement .signaled callback function

v3: Worked on review comments from Christian
    - Modify naming convention for reference counted objects
    - Fix fence driver reference drop issue
    - Drop amdgpu_userq_fence_driver_process() function return value

v4: Worked on review comments from Christian
    - Moved fence driver allocation into amdgpu_userq_fence_driver_alloc()
    - Added detail doc mentioning the differences b/w
      two spinlocks declared.

v5: Worked on review comments from Christian
    - Check before upcast and remove local variable
    - Add error handling in fence_drv alloc function.
    - Move rptr read fn outside of the loop and remove WARN_ON in
      destroy function.

v6:
  - clear the seq64 memory in user fence driver(Christian)
  - fix for the wptr va bo mapping(Christian)
  - move the fence_drv xa entry erase code from the interrupt handler
    into user fence destroy function

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:16 -04:00
Shashank Sharma
f540f69256 drm/amdgpu: add kernel config for gfx-userqueue
This patch:
- adds a kernel config option "CONFIG_DRM_AMDGPU_NAVI3X_USERQ"
- moves the usequeue initialization code for all IPs under
  this flag
- cover the core userqueue functions under this config
- adds stub function for userqueue ioctl.

so that the userqueue works only when the config is enabled.

V9:  Introduce this patch
V10: Call it CONFIG_DRM_AMDGPU_NAVI3X_USERQ instead of
     CONFIG_DRM_AMDGPU_USERQ_GFX (Christian)
V11: Add GFX in the config help description message.
V12: Add depends on BROKEN for this config, remove this when the rest of
the code is available.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:16 -04:00
Arvind Yadav
9d3afcb7b9 drm/amdgpu: fix MES GFX mask
Current MES GFX mask prevents FW to enable oversubscription. This patch
does the following:
- Fixes the mask values and adds a description for the same
- Removes the central mask setup and makes it IP specific, as it would
  be different when the number of pipes and queues are different.

v2: squash in fix from Shashank

Cc: Christian König <Christian.Koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:15 -04:00
Shashank Sharma
2c695d7c07 drm/amdgpu: enable compute/gfx usermode queue
This patch does the necessary changes required to
enable compute workload support using the existing
usermode queues infrastructure.

V9:  Patch introduced
V10: Add custom IP specific mqd strcuture for compute (Alex)
V11: Rename drm_amdgpu_userq_mqd_compute_gfx_v11 to
     drm_amdgpu_userq_mqd_compute_gfx11 (Marek)

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:15 -04:00
Arvind Yadav
543b614537 drm/amdgpu: enable SDMA usermode queues
This patch does necessary modifications to enable the SDMA
usermode queues using the existing userqueue infrastructure.

V9:  introduced this patch in the series
V10: use header file instead of extern (Alex)
V11: rename drm_amdgpu_userq_mqd_sdma_gfx_v11 to
     drm_amdgpu_userq_mqd_sdma_gfx11 (Marek)

Cc: Christian König <Christian.Koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:15 -04:00
Shashank Sharma
a1d201e169 drm/amdgpu: enable GFX-V11 userqueue support
This patch enables GFX-v11 IP support in the usermode queue base
code. It typically:
- adds a GFX_v11 specific MQD structure
- sets IP functions to create and destroy MQDs
- sets MQD objects coming from userspace

V10: introduced this spearate patch for GFX V11 enabling (Alex).
V11: Addressed review comments:
     - update the comments in GFX mqd structure informing user about using
       the INFO IOCTL for object sizes (Alex)
     - rename struct drm_amdgpu_userq_mqd_gfx_v11 to
       drm_amdgpu_userq_mqd_gfx11 (Marek)

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:15 -04:00
Shashank Sharma
d84607e3f7 drm/amdgpu: cleanup leftover queues
This patch adds code to cleanup any leftover userqueues which
a user might have missed to destroy due to a crash or any other
programming error.

V7:  Added Alex's R-B
V8:  Rebase
V9:  Rebase
V10: Rebase
V11: Rebase

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Suggested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:15 -04:00
Shashank Sharma
f09c1e6077 drm/amdgpu: generate doorbell index for userqueue
The userspace sends us the doorbell object and the relative doobell
index in the object to be used for the usermode queue, but the FW
expects the absolute doorbell index on the PCI BAR in the MQD. This
patch adds a function to convert this relative doorbell index to
absolute doorbell index.

V5:  Fix the db object reference leak (Christian)
V6:  Pin the doorbell bo in userqueue_create() function, and unpin it
     in userqueue destoy (Christian)
V7:  Added missing kfree for queue in error cases
     Added Alex's R-B
V8:  Rebase
V9:  Changed the function names from gfx_v11* to mes_v11*
V10: Rebase
V11: Rebase

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:15 -04:00
Shashank Sharma
5fb2f7fc21 drm/amdgpu: map wptr BO into GART
To support oversubscription, MES FW expects WPTR BOs to
be mapped into GART, before they are submitted to usermode
queues. This patch adds a function for the same.

V4: fix the wptr value before mapping lookup (Bas, Christian).

V5: Addressed review comments from Christian:
    - Either pin object or allocate from GART, but not both.
    - All the handling must be done with the VM locks held.

V7: Addressed review comments from Christian:
    - Do not take vm->eviction_lock
    - Use amdgpu_bo_gpu_offset to get the wptr_bo GPU offset

V8:  Rebase
V9:  Changed the function names from gfx_v11* to mes_v11*
V10: Remove unused adev (Harish)

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:15 -04:00
Shashank Sharma
6c42559f70 drm/amdgpu: map usermode queue into MES
This patch adds new functions to map/unmap a usermode queue into
the FW, using the MES ring. As soon as this mapping is done, the
queue would  be considered ready to accept the workload.

V1: Addressed review comments from Alex on the RFC patch series
    - Map/Unmap should be IP specific.
V2:
    Addressed review comments from Christian:
    - Fix the wptr_mc_addr calculation (moved into another patch)
    Addressed review comments from Alex:
    - Do not add fptrs for map/unmap

V3:  Integration with doorbell manager
V4:  Rebase
V5:  Use gfx_v11_0 for function names (Alex)
V6:  Removed queue->proc/gang/fw_ctx_address variables and doing the
     address calculations locally to keep the queue structure GEN
     independent (Alex)
V7:  Added R-B from Alex
V8:  Rebase
V9:  Rebase
V10: Rebase

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:15 -04:00
Shashank Sharma
defb41e8ef drm/amdgpu: create context space for usermode queue
The MES FW expects us to allocate at least one page as context
space to process gang and process related context data. This
patch creates a joint object for the same, and calculates GPU
space offsets of these spaces.

V1: Addressed review comments on RFC patch:
    Alex: Make this function IP specific

V2: Addressed review comments from Christian
    - Allocate only one object for total FW space, and calculate
      offsets for each of these objects.

V3: Integration with doorbell manager

V4: Review comments:
    - Remove shadow from FW space list from cover letter (Alex)
    - Alignment of macro (Luben)

V5: Merged patches 5 and 6 into this single patch
    Addressed review comments:
    - Use lower_32_bits instead of mask (Christian)
    - gfx_v11_0 instead of gfx_v11 in function names (Alex)
    - Shadow and GDS objects are now coming from userspace (Christian,
      Alex)

V6:
    - Add a comment to replace amdgpu_bo_create_kernel() with
      amdgpu_bo_create() during fw_ctx object creation (Christian).
    - Move proc_ctx_gpu_addr, gang_ctx_gpu_addr and fw_ctx_gpu_addr out
      of generic queue structure and make it gen11 specific (Alex).

V7:
   - Using helper function to create/destroy userqueue objects.
   - Removed FW object space allocation.

V8:
   - Updating FW object address from user values.

V9:
   - uppdated function name from gfx_v11_* to mes_v11_*

V10:
   - making this patch independent of IP based changes, moving any
     GFX object related changes in GFX specific patch (Alex)

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Acked-by: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:15 -04:00
Shashank Sharma
fbf136b932 drm/amdgpu: create MES-V11 usermode queue for GFX
A Memory queue descriptor (MQD) of a userqueue defines it in
the hw's context. As MQD format can vary between different
graphics IPs, we need gfx GEN specific handlers to create MQDs.

This patch:
- Adds a new file which will be used for MES based userqueue
  functions targeting GFX and SDMA IP.
- Introduces MQD handler functions for the usermode queues.

V1: Worked on review comments from Alex:
    - Make MQD functions GEN and IP specific

V2: Worked on review comments from Alex:
    - Reuse the existing adev->mqd[ip] for MQD creation
    - Formatting and arrangement of code

V3:
    - Integration with doorbell manager

V4: Review comments addressed:
    - Do not create a new file for userq, reuse gfx_v11_0.c (Alex)
    - Align name of structure members (Luben)
    - Don't break up the Cc tag list and the Sob tag list in commit
      message (Luben)
V5:
   - No need to reserve the bo for MQD (Christian).
   - Some more changes to support IP specific MQD creation.

V6:
   - Add a comment reminding us to replace the amdgpu_bo_create_kernel()
     calls while creating MQD object to amdgpu_bo_create() once eviction
     fences are ready (Christian).

V7:
   - Re-arrange userqueue functions in adev instead of uq_mgr (Alex)
   - Use memdup_user instead of copy_from_user (Christian)

V9:
   - Moved userqueue code from gfx_v11_0.c to new file mes_v11_0.c so
     that it can be reused for SDMA userqueues as well (Shashank, Alex)

V10: Addressed review comments from Alex
   - Making this patch independent of IP engine(GFX/SDMA/Compute) and
     specific to MES V11 only, using the generic MQD structure.
   - Splitting a spearate patch to enabling GFX support from here.
   - Verify mqd va address to be non-NULL.
   - Add a separate header file.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:15 -04:00
Shashank Sharma
0385800c2f drm/amdgpu: add helpers to create userqueue object
This patch introduces amdgpu_userqueue_object and its helper
functions to creates and destroy this object. The helper
functions creates/destroys a base amdgpu_bo, kmap/unmap it and
save the respective GPU and CPU addresses in the encapsulating
userqueue object.

These helpers will be used to create/destroy userqueue MQD, WPTR
and FW areas.

V7:
- Forked out this new patch from V11-gfx-userqueue patch to prevent
  that patch from growing very big.
- Using amdgpu_bo_create instead of amdgpu_bo_create_kernel in prep
  for eviction fences (Christian)

V9:
 - Rebase
V10:
 - Added Alex's R-B

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:15 -04:00
Shashank Sharma
5501117d24 drm/amdgpu: add new IOCTL for usermode queue
This patch adds:
- A new IOCTL function to create and destroy
- A new structure to keep all the user queue data in one place.
- A function to generate unique index for the queue.

V1: Worked on review comments from RFC patch series:
  - Alex: Keep a list of queues, instead of single queue per process.
  - Christian: Use the queue manager instead of global ptrs,
           Don't keep the queue structure in amdgpu_ctx

V2: Worked on review comments:
 - Christian:
   - Formatting of text
   - There is no need for queuing of userqueues, with idr in place
 - Alex:
   - Remove use_doorbell, its unnecessary
   - Reuse amdgpu_mqd_props for saving mqd fields

 - Code formatting and re-arrangement

V3:
 - Integration with doorbell manager

V4:
 - Accommodate MQD union related changes in UAPI (Alex)
 - Do not set the queue size twice (Bas)

V5:
 - Remove wrapper functions for queue indexing (Christian)
 - Do not save the queue id/idr in queue itself (Christian)
 - Move the idr allocation in the IP independent generic space
  (Christian)

V6:
 - Check the validity of input IP type (Christian)

V7:
 - Move uq_func from uq_mgr to adev (Alex)
 - Add missing free(queue) for error cases (Yifan)

V9:
 - Rebase

V10: Addressed review comments from Christian, and added R-B:
 - Do not initialize the local variable
 - Convert DRM_ERROR to DEBUG.

V11:
  - check the input flags to be zero (Alex)

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:15 -04:00
Shashank Sharma
bf33cb6551 drm/amdgpu: add usermode queue base code
This patch adds IP independent skeleton code for amdgpu
usermode queue. It contains:
- A new files with init functions of usermode queues.
- A queue context manager in driver private data.

V1: Worked on design review comments from RFC patch series:
(https://patchwork.freedesktop.org/series/112214/)
- Alex: Keep a list of queues, instead of single queue per process.
- Christian: Use the queue manager instead of global ptrs,
           Don't keep the queue structure in amdgpu_ctx

V2:
 - Reformatted code, split the big patch into two

V3:
- Integration with doorbell manager

V4:
- Align the structure member names to the largest member's column
  (Luben)
- Added SPDX license (Luben)

V5:
- Do not add amdgpu.h in amdgpu_userqueue.h (Christian).
- Move struct amdgpu_userq_mgr into amdgpu_userqueue.h (Christian).

V6: Rebase
V9: Rebase
V10: Rebase + Alex's R-B

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:14 -04:00
Alexandre Demers
9cfb230210 drm/amdgpu: still cleanup sid.h
The defines, shifts and masks are already available in dce_6_0_d.h,
dce_6_0_sh_mask.h.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:14 -04:00
Alexandre Demers
b255b64883 drm/amdgpu: fill in gmc_v6_0_set_clockgating_state()
Pretty much was already there, just not ported to amdgpu.

Tested-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:14 -04:00
Alexandre Demers
a149f0bd0b drm/amd/display/dc: reclassify DCE6 resources and hw sequencer
Classify DCE6 resource and sequencer as they are for other DCE versions

Put dce60_resource.c and .h under amd/display/dc/resource/dce60
Put and rename dce60_hw_sequencer.c and .h under amd/display/dc/hwss/dce60

v2: fix build when CONFIG_DRM_AMD_DC_SI=n (Alex)

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:14 -04:00
Lijo Lazar
6ffc6e056f drm/amdgpu: Reset RAS table if header is invalid
If a valid header is not found during RAS eeprom init, consider it as
new and reset RAS table info.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:14 -04:00
Tao Zhou
b695dd3bb8 drm/amdgpu: add loop bits for NPS2 page retirement
Support NPS2 RAS.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:14 -04:00
Kenneth Feng
bb00bf1732 drm/amd/amdgpu: decouple ASPM with pcie dpm
ASPM doesn't need to be disabled if pcie dpm is disabled.
So ASPM can be independantly enabled.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:14 -04:00
Ruili Ji
940e772635 amd/amdgpu: Init vcn hardware per instance for vcn 4.0.3
Add interface for hardware init by vcn instance.
v2: fix code format

Reviewed-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruili Ji <ruiliji2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:14 -04:00
Victor Skvortsov
3394069e7d drm/amdgpu: Disable ACA on VFs
VFs query RAS error counts directly from host with
AMDGPU_RAS_VIRT_ERROR_COUNT_QUERY. When ACA is enabled,
an unusable aca_sysfs is created rather than amdgpu_ras_sysfs_create()

Likewise, VFs depend on host support to query CPERs, rather than ACA component.

Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed-by: Zhigang Luo <Zhigang.luo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:14 -04:00
Alexandre Demers
9101b84f8c drm/amdgpu: use "irq" in place of "interrupt" in DCE6/8 as in DCE10/11
"interrupt" becomes "irq" in:
dce_vX_0_set_hpd_interrupt_state()
dce_vX_0_set_crtc_interrupt_state()
dce_vX_0_set_pageflip_interrupt_state()

It is easier when going through the code to just change the DCE number in
the functions' name to find and compare them across DCE versions.

Also, it standardizes function mapping inside a given structure where .set
and .process are both set to functions with a "_irq" suffix.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:13 -04:00
Alexandre Demers
160e6f5108 drm/amdgpu: fix typos in DCEs
In DCE6, DCE8, DCE10, DCE11, "hdp" is replaced by "hpd" and
replace "type" by "hpd" for a uniform parameter naming usage across DCEs.

In link_factory.c, there is a missing "p" to "types"

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:13 -04:00
Alex Deucher
9e7b08d239 drm/amdgpu/mes12: optimize MES pipe FW version fetching
Don't fetch it again if we already have it.  It seems the
registers don't reliably have the value at resume in some
cases.

Fixes: 785f0f9fe7 ("drm/amdgpu: Add mes v12_0 ip block support (v4)")
Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:13 -04:00
Denis Arefev
da7dc714a8 drm/amd/pm/smu11: Prevent division by zero
The user can set any speed value.
If speed is greater than UINT_MAX/8, division by zero is possible.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 1e866f1fe5 ("drm/amd/pm: Prevent divide by zero")
Signed-off-by: Denis Arefev <arefev@swemel.ru>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:48:13 -04:00
Alex Deucher
906ad45167 drm/amdgpu: cancel gfx idle work in device suspend for s0ix
This is normally handled in the gfx IP suspend callbacks, but
for S0ix, those are skipped because we don't want to touch
gfx.  So handle it in device suspend.

Fixes: b9467983b7 ("drm/amdgpu: add dynamic workload profile switching for gfx10")
Fixes: 963537ca23 ("drm/amdgpu: add dynamic workload profile switching for gfx11")
Fixes: 5f95a15495 ("drm/amdgpu: add dynamic workload profile switching for gfx12")
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:05:35 -04:00
Kenneth Feng
b23f81c442 drm/amd/display: pause the workload setting in dm
Pause the workload setting in dm when doing idle optimization

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:05:26 -04:00
Alex Deucher
92e511d1ce drm/amdgpu/pm/swsmu: implement pause workload profile
Add the callback for implementation for swsmu.

Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:05:23 -04:00
Alex Deucher
6dafb5d4c7 drm/amdgpu/pm: add workload profile pause helper
To be used for display idle optimizations when
we want to pause non-default profiles.

Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:05:17 -04:00
Alex Deucher
0e2ebfe276 drm/amdgpu/gfx12: dump full CP packet header FIFOs
In dev core dump, dump the full header fifo for
each queue. Each FIFO has 8 entries.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:05:14 -04:00
Alex Deucher
eb15a5d1ae drm/amdgpu/gfx11: dump full CP packet header FIFOs
In dev core dump, dump the full header fifo for
each queue. Each FIFO has 8 entries.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:05:11 -04:00
Alex Deucher
867cf768cb drm/amdgpu/gfx10: dump full CP packet header FIFOs
In dev core dump, dump the full header fifo for
each queue. Each FIFO has 8 entries.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:05:07 -04:00
Alex Deucher
fd4948494d drm/amdgpu/gfx9.4.3: dump full CP packet header FIFOs
In dev core dump, dump the full header fifo for
each queue. Each FIFO has 8 entries.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 18:01:08 -04:00
Alex Deucher
a267d1686c drm/amdgpu/gfx9: dump full CP packet header FIFOs
In dev core dump, dump the full header fifo for
each queue. Each FIFO has 8 entries.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 18:01:08 -04:00
Ruili Ji
9f7ce6a9ab drm/amd/pm: implement dpm vcn reset function
Implement VCN engine reset by sending MSG_ResetVCN
on smu 13.0.6.

v2: fix format for code and message

Reviewed-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruili Ji <ruiliji2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 18:01:08 -04:00
Taimur Hassan
dd035239c9 drm/amd/display: Promote DC to 3.2.328
Summary:

* Optimize custom brightness curve
* Correct SSC enable detection for DCN351
* Turn off eDP lcdvdd and backlight if not required
* Use DMUB Fused IO interface for HDCP
* Extend eDP-on-DP1 quirk list

Reviewed-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 18:01:08 -04:00
Sherry Wang
e3895e8a87 drm/amd/display: rename IPS2 entry/exit message
[Why&How]
Fix the confusing entry/exit message name for IPS2

Reviewed-by: Duncan Ma <duncan.ma@amd.com>
Signed-off-by: Sherry Wang <Yao.Wang1@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 18:01:08 -04:00
Taimur Hassan
8581214d5e drm/amd/display: [FW Promotion] Release 0.1.5.0
Aligning dmub_cmd header with dmu firmware release 0.1.5.0

Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Reviewed-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 18:01:08 -04:00
Charlene Liu
0d93e82186 drm/amd/display: turn off eDP lcdvdd and backlight if not required
[why]
A+N configuration, eDP on A-APU is off, extended display active.
Resume from s4, eDP's backlight is still on.

[how]
Turn off inactive eDP backlight and lcdvdd.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Jing Zhou <Jing.Zhou@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 18:01:08 -04:00
Ausef Yousof
32be4e39f4 drm/amd/display: dont disable dtb as dto src during dpms off
fix was previously in 25.20 but was reverted out as it was accompanied
by other changes that caused regression.

[why&how]
Disabling dtb as the dto src during dpms off relies on in the same
instance being able to also alter the dto src bit to dpref (or not dtb
in general), but this was recently changed to only take place in
dcn31_program_pix_clk, as that is where we want to perform any dto src
changes because tg is off at that point, it is unsafe to do that
elsewhere. What this means is now instead of disabling dtb as dto src
and modifying source bit, we are left with the configuration for a given
tg that specifies dtb as dto src and dtb dto en simultaneously is unset.
dcn31_program_pix_clk can rectify this but its possible for us to
perform some tg dependant  operation that would simply hang because when
we go to enable say crtc then, the clk we specify as dto src is "off" en
bit is cleared, source bit was never changed, and program_pix_clk hasnt
been called yet (as apart of dpms on)

We cant disable it as dto src during dpms off if we want the luxury of
performing tg dependant operation during dpms off and before dpms on.

Reviewed-by: Yihan Zhu <yihan.zhu@amd.com>
Signed-off-by: Ausef Yousof <Ausef.Yousof@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 18:01:08 -04:00
Ausef Yousof
556db637c2 drm/amd/display: wait for updates to latch before locking
[why&how]
It is possible for an update to acquire otg lock and begin programming
while the previous update has not completed and its values have not
latched. The correct way to go about this is to wait until the vupdate
pulses so we can be sure that previous updates have latched and we can
continue with the current update pipe programming, otherwise during
consecutive full updates we will have corruption flash on the screen.

The corruption flash occurs specifically on configs that require odm
combine, and its local to a specific pipe (will not flash across whole
screen). This ticket is across the otg slave, but it may also appear
across master.

Reviewed-by: Leo Chen <leo.chen@amd.com>
Signed-off-by: Ausef Yousof <Ausef.Yousof@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 18:01:08 -04:00
Mario Limonciello
33056a97ae drm/amd/display: Remove double checks for debug.enable_mem_low_power.bits.cm
[Why]
A variety of the 3DLUT handling functions check
`debug.enable_mem_low_power.bits.cm` both in the caller and function.
This is unnecessary overhead.

[How]
For each of them reduce to just checking just in caller or function.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 18:01:07 -04:00
Mario Limonciello
4321742c39 drm/amd/display: Move PSR support message into amdgpu_dm
[Why]
PSR support could vary from the panels connected to one GPU versus
another.

[How]
Move PSR support message into amdgpu_dm which has the scope of the
GPU and use that information.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 18:01:07 -04:00
Mario Limonciello
ef62b92b9d drm/amd/display: Adjust all dev_*() messages to drm_*()
[Why]
dev_*() messages don't show that they are from a driver in drm
subsystem.

[How]
Change all dev_*() messages to drm_*() messages.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 18:01:07 -04:00
Dominik Kaszewski
ce801e5d6c drm/amd/display: HDCP Locality check using DMUB Fused IO
[Why]
HDCP locality check has strict timing requirements, currently broken
due to reliance on msleep which does not guarantee accuracy.
The PR moves the write-poll-read sequence into DMUB using new generic
Fused IO interface, where the timing accuracy is greatly improved.
New flow is enabled using DCN resource capability bit (none for now),
or using a debug flag.

[How]
* Extended mod_hdcp_config with new function for requesting DMUB
to execute a sequence of fused I2C/AUX commands and synchronously
wait until an outbox reply arrives or a timeout expires.
* If the timeout expires, send an abort to DMUB.
* Update HDCP to use the DMUB for locality check if supported.
* Add DC_HDCP_LC_FORCE_FW_ENABLE and DC_HDCP_LC_ENABLE_SW_FALLBACK.
* Make the first enable new flow regardless of resource capabilities.
* Make the second enable fallback to old SW flow.
* Clean up makefile source file listings for easier updates.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Dominik Kaszewski <dominik.kaszewski@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 18:01:07 -04:00
Kevin Gao
d01a7306e1 drm/amd/display: Correct SSC enable detection for DCN351
[Why]
Due to very small clock register delta between DCN35 and DCN351, clock
spread is being checked on the wrong register for DCN351, causing the
display driver to believe that DPREFCLK downspread to be disabled when
in some stacks it is enabled. This causes the clock values for audio to
be incorrect.

[How]
Both DCN351 and DCN35 use the same clk_mgr, so we modify the DCN35
function that checks for SSC enable to read CLK6 instead of CLK5 when
using DCN351. This allows us to read for DPREFCLK downspread correctly
so the clock can properly compensate when setting values.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Kevin Gao <kevin.gao3@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 18:01:07 -04:00
Mario Limonciello
03b979e102 drm/amd/display: Optimize custom brightness curve
[Why]
When BIOS includes a lot of custom brightness data points, walking
the entire list can be time consuming.  This is most noticed when
dragging a power slider.  The "higher" values are "slower" to drag
around.

[How]
Move custom brightness calculation loop into a static function. Before
starting the loop check the "half way" data point to see how it compares
to the input.  If greater than the half way data point use that as the
starting point instead.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 18:01:07 -04:00
Victor Skvortsov
f9fbc33881 drm/amdgpu: Fix CPER error handling on VFs
CPER read will loop infinitely if an error is encountered and
the more bit is set. Add error checks to break upon failure.

v2: added function pointer checks

Suggested-by: Tony Yi <Tony.Yi@amd.com>
Signed-off-by: Victor Skvortsov <Victor.Skvortsov@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 18:00:40 -04:00
Dominik Kaszewski
7bb430f087 drm/amdgpu: Fix typo in DC_DEBUG_MASK kernel-doc
Add missing colon in kernel-doc for DC_DEBUG_MASK enum.

Signed-off-by: Dominik Kaszewski <dominik.kaszewski@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:59 -04:00
Sunil Khatri
89dab189a2 drm/amdgpu: Fix the comment to avoid warning
Fix the below comment warning
drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c:541:
warning: Function parameter or struct member 'adev'
not described in 'amdgpu_sdma_register_on_reset_callbacks'

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:59 -04:00
Lijo Lazar
6dee64e765 drm/amdgpu: Fix xgmi v6.4.1 link status reporting
Use the right register offsets for getting link status.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:59 -04:00
Andrey Vatoropin
d53a64e9ee drm/amd/display: Remove the redundant NULL check
Static analysis shows that pointer "timing" cannot be NULL because it
points to the object "struct dc_crtc_timing".

Remove the extra NULL check. It is meaningless and harms the readability
of the code.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Andrey Vatoropin <a.vatoropin@crpt.ru>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:59 -04:00
Lijo Lazar
5df0d6addb drm/amdgpu: Add basic validation for RAS header
If RAS header read from EEPROM is corrupted, it could result in trying
to allocate huge memory for reading the records. Add some validation to
header fields.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:59 -04:00
Apurv Mishra
daafa303d1 drm/amdkfd: Drop workaround for GC v9.4.3 revID 0
Remove workaround code for the early engineering
samples GC v9.4.3 SOCs with revID 0

Reviewed-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:59 -04:00
James Flowers
ca690c7e21 drm/amd/display: removed unused function
Removed unused function mpc401_get_3dlut_fast_load_status.

Signed-off-by: James Flowers <bold.zone2373@fastmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:59 -04:00
Alexandre Demers
060708d1fa drm/amdgpu: huge sid.h cleanup, drop substituted defines.
Now that we are using the proper defines, cleanup useless old "substituted" defines.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:59 -04:00
Alexandre Demers
d3cd9565c6 drm/amdgpu: move si.c away from sid.h
Replace defines by the ones added earlier to GFX6, SMU6 and DCE6

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:59 -04:00
Alexandre Demers
4445c9dfa9 drm/pm/legacy-dpm: move SI away from sid.h and si_enums.h
Replace defines for the ones under smu_6_0_d.h and smu_6_0_sh_mask.h

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:59 -04:00
Boyuan Zhang
e66c07864e drm/amdgpu: enable FW workaround for VCN 4_0_5
Enabling VCN FW workaround for drm key injection through shared
memory for vcn 4_0_5

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:59 -04:00
Victor Skvortsov
0c6e39ce6d drm/amdgpu: Add indirect L1_TLB_CNTL reg programming for VFs
VFs on some IP versions are unable to access this register directly.

This register must be programmed before PSP ring is setup,
so use PSP VF mailbox directly. PSP will broadcast the register
value to all VF assigned instances.

Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed-by: Zhigang Luo <Zhigang.luo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:58 -04:00
Prike Liang
4aa8de3d03 drm/amdgpu/gfx12: Implement the gfx12 kgq pipe reset
Implement the GFX12 kgq pipe reset, and temporarily disable
the GFX12 pipe reset until the CPFW fully support it.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:58 -04:00
Alexandre Demers
340f1d9fcd drm/amdgpu: add missing SMU6 defines, shifts and masks
They will be used later when switching away from sid.h/si_enums.h.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:58 -04:00
Charles Han
820116a39f drm/amd/pp: Fix potential NULL pointer dereference in atomctrl_initialize_mc_reg_table
The function atomctrl_initialize_mc_reg_table() and
atomctrl_initialize_mc_reg_table_v2_2() does not check the return
value of smu_atom_get_data_table(). If smu_atom_get_data_table()
fails to retrieve vram_info, it returns NULL which is later
dereferenced.

Fixes: b3892e2bb5 ("drm/amd/pp: Use atombios api directly in powerplay (v2)")
Fixes: 5f92b48cf6 ("drm/amd/pm: add mc register table initialization")
Signed-off-by: Charles Han <hanchunchao@inspur.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:58 -04:00
Prike Liang
d69248cf4c drm/amdgpu/gfx11: Implement the GFX11 KCQ pipe reset
Implement the GFX11 compute pipe reset. As the GFX11 CPFW
still hasn't fully supported pipe reset yet, therefore
disable the KCQ pipe reset temporarily.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:58 -04:00
Prike Liang
dcc8e148e0 drm/amdgpu/gfx11: Implement the GFX11 KGQ pipe reset
Implement the kernel graphics queue pipe reset,and the driver
will fallback to pipe reset when the queue reset fails. However,
the ME FW hasn't fully supported pipe reset yet so disable the
KGQ pipe reset temporarily.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:58 -04:00
Aric Cyr
b5af7525ae drm/amd/display: Promote DAL to 3.2.327
Summary:

* Improve vrr for replay and psr
* Rewrite drm debug message
* Fix clock issues for dcn32 and dcn401
* Fix mst dsc mode validation issue

Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:58 -04:00
Dillon Varone
e8cc149ed9 drm/amd/display: Fix Vertical Interrupt definitions for dcn32, dcn401
[WHY&HOW]
- VUPDATE_NO_LOCK should be used in place of VUPDATE always
- Add VERTICAL_INTERRUPT1 and VERTICAL_INTERRUPT2 definitions

Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:58 -04:00
Dillon Varone
0fc9635a80 Revert "drm/amd/display: Fix VUpdate offset calculations for dcn401"
This reverts commit fe45e2af4a.

Reason for revert: it causes stuttering in some usecases.

Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Dillon Varone <Dillon.Varone@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:37 -04:00
Dillon Varone
fe45e2af4a drm/amd/display: Fix VUpdate offset calculations for dcn401
[WHY&HOW]
DCN401 uses a different structure to store the VStartup offset used to
calculate the VUpdate position, so adjust the calculations to use this
value.

Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:37 -04:00
Fangzhi Zuo
146a4429b5 drm/amd/display: Do Not Consider DSC if Valid Config Not Found
[why]
In the mode validation, mst dsc is considered for bw calculation after
common dsc config is determined. Currently it considered common dsc config
is found if max and min target bpp are non zero which is not accurate. Invalid
max and min target bpp values would not get max_kbps and min_kbps calculated,
leading to falsefully pass a mode that does not have valid dsc parameters
available.

[how]
Use the return value of decide_dsc_bandwidth_range() to determine whether valid
dsc common config is found or not. Prune out modes that do not have valid common
dsc config determined.

Reviewed-by: Wayne Lin <wayne.lin@amd.com>
Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:37 -04:00
Dillon Varone
a3b7dc4a1e drm/amd/display: Add Support for reg inbox0 for host->DMUB CMDs
[WHY]
DCN4+ supports a new register based mailbox for sending messages
from host to DMCUB. This mailbox supports 64 byte commands, which makes
it compatible with the same structure as the frame buffer based mailbox.

[HOW]
The intention for reg_inbox0 is to be slot in replacement for the frame
buffer based mailbox (Inbox1). It supports all of the required features:
- Supports all messages handled by FB Inbox1
- Supports multi command batching

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Dillon Varone <Dillon.Varone@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:37 -04:00
ChunTao Tso
4b884e3f03 drm/amd/display: Add a Panel Replay config option
[Why]
Replay need special policy for the scenario Teams,
add a flag to imply apply special policy or not.

[How]
Add a config option intended for future use for video conferencing applications.

Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: ChunTao Tso <ChunTao.Tso@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:36 -04:00
Aurabindo Pillai
16e24a95fb drm/amd/display: use drm_warn instead of DRM_WARN
drm_warn prints the drm device instance which is helpful when
debugging multi gpu issues

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:36 -04:00
Aurabindo Pillai
50d6714b24 drm/amd/display: use drm_info instead of DRM_INFO
drm_info prints the drm device instance which is helpful when
debugging multi gpu issues

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:36 -04:00
Dillon Varone
3b258b6f52 drm/amd/display: Consider downspread against max clocks in DML2.1
[WHY&HOW]
Core should evaluate support based on the max clocks after considering
downspread.

Reviewed-by: Austin Zheng <austin.zheng@amd.com>
Signed-off-by: Dillon Varone <Dillon.Varone@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:36 -04:00
Robin Chen
7a2911b7f4 drm/amd/display: Enable Replay Low Hz feature flag
Enable replay low refresh rate support.

Reviewed-by: ChunTao Tso <chuntao.tso@amd.com>
Signed-off-by: Robin Chen <robin.chen@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:36 -04:00
Joshua Aberback
7b9f869879 drm/amd/display: Use meaningful size for block_sequence array
[Why]
This array was initially defined as size 50. There were array overflow
issues so the size was increased to 100. To ensure such issues are
avoided in the future, the size should be set based on the possible
contents instead of an arbitrary value.

[How]
 - upper bound, assume every update occurs on max number of pipes
 - define array sizes for function parameters, for static analysis

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Joshua Aberback <joshua.aberback@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:36 -04:00
Austin Zheng
40b85a9066 drm/amd/display: Set ODM Factor Based On DML Architecture
[Why]
Mapping of ODM enum is different for DML2.0 vs DML2.1.
Configs using DML2.1 will incorrectly trigger an assert meant for DML2.0.

[How]
Use if/else to seperate logic between DML2.0 and DML2.1.

Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Austin Zheng <Austin.Zheng@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:36 -04:00
Aurabindo Pillai
880ab14a4a drm/amd/display: convert more DRM_ERROR to drm_err
prefer drm_err instead of DRM_ERROR since the former prints the
associated DRM device, which is helpful when debugging multi-gpu
use cases.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:36 -04:00
Aurabindo Pillai
769e07136a drm/amd/display: use drm_err in create_validate_stream_for_sink()
make the drm device available in create_validate_stream_for_sink()
so that drm_err() can be used

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:36 -04:00
Aurabindo Pillai
93717be16e drm/amd/display: use drm_err in hpd rx offload
add amdgpu_device pointer to data associated with the work struct
such that hpd handlers has access to the drm device for use with
drm_err()

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:36 -04:00
Aurabindo Pillai
0f774fce44 drm/amd/display: convert DRM_ERROR to drm_err in hpd_rx_irq_create_workqueue()
pass in a pointer to amdgpu_device directly to the function.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:36 -04:00
Asad Kamal
7ac66f9355 drm/amd/pm: Use gpu_metrics_v1_8 for smu_v13_0_12
Use gpu_metrics_v1_8 for smu_v13_0_12 to fill metrics data

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:36 -04:00
Asad Kamal
084769f493 drm/amd/pm: Use gpu_metrics_v1_8 for smu_v13_0_6
Use gpu_metrics_v1_8 for smu_v13_0_6 to fill metrics data

v2: Move exposing caps to separate patch, move smu_v13.0.12 gpu metrics
1.8 usage to separate patch (Lijo)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:35 -04:00
Asad Kamal
1189c4fb6f drm/amd/pm: Expose smu_v13_0_6 caps
Expose smu_v13_0_6 caps by moving it to common header

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:35 -04:00
Alex Deucher
a9a8bccaa3 drm/amdgpu/gfx11: fix CSIB handling
We shouldn't return after the last section.
We need to update the rest of the CSIB.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:35 -04:00
Alex Deucher
683308af03 drm/amdgpu/gfx10: fix CSIB handling
We shouldn't return after the last section.
We need to update the rest of the CSIB.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:35 -04:00
Alex Deucher
a4a4c0ae67 drm/amdgpu/gfx9: fix CSIB handling
We shouldn't return after the last section.
We need to update the rest of the CSIB.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alex Deucher
c8b8d7a4f1 drm/amdgpu/gfx8: fix CSIB handling
We shouldn't return after the last section.
We need to update the rest of the CSIB.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alex Deucher
be7652c23d drm/amdgpu/gfx7: fix CSIB handling
We shouldn't return after the last section.
We need to update the rest of the CSIB.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alex Deucher
8307ebc15c drm/amdgpu/gfx6: fix CSIB handling
We shouldn't return after the last section.
We need to update the rest of the CSIB.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alex Deucher
9cffd67e80 drm/amdgpu/gfx: assign the actual me0 queues per pipe
Set the actual number of queues per pipe for ME0 (gfx).
This way we will dump all of the queues properly in
dev core dumps.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alex Deucher
8f970c46b5 drm/amdgpu/gfx: decouple the number of kgqs from the hw
The driver currently sets up one kgq per pipe.  As such
adev->gfx.me.num_queue_per_pipe is hardcoded to 1 everywhere.
This is fine for kernel queues, but when we enable user queues
we need to know that actual number of queues per pipe.  Decouple
the kgq setup from the actual hardware count.  For dev core
dumps and user queues, we want to know the actual number
of queues per pipe.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alex Deucher
0d47bb77b5 drm/amdgpu/gfx: make amdgpu_gfx_me_queue_to_bit() static
It's not used outside of amdgpu_gfx.c.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Srinivasan Shanmugam
9eab245326 drm/amdgpu/gfx10: Add Cleaner Shader Support for GFX10.3.x GPUs
Enable the cleaner shader for other GFX10.3.x series of GPUs to provide
data isolation between GPU workloads. The cleaner shader is responsible
for clearing the Local Data Store (LDS), Vector General Purpose
Registers (VGPRs), and Scalar General Purpose Registers (SGPRs), which
helps prevent data leakage and ensures accurate computation results.

This update extends cleaner shader support to GFX10.3.x GPUs, previously
available for GFX10.3.0. It enhances security by clearing GPU memory
between processes and maintains a consistent GPU state across KGD and
KFD workloads.

Cc: Mario Sopena-Novales <mario.novales@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alex Deucher
60d4952d89 drm/amdgpu: drop some dead code
Drop the cgs smu firmware code for SI, it's not used.
The smu firmware fetching for SI is done in si_dpm.c.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alexandre Demers
160d3d39f6 drm/amdgpu: continue cleaning up sid.h and si_enums.h
Remove more duplicated defines and move some in sid.h for coherence with
CIK.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Ananta Srikar
3470f80bd3 drm/amd/amdgpu: Fix typo
Fixes a typo in the word "version" in an error message.

Signed-off-by: Ananta Srikar <srikarananta01@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Andres Urian Florez
c6ae8d587e drm/amdgpu: Replace deprecated function strcpy() with strscpy()
Instead of using the strcpy() deprecated function to populate the
fw_name, use the strscpy() function

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy
Signed-off-by: Andres Urian Florez <andres.emb.sys@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alex Deucher
48b733d99b drm/amdgpu: add rebar parameter
Add a new parameter to disable BAR resizing.  Note that this
only disables the driver from attempting to resize the BAR,
The BIOS may have resized the BAR at boot.

Some teams have found this useful in debugging P2P DMA
issues on systems where the available MMIO space did not allow
for all of the GPUs present to resize their BARs.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alexandre Demers
b71b7cd91c drm/amdgpu: cleanup DCE6 a bit more
Use shifts already available in DCE6's defines, masks and shifts.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alexandre Demers
d35a412910 drm/amdgpu: keep removing sid.h dependency from si_dma.c
Move and rename DMA_SEM_INCOMPLETE_TIMER_CNTL and DMA_SEM_WAIT_FAIL_TIMER_CNTL
in oss_1_0_d.h

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alexandre Demers
14f15aa054 drm/amdgpu: move si_dma.c away from sid.h and si_enums.h
Replace defines for the ones in oss_1_0_d.h and oss_1_0_sh_mask.h

Taking the opportunity to add some comments taken from cik_sdma.c

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alexandre Demers
230a4b0528 drm/amdgpu: make GFX6 easier to read
Just fix the style and add a comment for reading easiness

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alexandre Demers
535b619190 drm/amdgpu: add missing GFX6 defines
They will be used later when switching away from sid.h/si_enums.h.

v2: fix whitespace (Alex)

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alexandre Demers
0ba7e47e8e drm/amdgpu: add missing DMA defines, shifts and masks
They will be used later when switching away from sid.h/si_enums.h.

v2: fix up whitespace (Alex)

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alexandre Demers
6168cb7a31 drm/amdgpu: move DCE6 away from sid.h and si_enums.h defines
This cleans up DCE6.

I added some minor tweaks taken from CIK to exit early

v2: minor fixes (Alex)

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alexandre Demers
76eb396db3 drm/amdgpu: use GRPH_SECONDARY_SURFACE_ADDRESS_MASK with GRPH_SECONDARY_SURFACE_ADDRESS in DCE6
It seems a copy-paste error: since we are working with
mmGRPH_SECONDARY_SURFACE_ADDRESS,
GRPH_SECONDARY_SURFACE_ADDRESS__GRPH_SECONDARY_SURFACE_ADDRESS_MASK
should be used.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
c82d915fe1 drm/amdgpu: move si_ih.c away from sid.h defines
They are properly defined under oss_1_0_d.h

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
cbd8207e23 drm/amdgpu: remove PACKET3 duplicated defines from si_enums.h
PACKET3 is already in sid.h, as it is done under cikd.h for CIK

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
193e088015 drm/amdgpu: use proper defines, shifts and masks in DCE6 code
By replacing VGA_VSTATUS_CNTL by VGA_RENDER_CONTROL__VGA_VSTATUS_CNTL_MASK,
we also need to fix its usage in GMC6.

Note: VGA_VSTATUS_CNTL's binary value was inverted in dce_6_0_sh_mask.h,
so we need to invert its value where it was used.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
de81b86e96 drm/amdgpu: wire up defines, shifts and masks through SI code
To be able to remove as much duplicated defines, the different files
containing definitions, shifts and masks must be properly included.

Once done, the code will be migrated where needed to shifts and masks and
proper defines, before removing useless defines in the end.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
8e46cabf8e drm/amdgpu: move GFX6 defines into gfx_v6_0.c
Send a few GFX6 defines where it's used in GFX6.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
1be0ae9e12 drm/amdgpu: move X_GB_ADDR_CONFIG_GOLDEN in GFX7
[BONAIRE|HAWAII]_GB_ADDR_CONFIG_GOLDEN are only used by GFX7. So keep them
where they are needed.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
e319f9ec36 drm/amdgpu: small cleanup to CIK SDMA
Tidy cik_sdma_hw_init() by returning directly cik_sdma_start()'s result.

Keep amdgpu_cik_gpu_check_soft_reset() early declaration with others.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
60c53fe7bc drm/amdgpu: use cik_sdma_is_idle() in CIK SDMA
cik_sdma_is_idle() does exactly what we need, so use it.

V2: fix parameter (Alex)

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
62e0b8f766 drm/amdgpu: use gmc_v7_0_is_idle() since it is available under GMC7
gmc_v7_0_is_idle() does exactly what we need, so use it.

v2: fix parameter (Alex)

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Saleemkhan Jamadar
cc9428d533 drm/amd/display: add proper error message for vblank init
v1 - DRM_ERROR to drm_err (Mario)

Update message to identifiy the vblank initialization fail case

Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
Saleemkhan Jamadar
43f668edae drm/amd/display: add proper error message for vblank init
v1 - DRM_ERROR to dev_err (Mario)

Update message to identifiy the vblank initialization fail case

Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
Ce Sun
969fd18c8d drm/amdgpu/vcn: during dpc recovery will corrupt VCPU buffer
err_event_athub and dpc recovery will corrupt VCPU buffer,
so we need to restore fw data and clear buffer in amdgpu_vcn_resume()

Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
Ce Sun
8ba904f541 drm/amdgpu: Multi-GPU DPC recovery support
Add support for DPC recover based on refactored code

Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
Ce Sun
11bb33766f drm/amdgpu: refactor amdgpu_device_gpu_recover
Split amdgpu_device_gpu_recover into the following stages:
halt activities,asic reset,schedule resume and amdgpu resume.
The reason is that the subsequent addition of dpc recover
code will have a high similarity with gpu reset

Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
Ce Sun
921c040efe drm/amd/pm: Add link reset for SMU 13.0.6
Add link reset implementation

Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
Lijo Lazar
8e0793b6c5 drm/amdkfd: Use dev_* instead of pr_* for messages
To get the device context, replace pr_ with dev_ functions.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
Aric Cyr
eed269da71 drm/amd/display: DC v3.2.326
Summary:

* DML 2.1 resync
* Vblank disable fixes
* Visual confirm debug improvements
* Add command for reading ABM histogram
* Bug fixes & improvements

Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
JinZe.Xu
a8f83d0c2d drm/amd/display: Use sync version of indirect register access.
[Why]
Access to indirect registers by DC and other components are not synchronized.

[How]
Use sync version of indirect register access.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: JinZe.Xu <JinZe.Xu@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
Aric Cyr
c82d84d1e4 drm/amd/display: Create a temporary scratch dc_link
Create a temporary scratch dc_link for programming purposes
and fix a copy of pipe_ctx on the stack to a pointer reference.

Reviewed-by: Josip Pavic <josip.pavic@amd.com>
Signed-off-by: Aric Cyr <Aric.Cyr@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
Charlene Liu
d5a7fdc88a drm/amd/display: fix zero value for APU watermark_c
[why]
the guard of is_apu not in sync, caused no watermark_c output.

Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
Christian König
f5e7fabd1f drm/amdgpu: allow pinning DMA-bufs into VRAM if all importers can do P2P
Try pinning into VRAM to allow P2P with RDMA NICs without ODP
support if all attachments can do P2P. If any attachment can't do
P2P just pin into GTT instead.

Acked-by: Simona Vetter <simona.vetter@ffwll.ch>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Tested-by: Pak Nin Lui <pak.lui@amd.com>
Cc: Simona Vetter <simona.vetter@ffwll.ch>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:30 -04:00
Chun-Liang Chang
4a8396d5c2 drm/amd/display: Add Read Histogram command header
[Why]
Read the histogram for VariBright validation

[How]
Add dc/dmub functions to read histogram and ACE

Reviewed-by: Jun Lei <jun.lei@amd.com>
Signed-off-by: Chun-Liang Chang <Chun-Liang.Chang@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:30 -04:00
Maarten Lankhorst
1b5447d773 drm/amdgpu: Add cgroups implementation
Similar to xe, enable some simple management of VRAM only.

Reviewed-by: Christian König <christian.koenig@amd.com>
Co-developed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:30 -04:00
Paul Hsieh
8b8a602c98 drm/amd/display: Skip to enable dsc if it has been off
[Why]
It makes DSC enable when we commit the stream which need
keep power off.And then it will skip to disable DSC if
pipe reset at this situation as power has been off. It may
cause the DSC unexpected enable on the pipe with the
next new stream which doesn't support DSC.

[HOW]
Check the DSC used on current pipe status when update stream.
Skip to enable if it has been off. The operation enable
DSC should happen when set power on.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Paul Hsieh <Paul.Hsieh@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:30 -04:00
Jay Cornwall
3666ed8218 drm/amdgpu: Increase KIQ invalidate_tlbs timeout
KIQ invalidate_tlbs request has been seen to marginally exceed the
configured 100 ms timeout on systems under load.

All other KIQ requests in the driver use a 10 second timeout. Use a
similar timeout implementation on the invalidate_tlbs path.

v2: Poll once before msleep
v3: Fix return value

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Cc: Kent Russell <kent.russell@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:30 -04:00
Austin Zheng
7646805506 drm/amd/display: DML21 Reintegration
[Why]
To bring in latest changes in DML21

[List of Changes]
- Unification of DML logging to use DML_LOG_* macro
- Clean up variables that are exclusively used for logging

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Austin Zheng <Austin.Zheng@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:30 -04:00
Cruise
7a505a844c drm/amd/display: Remove BW Allocation from DPIA notification
[Why]
USB4 BW Allocation response will be handled in HPD IRQ.
No need to handle it in DPIA notification callback.

[How]
Remove DP BW allocation response code in DPIA notification.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Cruise <Cruise.Hung@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:30 -04:00
Leo Zeng
c05da59e97 drm/amd/display: Get visual confirm color for stream
[WHY]
We want to output visual confirm color based on stream.

[HOW]
If visual confirm is for DMUB, use DMUB to get color.
Otherwise, find plane with highest layer index, output visual confirm color
of pipe that contains plane with highest index.

Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Leo Zeng <Leo.Zeng@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:30 -04:00
Leo Zeng
9cdd9c4d03 drm/amd/display: Add override for visual confirm
[WHY]
We want to allow the display manager to override the visual
confirm color in DC when required.

[HOW]
Add new visual confirm mode VISUAL_CONFIRM_EXPLICIT, check mode before
setting visual confirm color.

Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Leo Zeng <Leo.Zeng@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:30 -04:00
Jonathan Kim
b3862d60b1 drm/amdkfd: limit sdma queue reset caps flagging for gfx9
ASICs post GFX 9 are being flagged as SDMA per queue reset supported
in the KGD but KFD and scheduler FW currently have no support.
Limit SDMA queue reset capabilities to GFX 9.

Fixes: ceb7114c96 ("drm/amdkfd: flag per-sdma queue reset supported to user space")
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: David Belanger <david.belanger@amd.com>
Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com>
Reviewed-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2025-04-07 15:18:18 -04:00
Mario Limonciello
1c5fdef30e drm/amd/display: Add HP Elitebook 645 to the quirk list for eDP on DP1
[Why]
HP Elitebook 645 has DP0 and DP1 swapped.

[How]
Add HP Elitebook 645 to DP0/DP1 swap quirk list.

Cc: stable@vger.kernel.org
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3701
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 14:37:25 -04:00
Mario Limonciello
139e99d58e drm/amd/display: Add HP Probook 445 and 465 to the quirk list for eDP on DP1
[Why]
HP Probook 445 and 465 has DP0 and DP1 swapped.

[How]
Add HP Probook 445 and 465 to DP0/DP1 swap quirk list.

Cc: stable@vger.kernel.org
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3995
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Anson Tsao <anson.tsao@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 14:37:09 -04:00
Huacai Chen
366e77cd49 drm/amd/display: Protect FPU in dml2_validate()/dml21_validate()
Commit 7da55c27e7 ("drm/amd/display: Remove incorrect FP context
start") removes the FP context protection of dml2_create(), and it said
"All the DC_FP_START/END should be used before call anything from DML2".

However, dml2_validate()/dml21_validate() are not protected from their
callers, causing such errors:

 do_fpu invoked from kernel context![#1]:
 CPU: 10 UID: 0 PID: 331 Comm: kworker/10:1H Not tainted 6.14.0-rc6+ #4
 Workqueue: events_highpri dm_irq_work_func [amdgpu]
 pc ffff800003191eb0 ra ffff800003191e60 tp 9000000107a94000 sp 9000000107a975b0
 a0 9000000140ce4910 a1 0000000000000000 a2 9000000140ce49b0 a3 9000000140ce49a8
 a4 9000000140ce49a8 a5 0000000100000000 a6 0000000000000001 a7 9000000107a97660
 t0 ffff800003790000 t1 9000000140ce5000 t2 0000000000000001 t3 0000000000000000
 t4 0000000000000004 t5 0000000000000000 t6 0000000000000000 t7 0000000000000000
 t8 0000000100000000 u0 ffff8000031a3b9c s9 9000000130bc0000 s0 9000000132400000
 s1 9000000140ec0000 s2 9000000132400000 s3 9000000140ce0000 s4 90000000057f8b88
 s5 9000000140ec0000 s6 9000000140ce4910 s7 0000000000000001 s8 9000000130d45010
 ra: ffff800003191e60 dml21_map_dc_state_into_dml_display_cfg+0x40/0x1140 [amdgpu]
   ERA: ffff800003191eb0 dml21_map_dc_state_into_dml_display_cfg+0x90/0x1140 [amdgpu]
  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
  PRMD: 00000004 (PPLV0 +PIE -PWE)
  EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
 ESTAT: 000f0000 [FPD] (IS= ECode=15 EsubCode=0)
  PRID: 0014d010 (Loongson-64bit, Loongson-3C6000/S)
 Process kworker/10:1H (pid: 331, threadinfo=000000007bf9ddb0, task=00000000cc4ab9f3)
 Stack : 0000000100000000 0000043800000780 0000000100000001 0000000100000001
         0000000000000000 0000078000000000 0000000000000438 0000078000000000
         0000000000000438 0000078000000000 0000000000000438 0000000100000000
         0000000100000000 0000000100000000 0000000100000000 0000000100000000
         0000000000000001 9000000140ec0000 9000000132400000 9000000132400000
         ffff800003408000 ffff800003408000 9000000132400000 9000000140ce0000
         9000000140ce0000 ffff800003193850 0000000000000001 9000000140ec0000
         9000000132400000 9000000140ec0860 9000000140ec0738 0000000000000001
         90000001405e8000 9000000130bc0000 9000000140ec02a8 ffff8000031b5db8
         0000000000000000 0000043800000780 0000000000000003 ffff8000031b79cc
         ...
 Call Trace:
 [<ffff800003191eb0>] dml21_map_dc_state_into_dml_display_cfg+0x90/0x1140 [amdgpu]
 [<ffff80000319384c>] dml21_validate+0xcc/0x520 [amdgpu]
 [<ffff8000031b8948>] dc_validate_global_state+0x2e8/0x460 [amdgpu]
 [<ffff800002e94034>] create_validate_stream_for_sink+0x3d4/0x420 [amdgpu]
 [<ffff800002e940e4>] amdgpu_dm_connector_mode_valid+0x64/0x240 [amdgpu]
 [<900000000441d6b8>] drm_connector_mode_valid+0x38/0x80
 [<900000000441d824>] __drm_helper_update_and_validate+0x124/0x3e0
 [<900000000441ddc0>] drm_helper_probe_single_connector_modes+0x2e0/0x620
 [<90000000044050dc>] drm_client_modeset_probe+0x23c/0x1780
 [<9000000004420384>] __drm_fb_helper_initial_config_and_unlock+0x44/0x5a0
 [<9000000004403acc>] drm_client_dev_hotplug+0xcc/0x140
 [<ffff800002e9ab50>] handle_hpd_irq_helper+0x1b0/0x1e0 [amdgpu]
 [<90000000038f5da0>] process_one_work+0x160/0x300
 [<90000000038f6718>] worker_thread+0x318/0x440
 [<9000000003901b8c>] kthread+0x12c/0x220
 [<90000000038b1484>] ret_from_kernel_thread+0x8/0xa4

Unfortunately, protecting dml2_validate()/dml21_validate() out of DML2
causes "sleeping function called from invalid context", so protect them
with DC_FP_START() and DC_FP_END() inside.

Fixes: 7da55c27e7 ("drm/amd/display: Remove incorrect FP context start")
Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Tested-by: Dongyan Qian <qiandongyan@loongson.cn>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 14:36:50 -04:00
Huacai Chen
afcdf51d97 drm/amd/display: Protect FPU in dml2_init()/dml21_init()
Commit 7da55c27e7 ("drm/amd/display: Remove incorrect FP context
start") removes the FP context protection of dml2_create(), and it said
"All the DC_FP_START/END should be used before call anything from DML2".

However, dml2_init()/dml21_init() are not protected from their callers,
causing such errors:

 do_fpu invoked from kernel context![#1]:
 CPU: 0 UID: 0 PID: 239 Comm: kworker/0:5 Not tainted 6.14.0-rc6+ #2
 Workqueue: events work_for_cpu_fn
 pc ffff80000319de80 ra ffff80000319de5c tp 900000010575c000 sp 900000010575f840
 a0 0000000000000000 a1 900000012f210130 a2 900000012f000000 a3 ffff80000357e268
 a4 ffff80000357e260 a5 900000012ea52cf0 a6 0000000400000004 a7 0000012c00001388
 t0 00001900000015e0 t1 ffff80000379d000 t2 0000000010624dd3 t3 0000006400000014
 t4 00000000000003e8 t5 0000005000000018 t6 0000000000000020 t7 0000000f00000064
 t8 000000000000002f u0 5f5e9200f8901912 s9 900000012d380010 s0 900000012ea51fd8
 s1 900000012f000000 s2 9000000109296000 s3 0000000000000001 s4 0000000000001fd8
 s5 0000000000000001 s6 ffff800003415000 s7 900000012d390000 s8 ffff800003211f80
    ra: ffff80000319de5c dml21_apply_soc_bb_overrides+0x3c/0x960 [amdgpu]
   ERA: ffff80000319de80 dml21_apply_soc_bb_overrides+0x60/0x960 [amdgpu]
  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
  PRMD: 00000004 (PPLV0 +PIE -PWE)
  EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
 ESTAT: 000f0000 [FPD] (IS= ECode=15 EsubCode=0)
  PRID: 0014d010 (Loongson-64bit, Loongson-3C6000/S)
 Process kworker/0:5 (pid: 239, threadinfo=00000000927eadc6, task=000000008fd31682)
 Stack : 00040dc000003164 0000000000000001 900000012f210130 900000012eabeeb8
         900000012f000000 ffff80000319fe48 900000012f210000 900000012f210130
         900000012f000000 900000012eabeeb8 0000000000000001 ffff8000031a0064
         900000010575f9f0 900000012f210130 900000012eac0000 900000012ea80000
         900000012f000000 ffff8000031cefc4 900000010575f9f0 ffff8000035859c0
         ffff800003414000 900000010575fa78 900000012f000000 ffff8000031b4c50
         0000000000000000 9000000101c9d700 9000000109c40000 5f5e9200f8901912
         900000012d3c4bd0 900000012d3c5000 ffff8000034aed18 900000012d380010
         900000012d3c4bd0 ffff800003414000 900000012d380000 ffff800002ea49dc
         0000000000000001 900000012d3c6000 00000000ffffe423 0000000000010000
         ...
 Call Trace:
 [<ffff80000319de80>] dml21_apply_soc_bb_overrides+0x60/0x960 [amdgpu]
 [<ffff80000319fe44>] dml21_init+0xa4/0x280 [amdgpu]
 [<ffff8000031a0060>] dml21_create+0x40/0x80 [amdgpu]
 [<ffff8000031cefc0>] dc_state_create+0x100/0x160 [amdgpu]
 [<ffff8000031b4c4c>] dc_create+0x44c/0x640 [amdgpu]
 [<ffff800002ea49d8>] amdgpu_dm_init+0x3f8/0x2060 [amdgpu]
 [<ffff800002ea6658>] dm_hw_init+0x18/0x60 [amdgpu]
 [<ffff800002b16738>] amdgpu_device_init+0x1938/0x27e0 [amdgpu]
 [<ffff800002b18e80>] amdgpu_driver_load_kms+0x20/0xa0 [amdgpu]
 [<ffff800002b0c8f0>] amdgpu_pci_probe+0x1b0/0x580 [amdgpu]
 [<900000000448eae4>] local_pci_probe+0x44/0xc0
 [<9000000003b02b18>] work_for_cpu_fn+0x18/0x40
 [<9000000003b05da0>] process_one_work+0x160/0x300
 [<9000000003b06718>] worker_thread+0x318/0x440
 [<9000000003b11b8c>] kthread+0x12c/0x220
 [<9000000003ac1484>] ret_from_kernel_thread+0x8/0xa4

Unfortunately, protecting dml2_init()/dml21_init() out of DML2 causes
"sleeping function called from invalid context", so protect them with
DC_FP_START() and DC_FP_END() inside.

Fixes: 7da55c27e7 ("drm/amd/display: Remove incorrect FP context start")
Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 14:36:07 -04:00
Huacai Chen
4408b59eea drm/amd/display: Protect FPU in dml21_copy()
Commit 7da55c27e7 ("drm/amd/display: Remove incorrect FP context
start") removes the FP context protection of dml2_create(), and it said
"All the DC_FP_START/END should be used before call anything from DML2".

However, dml21_copy() are not protected from their callers, causing such
errors:

 do_fpu invoked from kernel context![#1]:
 CPU: 0 UID: 0 PID: 240 Comm: kworker/0:5 Not tainted 6.14.0-rc6+ #1
 Workqueue: events work_for_cpu_fn
 pc ffff80000318bd2c ra ffff80000315750c tp 9000000105910000 sp 9000000105913810
 a0 0000000000000000 a1 0000000000000002 a2 900000013140d728 a3 900000013140d720
 a4 0000000000000000 a5 9000000131592d98 a6 0000000000017ae8 a7 00000000001312d0
 t0 9000000130751ff0 t1 ffff800003790000 t2 ffff800003790000 t3 9000000131592e28
 t4 000000000004c6a8 t5 00000000001b7740 t6 0000000000023e38 t7 0000000000249f00
 t8 0000000000000002 u0 0000000000000000 s9 900000012b010000 s0 9000000131400000
 s1 9000000130751fd8 s2 ffff800003408000 s3 9000000130752c78 s4 9000000131592da8
 s5 9000000131592120 s6 9000000130751ff0 s7 9000000131592e28 s8 9000000131400008
    ra: ffff80000315750c dml2_top_soc15_initialize_instance+0x20c/0x300 [amdgpu]
   ERA: ffff80000318bd2c mcg_dcn4_build_min_clock_table+0x14c/0x600 [amdgpu]
  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
  PRMD: 00000004 (PPLV0 +PIE -PWE)
  EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
 ESTAT: 000f0000 [FPD] (IS= ECode=15 EsubCode=0)
  PRID: 0014d010 (Loongson-64bit, Loongson-3C6000/S)
 Process kworker/0:5 (pid: 240, threadinfo=00000000f1700428, task=0000000020d2e962)
 Stack : 0000000000000000 0000000000000000 0000000000000000 9000000130751fd8
         9000000131400000 ffff8000031574e0 9000000130751ff0 0000000000000000
         9000000131592e28 0000000000000000 0000000000000000 0000000000000000
         0000000000000000 0000000000000000 0000000000000000 0000000000000000
         0000000000000000 0000000000000000 0000000000000000 0000000000000000
         0000000000000000 0000000000000000 0000000000000000 f9175936df5d7fd2
         900000012b00ff08 900000012b000000 ffff800003409000 ffff8000034a1780
         90000001019634c0 900000012b000010 90000001307beeb8 90000001306b0000
         0000000000000001 ffff8000031942b4 9000000130780000 90000001306c0000
         9000000130780000 ffff8000031c276c 900000012b044bd0 ffff800003408000
         ...
 Call Trace:
 [<ffff80000318bd2c>] mcg_dcn4_build_min_clock_table+0x14c/0x600 [amdgpu]
 [<ffff800003157508>] dml2_top_soc15_initialize_instance+0x208/0x300 [amdgpu]
 [<ffff8000031942b0>] dml21_create_copy+0x30/0x60 [amdgpu]
 [<ffff8000031c2768>] dc_state_create_copy+0x68/0xe0 [amdgpu]
 [<ffff800002e98ea0>] amdgpu_dm_init+0x8c0/0x2060 [amdgpu]
 [<ffff800002e9a658>] dm_hw_init+0x18/0x60 [amdgpu]
 [<ffff800002b0a738>] amdgpu_device_init+0x1938/0x27e0 [amdgpu]
 [<ffff800002b0ce80>] amdgpu_driver_load_kms+0x20/0xa0 [amdgpu]
 [<ffff800002b008f0>] amdgpu_pci_probe+0x1b0/0x580 [amdgpu]
 [<9000000003c7eae4>] local_pci_probe+0x44/0xc0
 [<90000000032f2b18>] work_for_cpu_fn+0x18/0x40
 [<90000000032f5da0>] process_one_work+0x160/0x300
 [<90000000032f6718>] worker_thread+0x318/0x440
 [<9000000003301b8c>] kthread+0x12c/0x220
 [<90000000032b1484>] ret_from_kernel_thread+0x8/0xa4

Unfortunately, protecting dml21_copy() out of DML2 causes "sleeping
function called from invalid context", so protect them with DC_FP_START()
and DC_FP_END() inside.

Fixes: 7da55c27e7 ("drm/amd/display: Remove incorrect FP context start")
Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 14:34:32 -04:00
Tom Chung
69a46ce1f1 drm/amd/display: Do not enable Replay and PSR while VRR is on in amdgpu_dm_commit_planes()
[Why]
Replay and PSR will cause some video corruption while VRR is enabled.

[How]
Do not enable the Replay and PSR while VRR is active in
amdgpu_dm_enable_self_refresh().

Fixes: 67edb81d6e ("drm/amd/display: Disable replay and psr while VRR is enabled")
Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2025-04-07 14:32:36 -04:00
Emily Deng
ba6d8f878d drm/amdkfd: sriov doesn't support per queue reset
Disable per queue reset for sriov.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 14:32:29 -04:00
Flora Cui
2f6dd741cd drm/amdgpu/ip_discovery: add missing ip_discovery fw
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 14:32:20 -04:00
Matthew Auld
c0dd8a9253 drm/amdgpu/dma_buf: fix page_link check
The page_link lower bits of the first sg could contain something like
SG_END, if we are mapping a single VRAM page or contiguous blob which
fits into one sg entry. Rather pull out the struct page, and use that in
our check to know if we mapped struct pages vs VRAM.

Fixes: f44ffd677f ("drm/amdgpu: add support for exporting VRAM using DMA-buf v3")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v5.8+
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 14:31:45 -04:00
Christian König
a755906fb2 drm/amdgpu: immediately use GTT for new allocations
Only use GTT as a fallback if we already have a backing store. This
prevents evictions when an application constantly allocates and frees new
memory.

Partially fixes
https://gitlab.freedesktop.org/drm/amd/-/issues/3844#note_2833985.

Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: 216c1282dd ("drm/amdgpu: use GTT only as fallback for VRAM|GTT")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2025-04-07 14:30:57 -04:00
Alex Deucher
b71a2bb0ce drm/amdgpu/mes11: optimize MES pipe FW version fetching
Don't fetch it again if we already have it.  It seems the
registers don't reliably have the value at resume in some
cases.

Fixes: 028c3fb37e ("drm/amdgpu/mes11: initiate mes v11 support")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4083
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2025-04-07 14:30:10 -04:00
Thomas Zimmermann
1afba39f93 Merge drm/drm-next into drm-misc-next
Backmerging to get v6.15-rc1 into drm-misc-next. Also fixes a
build issue when enabling CONFIG_DRM_SCHED_KUNIT_TEST.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2025-04-07 14:35:48 +02:00
Linus Torvalds
16cd1c2657 A set of final cleanups for the timer subsystem:
1) Convert all del_timer[_sync]() instances over to the new
      timer_delete[_sync]() API and remove the legacy wrappers.
 
      Conversion was done with coccinelle plus some manual fixups as
      coccinelle chokes on scoped_guard().
 
   2) The final cleanup of the hrtimer_init() to hrtimer_setup() conversion.
 
      This has been delayed to the end of the merge window, so that all
      patches which have been merged through other trees are in mainline and
      all new users are catched.
 
 Doing this right before rc1 ensures that new code which is merged post rc1
 is not introducing new instances of the original functionality.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmfyXi0THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoYzlD/4ykDZbUzgTreYOxEQpBJ9elPwBhxfL
 1v8OwDjRWlNrmLup8RiUfKrlbmztGl1J/u9ld0qhjcqkywCCBC1N5S+DhCjYetyP
 MPWLbi2Dc35cFA+M7i8fMgxI2K9MLz2Zj1UKxz1MdsSuNHm07N3mul/3T11Ye4Rz
 nPlzeQBTBDFCKTEGKjr8zjuoD15Wl48sObM0AjV35BPuQR1jfY4CE6VXo2h78+0c
 jYwpJpDmcd+o1bDrfFhWUME2DzABEkHhn4wNSETnM4E5RXZRMUbi4UiigzInibQr
 JOUTKwPJXTMX/Erd0XyXErrYf2qy1X9BQy6NlyDDOv+8kLEVRsC9Efplx9uoEtfi
 QvVT/UmgmhZFJBfIT3/B8OvasrfwOropaYoG4L0zbDpp1b09VY47N5lCLlNr/mZf
 jb2TwIln8Szy2EfIT2RSd0ZNupyU8V4aH/mYNpSlbUJ6mfvfIAttBSS/YH+Zeqku
 7zOJkoCusaySOCZCOQkeikL3ZBN+FHtNteXxmGnp34ed/tsfgGZj1lsbmkM2rrWo
 f2mQsYAclUA4KQeY9z/Xf7/c5wJUkME69PxOaaN23dOpBR7GA58Cvb0PQTnPlAiT
 KnH/JRweBHtcv4KEHMi2f5no4cxcmXyKTj7/TLyYNjc8LATL9Eo/nxG36PLxy4lN
 QPOWz11zEBLjQQ==
 =8Ftq
 -----END PGP SIGNATURE-----

Merge tag 'timers-cleanups-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer cleanups from Thomas Gleixner:
 "A set of final cleanups for the timer subsystem:

   - Convert all del_timer[_sync]() instances over to the new
     timer_delete[_sync]() API and remove the legacy wrappers.

     Conversion was done with coccinelle plus some manual fixups as
     coccinelle chokes on scoped_guard().

   - The final cleanup of the hrtimer_init() to hrtimer_setup()
     conversion.

     This has been delayed to the end of the merge window, so that all
     patches which have been merged through other trees are in mainline
     and all new users are catched.

  Doing this right before rc1 ensures that new code which is merged post
  rc1 is not introducing new instances of the original functionality"

* tag 'timers-cleanups-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tracing/timers: Rename the hrtimer_init event to hrtimer_setup
  hrtimers: Rename debug_init_on_stack() to debug_setup_on_stack()
  hrtimers: Rename debug_init() to debug_setup()
  hrtimers: Rename __hrtimer_init_sleeper() to __hrtimer_setup_sleeper()
  hrtimers: Remove unnecessary NULL check in hrtimer_start_range_ns()
  hrtimers: Make callback function pointer private
  hrtimers: Merge __hrtimer_init() into __hrtimer_setup()
  hrtimers: Switch to use __htimer_setup()
  hrtimers: Delete hrtimer_init()
  treewide: Convert new and leftover hrtimer_init() users
  treewide: Switch/rename to timer_delete[_sync]()
2025-04-06 08:35:37 -07:00
Linus Torvalds
758e4c86a1 drm fixes for 6.15-rc1
bridge:
 - tda998x: Select CONFIG_DRM_KMS_HELPER
 
 amdgpu:
 - Guard against potential division by 0 in fan code
 - Zero RPM support for SMU 14.0.2
 - Properly handle SI and CIK support being disabled
 - PSR fixes
 - DML2 fixes
 - DP Link training fix
 - Vblank fixes
 - RAS fixes
 - Partitioning fix
 - SDMA fix
 - SMU 13.0.x fixes
 - Rom fetching fix
 - MES fixes
 - Queue reset fix
 
 xe:
 - Fix NULL pointer dereference on error path
 - Add missing HW workaround for BMG
 - Fix survivability mode not triggering
 - Fix build warning when DRM_FBDEV_EMULATION is not set
 
 i915:
 - Bounds check for scalers in DSC prefill latency computation
 - Fix build by adding a missing include
 
 adp:
 - Fix error handling in plane setup
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmfwQ5wACgkQDHTzWXnE
 hr7OIxAAn8ATKOYsyxaq0t8+gj2jhmGC1L3X6tQUzSh3uAwZpWzYlP8RVMMS9WrL
 Cwj0kBLlq71GasuG/wjEnAxuoahEWebrwTm73it9yWXWW+GsQYhOcWttvmhNSVET
 aTUdB2VM32S0QHf8t6rP0yG4XqJD95sSNnXCTh6/ksk5rhK6NpqyJRkdVBIRxI5O
 IUXxfxmREIrAq5ZfpoPuVUEA2US2W9p+L616RnILewa3iZYzK9XahfcErFU8xlS+
 796ZaYFGEWUmdJKTyCejRHz5z1afXAgKO3vRyGKbLC4MAsxk6MsWzTXsH5EMukxF
 lCtJQuODFDed2gXxLUOeXAUjWoHrgYpJBWRk23Pq83Kx2sf85MD2SF2d8pGJqMa+
 B0JYr/mFtZCi5xraHIH6rwx7k2+KqPVAl+P8f/Fye/AIxBUE3KoI43M69TQ3B2VD
 dG8Jz53DUVQ1TUGJjfZajninFfBdDlr9PgPLmp1gESjbGkI1azGXngwlaVK4kti+
 iZmM/4pbB0A/+PG7Wq1DFEhr0fvQW2RjtgXoh+CujRLgK7ZYwVbCcFpmGQgjOiXb
 IyphOc7tPiOH1EcXGSsrV5JlkyhVL0Ghu6JCKQCzF5MXbrg1fwjjUwVuGtJyqtso
 kKBbx3bx7rv577VXRhM13s0MPwOyhgi56a4kBtX7YRJqY5cVkdM=
 =P/sW
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2025-04-05' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
 "Weekly fixes, mostly from the end of last week, this week was very
  quiet, maybe you scared everyone away. It's mostly amdgpu, and xe,
  with some i915, adp and bridge bits, since I think this is overly
  quiet I'd expect rc2 to be a bit more lively.

  bridge:
   - tda998x: Select CONFIG_DRM_KMS_HELPER

  amdgpu:
   - Guard against potential division by 0 in fan code
   - Zero RPM support for SMU 14.0.2
   - Properly handle SI and CIK support being disabled
   - PSR fixes
   - DML2 fixes
   - DP Link training fix
   - Vblank fixes
   - RAS fixes
   - Partitioning fix
   - SDMA fix
   - SMU 13.0.x fixes
   - Rom fetching fix
   - MES fixes
   - Queue reset fix

  xe:
   - Fix NULL pointer dereference on error path
   - Add missing HW workaround for BMG
   - Fix survivability mode not triggering
   - Fix build warning when DRM_FBDEV_EMULATION is not set

  i915:
   - Bounds check for scalers in DSC prefill latency computation
   - Fix build by adding a missing include

  adp:
   - Fix error handling in plane setup"

  # -----BEGIN PGP SIGNATURE-----

* tag 'drm-next-2025-04-05' of https://gitlab.freedesktop.org/drm/kernel: (34 commits)
  drm/i2c: tda998x: select CONFIG_DRM_KMS_HELPER
  drm/amdgpu/gfx12: fix num_mec
  drm/amdgpu/gfx11: fix num_mec
  drm/amd/pm: Add gpu_metrics_v1_8
  drm/amdgpu: Prefer shadow rom when available
  drm/amd/pm: Update smu metrics table for smu_v13_0_6
  drm/amd/pm: Remove host limit metrics support
  Remove unnecessary firmware version check for gc v9_4_2
  drm/amdgpu: stop unmapping MQD for kernel queues v3
  Revert "drm/amdgpu/sdma_v4_4_2: update VM flush implementation for SDMA"
  drm/amdgpu: Parse all deferred errors with UMC aca handle
  drm/amdgpu: Update ta ras block
  drm/amdgpu: Add NPS2 to DPX compatible mode
  drm/amdgpu: Use correct gfx deferred error count
  drm/amd/display: Actually do immediate vblank disable
  drm/amd/display: prevent hang on link training fail
  Revert "drm/amd/display: dml2 soc dscclk use DPM table clk setting"
  drm/amd/display: Increase vblank offdelay for PSR panels
  drm/amd: Handle being compiled without SI or CIK support better
  drm/amd/pm: Add zero RPM enabled OD setting support for SMU14.0.2
  ...
2025-04-05 15:35:11 -07:00
Thomas Gleixner
8fa7292fee treewide: Switch/rename to timer_delete[_sync]()
timer_delete[_sync]() replaces del_timer[_sync](). Convert the whole tree
over and remove the historical wrapper inlines.

Conversion was done with coccinelle plus manual fixups where necessary.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-04-05 10:30:12 +02:00
Linus Torvalds
2cd5769fb0 Driver core updates for 6.15-rc1
Here is the big set of driver core updates for 6.15-rc1.  Lots of stuff
 happened this development cycle, including:
   - kernfs scaling changes to make it even faster thanks to rcu
   - bin_attribute constify work in many subsystems
   - faux bus minor tweaks for the rust bindings
   - rust binding updates for driver core, pci, and platform busses,
     making more functionaliy available to rust drivers.  These are all
     due to people actually trying to use the bindings that were in 6.14.
   - make Rafael and Danilo full co-maintainers of the driver core
     codebase
   - other minor fixes and updates.
 
 This has been in linux-next for a while now, with the only reported
 issue being some merge conflicts with the rust tree.  Depending on which
 tree you pull first, you will have conflicts in one of them.  The merge
 resolution has been in linux-next as an example of what to do, or can be
 found here:
 	https://lore.kernel.org/r/CANiq72n3Xe8JcnEjirDhCwQgvWoE65dddWecXnfdnbrmuah-RQ@mail.gmail.com
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZ+mMrg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylRgwCdH58OE3BgL0uoFY5vFImStpmPtqUAoL5HpVWI
 jtbJ+UuXGsnmO+JVNBEv
 =gy6W
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updatesk from Greg KH:
 "Here is the big set of driver core updates for 6.15-rc1. Lots of stuff
  happened this development cycle, including:

   - kernfs scaling changes to make it even faster thanks to rcu

   - bin_attribute constify work in many subsystems

   - faux bus minor tweaks for the rust bindings

   - rust binding updates for driver core, pci, and platform busses,
     making more functionaliy available to rust drivers. These are all
     due to people actually trying to use the bindings that were in
     6.14.

   - make Rafael and Danilo full co-maintainers of the driver core
     codebase

   - other minor fixes and updates"

* tag 'driver-core-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (52 commits)
  rust: platform: require Send for Driver trait implementers
  rust: pci: require Send for Driver trait implementers
  rust: platform: impl Send + Sync for platform::Device
  rust: pci: impl Send + Sync for pci::Device
  rust: platform: fix unrestricted &mut platform::Device
  rust: pci: fix unrestricted &mut pci::Device
  rust: device: implement device context marker
  rust: pci: use to_result() in enable_device_mem()
  MAINTAINERS: driver core: mark Rafael and Danilo as co-maintainers
  rust/kernel/faux: mark Registration methods inline
  driver core: faux: only create the device if probe() succeeds
  rust/faux: Add missing parent argument to Registration::new()
  rust/faux: Drop #[repr(transparent)] from faux::Registration
  rust: io: fix devres test with new io accessor functions
  rust: io: rename `io::Io` accessors
  kernfs: Move dput() outside of the RCU section.
  efi: rci2: mark bin_attribute as __ro_after_init
  rapidio: constify 'struct bin_attribute'
  firmware: qemu_fw_cfg: constify 'struct bin_attribute'
  powerpc/perf/hv-24x7: Constify 'struct bin_attribute'
  ...
2025-04-01 11:02:03 -07:00
Linus Torvalds
0c86b42439 drm for 6.15-rc1
uapi:
 - add mediatek tiled fourcc
 - add support for notifying userspace on device wedged
 
 new driver:
 - appletbdrm: support for Apple Touchbar displays on m1/m2
 - nova-core: skeleton rust driver to develop nova inside off
 
 firmware:
 - add some rust firmware pieces
 
 rust:
 - add 'LocalModule' type alias
 
 component:
 - add helper to query bound status
 
 fbdev:
 - fbtft: remove access to page->index
 
 media:
 - cec: tda998x: import driver from drm
 
 dma-buf:
 - add fast path for single fence merging
 
 tests:
 - fix lockdep warnings
 
 atomic:
 - allow full modeset on connector changes
 - clarify semantics of allow_modeset and drm_atomic_helper_check
 - async-flip: support on arbitary planes
 - writeback: fix UAF
 - Document atomic-state history
 
 format-helper:
 - support ARGB8888 to ARGB4444 conversions
 
 buddy:
 - fix multi-root cleanup
 
 ci:
 - update IGT
 
 dp:
 - support extended wake timeout
 - mst: fix RAD to string conversion
 - increase DPCD eDP control CAP size to 5 bytes
 - add DPCD eDP v1.5 definition
 - add helpers for LTTPR transparent mode
 
 panic:
 - encode QR code according to Fido 2.2
 
 scheduler:
 - add parameter struct for init
 - improve job peek/pop operations
 - optimise drm_sched_job struct layout
 
 ttm:
 - refactor pool allocation
 - add helpers for TTM shrinker
 
 panel-orientation:
 - add a bunch of new quirks
 
 panel:
 - convert panels to multi-style functions
 - edp: Add support for B140UAN04.4, BOE NV140FHM-NZ, CSW MNB601LS1-3,
   LG LP079QX1-SP0V, MNE007QS3-7, STA 116QHD024002, Starry 116KHD024006,
   Lenovo T14s Gen6 Snapdragon
 - himax-hx83102: Add support for CSOT PNA957QT1-1, Kingdisplay
   kd110n11-51ie, Starry 2082109qfh040022-50e
 - visionox-r66451: use multi-style MIPI-DSI functions
 - raydium-rm67200: Add driver for Raydium RM67200
 - simple: Add support for BOE AV123Z7M-N17, BOE AV123Z7M-N17
 - sony-td4353-jdi: Use MIPI-DSI multi-func interface
 - summit: Add driver for Apple Summit display panel
 - visionox-rm692e5: Add driver for Visionox RM692E5
 
 bridge:
 - pass full atomic state to various callbacks
 - adv7511: Report correct capabilities
 - it6505: Fix HDCP V compare
 - snd65dsi86: fix device IDs
 - nwl-dsi: set bridge type
 - ti-sn65si83: add error recovery and set bridge type
 - synopsys: add HDMI audio support
 
 xe:
 - support device-wedged event
 - add mmap support for PCI memory barrier
 - perf pmu integration and expose per-engien activity
 - add EU stall sampling support
 - GPU SVM and Xe SVM implementation
 - use TTM shrinker
 - add survivability mode to allow the driver to do
   firmware updates in critical failure states
 - PXP HWDRM support for MTL and LNL
 - expose package/vram temps over hwmon
 - enable DP tunneling
 - drop mmio_ext abstraction
 - Reject BO evcition if BO is bound to current VM
 - Xe suballocator improvements
 - re-use display vmas when possible
 - add GuC Buffer Cache abstraction
 - PCI ID update for Panther Lake and Battlemage
 - Enable SRIOV for Panther Lake
 - Refactor VRAM manager location
 
 i915:
 - enable extends wake timeout
 - support device-wedged event
 - Enable DP 128b/132b SST DSC
 - FBC dirty rectangle support for display version 30+
 - convert i915/xe to drm client setup
 - Compute HDMI PLLS for rates not in fixed tables
 - Allow DSB usage when PSR is enabled on LNL+
 - Enable panel replay without full modeset
 - Enable async flips with compressed buffers on ICL+
 - support luminance based brightness via DPCD for eDP
 - enable VRR enable/disable without full modeset
 - allow GuC SLPC default strategies on MTL+ for performance
 - lots of display refactoring in move to struct intel_display
 
 amdgpu:
 - add device wedged event
 - support async page flips on overlay planes
 - enable broadcast RGB drm property
 - add info ioctl for virt mode
 - OEM i2c support for RGB lights
 - GC 11.5.2 + 11.5.3 support
 - SDMA 6.1.3 support
 - NBIO 7.9.1 + 7.11.2 support
 - MMHUB 1.8.1 + 3.3.2 support
 - DCN 3.6.0 support
 - Add dynamic workload profile switching for GC 10-12
 - support larger VBIOS sizes
 - Mark gttsize parameters as deprecated
 - Initial JPEG queue resset support
 
 amdkfd:
 - add KFD per process flags for setting precision
 - sync pasid values between KGD and KFD
 - improve GTT/VRAM handling for APUs
 - fix user queue validation on GC7/8
 - SDMA queue reset support
 
 raedeon:
 - rs400 hyperz fix
 
 i2c:
 - td998x: drop platform_data, split driver into media and bridge
 
 ast:
 - transmitter chip detection refactoring
 - vbios display mode refactoring
 - astdp: fix connection status and filter unsupported modes
 - cursor handling refactoring
 
 imagination:
 - check job dependencies with sched helper
 
 ivpu:
 - improve command queue handling
 - use workqueue for IRQ handling
 - add support HW fault injection
 - locking fixes
 
 mgag200:
 - add support for G200eH5
 
 msm:
 - dpu: add concurrent writeback support for DPU 10.x+
 - use LTTPR helpers
 - GPU:
   - Fix obscure GMU suspend failure
   - Expose syncobj timeline support
   - Extend GPU devcoredump with pagetable info
   - a623 support
   - Fix a6xx gen1/gen2 indexed-register blocks in gpu snapshot / devcoredump
 - Display:
   - Add cpu-cfg interconnect paths on SM8560 and SM8650
   - Introduce KMS OMMU fault handler, causing devcoredump snapshot
   - Fixed error pointer dereference in msm_kms_init_aspace()
 - DPU:
   - Fix mode_changing handling
   - Add writeback support on SM6150 (QCS615)
   - Fix DSC programming in 1:1:1 topology
   - Reworked hardware resource allocation, moving it to the CRTC code
   - Enabled support for Concurrent WriteBack (CWB) on SM8650
   - Enabled CDM blocks on all relevant platforms
   - Reworked debugfs interface for BW/clocks debugging
   - Clear perf params before calculating bw
   - Support YUV formats on writeback
   - Fixed double inclusion
   - Fixed writeback in YUV formats when using cloned output, Dropped
     wb2_formats_rgb
   - Corrected dpu_crtc_check_mode_changed and struct dpu_encoder_virt
     kerneldocs
   - Fixed uninitialized variable in dpu_crtc_kickoff_clone_mode()
 - DSI:
   - DSC-related fixes
   - Rework clock programming
 - DSI PHY:
   - Fix 7nm (and lower) PHY programming
   - Add proper DT schema definitions for DSI PHY clocks
 - HDMI:
   - Rework the driver, enabling the use of the HDMI Connector framework
 - Bindings:
   - Added eDP PHY on SA8775P
 
 nouveau:
 - move drm_slave_encoder interface into driver
 - nvkm: refactor GSP RPC
 - use LTTPR helpers
 
 mediatek:
 - HDMI fixup and refinement
 - add MT8188 dsc compatible
 - MT8365 SoC support
 
 panthor:
 - Expose sizes of intenral BOs via fdinfo
 - Fix race between reset and suspend
 - Improve locking
 
 qaic:
 - Add support for AIC200
 
 renesas:
 - Fix limits in DT bindings
 
 rockchip:
 - support rk3562-mali
 - rk3576: Add HDMI support
 - vop2: Add new display modes on RK3588 HDMI0 up to 4K
 - Don't change HDMI reference clock rate
 - Fix DT bindings
 - analogix_dp: add eDP support
 - fix shutodnw
 
 solomon:
 - Set SPI device table to silence warnings
 - Fix pixel and scanline encoding
 
 v3d:
 - handle clock
 
 vc4:
 - Use drm_exec
 - Use dma-resv for wait-BO ioctl
 - Remove seqno infrastructure
 
 virtgpu:
 - Support partial mappings of GEM objects
 - Reserve VGA resources during initialization
 - Fix UAF in virtgpu_dma_buf_free_obj()
 - Add panic support
 
 vkms:
 - Switch to a managed modesetting pipeline
 - Add support for ARGB8888
 - fix UAf
 
 xlnx:
 - Set correct DMA segment size
 - use mutex guards
 - Fix error handling
 - Fix docs
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmfmA2cACgkQDHTzWXnE
 hr7lKQ/+I7gmln3ka8FyKnpwG5KusDpxz3OzgUKHpzOTkXL1+vPMt0vjCKRJKE3D
 zfpUTWNbwlVN0krqmUGyFIeCt8wmevBF6HvQ+GsWbiWltUj3xIlnkV0TVH84XTUo
 0evrXNG9K8sLjMKjrf7yrGL53Ayoaq9IO9wtOws+FCgtykAsMR/IWLrYLpj21ZQ6
 Oclhq5Cz21WRoQpzySR23s3DLi4LHri26RGKbGNh2PzxYwyP/euGW6O+ncEduNmg
 vQLgUfptaM/EubJFG6jxDWZJ2ChIAUUxQuhZwt7DKqRsYIcJKcfDSUzqL95t6SYU
 zewlYmeslvusoesCeTJzHaUj34yqJGsvjFPsHFUEvAy8BVncsqS40D6mhrRMo5nD
 chnSJu+IpDOEEqcdbb1J73zKLw6X3GROM8qUQEThJBD2yTsOdw9d0HXekMDMBXSi
 NayKvXfsBV19rI74FYnRCzHt8BVVANh5qJNnR5RcnPZ2KzHQbV0JFOA9YhxE8vFU
 GWkFWKlpAQUQ+yoTy1kuO9dcUxLIC7QseMeS5BYhcJBMEV78xQuRYRxgsL8YS4Yg
 rIhcb3mZwMFj7jBAqfpLKWiI+GTup+P9vcz7Bvm5iIf8gEhZLUTwqBeAYXQkcWd4
 i1AqDuFR0//sAgODoeU2sg1Q3yTL/i/DhcwvCIPgcDZ/x4Eg868=
 =oK/N
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2025-03-28' of https://gitlab.freedesktop.org/drm/kernel

Pull drm updates from Dave Airlie:
 "Outside of drm there are some rust patches from Danilo who maintains
  that area in here, and some pieces for drm header check tests.

  The major things in here are a new driver supporting the touchbar
  displays on M1/M2, the nova-core stub driver which is just the vehicle
  for adding rust abstractions and start developing a real driver inside
  of.

  xe adds support for SVM with a non-driver specific SVM core
  abstraction that will hopefully be useful for other drivers, along
  with support for shrinking for TTM devices. I'm sure xe and AMD
  support new devices, but the pipeline depth on these things is hard to
  know what they end up being in the marketplace!

  uapi:
   - add mediatek tiled fourcc
   - add support for notifying userspace on device wedged

  new driver:
   - appletbdrm: support for Apple Touchbar displays on m1/m2
   - nova-core: skeleton rust driver to develop nova inside off

  firmware:
   - add some rust firmware pieces

  rust:
   - add 'LocalModule' type alias

  component:
   - add helper to query bound status

  fbdev:
   - fbtft: remove access to page->index

  media:
   - cec: tda998x: import driver from drm

  dma-buf:
   - add fast path for single fence merging

  tests:
   - fix lockdep warnings

  atomic:
   - allow full modeset on connector changes
   - clarify semantics of allow_modeset and drm_atomic_helper_check
   - async-flip: support on arbitary planes
   - writeback: fix UAF
   - Document atomic-state history

  format-helper:
   - support ARGB8888 to ARGB4444 conversions

  buddy:
   - fix multi-root cleanup

  ci:
   - update IGT

  dp:
   - support extended wake timeout
   - mst: fix RAD to string conversion
   - increase DPCD eDP control CAP size to 5 bytes
   - add DPCD eDP v1.5 definition
   - add helpers for LTTPR transparent mode

  panic:
   - encode QR code according to Fido 2.2

  scheduler:
   - add parameter struct for init
   - improve job peek/pop operations
   - optimise drm_sched_job struct layout

  ttm:
   - refactor pool allocation
   - add helpers for TTM shrinker

  panel-orientation:
   - add a bunch of new quirks

  panel:
   - convert panels to multi-style functions
   - edp: Add support for B140UAN04.4, BOE NV140FHM-NZ, CSW MNB601LS1-3,
     LG LP079QX1-SP0V, MNE007QS3-7, STA 116QHD024002, Starry
     116KHD024006, Lenovo T14s Gen6 Snapdragon
   - himax-hx83102: Add support for CSOT PNA957QT1-1, Kingdisplay
     kd110n11-51ie, Starry 2082109qfh040022-50e
   - visionox-r66451: use multi-style MIPI-DSI functions
   - raydium-rm67200: Add driver for Raydium RM67200
   - simple: Add support for BOE AV123Z7M-N17, BOE AV123Z7M-N17
   - sony-td4353-jdi: Use MIPI-DSI multi-func interface
   - summit: Add driver for Apple Summit display panel
   - visionox-rm692e5: Add driver for Visionox RM692E5

  bridge:
   - pass full atomic state to various callbacks
   - adv7511: Report correct capabilities
   - it6505: Fix HDCP V compare
   - snd65dsi86: fix device IDs
   - nwl-dsi: set bridge type
   - ti-sn65si83: add error recovery and set bridge type
   - synopsys: add HDMI audio support

  xe:
   - support device-wedged event
   - add mmap support for PCI memory barrier
   - perf pmu integration and expose per-engien activity
   - add EU stall sampling support
   - GPU SVM and Xe SVM implementation
   - use TTM shrinker
   - add survivability mode to allow the driver to do firmware updates
     in critical failure states
   - PXP HWDRM support for MTL and LNL
   - expose package/vram temps over hwmon
   - enable DP tunneling
   - drop mmio_ext abstraction
   - Reject BO evcition if BO is bound to current VM
   - Xe suballocator improvements
   - re-use display vmas when possible
   - add GuC Buffer Cache abstraction
   - PCI ID update for Panther Lake and Battlemage
   - Enable SRIOV for Panther Lake
   - Refactor VRAM manager location

  i915:
   - enable extends wake timeout
   - support device-wedged event
   - Enable DP 128b/132b SST DSC
   - FBC dirty rectangle support for display version 30+
   - convert i915/xe to drm client setup
   - Compute HDMI PLLS for rates not in fixed tables
   - Allow DSB usage when PSR is enabled on LNL+
   - Enable panel replay without full modeset
   - Enable async flips with compressed buffers on ICL+
   - support luminance based brightness via DPCD for eDP
   - enable VRR enable/disable without full modeset
   - allow GuC SLPC default strategies on MTL+ for performance
   - lots of display refactoring in move to struct intel_display

  amdgpu:
   - add device wedged event
   - support async page flips on overlay planes
   - enable broadcast RGB drm property
   - add info ioctl for virt mode
   - OEM i2c support for RGB lights
   - GC 11.5.2 + 11.5.3 support
   - SDMA 6.1.3 support
   - NBIO 7.9.1 + 7.11.2 support
   - MMHUB 1.8.1 + 3.3.2 support
   - DCN 3.6.0 support
   - Add dynamic workload profile switching for GC 10-12
   - support larger VBIOS sizes
   - Mark gttsize parameters as deprecated
   - Initial JPEG queue resset support

  amdkfd:
   - add KFD per process flags for setting precision
   - sync pasid values between KGD and KFD
   - improve GTT/VRAM handling for APUs
   - fix user queue validation on GC7/8
   - SDMA queue reset support

  raedeon:
   - rs400 hyperz fix

  i2c:
   - td998x: drop platform_data, split driver into media and bridge

  ast:
   - transmitter chip detection refactoring
   - vbios display mode refactoring
   - astdp: fix connection status and filter unsupported modes
   - cursor handling refactoring

  imagination:
   - check job dependencies with sched helper

  ivpu:
   - improve command queue handling
   - use workqueue for IRQ handling
   - add support HW fault injection
   - locking fixes

  mgag200:
   - add support for G200eH5

  msm:
   - dpu: add concurrent writeback support for DPU 10.x+
   - use LTTPR helpers
   - GPU:
     - Fix obscure GMU suspend failure
     - Expose syncobj timeline support
     - Extend GPU devcoredump with pagetable info
     - a623 support
     - Fix a6xx gen1/gen2 indexed-register blocks in gpu snapshot /
       devcoredump
   - Display:
     - Add cpu-cfg interconnect paths on SM8560 and SM8650
     - Introduce KMS OMMU fault handler, causing devcoredump snapshot
     - Fixed error pointer dereference in msm_kms_init_aspace()
   - DPU:
     - Fix mode_changing handling
     - Add writeback support on SM6150 (QCS615)
     - Fix DSC programming in 1:1:1 topology
     - Reworked hardware resource allocation, moving it to the CRTC code
     - Enabled support for Concurrent WriteBack (CWB) on SM8650
     - Enabled CDM blocks on all relevant platforms
     - Reworked debugfs interface for BW/clocks debugging
     - Clear perf params before calculating bw
     - Support YUV formats on writeback
     - Fixed double inclusion
     - Fixed writeback in YUV formats when using cloned output, Dropped
       wb2_formats_rgb
     - Corrected dpu_crtc_check_mode_changed and struct dpu_encoder_virt
       kerneldocs
     - Fixed uninitialized variable in dpu_crtc_kickoff_clone_mode()
   - DSI:
     - DSC-related fixes
     - Rework clock programming
   - DSI PHY:
     - Fix 7nm (and lower) PHY programming
     - Add proper DT schema definitions for DSI PHY clocks
   - HDMI:
     - Rework the driver, enabling the use of the HDMI Connector
       framework
   - Bindings:
     - Added eDP PHY on SA8775P

  nouveau:
   - move drm_slave_encoder interface into driver
   - nvkm: refactor GSP RPC
   - use LTTPR helpers

  mediatek:
   - HDMI fixup and refinement
   - add MT8188 dsc compatible
   - MT8365 SoC support

  panthor:
   - Expose sizes of intenral BOs via fdinfo
   - Fix race between reset and suspend
   - Improve locking

  qaic:
   - Add support for AIC200

  renesas:
   - Fix limits in DT bindings

  rockchip:
   - support rk3562-mali
   - rk3576: Add HDMI support
   - vop2: Add new display modes on RK3588 HDMI0 up to 4K
   - Don't change HDMI reference clock rate
   - Fix DT bindings
   - analogix_dp: add eDP support
   - fix shutodnw

  solomon:
   - Set SPI device table to silence warnings
   - Fix pixel and scanline encoding

  v3d:
   - handle clock

  vc4:
   - Use drm_exec
   - Use dma-resv for wait-BO ioctl
   - Remove seqno infrastructure

  virtgpu:
   - Support partial mappings of GEM objects
   - Reserve VGA resources during initialization
   - Fix UAF in virtgpu_dma_buf_free_obj()
   - Add panic support

  vkms:
   - Switch to a managed modesetting pipeline
   - Add support for ARGB8888
   - fix UAf

  xlnx:
   - Set correct DMA segment size
   - use mutex guards
   - Fix error handling
   - Fix docs"

* tag 'drm-next-2025-03-28' of https://gitlab.freedesktop.org/drm/kernel: (1762 commits)
  drm/amd/pm: Update feature list for smu_v13_0_6
  drm/amdgpu: Add parameter documentation for amdgpu_sync_fence
  drm/amdgpu/discovery: optionally use fw based ip discovery
  drm/amdgpu/discovery: use specific ip_discovery.bin for legacy asics
  drm/amdgpu/discovery: check ip_discovery fw file available
  drm/amd/pm: Remove unnecessay UQ10 to UINT conversion
  drm/amd/pm: Remove unnecessay UQ10 to UINT conversion
  drm/amdgpu/sdma_v4_4_2: update VM flush implementation for SDMA
  drm/amdgpu: Optimize VM invalidation engine allocation and synchronize GPU TLB flush
  drm/amd/amdgpu: Increase max rings to enable SDMA page ring
  drm/amdgpu: Decode deferred error type in gfx aca bank parser
  drm/amdgpu/gfx11: Add Cleaner Shader Support for GFX11.5 GPUs
  drm/amdgpu/mes: clean up SDMA HQD loop
  drm/amdgpu/mes: enable compute pipes across all MEC
  drm/amdgpu/mes: drop MES 10.x leftovers
  drm/amdgpu/mes: optimize compute loop handling
  drm/amdgpu/sdma: guilty tracking is per instance
  drm/amdgpu/sdma: fix engine reset handling
  drm/amdgpu: remove invalid usage of sched.ready
  drm/amdgpu: add cleaner shader trace point
  ...
2025-03-28 17:44:52 -07:00
Alex Deucher
dce8bd9137 drm/amdgpu/gfx12: fix num_mec
GC12 only has 1 mec.

Fixes: 52cb80c12e ("drm/amdgpu: Add gfx v12_0 ip block support (v6)")
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-26 17:47:18 -04:00
Alex Deucher
4161050d47 drm/amdgpu/gfx11: fix num_mec
GC11 only has 1 mec.

Fixes: 3d879e81f0 ("drm/amdgpu: add init support for GFX11 (v2)")
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-26 17:47:14 -04:00
Asad Kamal
eddff9a58f drm/amd/pm: Add gpu_metrics_v1_8
Add new gpu_metrics_v1_8 to acquire below host limit counters

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-26 17:47:09 -04:00
Lijo Lazar
27145f78f5 drm/amdgpu: Prefer shadow rom when available
Fetch VBIOS from shadow ROM when available before trying other methods
like EFI method.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Fixes: 9c081c11c6 ("drm/amdgpu: Reorder to read EFI exported ROM first")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4066
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2025-03-26 17:46:33 -04:00
Asad Kamal
942de4ea69 drm/amd/pm: Update smu metrics table for smu_v13_0_6
Update smu metrics table to vesrion 0x10 for smu_v13_0_6

v2: Host metrics support removal moved to separate patch (Lijo)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-26 17:46:29 -04:00
Asad Kamal
1eef87883c drm/amd/pm: Remove host limit metrics support
Firmware algorithm changed and the values in this version
are not accurate thereby remove host limit metric support
for smu_v13_0_6, smu_v13_0_12 & smu_v13_0_14

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-26 17:46:24 -04:00
Candice Li
5b3c08ae9e Remove unnecessary firmware version check for gc v9_4_2
GC v9_4_2 uses a new versioning scheme for CP firmware, making
the warning ("CP firmware version too old, please update!") irrelevant.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2025-03-26 17:45:53 -04:00
Christian König
1f86f4125e drm/amdgpu: stop unmapping MQD for kernel queues v3
This looks unnecessary and actually extremely harmful since using kmap()
is not possible while inside the ring reset.

Remove all the extra mapping and unmapping of the MQDs.

v2: also fix debugfs
v3: fix coding style typo

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-26 17:45:42 -04:00
Jesse.zhang@amd.com
510a16d995 Revert "drm/amdgpu/sdma_v4_4_2: update VM flush implementation for SDMA"
this temporarily reverts
commit 6ec04e38b2 ("drm/amdgpu/sdma_v4_4_2: update VM flush implementation for SDMA")
it cause a regression.

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-26 17:44:47 -04:00
Xiang Liu
aedc92be96 drm/amdgpu: Parse all deferred errors with UMC aca handle
We should only increase the deferred errors in UMC block.

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-26 17:44:41 -04:00
Stanley.Yang
cc11dffc14 drm/amdgpu: Update ta ras block
Update ta ra block to keep sync with RAS TA.

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-26 17:44:34 -04:00
Lijo Lazar
ee97326fb9 drm/amdgpu: Add NPS2 to DPX compatible mode
Compute partition DPX is possible in NPS2 mode. Update the compatible
modes for DPX.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-26 17:44:32 -04:00
Xiang Liu
f3f05a0ec5 drm/amdgpu: Use correct gfx deferred error count
In the case of parsing GFX deferred error from SMU corrected error
channel, the error count should be set to 1 instead of parsing from
MISC0 register, which is 0.

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-26 17:44:27 -04:00
Leo Li
704bc361e3 drm/amd/display: Actually do immediate vblank disable
[Why]

The `vblank_config.offdelay` field follows the same semantics as the
`drm_vblank_offdelay` parameter. Setting it to 0 will never disable
vblank.

[How]

Set `offdelay` to a positive number.

Fixes: e45b6716de ("drm/amd/display: use a more lax vblank enable policy for DCN35+")
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2025-03-26 17:43:47 -04:00
Brendan Tam
8058061ed9 drm/amd/display: prevent hang on link training fail
[Why]
When link training fails, the phy clock will be disabled. However, in
enable_streams, it is assumed that link training succeeded and the
mux selects the phy clock, causing a hang when a register write is made.

[How]
When enable_stream is hit, check if link training failed. If it did, fall
back to the ref clock to avoid a hang and keep the system in a recoverable
state.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Brendan Tam <Brendan.Tam@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2025-03-26 17:43:25 -04:00
Charlene Liu
0389f2a3a2 Revert "drm/amd/display: dml2 soc dscclk use DPM table clk setting"
[why]
this dscclk use DCN defined per DPM level will cause a DCFCLK increase.
needs to follow up.

This reverts commit 15b959534a

Reviewed-by: Yihan Zhu <yihan.zhu@amd.com>
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-26 17:42:48 -04:00
Leo Li
f21e6d149b drm/amd/display: Increase vblank offdelay for PSR panels
[Why]

Depending on when the HW latching event (vupdate) of double-buffered
registers happen relative to the PSR SDP (signals panel psr enter/exit)
deadline, and how bad the Panel clock has drifted since the last ALPM
off event, there can be up to 3 frames of delay between sending the PSR
exit cmd to DMUB fw, and when the panel starts displaying live frames.
This can manifest as micro-stuttering when userspace commit patterns
cause rapid toggling of the DRM vblank counter, since PSR enter/exit is
hooked up to DRM vblank disable/enable respectively.

In the ideal world, the panel should present the live frame immediately
on PSR exit cmd. But due to HW design and PSR limitations, immediate
exit can only happen by chance, when:

1. PSR exit cmd is ack'd by FW before HW latching (vupdate) event, and
2. Panel's SDP deadline -- determined by it's PSR Start Delay in DPCD
  71h -- is after the vupdate event. The PSR exit SDP can then be sent
  immediately after HW latches. Otherwise, we have to wait 1 frame. And
3. There is negligible drift between the panel's clock and source clock.
  Otherwise, there can be up to 1 frame of drift.

Note that this delay is not expected with Panel Replay.

[How]

Since PSR power savings can be quite substantial, and there are a lot of
systems in the wild with PSR panels, It'll be nice to have a middle
ground that balances user experience with power savings.

A simple way to achieve this is by extending the vblank offdelay, such
that additional PSR exit delays will be less perceivable.

We can set:

   20/100 * offdelay_ms = 3_frames_ms
=> offdelay_ms = 5 * 3_frames_ms

This ensures that `3_frames_ms` will only be experienced as a 20% delay
on top how long the panel has been static, and thus make the delay
less perceivable.

If this ends up being too high of a percentage, it can be dropped
further in a future change.

Fixes: 537ef0f888 ("drm/amd/display: use new vblank enable policy for DCN35+")
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2025-03-26 17:42:09 -04:00
Mario Limonciello
5f054ddead drm/amd: Handle being compiled without SI or CIK support better
If compiled without SI or CIK support but amdgpu tries to load it
will run into failures with uninitialized callbacks.

Show a nicer message in this case and fail probe instead.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4050
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2025-03-26 17:41:55 -04:00
Tomasz Pakuła
b03f1810db drm/amd/pm: Add zero RPM enabled OD setting support for SMU14.0.2
Hook up zero RPM enable for 9070 and 9070 XT based on RDNA3
(smu 13.0.0 and 13.0.7) code.

Tested on 9070 XT Hellhound

Signed-off-by: Tomasz Pakuła <tomasz.pakula.oficjalny@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.12.x
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-26 17:41:49 -04:00
Denis Arefev
7c246a05df drm/amd/pm: Prevent division by zero
The user can set any speed value.
If speed is greater than UINT_MAX/8, division by zero is possible.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: c52dcf4919 ("drm/amd/pp: Avoid divide-by-zero in fan_ctrl_set_fan_speed_rpm")
Signed-off-by: Denis Arefev <arefev@swemel.ru>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2025-03-26 17:41:43 -04:00
Denis Arefev
4b8c3c0d17 drm/amd/pm: Prevent division by zero
The user can set any speed value.
If speed is greater than UINT_MAX/8, division by zero is possible.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: c52dcf4919 ("drm/amd/pp: Avoid divide-by-zero in fan_ctrl_set_fan_speed_rpm")
Signed-off-by: Denis Arefev <arefev@swemel.ru>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2025-03-26 17:41:37 -04:00
Denis Arefev
4e3d9508c0 drm/amd/pm: Prevent division by zero
The user can set any speed value.
If speed is greater than UINT_MAX/8, division by zero is possible.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 031db09017 ("drm/amd/powerplay/vega20: enable fan RPM and pwm settings V2")
Signed-off-by: Denis Arefev <arefev@swemel.ru>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2025-03-26 17:41:31 -04:00
Denis Arefev
7d641c2b83 drm/amd/pm: Prevent division by zero
The user can set any speed value.
If speed is greater than UINT_MAX/8, division by zero is possible.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: b64625a303 ("drm/amd/pm: correct the address of Arcturus fan related registers")
Signed-off-by: Denis Arefev <arefev@swemel.ru>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2025-03-26 17:41:23 -04:00
Denis Arefev
f23e9116eb drm/amd/pm: Prevent division by zero
The user can set any speed value.
If speed is greater than UINT_MAX/8, division by zero is possible.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: c05d1c4015 ("drm/amd/swsmu: add aldebaran smu13 ip support (v3)")
Signed-off-by: Denis Arefev <arefev@swemel.ru>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2025-03-26 17:41:03 -04:00
Linus Torvalds
a50b4fe095 A treewide hrtimer timer cleanup
hrtimers are initialized with hrtimer_init() and a subsequent store to
   the callback pointer. This turned out to be suboptimal for the upcoming
   Rust integration and is obviously a silly implementation to begin with.
 
   This cleanup replaces the hrtimer_init(T); T->function = cb; sequence
   with hrtimer_setup(T, cb);
 
   The conversion was done with Coccinelle and a few manual fixups.
 
   Once the conversion has completely landed in mainline, hrtimer_init()
   will be removed and the hrtimer::function becomes a private member.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmff5jQTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoVvRD/wKtuwmiA66NJFgXC0qVq82A6fO3bY8
 GBdbfysDJIbqGu5PTcULTbJ8qkqv3jeLUv6CcXvS4sZ7y/uJQl2lzf8yrD/0bbwc
 rLI6sHiPSZmK93kNVN4X5H7kvt7cE/DYC9nnEOgK3BY5FgKc4n9887d4aVBhL8Lv
 ODwVXvZ+xi351YCj7qRyPU24zt/p4tkkT1o2k4a0HBluqLI0D+V20fke9IERUL8r
 d1uWKlcn0TqYDesE8HXKIhbst3gx52rMJrXBJDHwFmG6v8Pj1fkTXCVpPo8QcBz8
 OTVkpomN9f/Tx4+GZwhZOF86LhLL3OhxD6pT7JhFCXdmSGv+Ez8uyk1YZysM/XpV
 Juy/1yAcBpDIDkmhMFGdAAn48Nn9Fotty0r4je60zSEp1d/4QMXcFme29qr2JTUE
 iWnQ/HD6DxUjVHqy7CYvvo26Xegg1C7qgyOVt4PYZwAM1VKF5P3kzYTb4SAdxtop
 Tpji1sfW9QV08jqMNo6XntD32DSP9S2HqjO9LwBw700jnx2jjJ35fcJs6iodMOUn
 gckIZLMn3L0OoglPdyA5O7SNTbKE7aFiRKdnT/cJtR3Fa39Qu27CwC5gfiyuie9I
 Q+LG8GLuYSBHXAR+PBK4GWlzJ7Dn8k3eqmbnLeKpRMsU6ZzcttgA64xhaviN2wN0
 iJbvLJeisXr3GA==
 =bYAX
 -----END PGP SIGNATURE-----

Merge tag 'timers-cleanups-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer cleanups from Thomas Gleixner:
 "A treewide hrtimer timer cleanup

  hrtimers are initialized with hrtimer_init() and a subsequent store to
  the callback pointer. This turned out to be suboptimal for the
  upcoming Rust integration and is obviously a silly implementation to
  begin with.

  This cleanup replaces the hrtimer_init(T); T->function = cb; sequence
  with hrtimer_setup(T, cb);

  The conversion was done with Coccinelle and a few manual fixups.

  Once the conversion has completely landed in mainline, hrtimer_init()
  will be removed and the hrtimer::function becomes a private member"

* tag 'timers-cleanups-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (100 commits)
  wifi: rt2x00: Switch to use hrtimer_update_function()
  io_uring: Use helper function hrtimer_update_function()
  serial: xilinx_uartps: Use helper function hrtimer_update_function()
  ASoC: fsl: imx-pcm-fiq: Switch to use hrtimer_setup()
  RDMA: Switch to use hrtimer_setup()
  virtio: mem: Switch to use hrtimer_setup()
  drm/vmwgfx: Switch to use hrtimer_setup()
  drm/xe/oa: Switch to use hrtimer_setup()
  drm/vkms: Switch to use hrtimer_setup()
  drm/msm: Switch to use hrtimer_setup()
  drm/i915/request: Switch to use hrtimer_setup()
  drm/i915/uncore: Switch to use hrtimer_setup()
  drm/i915/pmu: Switch to use hrtimer_setup()
  drm/i915/perf: Switch to use hrtimer_setup()
  drm/i915/gvt: Switch to use hrtimer_setup()
  drm/i915/huc: Switch to use hrtimer_setup()
  drm/amdgpu: Switch to use hrtimer_setup()
  stm class: heartbeat: Switch to use hrtimer_setup()
  i2c: Switch to use hrtimer_setup()
  iio: Switch to use hrtimer_setup()
  ...
2025-03-25 10:54:15 -07:00
Dmitry Baryshkov
fcbb93f1e4 drm/display: dp: change drm_dp_dpcd_read_link_status() return value
drm_dp_dpcd_read_link_status() follows the "return error code or number
of bytes read" protocol, with the code returning less bytes than
requested in case of some errors. However most of the drivers
interpreted that as "return error code in case of any error". Switch
drm_dp_dpcd_read_link_status() to drm_dp_dpcd_read_data() and make it
follow that protocol too.

Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20250324-drm-rework-dpcd-access-v4-2-e80ff89593df@oss.qualcomm.com
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2025-03-25 16:20:38 +02:00
Asad Kamal
7547510d4a drm/amd/pm: Update feature list for smu_v13_0_6
Update feature list for smu_v13_0_6 to show vcn & smu deep
sleep feature enable status

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:16:36 -04:00
Srinivasan Shanmugam
af23d3c9ca drm/amdgpu: Add parameter documentation for amdgpu_sync_fence
The 'flags' parameter, which specifies memory allocation behavior while
creating a sync entry,

Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c:162: warning: Function parameter or struct member 'flags' not described in 'amdgpu_sync_fence'

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:16:36 -04:00
Alex Deucher
80a0e82829 drm/amdgpu/discovery: optionally use fw based ip discovery
On chips without native IP discovery support, use the fw binary
if available, otherwise we can continue without it.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Flora Cui <flora.cui@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:16:36 -04:00
Flora Cui
25f602fbbc drm/amdgpu/discovery: use specific ip_discovery.bin for legacy asics
vega10/vega12/vega20/raven/raven2/picasso/arcturus/aldebaran

Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:16:35 -04:00
Flora Cui
017fbb6690 drm/amdgpu/discovery: check ip_discovery fw file available
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:16:35 -04:00
Asad Kamal
62c1ed0a64 drm/amd/pm: Remove unnecessay UQ10 to UINT conversion
Few of the metrics data for smu_v13_0_12 has not been reported
in Q10 format, remove UQ10 to UINT conversion for those

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:16:35 -04:00
Asad Kamal
0156d2bcd5 drm/amd/pm: Remove unnecessay UQ10 to UINT conversion
Few of the metrics data for smu_v13_0_6 has not been reported
in Q10 format, remove UQ10 to UINT conversion for those

v2: Move smu_v13_0_12 changes to separate patch(Kevin)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:16:35 -04:00
Jesse.zhang@amd.com
6ec04e38b2 drm/amdgpu/sdma_v4_4_2: update VM flush implementation for SDMA
This commit updates the VM flush implementation for the SDMA engine.

- Added a new function `sdma_v4_4_2_get_invalidate_req` to construct the VM_INVALIDATE_ENG0_REQ
  register value for the specified VMID and flush type. This function ensures that all relevant
  page table cache levels (L1 PTEs, L2 PTEs, and L2 PDEs) are invalidated.

- Modified the `sdma_v4_4_2_ring_emit_vm_flush` function to use the new `sdma_v4_4_2_get_invalidate_req`
  function. The updated function emits the necessary register writes and waits to perform a VM flush
  for the specified VMID. It updates the PTB address registers and issues a VM invalidation request
  using the specified VM invalidation engine.

- Included the necessary header file `gc/gc_9_0_sh_mask.h` to provide access to the required register
  definitions.

v2: vm flush by the vm inalidation packet (Lijo)
v3: code stle and define thh macro for the vm invalidation packet (Christian)
v4: Format definition sdma vm invalidate packet (Lijo)

Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:16:35 -04:00
Jesse.zhang@amd.com
b09cdeb4d3 drm/amdgpu: Optimize VM invalidation engine allocation and synchronize GPU TLB flush
- Modify the VM invalidation engine allocation logic to handle SDMA page rings.
  SDMA page rings now share the VM invalidation engine with SDMA gfx rings instead of
  allocating a separate engine. This change ensures efficient resource management and
  avoids the issue of insufficient VM invalidation engines.

- Add synchronization for GPU TLB flush operations in gmc_v9_0.c.
  Use spin_lock and spin_unlock to ensure thread safety and prevent race conditions
  during TLB flush operations. This improves the stability and reliability of the driver,
  especially in multi-threaded environments.

 v2: replace the sdma ring check with a function `amdgpu_sdma_is_page_queue`
 to check if a ring is an SDMA page queue.(Lijo)

 v3: Add GC version check, only enabled on GC9.4.3/9.4.4/9.5.0
 v4: Fix code style and add more detailed description (Christian)
 v5: Remove dependency on vm_inv_eng loop order, explicitly lookup shared inv_eng(Christian/Lijo)
 v6: Added search shared ring function amdgpu_sdma_get_shared_ring (Lijo)

Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:16:35 -04:00
Jesse.zhang@amd.com
ea6dd40caf drm/amd/amdgpu: Increase max rings to enable SDMA page ring
Increase the maximum number of rings supported by the AMDGPU driver from 133 to 149.
This change is necessary to enable support for the SDMA page ring.

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:16:35 -04:00
Xiang Liu
338f7412c7 drm/amdgpu: Decode deferred error type in gfx aca bank parser
In the case of injecting uncorrected error with background workload,
the deferred error among uncorrected errors need to be specified
by checking the deferred and poison bits of status register.

v2: refine checking for deferred error
v2: log possiable DEs among CEs
v2: generate CPER records for DEs among UEs

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:16:35 -04:00
Srinivasan Shanmugam
2ec0a7c337 drm/amdgpu/gfx11: Add Cleaner Shader Support for GFX11.5 GPUs
Enable the cleaner shader for GFX11.5.0/11.5.1 GPUs to provide data
isolation between GPU workloads. The cleaner shader is responsible for
clearing the Local Data Store (LDS), Vector General Purpose Registers
(VGPRs), and Scalar General Purpose Registers (SGPRs), which helps
prevent data leakage and ensures accurate computation results.

This update extends cleaner shader support to GFX11.5.0/11.5.1 GPUs,
previously available for GFX11.0.3. It enhances security by clearing GPU
memory between processes and maintains a consistent GPU state across KGD
and KFD workloads.

Cc: Mario Sopena-Novales <mario.novales@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:16:35 -04:00
Alex Deucher
5e93d0e335 drm/amdgpu/mes: clean up SDMA HQD loop
Follow the same logic as the other IP types.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Acked-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:16:35 -04:00
Alex Deucher
a52077b6b6 drm/amdgpu/mes: enable compute pipes across all MEC
Enable pipes on both MECs for MES.

Fixes: 745f46b6a9 ("drm/amdgpu: enable mes v12 self test")
Acked-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:16:35 -04:00
Alex Deucher
652a06f74a drm/amdgpu/mes: drop MES 10.x leftovers
Leftover from MES bring up.  There is no production
MES support for MES 10.x.  The rest of the MES 10.x
code has already been removed so drop this.

Acked-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:16:35 -04:00
Alex Deucher
5608ddf6e9 drm/amdgpu/mes: optimize compute loop handling
Break when we get to the end of the supported pipes
rather than continuing the loop.

Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:16:34 -04:00
Alex Deucher
3bae7916e7 drm/amdgpu/sdma: guilty tracking is per instance
The gfx and page queues are per instance, so track them
per instance.

v2: drop extra parameter (Lijo)

Fixes: fdbfaaaae0 ("drm/amdgpu: Improve SDMA reset logic with guilty queue tracking")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:16:34 -04:00
Alex Deucher
e02fcf7308 drm/amdgpu/sdma: fix engine reset handling
Move the kfd suspend/resume code into the caller.  That
is where the KFD is likely to detect a reset so on the KFD
side there is no need to call them.  Also add a mutex to
lock the actual reset sequence.

v2: make the locking per instance

Fixes: bac38ca8c4 ("drm/amdkfd: implement per queue sdma reset for gfx 9.4+")
Reviewed-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:16:34 -04:00
Christian König
fc70d1ea1b drm/amdgpu: remove invalid usage of sched.ready
I can't count how often I had to remove this nonsense.

Probably doesn't need an explanation any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:16:34 -04:00
Christian König
02ba7543f2 drm/amdgpu: add cleaner shader trace point
Note when the cleaner shader is executed.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:16:34 -04:00
Christian König
1bb1314d0b drm/amdgpu: add isolation trace point
Note when we switch from one isolation owner to another.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:16:34 -04:00
Christian König
db1e58ec86 drm/amdgpu: stop reserving VMIDs to enforce isolation
That was quite troublesome for gang submit. Completely drop this
approach and enforce the isolation separately.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:16:34 -04:00
Christian König
b7fbcd77bb drm/amdgpu: rework how the cleaner shader is emitted v3
Instead of emitting the cleaner shader for every job which has the
enforce_isolation flag set only emit it for the first submission from
every client.

v2: add missing NULL check
v3: fix another NULL pointer deref

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:16:34 -04:00
Christian König
bd22e44ad4 drm/amdgpu: rework how isolation is enforced v2
Limiting the number of available VMIDs to enforce isolation causes some
issues with gang submit and applying certain HW workarounds which
require multiple VMIDs to work correctly.

So instead start to track all submissions to the relevant engines in a
per partition data structure and use the dma_fences of the submissions
to enforce isolation similar to what a VMID limit does.

v2: use ~0l for jobs without isolation to distinct it from kernel
    submissions which uses NULL for the owner. Add some warning when we
    are OOM.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:16:34 -04:00
Christian König
7f11c59e07 drm/amdgpu: overwrite signaled fence in amdgpu_sync
This allows using amdgpu_sync even without peeking into the fences for a
long time.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:15:08 -04:00
Christian König
16590745b5 drm/amdgpu: use GFP_NOWAIT for memory allocations
In the critical submission path memory allocations can't wait for
reclaim since that can potentially wait for submissions to finish.

Finally clean that up and mark most memory allocations in the critical
path with GFP_NOWAIT. The only exception left is the dma_fence_array()
used when no VMID is available, but that will be cleaned up later on.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:15:08 -04:00
Kenneth Feng
a67f0094c9 drm/amd/amdgpu: Revert "drm/amd/amdgpu: shorten the gfx idle worker timeout"
This reverts commit 55ff973fe1.

Reason for revert: this causes some tests fail with call trace.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:15:08 -04:00
Ahmad Rehman
cfdf8b34b9 drm/amdgpu/sdam: Skip SDMA queue reset for SRIOV
For SRIOV, skip the SDMA queue reset and return
error. The engine/queue reset failure will trigger
FLR in the sequence.

v2: do not add queue reset support mask for sriov

Signed-off-by: Ahmad Rehman <Ahmad.Rehman@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:15:08 -04:00
Ahmad Rehman
0a59fbd5d9 drm/amdgpu: Add support to load PSP TA v13.0.12 for SRIOV
Add case for 13.0.12.

Signed-off-by: Ahmad Rehman <ahrehman@amd.com>
Reviewed-by: Vignesh.Chander@amd.com
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:15:08 -04:00
Ellen Pan
d7f5c13e45 drm/amdgpu: Enable amdgpu_ras_resume for gfx 9.5.0
This enables ras to be resumed after gpu recovery on mi350 sriov.

Signed-off-by: Ellen Pan <yunru.pan@amd.com>
Reviewed-by: Ahmad Rehman <Ahmad.Rehman@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:15:08 -04:00
Jonathan Kim
f82d27dcff drm/amdkfd: set precise mem ops caps to disabled for gfx 11 and 12
Clause instructions with precise memory enabled currently hang the
shader so set capabilities flag to disabled since it's unsafe to use
for debugging.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Tested-by: Lancelot Six <lancelot.six@amd.com>
Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21 12:15:08 -04:00
Victor Skvortsov
9c05636ca7 drm/amdgpu: Skip pcie_replay_count sysfs creation for VF
VFs cannot read the NAK_COUNTER register. This information is only
available through PMFW metrics.

Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-19 15:56:27 -04:00
Candice Li
a5f7e90fe0 drm/amdgpu: Add active_umc_mask to ras init_flags
Add active_umc_mask to ras init_flags.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-19 15:56:23 -04:00
Lijo Lazar
a7818b15cf Documentation/amdgpu: Add debug_mask documentation
Add description for debug_mask bit options.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-19 15:56:19 -04:00
Lijo Lazar
ab6893402a drm/amd/pm: Add debug bit for smu pool allocation
In certain cases, it's desirable to avoid PMFW log transactions to
system memory. Add a mask bit to decide whether to allocate smu pool in
device memory or system memory.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-19 15:56:13 -04:00
Alex Deucher
3b669df92c drm/amdgpu/vcn: adjust workload profile handling
No need to make the workload profile setup dependent
on the results of cancelling the delayed work thread.
We have all of the necessary checking in place for the
workload profile reference counting, so separate the
two.  As it is now, we can theoretically end up with
the call from begin_use happening while the worker
thread is executing which would result in the profile
not getting set for that submission.  It should not
affect the reference counting.

v2: bail early if the the profile is already active (Lijo)

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-19 15:56:09 -04:00
Alex Deucher
9e34d8d1a1 drm/amdgpu/gfx: adjust workload profile handling
No need to make the workload profile setup dependent
on the results of cancelling the delayed work thread.
We have all of the necessary checking in place for the
workload profile reference counting, so separate the
two.  As it is now, we can theoretically end up with
the call from begin_use happening while the worker
thread is executing which would result in the profile
not getting set for that submission.  It should not
affect the reference counting.

v2: bail early if the the profile is already active (Lijo)

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-19 15:56:04 -04:00
Candice Li
5762f9dcf7 drm/amdgpu: Add EEPROM I2C address support for smu v13_0_12
Add EEPROM I2C address support for smu v13_0_12.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-19 15:56:00 -04:00
Alex Deucher
ca6575a32a drm/amdgpu/vcn: fix ref counting for ring based profile handling
We need to make sure the workload profile ref counts are
balanced.  This isn't currently the case because we can
increment the count on submissions, but the decrement may
be delayed as work comes in.  Track when we enable the
workload profile so the references are balanced.

v2: switch to a mutex and active flag
v3: fix mutex init

Fixes: 1443dd3c67 ("drm/amd/pm: fix and simplify workload handling")
Cc: Yang Wang <kevinyang.wang@amd.com>
Cc: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-19 15:54:36 -04:00
Alex Deucher
553673a3e1 drm/amdgpu/gfx: fix ref counting for ring based profile handling
We need to make sure the workload profile ref counts are
balanced.  This isn't currently the case because we can
increment the count on submissions, but the decrement may
be delayed as work comes in.  Track when we enable the
workload profile so the references are balanced.

v2: switch to a mutex and active flag
v3: fix mutex init

Fixes: 8fdb3958e3 ("drm/amdgpu/gfx: add ring helpers for setting workload profile")
Cc: Yang Wang <kevinyang.wang@amd.com>
Cc: Kenneth Feng <kenneth.feng@amd.com>
Tested-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-19 15:52:56 -04:00
Harish Kasiviswanathan
fed7efbb43 drm/amdkfd: Fix bug in config_dequeue_wait_counts
For certain ASICs where dequeue_wait_count don't need to be initialized,
pm_config_dequeue_wait_counts_v9 return without filling in the packet
information. However, the calling function interprets this as a success
and sends the uninitialized packet to firmware causing hang.

Fix the above bug by not calling pm_config_dequeue_wait_counts_v9 for
ASICs that don't need the value to be initialized.

v2: Removed redudant code.
    Tidy up code based on review comments
v3: Don't call pm_config_dequeue_wait_counts_v9 for certain ASICs

Fixes: ed962f8d06 ("drm/amdkfd: Add pm_config_dequeue_wait_counts API")
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-19 15:51:44 -04:00
Christian König
0d9a95099d drm/amdgpu: grab an additional reference on the gang fence v2
We keep the gang submission fence around in adev, make sure that it
stays alive.

v2: fix memory leak on retry

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-19 15:51:31 -04:00
Tomasz Pakuła
d9d4cb224e drm/amdgpu/pm: Handle SCLK offset correctly in overdrive for smu 14.0.2
Currently, it seems like the code was carried over from RDNA3 because
it assumes two possible values to set. RDNA4, instead of having:
0: min SCLK
1: max SCLK
only has
0: SCLK offset

This change makes it so it only reports current offset value instead of
showing possible min/max values and their indices. Moreover, it now only
accepts the offset as a value, without the indice index.

Additionally, the lower bound was printed as %u by mistake.

Old:
OD_SCLK_OFFSET:
0: -500Mhz
1: 1000Mhz
OD_MCLK:
0: 97Mhz
1: 1259MHz
OD_VDDGFX_OFFSET:
0mV
OD_RANGE:
SCLK_OFFSET:    -500Mhz       1000Mhz
MCLK:      97Mhz       1500Mhz
VDDGFX_OFFSET:    -200mv          0mv

New:
OD_SCLK_OFFSET:
0Mhz
OD_MCLK:
0: 97Mhz
1: 1259MHz
OD_VDDGFX_OFFSET:
0mV
OD_RANGE:
SCLK_OFFSET:    -500Mhz       1000Mhz
MCLK:      97Mhz       1500Mhz
VDDGFX_OFFSET:    -200mv          0mv

Setting this offset:
Old: "s 1 <offset>"
New: "s <offset>"

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4036
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Tomasz Pakuła <tomasz.pakula.oficjalny@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1cfeb60e6e)
Cc: stable@vger.kernel.org # 6.12.x
2025-03-18 16:34:06 -04:00
Lo-an Chen
d60073294c drm/amd/display: Fix incorrect fw_state address in dmub_srv
[WHY]
The fw_state in dmub_srv was assigned with wrong address.
The address was pointed to the firmware region.

[HOW]
Fix the firmware state by using DMUB_DEBUG_FW_STATE_OFFSET
in dmub_cmd.h.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Lo-an Chen <lo-an.chen@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f57b38ac85)
2025-03-18 16:33:38 -04:00
Mario Limonciello
acbf16a6ae drm/amd/display: Use HW lock mgr for PSR1 when only one eDP
[WHY]
DMUB locking is important to make sure that registers aren't accessed
while in PSR.  Previously it was enabled but caused a deadlock in
situations with multiple eDP panels.

[HOW]
Detect if multiple eDP panels are in use to decide whether to use
lock. Refactor the function so that the first check is for PSR-SU
and then replay is in use to prevent having to look up number
of eDP panels for those configurations.

Fixes: f245b400a2 ("Revert "drm/amd/display: Use HW lock mgr for PSR1"")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3965
Reviewed-by: ChiaHsuan Chung <chiahsuan.chung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ed569e1279)
Cc: stable@vger.kernel.org
2025-03-18 16:33:11 -04:00
Yilin Chen
35f0f9f421 drm/amd/display: Fix message for support_edp0_on_dp1
[WHY]
The info message was wrong when support_edp0_on_dp1 is enabled

[HOW]
Use correct info message for support_edp0_on_dp1

Fixes: f6d17270d1 ("drm/amd/display: add a quirk to enable eDP0 on DP1")
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Yilin Chen <Yilin.Chen@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 79538e6365)
Cc: stable@vger.kernel.org
2025-03-18 16:32:33 -04:00
Philip Yang
542c3bb836 drm/amdkfd: Fix user queue validation on Gfx7/8
To workaround queue full h/w issue on Gfx7/8, when application create
AQL queue, the ring buffer bo allocate size is queue_size/2 and
map queue_size ring buffer to GPU in 2 pieces using 2 attachments, each
attachment map size is queue_size/2, with same ring_bo backing memory.

For Gfx7/8, user queue buffer validation should use queue_size/2 to
verify ring_bo allocation and mapping size.

Fixes: 68e599db7a ("drm/amdkfd: Validate user queue buffers")
Suggested-by: Tomáš Trnka <trnka@scm.com>
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e7a477735f)
Cc: stable@vger.kernel.org
2025-03-18 16:31:25 -04:00
David Belanger
35b6162bb7 drm/amdgpu: Restore uncached behaviour on GFX12
Always use MTYPE_UC if UNCACHED flag is specified.

This makes kernarg region uncached and it restores
usermode cache disable debug flag functionality.

Do not set MTYPE_UC for COHERENT flag, on GFX12 coherence is handled by
shader code.

Signed-off-by: David Belanger <david.belanger@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit eb6cdfb807)
Cc: stable@vger.kernel.org # 6.12.x
2025-03-18 16:29:52 -04:00
Wentao Liang
86730b5261 drm/amdgpu/gfx12: correct cleanup of 'me' field with gfx_v12_0_me_fini()
In gfx_v12_0_cp_gfx_load_me_microcode_rs64(), gfx_v12_0_pfp_fini() is
incorrectly used to free 'me' field of 'gfx', since gfx_v12_0_pfp_fini()
can only release 'pfp' field of 'gfx'. The release function of 'me' field
should be gfx_v12_0_me_fini().

Fixes: 52cb80c12e ("drm/amdgpu: Add gfx v12_0 ip block support (v6)")
Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ebdc52607a)
Cc: stable@vger.kernel.org # 6.12.x
2025-03-18 16:29:16 -04:00
Jay Cornwall
424648c383 drm/amdkfd: Fix instruction hazard in gfx12 trap handler
VALU instructions with SGPR source need wait states to avoid hazard
with SALU using different SGPR.

v2: Eliminate some hazards to reduce code explosion

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7e0459d453)
Cc: stable@vger.kernel.org # 6.12.x
2025-03-18 16:28:34 -04:00
Alex Deucher
5ca0040ecf drm/amdgpu/pm: wire up hwmon fan speed for smu 14.0.2
Add callbacks for fan speed fetching.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4034
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 90df6db62f)
Cc: stable@vger.kernel.org # 6.12.x
2025-03-18 16:27:56 -04:00
Harish Kasiviswanathan
19b53f9685 drm/amd/pm: add unique_id for gfx12
Expose unique_id for gfx12

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 16fbc18cb0)
Cc: stable@vger.kernel.org # 6.12.x
2025-03-18 16:26:53 -04:00
David Rosca
7fc0765208 drm/amdgpu: Remove JPEG from vega and carrizo video caps
JPEG is only supported for VCN1+.

Signed-off-by: David Rosca <david.rosca@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 0a6e7b06bd)
Cc: stable@vger.kernel.org
2025-03-18 16:26:12 -04:00
David Rosca
ec33964d9d drm/amdgpu: Fix JPEG video caps max size for navi1x and raven
8192x8192 is the maximum supported resolution.

Signed-off-by: David Rosca <david.rosca@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 6e0d2fde3a)
Cc: stable@vger.kernel.org
2025-03-18 16:25:46 -04:00