2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
Commit Graph

33509 Commits

Author SHA1 Message Date
David Rosca
f0105e1731 drm/amdgpu: Fix MPEG2, MPEG4 and VC1 video caps max size
1920x1088 is the maximum supported resolution.

Signed-off-by: David Rosca <david.rosca@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1a0807feb9)
Cc: stable@vger.kernel.org
2025-03-18 16:25:16 -04:00
Lijo Lazar
6c11d4a87d drm/amdgpu: Use wafl version for xgmi
XGMI and WAFL share the same versions. Use WAFL version if XGMI version
is not present in discovery.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-18 14:03:47 -04:00
Jesse.zhang@amd.com
cc63bcfd14 drm/amdgpu: Fix SDMA engine reset logic
The scheduler should restart only if the reset operation
succeeds  This ensures that new tasks are only submitted
to the queues after a successful reset.

Fixes: 4c02f73016 ("drm/amdgpu: Introduce conditional user queue suspension for SDMA resets")
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse.Zhang <Jesse.zhang@amd.com>
Reviewed-by: Tim Huang <tim.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-18 14:03:47 -04:00
Tomasz Pakuła
1cfeb60e6e drm/amdgpu/pm: Handle SCLK offset correctly in overdrive for smu 14.0.2
Currently, it seems like the code was carried over from RDNA3 because
it assumes two possible values to set. RDNA4, instead of having:
0: min SCLK
1: max SCLK
only has
0: SCLK offset

This change makes it so it only reports current offset value instead of
showing possible min/max values and their indices. Moreover, it now only
accepts the offset as a value, without the indice index.

Additionally, the lower bound was printed as %u by mistake.

Old:
OD_SCLK_OFFSET:
0: -500Mhz
1: 1000Mhz
OD_MCLK:
0: 97Mhz
1: 1259MHz
OD_VDDGFX_OFFSET:
0mV
OD_RANGE:
SCLK_OFFSET:    -500Mhz       1000Mhz
MCLK:      97Mhz       1500Mhz
VDDGFX_OFFSET:    -200mv          0mv

New:
OD_SCLK_OFFSET:
0Mhz
OD_MCLK:
0: 97Mhz
1: 1259MHz
OD_VDDGFX_OFFSET:
0mV
OD_RANGE:
SCLK_OFFSET:    -500Mhz       1000Mhz
MCLK:      97Mhz       1500Mhz
VDDGFX_OFFSET:    -200mv          0mv

Setting this offset:
Old: "s 1 <offset>"
New: "s <offset>"

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4036
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Tomasz Pakuła <tomasz.pakula.oficjalny@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-18 14:03:47 -04:00
Flora Cui
b5aaa82e2b drm/amdgpu: release xcp_mgr on exit
Free on driver cleanup.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Flora Cui <flora.cui@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-18 14:03:47 -04:00
Taimur Hassan
bed6bc66e8 drm/amd/display: 3.2.325
This version brings along following fixes:
- Use DPM table clk setting for dml2 soc dscclk
- Update static soc table
- Fix incorrect fw_state address in dmub_srv
- Use HW lock mgr for PSR1 when only one eDP
- Revert "Support for reg inbox0 for host->DMUB CMDs"
- Change notification of link BW allocation
- Fix message for support_edp0_on_dp1
- Guard against setting dispclk low for dcn31x
- Prevent VStartup Overflow
- Check pipe->stream before passing it to a function

Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-18 14:03:47 -04:00
Charlene Liu
15b959534a drm/amd/display: Use DPM table clk setting for dml2 soc dscclk
[WHY]
Not like dppclk/dispclk, dml2 will calculate the minimum required clocks.
For dscclk, it is used for pure comparision.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-18 14:03:47 -04:00
Charlene Liu
20c13ca5ba drm/amd/display: Update static soc table
[WHY]
Update the static soc table dcn3_5_soc.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-18 14:03:47 -04:00
Lo-an Chen
f57b38ac85 drm/amd/display: Fix incorrect fw_state address in dmub_srv
[WHY]
The fw_state in dmub_srv was assigned with wrong address.
The address was pointed to the firmware region.

[HOW]
Fix the firmware state by using DMUB_DEBUG_FW_STATE_OFFSET
in dmub_cmd.h.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Lo-an Chen <lo-an.chen@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-18 14:03:47 -04:00
Mario Limonciello
ed569e1279 drm/amd/display: Use HW lock mgr for PSR1 when only one eDP
[WHY]
DMUB locking is important to make sure that registers aren't accessed
while in PSR.  Previously it was enabled but caused a deadlock in
situations with multiple eDP panels.

[HOW]
Detect if multiple eDP panels are in use to decide whether to use
lock. Refactor the function so that the first check is for PSR-SU
and then replay is in use to prevent having to look up number
of eDP panels for those configurations.

Fixes: f245b400a2 ("Revert "drm/amd/display: Use HW lock mgr for PSR1"")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3965
Reviewed-by: ChiaHsuan Chung <chiahsuan.chung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-18 14:03:46 -04:00
Dillon Varone
b3d58262dc drm/amd/display: Revert "Support for reg inbox0 for host->DMUB CMDs"
This reverts commit 15d1c2e6bf.

Reason: Cursor movement causes system to hang.

Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-18 14:03:46 -04:00
Cruise Hung
52af17eabb drm/amd/display: Change notification of link BW allocation
[WHY & HOW]
The response of DP BW allocation is handled in Outbox ISR.
When it failed to request the DP BW allocation, it sent another
DPCD request in Outbox ISR immediately. The DP AUX reply also
uses the Outbox ISR. So, no AUX reply happened in this case.
Change to use HPD IRQ for the notification.

Reviewed-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Signed-off-by: Cruise Hung <Cruise.Hung@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-18 14:03:46 -04:00
Yilin Chen
79538e6365 drm/amd/display: Fix message for support_edp0_on_dp1
[WHY]
The info message was wrong when support_edp0_on_dp1 is enabled

[HOW]
Use correct info message for support_edp0_on_dp1

Fixes: f6d17270d1 ("drm/amd/display: add a quirk to enable eDP0 on DP1")
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Yilin Chen <Yilin.Chen@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-18 14:03:46 -04:00
Jing Zhou
9c2f4ae64b drm/amd/display: Guard against setting dispclk low for dcn31x
[WHY]
We should never apply a minimum dispclk value while in
prepare_bandwidth or while displays are active. This is
always an optimizaiton for when all displays are disabled.

[HOW]
Defer dispclk optimization until safe_to_lower = true
and display_count reaches 0.

Since 0 has a special value in this logic (ie. no dispclk
required) we also need adjust the logic that clamps it for
the actual request to PMFW.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Reviewed-by: Chris Park <chris.park@amd.com>
Reviewed-by: Eric Yang <eric.yang@amd.com>
Signed-off-by: Jing Zhou <Jing.Zhou@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-18 14:03:46 -04:00
Ryan Seto
7b59cc671a drm/amd/display: Prevent VStartup Overflow
[WHY & HOW]
Fixed Overflow issue by clamping VStartup to max value of register.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Ryan Seto <ryanseto@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-18 14:03:46 -04:00
Alex Hung
082ec59662 drm/amd/display: Check pipe->stream before passing it to a function
[WHAT & HOW]
dp_is_128b_132b_signal dereferences pipe->stream so it is necessary to
check it in advance.

Also fix erroneous spaces and move a variable declaration to top.

Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-18 14:03:46 -04:00
Dominik Kaszewski
84ff589539 drm/amdgpu: Add debug masks for HDCP LC FW testing
HDCP Locality Check is being moved to FW, add debug flags to control
its behavior in existing hardware for validation purposes.

Signed-off-by: Dominik Kaszewski <dominik.kaszewski@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-18 14:03:46 -04:00
Xiang Liu
d6f9bbce18 drm/amdgpu: Fix computation for remain size of CPER ring
The mistake of computation for remain size of CPER ring will cause
unbreakable while cycle when CPER ring overflow.

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-18 14:03:46 -04:00
Kenneth Feng
55ff973fe1 drm/amd/amdgpu: shorten the gfx idle worker timeout
Shorten the gfx idle worker timeout. This is to sync with
DAL when there is no activity on the screen. Original 1
second can not sync with DAL, so DAL can not apply MALL
when the workload type is not bootup default.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-18 14:03:38 -04:00
Tao Zhou
05d50ea3ea drm/amdgpu: format old RAS eeprom data into V3 version
Clear old data and save it in V3 format.

v2: only format eeprom data for new ASICs.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-18 14:03:38 -04:00
Alex Deucher
9deacd6c55 drm/amdgpu: don't free conflicting apertures for non-display devices
PCI_CLASS_ACCELERATOR_PROCESSING devices won't ever be
the sysfb, so there is no need to free conflicting
apertures.

Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-18 14:03:38 -04:00
Alex Deucher
e00e5c2238 drm/amdgpu: adjust drm_firmware_drivers_only() handling
Move to probe so we can check the PCI device type and
only apply the drm_firmware_drivers_only() check for
PCI DISPLAY classes.  Also add a module parameter to
override the nomodeset kernel parameter as a workaround
for platforms that have this hardcoded on their kernel
command lines.

Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-18 14:03:38 -04:00
Alex Deucher
4db4c82d4d drm/amdgpu: drop drm_firmware_drivers_only()
There are a number of systems and cloud providers out there
that have nomodeset hardcoded in their kernel parameters
to block nouveau for the nvidia driver.  This prevents the
amdgpu driver from loading. Unfortunately the end user cannot
easily change this.  The preferred way to block modules from
loading is to use modprobe.blacklist=<driver>.  That is what
providers should be using to block specific drivers.

Drop the check to allow the driver to load even when nomodeset
is specified on the kernel command line.

Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-18 14:03:37 -04:00
David Belanger
eb6cdfb807 drm/amdgpu: Restore uncached behaviour on GFX12
Always use MTYPE_UC if UNCACHED flag is specified.

This makes kernarg region uncached and it restores
usermode cache disable debug flag functionality.

Do not set MTYPE_UC for COHERENT flag, on GFX12 coherence is handled by
shader code.

Signed-off-by: David Belanger <david.belanger@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:18:02 -04:00
Wentao Liang
ebdc52607a drm/amdgpu/gfx12: correct cleanup of 'me' field with gfx_v12_0_me_fini()
In gfx_v12_0_cp_gfx_load_me_microcode_rs64(), gfx_v12_0_pfp_fini() is
incorrectly used to free 'me' field of 'gfx', since gfx_v12_0_pfp_fini()
can only release 'pfp' field of 'gfx'. The release function of 'me' field
should be gfx_v12_0_me_fini().

Fixes: 52cb80c12e ("drm/amdgpu: Add gfx v12_0 ip block support (v6)")
Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:17:24 -04:00
Thadeu Lima de Souza Cascardo
42d9d7bed2 drm/amd/display: avoid NPD when ASIC does not support DMUB
ctx->dmub_srv will de NULL if the ASIC does not support DMUB, which is
tested in dm_dmub_sw_init.

However, it will be dereferenced in dmub_hw_lock_mgr_cmd if
should_use_dmub_lock returns true.

This has been the case since dmub support has been added for PSR1.

Fix this by checking for dmub_srv in should_use_dmub_lock.

[   37.440832] BUG: kernel NULL pointer dereference, address: 0000000000000058
[   37.447808] #PF: supervisor read access in kernel mode
[   37.452959] #PF: error_code(0x0000) - not-present page
[   37.458112] PGD 0 P4D 0
[   37.460662] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
[   37.465553] CPU: 2 UID: 1000 PID: 1745 Comm: DrmThread Not tainted 6.14.0-rc1-00003-gd62e938120f0 #23 99720e1cb1e0fc4773b8513150932a07de3c6e88
[   37.478324] Hardware name: Google Morphius/Morphius, BIOS Google_Morphius.13434.858.0 10/26/2023
[   37.487103] RIP: 0010:dmub_hw_lock_mgr_cmd+0x77/0xb0
[   37.492074] Code: 44 24 0e 00 00 00 00 48 c7 04 24 45 00 00 0c 40 88 74 24 0d 0f b6 02 88 44 24 0c 8b 01 89 44 24 08 85 f6 75 05 c6 44 24 0e 01 <48> 8b 7f 58 48 89 e6 ba 01 00 00 00 e8 08 3c 2a 00 65 48 8b 04 5
[   37.510822] RSP: 0018:ffff969442853300 EFLAGS: 00010202
[   37.516052] RAX: 0000000000000000 RBX: ffff92db03000000 RCX: ffff969442853358
[   37.523185] RDX: ffff969442853368 RSI: 0000000000000001 RDI: 0000000000000000
[   37.530322] RBP: 0000000000000001 R08: 00000000000004a7 R09: 00000000000004a5
[   37.537453] R10: 0000000000000476 R11: 0000000000000062 R12: ffff92db0ade8000
[   37.544589] R13: ffff92da01180ae0 R14: ffff92da011802a8 R15: ffff92db03000000
[   37.551725] FS:  0000784a9cdfc6c0(0000) GS:ffff92db2af00000(0000) knlGS:0000000000000000
[   37.559814] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   37.565562] CR2: 0000000000000058 CR3: 0000000112b1c000 CR4: 00000000003506f0
[   37.572697] Call Trace:
[   37.575152]  <TASK>
[   37.577258]  ? __die_body+0x66/0xb0
[   37.580756]  ? page_fault_oops+0x3e7/0x4a0
[   37.584861]  ? exc_page_fault+0x3e/0xe0
[   37.588706]  ? exc_page_fault+0x5c/0xe0
[   37.592550]  ? asm_exc_page_fault+0x22/0x30
[   37.596742]  ? dmub_hw_lock_mgr_cmd+0x77/0xb0
[   37.601107]  dcn10_cursor_lock+0x1e1/0x240
[   37.605211]  program_cursor_attributes+0x81/0x190
[   37.609923]  commit_planes_for_stream+0x998/0x1ef0
[   37.614722]  update_planes_and_stream_v2+0x41e/0x5c0
[   37.619703]  dc_update_planes_and_stream+0x78/0x140
[   37.624588]  amdgpu_dm_atomic_commit_tail+0x4362/0x49f0
[   37.629832]  ? srso_return_thunk+0x5/0x5f
[   37.633847]  ? mark_held_locks+0x6d/0xd0
[   37.637774]  ? _raw_spin_unlock_irq+0x24/0x50
[   37.642135]  ? srso_return_thunk+0x5/0x5f
[   37.646148]  ? lockdep_hardirqs_on+0x95/0x150
[   37.650510]  ? srso_return_thunk+0x5/0x5f
[   37.654522]  ? _raw_spin_unlock_irq+0x2f/0x50
[   37.658883]  ? srso_return_thunk+0x5/0x5f
[   37.662897]  ? wait_for_common+0x186/0x1c0
[   37.666998]  ? srso_return_thunk+0x5/0x5f
[   37.671009]  ? drm_crtc_next_vblank_start+0xc3/0x170
[   37.675983]  commit_tail+0xf5/0x1c0
[   37.679478]  drm_atomic_helper_commit+0x2a2/0x2b0
[   37.684186]  drm_atomic_commit+0xd6/0x100
[   37.688199]  ? __cfi___drm_printfn_info+0x10/0x10
[   37.692911]  drm_atomic_helper_update_plane+0xe5/0x130
[   37.698054]  drm_mode_cursor_common+0x501/0x670
[   37.702600]  ? __cfi_drm_mode_cursor_ioctl+0x10/0x10
[   37.707572]  drm_mode_cursor_ioctl+0x48/0x70
[   37.711851]  drm_ioctl_kernel+0xf2/0x150
[   37.715781]  drm_ioctl+0x363/0x590
[   37.719189]  ? __cfi_drm_mode_cursor_ioctl+0x10/0x10
[   37.724165]  amdgpu_drm_ioctl+0x41/0x80
[   37.728013]  __se_sys_ioctl+0x7f/0xd0
[   37.731685]  do_syscall_64+0x87/0x100
[   37.735355]  ? vma_end_read+0x12/0xe0
[   37.739024]  ? srso_return_thunk+0x5/0x5f
[   37.743041]  ? find_held_lock+0x47/0xf0
[   37.746884]  ? vma_end_read+0x12/0xe0
[   37.750552]  ? srso_return_thunk+0x5/0x5f
[   37.754565]  ? lock_release+0x1c4/0x2e0
[   37.758406]  ? vma_end_read+0x12/0xe0
[   37.762079]  ? exc_page_fault+0x84/0xe0
[   37.765921]  ? srso_return_thunk+0x5/0x5f
[   37.769938]  ? lockdep_hardirqs_on+0x95/0x150
[   37.774303]  ? srso_return_thunk+0x5/0x5f
[   37.778317]  ? exc_page_fault+0x84/0xe0
[   37.782163]  entry_SYSCALL_64_after_hwframe+0x55/0x5d
[   37.787218] RIP: 0033:0x784aa5ec3059
[   37.790803] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1d 48 8b 45 c8 64 48 2b 04 25 28 00 0
[   37.809553] RSP: 002b:0000784a9cdf90e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[   37.817121] RAX: ffffffffffffffda RBX: 0000784a9cdf917c RCX: 0000784aa5ec3059
[   37.824256] RDX: 0000784a9cdf917c RSI: 00000000c01c64a3 RDI: 0000000000000020
[   37.831391] RBP: 0000784a9cdf9130 R08: 0000000000000100 R09: 0000000000ff0000
[   37.838525] R10: 0000000000000000 R11: 0000000000000246 R12: 0000025c01606ed0
[   37.845657] R13: 0000025c00030200 R14: 00000000c01c64a3 R15: 0000000000000020
[   37.852799]  </TASK>
[   37.854992] Modules linked in:
[   37.864546] gsmi: Log Shutdown Reason 0x03
[   37.868656] CR2: 0000000000000058
[   37.871979] ---[ end trace 0000000000000000 ]---
[   37.880976] RIP: 0010:dmub_hw_lock_mgr_cmd+0x77/0xb0
[   37.885954] Code: 44 24 0e 00 00 00 00 48 c7 04 24 45 00 00 0c 40 88 74 24 0d 0f b6 02 88 44 24 0c 8b 01 89 44 24 08 85 f6 75 05 c6 44 24 0e 01 <48> 8b 7f 58 48 89 e6 ba 01 00 00 00 e8 08 3c 2a 00 65 48 8b 04 5
[   37.904703] RSP: 0018:ffff969442853300 EFLAGS: 00010202
[   37.909933] RAX: 0000000000000000 RBX: ffff92db03000000 RCX: ffff969442853358
[   37.917068] RDX: ffff969442853368 RSI: 0000000000000001 RDI: 0000000000000000
[   37.924201] RBP: 0000000000000001 R08: 00000000000004a7 R09: 00000000000004a5
[   37.931336] R10: 0000000000000476 R11: 0000000000000062 R12: ffff92db0ade8000
[   37.938469] R13: ffff92da01180ae0 R14: ffff92da011802a8 R15: ffff92db03000000
[   37.945602] FS:  0000784a9cdfc6c0(0000) GS:ffff92db2af00000(0000) knlGS:0000000000000000
[   37.953689] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   37.959435] CR2: 0000000000000058 CR3: 0000000112b1c000 CR4: 00000000003506f0
[   37.966570] Kernel panic - not syncing: Fatal exception
[   37.971901] Kernel Offset: 0x30200000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[   37.982840] gsmi: Log Shutdown Reason 0x02

Fixes: b5c764d6ed ("drm/amd/display: Use HW lock mgr for PSR1")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Cc: Sun peng Li <sunpeng.li@amd.com>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Daniel Wheeler <daniel.wheeler@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:16:26 -04:00
Shaoyun Liu
f81cd79311 drm/amd/amdgpu: Fix MES init sequence
When MES is been used , the set_hw_resource_1 API is required to
initialize MES internal context correctly

Signed-off-by: Shaoyun Liu <shaoyun.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:16:08 -04:00
Xiang Liu
13c13bdd1b drm/amdgpu: Enable ACA by default for psp v13_0_6/v13_0_14
Enable ACA by default for psp v13_0_6/v13_0_14.

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:15:58 -04:00
Amber Lin
0c7e053448 drm/amdkfd: Correct F8_MODE for gfx950
Correct F8_MODE setting for gfx950 that was removed

Fixes: 61972cd93a ("drm/amdkfd: Set per-process flags only once for gfx9/10/11/12")
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviwanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:13:12 -04:00
ganglxie
a4b6e990d7 drm/amdgpu: Save PA of bad pages for old asics
for old asics that do not support mca translating, we
just save PA for them

Signed-off-by: ganglxie <ganglxie@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:13:08 -04:00
Emily Deng
2da3af5f0b drm/amdgpu: set CP_HQD_PQ_DOORBELL_CONTROL.DOORBELL_MODE to 1 for sriov multiple vf.
In sriov multiple vf, Set CP_HQD_PQ_DOORBELL_CONTROL.DOORBELL_MODE to 1 to read WPTR from MQD.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Acked-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:13:02 -04:00
Emily Deng
8d5e70ba5d drm/amdgpu: Add amdgpu_sriov_multi_vf_mode function
Use amdgpu_sriov_multi_vf_mode to replace amdgpu_sriov_vf(adev) && !amdgpu_sriov_is_pp_one_vf(adev).

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:12:52 -04:00
Alex Deucher
15030aeec3 drm/amdgpu/pm: enable vcn busy sysfs for GC 9.3.0
Make it visible for the all GC 9.3.0 chips that support it.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:12:43 -04:00
Alex Deucher
5b3922222c drm/amdgpu/pm: enable vcn busy sysfs for GC 12.x
Make it visible for the all GC 12.x chips that support it.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:12:35 -04:00
Jay Cornwall
7e0459d453 drm/amdkfd: Fix instruction hazard in gfx12 trap handler
VALU instructions with SGPR source need wait states to avoid hazard
with SALU using different SGPR.

v2: Eliminate some hazards to reduce code explosion

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:12:28 -04:00
Alex Deucher
18537feb18 drm/amdgpu/pm: enable vcn busy sysfs for additional GC 11.x
Make it visible for the all GC 11.x chips that support it.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:12:21 -04:00
Alex Deucher
1b81674e0b drm/amdgpu/pm: add VCN activity for SMU 14.0.2
Wire up the query.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:12:12 -04:00
Alex Deucher
2393c1a907 drm/amdgpu/pm: add VCN activity for SMU 13.0.0/7
Wire up the query.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:12:04 -04:00
Alex Hung
6a87982b58 drm/amd/display: Remove incorrect macro guard
This macro guard "__cplusplus" is unnecessary and should not be there.

Signed-off-by: Alex Hung <alex.hung@amd.com>
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:11:55 -04:00
Lijo Lazar
357506799b drm/amdgpu: Calculate IP specific xgmi bandwidth
Use IP version specific xgmi speed/width for bandwidth calculation.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:11:46 -04:00
Alex Deucher
2bc016737a drm/amdgpu/pm: add VCN activity for renoir
Wire up the query.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:11:32 -04:00
Alex Deucher
90df6db62f drm/amdgpu/pm: wire up hwmon fan speed for smu 14.0.2
Add callbacks for fan speed fetching.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4034
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:10:57 -04:00
Harish Kasiviswanathan
8a7820c072 drm/amdgpu: Reduce dequeue retry timeout for gfx9 family
Dequeue retry timeout controls the interval between checks for unmet
conditions. On MI series, reduce this from 0x40 to 0x1 (~ 1 uS). The
cost of additional bandwidth consumed by CP when polling memory
shouldn't be substantial.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:10:38 -04:00
Asad Kamal
02fc2f3c46 drm/amd/pm: Update feature list for smu_v13_0_12
Update feature list for smu_v13_0_12 to show vcn & smu deep
sleep feature enable status.

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:10:33 -04:00
Alex Deucher
fc3c139cf0 drm/amdgpu/gfx12: don't read registers in mqd init
Just use the default values.  There's not need to
get the value from hardware and it could cause problems
if we do that at runtime and gfxoff is active.

Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:10:30 -04:00
Alex Deucher
e27b36ea6b drm/amdgpu/gfx11: don't read registers in mqd init
Just use the default values.  There's not need to
get the value from hardware and it could cause problems
if we do that at runtime and gfxoff is active.

Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:10:25 -04:00
Emily Deng
f844732e3a drm/amdgpu: Fix the race condition for draining retry fault
Issue:
In the scenario where svm_range_restore_pages is called, but
svm->checkpoint_ts has not been set and the retry fault has not been
drained, svm_range_unmap_from_cpu is triggered and calls svm_range_free.
Meanwhile, svm_range_restore_pages continues execution and reaches
svm_range_from_addr. This results in a "failed to find prange..." error,
 causing the page recovery to fail.

How to fix:
Move the timestamp check code under the protection of svm->lock.

v2:
Make sure all right locks are released before go out.

v3:
Directly goto out_unlock_svms, and return -EAGAIN.

v4:
Refine code.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:10:16 -04:00
Lijo Lazar
b9e75bcb2b drm/amdgpu: Remove unsupported xgmi versions
XGMI v4.8.0 is not used in any SOCs. Remove the associated functions.
Also, ensure get_xgmi_info callback pointer is not NULL before calling
the function.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:10:08 -04:00
Harish Kasiviswanathan
16fbc18cb0 drm/amd/pm: add unique_id for gfx12
Expose unique_id for gfx12

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:09:49 -04:00
David Rosca
19478f2011 drm/amdgpu: Update SRIOV video codec caps
There have been multiple fixes to the video caps that are missing for
SRIOV. Update the SRIOV caps with correct values.

Signed-off-by: David Rosca <david.rosca@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:08:51 -04:00
David Rosca
0a6e7b06bd drm/amdgpu: Remove JPEG from vega and carrizo video caps
JPEG is only supported for VCN1+.

Signed-off-by: David Rosca <david.rosca@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:08:25 -04:00
David Rosca
6e0d2fde3a drm/amdgpu: Fix JPEG video caps max size for navi1x and raven
8192x8192 is the maximum supported resolution.

Signed-off-by: David Rosca <david.rosca@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:08:14 -04:00
David Rosca
1a0807feb9 drm/amdgpu: Fix MPEG2, MPEG4 and VC1 video caps max size
1920x1088 is the maximum supported resolution.

Signed-off-by: David Rosca <david.rosca@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13 23:07:52 -04:00
Natalie Vock
6cc30748e1 drm/amdgpu: NULL-check BO's backing store when determining GFX12 PTE flags
PRT BOs may not have any backing store, so bo->tbo.resource will be
NULL. Check for that before dereferencing.

Fixes: 0cce5f285d ("drm/amdkfd: Check correct memory types for is_system variable")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3e3fcd29b5)
Cc: stable@vger.kernel.org # 6.12.x
2025-03-12 14:59:21 -04:00
Yifan Zha
0882ca4eec drm/amd/amdkfd: Evict all queues even HWS remove queue failed
[Why]
If reset is detected and kfd need to evict working queues, HWS moving queue will be failed.
Then remaining queues are not evicted and in active state.

After reset done, kfd uses HWS to termination remaining activated queues but HWS is resetted.
So remove queue will be failed again.

[How]
Keep removing all queues even if HWS returns failed.
It will not affect cpsch as it checks reset_domain->sem.

v2: If any queue failed, evict queue returns error.
v3: Declare err inside the if-block.

Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 42c854b8fb)
Cc: stable@vger.kernel.org
2025-03-12 14:59:21 -04:00
Natalie Vock
3e3fcd29b5 drm/amdgpu: NULL-check BO's backing store when determining GFX12 PTE flags
PRT BOs may not have any backing store, so bo->tbo.resource will be
NULL. Check for that before dereferencing.

Fixes: 0cce5f285d ("drm/amdkfd: Check correct memory types for is_system variable")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-11 12:37:10 -04:00
Alexandre Demers
37c890d831 drm/amdgpu: finish wiring up sid.h in DCE6
For coherence with DCE8 et DCE10, add or move some values under sid.h
and remove duplicated from si_enums.h.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-11 12:37:07 -04:00
Yifan Zha
42c854b8fb drm/amd/amdkfd: Evict all queues even HWS remove queue failed
[Why]
If reset is detected and kfd need to evict working queues, HWS moving queue will be failed.
Then remaining queues are not evicted and in active state.

After reset done, kfd uses HWS to termination remaining activated queues but HWS is resetted.
So remove queue will be failed again.

[How]
Keep removing all queues even if HWS returns failed.
It will not affect cpsch as it checks reset_domain->sem.

v2: If any queue failed, evict queue returns error.
v3: Declare err inside the if-block.

Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-11 12:36:52 -04:00
Alexandre Demers
760632fa2e drm/amdgpu: fix SI's GB_ADDR_CONFIG_GOLDEN values and wire up sid.h in GFX6
By wiring up sid.h in GFX6, we end up with a few duplicated defines such as
the golden registers. Let's clean this up.

[TAHITI,VERDE, HAINAN]_GB_ADDR_CONFIG_GOLDEN were defined both in sid.h
and under si_enums.h, with different values. Keep the values used under radeon
and move them under gfx_v6_0.c where they are used (as it is done under cik)

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-11 12:36:48 -04:00
Alexandre Demers
20fb56dfd8 drm/amdgpu: prepare DCE6 uniformisation with DCE8 and DCE10
Let's begin the cleanup in sid.h to prevent warnings and errors when wiring
sid.h into dce_v6_0.c.

This is a bigger cleanup.
Many defines found under sid.h have already been properly moved
into the different "_d.h" and "_sh_mask.h", so they should have been
already removed from sid.h and properly linked in where needed.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-11 12:36:42 -04:00
Dan Carpenter
5b1fa87f30 drm/amdkfd: delete stray tab in kfd_dbg_set_mes_debug_mode()
These lines are indented one tab more than they should be.  Delete
the stray tabs.

Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-11 12:36:38 -04:00
Dan Carpenter
0ee560d71f drm/amdgpu/gfx: delete stray tabs
These lines are indented one tab too far.  Delete the extra tabs.

Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-11 12:36:29 -04:00
André Almeida
099f273eff drm/amdgpu: Trigger a wedged event for ring reset
Instead of only triggering a wedged event for complete GPU resets,
trigger for ring resets. Regardless of the reset, it's useful for
userspace to know that it happened because the kernel will reject
further submissions from that app.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 14:56:25 -04:00
Alex Deucher
ded6ad4c6e drm/amdgpu/vce2: fix ip block reference
Need to use the correct IP block type.  VCE vs VCN.
Fixes mclk issues on Hawaii.

Suggested by selendym.

Fixes: 82ae6619a4 ("drm/amdgpu: update the handle ptr in wait_for_idle")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3997
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Cc: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 02438acd25)
Cc: stable@vger.kernel.org
2025-03-10 14:18:04 -04:00
Mario Limonciello
e65e7bea22 drm/amd/display: Fix slab-use-after-free on hdcp_work
[Why]
A slab-use-after-free is reported when HDCP is destroyed but the
property_validate_dwork queue is still running.

[How]
Cancel the delayed work when destroying workqueue.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4006
Fixes: da3fd7ac0b ("drm/amd/display: Update CP property based on HW query")
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 725a04ba5a)
Cc: stable@vger.kernel.org
2025-03-10 14:16:42 -04:00
Alex Hung
79e31396fd drm/amd/display: Assign normalized_pix_clk when color depth = 14
[WHY & HOW]
A warning message "WARNING: CPU: 4 PID: 459 at ... /dc_resource.c:3397
calculate_phy_pix_clks+0xef/0x100 [amdgpu]" occurs because the
display_color_depth == COLOR_DEPTH_141414 is not handled. This is
observed in Radeon RX 6600 XT.

It is fixed by assigning pix_clk * (14 * 3) / 24 - same as the rests.

Also fixes the indentation in get_norm_pix_clk.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 274a87eb38)
Cc: stable@vger.kernel.org
2025-03-10 14:13:16 -04:00
Mario Limonciello
5760388d96 drm/amd/display: Restore correct backlight brightness after a GPU reset
[Why]
GPU reset will attempt to restore cached state, but brightness doesn't
get restored. It will come back at 100% brightness, but userspace thinks
it's the previous value.

[How]
When running resume sequence if GPU is in reset restore brightness
to previous value.

Acked-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5e19e2b57b)
Cc: stable@vger.kernel.org
2025-03-10 14:12:33 -04:00
Mario Limonciello
b5a981e1b3 drm/amd/display: fix default brightness
[Why]
To avoid flickering during boot default brightness level set by BIOS
should be maintained for as much of the boot as feasible.
commit 2fe87f54ab ("drm/amd/display: Set default brightness according
to ACPI") attempted to set the right levels for AC vs DC, but brightness
still got reset to maximum level in initialization code for
setup_backlight_device().

[How]
Remove the hardcoded initialization in setup_backlight_device() and
instead program brightness value to match BIOS (AC or DC).  This avoids a
brightness flicker from kernel changing the value.  Userspace may however
still change it during boot.

Fixes: 2fe87f54ab ("drm/amd/display: Set default brightness according to ACPI")
Acked-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 0747acf331)
Cc: stable@vger.kernel.org
2025-03-10 14:11:13 -04:00
Leo Li
40b8c14936 drm/amd/display: Disable unneeded hpd interrupts during dm_init
[Why]

It seems HPD interrupts are enabled by default for all connectors, even
if the hpd source isn't valid. An eDP for example, does not have a valid
hpd source (but does have a valid hpdrx source; see construct_phy()).
Thus, eDPs should have their hpd interrupt disabled.

In the past, this wasn't really an issue. Although the driver gets
interrupted, then acks by writing to hw registers, there weren't any
subscribed handlers that did anything meaningful (see
register_hpd_handlers()).

But things changed with the introduction of IPS. s2idle requires that
the driver allows IPS for DMUB fw to put hw to sleep. Since register
access requires hw to be awake, the driver will block IPS entry to do
so. And no IPS means no hw sleep during s2idle.

This was the observation on DCN35 systems with an eDP. During suspend,
the eDP toggled its hpd pin as part of the panel power down sequence.
The driver was then interrupted, and acked by writing to registers,
blocking IPS entry.

[How]

Since DC marks eDP connections as having invalid hpd sources (see
construct_phy()), DM should disable them at the hw level. Do so in
amdgpu_dm_hpd_init() by disabling all hpd ints first, then selectively
enabling ones for connectors that have valid hpd sources.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7b1ba19eb1)
Cc: stable@vger.kernel.org
2025-03-10 14:10:26 -04:00
Mario Limonciello
4afacc9948 drm/amd: Keep display off while going into S4
When userspace invokes S4 the flow is:

1) amdgpu_pmops_prepare()
2) amdgpu_pmops_freeze()
3) Create hibernation image
4) amdgpu_pmops_thaw()
5) Write out image to disk
6) Turn off system

Then on resume amdgpu_pmops_restore() is called.

This flow has a problem that because amdgpu_pmops_thaw() is called
it will call amdgpu_device_resume() which will resume all of the GPU.

This includes turning the display hardware back on and discovering
connectors again.

This is an unexpected experience for the display to turn back on.
Adjust the flow so that during the S4 sequence display hardware is
not turned back on.

Reported-by: Xaver Hugl <xaver.hugl@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2038
Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>
Tested-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Link: https://lore.kernel.org/r/20250306185124.44780-1-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 68bfdc8dc0)
2025-03-10 14:09:09 -04:00
Aliaksei Urbanski
e204aab79e drm/amd/display: fix missing .is_two_pixels_per_container
Starting from 6.11, AMDGPU driver, while being loaded with amdgpu.dc=1,
due to lack of .is_two_pixels_per_container function in dce60_tg_funcs,
causes a NULL pointer dereference on PCs with old GPUs, such as R9 280X.

So this fix adds missing .is_two_pixels_per_container to dce60_tg_funcs.

Reported-by: Rosen Penev <rosenp@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3942
Fixes: e6a901a008 ("drm/amd/display: use even ODM slice width for two pixels per container")
Signed-off-by: Aliaksei Urbanski <aliaksei.urbanski@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit bd4b125eb9)
Cc: stable@vger.kernel.org
2025-03-10 14:08:15 -04:00
David Rosca
df1e82e7ac drm/amdgpu/display: Allow DCC for video formats on GFX12
We advertise DCC as supported for NV12/P010 formats on GFX12,
but it would fail on this check on atomic commit.

Signed-off-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ba795235a2)
Cc: stable@vger.kernel.org # 6.12.x
2025-03-10 14:07:35 -04:00
Alex Deucher
02438acd25 drm/amdgpu/vce2: fix ip block reference
Need to use the correct IP block type.  VCE vs VCN.
Fixes mclk issues on Hawaii.

Suggested by selendym.

Fixes: 82ae6619a4 ("drm/amdgpu: update the handle ptr in wait_for_idle")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3997
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Cc: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:36:21 -04:00
Ethan Carter Edwards
315ce6c41a drm/amd/display: change kzalloc to kcalloc in dml1_validate()
We are trying to get rid of all multiplications from allocation
functions to prevent integer overflows. Here the multiplication is
probably safe, but using kcalloc() is more appropriate and improves
readability. This patch has no effect on runtime behavior.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Ethan Carter Edwards <ethan@ethancedwards.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:36:17 -04:00
Ethan Carter Edwards
b17a94f2fe drm/amd/display: change kzalloc to kcalloc in dcn314_validate_bandwidth()
We are trying to get rid of all multiplications from allocation
functions to prevent integer overflows. Here the multiplication is
probably safe, but using kcalloc() is more appropriate and improves
readability. This patch has no effect on runtime behavior.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Ethan Carter Edwards <ethan@ethancedwards.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:36:14 -04:00
Ethan Carter Edwards
934cb529e9 drm/amd/display: change kzalloc to kcalloc in dcn31_validate_bandwidth()
We are trying to get rid of all multiplications from allocation
functions to prevent integer overflows. Here the multiplication is
probably safe, but using kcalloc() is more appropriate and improves
readability. This patch has no effect on runtime behavior.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Ethan Carter Edwards <ethan@ethancedwards.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:36:08 -04:00
Ethan Carter Edwards
f4f086de31 drm/amd/display: change kzalloc to kcalloc in dcn30_validate_bandwidth()
We are trying to get rid of all multiplications from allocation
functions to prevent integer overflows. Here the multiplication is
probably safe, but using kcalloc() is more appropriate and improves
readability. This patch has no effect on runtime behavior.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Ethan Carter Edwards <ethan@ethancedwards.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:36:04 -04:00
Taimur Hassan
2f1b6b24b0 drm/amd/display: Promote DAL to 3.2.324
This version brings along following fixes:
- Fix some Replay/PSR issue
- Fix backlight brightness
- Fix suspend issue
- Fix visual confirm color
- Add scoped mutexes for amdgpu_dm_dhcp

Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:36:00 -04:00
Charlene Liu
756e58e83e drm/amd/display: remove minimum Dispclk and apply oem panel timing.
[why & how]
1. apply oem panel timing (not only on OLED)
2. remove MIN_DPP_DISP_CLK request in driver.

This fix will apply for dcn31x but not
sync with DML's output.

Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:35:55 -04:00
Mario Limonciello
272385483e drm/amd/display: Drop unnecessary ret variable for enable_assr()
[Why]
enable_assr() has a res variable that only is changed in one block with
no cleanup necessary.

[How]
Remove variable and return early from failure cases.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:35:49 -04:00
Mario Limonciello
6b675ab8ef drm/amd/display: Add scoped mutexes for amdgpu_dm_dhcp
[Why]
Guards automatically release mutex when it goes out of scope making
code easier to follow.

[How]
Replace all use of mutex_lock()/mutex_unlock() with guard(mutex).

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:32:58 -04:00
Mario Limonciello
725a04ba5a drm/amd/display: Fix slab-use-after-free on hdcp_work
[Why]
A slab-use-after-free is reported when HDCP is destroyed but the
property_validate_dwork queue is still running.

[How]
Cancel the delayed work when destroying workqueue.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4006
Fixes: da3fd7ac0b ("drm/amd/display: Update CP property based on HW query")
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:30:51 -04:00
Ryan Seto
29c1c20496 drm/amd/display: Prevent VStartup Overflow
[Why]
For some VR headsets with large blanks, it's possible
to overflow the OTG_VSTARTUP_PARAM:VSTARTUP_START
register. This can lead to incorrect DML calculations
and underflow downstream.

[How]
Min the calcualted max_vstartup_lines with the max
value of the register.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Ryan Seto <ryanseto@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:30:45 -04:00
Zhongwei Zhang
34935701b7 drm/amd/display: Correct timing_adjust_pending flag setting.
[Why&How]
stream->adjust will be overwritten by update->crtc_timing_adjust.
We should set update->crtc_timing_adjust->timing_adjust_pending
and then overwrite stream->adjust.
Reset update->crtc_timing_adjust->timing_adjust_pending after
the assignment.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Zhongwei Zhang <Zhongwei.Zhang@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:30:40 -04:00
Zhikai Zhai
d3069feecd drm/amd/display: calculate the remain segments for all pipes
[WHY]
In some cases the remain de-tile buffer segments will be greater
than zero if we don't add the non-top pipe to calculate, at
this time the override de-tile buffer size will be valid and used.
But it makes the de-tile buffer segments used finally for all of pipes
exceed the maximum.

[HOW]
Add the non-top pipe to calculate the remain de-tile buffer segments.
Don't set override size to use the average according to pipe count
if the value exceed the maximum.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Zhikai Zhai <zhikai.zhai@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:30:31 -04:00
Leo Zeng
dd60bfd534 drm/amd/display: Fix visual confirm color not updating
[WHY]
Sometimes visual confirm color is updated, but the
background color is not changed. This causes visual
confrim to show incorrect colors.

[HOW]
Update background color when visual confirm color changes.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Leo Zeng <Leo.Zeng@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:30:25 -04:00
Alex Hung
274a87eb38 drm/amd/display: Assign normalized_pix_clk when color depth = 14
[WHY & HOW]
A warning message "WARNING: CPU: 4 PID: 459 at ... /dc_resource.c:3397
calculate_phy_pix_clks+0xef/0x100 [amdgpu]" occurs because the
display_color_depth == COLOR_DEPTH_141414 is not handled. This is
observed in Radeon RX 6600 XT.

It is fixed by assigning pix_clk * (14 * 3) / 24 - same as the rests.

Also fixes the indentation in get_norm_pix_clk.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:28:53 -04:00
Dillon Varone
15d1c2e6bf drm/amd/display: Add Support for reg inbox0 for host->DMUB CMDs
[WHY]
DCN4+ supports a new register based mailbox for sending messages
from host to DMCUB. This mailbox supports 64 byte commands, which makes
it compatible with the same structure as the frame buffer based mailbox.

[HOW]
The intention for reg_inbox0 is to be slot in replacement for the frame
buffer based mailbox (Inbox1). It supports all of the required features:
- Supports all messages handled by FB Inbox1
- Supports multi command batching

Reviewed-by: Joshua Aberback <joshua.aberback@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:28:47 -04:00
Charlene Liu
50bcdef7b6 drm/amd/display: assume VBIOS supports DSC as default
[Why & How]
The clear_dsc_setting at boot logic was based on dcn version
check.
As such new ASIC lost this DSC clear up logic, change the
assumption to BIOS support eDP DSC for new ASIC.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:28:41 -04:00
George Shen
084e073544 drm/amd/display: Implement PCON regulated autonomous mode handling
[Why/How]
DP spec has been updated recently to make regulated autonomous mode more
well-defined. In case any PCON vendors choose to implement regulated
autonomous mode in the future, pre-emptively add handling for the
regulated autonomous mode based on current spec.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:28:34 -04:00
Peichen Huang
8a21da2842 drm/amd/display: not abort link train when bw is low
[WHY]
DP tunneling should not abort link train even bandwidth become
too low after downgrade. Otherwise, it would fail compliance test.

[HOW}
Do link train with downgrade settings even bandwidth is not enough

Reviewed-by: Cruise Hung <cruise.hung@amd.com>
Reviewed-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Signed-off-by: Peichen Huang <PeiChen.Huang@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:28:27 -04:00
Danny Wang
bd00b29b5f drm/amd/display: Do not enable replay when vtotal update is pending.
[Why&How]
Vtotal is not applied to HW when handling vsync interrupt.
Make sure vtotal is aligned before enable replay.

Reviewed-by: Anthony Koo <anthony.koo@amd.com>
Reviewed-by: Robin Chen <robin.chen@amd.com>
Signed-off-by: Danny Wang <danny.wang@amd.com>
Signed-off-by: Zhongwei Zhang <Zhongwei.Zhang@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:28:17 -04:00
Mario Limonciello
50e0bae34f drm/amd/display: Add and use new dm_prepare_suspend() callback
[Why]
The displays currently don't get turned off until after other IP blocks
have been suspended.  However turning off the displays first gives a
very visible response that the system is on it's way down.

[How]
Turn off displays in a prepare_suspend() callback instead when possible.
This will help for suspend and hibernate sequences.
The shutdown sequence however will not call prepare() so check whether
the state has been already saved to decide what to do.

Acked-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:28:04 -04:00
Mario Limonciello
5e19e2b57b drm/amd/display: Restore correct backlight brightness after a GPU reset
[Why]
GPU reset will attempt to restore cached state, but brightness doesn't
get restored. It will come back at 100% brightness, but userspace thinks
it's the previous value.

[How]
When running resume sequence if GPU is in reset restore brightness
to previous value.

Acked-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:27:39 -04:00
Mario Limonciello
0747acf331 drm/amd/display: fix default brightness
[Why]
To avoid flickering during boot default brightness level set by BIOS
should be maintained for as much of the boot as feasible.
commit 2fe87f54ab ("drm/amd/display: Set default brightness according
to ACPI") attempted to set the right levels for AC vs DC, but brightness
still got reset to maximum level in initialization code for
setup_backlight_device().

[How]
Remove the hardcoded initialization in setup_backlight_device() and
instead program brightness value to match BIOS (AC or DC).  This avoids a
brightness flicker from kernel changing the value.  Userspace may however
still change it during boot.

Fixes: 2fe87f54ab ("drm/amd/display: Set default brightness according to ACPI")
Acked-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:26:29 -04:00
Joshua Aberback
d93b92c976 drm/amd/display: Add more debug data to dmub_srv
[Why]
When analyzing some crash dumps, not all of the expected DMUB info was
available, so we want to add in-object storage for this data.

[How]
 - dmub_srv_debug (renamed to dmub_timeout_info) is already a member of
dmub_diagnostic_data, therefore keep a dmub_diagnostic_data directly in
dmub_srv
 - use dmub_srv->debug when collecting diagnostic info instead of stack
object to allow for easy inspection in crash dumps

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Joshua Aberback <joshua.aberback@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:26:23 -04:00
Leo Li
7b1ba19eb1 drm/amd/display: Disable unneeded hpd interrupts during dm_init
[Why]

It seems HPD interrupts are enabled by default for all connectors, even
if the hpd source isn't valid. An eDP for example, does not have a valid
hpd source (but does have a valid hpdrx source; see construct_phy()).
Thus, eDPs should have their hpd interrupt disabled.

In the past, this wasn't really an issue. Although the driver gets
interrupted, then acks by writing to hw registers, there weren't any
subscribed handlers that did anything meaningful (see
register_hpd_handlers()).

But things changed with the introduction of IPS. s2idle requires that
the driver allows IPS for DMUB fw to put hw to sleep. Since register
access requires hw to be awake, the driver will block IPS entry to do
so. And no IPS means no hw sleep during s2idle.

This was the observation on DCN35 systems with an eDP. During suspend,
the eDP toggled its hpd pin as part of the panel power down sequence.
The driver was then interrupted, and acked by writing to registers,
blocking IPS entry.

[How]

Since DC marks eDP connections as having invalid hpd sources (see
construct_phy()), DM should disable them at the hw level. Do so in
amdgpu_dm_hpd_init() by disabling all hpd ints first, then selectively
enabling ones for connectors that have valid hpd sources.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:23:31 -04:00
Leon Huang
0d9cabc8f5 drm/amd/display: Fix incorrect DPCD configs while Replay/PSR switch
[Why]
When switching between PSR/Replay,
the DPCD config of previous mode is not cleared,
resulting in unexpected behavior in TCON.

[How]
Initialize the DPCD in setup function

Reviewed-by: Robin Chen <robin.chen@amd.com>
Signed-off-by: Leon Huang <Leon.Huang1@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:23:25 -04:00
Harish Kasiviswanathan
ed962f8d06 drm/amdkfd: Add pm_config_dequeue_wait_counts API
Update pm_update_grace_period() to more cleaner
pm_config_dequeue_wait_counts(). Previously, grace_period variable was
overloaded as a variable and a macro, making it inflexible to configure
additional dequeue wait times.

pm_config_dequeue_wait_counts() now takes in a cmd / variable. This
allows flexibility to update different dequeue wait times.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:23:12 -04:00
Alex Deucher
2c01befe4a drm/amdgpu/vcn: fix idle work handler for VCN 2.5
VCN 2.5 uses the PG callback to enable VCN DPM which is
a global state.  As such, we need to make sure all instances
are in the same state.

v2: switch to a ref count (Lijo)
v3: switch to its own idle work handler
v4: fix logic in DPG handling

Fixes: 4ce4fe2720 ("drm/amdgpu/vcn: use per instance callbacks for idle work handler")
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:22:05 -04:00
Marek Olšák
3855f1d925 drm/amd/display: allow 256B DCC max compressed block sizes on gfx12
The hw supports it.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10 13:22:01 -04:00
Greg Kroah-Hartman
993a47bd7b Linux 6.14-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmfOKBUeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG1aQH/iC+Oyij4VxAjBek
 BOXIT/p6CwlIXb8ObiWWcRjDPizlcxb3RaV8J2RO+IqaQ2wltxpFANq2G7Re2FPm
 SNcEpIURAOVcxHGedcfFA91srO5F4FzNTO8LVp7MIbcgMYy3pdk+dbZmi6A691R+
 t9pb74m+MAnF1o/MUx7pUlhAT/4ymuuR0F7WCSg4h0Xwe5m0nlJY89kJBC7PCjyd
 n3mdhsz3rDSLmt/z/T7HGD89r8sYSvm9cOKtL3ELgGTrm7boQV8ii9Y9w04DI8PQ
 JmIernugcCxmhH36mVUAHgJf2+/T388xFUh/D5+skeUOUZpaJZG866rnb32WpsHc
 eWLFUeg=
 =Wypt
 -----END PGP SIGNATURE-----

Merge 6.14-rc6 into driver-core-next

We need the driver core fix in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-10 17:37:25 +01:00
Dave Airlie
236f475d29 amdgpu:
- Fix spelling typos
 - RAS updates
 - VCN 5.0.1 updates
 - SubVP fixes
 - DCN 4.0.1 fixes
 - MSO DPCD fixes
 - DIO encoder refactor
 - PCON fixes
 - Misc cleanups
 - DMCUB fixes
 - USB4 DP fixes
 - DM cleanups
 - Backlight cleanups and fixes
 - Support platform backlight curves
 - Misc code cleanups
 - SMU 14 fixes
 - JPEG 4.0.3 reset updates
 - SR-IOV fixes
 - SVM fixes
 - GC 12 DCC fixes
 - DC DCE 6.x fix
 - Hiberation fix
 
 amdkfd:
 - Fix possible NULL pointer in queue validation
 - Remove unnecessary CP domain validation
 - SDMA queue reset support
 - Add per process flags
 
 radeon:
 - Fix spelling typos
 - RS400 hyperZ fix
 
 UAPI:
 - Add KFD per process flags for setting precision
   Proposed user space: 2a64fa5e06
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZ8tfmgAKCRC93/aFa7yZ
 2CmkAP4xoy09aHecz6WOdNN9VBJiSMSZ3UOyBL9Ll2i5nI3YegEA7Co+AmQpXOPn
 HKeD6Vy1tt1XE/Liuur5dbLUKsVYSAI=
 =IQ45
 -----END PGP SIGNATURE-----

Merge tag 'amd-drm-next-6.15-2025-03-07' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amdgpu:
- Fix spelling typos
- RAS updates
- VCN 5.0.1 updates
- SubVP fixes
- DCN 4.0.1 fixes
- MSO DPCD fixes
- DIO encoder refactor
- PCON fixes
- Misc cleanups
- DMCUB fixes
- USB4 DP fixes
- DM cleanups
- Backlight cleanups and fixes
- Support platform backlight curves
- Misc code cleanups
- SMU 14 fixes
- JPEG 4.0.3 reset updates
- SR-IOV fixes
- SVM fixes
- GC 12 DCC fixes
- DC DCE 6.x fix
- Hiberation fix

amdkfd:
- Fix possible NULL pointer in queue validation
- Remove unnecessary CP domain validation
- SDMA queue reset support
- Add per process flags

radeon:
- Fix spelling typos
- RS400 hyperZ fix

UAPI:
- Add KFD per process flags for setting precision
  Proposed user space: 2a64fa5e06

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250307211051.1880472-1-alexander.deucher@amd.com
2025-03-10 09:04:52 +10:00
Harish Kasiviswanathan
cf6d949a40 drm/amdkfd: Add support for more per-process flag
Add support for more per-process flags starting with option to configure
MFMA precision for gfx 9.5

v2: Change flag name to KFD_PROC_FLAG_MFMA_HIGH_PRECISION
    Remove unused else condition
v3: Bump the KFD API version
v4: Missed SH_MEM_CONFIG__PRECISION_MODE__SHIFT define. Added it.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-07 15:33:49 -05:00
Harish Kasiviswanathan
61972cd93a drm/amdkfd: Set per-process flags only once for gfx9/10/11/12
Define set_cache_memory_policy() for these asics and move all static
changes from update_qpd() which is called each time a queue is created
to set_cache_memory_policy() which is called once during process
initialization

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-07 15:33:49 -05:00
Harish Kasiviswanathan
289e68503a drm/amdkfd: Set per-process flags only once cik/vi
Set per-process static sh_mem config only once during process
initialization. Move all static changes from update_qpd() which is
called each time a queue is created to set_cache_memory_policy() which
is called once during process initialization.

set_cache_memory_policy() is currently defined only for cik and vi
family. So this commit only focuses on these two. A separate commit will
address other asics.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-07 15:33:49 -05:00
Mario Limonciello
68bfdc8dc0 drm/amd: Keep display off while going into S4
When userspace invokes S4 the flow is:

1) amdgpu_pmops_prepare()
2) amdgpu_pmops_freeze()
3) Create hibernation image
4) amdgpu_pmops_thaw()
5) Write out image to disk
6) Turn off system

Then on resume amdgpu_pmops_restore() is called.

This flow has a problem that because amdgpu_pmops_thaw() is called
it will call amdgpu_device_resume() which will resume all of the GPU.

This includes turning the display hardware back on and discovering
connectors again.

This is an unexpected experience for the display to turn back on.
Adjust the flow so that during the S4 sequence display hardware is
not turned back on.

Reported-by: Xaver Hugl <xaver.hugl@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2038
Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>
Tested-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Link: https://lore.kernel.org/r/20250306185124.44780-1-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-07 15:33:49 -05:00
Tom St Denis
0d1a686b54 drm/amd/amdgpu: Add missing GC 11.5.0 register
Adds register needed for debugging purposes.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-07 15:33:48 -05:00
Alex Sierra
59228c6631 drm/amdkfd: clear F8_MODE for gfx950
Default F8_MODE should be OCP format on gfx950.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-07 15:33:48 -05:00
Alexandre Demers
092da9fb25 drm/amdgpu: add defines for pin_offsets in DCE8
Define pin_offsets values in the same way it is done in DCE8

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-07 12:56:08 -05:00
Srinivasan Shanmugam
9c551ca3db drm/amdgpu: Fix annotation for dce_v6_0_line_buffer_adjust function
Updated description for the 'other_mode' parameter. This parameter is
used to determine the display mode of another display controller that
may be sharing the line buffer.

Cc: Ken Wang <Qingqing.Wang@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-07 12:56:04 -05:00
Wentao Liang
1435e895d4 drm/amdgpu: handle amdgpu_cgs_create_device() errors in amd_powerplay_create()
Add error handling to propagate amdgpu_cgs_create_device() failures
to the caller. When amdgpu_cgs_create_device() fails, release hwmgr
and return -ENOMEM to prevent null pointer dereference.

[v1]->[v2]: Change error code from -EINVAL to -ENOMEM. Free hwmgr.

Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-07 12:55:56 -05:00
Aliaksei Urbanski
bd4b125eb9 drm/amd/display: fix missing .is_two_pixels_per_container
Starting from 6.11, AMDGPU driver, while being loaded with amdgpu.dc=1,
due to lack of .is_two_pixels_per_container function in dce60_tg_funcs,
causes a NULL pointer dereference on PCs with old GPUs, such as R9 280X.

So this fix adds missing .is_two_pixels_per_container to dce60_tg_funcs.

Reported-by: Rosen Penev <rosenp@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3942
Fixes: e6a901a008 ("drm/amd/display: use even ODM slice width for two pixels per container")
Signed-off-by: Aliaksei Urbanski <aliaksei.urbanski@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-07 12:54:51 -05:00
David Rosca
ba795235a2 drm/amdgpu/display: Allow DCC for video formats on GFX12
We advertise DCC as supported for NV12/P010 formats on GFX12,
but it would fail on this check on atomic commit.

Signed-off-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-07 12:54:25 -05:00
Xiang Liu
148084bbb1 drm/amdgpu: Use unique CPER record id across devices
Encode socket id to CPER record id to be unique across devices.

v2: add pointer check for adev->smuio.funcs->get_socket_id
v2: set 0 if adev->smuio.funcs->get_socket_id is NULL

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-07 12:54:08 -05:00
Shiwu Zhang
216be476f1 drm/amdgpu: fix the gb_addr_config_fields init value mismatch
For gfx_v9_4_3 specifically, before regGB_ADDR_CONFIG is overwritten
in gfx hw_init it is read out to popluate the gb_addr_config_fields
in the sw_init stage, which causes mismatch.

Fix it by using the golden value in sw_init as well.

v2: This is a driver-set golden reg and keep as it is (Lijo)

Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-07 12:53:58 -05:00
Shiwu Zhang
3bc7bc73af drm/amdgpu: retire ip init code specific for A0 rev
For aqua_vanjaram, A0 HW is retired so remove the code
specific for it in gfx ip init.

Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-07 12:53:43 -05:00
Tao Zhou
334dc5fcc3 drm/amdgpu: increase RAS bad page threshold
For default policy, driver will issue an RMA event when the number of
bad pages is greater than 8 physical rows, rather than reaches 8
physical rows, don't rely on threshold configurable parameters in
default mode.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-07 12:53:39 -05:00
Emily Deng
fe2fa3be3d drm/amdgpu: Fix missing drain retry fault the last entry
While the entry get in svm_range_unmap_from_cpu is the last entry, and
the entry is page fault, it also need to be dropped. So for equal case,
it also need to be dropped.

v2:
Only modify the svm_range_restore_pages.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Xiaogang Chen<xiaogang.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-07 12:53:30 -05:00
Victor Lu
94b0908b85 drm/amdgpu: Do not set power brake sequence for Aldebaran SRIOV
Aldebaran SRIOV VF cannot access the power brake feature regs.
The accesses can be skipped to avoid a dmesg warning.

v2: Remove redundant asic type check

Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-07 12:53:24 -05:00
Jonathan Kim
14c8097ba4 drm/amdkfd: remove unused debug gws support status variable
Remove unused declaration of gws_debug_workaround.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Amber Lin <amber.lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-07 12:53:13 -05:00
Charles Han
571d36837c drm/amdgpu: fix inconsistent indenting warning
Fix below inconsistent indenting smatch warning.
smatch warnings:
drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c:582 amdgpu_sdma_reset_engine() warn: inconsistent indenting

Signed-off-by: Charles Han <hanchunchao@inspur.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-07 12:53:06 -05:00
Victor Lu
3646cc65e2 drm/amdgpu: Do not write to GRBM_CNTL if Aldebaran SRIOV
Aldebaran SRIOV VF does not have write permissions to GRBM_CTNL.
This access can be skipped to avoid a dmesg warning.

v2: Use GC IP version check instead of asic check

Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-07 12:52:29 -05:00
Kenneth Feng
da552bda98 drm/amd/pm: always allow ih interrupt from fw
always allow ih interrupt from fw on smu v14 based on
the interface requirement

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a3199eba46)
Cc: stable@vger.kernel.org # 6.12.x
2025-03-05 12:34:09 -05:00
Andrew Martin
fd617ea3b7 drm/amdkfd: Fix NULL Pointer Dereference in KFD queue
Through KFD IOCTL Fuzzing we encountered a NULL pointer derefrence
when calling kfd_queue_acquire_buffers.

Fixes: 629568d25f ("drm/amdkfd: Validate queue cwsr area and eop buffer size")
Signed-off-by: Andrew Martin <Andrew.Martin@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Andrew Martin <Andrew.Martin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 049e5bf3c8)
Cc: stable@vger.kernel.org
2025-03-05 11:47:00 -05:00
Ma Ke
374c9faac5 drm/amd/display: Fix null check for pipe_ctx->plane_state in resource_build_scaling_params
Null pointer dereference issue could occur when pipe_ctx->plane_state
is null. The fix adds a check to ensure 'pipe_ctx->plane_state' is not
null before accessing. This prevents a null pointer dereference.

Found by code review.

Fixes: 3be5262e35 ("drm/amd/display: Rename more dc_surface stuff to plane_state")
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 63e6a77ccf)
Cc: stable@vger.kernel.org
2025-03-05 11:44:53 -05:00
Sathishkumar S
a29936bcd2 drm/amdgpu: Fix core reset sequence for JPEG5_0_1
For cores 1 through 9 repair the core reset sequence by
adjusting offsets to access the expected registers.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:47:36 -05:00
Jonathan Kim
ceb7114c96 drm/amdkfd: flag per-sdma queue reset supported to user space
Similar to compute queue reset, flag SDMA queue reset capabilities to
user space for safe testing.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:47:33 -05:00
Jonathan Kim
bac38ca8c4 drm/amdkfd: implement per queue sdma reset for gfx 9.4+
To reset hung SDMA queues on GFX 9.4+ for the GFX9 family, a soft reset
must be issued through SMU.  Since soft resets will reset an entire SDMA
engine, use a common KGD call to do the reset as the KGD will handle
avoiding a reset of in flight GFX and paging queues on that engine.

In addition, create a common call for all reset types to simplify
the handling of module parameter settings that block gpu resets.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:47:26 -05:00
Victor Lu
057fef20b8 drm/amdgpu: Do not program AGP BAR regs under SRIOV in gfxhub_v1_0.c
SRIOV VF does not have write access to AGP BAR regs.
Skip the writes to avoid a dmesg warning.

Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:47:21 -05:00
Sathishkumar S
20c34e5c4a drm/amdgpu: Fix core reset sequence for JPEG4_0_3
For cores 1 through 7 repair the core reset sequence by
adjusting offsets to access the expected registers.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:47:07 -05:00
Tony Yi
a91d91b600 drm/amdgpu: Add support for CPERs on virtualization
Add support for CPERs on VFs.

VFs do not receive PMFW messages directly; as such, they need to
query them from the host. To avoid hitting host event guard,
CPER queries need to be rate limited. CPER queries share the same
RAS telemetry buffer as error count query, so a mutex protecting
the shared buffer was added as well.

For readability, the amdgpu_detect_virtualization was refactored
into multiple individual functions.

Signed-off-by: Tony Yi <Tony.Yi@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:47:03 -05:00
James Zhu
ca17c8e149 drm/amdkfd: remove unnecessary cpu domain validation
before move to GTT domain.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:46:55 -05:00
Aurabindo Pillai
a89b530373 drm/amd/display: use drm_* instead of DRM_ in apply_edid_quirks()
drm_* macros are more helpful that DRM_* macros since the former
indicates the associated DRM device that prints the error, which maybe
helpful when debugging.

Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:46:51 -05:00
Aurabindo Pillai
41b8304760 drm/amd/display: Add workaround for a panel
Implement w/a for a panel which requires 10s delay after link detect.

Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:46:44 -05:00
Tony Yi
d4c60219ac drm/amdgpu: Update headers for CPER support on SRIOV
Update amdgv_sriovmsg.h and mxgpu_nv.h to add new definitions for
CPER support on VFs. PMFW ACA messages are not available on VFs,
and VFs must query CPERs from host.

Signed-off-by: Tony Yi <Tony.Yi@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:46:40 -05:00
Kenneth Feng
a3199eba46 drm/amd/pm: always allow ih interrupt from fw
always allow ih interrupt from fw on smu v14 based on
the interface requirement

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:46:37 -05:00
Lijo Lazar
6ef5ccaad7 drm/amdgpu: Reinit FW shared flags on VCN v5.0.1
After a full device reset, shared memory region will clear out and it's
not possible to reliably save the region in case of RAS errors.
Reinitialize the flags if required.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:46:33 -05:00
Lijo Lazar
6e09402098 drm/amdgpu: Use the right struct for VCN v5.0.1
VCN IP versions >= 5.0 uses VCN5 fw shared struct.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:46:26 -05:00
Andrew Martin
049e5bf3c8 drm/amdkfd: Fix NULL Pointer Dereference in KFD queue
Through KFD IOCTL Fuzzing we encountered a NULL pointer derefrence
when calling kfd_queue_acquire_buffers.

Fixes: 629568d25f ("drm/amdkfd: Validate queue cwsr area and eop buffer size")
Signed-off-by: Andrew Martin <Andrew.Martin@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Andrew Martin <Andrew.Martin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:45:35 -05:00
Alexandre Demers
ab23db6d08 drm/amdgpu: add dce_v6_0_soft_reset() to DCE6
DCE6 was missing soft reset, but it was easily identifiable under radeon.
This should be it, pretty much as it is done under DCE8 and DCE10.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:45:30 -05:00
Alexandre Demers
5f6021d52b drm/amdgpu: fix style in DCE6
Whitespace cleanups.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:45:27 -05:00
Alexandre Demers
029ab8cabd drm/amdgpu: add some comments in DCE6
Add some comments.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:45:23 -05:00
Asad Kamal
fb92daa33a drm/amd/pm: Fix indentation issue
Fix indentation issue for smu_v_13_0_12 get_gpu_metrics

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202502272246.OISqUnC1-lkp@intel.com
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:45:04 -05:00
Asad Kamal
8df5f03be5 drm/amdgpu: Set PG state to gating for vcn_v_5_0_1
For vcn_v_5_0_1, set power state to gating during hw fini. Also there may
be scenario where VCN engine hangs during a job execution, then it's not
safe to assume that set_pg_state works fine during hw_fini to put the state
to gated. After a reset, we can assume that it's in the default state,
therefore reset the driver maintained state. Put the default state as gated
during reset as per this assumption.

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:44:56 -05:00
Dr. David Alan Gilbert
1092a4ea1b drm/amdgpu: Remove unused pqm_get_kernel_queue
pqm_get_kernel_queue() has been unused since 2022's
commit 5bdd3eb253 ("drm/amdkfd: Remove unused old debugger
implementation")

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:44:51 -05:00
Dr. David Alan Gilbert
dcb5bb0624 drm/amdgpu: Remove unused print__rq_dlg_params_st
print__rq_dlg_params_st() was added in 2017 by
commit 061bfa06a4 ("drm/amdgpu/display: Add dml support for DCN")
but has remained unused.

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:44:46 -05:00
Dr. David Alan Gilbert
f281a92abe drm/amdgpu: Remove unused pre_surface_trace
pre_surface_trace() has been unused since 2017's
commit 745cc746da ("drm/amd/display: remove
dc_pre_update_surfaces_to_stream from dc use")

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:44:40 -05:00
Dr. David Alan Gilbert
7b111aaae0 drm/amdgpu: Remove powerdown_uvd member
With phm_powerdown_uvd() gone in the previous patch, there's
now no longer anything that reads the powerdown_uvd member of the
pp_hwmgr_func.

Remove it.

There are a few assignments to it; a boring NULL which can just go,
and two functions, but those functions are called explicitly anyway
so the assignments to the member go.

One of those (smu7_powerdown_uvd) wasn't static previously;
make it static.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:44:37 -05:00
Dr. David Alan Gilbert
51cd1bcfac drm/amdgpu: Remove phm_powerdown_uvd
phm_powerdown_uvd() has been unused since 2017's
commit 47047263c5 ("drm/amd/powerplay: delete eventmgr related files.")

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:44:29 -05:00
Dr. David Alan Gilbert
8d00cfd5e6 drm/amdgpu: Remove ppatomfwctrl deadcode
pp_atomfwctrl_get_pp_assign_pin() and pp_atomfwctrl_get_pp_assign_pin()
were added in 2017 by
commit 0d2c7569e1 ("drm/amdgpu: add new atomfirmware based helpers for
powerplay")
but have remained unused.

Remove them, and the helper functions they used.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:43:11 -05:00
Mario Limonciello
36d63ce5db drm/amd/display: Add a new dcdebugmask to allow turning off brightness curve
Upgrading the kernel may cause some systems that were previously not using
a firmware specified brightness curve to use one.

In the event of problems with this curve (for example an interpolation
error) add a new dcdebugmask value that can be used to turn it off.  Also
add an info message to show that custom brightness curves are currently in
use.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Link: https://lore.kernel.org/r/20250228185145.186319-6-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:42:57 -05:00
Mario Limonciello
578df37b1b drm/amd/display: Add support for custom brightness curve
Some systems specify in the firmware a brightness curve that better
reflects the characteristics of the panel used. This is done in the
form of data points and matching luminance percentage.

When converting a userspace requested brightness value use that curve
to convert to a firmware intended brightness value.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Link: https://lore.kernel.org/r/20250228185145.186319-5-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:42:51 -05:00
Mario Limonciello
f25c0f0d4f drm/amd/display: Avoid operating on copies of backlight caps
Making a copy of the backlight caps structure between uses is unnecessary.
Refer to pointers to the same structure when using it.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Link: https://lore.kernel.org/r/20250228185145.186319-4-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:42:46 -05:00
Mario Limonciello
f729e63743 drm/amd: Pass luminance data to amdgpu_dm_backlight_caps
The ATIF method on some systems will provide a backlight curve. Pass
this curve into amdgpu_dm add it to the structures.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Link: https://lore.kernel.org/r/20250228185145.186319-3-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:42:42 -05:00
Mario Limonciello
1c79b5fcdf drm/amd: Copy entire structure in amdgpu_acpi_get_backlight_caps()
As new members are introduced to the structure copying the entire
structure will help avoid missing them.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Link: https://lore.kernel.org/r/20250228185145.186319-2-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:42:37 -05:00
Taimur Hassan
43e88e20d3 drm/amd/display: Promote DAL to 3.2.323
This version brings along following fixes:
- Various cleanups to amdgpu dm
- Add DP tunneling IRQ handler
- Fix display corruption for dcn35
- Fix dmcub reset problem
- Adjust BW determination for PCON
- DIO encoder refactor
- Fix performance with SubVP under gaming

Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:42:34 -05:00
Mario Limonciello
130d8324ea drm/amd/display: Use drm_err() for handle_hpd_irq_helper()
drm_err() will show which device has the error.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:42:29 -05:00
Mario Limonciello
f123fda197 drm/amd/display: Use scoped guards for handle_hpd_irq_helper()
Scoped guards will release the mutex when they go out of scope.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:42:22 -05:00
Mario Limonciello
981a47429e drm/amd/display: Use _free() macro for amdgpu_dm_update_connector_after_detect()
By using a _free() macro multiple duplicated snippets of code to free
the sink can be dropped. The sink will be released when leaving scope.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:42:18 -05:00
Mario Limonciello
aca9ec9b05 drm/amd/display: Use scoped guard for amdgpu_dm_update_connector_after_detect()
A scoped guard will release the mutex when it goes out of scope.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:42:14 -05:00
Mario Limonciello
d13fbeb74b drm/amd/display: Use _free(kfree) for dm_gpureset_commit_state()
Using a _free(kfree) macro drops the need for a goto statement
as it will be freed when it goes out of scope.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:42:08 -05:00
Mario Limonciello
7b3e14acc1 drm/amd/display: Change amdgpu_dm_irq_resume_*() to void
amdgpu_dm_irq_resume_early() and amdgpu_dm_irq_resume_late() don't
have any error flows. Change the return type from integer to void.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:41:51 -05:00
Mario Limonciello
c2bd614bf8 drm/amd/display: Change amdgpu_dm_irq_resume_*() to use drm_dbg()
drm_dbg() is helpful to show which device had the debug statement.
Adjust to using this instead for debug messages.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:41:37 -05:00
Mario Limonciello
f24a74d59e drm/amd/display: Use scoped guard for dm_resume()
Scoped guards will release the mutex when they go out of scope.
Adjust the code to use these instead.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:41:34 -05:00
Mario Limonciello
180998bf30 drm/amd/display: Use drm_err() instead of DRM_ERROR in dm_resume()
drm_err() is helpful to show which device had the error. Adjust to
using this instead for error messages.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:41:29 -05:00
Mario Limonciello
e3bc320c4b drm/amd/display: Use _free() macro for amdgpu_dm_commit_zero_streams()
All cases except a failure to create a copy of the current context will
call dc_state_release() on the copied context.

Use a _free() macro to free the context and then adjust the error handling
flow to drop the unnecessary use of goto statements.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:41:25 -05:00
Mario Limonciello
3cf7a0bc87 drm/amd/display: Catch failures for amdgpu_dm_commit_zero_streams()
amdgpu_dm_commit_zero_streams() returns a DC error code that isn't
checked. Add an explicit check to this and fail dm_suspend() if it
is not DC_OK.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:41:21 -05:00
Mario Limonciello
65890cad2e drm/amd/display: Drop ret variable from dm_suspend()
The `ret` variable in dm_suspend() doesn't get set and is just used
to return 0.  Drop the needless declaration.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:41:17 -05:00
Mario Limonciello
20ea047768 drm/amd/display: Change amdgpu_dm_irq_suspend() to void
amdgpu_dm_irq_suspend() doesn't have any error flows and always
returns zero.

Change the function to void.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:41:12 -05:00
Cruise Hung
c286e8501a drm/amd/display: Add tunneling IRQ handler
USB4 DP BW Allocation uses DP_TUNNELING_IRQ to indicate the status update.
The DP_TUNNELING_IRQ is defined in LINK_SERVICE_IRQ_VECTOR_ESI0. When
receiving DP HPD IRQ in USB4, read the LINK_SERVICE_IRQ_VECTOR_ESI0.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Cruise Hung <Cruise.Hung@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:41:08 -05:00
Leo Zeng
5ad8eed172 drm/amd/display: Added visual confirm for DCC
[WHY]
We want to add a visual confirm mode for DCC and MCache for
debugging purpose.

[HOW]
color pipes based on whether DCC is enabled and what MCache id
is used.
black - DCC disabled
red - DCC enabled
grey - 2 different MCaches used
other colors - 1 MCache used

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Leo Zeng <Leo.Zeng@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:41:02 -05:00
Nicholas Kazlauskas
c707ea82c7 drm/amd/display: Ensure DMCUB idle before reset on DCN31/DCN35
[Why]
If we soft reset before halt finishes and there are outstanding
memory transactions then the memory interface may produce unexpected
results, such as out of order transactions when the firmware next runs.

These can manifest as random or unexpected load/store violations.

[How]
Increase the timeout before soft reset to ensure the DMCUB has quiesced.
This is effectively 1s maximum based on experimentation.

Use the enable bit check on DCN31 like we're doing on DCN35 and reorder
the reset writes to follow the HW programming guide.

Ensure we're reading SCRATCH7 instead of SCRATCH8 for the HALT code.
No current versions of DMCUB firmware use the SCRATCH8 boot bit to
dynamically switch where the HALT code goes to maintain backwards
compatibility with PSP.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:40:56 -05:00
Nicholas Kazlauskas
a2f72c0717 drm/amd/display: Revert "Increase halt timeout for DMCUB to 1s"
This reverts commit 50f040c53e ("drm/amd/display: Increase halt
timeout for DMCUB to 1s")

There's two issues here:
1. Each poll is closer to 10us than 1us so it stalls for 15s on PNP.
2. We're reading the wrong scratch register to check for the HALT code.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:39:38 -05:00
Alex Hung
02b2c97824 drm/amd/display: Check NULL connector before it is used
[Why & How]
amdgpu_dm_find_first_crtc_matching_connector can return NULL.
It is necessary to the returned connector before passing it
drm_atomic_get_new_connector_state which always assumes connector is
not NULL.

Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:39:35 -05:00
George Shen
79fc4e856e drm/amd/display: Remove unused struct definition
[Why/How]
The struct is not and will not be used, as it is no longer relevant nor
supported.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:39:32 -05:00
George Shen
0584bbcf0c drm/amd/display: Skip checking FRL_MODE bit for PCON BW determination
[Why/How]
Certain PCON will clear the FRL_MODE bit despite supporting the link BW
indicated in the other bits.

Thus, skip checking the FRL_MODE bit when interpreting the
hdmi_encoded_link_bw struct.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:39:28 -05:00
Peichen Huang
54743ca151 drm/amd/display: misc for dio encoder refactor
[WHY]
These are left required changes for dio encoder refactor.

[HOW]
1. original logic is separated by config option
2. new link encoder dp enable/disable code for dcn35
3. process fec only for DP 8b10b encoding

Reviewed-by: Cruise Hung <cruise.hung@amd.com>
Signed-off-by: Peichen Huang <PeiChen.Huang@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:39:20 -05:00
Hansen Dsouza
fc215e83d0 drm/amd/display: read mso dpcd caps
[Why & How]
Read if panel support multi-sst links

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Hansen Dsouza <Hansen.Dsouza@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:39:16 -05:00
Dillon Varone
0dfcc2bf26 drm/amd/display: Fix DMUB reset sequence for DCN401
[WHY]
It should no longer use DMCUB_SOFT_RESET as it can result
in the memory request path becoming desynchronized.

[HOW]
To ensure robustness in the reset sequence:
1) Extend timeout on the "halt" command sent via gpint, and check for
controller to enter "wait" as a stronger guarantee that there are no
requests to memory still in flight.
2) Remove usage of DMCUB_SOFT_RESET
3) Rely on PSP to reset the controller safely

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Dillon Varone <Dillon.Varone@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:39:10 -05:00
Dillon Varone
a025f424af drm/amd/display: Fix p-state type when p-state is unsupported
[WHY&HOW]
P-state type would remain on previously used when unsupported which
causes confusion in logging and visual confirm, so set back to zero
when unsupported.

Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Dillon Varone <Dillon.Varone@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:39:05 -05:00
Aric Cyr
b74f46f3ce drm/amd/display: Request HW cursor on DCN3.2 with SubVP
[why]
When SubVP is active the HW cursor size is limited to 64x64, and
anything larger will force composition which is bad for gaming on
DCN3.2 if the game uses a larger cursor.

[how]
If HW cursor is requested, typically by a fullscreen game, do not
enable SubVP so that up to 256x256 cursor sizes are available for
DCN3.2.

Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Aric Cyr <Aric.Cyr@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:39:00 -05:00
Vitaliy Shevtsov
c3c584c18c drm/amd/display: fix type mismatch in CalculateDynamicMetadataParameters()
There is a type mismatch between what CalculateDynamicMetadataParameters()
takes and what is passed to it. Currently this function accepts several
args as signed long but it's called with unsigned integers and integer. On
some systems where long is 32 bits and one of these unsigned int params is
greater than INT_MAX it may cause passing input params as negative values.

Fix this by changing these argument types from long to unsigned int and to
int respectively. Also this will align the function's definition with
similar functions in other dcn* drivers.

Found by Linux Verification Center (linuxtesting.org) with Svace.

Fixes: 6725a88f88 ("drm/amd/display: Add DCN3 DML")
Signed-off-by: Vitaliy Shevtsov <v.shevtsov@mt-integration.ru>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:38:07 -05:00
Lijo Lazar
a734a717dc drm/amdgpu: Avoid HDP flush on JPEG v5.0.1
Similar to JPEG v4.0.3, HDP flush shouldn't be performed by JPEG engine.
Keep it empty.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:38:01 -05:00
Lijo Lazar
6fcfaac604 drm/amdgpu: Initialize RRMT status on JPEG v5.0.1
Initialize RRMT enablement status from register.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:37:56 -05:00
Jesse.zhang@amd.com
77bd621d14 drm/amdgpu: Update SDMA scheduler mask handling to include page queue
This patch updates the SDMA scheduler mask handling to include the page queue
if it exists. The scheduler mask is calculated based on the number of SDMA
instances and the presence of the page queue. The mask is updated to reflect
the state of both the SDMA gfx ring and the page queue.

Changes:
- Add handling for the SDMA page queue in `amdgpu_debugfs_sdma_sched_mask_set`.
- Update scheduler mask calculations to include the page queue.
- Modify `amdgpu_debugfs_sdma_sched_mask_get` to return the correct mask value.

This change is necessary to verify multiple queues (SDMA gfx queue + page queue)
and ensure proper scheduling and state management for SDMA instances.

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:37:42 -05:00
Lijo Lazar
0b9647d40e drm/amdgpu: Add offset normalization in VCN v5.0.1
VCN v5.0.1 also will need register offset normalization. Reuse the logic
from VCN v4.0.3. Also, avoid HDP flush similar to VCN v4.0.3

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:37:35 -05:00
Lijo Lazar
b5a3fc54e8 drm/amdgpu: Initialize RRMT status on VCN v5.0.1
Initialize RRMT status from register.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:37:31 -05:00
Xiang Liu
677ae51f49 drm/amdgpu: Free CPER entry after committing to ring
Free CPER entry when it's committed to CPER ring to avoid memory leak.

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:37:27 -05:00
Alexandre Demers
899634a57a drm/amdgpu: fix spelling typos in SI
Fix typos

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:37:23 -05:00
Alexandre Demers
ce43abd7ec drm/amdgpu: fix spelling typos
Found some typos while exploring amdgpu code.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:37:13 -05:00
Dave Airlie
e21cba7047 drm-misc-next for v6.15:
Cross-subsystem Changes:
 
 bus:
 - mhi: Avoid access to uninitialized field
 
 Core Changes:
 
 - Fix docmentation
 
 dp:
 - Add helpers for LTTPR transparent mode
 
 sched:
 - Improve job peek/pop operations
 - Optimize layout of struct drm_sched_job
 
 Driver Changes:
 
 arc:
 - Convert to devm_platform_ioremap_resource()
 
 aspeed:
 - Convert to devm_platform_ioremap_resource()
 
 bridge:
 - ti-sn65dsi86: Support CONFIG_PWM tristate
 
 i915:
 - dp: Use helpers for LTTPR transparent mode
 
 mediatek:
 - Convert to devm_platform_ioremap_resource()
 
 msm:
 - dp: Use helpers for LTTPR transparent mode
 
 nouveau:
 - dp: Use helpers for LTTPR transparent mode
 
 panel:
 - raydium-rm67200: Add driver for Raydium RM67200
 - simple: Add support for BOE AV123Z7M-N17, BOE AV123Z7M-N17
 - sony-td4353-jdi: Use MIPI-DSI multi-func interface
 - summit: Add driver for Apple Summit display panel
 - visionox-rm692e5: Add driver for Visionox RM692E5
 
 repaper:
 - Fix integer overflows
 
 stm:
 - Convert to devm_platform_ioremap_resource()
 
 vc4:
 - Convert to devm_platform_ioremap_resource()
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmfAMnwACgkQaA3BHVML
 eiOszgf+NCaYdCCKXurnKy2DVHZ61ar+Fg/FHYs1oThFVqknRT9CBGY93g3I7VGs
 gYUATrH2fYt4daIF94ap4YLYuMrZSKXSd0duzr8lQzCVAb/UgvUap9yjGwJsCOZA
 bE2jxlKvo6cUJbq19+ATgrAQ7y31/w30PCPyBjMGR0t4C3bBbbCHpxFyKzDWCtFT
 JS6pNbCPblZCCvtFPmQRqQqLSE0K5x4SE49ZpHM6hhA4MavlwoN4ZkWincliSzF2
 XVCo7QHpVy3SItshhOB7ch7pIdkbm6nUvW2poR3UdXe+B+OSKRIz0C+2E98bYAra
 vwkyWoUP2A9nKZBsNy2d5026DeUDwg==
 =KURF
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2025-02-27' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

drm-misc-next for v6.15:

Cross-subsystem Changes:

bus:
- mhi: Avoid access to uninitialized field

Core Changes:

- Fix docmentation

dp:
- Add helpers for LTTPR transparent mode

sched:
- Improve job peek/pop operations
- Optimize layout of struct drm_sched_job

Driver Changes:

arc:
- Convert to devm_platform_ioremap_resource()

aspeed:
- Convert to devm_platform_ioremap_resource()

bridge:
- ti-sn65dsi86: Support CONFIG_PWM tristate

i915:
- dp: Use helpers for LTTPR transparent mode

mediatek:
- Convert to devm_platform_ioremap_resource()

msm:
- dp: Use helpers for LTTPR transparent mode

nouveau:
- dp: Use helpers for LTTPR transparent mode

panel:
- raydium-rm67200: Add driver for Raydium RM67200
- simple: Add support for BOE AV123Z7M-N17, BOE AV123Z7M-N17
- sony-td4353-jdi: Use MIPI-DSI multi-func interface
- summit: Add driver for Apple Summit display panel
- visionox-rm692e5: Add driver for Visionox RM692E5

repaper:
- Fix integer overflows

stm:
- Convert to devm_platform_ioremap_resource()

vc4:
- Convert to devm_platform_ioremap_resource()

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20250227094041.GA114623@linux.fritz.box
2025-02-28 12:36:01 +10:00
Srinivasan Shanmugam
7d83c129a8 drm/amdgpu: Fix parameter annotation in vcn_v5_0_0_is_idle
Update parameter description in the vcn_v5_0_0_is_idle function

Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c:1231: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v5_0_0_is_idle'
drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c:1231: warning: Excess function parameter 'handle' description in 'vcn_v5_0_0_is_idle'

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:05 -05:00
Philip Yang
fe9d0061c4 drm/amdkfd: debugfs hang_hws skip GPU with MES
debugfs hang_hws is used by GPU reset test with HWS, for MES this crash
the kernel with NULL pointer access because dqm->packet_mgr is not setup
for MES path.

Skip GPU with MES for now, MES hang_hws debugfs interface will be
supported later.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:05 -05:00
Philip Yang
7919b4cad5 drm/amdkfd: Fix pqm_destroy_queue race with GPU reset
If GPU in reset, destroy_queue return -EIO, pqm_destroy_queue should
delete the queue from process_queue_list and free the resource.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:05 -05:00
Srinivasan Shanmugam
0f3fda3117 drm/amdgpu: Fix parameter annotations for VCN clock gating functions
The previous references to a non-existent `adev` parameter have been
removed & corrected to reflect the use of the `vinst` pointer, which
points to the VCN instance structure, in the below files:

- vcn_v1_0.c
- vcn_v2_0.c
- vcn_v3_0.c

Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c:624: warning: Function parameter or struct member 'vinst' not described in 'vcn_v1_0_enable_clock_gating'
drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c:624: warning: Excess function parameter 'adev' description in 'vcn_v1_0_enable_clock_gating'
drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:376: warning: Function parameter or struct member 'vinst' not described in 'vcn_v2_0_mc_resume'
drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:376: warning: Excess function parameter 'adev' description in 'vcn_v2_0_mc_resume'
drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:776: warning: Function parameter or struct member 'vinst' not described in 'vcn_v3_0_disable_clock_gating'
drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:776: warning: Excess function parameter 'adev' description in 'vcn_v3_0_disable_clock_gating'
drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:776: warning: Excess function parameter 'inst' description in 'vcn_v3_0_disable_clock_gating'
drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:965: warning: Function parameter or struct member 'vinst' not described in 'vcn_v3_0_enable_clock_gating'
drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:965: warning: Excess function parameter 'adev' description in 'vcn_v3_0_enable_clock_gating'
drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:965: warning: Excess function parameter 'inst' description in 'vcn_v3_0_enable_clock_gating'

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:05 -05:00
Philip Yang
f0b4440cdc drm/amdkfd: Fix mode1 reset crash issue
If HW scheduler hangs and mode1 reset is used to recover GPU, KFD signal
user space to abort the processes. After process abort exit, user queues
still use the GPU to access system memory before h/w is reset while KFD
cleanup worker free system memory and free VRAM.

There is use-after-free race bug that KFD allocate and reuse the freed
system memory, and user queue write to the same system memory to corrupt
the data structure and cause driver crash.

To fix this race, KFD cleanup worker terminate user queues, then flush
reset_domain wq to wait for any GPU ongoing reset complete, and then
free outstanding BOs.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:04 -05:00
Philip Yang
1b9366c601 drm/amdkfd: KFD release_work possible circular locking
If waiting for gpu reset done in KFD release_work, thers is WARNING:
possible circular locking dependency detected

  #2  kfd_create_process
        kfd_process_mutex
          flush kfd release work

  #1  kfd release work
        wait for amdgpu reset work

  #0  amdgpu_device_gpu_reset
        kgd2kfd_pre_reset
          kfd_process_mutex

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock((work_completion)(&p->release_work));
                  lock((wq_completion)kfd_process_wq);
                  lock((work_completion)(&p->release_work));
   lock((wq_completion)amdgpu-reset-dev);

To fix this, KFD create process move flush release work outside
kfd_process_mutex.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:04 -05:00
Philip Yang
ee3ed10066 drm/amdkfd: Remove kfd_process_hw_exception worker
With GPU reset-domain worker implemented, KFD hw_exception worker is not
needed any more, just call amdgpu_amdkfd_gpu_reset directly from
kfd_hws_hang.

Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:04 -05:00
Asad Kamal
f9234217d0 drm/amd/amdgpu: Add support for xgmi_v6_4_1
Add support for xgmi_v6_4_1 and use it appropriate places

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:04 -05:00
Lijo Lazar
485993e2f1 drm/amdgpu: Add xgmi speed/width related info
Add APIs to initialize XGMI speed, width details and get to max
bandwidth supported. It is assumed that a device only supports same
generation of XGMI links with uniform width.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:04 -05:00
Lijo Lazar
6f16d101da drm/amdgpu: Move xgmi definitions to xgmi header
Move definitions related to xgmi to amdgpu_xgmi header

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:04 -05:00
Kenneth Feng
0107c595c5 drm/amd/pm: add fan abnormal detection
add fan abnormal detection on smu v14.0.2&smu v14.0.3

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:04 -05:00
Xiaogang Chen
509d662a57 drm/amdkfd: remove kfd_pasid.c from amdgpu driver build
Since kfd uses pasid values from graphic driver now do not need use kfd pasid
fucntions.

Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:04 -05:00
David Yat Sin
e90711946b drm/amdkfd: clamp queue size to minimum
If queue size is less than minimum, clamp it to minimum to prevent
underflow when writing queue mqd.

Signed-off-by: David Yat Sin <David.YatSin@amd.com>
Reviewed-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:04 -05:00
André Almeida
9c696cc57c drm/amdgpu: Create a debug option to disable ring reset
Prior to the addition of ring reset, the debug option
`debug_disable_soft_recovery` could be used to force a full device
reset. Now that we have ring reset, create a debug option to disable
them in amdgpu, forcing the driver to go with the full device
reset path again when both options are combined.

This option is useful for testing and debugging purposes when one wants
to test the full reset from userspace.

Signed-off-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:04 -05:00
Ma Ke
63e6a77ccf drm/amd/display: Fix null check for pipe_ctx->plane_state in resource_build_scaling_params
Null pointer dereference issue could occur when pipe_ctx->plane_state
is null. The fix adds a check to ensure 'pipe_ctx->plane_state' is not
null before accessing. This prevents a null pointer dereference.

Found by code review.

Fixes: 3be5262e35 ("drm/amd/display: Rename more dc_surface stuff to plane_state")
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:04 -05:00
Alex Deucher
1d72fc2e9e drm/amdgpu/mes11: drop amdgpu_mes_suspend()/amdgpu_mes_resume() calls
They are noops on GFX11 for most firmware versions. KFD already
handles its own queues and they should already be unmapped at this
point so even if this runs, it's not doing anything.

Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:03 -05:00
Colin Ian King
eaa3feb16d drm/amdgpu: Fix spelling mistake "initiailize" -> "initialize" and grammar
There is a spelling mistake and a grammatical error in a dev_err
message. Fix it.

Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:03 -05:00
Xiang Liu
00f85667fa drm/amdgpu: Decode deferred error type in aca bank parser
In the case of poison inband log, the error type need to be specified
by checking the deferred or poison bit of status register.

v2: check both deferred and poison bit

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:03 -05:00
Le Ma
5b5f01eff7 drm/amdgpu: add sdma page queue irq processing for sdma442
Add the trap irq processing for page queue of sdma442

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by and Tested-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:03 -05:00
Kenneth Feng
7d37bcab97 drm/amd/pm: disable gfxoff on the specific sku
disable gfxoff on the specific sku based on the requirement

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:03 -05:00
Xiang Liu
d4bd7a50ca drm/amdgpu: Report generic instead of unknown boot time errors
Change the DMESG reporting of unknown errors to "Boot Controller
Generic Error" to align with the RAS SPEC and provide more clarity
to customers.

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:03 -05:00
Lijo Lazar
b965e42530 drm/amdgpu: Fix logic to fetch supported NPS modes
Correct the logic to find supported NPS modes from firmware.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reported-by: Ava Zhang <niandong.zhang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Fixes: 30eb41f5d1 ("drm/amdgpu: Use firmware supported NPS modes")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:03 -05:00
Xiang Liu
906d2859e1 drm/amdgpu: Disable fru_id field in CPER section
The fru_id field is disabled cause of mis-matching defination
between CPER spec and driver.

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:03 -05:00
Srinivasan Shanmugam
fddc450263 drm/amdkfd: Fix Circular Locking Dependency in 'svm_range_cpu_invalidate_pagetables'
This commit addresses a circular locking dependency in the
svm_range_cpu_invalidate_pagetables function. The function previously
held a lock while determining whether to perform an unmap or eviction
operation, which could lead to deadlocks.

Fixes the below:

[  223.418794] ======================================================
[  223.418820] WARNING: possible circular locking dependency detected
[  223.418845] 6.12.0-amdstaging-drm-next-lol-050225 #14 Tainted: G     U     OE
[  223.418869] ------------------------------------------------------
[  223.418889] kfdtest/3939 is trying to acquire lock:
[  223.418906] ffff8957552eae38 (&dqm->lock_hidden){+.+.}-{3:3}, at: evict_process_queues_cpsch+0x43/0x210 [amdgpu]
[  223.419302]
               but task is already holding lock:
[  223.419303] ffff8957556b83b0 (&prange->lock){+.+.}-{3:3}, at: svm_range_cpu_invalidate_pagetables+0x9d/0x850 [amdgpu]
[  223.419447] Console: switching to colour dummy device 80x25
[  223.419477] [IGT] amd_basic: executing
[  223.419599]
               which lock already depends on the new lock.

[  223.419611]
               the existing dependency chain (in reverse order) is:
[  223.419621]
               -> #2 (&prange->lock){+.+.}-{3:3}:
[  223.419636]        __mutex_lock+0x85/0xe20
[  223.419647]        mutex_lock_nested+0x1b/0x30
[  223.419656]        svm_range_validate_and_map+0x2f1/0x15b0 [amdgpu]
[  223.419954]        svm_range_set_attr+0xe8c/0x1710 [amdgpu]
[  223.420236]        svm_ioctl+0x46/0x50 [amdgpu]
[  223.420503]        kfd_ioctl_svm+0x50/0x90 [amdgpu]
[  223.420763]        kfd_ioctl+0x409/0x6d0 [amdgpu]
[  223.421024]        __x64_sys_ioctl+0x95/0xd0
[  223.421036]        x64_sys_call+0x1205/0x20d0
[  223.421047]        do_syscall_64+0x87/0x140
[  223.421056]        entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  223.421068]
               -> #1 (reservation_ww_class_mutex){+.+.}-{3:3}:
[  223.421084]        __ww_mutex_lock.constprop.0+0xab/0x1560
[  223.421095]        ww_mutex_lock+0x2b/0x90
[  223.421103]        amdgpu_amdkfd_alloc_gtt_mem+0xcc/0x2b0 [amdgpu]
[  223.421361]        add_queue_mes+0x3bc/0x440 [amdgpu]
[  223.421623]        unhalt_cpsch+0x1ae/0x240 [amdgpu]
[  223.421888]        kgd2kfd_start_sched+0x5e/0xd0 [amdgpu]
[  223.422148]        amdgpu_amdkfd_start_sched+0x3d/0x50 [amdgpu]
[  223.422414]        amdgpu_gfx_enforce_isolation_handler+0x132/0x270 [amdgpu]
[  223.422662]        process_one_work+0x21e/0x680
[  223.422673]        worker_thread+0x190/0x330
[  223.422682]        kthread+0xe7/0x120
[  223.422690]        ret_from_fork+0x3c/0x60
[  223.422699]        ret_from_fork_asm+0x1a/0x30
[  223.422708]
               -> #0 (&dqm->lock_hidden){+.+.}-{3:3}:
[  223.422723]        __lock_acquire+0x16f4/0x2810
[  223.422734]        lock_acquire+0xd1/0x300
[  223.422742]        __mutex_lock+0x85/0xe20
[  223.422751]        mutex_lock_nested+0x1b/0x30
[  223.422760]        evict_process_queues_cpsch+0x43/0x210 [amdgpu]
[  223.423025]        kfd_process_evict_queues+0x8a/0x1d0 [amdgpu]
[  223.423285]        kgd2kfd_quiesce_mm+0x43/0x90 [amdgpu]
[  223.423540]        svm_range_cpu_invalidate_pagetables+0x4a7/0x850 [amdgpu]
[  223.423807]        __mmu_notifier_invalidate_range_start+0x1f5/0x250
[  223.423819]        copy_page_range+0x1e94/0x1ea0
[  223.423829]        copy_process+0x172f/0x2ad0
[  223.423839]        kernel_clone+0x9c/0x3f0
[  223.423847]        __do_sys_clone+0x66/0x90
[  223.423856]        __x64_sys_clone+0x25/0x30
[  223.423864]        x64_sys_call+0x1d7c/0x20d0
[  223.423872]        do_syscall_64+0x87/0x140
[  223.423880]        entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  223.423891]
               other info that might help us debug this:

[  223.423903] Chain exists of:
                 &dqm->lock_hidden --> reservation_ww_class_mutex --> &prange->lock

[  223.423926]  Possible unsafe locking scenario:

[  223.423935]        CPU0                    CPU1
[  223.423942]        ----                    ----
[  223.423949]   lock(&prange->lock);
[  223.423958]                                lock(reservation_ww_class_mutex);
[  223.423970]                                lock(&prange->lock);
[  223.423981]   lock(&dqm->lock_hidden);
[  223.423990]
                *** DEADLOCK ***

[  223.423999] 5 locks held by kfdtest/3939:
[  223.424006]  #0: ffffffffb82b4fc0 (dup_mmap_sem){.+.+}-{0:0}, at: copy_process+0x1387/0x2ad0
[  223.424026]  #1: ffff89575eda81b0 (&mm->mmap_lock){++++}-{3:3}, at: copy_process+0x13a8/0x2ad0
[  223.424046]  #2: ffff89575edaf3b0 (&mm->mmap_lock/1){+.+.}-{3:3}, at: copy_process+0x13e4/0x2ad0
[  223.424066]  #3: ffffffffb82e76e0 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}, at: copy_page_range+0x1cea/0x1ea0
[  223.424088]  #4: ffff8957556b83b0 (&prange->lock){+.+.}-{3:3}, at: svm_range_cpu_invalidate_pagetables+0x9d/0x850 [amdgpu]
[  223.424365]
               stack backtrace:
[  223.424374] CPU: 0 UID: 0 PID: 3939 Comm: kfdtest Tainted: G     U     OE      6.12.0-amdstaging-drm-next-lol-050225 #14
[  223.424392] Tainted: [U]=USER, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[  223.424401] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS PRO WIFI/X570 AORUS PRO WIFI, BIOS F36a 02/16/2022
[  223.424416] Call Trace:
[  223.424423]  <TASK>
[  223.424430]  dump_stack_lvl+0x9b/0xf0
[  223.424441]  dump_stack+0x10/0x20
[  223.424449]  print_circular_bug+0x275/0x350
[  223.424460]  check_noncircular+0x157/0x170
[  223.424469]  ? __bfs+0xfd/0x2c0
[  223.424481]  __lock_acquire+0x16f4/0x2810
[  223.424490]  ? srso_return_thunk+0x5/0x5f
[  223.424505]  lock_acquire+0xd1/0x300
[  223.424514]  ? evict_process_queues_cpsch+0x43/0x210 [amdgpu]
[  223.424783]  __mutex_lock+0x85/0xe20
[  223.424792]  ? evict_process_queues_cpsch+0x43/0x210 [amdgpu]
[  223.425058]  ? srso_return_thunk+0x5/0x5f
[  223.425067]  ? mark_held_locks+0x54/0x90
[  223.425076]  ? evict_process_queues_cpsch+0x43/0x210 [amdgpu]
[  223.425339]  ? srso_return_thunk+0x5/0x5f
[  223.425350]  mutex_lock_nested+0x1b/0x30
[  223.425358]  ? mutex_lock_nested+0x1b/0x30
[  223.425367]  evict_process_queues_cpsch+0x43/0x210 [amdgpu]
[  223.425631]  kfd_process_evict_queues+0x8a/0x1d0 [amdgpu]
[  223.425893]  kgd2kfd_quiesce_mm+0x43/0x90 [amdgpu]
[  223.426156]  svm_range_cpu_invalidate_pagetables+0x4a7/0x850 [amdgpu]
[  223.426423]  ? srso_return_thunk+0x5/0x5f
[  223.426436]  __mmu_notifier_invalidate_range_start+0x1f5/0x250
[  223.426450]  copy_page_range+0x1e94/0x1ea0
[  223.426461]  ? srso_return_thunk+0x5/0x5f
[  223.426474]  ? srso_return_thunk+0x5/0x5f
[  223.426484]  ? lock_acquire+0xd1/0x300
[  223.426494]  ? copy_process+0x1718/0x2ad0
[  223.426502]  ? srso_return_thunk+0x5/0x5f
[  223.426510]  ? sched_clock_noinstr+0x9/0x10
[  223.426519]  ? local_clock_noinstr+0xe/0xc0
[  223.426528]  ? copy_process+0x1718/0x2ad0
[  223.426537]  ? srso_return_thunk+0x5/0x5f
[  223.426550]  copy_process+0x172f/0x2ad0
[  223.426569]  kernel_clone+0x9c/0x3f0
[  223.426577]  ? __schedule+0x4c9/0x1b00
[  223.426586]  ? srso_return_thunk+0x5/0x5f
[  223.426594]  ? sched_clock_noinstr+0x9/0x10
[  223.426602]  ? srso_return_thunk+0x5/0x5f
[  223.426610]  ? local_clock_noinstr+0xe/0xc0
[  223.426619]  ? schedule+0x107/0x1a0
[  223.426629]  __do_sys_clone+0x66/0x90
[  223.426643]  __x64_sys_clone+0x25/0x30
[  223.426652]  x64_sys_call+0x1d7c/0x20d0
[  223.426661]  do_syscall_64+0x87/0x140
[  223.426671]  ? srso_return_thunk+0x5/0x5f
[  223.426679]  ? common_nsleep+0x44/0x50
[  223.426690]  ? srso_return_thunk+0x5/0x5f
[  223.426698]  ? trace_hardirqs_off+0x52/0xd0
[  223.426709]  ? srso_return_thunk+0x5/0x5f
[  223.426717]  ? syscall_exit_to_user_mode+0xcc/0x200
[  223.426727]  ? srso_return_thunk+0x5/0x5f
[  223.426736]  ? do_syscall_64+0x93/0x140
[  223.426748]  ? srso_return_thunk+0x5/0x5f
[  223.426756]  ? up_write+0x1c/0x1e0
[  223.426765]  ? srso_return_thunk+0x5/0x5f
[  223.426775]  ? srso_return_thunk+0x5/0x5f
[  223.426783]  ? trace_hardirqs_off+0x52/0xd0
[  223.426792]  ? srso_return_thunk+0x5/0x5f
[  223.426800]  ? syscall_exit_to_user_mode+0xcc/0x200
[  223.426810]  ? srso_return_thunk+0x5/0x5f
[  223.426818]  ? do_syscall_64+0x93/0x140
[  223.426826]  ? syscall_exit_to_user_mode+0xcc/0x200
[  223.426836]  ? srso_return_thunk+0x5/0x5f
[  223.426844]  ? do_syscall_64+0x93/0x140
[  223.426853]  ? srso_return_thunk+0x5/0x5f
[  223.426861]  ? irqentry_exit+0x6b/0x90
[  223.426869]  ? srso_return_thunk+0x5/0x5f
[  223.426877]  ? exc_page_fault+0xa7/0x2c0
[  223.426888]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  223.426898] RIP: 0033:0x7f46758eab57
[  223.426906] Code: ba 04 00 f3 0f 1e fa 64 48 8b 04 25 10 00 00 00 45 31 c0 31 d2 31 f6 bf 11 00 20 01 4c 8d 90 d0 02 00 00 b8 38 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 41 89 c0 85 c0 75 2c 64 48 8b 04 25 10 00
[  223.426930] RSP: 002b:00007fff5c3e5188 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
[  223.426943] RAX: ffffffffffffffda RBX: 00007f4675f8c040 RCX: 00007f46758eab57
[  223.426954] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
[  223.426965] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[  223.426975] R10: 00007f4675e81a50 R11: 0000000000000246 R12: 0000000000000001
[  223.426986] R13: 00007fff5c3e5470 R14: 00007fff5c3e53e0 R15: 00007fff5c3e5410
[  223.427004]  </TASK>

v2: To resolve this issue, the allocation of the process context buffer
(`proc_ctx_bo`) has been moved from the `add_queue_mes` function to the
`pqm_create_queue` function. This change ensures that the buffer is
allocated only when the first queue for a process is created and only if
the Micro Engine Scheduler (MES) is enabled. (Felix)

v3: Fix typo s/Memory Execution Scheduler (MES)/Micro Engine Scheduler
in commit message. (Lijo)

Fixes: 438b39ac74 ("drm/amdkfd: pause autosuspend when creating pdd")
Cc: Jesse Zhang <jesse.zhang@amd.com>
Cc: Yunxiang Li <Yunxiang.Li@amd.com>
Cc: Philip Yang <Philip.Yang@amd.com>
Cc: Alex Sierra <alex.sierra@amd.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:49:32 -05:00
Benjamin Chan
dce1b82398 drm/amdgpu: Add amdisp pinctrl MFD resource
AMDISP GPIO control uses a dedicated pinctrl driver,
and requires MFD hotadd GPIO resources.

Co-developed-by: Pratap Nirujogi <pratap.nirujogi@amd.com>
Signed-off-by: Benjamin Chan <benjamin.chan@amd.com>
Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:49 -05:00
Alex Deucher
4343f814e5 drm/amdgpu/mes12: drop amdgpu_mes_suspend()/amdgpu_mes_resume() calls
They are noops on GFX12.  There is no suspend/resume all support
in firmware so the function doesn't do anything.  KFD already
handles its own queues and they should already be unmapped at this
point so even if this runs, it's not doing anything.

Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:43 -05:00
Dr. David Alan Gilbert
82c13da746 drm/amd/display: Remove unused optc3_fpu_set_vrr_m_const
The last use of optc3_fpu_set_vrr_m_const() was removed in 2022's
commit 64f991590f ("drm/amd/display: Fix a compilation failure on PowerPC
caused by FPU code")
which removed the only caller (with a similar) name.

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:40 -05:00
Pratap Nirujogi
a67e75beff drm/amdgpu: Replace DRM_ERROR() with drm_err()
DRM_ERROR() is no longer preferred. Replace DRM_ERROR() usage
with drm_err() in isp driver.

Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:36 -05:00
Luan Arcanjo
b5838d1517 drm/amd/display/dc: Refactor remove duplications
All dce command_table_helper's shares a copy-pasted collection
of copy-pasted functions, which are: phy_id_to_atom,
clock_source_id_to_atom_phy_clk_src_id, and engine_bp_to_atom.

This patch removes the multiple copy-pasted by moving them to
the command_table_helper.c and make the command_table_helper's
calls the functions implemented by the command_table_helper.c
instead.

The changes were not tested on actual hardware. I am only able
to verify that the changes keep the code compileable and do my
best to to look repeatedly if I am not actually changing any code.

This is the version 4 of the PATCH, fixed comments about
licence in the new files and the matches From email to
Signed-off-by email. Fixed comments about using
command_table_helper instead of creating a dce_common

Signed-off-by: Luan Icaro Pinto Arcanjo <luanicaro@usp.br>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:32 -05:00
Alex Deucher
4d1b653571 drm/amdgpu/vcn: use dev_info() for firmware information
To properly handle multiple GPUs.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:32 -05:00
Alex Deucher
c51aa7923e drm/amdgpu/vcn: optimize firmware storage
If each instance uses the same fw image, only store one
copy in the driver.

Acked-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:32 -05:00
Alex Deucher
31a37dfc8f drm/amdgpu/vcn5.0.1: use generic set_power_gating_state helper
No need for an IP specific version.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:32 -05:00
Alex Deucher
9b648fa54c drm/amdgpu/vcn5.0.0: use generic set_power_gating_state helper
No need for an IP specific version.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:32 -05:00
Alex Deucher
4bb5879322 drm/amdgpu/vcn4.0.5: use generic set_power_gating_state helper
No need for an IP specific version.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:31 -05:00
Alex Deucher
1ee6b2bff2 drm/amdgpu/vcn4.0.3: use generic set_power_gating_state helper
No need for an IP specific version.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:31 -05:00
Alex Deucher
8bdfa5756b drm/amdgpu/vcn4.0: use generic set_power_gating_state helper
No need for an IP specific version.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:31 -05:00
Alex Deucher
38c0d9882a drm/amdgpu/vcn3.0: use generic set_power_gating_state helper
No need for an IP specific version.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:31 -05:00
Alex Deucher
bd32af6faa drm/amdgpu/vcn2.5: use generic set_power_gating_state helper
No need for an IP specific version.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:31 -05:00
Alex Deucher
3389dd059f drm/amdgpu/vcn2.0: use generic set_power_gating_state helper
No need for an IP specific version.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:31 -05:00
Alex Deucher
cac3dc89f2 drm/amdgpu/vcn1.0: use generic set_power_gating_state helper
No need for an IP specific version.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:31 -05:00
Alex Deucher
a2cf2a883c drm/amdgpu/vcn: add a generic helper for set_power_gating_state
It's common for all VCN variants.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:31 -05:00
Alex Deucher
4ce4fe2720 drm/amdgpu/vcn: use per instance callbacks for idle work handler
Use the vcn instance power gating callbacks rather than
the IP powergating callback.  This limits power gating to
only the instance in use rather than all of the instances.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:31 -05:00
Alex Deucher
592846e3fe drm/amdgpu/vcn5.0.1: add set_pg_state callback
Rework the code as a vcn instance callback.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:31 -05:00
Alex Deucher
f2eb0a66ca drm/amdgpu/vcn5.0.0: add set_pg_state callback
Rework the code as a vcn instance callback.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:31 -05:00
Alex Deucher
f9993efed7 drm/amdgpu/vcn4.0.5: add set_pg_state callback
Rework the code as a vcn instance callback.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:31 -05:00
Alex Deucher
39fb77a8d3 drm/amdgpu/vcn4.0.3: add set_pg_state callback
Rework the code as a vcn instance callback.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:31 -05:00
Alex Deucher
8b18f03142 drm/amdgpu/vcn4.0: add set_pg_state callback
Rework the code as a vcn instance callback.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:30 -05:00
Alex Deucher
bda37b68f6 drm/amdgpu/vcn3.0: add set_pg_state callback
Rework the code as a vcn instance callback.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:30 -05:00
Alex Deucher
307ce8bdc6 drm/amdgpu/vcn2.5: add set_pg_state callback
Rework the code as a vcn instance callback.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:30 -05:00
Alex Deucher
40c6d55806 drm/amdgpu/vcn2.0: add set_pg_state callback
Rework the code as a vcn instance callback.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:30 -05:00
Alex Deucher
c5ed3655cd drm/amdgpu/vcn1.0: add set_pg_state callback
Rework the code as a vcn instance callback.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:30 -05:00
Alex Deucher
55945f08d9 drm/amdgpu/vcn: add new per instance callback for powergating
This is per instance so add a new function pointer for it.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:30 -05:00
Alex Deucher
64303b72de drm/amdgpu/vcn: adjust pause_dpg_mode function signature
Change it to take a vcn instance rather than adev to align
with the vcn instance changes.

TODO: clean up the function internals to use the vinst state
directly rather than accessing it indirectly via adev->vcn.inst[].

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:30 -05:00
Alex Deucher
0a3fb7338f drm/amdgpu/vcn5.0.1: convert internal functions to use vcn_inst
Pass the vcn instance structure to these functions rather
than adev and the instance number.

TODO: clean up the function internals to use the vinst state
directly rather than accessing it indirectly via adev->vcn.inst[].

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:30 -05:00
Alex Deucher
e3eb71cd69 drm/amdgpu/vcn5.0.0: convert internal functions to use vcn_inst
Pass the vcn instance structure to these functions rather
than adev and the instance number.

TODO: clean up the function internals to use the vinst state
directly rather than accessing it indirectly via adev->vcn.inst[].

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:30 -05:00
Alex Deucher
c07c0c0df9 drm/amdgpu/vcn4.0.5: convert internal functions to use vcn_inst
Pass the vcn instance structure to these functions rather
than adev and the instance number.

TODO: clean up the function internals to use the vinst state
directly rather than accessing it indirectly via adev->vcn.inst[].

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:30 -05:00
Alex Deucher
4a23b9c670 drm/amdgpu/vcn4.0.3: convert internal functions to use vcn_inst
Pass the vcn instance structure to these functions rather
than adev and the instance number.

TODO: clean up the function internals to use the vinst state
directly rather than accessing it indirectly via adev->vcn.inst[].

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:29 -05:00
Alex Deucher
259873561f drm/amdgpu/vcn4.0: convert internal functions to use vcn_inst
Pass the vcn instance structure to these functions rather
than adev and the instance number.

TODO: clean up the function internals to use the vinst state
directly rather than accessing it indirectly via adev->vcn.inst[].

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:29 -05:00
Alex Deucher
f1ab687040 drm/amdgpu/vcn2.5: convert internal functions to use vcn_inst
Pass the vcn instance structure to these functions rather
than adev and the instance number.

TODO: clean up the function internals to use the vinst state
directly rather than accessing it indirectly via adev->vcn.inst[].

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:29 -05:00
Alex Deucher
38a404f8af drm/amdgpu/vcn2.0: convert internal functions to use vcn_inst
Pass the vcn instance structure to these functions rather
than adev and the instance number.

TODO: clean up the function internals to use the vinst state
directly rather than accessing it indirectly via adev->vcn.inst[].

v2: index instances directly on vcn1.0 and 2.0 to make
it clear that they only support a single instance (Lijo)

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:29 -05:00
Alex Deucher
201fee333d drm/amdgpu/vcn1.0: convert internal functions to use vcn_inst
Pass the vcn instance structure to these functions rather
than adev and the instance number.

TODO: clean up the function internals to use the vinst state
directly rather than accessing it indirectly via adev->vcn.inst[].

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:29 -05:00
Alex Deucher
710151263c drm/amdgpu/vcn3.0: convert internal functions to use vcn_inst
Pass the vcn instance structure to these functions rather
than adev and the instance number.

TODO: clean up the function internals to use the vinst state
directly rather than accessing it indirectly via adev->vcn.inst[].

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:29 -05:00
Alex Deucher
f98675638f drm/amdgpu/vcn: switch vcn helpers to be instance based
Pass the instance to the helpers.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:29 -05:00
Alex Deucher
cb10727168 drm/amdgpu/vcn: move more instanced data to vcn_instance
Move more per instance data into the per instance structure.

v2: index instances directly on vcn1.0 and 2.0 to make
it clear that they only support a single instance (Lijo)
v3: fix typo on vcn 2.5

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> (v2)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:29 -05:00
Alex Deucher
9bf9442051 drm/amdgpu/vcn: make powergating status per instance
Store it per instance so we can track it per instance.

v2: index instances directly on vcn1.0 and 2.0 to make
it clear that they only support a single instance (Lijo)

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:29 -05:00
Alex Deucher
bee48570cf drm/amdgpu/vcn: switch work handler to be per instance
Have a separate work handler for each VCN instance. This
paves the way for per instance VCN power gating at runtime.

v2: index instances directly on vcn1.0 and 2.0 to make
it clear that they only support a single instance (Lijo)

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:29 -05:00
Alex Deucher
94629182f3 drm/amdgpu/vcn5.0.1: split code along instances
Split the code on a per instance basis.  This will allow
us to use the per instance functions in the future to
handle more things per instance.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:28 -05:00
Alex Deucher
0797c54502 drm/amdgpu/vcn5.0.0: split code along instances
Split the code on a per instance basis.  This will allow
us to use the per instance functions in the future to
handle more things per instance.

v2: squash in fix for stop() from Boyuan

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:28 -05:00
Alex Deucher
ecc9ab4e92 drm/amdgpu/vcn4.0.5: split code along instances
Split the code on a per instance basis.  This will allow
us to use the per instance functions in the future to
handle more things per instance.

v2: squash in fix for stop() from Boyuan

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:28 -05:00
Alex Deucher
5826d5a5d5 drm/amdgpu/vcn4.0.3: split code along instances
Split the code on a per instance basis.  This will allow
us to use the per instance functions in the future to
handle more things per instance.

v2: squash in fix for stop() from Boyuan

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:28 -05:00
Alex Deucher
f4cd7a85db drm/amdgpu/vcn4.0: split code along instances
Split the code on a per instance basis.  This will allow
us to use the per instance functions in the future to
handle more things per instance.

v2: squash in fix for stop() from Boyuan

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:28 -05:00
Alex Deucher
d39f1bb577 drm/amdgpu/vcn3.0: split code along instances
Split the code on a per instance basis.  This will allow
us to use the per instance functions in the future to
handle more things per instance.

v2: squash in fix for stop() from Boyuan

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:28 -05:00
Alex Deucher
dae8700198 drm/amdgpu/vcn2.5: fix VCN stop logic
Need to make sure we call amdgpu_dpm_enable_vcn()
in vcn_v2_5_stop() at the end if there are errors
or DPG is enabled.

Fixes: ebc25499de ("drm/amdgpu/vcn2.5: split code along instances")
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Suggested-by: Boyuan Zhang <boyuan.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 15:52:11 -05:00
Dave Airlie
425b848175 amd-drm-next-6.15-2025-02-21:
amdgpu:
 - Add OEM i2c support for RGB lights, etc.
 - Add support for GC 11.5.3
 - Add support for GC 11.5.2
 - Add support for SDMA 6.1.3
 - Add support for NBIO 7.11.2
 - Add support for NBIO 7.9.1
 - Add support for MMHUB 3.3.2
 - Add support for MMHUB 1.8.1
 - Add support for SMU 14.0.5
 - Add support for SMUIO 13.0.11
 - Add support for PSP 14.0.5
 - Add support for UMC 12.5.0
 - Add support for DCN 3.6.0
 - JPEG 4.0.3 updates
 - Add dynamic workload profile switching for GC 10-12
 - support larger vbios sizes
 - GC 9.5.0 updates
 - SMU 13.0.12 updates
 - SMU 13.0.6 updates
 - IP discovery updates
 - GC 10 queue reset updates
 - DCN 4.0.1 updates
 - UHBR link rate fixes
 - Aborted suspend fix
 - Mark gttsize parameter as deprecated
 - GC 10 cleaner shader updates
 - PSR-SU fixes
 - Clean up PM4 headers
 - Cursor fixes
 - Enable devcoredump for JPEG
 - Misc cleanups
 - Runpm cleanups
 - MES updates
 - GC 9 gfxoff fixes
 - Vbios fetching cleanups
 - Documentation updates
 - Update secondary plane handling
 - DML2 updates
 - SDMA fixes for MI
 - Cleaner shader fixes for GC 11/12
 - ACA updates
 - Initial JPEG queue reset support
 - RAS updates
 - Initial RAS CPER support
 - DCN/DCE panic screen handling cleanup
 - BT2020 fixes
 - SR-IOV fixes
 
 amdkfd:
 - synchronize pasid values between KGD and KFD
 - Misc cleanups
 - Improve GTT/VRAM handling for APUs
 - Topology updates
 - Fix user queue validation on GC 7/8
 
 UAPI:
 - Enable "Broadcast RGB" drm property
 - Add INFO IOCTL query for virtualization mode
   Proposed userspace:
   e663bed7d6
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZ7jwVwAKCRC93/aFa7yZ
 2EBmAP42gCX0ESuLJlwLTrobhs9mALp5g6AClLTSBuxNcgfOJwEAvqq12R2db8F9
 hFF3rq07C8YaL3bMIi3IhqIXVsnUBgo=
 =uvFs
 -----END PGP SIGNATURE-----

Merge tag 'amd-drm-next-6.15-2025-02-21' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-6.15-2025-02-21:

amdgpu:
- Add OEM i2c support for RGB lights, etc.
- Add support for GC 11.5.3
- Add support for GC 11.5.2
- Add support for SDMA 6.1.3
- Add support for NBIO 7.11.2
- Add support for NBIO 7.9.1
- Add support for MMHUB 3.3.2
- Add support for MMHUB 1.8.1
- Add support for SMU 14.0.5
- Add support for SMUIO 13.0.11
- Add support for PSP 14.0.5
- Add support for UMC 12.5.0
- Add support for DCN 3.6.0
- JPEG 4.0.3 updates
- Add dynamic workload profile switching for GC 10-12
- support larger vbios sizes
- GC 9.5.0 updates
- SMU 13.0.12 updates
- SMU 13.0.6 updates
- IP discovery updates
- GC 10 queue reset updates
- DCN 4.0.1 updates
- UHBR link rate fixes
- Aborted suspend fix
- Mark gttsize parameter as deprecated
- GC 10 cleaner shader updates
- PSR-SU fixes
- Clean up PM4 headers
- Cursor fixes
- Enable devcoredump for JPEG
- Misc cleanups
- Runpm cleanups
- MES updates
- GC 9 gfxoff fixes
- Vbios fetching cleanups
- Documentation updates
- Update secondary plane handling
- DML2 updates
- SDMA fixes for MI
- Cleaner shader fixes for GC 11/12
- ACA updates
- Initial JPEG queue reset support
- RAS updates
- Initial RAS CPER support
- DCN/DCE panic screen handling cleanup
- BT2020 fixes
- SR-IOV fixes

amdkfd:
- synchronize pasid values between KGD and KFD
- Misc cleanups
- Improve GTT/VRAM handling for APUs
- Topology updates
- Fix user queue validation on GC 7/8

UAPI:
- Enable "Broadcast RGB" drm property
- Add INFO IOCTL query for virtualization mode
  Proposed userspace:
  e663bed7d6

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250221213651.4176031-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
2025-02-26 16:41:44 +10:00
Pierre-Eric Pelloux-Prayer
d3c7059b6a drm/amdgpu: init return value in amdgpu_ttm_clear_buffer
Otherwise an uninitialized value can be returned if
amdgpu_res_cleared returns true for all regions.

Possibly closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3812

Fixes: a68c7eaa7a ("drm/amdgpu: Enable clear page functionality")
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7c62aacc3b)
Cc: stable@vger.kernel.org
2025-02-25 12:29:12 -05:00
Roman Li
4de141b8b1 drm/amd/display: Fix HPD after gpu reset
[Why]
DC is not using amdgpu_irq_get/put to manage the HPD interrupt refcounts.
So when amdgpu_irq_gpu_reset_resume_helper() reprograms all of the IRQs,
HPD gets disabled.

[How]
Use amdgpu_irq_get/put() for HPD init/fini in DM in order to sync refcounts

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Roman Li <Roman.Li@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f3dde2ff7f)
Cc: stable@vger.kernel.org
2025-02-25 12:28:39 -05:00
Yilin Chen
b5f7242e49 drm/amd/display: add a quirk to enable eDP0 on DP1
[why]
some board designs have eDP0 connected to DP1, need a way to enable
support_edp0_on_dp1 flag, otherwise edp related features cannot work

[how]
do a dmi check during dm initialization to identify systems that
require support_edp0_on_dp1. Optimize quirk table with callback
functions to set quirk entries, retrieve_dmi_info can set quirks
according to quirk entries

Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Yilin Chen <Yilin.Chen@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f6d17270d1)
Cc: stable@vger.kernel.org
2025-02-25 12:27:57 -05:00
Tom Chung
e8863f8b03 drm/amd/display: Disable PSR-SU on eDP panels
[Why]
PSR-SU may cause some glitching randomly on several panels.

[How]
Temporarily disable the PSR-SU and fallback to PSR1 for
all eDP panels.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3388
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 6deeefb820)
Cc: stable@vger.kernel.org
2025-02-25 12:26:51 -05:00
Melissa Wen
12f3b92d1c drm/amd/display: restore edid reading from a given i2c adapter
When switching to drm_edid, we slightly changed how to get edid by
removing the possibility of getting them from dc_link when in aux
transaction mode. As MST doesn't initialize the connector with
`drm_connector_init_with_ddc()`, restore the original behavior to avoid
functional changes.

v2:
- Fix build warning of unchecked dereference (kernel test bot)

CC: Alex Hung <alex.hung@amd.com>
CC: Mario Limonciello <mario.limonciello@amd.com>
CC: Roman Li <Roman.Li@amd.com>
CC: Aurabindo Pillai <Aurabindo.Pillai@amd.com>
Fixes: 48edb2a425 ("drm/amd/display: switch amdgpu_dm_connector to use struct drm_edid")
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 81262b1656)
2025-02-25 12:25:15 -05:00
Alex Deucher
748a1f51bb drm/amdgpu/mes: keep enforce isolation up to date
Re-send the mes message on resume to make sure the
mes state is up to date.

Fixes: 8521e3c5f0 ("drm/amd/amdgpu: limit single process inside MES")
Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Shaoyun Liu <shaoyun.liu@amd.com>
Cc: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 27b7915147)
2025-02-25 12:23:48 -05:00
Alex Deucher
e7ea88207c drm/amdgpu/gfx: only call mes for enforce isolation if supported
This should not be called on chips without MES so check if
MES is enabled and if the cleaner shader is supported.

Fixes: 8521e3c5f0 ("drm/amd/amdgpu: limit single process inside MES")
Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Shaoyun Liu <shaoyun.liu@amd.com>
Cc: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
(cherry picked from commit 80513e3897)
2025-02-25 12:22:56 -05:00
Alex Deucher
099bffc7ca drm/amdgpu: disable BAR resize on Dell G5 SE
There was a quirk added to add a workaround for a Sapphire
RX 5600 XT Pulse that didn't allow BAR resizing.  However,
the quirk caused a regression with runtime pm on Dell laptops
using those chips, rather than narrowing the scope of the
resizing quirk, add a quirk to prevent amdgpu from resizing
the BAR on those Dell platforms unless runtime pm is disabled.

v2: update commit message, add runpm check

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1707
Fixes: 907830b0fc ("PCI: Add a REBAR size quirk for Sapphire RX 5600 XT Pulse")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5235053f44)
Cc: stable@vger.kernel.org
2025-02-25 12:20:27 -05:00
David Yat Sin
3502ab5022 drm/amdkfd: Preserve cp_hqd_pq_control on update_mqd
When userspace applications call AMDKFD_IOC_UPDATE_QUEUE. Preserve
bitfields that do not need to be modified as they contain flags to
track queue states that are used by CP FW.

Signed-off-by: David Yat Sin <David.YatSin@amd.com>
Reviewed-by: Jay Cornwall <jay.cornwall@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8150827990)
Cc: stable@vger.kernel.org
2025-02-25 12:19:25 -05:00
chr[]
91dcc66b34 amdgpu/pm/legacy: fix suspend/resume issues
resume and irq handler happily races in set_power_state()

* amdgpu_legacy_dpm_compute_clocks() needs lock
* protect irq work handler
* fix dpm_enabled usage

v2: fix clang build, integrate Lijo's comments (Alex)

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2524
Fixes: 3712e7a494 ("drm/amd/pm: unified lock protections in amdgpu_dpm.c")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Tested-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> # on Oland PRO
Signed-off-by: chr[] <chris@rudorff.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ee3dc9e204)
Cc: stable@vger.kernel.org
2025-02-25 12:17:38 -05:00
Tao Zhou
dab993bf15 drm/amdgpu: increase AMDGPU_MAX_RINGS
Increase it since a cper ring is introduced.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:45:13 -05:00
Srinivasan Shanmugam
59f9c2c9f6 drm/amdgpu: Fix correct parameter desc for VCN idle check functions
Fixes the kdoc for the following VCN idle check functions by updating
the parameter description from 'handle' to 'ip_block':

- vcn_v4_0_is_idle
- vcn_v4_0_3_is_idle
- vcn_v4_0_5_is_idle
- vcn_v5_0_1_is_idle

Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.c:935: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v5_0_1_is_idle'
drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.c:935: warning: Excess function parameter 'handle' description in 'vcn_v5_0_1_is_idle'
drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c:1972: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v4_0_is_idle'
drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c:1972: warning: Excess function parameter 'handle' description in 'vcn_v4_0_is_idle'
drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c:1583: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v4_0_3_is_idle'
drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c:1583: warning: Excess function parameter 'handle' description in 'vcn_v4_0_3_is_idle'
drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c:1200: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v5_0_0_is_idle'
drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c:1200: warning: Excess function parameter 'handle' description in 'vcn_v5_0_0_is_idle'
drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c:1460: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v4_0_5_is_idle'
drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c:1460: warning: Excess function parameter 'handle' description in 'vcn_v4_0_5_is_idle'

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:45:13 -05:00
Pierre-Eric Pelloux-Prayer
7c62aacc3b drm/amdgpu: init return value in amdgpu_ttm_clear_buffer
Otherwise an uninitialized value can be returned if
amdgpu_res_cleared returns true for all regions.

Possibly closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3812

Fixes: a68c7eaa7a ("drm/amdgpu: Enable clear page functionality")
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:45:12 -05:00
ganglxie
a8f921a10a drm/amdgpu: Change page/record number calculation based on nps
save only one record to save eeprom space,and
bad_page_num = pa_rec_num + mca_rec_num*16

Signed-off-by: ganglxie <ganglxie@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:45:12 -05:00
ganglxie
0153d27673 drm/amdgpu: Refine bad page adding
bad page adding can be simpler with nps info

Signed-off-by: ganglxie <ganglxie@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:45:12 -05:00
Asad Kamal
e6aae1db41 drm/amd/pm: Get metrics table version for smu_v13_0_12
Get metrics table version for smu_v13_0_12 and populate pm_metrics

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:45:12 -05:00
Jesse.zhang@amd.com
e4e6ae41cc drm/amdgpu: update SDMA sysfs reset mask in late_init
- Added `sdma_v4_4_2_update_reset_mask` function to update the reset mask.
- update the sysfs reset mask to the `late_init` stage to ensure that the SMU  initialization
     and capability setup are completed before checking the SDMA reset capability.
- For IP versions 9.4.3 and 9.4.4, enable per-queue reset if the MEC firmware version is at least 0xb0 and PMFW supports queue reset.
- Add a TODO comment for future support of per-queue reset for IP version 9.5.0.

This change ensures that per-queue reset is only enabled when the MEC and PMFW support it.

v2: fix ip version (9.5.4 -> 9.5.0)(Lijo)

Suggested-by: Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:45:12 -05:00
Xiang Liu
ff930483af drm/amdgpu: Set CPER enabled flag after ring initiailized
Setting cper.enabled to be true only after cper ring is successfully
created.

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:45:12 -05:00
ganglxie
f2510355fb drm/amdgpu: Save nps to eeprom
nps info saved together with bad page makes bad page parsing more efficient

Signed-off-by: ganglxie <ganglxie@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:45:12 -05:00
Xiang Liu
ce615fe328 drm/amdgpu: Check if CPER enabled when generating CPER
In the case of CPER disabled, generating CPER will cause kernel NULL
pointer dereference without checking.

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:45:12 -05:00
Mangesh Gadre
700e535db4 drm/amd/pm: handling of set performance level
display performance level when set not supported

Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:45:12 -05:00
Jonathan Kim
9424a5bf08 drm/amdgpu: simplify xgmi peer info calls
Deprecate KFD XGMI peer info calls in favour of calling directly from
simplified XGMI peer info functions.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:45:12 -05:00
Jonathan Kim
6f0e68b8c7 drm/amdkfd: enable cooperative launch on gfx12
Even though GWS no longer exists, to maintain runtime usage for
cooperative launch, SW set legacy GWS size.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Acked-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:45:12 -05:00
Taimur Hassan
9655a16031 drm/amd/display: Promote DAL to 3.2.322
- Disable PSR-SU on eDP panels
- Fix HPD after GPU reset
- Fixes on dcn4x init, DML2 state policy on DCN36
- Various minor logic fixes

Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:45:12 -05:00
Taimur Hassan
d7dc4917ae drm/amd/display: [FW Promotion] Release 0.0.255.0
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:45:11 -05:00
Roman Li
f3dde2ff7f drm/amd/display: Fix HPD after gpu reset
[Why]
DC is not using amdgpu_irq_get/put to manage the HPD interrupt refcounts.
So when amdgpu_irq_gpu_reset_resume_helper() reprograms all of the IRQs,
HPD gets disabled.

[How]
Use amdgpu_irq_get/put() for HPD init/fini in DM in order to sync refcounts

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Roman Li <Roman.Li@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:45:11 -05:00
Mike Katsnelson
8adeff83a3 drm/amd/display: stop DML2 from removing pipes based on planes
[Why]
Transitioning from low to high resolutions at high refresh rates caused grey corruption.
During the transition state, there is a period where plane size is based on low resultion
state and ODM slices are based on high resoultion state, causing the entire plane to be
contained in one ODM slice. DML2 would turn off the pipe for the ODM slice with no plane,
causing an underflow since the pixel rate for the higher resolution cannot be supported on
one pipe. This change stops DML2 from turning off pipes that are mapped to an ODM slice
with no plane. This is possible to do without negative consequences because pipes can now
take the minimum viewport and draw with zero recout size, removing the need to have the
pipe turned off.

[How]
In map_pipes_from_plane(), remove "check" that skips ODM slices that are not covered by
the plane. This prevents the pipes for those ODM slices from being freed.

Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Signed-off-by: Mike Katsnelson <mike.katsnelson@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:45:11 -05:00
Nicholas Kazlauskas
50f040c53e drm/amd/display: Increase halt timeout for DMCUB to 1s
[Why]
If we soft reset before halt finishes and there are outstanding
memory transactions then the memory interface may produce unexpected
results, such as out of order transactions when the firmware next runs.

These can manifest as random or unexpected load/store violations.

[How]
Increase the timeout before soft reset to ensure the DMCUB has quiesced.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:45:11 -05:00
Krunoslav Kovac
35079e7eee drm/amd/display: Remove unused header
[Why]
Removes unused header

Reviewed-by: Samson Tam <samson.tam@amd.com>
Signed-off-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:45:11 -05:00
Yihan Zhu
02a940da2c drm/amd/display: handle max_downscale_src_width fail check
[WHY]
If max_downscale_src_width check fails, we exit early from TAP calculation and left a NULL
value to the scaling data structure to cause the zero divide in the DML validation.

[HOW]
Call set default TAP calculation before early exit in get_optimal_number_of_taps due to
max downscale limit exceed.

Reviewed-by: Samson Tam <samson.tam@amd.com>
Signed-off-by: Yihan Zhu <Yihan.Zhu@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:45:11 -05:00
Michael Strauss
7c6518c1c7 drm/amd/display: Update FIXED_VS Link Rate Toggle Workaround Usage
[WHY]
Previously the 128b/132b LTTPR support DPCD field was used to decide if
FIXED_VS training sequence required a rate toggle before initiating LT.

When running DP2.1 4.9.x.x compliance tests, emulated LTTPRs can report
no-128b/132b support which is then forwarded by the FIXED_VS retimer.
As a result this test exposes the rate toggle again, erroneously causing
failures as certain compliance sinks don't expect this behaviour.

[HOW]
Add new DPCD register defines/reads to read LTTPR IEEE OUI and device ID.

Decide whether to perform the rate toggle based on the LTTPR's IEEE OUI
which guarantees that we only perform the toggle on affected retimers.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:45:11 -05:00
Charlene Liu
23ef388a84 drm/amd/display: fix dcn4x init failed
[why]
failed due to cmdtable not created.
switch atombios cmdtable as default.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:45:11 -05:00
Aurabindo Pillai
ba93dddfc9 drm/amd/display: Temporarily disable hostvm on DCN31
With HostVM enabled, DCN31 fails to pass validation for 3x4k60. Some Linux
userspace does not downgrade one of the monitors to 4k30, and the result
is that the monitor does not light up. Disable it until the bandwidth
calculation failure is resolved.

Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:45:11 -05:00
Rafal Ostrowski
ab021b9f31 drm/amd/display: ACPI Re-timer Programming
[Why]
We must implement an ACPI re-timer programming interface and notify
ACPI driver whenever a PHY transition is about to take place.

Because some trace lengths on certain platforms are very long,
then a re-timer may need to be programmed whenever a PHY transition
takes place. The implementation of this re-timer programming interface
will notify ACPI driver that PHY transition is taking place and it
will trigger the re-timer as needed.

First we need to gather retimer information from ACPI interface.

Then, in the PRE case, the re-timer interface needs to be called before we call
transmitter ENABLE.
In the POST case, it has to be called after we call transmitter DISABLE.

[How]
Implemented ACPI retimer programming interface.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Rafal Ostrowski <rostrows@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:36 -05:00
Patel, Swapnil
02a2793ab2 drm/amd/display: Refactor DCN4x and related code
[why & how]
Refactor existing code related to DCN4x for better code sharing with
other modules.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Swapnil Patel <Swapnil.Patel@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:02 -05:00
Yilin Chen
f6d17270d1 drm/amd/display: add a quirk to enable eDP0 on DP1
[why]
some board designs have eDP0 connected to DP1, need a way to enable
support_edp0_on_dp1 flag, otherwise edp related features cannot work

[how]
do a dmi check during dm initialization to identify systems that
require support_edp0_on_dp1. Optimize quirk table with callback
functions to set quirk entries, retrieve_dmi_info can set quirks
according to quirk entries

Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Yilin Chen <Yilin.Chen@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:02 -05:00
Peichen Huang
d295786840 drm/amd/display: replace dio encoder access
[WHY]
replace dio encoder access to work with new dio encoder
assignment.

[HOW}
1. before validation, access dio encoder by get_temp_dio_link_enc()
2. after validation, access dio encoder through pipe_ctx->link_res

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Reviewed-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Signed-off-by: Peichen Huang <PeiChen.Huang@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:02 -05:00
Navid Assadian
0fe2df4498 drm/amd/display: Add SPL namespace
[Why]
In order to avoid component conflicts, spl namespace is needed.

[How]
Adding SPL namespace to the public API os that each user of SPL can have
their own namespace.

Signed-off-by: Navid Assadian <Navid.Assadian@amd.com>
Reviewed-by: Samson Tam <Samson.Tam@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:02 -05:00
Samson Tam
259eacbfcf drm/amd/display: Fix unit test failure
[Why]
Some of unit tests use large scaling ratio such that when we
 calculate optimal number of taps, max_taps is negative.
 Then in recent change, we changed max_taps to uint instead
 of int so now max_taps wraps and is positive.  This change
 changed the behaviour from returning back false to return
 true and breaks unit test check

[How]
Add check to prevent max_taps from wrapping and set to 0
 instead

Signed-off-by: Samson Tam <Samson.Tam@amd.com>
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:02 -05:00
Samson Tam
0d30046476 drm/amd/display: fix check for identity ratio
[Why]
IDENTITY_RATIO check uses 2 bits for integer, which only allows
 checking downscale ratios up to 3.  But we support up to 6x
 downscale

[How]
Update IDENTITY_RATIO to check 3 bits for integer
Add ASSERT to catch if we downscale more than 6x

Signed-off-by: Samson Tam <Samson.Tam@amd.com>
Reviewed-by: Jun Lei <jun.lei@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:02 -05:00
Assadian, Navid
26873260d3 drm/amd/display: Fix mismatch type comparison
The mismatch type comparison/assignment may cause data loss. Since the
values are always non-negative, it is safe to use unsigned variables to
resolve the mismatch.

Signed-off-by: Navid Assadian <navid.assadian@amd.com>
Reviewed-by: Joshua Aberback <joshua.aberback@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:01 -05:00
Navid Assadian
fba4d19f37 drm/amd/display: Add opp recout adjustment
[Why]
For subsampled YUV output formats, more pixels can get fetched and be
used for scaling.

[How]
Add the adjustment to the calculated recout, so the viewport covers the
corresponding pixels on the source plane.

Signed-off-by: Navid Assadian <Navid.Assadian@amd.com>
Reviewed-by: Samson Tam <Samson.Tam@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:01 -05:00
Samson Tam
86f06bcbb5 drm/amd/display: Fix mismatch type comparison in custom_float
[Why & How]
Passing uint into uchar function param.  Pass uint instead

Signed-off-by: Samson Tam <Samson.Tam@amd.com>
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:01 -05:00
Nicholas Kazlauskas
97b05c8c2e drm/amd/display: Apply DCN35 DML2 state policy for DCN36 too
[Why]
DCN36 should inherit the same policy as DCN35 for DML2.

[How]
Add it to the list of checks in translation helper.

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:01 -05:00
Alex Hung
d8075f5a6d drm/amd/display: update incorrect cursor buffer size
[WHAT & HOW]
Fix the incorrect value of the cursor_buffer_size.

Signed-off-by: Alex Hung <alex.hung@amd.com>
Reviewed-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:01 -05:00
Tom Chung
6deeefb820 drm/amd/display: Disable PSR-SU on eDP panels
[Why]
PSR-SU may cause some glitching randomly on several panels.

[How]
Temporarily disable the PSR-SU and fallback to PSR1 for
all eDP panels.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3388
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:01 -05:00
Tom Chung
dc84a21f5f drm/amd/display: Revert "Disable PSR-SU on some OLED panel"
This reverts commit c31b41f1cb.

We planning to disable the PSR-SU and fallback to PSR1 for
all eDP panels not only for specific eDP panel temporarily.

Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:01 -05:00
Colin Ian King
abefe9fcfb drm/amd/display: Fix spelling mistake "oustanding" -> "outstanding"
There is a spelling mistake in max_oustanding_when_urgent_expected,
fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:01 -05:00
Melissa Wen
81262b1656 drm/amd/display: restore edid reading from a given i2c adapter
When switching to drm_edid, we slightly changed how to get edid by
removing the possibility of getting them from dc_link when in aux
transaction mode. As MST doesn't initialize the connector with
`drm_connector_init_with_ddc()`, restore the original behavior to avoid
functional changes.

v2:
- Fix build warning of unchecked dereference (kernel test bot)

CC: Alex Hung <alex.hung@amd.com>
CC: Mario Limonciello <mario.limonciello@amd.com>
CC: Roman Li <Roman.Li@amd.com>
CC: Aurabindo Pillai <Aurabindo.Pillai@amd.com>
Fixes: 48edb2a425 ("drm/amd/display: switch amdgpu_dm_connector to use struct drm_edid")
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:01 -05:00
Dr. David Alan Gilbert
9d8af72fe7 drm/amdgpu: Remove unused nbif_v6_3_1_sriov_funcs
The nbif_v6_3_1_sriov_funcs instance of amdgpu_nbio_funcs was added in
commit 894c6d3522 ("drm/amdgpu: Add nbif v6_3_1 ip block support")
but has remained unused.

Alex has confirmed it wasn't needed.

Remove it, together with the four unused stub functions:
  nbif_v6_3_1_sriov_ih_doorbell_range
  nbif_v6_3_1_sriov_gc_doorbell_init
  nbif_v6_3_1_sriov_vcn_doorbell_range
  nbif_v6_3_1_sriov_sdma_doorbell_range

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:01 -05:00
Sathishkumar S
62431979dd drm/amdgpu: Add ring reset callback for JPEG5_0_1
Add ring reset function callback for JPEG5_0_1 to
recover from job timeouts without a full gpu reset.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:00 -05:00
André Almeida
b7fd6528b5 drm/amdgpu: Log after a successful ring reset
When a ring reset happens, the kernel log shows only "amdgpu: Starting
<ring name> ring reset", but when it finishes nothing appears in the
log. Explicitly write in the log that the reset has finished correctly.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:00 -05:00
André Almeida
28d05f0836 drm/amdgpu: Log the creation of a coredump file
After a GPU reset happens, the driver creates a coredump file. However,
the user might not be aware of it. Log the file creation the user can
find more information about the device and add the file to bug reports.
This is similar to what the xe driver does.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:00 -05:00
Alex Deucher
27b7915147 drm/amdgpu/mes: keep enforce isolation up to date
Re-send the mes message on resume to make sure the
mes state is up to date.

Fixes: 8521e3c5f0 ("drm/amd/amdgpu: limit single process inside MES")
Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Shaoyun Liu <shaoyun.liu@amd.com>
Cc: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:00 -05:00
Asad Kamal
0b4119d54b drm/amd/pm: Use separate metrics table for smu_v13_0_12
Use separate metrics table for smu_v13_0_12 and fetch metrics data using
that.

v2: Fix jpeg busy indexing (Lijo)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:00 -05:00
Sathishkumar S
9b71be8785 drm/amdgpu: Add core reset registers for JPEG5_0_1
Add core reset control register definitions and align
all prior register definitions to end at 100 column
length for uniformity.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:00 -05:00
Sathishkumar S
da120ed561 drm/amdgpu: Per-instance init func for JPEG5_0_1
Add helper functions to handle per-instance and per-core
initialization and deinitialization in JPEG5_0_1.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:00 -05:00
Aurabindo Pillai
a1addcf849 drm/amd/display: fix an indent issue in DML21
Remove extraneous tab and newline in dml2_core_dcn4.c that was
reported by the bot

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202502211920.txUfwtSj-lkp@intel.com/
Fixes: 70839da636 ("drm/amd/display: Add new DCN401 sources")
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:00 -05:00
Alex Deucher
5235053f44 drm/amdgpu: disable BAR resize on Dell G5 SE
There was a quirk added to add a workaround for a Sapphire
RX 5600 XT Pulse that didn't allow BAR resizing.  However,
the quirk caused a regression with runtime pm on Dell laptops
using those chips, rather than narrowing the scope of the
resizing quirk, add a quirk to prevent amdgpu from resizing
the BAR on those Dell platforms unless runtime pm is disabled.

v2: update commit message, add runpm check

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1707
Fixes: 907830b0fc ("PCI: Add a REBAR size quirk for Sapphire RX 5600 XT Pulse")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:00 -05:00
Asad Kamal
25907304cf drm/amd/pm: Fetch fru product info for smu_v13_0_12
Fetch fru product info for smu_v13_0_12 from static metrics table

v2: Field by field copy for fru info(Lijo)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:44:00 -05:00
Asad Kamal
95eebc05a7 drm/amd/pm: Fetch static metrics table
Fetch clock frequency table from static metrics table for
smu_v13_0_12

v2: Move PPTable definition, remove unnecessary checks for getting
static metrics table(Lijo)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:43:59 -05:00
Asad Kamal
6c565218ed drm/amd/pm: Add GetStaticMetricTable message
Add GetStaticMetricTable message for smu_v13_0_12

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:43:59 -05:00
Asad Kamal
e2b3f95b47 drm/amd/pm: Update pmfw headers for smu_v13_0_12
Update pmfw headers for smu_v13_0_12 new messages & metrics table.
Static metrics table for frequency added, Separate metrics table
for smu_v13_0_12 added.

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:43:59 -05:00
Jesse.zhang@amd.com
c94943b086 drm/amdgpu: Update amdgpu_job_timedout to check if the ring is guilty
This patch updates the `amdgpu_job_timedout` function to check if
the ring is actually guilty of causing the timeout. If not, it
skips error handling and fence completion.

v2: move the is_guilty check down into the queue reset area (Alex)
v3: need to call is_guilty before reset (Alex)
v4: squash in is_guilty logic fixes (Alex)

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:43:59 -05:00
Jesse.zhang@amd.com
d190e4d0f7 drm/amd/pm: add support for checking SDMA reset capability
This patch introduces a new function to check if the SMU supports resetting the SDMA engine.
This capability check ensures that the driver does not attempt to reset the SDMA engine
on hardware that does not support it.

The following changes are included:
- New function `amdgpu_dpm_reset_sdma_is_supported` to check SDMA reset
  support at the AMDGPU driver level.
- New function `smu_reset_sdma_is_supported` to check SDMA reset support
  at the SMU level.
- Implementation of `smu_v13_0_6_reset_sdma_is_supported` for the specific
  SMU version v13.0.6.
- Updated `smu_v13_0_6_reset_sdma` to use the new capability check before
  attempting to reset the SDMA engine.

v2: change smu_reset_sdma_is_supported type to bool (Tim)

Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Tim Huang <tim.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:43:59 -05:00
Jesse.zhang@amd.com
8225254492 drm/amdgpu: Add reset function pointer for SDMA v4.4.2 page ring
This patch adds a reset function pointer to the SDMA v4.4.2 page ring
functionality. The new function pointer `reset` is set to
`sdma_v4_4_2_reset_queue`, which is responsible for resetting the SDMA queue.

Changes:
- Add `reset` function pointer to `sdma_v4_4_2_page_ring_funcs`.

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:43:59 -05:00
Jesse.zhang@amd.com
fdbfaaaae0 drm/amdgpu: Improve SDMA reset logic with guilty queue tracking
This patch includes the remaining improvements to the SDMA reset logic:
- Added `gfx_guilty` and `page_guilty` flags to track guilty queues.
- Updated the reset and resume functions to handle the guilty state.
- Cached the `rptr` before reset.

v2:
   1.replace the caller with a guilty bool.
   If the queue is the guilty one, set the rptr and wptr  to the saved wptr value,
   else, set the rptr and wptr to the saved rptr value. (Alex)
   2. cache the rptr before the reset. (Alex)

v3: Keeping intermediate variables like u64 rwptr simplifies resotre rptr/wptr.(Lijo)

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Suggested-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:43:59 -05:00
Jesse.zhang@amd.com
0ad649321a drm/amdgpu/sdma: Introduce is_guilty callbacks for sdma GFX and PAGE rings
This patch introduces the `is_guilty` callbacks for the GFX and PAGE rings.
These callbacks check if a ring is guilty of causing a timeout or error.

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:43:59 -05:00
Jesse.zhang@amd.com
4d3c4f4f7f drm/amdgpu: Introduce cached_rptr and is_guilty callback in amdgpu_ring
This patch introduces the following changes:
- Add `cached_rptr` to the `amdgpu_ring` structure to store the read pointer before a reset.
- Add `is_guilty` callback to the `amdgpu_ring_funcs` structure to check if a ring is guilty of causing a timeout.

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:43:59 -05:00
Jesse.zhang@amd.com
4c02f73016 drm/amdgpu: Introduce conditional user queue suspension for SDMA resets
- Modify the `amdgpu_sdma_reset_engine` function to accept a `suspend_user_queues` parameter.
- This parameter allows the function to conditionally suspend and resume user queues during SDMA resets.
- Ensure that user queues are suspended only when necessary to avoid unnecessary overhead and potential deadlocks.
- Restart the scheduler's work queue for the GFX and page rings after the reset to allow new tasks to be submitted.

This change improves synchronization between the KGD and the KFD during SDMA resets,
ensuring proper handling of user queues and avoiding race conditions.

V2: replace the ring_lock with the existed the scheduler
    locks for the queues (ring->sched) on the sdma engine.(Alex)

v3: call drm_sched_wqueue_stop() rather than job_list_lock.
    If a GPU ring reset was already initiated for one ring at amdgpu_job_timedout,
    skip resetting that ring and call drm_sched_wqueue_stop()
    for the other rings (Alex)

   replace  the common lock (sdma_reset_lock) with DQM lock to
   to resolve reset races between the two driver sections during KFD eviction.(Jon)

   Rename the caller to Reset_src and
   Change AMDGPU_RESET_SRC_SDMA_KGD/KFD to AMDGPU_RESET_SRC_SDMA_HWS/RING (Jon)

v4: restart the wqueue if the reset was successful,
    or fall back to a full adapter reset. (Alex)

   move definition of reset source to enumeration AMDGPU_RESET_SRCS, and
   check reset src in amdgpu_sdma_reset_instance (Jon)

v5: Call amdgpu_amdkfd_suspend/resume at the start/end of reset function respectively under !SRC_HWS
    conditions only (Jon)

v6: replace the paramter src with a bool suspend_user_queues,
    remove the paramter src in pre/post func. (Jon)

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Suggested-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Suggested-by: Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Acked-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:43:59 -05:00
Lijo Lazar
0ca5751560 drm/amdgpu: Remove redundant logic in GC v9.4.3
GFXOFF check is not needed for GC v9.4.3. Also, save/restore list is
available by default.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:43:59 -05:00
Sathishkumar S
793ee232ee drm/amdgpu: Do not poweroff UVDJ in JPEG4_0_3
Update power gate setting to not poweroff UVDJ in JPEG4_0_3.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:43:58 -05:00
Jesse.zhang@amd.com
d6e6ea5efb drm/amdgpu/sdma: Refactor SDMA reset functionality and add callback support
This patch refactors the SDMA reset functionality in the `sdma_v4_4_2` driver
to improve modularity and support shared usage between AMDGPU and KFD. The
changes include:

1. **Refactored SDMA Reset Logic**:
   - Split the `sdma_v4_4_2_reset_queue` function into two separate functions:
     - `sdma_v4_4_2_stop_queue`: Stops the SDMA queue before reset.
     - `sdma_v4_4_2_restore_queue`: Restores the SDMA queue after reset.
   - These functions are now used as callbacks for the shared reset mechanism.

2. **Added Callback Support**:
   - Introduced a new structure `sdma_v4_4_2_reset_funcs` to hold the stop and
     restore callbacks.
   - Added `sdma_v4_4_2_set_reset_funcs` to register these callbacks with the
     shared reset mechanism using `amdgpu_set_on_reset_callbacks`.

3. **Fixed Reset Queue Function**:
   - Modified `sdma_v4_4_2_reset_queue` to use the shared `amdgpu_sdma_reset_queue`
     function, ensuring consistency across the driver.

This patch ensures that SDMA reset functionality is more modular, reusable, and
aligned with the shared reset mechanism between AMDGPU and KFD.

v2: Renamed sdma_v4_4_2_set_reset_funcs to sdma_v4_4_2_set_engine_reset_funcs.
    Renamed sdma_v4_4_2_reset_funcs to sdma_v4_4_2_engine_reset_funcs.(Alex)

Suggested-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:43:58 -05:00
Jesse.zhang@amd.com
f33044952c drm/amdgpu/kfd: Add shared SDMA reset functionality with callback support
This patch introduces shared SDMA reset functionality between AMDGPU and KFD.
The implementation includes the following key changes:

1. Added `amdgpu_sdma_reset_queue`:
   - Resets a specific SDMA queue by instance ID.
   - Invokes registered pre-reset and post-reset callbacks to allow KFD and AMDGPU
     to save/restore their state during the reset process.

2. Added `amdgpu_set_on_reset_callbacks`:
   - Allows KFD and AMDGPU to register callback functions for pre-reset and
     post-reset operations.
   - Callbacks are stored in a global linked list and invoked in the correct order
     during SDMA reset.

This patch ensures that both AMDGPU and KFD can handle SDMA reset events
gracefully, with proper state saving and restoration. It also provides a flexible
callback mechanism for future extensions.

v2: fix CamelCase and put the SDMA helper into amdgpu_sdma.c (Alex)

v3: rename the `amdgpu_register_on_reset_callbacks` function to
      `amdgpu_sdma_register_on_reset_callbacks`
    move global reset_callback_list to struct amdgpu_sdma (Alex)

v4: Update the reset callback function description and
   rename the reset function to amdgpu_sdma_reset_engine (Alex)

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Suggested-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:43:58 -05:00
Likun Gao
71209c9663 drm/amdgpu: correct the name of mes_pipe structure
Correct the structure name admgpu_mes_pipe to amdgpu_mes_pipe.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:43:58 -05:00
David Yat Sin
8150827990 drm/amdkfd: Preserve cp_hqd_pq_control on update_mqd
When userspace applications call AMDKFD_IOC_UPDATE_QUEUE. Preserve
bitfields that do not need to be modified as they contain flags to
track queue states that are used by CP FW.

Signed-off-by: David Yat Sin <David.YatSin@amd.com>
Reviewed-by: Jay Cornwall <jay.cornwall@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:43:58 -05:00
chr[]
ee3dc9e204 amdgpu/pm/legacy: fix suspend/resume issues
resume and irq handler happily races in set_power_state()

* amdgpu_legacy_dpm_compute_clocks() needs lock
* protect irq work handler
* fix dpm_enabled usage

v2: fix clang build, integrate Lijo's comments (Alex)

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2524
Fixes: 3712e7a494 ("drm/amd/pm: unified lock protections in amdgpu_dpm.c")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Tested-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> # on Oland PRO
Signed-off-by: chr[] <chris@rudorff.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:43:58 -05:00
Sunil Khatri
7dc3405403 drm/amdgpu: update the handle ptr in is_idle
Update the *handle to amdgpu_ip_block ptr for all
functions pointers of is_idle.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:43:58 -05:00
Thomas Zimmermann
130377304e Merge drm/drm-next into drm-misc-next
Backmerging to get fixes from v6.14-rc4.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2025-02-25 11:43:10 +01:00
Dave Airlie
fb51bf0255 Linux 6.14-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAme7hfkeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGU+IH/1bk6zIvAwXXS5yu
 KNsQ8dEkC3Xme6HqLtPsAhRLF+5YJf6MaGm1ip5dDMyIvasa2gwvCQQQoOpeMbKj
 79VKT+m9t3szMHZaQYjlOuYHBmNSJ4cMCD2Qh6ktXHGPfTTWDFGf7fBwBOkVNeJU
 1Ask+bxeop21aJMhfYXrUta3OYyerLBUR6jCiCM82A/GLtdv6oNGXBu3ygDt9Tjx
 ZHSl+CYjKpmGUP8JnMKwCBHVguEfqgzZ//dY1H16AvOLed9k2jkMFn8O5Vi3vjnx
 TWMMXoiJimuamGzbjxtCCqzxNlFFDT4gRpDqeJxb16W/gDTFmbRr9LDjNehCZe33
 AigLZ6M=
 =Y/7F
 -----END PGP SIGNATURE-----

Merge tag 'v6.14-rc4' into drm-next

Backmerge Linux 6.14-rc4 at the request of tzimmermann so misc-next
can base on rc4.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2025-02-25 17:36:09 +10:00
Tvrtko Ursulin
80b6ef8ae2 drm/amdgpu: Pop jobs from the queue more robustly
Replace a copy of DRM scheduler's to_drm_sched_job with a copy of a newly
added drm_sched_entity_queue_pop.

This allows breaking the hidden dependency that queue_node has to be the
first element in struct drm_sched_job.

A comment is also added with a reference to the mailing list discussion
explaining the copied helper will be removed when the whole broken
amdgpu_job_stop_all_jobs_on_sched is removed.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Philipp Stanner <phasta@kernel.org>
Cc: Zhang, Hawking <Hawking.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20250221105038.79665-3-tvrtko.ursulin@igalia.com
2025-02-24 10:17:40 +01:00
Christian König
cb0de06d1b drm/amdgpu: remove all KFD fences from the BO on release
Remove all KFD BOs from the private dma_resv object.

This prevents the KFD from being evict unecessarily when an exported BO
is released.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-and-tested-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-21 10:41:49 -05:00
Thomas Weißschuh
600aa8d31a drm/amd/display: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Link: https://lore.kernel.org/r/20241216-sysfs-const-bin_attr-drm-v1-5-210f2b36b9bf@weissschuh.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-21 09:20:31 +01:00
Thomas Weißschuh
2d0f5001b6 drm/amdgpu: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20241216-sysfs-const-bin_attr-drm-v1-4-210f2b36b9bf@weissschuh.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-21 09:20:31 +01:00
Sunil Khatri
3521276ad1 drm/amdgpu: update the handle ptr in get_clockgating_state
Update the *handle to amdgpu_ip_block ptr for all
functions pointers of get_clockgating_state.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:19:05 -05:00
Rodrigo Siqueira
3f670b745d drm/amd/display: Add clear DCC and Tiling callback for DCE
Introduce the DCC and Tiling reset callback to all DCE versions that can
call it.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:18:22 -05:00
Srinivasan Shanmugam
2b04d04de9 drm/amdkfd: Fix error handling for missing PASID in 'kfd_process_device_init_vm'
In the kfd_process_device_init_vm function, a valid error code is now
returned when the associated Process Address Space ID (PASID) is not
present.

If the address space virtual memory (avm) does not have an associated
PASID, the function sets the ret variable to -EINVAL before proceeding
to the error handling section. This ensures that the calling function,
such as kfd_ioctl_acquire_vm, can appropriately handle the error,
thereby preventing any issues during virtual memory initialization.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c:1694 kfd_process_device_init_vm()
warn: missing error code 'ret'

drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c
    1647 int kfd_process_device_init_vm(struct kfd_process_device *pdd,
    1648                                struct file *drm_file)
    1649 {
    ...
    1690
    1691         if (unlikely(!avm->pasid)) {
    1692                 dev_warn(pdd->dev->adev->dev, "WARN: vm %p has no pasid associated",
    1693                                  avm);
--> 1694                 goto err_get_pasid;

ret = -EINVAL?

    1695         }

Fixes: 8544374c0f ("drm/amdkfd: Have kfd driver use same PASID values from graphic driver")
Reported by: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Xiaogang Chen <xiaogang.chen@amd.com>
Cc: Felix Kuehling <felix.kuehling@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:16:41 -05:00
Xiang Liu
2f94469cc0 drm/amdgpu: Remove redundant check of adev
There is no need to check adev for sure.

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:16:37 -05:00
Xiang Liu
663a87763b drm/amdgpu: Check aca enabled inside cper init/fini func
Move code about checking aca enabled to the cper init/fini function
to make code clean.

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:16:33 -05:00
Lijo Lazar
30eb41f5d1 drm/amdgpu: Use firmware supported NPS modes
If firmware supported NPS modes are available through CAP register, use
those values for supported NPS modes.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:16:29 -05:00
Lijo Lazar
b2a9e562df drm/amd/pm: Fetch current power limit from PMFW
On SMU v13.0.12, always query the firmware to get the current power
limit as it could be updated through other means also.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:16:18 -05:00
Sathishkumar S
c4c3808feb drm/amdgpu: Add ring reset callback for JPEG4_0_3
Add ring reset function callback for JPEG4_0_3 to
recover from job timeouts without a full gpu reset.

V2:
 - sched->ready flag shouldn't be modified by HW backend (Christian)

V3:
 - Dont modifying sched/job-submission state from HW backend (Christian)
 - Implement per-core reset sequence

V4:
 -  Dont create reset_mask sysfs and return -EOPNOTSUPP on VFs (Lijo)

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:16:11 -05:00
Sathishkumar S
58702e1a09 drm/amdgpu: Add JPEG4_0_3 core reset control reg
Add core reset control registers for JPEG4_0_3

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:16:04 -05:00
Srinivasan Shanmugam
dc0297f319 drm/amdgpu: Replace Mutex with Spinlock for RLCG register access to avoid Priority Inversion in SRIOV
RLCG Register Access is a way for virtual functions to safely access GPU
registers in a virtualized environment., including TLB flushes and
register reads. When multiple threads or VFs try to access the same
registers simultaneously, it can lead to race conditions. By using the
RLCG interface, the driver can serialize access to the registers. This
means that only one thread can access the registers at a time,
preventing conflicts and ensuring that operations are performed
correctly. Additionally, when a low-priority task holds a mutex that a
high-priority task needs, ie., If a thread holding a spinlock tries to
acquire a mutex, it can lead to priority inversion. register access in
amdgpu_virt_rlcg_reg_rw especially in a fast code path is critical.

The call stack shows that the function amdgpu_virt_rlcg_reg_rw is being
called, which attempts to acquire the mutex. This function is invoked
from amdgpu_sriov_wreg, which in turn is called from
gmc_v11_0_flush_gpu_tlb.

The [ BUG: Invalid wait context ] indicates that a thread is trying to
acquire a mutex while it is in a context that does not allow it to sleep
(like holding a spinlock).

Fixes the below:

[  253.013423] =============================
[  253.013434] [ BUG: Invalid wait context ]
[  253.013446] 6.12.0-amdstaging-drm-next-lol-050225 #14 Tainted: G     U     OE
[  253.013464] -----------------------------
[  253.013475] kworker/0:1/10 is trying to lock:
[  253.013487] ffff9f30542e3cf8 (&adev->virt.rlcg_reg_lock){+.+.}-{3:3}, at: amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu]
[  253.013815] other info that might help us debug this:
[  253.013827] context-{4:4}
[  253.013835] 3 locks held by kworker/0:1/10:
[  253.013847]  #0: ffff9f3040050f58 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x3f5/0x680
[  253.013877]  #1: ffffb789c008be40 ((work_completion)(&wfc.work)){+.+.}-{0:0}, at: process_one_work+0x1d6/0x680
[  253.013905]  #2: ffff9f3054281838 (&adev->gmc.invalidate_lock){+.+.}-{2:2}, at: gmc_v11_0_flush_gpu_tlb+0x198/0x4f0 [amdgpu]
[  253.014154] stack backtrace:
[  253.014164] CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Tainted: G     U     OE      6.12.0-amdstaging-drm-next-lol-050225 #14
[  253.014189] Tainted: [U]=USER, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[  253.014203] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 11/18/2024
[  253.014224] Workqueue: events work_for_cpu_fn
[  253.014241] Call Trace:
[  253.014250]  <TASK>
[  253.014260]  dump_stack_lvl+0x9b/0xf0
[  253.014275]  dump_stack+0x10/0x20
[  253.014287]  __lock_acquire+0xa47/0x2810
[  253.014303]  ? srso_alias_return_thunk+0x5/0xfbef5
[  253.014321]  lock_acquire+0xd1/0x300
[  253.014333]  ? amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu]
[  253.014562]  ? __lock_acquire+0xa6b/0x2810
[  253.014578]  __mutex_lock+0x85/0xe20
[  253.014591]  ? amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu]
[  253.014782]  ? sched_clock_noinstr+0x9/0x10
[  253.014795]  ? srso_alias_return_thunk+0x5/0xfbef5
[  253.014808]  ? local_clock_noinstr+0xe/0xc0
[  253.014822]  ? amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu]
[  253.015012]  ? srso_alias_return_thunk+0x5/0xfbef5
[  253.015029]  mutex_lock_nested+0x1b/0x30
[  253.015044]  ? mutex_lock_nested+0x1b/0x30
[  253.015057]  amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu]
[  253.015249]  amdgpu_sriov_wreg+0xc5/0xd0 [amdgpu]
[  253.015435]  gmc_v11_0_flush_gpu_tlb+0x44b/0x4f0 [amdgpu]
[  253.015667]  gfx_v11_0_hw_init+0x499/0x29c0 [amdgpu]
[  253.015901]  ? __pfx_smu_v13_0_update_pcie_parameters+0x10/0x10 [amdgpu]
[  253.016159]  ? srso_alias_return_thunk+0x5/0xfbef5
[  253.016173]  ? smu_hw_init+0x18d/0x300 [amdgpu]
[  253.016403]  amdgpu_device_init+0x29ad/0x36a0 [amdgpu]
[  253.016614]  amdgpu_driver_load_kms+0x1a/0xc0 [amdgpu]
[  253.017057]  amdgpu_pci_probe+0x1c2/0x660 [amdgpu]
[  253.017493]  local_pci_probe+0x4b/0xb0
[  253.017746]  work_for_cpu_fn+0x1a/0x30
[  253.017995]  process_one_work+0x21e/0x680
[  253.018248]  worker_thread+0x190/0x330
[  253.018500]  ? __pfx_worker_thread+0x10/0x10
[  253.018746]  kthread+0xe7/0x120
[  253.018988]  ? __pfx_kthread+0x10/0x10
[  253.019231]  ret_from_fork+0x3c/0x60
[  253.019468]  ? __pfx_kthread+0x10/0x10
[  253.019701]  ret_from_fork_asm+0x1a/0x30
[  253.019939]  </TASK>

v2: s/spin_trylock/spin_lock_irqsave to be safe (Christian).

Fixes: e864180ee4 ("drm/amdgpu: Add lock around VF RLCG interface")
Cc: lin cao <lin.cao@amd.com>
Cc: Jingwen Chen <Jingwen.Chen2@amd.com>
Cc: Victor Skvortsov <victor.skvortsov@amd.com>
Cc: Zhigang Luo <zhigang.luo@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:14:38 -05:00
Taimur Hassan
71e59a4268 drm/amd/display: 3.2.321
Summary:

* Add support for disconnected eDP streams
* Add log for MALL entry on DCN32x
* Add DCC/Tiling reset helper for DCN and DCE
* Guard against setting dispclk low when active
* Other minor fixes

Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:14:33 -05:00
Harry VanZyllDeJong
6571bef25f drm/amd/display: Add support for disconnected eDP streams
[Why]
eDP may not be connected to the GPU on driver start causing
fail enumeration.

[How]
Move the virtual signal type check before the eDP connector
signal check.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Harry VanZyllDeJong <hvanzyll@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:14:29 -05:00
Peichen Huang
73e686939c drm/amd/display: dpia should avoid encoder used by dp2
[WHY]
In current HPO DP2 implementation, driver would enable/disable DIG
encoder when configuring HPO DP2. Therefore, usb4 dp tunnelling should
not use the DIG encoder if the corresponded phy is used by a HPO DP2
stream.

[HOW]
A DP2 stream is treated as a dig stream.

Reviewed-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Signed-off-by: Peichen Huang <PeiChen.Huang@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:14:24 -05:00
Nicholas Kazlauskas
72d7a7fa1f drm/amd/display: Guard against setting dispclk low when active
[Why]
We should never apply a minimum dispclk value while in prepare_bandwidth
or while displays are active. This is always an optimization for when
all displays are disabled.

[How]
Defer dispclk optimization until safe_to_lower = true and display_count
reaches 0.

Since 0 has a special value in this logic (ie. no dispclk required)
we also need adjust the logic that clamps it for the actual request
to PMFW.

Reviewed-by: Gabe Teeger <gabe.teeger@amd.com>
Reviewed-by: Leo Chen <leo.chen@amd.com>
Reviewed-by: Syed Hassan <syed.hassan@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:14:19 -05:00
Ilya Bakoulin
07bc2dcbcf drm/amd/display: Fix BT2020 YCbCr limited/full range input
[Why]
BT2020 YCbCr input is not handled properly when full range
quantization is used and limited range is not supported at all.

[How]
- Add enums for BT2020 YCbCr limited/full range
- Add limited range CSC matrix

Reviewed-by: Krunoslav Kovac <krunoslav.kovac@amd.com>
Signed-off-by: Ilya Bakoulin <Ilya.Bakoulin@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Robert Mader <robert.mader@collabora.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:14:10 -05:00
Aurabindo Pillai
9856893f75 drm/amd/display: Add log for MALL entry on DCN32x
[Why&How]
Add a dyndbg log entry to check whether the driver requested scanout
from MALL cache to PMFW via DMCUB

Reviewed-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:13:55 -05:00
Oleh Kuzhylnyi
e619ac4191 drm/amd/display: Add total_num_dpps_required field to informative structure
[Why]
The informative structure needs to be extended by the total number of DPPs
required per each active plane.
The new informative field is going to be used as a statistical indicator.

[How]
The dml2_core_calcs_get_informative() routine must count a total number of DPPs.

Reviewed-by: Austin Zheng <austin.zheng@amd.com>
Signed-off-by: Oleh Kuzhylnyi <okuzhyln@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:13:51 -05:00
George Shen
de84d58012 drm/amd/display: Read LTTPR ALPM caps during link cap retrieval
[Why]
The latest DP spec requires the DP TX to read DPCD F0000h through F0009h
when detecting LTTPR capabilities for the first time.

[How]
Update LTTPR cap retrieval to read up to F0009h (two more bytes than the
previous F0007h), and store the LTTPR ALPM capabilities.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:13:45 -05:00
Alex Hung
5f7e384ab5 drm/amd/display: Print seamless boot message in mark_seamless_boot_stream
[WHAT & HOW]
Add a message so users know the stream will be used for seamless boot.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:13:34 -05:00
Rodrigo Siqueira
d27a1e93f2 drm/amd/display: Add clear DCC and Tiling callback for DCN
Introduce the DCC and Tiling reset callback to all DCN versions that can
call it.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:13:20 -05:00
Rodrigo Siqueira
c905aa6856 drm/amd/display: Rename panic function
Rename dc_plane_force_update_for_panic to
dc_plane_force_dcc_and_tiling_disable to describe the function operation
in the name. Also, this function might be used in other contexts, and a
more generic name can be helpful for this purpose.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:12:49 -05:00
Rodrigo Siqueira
098c9b58be drm/amd/display: Add DCC/Tiling reset helper for DCN and DCE
This commit introduces a function helper for resetting DCN/DCE DCC and
tiling. Those functions are generic for their respective DCN/DCE, so
they were added to the oldest version of each architecture.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:12:34 -05:00
Leo Zeng
8ae6dfc0b6 Revert "drm/amd/display: Request HW cursor on DCN3.2 with SubVP"
This reverts commit 13437c9160.

Reason to revert: idle power regression found in testing.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Leo Zeng <Leo.Zeng@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:11:34 -05:00
Harry Wentland
cbf4890c6f drm/amd/display: Don't treat wb connector as physical in create_validate_stream_for_sink
Don't try to operate on a drm_wb_connector as an amdgpu_dm_connector.
While dereferencing aconnector->base will "work" it's wrong and
might lead to unknown bad things. Just... don't.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:11:11 -05:00
Ovidiu Bunea
c488967488 drm/amd/display: Exit idle optimizations before accessing PHY
[why & how]
By default, DCN HW is in idle optimized state which does not allow access
to PHY registers. If BIOS powers up the DCN, it is fine because they will
power up everything. Only exit idle optimized state when not taking control
from VBIOS.

Fixes: be704e5ef4 ("Revert "drm/amd/display: Exit idle optimizations before attempt to access PHY"")
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Ovidiu Bunea <Ovidiu.Bunea@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19 15:08:56 -05:00
Nam Cao
690d59fee8 drm/amdgpu: Switch to use hrtimer_setup()
hrtimer_setup() takes the callback function pointer as argument and
initializes the timer completely.

Replace hrtimer_init() and the open coded initialization of
hrtimer::function with the new setup mechanism.

Patch was created by using Coccinelle.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Zack Rusin <zack.rusin@broadcom.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/all/2e3664eebb00d3a8c786ee7cc1fba8096bababc9.1738746904.git.namcao@linutronix.de
2025-02-18 11:19:05 +01:00
Thomas Zimmermann
8bd1a8e757 Merge drm/drm-next into drm-misc-next
Backmerging to get bugfixes from v6.14-rc2.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2025-02-18 07:43:43 +01:00
Xiang Liu
f9d35b945c drm/amdgpu: Generate bad page threshold cper records
Generate CPER record when bad page threshold exceed and
commit to CPER ring.

v2: return -ENOMEM instead of false
v2: check return value of fill section function

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:09:30 -05:00
Xiang Liu
4058e7cbfd drm/amdgpu: Commit CPER entry
Commit the CPER entry to the ring buffer.

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:09:30 -05:00
Tao Zhou
8652920d2c drm/amdgpu: add mutex lock for cper ring
Avoid the confliction between read and write of ring buffer.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:09:30 -05:00
Asad Kamal
b1118df145 drm/amd/pm: Limit jpeg rings as per max for jpeg_v_4_0_3
Since pmfw supports for smuv13_0_6 is limited to 8 jpeg rings per instance,
which is the max for jpeg_v_4_0_3. Limit it to same to avoid out
of bound access.

Fixes: 568199a5c7 ("drm/amd/pm: Limit to 8 jpeg rings per instance")
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:09:30 -05:00
Tao Zhou
a6d9d19290 drm/amdgpu: add data write function for CPER ring
Old CPER data will be overwritten if ring buffer is full, and read
pointer always points to CPER header.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:09:30 -05:00
Tao Zhou
5a14282429 drm/amdgpu: read CPER ring via debugfs
We read CPER data from read pointer to write pointer without changing
the pointers.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:09:29 -05:00
Tao Zhou
4d614ce8ff drm/amdgpu: add RAS CPER ring buffer
And initialize it, this is a pure software ring to store RAS CPER data.

v2: change ring size to 0x100000
v2: update the initialization of count_dw of cper ring, it's dword
variable
v3: skip VM inv eng for cper
v3: init/fini when aca enabled

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:09:29 -05:00
Xiang Liu
b3060f5bea drm/amdgpu: Get timestamp from system time
Get system local time and encode it to timestamp for CPER.

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:09:29 -05:00
Alex Deucher
f3e10e1a0c drm/amdgpu/mes12: allocate hw_resource_1 buffer once
Allocate the buffer at sw init time so we don't alloc
and free it for every suspend/resume or reset cycle.

Reviewed-by: Shaoyun.liu <shaouyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:09:29 -05:00
Alex Deucher
13d68ae651 drm/amdgpu/mes11: allocate hw_resource_1 buffer once
Allocate the buffer at sw init time so we don't alloc
and free it for every suspend/resume or reset cycle.

Reviewed-by: Shaoyun.liu <shaouyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:09:29 -05:00
Nathan Chancellor
196222dccb drm/amd/display: Reapply 2fde4fdddc
Commit 2563391e57 ("drm/amd/display: DML2.1 resynchronization") blew
away the compiler warning fix from commit 2fde4fdddc
("drm/amd/display: Avoid -Wenum-float-conversion in
add_margin_and_round_to_dfs_grainularity()"), causing the warning to
reappear.

  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.c:183:58: error: arithmetic between enumeration type 'enum dentist_divider_range' and floating-point type 'double' [-Werror,-Wenum-float-conversion]
    183 |         divider = (unsigned int)(DFS_DIVIDER_RANGE_SCALE_FACTOR * (vco_freq_khz / clock_khz));
        |                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~
  1 error generated.

Apply the fix again to resolve the warning.

Fixes: 1b30456150 ("drm/amd/display: DML21 Reintegration")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:09:29 -05:00
Hawking Zhang
652e090230 drm/amdgpu: Generate cper records
Encode the error information in CPER format and commit
to the cper ring

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Yang Wang <keivnyang.wang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:09:29 -05:00
Philip Yang
e7a477735f drm/amdkfd: Fix user queue validation on Gfx7/8
To workaround queue full h/w issue on Gfx7/8, when application create
AQL queue, the ring buffer bo allocate size is queue_size/2 and
map queue_size ring buffer to GPU in 2 pieces using 2 attachments, each
attachment map size is queue_size/2, with same ring_bo backing memory.

For Gfx7/8, user queue buffer validation should use queue_size/2 to
verify ring_bo allocation and mapping size.

Fixes: 68e599db7a ("drm/amdkfd: Validate user queue buffers")
Suggested-by: Tomáš Trnka <trnka@scm.com>
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:09:29 -05:00
Hawking Zhang
ad97840f95 drm/amdgpu: Introduce funcs for generating cper record
Introduce new functions that are used to generate
cper ue or ce records.

v2: return -ENOMEM instead of false
v2: check return value of fill section function

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Yang Wang <keivnyang.wang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:09:29 -05:00
Hawking Zhang
56316ee91b drm/amdgpu: Include ACA error type in aca bank
ACA error types managed by driver a direct 1:1
correspondence with those managed by firmware.

To address this, for each ACA bank, include
both the ACA error type and the ACA SMU type.

This addition is useful for creating CPER records.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Yang Wang <keivnyang.wang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:09:29 -05:00
Candice Li
76b1f8b32d drm/amdgpu: Optimize the enablement of GECC
Enable GECC only when the default memory ECC mode or
the module parameter amdgpu_ras_enable is activated.

v2: Add kernel message to remind users explicitly set
    amdgpu_ras_enable=1 before driver loading to enable GECC
    and set amdgpu_ras_enable=0 to disable GECC when GECC is
    currently enabled if needed.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:09:29 -05:00
Hawking Zhang
92d5d2a09d drm/amdgpu: Introduce funcs for populating CPER
Introduce utility functions designed to assist
in populating CPER records.

v2: call cper_init/fini in device_ip_init/fini.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:09:29 -05:00
Hawking Zhang
523b69c654 drm/amd/include: Add amd cper header
AMD is using Common Platform Error Record (CPER) format
to report all gpu hardware errors.

v2: add program attribute

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:09:28 -05:00
Srinivasan Shanmugam
2012aff981 drm/amdgpu: Rename VCN clock gating function for consistency
Change the function name from vcn_v2_5_enable_clock_gating_inst
to vcn_v2_5_enable_clock_gating to ensure consistency in naming.

Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c:781: warning: expecting prototype for vcn_v2_5_enable_clock_gating_inst(). Prototype was for vcn_v2_5_enable_clock_gating() instead

Cc: Leo Liu <leo.liu@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:09:28 -05:00
Alex Deucher
eda80f1c2a drm/amdgpu/vcn4.0.3: drop dpm power helpers
VCN 4.0.3 doesn't support powergating so there is
no need to call these.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:09:25 -05:00
Alex Deucher
7780239809 drm/amdgpu/vcn5.0.1: drop dpm power helpers
VCN 5.0.1 doesn't support powergating so there is
no need to call these.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:09:18 -05:00
Alex Deucher
0487f50310 drm/amdgpu/vcn5.0.1: use correct dpm helper
The VCN and UVD helpers were split in
commit ff69bba05f ("drm/amd/pm: add inst to dpm_set_powergating_by_smu")
However, this happened in parallel to the vcn 5.0.1
development so it was missed there.

Fixes: 346492f30c ("drm/amdgpu: Add VCN_5_0_1 support")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Sonny Jiang <sonjiang@amd.com>
Cc: Boyuan Zhang <boyuan.zhang@amd.com>
2025-02-17 14:09:11 -05:00
Alex Deucher
56763be400 drm/amdgpu/umsch: tidy up the ucode name string handling
Make the constant parts of the name part of the string
we pass to amdgpu_ucode_request().  Only the version
number varies from IP to IP.

Reviewed-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Lang Yu <Lang.Yu@amd.com>
2025-02-17 14:09:02 -05:00
Alex Deucher
c917e39cbd drm/amdgpu/umsch: fix ucode check
Return an error if the IP version doesn't match otherwise
we end up passing a NULL string to amdgpu_ucode_request.
We should never hit this in practice today since we only
enable the umsch code on the supported IP versions, but
add a check to be safe.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202502130406.iWQ0eBug-lkp@intel.com/
Fixes: 020620424b ("drm/amd: Use a constant format string for amdgpu_ucode_request")
Reviewed-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Lang Yu <Lang.Yu@amd.com>
2025-02-17 14:08:53 -05:00
Amber Lin
5183e69090 drm/amdgpu: Remove extra checks for CPX
As far as the number of XCCs, the number of compute partitions, and the
number of memory partitions qualify, CPX is valid.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:08:49 -05:00
Alex Deucher
fe652becdb drm/amdgpu/umsch: declare umsch firmware
Needed to be properly picked up for the initrd, etc.

Fixes: 3488c79bea ("drm/amdgpu: add initial support for UMSCH")
Reviewed-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Lang Yu <Lang.Yu@amd.com>
2025-02-17 14:08:37 -05:00
Alex Deucher
80513e3897 drm/amdgpu/gfx: only call mes for enforce isolation if supported
This should not be called on chips without MES so check if
MES is enabled and if the cleaner shader is supported.

Fixes: 8521e3c5f0 ("drm/amd/amdgpu: limit single process inside MES")
Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Shaoyun Liu <shaoyun.liu@amd.com>
Cc: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
2025-02-17 14:08:28 -05:00
Sathishkumar S
500c04d2a7 drm/amdgpu: Add ring reset callback for JPEG2_0_0
Add ring reset function callback for JPEG2_0_0 to
recover from job timeouts without a full gpu reset.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:08:28 -05:00
Sathishkumar S
09e24a0b52 drm/amdgpu: Add ring reset callback for JPEG2_5_0
Add ring reset function callback for JPEG2_5_0 to
recover from job timeouts without a full gpu reset.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:08:28 -05:00
Sathishkumar S
cb493aee4d drm/amdgpu: Per-instance init func for JPEG2_5_0
Add helper functions to handle per-instance initialization
and deinitialization in JPEG2_5_0.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:08:28 -05:00
Sathishkumar S
03399d0bff drm/amdgpu: Add ring reset callback for JPEG3_0_0
Add ring reset function callback for JPEG3_0_0 to
recover from job timeouts without a full gpu reset.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:08:27 -05:00
Sathishkumar S
74894ffc7d drm/amdgpu: Add ring reset callback for JPEG4_0_0
Add ring reset function callback for JPEG4_0_0 to
recover from job timeouts without a full gpu reset.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:08:27 -05:00
Sathishkumar S
abce7b4fc7 drm/amdgpu: Per-instance init func for JPEG4_0_3
Add helper functions to handle per-instance and per-core
initialization and deinitialization in JPEG4_0_3.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:08:27 -05:00
Yang Wang
8c66312345 drm/amdgpu: refine smu send msg debug log format
remove unnecessary line breaks.

[   51.280860] amdgpu 0000:24:00.0: amdgpu: smu send message: GetEnabledSmuFeaturesHigh(13) param: 0x00000000, resp: 0x00000001,                        readval: 0x00003763

Fixes: 0cd2bc06de ("drm/amd/pm: enable amdgpu smu send message log")
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:08:27 -05:00
Saleemkhan Jamadar
b0bebbe4ea drm/amdgpu/umsch: remove vpe test from umsch
current test is more intrusive for user queue test

Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Suggested-by: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:08:27 -05:00
Candice Li
59af05d6a3 drm/amdgpu: Enable ACA by default for psp v13_0_12
Enable ACA by default for psp v13_0_12.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:08:27 -05:00
Dave Airlie
0ed1356af8 drm-misc-next for v6.15:
UAPI Changes:
 
 fourcc:
 - Add modifiers for MediaTek tiled formats
 
 Cross-subsystem Changes:
 
 bus:
 - mhi: Enable image transfer via BHIe in PBL
 
 dma-buf:
 - Add fast-path for single-fence merging
 
 Core Changes:
 
 atomic helper:
 - Allow full modeset on connector changes
 - Clarify semantics of allow_modeset
 - Clarify semantics of drm_atomic_helper_check()
 
 buddy allocator:
 - Fix multi-root cleanup
 
 ci:
 - Update IGT
 
 display:
 - dp: Support Extendeds Wake Timeout
 - dp_mst: Fix RAD-to-string conversion
 
 panic:
 - Encode QR code according to Fido 2.2
 
 probe helper:
 - Cleanups
 
 scheduler:
 - Cleanups
 
 ttm:
 - Refactor pool-allocation code
 - Cleanups
 
 Driver Changes:
 
 amdxdma:
 - Fix error handling
 - Cleanups
 
 ast:
 - Refactor detection of transmitter chips
 - Refactor support of VBIOS display-mode handling
 - astdp: Fix connection status; Filter unsupported display modes
 
 bridge:
 - adv7511: Report correct capabilities
 - it6505: Fix HDCP V compare
 - sn65dsi86: Fix device IDs
 - Cleanups
 
 i915:
 - Enable Extendeds Wake Timeout
 
 imagination:
 - Check job dependencies with DRM-sched helper
 
 ivpu:
 - Improve command-queue handling
 - Use workqueue for IRQ handling
 - Add suport for HW fault injection
 - Locking fixes
 - Cleanups
 
 mgag200:
 - Add support for G200eH5 chips
 
 msm:
 - dpu: Add concurrent writeback support for DPU 10.x+
 
 nouveau:
 - Move drm_slave_encoder interface into driver
 - nvkm: Refactor GSP RPC
 
 omapdrm:
 - Cleanups
 
 panel:
 - Convert several panels to multi-style functions to improve error
   handling
 - edp: Add support for B140UAN04.4, BOE NV140FHM-NZ, CSW MNB601LS1-3,
   LG LP079QX1-SP0V, MNE007QS3-7, STA 116QHD024002, Starry 116KHD024006,
   Lenovo T14s Gen6 Snapdragon
 - himax-hx83102: Add support for CSOT PNA957QT1-1, Kingdisplay
   kd110n11-51ie, Starry 2082109qfh040022-50e
 
 panthor:
 - Expose sizes of intenral BOs via fdinfo
 - Fix race between reset and suspend
 - Cleanups
 
 qaic:
 - Add support for AIC200
 - Cleanups
 
 renesas:
 - Fix limits in DT bindings
 
 rockchip:
 - rk3576: Add HDMI support
 - vop2: Add new display modes on RK3588 HDMI0 up to 4K
 - Don't change HDMI reference clock rate
 - Fix DT bindings
 
 solomon:
 - Set SPI device table to silence warnings
 - Fix pixel and scanline encoding
 
 v3d:
 - Cleanups
 
 vc4:
 - Use drm_exec
 - Use dma-resv for wait-BO ioctl
 - Remove seqno infrastructure
 
 virtgpu:
 - Support partial mappings of GEM objects
 - Reserve VGA resources during initialization
 - Fix UAF in virtgpu_dma_buf_free_obj()
 - Add panic support
 
 vkms:
 - Switch to a managed modesetting pipeline
 - Add support for ARGB8888
 
 xlnx:
 - Set correct DMA segment size
 - Fix error handling
 - Fix docs
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmesYp4ACgkQaA3BHVML
 eiP0agf/ZsUXdVdJ3HMI3vAMbiUnW8ThQZx0M3Zkpf3EOCKgdibbjNkhgTzsp3b/
 GqcP/pmqwmjCUgVADVDVJvRTD1rSadcxCYYTwI9fsjnXWmib9l4sjI3jPRJkIekj
 a5eeCFnMEKzfaxhFPJ/nu+esMZ19ZRE1iSO7WC7YC0SSR6zYP/QaNkGti8QA5TpL
 oBHd4jctoxXw9sw/ACEwzsSRdxL/opXQ5KbFJFmr2Q9/osaTUX7QtslDhnh+2lVK
 ZGxGdFIvFYKB+QrXpfjJU51Q01gqp/PWS+YIDrpr7bLC5YeFQVe3FDjqYIXSoE2t
 dFYtU2mSS/rwijcRc64sghXCdbKS1w==
 =iJGz
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2025-02-12' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

drm-misc-next for v6.15:

UAPI Changes:

fourcc:
- Add modifiers for MediaTek tiled formats

Cross-subsystem Changes:

bus:
- mhi: Enable image transfer via BHIe in PBL

dma-buf:
- Add fast-path for single-fence merging

Core Changes:

atomic helper:
- Allow full modeset on connector changes
- Clarify semantics of allow_modeset
- Clarify semantics of drm_atomic_helper_check()

buddy allocator:
- Fix multi-root cleanup

ci:
- Update IGT

display:
- dp: Support Extendeds Wake Timeout
- dp_mst: Fix RAD-to-string conversion

panic:
- Encode QR code according to Fido 2.2

probe helper:
- Cleanups

scheduler:
- Cleanups

ttm:
- Refactor pool-allocation code
- Cleanups

Driver Changes:

amdxdma:
- Fix error handling
- Cleanups

ast:
- Refactor detection of transmitter chips
- Refactor support of VBIOS display-mode handling
- astdp: Fix connection status; Filter unsupported display modes

bridge:
- adv7511: Report correct capabilities
- it6505: Fix HDCP V compare
- sn65dsi86: Fix device IDs
- Cleanups

i915:
- Enable Extendeds Wake Timeout

imagination:
- Check job dependencies with DRM-sched helper

ivpu:
- Improve command-queue handling
- Use workqueue for IRQ handling
- Add suport for HW fault injection
- Locking fixes
- Cleanups

mgag200:
- Add support for G200eH5 chips

msm:
- dpu: Add concurrent writeback support for DPU 10.x+

nouveau:
- Move drm_slave_encoder interface into driver
- nvkm: Refactor GSP RPC

omapdrm:
- Cleanups

panel:
- Convert several panels to multi-style functions to improve error
  handling
- edp: Add support for B140UAN04.4, BOE NV140FHM-NZ, CSW MNB601LS1-3,
  LG LP079QX1-SP0V, MNE007QS3-7, STA 116QHD024002, Starry 116KHD024006,
  Lenovo T14s Gen6 Snapdragon
- himax-hx83102: Add support for CSOT PNA957QT1-1, Kingdisplay
  kd110n11-51ie, Starry 2082109qfh040022-50e

panthor:
- Expose sizes of intenral BOs via fdinfo
- Fix race between reset and suspend
- Cleanups

qaic:
- Add support for AIC200
- Cleanups

renesas:
- Fix limits in DT bindings

rockchip:
- rk3576: Add HDMI support
- vop2: Add new display modes on RK3588 HDMI0 up to 4K
- Don't change HDMI reference clock rate
- Fix DT bindings

solomon:
- Set SPI device table to silence warnings
- Fix pixel and scanline encoding

v3d:
- Cleanups

vc4:
- Use drm_exec
- Use dma-resv for wait-BO ioctl
- Remove seqno infrastructure

virtgpu:
- Support partial mappings of GEM objects
- Reserve VGA resources during initialization
- Fix UAF in virtgpu_dma_buf_free_obj()
- Add panic support

vkms:
- Switch to a managed modesetting pipeline
- Add support for ARGB8888

xlnx:
- Set correct DMA segment size
- Fix error handling
- Fix docs

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20250212090625.GA24865@linux.fritz.box
2025-02-14 10:24:02 +10:00
André Almeida
41129e236f drm/amdgpu: Enable async flip on overlay planes
amdgpu can handle async flips on overlay planes, so allow it for atomic
async checks.

Signed-off-by: André Almeida <andrealmeid@igalia.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250127-tonyk-async_flip-v12-2-0f7f8a8610d3@igalia.com
[DB: fixed checkpatch warning by adding braces]
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
2025-02-14 00:54:29 +02:00
André Almeida
6fe52b63f5
drm/amdgpu: Use device wedged event
Use DRM's device wedged event to notify userspace that a reset had
happened. For now, only use `none` method meant for telemetry
capture.

In the future we might want to report a recovery method if the reset didn't
succeed.

Acked-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250204070528.1919158-6-raag.jadav@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-02-13 12:15:44 -05:00
Alex Deucher
7845438718 drm/amdgpu/mes: Add cleaner shader fence address handling in MES for GFX12
This commit introduces enhancements to the handling of the cleaner
shader fence in the AMDGPU MES driver:

- The MES (Microcode Execution Scheduler) now sends a PM4 packet to the
  KIQ (Kernel Interface Queue) to request the cleaner shader, ensuring
  that requests are handled in a controlled manner and avoiding the
  race conditions.
- The CP (Compute Processor) firmware has been updated to use a private
  bus for accessing specific registers, avoiding unnecessary operations
  that could lead to issues in VF (Virtual Function) mode.
- The cleaner shader fence memory address is now set correctly in the
  `mes_set_hw_res_pkt` structure, allowing for proper synchronization of
  the cleaner shader execution.

Cc: Christian König <christian.koenig@amd.com>
Cc: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Shaoyun Liu <shaoyun.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:06:37 -05:00
Xiaogang Chen
10e08943ca drm/amdkfd: Fix pasid value leak
Curret kfd does not allocate pasid values, instead uses pasid value for each
vm from graphic driver. So should not prevent graphic driver from releasing
pasid values since the values are allocated by graphic driver, not kfd driver
anymore. This patch does not stop graphic driver release pasid values.

Fixes: 8544374c0f ("drm/amdkfd: Have kfd driver use same PASID values from graphic driver")
Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:05:50 -05:00
Shaoyun Liu
15d8c92f10 drm/amd/include : Update MES v12 API for fence update
MES fence_value will be updated in fence_addr if API success,
otherwise upper 32 bit will be used to indicate error code.
In any case, MES will trigger an EOP interrupt with 0xb1 as
context id in the interrupt cookie

Signed-off-by: Shaoyun Liu <shaoyun.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:05:50 -05:00
Saleemkhan Jamadar
4b9a3117bb drm/amdgpu/vcn: enable TMZ support for vcn 4_0_5
TMZ support is enabled for vcn on GC IP 11_5_0

Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:05:50 -05:00
Asad Kamal
87d8232f0f drm/amd/pm: Rename pmfw message SetPstatePolicy
Rename pmfw message SelectPstatePolicy to SetThrottlingPolicy as per
pmfw interface header for smu_v_13_0_6

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:05:49 -05:00
Srinivasan Shanmugam
be2560e4b8 drm/amdgpu/mes: Add cleaner shader fence address handling in MES for GFX11
This commit introduces enhancements to the handling of the cleaner
shader fence in the AMDGPU MES driver:

- The MES (Microcode Execution Scheduler) now sends a PM4 packet to the
  KIQ (Kernel Interface Queue) to request the cleaner shader, ensuring
  that requests are handled in a controlled manner and avoiding the
  race conditions.
- The CP (Compute Processor) firmware has been updated to use a private
  bus for accessing specific registers, avoiding unnecessary operations
  that could lead to issues in VF (Virtual Function) mode.
- The cleaner shader fence memory address is now set correctly in the
  `mes_set_hw_res_pkt` structure, allowing for proper synchronization of
  the cleaner shader execution.

Cc: lin cao <lin.cao@amd.com>
Cc: Jingwen Chen <Jingwen.Chen2@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Suggested-by: Shaoyun Liu <shaoyun.liu@amd.com>
Reviewed by: Shaoyun.liu  <Shaoyun.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:05:49 -05:00
Ying Li
16a5a8fe6f drm/amd/amdgpu: add support for IP version 11.5.2
This initializes drm/amd/amdgpu version 11.5.2

Signed-off-by: YING LI <yingli12@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:05:49 -05:00
Ying Li
ee9e64549f drm/amd/pm: add support for IP version 11.5.2
This initializes drm/amd/pm version 11.5.2

Signed-off-by: YING LI <yingli12@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:05:49 -05:00
Philip Yang
23b645231e drm/amdgpu: Unlocked unmap only clear page table leaves
SVM migration unmap pages from GPU and then update mapping to GPU to
recover page fault. Currently unmap clears the PDE entry for range
length >= huge page and free PTB bo, update mapping to alloc new PT bo.
There is race bug that the freed entry bo maybe still on the pt_free
list, reused when updating mapping and then freed, leave invalid PDE
entry and cause GPU page fault.

By setting the update to clear only one PDE entry or clear PTB, to
avoid unmap to free PTE bo. This fixes the race bug and improve the
unmap and map to GPU performance. Update mapping to huge page will
still free the PTB bo.

With this change, the vm->pt_freed list and work is not needed. Add
WARN_ON(unlocked) in amdgpu_vm_pt_free_dfs to catch if unmap to free the
PTB.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:05:49 -05:00
Alex Deucher
1350dd3691 drm/amdgpu/mes11: fix set_hw_resources_1 calculation
It's GPU page size not CPU page size.  In most cases they
are the same, but not always.  This can lead to overallocation
on systems with larger pages.

Cc: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:05:49 -05:00
Eric Huang
5ffd56822a drm/amdkfd: fix missing L2 cache info in topology
In some ASICs L2 cache info may miss in kfd topology,
because the first bitmap may be empty, that means
the first cu may be inactive, so to find the first
active cu will solve the issue.

v2: Only find the first active cu in the first xcc

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:05:49 -05:00
Alex Deucher
ebc25499de drm/amdgpu/vcn2.5: split code along instances
Split the code on a per instance basis.  This will allow
us to use the per instance functions in the future to
handle more things per instance.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:05:49 -05:00
Taimur Hassan
53472eeb22 drm/amd/display: 3.2.320
Summary:

* Start enabling support for 4-plane MPO
* DML21 Updates
* SPL Updates
* Other minor fixes

Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:05:49 -05:00
Harish Kasiviswanathan
3394b1f76d drm/amdgpu: Set snoop bit for SDMA for MI series
SDMA writes has to probe invalidate RW lines. Set snoop bit in mmhub for
this to happen.

v2: Missed a few mmhub_v9_4. Added now.
v3: Calculate hub offset once since it doesn't change inside the loop
    Modified function names based on review comments.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:05:49 -05:00
Samson Tam
53b2e0c24a drm/amd/display: sspl: cleanup filter code
[Why & How]
Remove unused filters and functions
Add static to limit scope

Signed-off-by: Samson Tam <Samson.Tam@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Jun Lei <jun.lei@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:05:48 -05:00
Aurabindo Pillai
8f87447a8e drm/amd/display: Make dcn401_program_pipe non static
Allow reuse of code by making dcn401_program_pipe()
non static.

Fixes: 2739bd1237 ("drm/amd/display: Allow reuse of of DCN4x code")
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Karthi Kandasamy <karthi.kandasamy@amd.com>
Reviewed-by: Alvin Lee <Alvin.Lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:10 -05:00
Charlene Liu
b40d022ec0 drm/amd/display: pass calculated dram_speed_mts to dml2
[why]
currently dml2 is using a hard coded 16 to convert memclk to dram_speed_mts.
for apu, this depends on wck_ratio.

change to pass the already calculated dram_speed_mts from fpu to dml2.

v2: use existing calculation of dram_speed_mts for now to avoid regression

Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Roman Li <Roman.Li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:10 -05:00
Brendan Tam
51d1b33854 drm/amd/display: add workaround flag to link to force FFE preset
[Why]
There have been instances of some monitors being unable to link train on
their reported link speed using their selected FFE preset. If a different
FFE preset is found that has a higher rate of success during link training
this workaround can be used to force its FFE preset.

[How]
A new link workaround flag is made called force_dp_ffe_preset. The flag is
checked in override_training_settings and will set lt_settings->ffe_preset
which is null if the flag is not set. The flag is then set in
override_lane_settings.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Brendan Tam <Brendan.Tam@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:09 -05:00
Samson Tam
5a20ca32a2 drm/amd/display: add s1_12 filter tables
[Why & How]
Instead of converting tables from s1_10 to s1_12, add s1_12 tables instead.
Remove init calls that do the conversion. Add APIs to read s1_10 tables

Reviewed-by: Navid Assadian <navid.assadian@amd.com>
Signed-off-by: Samson Tam <Samson.Tam@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:09 -05:00
Ausef Yousof
c36d7948bb drm/amd/display: limit coverage of optimization skip
[why&how]
causing some regression on dgpu which still needs the
pre-emptive return, limit this to reporter asic version
it is simple to include
different dcn versions from this point forward, each dcn
resource is initialized with the flag and can be enabled
at will

Reviewed-by: Chris Park <chris.park@amd.com>
Signed-off-by: Ausef Yousof <Ausef.Yousof@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:09 -05:00
Leo Zeng
b474a6e11f drm/amd/display: add new IRQ enum for underflows
[WHY & HOW]
needed in certain scenarios for debugging and logging

Reviewed-by: Joshua Aberback <joshua.aberback@amd.com>
Reviewed-by: Martin Leung <martin.leung@amd.com>
Signed-off-by: Leo Zeng <Leo.Zeng@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:09 -05:00
Samson Tam
2a4519c4e9 drm/amd/display: remove TF check for LLS policy
[Why & How]
LLS policy not affected by TF.
Remove check in don't care case and use
 pixel format only.

Reviewed-by: Navid Assadian <navid.assadian@amd.com>
Signed-off-by: Samson Tam <Samson.Tam@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:09 -05:00
Samson Tam
8e539d2dd2 drm/amd/display: use s1_12 filter tables in SPL
[Why & How]
Instead of converting tables from s1_10 to s1_12,
 added s1_12 tables instead in SPL
Remove init calls that do the conversion

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Samson Tam <Samson.Tam@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:09 -05:00
Austin Zheng
1b30456150 drm/amd/display: DML21 Reintegration
For various fixes to mcache_row_bytes calculation.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Austin Zheng <Austin.Zheng@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:09 -05:00
Ilya Bakoulin
e8bffa52e0 drm/amd/display: Don't try AUX transactions on disconnected link
[Why]
Setting link DPMS off in response to HPD disconnect creates AUX
transactions on a link that is supposed to be disconnected. This can
cause issues in some cases when the sink re-asserts HPD and expects
source to re-enable the link.

[How]
Avoid AUX transactions on disconnected link.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Ilya Bakoulin <Ilya.Bakoulin@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:09 -05:00
Zaeem Mohamed
fed4c27537 drm/amd/display: docstring definitions MAX_SURFACES and MAX_PLANES
MAX_SURFACES and MAX_PLANES now have docstrings that better show the difference between the two.

Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:09 -05:00
Zaeem Mohamed
14d7ca5273 drm/amd/display: Expose 3 secondary planes for supported ASICs
[why]
For enabling 4-plane MPO, we need dc to expose 4 planes for DCN35 and
beyond, as well as DCN21

[how]
Set dc_caps.max_slave_*planes to 3 for appropriate ASICs

Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:09 -05:00
Tim Huang
76e0410fe0 drm/amdgpu: add discovery support for DCN IP version 3.6.0
Add discovery entry for DCN IP version 3.6.0.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:09 -05:00
Mario Limonciello
0a0bd4f95c drm/amd: Refactor find_system_memory()
find_system_memory() pulls out two fields from an SMBIOS type 17
device and sets them on KFD devices. The data offsets are counted
to find interesting data.

Instead use a struct representation to access the members and pull
out the two specific fields.

No intended functional changes.

Link: https://www.dmtf.org/sites/default/files/standards/documents/DSP0134_3.8.0.pdf p99
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Link: https://lore.kernel.org/r/20250206214847.3334595-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:09 -05:00
Jiang Liu
e92f3f94ca drm/amdgpu: reset psp->cmd to NULL after releasing the buffer
Reset psp->cmd to NULL after releasing the buffer in function psp_sw_fini().

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:08 -05:00
Asad Kamal
aafe181f7d drm/amdgpu: Add flags to distinguish vf/pf/pt mode
Add extra flag definition for ids_flag field to distinguish
between vf/pf/pt modes

v2: Updated kms driver minor version & removed pf check as default is 0
v3: Fix up version (Alex)
v4: rebase (Alex)

Proposed userspace:
e663bed7d6

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:08 -05:00
Alex Deucher
759e764f7d drm/amdkfd: use GTT for VRAM on APUs only if GTT is larger
If the user has configured a large carveout on a small APU,
only use GTT for VRAM allocations if GTT is larger than
VRAM.

v2: fix reversed check (Philip)

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:08 -05:00
Alex Deucher
8b0d068e7d drm/amdkfd: add a new flag to manage where VRAM allocations go
On big and small APUs we send KFD VRAM allocations to GTT
since the carve out is either non-existent or relatively
small.  However, if someone sets the carve out size to be
relatively large, we may end up using GTT rather than VRAM.

No change of logic with this patch, but it allows the
driver to determine which logic to use based on the
carve out size in the future.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:08 -05:00
Lijo Lazar
cc0e91a755 drm/amdgpu: Make VBIOS image read optional
Keep VBIOS image read optional for select SOCs in passthrough mode.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:08 -05:00
Lijo Lazar
6e8ca38ebc drm/amdgpu: Add flag to make VBIOS read optional
Certain SOCs may not need much data from VBIOS. Some data like VBIOS
version used will be missed but it doesn't affect functionality. Add a
flag to make VBIOS image optional.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:08 -05:00
Lijo Lazar
7e0aa70681 drm/amdgpu: Add VBIOS flags
Instead of read_bios, use get_bios_flags to get various options around
reading VBIOS.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:08 -05:00
Lijo Lazar
e986e89659 drm/amdgpu: Add wrapper for freeing vbios memory
Use bios_release wrapper to release memory allocated for vbios image and
reset the variables.

v2:
	Use the same wrapper for clean up in sw_fini (Alex Deucher)

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:08 -05:00
Wayne Lin
1846a3472f drm/amd/display: Add DCN36 DM Support
Add DM handling for DCN36.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:08 -05:00
Wayne Lin
4bc8f12db2 drm/amd/display: Add DCN36 CORE
Add DCN36 support in dc_resource.c.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Martin Leung <martin.leung@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:08 -05:00
Wayne Lin
9b7d816f09 drm/amd/display: Support DCN36 HDCP
Add case in hdcp_create_workqueue() to support HDCP on DCN36 as well.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:08 -05:00
Wayne Lin
23577b3a15 drm/amd/display: Support DCN36 DSC
Add case on clean_up_dsc_blocks() to support DCN36 as well.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Martin Leung <martin.leung@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:08 -05:00
Wayne Lin
c5dd47d9e6 drm/amd/display: Add DCN36 DMCUB
DMCU-B (Display Micro-Controller Unit B) is a display microcontroller
used for shared display functionality with BIOS and for advanced
power saving display features.

Add case to support DCN3.6 as well.

V2: adjust copyright license text

Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Martin Leung <martin.leung@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:07 -05:00
Wayne Lin
8cb06693bc drm/amd/display: Add DCN36 DML2 support
Enable DML2 for DCN36.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Martin Leung <martin.leung@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:07 -05:00
Wayne Lin
4bcba9844b drm/amd/display: Add DCN36 GPIO
Add DCN36 support in GPIO.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Martin Leung <martin.leung@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:07 -05:00
Wayne Lin
9ae42f6120 drm/amd/display: Add DCN36 Resource
Add resource handling for DCN36.

V2: adjust copyright license text
V3: remove unnecessary headers

Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Martin Leung <martin.leung@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:07 -05:00
Wayne Lin
76e3b62db9 drm/amd/display: Add DCN36 IRQ
Add IRQ services for DCN36.
This allows us to create/init and manage irqs for DCN3

V2: adjust copyright license text

Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Martin Leung <martin.leung@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:07 -05:00
Wayne Lin
02efc0a780 drm/amd/display: Add DCN36 BIOS command table support
Add case for DCN36 in command_table_helper2.c.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Martin Leung <martin.leung@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:07 -05:00
Wayne Lin
59f79d83fc drm/amd/display: Add DCN36 version identifiers
Add DCN3.6 asic identifiers.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Martin Leung <martin.leung@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:07 -05:00
Wayne Lin
9b194af117 drm/amd/display: Add dcn36 register header files
[Why & How]
Add register headers for DCN36.

V2: adjust copyright license text

Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:07 -05:00
Alex Deucher
15f00b073c drm/amdgpu/gfx9: use amdgpu_gfx_off_ctrl_immediate() for PG
Use amdgpu_gfx_off_ctrl_immediate() when powergating.
There's no need for the delay in gfx off allow.  The
powergating is dynamically disabled/enabled as for
RV/PCO on compute queues and allowing gfx off again as
soon the job is submitted improves power savings.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Suggested-by: Błażej Szczygieł <mumei6102@gmail.com>
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3861
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:07 -05:00
Alex Deucher
250d9769ee drm/amdgpu/gfx: add amdgpu_gfx_off_ctrl_immediate()
Same as amdgpu_gfx_off_ctrl(), but without the delay
for gfxoff disallow.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Suggested-by: Błażej Szczygieł <mumei6102@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:07 -05:00
Shaoyun Liu
1c687c0da9 drm/amd/include : MES v11 and v12 API header update
MES requires driver set cleaner_shader_fence_mc_addr
for cleaner shader support.

Signed-off-by: Shaoyun Liu <shaoyun.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:06 -05:00
Lijo Lazar
a53cbd9e6f drm/amd/pm: Remove unnecessary device state checks
For amdgpu_get_pp_force_state, amdgpu_get_pp_cur_state already takes
care of device state check. In other cases, values are returned from
driver cached variables and are not dependent on device state.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Feifei Xu <feifei.xu@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:06 -05:00
Lijo Lazar
543f6e7163 drm/amd/pm: Fix get_if_active usage
If a device supports runtime pm, then pm_runtime_get_if_active returns 0
if a device is not active and 1 if already active. However, if a device
doesn't support runtime pm, the API returns -EINVAL. A device not
supporting runtime pm implies it's not affected by runtime pm and it's
active. Hence no need to get() to increment usage count. Remove < 0
return value check. Also, ignore runpm state to determine active status.
If the device is already in suspend state, disallow access.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Feifei Xu <feifei.xu@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:06 -05:00
Lijo Lazar
55aa33c3fe drm/amd/pm: Add APIs for device access checks
Wrap the checks before device access in helper functions and use them
for device access. The generic order of APIs now is to do input argument
validation first and check if device access is allowed.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Feifei Xu <feifei.xu@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:06 -05:00
Lijo Lazar
a5219b41dd drm/amdgpu: Clean up atom header file inclusion
atom bios header files are not required in these files.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:06 -05:00
Alex Deucher
abab978127 drm/amdgpu/sdma4: drop gfxoff calls in dump ip state
SDMA 4.x is not part of the GFX power domain so this is
not necessary.

Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:04:06 -05:00
Alex Hung
3e7ef261d3 drm/amd/display: Replace pr_info in dc_validate_boot_timing()
Use DC_LOG_DEBUG instead of pr_info to match other uses in dc.c.

Fixes: 091e301c2b ("drm/amd/display: Add debug messages for dc_validate_boot_timing()")
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:03 -05:00
Dr. David Alan Gilbert
b0fce908cf drm/amd/display: Remove unused link_enc_cfg_get_link_enc_used_by_stream
link_enc_cfg_get_link_enc_used_by_stream() is no longer used after
2021's:
commit 6366b00346 ("drm/amd/display: Maintain consistent mode of
operation during encoder assignment")
which introduces and uses the _current version instead.

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:03 -05:00
Dr. David Alan Gilbert
6d4e03d0b1 drm/amd/display: Remove unused get_max_support_fbc_buffersize
get_max_support_fbc_buffersize() is unused since 2021's
commit 94f0d0c80c ("drm/amd/display/dc/dce110/dce110_compressor: Remove
unused function 'dce110_get_required_compressed_surfacesize")
removed it's only caller.

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:03 -05:00
Dr. David Alan Gilbert
9ab737f3ae drm/amd/display: Remove unused hubbub1_toggle_watermark_change_req
hubbub1_toggle_watermark_change_req() last use was removed in 2017 by
commit b8fce2c9d7 ("drm/amd/display: Optimize programming front end")

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:03 -05:00
Dr. David Alan Gilbert
6d04e9785c drm/amd/display: Remove unused get_clock_requirements_for_state
get_clock_requirements_for_state() was added in 2018 by
commit 8ab2180f96 ("drm/amd/display: Add function to fetch clock
requirements")
but never used.

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:02 -05:00
Dr. David Alan Gilbert
fa88342931 drm/amd/display: Remove unused dc_stream_get_crtc_position
The last user of dc_stream_get_crtc_position() was
mod_freesync_get_v_position() which is removed in a previous
patch in this series.

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:02 -05:00
Dr. David Alan Gilbert
2d5e8a8997 drm/amd/display: Remove unused freesync functions
mod_freesync_get_vmin_vmax() and mod_freesync_get_v_position() were
added in 2017 by
commit 72ada5f769 ("drm/amd/display: FreeSync Auto Sweep Support")

mod_freesync_is_valid_range() was added in 2018 by
commit e80e944608 ("drm/amd/display: add method to check for supported
range")

mod_freesync_get_settings() was added in 2018 by
commit a3e1737ed6 ("drm/amd/display: Implement stats logging")

and
mod_freesync_calc_field_rate_from_timing() was added in 2020 by
commit 49c70ece54 ("drm/amd/display: Change input parameter for
set_drr")

None of these have been used.

Remove them.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:02 -05:00
Dr. David Alan Gilbert
3bd202b3c4 drm/amd/display: Remove unused mpc1_is_mpcc_idle
mpc1_is_mpcc_idle() was added in 2017 by
commit feb4a3cd8e ("drm/amd/display: Integrating MPC pseudocode")
but never used.

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:02 -05:00
Lijo Lazar
568199a5c7 drm/amd/pm: Limit to 8 jpeg rings per instance
JPEG 5.0.1 supports upto 10 rings, however PMFW support for SMU v13.0.6
variants is now limited to 8 per instance. Limit to 8 temporarily to
avoid out of bounds access.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:02 -05:00
Sathishkumar S
64dc2f0029 drm/amdgpu: Enable devcoredump for JPEG5_0_0
Add register list and enable devcoredump for JPEG5_0_0

V2: (Lijo)
 - remove version specific callbacks and use simplified helper functions

V3: (Lijo)
 - move amdgpu_jpeg_reg_dump_fini() to sw_fini() and avoid the call here

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Acked-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:02 -05:00
Sathishkumar S
8ecd4ec6a5 drm/amdgpu: Enable devcoredump for JPEG2_5_0
Add register list and enable devcoredump for JPEG2_5_0

V2: (Lijo)
- remove version specific callbacks and use simplified helper functions

V3: (Lijo)
- move amdgpu_jpeg_reg_dump_fini() to sw_fini() and avoid the call here

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Acked-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:02 -05:00
Sathishkumar S
63d5f8db53 drm/amdgpu: Enable devcoredump for JPEG2_0_0
Add register list and enable devcoredump for JPEG2_0_0

V2: (Lijo)
- remove version specific callbacks and use simplified helper functions

V3: (Lijo)
- move amdgpu_jpeg_reg_dump_fini() to sw_fini() and avoid the call here

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Acked-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:02 -05:00
Sathishkumar S
d949e91b42 drm/amdgpu: Enable devcoredump for JPEG3_0_0
Add register list and enable devcoredump for JPEG3_0_0

V2: (Lijo)
- remove version specific callbacks and use simplified helper functions

V3: (Lijo)
- move amdgpu_jpeg_reg_dump_fini() to sw_fini() and avoid the call here

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Acked-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:02 -05:00
Sathishkumar S
2b0ccf3923 drm/amdgpu: Enable devcoredump for JPEG4_0_5
Add register list and enable devcoredump for JPEG4_0_5

V2: (Lijo)
- remove version specific callbacks and use simplified helper functions

V3: (Lijo)
- move amdgpu_jpeg_reg_dump_fini() to sw_fini() and avoid the call here

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Acked-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:02 -05:00
Sathishkumar S
c3dddd6029 drm/amdgpu: Enable devcoredump for JPEG4_0_0
Add register list and enable devcoredump for JPEG4_0_0

V2: (Lijo)
- remove version specific callbacks and use simplified helper functions

V3: (Lijo)
- move amdgpu_jpeg_reg_dump_fini() to sw_fini() and avoid the call here

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Acked-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:02 -05:00
Sathishkumar S
358b3774a0 drm/amdgpu: Enable devcoredump for JPEG5_0_1
Add register list and enable devcoredump for JPEG5_0_1

V2: (Lijo)
- remove version specific callbacks and use simplified helper functions

V3: (Lijo)
- move amdgpu_jpeg_reg_dump_fini() to sw_fini() and avoid the call here

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Acked-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:02 -05:00
Sathishkumar S
08527cb534 drm/amdgpu: Enable devcoredump for JPEG4_0_3
Add register list and enable devcoredump for JPEG4_0_3

V2: (Lijo)
 - remove version specific callbacks and use simplified helper functions

V3: (Lijo)
 - move amdgpu_jpeg_reg_dump_fini() to sw_fini() and avoid the call here

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Acked-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:01 -05:00
Sathishkumar S
df996b5eff drm/amdgpu: Add helper funcs for jpeg devcoredump
Add devcoredump helper functions that can be reused for all jpeg versions.

V2: (Lijo)
 - add amdgpu_jpeg_reg_dump_init() and amdgpu_jpeg_reg_dump_fini()
 - use reg_list and reg_count from init() to dump and print registers
 - memory allocation and freeing is moved to the init() and fini()

V3: (Lijo)
 - move amdgpu_jpeg_reg_dump_fini() to sw_fini()

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Acked-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:01 -05:00
Shiwu Zhang
b3dd2903b0 drm/amdgpu: Enable IFWI update support with PSPv13.0.12
Make psp_vbflash_status and psp_vbflash available in sysfs

Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:01 -05:00
Asad Kamal
1fb85819d6 drm/amd/pm: Skip P2S load for SMU v13.0.12
Skip P2S table load for SMU v13.0.12

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:01 -05:00
Mangesh Gadre
5caea7a589 drm/amdgpu: Add support for smuio 13.0.11
Add new IP version support

Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com>
Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:01 -05:00
Mangesh Gadre
37971df806 drm/amdgpu: Add support for nbio 7.9.1
Add new IP version support

Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com>
Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:01 -05:00
Mangesh Gadre
a03f5f8d56 drm/amdgpu: Add support for smu 13.0.12
Add new IP version support

Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com>
Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:01 -05:00
Mangesh Gadre
05fd502e04 drm/amdgpu: Add support for umc 12.5.0/mmhub 1.8.1
Add new IP version support

Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com>
Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:01 -05:00
Taimur Hassan
a77269e33c drm/amd/display: 3.2.319
- Move SPL to a new path
- Request HW cursor on DCN3.2 with SubVP
- Allow reuse of of DCN4x code
- Enable odm 4:1 when debug key is set
- Fix seamless boot sequence
- Support multiple options during psr entry.
- Revert "Exit idle optimizations before attempt to access PHY"
- Fix out-of-bound accesses
- Fixes for mcache programming in DML21

Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:01 -05:00
Samson Tam
5c06c1df35 drm/amd/display: Move SPL to a new path
[WHY & HOW]
- Move SPL from dc/spl to dc/sspl
- Update build files and header paths
- Remove dc/spl files

Reviewed-by: George Zhang <george.zhang@amd.com>
Signed-off-by: Samson Tam <Samson.Tam@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:01 -05:00
Aric Cyr
13437c9160 drm/amd/display: Request HW cursor on DCN3.2 with SubVP
[WHY]
When SubVP is active the HW cursor size is limited to 64x64, and
anything larger will force composition which is bad for gaming on
DCN3.2 if the game uses a larger cursor.

[HOW]
If HW cursor is requested, typically by a fullscreen game, do not
enable SubVP so that up to 256x256 cursor sizes are available for
DCN3.2.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Aric Cyr <Aric.Cyr@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:01 -05:00
Dmytro
2739bd1237 drm/amd/display: Allow reuse of of DCN4x code
Remove the static qualifier to make it available for code sharing
with other components.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Dmytro <dmytro.laktyushkin@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:01 -05:00
Muhammad Ahmed
503d67484e drm/amd/display: Enable odm 4:1 when debug key is set
[WHAT]
odm 4to1 is enabled when debug key is set.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Muhammad Ahmed <muhammad.ahmed@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:00 -05:00
Sathishkumar S
580dac7437 drm/amdgpu: Add a func for core specific reg offset
Add an inline function to calculate core specific register offsets for
JPEG v4.0.3 and reuse it, makes code more readable and easier to align.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Acked-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:00 -05:00
Martin Tsai
3a5fa55455 drm/amd/display: Support multiple options during psr entry.
[WHY]
Some panels may not handle idle pattern properly during PSR entry.

[HOW]
Add a condition to allow multiple options on power down
sequence during PSR1 entry.

Reviewed-by: Anthony Koo <anthony.koo@amd.com>
Signed-off-by: Martin Tsai <Martin.Tsai@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:00 -05:00
Brandon Syu
be704e5ef4 Revert "drm/amd/display: Exit idle optimizations before attempt to access PHY"
This reverts commit de612738e9.

Reason to revert: screen flashes or gray screen appeared half of the
screen after resume from S4/S5.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Brandon Syu <Brandon.Syu@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:00 -05:00
Dillon Varone
c909a49128 drm/amd/display: Fixes for mcache programming in DML21
[WHY & HOW]
- Fix indexing phantom planes for mcache programming in the wrapper
- Fix phantom mcache allocations to align with HW guidance
- Fix mcache assignment for chroma plane for multi-planar formats

Reviewed-by: Austin Zheng <Austin.Zheng@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:00 -05:00
Lijo Lazar
2f9a32b589 drm/amdgpu: Clean up IP version checks in gmcv9.0
Clean up some IP version checks in gmcv9.0

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:00 -05:00
Lijo Lazar
a52e6cb06b drm/amdgpu: Clean up GFX v9.4.3 IP version checks
Remove unnecessary IP version checks for GFX 9.4.3 and similar variants.
Wrap checks inside meaningful function.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:00 -05:00
Lijo Lazar
a01e934242 drm/amdgpu: Use version to figure out harvest info
IP tables with version <=2 may use harvest bit. For version 3 and above,
harvest bit is not applicable, instead uses harvest table. Fix the
logic accordingly.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:00 -05:00
Lijo Lazar
31f9ed5882 drm/amdgpu: Pass IP instance/hwid as parameters
Use IP instance number and hwid as function args for validation checks.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:00 -05:00
Srinivasan Shanmugam
17585c07c2 drm/amdgpu/gfx10: Enable cleaner shader for GFX10.1.1/10.1.2 GPUs
Enable the cleaner shader for GFX10.1.1/10.1.2 GPUs to provide data
isolation between GPU workloads. The cleaner shader is responsible for
clearing the Local Data Store (LDS), Vector General Purpose Registers
(VGPRs), and Scalar General Purpose Registers (SGPRs), which helps
prevent data leakage and ensures accurate computation results.

This update extends cleaner shader support to GFX10.1.1/10.1.2 GPUs,
previously available for GFX10.1.10. It enhances security by clearing
GPU memory between processes and maintains a consistent GPU state across
KGD and KFD workloads.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:00 -05:00
Alex Deucher
e818635a31 drm/amdgpu: update and cleanup PM4 headers
Consolidate PM4 definitions.  Most of these were previously
only defined in UMDs.  Add them here as well and sync with
latest packets.  Also no need to include soc15d.h on gfx10+.

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Suggested-by: Saurabh Verma <saurabh.verma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:00 -05:00
Aric Cyr
942bd112c9 drm/amd/display: 3.2.318
This version brings along the following fixes:

- Fixes on psr_version, dcn35 register address, DCPG OP control sequences
- Imporvements to CR AUX RD interval interpretation, dio link encoder
- Disable PSR-SU on some OLED panels

Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:03:00 -05:00
Peichen Huang
a1d79eae96 drm/amd/display: refactor dio link encoder assigning
[WHY]
We would like to have new dio encoder assigning flow.
Which should be aligned with hpo assigning and have
simple logic and data representation.

[HOW}
1. A new config option to enable/disable the new code.
2. Encoder-link mapping is in res_ctx and assigned encoder.
is accessed through pipe_ctx.
3. assign dio encoder when add stream to ctx

Reviewed-by: Jun Lei <jun.lei@amd.com>
Reviewed-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Signed-off-by: Peichen Huang <PeiChen.Huang@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:59 -05:00
Sung Lee
c87d202692 drm/amd/display: Guard Possible Null Pointer Dereference
[WHY]
In some situations, dc->res_pool may be null.

[HOW]
Check if pointer is null before dereference.

Reviewed-by: Joshua Aberback <joshua.aberback@amd.com>
Signed-off-by: Sung Lee <Sung.Lee@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:59 -05:00
Hansen Dsouza
871f65a59f drm/amd/display: Add boot option to reduce PHY SSC for HBR3
[Why]
Spread on DPREFCLK by 0.3 percent can have a negative effect on sink
when PHY SSC is also spread by 0.3 percent
[How]
Add boot option for DMU to lower PHY SSC

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Hansen Dsouza <Hansen.Dsouza@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:59 -05:00
Dillon Varone
cbd97d621e drm/amd/display: Ammend DCPG IP control sequences to align with HW guidance
[WHY&HOW]
IP_REQUEST_CNTL should only be toggled off when it was originally, never
unconditionally.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:59 -05:00
Tom Chung
c31b41f1cb drm/amd/display: Disable PSR-SU on some OLED panel
[Why]
PSR-SU may cause some glitching randomly on some OLED panel.

[How]
Disable the PSR-SU for certain PSR-SU OLED panel.

Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:59 -05:00
Austin Zheng
36681f15bb drm/amd/display: Account For OTO Prefetch Bandwidth When Calculating Urgent Bandwidth
[Why]
1) The current calculations for OTO prefetch bandwidth do not consider the number of DPP pipes in use.
As a result, OTO prefetch bandwidth may be larger than the vactive bandwidth if multiple DPP pipes are used.
OTO prefetch bandwidth should never exceed the vactive bandwidth.

2) Mode programming may be mismatched with mode support
In cases where mode support has chosen to use the equalized (equ) prefetch schedule,
mode programming may end up using oto prefetch schedule instead.
The bandwidth required to do the oto schedule may end up being higher than the equ schedule.
This can cause the required urgent bandwidth to exceed the available urgent bandwidth.

[How]
Output the oto prefetch bandwidth and incorperate it into the urgent bandwidth calculations
even if the prefetch schedule being used is not the oto schedule.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Austin Zheng <Austin.Zheng@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:59 -05:00
Zhikai Zhai
4a4077b4b6 drm/amd/display: Update Cursor request mode to the beginning prefetch always
[Why]
The double buffer cursor registers is updated by the cursor
vupdate event. There is a gap between vupdate and cursor data
fetch if cursor fetch data reletive to cursor position.
Cursor corruption will happen if we update the cursor surface
in this gap.

[How]
Modify the cursor request mode to the beginning prefetch always
and avoid wraparound calculation issues.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Zhikai Zhai <zhikai.zhai@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:59 -05:00
George Shen
6a7fde4332 drm/amd/display: Update CR AUX RD interval interpretation
[Why]
DP spec updated to have the CR AUX RD interval match the EQ AUX RD
interval interpretation of DPCD 0000Eh/0220Eh for 8b/10b non-LTTPR mode
and LTTPR transparent mode cases.

[How]
Update interpretation of DPCD 0000Eh/0220Eh for CR AUX RD interval
during 8b/10b link training.

Reviewed-by: Michael Strauss <michael.strauss@amd.com>
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:59 -05:00
Tom Chung
d8c782cac5 drm/amd/display: Initial psr_version with correct setting
[Why & How]
The initial setting for psr_version is not correct while
create a virtual link.

The default psr_version should be DC_PSR_VERSION_UNSUPPORTED.

Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:59 -05:00
Srinivasan Shanmugam
25961bad92 drm/amdgpu/gfx10: Add cleaner shader for GFX10.1.10
This commit adds the cleaner shader microcode for GFX10.1.0 GPUs. The
cleaner shader is a piece of GPU code that is used to clear or
initialize certain GPU resources, such as Local Data Share (LDS), Vector
General Purpose Registers (VGPRs), and Scalar General Purpose Registers
(SGPRs).

Clearing these resources is important for ensuring data isolation
between different workloads running on the GPU. Without the cleaner
shader, residual data from a previous workload could potentially be
accessed by a subsequent workload, leading to data leaks and incorrect
computation results.

The cleaner shader microcode is represented as an array of 32-bit words
(`gfx_10_1_0_cleaner_shader_hex`). This array is the binary
representation of the cleaner shader code, which is written in a
low-level GPU instruction set.

When the cleaner shader feature is enabled, the AMDGPU driver loads this
array into a specific location in the GPU memory. The GPU then reads
this memory location to fetch and execute the cleaner shader
instructions.

The cleaner shader is executed automatically by the GPU at the end of
each workload, before the next workload starts. This ensures that all
GPU resources are in a clean state before the start of each workload.

This addition is part of the cleaner shader feature implementation. The
cleaner shader feature helps resource utilization by cleaning up GPU
resources after they are used. It also enhances security and reliability
by preventing data leaks between workloads.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:59 -05:00
Victor Skvortsov
0489339776 drm/amdgpu: Skip err_count sysfs creation on VF unsupported RAS blocks
VFs are not able to query error counts for all RAS blocks. Rather than
returning error for queries on these blocks, skip sysfs the creation
all together.

Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:59 -05:00
Hawking Zhang
16b85a0942 drm/amdgpu: Update usage for bad page threshold
The driver's behavior varies based on
the configuration of amdgpu_bad_page_threshold setting

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:59 -05:00
Asad Kamal
c003b5ccaf drm/amd/pm: Update pm attr for gc_9_5_0
Update power management & clk attributes for gc_v_9_5_0

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:58 -05:00
Asad Kamal
b9755229ea drm/amd/pm: Skip showing MCLK_OD level
Skip showing MCLK_OD level if setting UCLK MAX is not supported

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:58 -05:00
Asad Kamal
7485c30809 drm/amd/pm: Add metrics support for smuv13.0.12
Add metrics table support for smuv13.0.12 to
fetch data from metrics version v2

v2: Update get metric field and get metric size macro (Lijo)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:58 -05:00
Asad Kamal
ca7a75183b drm/amd/pm: Add SMUv13.0.12 PPT interface
Add SMUv13.0.12 PPT interface to fetch dpm features

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:58 -05:00
Asad Kamal
00117e3eb1 drm/amd/pm: Add metrics table header for smu_v13_0_12
Add metrics table header for smu_v13_0_12 as metrics version V2

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:58 -05:00
Asad Kamal
b2d97a134c drm/amd/pm: Update metrics tbl struct for smu_v_13.0.6
Update metrics table struct name for smu_v_13.0.6 and keep
it as version

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:58 -05:00
Yifan Zha
b02d6fd855 drm/amd/pm: Update smu_v13_0_0 SRIOV VF flag in msg mapping table
[Why]
Under SRIOV VF, driver send a VF unsupportted smu message causing
a failure.

[How]
Update smu_v13_0_0 message mapping table based on PMFW.

Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:58 -05:00
Mario Limonciello
16ca828617 drm/amd/display: Refactor mark_seamless_boot_stream()
mark_seamless_boot_stream() can be called multiple times to run
the more expensive checks in dc_validate_boot_timing().

Refactor the function so that if those have already passed once
the function isn't called again.

Also add a message the first time that they have passed to let
the user know the stream will be used for seamless boot.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Link: https://lore.kernel.org/r/20250120194903.1048811-4-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:58 -05:00
Mario Limonciello
50e30e3a0e drm/amd: Mark amdgpu.gttsize parameter as deprecated and show warnings on use
When not set `gttsize` module parameter by default will get the
value to use for the GTT pool from the TTM page limit, which is
set by a separate module parameter.

This inevitably leads to people not sure which one to set when they
want more addressable memory for the GPU, and you'll end up seeing
instructions online saying to set both.

Add some messages to try to guide people both who are using or misusing
the parameters and mark the parameter as deprecated with the plan to
drop it after the next LTS kernel release.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:58 -05:00
Mario Limonciello
196b68aa32 drm/amd/display: Add new log type DC_LOG_INFO
`DC_LOG_INFO` will wrap `drm_info()` and be used for the typical
`INFO` level printk messages but in DC code.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Link: https://lore.kernel.org/r/20250120194903.1048811-3-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:58 -05:00
Mario Limonciello
f73767b216 drm/amd/display: Decrease message about seamless boot enabled to debug
The message in amdgpu_dm about seamless boot is about an ASIC version
check and module parameter check.  It doesn't actually mean that seamless
boot will work.

Push this message into debug to avoid being disingenuous about it working
until it's been tested.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Link: https://lore.kernel.org/r/20250120194903.1048811-2-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:58 -05:00
Mario Limonciello
091e301c2b drm/amd/display: Add debug messages for dc_validate_boot_timing()
dc_validate_boot_timing() runs through an exhaustive list of checks to
determine whether a boot stream can be marked as seamless. When the
checks fail, a user will be left guessing what the reason was

Add debug statements that will be helpful to validate the specific
reason.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Link: https://lore.kernel.org/r/20250120194903.1048811-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:58 -05:00
Jiang Liu
38e8ca3e4b amdgpu/soc15: enable asic reset for dGPU in case of suspend abort
When GPU suspend is aborted, do the same for dGPU as APU to reset
soc15 asic. Otherwise it may cause following errors:
[  547.229463] amdgpu 0001:81:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_0.2.1.0 test failed (-110)

[  555.126827] amdgpu 0000:0a:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_0.2.1.0 test failed (-110)
[  555.126901] [drm:amdgpu_gfx_enable_kcq [amdgpu]] *ERROR* KCQ enable failed
[  555.126957] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <gfx_v9_4_3> failed -110
[  555.126959] amdgpu 0000:0a:00.0: amdgpu: amdgpu_device_ip_resume failed (-110).
[  555.126965] PM: dpm_run_callback(): pci_pm_resume+0x0/0xe0 returns -110
[  555.126966] PM: Device 0000:0a:00.0 failed to resume async: error -110

This fix has been tested on Mi308X.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Tested-by: Shuo Liu <shuox.liu@linux.alibaba.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/2462b4b12eb9d025e82525178d568cbaa4c223ff.1736739303.git.gerry@linux.alibaba.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:58 -05:00
Aric Cyr
7597d8f2e5 drm/amd/display: 3.2.317
This version brings along following fixes:

- Reverse the visual confirm recouts
- Exclude clkoffset and ips setting for dcn351 specific
- Fix cursor programming problems
- Increase block_sequence array size
- Use Nominal vBlank to determine vstartup if Provided
- Fix clock frequencies incorrect problems for dcn401
- Add SDP programming for UHBR link as well
- Support "Broadcast RGB" drm property

Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:57 -05:00
Peterson Guo
3c50bf2196 drm/amd/display: Reverse the visual confirm recouts
[WHY]
When checking if a pipe can disable cursor to prevent duplicate cursors,
having visual confirm on will prevent disabling cursors on planes which
cover the bottom of the screen.

[HOW]
When checking if a plane can disable visual confirm, the pipe first
reverses these calculations before doing the checks.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Peterson Guo <peterson.guo@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:57 -05:00
Charlene Liu
b9e124a565 drm/amd/display: Exclude clkoffset and ips setting for dcn351 specific
Exclude clock offset and IPS setting for dcn351 specific only.

Reviewed-by: Syed Hassan <syed.hassan@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:57 -05:00
Joshua Aberback
3a7810c212 drm/amd/display: Increase block_sequence array size
[Why]
It's possible to generate more than 50 steps in hwss_build_fast_sequence,
for example with a 6-pipe asic where all pipes are in one MPC chain. This
overflows the block_sequence buffer and corrupts block_sequence_steps,
causing a crash.

[How]
Expand block_sequence to 100 items. A naive upper bound on the possible
number of steps for a 6-pipe asic, ignoring the potential for steps to be
mutually exclusive, is 91 with current code, therefore 100 is sufficient.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Joshua Aberback <joshua.aberback@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:57 -05:00
Austin Zheng
41df56b1fc drm/amd/display: Use Nominal vBlank If Provided Instead Of Capping It
[Why/How]
vBlank used to determine the max vStartup is based on the smallest between
the vblank provided by the timing and vblank in ip_caps.
Extra vblank time is not considered if the vblank provided by the timing ends
up being higher than what's defined by the ip_caps

Use 1 less than the vblank size in case the timing is interlaced
so vstartup will always be less than vblank_nom.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Austin Zheng <Austin.Zheng@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:57 -05:00
Dillon Varone
5f0d1ef6f1 drm/amd/display: Populate register address for dentist for dcn401
[WHY&HOW]
Address was not previously populated which can result in incorrect
clock frequencies being read on boot.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:57 -05:00
Ian Chen
ae36501515 drm/amd/display: Add AS SDP programming for UHBR link rate.
Add SDP programming for UHB link as well.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Ian Chen <ian.chen@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:57 -05:00
Josip Pavic
06b0a4ad71 drm/amd/display: log destination of vertical interrupt
[Why]
Knowing the destination of OTG's vertical interrupt 2 is useful for
debugging, but it is not currently included in the OTG state readback
logic

[How]
Read the OTG interrupt destination register to get the vertical interrupt
2 destination on ASICs that have this register when reading back the OTG
state from hardware

Reviewed-by: Sung Lee <sung.lee@amd.com>
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Josip Pavic <Josip.Pavic@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:57 -05:00
Yan Li
6eb4c13a38 drm/amd/display: Support "Broadcast RGB" drm property
[WHY]
The source device outputs a full RGB signal, but TV may
be set to use limited RGB. The mismatch in color
range leads to a degradation in image quality.
Display driver should have the ability to switch
between the full and limited RGB to match TV's settings.

[HOW]
Add support of the linux DRM "Broadcast RGB" property, which
indicates the Quantization Range (Full vs Limited) used.
User space can set this property to be "Automatic", "Full"
or "Limited 16:235" to adjust the output color range.

Reviewed-by: Jerry Zuo <jerry.zuo@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Yan Li <yan.li@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:57 -05:00
Colin Ian King
9bbb556868 drm/amd/display: remove extraneous ; after statements
There are a couple of statements with two following semicolons, replace
these with just one semicolon.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:57 -05:00
Jesse.zhang@amd.com
30f7f53a5b drm/amdgpu/gfx10: implement gfx queue reset via MMIO
Using mmio to do queue reset

v2: Alignment the function with gfx9/gfx9.4.3.

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:57 -05:00
Jesse.zhang@amd.com
ffdd7a7b28 drm/amdgpu/gfx10: implement queue reset via MMIO
Using mmio to do queue reset.

v2: Alignment this function with gfx9/gfx9.4.3.

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:57 -05:00
Asad Kamal
884e7e5ae0 drm/amd/pm: Fill ip version for SMU v13.0.12
Fill ip version in pm_metrics for SMU v13.0.12

v2: Remove ip version check(Lijo)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:56 -05:00
Lijo Lazar
f7a594e405 drm/amdgpu: Use active umc info from discovery
There could be configs where some UMC instances are harvested. This
information is obtained through discovery data and populated in
umc.active_mask. Avoid reassigning this as AID mask, instead use the
mask directly while iterating through umc instances. This is to avoid
accesses to harvested UMC instances.

v2: fix warning (Alex)

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:56 -05:00
Asad Kamal
f5580a9c54 drm/amd/pm: Populate pmfw version for SMU v13.0.12
Populate pmfw version for SMU v13.0.12 to device struct

v2: Remove ip version check to get smu version

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:56 -05:00
Amber Lin
46d0436a3e drm/amdgpu: Set noretry default for GC 9.5.0
Set GC 9.5.0 noretry default as 1 for better performance. It can be
changed by the administrator using amdgpu.noretry=0 or by the user using
HSA_XNACK=1 environment variable.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviwanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:56 -05:00
Le Ma
23cb207751 drm/amdgpu: read harvest info from harvest table for gfx950
Harvest table is applied for gfx950.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:56 -05:00
Shiwu Zhang
667b96134c drm/amdgpu: enlarge the VBIOS binary size limit
Some chips have a larger VBIOS file so raise the size limit to support
the flashing tool.

Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:56 -05:00
Dr. David Alan Gilbert
933dc3c7c9 drm/amdkfd: Remove unused functions
kfd_device_by_pci_dev(), kfd_get_pasid_limit() and kfd_set_pasid_limit()
have been unused since 2023's
commit c99a2e7ae2 ("drm/amdkfd: drop IOMMUv2 support")

Remove them.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:56 -05:00
Alex Deucher
e29dad86fa drm/amdgpu/swsmu: set workload profile to bootup default
Now that we can select a workload profile dynamically when
we submit work, it's best to default to the bootup
default workload profile.  Defaulting to other profiles
prevents some power management features from kicking in
during idle periods.  Once all jobs have finished, the
workload profile will automatically move back to default
bootup for max power savings.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:56 -05:00
Alex Deucher
5f95a15495 drm/amdgpu: add dynamic workload profile switching for gfx12
Enable dynamic workload profile switching for gfx12.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:56 -05:00
Alex Deucher
963537ca23 drm/amdgpu: add dynamic workload profile switching for gfx11
Enable dynamic workload profile switching for gfx11.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:56 -05:00
Alex Deucher
b9467983b7 drm/amdgpu: add dynamic workload profile switching for gfx10
Enable dynamic workload profile switching for gfx10.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:56 -05:00
Alex Deucher
8fdb3958e3 drm/amdgpu/gfx: add ring helpers for setting workload profile
Add helpers to switch the workload profile dynamically when
commands are submitted.  This allows us to switch to
the FULLSCREEN3D or COMPUTE profile when work is submitted.
Add a delayed work handler to delay switching out of the
selected profile if additional work comes in.  This works
the same as the VIDEO profile for VCN.  This lets dynamically
enable workload profiles on the fly and then move back
to the default when there is no work.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:55 -05:00
Xiaogang Chen
8544374c0f drm/amdkfd: Have kfd driver use same PASID values from graphic driver
Current kfd driver has its own PASID value for a kfd process and uses it to
locate vm at interrupt handler or mapping between kfd process and vm. That
design is not working when a physical gpu device has multiple spatial
partitions, ex: adev in CPX mode. This patch has kfd driver use same pasid
values that graphic driver generated which is per vm per pasid.

These pasid values are passed to fw/hardware. We do not need change interrupt
handler though more pasid values are used. Also, pasid values at log are
replaced by user process pid; pasid values are not exposed to user. Users see
their process pids that have meaning in user space.

Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:55 -05:00
Lijo Lazar
ca44922107 drm/amdgpu: Check RRMT status for JPEG v4.0.3
RRMT could get dynamically enabled/disabled by PSP firmware. Read the
status from register for reading RRMT status. For VFs, this is not
accessible, hence assume that it's always disabled for now.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:55 -05:00
Lijo Lazar
485380f7fe drm/amdgpu: Check RRMT status for VCN v4.0.3
RRMT could get dynamically enabled/disabled by PSP firmware. Read the
status from register for reading RRMT status. For VFs, this is not
accessible, hence assume that it's always disabled for now.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:55 -05:00
Lijo Lazar
822b13d19f drm/amdgpu: Add VCN v4.0.3 RRMT register offset
Add RRMT control register offset for VCN v4.0.3

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:55 -05:00
Tim Huang
e55565f880 drm/amdgpu: add support for PSP IP version 14.0.5
This initializes PSP IP version 14.0.5.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:55 -05:00
Tim Huang
e7704d7c72 drm/amdgpu: add support for SMU IP version 14.0.5
This initializes SMU IP version 14.0.5.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:55 -05:00
Tim Huang
6d437d5203 drm/amdgpu: enable VCN/JPEG CGPG for GC IP version 11.5.3
Enable VCN/JPEG CGPG for ASIC with GFX version 11.5.3.

Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:55 -05:00
Tim Huang
6bde08d317 drm/amdgpu: add support for MMHUB IP version 3.3.2
This initializes MMHUB IP version 3.3.2.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:55 -05:00
Tim Huang
e659c9eb87 drm/amdgpu: add support for NBIO IP version 7.11.2
This initializes NBIO IP version 7.11.2.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:55 -05:00
Tim Huang
b2e5a04147 drm/amdgpu: add support for SDMA IP version 6.1.3
This initializes SDMA IP version 6.1.3.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:55 -05:00
Tim Huang
b784faeba2 drm/amdgpu: add support for GC IP version 11.5.3
This initializes GC IP version 11.5.3.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:55 -05:00
Alex Deucher
20f48be63d drm/amdgpu: add OEM i2c bus for polaris chips
It uses the VGADCC bus.  DC doesn't use this bus, so it
is safe to add it here.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:54 -05:00
Alex Deucher
1c0b144bf7 drm/amdgpu: rework i2c init and fini
No functional change.  Rework the code to allow for
adding some additional i2c buses in conjunction with DC
in the future.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:54 -05:00
Alex Deucher
ba7f8eb7e4 drm/amdgpu/atombios: drop empty function
This was leftover from when amdgpu was forked from radeon.
The function is empty so drop it.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:54 -05:00
Alex Deucher
2ed83f2cc4 drm/amd/display/dc: enable oem i2c support for DCE 12.x
Use the value pulled from the vbios just like newer chips.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:54 -05:00
Alex Deucher
d957d4a3f8 drm/amd/display/dc: add support for oem i2c in atom_firmware_info_v3_1
The fields are marked as reserved in atom_firmware_info_v3_1,
but thet contain valid data in all of the vbios images I've
looked at so add parse these fields as per
atom_firmware_info_v3_2.  The offsets are the same and the
reset of the structure is the same.

v2: squash in NULL checks

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:54 -05:00
Alex Deucher
3d5470c973 drm/amd/display/dm: add support for OEM i2c bus
Expose the OEM i2c bus on boards that support it.
This bus is used for OEM specific features like RGB, etc.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:54 -05:00
Alex Deucher
b217105acb drm/amd/display/dm: handle OEM i2c buses in i2c functions
Allow the creation of an OEM i2c bus and use the proper
DC helpers for that case.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:54 -05:00
Alex Deucher
44810f8de2 drm/amd/display/dc: add a new helper to fetch the OEM ddc_service
This is the i2c bus used by OEMs for board specific i2c features
like RGB.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:54 -05:00
Alex Deucher
33da70bd1e drm/amd/display/dm: drop hw_support check in amdgpu_dm_i2c_xfer()
DC supports SW i2c as well.  Drop the check.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:54 -05:00
Alex Deucher
0371dbd423 drm/amd/display/dm: drop extra parameters to create_i2c()
link_index can be fetched from the ddc_service; no need for a separate
parameter.  res is not used.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:54 -05:00
Sathishkumar S
8064ca6e93 drm/amdgpu: increase amdgpu max rings limit
increase max rings to 132 to support all JPEG5_0_1 cores, else
ring_init fails due to ring count exceeding maximum limit.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 21:02:54 -05:00
Jiang Liu
1abb264869 drm/amdgpu: avoid buffer overflow attach in smu_sys_set_pp_table()
It malicious user provides a small pptable through sysfs and then
a bigger pptable, it may cause buffer overflow attack in function
smu_sys_set_pp_table().

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2025-02-12 19:47:15 -05:00
Lancelot SIX
d584198a6f drm/amdkfd: Ensure consistent barrier state saved in gfx12 trap handler
It is possible for some waves in a workgroup to finish their save
sequence before the group leader has had time to capture the workgroup
barrier state.  When this happens, having those waves exit do impact the
barrier state.  As a consequence, the state captured by the group leader
is invalid, and is eventually incorrectly restored.

This patch proposes to have all waves in a workgroup wait for each other
at the end of their save sequence (just before calling s_endpgm_saved).

Signed-off-by: Lancelot SIX <lancelot.six@amd.com>
Reviewed-by: Jay Cornwall <jay.cornwall@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.12.x
2025-02-12 19:47:15 -05:00
Jiang Liu
a0a455b4bc drm/amdgpu: bail out when failed to load fw in psp_init_cap_microcode()
In function psp_init_cap_microcode(), it should bail out when failed to
load firmware, otherwise it may cause invalid memory access.

Fixes: 07dbfc6b10 ("drm/amd: Use `amdgpu_ucode_*` helpers for PSP")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 19:47:15 -05:00
Zhu Lingshan
a33f7f9660 amdkfd: properly free gang_ctx_bo when failed to init user queue
The destructor of a gtt bo is declared as
void amdgpu_amdkfd_free_gtt_mem(struct amdgpu_device *adev, void **mem_obj);
Which takes void** as the second parameter.

GCC allows passing void* to the function because void* can be implicitly
casted to any other types, so it can pass compiling.

However, passing this void* parameter into the function's
execution process(which expects void** and dereferencing void**)
will result in errors.

Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Fixes: fb91065851 ("drm/amdkfd: Refactor queue wptr_bo GART mapping")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12 19:47:15 -05:00
Alex Deucher
55ed2b1b50 drm/amdgpu: bump version for RV/PCO compute fix
Bump the driver version for RV/PCO compute stability fix
so mesa can use this check to enable compute queues on
RV/PCO.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.12.x
2025-02-12 19:47:15 -05:00
Alex Deucher
b35eb9128e drm/amdgpu/gfx9: manually control gfxoff for CS on RV
When mesa started using compute queues more often
we started seeing additional hangs with compute queues.
Disabling gfxoff seems to mitigate that.  Manually
control gfxoff and gfx pg with command submissions to avoid
any issues related to gfxoff.  KFD already does the same
thing for these chips.

v2: limit to compute
v3: limit to APUs
v4: limit to Raven/PCO
v5: only update the compute ring_funcs
v6: Disable GFX PG
v7: adjust order

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Suggested-by: Błażej Szczygieł <mumei6102@gmail.com>
Suggested-by: Sergey Kovalenko <seryoga.engineering@gmail.com>
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3861
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-January/119116.html
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.12.x
2025-02-12 19:47:01 -05:00
Alex Deucher
960a628774 drm/amdgpu/pm: fix UVD handing in amdgpu_dpm_set_powergating_by_smu()
UVD and VCN were split into separate dpm helpers in commit
ff69bba05f ("drm/amd/pm: add inst to dpm_set_powergating_by_smu")
as such, there is no need to include UVD in the is_vcn variable since
UVD and VCN are handled by separate dpm helpers now. Fix the check.

Fixes: ff69bba05f ("drm/amd/pm: add inst to dpm_set_powergating_by_smu")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3959
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-February/119827.html
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Boyuan Zhang <boyuan.zhang@amd.com>
2025-02-12 18:13:32 -05:00
Philipp Stanner
796a9f55a8 drm/sched: Use struct for drm_sched_init() params
drm_sched_init() has a great many parameters and upcoming new
functionality for the scheduler might add even more. Generally, the
great number of parameters reduces readability and has already caused
one missnaming, addressed in:

commit 6f1cacf4eb ("drm/nouveau: Improve variable name in
nouveau_sched_init()").

Introduce a new struct for the scheduler init parameters and port all
users.

Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Acked-by: Matthew Brost <matthew.brost@intel.com> # for Xe
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> # for Panfrost and Panthor
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> # for Etnaviv
Reviewed-by: Frank Binns <frank.binns@imgtec.com> # for Imagination
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> # for Sched
Reviewed-by: Maíra Canal <mcanal@igalia.com> # for v3d
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> # for amdxdna
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20250211111422.21235-2-phasta@kernel.org
2025-02-12 11:59:52 +01:00
Maxime Ripard
93c7dd1b39
Merge drm/drm-next into drm-misc-next
Bring rc1 to start the new release dev.

Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-02-06 13:47:32 +01:00
Tom Chung
f245b400a2 Revert "drm/amd/display: Use HW lock mgr for PSR1"
This reverts commit
a2b5a99562 ("drm/amd/display: Use HW lock mgr for PSR1")

Because it may cause system hang while connect with two edp panel.

Acked-by: Wayne Lin <wayne.lin@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-04 17:47:34 -05:00
Nathan Chancellor
820ccf8cb2 drm/amd/display: Respect user's CONFIG_FRAME_WARN more for dml files
Currently, there are several files in drm/amd/display that aim to have a
higher -Wframe-larger-than value to avoid instances of that warning with
a lower value from the user's configuration. However, with the way that
it is currently implemented, it does not respect the user's request via
CONFIG_FRAME_WARN for a higher stack frame limit, which can cause pain
when new instances of the warning appear and break the build due to
CONFIG_WERROR.

Adjust the logic to switch from a hard coded -Wframe-larger-than value
to only using the value as a minimum clamp and deferring to the
requested value from CONFIG_FRAME_WARN if it is higher.

Suggested-by: Harry Wentland <harry.wentland@amd.com>
Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Closes: https://lore.kernel.org/2025013003-audience-opposing-7f95@gregkh/
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-04 17:47:10 -05:00
Lo-an Chen
e01f07cb92 drm/amd/display: Fix seamless boot sequence
[WHY]
When the system powers up eDP with external monitors in seamless boot
sequence, stutter get enabled before TTU and HUBP registers being
programmed, which resulting in underflow.

[HOW]
Enable TTU in hubp_init.
Change the sequence that do not perpare_bandwidth and optimize_bandwidth
while having seamless boot streams.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Lo-an Chen <lo-an.chen@amd.com>
Signed-off-by: Paul Hsieh <paul.hsieh@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-03 12:13:38 -05:00
Alex Hung
8adbb2a98b drm/amd/display: Fix out-of-bound accesses
[WHAT & HOW]
hpo_stream_to_link_encoder_mapping has size MAX_HPO_DP2_ENCODERS(=4),
but location can have size up to 6. As a result, it is necessary to
check location against MAX_HPO_DP2_ENCODERS.

Similiarly, disp_cfg_stream_location can be used as an array index which
should be 0..5, so the ASSERT's conditions should be less without equal.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3904
Reviewed-by: Austin Zheng <Austin.Zheng@amd.com>
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-03 12:12:04 -05:00
Marek Olšák
2255b40cac drm/amdgpu: add a BO metadata flag to disable write compression for Vulkan
Vulkan can't support DCC and Z/S compression on GFX12 without
WRITE_COMPRESS_DISABLE in this commit or a completely different DCC
interface.

AMDGPU_TILING_GFX12_SCANOUT is added because it's already used by userspace.

Cc: stable@vger.kernel.org # 6.12.x
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-03 12:11:36 -05:00
Melissa Wen
7f2b5237e3 drm/amd/display: restore invalid MSA timing check for freesync
This restores the original behavior that gets min/max freq from EDID and
only set DP/eDP connector as freesync capable if "sink device is capable
of rendering incoming video stream without MSA timing parameters", i.e.,
`allow_invalid_MSA_timing_params` is true. The condition was mistakenly
removed by 0159f88a99 ("drm/amd/display: remove redundant freesync
parser for DP").

CC: Mario Limonciello <mario.limonciello@amd.com>
CC: Alex Hung <alex.hung@amd.com>
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3915
Fixes: 0159f88a99 ("drm/amd/display: remove redundant freesync parser for DP")
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2025-01-28 16:26:13 -05:00
Prike Liang
9078a5bfa2 drm/amdkfd: only flush the validate MES contex
The following page fault was observed duringthe KFD process release.
In this particular error case, the HIP test (./MemcpyPerformance -h)
does not require the queue. As a result, the process_context_addr was
not assigned when the KFD process was released, ultimately leading to
this page fault during the execution of the function
kfd_process_dequeue_from_all_devices().

[345962.294891] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:153 vmid:0 pasid:0)
[345962.295333] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000000000000000 from client 10
[345962.295775] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000B33
[345962.296097] amdgpu 0000:03:00.0: amdgpu:     Faulty UTCL2 client ID: CPC (0x5)
[345962.296394] amdgpu 0000:03:00.0: amdgpu:     MORE_FAULTS: 0x1
[345962.296633] amdgpu 0000:03:00.0: amdgpu:     WALKER_ERROR: 0x1
[345962.296876] amdgpu 0000:03:00.0: amdgpu:     PERMISSION_FAULTS: 0x3
[345962.297135] amdgpu 0000:03:00.0: amdgpu:     MAPPING_ERROR: 0x1
[345962.297377] amdgpu 0000:03:00.0: amdgpu:     RW: 0x0
[345962.297682] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:169 vmid:0 pasid:0)

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2025-01-28 16:24:39 -05:00
loanchen
f88192d233 drm/amd/display: Correct register address in dcn35
[Why]
the offset address of mmCLK5_spll_field_8 was incorrect for dcn35
which causes SSC not to be enabled.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Lo-An Chen <lo-an.chen@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2025-01-28 16:23:30 -05:00
Lijo Lazar
819bf6662b drm/amd/pm: Mark MM activity as unsupported
Aldebaran doesn't support querying MM activity percentage. Keep the
field as 0xFFs to mark it as unsupported.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2025-01-28 16:23:06 -05:00
Kenneth Feng
5cda56bd86 drm/amd/amdgpu: change the config of cgcg on gfx12
change the config of cgcg on gfx12

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.12.x
2025-01-28 16:22:39 -05:00
Jay Cornwall
f214b7beb0 drm/amdkfd: Block per-queue reset when halt_if_hws_hang=1
The purpose of halt_if_hws_hang is to preserve GPU state for driver
debugging when queue preemption fails. Issuing per-queue reset may
kill wavefronts which caused the preemption failure.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.12.x
2025-01-28 16:22:02 -05:00
Aric Cyr
024771f3fb drm/amd/display: Optimize cursor position updates
[why]
Updating the cursor enablement register can be a slow operation and accumulates
when high polling rate cursors cause frequent updates asynchronously to the
cursor position.

[how]
Since the cursor enable bit is cached there is no need to update the
enablement register if there is no change to it.  This removes the
read-modify-write from the cursor position programming path in HUBP and
DPP, leaving only the register writes.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Sung Lee <sung.lee@amd.com>
Signed-off-by: Aric Cyr <Aric.Cyr@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-24 09:56:28 -05:00
Aric Cyr
01130f5260 drm/amd/display: Add hubp cache reset when powergating
[Why]
When HUBP is power gated, the SW state can get out of sync with the
hardware state causing cursor to not be programmed correctly.

[How]
Similar to DPP, add a HUBP reset function which is called wherever
HUBP is initialized or powergated.  This function will clear the cursor
position and attribute cache allowing for proper programming when the
HUBP is brought back up.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Sung Lee <sung.lee@amd.com>
Signed-off-by: Aric Cyr <Aric.Cyr@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-24 09:56:22 -05:00
Shaoyun Liu
335acfb64e drm/amd/amdgpu: Enable scratch data dump for mes 12
MES internal will check CP_MES_MSCRATCH_LO/HI register to set scratch
data location during ucode start, driver side need to start the MES
one by one with different setting for each pipe

Signed-off-by: Shaoyun Liu <shaoyun.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-24 09:56:13 -05:00
Mario Limonciello
7e4cb7dea2 drm/amd: Clarify kdoc for amdgpu.gttsize
Effectively amdgpu.gttsize gets set to ~1/2 of RAM, but that's controlled
by what the TTM page limit is set to.  Clarify the kdoc.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-24 09:56:08 -05:00
Srinivasan Shanmugam
dc915275ea drm/amd/amdgpu: Prevent null pointer dereference in GPU bandwidth calculation
If the parent is NULL, adev->pdev is used to retrieve the PCIe speed and
width, ensuring that  the function can still determine these
capabilities from the device itself.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:6193 amdgpu_device_gpu_bandwidth()
	error: we previously assumed 'parent' could be null (see line 6180)

drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
    6170 static void amdgpu_device_gpu_bandwidth(struct amdgpu_device *adev,
    6171                                         enum pci_bus_speed *speed,
    6172                                         enum pcie_link_width *width)
    6173 {
    6174         struct pci_dev *parent = adev->pdev;
    6175
    6176         if (!speed || !width)
    6177                 return;
    6178
    6179         parent = pci_upstream_bridge(parent);
    6180         if (parent && parent->vendor == PCI_VENDOR_ID_ATI) {
                     ^^^^^^
If parent is NULL

    6181                 /* use the upstream/downstream switches internal to dGPU */
    6182                 *speed = pcie_get_speed_cap(parent);
    6183                 *width = pcie_get_width_cap(parent);
    6184                 while ((parent = pci_upstream_bridge(parent))) {
    6185                         if (parent->vendor == PCI_VENDOR_ID_ATI) {
    6186                                 /* use the upstream/downstream switches internal to dGPU */
    6187                                 *speed = pcie_get_speed_cap(parent);
    6188                                 *width = pcie_get_width_cap(parent);
    6189                         }
    6190                 }
    6191         } else {
    6192                 /* use the device itself */
--> 6193                 *speed = pcie_get_speed_cap(parent);
                                                     ^^^^^^ Then we are toasted here.

    6194                 *width = pcie_get_width_cap(parent);
    6195         }
    6196 }

Fixes: 757e8b951c ("drm/amdgpu: cache gpu pcie link width")
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-24 09:55:26 -05:00
Srinivasan Shanmugam
da29abe71e drm/amd/display: Fix error pointers in amdgpu_dm_crtc_mem_type_changed
The function amdgpu_dm_crtc_mem_type_changed was dereferencing pointers
returned by drm_atomic_get_plane_state without checking for errors. This
could lead to undefined behavior if the function returns an error pointer.

This commit adds checks using IS_ERR to ensure that new_plane_state and
old_plane_state are valid before dereferencing them.

Fixes the below:

drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:11486 amdgpu_dm_crtc_mem_type_changed()
error: 'new_plane_state' dereferencing possible ERR_PTR()

drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c
    11475 static bool amdgpu_dm_crtc_mem_type_changed(struct drm_device *dev,
    11476                                             struct drm_atomic_state *state,
    11477                                             struct drm_crtc_state *crtc_state)
    11478 {
    11479         struct drm_plane *plane;
    11480         struct drm_plane_state *new_plane_state, *old_plane_state;
    11481
    11482         drm_for_each_plane_mask(plane, dev, crtc_state->plane_mask) {
    11483                 new_plane_state = drm_atomic_get_plane_state(state, plane);
    11484                 old_plane_state = drm_atomic_get_plane_state(state, plane);
                                            ^^^^^^^^^^^^^^^^^^^^^^^^^^ These functions can fail.

    11485
--> 11486                 if (old_plane_state->fb && new_plane_state->fb &&
    11487                     get_mem_type(old_plane_state->fb) != get_mem_type(new_plane_state->fb))
    11488                         return true;
    11489         }
    11490
    11491         return false;
    11492 }

Fixes: 4caacd1671 ("drm/amd/display: Do not elevate mem_type change to full update")
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-24 09:55:19 -05:00
Lin.Cao
b529093999 drm/amdgpu: fix ring timeout issue in gfx10 sr-iov environment
commit 26c95e838e ("drm/amdgpu: set the VM pointer to NULL in
amdgpu_job_prepare") set job->vm as NULL if there is no fence. It will
cause emit switch buffer be skippen if job->vm set as NULL.

Check job rather than vm could solve this problem.

Fixes: 26c95e838e ("drm/amdgpu: set the VM pointer to NULL in amdgpu_job_prepare")
Signed-off-by: Lin.Cao <lincao12@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-24 09:55:04 -05:00
Lijo Lazar
1bf06a1fcd drm/amd/pm: Fix smu v13.0.6 caps initialization
Fix the initialization and usage of SMU v13.0.6 capability values. Use
caps_set/clear functions to set/clear capability.

Also, fix SET_UCLK_MAX capability on APUs, it is supported on APUs.

Fixes: e9b86b841b ("drm/amd/pm: Add capability flags for SMU v13.0.6")
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-24 09:54:21 -05:00
Jesse.zhang@amd.com
875596b984 drm/amd/pm: Refactor SMU 13.0.6 SDMA reset firmware version checks
This patch refactors the firmware version checks in `smu_v13_0_6_reset_sdma`
to support multiple SMU programs with different firmware version thresholds.

V2: return -EOPNOTSUPP for unspported pmfw

Suggested-by: Lazar Lijo <Lijo.Lazar@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-24 09:54:17 -05:00
Jesse.zhang@amd.com
2e7618457c revert "drm/amdgpu/pm: add definition PPSMC_MSG_ResetSDMA2"
pmfw now unifies PPSMC_MSG_ResetSDMA definitions for different devices.
PPSMC_MSG_ResetSDMA2 is not needed.

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-24 09:54:11 -05:00
Jesse.zhang@amd.com
941f0cb6c8 revert "drm/amdgpu/pm: Implement SDMA queue reset for different asic"
pmfw unified PPSMC_MSG_ResetSDMA definitions for different devices.
PPSMC_MSG_ResetSDMA2 is not needed.

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-24 09:54:08 -05:00
Lijo Lazar
e9b86b841b drm/amd/pm: Add capability flags for SMU v13.0.6
Add capability flags for SMU v13.0.6 variants. Initialize the flags
based on firmware support. As there are multiple IP versions maintained,
it is more manageable with one time initialization of caps flags based
on IP version and firmware feature support.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-24 09:54:03 -05:00
Alex Deucher
aedf498a2c drm/amd/display: fix SUBVP DC_DEBUG_MASK documentation
This needs to be kerneldoc formatted.

Fixes: 5349658fa4a1 ("drm/amd: Add debug option to disable subvp")
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
2025-01-24 09:53:59 -05:00
Alex Deucher
85172c8034 drm/amd/display: fix CEC DC_DEBUG_MASK documentation
This needs to be kerneldoc formatted.

Fixes: 7594874227 ("drm/amd/display: add CEC notifier to amdgpu driver")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Kun Liu <Kun.Liu2@amd.com>
2025-01-24 09:53:34 -05:00
Alex Deucher
64314e3f9c drm/amdgpu: fix the PCIe lanes reporting in the INFO IOCTL
Combine the platform and GPU caps like we do for PCIe Gen.
This aligns properly with expectations and documentation
for the interface.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3820
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-24 09:53:30 -05:00
Alex Deucher
757e8b951c drm/amdgpu: cache gpu pcie link width
Get the PCIe link with of the device itself (or it's
integrated upstream bridge) and cache that.

v2: fix typo

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3820
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-24 09:53:24 -05:00
Tzung-Bi Shih
a8d42cd228 drm/amd/display: mark static functions noinline_for_stack
When compiling allmodconfig (CONFIG_WERROR=y) with clang-19, see the
following errors:

.../display/dc/dml2/display_mode_core.c:6268:13: warning: stack frame size (3128) exceeds limit (3072) in 'dml_prefetch_check' [-Wframe-larger-than]
.../display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c:7236:13: warning: stack frame size (3256) exceeds limit (3072) in 'dml_core_mode_support' [-Wframe-larger-than]

Mark static functions called by dml_prefetch_check() and
dml_core_mode_support() noinline_for_stack to avoid them become huge
functions and thus exceed the frame size limit.

A way to reproduce:
$ git checkout next-20250107
$ mkdir build_dir
$ export PATH=/tmp/llvm-19.1.6-x86_64/bin:$PATH
$ make LLVM=1 O=build_dir allmodconfig
$ make LLVM=1 O=build_dir drivers/gpu/drm/ -j

The way how it chose static functions to mark:
[0] Unset CONFIG_WERROR in build_dir/.config.
To get display_mode_core.o without errors.

[1] Get a function list called by dml_prefetch_check().
$ sed -n '6268,6711p' drivers/gpu/drm/amd/display/dc/dml2/display_mode_core.c \
  | sed -n -r 's/.*\W(\w+)\(.*/\1/p' | sort -u >/tmp/syms

[2] Get the non-inline function list.
Objdump won't show the symbols if they are inline functions.

$ make LLVM=1 O=build_dir drivers/gpu/drm/ -j
$ objdump -d build_dir/.../display_mode_core.o | \
  ./scripts/checkstack.pl x86_64 0 | \
  grep -f /tmp/syms | cut -d' ' -f2- >/tmp/orig

[3] Get the full function list.
Append "-fno-inline" to `CFLAGS_.../display_mode_core.o` in
drivers/gpu/drm/amd/display/dc/dml2/Makefile.

$ make LLVM=1 O=build_dir drivers/gpu/drm/ -j
$ objdump -d build_dir/.../display_mode_core.o | \
  ./scripts/checkstack.pl x86_64 0 | \
  grep -f /tmp/syms | cut -d' ' -f2- >/tmp/noinline

[4] Get the inline function list.
If a symbol only in /tmp/noinline but not in /tmp/orig, it is a good
candidate to mark noinline.

$ diff /tmp/orig /tmp/noinline

Chosen functions and their stack sizes:
CalculateBandwidthAvailableForImmediateFlip [display_mode_core.o]:144
CalculateExtraLatency [display_mode_core.o]:176
CalculateTWait [display_mode_core.o]:64
CalculateVActiveBandwithSupport [display_mode_core.o]:112
set_calculate_prefetch_schedule_params [display_mode_core.o]:48

CheckGlobalPrefetchAdmissibility [dml2_core_dcn4_calcs.o]:544
calculate_bandwidth_available [dml2_core_dcn4_calcs.o]:320
calculate_vactive_det_fill_latency [dml2_core_dcn4_calcs.o]:272
CalculateDCFCLKDeepSleep [dml2_core_dcn4_calcs.o]:208
CalculateODMMode [dml2_core_dcn4_calcs.o]:208
CalculateOutputLink [dml2_core_dcn4_calcs.o]:176

Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-24 09:53:12 -05:00
Jay Cornwall
1241b64d4b drm/amdkfd: Clear MODE.VSKIP in gfx9 trap handler
If user shader issues S_SETVSKIP then this state will persist when
executing the trap handler, causing vector instructions to be
skipped.

VSKIP state is already saved/restored through the MODE register.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-24 09:53:05 -05:00
Lijo Lazar
a0db1ea0dd drm/amdgpu: Refine ip detection log message
'add ip block' causes a confusion if the blocks are disabled later with
ip_block_mask. Instead change to 'detected' and also add device context.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-24 09:52:58 -05:00
Lijo Lazar
b1df8050e7 drm/amdgpu: Add handler for SDMA context empty
Context empty interrupt is enabled for SDMA 4.4.2. Add a handler for
context empty interrupt so that it is disposed of fast, and not
propagated to KFD layer.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-24 09:52:43 -05:00
Aurabindo Pillai
9d63fbf751 drm/amd: Add debug option to disable subvp
Some monitors flicker when subvp is enabled which maybe related to
an uncommon timing they use. To isolate such issues, add a debug
option to help isolate this the issue for debugging.

Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-24 09:52:31 -05:00
Jay Cornwall
36a21f2686 drm/amdkfd: Sync trap handler binary with source
Source and binary have become mismatched during branch activity.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-24 09:52:24 -05:00
Emily Deng
b5f022fe8e drm/amdkfd: Fix partial migrate issue
For partial migrate from ram to vram, the migrate->cpages is not
equal to migrate->npages, should use migrate->npages to check all needed
migrate pages which could be copied or not.

And only need to set those pages could be migrated to migrate->dst[i], or
the migrate_vma_pages will migrate the wrong pages based on the migrate->dst[i].

v2:
Add mpages to break the loop earlier.

v3:
Uses MIGRATE_PFN_MIGRATE to identify whether page could be migrated.

v4:
Correct the error part.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Philip Yang<Philip.Yang@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-24 09:52:08 -05:00
Srinivasan Shanmugam
19b7f7c721 drm/amdgpu/gfx12: Add Cleaner Shader Support for GFX12.0 GPUs
This commit enables the cleaner shader feature for GFX12.0 and GFX12.0.1
GPUs. The cleaner shader is important for clearing GPU resources such as
Local Data Share (LDS), Vector General Purpose Registers (VGPRs), and
Scalar General Purpose Registers (SGPRs) between workloads.

- This feature ensures that GPU resources are reset between workloads,
  preventing data leaks and ensuring accurate computation.

By enabling the cleaner shader, this update enhances the security and
reliability of GPU operations on GFX12.0 hardware.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-14 11:06:50 -05:00
Kent Russell
f2935a3019 drm/amdgpu: Mark debug KFD module params as unsafe
Mark options only meant to be used for debugging as unsafe so that the
kernel is tainted when they are used.

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-14 11:06:50 -05:00
Gui Chengming
62952a38d9 drm/amdgpu: fix fw attestation for MP0_14_0_{2/3}
FW attestation was disabled on MP0_14_0_{2/3}.

V2:
Move check into is_fw_attestation_support func. (Frank)
Remove DRM_WARN log info. (Alex)
Fix format. (Christian)

Signed-off-by: Gui Chengming <Jack.Gui@amd.com>
Reviewed-by: Frank.Min <Frank.Min@amd.com>
Reviewed-by: Christian König <Christian.Koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-14 11:06:50 -05:00
Christian König
def59436fb drm/amdgpu: always sync the GFX pipe on ctx switch
That is needed to enforce isolation between contexts.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-14 11:06:50 -05:00
Christian König
177b76a8d8 drm/amdgpu: mark a bunch of module parameters unsafe
We sometimes have people trying to use debugging options in production
environments.

Mark options only meant to be used for debugging as unsafe so that the
kernel is tainted when they are used.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Acked-by: Simona Vetter <simona.vetter@ffwll.ch>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-14 11:06:50 -05:00
Tvrtko Ursulin
e996127ec1 drm/amdgpu: Use DRM scheduler API in amdgpu_xcp_release_sched
Lets use the existing helper instead of peeking into the structure
directly.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-14 11:06:50 -05:00
Kenneth Feng
2affe2bbc9 drm/amdgpu: disable gfxoff with the compute workload on gfx12
Disable gfxoff with the compute workload on gfx12. This is a
workaround for the opencl test failure.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-14 11:06:50 -05:00
Srinivasan Shanmugam
0b6b2dd383 drm/amdgpu: Fix Circular Locking Dependency in AMDGPU GFX Isolation
This commit addresses a circular locking dependency issue within the GFX
isolation mechanism. The problem was identified by a warning indicating
a potential deadlock due to inconsistent lock acquisition order.

- The `amdgpu_gfx_enforce_isolation_ring_begin_use` and
  `amdgpu_gfx_enforce_isolation_ring_end_use` functions previously
  acquired `enforce_isolation_mutex` and called `amdgpu_gfx_kfd_sch_ctrl`,
  leading to potential deadlocks. ie., If `amdgpu_gfx_kfd_sch_ctrl` is
  called while `enforce_isolation_mutex` is held, and
  `amdgpu_gfx_enforce_isolation_handler` is called while `kfd_sch_mutex` is
  held, it can create a circular dependency.

By ensuring consistent lock usage, this fix resolves the issue:

[  606.297333] ======================================================
[  606.297343] WARNING: possible circular locking dependency detected
[  606.297353] 6.10.0-amd-mlkd-610-311224-lof #19 Tainted: G           OE
[  606.297365] ------------------------------------------------------
[  606.297375] kworker/u96:3/3825 is trying to acquire lock:
[  606.297385] ffff9aa64e431cb8 ((work_completion)(&(&adev->gfx.enforce_isolation[i].work)->work)){+.+.}-{0:0}, at: __flush_work+0x232/0x610
[  606.297413]
               but task is already holding lock:
[  606.297423] ffff9aa64e432338 (&adev->gfx.kfd_sch_mutex){+.+.}-{3:3}, at: amdgpu_gfx_kfd_sch_ctrl+0x51/0x4d0 [amdgpu]
[  606.297725]
               which lock already depends on the new lock.

[  606.297738]
               the existing dependency chain (in reverse order) is:
[  606.297749]
               -> #2 (&adev->gfx.kfd_sch_mutex){+.+.}-{3:3}:
[  606.297765]        __mutex_lock+0x85/0x930
[  606.297776]        mutex_lock_nested+0x1b/0x30
[  606.297786]        amdgpu_gfx_kfd_sch_ctrl+0x51/0x4d0 [amdgpu]
[  606.298007]        amdgpu_gfx_enforce_isolation_ring_begin_use+0x2a4/0x5d0 [amdgpu]
[  606.298225]        amdgpu_ring_alloc+0x48/0x70 [amdgpu]
[  606.298412]        amdgpu_ib_schedule+0x176/0x8a0 [amdgpu]
[  606.298603]        amdgpu_job_run+0xac/0x1e0 [amdgpu]
[  606.298866]        drm_sched_run_job_work+0x24f/0x430 [gpu_sched]
[  606.298880]        process_one_work+0x21e/0x680
[  606.298890]        worker_thread+0x190/0x350
[  606.298899]        kthread+0xe7/0x120
[  606.298908]        ret_from_fork+0x3c/0x60
[  606.298919]        ret_from_fork_asm+0x1a/0x30
[  606.298929]
               -> #1 (&adev->enforce_isolation_mutex){+.+.}-{3:3}:
[  606.298947]        __mutex_lock+0x85/0x930
[  606.298956]        mutex_lock_nested+0x1b/0x30
[  606.298966]        amdgpu_gfx_enforce_isolation_handler+0x87/0x370 [amdgpu]
[  606.299190]        process_one_work+0x21e/0x680
[  606.299199]        worker_thread+0x190/0x350
[  606.299208]        kthread+0xe7/0x120
[  606.299217]        ret_from_fork+0x3c/0x60
[  606.299227]        ret_from_fork_asm+0x1a/0x30
[  606.299236]
               -> #0 ((work_completion)(&(&adev->gfx.enforce_isolation[i].work)->work)){+.+.}-{0:0}:
[  606.299257]        __lock_acquire+0x16f9/0x2810
[  606.299267]        lock_acquire+0xd1/0x300
[  606.299276]        __flush_work+0x250/0x610
[  606.299286]        cancel_delayed_work_sync+0x71/0x80
[  606.299296]        amdgpu_gfx_kfd_sch_ctrl+0x287/0x4d0 [amdgpu]
[  606.299509]        amdgpu_gfx_enforce_isolation_ring_begin_use+0x2a4/0x5d0 [amdgpu]
[  606.299723]        amdgpu_ring_alloc+0x48/0x70 [amdgpu]
[  606.299909]        amdgpu_ib_schedule+0x176/0x8a0 [amdgpu]
[  606.300101]        amdgpu_job_run+0xac/0x1e0 [amdgpu]
[  606.300355]        drm_sched_run_job_work+0x24f/0x430 [gpu_sched]
[  606.300369]        process_one_work+0x21e/0x680
[  606.300378]        worker_thread+0x190/0x350
[  606.300387]        kthread+0xe7/0x120
[  606.300396]        ret_from_fork+0x3c/0x60
[  606.300406]        ret_from_fork_asm+0x1a/0x30
[  606.300416]
               other info that might help us debug this:

[  606.300428] Chain exists of:
                 (work_completion)(&(&adev->gfx.enforce_isolation[i].work)->work) --> &adev->enforce_isolation_mutex --> &adev->gfx.kfd_sch_mutex

[  606.300458]  Possible unsafe locking scenario:

[  606.300468]        CPU0                    CPU1
[  606.300476]        ----                    ----
[  606.300484]   lock(&adev->gfx.kfd_sch_mutex);
[  606.300494]                                lock(&adev->enforce_isolation_mutex);
[  606.300508]                                lock(&adev->gfx.kfd_sch_mutex);
[  606.300521]   lock((work_completion)(&(&adev->gfx.enforce_isolation[i].work)->work));
[  606.300536]
                *** DEADLOCK ***

[  606.300546] 5 locks held by kworker/u96:3/3825:
[  606.300555]  #0: ffff9aa5aa1f5d58 ((wq_completion)comp_1.1.0){+.+.}-{0:0}, at: process_one_work+0x3f5/0x680
[  606.300577]  #1: ffffaa53c3c97e40 ((work_completion)(&sched->work_run_job)){+.+.}-{0:0}, at: process_one_work+0x1d6/0x680
[  606.300600]  #2: ffff9aa64e463c98 (&adev->enforce_isolation_mutex){+.+.}-{3:3}, at: amdgpu_gfx_enforce_isolation_ring_begin_use+0x1c3/0x5d0 [amdgpu]
[  606.300837]  #3: ffff9aa64e432338 (&adev->gfx.kfd_sch_mutex){+.+.}-{3:3}, at: amdgpu_gfx_kfd_sch_ctrl+0x51/0x4d0 [amdgpu]
[  606.301062]  #4: ffffffff8c1a5660 (rcu_read_lock){....}-{1:2}, at: __flush_work+0x70/0x610
[  606.301083]
               stack backtrace:
[  606.301092] CPU: 14 PID: 3825 Comm: kworker/u96:3 Tainted: G           OE      6.10.0-amd-mlkd-610-311224-lof #19
[  606.301109] Hardware name: Gigabyte Technology Co., Ltd. X570S GAMING X/X570S GAMING X, BIOS F7 03/22/2024
[  606.301124] Workqueue: comp_1.1.0 drm_sched_run_job_work [gpu_sched]
[  606.301140] Call Trace:
[  606.301146]  <TASK>
[  606.301154]  dump_stack_lvl+0x9b/0xf0
[  606.301166]  dump_stack+0x10/0x20
[  606.301175]  print_circular_bug+0x26c/0x340
[  606.301187]  check_noncircular+0x157/0x170
[  606.301197]  ? register_lock_class+0x48/0x490
[  606.301213]  __lock_acquire+0x16f9/0x2810
[  606.301230]  lock_acquire+0xd1/0x300
[  606.301239]  ? __flush_work+0x232/0x610
[  606.301250]  ? srso_alias_return_thunk+0x5/0xfbef5
[  606.301261]  ? mark_held_locks+0x54/0x90
[  606.301274]  ? __flush_work+0x232/0x610
[  606.301284]  __flush_work+0x250/0x610
[  606.301293]  ? __flush_work+0x232/0x610
[  606.301305]  ? __pfx_wq_barrier_func+0x10/0x10
[  606.301318]  ? mark_held_locks+0x54/0x90
[  606.301331]  ? srso_alias_return_thunk+0x5/0xfbef5
[  606.301345]  cancel_delayed_work_sync+0x71/0x80
[  606.301356]  amdgpu_gfx_kfd_sch_ctrl+0x287/0x4d0 [amdgpu]
[  606.301661]  amdgpu_gfx_enforce_isolation_ring_begin_use+0x2a4/0x5d0 [amdgpu]
[  606.302050]  ? srso_alias_return_thunk+0x5/0xfbef5
[  606.302069]  amdgpu_ring_alloc+0x48/0x70 [amdgpu]
[  606.302452]  amdgpu_ib_schedule+0x176/0x8a0 [amdgpu]
[  606.302862]  ? drm_sched_entity_error+0x82/0x190 [gpu_sched]
[  606.302890]  amdgpu_job_run+0xac/0x1e0 [amdgpu]
[  606.303366]  drm_sched_run_job_work+0x24f/0x430 [gpu_sched]
[  606.303388]  process_one_work+0x21e/0x680
[  606.303409]  worker_thread+0x190/0x350
[  606.303424]  ? __pfx_worker_thread+0x10/0x10
[  606.303437]  kthread+0xe7/0x120
[  606.303449]  ? __pfx_kthread+0x10/0x10
[  606.303463]  ret_from_fork+0x3c/0x60
[  606.303476]  ? __pfx_kthread+0x10/0x10
[  606.303489]  ret_from_fork_asm+0x1a/0x30
[  606.303512]  </TASK>

v2: Refactor lock handling to resolve circular dependency (Alex)

- Introduced a `sched_work` flag to defer the call to
  `amdgpu_gfx_kfd_sch_ctrl` until after releasing
  `enforce_isolation_mutex`.
- This change ensures that `amdgpu_gfx_kfd_sch_ctrl` is called outside
  the critical section, preventing the circular dependency and deadlock.
- The `sched_work` flag is set within the mutex-protected section if
  conditions are met, and the actual function call is made afterward.
- This approach ensures consistent lock acquisition order.

Fixes: afefd6f245 ("drm/amdgpu: Implement Enforce Isolation Handler for KGD/KFD serialization")
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-14 10:38:33 -05:00
Dave Airlie
c3d590f8ba amd-drm-next-6.14-2025-01-10:
amdgpu:
 - Fix max surface handling in DC
 - clang fixes
 - DCN 3.5 fixes
 - DCN 4.0.1 fixes
 - DC CRC fixes
 - DML updates
 - DSC fixes
 - PSR fixes
 - DC add some divide by 0 checks
 - SMU13 updates
 - SR-IOV fixes
 - RAS fixes
 - Cleaner shader support for gfx10.3 dGPUs
 - fix drm buddy trim handling
 - SDMA engine reset updates
 _ Fix RB bitmap setup
 - Fix doorbell ttm cleanup
 - Add CEC notifier support
 - DPIA updates
 - MST fixes
 
 amdkfd:
 - Shader debugger fixes
 - Trap handler cleanup
 - Cleanup includes
 - Eviction fence wq fix
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZ4FXWgAKCRC93/aFa7yZ
 2MudAQCzmzUNF9W29JOcset09IcS24Xe5liXrJWzHIPaHhQ25QD/ZU4JHb1947/8
 EnS3P7vraGPuCCet2aKmiWgtay7zggE=
 =5nDZ
 -----END PGP SIGNATURE-----

Merge tag 'amd-drm-next-6.14-2025-01-10' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-6.14-2025-01-10:

amdgpu:
- Fix max surface handling in DC
- clang fixes
- DCN 3.5 fixes
- DCN 4.0.1 fixes
- DC CRC fixes
- DML updates
- DSC fixes
- PSR fixes
- DC add some divide by 0 checks
- SMU13 updates
- SR-IOV fixes
- RAS fixes
- Cleaner shader support for gfx10.3 dGPUs
- fix drm buddy trim handling
- SDMA engine reset updates
_ Fix RB bitmap setup
- Fix doorbell ttm cleanup
- Add CEC notifier support
- DPIA updates
- MST fixes

amdkfd:
- Shader debugger fixes
- Trap handler cleanup
- Cleanup includes
- Eviction fence wq fix

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250110172731.2960668-1-alexander.deucher@amd.com
2025-01-13 11:13:13 +10:00
Ryan Seto
812a33a65d drm/amd/display: 3.2.316
This version brings along following fixes:
- Add some feature for secure display
- Add replay desync error count tracking and reset
- Update chip_cap defines and usage
- Remove unnecessary eDP power down
- Fix some stuttering/corruption issue on PSR panel
- Cleanup and refactoring DML2.1

Acked-by: Wayne Lin <wayne.lin@amd.com>
Reviewed-by: Martin Leung <martin.leung@amd.com>
Signed-off-by: Ryan Seto <ryanseto@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 12:12:47 -05:00
Charlene Liu
0ae47e971b drm/amd/display: avoid reset DTBCLK at clock init
[why & how]
this is to init to HW real DTBCLK.
and use real HW DTBCLK status to update internal logic state

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Martin Leung <martin.leung@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Ausef Yousof <Ausef.Yousof@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 12:12:37 -05:00
Peichen Huang
230dced3e2 drm/amd/display: improve dpia pre-train
[WHY]
We see unstable DP LL 4.2.1.3 test result with dpia pre-train. It is
because the outbox interrupt mechanism can not handle HPD
immediately and require some improvement.

[HOW]
1. not enable link if hpd_pending is true.
2. abort pre-train if training failed and hpd_pending is true.
3. check if 2 lane supported when it is alt mode

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Reviewed-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Signed-off-by: Peichen Huang <PeiChen.Huang@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 12:12:29 -05:00
Austin Zheng
ec6d8d49f4 drm/amd/display: Apply DML21 Patches
[Why & How]
Add several DML21 fixes

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Austin Zheng <Austin.Zheng@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 12:12:23 -05:00
Tom Chung
a2b5a99562 drm/amd/display: Use HW lock mgr for PSR1
[Why]
Without the dmub hw lock, it may cause the lock timeout issue
while do modeset on PSR1 eDP panel.

[How]
Allow dmub hw lock for PSR1.

Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 12:11:47 -05:00
Dennis Chan
0524dd3a4f drm/amd/display: Revised for Replay Pseudo vblank control
[why & how]
Revised Replay Full screen video Pseudo vblank control.

Reviewed-by: Allen Li <allen.li@amd.com>
Signed-off-by: Dennis Chan <dennis.chan@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 12:11:38 -05:00
Robin Chen
4e5a9bcc9b drm/amd/display: Add a new flag for replay low hz
[Why & How]
Add a new flag in replay_config to indicate the replay
low hz status.

Reviewed-by: Allen Li <allen.li@amd.com>
Signed-off-by: Robin Chen <robin.chen@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 12:11:30 -05:00
Karthi Kandasamy
92d100378c drm/amd/display: Remove unused read_ono_state function from Hwss module
[Why]
The functions read_ono_state are no longer in use and have been identified
as redundant.
Removing them helps streamline the codebase and improve maintainability by
eliminating unnecessary code.

[How]
These unused functions were removed from Hwss module, ensuring that no
functionality is affected, and the code is simplified.

Reviewed-by: Martin Leung <martin.leung@amd.com>
Signed-off-by: Karthi Kandasamy <karthi.kandasamy@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 12:11:22 -05:00
Leo Li
4caacd1671 drm/amd/display: Do not elevate mem_type change to full update
[Why]

There should not be any need to revalidate bandwidth on memory placement
change, since the fb is expected to be pinned to DCN-accessable memory
before scanout. For APU it's DRAM, and DGPU, it's VRAM. However, async
flips + memory type change needs to be rejected.

[How]

Do not set lock_and_validation_needed on mem_type change. Instead,
reject an async_flip request if the crtc's buffer(s) changed mem_type.

This may fix stuttering/corruption experienced with PSR SU and PSR1
panels, if the compositor allocates fbs in both VRAM carveout and GTT
and flips between them.

Fixes: a7c0cad0dc ("drm/amd/display: ensure async flips are only accepted for fast updates")
Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 12:10:08 -05:00
Leo Li
aa6713fa20 drm/amd/display: Do not wait for PSR disable on vbl enable
[Why]

Outside of a modeset/link configuration change, we should not have to
wait for the panel to exit PSR. Depending on the panel and it's state,
it may take multiple frames for it to exit PSR. Therefore, waiting in
all scenarios may cause perceived stuttering, especially in combination
with faster vblank shutdown.

[How]

PSR1 disable is hooked up to the vblank enable event, and vice versa. In
case of vblank enable, do not wait for panel to exit PSR, but still wait
in all other cases.

We also avoid a call to unnecessarily change power_opts on disable -
this ends up sending another command to dmcub fw.

When testing against IGT, some crc tests like kms_plane_alpha_blend and
amd_hotplug were failing due to CRC timeouts. This was found to be
caused by the early return before HW has fully exited PSR1. Fix this by
first making sure we grab a vblank reference, then waiting for panel to
exit PSR1, before programming hw for CRC generation.

Fixes: 58a261bfc9 ("drm/amd/display: use a more lax vblank enable policy for older ASICs")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3743
Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 12:07:05 -05:00
Yiling Chen
f5860c88cd drm/amd/display: Remove unnecessary eDP power down
[why]
When first time of link training is fail,
eDP would be powered down and
would not be powered up for next retry link training.
It causes that all of retry link linking would be fail.

[how]
We has extracted both power up and down sequence from
enable/disable link output function before DCN32.
We remov eDP power down in dcn32_disable_link_output().

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Yiling Chen <yi-ling.chen2@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 12:06:58 -05:00
Nicholas Susanto
c7ccfc0d42 Revert "drm/amd/display: Enable urgent latency adjustments for DCN35"
Revert commit 284f141f5c ("drm/amd/display: Enable urgent latency adjustments for DCN35")

[Why & How]

Urgent latency increase caused  2.8K OLED monitor caused it to
block this panel support P0.

Reverting this change does not reintroduce the netflix corruption issue
which it fixed.

Fixes: 284f141f5c ("drm/amd/display: Enable urgent latency adjustments for DCN35")
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Nicholas Susanto <Nicholas.Susanto@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 12:05:09 -05:00
Dillon Varone
3ea943991d drm/amd/display: Add SMU interface to get UMC count for dcn401
[WHY&HOW]
BIOS table will not always contain accurate UMC channel info when
harvesting is enabled, so get the correct info from SMU.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 12:04:43 -05:00
Alex Hung
e2c4c6c105 drm/amd/display: Initialize denominator defaults to 1
[WHAT & HOW]
Variables, used as denominators and maybe not assigned to other values,
should be initialized to non-zero to avoid DIVIDE_BY_ZERO, as reported
by Coverity.

Reviewed-by: Austin Zheng <austin.zheng@amd.com>
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 12:04:06 -05:00
Wayne Lin
44cea2bb9c drm/amd/display: Extend secure display to support DisplayCRC mode
[Why]
For the legacy secure display, it involves PSP + DMUB to confgiure and
retrieve the CRC/ROI result. Have requirement to support mode which all
handled by driver only.

[How]
Add another "DisplayCRC" mode, which doesn't involve PSP + DMUB.
All things are handled by the driver only

Reviewed-by: HaoPing Liu <haoping.liu@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 12:03:52 -05:00
Wayne Lin
b6fcc3867d drm/amd/display: Add support to configure CRC window on specific CRC instance
[Why]
Have the need to specify the CRC window on specific CRC engine.
dc_stream_configure_crc() today calculates CRC on crc engine 0 only and always
resets CRC engine at first.

[How]
Add index parameter to dc_stream_configure_crc() for selecting the desired crc
engine. Additionally, add another parameter to specify whether to skip the
default reset of crc engine.

Reviewed-by: HaoPing Liu <haoping.liu@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 12:03:45 -05:00
Wayne Lin
4a9a918545 drm/amd/display: Reduce accessing remote DPCD overhead
[Why]
Observed frame rate get dropped by tool like glxgear. Even though the
output to monitor is 60Hz, the rendered frame rate drops to 30Hz lower.

It's due to code path in some cases will trigger
dm_dp_mst_is_port_support_mode() to read out remote Link status to
assess the available bandwidth for dsc maniplation. Overhead of keep
reading remote DPCD is considerable.

[How]
Store the remote link BW in mst_local_bw and use end-to-end full_pbn
as an indicator to decide whether update the remote link bw or not.

Whenever we need the info to assess the BW, visit the stored one first.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3720
Fixes: fa57924c76 ("drm/amd/display: Refactor function dm_dp_mst_is_port_support_mode()")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 12:02:18 -05:00
Wayne Lin
a04d9534a8 drm/amd/display: Validate mdoe under MST LCT=1 case as well
[Why & How]
Currently in dm_dp_mst_is_port_support_mode(), when valdidating mode
under dsc decoding at the last DP link config, we only validate the
case when there is an UFP. However, if the MSTB LCT=1, there is no
UFP.

Under this case, use root_link_bw_in_kbps as the available bw to
compare.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3720
Fixes: fa57924c76 ("drm/amd/display: Refactor function dm_dp_mst_is_port_support_mode()")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 12:00:56 -05:00
Rafal Ostrowski
63ab80d9ac drm/amd/display: DML2.1 Post-Si Cleanup
[Why]
There are a few cleanup and refactoring tasks that need to be done
with the DML2.1 wrapper and DC interface to remove dependencies on
legacy structures and N-1 prototypes.

[How]
Implemented pipe_ctx->global_sync.
Implemented new functions to use pipe_ctx->hubp_regs and
pipe_ctx->global_sync:
- hubp_setup2
- hubp_setup_interdependent2
- Several other new functions for DCN 4.01 to support newer structures

Removed dml21_update_pipe_ctx_dchub_regs
Removed dml21_extract_legacy_watermark_set
Removed dml21_populate_pipe_ctx_dlg_param
Removed outdated dcn references in DML2.1 wrapper.

Reviewed-by: Austin Zheng <austin.zheng@amd.com>
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Rafal Ostrowski <rostrows@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 12:00:34 -05:00
Michael Strauss
00d53a0d8a drm/amd/display: Update chip_cap defines and usage
[WHY]
The defines have also been updated with prefix AMD_ and atomfirmware.h
has been temporarily updated with both sets of defines to allow the
transition.
This update is being made to standardize workaround chip_cap flags,
in order to support more workaround flags in the future.

[HOW]
Updated EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN define, the flag is now
an enum masked by EXT_DISPLAY_PATH_CAPS__EXT_CHIP_MASK. All checks for
DP_FIXED_VS_EN are now performed by masking with EXT_CHIP_MASK and
checking for an exact match rather than the previous bitwise AND check.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 12:00:23 -05:00
Taimur Hassan
3606115ba8 drm/amd/display: [FW Promotion] Release 0.0.248.0
Refactoring some flags for replay

Acked-by: Wayne Lin <wayne.lin@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 12:00:01 -05:00
Jack Chang
7d8a4bffe5 drm/amd/display: Add replay desync error count tracking and reset functionality
[Why & How]
Build-up get/reset desync error count interface and implement the functions.

Reviewed-by: ChunTao Tso <chuntao.tso@amd.com>
Reviewed-by: Robin Chen <robin.chen@amd.com>
Signed-off-by: Jack Chang <jack.chang@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 11:59:37 -05:00
Sung Lee
59fb2d0697 drm/amd/display: Log Hard Min Clocks and Phantom Pipe Status
[WHY]
On entering/exiting idle power, certain parameters would be
very useful to know for power profiling purposes.

[HOW]
This commit adds certain hard min clocks and pipe types
to log output on idle optimization enter/exit.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Sung Lee <Sung.Lee@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 11:59:27 -05:00
Gabe Teeger
abc0ad6d08 drm/amd/display: Limit Scaling Ratio on DCN3.01
[why]
Underflow and flickering was occuring due to high scaling ratios
when resizing videos.

[how]
Limit the scaling ratios by increasing the max scaling factor

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Gabe Teeger <Gabe.Teeger@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 11:59:18 -05:00
Alex Deucher
d477e39532 drm/amdgpu/smu13: update powersave optimizations
Only apply when compute profile is selected.  This is
the only supported configuration.  Selecting other
profiles can lead to performane degradations.

Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 11:59:09 -05:00
Kun Liu
7594874227 drm/amd/display: add CEC notifier to amdgpu driver
This patch adds the cec_notifier feature to amdgpu driver.
The changes will allow amdgpu driver code to notify EDID
and HPD changes to an eventual CEC adapter.

Signed-off-by: Kun Liu <Kun.Liu2@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10 11:58:57 -05:00
Victor Zhao
85b73415fd drm/amdgpu: fill the ucode bo during psp resume for SRIOV
refill the ucode bo during psp resume for SRIOV, otherwise ucode load
will fail after VM hibernation and fb clean.

Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-09 16:02:57 -05:00
Srinivasan Shanmugam
9814626751 drm/amdgpu/gfx10: Enable cleaner shader for GFX10.3.2/10.3.4/10.3.5 GPUs
Enable the cleaner shader for GFX10.3.2/10.3.4/10.3.5 GPUs to provide
data isolation between GPU workloads. The cleaner shader is responsible
for clearing the Local Data Store (LDS), Vector General Purpose
Registers (VGPRs), and Scalar General Purpose Registers (SGPRs), which
helps prevent data leakage and ensures accurate computation results.

This update extends cleaner shader support to GFX10.3.2/10.3.4/10.3.5
GPUs, previously available for GFX10.3.0. It enhances security by
clearing GPU memory between processes and maintains a consistent GPU
state across KGD and KFD workloads.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-09 16:02:57 -05:00
Jonathan Kim
86bde64cb7 drm/amdgpu: fix gpu recovery disable with per queue reset
Per queue reset should be bypassed when gpu recovery is disabled
with module parameter.

Fixes: ee0a469cf9 ("drm/amdkfd: support per-queue reset on gfx9")
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-09 16:02:57 -05:00
Jiang Liu
edec9b0690 drm/amdgpu: wrong array index to get ip block for PSP
The adev->ip_blocks array is not indexed by AMD_IP_BLOCK_TYPE_xxx,
instead we should call amdgpu_device_ip_get_ip_block() to get the
corresponding IP block oject.

Fix some checkpatch issues (Alex)

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-09 16:02:56 -05:00
Jiang Liu
60a2c0c12b drm/amdgpu: tear down ttm range manager for doorbell in amdgpu_ttm_fini()
Tear down ttm range manager for doorbell in function amdgpu_ttm_fini(),
to avoid memory leakage.

Fixes: 792b84fb90 ("drm/amdgpu: initialize ttm for doorbells")
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-09 16:02:56 -05:00
Tim Huang
6b34d0328b drm/amdgpu: fix incorrect number of active RBs for gfx12
The RB bitmap should be global active RB bitmap &
active RB bitmap based on active SA.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-09 16:02:56 -05:00
Tim Huang
4a60c55b3b drm/amdgpu: fix incorrect active RB bitmap in setup RBs
The RB bitmap width per SA may be 0x1 for some ASICs.
Use the actual bitmap of SA instead of 0x3 to determine
the active RB bitmap.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-09 16:02:56 -05:00
Jay Cornwall
62498e797a drm/amdkfd: Move gfx12 trap handler to separate file
gfx12 derivatives will have substantially different trap handler
implementations from gfx10/gfx11. Add a separate source file for
gfx12+ and remove unneeded conditional code.

No functional change.

v2: Revert copyright date to 2018, minor comment fixes

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Cc: Jonathan Kim <jonathan.kim@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-09 16:02:56 -05:00
Dan Carpenter
6ec6cd9acb drm/amdgpu: Fix shift type in amdgpu_debugfs_sdma_sched_mask_set()
The "mask" and "val" variables are type u64.  The problem is that the
BIT() macros are type unsigned long which is just 32 bits on 32bit
systems.

It's unlikely that people will be using this driver on 32bit kernels
and even if they did we only use the lower AMDGPU_MAX_SDMA_INSTANCES (16)
bits.  So this bug does not affect anything in real life.

Still, for correctness sake, u64 bit masks should use BIT_ULL().

Fixes: d2e3961ae3 ("drm/amdgpu: add amdgpu_sdma_sched_mask debugfs")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/d39a9325-87a4-4543-b6ec-1c61fca3a6fc@stanley.mountain
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-09 16:02:56 -05:00
Jesse Zhang
f7e672e6f8 drm/amdgpu: enable gfx12 queue reset flag
Enable the kgq and kcq queue reset flag

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Tim Huang <tim.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-09 16:02:47 -05:00
Nathan Chancellor
e4479aecf6 drm/amd/display: Increase sanitizer frame larger than limit when compile testing with clang
Commit 24909d9ec7 ("drm/amd/display: Overwriting dualDPP UBF values
before usage") added a new warning in dml2/display_mode_core.c when
building allmodconfig with clang:

  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c:6268:13: error: stack frame size (3128) exceeds limit (3072) in 'dml_prefetch_check' [-Werror,-Wframe-larger-than]
   6268 | static void dml_prefetch_check(struct display_mode_lib_st *mode_lib)
        |             ^

Commit be4e350931 ("drm/amd/display: DML21 Reintegration For Various
Fixes") introduced one in dml2_core/dml2_core_dcn4_calcs.c with the same
configuration:

  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c:7236:13: error: stack frame size (3256) exceeds limit (3072) in 'dml_core_mode_support' [-Werror,-Wframe-larger-than]
   7236 | static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out_params)
        |             ^

In the case of the first warning, the stack usage was already at the
limit at the parent change, so the offending change was rather
innocuous. In the case of the second warning, there was a rather
dramatic increase in stack usage compared to the parent:

  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c:7032:13: error: stack frame size (2696) exceeds limit (2048) in 'dml_core_mode_support' [-Werror,-Wframe-larger-than]
   7032 | static bool dml_core_mode_support(struct dml2_core_calcs_mode_support_ex *in_out_params)
        |             ^

This is an unfortunate interaction between an issue with stack slot
reuse in LLVM that gets exacerbated by sanitization (which gets enabled
with all{mod,yes}config) and function calls using a much higher number
of parameters than is typical in the kernel, necessitating passing most
of these values on the stack.

While it is possible that there should be source code changes to address
these warnings, this code is difficult to modify for various reasons, as
has been noted in other changes that have occurred for similar reasons,
such as commit 6740ec97bc ("drm/amd/display: Increase frame warning
limit with KASAN or KCSAN in dml2").

Increase the frame larger than limit when compile testing with clang and
the sanitizers enabled to avoid this breakage in all{mod,yes}config, as
they are commonly used and valuable testing targets. While it is not the
best to hide this issue, it is not really relevant when compile testing,
as the sanitizers are commonly stressful on optimizations and they are
only truly useful at runtime, which COMPILE_TEST states will not occur
with the current build.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202412121748.chuX4sap-lkp@intel.com/
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-09 16:02:40 -05:00
Jesse Zhang
da5c9677d2 drm/amdgpu/pm: Implement SDMA queue reset for different asic
Implement sdma queue reset by SMU_MSG_ResetSDMA2

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Suggested-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Tim Huang <tim.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-09 16:02:28 -05:00
Jesse Zhang
c8fd3a74c7 drm/amdgpu/pm: add definition PPSMC_MSG_ResetSDMA2
add the PPSMC_MSG_ResetSDMA2 definition for smu 13.0.6

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Tim Huang <tim.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-09 16:02:09 -05:00
Jesse Zhang
39b0fa29f6 drm/amdgpu/sdma4.4.2: add apu support in sdma queue reset
Remove apu check in sdma queue reset.

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-09 16:01:29 -05:00
Dave Airlie
0739b8ba82 drm-misc-next for 6.14:
UAPI Changes:
 - Clarify drm memory stats documentation
 
 Cross-subsystem Changes:
 
 Core Changes:
  - sched: Documentation fixes,
 
 Driver Changes:
  - amdgpu: Track BO memory stats at runtime
  - amdxdna: Various fixes
  - hisilicon: New HIBMC driver
  - bridges:
    - Provide default implementation of atomic_check for HDMI bridges
    - it605: HDCP improvements, MCCS Support
 -----BEGIN PGP SIGNATURE-----
 
 iJUEABMJAB0WIQTkHFbLp4ejekA/qfgnX84Zoj2+dgUCZ3uZUQAKCRAnX84Zoj2+
 dgM8AX4ur9y2eXLVQPS2IhCouWpFsYgSRnCysdVG43vszZ2kcObvlj4UV8nrrv7j
 W6x9FZYBfRLm6ctAUnu05ppm3zSbSdmsocadu1mfoDbShy31Pc5xklnB1u6M3Asw
 3mOtO5dxkA==
 =QZ6P
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2025-01-06' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

drm-misc-next for 6.14:

UAPI Changes:
- Clarify drm memory stats documentation

Cross-subsystem Changes:

Core Changes:
 - sched: Documentation fixes,

Driver Changes:
 - amdgpu: Track BO memory stats at runtime
 - amdxdna: Various fixes
 - hisilicon: New HIBMC driver
 - bridges:
   - Provide default implementation of atomic_check for HDMI bridges
   - it605: HDCP improvements, MCCS Support

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250106-augmented-kakapo-of-action-0cf000@houat
2025-01-09 15:48:50 +10:00
Dmitry Baryshkov
26d6fd8191 drm/connector: make mode_valid take a const struct drm_display_mode
The mode_valid() callbacks of drm_encoder, drm_crtc and drm_bridge
take a const struct drm_display_mode argument. Change the mode_valid
callback of drm_connector to also take a const argument.

Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Reviewed-by: Raphael Gallais-Pou <rgallaispou@gmail.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241214-drm-connector-mode-valid-const-v2-5-4f9498a4c822@linaro.org
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
2025-01-07 12:45:19 +02:00
Dmitry Baryshkov
b255ce4388 drm/amdgpu: don't change mode in amdgpu_dm_connector_mode_valid()
Make amdgpu_dm_connector_mode_valid() duplicate the mode during the
test rather than modifying the passed mode. This is a preparation to
converting the mode_valid() callback of drm_connector to take a const
struct drm_display_mode argument.

Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241214-drm-connector-mode-valid-const-v2-2-4f9498a4c822@linaro.org
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
2025-01-07 12:40:02 +02:00
Kent Russell
6c9c97387b drm/amdgpu: Remove unnecessary NULL check
container_of cannot return NULL, so it is unnecessary to check for
NULL after gem_to_amdgpu_bo, which is just a container_of call

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06 14:44:29 -05:00
Asad Kamal
24a1b66752 drm/amd/pm: Fill max mem bw & total app clk counter
Fill max memory bandwidth and total app clock counter to metrics v1_7

v2: Remove unnecessary check

v3: Add app clock counter support for apu

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06 14:44:29 -05:00
Asad Kamal
6caf95b771 drm/amd/pm: Update SMUv13.0.6 PMFW headers
Update pmfw headers for smuv13.0.6 to pmfw version 85.121

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06 14:44:29 -05:00
Arunpravin Paneer Selvam
3318ba94e5 drm/amdgpu: Add a lock when accessing the buddy trim function
When running YouTube videos and Steam games simultaneously,
the tester found a system hang / race condition issue with
the multi-display configuration setting. Adding a lock to
the buddy allocator's trim function would be the solution.

<log snip>
[ 7197.250436] general protection fault, probably for non-canonical address 0xdead000000000108
[ 7197.250447] RIP: 0010:__alloc_range+0x8b/0x340 [amddrm_buddy]
[ 7197.250470] Call Trace:
[ 7197.250472]  <TASK>
[ 7197.250475]  ? show_regs+0x6d/0x80
[ 7197.250481]  ? die_addr+0x37/0xa0
[ 7197.250483]  ? exc_general_protection+0x1db/0x480
[ 7197.250488]  ? drm_suballoc_new+0x13c/0x93d [drm_suballoc_helper]
[ 7197.250493]  ? asm_exc_general_protection+0x27/0x30
[ 7197.250498]  ? __alloc_range+0x8b/0x340 [amddrm_buddy]
[ 7197.250501]  ? __alloc_range+0x109/0x340 [amddrm_buddy]
[ 7197.250506]  amddrm_buddy_block_trim+0x1b5/0x260 [amddrm_buddy]
[ 7197.250511]  amdgpu_vram_mgr_new+0x4f5/0x590 [amdgpu]
[ 7197.250682]  amdttm_resource_alloc+0x46/0xb0 [amdttm]
[ 7197.250689]  ttm_bo_alloc_resource+0xe4/0x370 [amdttm]
[ 7197.250696]  amdttm_bo_validate+0x9d/0x180 [amdttm]
[ 7197.250701]  amdgpu_bo_pin+0x15a/0x2f0 [amdgpu]
[ 7197.250831]  amdgpu_dm_plane_helper_prepare_fb+0xb2/0x360 [amdgpu]
[ 7197.251025]  ? try_wait_for_completion+0x59/0x70
[ 7197.251030]  drm_atomic_helper_prepare_planes.part.0+0x2f/0x1e0
[ 7197.251035]  drm_atomic_helper_prepare_planes+0x5d/0x70
[ 7197.251037]  drm_atomic_helper_commit+0x84/0x160
[ 7197.251040]  drm_atomic_nonblocking_commit+0x59/0x70
[ 7197.251043]  drm_mode_atomic_ioctl+0x720/0x850
[ 7197.251047]  ? __pfx_drm_mode_atomic_ioctl+0x10/0x10
[ 7197.251049]  drm_ioctl_kernel+0xb9/0x120
[ 7197.251053]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 7197.251056]  drm_ioctl+0x2d4/0x550
[ 7197.251058]  ? __pfx_drm_mode_atomic_ioctl+0x10/0x10
[ 7197.251063]  amdgpu_drm_ioctl+0x4e/0x90 [amdgpu]
[ 7197.251186]  __x64_sys_ioctl+0xa0/0xf0
[ 7197.251190]  x64_sys_call+0x143b/0x25c0
[ 7197.251193]  do_syscall_64+0x7f/0x180
[ 7197.251197]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 7197.251199]  ? amdgpu_display_user_framebuffer_create+0x215/0x320 [amdgpu]
[ 7197.251329]  ? drm_internal_framebuffer_create+0xb7/0x1a0
[ 7197.251332]  ? srso_alias_return_thunk+0x5/0xfbef5

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Fixes: 4a5ad08f53 ("drm/amdgpu: Add address alignment support to DCC buffers")
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06 14:44:29 -05:00
Zhu Lingshan
c901693f36 drm/amdkfd: always include uapi header in priv.h
The header usr/linux/kfd_ioctl.h is a duplicate of uapi/linux/kfd_ioctl.h.
And it is actually not a file in the source code tree.
Ideally, the usr version should be updated whenever the source code is recompiled.

However, I have noticed a discrepancy between the two headers
even after rebuilding the kernel.

This commit modifies kfd_priv.h to always include the header from uapi to ensure
the latest changes are reflected. We should always include the source
code header other than the duplication.

Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06 14:44:29 -05:00
Prike Liang
2b11179e18 drm/amdgpu: reduce RLC safe mode request for gfx clock gating
The driver can only request one time for the power safe mode instead of
polling and disabling the power feature each time prior to program the
GFX clock gating control registers. This update will reduce the latency
on the GFX clock gating entry.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06 14:44:29 -05:00
Aurabindo Pillai
a5d258a00b Revert "drm/amd/display: Optimize cursor position updates"
This reverts commit 88c7c56d07c108ed4de319c8dba44aa4b8a38dd1.

SW and HW state are not always matching in some cases causing cursor to
be disabled.

Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06 14:44:28 -05:00
Srinivasan Shanmugam
8b248b9045 drm/amdgpu/gfx10: Add cleaner shader for GFX10.3.0
This commit adds the cleaner shader microcode for GFX10.3.0 GPUs. The
cleaner shader is a piece of GPU code that is used to clear or
initialize certain GPU resources, such as Local Data Share (LDS), Vector
General Purpose Registers (VGPRs), and Scalar General Purpose Registers
(SGPRs).

Clearing these resources is important for ensuring data isolation
between different workloads running on the GPU. Without the cleaner
shader, residual data from a previous workload could potentially be
accessed by a subsequent workload, leading to data leaks and incorrect
computation results.

The cleaner shader microcode is represented as an array of 32-bit words
(`gfx_10_3_0_cleaner_shader_hex`). This array is the binary
representation of the cleaner shader code, which is written in a
low-level GPU instruction set.

When the cleaner shader feature is enabled, the AMDGPU driver loads this
array into a specific location in the GPU memory. The GPU then reads
this memory location to fetch and execute the cleaner shader
instructions.

The cleaner shader is executed automatically by the GPU at the end of
each workload, before the next workload starts. This ensures that all
GPU resources are in a clean state before the start of each workload.

This addition is part of the cleaner shader feature implementation. The
cleaner shader feature helps resource utilization by cleaning up GPU
resources after they are used. It also enhances security and reliability
by preventing data leaks between workloads.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06 14:44:28 -05:00
Kun Liu
03cc84b102 drm/amd/pm: fix BUG: scheduling while atomic
atomic scheduling will be triggered in interrupt handler for
AC/DC mode switch as following backtrace.
Call Trace:
 <IRQ>
 dump_stack_lvl
 __schedule_bug
 __schedule
 schedule
 schedule_preempt_disabled
 __mutex_lock
 smu_cmn_send_smc_msg_with_param
 smu_v13_0_irq_process
 amdgpu_irq_dispatch
 amdgpu_ih_process
 amdgpu_irq_handler
 __handle_irq_event_percpu
 handle_irq_event
 handle_edge_irq
 __common_interrupt
 common_interrupt
 </IRQ>
 <TASK>
 asm_common_interrupt

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Kun Liu <Kun.Liu2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06 14:44:28 -05:00
Srinivasan Shanmugam
9095567bc3 drm/amdgpu: Fix error handling in amdgpu_ras_add_bad_pages
It ensures that appropriate error codes are returned when an error
condition is detected

Fixes the below;
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:2849 amdgpu_ras_add_bad_pages() warn: missing error code here? 'amdgpu_umc_pages_in_a_row()' failed.
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:2884 amdgpu_ras_add_bad_pages() warn: missing error code here? 'amdgpu_ras_mca2pa()' failed.

v2: s/-EIO/-EINVAL, retained the use of -EINVAL from
    amdgpu_umc_pages_in_a_row & and amdgpu_ras_mca2pa_by_idx, when the
    RAS context is not initialized or the convert_ras_err_addr function is
    unavailable. (Thomas)

V3: Returning 0 as the absence of eh_data is acceptable. (Tao)

Fixes: a8d133e625 ("drm/amdgpu: parse legacy RAS bad page mixed with new data in various NPS modes")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Cc: YiPeng Chai <yipeng.chai@amd.com>
Cc: Tao Zhou <tao.zhou1@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06 14:44:28 -05:00
Zhu Lingshan
2774ef7625 drm/amdkfd: wq_release signals dma_fence only when available
kfd_process_wq_release() signals eviction fence by
dma_fence_signal() which wanrs if dma_fence
is NULL.

kfd_process->ef is initialized by kfd_process_device_init_vm()
through ioctl. That means the fence is NULL for a new
created kfd_process, and close a kfd_process right
after open it will trigger the warning.

This commit conditionally signals the eviction fence
in kfd_process_wq_release() only when it is available.

[  503.660882] WARNING: CPU: 0 PID: 9 at drivers/dma-buf/dma-fence.c:467 dma_fence_signal+0x74/0xa0
[  503.782940] Workqueue: kfd_process_wq kfd_process_wq_release [amdgpu]
[  503.789640] RIP: 0010:dma_fence_signal+0x74/0xa0
[  503.877620] Call Trace:
[  503.880066]  <TASK>
[  503.882168]  ? __warn+0xcd/0x260
[  503.885407]  ? dma_fence_signal+0x74/0xa0
[  503.889416]  ? report_bug+0x288/0x2d0
[  503.893089]  ? handle_bug+0x53/0xa0
[  503.896587]  ? exc_invalid_op+0x14/0x50
[  503.900424]  ? asm_exc_invalid_op+0x16/0x20
[  503.904616]  ? dma_fence_signal+0x74/0xa0
[  503.908626]  kfd_process_wq_release+0x6b/0x370 [amdgpu]
[  503.914081]  process_one_work+0x654/0x10a0
[  503.918186]  worker_thread+0x6c3/0xe70
[  503.921943]  ? srso_alias_return_thunk+0x5/0xfbef5
[  503.926735]  ? srso_alias_return_thunk+0x5/0xfbef5
[  503.931527]  ? __kthread_parkme+0x82/0x140
[  503.935631]  ? __pfx_worker_thread+0x10/0x10
[  503.939904]  kthread+0x2a8/0x380
[  503.943132]  ? __pfx_kthread+0x10/0x10
[  503.946882]  ret_from_fork+0x2d/0x70
[  503.950458]  ? __pfx_kthread+0x10/0x10
[  503.954210]  ret_from_fork_asm+0x1a/0x30
[  503.958142]  </TASK>
[  503.960328] ---[ end trace 0000000000000000 ]---

Fixes: 967d226eaa ("dma-buf: add WARN_ON() illegal dma-fence signaling")
Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06 14:44:28 -05:00
yfeng1
62bf9fe6fa drm/amdgpu: Fix for MEC SJT FW Load Fail on VF
Users might switch to ROCM build does not include MEC SJT FW and driver
needs to consider this case.w

Signed-off-by: yfeng1 <yfeng1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06 14:44:28 -05:00
Martin Leung
da968c3ce4 drm/amd/display: Promote DC to 3.2.315
This version brings along the following:
- Add Interface to Dump DSC Caps from dm
- Add DP required HBlank size calc to link interface
- Add 6bpc RGB case for dcn32 output bpp calculations
- Add VC for VESA Aux Backlight Control
- Add support for setting multiple CRC windows in dc
- Clean up SPL code and outdated interfaces in dcn401_clk_mgr
- Disable replay and psr while VRR is enabled
- Fix PSR-SU not support but still call the  amdgpu_dm_psr_enable
- Implement Replay Low Hz Visual Confirm
- Extend dc_stream_get_crc to support 2nd crc engine
- Update power gating logic for DCN35 hw

Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Martin Leung <Martin.Leung@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06 14:44:28 -05:00
Wayne Lin
1e36774f16 drm/amd/display: Extend capability to get multiple ROI CRCs
[Why & How]
We already extend our dm, dc and dmub to support setting of multiple CRC
instances, now extend the capability to return back the ROI/CRC pair result
from psp by specifying activated ROI instances.

Reviewed-by: HaoPing Liu <haoping.liu@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06 14:44:28 -05:00
Iswara Nagulendran
d566fc42c0 drm/amd/display: Add VC for VESA Aux Backlight Control
[WHY]
There is no way to distinguish
the static backlight control type
being used and the VABC support
without the use of a debugger or
reading DPCD registers.

[HOW]
Add Visual Confirm support
for VESA Aux-based Backlight Control.

Reviewed-by: Harry Vanzylldejong <harry.vanzylldejong@amd.com>
Signed-off-by: Iswara Nagulendran <Iswara.Nagulendran@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06 14:44:28 -05:00
Samson Tam
41c18333d4 drm/amd/display: Clean up SPL code
[Why & How]
Use helper functions for checking formats
Apply cositing offset in rotation case

Reviewed-by: Navid Assadian <navid.assadian@amd.com>
Signed-off-by: Samson Tam <Samson.Tam@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06 14:44:28 -05:00
Roman Li
f6e09701c3 drm/amd/display: Add check for granularity in dml ceil/floor helpers
[Why]
Wrapper functions for dcn_bw_ceil2() and dcn_bw_floor2()
should check for granularity is non zero to avoid assert and
divide-by-zero error in dcn_bw_ functions.

[How]
Add check for granularity 0.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Roman Li <Roman.Li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06 14:44:28 -05:00
George Shen
79a57f9479 drm/amd/display: Add 6bpc RGB case for dcn32 output bpp calculations
[Why]
Current DCN32 calculation doesn't consider RGB 6bpc for the DP case.
This results in an invalid output bpp being calculated when DSC is not
enabled in the configuration, failing the mode validation.

[How]
Add special case to handle 6bpc RGB in the output bpp calculation.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06 14:44:28 -05:00
Tom Chung
d7879340e9 drm/amd/display: Disable replay and psr while VRR is enabled
[Why]
Replay and PSR will cause some video corruption while VRR is enabled.

[How]
1. Disable the Replay and PSR while VRR is enabled.
2. Change the amdgpu_dm_crtc_vrr_active() parameter to const.
   Because the function will only read data from dm_crtc_state.

Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06 14:44:27 -05:00
Tom Chung
f765e7ce04 drm/amd/display: Fix PSR-SU not support but still call the amdgpu_dm_psr_enable
[Why]
The enum DC_PSR_VERSION_SU_1 of psr_version is 1 and
DC_PSR_VERSION_UNSUPPORTED is 0xFFFFFFFF.

The original code may has chance trigger the amdgpu_dm_psr_enable()
while psr version is set to DC_PSR_VERSION_UNSUPPORTED.

[How]
Modify the condition to psr->psr_version == DC_PSR_VERSION_SU_1

Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06 14:44:27 -05:00
George Shen
1619d4168b drm/amd/display: Add HBlank reduction DPCD write to DPMS sequence
[Why]
Certain small HBlank timings may not have a large enough HBlank to
support audio when low bpp DSC is enabled. HBlank expansion by the
source can solve this problem, but requires the branch/sink to support
HBlank reduction.

[How]
Update DPMS sequence to call DM to perform DPCD write to enable HBlank
reduction on the branch/sink. Add stub in dm_helpers to be implemented
later.

Reviewed-by: Michael Strauss <michael.strauss@amd.com>
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06 14:44:27 -05:00