linux

mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00

Author	SHA1	Message	Date
Arvind Yadav	cade59abaa	drm/amdgpu/gfx12: Add fw minimum version check for usermode queue This patch is load usermode queue based on FW support for gfx12. CP Ucode FW Vesion: [PFP = 2840, ME = 2780, MEC = 3050, MES = 123] v2: Addressed review comments from Alex - Just check the firmware versions directly. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Shashank Sharma <shashank.sharma@amd.com> Cc: Sunil Khatri <sunil.khatri@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:46 -04:00
Arvind Yadav	61ca97e959	drm/amdgpu/gfx11: Add fw minimum version check for usermode queue This patch is load usermode queue based on FW support for gfx11. CP Ucode FW version: [PFP = 2530, ME = 2390, MEC = 2600, MES = 120] v2: Addressed review comments from Alex. - Just check the firmware versions directly. v3: Firmware version checks only for Navi3x(by Alex). Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Shashank Sharma <shashank.sharma@amd.com> Cc: Sunil Khatri <sunil.khatri@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:46 -04:00
Srinivasan Shanmugam	3f397cd203	drm/amd/display: Add NULL pointer checks in dm_force_atomic_commit() This commit updates the dm_force_atomic_commit function to replace the usage of PTR_ERR_OR_ZERO with IS_ERR for checking error states after retrieving the Connector (drm_atomic_get_connector_state), CRTC (drm_atomic_get_crtc_state), and Plane (drm_atomic_get_plane_state) states. The function utilized PTR_ERR_OR_ZERO for error checking. However, this approach is inappropriate in this context because the respective functions do not return NULL; they return pointers that encode errors. This change ensures that error pointers are properly checked using IS_ERR before attempting to dereference. Cc: Harry Wentland <harry.wentland@amd.com> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:46 -04:00
Alex Deucher	42a6667780	drm/amdgpu/userq: use consistent function naming s/userqueue/userq/ 1. remove the mix of amdgpu_userqueue and amdgpu_userq 2. to be consistent with other amdgpu_userq_fence.c 3. it's shorter Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:46 -04:00
Alex Deucher	4fdbe3a623	drm/amdgpu/userq: rename eviction helpers suspend/resume -> evict/restore Rename to avoid confusion with the system suspend and resume helpers. v2: update error messages Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:46 -04:00
Alex Deucher	d13e95967e	drm/amdgpu/userq: move waiting for last fence before umap Need to wait for the last fence before unmapping. This also fixes a memory leak in amdgpu_userqueue_cleanup() when the fence isn't signalled. Fixes: `b0db33c8c5` ("drm/amdgpu/userq: rework front end call sequence") Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:45 -04:00
Alex Deucher	36b0bc1731	drm/amdgpu/userq: unmap queues amdgpu_userq_mgr_fini() This was missed when the map and unmap were split out of the mqd create and destroy functions. Fixes: `b0db33c8c5` ("drm/amdgpu/userq: rework front end call sequence") Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:45 -04:00
Alex Deucher	e67b95f0cd	drm/amdgpu: switch from queue_active to queue state Track the state of the queue rather than simple active vs not. This is needed for other states (hung, preempted, etc.). While we are at it, move the state tracking into the user queue front end code. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:45 -04:00
Alex Deucher	ba324ffb25	drm/amdgpu/userq: optimize enforce isolation and s/r If user queues are disabled for all IPs in the case of suspend and resume and for gfx/compute in the case of enforce isolation, we can return early. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:45 -04:00
Dr. David Alan Gilbert	d61724056a	drm/amd/display: Remove unused *vbios_smu_set_dprefclk rn_vbios_smu_set_dprefclk() was added in 2019 by commit `4edb6fc918` ("drm/amd/display: Add Renoir clock manager") rv1_vbios_smu_set_dprefclk() was also added in 2019 by commit `dc88b4a684` ("drm/amd/display: make clk mgr soc specific") neither have been used. Remove them. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:45 -04:00
Xiang Liu	696e8fa354	drm/amdgpu: Print kernel message when error logged by scrub Print a kernel message when the scrub bit of status register is set to indicate that errors are being logged by the scrub. Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:45 -04:00
Gergo Koteles	20232192a5	drm/amd/display: do not copy invalid CRTC timing info Since `b255ce4388`, it is possible that the CRTC timing information for the preferred mode has not yet been calculated while amdgpu_dm_connector_mode_valid() is running. In this case use the CRTC timing information of the actual mode. Fixes: `b255ce4388` ("drm/amdgpu: don't change mode in amdgpu_dm_connector_mode_valid()") Closes: https://lore.kernel.org/all/ed09edb167e74167a694f4854102a3de6d2f1433.camel@irl.hu/ Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4085 Signed-off-by: Gergo Koteles <soyer@irl.hu> Reviewed-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:45 -04:00
TungYu Lu	33bc89949b	drm/amd/display: Correct prefetch calculation [Why] The minimum value of the dst_y_prefetch_equ was not correct in prefetch calculation whice causes OPTC underflow. [How] Add the min operation of dst_y_prefetch_equ in prefetch calculation for legacy DML. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: TungYu Lu <tungyu.lu@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:45 -04:00
Dillon Varone	19e743f0fb	drm/amd/display: Refactor SubVP cursor limiting logic [WHY] There are several gaps that can result in SubVP being enabled with incompatible HW cursor sizes, and unjust restrictions to cursor size due to wrong predictions on future usage of SubVP [HOW] - remove "prediction" logic in favor of tagging based on previous SubVP usage - block SubVP if current HW cursor settings are incompatible - provide interface for DM to determine if HW cursor should be disabled due to an attempt to enable SubVP Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:45 -04:00
Alex Deucher	11772eb73b	drm/amdgpu/userq: add a helper to check which IPs are enabled Add a helper to get a mask of IPs which support user queues. Use this in the INFO IOCTL to get the IP mask to replace the current code. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:45 -04:00
Arunpravin Paneer Selvam	4b27406380	drm/amdgpu: Add queue id support to the user queue wait IOCTL Add queue id support to the user queue wait IOCTL drm_amdgpu_userq_wait structure. This is required to retrieve the wait user queue and maintain the fence driver references in it so that the user queue in the same context releases their reference to the fence drivers at some point before queue destruction. Otherwise, we would gather those references until we don't have any more space left and crash. v2: Modify the UAPI comment as per the mesa and libdrm UAPI comment. Libdrm MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/408 Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34493 Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:44 -04:00
Alex Deucher	4ec2141d23	drm/amdgpu/userq: enable support for secure queues Enable users to create secure GFX/compute queues. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Tested-by: Jesse.Zhang <Jesse.zhang@amd.com> Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:44 -04:00
Alex Deucher	87ceff6136	drm/amdgpu/userq/mes: pass the secure flag to mqd init So that we initialize the MQD as a secure queue. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:44 -04:00
Meenakshikumar Somasundaram	8aaeb25327	drm/amd/display: Fix pixel rate divider policy for 1 pixel per cycle config [Why] Pixel rate dividor was not programmed correctly for 1 pixel per cycle configuration for empty tu case. [How] Included check for empty tu when pixel rate dividor values were selected. Reviewed-by: Michael Strauss <michael.strauss@amd.com> Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:44 -04:00
Leo Li	8f772d79ef	drm/amd/display: Default IPS to RCG_IN_ACTIVE_IPS2_IN_OFF [Why] Recent findings show negligible power savings between IPS2 and RCG during static desktop. In fact, DCN related clocks are higher when IPS2 is enabled vs RCG. RCG_IN_ACTIVE is also the default policy for another OS supported by DC, and it has faster entry/exit. [How] Remove previous logic that checked for IPS2 support, and just default to `DMUB_IPS_RCG_IN_ACTIVE_IPS2_IN_OFF`. Fixes: `199888aa25` ("drm/amd/display: Update IPS default mode for DCN35/DCN351") Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:44 -04:00
Charlene Liu	8e40dd9320	drm/amd/display: Revert "not disable dtb as dto src at dpms off" [why] not all the asic using the same code path. need to revisit and limit the impact. This reverts commit `32be4e39f4`. Reviewed-by: Gabe Teeger <gabe.teeger@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:44 -04:00
George Shen	1594b60d74	drm/amd/display: Use 16ms AUX read interval for LTTPR with old sinks [Why/How] LTTPR are required to program DPCD 0000Eh to 0x4 (16ms) upon AUX read reply to this register. Since old Sinks witih DPCD rev 1.1 and earlier may not support this register, assume the mandatory value is programmed by the LTTPR to avoid AUX timeout issues. Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: George Shen <george.shen@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:44 -04:00
Mario Limonciello	a918bb4a90	drm/amd/display: Fix ACPI edid parsing on some Lenovo systems [Why] The ACPI EDID in the BIOS of a Lenovo laptop includes 3 blocks, but dm_helpers_probe_acpi_edid() has a start that is 'char'. The 3rd block index starts after 255, so it can't be indexed properly. This leads to problems with the display when the EDID is parsed. [How] Change the variable type to 'short' so that larger values can be indexed. Cc: Renjith Pananchikkal <renjith.pananchikkal@amd.com> Reported-by: Mark Pearson <mpearson@lenovo.com> Suggested-by: David Ober <dober@lenovo.com> Fixes: `c6a837088b` ("drm/amd/display: Fetch the EDID from _DDC if available for eDP") Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:44 -04:00
Taimur Hassan	ce2e117bfb	drm/amd/display: Promote DC to 3.2.329 Summary: * Implement HDMI Read request * RMCM and MCM 3DLUT support * Enable urgent latency adjustment on DCN35 * Enable phy-ssc reduction by default Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Reviewed-by: Leo Li <sunpeng.li@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:35 -04:00
Felix Kuehling	372c8d72c3	drm/amdgpu: Allow P2P access through XGMI If peer memory is accessible through XGMI, allow leaving it in VRAM rather than forcing its migration to GTT on DMABuf attachment. Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Tested-by: Hao (Claire) Zhou <hao.zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:28:36 -04:00
Roman Li	e15d09f510	drm/amd/display: enable phy-ssc reduction by default [Why] Reduction of phy-ssc is needed to support DP2 high pixel clock on dcn35x/36. There's a special flag to enable it in dmub hw params. [How] Set hbr3_phy_ssc to true for dcn35, dcn351 and dcn36. Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:28:30 -04:00
Nicholas Susanto	cd74ce1f0c	drm/amd/display: Enable urgent latency adjustment on DCN35 [Why] Urgent latency adjustment was disabled on DCN35 due to issues with P0 enablement on some platforms. Without urgent latency, underflows occur when doing certain high timing configurations. After testing, we found that reenabling urgent latency didn't reintroduce p0 support on multiple platforms. [How] renable urgent latency on DCN35 and setting it to 3000 Mhz. This reverts commit `3412860cc4`. Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Nicholas Susanto <nsusanto@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:28:09 -04:00
Yihan Zhu	652968d996	drm/amd/display: DCN42 RMCM and MCM 3DLUT support [WHY & HOW] Providing hardware programming for the RMCM and MCM IPs for 3DLUT in DCN42. Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Yihan Zhu <Yihan.Zhu@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:28:05 -04:00
Yihan Zhu	c9646e5a7e	drm/amd/display: DCN32 null data check [WHY & HOW] Avoid null curve data structure used in the cm block for the potential issue. Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Yihan Zhu <Yihan.Zhu@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:28:01 -04:00
Roman Li	2ba8619b9a	drm/amd/display: Force full update in gpu reset [Why] While system undergoing gpu reset always do full update to sync the dc state before and after reset. [How] Return true in should_reset_plane() if gpu reset detected Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:27:56 -04:00
Roman Li	d91bc90139	drm/amd/display: Fix gpu reset in multidisplay config [Why] The indexing of stream_status in dm_gpureset_commit_state() is incorrect. That leads to asserts in multi-display configuration after gpu reset. [How] Adjust the indexing logic to align stream_status with surface_updates. Fixes: `cdaae8371a` ("drm/amd/display: Handle GPU reset for DC block") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3808 Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:27:41 -04:00
Austin Zheng	8fc3959cd4	drm/amd/display: Move Mode Support Prefetch Checks To Its Own Function [Why] Large stack size observed in DCN4 mode support when compiling with clang. Additional instrumentation added by compiler adds to stack size. dml_core_mode_support ends up going over the stack size limit due to the size of the function. [How] Move checks and calculations for prefetch to its own function. Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Austin Zheng <Austin.Zheng@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:27:39 -04:00
Felix Kuehling	05185812ae	drm/amdgpu: Don't pin VRAM without DMABUF_MOVE_NOTIFY Pinning of VRAM is for peer devices that don't support dynamic attachment and move notifiers. But it requires that all such peer devices are able to access VRAM via PCIe P2P. Any device without P2P access requires migration to GTT, which fails if the memory is already pinned for another peer device. Sharing between GPUs should not require pinning in VRAM. However, if DMABUF_MOVE_NOTIFY is disabled in the kernel build, even DMABufs shared between GPUs must be pinned, which can lead to failures and functional regressions on systems where some peer GPUs are not P2P accessible. Disable VRAM pinning if move notifiers are disabled in the kernel build to fix regressions when sharing BOs between GPUs. Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Tested-by: Hao (Claire) Zhou <hao.zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:27:34 -04:00
Jack Chang	6df7175263	drm/amd/display: Move desync error counter operation up. [Why & How] Move desync error counter operation up to prevent it from being skipped by force disable desync error. Reviewed-by: Robin Chen <robin.chen@amd.com> Reviewed-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Jack Chang <jack.chang@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:27:32 -04:00
Mario Limonciello	7e40f64896	drm/amd/display: Avoid divide by zero by initializing dummy pitch to 1 [Why] If the dummy values in `populate_dummy_dml_surface_cfg()` aren't updated then they can lead to a divide by zero in downstream callers like CalculateVMAndRowBytes() [How] Initialize dummy value to a value to avoid divide by zero. Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:27:29 -04:00
Chris Park	724a4b400b	drm/amd/display: Implement HDMI Read Request [Why] Read Request provides alterative method to polling to the HDMI sinks that support it. [How] Implement Read Request where interrupt can be generated by the sink. Reviewed-by: Joshua Aberback <joshua.aberback@amd.com> Signed-off-by: Chris Park <chris.park@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:27:25 -04:00
Yiling Chen	f53d0f48a8	drm/amd/display: To apply the adjusted DP ref clock for DP devices [Why] For some pixel clock margin sensitive external monitor, we could not keep original DP ref clock for the ASICs supported SSC DP ref clock. [How] From slicon design team's comment, we have to apply the adjusted DP ref clock for DP devices. DP 128b (DP2) signals uses the DTBCLK not DP ref. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Yiling Chen <yi-ling.chen2@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:26:59 -04:00
Alex Deucher	eec6444923	drm/amdgpu/gfx12: add support for TMZ queues to mqd_init Set up TMZ for queues. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:00:43 -04:00
Alex Deucher	9486875408	drm/amdgpu/gfx11: add support for TMZ queues to mqd_init Set up TMZ for queues. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:00:38 -04:00
Felix Kuehling	3940796a6e	drm/amdgpu: Use allowed_domains for pinning dmabufs When determining the domains for pinning DMABufs, filter allowed_domains and fail with a warning if VRAM is forbidden and GTT is not an allowed domain. Fixes: `f5e7fabd1f` ("drm/amdgpu: allow pinning DMA-bufs into VRAM if all importers can do P2P") Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:59:05 -04:00
Alex Deucher	cb808ab833	drm/amdgpu: add tmz queue parameter to mqd props Use this to track the whether we want TMZ for queues. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:59:01 -04:00
Srinivasan Shanmugam	d30f610762	drm/amdgpu: Refine Cleaner Shader MEC firmware version for GFX10.1.x GPUs Update the minimum firmware version for the Cleaner Shader in the gfx_v10_0_sw_init function. This change adjusts the minimum required firmware version for the MEC firmware from 152 to 151, allowing for broader compatibility with GFX10.1 GPUs. Fixes: `25961bad92` ("drm/amdgpu/gfx10: Add cleaner shader for GFX10.1.10") Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:57:26 -04:00
Jesse.zhang@amd.com	2200b41428	drm/amdgpu:remove old sdma reset callback mechanism This patch removes the deprecated SDMA reset callback mechanism, which was previously used to register pre-reset and post-reset callbacks for SDMA engine resets. The callback mechanism has been replaced with a more direct and efficient approach using `stop_queue` and `start_queue` functions in the ring's function table. The SDMA reset callback mechanism allowed KFD and AMDGPU to register pre-reset and post-reset functions for handling SDMA engine resets. However, this approach added unnecessary complexity and was no longer needed after the introduction of the `stop_queue` and `start_queue` functions in the ring's function table. 1. Remove Callback Mechanism: - Removed the `amdgpu_sdma_register_on_reset_callbacks` function and its associated data structures (`sdma_on_reset_funcs`). - Removed the callback registration logic from the SDMA v4.4.2 initialization code. 2. Clean Up Related Code: - Removed the `sdma_v4_4_2_set_engine_reset_funcs` function, which was used to register the callbacks. - Removed the `sdma_v4_4_2_engine_reset_funcs` structure, which contained the pre-reset and post-reset callback functions. Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:57:22 -04:00
Sunil Khatri	9018c7fe68	drm/amdgpu/userq: add context and seqno of the fence Add context and seqno of the fence in error logging rather than printing fence ptr. Reviewed-by: Christian König <christian.koenig@amd.com> Suggested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:57:09 -04:00
Jesse.zhang@amd.com	6a07ac702f	drm/amdgpu: optimize queue reset and stop logic for sdma_v5_2 This patch refactors the SDMA v5.2 queue reset and stop logic to improve code readability, maintainability, and performance. The key changes include: 1. Generalized `sdma_v5_2_gfx_stop` Function: - Added an `inst_mask` parameter to allow stopping specific SDMA instances instead of all instances. This is useful for resetting individual queues. 2. Simplified `sdma_v5_2_reset_queue` Function: - Removed redundant loops and checks by directly using the `ring->me` field to identify the SDMA instance. - Reused the `sdma_v5_2_gfx_stop` function to stop the queue, reducing code duplication. v1: The general coding style is to declare variables like "i" or "r" last. E.g. longest lines first and short lasts. (Chritian) Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:57:07 -04:00
Jesse.zhang@amd.com	574f4b5562	drm/amdgpu: optimize queue reset and stop logic for sdma_v5_0 This patch refactors the SDMA v5.0 queue reset and stop logic to improve code readability, maintainability, and performance. The key changes include: 1. Generalized `sdma_v5_0_gfx_stop` Function: - Added an `inst_mask` parameter to allow stopping specific SDMA instances instead of all instances. This is useful for resetting individual queues. 2. Simplified `sdma_v5_0_reset_queue` Function: - Removed redundant loops and checks by directly using the `ring->me` field to identify the SDMA instance. - Reused the `sdma_v5_0_gfx_stop` function to stop the queue, reducing code duplication. v1: The general coding style is to declare variables like "i" or "r" last. E.g. longest lines first and short lasts. (Chritian) Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:57:04 -04:00
Jesse.zhang@amd.com	47454f2dc0	drm/amdgpu: Register the new sdma function pointers for sdma_v5_2 Register stop/start/soft_reset queue functions for SDMA IP versions v5.2. Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:57:01 -04:00
Jesse.zhang@amd.com	e56d4bf57f	drm/amdgpu/: drm/amdgpu: Register the new sdma function pointers for sdma_v5_0 Register stop/start/soft_reset queue functions for SDMA IP versions v5.0. Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:56:58 -04:00
Jesse.zhang@amd.com	5c3e7c4953	drm/amdgpu: Implement SDMA soft reset directly for v5.x This patch introduces a new function `amdgpu_sdma_soft_reset` to handle SDMA soft resets directly, rather than relying on the DPM interface. 1. New `amdgpu_sdma_soft_reset` Function: - Implements a soft reset for SDMA engines by directly writing to the hardware registers. - Handles SDMA versions 4.x and 5.x separately: - For SDMA 4.x, the existing `amdgpu_dpm_reset_sdma` function is used for backward compatibility. - For SDMA 5.x, the driver directly manipulates the `GRBM_SOFT_RESET` register to reset the specified SDMA instance. 2. Integration into `amdgpu_sdma_reset_engine`: - The `amdgpu_sdma_soft_reset` function is called during the SDMA reset process, replacing the previous call to `amdgpu_dpm_reset_sdma`. v2: r should default to an error (Alex) Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:56:54 -04:00
Jesse.zhang@amd.com	b22659d5d3	drm/amdgpu: switch amdgpu_sdma_reset_engine to use the new sdma function pointers Replace old callback mechanism with direct calls to stop/start functions. Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:56:49 -04:00
Alex Deucher	a5c34299d8	drm/amdgpu/userq: enable support for queue priorities Enable users to create queues at different priority levels. The highest level is restricted to drm master. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:56:46 -04:00
Alex Deucher	23a650bb9f	drm/amdgpu/userq/mes: handle user queue priority Handle the queue priority set by the user. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:56:42 -04:00
Alex Deucher	9546c05628	drm/amdgpu/userq: add priorty to user queue structure So we can track this when we create user queues. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:56:39 -04:00
Alex Deucher	a83be6e479	drm/amdgpu/mes12: add conversion for priority levels Convert driver priority levels to MES11 priority levels. At the moment they are the same, but they may not always be. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:56:36 -04:00
Alex Deucher	3d0a402e7c	drm/amdgpu/mes11: add conversion for priority levels Convert driver priority levels to MES11 priority levels. At the moment they are the same, but they may not always be. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:56:33 -04:00
Alex Deucher	fced8e7d2d	drm/amdgpu: convert userq UAPI _pad to flags Reuse the _pad field for flags. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:56:18 -04:00
Wentao Liang	6027cbee19	drm/amd/display: Add error check for avi and vendor infoframe setup function The function fill_stream_properties_from_drm_display_mode() calls the function drm_hdmi_avi_infoframe_from_display_mode() and the function drm_hdmi_vendor_infoframe_from_display_mode(), but does not check its return value. Log the error messages to prevent silent failure if either function fails. Signed-off-by: Wentao Liang <vulab@iscas.ac.cn> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:56:13 -04:00
Alex Deucher	8f23a97907	drm/amdgpu/userq: integrate with enforce isolation Enforce isolation serializes access to the GFX IP. User queues are isolated in the MES scheduler, but we still need to serialize between kernel queues and user queues. For enforce isolation, group KGD user queues with KFD user queues. v2: split out variable renaming, add config guards v3: use new function names Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:56:11 -04:00
Alex Deucher	28fc3172e4	drm/amdgpu: rename enforce isolation variables Since they will be used for both KFD and KGD user queues, rename them from kfd to userq. No intended functional change. Acked-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:56:06 -04:00
Alex Deucher	94976e7e5e	drm/amdgpu/userq: add helpers to start/stop scheduling This will be used to stop/start user queue scheduling for example when switching between kernel and user queues when enforce isolation is enabled. v2: use idx v3: only stop compute/gfx queues Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:55:59 -04:00
Alex Deucher	56a0a80af0	drm/amdgpu/userq: track the xcp_id associated with the queue Track this to align with KFD for enforce isolation handling. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:55:55 -04:00
Emily Deng	5ae4591f4e	drm/amdgpu: Clear overflow for SRIOV For VF, it doesn't have the permission to clear overflow, clear the bit by reset. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:55:51 -04:00
Alex Deucher	fb20954c97	drm/amdgpu/userq: rework driver parameter Replace disable_kq parameter with user_queue parameter. The parameter has the following logic: -1 = auto (ASIC specific default) 0 = user queues disabled 1 = user queues enabled and kernel queues enabled (if supported) 2 = user queues enabled and kernel queues disabled The default behavior (-1) is currently the same as 0 for current ASICs. To enable user queues (in addition to kernel queues) set user_queue=1. To enable user queues and disable kernel queues (to make all resources available to user queues), set user_queue=2. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:55:47 -04:00
Asad Kamal	172494c4e9	drm/amd/pm: Enable host limit metrics support Enable host limit metrics support for smuv_13_0_12 Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:55:43 -04:00
Alex Deucher	0ed032dc7d	drm/amdgpu/sdma7: properly reference trap interrupts for userqs We need to take a reference to the interrupts to make sure they stay enabled even if the kernel queues have disabled them. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:55:39 -04:00
Alex Deucher	1197cfb730	drm/amdgpu/sdma6: properly reference trap interrupts for userqs We need to take a reference to the interrupts to make sure they stay enabled even if the kernel queues have disabled them. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:55:36 -04:00
Asad Kamal	2c8b0d628a	drm/amd/pm: Enable host limit metrics support Enable host limit metrics support for smuv_13_0_6 Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:55:31 -04:00
Sathishkumar S	b574729ff0	drm/amdgpu: Enable doorbell for JPEG5_0_1 Enable doorbell for JPEG5_0_1 and adjust index for VCN5_0_1. Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:55:25 -04:00
Shiwu Zhang	8ae634f10e	drm/amdgpu: Update vcn doorbell range in NBIO 7.9 Increase vcn doorbell range for gfx950 to 11. Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:55:10 -04:00
Alex Deucher	e10414cf2e	drm/amdgpu/gfx12: properly reference EOP interrupts for userqs Regardless of whether we disable kernel queues, we need to take an extra reference to the pipe interrupts for user queues to make sure they stay enabled in case we disable them for kernel queues. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:55:04 -04:00
Alex Deucher	ac9984cee7	drm/amdgpu/gfx11: properly reference EOP interrupts for userqs Regardless of whether we disable kernel queues, we need to take an extra reference to the pipe interrupts for user queues to make sure they stay enabled in case we disable them for kernel queues. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:54:56 -04:00
Eric Huang	6b9d26089f	drm/amdkfd: fix a bug of smi event for superuser rocm-smi with superuser permission doesn't show some of smi events, i.e. page fault/migration, because the condition of "(events & all)" is false. Superuser should be able to detect all events, the condiiton of "(events & all)" seems redundant, so removing it will fix the issue. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:54:44 -04:00
Alexandre Demers	00ec6732a9	drm/amdgpu: add missing DCE6 to dce_version_to_string() Missing DCE 6.0 6.1 and 6.4 are identified as UNKNOWN. Fix this. Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:54:41 -04:00
Alexandre Demers	85207abb40	drm/amdgpu: fix typo in bios_parser.c Probably a cut and paste error from using get_integrated_info_v8's comment. This has to be get_integrated_info_v9 Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:54:36 -04:00
Alexandre Demers	f82e7cf5f5	drm/amdgpu: fix duplicated value setting in dce100_resource_construct() i2c_speed_in_khz was set twice with the same values. Looking at other DCE versions, we probably wanted to set the value for i2c_speed_in_khz_hdcp. Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:54:30 -04:00
Alexandre Demers	3d5d0d35a7	drm/amdgpu: fix typo in atombios.h "aligned" not "aligend" Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:54:24 -04:00
Alexandre Demers	66f6ea421a	drm/amdgpu: add missing parameter name in dce110_clk_src_construct() declaration While not needed per speaking, all the other parameters have names but this one. Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:54:22 -04:00
Alexandre Demers	34c86a0f44	drm/amdgpu: rename function to follow naming convention in dce110 The prefix dce110 is used on all functions, but init_pipes() and init_hw(). Under DCN, these sames functions are prefixed. Let's keep thing coherent. Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:54:18 -04:00
Dan Carpenter	0e023c327b	drm/amdgpu: Clean up error handling in amdgpu_userq_fence_driver_alloc() 1) Checkpatch complains if we print an error message for kzalloc() failure. The kzalloc() failure already has it's own error messages built in. Also this allocation is small enough that it is guaranteed to succeed. 2) Return directly instead of doing a goto free_fence_drv. The "fence_drv" is already NULL so no cleanup is necessary. Reviewed-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:54:10 -04:00
Dan Carpenter	8ff7c78bae	drm/amdgpu: Fix double free in amdgpu_userq_fence_driver_alloc() The goto frees "fence_drv" so this is a double free bug. There is no need to call amdgpu_seq64_free(adev, fence_drv->va) since the seq64 allocation failed so change the goto to goto free_fence_drv. Also propagate the error code from amdgpu_seq64_alloc() instead of hard coding it to -ENOMEM. Fixes: `e7cf21fbb2` ("drm/amdgpu: Few optimization and fixes for userq fence driver") Reviewed-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:52:53 -04:00
Alex Deucher	987718c559	drm/amdgpu/userq: move runpm handling into core userq code Pull it out of the MES code and into the generic code. It's not MES specific and needs to be applied to all user queues regardless of the backend. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:52:49 -04:00
Eric Huang	9315860d05	drm/amdkfd: fix NULL check mistake for process smi event The mistake will lead to NULL kernel oops, so fix it. Fixes: `4172b556fd` ("drm/amdkfd: add smi events for process start and end") Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:49:45 -04:00
Jesse.zhang@amd.com	ce1d40196d	drm/amdgpu/sdma_v4: Register the new sdma function pointers Register stop/start/soft_reset queue functions for sdma v4_4_2. Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:49:42 -04:00
Jesse.zhang@amd.com	2989184215	drm/amdgpu: Add the new sdma function pointers for amdgpu_sdma.h This patch introduces new function pointers in the amdgpu_sdma structure to handle queue stop, start and soft reset operations. These will replace the older callback mechanism. The new functions are: - stop_kernel_queue: Stops a specific SDMA queue - start_kernel_queue: Starts/Restores a specific SDMA queue - soft_reset_kernel_queue: Performs soft reset on a specific SDMA queue v2: Update stop_queue/start_queue function paramters to use ring pointer instead of device/instance(Chritian) v3: move stop_queue/start_queue to struct amdgpu_sdma_instance and rename them. (Alex) v4: rework the ordering a bit (Alex) Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:49:36 -04:00
Alex Deucher	94fc88f680	drm/amdgpu: don't swallow errors in amdgpu_userqueue_resume_all() since we loop through the queues \|= the errors. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:49:32 -04:00
Alex Deucher	c2c722217a	drm/amdgpu/userq: handle system suspend and resume Unmap user queues on suspend and map them on resume. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:49:29 -04:00
Alex Deucher	73e12e98ec	drm/amdgpu/userq: add suspend and resume helpers Add helpers to unmap and map user queues on suspend and resume. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:49:26 -04:00
Alex Deucher	c0bbf64870	drm/amdgpu/userq: properly clean up userq fence driver on failure If userq creation fails, we need to properly unwind and free the user queue fence driver. v2: free idr as well (Sunil) Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:49:22 -04:00
Alex Deucher	edc762a51c	drm/amdgpu/userq: move some code around Move some userq fence handling code into amdgpu_userq_fence.c. This matches the other code in that file. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:49:20 -04:00
Alex Deucher	b0db33c8c5	drm/amdgpu/userq: rework front end call sequence Split out the queue map from the mqd create call and split out the queue unmap from the mqd destroy call. This splits the queue setup and teardown with the actual enablement in the firmware. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:49:16 -04:00
Alex Deucher	51a9ea4551	drm/amdgpu/userq: rename suspend/resume callbacks Rename to map and umap to better align with what is happening at the firmware level and remove the extra level of indirection in the MES userq code. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:49:10 -04:00
Alex Deucher	38feab2dea	drm/amdgpu/userq/mes: remove unused header This is unused so remove it. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:47:26 -04:00
Lijo Lazar	c235a71322	drm/amdgpu: Use the right function for hdp flush There are a few prechecks made before HDP flush like a flush is not required on APU bare metal. Using hdp callback directly bypasses those checks. Use amdgpu_device_flush_hdp which takes care of prechecks. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `1d9bff4cf8`)	2025-04-16 15:57:46 -04:00
Alex Deucher	cd9e6d6fdd	drm/amd/display/dml2: use vzalloc rather than kzalloc The structures are large and they do not require contiguous memory so use vzalloc. Fixes: `70839da636` ("drm/amd/display: Add new DCN401 sources") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4126 Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `20c50a9a79`) Cc: stable@vger.kernel.org	2025-04-16 15:55:37 -04:00
David Rosca	2036be3174	drm/amdgpu: Add back JPEG to video caps for carrizo and newer JPEG is not supported on Vega only. Fixes: `0a6e7b06bd` ("drm/amdgpu: Remove JPEG from vega and carrizo video caps") Signed-off-by: David Rosca <david.rosca@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `0f4dfe86fe`) Cc: stable@vger.kernel.org	2025-04-16 15:55:00 -04:00
ZhenGuo Yin	e7afa85a0d	drm/amdgpu: fix warning of drm_mm_clean Kernel doorbell BOs needs to be freed before ttm_fini. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4145 Fixes: `54c30d2a8d` ("drm/amdgpu: create kernel doorbell pages") Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `39938a8ed9`) Cc: stable@vger.kernel.org	2025-04-16 15:52:49 -04:00
Mario Limonciello	1657793def	drm/amd: Forbid suspending into non-default suspend states On systems that default to 'deep' some userspace software likes to try to suspend in 'deep' first. If there is a failure for any reason (such as -ENOMEM) the failure is ignored and then it will try to use 's2idle' as a fallback. This fails, but more importantly it leads to graphical problems. Forbid this behavior and only allow suspending in the last state supported by the system. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4093 Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250408180957.4027643-1-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `2aabd44aa8`)	2025-04-16 15:52:20 -04:00
Christian König	447fab3095	drm/amdgpu: use a dummy owner for sysfs triggered cleaner shaders v4 Otherwise triggering sysfs multiple times without other submissions in between only runs the shader once. v2: add some comment v3: re-add missing cast v4: squash in semicolon fix Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `8b2ae7d492`)	2025-04-16 15:51:47 -04:00
Dave Airlie	683058df13	Merge tag 'drm-misc-next-2025-04-09' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for v6.16-rc1: UAPI Changes: - Add ASAHI uapi header! - Add apple fourcc modifiers. - Add capset virtio definitions to UAPI. - Extend EXPORT_SYNC_FILE for timeline syncobjs. Cross-subsystem Changes: - Adjust DMA-BUF sg handling to not cache map on attach. - Update drm/ci, hlcdc, virtio, maintainers. - Update fbdev todo. - Allow setting dma-device for dma-buf import. - Export efi_mem_desc_lookup to make efidrm build as a module. Core Changes: - Update drm scheduler docs. - Use the correct resv object in TTM delayed destroy. - Fix compiler warning with panic qr code, and other small fixes. - drm/ci updates. - Add debugfs file for listing all bridges. - Small fixes to drm/client, ttm tests. - Add documentation to display/hdmi. - Add kunit tests for bridges. - Dont fail managed device probing if connector polling fails. - Create Kconfig.debug for drm core. - Add tests for the drm scheduler. - Add and use new access helpers for DPCPD. - Add generic and optimized conversions for format-helper. - Begin refcounting panel for improving lifetime handling. - Unify simpledrm and ofdrm sysfb, and add extra features. - Split hdmi audio in bridge to make DP audio work. Driver Changes: - Convert drivers to use devm_platform_ioremap_resource(). - Assorted small fixes to imx/legacy-bridg, gma500, pl111, nouveau, vc4, vmwgfx, ast, mxsfb, xlnx, accel/qaic, v3d, bridge/imx8qxp-ldb, ofdrm, bridge/fsl-ldb, udl, bridge/ti-sn65dsi86, bridge/anx7625, cirrus-qemu, bridge/cdns-dsi, panel/sharp, panel/himax, bridge/sil902x, renesas, imagination, various panels. - Allow attaching more display to vkms. - Add Powertip PH128800T004-ZZA01 panel. - Add rotation quirk for ZOTAC panel. - Convert bridge/tc358775 to atomic. - Remove deprecated panel calls from synaptics, novatek, samsung panels. - Refactor shmem helper page pinning and accel drivers using it. - Add dmabuf support to accel/amdxdna. - Use 4k page table format for panfrost/mediatek. - Add common powerup/down dp link helper and use it. - Assorted compiler warning fixes. - Support dma-buf import for renesas Signed-off-by: Dave Airlie <airlied@redhat.com> # Conflicts: # include/drm/drm_kunit_helpers.h From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://lore.kernel.org/r/e147ff95-697b-4067-9e2e-7cbd424e162a@linux.intel.com	2025-04-14 15:29:49 +10:00
Shane Xiao	c3abed53ca	drm/amdkfd: Add rec SDMA engines support with limited XGMI This patch adds recommended SDMA engines with limited XGMI SDMA engines. It will help improve overall performance for device to device copies with this optimization. v2: Update the formatting issues and data type Signed-off-by: Shane Xiao <shane.xiao@amd.com> Suggested-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-13 09:56:32 -04:00
Srinivasan Shanmugam	083a0c8d17	drm/amdgpu: Enhance Cleaner Shader Handling in GFX v9.0 Architecture v2 This commit modifies the gfx_v9_0_ring_emit_cleaner_shader function to use a switch statement for cleaner shader emission based on the specific GFX IP version. The function now distinguishes between different IP versions, using PACKET3_RUN_CLEANER_SHADER_9_0 for the versions 9.0.1, 9.1.0, 9.2.1, 9.2.2, 9.3.0, and 9.4.0, while retaining PACKET3_RUN_CLEANER_SHADER for version 9.4.2. v2: Simplify logic (Alex). Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Suggested-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 17:03:06 -04:00
Srinivasan Shanmugam	8896abcfdd	drm/amdgpu: Add PACKET3_RUN_CLEANER_SHADER_9_0 for Cleaner Shader execution This commit introduces the PACKET3_RUN_CLEANER_SHADER_9_0 definition, which is a command packet utilized to instruct the GPU to execute the cleaner shader for the GFX9.0 graphics architecture. The cleaner shader is a piece of GPU code that is responsible for clearing or initializing essential GPU resources, such as Local Data Share (LDS), Vector General Purpose Registers (VGPRs), and Scalar General Purpose Registers (SGPRs). Properly clearing these resources is vital for ensuring data isolation and security between different workloads executed on the GPU. When the GPU receives this packet, it fetches and runs the cleaner shader instructions from the specified location in the packet. Thus by preventing data leaks and ensuring that previous job states do not interfere with subsequent workloads. Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 17:03:02 -04:00
Jesse Zhang	cf93f10101	drm/amd/amdgpu: Fix out of bounds warning in amdgpu_hw_ip_info Fix an array index out of bounds warning in the DMA IP case of amdgpu_hw_ip_info() where it was incorrectly checking adev->gfx.gfx_ring[i].no_user_submission instead of adev->sdma.instance[i].ring.no_user_submission. The mismatch caused UBSAN to report an array bounds violation since it was accessing the GFX ring array with SDMA instance indices. Fixes: `4310acd446` ("drm/amdgpu: add ring flag for no user submissions") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 17:02:17 -04:00
Eric Huang	4172b556fd	drm/amdkfd: add smi events for process start and end rocm-smi will be able to show the events for KFD process start/end, it is the implementation of this feature. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 17:01:25 -04:00
Lijo Lazar	1d9bff4cf8	drm/amdgpu: Use the right function for hdp flush There are a few prechecks made before HDP flush like a flush is not required on APU bare metal. Using hdp callback directly bypasses those checks. Use amdgpu_device_flush_hdp which takes care of prechecks. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 17:01:06 -04:00
Ellen Pan	5045c6c698	drm/amdgpu: Direct ret in ras_reset_err_cnt on VF With adding sriov_vf check, we directly return EOPNOTSUPP in ras_reset_error_count as we should not do anything on VF to reset RAS error count. This also fixes the issue that loading guest driver causes register violations. Reviewed-by: Ahmad Rehman <Ahmad.Rehman@amd.com> Signed-off-by: Ellen Pan <yunru.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 17:01:00 -04:00
Lijo Lazar	18a878fd8a	drm/amdgpu: Use generic hdp flush function Except HDP v5.2 all use a common logic for HDP flush. Use a generic function. HDP v5.2 forces NO_KIQ logic, revisit it later. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 17:00:56 -04:00
Candice Li	d6b22b1dff	drm/amdgpu: Set RAS EEPROM table version to v3 for umc v12_5 Set RAS EEPROM table version to v3 for umc v12_5. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 17:00:50 -04:00
Jesse.zhang@amd.com	e21e1e8bb8	drm/amdgpu: Enable per-queue reset for SDMA v4.4.2 on IP v9.5.0 Add support for per-queue reset on SDMA v4.4.2 when running with: 1. MEC firmware version 17 or later 2. DPM indicates SDMA reset is supported v2: Fixed supported firmware versions (Lijo) Signed-off-by: Jesse.Zhang <Jesse.zhang@amd.com> Reviewed-by: Tim Huang <tim.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 17:00:44 -04:00
Srinivasan Shanmugam	e62a8bc5d6	drm/amdgpu/gfx11: Add Cleaner Shader Support for GFX11.5.2/11.5.3 GPUs Enable the cleaner shader for additional GFX11.5.2/11.5.3 series GPUs to ensure data isolation among GPU tasks. The cleaner shader is tasked with clearing the Local Data Store (LDS), Vector General Purpose Registers (VGPRs), and Scalar General Purpose Registers (SGPRs), which helps avoid data leakage and guarantees the accuracy of computational results. This update extends cleaner shader support to GFX11.5.2/11.5.3 GPUs, previously available for GFX11.0.3. It enhances security by clearing GPU memory between processes and maintains a consistent GPU state across KGD and KFD workloads. Cc: Mario Sopena-Novales <mario.novales@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 17:00:39 -04:00
Alex Deucher	20c50a9a79	drm/amd/display/dml2: use vzalloc rather than kzalloc The structures are large and they do not require contiguous memory so use vzalloc. Fixes: `70839da636` ("drm/amd/display: Add new DCN401 sources") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4126 Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 17:00:02 -04:00
Roman Li	309d11b4bb	drm/amd/display: Add htmldocs description for fused_io interface [Why] htmldocs build warning: "Function parameter or struct member 'fused_io' not described in 'amdgpu_display_manager'". [How] Add missing description. Fixes: `ce801e5d6c` ("drm/amd/display: HDCP Locality check using DMUB Fused IO") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Roman Li <Roman.Li@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 16:58:19 -04:00
Alex Deucher	2e0454b730	drm/amdgpu: adjust enforce_isolation handling Switch from a bool to an enum and allow more options for enforce isolation. There are now 3 modes of operation: - Disabled (0) - Enabled (serialization and cleaner shader) (1) - Enabled in legacy mode (no serialization or cleaner shader) (2) This provides better flexibility for more use cases. Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 16:58:15 -04:00
Alex Deucher	b86fd212f3	drm/amdgpu/mes12: use the device value for enforce isolation Use the local setting rather than the global parameter. Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 16:58:12 -04:00
Alex Deucher	a7bb01337f	drm/amdgpu/mes11: use the device value for enforce isolation Use the local setting rather than the global parameter. Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 16:58:04 -04:00
David Rosca	0f4dfe86fe	drm/amdgpu: Add back JPEG to video caps for carrizo and newer JPEG is not supported on Vega only. Fixes: `0a6e7b06bd` ("drm/amdgpu: Remove JPEG from vega and carrizo video caps") Signed-off-by: David Rosca <david.rosca@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 16:56:44 -04:00
Prike Liang	9a218d6f47	drm/amdgpu/gfx12: Implement the GFX12 KCQ pipe reset Implement the GFX12 KCQ pipe reset, and disable the GFX12 kernel compute queue until the CPFW fully supports it. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 16:56:07 -04:00
Ce Sun	732c6cefc1	drm/amdgpu: Replace tmp_adev with hive in amdgpu_pci_slot_reset Checking hive is more readable. The following smatch warning: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:6820 amdgpu_pci_slot_reset() warn: iterator used outside loop: 'tmp_adev' Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Ce Sun <cesun102@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 16:55:55 -04:00
ZhenGuo Yin	39938a8ed9	drm/amdgpu: fix warning of drm_mm_clean Kernel doorbell BOs needs to be freed before ttm_fini. Fixes: `54c30d2a8d` ("drm/amdgpu: create kernel doorbell pages") Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 16:54:57 -04:00
Kenneth Feng	c770ef1967	drm/amd/amdgpu: disable ASPM in some situations disable ASPM with some ASICs on some specific platforms. required from PCIe controller owner. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 16:54:48 -04:00
Prike Liang	0ec7535f5b	drm/amdgpu: remove the duplicated mes queue active state setting The MES queue deactivation and active status are already set in mes_userq_unmap\|map(), so the caller needn't set the queue_active bit again. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 16:54:39 -04:00
Ruili Ji	f0ec5926da	amd/amdgpu: Implement VCN queue reset for vcn 4.0.3 Add function for vcn queue reset to make driver to do fine-grained reset instead of the whole gpu reset. Reviewed-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Ruili Ji <ruiliji2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 16:54:27 -04:00
Masha Grinman	8ef4e99674	drm/amdgpu: Move read of snoop register from guest to host Guest is reading/writing to snoop register which is a security violation We moved the code to the host driver And also added a validation on the guest side to check if it's guest Signed-off-by: Masha Grinman <Masha.Grinman@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 16:54:02 -04:00
Mario Limonciello	2aabd44aa8	drm/amd: Forbid suspending into non-default suspend states On systems that default to 'deep' some userspace software likes to try to suspend in 'deep' first. If there is a failure for any reason (such as -ENOMEM) the failure is ignored and then it will try to use 's2idle' as a fallback. This fails, but more importantly it leads to graphical problems. Forbid this behavior and only allow suspending in the last state supported by the system. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4093 Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250408180957.4027643-1-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 16:53:08 -04:00
Christian König	8b2ae7d492	drm/amdgpu: use a dummy owner for sysfs triggered cleaner shaders v4 Otherwise triggering sysfs multiple times without other submissions in between only runs the shader once. v2: add some comment v3: re-add missing cast v4: squash in semicolon fix Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-11 16:52:33 -04:00
Dave Airlie	47271a0cae	amd-drm-fixes-6.15-2025-04-09: amdgpu: - MES FW version caching fixes - Only use GTT as a fallback if we already have a backing store - dma_buf fix - IP discovery fix - Replay and PSR with VRR fix - DC FP fixes - eDP fixes - KIQ TLB invalidate fix - Enable dmem groups support - Allow pinning VRAM dma bufs if imports can do P2P - Workload profile fixes - Prevent possible division by 0 in fan handling amdkfd: - Queue reset fixes -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZ/akUAAKCRC93/aFa7yZ 2HtpAP9ptJj5RKxOcdh8lbiOZN3RGjxsqw6wXk7LUnw3A2npVAD5AXoaTis4S84e 0NIY7bd1Q1njP2akA4O5AKykkpgVtgw= =Ha4H -----END PGP SIGNATURE----- Merge tag 'amd-drm-fixes-6.15-2025-04-09' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.15-2025-04-09: amdgpu: - MES FW version caching fixes - Only use GTT as a fallback if we already have a backing store - dma_buf fix - IP discovery fix - Replay and PSR with VRR fix - DC FP fixes - eDP fixes - KIQ TLB invalidate fix - Enable dmem groups support - Allow pinning VRAM dma bufs if imports can do P2P - Workload profile fixes - Prevent possible division by 0 in fan handling amdkfd: - Queue reset fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250409165238.1180153-1-alexander.deucher@amd.com	2025-04-10 17:14:02 +10:00
Alex Deucher	34779e1446	drm/amdgpu/mes12: optimize MES pipe FW version fetching Don't fetch it again if we already have it. It seems the registers don't reliably have the value at resume in some cases. Fixes: `785f0f9fe7` ("drm/amdgpu: Add mes v12_0 ip block support (v4)") Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `9e7b08d239`) Cc: stable@vger.kernel.org # 6.12.x	2025-04-09 10:53:11 -04:00
Denis Arefev	7ba88b5ccc	drm/amd/pm/smu11: Prevent division by zero The user can set any speed value. If speed is greater than UINT_MAX/8, division by zero is possible. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `1e866f1fe5` ("drm/amd/pm: Prevent divide by zero") Signed-off-by: Denis Arefev <arefev@swemel.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `da7dc714a8`) Cc: stable@vger.kernel.org	2025-04-09 10:53:11 -04:00
Alex Deucher	35a5440832	drm/amdgpu: cancel gfx idle work in device suspend for s0ix This is normally handled in the gfx IP suspend callbacks, but for S0ix, those are skipped because we don't want to touch gfx. So handle it in device suspend. Fixes: `b9467983b7` ("drm/amdgpu: add dynamic workload profile switching for gfx10") Fixes: `963537ca23` ("drm/amdgpu: add dynamic workload profile switching for gfx11") Fixes: `5f95a15495` ("drm/amdgpu: add dynamic workload profile switching for gfx12") Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `906ad45167`) Cc: stable@vger.kernel.org	2025-04-09 10:53:11 -04:00
Kenneth Feng	50f29ead1f	drm/amd/display: pause the workload setting in dm Pause the workload setting in dm when doing idle optimization Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `b23f81c442`)	2025-04-09 10:53:11 -04:00
Alex Deucher	c81a3ceedb	drm/amdgpu/pm/swsmu: implement pause workload profile Add the callback for implementation for swsmu. Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `92e511d1ce`)	2025-04-09 10:53:11 -04:00
Alex Deucher	7f991dd364	drm/amdgpu/pm: add workload profile pause helper To be used for display idle optimizations when we want to pause non-default profiles. Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `6dafb5d4c7`)	2025-04-09 10:53:11 -04:00
Alex Deucher	72801504fd	drm/amdgpu/sdma7: add support for disable_kq When the parameter is set, disable user submissions to kernel queues. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:23 -04:00
Alex Deucher	fcf5eb979a	drm/amdgpu/sdma6: add support for disable_kq When the parameter is set, disable user submissions to kernel queues. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:23 -04:00
Alex Deucher	1d65006fc1	drm/amdgpu/sdma: add flag for tracking disable_kq For SDMA, we still need kernel queues for paging so they need to be initialized, but we no not want to accept submissions from userspace when disable_kq is set. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:23 -04:00
Alex Deucher	0981e0ef18	drm/amdgpu/gfx12: add support for disable_kq Plumb in support for disabling kernel queues. v2: use ring counts per Felix' suggestion v3: fix stream fault handler, enable EOP interrupts v4: fix MEC interrupt offset (Sunil) v5: clean up after removing extra sched.ready settings Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:23 -04:00
Alex Deucher	1e63ebc0d4	drm/amdgpu/gfx11: add support for disable_kq Plumb in support for disabling kernel queues in GFX11. We have to bring up a GFX queue briefly in order to initialize the clear state. After that we can disable it. v2: use ring counts per Felix' suggestion v3: fix stream fault handler, enable EOP interrupts v4: fix MEC interrupt offset (Sunil) Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:22 -04:00
Alex Deucher	1f61fc28b9	drm/amdgpu/mes: make more vmids available when disable_kq=1 If we don't have kernel queues, the vmids can be used by the MES for user queues. Acked-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:22 -04:00
Alex Deucher	acdc43f270	drm/amdgpu/mes: update hqd masks when disable_kq is set Make all resources available to user queues. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Suggested-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:22 -04:00
Alex Deucher	f091fa777b	drm/amdgpu/gfx: add generic handling for disable_kq Add proper checks for disable_kq functionality in gfx helper functions. Add special logic for families that require the clear state setup. v2: use ring count as per Felix suggestion v3: fix num_gfx_rings handling in amdgpu_gfx_graphics_queue_acquire() v4: fix error code (Alex) Reviewed-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:22 -04:00
Alex Deucher	4310acd446	drm/amdgpu: add ring flag for no user submissions This would be set by IPs which only accept submissions from the kernel, not userspace, such as when kernel queues are disabled. Don't expose the rings to userspace and reject any submissions in the CS IOCTL. v2: fix error code (Alex) Reviewed-by: Sunil Khatri<sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:22 -04:00
Alex Deucher	a96a787d6d	drm/amdgpu: add parameter to disable kernel queues On chips that support user queues, setting this option will disable kernel queues to be used to validate user queues without kernel queues. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:22 -04:00
Alex Deucher	cf97de5b54	drm/amdgpu/userq: prevent runtime pm when userqs are active Similar to KFD, prevent runtime pm while user queues are active. Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:22 -04:00
Alex Deucher	4ce60dbada	drm/amdgpu: store userq_managers in a list in adev So we can iterate across them when we need to manage all user queues. v2: add uq_mgr to adev list in amdgpu_userq_mgr_init Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:22 -04:00
Alex Deucher	100b6010d7	drm/amdgpu: bump version for user queue IP support query Add the user queue IP support query to the drm_amdgpu_info_device query. Cc: marek.olsak@amd.com Cc: prike.liang@amd.com Cc: sunil.khatri@amd.com Cc: yogesh.mohanmarimuthu@amd.com Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:22 -04:00
Alex Deucher	1af6881263	drm/amdgpu: add UAPI to query if user queues are supported Add an INFO query to check if user queues are supported. v2: switch to a mask of IPs (Marek) v3: move to drm_amdgpu_info_device (Marek) Cc: marek.olsak@amd.com Cc: prike.liang@amd.com Cc: sunil.khatri@amd.com Cc: yogesh.mohanmarimuthu@amd.com Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:22 -04:00
Alex Deucher	a4a3373da2	drm/amdgpu/gfx12: split userq setup to a separate switch Add a separate switch statement for the userq callback assignment so that we can assign the callbacks for each asic as the firmware becomes available. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:22 -04:00
Alex Deucher	9983ed9693	drm/amdgpu/gfx11: clean up and consolidate sw_init With the ME details fixed, we can now consolidate this state. Also split out the userq setup into a separate switch statement so that we can set them per IP version when the firmwares are ready. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:22 -04:00
Arvind Yadav	32bd8b3ea7	drm/amdgpu: Fix display freezing issue when resizing apps The display is freezing because the amdgpu_userq_wait_ioctl() is waiting for a non-user queue fence(specifically, the PT update fence). RootCause: The resume_work is initiated by both amdgpu_userq_suspend and amdgpu_userqueue_ensure_ev_fence at same time. The amdgpu_userq_suspend signals a dma-fence and subsequently triggers the resume_work, which is intended to replace the existing fence by creating new dma-fence. However, following this, the amdgpu_userqueue_ensure_ev_fence schedules another resume_work that generates a new dma-fence, thereby replacing the one created by amdgpu_userq_suspend. Consequently, the original fence will never be signaled. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Shashank Sharma <shashank.sharma@amd.com> Cc: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:21 -04:00
Alex Deucher	b6f190e623	drm/amdgpu/mes: warn on unexpected pipe numbers Warn if the number of pipes exceeds what the MES supports. Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:21 -04:00
Alex Deucher	9e2bbba1d5	drm/amdgpu/mes: centralize gfx_hqd mask management Move it to amdgpu_mes to align with the compute and sdma hqd masks. No functional change. v2: rebase on new changes v3: misc optimizations Reviewed-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Sunil Khatri<sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:21 -04:00
Alex Deucher	4220d2c7c4	drm/amdgpu: remove is_mes_queue flag This was leftover from MES bring up when we had MES user queues in the kernel. It's no longer used so remove it. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:21 -04:00
Alex Deucher	cb17fff3a2	drm/amdgpu/mes: remove unused functions Leftover from the MES self tests that were removed previously. Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:21 -04:00
Alex Deucher	158bfbc72c	drm/amdgpu: validate user queue parameters Make sure these are set properly to ensure compatibility if we ever update the IOCTL interface. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:21 -04:00
Arvind Yadav	ad6c120f68	drm/amdgpu: fix the memleak caused by fence not released Encountering a taint issue during the unloading of gpu_sched due to the fence not being released/put. In this context, amdgpu_vm_clear_freed is responsible for creating a job to update the page table (PT). It allocates kmem_cache for drm_sched_fence and returns the finished fence associated with job->base.s_fence. In case of Usermode queue this finished fence is added to the timeline sync object through amdgpu_gem_update_bo_mapping, which is utilized by user space to ensure the completion of the PT update. [ 508.900587] ============================================================================= [ 508.900605] BUG drm_sched_fence (Tainted: G N): Objects remaining in drm_sched_fence on __kmem_cache_shutdown() [ 508.900617] ----------------------------------------------------------------------------- [ 508.900627] Slab 0xffffe0cc04548780 objects=32 used=2 fp=0xffff8ea81521f000 flags=0x17ffffc0000240(workingset\|head\|node=0\|zone=2\|lastcpupid=0x1fffff) [ 508.900645] CPU: 3 UID: 0 PID: 2337 Comm: rmmod Tainted: G N 6.12.0+ #1 [ 508.900651] Tainted: [N]=TEST [ 508.900653] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE/X570 AORUS ELITE, BIOS F34 06/10/2021 [ 508.900656] Call Trace: [ 508.900659] <TASK> [ 508.900665] dump_stack_lvl+0x70/0x90 [ 508.900674] dump_stack+0x14/0x20 [ 508.900678] slab_err+0xcb/0x110 [ 508.900687] ? srso_return_thunk+0x5/0x5f [ 508.900692] ? try_to_grab_pending+0xd3/0x1d0 [ 508.900697] ? srso_return_thunk+0x5/0x5f [ 508.900701] ? mutex_lock+0x17/0x50 [ 508.900708] __kmem_cache_shutdown+0x144/0x2d0 [ 508.900713] ? flush_rcu_work+0x50/0x60 [ 508.900719] kmem_cache_destroy+0x46/0x1f0 [ 508.900728] drm_sched_fence_slab_fini+0x19/0x970 [gpu_sched] [ 508.900736] __do_sys_delete_module.constprop.0+0x184/0x320 [ 508.900744] ? srso_return_thunk+0x5/0x5f [ 508.900747] ? debug_smp_processor_id+0x1b/0x30 [ 508.900754] __x64_sys_delete_module+0x16/0x20 [ 508.900758] x64_sys_call+0xdf/0x20d0 [ 508.900763] do_syscall_64+0x51/0x120 [ 508.900769] entry_SYSCALL_64_after_hwframe+0x76/0x7e v2: call dma_fence_put in amdgpu_gem_va_update_vm v3: Addressed review comments from Christian. - calling amdgpu_gem_update_timeline_node before switch. - puting a dma_fence in case of error or !timeline_syncobj. v4: Addressed review comments from Christian. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Shashank Sharma <shashank.sharma@amd.com> Cc: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Le Ma <le.ma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:21 -04:00
Alex Deucher	ecdb0b32e5	drm/amdgpu/userq: move the header to amdgpu directory To align with other headers. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:21 -04:00
Alex Deucher	665de8c947	drm/amdgpu/userq: remove BROKEN from config This can be enabled now. We have the firmware checks in place. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:21 -04:00
Alex Deucher	5ca4095960	drm/amdgpu: add userq firmware version checks Currently disabled until the firmwares are officially released. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:21 -04:00
Alex Deucher	f36e4876c8	drm/amdgpu/gfx11: fix config guard s/CONFIG_DRM_AMD_USERQ_GFX/CONFIG_DRM_AMDGPU_NAVI3X_USERQ/ Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:21 -04:00
Alex Deucher	c4f42c8d0b	drm/amdgpu/Kconfig: fix wording of DRM_AMDGPU_NAVI3X_USERQ The feature is not navi3x specific at this point. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:20 -04:00
Alex Deucher	df85baa767	drm/amdgpu: return an error in the userq IOCTL when DRM_AMDGPU_NAVI3X_USERQ=n I'd swear this was already fixed, but I guess the patch never landed. Add it now. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:20 -04:00
Alex Deucher	2a060b3ae9	drm/amdgpu/userq: handle runtime pm Take a reference when we create a queue and drop it when we destroy the queue. We need to keep the device active while user queues are active. v2: squash in fix from Sunil v3: squash in fix from Prike Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:20 -04:00
Alex Deucher	29adc5c2dd	drm/amdgpu/userq: fix hardcoded uq functions Use the IP type to look up the userq functions rather than hardcoding it. Reviewed-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:20 -04:00
Arvind Yadav	f15d4e92f7	drm/amdgpu: Fix display freeze lockup error A deadlock situation has arised between the userq signal ioctl and the eviction fence. In this scenario, the function amdgpu_userq_signal_ioctl() has acquired a reservation lock on the read/write buffer object (BO) through drm_exec. Subsequently, it calls amdgpu_userqueue_ensure_ev_fence(), which is in a waiting for the userq resume work. Meanwhile, the userq suspend worker has initiated the userq resume work(amdgpu_userqueue_resume_worker). This userq resume work attempts to validate the vm->done BO, leading to amdgpu_userqueue_validate_bos also attempting to reservation lock the same write BO that is already locked by amdgpu_userq_signal_ioctl. As a result, the resume work becomes stalled, causing amdgpu_userqueue_ensure_ev_fence to remain in a waiting state. Call Trace: [ 242.836469] INFO: task gnome-shel:cs0:1288 blocked for more than 120 seconds. [ 242.836486] Tainted: G OE 6.12.0-rc2rebased-oct-24+ #4 [ 242.836491] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 242.836494] task:gnome-shel:cs0 state:D stack:0 pid:1288 tgid:1282 ppid:1180 flags:0x00000002 [ 242.836503] Call Trace: [ 242.836508] <TASK> [ 242.836517] __schedule+0x3e0/0xb10 [ 242.836530] ? srso_return_thunk+0x5/0x5f [ 242.836541] schedule+0x31/0x120 [ 242.836546] schedule_timeout+0x150/0x160 [ 242.836551] ? srso_return_thunk+0x5/0x5f [ 242.836555] ? sysvec_call_function+0x69/0xd0 [ 242.836562] ? srso_return_thunk+0x5/0x5f [ 242.836567] ? preempt_count_add+0x7f/0xd0 [ 242.836577] __wait_for_common+0x91/0x180 [ 242.836582] ? __pfx_schedule_timeout+0x10/0x10 [ 242.836590] wait_for_completion+0x28/0x30 [ 242.836595] __flush_work+0x16c/0x290 [ 242.836602] ? __pfx_wq_barrier_func+0x10/0x10 [ 242.836611] flush_delayed_work+0x3a/0x60 [ 242.836621] amdgpu_userqueue_ensure_ev_fence+0x2d/0xb0 [amdgpu] [ 242.836966] amdgpu_userq_signal_ioctl+0x959/0xec0 [amdgpu] [ 242.837171] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu] [ 242.837365] drm_ioctl_kernel+0xae/0x100 [drm] [ 242.837398] drm_ioctl+0x2a1/0x500 [drm] [ 242.837420] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu] [ 242.837622] ? srso_return_thunk+0x5/0x5f [ 242.837627] ? srso_return_thunk+0x5/0x5f [ 242.837630] ? _raw_spin_unlock_irqrestore+0x2b/0x50 [ 242.837635] amdgpu_drm_ioctl+0x4f/0x90 [amdgpu] [ 242.837811] __x64_sys_ioctl+0x99/0xd0 [ 242.837820] x64_sys_call+0x1209/0x20d0 [ 242.837825] do_syscall_64+0x51/0x120 [ 242.837830] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 242.837835] RIP: 0033:0x7f2f33f1a94f [ 242.837838] RSP: 002b:00007f2f24ffea30 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 242.837842] RAX: ffffffffffffffda RBX: 00007f2f24ffebd0 RCX: 00007f2f33f1a94f [ 242.837845] RDX: 00007f2f24ffebd0 RSI: 00000000c0306457 RDI: 000000000000000d [ 242.837847] RBP: 00007f2f24ffeab0 R08: 0000000000000000 R09: 0000000000000000 [ 242.837849] R10: 00007f2f24ffecd0 R11: 0000000000000246 R12: 00007f2f25000640 [ 242.837851] R13: 00000000c0306457 R14: 000000000000000d R15: 00007fff3b39c1e0 [ 242.837858] </TASK> [ 242.837865] INFO: task Xwayland:cs0:1517 blocked for more than 120 seconds. [ 242.837869] Tainted: G OE 6.12.0-rc2rebased-oct-24+ #4 [ 242.837872] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 242.837874] task:Xwayland:cs0 state:D stack:0 pid:1517 tgid:1338 ppid:1282 flags:0x00004002 [ 242.837878] Call Trace: [ 242.837880] <TASK> [ 242.837883] __schedule+0x3e0/0xb10 [ 242.837890] schedule+0x31/0x120 [ 242.837894] schedule_preempt_disabled+0x1c/0x30 [ 242.837897] __mutex_lock.constprop.0+0x386/0x6e0 [ 242.837902] ? srso_return_thunk+0x5/0x5f [ 242.837905] ? __timer_delete_sync+0x81/0xe0 [ 242.837911] __mutex_lock_slowpath+0x13/0x20 [ 242.837915] mutex_lock+0x3b/0x50 [ 242.837919] amdgpu_userqueue_ensure_ev_fence+0x35/0xb0 [amdgpu] [ 242.838138] amdgpu_userq_signal_ioctl+0x959/0xec0 [amdgpu] [ 242.838340] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu] [ 242.838531] drm_ioctl_kernel+0xae/0x100 [drm] [ 242.838559] drm_ioctl+0x2a1/0x500 [drm] [ 242.838580] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu] [ 242.838778] ? srso_return_thunk+0x5/0x5f [ 242.838783] ? srso_return_thunk+0x5/0x5f [ 242.838786] ? _raw_spin_unlock_irqrestore+0x2b/0x50 [ 242.838791] amdgpu_drm_ioctl+0x4f/0x90 [amdgpu] [ 242.838967] __x64_sys_ioctl+0x99/0xd0 [ 242.838972] x64_sys_call+0x1209/0x20d0 [ 242.838975] do_syscall_64+0x51/0x120 [ 242.838979] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 242.838982] RIP: 0033:0x7f9118b1a94f [ 242.838985] RSP: 002b:00007f910cdff760 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 242.838989] RAX: ffffffffffffffda RBX: 00007f910cdff910 RCX: 00007f9118b1a94f [ 242.838991] RDX: 00007f910cdff910 RSI: 00000000c0306457 RDI: 000000000000000c [ 242.838993] RBP: 00007f910cdff7e0 R08: 0000000000000000 R09: 0000000000000001 [ 242.838995] R10: 00007f910cdff9d4 R11: 0000000000000246 R12: 00007f910ce00640 [ 242.838997] R13: 00000000c0306457 R14: 000000000000000c R15: 00007fff9dd11d10 [ 242.839004] </TASK> v2: Addressed review comemnts from Christian. v3/v4: Addressed review comemnts from Christian. - Move drm_exec drm_exec loop after userq fence create. - cleanup the newly created userq fence in case of error. v5 - Addressed review comemnts from Christian. - Create a new amdgpu_userq_fence_alloc() function for allocation. - Calling dma_fence_put for cleanup procedure. - make amdgpu_userq_fence_create() function static. - drm_exec_init is called after mutex_unlock. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Shashank Sharma <shashank.sharma@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:20 -04:00
Arunpravin Paneer Selvam	fc4a85c6b2	drm/amdgpu: Modify the seq64 VM cache policy The seq64 VM cache policy should be set to UC (Uncached) to match with userqueue fence address kernel mapped memory's cache settings. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:20 -04:00
Arunpravin Paneer Selvam	239a310b49	drm/amdgpu: Fix out-of-bounds issue in user fence Fix out-of-bounds issue in userq fence create when accessing the userq xa structure. Added a lock to protect the race condition. v2:(Christian) - Allocate memory with GFP_ATOMIC. v3: - Moved to 2 xa approach. v4:(Christian) - Lock the xa_for_each blocks and memory allocation part as well to make sure that xa is not modified in between the 2 xa_for_each blocks. BUG: KASAN: slab-out-of-bounds in amdgpu_userq_fence_create+0x726/0x880 [amdgpu] [ +0.000006] Call Trace: [ +0.000005] <TASK> [ +0.000005] dump_stack_lvl+0x6c/0x90 [ +0.000011] print_report+0xc4/0x5e0 [ +0.000009] ? srso_return_thunk+0x5/0x5f [ +0.000008] ? kasan_complete_mode_report_info+0x26/0x1d0 [ +0.000007] ? amdgpu_userq_fence_create+0x726/0x880 [amdgpu] [ +0.000405] kasan_report+0xdf/0x120 [ +0.000009] ? amdgpu_userq_fence_create+0x726/0x880 [amdgpu] [ +0.000405] __asan_report_store8_noabort+0x17/0x20 [ +0.000007] amdgpu_userq_fence_create+0x726/0x880 [amdgpu] [ +0.000406] ? __pfx_amdgpu_userq_fence_create+0x10/0x10 [amdgpu] [ +0.000408] ? srso_return_thunk+0x5/0x5f [ +0.000008] ? ttm_resource_move_to_lru_tail+0x235/0x4f0 [ttm] [ +0.000013] ? srso_return_thunk+0x5/0x5f [ +0.000008] amdgpu_userq_signal_ioctl+0xd29/0x1c70 [amdgpu] [ +0.000412] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu] [ +0.000404] ? try_to_wake_up+0x165/0x1840 [ +0.000010] ? __pfx_futex_wake_mark+0x10/0x10 [ +0.000011] drm_ioctl_kernel+0x178/0x2f0 [drm] [ +0.000050] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu] [ +0.000404] ? __pfx_drm_ioctl_kernel+0x10/0x10 [drm] [ +0.000043] ? __kasan_check_read+0x11/0x20 [ +0.000007] ? srso_return_thunk+0x5/0x5f [ +0.000007] ? __kasan_check_write+0x14/0x20 [ +0.000008] drm_ioctl+0x513/0xd20 [drm] [ +0.000040] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu] [ +0.000407] ? __pfx_drm_ioctl+0x10/0x10 [drm] [ +0.000044] ? srso_return_thunk+0x5/0x5f [ +0.000007] ? _raw_spin_lock_irqsave+0x99/0x100 [ +0.000007] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ +0.000006] ? __rseq_handle_notify_resume+0x188/0xc30 [ +0.000008] ? srso_return_thunk+0x5/0x5f [ +0.000008] ? srso_return_thunk+0x5/0x5f [ +0.000006] ? _raw_spin_unlock_irqrestore+0x27/0x50 [ +0.000010] amdgpu_drm_ioctl+0xcd/0x1d0 [amdgpu] [ +0.000388] __x64_sys_ioctl+0x135/0x1b0 [ +0.000009] x64_sys_call+0x1205/0x20d0 [ +0.000007] do_syscall_64+0x4d/0x120 [ +0.000008] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ +0.000007] RIP: 0033:0x7f7c3d31a94f Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:20 -04:00
Saleemkhan Jamadar	49cd3353db	drm/amdgpu: add db size and offset range for VCN and VPE VCN and VPE have different offset range, update the doorbell offset range repsectively. Doorbell size for VCN and VPE is 32bit. v1 : add gfx switch case and fix checkpatch warnings (Shashank) Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:20 -04:00
Saleemkhan Jamadar	3e37fcb57b	drm/amdgpu: map doorbell for the requested userq Introduce db_info structure to the populate the doorbell information that is required to be mapped. Made changes to the doorbell mapping func more generic, by taking parameters that vary based on IPs and/or usecase into db_info structure. v2 - Fix space alignment and checkpatch warnings(Shashank) Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:20 -04:00
Christian König	8639d2f5ca	drm/amdgpu: fix call to amdgpu_eviction_fence_detach That needs to be done after grabbing the lock, not before. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:20 -04:00
Arvind Yadav	adba092973	drm/amdgpu: Fix Illegal opcode in command stream Error When applications closes, it triggers the drm_file_free function which subsequently releases all allocated buffer objects. Concurrently, the resume_worker thread will attempt to map the usermode queue. However, since the wptr buffer object has already been deallocated, this will result in an Illegal opcode error being raised in the command stream. Now replacing drm_release() with a new function amdgpu_drm_release(). This function will set the flag to prevent the scheduling of any new queue resume/map, stop all queues and then call drm_release(). V2: - Replace drm_release with amdgpu_drm_release(Christian). Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:20 -04:00
Arunpravin Paneer Selvam	02521454f0	drm/amdgpu: Apply sign extension to seq64 Apply sign extension to seq64 va address. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:20 -04:00
Christian König	91acb5d47b	drm/amdgpu: Modify the MES process va end limit Modify the MES process va end limit to max pfn. Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:19 -04:00
Arunpravin Paneer Selvam	ed5fdc1fc2	drm/amdgpu: Fix the use-after-free issue in wait IOCTL The xarray pointer which has the userqueue xarray structure reference should be cleared when the userqueue gets destroyed. Otherwise, we may access the freed xa memory and see the below warnings. warning 1: BUG: KASAN: slab-use-after-free in _raw_spin_lock+0x7a/0xe0 [ +0.000044] Call Trace: [ +0.000017] <TASK> [ +0.000016] dump_stack_lvl+0x6c/0x90 [ +0.000025] print_report+0xc4/0x5e0 [ +0.000025] ? srso_return_thunk+0x5/0x5f [ +0.000024] ? kasan_complete_mode_report_info+0x60/0x1d0 [ +0.000030] ? _raw_spin_lock+0x7a/0xe0 [ +0.000023] kasan_report+0xdf/0x120 [ +0.000023] ? _raw_spin_lock+0x7a/0xe0 [ +0.000025] kasan_check_range+0xf7/0x1b0 [ +0.000025] __kasan_check_write+0x14/0x20 [ +0.000024] _raw_spin_lock+0x7a/0xe0 [ +0.000023] ? __pfx__raw_spin_lock+0x10/0x10 [ +0.000024] ? amdgpu_userq_wait_ioctl+0xac0/0x1f30 [amdgpu] [ +0.000442] amdgpu_userq_wait_ioctl+0x18fc/0x1f30 [amdgpu] [ +0.000428] ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu] [ +0.000424] ? __pfx_idr_alloc_u32+0x10/0x10 [ +0.000027] ? srso_return_thunk+0x5/0x5f [ +0.000024] ? __kasan_check_write+0x14/0x20 [ +0.000025] ? srso_return_thunk+0x5/0x5f [ +0.000024] ? idr_alloc+0x72/0xc0 [ +0.000023] ? srso_return_thunk+0x5/0x5f [ +0.000023] ? fput+0x1c/0x2f0 [ +0.000025] drm_ioctl_kernel+0x178/0x2f0 [drm] [ +0.000065] ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu] [ +0.000425] ? __pfx_drm_ioctl_kernel+0x10/0x10 [drm] [ +0.000064] ? srso_return_thunk+0x5/0x5f [ +0.000023] ? __kasan_check_write+0x14/0x20 [ +0.000025] drm_ioctl+0x513/0xd20 [drm] [ +0.000058] ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu] [ +0.000428] ? __pfx_drm_ioctl+0x10/0x10 [drm] [ +0.000061] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ +0.000027] ? __count_memcg_events+0x11f/0x3a0 [ +0.000027] ? srso_return_thunk+0x5/0x5f [ +0.001040] ? srso_return_thunk+0x5/0x5f [ +0.000969] ? _raw_spin_unlock_irqrestore+0x27/0x50 [ +0.000966] amdgpu_drm_ioctl+0xcd/0x1d0 [amdgpu] [ +0.001352] __x64_sys_ioctl+0x135/0x1b0 [ +0.000966] x64_sys_call+0x1205/0x20d0 [ +0.000968] do_syscall_64+0x4d/0x120 [ +0.000960] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ +0.000962] RIP: 0033:0x7f42af11a94f warning 2: WARNING: at lib/xarray.c:1849 __xa_alloc+0x13a/0x150 [ 366.491409] RIP: 0010:__xa_alloc+0x13a/0x150 [ 366.491434] Call Trace: [ 366.491437] <TASK> [ 366.491440] ? show_regs+0x6d/0x80 [ 366.491445] ? __warn+0x91/0x140 [ 366.491450] ? __xa_alloc+0x13a/0x150 [ 366.491453] ? report_bug+0x1c9/0x1e0 [ 366.491459] ? handle_bug+0x63/0xa0 [ 366.491463] ? exc_invalid_op+0x1d/0x80 [ 366.491467] ? asm_exc_invalid_op+0x1f/0x30 [ 366.491476] ? __xa_alloc+0x13a/0x150 [ 366.491484] amdgpu_userq_wait_ioctl+0xe0e/0xfe0 [amdgpu] [ 366.491743] ? idr_alloc_u32+0x97/0xd0 [ 366.491749] ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu] [ 366.491912] drm_ioctl_kernel+0xae/0x100 [drm] [ 366.491942] drm_ioctl+0x2a1/0x500 [drm] [ 366.491961] ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu] [ 366.492127] ? srso_return_thunk+0x5/0x5f [ 366.492132] ? srso_return_thunk+0x5/0x5f [ 366.492135] ? _raw_spin_unlock_irqrestore+0x2b/0x50 [ 366.492139] amdgpu_drm_ioctl+0x4f/0x90 [amdgpu] [ 366.492288] __x64_sys_ioctl+0x99/0xd0 [ 366.492295] x64_sys_call+0x1209/0x20d0 [ 366.492299] do_syscall_64+0x51/0x120 [ 366.492303] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 366.492418] RIP: 0033:0x7f86f3b1a94f Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:19 -04:00
Arunpravin Paneer Selvam	c9e20cb005	drm/amdgpu: Fix NULL ptr dereference issue for non userq fences Add the correct fences count variable [num_fences] in the fences array iteration to handle the userq / non-userq fences. v2:(Christian) - All fences in the array either come from some reservation object or drm_syncobj. If any of those are NULL then there is a bug somewhere else. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:19 -04:00
Arunpravin Paneer Selvam	9ed335d939	drm/amdgpu: Add mqd for userq compute queue Add mqd for userq compute queue for gfx11/gfx12 Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:19 -04:00
Shashank Sharma	31f7efcdca	drm/amdgpu: enable eviction fence This patch enables attachment and detachment of eviction fences. This is just a fork of eviction fence enabling code from the first patch of the series so that the CI testing can happen on fully fledged code. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian Koenig <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:19 -04:00
Shashank Sharma	a242a3e4b5	drm/amdgpu: simplify eviction fence suspend/resume The basic idea in this redesign is to add an eviction fence only in UQ resume path. When userqueue is not present, keep ev_fence as NULL Main changes are: - do not create the eviction fence during evf_mgr_init, keeping evf_mgr->ev_fence=NULL until UQ get active. - do not replace the ev_fence in evf_resume path, but replace it only in uq_resume path, so remove all the unnecessary code from ev_fence_resume. - add a new helper function (amdgpu_userqueue_ensure_ev_fence) which will do the following: - flush any pending uq_resume work, so that it could create an eviction_fence - if there is no pending uq_resume_work, add a uq_resume work and wait for it to execute so that we always have a valid ev_fence - call this helper function from two places, to ensure we have a valid ev_fence: - when a new uq is created - when a new uq completion fence is created v2: Worked on review comments by Christian. v3: Addressed few more review comments by Christian. v4: Move mutex lock outside of the amdgpu_userqueue_suspend() function (Christian). v5: squash in build fix (Alex) Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:19 -04:00
Arunpravin Paneer Selvam	dd5a376cd2	drm/amdgpu: enable userqueue secure sem for GFX 12 - Add a field in struct amdgpu_mqd_prop for userqueue secure sem fence address since now we have a generic file for mes_userqueue.c - Add secure sem fence address mqd support to gfx12 into their corresponding init functions. - Enable secure semaphore IRQ handling V2: Address review comment from Alex: Use fence_address instead of fenceaddress (Shashank) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:19 -04:00
Somalapuram Amaranath	988c9e7046	drm/amdgpu: enable userqueue support for GFX12 This patch enables Usermode queue support across GFX, Compute and SDMA IPs on GFX12/SDMA7. It typically reuses Navi3X userqueue IP functions to create and destroy MQDs. v2: rebase on proposed changes (Alex) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Arvind Yadav <arvind.yadav@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:19 -04:00
Alex Deucher	79819d9a0a	drm/amdgpu/uq: make MES UQ setup generic Now that all of the IP specific code has been moved into the IP specific functions, we can make this code generic. V2: Fixed build errors and porting logics (Shashank) Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:19 -04:00
Alex Deucher	b965c5d871	drm/amdgpu/uq: remove gfx11 specifics from UQ setup This can all be handled by in the IP specific mpd init code. V2: Removed setting of gds_va, which was removed during UAPI review (Shashank) Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:19 -04:00
Alex Deucher	21926b5db8	drm/amdgpu/sdma7: update mqd init for UQ Set the addresses for the UQ metadata. V2: Fix lower offset mask (Shashank) V2: Use lower_32_bits for mqd objects(Alex) Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:19 -04:00
Alex Deucher	d07a7fcb8d	drm/amdgpu/sdma6: update mqd init for UQ Set the addresses for the UQ metadata. V2: Fix lower address mask (Shashank) V3: Use lower_32_bits for MQD objects (Alex) Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:18 -04:00
Alex Deucher	ab328d9a7b	drm/amdgpu/gfx12: update mqd init for UQ Set the addresses for the UQ metadata. V2: Fix lower address mask (Shashank) V3: Use lower_32_bits() for MQD objects (Alex) Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:18 -04:00
Amaranath Somalapuram	f2234816a3	drm/amdgpu: fix IGT CI regression with eviction fence This patch fixes one of the regressions in eviction fence code with IGT tests. Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Amaranath Somalapuram <amaranath.somalapuram@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:18 -04:00
Alex Deucher	7179439e34	drm/amdgpu/gfx11: update mqd init for UQ Set the addresses for the UQ metadata. V2: Fix lower address (Shashank) V3: Restore lower_32_bits() for MQD addresses (Alex) Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:18 -04:00
Alex Deucher	825f82cf93	drm/amdgpu: add some additional members to amdgpu_mqd_prop These are needed to make userqueue infrastructure generic. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:18 -04:00
Shashank Sharma	b8e6d3f68c	drm/amdgpu: handle eviction fence race The eviction process can get into a race condition between the eviction fence suspend work (which replaces the old fence with new) and kms_close (which destroys the fence and doesn't expect a new one). This patch: - adds a flag to indicate that fd is closing, so fence replacement is not required (evf_mgr->fd_closing) - adds a flush_work() during the ev_fence_destroy routine V2: Addressed review comments from Christian: - Do not use mutex to sync - Use flush_work and wait for suspend_work to be done V3: Fixed state machine for queue->active, which adds into race between suspend/resume and queue ops Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:18 -04:00
Shashank Sharma	44cfdf368f	drm/amdgpu: resume gfx userqueues This patch adds support for userqueue resume. What it typically does is this: - adds a new delayed work for resuming all the queues. - schedules this delayed work from the suspend work. - validates the BOs and replaces the eviction fence before resuming all the queues running under this instance of userq manager. V2: Addressed Christian's review comments: - declare local variables like ret at the bottom. - lock all the object first, then start attaching the new fence. - dont replace old eviction fence, just attach new eviction fence. - no error logs for drm_exec_lock failures - no need to reserve bos after drm_exec_locked - schedule the resume worker immediately (not after 100 ms) - check for NULL BO (Arvind) V5: Rebased wrt changes in suspend patch - moved amdgpu_userqueue_validate_vm_bo in this patch - initialized ret in resume_all V6: Rebase V7: Addressed review comments from Christian - Do not use list_for_each_safe() with vm->invalidated, its not correct way V8: Fixed the race condition between suspend/close/fence Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Acked-by: Christian Koenig <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:18 -04:00
Shashank Sharma	b0328087c1	drm/amdgpu: suspend gfx userqueues This patch adds suspend support for gfx userqueues. It typically does the following: - adds an enable_signaling function for the eviction fence, so that it can trigger the userqueue suspend, - adds a delayed work to handle suspending of the eviction_fence - adds a suspend function to handle suspending of userqueues which suspends all the queues under this userq manager and signals the eviction fence, - adds a function to replace the old eviction fence with a new one and attach it to each of the objects, - adds reference of userq manager in the eviction fence container so that it can be used in the suspend function. V2: Addressed Christian's review comments: - schedule suspend work immediately V4: Addressed Christian's review comments: - wait for pending uq fences before starting suspend, added queue->last_fence for the same - accommodate ev_fence_mgr into existing code - some bug fixes and NULL checks V5: Addressed Christian's review comments (gitlab) - Wait for eviction fence to get signaled in destroy, don't signal it - Wait for eviction fence to get signaled in replace fence, don't signal it V6: Addressed Christian's review comments - Do not destroy the old eviction fence until we have it replaced - Change the sequence of fence replacement sub-tasks - reusing the ev_fence delayed work for userqueue suspend as well (Shashank). V7: Addressed Christian's review comments - give evf_mgr as argument (instead of fpriv) to replace_fence() - save ptr to evf_mgr in ev_fence (instead of uq_mgr) - modify suspend_all_queues logic to reflect error properly - remove the garbage drm_exec_lock section in wait_for_signal - grab the userqueue mutex before starting the wait for fence - remove the unrelated gobj check from signal_ioctl V8: Added race condition fixes Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Acked-by: Christian Koenig <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:18 -04:00
Shashank Sharma	30e4d78138	drm/amdgpu: add userqueue suspend/resume functions This patch adds userqueue suspend/resume functions at core MES V11 IP level. V2: use true/false for queue_active status (Christian) added Christian's R-B V3: reset/set queue status in mqd.create and mqd.destroy Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian Koenig <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:18 -04:00
Shashank Sharma	fb796c3087	drm/amdgpu: add gfx eviction fence helpers This patch adds basic eviction fence framework for the gfx buffers. The idea is to: - One eviction fence is created per gfx process, at kms_open. - This fence is attached to all the gem buffers created by this process. - This fence is detached to all the gem buffers at postclose_kms. This framework will be further used for usermode queues. V2: Addressed review comments from Christian - keep fence_ctx and fence_seq directly in fpriv - evcition_fence should be dynamically allocated - do not save eviction fence instance in BO, there could be many such fences attached to one BO - use dma_resv_replace_fence() in detach V3: Addressed review comments from Christian - eviction fence create and destroy functions should be called only once from fpriv create/destroy - use dma_fence_put() in eviction_fence_destroy V4: Addressed review comments from Christian: - create a separate ev_fence_mgr structure - cleanup fence init part - do not add a domain for fence owner KGD V5: Addressed review comments from Christian: - drop the dma_fence_is_signaled check - use a local variable to access evf_mgr->ev_fence under the spin_lock() multiple places - remove the vm->is_compute_ctx check to attach gfx eviction fence, in gem_object_open V6: Addressed review comments from Christian: - drop the return value from eviction_fence_signal - reserve_fence should be the first thing inside the attach_eviction_fence function, also keep the resv_add_fence inside the lock - remove the unwanted ev_fence check inside detach function - fix wrong variable check in eviction_fence_init function - return the error value of eviction_fence_init to the caller, dont keep it void. - fail gem_object_open if attaching of eviction_fence fails - detach the eviction fence only when amdgpu_vm_is_bo_always_valid is not true. V7: Addressed review comments from Christian: - Do not add a uq_mgr ptr in ev_fence, rather add evf_mgr V8: Move eviction fence enabling into separate patch for CI Cc: Christian Koenig <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian Koenig <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:18 -04:00
Sunil Khatri	a640126fbd	drm/amdgpu: add the argument description for gpu_addr Add argument description for the input argument gpu_addr for amdgpu_seq64_alloc. Fixes the warning raised by the compiler: drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c:168: warning: Function parameter or struct member 'gpu_addr' not described in 'amdgpu_seq64_alloc Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:18 -04:00
Shashank Sharma	90c448fef3	drm/amdgpu: add new AMDGPU_INFO subquery for userq objects This patch adds a new subquery (AMDGPU_INFO_UQ_FW_AREAS) in AMDGPU_INFO_IOCTL to get the size and alignment of shadow and csa objects from the FW setup. This information is required for the userqueue consumers. V2: Added Alex's suggestions and addressed review comments: - make this query IP specific (GFX/SDMA etc) - give a better title (AMDGPU_INFO_UQ_METADATA) - restructured the code as per sample code shared by Alex V3: Split the UAPI patch from shadow_size_fn modifications V4: Addressed review comments from UAPI review (Marek/Pierre-Eric) - Change the query name to AMDGPU_INFO_UQ_FW_AREAS - remove unused inpur parameter for AMDGPU_HW_IP* UAPI link: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/400/ Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Arvind Yadav <arvind.yadav@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:17 -04:00
Shashank Sharma	aed7caf2d4	drm/amdgpu: add get_gfx_shadow_info callback for gfx12 This callback gets the size and alignment requirements for the gfx shadow buffer for preemption. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:17 -04:00
Arunpravin Paneer Selvam	2761bb9a31	drm/amdgpu: Modify userq signal/wait struct field names Modify kernel UAPI userq signal/wait struct field names and description corresponding to the libdrm UAPI review comments. libdrm MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/392 Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:17 -04:00
Shashank Sharma	d9e697f19b	drm/amdgpu: bypass SRIOV check for shadow size info Currently, the shadow FW space size and alignment information is protected under a flag (adev->gfx.cp_gfx_shadow) which gets set only in case of SRIOV setups. if (amdgpu_sriov_vf(adev)) adev->gfx.cp_gfx_shadow = true; But we need this information for GFX Userqueues, so that user can create these objects while creating userqueue. This patch series creates a method to get this information bypassing the dependency on this check. This patch: - adds a new input parameter flag to the gfx.funcs->get_gfx_shadow_info fptr definition, so that it can accommodate the information without the check (adev->gfx.cp_gfx_shadow) on request. - updates the existing definition of amdgpu_gfx_get_gfx_shadow_info to adjust with this new flag. Next patch in the series is adding a UAPI which will consume this info. V2: split this patch from the new UAPI patch Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Arvind Yadav <arvind.yadav@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:17 -04:00
Shashank Sharma	2e06b175ff	drm/amdgpu: fix userqueue UAPI comments This patch fixes some of the pending UAPI review comments from the libDRM/UAPI review process. - It updates some outdated comments in the userqueue UAPI header highlighted during the libdrm UAPI review. - It removes the GDS BO support which was found unused. - It also removes the unused flags parameter from the UAPI. - It also adds a padding variables in userqueue in/out structures. (Pierre-Eric and Marek) - clarify comments on top of drm_amdgpu_userq_in - clarify comment for queue_id (in) - clarify comment for mqd - clarify comment for compute MQD size - clarify comment for queue_id (out) - remove GDB object from BO object list - remove the unused flags parameter Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:17 -04:00
Shashank Sharma	5f2f78314c	Revert "drm/amdgpu: don't allow userspace to create a doorbell BO" This reverts commit `6be2ad4f00`. This patch was to block userspace to use doorbell manager UAPI until usermode queue UAPI gets approved. UQ UAPI got approved in the following MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/392 Cc: Christian Koenig <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:17 -04:00
Arvind Yadav	38c67ec9aa	drm/amdgpu: Add input fence to sync bo map/unmap This patch adds input fences to VM_IOCTL for buffer object. The kernel will map/unmap the BO only when the fence is signaled. The UAPI for the same has been approved here: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/392 V2: Bug fix (Arvind) V3: Bug fix (Arvind) V4: Rename UAPI objects as per UAPI review (Marek) V5: Addressed review comemnts from Christian - function should return error. - Add 'TODO' comment - The input fence should be independent of the operation. V6: Addressed review comemnts from Christian - Release the memory allocated by memdup_user(). V7: Addressed review comemnts from Christian - Drop the debug print and add "return r;" for the error handling. V11: Rebase v12: Fix 32-bit holes issue in sturct drm_amdgpu_gem_va. v13: Fix deadlock issue. v14: Fix merge conflict. v15: Fix review comment by renaming syncobj handles. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:17 -04:00
Arunpravin Paneer Selvam	189ee986b0	drm/amdgpu: add userq specific kernel config for fence ioctls Keep the user queue fence signal and wait IOCTLs in the kernel config CONFIG_DRM_AMDGPU_NAVI3X_USERQ. v2(Christian): - Remove the userq specific config added for kernel queues fence init function. v3(Alex): - It will be better to return an error(-ENOTSUPP) in these cases. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:17 -04:00
Arunpravin Paneer Selvam	f7cb6a28e1	drm/amdgpu: Add gpu_addr support to seq64 allocation Add gpu address support to seq64 alloc function. v1:(Christian) - Add the user of this new interface change to the same patch. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:17 -04:00
Arunpravin Paneer Selvam	cb4a73f46f	drm/amdgpu: Add separate array of read and write for BO handles Drop AMDGPU_USERQ_BO_WRITE as this should not be a global option of the IOCTL, It should be option per buffer. Hence adding separate array for read and write BO handles. v2(Marek): - Internal kernel details shouldn't be here. This file should only document the observed behavior, not the implementation . v3: - Fix DAL CI clang issue. v4: - Added Alex RB to merge the kernel UAPI changes since he has already approved the amdgpu_drm.h changes. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Suggested-by: Marek Olšák <marek.olsak@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:17 -04:00
Arunpravin Paneer Selvam	d8675102ba	drm/amdgpu: add vm root BO lock before accessing the vm Add a vm root BO lock before accessing the userqueue VM. v1:(Christian) - Keep the VM locked until you are done with the mapping. - Grab a temporary BO reference, drop the VM lock and acquire the BO. When you are done with everything just drop the BO lock and then the temporary BO reference. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:17 -04:00
Arunpravin Paneer Selvam	fbea3d3174	drm/amdgpu: Add the missing error handling for xa_store() call Add the missing error handling for xa_store() call in the function amdgpu_userq_fence_driver_alloc(). Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:17 -04:00
Arunpravin Paneer Selvam	e7cf21fbb2	drm/amdgpu: Few optimization and fixes for userq fence driver Few optimization and fixes for userq fence driver. v1:(Christian): - Remove unnecessary comments. - In drm_exec_init call give num_bo_handles as last parameter it would making allocation of the array more efficient - Handle return value of __xa_store() and improve the error handling of amdgpu_userq_fence_driver_alloc(). v2:(Christian): - Revert userq_xa xarray init to XA_FLAGS_LOCK_IRQ. - move the xa_unlock before the error check of the call xa_err(__xa_store()) and moved this change to a separate patch as this is adding a missing error handling. - Removed the unnecessary comments. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:16 -04:00
Arunpravin Paneer Selvam	ac4a1f7f13	drm/amdgpu: Remove the MES self test Remove MES self test as this conflicts the userqueue fence interrupts. v2:(Christian) - remove the amdgpu_mes_self_test() function and any now unused code. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:16 -04:00
Arvind Yadav	70773bef4e	drm/amdgpu: update userqueue BOs and PDs This patch updates the VM_IOCTL to allow userspace to synchronize the mapping/unmapping of a BO in the page table. The major changes are: - it adds a drm_timeline object as an input parameter to the VM IOCTL. - this object is used by the kernel to sync the update of the BO in the page table during the mapping of the object. - the kernel also synchronizes the tlb flush of the page table entry of this object during the unmapping (Added in this series: https://patchwork.freedesktop.org/series/131276/ and https://patchwork.freedesktop.org/patch/584182/) - the userspace can wait on this timeline, and then the BO is ready to be consumed by the GPU. The UAPI for the same has been approved here: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/392 V2: - remove the eviction fence coupling V3: - added the drm timeline support instead of input/output fence (Christian) V4: - made timeline 64-bit (Christian) - bug fix (Arvind) V5: GLCTS bug fix (Arvind) V6: Rename syncobj_handle -> timeline_syncobj_out Rename point -> timeline_point_in (Marek) V7: Addressed review comments from Christian: - do not send last_update fence in case of vm_clear_freed, instead return the fence from gen_va_update_vm - move the functions to update bo_mapping to amdgpu_gem.c - do not use amdgpu_userq_update_vm anymore in userq_create() V8: Addressed review comments from Christian: - Split amdgpu_gem_update_bo_mapping function. - amdgpu_gem_va_update_vm should return stub for error. V9: Addressed review comments from Christian: - Rename the function amdgpu_gem_update_timeline_node. - amdgpu_gem_update_timeline_node should be void function. - when timeline_point is zero don't allocate a chain and call drm_syncobj_replace_fence() instead of drm_syncobj_add_point(). V11: rebase V12: Fix 32-bit holes issue in sturct drm_amdgpu_gem_va. V13: Fix the review comment by renaming timeline syncobj (Marek) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Felix Kuehling <felix.kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:16 -04:00
Arunpravin Paneer Selvam	8949843762	drm/amdgpu: Enable userq fence interrupt support Add support to handle the userqueue protected fence signal hardware interrupt. Create a xarray which maps the doorbell index to the fence driver address. This would help to retrieve the fence driver information when an userq fence interrupt is triggered. Firmware sends the doorbell offset value and this info is compared with the queue's mqd doorbell offset value. If they are same, we process the userq fence interrupt. v1:(Christian): - use xa_load to extract the fence driver. - move the amdgpu_userq_fence_driver_process call within the xa_lock as there is a chance that fence_drv might be freed. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:16 -04:00
Arunpravin Paneer Selvam	15e30a6e47	drm/amdgpu: Add wait IOCTL timeline syncobj support Add user fence wait IOCTL timeline syncobj support. v2:(Christian) - handle dma_fence_wait() return value. - shorten the variable name syncobj_timeline_points a bit. - move num_points up to avoid padding issues. v3:(Christian) - Handle timeline drm_syncobj_find_fence() call error handling - Use dma_fence_unwrap_for_each() in timeline fence as there could be more than one fence. v4:(Christian) - Drop the first num_fences since fence is always included in the dma_fence_unwrap_for_each() iteration, when fence != f then fence is most likely just a container. v5: Added Alex RB to merge the kernel UAPI changes since he has already approved the amdgpu_drm.h changes. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:16 -04:00
Arunpravin Paneer Selvam	a292fdecd7	drm/amdgpu: Implement userqueue signal/wait IOCTL This patch introduces new IOCTL for userqueue secure semaphore. The signal IOCTL called from userspace application creates a drm syncobj and array of bo GEM handles and passed in as parameter to the driver to install the fence into it. The wait IOCTL gets an array of drm syncobjs, finds the fences attached to the drm syncobjs and obtain the array of memory_address/fence_value combintion which are returned to userspace. v2: (Christian) - Install fence into GEM BO object. - Lock all BO's using the dma resv subsystem - Reorder the sequence in signal IOCTL function. - Get write pointer from the shadow wptr - use userq_fence to fetch the va/value in wait IOCTL. v3: (Christian) - Use drm_exec helper for the proper BO drm reserve and avoid BO lock/unlock issues. - fence/fence driver reference count logic for signal/wait IOCTLs. v4: (Christian) - Fixed the drm_exec calling sequence - use dma_resv_for_each_fence_unlock if BO's are not locked - Modified the fence_info array storing logic. v5: (Christian) - Keep fence_drv until wait queue execution. - Add dma_fence_wait for other fences. - Lock BO's using drm_exec as the number of fences in them could change. - Install signaled fences as well into BO/Syncobj. - Move Syncobj fence installation code after the drm_exec_prepare_array. - Directly add dma_resv_usage_rw(args->bo_flags.... - remove unnecessary dma_fence_put. v6: (Christian) - Add xarray stuff to store the fence_drv - Implement a function to iterate over the xarray and drop the fence_drv references. - Add drm_exec_until_all_locked() wrapper - Add a check that if we haven't exceeded the user allocated num_fences before adding dma_fence to the fences array. v7: (Christian) - Use memdup_user() for kmalloc_array + copy_from_user - Move the fence_drv references from the xarray into the newly created fence and drop the fence_drv references when we signal this fence. - Move this locking of BOs before the "if (!wait_info->num_fences)", this way you need this code block only once. - Merge the error handling code and the cleanup + return 0 code. - Initializing the xa should probably be done in the userq code. - Remove the userq back pointer stored in fence_drv. - Pass xarray as parameter in amdgpu_userq_walk_and_drop_fence_drv() v8: (Christian) - Move fence_drv references must come before adding the fence to the list. - Use xa_lock_irqsave_nested for nested spinlock operations. - userq_mgr should be per fpriv and not one per device. - Restructure the interrupt process code for the early exit of the loop. - The reference acquired in the syncobj fence replace code needs to be kept around. - Modify the dma_fence acquire placement in wait IOCTL. - Move USERQ_BO_WRITE flag to UAPI header file. - drop the fence drv reference after telling the hw to stop accessing it. - Add multi sync object support to userq signal IOCTL. V9: (Christian) - Store all the fence_drv ref to other drivers and not ourself. - Remove the userq fence xa implementation and replace with kvmalloc_array. v10: (Christian) - Add a comment for the userq_xa xarray - drop the if check of userq_fence->fence_drv_array - use the i variable to initialize userq_fence->fence_drv_array_count - drop the fence reference before you free the array in the error handling, otherwise it could be that some references leaked. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:16 -04:00
Arunpravin Paneer Selvam	2e65ea1ab2	drm/amdgpu: screen freeze and userq driver crash Screen freeze and userq fence driver crash while playing Xonotic v2: (Christian) - There is change that fence might signal in between testing and grabbing the lock. Hence we can move the lock above the if..else check and use the dma_fence_is_signaled_locked(). Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:16 -04:00
Arunpravin Paneer Selvam	8493312a94	drm/amdgpu: Add mqd support for the fence address - Add a field in struct v11_gfx_mqd for userqueue fence address. - Assign fence gpu VA address to the userqueue mqd fence address fields. v2: Remove the mask and replace with lower_32_bits (Christian) Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:16 -04:00
Arunpravin Paneer Selvam	97ff194625	drm/amdgpu: Implement a new userqueue fence driver Developed a userqueue fence driver for the userqueue process shared BO synchronization. Create a dma fence having write pointer as the seqno and allocate a seq64 memory for each user queue process and feed this memory address into the firmware/hardware, thus the firmware writes the read pointer into the given address when the process completes it execution. Compare wptr and rptr, if rptr >= wptr, signal the fences for the waiting process to consume the buffers. v2: Worked on review comments from Christian for the following modifications - Add wptr as sequence number into the fence - Add a reference count for the fence driver - Add dma_fence_put below the list_del as it might frees the userq fence. - Trim unnecessary code in interrupt handler. - Check dma fence signaled state in dma fence creation function for a potential problem of hardware completing the job processing beforehand. - Add necessary locks. - Create a list and process all the unsignaled fences. - clean up fences in destroy function. - implement .signaled callback function v3: Worked on review comments from Christian - Modify naming convention for reference counted objects - Fix fence driver reference drop issue - Drop amdgpu_userq_fence_driver_process() function return value v4: Worked on review comments from Christian - Moved fence driver allocation into amdgpu_userq_fence_driver_alloc() - Added detail doc mentioning the differences b/w two spinlocks declared. v5: Worked on review comments from Christian - Check before upcast and remove local variable - Add error handling in fence_drv alloc function. - Move rptr read fn outside of the loop and remove WARN_ON in destroy function. v6: - clear the seq64 memory in user fence driver(Christian) - fix for the wptr va bo mapping(Christian) - move the fence_drv xa entry erase code from the interrupt handler into user fence destroy function Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:16 -04:00
Shashank Sharma	f540f69256	drm/amdgpu: add kernel config for gfx-userqueue This patch: - adds a kernel config option "CONFIG_DRM_AMDGPU_NAVI3X_USERQ" - moves the usequeue initialization code for all IPs under this flag - cover the core userqueue functions under this config - adds stub function for userqueue ioctl. so that the userqueue works only when the config is enabled. V9: Introduce this patch V10: Call it CONFIG_DRM_AMDGPU_NAVI3X_USERQ instead of CONFIG_DRM_AMDGPU_USERQ_GFX (Christian) V11: Add GFX in the config help description message. V12: Add depends on BROKEN for this config, remove this when the rest of the code is available. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:16 -04:00
Arvind Yadav	9d3afcb7b9	drm/amdgpu: fix MES GFX mask Current MES GFX mask prevents FW to enable oversubscription. This patch does the following: - Fixes the mask values and adds a description for the same - Removes the central mask setup and makes it IP specific, as it would be different when the number of pipes and queues are different. v2: squash in fix from Shashank Cc: Christian König <Christian.Koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:15 -04:00
Shashank Sharma	2c695d7c07	drm/amdgpu: enable compute/gfx usermode queue This patch does the necessary changes required to enable compute workload support using the existing usermode queues infrastructure. V9: Patch introduced V10: Add custom IP specific mqd strcuture for compute (Alex) V11: Rename drm_amdgpu_userq_mqd_compute_gfx_v11 to drm_amdgpu_userq_mqd_compute_gfx11 (Marek) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:15 -04:00
Arvind Yadav	543b614537	drm/amdgpu: enable SDMA usermode queues This patch does necessary modifications to enable the SDMA usermode queues using the existing userqueue infrastructure. V9: introduced this patch in the series V10: use header file instead of extern (Alex) V11: rename drm_amdgpu_userq_mqd_sdma_gfx_v11 to drm_amdgpu_userq_mqd_sdma_gfx11 (Marek) Cc: Christian König <Christian.Koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:15 -04:00
Shashank Sharma	a1d201e169	drm/amdgpu: enable GFX-V11 userqueue support This patch enables GFX-v11 IP support in the usermode queue base code. It typically: - adds a GFX_v11 specific MQD structure - sets IP functions to create and destroy MQDs - sets MQD objects coming from userspace V10: introduced this spearate patch for GFX V11 enabling (Alex). V11: Addressed review comments: - update the comments in GFX mqd structure informing user about using the INFO IOCTL for object sizes (Alex) - rename struct drm_amdgpu_userq_mqd_gfx_v11 to drm_amdgpu_userq_mqd_gfx11 (Marek) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:15 -04:00
Shashank Sharma	d84607e3f7	drm/amdgpu: cleanup leftover queues This patch adds code to cleanup any leftover userqueues which a user might have missed to destroy due to a crash or any other programming error. V7: Added Alex's R-B V8: Rebase V9: Rebase V10: Rebase V11: Rebase Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Suggested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:15 -04:00
Shashank Sharma	f09c1e6077	drm/amdgpu: generate doorbell index for userqueue The userspace sends us the doorbell object and the relative doobell index in the object to be used for the usermode queue, but the FW expects the absolute doorbell index on the PCI BAR in the MQD. This patch adds a function to convert this relative doorbell index to absolute doorbell index. V5: Fix the db object reference leak (Christian) V6: Pin the doorbell bo in userqueue_create() function, and unpin it in userqueue destoy (Christian) V7: Added missing kfree for queue in error cases Added Alex's R-B V8: Rebase V9: Changed the function names from gfx_v11* to mes_v11* V10: Rebase V11: Rebase Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:15 -04:00
Shashank Sharma	5fb2f7fc21	drm/amdgpu: map wptr BO into GART To support oversubscription, MES FW expects WPTR BOs to be mapped into GART, before they are submitted to usermode queues. This patch adds a function for the same. V4: fix the wptr value before mapping lookup (Bas, Christian). V5: Addressed review comments from Christian: - Either pin object or allocate from GART, but not both. - All the handling must be done with the VM locks held. V7: Addressed review comments from Christian: - Do not take vm->eviction_lock - Use amdgpu_bo_gpu_offset to get the wptr_bo GPU offset V8: Rebase V9: Changed the function names from gfx_v11* to mes_v11* V10: Remove unused adev (Harish) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:15 -04:00
Shashank Sharma	6c42559f70	drm/amdgpu: map usermode queue into MES This patch adds new functions to map/unmap a usermode queue into the FW, using the MES ring. As soon as this mapping is done, the queue would be considered ready to accept the workload. V1: Addressed review comments from Alex on the RFC patch series - Map/Unmap should be IP specific. V2: Addressed review comments from Christian: - Fix the wptr_mc_addr calculation (moved into another patch) Addressed review comments from Alex: - Do not add fptrs for map/unmap V3: Integration with doorbell manager V4: Rebase V5: Use gfx_v11_0 for function names (Alex) V6: Removed queue->proc/gang/fw_ctx_address variables and doing the address calculations locally to keep the queue structure GEN independent (Alex) V7: Added R-B from Alex V8: Rebase V9: Rebase V10: Rebase Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:15 -04:00
Shashank Sharma	defb41e8ef	drm/amdgpu: create context space for usermode queue The MES FW expects us to allocate at least one page as context space to process gang and process related context data. This patch creates a joint object for the same, and calculates GPU space offsets of these spaces. V1: Addressed review comments on RFC patch: Alex: Make this function IP specific V2: Addressed review comments from Christian - Allocate only one object for total FW space, and calculate offsets for each of these objects. V3: Integration with doorbell manager V4: Review comments: - Remove shadow from FW space list from cover letter (Alex) - Alignment of macro (Luben) V5: Merged patches 5 and 6 into this single patch Addressed review comments: - Use lower_32_bits instead of mask (Christian) - gfx_v11_0 instead of gfx_v11 in function names (Alex) - Shadow and GDS objects are now coming from userspace (Christian, Alex) V6: - Add a comment to replace amdgpu_bo_create_kernel() with amdgpu_bo_create() during fw_ctx object creation (Christian). - Move proc_ctx_gpu_addr, gang_ctx_gpu_addr and fw_ctx_gpu_addr out of generic queue structure and make it gen11 specific (Alex). V7: - Using helper function to create/destroy userqueue objects. - Removed FW object space allocation. V8: - Updating FW object address from user values. V9: - uppdated function name from gfx_v11_* to mes_v11_* V10: - making this patch independent of IP based changes, moving any GFX object related changes in GFX specific patch (Alex) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Acked-by: Christian Koenig <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:15 -04:00
Shashank Sharma	fbf136b932	drm/amdgpu: create MES-V11 usermode queue for GFX A Memory queue descriptor (MQD) of a userqueue defines it in the hw's context. As MQD format can vary between different graphics IPs, we need gfx GEN specific handlers to create MQDs. This patch: - Adds a new file which will be used for MES based userqueue functions targeting GFX and SDMA IP. - Introduces MQD handler functions for the usermode queues. V1: Worked on review comments from Alex: - Make MQD functions GEN and IP specific V2: Worked on review comments from Alex: - Reuse the existing adev->mqd[ip] for MQD creation - Formatting and arrangement of code V3: - Integration with doorbell manager V4: Review comments addressed: - Do not create a new file for userq, reuse gfx_v11_0.c (Alex) - Align name of structure members (Luben) - Don't break up the Cc tag list and the Sob tag list in commit message (Luben) V5: - No need to reserve the bo for MQD (Christian). - Some more changes to support IP specific MQD creation. V6: - Add a comment reminding us to replace the amdgpu_bo_create_kernel() calls while creating MQD object to amdgpu_bo_create() once eviction fences are ready (Christian). V7: - Re-arrange userqueue functions in adev instead of uq_mgr (Alex) - Use memdup_user instead of copy_from_user (Christian) V9: - Moved userqueue code from gfx_v11_0.c to new file mes_v11_0.c so that it can be reused for SDMA userqueues as well (Shashank, Alex) V10: Addressed review comments from Alex - Making this patch independent of IP engine(GFX/SDMA/Compute) and specific to MES V11 only, using the generic MQD structure. - Splitting a spearate patch to enabling GFX support from here. - Verify mqd va address to be non-NULL. - Add a separate header file. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:15 -04:00
Shashank Sharma	0385800c2f	drm/amdgpu: add helpers to create userqueue object This patch introduces amdgpu_userqueue_object and its helper functions to creates and destroy this object. The helper functions creates/destroys a base amdgpu_bo, kmap/unmap it and save the respective GPU and CPU addresses in the encapsulating userqueue object. These helpers will be used to create/destroy userqueue MQD, WPTR and FW areas. V7: - Forked out this new patch from V11-gfx-userqueue patch to prevent that patch from growing very big. - Using amdgpu_bo_create instead of amdgpu_bo_create_kernel in prep for eviction fences (Christian) V9: - Rebase V10: - Added Alex's R-B Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:15 -04:00
Shashank Sharma	5501117d24	drm/amdgpu: add new IOCTL for usermode queue This patch adds: - A new IOCTL function to create and destroy - A new structure to keep all the user queue data in one place. - A function to generate unique index for the queue. V1: Worked on review comments from RFC patch series: - Alex: Keep a list of queues, instead of single queue per process. - Christian: Use the queue manager instead of global ptrs, Don't keep the queue structure in amdgpu_ctx V2: Worked on review comments: - Christian: - Formatting of text - There is no need for queuing of userqueues, with idr in place - Alex: - Remove use_doorbell, its unnecessary - Reuse amdgpu_mqd_props for saving mqd fields - Code formatting and re-arrangement V3: - Integration with doorbell manager V4: - Accommodate MQD union related changes in UAPI (Alex) - Do not set the queue size twice (Bas) V5: - Remove wrapper functions for queue indexing (Christian) - Do not save the queue id/idr in queue itself (Christian) - Move the idr allocation in the IP independent generic space (Christian) V6: - Check the validity of input IP type (Christian) V7: - Move uq_func from uq_mgr to adev (Alex) - Add missing free(queue) for error cases (Yifan) V9: - Rebase V10: Addressed review comments from Christian, and added R-B: - Do not initialize the local variable - Convert DRM_ERROR to DEBUG. V11: - check the input flags to be zero (Alex) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:15 -04:00
Shashank Sharma	bf33cb6551	drm/amdgpu: add usermode queue base code This patch adds IP independent skeleton code for amdgpu usermode queue. It contains: - A new files with init functions of usermode queues. - A queue context manager in driver private data. V1: Worked on design review comments from RFC patch series: (https://patchwork.freedesktop.org/series/112214/) - Alex: Keep a list of queues, instead of single queue per process. - Christian: Use the queue manager instead of global ptrs, Don't keep the queue structure in amdgpu_ctx V2: - Reformatted code, split the big patch into two V3: - Integration with doorbell manager V4: - Align the structure member names to the largest member's column (Luben) - Added SPDX license (Luben) V5: - Do not add amdgpu.h in amdgpu_userqueue.h (Christian). - Move struct amdgpu_userq_mgr into amdgpu_userqueue.h (Christian). V6: Rebase V9: Rebase V10: Rebase + Alex's R-B Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:14 -04:00
Alexandre Demers	9cfb230210	drm/amdgpu: still cleanup sid.h The defines, shifts and masks are already available in dce_6_0_d.h, dce_6_0_sh_mask.h. Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:14 -04:00
Alexandre Demers	b255b64883	drm/amdgpu: fill in gmc_v6_0_set_clockgating_state() Pretty much was already there, just not ported to amdgpu. Tested-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:14 -04:00
Alexandre Demers	a149f0bd0b	drm/amd/display/dc: reclassify DCE6 resources and hw sequencer Classify DCE6 resource and sequencer as they are for other DCE versions Put dce60_resource.c and .h under amd/display/dc/resource/dce60 Put and rename dce60_hw_sequencer.c and .h under amd/display/dc/hwss/dce60 v2: fix build when CONFIG_DRM_AMD_DC_SI=n (Alex) Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:14 -04:00
Lijo Lazar	6ffc6e056f	drm/amdgpu: Reset RAS table if header is invalid If a valid header is not found during RAS eeprom init, consider it as new and reset RAS table info. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:14 -04:00
Tao Zhou	b695dd3bb8	drm/amdgpu: add loop bits for NPS2 page retirement Support NPS2 RAS. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:14 -04:00
Kenneth Feng	bb00bf1732	drm/amd/amdgpu: decouple ASPM with pcie dpm ASPM doesn't need to be disabled if pcie dpm is disabled. So ASPM can be independantly enabled. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:14 -04:00
Ruili Ji	940e772635	amd/amdgpu: Init vcn hardware per instance for vcn 4.0.3 Add interface for hardware init by vcn instance. v2: fix code format Reviewed-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Ruili Ji <ruiliji2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:14 -04:00
Victor Skvortsov	3394069e7d	drm/amdgpu: Disable ACA on VFs VFs query RAS error counts directly from host with AMDGPU_RAS_VIRT_ERROR_COUNT_QUERY. When ACA is enabled, an unusable aca_sysfs is created rather than amdgpu_ras_sysfs_create() Likewise, VFs depend on host support to query CPERs, rather than ACA component. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Reviewed-by: Zhigang Luo <Zhigang.luo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:14 -04:00
Alexandre Demers	9101b84f8c	drm/amdgpu: use "irq" in place of "interrupt" in DCE6/8 as in DCE10/11 "interrupt" becomes "irq" in: dce_vX_0_set_hpd_interrupt_state() dce_vX_0_set_crtc_interrupt_state() dce_vX_0_set_pageflip_interrupt_state() It is easier when going through the code to just change the DCE number in the functions' name to find and compare them across DCE versions. Also, it standardizes function mapping inside a given structure where .set and .process are both set to functions with a "_irq" suffix. Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:13 -04:00
Alexandre Demers	160e6f5108	drm/amdgpu: fix typos in DCEs In DCE6, DCE8, DCE10, DCE11, "hdp" is replaced by "hpd" and replace "type" by "hpd" for a uniform parameter naming usage across DCEs. In link_factory.c, there is a missing "p" to "types" Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:13 -04:00
Alex Deucher	9e7b08d239	drm/amdgpu/mes12: optimize MES pipe FW version fetching Don't fetch it again if we already have it. It seems the registers don't reliably have the value at resume in some cases. Fixes: `785f0f9fe7` ("drm/amdgpu: Add mes v12_0 ip block support (v4)") Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:13 -04:00
Denis Arefev	da7dc714a8	drm/amd/pm/smu11: Prevent division by zero The user can set any speed value. If speed is greater than UINT_MAX/8, division by zero is possible. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `1e866f1fe5` ("drm/amd/pm: Prevent divide by zero") Signed-off-by: Denis Arefev <arefev@swemel.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:48:13 -04:00
Alex Deucher	906ad45167	drm/amdgpu: cancel gfx idle work in device suspend for s0ix This is normally handled in the gfx IP suspend callbacks, but for S0ix, those are skipped because we don't want to touch gfx. So handle it in device suspend. Fixes: `b9467983b7` ("drm/amdgpu: add dynamic workload profile switching for gfx10") Fixes: `963537ca23` ("drm/amdgpu: add dynamic workload profile switching for gfx11") Fixes: `5f95a15495` ("drm/amdgpu: add dynamic workload profile switching for gfx12") Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:05:35 -04:00
Kenneth Feng	b23f81c442	drm/amd/display: pause the workload setting in dm Pause the workload setting in dm when doing idle optimization Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:05:26 -04:00
Alex Deucher	92e511d1ce	drm/amdgpu/pm/swsmu: implement pause workload profile Add the callback for implementation for swsmu. Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:05:23 -04:00
Alex Deucher	6dafb5d4c7	drm/amdgpu/pm: add workload profile pause helper To be used for display idle optimizations when we want to pause non-default profiles. Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:05:17 -04:00
Alex Deucher	0e2ebfe276	drm/amdgpu/gfx12: dump full CP packet header FIFOs In dev core dump, dump the full header fifo for each queue. Each FIFO has 8 entries. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:05:14 -04:00
Alex Deucher	eb15a5d1ae	drm/amdgpu/gfx11: dump full CP packet header FIFOs In dev core dump, dump the full header fifo for each queue. Each FIFO has 8 entries. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:05:11 -04:00
Alex Deucher	867cf768cb	drm/amdgpu/gfx10: dump full CP packet header FIFOs In dev core dump, dump the full header fifo for each queue. Each FIFO has 8 entries. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-08 16:05:07 -04:00
Alex Deucher	fd4948494d	drm/amdgpu/gfx9.4.3: dump full CP packet header FIFOs In dev core dump, dump the full header fifo for each queue. Each FIFO has 8 entries. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-07 18:01:08 -04:00
Alex Deucher	a267d1686c	drm/amdgpu/gfx9: dump full CP packet header FIFOs In dev core dump, dump the full header fifo for each queue. Each FIFO has 8 entries. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-07 18:01:08 -04:00
Ruili Ji	9f7ce6a9ab	drm/amd/pm: implement dpm vcn reset function Implement VCN engine reset by sending MSG_ResetVCN on smu 13.0.6. v2: fix format for code and message Reviewed-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Ruili Ji <ruiliji2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-07 18:01:08 -04:00

... 3 4 5 6 7 ...

33509 Commits