linux

mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00

Author	SHA1	Message	Date
Luben Tuikov	56e449603f	drm/sched: Convert the GPU scheduler to variable number of run-queues The GPU scheduler has now a variable number of run-queues, which are set up at drm_sched_init() time. This way, each driver announces how many run-queues it requires (supports) per each GPU scheduler it creates. Note, that run-queues correspond to scheduler "priorities", thus if the number of run-queues is set to 1 at drm_sched_init(), then that scheduler supports a single run-queue, i.e. single "priority". If a driver further sets a single entity per run-queue, then this creates a 1-to-1 correspondence between a scheduler and a scheduled entity. Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Russell King <linux+etnaviv@armlinux.org.uk> Cc: Qiang Yu <yuq825@gmail.com> Cc: Rob Clark <robdclark@gmail.com> Cc: Abhinav Kumar <quic_abhinavk@quicinc.com> Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: Danilo Krummrich <dakr@redhat.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Emma Anholt <emma@anholt.net> Cc: etnaviv@lists.freedesktop.org Cc: lima@lists.freedesktop.org Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Cc: nouveau@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20231023032251.164775-1-luben.tuikov@amd.com	2023-10-26 12:03:47 -04:00
Mario Limonciello	64ffd2f1d0	drm/amd: Disable ASPM for VI w/ all Intel systems Originally we were quirking ASPM disabled specifically for VI when used with Alder Lake, but it appears to have problems with Rocket Lake as well. Like we've done in the case of dpm for newer platforms, disable ASPM for all Intel systems. Cc: stable@vger.kernel.org # 5.15+ Fixes: `0064b0ce85` ("drm/amd/pm: enable ASPM by default") Reported-and-tested-by: Paolo Gentili <paolo.gentili@canonical.com> Closes: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2036742 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-25 09:53:17 -04:00
Dave Airlie	0ecf4aa32b	amd-drm-next-6.7-2023-10-20: amdgpu: - SMU 13 updates - UMSCH updates - DC MPO fixes - RAS updates - MES 11 fixes - Fix possible memory leaks in error pathes - GC 11.5 fixes - Kernel doc updates - PSP updates - APU IMU fixes - Misc code cleanups - SMU 11 fixes - OD fix - Frame size warning fixes - SR-IOV fixes - NBIO 7.11 updates - NBIO 7.7 updates - XGMI fixes - devcoredump updates amdkfd: - Misc code cleanups - SVM fixes -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZTLX4QAKCRC93/aFa7yZ 2ORJAP9lnoyvUPIP63Hx5TADQtZHiA+ShkATGQmDia94ABtCxwEAlo88TipxAo7c tRX8Mn+rix3M739FDFxV0bp7hCXsbgQ= =fY9I -----END PGP SIGNATURE----- Merge tag 'amd-drm-next-6.7-2023-10-20' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.7-2023-10-20: amdgpu: - SMU 13 updates - UMSCH updates - DC MPO fixes - RAS updates - MES 11 fixes - Fix possible memory leaks in error pathes - GC 11.5 fixes - Kernel doc updates - PSP updates - APU IMU fixes - Misc code cleanups - SMU 11 fixes - OD fix - Frame size warning fixes - SR-IOV fixes - NBIO 7.11 updates - NBIO 7.7 updates - XGMI fixes - devcoredump updates amdkfd: - Misc code cleanups - SVM fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231020195043.4937-1-alexander.deucher@amd.com	2023-10-25 10:54:22 +10:00
Christian König	4984fc578a	drm/amdkfd: reserve a fence slot while locking the BO Looks like the KFD still needs this. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: `8abc1eb298` ("drm/amdkfd: switch over to using drm_exec v3") Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231020123306.43978-1-christian.koenig@amd.com	2023-10-23 14:48:47 +02:00
Dave Airlie	7cd62eab9b	Linux 6.6-rc7 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmU1ngkeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGrsIH/0k/+gdBBYFFdEym foRhKir9WV3ZX4oIozJjA1f7T+qVYclKs6kaYm3gNepRBb6AoG8pdgv4MMAqhYsf QMe2XHi0MrO/qKBgfNfivxEa9jq+0QK5uvTbqCRqCAB8LfwVyDqapCmg3EuiZcPW UbMITmnwLIfXgPxvp9rabmCsTqO6FLbf0GDOVIkNSAIDBXMpcO1iffjrWUbhRa7n oIoiJmWJLcXLxPWDsRKbpJwzw2cIG08YhfQYAiQnC3YaeRm1FKLDIICRBsmfYzja rWv9r4dn4TDfV4/AnjggQnsZvz2yPCxNaFSQIT88nIeiLvyuUTJ9j8aidsSfMZQf xZAbzbA= =NoQv -----END PGP SIGNATURE----- BackMerge tag 'v6.6-rc7' into drm-next This is needed to add the msm pr which is based on a higher base. Signed-off-by: Dave Airlie <airlied@redhat.com>	2023-10-23 18:20:06 +10:00
Luben Tuikov	d3df66fd98	drm/amdgpu: Remove redundant call to priority_is_valid() Remove a redundant call to amdgpu_ctx_priority_is_valid() from amdgpu_ctx_priority_permit(), which is called from amdgpu_ctx_init() which is called from amdgpu_ctx_alloc() which is called from amdgpu_ctx_ioctl(), where we've called amdgpu_ctx_priority_is_valid() already first thing in the function. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20231018010359.30393-1-luben.tuikov@amd.com	2023-10-21 20:27:15 -04:00
André Almeida	de009982c6	drm/amdgpu: Create version number for coredumps Even if there's nothing currently parsing amdgpu's coredump files, if we eventually have such tools they will be glad to find a version field to properly read the file. Create a version number to be displayed on top of coredump file, to be incremented when the file format or content get changed. Signed-off-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:29 -04:00
André Almeida	69619868d3	drm/amdgpu: Move coredump code to amdgpu_reset file Giving that we use codedump just for device resets, move it's functions and structs to a more semantic file, the amdgpu_reset.{c, h}. Signed-off-by: André Almeida <andrealmeid@igalia.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:29 -04:00
André Almeida	2d6a2a28cd	drm/amdgpu: Encapsulate all device reset info To better organize struct amdgpu_device, keep all reset information related fields together in a separated struct. Signed-off-by: André Almeida <andrealmeid@igalia.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:28 -04:00
Shiwu Zhang	723fac64d0	drm/amdgpu: support the port num info based on the capability flag XGMI TA will set the capability flag to indicate whether the port_num info is supported or not. KGD checks the flag and accordingly picks up the right buffer format and send the right command to TA to retrieve the info. v2: simplify the code by reusing the same statement (lijo) Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:28 -04:00
Shiwu Zhang	e8a5ded36b	drm/amdgpu: prepare the output buffer for GET_PEER_LINKS command Per the xgmi ta implementation, KGD needs to fill in node_ids in concern into the shared command output buffer rather than the command input buffer. Input buffer is not used for GET_PEER_LINKS command execution. In this way, xgmi ta can reuse the node info in the output buffer just filled in and populate the same buffer with link info directly. Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:28 -04:00
Tao Zhou	d9443ac4f9	drm/amdgpu: drop status query/reset for GCEA 9.4.3 and MMEA 1.8 PMFW will be responsible for them. v2: remove query interfaces. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:28 -04:00
Shiwu Zhang	626121fce4	drm/amdgpu: update the xgmi ta interface header Update the header file to the v20.00.00.13 v1: rename TA_COMMAND_XGMI__GET_GET_TOPOLOGY_INFO to TA_COMMAND_XGMI__GET_TOPOLOGY_INFO And also rename struct ta_xgmi_cmd_get_peer_link_info_output to ta_xgmi_cmd_get_peer_link_info accordingly v2: add structs to support xgmi GET_EXTEND_PEER_LINK command Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:28 -04:00
Tao Zhou	8096df7664	drm/amdgpu: add set/get mca debug mode operations Record the debug mode status in RAS. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:28 -04:00
Tao Zhou	21226f02d7	drm/amdgpu: replace reset_error_count with amdgpu_ras_reset_error_count Simplify the code. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:28 -04:00
Li Ma	9d7a965e22	drm/amdgpu: add clockgating support for NBIO v7.7.1 add clockgating support for NBIO ip 7.7.1 Signed-off-by: Li Ma <li.ma@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:28 -04:00
Li Ma	fa9dd7a285	drm/amdgpu: fix missing stuff in NBIO v7.11 add get_clockgating_state, update_medium_grain_light_sleep and update_medium_grain_clock_gating in nbio_v7_11_funcs v1: add missing funcs in nbio_v7_11.c v2: modify the if condition and add spport for nbio v7.11 clockgating. Signed-off-by: Li Ma <li.ma@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:28 -04:00
Stanley.Yang	66d64e4e03	drm/amdgpu: Enable RAS feature by default for APU Enable RAS feature by default for aqua vanjaram on apu platform. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:28 -04:00
Yang Wang	49c260bef3	drm/amdgpu: fix typo for amdgpu ras error data print typo fix. Fixes: `5b1270beb3` ("drm/amdgpu: add ras_err_info to identify RAS error source") Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Candice Li <candice.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:28 -04:00
Bokun Zhang	017634a68d	drm/amd/amdgpu/vcn: Add RB decouple feature under SRIOV - P4 - In VCN 4 SRIOV code path, add code to enable RB decouple feature Signed-off-by: Bokun Zhang <bokun.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:27 -04:00
Bokun Zhang	eb9d6256b9	drm/amd/amdgpu/vcn: Add RB decouple feature under SRIOV - P3 - Update VCN header for RB decouple feature - Add metadata struct, metadata will be placed after each RB Signed-off-by: Bokun Zhang <bokun.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:27 -04:00
Bokun Zhang	fc3136730b	drm/amd/amdgpu/vcn: Add RB decouple feature under SRIOV - P2 - Add function to check if RB decouple is enabled under SRIOV Signed-off-by: Bokun Zhang <bokun.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:27 -04:00
Bokun Zhang	97b2821643	drm/amd/amdgpu/vcn: Add RB decouple feature under SRIOV - P1 - Update SRIOV header with RB decouple flag Signed-off-by: Bokun Zhang <bokun.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:27 -04:00
Stanley.Yang	8a65661114	drm/amdgpu: Fix delete nodes that have been relesed Fix delete nodes that it has been freed. Fixes: `5b1270beb3` ("drm/amdgpu: add ras_err_info to identify RAS error source") Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:27 -04:00
Hawking Zhang	f2176d7063	drm/amdgpu: Add UVD_VCPU_INT_EN2 to dpg sram Add RAS sepcifc programming to dpg sram. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:27 -04:00
Hawking Zhang	9248462d7e	drm/amdgpu: Enable software RAS in vcn v4_0_3 Set VCN/JPEG RAS masks to enable software RAS for VCN and JPEG. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:26 -04:00
Tao Zhou	472c5fb297	drm/amdgpu: define ras_reset_error_count function Make the code architecture more simple. v2: reuse ras_reset_error_count in ras_reset_error_status. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:26 -04:00
Candice Li	afcf949cf3	drm/amdgpu: Log UE corrected by replay as correctable error Support replay mode where UE could be converted to CE. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:26 -04:00
Dave Airlie	d43c76c820	Short summary of fixes pull: amdgpu: - Disable AMD_CTX_PRIORITY_UNSET bridge: - ti-sn65dsi86: Fix device lifetime edid: - Add quirk for BenQ GW2765 ivpu: - Extend address range for MMU mmap nouveau: - DP-connector fixes - Documentation fixes panel: - Move AUX B116XW03 into panel-simple scheduler: - Eliminate DRM_SCHED_PRIORITY_UNSET ttm: - Fix possible NULL-ptr deref in cleanup -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmUxFsoACgkQaA3BHVML eiNkkwf/YORvSPLN2LEqh4P8to7X92moFR7XwkA1GVofKOC3C04wDBq+Ezdy5pnw 7T/uvEnmrnDP5Iucerly8bE5IdJGkafIB8+oBIcGhGvPgtYNGSV8NWoeA/lhBmsS av56+zBPpBF3s5fZa4rB/WciGRuCsLhbcYUN/LGGVBXdhLgb8LRb/seTv/Ah9Sra LQtFmrDNNE0+FIWuKkUsY/CQYysbMzuHYFiLumtY59lE1R5kyvlzE8OrdcqlDXhC HCFt0KuA+4onzdHKnge/as5T3jBJucGV9mTcgal3158n94U/ODrXEJqVD/KFKle6 052vhpApOL3+RyDKGjXQLm/GumEc9w== =LJB1 -----END PGP SIGNATURE----- Merge tag 'drm-misc-fixes-2023-10-19' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Short summary of fixes pull: amdgpu: - Disable AMD_CTX_PRIORITY_UNSET bridge: - ti-sn65dsi86: Fix device lifetime edid: - Add quirk for BenQ GW2765 ivpu: - Extend address range for MMU mmap nouveau: - DP-connector fixes - Documentation fixes panel: - Move AUX B116XW03 into panel-simple scheduler: - Eliminate DRM_SCHED_PRIORITY_UNSET ttm: - Fix possible NULL-ptr deref in cleanup Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20231019114605.GA22540@linux-uq9g	2023-10-20 14:07:58 +10:00
Felix Kuehling	316baf09d3	drm/amdgpu: Reserve fences for VM update In amdgpu_dma_buf_move_notify reserve fences for the page table updates in amdgpu_vm_clear_freed and amdgpu_vm_handle_moved. This fixes a BUG_ON in dma_resv_add_fence when using SDMA for page table updates. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:56:57 -04:00
Felix Kuehling	51b79f3381	drm/amdgpu: Fix possible null pointer dereference abo->tbo.resource may be NULL in amdgpu_vm_bo_update. Fixes: `1802537820` ("drm/ttm: stop allocating dummy resources during BO creation") Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:56:50 -04:00
Felix Kuehling	207430b76a	drm/amdgpu: Reserve fences for VM update In amdgpu_dma_buf_move_notify reserve fences for the page table updates in amdgpu_vm_clear_freed and amdgpu_vm_handle_moved. This fixes a BUG_ON in dma_resv_add_fence when using SDMA for page table updates. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:26:52 -04:00
Felix Kuehling	e6f8588733	drm/amdgpu: Fix possible null pointer dereference abo->tbo.resource may be NULL in amdgpu_vm_bo_update. Fixes: `1802537820` ("drm/ttm: stop allocating dummy resources during BO creation") Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:26:52 -04:00
Stanley.Yang	b1338a8e71	drm/amdgpu: Workaround to skip kiq ring test during ras gpu recovery This is workaround, kiq ring test failed in suspend stage when do ras recovery. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:26:52 -04:00
Mario Limonciello	e56690bb37	drm/amd: Read IMU FW version from scratch register during hw_init If the IMU version wasn't discovered from the header, such as when the firmware was directly loaded by PSP then there is no firmware version to show to userspace from sysfs or IOCTL. The IMU F/W stores the version in the first scratch register though, so fetch it in these cases to let the driver export. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:26:51 -04:00
Mario Limonciello	4916615fe9	drm/amd: Don't parse IMU ucode version if it won't be loaded When the IMU ucode is loaded by the PSP parsing the version that comes from Linux will vary. Rather than showing the wrong data to kernel interface consumers, avoid populating it in this case. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:26:51 -04:00
Mario Limonciello	d757dfd667	drm/amd: Move microcode init step to early_init() The intention for early init is to find any missing microcode early and fail the driver load if it's missing. Move this step to earlier in driver init to match other IP blocks. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:26:51 -04:00
Asad Kamal	d8c1925ba8	drm/amdgpu: update retry times for psp BL wait Increase retry time for PSP BL wait, to compensate for longer time to set c2pmsg 35 ready bit during mode1 with RAS Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:26:51 -04:00
Alex Deucher	28ab9a02b6	drm/amdgpu/mes11: remove aggregated doorbell code It's not enabled in hardware so the code is dead. Remove it. Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:26:51 -04:00
Asad Kamal	53dd920c1f	drm/amdgpu : Add hive ras recovery check If one of the devices in the hive detects a fatal error, need to send ras recovery reset message to PMFW of all devices in the hive. For that add a flag in hive to indicate that it's undergoing ras recovery Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:26:51 -04:00
Mangesh Gadre	2d955a06a5	Revert "drm/amdgpu: Program xcp_ctl registers as needed" This reverts commit `0bdebfef3f`. XCP_CTL register is programmed by firmware and register access is protected. Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:26:50 -04:00
Lang Yu	ab29ac57ad	drm/amdgpu/umsch: add suspend and resume callback Add missing IP callbacks. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:26:50 -04:00
Christian König	6b18ef481f	drm/amdgpu: ignore duplicate BOs again Looks like RADV is actually hitting this. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: `ca6c1e210a` ("drm/amdgpu: use the new drm_exec object for CS v3") Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231017121015.1336786-1-christian.koenig@amd.com	2023-10-19 13:19:44 +02:00
Dave Airlie	27442758e9	amd-drm-next-6.7-2023-10-13: amdgpu: - DC replay fixes - Misc code cleanups and spelling fixes - Documentation updates - RAS EEPROM Updates - FRU EEPROM Updates - IP discovery updates - SR-IOV fixes - RAS updates - DC PQ fixes - SMU 13.0.6 updates - GC 11.5 Support - NBIO 7.11 Support - GMC 11 Updates - Reset fixes - SMU 11.5 Updates - SMU 13.0 OD support - Use flexible arrays for bo list handling - W=1 Fixes - SubVP fixes - DPIA fixes - DCN 3.5 Support - Devcoredump fixes - VPE 6.1 support - VCN 4.0 Updates - S/G display fixes - DML fixes - DML2 Support - MST fixes - VRR fixes - Enable seamless boot in more cases - Enable content type property for HDMI - OLED fixes - Rework and clean up GPUVM TLB flushing - DC ODM fixes - DP 2.x fixes - AGP aperture fixes - SDMA firmware loading cleanups - Cyan Skillfish GPU clock counter fix - GC 11 GART fix - Cache GPU fault info for userspace queries - DC cursor check fixes - eDP fixes - DC FP handling fixes - Variable sized array fixes - SMU 13.0.x fixes - IB start and size alignment fixes for VCN - SMU 14 Support - Suspend and resume sequence rework - vkms fix amdkfd: - GC 11 fixes - GC 10 fixes - Doorbell fixes - CWSR fixes - SVM fixes - Clean up GC info enumeration - Rework memory limit handling - Coherent memory handling fixes - Use partial migrations in GPU faults - TLB flush fixes - DMA unmap fixes - GC 9.4.3 fixes - SQ interrupt fix - GTT mapping fix - GC 11.5 Support radeon: - Misc code cleanups - W=1 Fixes - Fix possible buffer overflow - Fix possible NULL pointer dereference UAPI: - Add EXT_COHERENT memory allocation flags. These allow for system scope atomics. Proposed userspace: https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/pull/88 - Add support for new VPE engine. This is a memory to memory copy engine with advanced scaling, CSC, and color management features Proposed mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25713 - Add INFO IOCTL interface to query GPU faults Proposed Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238 Proposed libdrm MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/298 -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZSmDAQAKCRC93/aFa7yZ 2EdeAQC2lkQ9IHLOon5kIZUK+r9IPYlgFsii+qfmMPLBaMcuwgEA8F4eJln/cc9V 02EKhlapkggYXYa+uhOE2KTnWgMFJgI= =SEXq -----END PGP SIGNATURE----- Merge tag 'amd-drm-next-6.7-2023-10-13' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.7-2023-10-13: amdgpu: - DC replay fixes - Misc code cleanups and spelling fixes - Documentation updates - RAS EEPROM Updates - FRU EEPROM Updates - IP discovery updates - SR-IOV fixes - RAS updates - DC PQ fixes - SMU 13.0.6 updates - GC 11.5 Support - NBIO 7.11 Support - GMC 11 Updates - Reset fixes - SMU 11.5 Updates - SMU 13.0 OD support - Use flexible arrays for bo list handling - W=1 Fixes - SubVP fixes - DPIA fixes - DCN 3.5 Support - Devcoredump fixes - VPE 6.1 support - VCN 4.0 Updates - S/G display fixes - DML fixes - DML2 Support - MST fixes - VRR fixes - Enable seamless boot in more cases - Enable content type property for HDMI - OLED fixes - Rework and clean up GPUVM TLB flushing - DC ODM fixes - DP 2.x fixes - AGP aperture fixes - SDMA firmware loading cleanups - Cyan Skillfish GPU clock counter fix - GC 11 GART fix - Cache GPU fault info for userspace queries - DC cursor check fixes - eDP fixes - DC FP handling fixes - Variable sized array fixes - SMU 13.0.x fixes - IB start and size alignment fixes for VCN - SMU 14 Support - Suspend and resume sequence rework - vkms fix amdkfd: - GC 11 fixes - GC 10 fixes - Doorbell fixes - CWSR fixes - SVM fixes - Clean up GC info enumeration - Rework memory limit handling - Coherent memory handling fixes - Use partial migrations in GPU faults - TLB flush fixes - DMA unmap fixes - GC 9.4.3 fixes - SQ interrupt fix - GTT mapping fix - GC 11.5 Support radeon: - Misc code cleanups - W=1 Fixes - Fix possible buffer overflow - Fix possible NULL pointer dereference UAPI: - Add EXT_COHERENT memory allocation flags. These allow for system scope atomics. Proposed userspace: https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/pull/88 - Add support for new VPE engine. This is a memory to memory copy engine with advanced scaling, CSC, and color management features Proposed mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25713 - Add INFO IOCTL interface to query GPU faults Proposed Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238 Proposed libdrm MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/298 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231013175758.1735031-1-alexander.deucher@amd.com	2023-10-18 16:08:07 +10:00
Luben Tuikov	fa8391ad68	gpu/drm: Eliminate DRM_SCHED_PRIORITY_UNSET Eliminate DRM_SCHED_PRIORITY_UNSET, value of -2, whose only user was amdgpu. Furthermore, eliminate an index bug, in that when amdgpu boots, it calls drm_sched_entity_init() with DRM_SCHED_PRIORITY_UNSET, which uses it to index sched->sched_rq[]. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alex Deucher <Alexander.Deucher@amd.com> Link: https://lore.kernel.org/r/20231017035656.8211-2-luben.tuikov@amd.com	2023-10-17 20:35:38 -04:00
Luben Tuikov	eab0261967	drm/amdgpu: Unset context priority is now invalid A context priority value of AMD_CTX_PRIORITY_UNSET is now invalid--instead of carrying it around and passing it to the Direct Rendering Manager--and it becomes AMD_CTX_PRIORITY_NORMAL in amdgpu_ctx_ioctl(), the gateway to context creation. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alex Deucher <Alexander.Deucher@amd.com> Link: https://lore.kernel.org/r/20231017035656.8211-1-luben.tuikov@amd.com	2023-10-17 20:35:38 -04:00
Ma Ke	cd90511557	drm/amdgpu/vkms: fix a possible null pointer dereference In amdgpu_vkms_conn_get_modes(), the return value of drm_cvt_mode() is assigned to mode, which will lead to a NULL pointer dereference on failure of drm_cvt_mode(). Add a check to avoid null pointer dereference. Signed-off-by: Ma Ke <make_ruc2021@163.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:36:25 -04:00
Yang Wang	3bba4bc6a0	drm/amdgpu: add RAS error info support for umc_v12_0 add RAS error info support for umc_v12_0. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:36:11 -04:00
Yang Wang	8736d17a7f	drm/amdgpu: add RAS error info support for mmhub_v1_8 add RAS error info support for mmhub_v1_8. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:36:03 -04:00
Yang Wang	156c2814c2	drm/amdgpu: add RAS error info support for gfx_v9_4_3 add RAS error info support for gfx_v9_4_3. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:35:55 -04:00
Yang Wang	dd401cd29a	drm/amdgpu: add RAS error info support for sdma_v4_4_2. add RAS error info support for sdma_v4_4_2. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:35:45 -04:00
Yang Wang	5b1270beb3	drm/amdgpu: add ras_err_info to identify RAS error source introduced "ras_err_info" to better identify a RAS ERROR source. NOTE: For legacy chips, keep the original RAS error print format. v1: RAS errors may come from different dies during a RAS error query, therefore, need a new data structure to identify the source of RAS ERROR. v2: - use new data structure 'amdgpu_smuio_mcm_config_info' instead of ras_err_id (in v1 patch) - refine ras error dump function name - refine ras error dump log format Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:35:35 -04:00
Yifan Zhang	6a1c31c7a8	drm/amdgpu: flush the correct vmid tlb for specific pasid flush the correct vmid tlb for specific pasid on gmc 11. Fixes: `041a574388` ("drm/amdgpu: fix and cleanup gmc_v11_0_flush_gpu_tlb_pasid") Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:34:29 -04:00
Yang Wang	1a00cfab37	drm/amdgpu: make err_data structure built-in for ras_manager (No effect outside the ras_mgr data structure) Since a new member was added to the ras_err_data data structure, it becomes unreasonable for the ras_mgr instance to contain this data, because ras mgr only uses the 2 member information of ue_count/ce_count in err_data. This patch changes the code err_data into built-in structure members, making the code directly compatible. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:34:13 -04:00
Jesse Zhang	e341631f4a	drm/amdgpu: disable GFXOFF and PG during compute for GFX9 Temporary workaround to fix issues observed in some compute applications when GFXOFF is enabled on GFX9. Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:33:56 -04:00
Lang Yu	ef2354c70f	drm/amdgpu/umsch: fix missing stuff during rebase These are missed during rebase. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:33:49 -04:00
Lang Yu	fb5b73acf7	drm/amdgpu/umsch: correct IP version format FW uses IP_VERSION_MAJ_MIN_REV format. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:33:42 -04:00
Lang Yu	1c1f14a472	drm/amdgpu: don't use legacy invalidation on MMHUB v3.3 Legacy invalidation is not supported. This is missed during rebase. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:33:29 -04:00
Lang Yu	4661482b9c	drm/amdgpu: correct NBIO v7.11 programing Use v7.7 before, switch to v7.11 now. Fix incorrect programing. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:33:21 -04:00
Xiaogang Chen	ffa88b0019	drm/amdgpu: Correctly use bo_va->ref_count in compute VMs This is needed to correctly handle BOs imported into compute VM from gfx. Both kfd and gfx should use same bo_va and set bo_va->ref_count correctly when map the Bos into same VM, otherwise we may trigger kernel general protection when iterate mappings over bo_va's valids or invalids list. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Xiaogang Chen <Xiaogang.Chen@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Tested-by: Xiaogang Chen <Xiaogang.Chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:33:08 -04:00
Lijo Lazar	f20f3b0d6c	drm/amd/pm: Add P2S tables for SMU v13.0.6 Add P2S table load support on SMU v13.0.6 ASICs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:33:01 -04:00
Lijo Lazar	79daf69246	drm/amdgpu: Add support to load P2S tables Add support to load P2S tables through PSP. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:32:55 -04:00
Lijo Lazar	cd21cb1fcb	drm/amdgpu: Update PSP interface header Adds FW id for P2S table. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:32:47 -04:00
Lijo Lazar	a8558fce7a	drm/amdgpu: Avoid FRU EEPROM access on APU FRU EEPROM access is not valid for APU devices. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:32:41 -04:00
Lin.Cao	f74f19c440	drm/amdgpu: save VCN instances init info before jpeg init JPEG init header will overwirte vcn init header info which will loss some debug information Signed-off-by: Lin.Cao <lincao12@amd.com> Reviewed-by: Jingwen Chen <Jingwen.Chen2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:32:34 -04:00
Alex Hung	98a80bb3dd	Revert "drm/amd/display: Hande writeback request from userspace" This reverts commit `cd1a4bc228`. [WHY & HOW] The writeback series cause a regression in thunderbolt display. Signed-off-by: Alex Hung <alex.hung@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:24:47 -04:00
Alex Hung	731a20cb89	Revert "drm/amd/display: Add writeback enable field (wb_enabled)" This reverts commit `f6893fcb10`. [WHY & HOW] The writeback series cause a regression in thunderbolt display. Signed-off-by: Alex Hung <alex.hung@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:08:57 -04:00
Asad Kamal	625e5f3851	drm/amdgpu: Expose ras version & schema info Expose ras table version & schema info to sysfs v2: Updated schema to get poison support info from ras context, removed asic specific checks Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:02:54 -04:00
Lijo Lazar	d4a02673b3	drm/amdgpu: Read PSPv13 OS version from register PSP OS updates the version information in register. On APUs with PSPv13, PSP OS will already be loaded with SBIOS. Hence use the version register instead of using information in driver binary header. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:02:43 -04:00
Lang Yu	faeddb6eab	drm/amdgpu/umsch: enable doorbell for umsch Program vcn_doorbell_range with vcn_ring0_1. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:01:37 -04:00
Mario Limonciello	db99889065	drm/amd: Split up UVD suspend into prepare and suspend steps amdgpu_uvd_suspend() allocates memory and copies objects into that allocated memory. This fails under memory pressure. Instead move majority of this code into a prepare step when swap can still be allocated. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:01:04 -04:00
Mario Limonciello	cb11ca3233	drm/amd: Add concept of running prepare_suspend() sequence for IP blocks If any IP blocks allocate memory during their hw_fini() sequence this can cause the suspend to fail under memory pressure. Introduce a new phase that IP blocks can use to allocate memory before suspend starts so that it can potentially be evicted into swap instead. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:00:58 -04:00
Mario Limonciello	5095d54181	drm/amd: Evict resources during PM ops prepare() callback Linux PM core has a prepare() callback run before suspend. If the system is under high memory pressure, the resources may need to be evicted into swap instead. If the storage backing for swap is offlined during the suspend() step then such a call may fail. So move this step into prepare() to move evict majority of resources and update all non-pmops callers to call the same callback. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2362 Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:00:18 -04:00
Li Ma	31715a8620	drm/amdgpu: enable GFX IP v11.5.0 CG and PG support Add CG support for GFX/MC/HDP/ATHUB/IH/BIF. Add PG support for GFX. Signed-off-by: Li Ma <li.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:00:15 -04:00
Li Ma	ad3e54ab9e	drm/amdgpu/discovery: add SMU 14 support add smu 14 into the IP discovery list. Signed-off-by: Li Ma <li.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:00:00 -04:00
Lang Yu	4acf679f86	drm/amdgpu/umsch: power on/off UMSCH by DLDO VCN 4.0.5 uses DLDO. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:59:32 -04:00
Lang Yu	617b472431	drm/amdgpu/umsch: fix psp frontdoor loading These changes are missed in rebase. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:59:24 -04:00
Lijo Lazar	558fcb7d11	drm/amdgpu: Increase IP discovery region size IP discovery region has increased to > 8K on some SOCs.Maximum reserve size is upto 12K, but not used. For now increase to 10K. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:59:16 -04:00
Lin.Cao	b053117e86	drm/amdgpu: Return -EINVAL when MMSCH init status incorrect Return -EINVAL when MMSCH init fail which can be handle by function amdgpu_device_reset_sriov correctly. Signed-off-by: Lin.Cao <lincao12@amd.com> Reviewed-by: Jingwen Chen <Jingwen.Chen2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:58:48 -04:00
Lang Yu	9a37f65c4e	drm/amdgpu/vpe: fix insert_nop ops Avoid infinite loop when count is 0. This is missed in rebase. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:58:33 -04:00
Srinivasan Shanmugam	54967d5683	drm/amdgpu: Address member 'gart_placement' not described in 'amdgpu_gmc_gart_location' Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c:274: warning: Function parameter or member 'gart_placement' not described in 'amdgpu_gmc_gart_location' Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:58:22 -04:00
Lang Yu	84aa39ab1e	drm/amdgpu/vpe: align with mcbp changes MCBP is decided by adev->gfx.mcbp now. This is missed in rebase. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:58:13 -04:00
Lang Yu	99ea82f424	drm/amdgpu/vpe: remove IB end boundary requirement Remove IB end boundary requirement, VPE has no such limitions, use existing amdgpu_ring_generic_pad_ib() instead. This is missed in rebase. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:58:01 -04:00
Jay Cornwall	757920585d	drm/amdgpu: Improve MES responsiveness during oversubscription When MES is oversubscribed it may not frequently check for new command submissions from driver if the scheduling load is high. Response latency as high as 5 seconds has been observed. Enable a flag which adds a check for new commands between scheduling quantums. Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Cc: Alexandru Tudor <alexandru.tudor@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:57:46 -04:00
Thomas Zimmermann	57390019b6	Merge drm/drm-next into drm-misc-next Updating drm-misc-next to the state of Linux v6.6-rc2. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2023-10-11 09:50:59 +02:00
Icenowy Zheng	3806a8c647	drm/amdgpu: fix SI failure due to doorbells allocation SI hardware does not have doorbells at all, however currently the code will try to do the allocation and thus fail, makes SI AMDGPU not usable. Fix this failure by skipping doorbells allocation when doorbells count is zero. Fixes: `54c30d2a8d` ("drm/amdgpu: create kernel doorbell pages") Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Icenowy Zheng <uwu@icenowy.me> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 17:59:29 -04:00
Christian König	ff89f064dc	drm/amdgpu: add missing NULL check bo->tbo.resource can easily be NULL here. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2902 Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> CC: stable@vger.kernel.org	2023-10-09 17:59:29 -04:00
Icenowy Zheng	219223eca4	drm/amdgpu: fix SI failure due to doorbells allocation SI hardware does not have doorbells at all, however currently the code will try to do the allocation and thus fail, makes SI AMDGPU not usable. Fix this failure by skipping doorbells allocation when doorbells count is zero. Fixes: `54c30d2a8d` ("drm/amdgpu: create kernel doorbell pages") Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Icenowy Zheng <uwu@icenowy.me> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 17:02:52 -04:00
Aaron Liu	ce862c4995	drm/amdgpu/discovery: enable DCN 3.5.0 support Enable DCN 3.5.0 support. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 17:02:46 -04:00
Arvind Yadav	367a0af433	drm/amdkfd: get doorbell's absolute offset based on the db_size Here, Adding db_size in byte to find the doorbell's absolute offset for both 32-bit and 64-bit doorbell sizes. So that doorbell offset will be aligned based on the doorbell size. v2: - Addressed the review comment from Felix. v3: - Adding doorbell_size as parameter to get db absolute offset. v4: Squash the two patches into one. Cc: Christian Koenig <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 17:02:34 -04:00
Christian König	31220ee9dc	drm/amdgpu: add missing NULL check bo->tbo.resource can easily be NULL here. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2902 Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> CC: stable@vger.kernel.org	2023-10-09 17:01:32 -04:00
Yifan Zhang	061863e5db	drm/amdgpu: add hub->ctx_distance in setup_vmid_config add hub->ctx_distance when read CONTEXT1_CNTL, align w/ write back operation. v2: fix coding style errors reported by checkpatch.pl (Christian) Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 16:59:06 -04:00
Stanley.Yang	80285ae1ec	drm/amdgpu: Fix potential null pointer derefernce The amdgpu_ras_get_context may return NULL if device not support ras feature, so add check before using. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 16:52:46 -04:00
Lijo Lazar	b3e73b5a8f	Documentation/amdgpu: Add FRU attribute details Add documentation for the newly added manufacturer and fru_id attributes in sysfs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 16:52:25 -04:00
Lijo Lazar	ac6b1f275f	drm/amdgpu: Add more FRU field information Add support to read Manufacturer Name and FRU File Id fields. Also add sysfs device attributes for external usage. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 16:52:17 -04:00
Lijo Lazar	8a2b51392a	drm/amdgpu: Refactor FRU product information Keep FRU related information together in a separate structure. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 16:52:08 -04:00
Yang Wang	be2e8aca06	drm/amdgpu: enable FRU device for SMU v13.0.6 v1: enable GFX v9.4.3 FRU device to query board information. v2: use MP1 version to identify different asic Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 16:51:58 -04:00
Boyuan Zhang	6cb8e3ee3a	drm/amdgpu: update ib start and size alignment Update IB starting address alignment and size alignment with correct values for decode and encode IPs. Decode IB starting address alignment: 256 bytes Decode IB size alignment: 64 bytes Encode IB starting address alignment: 256 bytes Encode IB size alignment: 4 bytes Also bump amdgpu driver version for this update. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 16:51:39 -04:00
Kees Cook	c8e7df374b	drm/amdgpu: Annotate struct amdgpu_bo_list with __counted_by Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct amdgpu_bo_list. Additionally, since the element count member must be set before accessing the annotated flexible array member, move its initialization earlier. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org> Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: linux-hardening@vger.kernel.org Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci [1] Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Srinivasan Shanmugam	e0a3e7bf62	drm/amdgpu: Drop unnecessary return statements There is no reason to call return at the end of function that returns void. Fixes the below: WARNING: void function return statements are not generally useful Thus remove such a statement in the affected functions. Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Lijo Lazar	4798db85b7	Documentation/amdgpu: Add board info details Add documentation for board info sysfs attribute. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Lijo Lazar	76da73f026	drm/amdgpu: Add sysfs attribute to get board info Add a sysfs attribute which shows the board form factor like OAM or CEM. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Lijo Lazar	b0a4553336	drm/amdgpu: Get package types for smuio v13.0 Add support to query package types supported in smuio v13.0 ASICs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Lijo Lazar	4365d2ed09	drm/amdgpu: Add more smuio v13.0.3 package types Expand support to get other board types like OAM or CEM. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Sathishkumar S	cbad0dd13a	drm/amdgpu: fix ip count query for xcp partitions fix wrong ip count INFO on spatial partitions. update the query to return the instance count corresponding to the partition id. v2: initialize variables only when required to be (Christian) move variable declarations to the beginning of function (Christian) Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Lijo Lazar	28a3f49609	drm/amdgpu: Move package type enum to amdgpu_smuio Move definition of package type to amdgpu_smuio header and add new package types for CEM and OAM. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Srinivasan Shanmugam	2b6b29f33f	drm/amdgpu: Fix complex macros error Fixes the below: ERROR: Macros with complex values should be enclosed in parentheses WARNING: macros should not use a trailing semicolon +#define amdgpu_inc_vram_lost(adev) atomic_inc(&((adev)->vram_lost_counter)); Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Alex Deucher	7d3f1d76f3	drm/amdgpu: refine fault cache updates Don't update the fault cache if status is 0. In the multiple fault case, subsequent faults will return a 0 status which is useless for userspace and replaces the useful fault status, so only update if status is non-0. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:49:51 -04:00
Alex Deucher	7a41ed8b59	drm/amdgpu: add new INFO ioctl query for the last GPU page fault Add a interface to query the last GPU page fault for the process. Useful for debugging context lost errors. v2: split vmhub representation between kernel and userspace v3: add locking when fetching fault info in INFO IOCTL Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238 libdrm MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238 Cc: samuel.pitoiset@gmail.com Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:49:39 -04:00
Kees Cook	ac8e62ab25	drm/amdgpu/discovery: Annotate struct ip_hw_instance with __counted_by Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct ip_hw_instance. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230922173216.3823169-2-keescook@chromium.org	2023-10-05 11:29:09 +02:00
Mario Limonciello	134b8c5d86	drm/amd: Fix detection of _PR3 on the PCIe root port On some systems with Navi3x dGPU will attempt to use BACO for runtime PM but fails to resume properly. This is because on these systems the root port goes into D3cold which is incompatible with BACO. This happens because in this case dGPU is connected to a bridge between root port which causes BOCO detection logic to fail. Fix the intent of the logic by looking at root port, not the immediate upstream bridge for _PR3. Cc: stable@vger.kernel.org Suggested-by: Jun Ma <Jun.Ma2@amd.com> Tested-by: David Perry <David.Perry@amd.com> Fixes: `b10c1c5b3a` ("drm/amdgpu: add check for ACPI power resources") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 22:52:05 -04:00
Luben Tuikov	5d061675b7	drm/amdgpu: Fix a memory leak Fix a memory leak in amdgpu_fru_get_product_info(). Cc: Alex Deucher <Alexander.Deucher@amd.com> Reported-by: Yang Wang <kevinyang.wang@amd.com> Fixes: `0dbf2c5626` ("drm/amdgpu: Interpret IPMI data for product information (v2)") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 22:43:26 -04:00
Philip Yang	fdac890966	drm/amdgpu: ratelimited override pte flags messages Use ratelimited version of dev_dbg to avoid flooding dmesg log. No functional change. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 18:40:11 -04:00
Mario Limonciello	b8e6aec146	drm/amd: Drop all hand-built MIN and MAX macros in the amdgpu base driver Several files declare MIN() or MAX() macros that ignore the types of the values being compared. Drop these macros and switch to min() min_t(), and max() from `linux/minmax.h`. Suggested-by: Hamza Mahfooz <Hamza.Mahfooz@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 18:39:52 -04:00
Alex Deucher	8dbf1ba867	drm/amdgpu: cache gpuvm fault information for gmc7+ Cache the current fault info in the vm struct. This can be queried by userspace later to help debug UMDs. Cc: samuel.pitoiset@gmail.com Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 18:37:07 -04:00
Alex Deucher	2e8ef6a561	drm/amdgpu: add cached GPU fault structure to vm struct When we get a GPU page fault, cache the fault for later analysis. Cc: samuel.pitoiset@gmail.com Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 18:36:57 -04:00
Rajneesh Bhardwaj	f4bff6e0b9	drm/amdgpu: Use ttm_pages_limit to override vram reporting On GFXIP9.4.3 APU, allow the memory reporting as per the ttm pages limit in NPS1 mode. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 18:36:42 -04:00
Rajneesh Bhardwaj	9b37d45d79	drm/amdgpu: Rework KFD memory max limits To allow bigger allocations specially on systems such as GFXIP 9.4.3 that use GTT memory for VRAM allocations, relax the limits to maximize ROCm allocations. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 18:36:17 -04:00
Alex Deucher	67318cb843	drm/amdgpu/gmc11: set gart placement GC11 Needed to avoid a hardware issue. v2: force high for all GC11 parts for consistency (Alex) v3: rebase Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 18:36:12 -04:00
Alex Deucher	917f91d8d8	drm/amdgpu/gmc: add a way to force a particular placement for GART We normally place GART based on the location of VRAM and the available address space around that, but provide an option to force a particular location for hardware that needs it. v2: Switch to passing the placement via parameter Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 18:36:07 -04:00
Lang Yu	a19d934986	drm/amdgpu: correct gpu clock counter query on cyan skilfish Cayn skilfish uses SMUIO v11.0.8 offset. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: <stable@vger.kernel.org> # v5.15+	2023-10-04 18:35:28 -04:00
Mario Limonciello	c4c8955b8a	drm/amd: Fix detection of _PR3 on the PCIe root port On some systems with Navi3x dGPU will attempt to use BACO for runtime PM but fails to resume properly. This is because on these systems the root port goes into D3cold which is incompatible with BACO. This happens because in this case dGPU is connected to a bridge between root port which causes BOCO detection logic to fail. Fix the intent of the logic by looking at root port, not the immediate upstream bridge for _PR3. Cc: stable@vger.kernel.org Suggested-by: Jun Ma <Jun.Ma2@amd.com> Tested-by: David Perry <David.Perry@amd.com> Fixes: `b10c1c5b3a` ("drm/amdgpu: add check for ACPI power resources") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-03 15:43:11 -04:00
Alex Hung	f6893fcb10	drm/amd/display: Add writeback enable field (wb_enabled) [WHAT] Add a new field to keep track whether a crtc is previously writeback-enabled. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-03 15:42:38 -04:00
Alex Hung	cd1a4bc228	drm/amd/display: Hande writeback request from userspace [WHAT] Handle writeback requests and fill in the required information for DWB programming and setup. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-03 15:42:06 -04:00
Alex Deucher	0021d70a06	drm/amdkfd: drop struct kfd_cu_info I think this was an abstraction back from when kfd supported both radeon and amdgpu. Since we just support amdgpu now, there is no more need for this and we can use the amdgpu structures directly. This also avoids having the kfd_cu_info structures on the stack when inlining which can blow up the stack. Cc: Arnd Bergmann <arnd@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-03 15:41:13 -04:00
Dave Airlie	79fb229b88	Merge tag 'drm-misc-next-2023-09-27' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v6.7-rc1: UAPI Changes: - drm_file owner is now updated during use, in the case of a drm fd opened by the display server for a client, the correct owner is displayed. - Qaic gains support for the QAIC_DETACH_SLICE_BO ioctl to allow bo recycling. Cross-subsystem Changes: - Disable boot logo for au1200fb, mmpfb and unexport logo helpers. Only fbcon should manage display of logo. - Update freescale in MAINTAINERS. - Add some bridge files to bridge in MAINTAINERS. - Update gma500 driver repo in MAINTAINERS to point to drm-misc. Core Changes: - Move size computations to drm buddy allocator. - Make drm_atomic_helper_shutdown(NULL) a nop. - Assorted small fixes in drm_debugfs, DP-MST payload addition error handling. - Fix DRM_BRIDGE_ATTACH_NO_CONNECTOR handling. - Handle bad (h/v)sync_end in EDID by clipping to htotal. - Build GPUVM as a module. Driver Changes: - Simple drivers don't need to cache prepared result. - Call drm_atomic_helper_shutdown() in shutdown/unbind for a whole lot more drm drivers. - Assorted small fixes in amdgpu, ssd130x, bridge/it6621, accel/qaic, nouveau, tc358768. - Add NV12 for komeda writeback. - Add arbitration lost event to synopsis/dw-hdmi-cec. - Speed up s/r in nouveau by not restoring some big bo's. - Assorted nouveau display rework in preparation for GSP-RM, especially related to how the modeset sequence works and the DP sequence in relation to link training. - Update anx7816 panel. - Support NVSYNC and NHSYNC in tegra. - Allow multiple power domains in simple driver. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/f1fae5eb-25b8-192a-9a53-215e1184ce81@linux.intel.com	2023-09-29 08:27:15 +10:00
Sui Jingfeng	18bf400530	drm/amdgpu: Use pci_get_base_class() to reduce duplicated code Use pci_get_base_class() to reduce duplicated code. No functional change intended. Link: https://lore.kernel.org/r/20230825062714.6325-5-sui.jingfeng@linux.dev Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 16:54:54 -05:00
Tao Zhou	fc59889071	drm/amdgpu: update retry times for psp vmbx wait Increase the retry loops and replace the constant number with macro. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:44:36 -04:00
Tao Zhou	1934907234	drm/amdgpu: exit directly if gpu reset fails No need to perform the full reset operation in case of gpu reset failure. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:44:36 -04:00
Mario Limonciello	93499bd6cd	drm/amd: Move microcode init from sw_init to early_init for CIK SDMA As part of IP discovery early_init is run for all HW IP blocks. During this phase all firmware is supposed to be identified that may be missing so that the driver can avoid releasing resources used by the EFI framebuffer or simpledrm until the last possible moment. Move microcode loading from sw_init to early_init. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:38:17 -04:00
Mario Limonciello	751e293f2c	drm/amd: Move microcode init from sw_init to early_init for SDMA v2.4 As part of IP discovery early_init is run for all HW IP blocks. During this phase all firmware is supposed to be identified that may be missing so that the driver can avoid releasing resources used by the EFI framebuffer or simpledrm until the last possible moment. Move microcode loading from sw_init to early_init. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:38:11 -04:00
Mario Limonciello	cc76630483	drm/amd: Move microcode init from sw_init to early_init for SDMA v3.0 As part of IP discovery early_init is run for all HW IP blocks. During this phase all firmware is supposed to be identified that may be missing so that the driver can avoid releasing resources used by the EFI framebuffer or simpledrm until the last possible moment. Move microcode loading from sw_init to early_init. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:38:05 -04:00
Mario Limonciello	e0d4fbb58c	drm/amd: Move microcode init from sw_init to early_init for SDMA v5.2 As part of IP discovery early_init is run for all HW IP blocks. During this phase all firmware is supposed to be identified that may be missing so that the driver can avoid releasing resources used by the EFI framebuffer or simpledrm until the last possible moment. Move microcode loading from sw_init to early_init. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:37:50 -04:00
Mario Limonciello	95b456d3b0	drm/amd: Move microcode init from sw_init to early_init for SDMA v6.0 As part of IP discovery early_init is run for all HW IP blocks. During this phase all firmware is supposed to be identified that may be missing so that the driver can avoid releasing resources used by the EFI framebuffer or simpledrm until the last possible moment. Move microcode loading from sw_init to early_init. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:37:43 -04:00
Mario Limonciello	ed1c1053cd	drm/amd: Move microcode init from sw_init to early_init for SDMA v5.0 As part of IP discovery early_init is run for all HW IP blocks. During this phase all firmware is supposed to be identified that may be missing so that the driver can avoid releasing resources used by the EFI framebuffer or simpledrm until the last possible moment. Move microcode loading from sw_init to early_init. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:37:34 -04:00
Mario Limonciello	161d076c2d	drm/amd: Drop error message about failing to load SDMA firmware The error path for SDMA firmware loading is unnecessarily noisy. When a firmware is missing 3 errors show up: ``` amdgpu 0000:07:00.0: Direct firmware load for amdgpu/green_sardine_sdma.bin failed with error -2 [drm:sdma_v4_0_early_init [amdgpu]] ERROR Failed to load sdma firmware! [drm:amdgpu_device_init [amdgpu]] ERROR early_init of IP block <sdma_v4_0> failed -19 ``` The error code for the device init is bubbled up already, remove the second one. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:36:41 -04:00
Le Ma	3152d01e88	drm/amd/pm: deprecate allow_xgmi_power_down interface Replace with set_plpd_mode uniformly for places to use. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:36:23 -04:00
Mario Limonciello	3657a1d5ac	drm/amd: Limit seamless boot by default to APUs A hang is reported on DCN 3.2 with seamless boot enabled. As the benefits come from an eDP setup, limit it to only enabled by default with APUs. Suggested-by: Alexander.Deucher@amd.com Reported-by: feifei.xu@amd.com Closes: https://lore.kernel.org/amd-gfx/85b427f6-11ec-4249-bf6f-eadf9c375f88@amd.com/T/#m2887e919d7c01b2a4860d2261b366d22e070f309 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:35:31 -04:00
Alex Deucher	b2e1cbe628	drm/amdgpu/gmc11: disable AGP on GC 11.5 AGP aperture is deprecated and no longer functional. v2: fix typo (Alex) v3: just skip the agp setup call v4: revert back to the original model v5: back to v3 Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 17:00:23 -04:00
David (Ming Qiang) Wu	fa1f1cc09d	drm/amdgpu: not to save bo in the case of RAS err_event_athub err_event_athub will corrupt VCPU buffer and not good to be restored in amdgpu_vcn_resume() and in this case the VCPU buffer needs to be cleared for VCN firmware to work properly. Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 17:00:23 -04:00
Luben Tuikov	9ed630c5c4	drm/amdgpu: Fix a memory leak Fix a memory leak in amdgpu_fru_get_product_info(). Cc: Alex Deucher <Alexander.Deucher@amd.com> Reported-by: Yang Wang <kevinyang.wang@amd.com> Fixes: `0dbf2c5626` ("drm/amdgpu: Interpret IPMI data for product information (v2)") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 17:00:23 -04:00
Alex Deucher	de59b69932	drm/amdgpu/gmc: set a default disable value for AGP To disable AGP, the start needs to be set to a higher value than the end. Set a default disable value for the AGP aperture and allow the IP specific GMC code to enable it selectively be calling amdgpu_gmc_agp_location(). Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 17:00:22 -04:00
Alex Deucher	29495d8145	drm/amdgpu/gmc6-8: properly disable the AGP aperture The BOT register needs to be larger than the TOP register for this to be properly disabled. The lower 22 bits of the BOT address are always 0 and the lower 22 bits of the TOP register are always 1 so you need to make the upper bits of BOT larger than the upper bits of BOT. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 17:00:22 -04:00
Mangesh Gadre	cd956e7531	drm/amdgpu:Expose physical id of device in XGMI hive This identifies the physical ordering of devices in the hive v2: fix compilation issue Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 17:00:22 -04:00
Lang Yu	7021b397c6	drm/amdgpu/vpe: fix truncation warnings Fix truncation warnings. Fixes: `9d4346bdbc` ("drm/amdgpu: add VPE 6.1.0 support") Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202309200028.aUVuM8os-lkp@intel.com Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 17:00:21 -04:00
Philip Yang	101b810430	drm/amdkfd: Move dma unmapping after TLB flush Otherwise GPU may access the stale mapping and generate IOMMU IO_PAGE_FAULT. Move this to inside p->mutex to prevent multiple threads mapping and unmapping concurrently race condition. After kfd_mem_dmaunmap_attachment is removed from unmap_bo_from_gpuvm, kfd_mem_dmaunmap_attachment is called if failed to map to GPUs, and before free the mem attachment in case failed to unmap from GPUs. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:10 -04:00
Christian König	08abccc9a7	drm/amdgpu: further move TLB hw workarounds a layer up For the PASID flushing we already handled that at a higher layer, apply those workarounds to the standard flush as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	e2e3788850	drm/amdgpu: rework lock handling for flush_tlb v2 Instead of each implementation doing this more or less correctly move taking the reset lock at a higher level. v2: fix typo Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	3983c9fd2d	drm/amdgpu: drop error return from flush_gpu_tlb_pasid That function never fails, drop the error return. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	041a574388	drm/amdgpu: fix and cleanup gmc_v11_0_flush_gpu_tlb_pasid The same PASID can be used by more than one VMID, reset each of them. Use the common KIQ handling. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	72cc99205c	drm/amdgpu: cleanup gmc_v10_0_flush_gpu_tlb_pasid The same PASID can be used by more than one VMID, reset each of them. Use the common KIQ handling. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	e7b90e99fa	drm/amdgpu: fix and cleanup gmc_v9_0_flush_gpu_tlb_pasid Testing for reset is pointless since the reset can start right after the test. The same PASID can be used by more than one VMID, invalidate each of them. Move the KIQ and all the workaround handling into common GMC code. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	0c525aa406	drm/amdgpu: fix and cleanup gmc_v8_0_flush_gpu_tlb_pasid Testing for reset is pointless since the reset can start right after the test. Grab the reset semaphore instead. The same PASID can be used by more than once VMID, build a mask of VMIDs to invalidate instead of just restting the first one. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	fb4c52db69	drm/amdgpu: fix and cleanup gmc_v7_0_flush_gpu_tlb_pasid Testing for reset is pointless since the reset can start right after the test. Grab the reset semaphore instead. The same PASID can be used by more than once VMID, build a mask of VMIDs to invalidate instead of just restting the first one. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	a54db42ff3	drm/amdgpu: cleanup gmc_v11_0_flush_gpu_tlb Remove leftovers from copying this from the gmc v10 code. v2: squash in fix from Yifan Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	a70cb2176f	drm/amdgpu: rework gmc_v10_0_flush_gpu_tlb v2 Move the SDMA workaround necessary for Navi 1x into a higher layer. v2: use dev_err Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:54:52 -04:00
Tao Zhou	8c14a67bdf	drm/amdgpu: change if condition for bad channel bitmap update The amdgpu_ras_eeprom_control.bad_channel_bitmap is u32 type, but the channel index could be larger than 32. For the ASICs whose channel number is more than 32, the amdgpu_dpm_send_hbm_bad_channel_flag interface is not supported, so we simply bypass channel bitmap update under this condition. v2: replace sizeof with BITS_PER_TYPE, we should check bit number instead of byte number. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:54:52 -04:00
Tao Zhou	6205b558e1	drm/amdgpu: fix value of some UMC parameters for UMC v12 Prepare for bad page retirement. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:54:52 -04:00
Philip Yang	e61801f162	drm/amdkfd: Don't use sw fault filter if retry cam enabled If retry cam enabled, we don't use sw retry fault filter and add fault into sw filter ring, so we shouldn't remove fault from sw filter. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:54:51 -04:00
Christian König	24a6eb92b7	drm/amdgpu: fix and cleanup gmc_v9_0_flush_gpu_tlb The KIQ code path was ignoring the second flush. Also avoid long lines and re-calculating the register offsets over and over again. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:54:51 -04:00
Philip Yang	bcfb9cee61	drm/amdgpu: Increase IH soft ring size for GFX v9.4.3 dGPU On GFX v9.4.3 dGPU, applications have random timeout failure when XNACK on, dmesg log has "amdgpu: IH soft ring buffer overflow 0x900, 0x900", because dGPU mode has 272 cam entries. After increasing IH soft ring to 512 entries, no more IH soft ring overflow message and application passed. Fixes: `bf80d34b6c` ("drm/amdgpu: Increase soft IH ring size") Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:54:51 -04:00
Lijo Lazar	c45e38f217	drm/amdgpu: Restore partition mode after reset On a full device reset, PSP FW gets unloaded. Hence restore the partition mode by placing a new request. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:54:51 -04:00
Cong Liu	f387bb578d	drm/amdgpu: fix a memory leak in amdgpu_ras_feature_enable This patch fixes a memory leak in the amdgpu_ras_feature_enable() function. The leak occurs when the function sends a command to the firmware to enable or disable a RAS feature for a GFX block. If the command fails, the kfree() function is not called to free the info memory. Fixes: `9f051d6ff1` ("drm/amdgpu: Free ras cmd input buffer properly") Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Cong Liu <liucong2@kylinos.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 17:27:04 -04:00
Lijo Lazar	06cce38ef5	Revert "drm/amdgpu: Report vbios version instead of PN" This reverts commit `7748ce5b69`. vbios_version sysfs node is used to identify Part Number also. Revert to the same so that it doesn't break scripts/software which parse this. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 17:26:15 -04:00
Lijo Lazar	ff96ddc3f2	drm/amdgpu: Add more fields to IP version Include subrevision and variant fileds also to IP version. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:25:17 -04:00
Tao Zhou	f8754f58d6	drm/amdgpu: print channel index for UMC bad page Print channel index for UMC v12. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:25:17 -04:00
Sathishkumar S	5aba51233b	drm/amdgpu: update IP count INFO query update the query to return the number of functional instances where there is more than an instance of the requested type and for others continue to return one. v2: count must reflect the actual number of engines (Alex) v3: fix wrong number of engines for vcn (Alex) Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:09 -04:00
Stanley.Yang	a83f2bf1f4	drm/amdgpu: Fix false positive error log It should first check block ras obj whether be set, it should return 0 directly if block ras obj or hw_ops is not set. If block doesn't support RAS just return 0 is fine. Changed from V1: return 0 directly if block ras obj or hw ops is not set Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:09 -04:00
Vignesh Chander	8c95cda3e1	drm/amdgpu/jpeg: skip set pg for sriov Host handles PG. Signed-off-by: Vignesh Chander <Vignesh.Chander@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:09 -04:00
André Almeida	a769178585	drm/amdgpu: Rework coredump to use memory dynamically Instead of storing coredump information inside amdgpu_device struct, move if to a proper separated struct and allocate it dynamically. This will make it easier to further expand the logged information. Signed-off-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:07 -04:00
Cong Liu	5838f74c29	drm/amdgpu: fix a memory leak in amdgpu_ras_feature_enable This patch fixes a memory leak in the amdgpu_ras_feature_enable() function. The leak occurs when the function sends a command to the firmware to enable or disable a RAS feature for a GFX block. If the command fails, the kfree() function is not called to free the info memory. Fixes: `9f051d6ff1` ("drm/amdgpu: Free ras cmd input buffer properly") Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Cong Liu <liucong2@kylinos.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:07 -04:00
Lijo Lazar	24f60ddc4b	drm/amdgpu: Fix vbios version string search Search for vbios version string in STRING_OFFSET-ATOM_ROM_HEADER region first. If those offsets are not populated, use the hardcoded region. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:06 -04:00
Lijo Lazar	2af351d692	Revert "drm/amdgpu: Report vbios version instead of PN" This reverts commit `7748ce5b69`. vbios_version sysfs node is used to identify Part Number also. Revert to the same so that it doesn't break scripts/software which parse this. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:06 -04:00
David Francis	5f248462c6	drm/amdgpu: Add EXT_COHERENT memory allocation flags These flags (for GEM and SVM allocations) allocate memory that allows for system-scope atomic semantics. On GFX943 these flags cause caches to be avoided on non-local memory. On all other ASICs they are identical in functionality to the equivalent COHERENT flags. Corresponding Thunk patch is at https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/pull/88 Reviewed-by: David Yat Sin <David.YatSin@amd.com> Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:06 -04:00
Yang Wang	4051844c66	drm/amdgpu: add amdgpu mca debug sysfs support add amdgpu mca debug sysfs support. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:25:19 -04:00
Alex Deucher	d11bbacee3	drm/amdgpu: add VPE IP discovery info to HW IP info query Add missing IP discovery info. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:25:11 -04:00
Yang Wang	7ff607e272	drm/amdgpu: add amdgpu smu mca dump feature support add amdgpu smu mca dump feature support. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:25:01 -04:00
Mario Limonciello	7f4ce7b50a	drm/amd: Enable seamless boot by default on newer ASICs Seamless boot can technically be supported as far back as DCN1 but to avoid regressions on older hardware, enable it for DCN3 and later. If users report using the module parameter that it works on older ASICs as well, this can be adjusted. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:24:46 -04:00
Mario Limonciello	5dc270d366	drm/amd: Add a module parameter for seamless boot The module parameter can be used to test more easily enabling seamless boot support on additional ASICs. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:24:39 -04:00
Timmy Tsai	2fa73a101c	drm/amd: Add HDP flush during jpeg init During jpeg init, CPU writes to frame buffer which can be cached by HDP, occasionally causing invalid header to be sent to MMSCH. Perform HDP flush after writing to frame buffer before continuing with jpeg init sequence. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timmy Tsai <timmtsai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:24:28 -04:00
Mario Limonciello	bb0f84293e	drm/amd: Move seamless boot check out of display This will allow base driver to dictate whether seamless should be enabled. No intended functional changes. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:24:21 -04:00
Mario Limonciello	3ef07651a5	drm/amd: Drop special case for yellow carp without discovery `amdgpu_gmc_get_vbios_allocations` has a special case for how to bring up yellow carp when amdgpu discovery is turned off. As this ASIC ships with discovery turned on, it's generally dead code and worse it causes `adev->mman.keep_stolen_vga_memory` to not be initialized for yellow carp. Remove it. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:24:04 -04:00
Lijo Lazar	4e8303cf2c	drm/amdgpu: Use function for IP version check Use an inline function for version check. Gives more flexibility to handle any format changes. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:23:28 -04:00
Tvrtko Ursulin	1c7a387ffe	drm: Update file owner during use With the typical model where the display server opens the file descriptor and then hands it over to the client(), we were showing stale data in debugfs. Fix it by updating the drm_file->pid on ioctl access from a different process. The field is also made RCU protected to allow for lockless readers. Update side is protected with dev->filelist_mutex. Before: $ cat /sys/kernel/debug/dri/0/clients command pid dev master a uid magic Xorg 2344 0 y y 0 0 Xorg 2344 0 n y 0 2 Xorg 2344 0 n y 0 3 Xorg 2344 0 n y 0 4 After: $ cat /sys/kernel/debug/dri/0/clients command tgid dev master a uid magic Xorg 830 0 y y 0 0 xfce4-session 880 0 n y 0 1 xfwm4 943 0 n y 0 2 neverball 1095 0 n y 0 3 ) More detailed and historically accurate description of various handover implementation kindly provided by Emil Velikov: """ The traditional model, the server was the orchestrator managing the primary device node. From the fd, to the master status and authentication. But looking at the fd alone, this has varied across the years. IIRC in the DRI1 days, Xorg (libdrm really) would have a list of open fd(s) and reuse those whenever needed, DRI2 the client was responsible for open() themselves and with DRI3 the fd was passed to the client. Around the inception of DRI3 and systemd-logind, the latter became another possible orchestrator. Whereby Xorg and Wayland compositors could ask it for the fd. For various reasons (hysterical and genuine ones) Xorg has a fallback path going the open(), whereas Wayland compositors are moving to solely relying on logind... some never had fallback even. Over the past few years, more projects have emerged which provide functionality similar (be that on API level, Dbus, or otherwise) to systemd-logind. """ v2: * Fixed typo in commit text and added a fine historical explanation from Emil. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: Daniel Vetter <daniel@ffwll.ch> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Tested-by: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230621094824.2348732-1-tvrtko.ursulin@linux.intel.com Signed-off-by: Christian König <christian.koenig@amd.com>	2023-09-20 15:27:44 +02:00
Dave Airlie	1216d49178	amd-drm-fixes-6.6-2023-09-13: amdgpu: - GC 9.4.3 fixes - Fix white screen issues with S/G display on system with >= 64G of ram - Replay fixes - SMU 13.0.6 fixes - AUX backlight fix - NBIO 4.3 SR-IOV fixes for HDP - RAS fixes - DP MST resume fix - Fix segfault on systems with no vbios - DPIA fixes amdkfd: - CWSR grace period fix - Unaligned doorbell fix - CRIU fix for GFX11 - Add missing TLB flush on gfx10 and newer -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZQIRSAAKCRC93/aFa7yZ 2O/nAP4zB0fdLB46Hhz11aYsE9Zghe91b2rcmF4EYpEAQs7awwEAhSjy0Wiy6EYb prEGCdW0O8Tq7fdjr7+JrPmF7dasAQk= =SUbg -----END PGP SIGNATURE----- Merge tag 'amd-drm-fixes-6.6-2023-09-13' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.6-2023-09-13: amdgpu: - GC 9.4.3 fixes - Fix white screen issues with S/G display on system with >= 64G of ram - Replay fixes - SMU 13.0.6 fixes - AUX backlight fix - NBIO 4.3 SR-IOV fixes for HDP - RAS fixes - DP MST resume fix - Fix segfault on systems with no vbios - DPIA fixes amdkfd: - CWSR grace period fix - Unaligned doorbell fix - CRIU fix for GFX11 - Add missing TLB flush on gfx10 and newer Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230913195009.7714-1-alexander.deucher@amd.com	2023-09-15 09:50:50 +10:00
Daniel Vetter	15794f9dc3	One doc fix for drm/connector, one fix for amdgpu for an crash when VRAM usage is high, and one fix in gm12u320 to fix the timeout units in the code -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCZPl/TAAKCRDj7w1vZxhR xWZZAP0b3k5vIuQdbiZBdXy7+guakiJ2DqOMxJJ+sYS5Mun53AEA73Cu1gmBNMoT d8H1uBjOfvPcXANNI0t0OgJfrESOdg8= =atPC -----END PGP SIGNATURE----- Merge tag 'drm-misc-fixes-2023-09-07' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes One doc fix for drm/connector, one fix for amdgpu for an crash when VRAM usage is high, and one fix in gm12u320 to fix the timeout units in the code Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Maxime Ripard <mripard@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/w5nlld5ukeh6bgtljsxmkex3e7s7f4qquuqkv5lv4cv3uxzwqr@pgokpejfsyef	2023-09-14 14:00:51 +02:00
Alex Deucher	addd7aef25	drm/amdgpu: add remap_hdp_registers callback for nbio 7.11 Implement support for remapping the HDP aperture registers for NBIO 7.11. Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-12 17:30:22 -04:00
Alex Deucher	b85a17d354	drm/amdgpu: add vcn_doorbell_range callback for nbio 7.11 Implement support for setting up the VCN doorbell range for NBIO 7.11. Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-12 17:30:18 -04:00
Thomas Zimmermann	c900529f3d	Merge drm/drm-fixes into drm-misc-fixes Forwarding to v6.6-rc1. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2023-09-12 08:53:30 +02:00
David Francis	5e7e822542	drm/amdgpu: Handle null atom context in VBIOS info ioctl On some APU systems, there is no atom context and so the atom_context struct is null. Add a check to the VBIOS_INFO branch of amdgpu_info_ioctl to handle this case, returning all zeroes. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 18:25:26 -04:00
Hawking Zhang	ffd6bde302	drm/amdgpu: fallback to old RAS error message for aqua_vanjaram So driver doesn't generate incorrect message until the new format is settled down for aqua_vanjaram Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 18:20:07 -04:00
Alex Deucher	ab43213e7a	drm/amdgpu/nbio4.3: set proper rmmio_remap.reg_offset for SR-IOV Needed for HDP flush to work correctly. Reviewed-by: Timmy Tsai <timmtsai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 18:19:48 -04:00
Alex Deucher	1832403cd4	drm/amdgpu/soc21: don't remap HDP registers for SR-IOV This matches the behavior for soc15 and nv. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Timmy Tsai <timmtsai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 18:19:42 -04:00
Hamza Mahfooz	169ed4ece8	Revert "drm/amd: Disable S/G for APUs when 64GB or more host memory" This reverts commit `70e64c4d52`. Since, we now have an actual fix for this issue, we can get rid of this workaround as it can cause pin failures if enough VRAM isn't carved out by the BIOS. Cc: stable@vger.kernel.org # 6.1+ Acked-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 18:18:17 -04:00
Mukul Joshi	97e3c6a853	drm/amdgpu: Store CU info from all XCCs for GFX v9.4.3 Currently, we store CU info only for a single XCC assuming that it is the same for all XCCs. However, that may not be true. As a result, store CU info for all XCCs. This info is later used for CU masking. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 18:16:31 -04:00
Mukul Joshi	81faf9e0c3	drm/amdkfd: Fix reg offset for setting CWSR grace period This patch fixes the case where the code currently passes absolute register address and not the reg offset, which HWS expects, when sending the PM4 packet to set/update CWSR grace period. Additionally, cleanup the signature of build_grace_period_packet_info function as it no longer needs the inst parameter. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 18:15:43 -04:00
David Francis	86f2ec2265	drm/amdgpu: Handle null atom context in VBIOS info ioctl On some APU systems, there is no atom context and so the atom_context struct is null. Add a check to the VBIOS_INFO branch of amdgpu_info_ioctl to handle this case, returning all zeroes. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:22:32 -04:00
André Almeida	ffde72107b	drm/amdgpu: Create an option to disable soft recovery Create a module option to disable soft recoveries on amdgpu, making every recovery go through the device reset path. This option makes easier to force device resets for testing and debugging purposes. Signed-off-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:22:23 -04:00
André Almeida	887db1e49a	drm/amdgpu: Merge debug module parameters Merge all developer debug options available as separated module parameters in one, making it obvious that are for developers. Drop the obsolete module options in favor of the new ones. Signed-off-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:19:54 -04:00
Yifan Zhang	0e64c9aad0	drm/amdgpu: add type conversion for gc info gc info usage misses type conversion. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Li Ma <li.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:17:30 -04:00
Mukul Joshi	68fa72a437	drm/amdgpu: Rename KGD_MAX_QUEUES to AMDGPU_MAX_QUEUES Rename KGD_MAX_QUEUES to AMDGPU_MAX_QUEUES to conform with the naming convention followed in amdgpu_gfx.h. No functional change. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:15:41 -04:00
Tao Zhou	ced575203a	drm/amdgpu: print more address info of UMC bad page Print out row, column and bank value of UMC error address for UMC v12. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:15:15 -04:00
Hawking Zhang	9f9d4651f7	drm/amdgpu: fallback to old RAS error message for aqua_vanjaram So driver doesn't generate incorrect message until the new format is settled down for aqua_vanjaram Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:15:02 -04:00
Alex Deucher	6a82822b90	drm/amdgpu/nbio4.3: set proper rmmio_remap.reg_offset for SR-IOV Needed for HDP flush to work correctly. Reviewed-by: Timmy Tsai <timmtsai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:14:59 -04:00
Alex Deucher	8a6e26e7ef	drm/amdgpu/soc21: don't remap HDP registers for SR-IOV This matches the behavior for soc15 and nv. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Timmy Tsai <timmtsai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:14:50 -04:00
Hamza Mahfooz	601c63ad8e	Revert "drm/amd: Disable S/G for APUs when 64GB or more host memory" This reverts commit `70e64c4d52`. Since, we now have an actual fix for this issue, we can get rid of this workaround as it can cause pin failures if enough VRAM isn't carved out by the BIOS. Cc: stable@vger.kernel.org # 6.1+ Acked-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:12:20 -04:00
Tao Zhou	3cb9ebc9d6	drm/amdgpu: add channel index table for UMC v12 Get UMC phyical channel index according to node id, umc instance and channel instance. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:10:58 -04:00
Tao Zhou	40a08fe890	drm/amdgpu: add address conversion for UMC v12 Convert MCA error address to physical address and find out all pages in one physical row. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:10:35 -04:00
Lijo Lazar	ca7aa3bf31	drm/amdgpu: Use default reset method handler When reset method is not passed in reset context, look for the handler for default reset method. On Aldebaran, default reset method for SOCs connected to CPU over XGMI is MODE2. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:10:25 -04:00
Mukul Joshi	f705a6f021	drm/amdgpu: Store CU info from all XCCs for GFX v9.4.3 Currently, we store CU info only for a single XCC assuming that it is the same for all XCCs. However, that may not be true. As a result, store CU info for all XCCs. This info is later used for CU masking. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:10:19 -04:00
Ma Jun	a1ce3e1f7c	drm/amd: Fix the flag setting code for interrupt request [1] Remove the irq flags setting code since pci_alloc_irq_vectors() handles these flags. [2] Free the msi vectors in case of error. Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:10:09 -04:00
Lang Yu	dbb8052151	drm/amdgpu: fix unsigned error codes Fixes: `5d5eac7e83` ("drm/amdgpu: add selftest framework for UMSCH") Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/all/ZPhddADtKmOuVyDq@lang-desktop Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:09:12 -04:00
Mukul Joshi	56d6daa3c7	drm/amdkfd: Fix reg offset for setting CWSR grace period This patch fixes the case where the code currently passes absolute register address and not the reg offset, which HWS expects, when sending the PM4 packet to set/update CWSR grace period. Additionally, cleanup the signature of build_grace_period_packet_info function as it no longer needs the inst parameter. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:07:33 -04:00
Arunpravin Paneer Selvam	2eb412aa25	drm/amdgpu: Move the size computations to drm buddy - Move roundup_power_of_two() and IS_ALIGNED() computations to drm buddy file to support the new try harder mechanism for contiguous allocation. - Move trim function call to drm_buddy_alloc_blocks() function. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230909160902.15644-2-Arunpravin.PaneerSelvam@amd.com Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Christian König <christian.koenig@amd.com>	2023-09-11 20:18:06 +02:00
Linus Torvalds	a48fa7efaf	drm fixes for 6.6-rc1 amdgpu: - Display replay fixes - Fixes for headless boards - Fix documentation breakage - RAS fixes - Handle newer IP discovery tables - SMU 13.0.6 fixes - SR-IOV fixes - Display vstartup fixes - NBIO 7.9 fixes - Display scaling mode fixes - Debugfs power reporting fix - GC 9.4.3 fixes - Dirty framebuffer fixes for fbcon - eDP fixes - DCN 3.1.5 fix - Display ODM fixes - GPU core dump fix - Re-enable zops property now that IGT test is fixed - Fix possible UAF in CS code - Cursor degamma fix amdkfd: - HMM fixes - Interrupt masking fix - GFX11 MQD fixes i915: - Mark requests for GuC virtual engines to avoid use-after-free nouveau: - Fix fence state in nouveau_fence_emit() ivpu: - replace strncpy -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmT6ihUACgkQDHTzWXnE hr6Vqg/+OGfbxx0qev5C93bLYpg8d4zbply0zTed2hU48zczARkOyDH+h2uYM4tP rmB6mK/0KRy3H8vkngKduR/IF6QBwzOnLDpS+C/TrHYQYqMDwvs3qEDVYqXh3V5H GPIFuu9sb2Nb/o9Fid70pvNACbgGsAUgqMUhQ/is3NjzOR8S/qjdBQ7wSdoUoOqx okTdlwuMK5SEYrihSyTZvhNcwpR/1L8JuxOUXXXUSQ0tRXBer/ZNF2lcEyYmQ0zs bZHKM4dNbdew/EhygbH6LVB5RjFaT5pGw08Xm8zJry+q5tXQV/NIXPQHL3vWqQoX i2QLbvGX/Uu8LJg9YNdsa1kPwNKADAxF64cW38Llv8ybsPHyva/I255j689/TvSG Se7HkTooURKS6GWFHPOkyMNC0Y+Fb/7WG5zUPSDVk9stqJz9pzx48okTsCWeuovD cBOssp8If1QsTyPvDq5A47l5z1oO3J1rdJ9fL0GnpOuXulPhgJGIoUSkftyI6lbw rhUoRd7w6VcgOsA9WIkAf+/325em0Y0AKZBgQnr2jfF56IE4iLa8yFuJNeReZ9oy W9yf14AB0orVm9+P4+PqATXBh+PdCTD1CPcB0MyK1SZAjh836Tc0HPKvJAUU6jp+ 8aw3BKXcaLjIP/dCyhSddXCnuuTPWreff5isktwgEXUtNFv9jx0= =fPeN -----END PGP SIGNATURE----- Merge tag 'drm-next-2023-09-08' of git://anongit.freedesktop.org/drm/drm Pull drm fixes from Dave Airlie: "Regular rounds of rc1 fixes, a large bunch for amdgpu since it's three weeks in one go, one i915, one nouveau and one ivpu. I think there might be a few more fixes in misc that I haven't pulled in yet, but we should get them all for rc2. amdgpu: - Display replay fixes - Fixes for headless boards - Fix documentation breakage - RAS fixes - Handle newer IP discovery tables - SMU 13.0.6 fixes - SR-IOV fixes - Display vstartup fixes - NBIO 7.9 fixes - Display scaling mode fixes - Debugfs power reporting fix - GC 9.4.3 fixes - Dirty framebuffer fixes for fbcon - eDP fixes - DCN 3.1.5 fix - Display ODM fixes - GPU core dump fix - Re-enable zops property now that IGT test is fixed - Fix possible UAF in CS code - Cursor degamma fix amdkfd: - HMM fixes - Interrupt masking fix - GFX11 MQD fixes i915: - Mark requests for GuC virtual engines to avoid use-after-free nouveau: - Fix fence state in nouveau_fence_emit() ivpu: - replace strncpy" * tag 'drm-next-2023-09-08' of git://anongit.freedesktop.org/drm/drm: (51 commits) drm/amdgpu: Restrict bootloader wait to SMUv13.0.6 drm/amd/display: prevent potential division by zero errors drm/amd/display: enable cursor degamma for DCN3+ DRM legacy gamma drm/amd/display: limit the v_startup workaround to ASICs older than DCN3.1 Revert "drm/amd/display: Remove v_startup workaround for dcn3+" drm/amdgpu: fix amdgpu_cs_p1_user_fence Revert "Revert "drm/amd/display: Implement zpos property"" drm/amdkfd: Add missing gfx11 MQD manager callbacks drm/amdgpu: Free ras cmd input buffer properly drm/amdgpu: Hide xcp partition sysfs under SRIOV drm/amdgpu: use read-modify-write mode for gfx v9_4_3 SQ setting drm/amdkfd: use mask to get v9 interrupt sq data bits correctly drm/amdgpu: Allocate coredump memory in a nonblocking way drm/amdgpu: Support query ecc cap for aqua_vanjaram drm/amdgpu: Add umc_info v4_0 structure drm/amd/display: always switch off ODM before committing more streams drm/amd/display: Remove wait while locked drm/amd/display: update blank state on ODM changes drm/amd/display: Add smu write msg id fail retry process drm/amdgpu: Add SMU v13.0.6 default reset methods ...	2023-09-07 19:47:04 -07:00
Lijo Lazar	fbe1a9e0c7	drm/amdgpu: Restrict bootloader wait to SMUv13.0.6 Restrict the wait for boot loader steady state only to SMUv13.0.6. For older SOCs, ASIC init has a longer wait period and that takes care. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 22:11:51 -04:00
Candice Li	7e6ec09974	drm/amdgpu: Add umc v12_0 ras functions Add umc v12_0 ras error querying. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:38:00 -04:00
Lijo Lazar	6b7d211740	drm/amdgpu: Fix refclk reporting for SMU v13.0.6 SMU v13.0.6 SOCs have 100MHz reference clock. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:37:03 -04:00
Hawking Zhang	c2c23a10f1	drm/amdgpu: Correct se_num and reg_inst for gfx v9_4_3 ras counters gfx_v9_4_3_ue\|ce_reg_list is an array per gfx core instance correct the settings of se_num and reg_inst for some of gfx ras counters so all the available register instances can be polled for ras status. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:36:57 -04:00
Lijo Lazar	1b8e56b994	drm/amdgpu: Restrict bootloader wait to SMUv13.0.6 Restrict the wait for boot loader steady state only to SMUv13.0.6. For older SOCs, ASIC init has a longer wait period and that takes care. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:36:52 -04:00
Lijo Lazar	b93fb0fe24	drm/amdgpu: Add only valid firmware version nodes Show only firmware version attributes that have valid version. Hide others. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:36:44 -04:00
Lang Yu	d519072d26	drm/amdgpu: fix incompatible types in conditional expression Use proper type. Fixes: `9d4346bdbc` ("drm/amdgpu: add VPE 6.1.0 support") Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Solomon Chiu <solomon.chiu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202309020608.FwP8QMht-lkp@intel.com Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:35:29 -04:00
Srinivasan Shanmugam	eb3b214c37	drm/amdgpu: Use min_t to replace min Use min_t to replace min, min_t is a bit fast because min use twice typeof. And using min_t is cleaner here since the min/max macros do a typecheck while min_t()/max_t() to an explicit type cast. Fixes the below checkpatch warning: WARNING: min() should probably be min_t() Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:35:22 -04:00
Candice Li	d57e24aa56	drm/amdgpu: Update amdgpu_device_indirect_r/wreg_ext Only calculate pcie_index_hi for register address greater than 32bits. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:34:26 -04:00
Candice Li	a76b2870bd	drm/amdgpu: Add RREG64_PCIE_EXT/WREG64_PCIE_EXT functions Add 64bits register access support on register whose address is greater than 32bits. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:34:18 -04:00
Srinivasan Shanmugam	9b70a1d414	drm/amdgpu: Declare array with strings as pointers constant This warning is for the declaration of a static array, and it is recommended to declare it as type "static const char * const" instead of "static const char ". an array pointer declared as type "static const char " can point to a different character constant because the pointer is mutable. However, if it is declared as type "static const char * const", the pointer will point to an immutable character constant, preventing it from being modified which can better ensure the safety and stability of the program. Fixes the below: WARNING: static const char * array should probably be static const char * const Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:34:11 -04:00
Jiapeng Chong	df04434cb5	drm/amdgpu: clean up some inconsistent indenting No functional modification involved. drivers/gpu/drm/amd/amdgpu/nbio_v7_11.c:34 nbio_v7_11_get_rev_id() warn: inconsistent indenting. v2: drop leftover printk (Alex) Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=6316 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:32:25 -04:00
Yifan Zhang	0bdf09cc5e	drm/amdgpu: calling address translation functions to simplify codes Use amdgpu_gmc_vram_pa to simplify codes. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:31:52 -04:00
Simon Pilkington	e2884fe84a	drm/amd: Make fence wait in suballocator uninterruptible Commit `c103a23f2f` ("drm/amd: Convert amdgpu to use suballocation helper.") made the fence wait in amdgpu_sa_bo_new() interruptible but there is no code to handle an interrupt. This caused the kernel to randomly explode in high-VRAM-pressure situations so make it uninterruptible again. Signed-off-by: Simon Pilkington <simonp.git@gmail.com> Fixes: `c103a23f2f` ("drm/amd: Convert amdgpu to use suballocation helper.") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2761 CC: stable@vger.kernel.org # 6.4+	2023-09-01 15:12:07 +02:00
Christian König	35588314e9	drm/amdgpu: fix amdgpu_cs_p1_user_fence The offset is just 32bits here so this can potentially overflow if somebody specifies a large value. Instead reduce the size to calculate the last possible offset. The error handling path incorrectly drops the reference to the user fence BO resulting in potential reference count underflow. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2023-08-31 18:14:49 -04:00
Hawking Zhang	9f051d6ff1	drm/amdgpu: Free ras cmd input buffer properly Do not access the pointer for ras input cmd buffer if it is even not allocated. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 18:12:13 -04:00
Rajneesh Bhardwaj	2031c46b09	drm/amdgpu: Hide xcp partition sysfs under SRIOV XCP partitions should not be visible for the VF for GFXIP 9.4.3. Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 18:11:41 -04:00
Tao Zhou	e23b10675a	drm/amdgpu: use read-modify-write mode for gfx v9_4_3 SQ setting Instead of using direct update, avoid touching unrelated fields. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 18:11:09 -04:00
André Almeida	6d1b345548	drm/amdgpu: Allocate coredump memory in a nonblocking way During a GPU reset, a normal memory reclaim could block to reclaim memory. Giving that coredump is a best effort mechanism, it shouldn't disturb the reset path. Change its memory allocation flag to a nonblocking one. Signed-off-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 18:10:11 -04:00
Hawking Zhang	e0c5c387ac	drm/amdgpu: Support query ecc cap for aqua_vanjaram Driver queries umc_info v4_0 to identify ecc cap for aqua_vanjaram Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Candice Li <candice.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 18:09:45 -04:00
Lijo Lazar	05347402d1	drm/amdgpu: Add SMU v13.0.6 default reset methods For APUs with SMU v13.0.6, mode-2 reset is kept as default and for others mode-1 is the default reset method. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 18:05:43 -04:00
Lijo Lazar	7c2949c12e	drm/amdgpu: Add bootloader wait for PSP v13 Implement the wait for bootloader call back for PSP v13.0 ASICs. Only for ASICs with PSP v13.0.6, it needs an additional check for VBIOS mailbox status. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 18:04:10 -04:00
Hamza Mahfooz	0a611560f5	drm/amdgpu: register a dirty framebuffer callback for fbcon fbcon requires that we implement &drm_framebuffer_funcs.dirty. Otherwise, the framebuffer might take a while to flush (which would manifest as noticeable lag). However, we can't enable this callback for non-fbcon cases since it may cause too many atomic commits to be made at once. So, implement amdgpu_dirtyfb() and only enable it for fbcon framebuffers (we can use the "struct drm_file file" parameter in the callback to check for this since it is only NULL when called by fbcon, at least in the mainline kernel) on devices that support atomic KMS. Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: stable@vger.kernel.org # 6.1+ Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2519 Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 18:03:48 -04:00
Mangesh Gadre	7b9f623530	drm/amdgpu: Updated TCP/UTCL1 programming Update TCP/UTCL1 thrashing control settings v2: updated rev_id check Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 18:02:49 -04:00
Hawking Zhang	7d4424373d	drm/amdgpu: Fix the return for gpu mode1_reset amdgpu_device_mode1_reset will return gpu mode1_reset succeed (ret = 0) as long as wait_for_bootloader call succeed, regardless of the status reported by smu or psp firmware. This results to driver continue executing recovery even smu or psp fail to perform mode1 reset. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 18:02:10 -04:00
Mangesh Gadre	3f16096795	drm/amdgpu: Remove SRAM clock gater override by driver rlc firmware does required setting, driver need not do it. Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:59:17 -04:00
Lijo Lazar	7656168a8a	drm/amdgpu: Add bootloader status check Add a function to wait till bootloader has reached steady state. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:58:58 -04:00
Horace Chen	8c97e87c13	drm/amdkfd: use correct method to get clock under SRIOV [What] Current SRIOV still using adev->clock.default_XX which gets from atomfirmware. But these fields are abandoned in atomfirmware long ago. Which may cause function to return a 0 value. [How] We don't need to check whether SR-IOV. For SR-IOV one-vf-mode, pm is enabled and VF is able to read dpm clock from pmfw, so we can use dpm clock interface directly. For multi-VF mode, VF pm is disabled, so driver can just react as pm disabled. One-vf-mode is introduced from GFX9 so it shall not have any backward compatibility issue. Signed-off-by: Horace Chen <horace.chen@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:58:29 -04:00
Lijo Lazar	8f1778939b	drm/amdgpu: Unset baco dummy mode on nbio v7.9 BACO dummy mode could be set under reset conditions and that affects framebuffer access. Check If baco dummy mode is set, unset it if so. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:58:23 -04:00
YiPeng Chai	e81c455685	drm/amdgpu: Enable ras for mp0 v13_0_6 sriov Enable ras for mp0 v13_0_6 sriov Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:55:02 -04:00
Samir Dhume	bae44a8fcb	drm/amdgpu/jpeg - skip change of power-gating state for sriov Powergating is handled in the host driver. Reviewed-by: Zhigang Luo <zhigang.luo@amd.com> Signed-off-by: Samir Dhume <samir.dhume@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:54:18 -04:00
Le Ma	46b55e25c9	drm/amdgpu: update gc_info v2_1 from discovery Several new fields are exposed in gc_info v2_1 Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:53:19 -04:00
Le Ma	d4f6425a56	drm/amdgpu: update mall info v2 from discovery Mall info v2 is introduced in ip discovery Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:53:08 -04:00
Candice Li	4b721ed87e	drm/amdgpu: Only support RAS EEPROM on dGPU platform RAS EEPROM device is only supported on dGPU platform for smu v13_0_6. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:52:49 -04:00
Lang Yu	983ac45a06	drm/amdgpu: update SET_HW_RESOURCES definition for UMSCH Align with FW changes. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:14:21 -04:00
Lang Yu	eebb06d121	drm/amdgpu: add amdgpu_umsch_mm module parameter Enable Multi Media User Mode Scheduler (0 = disabled (default), 1 = enabled). Signed-off-by: Lang Yu <Lang.Yu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:14:21 -04:00
Lang Yu	822f780829	drm/amdgpu/discovery: enable UMSCH 4.0 in IP discovery Enable UMSCH to support VPE and VCN user queues. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:14:21 -04:00
Lang Yu	4f94903332	drm/amdgpu: add PSP loading support for UMSCH Add front door loading support. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:14:21 -04:00
Lang Yu	40748f9a0a	drm/amdgpu: reserve mmhub engine 3 for UMSCH FW UMSCH FW uses mmhub engine 3 for invalidation. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Acked-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:14:21 -04:00
Lang Yu	d591ae0c9f	drm/amdgpu: add VPE queue submission test Submit a fence command through indirect buffer. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:14:21 -04:00
Lang Yu	5d5eac7e83	drm/amdgpu: add selftest framework for UMSCH Prepare for VPE and VCN queue submission test. v2: rebase on drm_exec (Alex) Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:14:21 -04:00
Lang Yu	dc6f3d6ff2	drm/amdgpu: enable UMSCH scheduling for VPE Add VPE into UMSCH hw resourses, set vmid mask to 0xf00, set hqd mask to 0xfe, then UMSCH can schedule VPE queues. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:40:56 -04:00
Lang Yu	3488c79bea	drm/amdgpu: add initial support for UMSCH Add basic data structure, dummy ring functions and ip functions for UMSCH. Implement sw_init(ring_init and init_microcodede) and hw_init(load_microcode), UMSCH can boot up now. Implement hw_init(ring_start) and hw_fini(ring_stop), UMSCH is ready for command submission now. Implement set_hw_resources and add/remove_queue, UMSCH is ready for scheduling now. Aggregated doorbell is used to notify UMSCH FW that there is unmapped queue with corresponding priority level (e.g., AGDB[0] for Real time band, etc.) is updating its job. v2: squash together initial patches to avoid breaking the build (Alex) Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:40:53 -04:00
Lang Yu	9c852a42a9	drm/amdgpu: add UMSCH firmware header definition Add firmware header definition for UMSCH. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:40:45 -04:00
Lang Yu	1a29f36781	drm/amdgpu: add UMSCH RING TYPE definition Add RING TYPE definition for Multi Mdeia User Mode Scheduler. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:40:40 -04:00
Saleemkhan Jamadar	1cf36599b9	drm/amdgpu/jpeg: initialize number of jpeg ring Initialize number of jpeg ring for vcn 4.0.5. Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:39:41 -04:00
Christian König	a5492fe27f	drm/amdgpu: fix amdgpu_cs_p1_user_fence The offset is just 32bits here so this can potentially overflow if somebody specifies a large value. Instead reduce the size to calculate the last possible offset. The error handling path incorrectly drops the reference to the user fence BO resulting in potential reference count underflow. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:39:28 -04:00
Evan Quan	90bcb9b595	drm/amdgpu: revise the device initialization sequences By placing the sysfs interfaces creation after `.late_int`. Since some operations performed during `.late_init` may affect how the sysfs interfaces should be created. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:35:33 -04:00
Evan Quan	3e38b634f9	drm/amd/pm: introduce a new set of OD interfaces There will be multiple interfaces(sysfs files) exposed with each representing a single OD functionality. And all those interface will be arranged in a tree liked hierarchy with the top dir as "gpu_od". Meanwhile all functionalities for the same component will be arranged under the same directory. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:35:26 -04:00
Saleemkhan Jamadar	6be6e74b7d	drm/amdgpu: enable PG flags for VCN Enable PG flags for VCN and Jpeg on IP 11_5_0 Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:35:02 -04:00
Saleemkhan Jamadar	844d8dd5b9	drm/amdgpu/discovery: add VCN 4.0.5 Support Enable VCN 4.0.5 on gc 11_5_0. Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:34:58 -04:00
Saleemkhan Jamadar	c64f389506	drm/amdgpu/soc21: Add video cap query support for VCN_4_0_5 Added the video capability query support for VCN version 4_0_5 Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:34:52 -04:00
Saleemkhan Jamadar	cc308acc9b	drm/amdgpu:enable CG and PG flags for VCN Enable CG and PG flags for VCN on IP 11_5_0 Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:34:48 -04:00
Saleemkhan Jamadar	1827b37582	drm/amdgpu: add VCN_4_0_5 firmware support Add VCN_4_0_5 firmware support Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:34:43 -04:00
Saleemkhan Jamadar	8f98a715da	drm/amdgpu/jpeg: add jpeg support for VCN4_0_5 Add jpeg support for VCN4_0_5 v2 - update license year (Leo Liu) Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:34:36 -04:00
Saleemkhan Jamadar	547aad32ed	drm/amdgpu: add VCN4 ip block support Add VCN 4.0.5 initialization and decoder/encoder ring functions. v2 - update license year (Leo Liu) Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:34:32 -04:00
Lang Yu	f9ecae9a4e	drm/amdgpu: fix VPE front door loading issue Implement proper front door loading for vpe 6.1. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:34:22 -04:00
Lang Yu	5f6e9cdc83	drm/amdgpu: add VPE FW version query support Add support to query VPE FW version. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:34:19 -04:00
Lang Yu	3ee8fb7005	drm/amdgpu: enable VPE for VPE 6.1.0 Enable Video Processing Engine on SoCs that contain VPE 6.1.0. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:34:16 -04:00
Lang Yu	523c12802d	drm/amdgpu: add user space CS support for VPE Enable command submission to VPE from user space. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:34:14 -04:00
Lang Yu	c5d67a0ec3	drm/amdgpu: add PSP loading support for VPE Add PSP loading support for Video Processing Engine. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:34:10 -04:00
Lang Yu	9d4346bdbc	drm/amdgpu: add VPE 6.1.0 support Add skeleton driver code. (Ray) Add initial support for Video Processing Engine. (Lang) Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:34:05 -04:00
Lang Yu	5861e47731	drm/amdgpu: add nbio 7.11 callback for VPE Add nbio callback to configure doorbell settings. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:33:56 -04:00
Lang Yu	75fdd738ff	drm/amdgpu: add nbio callback for VPE Add nbio callback to configure doorbell settings. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:33:53 -04:00
Lang Yu	964a36d7a4	drm/amdgpu: add PSP FW TYPE for VPE Add PSP FW TYPE for Video Processing Engine. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:33:50 -04:00
Lang Yu	4c63735fa8	drm/amdgpu: add UCODE ID for VPE Add UCODE ID for Video Processing Engine. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:33:47 -04:00
Lang Yu	ce7b59c1e6	drm/amdgpu: add support for VPE firmware name decoding Add decoding VPE firmware name support. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:33:43 -04:00
Lang Yu	2f3916bedb	drm/amdgpu: add doorbell index for VPE Add doorbell index for Video Processing Engine. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:33:40 -04:00
Lang Yu	0b233357a6	drm/amdgpu: add HWID for VPE Add HWID for Video Processing Engine. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:33:31 -04:00
Lang Yu	b0fa855cab	drm/amdgpu: add VPE firmware interface Add initial firmware interface. (Ray) Add more opcodes and rename to vpe_v6_1. (Lang) v2: Update copyright date (Alex) Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:33:27 -04:00
Lang Yu	878fe05116	drm/amdgpu: add VPE firmware header definition Add firmware header definition for Video Processing Engine. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:33:25 -04:00
Huang Rui	5b28f1c720	drm/amdgpu: add VPE HW IP BLOCK definition Add HW IP BLOCK for Video Processing Engine. Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:33:21 -04:00
Huang Rui	2d6ea3b07c	drm/amdgpu: add VPE RING TYPE definition Add RING TYPE for Video Processing Engine. Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:33:09 -04:00
Linus Torvalds	b6f6167ea8	pci-v6.6-changes -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmTvfQgUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vyDKA//UBxniXTyxvN8L/agMZngFJd9jLkE p2lnk5eTW6y/aJp1g+ujc7IJEmHG/B1Flp0b5mK8XL7S6OBtAGlPwnuPPpXb0ZxV ofSuQpYoNZGpkYrQMYvATfdLnH2WF3Yj3WCqh5jd2EldPEyqhMV68l7NMzf6+td2 KWJPli1XO8e60JAzbhpXH9vn1I0T8e6Qx8z/ulcydfiOH3PGDPnVrEo8gw9CvJOr aDqSPW7uhTk2SjjUJcAlQVpTGclE4yBxOOhEbuSGc7L6Ab04Y6D0XKx1589AUK6Z W2dQFK3cFYNQQ9aS/2DMUG88H09ca5t8kgUf7Iz3uan1soPzSYK8SLNBgxAPs11S 1jY093rDXXoaCJqxWUwDc/JUpWq6T3g4m445SNvFIOMcSwmMOIfAwfug4UexE1zC Ie8u3Um35Mp25o0o6V1J2EjdBsUsm0p//CsslfoAAIWi85W02Z/46bLLcITchkCe bP05H+c55ZN6maRJiaeghcpY+iWO4XCRCKS9mF1v9yn7FOhNxhBcwgTNPyGBVrYz T9w3ynTHAmuwNqtd6jhpTR/b1902up/Qv9I8uHhBDMqJAXfHocGEXHZblNuZMgfE bu9cjcbFghUPdrhUHYmbEqAzhdlL2SFuMYfn8D4QV4A6x+32xCdwsi39I0Effm5V wl0HmemjKjTYbLw= =iFFM -----END PGP SIGNATURE----- Merge tag 'pci-v6.6-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull PCI updates from Bjorn Helgaas: "Enumeration: - Add locking to read/modify/write PCIe Capability Register accessors for Link Control and Root Control - Use pci_dev_id() when possible instead of manually composing ID from dev->bus->number and dev->devfn Resource management: - Move prototypes for __weak sysfs resource files to linux/pci.h to fix 'no previous prototype' warnings - Make more I/O port accesses depend on HAS_IOPORT - Use devm_platform_get_and_ioremap_resource() instead of open-coding platform_get_resource() followed by devm_ioremap_resource() Power management: - Ensure devices are powered up while accessing VPD - If device is powered-up, keep it that way while polling for PME - Only read PCI_PM_CTRL register when available, to avoid reading the wrong register and corrupting dev->current_state Virtualization: - Avoid Secondary Bus Reset on NVIDIA T4 GPUs Error handling: - Remove unused pci_disable_pcie_error_reporting() - Unexport pci_enable_pcie_error_reporting(), used only by aer.c - Unexport pcie_port_bus_type, used only by PCI core VGA: - Simplify and clean up typos in VGA arbiter Apple PCIe controller driver: - Initialize pcie->nvecs (number of available MSIs) before use Broadcom iProc PCIe controller driver: - Use of_property_read_bool() instead of low-level accessors for boolean properties Broadcom STB PCIe controller driver: - Assert PERST# when probing BCM2711 because some bootloaders don't do it Freescale i.MX6 PCIe controller driver: - Add .host_deinit() callback so we can clean up things like regulators on probe failure or driver unload Freescale Layerscape PCIe controller driver: - Add support for link-down notification so the endpoint driver can process LINK_DOWN events - Add suspend/resume support, including manual PME_Turn_off/PME_TO_Ack handshake - Save Link Capabilities during probe so they can be restored when handling a link-up event, since the controller loses the Link Width and Link Speed values during reset Intel VMD host bridge driver: - Fix disable of bridge windows during domain reset; previously we cleared the base/limit registers, which actually left the windows enabled Marvell MVEBU PCIe controller driver: - Remove unused busn member Microchip PolarFlare PCIe controller driver: - Fix interrupt bit definitions so the SEC and DED interrupt handlers work correctly - Make driver buildable as a module - Read FPGA MSI configuration parameters from hardware instead of hard-coding them Microsoft Hyper-V host bridge driver: - To avoid a NULL pointer dereference, skip MSI restore after hibernate if MSI/MSI-X hasn't been enabled NVIDIA Tegra194 PCIe controller driver: - Revert 'PCI: tegra194: Enable support for 256 Byte payload' because Linux doesn't know how to reduce MPS from to 256 to 128 bytes for endpoints below a switch (because other devices below the switch might already be operating), which leads to 'Malformed TLP' errors Qualcomm PCIe controller driver: - Add DT and driver support for interconnect bandwidth voting for 'pcie-mem' and 'cpu-pcie' interconnects - Fix broken SDX65 'compatible' DT property - Configure controller so MHI bus master clock will be switched off while in ASPM L1.x states - Use alignment restriction from EPF core in EPF MHI driver - Add Endpoint eDMA support - Add MHI eDMA support - Add Snapdragon SM8450 support to the EPF MHI driversupport - Add MHI eDMA support - Add Snapdragon SM8450 support to the EPF MHI driversupport - Add MHI eDMA support - Add Snapdragon SM8450 support to the EPF MHI driversupport - Add MHI eDMA support - Add Snapdragon SM8450 support to the EPF MHI driver - Use iATU for EPF MHI transfers smaller than 4K to avoid eDMA setup latency - Add sa8775p DT binding and driver support Rockchip PCIe controller driver: - Use 64-bit mask on MSI 64-bit PCI address to avoid zeroing out the upper 32 bits SiFive FU740 PCIe controller driver: - Set the supported number of MSI vectors so we can use all available MSI interrupts Synopsys DesignWare PCIe controller driver: - Add generic dwc suspend/resume APIs (dw_pcie_suspend_noirq() and dw_pcie_resume_noirq()) to be called by controller driver suspend/resume ops, and a controller callback to send PME_Turn_Off MicroSemi Switchtec management driver: - Add support for PCIe Gen5 devices Miscellaneous: - Reorder and compress to reduce size of struct pci_dev - Fix race in DOE destroy_work_on_stack() - Add stubs to avoid casts between incompatible function types - Explicitly include correct DT includes to untangle headers" * tag 'pci-v6.6-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (96 commits) PCI: qcom-ep: Add ICC bandwidth voting support dt-bindings: PCI: qcom: ep: Add interconnects path PCI: qcom-ep: Treat unknown IRQ events as an error dt-bindings: PCI: qcom: Fix SDX65 compatible PCI: endpoint: Add kernel-doc for pci_epc_mem_init() API PCI: epf-mhi: Use iATU for small transfers PCI: epf-mhi: Add support for SM8450 PCI: epf-mhi: Add eDMA support PCI: qcom-ep: Add eDMA support PCI: epf-mhi: Make use of the alignment restriction from EPF core PCI/PM: Only read PCI_PM_CTRL register when available PCI: qcom: Add support for sa8775p SoC dt-bindings: PCI: qcom: Add sa8775p compatible PCI: qcom-ep: Pass alignment restriction to the EPF core PCI: Simplify pcie_capability_clear_and_set_word() control flow PCI: Tidy config space save/restore messages PCI: Fix code formatting inconsistencies PCI: Fix typos in docs and comments PCI: Fix pci_bus_resetable(), pci_slot_resetable() name typos PCI: Simplify pci_dev_driver() ...	2023-08-30 20:23:07 -07:00
Srinivasan Shanmugam	8254e05c82	drm/amdgpu: Fix printk_ratelimit() with DRM_ERROR_RATELIMITED in 'amdgpu_cs_ioctl' Replaced printk_ratelimit() with its DRM equivalent to avoid flooding of dmesg logs & hence fixes the following: WARNING: Prefer printk_ratelimited or pr_<level>_ratelimited to printk_ratelimit + if (printk_ratelimit()) Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:51:17 -04:00
Srinivasan Shanmugam	bf227a4f05	drm/amdgpu: Use READ_ONCE() when reading the values in 'sdma_v4_4_2_ring_get_rptr' Use READ_ONCE() instead of declaring the pointer volatile. To prevent the compiler from refetching or reordering the read, so that the read value is always consistent. Link: https://lwn.net/Articles/624126/ Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: Le Ma <le.ma@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:51:17 -04:00
Yifan Zhang	4d5dc6260c	drm/amdgpu: remove unused parameter in amdgpu_vmid_grab_idle amdgpu_vm is not used in amdgpu_vmid_grab_idle. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:51:17 -04:00
Hawking Zhang	bf7aa8bea9	drm/amdgpu: Free ras cmd input buffer properly Do not access the pointer for ras input cmd buffer if it is even not allocated. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:51:16 -04:00
Ma Jun	8f9a9a09af	drm/amd: Simplify the bo size check funciton Simplify the code logic of size check function amdgpu_bo_validate_size Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:51:16 -04:00
Rajneesh Bhardwaj	d30279a9e3	drm/amdgpu: Hide xcp partition sysfs under SRIOV XCP partitions should not be visible for the VF for GFXIP 9.4.3. Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:51:16 -04:00
Tao Zhou	ac3343c761	drm/amdgpu: use read-modify-write mode for gfx v9_4_3 SQ setting Instead of using direct update, avoid touching unrelated fields. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:51:16 -04:00
ZhenGuo Yin	9f05cfc78c	drm/amdgpu: access RLC_SPM_MC_CNTL through MMIO in SRIOV runtime Register RLC_SPM_MC_CNTL is not blocked by L1 policy, VF can directly access it through MMIO during SRIOV runtime. v2: use SOC15 interface to access registers Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:51:16 -04:00
Lee Jones	668dfc4533	drm/amd/amdgpu/sdma_v6_0: Demote a bunch of half-completed function headers Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:248: warning: Function parameter or member 'job' not described in 'sdma_v6_0_ring_emit_ib' drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:248: warning: Function parameter or member 'flags' not described in 'sdma_v6_0_ring_emit_ib' drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:945: warning: Function parameter or member 'timeout' not described in 'sdma_v6_0_ring_test_ib' drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:1124: warning: Function parameter or member 'ring' not described in 'sdma_v6_0_ring_pad_ib' drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:1175: warning: Function parameter or member 'vmid' not described in 'sdma_v6_0_ring_emit_vm_flush' drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:1175: warning: Function parameter or member 'pd_addr' not described in 'sdma_v6_0_ring_emit_vm_flush' Cc: linaro-mm-sig@lists.linaro.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:51:16 -04:00
André Almeida	d68ccdb263	drm/amdgpu: Allocate coredump memory in a nonblocking way During a GPU reset, a normal memory reclaim could block to reclaim memory. Giving that coredump is a best effort mechanism, it shouldn't disturb the reset path. Change its memory allocation flag to a nonblocking one. Signed-off-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:51:16 -04:00
Hawking Zhang	a8cde40201	drm/amdgpu: Support query ecc cap for aqua_vanjaram Driver queries umc_info v4_0 to identify ecc cap for aqua_vanjaram Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Candice Li <candice.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:51:13 -04:00
Lijo Lazar	c4b9dc5313	drm/amdgpu: Add SMU v13.0.6 default reset methods For APUs with SMU v13.0.6, mode-2 reset is kept as default and for others mode-1 is the default reset method. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:31:37 -04:00
Lee Jones	04cef5f583	drm/amd/amdgpu/amdgpu_doorbell_mgr: Correct misdocumented param 'doorbell_index' Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell_mgr.c:123: warning: Function parameter or member 'doorbell_index' not described in 'amdgpu_doorbell_index_on_bar' drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell_mgr.c:123: warning: Excess function parameter 'db_index' description in 'amdgpu_doorbell_index_on_bar' Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:27:23 -04:00
Lee Jones	a728342ae4	drm/amd/amdgpu/imu_v11_0: Increase buffer size to ensure all possible values can be stored Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/imu_v11_0.c: In function ‘imu_v11_0_init_microcode’: drivers/gpu/drm/amd/amdgpu/imu_v11_0.c:52:54: warning: ‘_imu.bin’ directive output may be truncated writing 8 bytes into a region of size between 4 and 33 [-Wformat-truncation=] drivers/gpu/drm/amd/amdgpu/imu_v11_0.c:52:9: note: ‘snprintf’ output between 16 and 45 bytes into a destination of size 40 Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:27:18 -04:00
Lee Jones	ac84d99a11	drm/amd/amdgpu/amdgpu_sdma: Increase buffer size to account for all possible values Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c: In function ‘amdgpu_sdma_init_microcode’: drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c:217:64: warning: ‘.bin’ directive output may be truncated writing 4 bytes into a region of size between 0 and 32 [-Wformat-truncation=] drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c:217:17: note: ‘snprintf’ output between 13 and 52 bytes into a destination of size 40 drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c:215:66: warning: ‘snprintf’ output may be truncated before the last format character [-Wformat-truncation=] drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c:215:17: note: ‘snprintf’ output between 12 and 41 bytes into a destination of size 40 Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:27:06 -04:00
Lee Jones	3dd8a754a5	drm/amd/amdgpu/amdgpu_ras: Increase buffer size to account for all possible values Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c: In function ‘amdgpu_ras_sysfs_create’: drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1406:20: warning: ‘_err_count’ directive output may be truncated writing 10 bytes into a region of size between 1 and 32 [-Wformat-truncation=] drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1405:9: note: ‘snprintf’ output between 11 and 42 bytes into a destination of size 32 Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:27:03 -04:00
Lee Jones	8057a9d656	drm/amd/amdgpu/amdgpu_device: Provide suitable description for param 'xcc_id' Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:516: warning: Function parameter or member 'xcc_id' not described in 'amdgpu_mm_wreg_mmio_rlc' Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:26:58 -04:00
Christophe JAILLET	415b7ba36a	drm/amdgpu: Use kvzalloc() to simplify code kvzalloc() can be used instead of kvmalloc() + memset() + explicit NULL assignments. It is less verbose and more future proof. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:26:47 -04:00
Christophe JAILLET	5f5c75bf16	drm/amdgpu: Remove amdgpu_bo_list_array_entry() Now that there is an explicit flexible array at the end of 'struct amdgpu_bo_list', it can be used to remove amdgpu_bo_list_array_entry() and simplify some macro. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:26:45 -04:00
Christophe JAILLET	a23abe1fbd	drm/amdgpu: Remove a redundant sanity check The case where 'num_entries' is too big, is already handled by struct_size(), because kvmalloc() would fail. It will return -ENOMEM instead of -EINVAL, but it is only related to a unlikely to happen sanity check. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:26:38 -04:00
Christophe JAILLET	ff49bd2c74	drm/amdgpu: Explicitly add a flexible array at the end of 'struct amdgpu_bo_list' 'struct amdgpu_bo_list' is really used as if it was ended by a flex array. So make it more explicit and add a 'struct amdgpu_bo_list_entry entries[]' field at the end of the structure. This way, struct_size() can be used when it is allocated. It is less verbose. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:26:32 -04:00
Hawking Zhang	ec70578c83	drm/amdgpu: Allow issue disable gfx ras cmd to firmware Disable gfx ras command is needed in some use cases Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:25:54 -04:00
Lijo Lazar	e370f8f389	drm/amdgpu: Add bootloader wait for PSP v13 Implement the wait for bootloader call back for PSP v13.0 ASICs. Only for ASICs with PSP v13.0.6, it needs an additional check for VBIOS mailbox status. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:25:44 -04:00
Hamza Mahfooz	1c6b6bd078	drm/amdgpu: register a dirty framebuffer callback for fbcon fbcon requires that we implement &drm_framebuffer_funcs.dirty. Otherwise, the framebuffer might take a while to flush (which would manifest as noticeable lag). However, we can't enable this callback for non-fbcon cases since it may cause too many atomic commits to be made at once. So, implement amdgpu_dirtyfb() and only enable it for fbcon framebuffers (we can use the "struct drm_file file" parameter in the callback to check for this since it is only NULL when called by fbcon, at least in the mainline kernel) on devices that support atomic KMS. Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: stable@vger.kernel.org # 6.1+ Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2519 Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:25:34 -04:00
Mangesh Gadre	7caebc8f99	drm/amdgpu: Updated TCP/UTCL1 programming Update TCP/UTCL1 thrashing control settings v2: updated rev_id check Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:20:14 -04:00
Srinivasan Shanmugam	f54e1d47e0	drm/amdgpu: Fix kcalloc over kzalloc in 'gmc_v9_0_init_mem_ranges' Replace kzalloc(n * sizeof(...), ...) with kcalloc(n, sizeof(...), ...) since kcalloc is the preferred API in case of allocating with multiply. Fixes the below: WARNING: Prefer kcalloc over kzalloc with multiply Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:20:02 -04:00
Philip Yang	5d44a766f7	drm/amdkfd: Share the original BO for GTT mapping If mGPUs is on same IOMMU group, or is ram direct mapped, then mGPUs can share the original BO for GTT mapping dma address, without creating new BO from export/import dmabuf. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:19:07 -04:00
Hawking Zhang	2c0f880abc	drm/amdgpu: Fix the return for gpu mode1_reset amdgpu_device_mode1_reset will return gpu mode1_reset succeed (ret = 0) as long as wait_for_bootloader call succeed, regardless of the status reported by smu or psp firmware. This results to driver continue executing recovery even smu or psp fail to perform mode1 reset. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:01:44 -04:00
benl	96271dd4d5	drm/amdgpu: add gfxhub 11.5.0 support Add initial gfxhub 11.5 support. Signed-off-by: benl <ben.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:01:15 -04:00
Prike Liang	b90975fa5b	drm/amdgpu: enable gmc11 for GC 11.5.0 Add to IP discovery table. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:01:13 -04:00
Lang Yu	aba2be4147	drm/amdgpu: add mmhub 3.3.0 support Add initial implementation for mmhub 3.3.0. v2: squash in client id fix (Alex) Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:01:09 -04:00
Prike Liang	b5549a2df0	drm/amdgpu/discovery: enable gfx11 for GC 11.5.0 Add to IP discovery table. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:01:03 -04:00
Lang Yu	d3ff0189c1	drm/amdgpu/discovery: enable mes block for gc 11.5.0 Add to IP discovery table. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:00:59 -04:00
Aaron Liu	10c9d86918	drm/amdgpu: add mes firmware support for gc_11_5_0 Add scheduler and kiq firmware support for gc_11_5_0. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:00:57 -04:00
Aaron Liu	d717da1775	drm/amdgpu: add imu firmware support for gc_11_5_0 Add imu firmware support for gc_11_5_0. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:00:54 -04:00
Aaron Liu	8e42b463df	drm/amdgpu: add golden setting for gc_11_5_0 Initialize golden setting for gc_11_5_0. v2: squash in latest golden updates (Alex) v3: squash in checkpatch fix (Alex) Signed-off-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:00:47 -04:00
Prike Liang	15e7cbd91d	drm/amdgpu/gfx11: initialize gfx11.5.0 Initalize gfx 11.5.0 and set gfx hw configuration. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:00:44 -04:00
Prike Liang	dd5a326155	drm/amdgpu/gmc11: initialize GMC for GC 11.5.0 memory support Initialize vram attribute and VMHUB for GC 11.5.0. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:00:39 -04:00
Prike Liang	d9d6833442	drm/amdgpu/discovery: add nbio 7.11.0 support Add to IP discovery table. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:00:30 -04:00
benl	e44d856eaa	drm/amdgpu: add nbio 7.11 support Add initial nbio 7.11 implementation. Signed-off-by: benl <ben.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:00:27 -04:00
Prike Liang	bb7249ee45	drm/amdgpu/discovery: enable soc21 support Add 11.5.0 to IP discovery table. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:00:22 -04:00
Prike Liang	0d1db799e7	drm/amdgpu/soc21: add initial GC 11.5.0 soc21 support Disable clock gating and power gating on the early bring up phase. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:00:18 -04:00
Prike Liang	2c8a7ca164	drm/amdgpu: add new AMDGPU_FAMILY definition add GC 11.5.0 family Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:00:15 -04:00
Lang Yu	f56c1941eb	drm/amdgpu: use 6.1.0 register offset for HDP CLK_CNTL Use 6.1.0 register offset and remove unused variable. v2: clean up logic (Alex) Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:00:12 -04:00
Mangesh Gadre	559259362e	drm/amdgpu: Remove SRAM clock gater override by driver rlc firmware does required setting, driver need not do it. Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 14:59:30 -04:00
Lijo Lazar	15c5c5f575	drm/amdgpu: Add bootloader status check Add a function to wait till bootloader has reached steady state. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 14:59:24 -04:00
Horace Chen	0bc119fa2e	drm/amdkfd: use correct method to get clock under SRIOV [What] Current SRIOV still using adev->clock.default_XX which gets from atomfirmware. But these fields are abandoned in atomfirmware long ago. Which may cause function to return a 0 value. [How] We don't need to check whether SR-IOV. For SR-IOV one-vf-mode, pm is enabled and VF is able to read dpm clock from pmfw, so we can use dpm clock interface directly. For multi-VF mode, VF pm is disabled, so driver can just react as pm disabled. One-vf-mode is introduced from GFX9 so it shall not have any backward compatibility issue. Signed-off-by: Horace Chen <horace.chen@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 14:59:21 -04:00
Lijo Lazar	36b0f88988	drm/amdgpu: Unset baco dummy mode on nbio v7.9 BACO dummy mode could be set under reset conditions and that affects framebuffer access. Check If baco dummy mode is set, unset it if so. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 14:59:16 -04:00
YiPeng Chai	80578f1641	drm/amdgpu: Enable ras for mp0 v13_0_6 sriov Enable ras for mp0 v13_0_6 sriov Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 14:58:16 -04:00
Samir Dhume	00481158ca	drm/amdgpu/jpeg - skip change of power-gating state for sriov Powergating is handled in the host driver. Reviewed-by: Zhigang Luo <zhigang.luo@amd.com> Signed-off-by: Samir Dhume <samir.dhume@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 14:57:59 -04:00
Lijo Lazar	f8a499aed2	drm/amdgpu: Keep reset handlers shared Instead of maintaining a list per device, keep the reset handlers common per ASIC family. A pointer to the list of handlers is maintained in reset control. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 14:57:54 -04:00
Le Ma	e240020ad1	drm/amdgpu: update gc_info v2_1 from discovery Several new fields are exposed in gc_info v2_1 Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 14:57:32 -04:00
Le Ma	f489a41998	drm/amdgpu: update mall info v2 from discovery Mall info v2 is introduced in ip discovery Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 14:57:29 -04:00
Candice Li	46963ed585	drm/amdgpu: Only support RAS EEPROM on dGPU platform RAS EEPROM device is only supported on dGPU platform for smu v13_0_6. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 14:57:26 -04:00
Chen Jiahao	d903af1a91	drm/amd/amdgpu: Use kmemdup to simplify kmalloc and memcpy logic Using kmemdup() helper function rather than implementing it again with kmalloc() + memcpy(), which improves the code readability. Signed-off-by: Chen Jiahao <chenjiahao16@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 14:56:47 -04:00
Ilpo Järvinen	ce7d88110b	drm/amdgpu: Use RMW accessors for changing LNKCTL Don't assume that only the driver would be accessing LNKCTL. ASPM policy changes can trigger write to LNKCTL outside of driver's control. And in the case of upstream bridge, the driver does not even own the device it's changing the registers for. Use RMW capability accessors which do proper locking to avoid losing concurrent updates to the register value. Suggested-by: Lukas Wunner <lukas@wunner.de> Fixes: `a2e73f56fa` ("drm/amdgpu: Add support for CIK parts") Fixes: `62a3755341` ("drm/amdgpu: add si implementation v10") Link: https://lore.kernel.org/r/20230717120503.15276-6-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-21 14:11:35 -05:00
Lijo Lazar	e20ff05170	drm/amdgpu: Add memory vendor information For ASICs with GC v9.4.3, determine the vendor information from scratch register. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-16 11:38:11 -04:00
Mario Limonciello	0dee726395	drm/amd: flush any delayed gfxoff on suspend entry DCN 3.1.4 is reported to hang on s2idle entry if graphics activity is happening during entry. This is because GFXOFF was scheduled as delayed but RLC gets disabled in s2idle entry sequence which will hang GFX IP if not already in GFXOFF. To help this problem, flush any delayed work for GFXOFF early in s2idle entry sequence to ensure that it's off when RLC is changed. commit `4b31b92b14` ("drm/amdgpu: complete gfxoff allow signal during suspend without delay") modified power gating flow so that if called in s0ix that it ensured that GFXOFF wasn't put in work queue but instead processed immediately. This is dead code due to commit `10cb67eb8a` ("drm/amdgpu: skip CG/PG for gfx during S0ix") because GFXOFF will now not be explicitly called as part of the suspend entry code. Remove that dead code. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-16 11:35:14 -04:00
Tim Huang	603b9a575d	drm/amdgpu: skip fence GFX interrupts disable/enable for S0ix GFX v11.0.1 reported fence fallback timer expired issue on SDMA and GFX rings after S0ix resume. This is generated by EOP interrupts are disabled when S0ix suspend but fails to re-enable when resume because of the GFX is in GFXOFF. [ 203.349571] [drm] Fence fallback timer expired on ring sdma0 [ 203.349572] [drm] Fence fallback timer expired on ring gfx_0.0.0 [ 203.861635] [drm] Fence fallback timer expired on ring gfx_0.0.0 For S0ix, GFX is in GFXOFF state, avoid to touch the GFX registers to configure the fence driver interrupts for rings that belong to GFX. The interrupts configuration will be restored by GFXOFF exit. Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-16 11:34:57 -04:00
Lijo Lazar	b5cdadedaa	drm/amdgpu: Remove gfxoff check in GFX v9.4.3 GFXOFF feature is not there for GFX 9.4.3 ASICs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-16 11:34:50 -04:00
James Zhu	400a39f1ec	drm/amdgpu: skip xcp drm device allocation when out of drm resource Return 0 when drm device alloc failed with -ENOSPC in order to allow amdgpu drive loading. But the xcp without drm device node assigned won't be visiable in user space. This helps amdgpu driver loading on system which has more than 64 nodes, the current limitation. The proposal to add more drm nodes is discussed in public, which will support up to 2^20 nodes totally. kernel drm: https://lore.kernel.org/lkml/20230724211428.3831636-1-michal.winiarski@intel.com/T/ libdrm: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/305 Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-16 11:34:11 -04:00
Samir Dhume	dd12b858c2	drm/amdgpu/vcn: Skip vcn power-gating change for sriov CG/PG is handled on the host side. Signed-off-by: Samir Dhume <samir.dhume@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-16 11:33:59 -04:00
Samir Dhume	d34fecc6e9	drm/amdgpu/jpeg: sriov support for jpeg_v4_0_3 initialization table handshake with mmsch Signed-off-by: Samir Dhume <samir.dhume@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-16 11:33:59 -04:00
Srinivasan Shanmugam	b828e1004c	drm/amdgpu: Replace ternary operator with min() in 'amdgpu_iomem_write' Fixes the following coccicheck: drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:2482:16-17: WARNING opportunity for min() min() macro is defined in include/linux/minmax.h. It avoids multiple evaluations of the arguments when non-constant and performs strict type-checking. Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:30 -04:00
Mario Limonciello	9366c2e87d	drm/amd: Rename AMDGPU_PP_SENSOR_GPU_POWER Use the clearer name `AMDGPU_PP_SENSOR_GPU_AVG_POWER` instead. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:30 -04:00
GUO Zihua	e8b2ad875f	drm/amdgpu: Remove duplicated includes Remove duplicated includes in amdgpu_amdkfd_gpuvm.c and amdgpu_ttm.c. Resolves checkincludes message. Signed-off-by: GUO Zihua <guozihua@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:29 -04:00
Alex Deucher	4d6fc55ab1	drm/amdgpu: expand runpm parameter Allow the user to specify -2 as auto enabled with displays. By default we don't enter runtime suspend when there are displays attached because it does not work well in some desktop environments due to the driver sending hotplug events on resume in case any new displays were attached while the GPU was powered down. Some users still want this functionality though, so this lets you enable it. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2428 Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:29 -04:00
Aurabindo Pillai	e94e787e37	drm/amd: Remove freesync video mode amdgpu parameter [Why&How] Freesync Video mode was enabled by default. Hence no need for the module parameter, so remove it completely Acked-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Stylon Wang <stylon.wang@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:29 -04:00
Samir Dhume	d117fd2964	drm/amdgpu/vcn: sriov support for vcn_v4_0_3 initialization table handshake with mmsch Signed-off-by: Samir Dhume <samir.dhume@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:28 -04:00
Srinivasan Shanmugam	44fd83e920	drm/amdgpu: Replace ternary operator with min() in 'amdgpu_iomem_read' Fixes the following coccicheck: drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:2427:16-17: WARNING opportunity for min() min() macro is defined in include/linux/minmax.h. It avoids multiple evaluations of the arguments when non-constant and performs strict type-checking. Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:28 -04:00
Samir Dhume	945355c96e	drm/amdgpu/vcn: change end doorbell index for vcn_v4_0_3 For sriov, doorbell index for vcn0 for AID needs to be on 32 byte boundary so we need to move the vcn end doorbell Signed-off-by: Samir Dhume <samir.dhume@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:28 -04:00
Eric Huang	3831989d62	drm/amdkfd: workaround address watch clearing bug for gfx v9.4.2 KFD currently relies on MEC FW to clear tcp watch control register on UNMAP_PROCESS, but FW doesn't work on it, which is a bug. So the solution is to clear the register as gfx v9 in KFD. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:27 -04:00
Samir Dhume	dba24294ff	drm/amdgpu/jpeg: mmsch_v4_0_3 requires doorbell on 32 byte boundary BASE: VCN0 unified (32 byte boundary) BASE+4: MJPEG0 BASE+5: MJPEG1 BASE+6: MJPEG2 BASE+7: MJPEG3 BASE+12: MJPEG4 BASE+13: MJPEG5 BASE+14: MJPEG6 BASE+15: MJPEG7 Signed-off-by: Samir Dhume <samir.dhume@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:27 -04:00
Samir Dhume	a31c114bcf	drm/amdgpu/vcn: mmsch_v4_0_3 requires doorbell on 32 byte boundary Align on 32 byte boundary. Signed-off-by: Samir Dhume <samir.dhume@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:27 -04:00
Samir Dhume	8d72444288	drm/amdgpu/vcn: Add MMSCH v4_0_3 support for sriov The structures are the same as v4_0 except for the init header Signed-off-by: Samir Dhume <samir.dhume@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:27 -04:00
Candice Li	b81fde0dfe	drm/amdgpu: Add I2C EEPROM support on smu v13_0_6 Support I2C EEPROM on smu v13_0_6. v2: Move IP_VERSION(13, 0, 6) ahead of IP_VERSION(13, 0, 10). Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:27 -04:00
Xiongfeng Wang	bdacd16afa	drm/amd: Use pci_dev_id() to simplify the code PCI core API pci_dev_id() can be used to get the BDF number for a pci device. We don't need to compose it mannually. Use pci_dev_id() to simplify the code a little bit. Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:07:42 -04:00
Srinivasan Shanmugam	d0d6928058	drm/amdgpu: Fix identifier names to function definition arguments in atom.h Fixes the following: WARNING: function definition argument 'struct card_info ' should also have an identifier name WARNING: function definition argument 'uint32_t' should also have an identifier name WARNING: function definition argument 'void ' should also have an identifier name WARNING: function definition argument 'struct atom_context ' should also have an identifier name WARNING: function definition argument 'int' should also have an identifier name WARNING: function definition argument 'uint32_t ' should also have an identifier name WARNING: Unnecessary space before function pointer name ERROR: space prohibited after that '*' (ctx:BxW) CHECK: Prefer kernel type 'u32' over 'uint32_t' Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:07:42 -04:00
YiPeng Chai	1b98a5f8e0	drm/amdgpu: mode1 reset needs to recover mp1 for mp0 v13_0_10 Mode1 reset needs to recover mp1 in fatal error case for mp0 v13_0_10. v2: Define a macro to wrap psp function calls. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:07:41 -04:00
Hawking Zhang	8b3a7a707c	drm/amdgpu: Remove unnecessary ras cap check RAS global isr will only be invoked by hardware interrupt. Don't need to query ras capability in isr In addition, amdgpu_ras_interrupt_fatal_error_handler ensures the isr won't be called from guest linux side by accident. The RAS cap check in isr that introduced to fix sriov crash is not needed any more Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:07:41 -04:00
Jiadong Zhu	1e9e15dcf4	drm/amdgpu: disable mcbp if parameter zero is set The parameter amdgpu_mcbp shall have priority against the default value calculated from the chip version. User could disable mcbp by setting the parameter mcbp as zero. v2: do not trigger preemption in sw ring muxer when mcbp is disabled. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 17:44:02 -04:00
Srinivasan Shanmugam	4c452b5c7d	drm/amdgpu: Fix missing comment for mb() in 'amdgpu_device_aper_access' This patch adds the missing code comment for memory barrier WARNING: memory barrier without comment + mb(); WARNING: memory barrier without comment + mb(); Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 17:40:44 -04:00
Alex Deucher	6be2ad4f00	drm/amdgpu: don't allow userspace to create a doorbell BO We need the domains in amdgpu_drm.h for the kernel driver to manage the pool, but we don't want userspace using it until the code is ready. So reject for now. Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-11 14:48:07 -04:00
Alex Deucher	c99a2e7ae2	drm/amdkfd: drop IOMMUv2 support Now that we use the dGPU path for all APUs, drop the IOMMUv2 support. v2: drop the now unused queue manager functions for gfx7/8 APUs Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Mike Lothian <mike@fireburn.co.uk> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-11 14:47:25 -04:00
Uros Bizjak	9e761bff03	drm/amdgpu: Use local64_try_cmpxchg in amdgpu_perf_read Use local64_try_cmpxchg instead of local64_cmpxchg (ptr, old, new) == old in amdgpu_perf_read. x86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg). Also, try_cmpxchg implicitly assigns old ptr value to "old" when cmpxchg fails. There is no need to re-read the value in the loop. No functional change intended. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 18:08:51 -04:00
Asad Kamal	59070fd9cc	drm/amdgpu: Add pci usage to nbio v7.9 Add implementation to get pcie usage for nbio v7.9. Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:46:05 -04:00
Asad Kamal	8d759dc664	drm/amdgpu: Add pcie usage callback to nbio Add a callback in nbio to get pcie usage Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:46:05 -04:00
Candice Li	bc0f80802d	drm/amdgpu: Extend poison mode check to SDMA/VCN/JPEG Treat SDMA/VCN/JPEG as RAS capable IP blocks in poison mode. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:46:05 -04:00
Emily Deng	f734b2133c	drm/amdgpu/irq: Move irq resume to the beginning Need to move irq resume to the beginning of reset sriov, or if one interrupt occurs before irq resume, then the irq won't work anymore. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:46:04 -04:00
Tao Zhou	7692e1ee24	drm/amdgpu: add RAS fatal error handler for NBIO v7.9 Register RAS fatal error interrupt and add handler. v2: only register NBIO RAS for dGPU platform. change nbio_v7_9_set_ras_controller_irq_state and nbio_v7_9_set_ras_err_event_athub_irq_state to dummy functions. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:46:04 -04:00
Srinivasan Shanmugam	657db07b32	drm/amdgpu: Fix identation issues in 'kgd_gfx_v9_program_trap_handler_settings' Fixes the following: ERROR: code indent should use tabs where possible WARNING: please, no spaces at the start of a line Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:46:04 -04:00
Alex Deucher	81af32520e	drm/amdgpu/gfx11: only enable CP GFX shadowing on SR-IOV This is only required for SR-IOV world switches, but it adds additional latency leading to reduced performance in some benchmarks. Disable for now on bare metal. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:46:04 -04:00
Lijo Lazar	7957ec80ef	drm/amdgpu: Add FRU sysfs nodes only if needed Create sysfs nodes for FRU data only if FRU data is available. Move the logic to FRU specific file. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:45:55 -04:00
Ran Sun	20c7435447	drm/amdgpu: Clean up errors in vcn_v3_0.c Fix the following errors reported by checkpatch: ERROR: space required before the open brace '{' ERROR: "foo * bar" should be "foo *bar" ERROR: space required before the open parenthesis '(' ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:44:00 -04:00
Ran Sun	7bb8c4f6a4	drm/amdgpu: Clean up errors in tonga_ih.c Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:58 -04:00
Ran Sun	7b57c54c96	drm/amdgpu: Clean up errors in gfx_v7_0.c Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line ERROR: trailing statements should be on next line ERROR: open brace '{' following struct go on the same line ERROR: space prohibited before that '++' (ctx:WxB) Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:56 -04:00
Ran Sun	2b2b5858f5	drm/amdgpu: Clean up errors in vcn_v4_0.c Fix the following errors reported by checkpatch: spaces required around that '==' (ctx:VxV) ERROR: space required before the open parenthesis '(' ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:54 -04:00
Ran Sun	c8a1439699	drm/amdgpu: Clean up errors in uvd_v3_1.c Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:52 -04:00
Ran Sun	599f7c8b85	drm/amdgpu: Clean up errors in mxgpu_vi.c Fix the following errors reported by checkpatch: ERROR: spaces required around that '-=' (ctx:WxV) Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:51 -04:00
Ran Sun	939a392f07	drm/amdgpu: Clean up errors in nv.c Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:49 -04:00
Ran Sun	baa5ede875	drm/amdgpu: Clean up errors in amdgpu_virt.c Fix the following errors reported by checkpatch: ERROR: space required before the open parenthesis '(' Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:47 -04:00
Ran Sun	1b01c010d7	drm/amdgpu: Clean up errors in amdgpu_ring.h Fix the following errors reported by checkpatch: ERROR: spaces required around that ':' (ctx:VxW) Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:45 -04:00
Ran Sun	7b7fbabbff	drm/amdgpu: Clean up errors in amdgpu_trace.h Fix the following errors reported by checkpatch: ERROR: space required after that ',' (ctx:VxV) ERROR: "foo* bar" should be "foo *bar" Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:43 -04:00
Ran Sun	91aafa3c4e	drm/amdgpu: Clean up errors in mes_v11_0.c Fix the following errors reported by checkpatch: ERROR: else should follow close brace '}' Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:42 -04:00
Ran Sun	98268d4033	drm/amdgpu: Clean up errors in amdgpu_atombios.h Fix the following errors reported by checkpatch: ERROR: open brace '{' following struct go on the same line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:40 -04:00
Ran Sun	06d82d87b4	drm/amdgpu: Clean up errors in soc21.c Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:38 -04:00
Ran Sun	18ef754488	drm/amdgpu: Clean up errors in dce_v8_0.c Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line ERROR: code indent should use tabs where possible ERROR: space required before the open brace '{' Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:36 -04:00
Ran Sun	665ba81b4a	drm/amdgpu/jpeg: Clean up errors in vcn_v1_0.c Fix the following errors reported by checkpatch: ERROR: space required before the open parenthesis '(' ERROR: space prohibited after that '~' (ctx:WxW) Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:34 -04:00
Ran Sun	e2515e2b90	drm/amdgpu: Clean up errors in mxgpu_nv.c Fix the following errors reported by checkpatch: ERROR: else should follow close brace '}' ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:32 -04:00
Ran Sun	2b77f199a5	drm/amdgpu: Clean up errors in dce_v10_0.c Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:30 -04:00
Ran Sun	7c29b40236	drm/jpeg: Clean up errors in jpeg_v2_0.c Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:28 -04:00
Ran Sun	a788b54f3d	drm/amdgpu: Clean up errors in uvd_v7_0.c Fix the following errors reported by checkpatch: ERROR: spaces required around that ':' (ctx:VxE) that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:26 -04:00
Ran Sun	7163dadea2	drm/amdgpu/atomfirmware: Clean up errors in amdgpu_atomfirmware.c Fix the following errors reported by checkpatch: ERROR: spaces required around that '>=' (ctx:WxV) ERROR: spaces required around that '!=' (ctx:WxV) ERROR: code indent should use tabs where possible Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:23 -04:00
Ran Sun	f291f9b9db	drm/amdgpu: Clean up errors in mmhub_v9_4.c Fix the following errors reported by checkpatch: ERROR: code indent should use tabs where possible ERROR: space required before the open parenthesis '(' Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:22 -04:00
Ran Sun	1f45f1c592	drm/amdgpu: Clean up errors in vega20_ih.c Fix the following errors reported by checkpatch: ERROR: trailing statements should be on next line ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:20 -04:00
Ran Sun	46eb29b867	drm/amdgpu: Clean up errors in ih_v6_0.c Fix the following errors reported by checkpatch: ERROR: trailing statements should be on next line ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:18 -04:00
Ran Sun	08110c26ce	drm/amdgpu: Clean up errors in amdgpu_psp.h Fix the following errors reported by checkpatch: ERROR: open brace '{' following struct go on the same line ERROR: open brace '{' following enum go on the same line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:17 -04:00
Ran Sun	042a70e43a	drm/amdgpu: Clean up errors in vce_v3_0.c Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:15 -04:00
Ran Sun	9c7f00f7d1	drm/amdgpu: Clean up errors in cik_ih.c Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:13 -04:00
Ruan Jinjie	3b780089fd	drm/amdgpu: Remove a lot of unnecessary ternary operators There are many ternary operators, the true or false judgement of which is unnecessary in C language semantics. Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:39:56 -04:00
Alex Deucher	73b0648179	drm/amdgpu: fix possible UAF in amdgpu_cs_pass1() Since the gang_size check is outside of chunk parsing loop, we need to reset i before we free the chunk data. Suggested by Ye Zhang (@VAR10CK) of Baidu Security. Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:39:40 -04:00
Lijo Lazar	7748ce5b69	drm/amdgpu: Report vbios version instead of PN Report VBIOS version in vbios_version sysfs node instead of part number. Part number remains constant for a SKU type. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:14:09 -04:00
Shashank Sharma	664c3b03f9	drm/amdgpu: cleanup MES process level doorbells MES allocates process level doorbells, but there is no userspace client to consume it. It was only being used for the MES ring tests (in kernel), and was written by kernel doorbell write. The previous patch of this series has changed the MES ring test code to use kernel level MES doorbells. This patch now cleans up the process level doorbell allocation code which is not required. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:14:07 -04:00
Shashank Sharma	e3cbb1f404	drm/amdgpu: use doorbell mgr for MES kernel doorbells This patch: - Removes the existing doorbell management code, and its variables from the doorbell_init function, it will be done in doorbell manager now. - uses the doorbell page created for MES kernel level needs (doorbells for MES self tests) - current MES code was allocating MES doorbells in MES process context, but those were getting written using kernel doorbell calls. This patch instead allocates a MES kernel doorbell for this (in add_hw_queue). V2: Create an extra page of doorbells for MES during kernel doorbell creation (Alex) V4: Move MES doorbell size and page offset objects in this patch from patch 6. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian Koenig <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:14:07 -04:00
Ori Messinger	557d466b15	drm/amdgpu: Report Missing MES Firmware Versions with Sysfs Added missing MES firmware versions to the 'fw_version' sysfs directory, they should now exist as a files named "mes_fw_version" and "mes_kiq_fw_version" found at: /sys/class/drm/cardX/device/fw_version/mes_fw_version /sys/class/drm/cardX/device/fw_version/mes_kiq_fw_version Where X is the card number, and the version is displayed in hexadecimal. Signed-off-by: Ori Messinger <ori.messinger@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:14:07 -04:00
Shashank Sharma	d124aa0ac9	drm/amdgpu: get absolute offset from doorbell index This patch adds a helper function which converts a doorbell's relative index in a BO to an absolute doorbell offset in the doorbell BAR. V2: No space between the variable name doc (Luben) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:14:07 -04:00
Shashank Sharma	54c30d2a8d	drm/amdgpu: create kernel doorbell pages This patch: - creates a doorbell page for graphics driver usages. - adds a few new varlables in adev->doorbell structure to keep track of kernel's doorbell-bo. - removes the adev->doorbell.ptr variable, replaces it with kernel-doorbell-bo's cpu address. V2: - Create doorbell BO directly, no wrappe functions (Alex) - no additional doorbell structure (Alex, Christian) - Use doorbell_cpu_ptr, remove ioremap (Christian, Alex) - Allocate one extra page of doorbells for MES (Alex) V4: Move MES doorbell base init into MES related patch (Christian) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:14:06 -04:00
Lijo Lazar	36f3f375ed	drm/amdgpu: Use nbio callback for nv and soc21 Make the new ascis to follow nbio callback method to get pcie replay count. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:14:06 -04:00
Lijo Lazar	50709d18f4	drm/amdgpu: Add pci replay count to nbio v7.9 Add implementation to get pcie replay count for nbio v7.9. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:14:06 -04:00
Shashank Sharma	792b84fb90	drm/amdgpu: initialize ttm for doorbells This patch initialzes the ttm resource manager for doorbells. V2: Do not round up doorbell size (Alex) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian Koenig <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:14:06 -04:00
Alex Deucher	dc3499c71d	drm/amdgpu: accommodate DOMAIN/PL_DOORBELL This patch adds changes: - to accommodate the new GEM domain for DOORBELLs - to accommodate the new TTM PL for DOORBELLs in order to manage doorbell pages as GEM object. V2: Addressed reviwe comments from Christian - drop the doorbell changes for pinning/unpinning - drop the doorbell changes for dma-buf map - drop the doorbell changes for sgt - no need to handle TTM_PL_FLAG_CONTIGUOUS for doorbell - add caching type for doorbell V3: - Removed unrelated empty line (Christian) - Add PL_DOORBELL in mem_type_to_domain() as well (Alex) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian Koenig <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>	2023-08-07 17:14:06 -04:00
Shashank Sharma	794c33c66f	drm/amdgpu: don't modify num_doorbells for mes This patch removes the check and change in num_kernel_doorbells for MES, which is not being used anywhere by MES code. V2: Fixed checkpatch warnings. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian Koenig <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:14:06 -04:00
Lijo Lazar	900af4e488	drm/amdgpu: Add pcie replay count callback to nbio Add a callback in nbio to get pcie replay count. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:14:06 -04:00
Srinivasan Shanmugam	07867a78f8	drm/amdgpu: Prefer pr_err/_warn/_notice over printk in amdgpu_atpx_handler.c Fixes the following style issues: ERROR: open brace '{' following function definitions go on the next line WARNING: printk() should include KERN_<LEVEL> facility level Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Bert Karwatzki <spasswolf@web.de> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:13:10 -04:00
Srinivasan Shanmugam	a494a7ce54	Revert "drm/amdgpu: Prefer dev_* variant over printk in amdgpu_atpx_handler.c" Usage of container_of is wrong here. struct acpi_device *adev = container_of(handle, struct acpi_device, handle) This reverts commit `b0bd0a92b8`. References: https://gitlab.freedesktop.org/drm/amd/-/issues/2744 Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Bert Karwatzki <spasswolf@web.de> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:13:09 -04:00
Zhigang Luo	e24b2fdaec	drm/amdgpu: init TA microcode for SRIOV VF when MP0 IP is 13.0.6 Init TA ucode for SRIOV. Signed-off-by: Zhigang Luo <Zhigang.Luo@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:13:09 -04:00
Zhigang Luo	66353ec433	drm/amdgpu: remove SRIOV VF FB location programming For SRIOV VF, FB location is programmed by host driver, no need to program it in guest driver. v2: squash in unused variable removal Signed-off-by: Zhigang Luo <Zhigang.Luo@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:13:09 -04:00
Prike Liang	f05f4fe6ab	drm/amdgpu: enable SDMA MGCG for SDMA 5.2.x Now the SDMA firmware can support SDMA MGCG properly, so let's enable it from the driver side. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:12:49 -04:00
Hawking Zhang	6fc9d92c3d	drm/amdgpu: Issue ras enable_feature for gfx ip only For non-GFX IP blocks, set up ras obj if ras feature is allowed. For GFX IP blocks, force issue ras enable_feature command to firmware and only set up ras obj if ras feature is allowed Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:12:49 -04:00
Hawking Zhang	a5c75947b4	drm/amdgpu: Remove gfx v11_0_3 ras_late_init call amdgpu_ras_late_init will invoke ras_late_init call per IP block Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:12:49 -04:00
Srinivasan Shanmugam	21539a6d41	drm/amdgpu: Clean up style problems in mmhub_v2_3.c Fixes the following: ERROR: code indent should use tabs where possible WARNING: Missing a blank line after declarations WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: suspect code indent for conditional statements (8, 24) + if (!(data & (DAGB0_CNTL_MISC2__DISABLE_WRREQ_CG_MASK \| [...] + *flags \|= AMD_CG_SUPPORT_MC_MGCG; Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:12:49 -04:00
Srinivasan Shanmugam	4e2abc197f	drm/amdgpu: Move vram, gtt & flash defines to amdgpu_ ttm & _psp.h As amdgpu.h is getting decomposed, move vram and gtt extern defines into amdgpu_ttm.h & flash extern to amdgpu_psp.h Fixes: `f9acfafc34` ("drm/amdgpu: Move externs to amdgpu.h file from amdgpu_drv.c") Suggested-by: Christian König <christian.koenig@amd.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:12:49 -04:00
Hawking Zhang	62c4b772bd	drm/amdgpu: Apply poison mode check to GFX IP only For GFX IP that only supports poison consumption, GFX RAS won't be marked as enabled. i.e., hardware doesn't support gfx sram ecc. But driver still needs to issue firmware to enable poison consumption mode for GFX IP. In such case, check poison mode and treat GFX IP as RAS capable IP block. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:12:49 -04:00
Hawking Zhang	f957138cc3	drm/amdgpu: Only create err_count sysfs when hw_op is supported Some IP blocks only support partial ras feature and don't have ras counter and/or ras error status register at all. Driver should not create err_count sysfs node for those IP blocks. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:12:49 -04:00
Srinivasan Shanmugam	e2e42edfe8	drm/amdgpu: Sort the includes in amdgpu/amdgpu_drv.c Sort the include files that are included in amdgpu_drv.c alphabetically. Suggested-by: Mario Limonciello <mario.limonciello@amd.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:12:49 -04:00
Srinivasan Shanmugam	5f95f00317	drm/amdgpu: Cleanup amdgpu/amdgpu_cgs.c Fixes the below: ERROR: switch and case should be at the same indent WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: Block comments use * on subsequent lines WARNING: Comparisons should place the constant on the right side of the test Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:12:49 -04:00
Praful Swarnakar	2d5c04152a	drm/amdgpu: Fix style issues in amdgpu_psp.c Fixes the following to align to linux coding style: WARNING: Block comments use a trailing / on a separate line WARNING: Block comments should align the on each line Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Praful Swarnakar <Praful.Swarnakar@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:12:48 -04:00
Praful Swarnakar	ad19c200b1	drm/amdgpu: Fix style issues in amdgpu_debugfs.c Fixes the following to align to linux coding style: WARNING: Missing a blank line after declarations WARNING: sizeof rd should be sizeof(rd) Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Praful Swarnakar <Praful.Swarnakar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:12:48 -04:00
Prike Liang	4c340d0034	drm/amdgpu/discovery: add ih 6.1.0 support Add to IP discovery table. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 16:35:42 -04:00
Ben Li	0ba96fd3c0	drm/amdgpu: add ih 6.1 support Add initial support for IH 6.1. v2: Fix copyright date (Alex) Signed-off-by: Ben Li <ben.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 16:35:38 -04:00
Prike Liang	eff7a442c1	drm/amdgpu/discovery: add smuio 14.0.0 support Add to IP discovery table. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 16:35:27 -04:00
Prike Liang	9b9a5e34d4	drm/amdgpu/discovery: add hdp 6.1.0 support Add to IP discovery table. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 16:35:23 -04:00
Lijo Lazar	161c908d6a	drm/amdgpu: Match against exact bootloader status On PSP v13.x ASICs, boot loader will set only the MSB to 1 and clear the least significant bits for any command submission. Hence match against the exact register value, otherwise a register value of all 0xFFs also could falsely indicate that boot loader is ready. Also, from PSP v13.0.6 and newer, bits[7:0] will be used to indicate command error status. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 16:34:55 -04:00
Prike Liang	99af9c950d	drm/amdgpu/discovery: enable sdma6 for SDMA 6.1.0 Add to IP discovery table. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 16:34:04 -04:00
Mario Limonciello	70e64c4d52	drm/amd: Disable S/G for APUs when 64GB or more host memory Users report a white flickering screen on multiple systems that is tied to having 64GB or more memory. When S/G is enabled pages will get pinned to both VRAM carve out and system RAM leading to this. Until it can be fixed properly, disable S/G when 64GB of memory or more is detected. This will force pages to be pinned into VRAM. This should fix white screen flickers but if VRAM pressure is encountered may lead to black screens. It's a trade-off for now. Fixes: `81d0bcf990` ("drm/amdgpu: make display pinning more flexible (v2)") Cc: Hamza Mahfooz <Hamza.Mahfooz@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: <stable@vger.kernel.org> # 6.1.y: `bf0207e172` ("drm/amdgpu: add S/G display parameter") Cc: <stable@vger.kernel.org> # 6.4.y Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2735 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2354 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 16:32:54 -04:00
Prike Liang	7a22c147f7	drm/amdgpu/sdma6: initialize sdma 6.1.0 Add firmware declaration. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 16:32:46 -04:00
Daniel Vetter	3d00c59d14	amd-drm-next-6.6-2023-07-28: amdgpu: - Lots of checkpatch cleanups - GFX 9.4.3 updates - Add USB PD and IFWI flashing documentation - GPUVM updates - RAS fixes - DRR fixes - FAMS fixes - Virtual display fixes - Soft IH fixes - SMU13 fixes - Rework PSP firmware loading for other IPs - Kernel doc fixes - DCN 3.0.1 fixes - LTTPR fixes - DP MST fixes - DCN 3.1.6 fixes - SubVP fixes - Display bandwidth calculation fixes - VCN4 secure submission fixes - Allow building DC on RISC-V - Add visible FB info to bo_print_info - HBR3 fixes - Add PSP 14.0 support - GFX9 MCBP fix - GMC10 vmhub index fix - GMC11 vmhub index fix - Create a new doorbell manager - SR-IOV fixes amdkfd: - Cleanup CRIU dma-buf handling - Use KIQ to unmap HIQ - GFX 9.4.3 debugger updates - GFX 9.4.2 debugger fixes - Enable cooperative groups fof gfx11 - SVM fixes radeon: - Lots of checkpatch cleanups -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZMQ0vAAKCRC93/aFa7yZ 2EOOAQCrsNf1IEynXVj0gVYOWFDpBCdaDkw+gXR73nOlwBeZzgD8DAoismXYDY95 pkKlx/HL5O8qyZ25Lc9ZlgsJnTpnpw4= =c/Jk -----END PGP SIGNATURE----- Merge tag 'amd-drm-next-6.6-2023-07-28' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.6-2023-07-28: amdgpu: - Lots of checkpatch cleanups - GFX 9.4.3 updates - Add USB PD and IFWI flashing documentation - GPUVM updates - RAS fixes - DRR fixes - FAMS fixes - Virtual display fixes - Soft IH fixes - SMU13 fixes - Rework PSP firmware loading for other IPs - Kernel doc fixes - DCN 3.0.1 fixes - LTTPR fixes - DP MST fixes - DCN 3.1.6 fixes - SubVP fixes - Display bandwidth calculation fixes - VCN4 secure submission fixes - Allow building DC on RISC-V - Add visible FB info to bo_print_info - HBR3 fixes - Add PSP 14.0 support - GFX9 MCBP fix - GMC10 vmhub index fix - GMC11 vmhub index fix - Create a new doorbell manager - SR-IOV fixes amdkfd: - Cleanup CRIU dma-buf handling - Use KIQ to unmap HIQ - GFX 9.4.3 debugger updates - GFX 9.4.2 debugger fixes - Enable cooperative groups fof gfx11 - SVM fixes radeon: - Lots of checkpatch cleanups Merge conflicts: - drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c The switch to drm eu helpers in `8a206685d3` ("drm/amdgpu: use drm_exec for GEM and CSA handling v2") clashed with the cosmetic cleanups from `30953c4d00` ("drm/amdgpu: Fix style issues in amdgpu_gem.c"). I kept the former since the cleanup up code is gone. - drivers/gpu/drm/amd/amdgpu/atom.c. `adf64e2142` ("drm/amd: Avoid reading the VBIOS part number twice") removed code that `992b8fe106` ("drm/radeon: Replace all non-returning strlcpy with strscpy") polished. From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230728214228.8102-1-alexander.deucher@amd.com [sima: some merge conflict wrangling as noted] Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2023-08-04 11:10:18 +02:00
Lang Yu	6f38bdb86a	drm/amdgpu: correct vmhub index in GMC v10/11 Align with new vmhub definition. v2: use client_id == VMC to decide vmhub(Hawking) Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 15:05:31 -04:00
Srinivasan Shanmugam	3dc6d8352e	drm/amdgpu: Fix non-standard format specifiers in 'amdgpu_show_fdinfo' Fixes the following: WARNING: %Lu is non-standard C, use %llu + seq_printf(m, "drm-client-id:\t%Lu\n", vm->immediate.fence_context); WARNING: %Ld is non-standard C, use %lld + seq_printf(m, "drm-engine-%s:\t%Ld ns\n", amdgpu_ip_name[hw_ip], Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 15:05:31 -04:00
Jiadong Zhu	8cbbd11547	drm/amdgpu: set completion status as preempted for the resubmission The driver's CSA buffer is shared by all the ibs. When the high priority ib is submitted after the preempted ib, CP overrides the ib_completion_status as completed in the csa buffer. After that the preempted ib is resubmitted, CP would clear some locals stored for ib resume when reading the completed status, which causes gpu hang in some cases. Always set status as preempted for those resubmitted ib instead of reading everything from the CSA buffer. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2535 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2717 Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 15:04:19 -04:00
Srinivasan Shanmugam	7db36fe942	drm/amdgpu: Use parentheses for sizeof numa_info in 'amdgpu_acpi_get_numa_info' Fixes the below: WARNING: sizeof numa_info should be sizeof(*numa_info) Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 14:59:30 -04:00
Srinivasan Shanmugam	6cf20211fc	drm/amdgpu: Fix unnecessary else after return in 'amdgpu_eeprom_xfer' Fixes the following: WARNING: else is not generally useful after a break or return + return -EINVAL; + } else { Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 14:59:29 -04:00
Li Ma	82f33504a4	drm/amdgpu/discovery: enable PSP 14.0.0 support Add it to IP discovery. Signed-off-by: Li Ma <li.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 14:59:29 -04:00
Li Ma	14b2760f3c	drm/amdgpu: add PSP 14.0.0 support Uses same driver interface as 13.0. Signed-off-by: Li Ma <li.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 14:59:29 -04:00
Jonathan Kim	fc7f1d9697	drm/amdkfd: fix and enable ttmp setup for gfx11 The MES cached process context must be cleared on adding any queue for the first time. For proper debug support, the MES will clear it's cached process context on the first call to SET_SHADER_DEBUGGER. This allows TTMPs to be pesistently enabled in a safe manner. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Eric Huang <jinhuieric@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 14:59:29 -04:00
Srinivasan Shanmugam	f9acfafc34	drm/amdgpu: Move externs to amdgpu.h file from amdgpu_drv.c Fixes the following: WARNING: externs should be avoided in .c files +extern const struct attribute_group amdgpu_vram_mgr_attr_group; WARNING: externs should be avoided in .c files +extern const struct attribute_group amdgpu_gtt_mgr_attr_group; WARNING: externs should be avoided in .c files +extern const struct attribute_group amdgpu_flash_attr_group; And other style fixes: WARNING: Block comments should align the * on each line WARNING: void function return statements are not generally useful WARNING: braces {} are not necessary for single statement blocks Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 14:59:29 -04:00
Srinivasan Shanmugam	b0bd0a92b8	drm/amdgpu: Prefer dev_* variant over printk in amdgpu_atpx_handler.c Changed from printk to dev_* variants so that we get better debug info when there are multiple GPUs in the system. Fixes other style issue: ERROR: open brace '{' following function definitions go on the next line WARNING: printk() should include KERN_<LEVEL> facility level Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 14:59:29 -04:00
Srinivasan Shanmugam	7593164d2f	drm/amdgpu: Fix no new typedefs for enum _AMDGPU_DOORBELL_* Fixes the following: WARNING: do not add new typedefs Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 14:59:29 -04:00
Srinivasan Shanmugam	b8920e1e0d	drm/amdgpu: Fix ENOSYS means 'invalid syscall nr' in amdgpu_device.c ENOSYS should be used for nonexistent syscalls only, replace ENOSYS with EOPNOTSUPP for reset handlers that are not implemented for respective ASIC. WARNING: ENOSYS means 'invalid syscall nr' and nothing else + if (r == -ENOSYS) WARNING: ENOSYS means 'invalid syscall nr' and nothing else + if (r == -ENOSYS) And other following style fixes in amdgpu_device.c: WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'. WARNING: Block comments should align the * on each line WARNING: Missing a blank line after declarations WARNING: braces {} are not necessary for single statement blocks Cc: Lijo Lazar <lijo.lazar@amd.com> Cc: Kent Russell <kent.russell@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 14:59:29 -04:00
Bob Zhou	8a92e8676c	drm/amdgpu: remove repeat code for mes_add_queue_pkt The setting of mes_add_queue_pkt is repeated, so remove it. Signed-off-by: Bob Zhou <bob.zhou@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 14:48:13 -04:00
Eric Huang	952ee94593	drm/amdgpu: enable trap of each kfd vmid for gfx v9.4.3 To setup ttmp on as default for gfx v9.4.3 in IP hw init. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 14:47:52 -04:00
Lijo Lazar	b5ac08806c	drm/amdgpu: Restore HQD persistent state register On GFX v9.4.3, compute queue MQD is populated using the values in HQD persistent state register. Hence don't clear the values on module unload, instead restore it to the default reset value so that MQD is initialized correctly during next module load. In particular, preload flag needs to be set on compute queue MQD, otherwise it could cause uninitialized values being used at device reset state resulting in EDC. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:47:27 -04:00
YuanShang	30b59910d9	drm/amdgpu: load sdma ucode in the guest machine [why] User mode driver need to check the sdma ucode version to see whether the sdma engine supports a new type of PM4 packet. In SRIOV, sdma is loaded by the host. And, there is no way to check the sdma ucode version of CHIP_NAVI12 and CHIP_SIENNA_CICHLID of the host in the guest machine. [how] Load the sdma ucode for CHIP_NAVI12 and CHIP_SIENNA_CICHLID in the guest machine. Signed-off-by: YuanShang <YuanShang.Mao@amd.com> Reviewed-By: Horace Chen <horace.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:47:26 -04:00
Srinivasan Shanmugam	fc8e55f378	drm/amdgpu: Use seq_puts() instead of seq_printf() For a constant format without additional arguments, use seq_puts() instead of seq_printf(). Also, it fixes the following warning. WARNING: Prefer seq_puts to seq_printf And other style fixes: WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: Block comments should align the * on each line Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:47:26 -04:00
Srinivasan Shanmugam	a0cc8e1512	drm/amdgpu: Update min() to min_t() in 'amdgpu_info_ioctl' Fixes the following: WARNING: min() should probably be min_t(size_t, size, sizeof(ip)) + ret = copy_to_user(out, &ip, min((size_t)size, sizeof(ip))); And other style fixes: WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: Missing a blank line after declarations Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:47:26 -04:00
Srinivasan Shanmugam	ce83aa7bad	drm/amdgpu: Remove else after return in 'is_fru_eeprom_supported' Expressions under 'else' branch under case 'CHIP_SIENNA_CICHLID' in function 'is_fru_eeprom_supported' are executed whenever the expression in 'if' is False. Otherwise, return from case occurs. Therefore, there is no need in 'else', and it has been removed. Fixes the following: WARNING: else is not generally useful after a break or return + return false; + } else { Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:47:26 -04:00
Srinivasan Shanmugam	50fbe0cc95	drm/amdgpu: Add -ENOMEM error handling when there is no memory Return -ENOMEM, when there is no sufficient dynamically allocated memory Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:47:26 -04:00
Stanley.Yang	fcb7a1849a	drm/amdgpu: Check APU flag to disable RAS Only disable RAS by default for aqua vanjaram on APU platform. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:47:26 -04:00
Shiwu Zhang	9bc12db4e2	drm/amdgpu: fix the indexing issue during rlcg access ctrl init In case that the GET_INST() is used for looping, only loops for the times of actual num of xcc, otherwise GET_INST() will return the invalid index, a.k.a -1 And also remove the redundant mask checking in case of GET_INST() Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:47:26 -04:00
Pierre-Eric Pelloux-Prayer	818c158fd4	drm/amdgpu: add VISIBLE info in amdgpu_bo_print_info This allows tools to distinguish between VRAM and visible VRAM. Use the opportunity to fix locking before accessing bo. v2: squash in unused variable fix Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:47:26 -04:00
Srinivasan Shanmugam	30953c4d00	drm/amdgpu: Fix style issues in amdgpu_gem.c Fixes the following to align to linux coding style: WARNING: braces {} are not necessary for any arm of this statement WARNING: Missing a blank line after declarations ERROR: space prohibited before that close parenthesis ')' WARNING: unnecessary whitespace before a quoted newline WARNING: %LX is non-standard C, use %llX Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:41:00 -04:00
Lijo Lazar	6cb209ed68	drm/amdgpu: Update ring scheduler info as needed Not all rings have scheduler associated. Only update scheduler data for rings with scheduler. It could result in out of bound access as total rings are more than those associated with particular IPs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:39:29 -04:00
sguttula	c6195ef5ee	drm/amdgpu: Enabling FW workaround through shared memory for VCN4_0_2 This patch will enable VCN FW workaround using DRM KEY INJECT WORKAROUND method, which is helping in fixing the secure playback. Signed-off-by: sguttula <Suresh.Guttula@amd.com> Reviewed-by: Leo Liu <leo.liiu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:39:21 -04:00
Srinivasan Shanmugam	93125cb704	drm/amd/amdgpu: Fix warnings in amdgpu/amdgpu_display.c Fixes the below checkpatch.pl warnings: WARNING: Block comments use * on subsequent lines WARNING: Block comments use a trailing */ on a separate line WARNING: suspect code indent for conditional statements (8, 12) WARNING: braces {} are not necessary for single statement blocks Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:36:58 -04:00
Srinivasan Shanmugam	37c3fc6620	drm/amdgpu: Return -ENOMEM when there is no memory in 'amdgpu_gfx_mqd_sw_init' Return -ENOMEM, when there is no sufficient dynamically allocated memory to create MQD backup for ring Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:36:16 -04:00
Srinivasan Shanmugam	88dd0b188e	drm/amdgpu: Fix do not add new typedefs in amdgpu_fw_attestation.c Fixes the following to align to coding style: WARNING: do not add new typedefs +typedef struct FW_ATT_DB_HEADER WARNING: do not add new typedefs +typedef struct FW_ATT_RECORD WARNING: Symbolic permissions 'S_IRUSR' are not preferred. Consider using octal permissions '0400'. + S_IRUSR, ERROR: "(foo)" should be "(foo )" WARNING: please, no space before tabs Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:36:08 -04:00
Srinivasan Shanmugam	b25b359926	drm/amdgpu: Prefer #if IS_ENABLED over #if defined in amdgpu_drv.c Adhere to linux coding style Fixes the following: WARNING: Prefer IS_ENABLED(<FOO>) to CONFIG_<FOO> \|\| CONFIG_<FOO>_MODULE +#if defined(CONFIG_DRM_RADEON) \|\| defined(CONFIG_DRM_RADEON_MODULE) WARNING: Prefer IS_ENABLED(<FOO>) to CONFIG_<FOO> \|\| CONFIG_<FOO>_MODULE +#if defined(CONFIG_DRM_RADEON) \|\| defined(CONFIG_DRM_RADEON_MODULE) Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:36:00 -04:00
Jonathan Kim	7a1c5c6753	drm/amdkfd: enable cooperative groups for gfx11 MES can concurrently schedule queues on the device that require exclusive device access if marked exclusively_scheduled without the requirement of GWS. Similar to the F32 HWS, MES will manage quality of service for these queues. Use this for cooperative groups since cooperative groups are device occupancy limited. Since some GFX11 devices can only be debugged with partial CUs, do not allow the debugging of cooperative groups on these devices as the CU occupancy limit will change on attach. In addition, zero initialize the MES add queue submission vector for MES initialization tests as we do not want these to be cooperative dispatches. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:35:43 -04:00
Horace Chen	83f24a8f05	drm/amdgpu: set sw state to gfxoff after SR-IOV reset [Why] Current SR-IOV will not set GC to off state, while it is a real GC hard reset. Whthout GFX off flag, driver may do gfxhub invalidation before firmware load and gfxhub gart enable. This operation may cause CP to become busy because GC is not in the right state for invalidation. [How] Add a function for SR-IOV to clean up some sw state before recover. Set adev->gfx.is_poweron to false to prevent gfxhub invalidation before gfx firmware autoload complete. Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: HaiJun Chang <HaiJun.Chang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:35:23 -04:00
Yang Li	f135b0fc31	drm/amdgpu: Fix one kernel-doc comment Use colon to separate parameter name from their specific meaning. silence the warning: drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c:793: warning: Function parameter or member 'adev' not described in 'amdgpu_vm_pte_update_noretry_flags' Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-21 16:52:25 -04:00
Mario Limonciello	6b4cf4a35f	drm/amd: Fix an error handling mistake in psp_sw_init() If the second call to amdgpu_bo_create_kernel() fails, the memory allocated from the first call should be cleared. If the third call fails, the memory from the second call should be cleared. Fixes: `b95b539168` ("drm/amdgpu/psp: move PSP memory alloc from hw_init to sw_init") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-21 16:52:25 -04:00
Victor Lu	9196b63bee	drm/amdgpu: Fix infinite loop in gfxhub_v1_2_xcc_gart_enable (v2) An instance of for_each_inst() was not changed to match its new behaviour and is causing a loop. v2: remove tmp_mask variable Fixes: `b579ea632f` ("drm/amdgpu: Modify for_each_inst macro") Signed-off-by: Victor Lu <victorchengchi.lu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-21 16:52:25 -04:00
Lijo Lazar	0bdebfef3f	drm/amdgpu: Program xcp_ctl registers as needed XCP_CTL register is expected to be programmed by firmware. Under certain conditions FW may not have programmed it correctly. As a workaround, program it when FW has not programmed the right values. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-21 16:52:25 -04:00
sguttula	5dbb59247b	drm/amdgpu: allow secure submission on VCN4 ring This patch will enable secure decode playback on VCN4_0_2 Signed-off-by: sguttula <Suresh.Guttula@amd.com> Reviewed-by: Leo Liu <leo.liiu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-21 16:52:24 -04:00
Mario Limonciello	adf64e2142	drm/amd: Avoid reading the VBIOS part number twice The VBIOS part number is read both in amdgpu_atom_parse() as well as in atom_get_vbios_pn() and stored twice in the `struct atom_context` structure. Remove the first unnecessary read and move the `pr_info` line from that read into the second. v2: squash in unused variable removal Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-21 16:52:24 -04:00
Guchun Chen	18cf073faa	drm/amdgpu: use a macro to define no xcp partition case ~0 as no xcp partition is used in several places, so improve its definition by a macro for code consistency. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-18 11:19:02 -04:00
Guchun Chen	e379b5e7dc	drm/amdgpu/vm: use the same xcp_id from root PD Other PDs/PTs allocation should just use the same xcp_id as that stored in root PD. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-18 11:18:53 -04:00
Guchun Chen	5003ca63bc	drm/amdgpu: fix slab-out-of-bounds issue in amdgpu_vm_pt_create Recent code set xcp_id stored from file private data when opening device to amdgpu bo for accounting memory usage etc, but not all VMs are attached to this fpriv structure like the vm cases in amdgpu_mes_self_test, otherwise, KASAN will complain below out of bound access. And more importantly, VM code should not touch fpriv structure, so drop fpriv code handling from amdgpu_vm_pt. [ 77.292314] BUG: KASAN: slab-out-of-bounds in amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] [ 77.293845] Read of size 4 at addr ffff888102c48a48 by task modprobe/1069 [ 77.294146] Call Trace: [ 77.294178] <TASK> [ 77.294208] dump_stack_lvl+0x49/0x63 [ 77.294260] print_report+0x16f/0x4a6 [ 77.294307] ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] [ 77.295979] ? kasan_complete_mode_report_info+0x3c/0x200 [ 77.296057] ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] [ 77.297556] kasan_report+0xb4/0x130 [ 77.297609] ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] [ 77.299202] __asan_load4+0x6f/0x90 [ 77.299272] amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] [ 77.300796] ? amdgpu_init+0x6e/0x1000 [amdgpu] [ 77.302222] ? amdgpu_vm_pt_clear+0x750/0x750 [amdgpu] [ 77.303721] ? preempt_count_sub+0x18/0xc0 [ 77.303786] amdgpu_vm_init+0x39e/0x870 [amdgpu] [ 77.305186] ? amdgpu_vm_wait_idle+0x90/0x90 [amdgpu] [ 77.306683] ? kasan_set_track+0x25/0x30 [ 77.306737] ? kasan_save_alloc_info+0x1b/0x30 [ 77.306795] ? __kasan_kmalloc+0x87/0xa0 [ 77.306852] amdgpu_mes_self_test+0x169/0x620 [amdgpu] v2: without specifying xcp partition for PD/PT bo, the xcp id is -1. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2686 Fixes: `3ebfd221c1` ("drm/amdkfd: Store xcp partition id to amdgpu bo") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-18 11:18:16 -04:00
Guchun Chen	50e633081e	drm/amdgpu: Allocate root PD on correct partition file_priv needs to be setup firstly, otherwise, root PD will always be allocated on partition 0, even if opening the device from other partitions. Fixes: `3ebfd221c1` ("drm/amdkfd: Store xcp partition id to amdgpu bo") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-18 11:16:48 -04:00
Victor Lu	8ed49dd1d3	drm/amdgpu: Add RLCG interface driver implementation for gfx v9.4.3 (v3) Add RLCG interface support for gfx v9.4.3 and multiple XCCs. Do not enable it yet. v2: Fix amdgpu_rlcg_reg_access_ctrl init, add support for multiple XCCs in amdgpu_mm_wreg_mmio_rlc v3: Use GET_INST() when indexing amdgpu_rlcg_reg_access_ctrl Signed-off-by: Victor Lu <victorchengchi.lu@amd.com> Reviewed-by: Zhigang Luo <zhigang.luo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-18 11:16:41 -04:00
Shashank Sharma	43c064db65	drm/amdgpu: create a new file for doorbell manager This patch: - creates a new file for doorbell management. - moves doorbell code from amdgpu_device.c to this file. V2: - remove doc from function declaration (Christian) - remove 'device' from function names to make it consistent (Alex) - add SPDX license identifier (Luben) V3: - change license to MIT license(Christian) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-18 11:12:08 -04:00
Candice Li	5229a37e17	drm/amdgpu: Allow the initramfs generator to include psp_13_0_6_ta Allow the initramfs generator to automatically include psp_13_0_6_ta firmware to initramfs. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-18 11:11:49 -04:00
Stanley.Yang	276f6e8cb7	drm/amdgpu: Disable RAS by default on APU flatform Disable RAS feature by default for aqua vanjaram on APU platform. Changed from V1: Splite Disable RAS by default on APU platform into a separated patch. Changed from V2: Avoid to modify global variable amdgpu_ras_enable. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-18 11:11:36 -04:00
Stanley.Yang	cb906ce32b	drm/amdgpu: Enable aqua vanjaram RAS Enable RAS for aqua vanjaram. Changed from V1: Split the change in amdgpu_ras_asic_supported into a separated patch. Changed from V2: Avoid to modify global variable amdgpu_ras_enable. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-18 11:11:23 -04:00
Srinivasan Shanmugam	a62e702ee1	drm/amdgpu: Avoid possiblity of kernel crash in 'gmc_v8_0, gmc_v7_0_init_microcode()' If the function 'gmc_v8_0_ or gmc_v7_0_init_microcode()' fails, the driver will just fail to load, hence return -EINVAL rather having BUG(), fixes WARNING: Do not crash the kernel unless it is absolutely unavoidable--use WARN_ON_ONCE() plus recovery code (if feasible) instead of BUG() or variants Fixes: `2f77b5931f` ("drm/amdgpu: Fix error & warnings in gmc_v8_0.c") Fixes: `0cfc1d6830` ("drm/amdgpu: Fix errors & warnings in gmc_ v6_0, v7_0.c") Suggested-by: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-18 11:09:30 -04:00
Saleemkhan Jamadar	33e88286d6	Revert "drm/amdgpu:update kernel vcn ring test" VCN FW depncencies revert it to unblock others This reverts commit `f3fa86f5c7`. Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Acked-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-18 11:06:54 -04:00
Daniel Vetter	6c7f27441d	drm-misc-next for v6.6: UAPI Changes: * fbdev: * Make fbdev userspace interfaces optional; only leaves the framebuffer console active * prime: * Support dma-buf self-import for all drivers automatically: improves support for many userspace compositors Cross-subsystem Changes: * backlight: * Fix interaction with fbdev in several drivers * base: Convert struct platform.remove to return void; part of a larger, tree-wide effort * dma-buf: Acquire reservation lock for mmap() in exporters; part of an on-going effort to simplify locking around dma-bufs * fbdev: * Use Linux device instead of fbdev device in many places * Use deferred-I/O helper macros in various drivers * i2c: Convert struct i2c from .probe_new to .probe; part of a larger, tree-wide effort * video: * Avoid including <linux/screen_info.h> Core Changes: * atomic: * Improve logging * prime: * Remove struct drm_driver.gem_prime_mmap plus driver updates: all drivers now implement this callback with drm_gem_prime_mmap() * gem: * Support execution contexts: provides locking over multiple GEM objects * ttm: * Support init_on_free * Swapout fixes Driver Changes: * accel: * ivpu: MMU updates; Support debugfs * ast: * Improve device-model detection * Cleanups * bridge: * dw-hdmi: Improve support for YUV420 bus format * dw-mipi-dsi: Fix enable/disable of DSI controller * lt9611uxc: Use MODULE_FIRMWARE() * ps8640: Remove broken EDID code * samsung-dsim: Fix command transfer * tc358764: Handle HS/VS polarity; Use BIT() macro; Various cleanups * Cleanups * ingenic: * Kconfig REGMAP fixes * loongson: * Support display controller * mgag200: * Minor fixes * mxsfb: * Support disabling overlay planes * nouveau: * Improve VRAM detection * Various fixes and cleanups * panel: * panel-edp: Support AUO B116XAB01.4 * Support Visionox R66451 plus DT bindings * Cleanups * ssd130x: * Support per-controller default resolution plus DT bindings * Reduce memory-allocation overhead * Cleanups * tidss: * Support TI AM625 plus DT bindings * Implement new connector model plus driver updates * vkms * Improve write-back support * Documentation fixes -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmSvvRAACgkQaA3BHVML eiNpGQgAs8jq1XjN9t8jZsdgXnoCbkZyVUI2NO0HwoVwpRCLgbXp5AX5qq2oRciE TBhe4Fceh/ZsYqHTZQahnguxgRKM5JgXwbI4Z0iiOVcqasNbycaKAqipxJJ7kdo1 qPhGCbgQFVX7oIq2xjfXehh6O0SYX+R9r88X8dMJxMYv/pcLwOHG74kS040WOcQq uATgcnobOf/D8ZmlqvfKGAeTUoFo/RSR2Uhlauka58qgeUbicrTELZT2barY9d+k as6U5vv4wx2zMklTkjrlkMpAT1ZpbB9d3jGHwL27VEnjlfd3wV2bdH7Dzn9qZRf/ gn0ALg/b3u5yBWk/k7YBvijXyNcH6Q== =bBuG -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2023-07-13' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v6.6: UAPI Changes: * fbdev: * Make fbdev userspace interfaces optional; only leaves the framebuffer console active * prime: * Support dma-buf self-import for all drivers automatically: improves support for many userspace compositors Cross-subsystem Changes: * backlight: * Fix interaction with fbdev in several drivers * base: Convert struct platform.remove to return void; part of a larger, tree-wide effort * dma-buf: Acquire reservation lock for mmap() in exporters; part of an on-going effort to simplify locking around dma-bufs * fbdev: * Use Linux device instead of fbdev device in many places * Use deferred-I/O helper macros in various drivers * i2c: Convert struct i2c from .probe_new to .probe; part of a larger, tree-wide effort * video: * Avoid including <linux/screen_info.h> Core Changes: * atomic: * Improve logging * prime: * Remove struct drm_driver.gem_prime_mmap plus driver updates: all drivers now implement this callback with drm_gem_prime_mmap() * gem: * Support execution contexts: provides locking over multiple GEM objects * ttm: * Support init_on_free * Swapout fixes Driver Changes: * accel: * ivpu: MMU updates; Support debugfs * ast: * Improve device-model detection * Cleanups * bridge: * dw-hdmi: Improve support for YUV420 bus format * dw-mipi-dsi: Fix enable/disable of DSI controller * lt9611uxc: Use MODULE_FIRMWARE() * ps8640: Remove broken EDID code * samsung-dsim: Fix command transfer * tc358764: Handle HS/VS polarity; Use BIT() macro; Various cleanups * Cleanups * ingenic: * Kconfig REGMAP fixes * loongson: * Support display controller * mgag200: * Minor fixes * mxsfb: * Support disabling overlay planes * nouveau: * Improve VRAM detection * Various fixes and cleanups * panel: * panel-edp: Support AUO B116XAB01.4 * Support Visionox R66451 plus DT bindings * Cleanups * ssd130x: * Support per-controller default resolution plus DT bindings * Reduce memory-allocation overhead * Cleanups * tidss: * Support TI AM625 plus DT bindings * Implement new connector model plus driver updates * vkms * Improve write-back support * Documentation fixes Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20230713090830.GA23281@linux-uq9g	2023-07-17 15:37:57 +02:00
Dave Airlie	38d88d5e97	amd-drm-fixes-6.5-2023-07-12: amdgpu: - SMU i2c locking fix - Fix a possible deadlock in process restoration for ROCm apps - Disable PCIe lane/speed switching on Intel platforms (the platforms don't support it) -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZK7yYQAKCRC93/aFa7yZ 2JvYAQDpMj8/rLUsmWRk30jvkaZivgaeUEAG0FGaMpKaaATbvwEA2eHxN3xk5GKs ethEPp/zdivIrz6h/JWSCFrpCzqg4g8= =rsRJ -----END PGP SIGNATURE----- Merge tag 'amd-drm-fixes-6.5-2023-07-12' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.5-2023-07-12: amdgpu: - SMU i2c locking fix - Fix a possible deadlock in process restoration for ROCm apps - Disable PCIe lane/speed switching on Intel platforms (the platforms don't support it) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230712184009.7740-1-alexander.deucher@amd.com	2023-07-14 13:19:54 +10:00
Saleemkhan Jamadar	093b21f431	Revert "drm/amdgpu: update kernel vcn ring test" VCN FW depncencies revert it to unlock others This reverts commit `3ebfa943b8`. Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Acked-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-13 17:32:40 -04:00
Guchun Chen	826c1e923b	drm/amdgpu/vkms: relax timer deactivation by hrtimer_try_to_cancel In below thousands of screen rotation loop tests with virtual display enabled, a CPU hard lockup issue may happen, leading system to unresponsive and crash. do { xrandr --output Virtual --rotate inverted xrandr --output Virtual --rotate right xrandr --output Virtual --rotate left xrandr --output Virtual --rotate normal } while (1); NMI watchdog: Watchdog detected hard LOCKUP on cpu 1 ? hrtimer_run_softirq+0x140/0x140 ? store_vblank+0xe0/0xe0 [drm] hrtimer_cancel+0x15/0x30 amdgpu_vkms_disable_vblank+0x15/0x30 [amdgpu] drm_vblank_disable_and_save+0x185/0x1f0 [drm] drm_crtc_vblank_off+0x159/0x4c0 [drm] ? record_print_text.cold+0x11/0x11 ? wait_for_completion_timeout+0x232/0x280 ? drm_crtc_wait_one_vblank+0x40/0x40 [drm] ? bit_wait_io_timeout+0xe0/0xe0 ? wait_for_completion_interruptible+0x1d7/0x320 ? mutex_unlock+0x81/0xd0 amdgpu_vkms_crtc_atomic_disable It's caused by a stuck in lock dependency in such scenario on different CPUs. CPU1 CPU2 drm_crtc_vblank_off hrtimer_interrupt grab event_lock (irq disabled) __hrtimer_run_queues grab vbl_lock/vblank_time_block amdgpu_vkms_vblank_simulate amdgpu_vkms_disable_vblank drm_handle_vblank hrtimer_cancel grab dev->event_lock So CPU1 stucks in hrtimer_cancel as timer callback is running endless on current clock base, as that timer queue on CPU2 has no chance to finish it because of failing to hold the lock. So NMI watchdog will throw the errors after its threshold, and all later CPUs are impacted/blocked. So use hrtimer_try_to_cancel to fix this, as disable_vblank callback does not need to wait the handler to finish. And also it's not necessary to check the return value of hrtimer_try_to_cancel, because even if it's -1 which means current timer callback is running, it will be reprogrammed in hrtimer_start with calling enable_vblank to make it works. v2: only re-arm timer when vblank is enabled (Christian) and add a Fixes tag as well v3: drop warn printing (Christian) v4: drop superfluous check of blank->enabled in timer function, as it's guaranteed in drm_handle_vblank (Christian) Fixes: `84ec374bd5` ("drm/amdgpu: create amdgpu_vkms (v4)") Cc: stable@vger.kernel.org Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-13 17:32:15 -04:00
Srinivasan Shanmugam	2f77b5931f	drm/amdgpu: Fix error & warnings in gmc_v8_0.c Fix below checkpatch error & warnings: ERROR: trailing statements should be on next line + default: BUG(); WARNING: braces {} are not necessary for single statement blocks WARNING: braces {} are not necessary for any arm of this statement WARNING: Block comments should align the * on each line Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-13 17:29:11 -04:00
Luben Tuikov	52b82609bf	drm/amdgpu: Rename to amdgpu_vm_tlb_seq_struct Rename struct amdgpu_vm_tlb_seq_cb {...} to struct amdgpu_vm_tlb_seq_struct {...}, so as to not conflict with documentation processing tools. Of course, C has no problem with this. Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/b5ebc891-ee63-1638-0377-7b512d34b823@infradead.org Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-12 12:22:52 -04:00
Srinivasan Shanmugam	bd3c414254	drm/amdkfd: Fix stack size in 'amdgpu_amdkfd_unmap_hiq' Dynamically allocate large local variable instead of putting it onto the stack to avoid exceeding the stack size: drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c: In function ‘amdgpu_amdkfd_unmap_hiq’: drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c:868:1: warning: the frame size of 1280 bytes is larger than 1024 bytes [-Wframe-larger-than=] Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202307080505.V12qS0oz-lkp@intel.com Suggested-by: Guchun Chen <guchun.chen@amd.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-12 12:22:52 -04:00
Mario Limonciello	188623076d	drm/amd: Move helper for dynamic speed switch check out of smu13 This helper is used for checking if the connected host supports the feature, it can be moved into generic code to be used by other smu implementations as well. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x	2023-07-12 12:09:54 -04:00
gaba	8a774fe912	drm/amdgpu: avoid restore process run into dead loop. In restore process worker, pinned BO cause update PTE fail, then the function re-schedule the restore_work. This will generate dead loop. Signed-off-by: gaba <gaba@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2023-07-12 12:07:13 -04:00
Mario Limonciello	5d1eb4c4c8	drm/amd: Move helper for dynamic speed switch check out of smu13 This helper is used for checking if the connected host supports the feature, it can be moved into generic code to be used by other smu implementations as well. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-12 11:12:10 -04:00
Saleemkhan Jamadar	3ebfa943b8	drm/amdgpu: update kernel vcn ring test add session context buffer to decoder ring test for vcn v1 to v3. v3 - correct the cmd for sesssion ctx buf v2 - add the buffer into IB (Leo liu) Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-12 11:12:10 -04:00
Saleemkhan Jamadar	f3fa86f5c7	drm/amdgpu:update kernel vcn ring test add session context buffer to decoder ring test. v5 - clear the session ct buffer (Christian) v4 - data type, explain change of ib size change (Christian) v3 - indent and v2 changes correction. (Christian) v2 - put the buffer at the end of the IB (Christian) Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-12 11:12:09 -04:00
Tao Zhou	bd97449837	drm/amdgpu: add watchdog timer enablement for gfx_v9_4_3 Configure SQ watchdog timer setting. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-12 11:12:09 -04:00
Mukul Joshi	1879e009a4	drm/amdkfd: Update CWSR grace period for GFX9.4.3 For GFX9.4.3, setup a reduced default CWSR grace period equal to 1000 cycles instead of 64000 cycles. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-12 11:12:09 -04:00
Lang Yu	1ddcdb7cb6	drm/amdgpu: use psp_execute_load_ip_fw instead Replace the old ones with psp_execute_load_ip_fw. Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-12 11:12:09 -04:00
Lang Yu	45b51acb38	drm/amdgpu: rename psp_execute_non_psp_fw_load and make it global This will make this function more general, and then serve other IPs. Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-12 11:12:09 -04:00
Eric Huang	036e348fdc	drm/amdkfd: add kfd2kgd debugger callbacks for GC v9.4.3 Implement the similarities as GC v9.4.2, and the difference for GC v9.4.3 HW spec, i.e. xcc instance. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-12 10:58:01 -04:00
Arnd Bergmann	822130b5e8	drm/amdgpu: avoid integer overflow warning in amdgpu_device_resize_fb_bar() On 32-bit architectures comparing a resource against a value larger than U32_MAX can cause a warning: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1344:18: error: result of comparison of constant 4294967296 with expression of type 'resource_size_t' (aka 'unsigned int') is always false [-Werror,-Wtautological-constant-out-of-range-compare] res->start > 0x100000000ull) ~~~~~~~~~~ ^ ~~~~~~~~~~~~~~ As gcc does not warn about this in dead code, add an IS_ENABLED() check at the start of the function. This will always return success but not actually resize the BAR on 32-bit architectures without high memory, which is exactly what we want here, as the driver can fall back to bank switching the VRAM access. Fixes: `31b8adab32` ("drm/amdgpu: require a root bus window above 4GB for BAR resize") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-12 10:57:41 -04:00
Philip Yang	bf80d34b6c	drm/amdgpu: Increase soft IH ring size Retry faults are delegated to soft IH ring and then processed by deferred worker. Current soft IH ring size PAGE_SIZE can store 128 entries, which may overflow and drop retry faults, causes HW stucks because the retry fault is not recovered. Increase soft IH ring size to 8KB, enough to store 256 CAM entries because we clear the CAM entry after handling the retry fault from soft ring. Define macro IH_RING_SIZE and IH_SW_RING_SIZE to remove duplicate constant. Show warning message if soft IH ring overflows with CAM enabled because this should not happen. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-12 10:57:25 -04:00
Alex Deucher	95b88ea1af	drm/amdgpu/gfx10: move update_spm_vmid() out of rlc_init() rlc_init() is part of sw_init() so it should not touch hardware. Additionally, calling the rlc update_spm_vmid() callback directly invokes a gfx on/off cycle which could result in powergating being enabled before hw init is complete. Split update_spm_vmid() into an internal implementation for local use without gfxoff interaction and then the rlc callback which includes gfxoff handling. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-12 10:57:22 -04:00
Alex Deucher	08b6e1725d	drm/amdgpu/gfx9: move update_spm_vmid() out of rlc_init() rlc_init() is part of sw_init() so it should not touch hardware. Additionally, calling the rlc update_spm_vmid() callback directly invokes a gfx on/off cycle which could result in powergating being enabled before hw init is complete. Split update_spm_vmid() into an internal implementation for local use without gfxoff interaction and then the rlc callback which includes gfxoff handling. lbpw_init also touches hardware so mvoe that to rlc_resume as well. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-12 10:57:15 -04:00
Christian König	ca6c1e210a	drm/amdgpu: use the new drm_exec object for CS v3 Use the new component here as well and remove the old handling. v2: drop dupplicate handling v3: fix memory leak pointed out by Tatsuyuki Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230711133122.3710-7-christian.koenig@amd.com	2023-07-12 14:14:50 +02:00
Christian König	2acc73f81f	drm/amdgpu: use drm_exec for MES testing Start using the new component here as well. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230711133122.3710-6-christian.koenig@amd.com	2023-07-12 14:14:44 +02:00
Christian König	8a206685d3	drm/amdgpu: use drm_exec for GEM and CSA handling v2 Start using the new component here as well. v2: ignore duplicates to allow per VM BO mappings Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230711133122.3710-5-christian.koenig@amd.com	2023-07-12 14:14:36 +02:00
Christian König	8abc1eb298	drm/amdkfd: switch over to using drm_exec v3 Avoids quite a bit of logic and kmalloc overhead. v2: fix multiple problems pointed out by Felix v3: two more nit picks from Felix fixed Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230711133122.3710-4-christian.koenig@amd.com	2023-07-12 14:14:26 +02:00
Srinivasan Shanmugam	6dda3f18bd	drm/amdgpu: Fix errors & warnings in gfx_v10_0.c Fix the below checkpatch errors & warnings: ERROR: that open brace { should be on the previous line ERROR: space prohibited before that ',' (ctx:WxV) ERROR: space required after that ',' (ctx:WxV) ERROR: code indent should use tabs where possible ERROR: switch and case should be at the same indent WARNING: please, no spaces at the start of a line WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: space prohibited before semicolon WARNING: Block comments use a trailing / on a separate line WARNING: Block comments use on subsequent lines WARNING: braces {} are not necessary for any arm of this statement WARNING: Missing a blank line after declarations Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-10 09:02:37 -04:00
Srinivasan Shanmugam	fe018cf2a1	drm/amdgpu: Fix warnings in gfxhub_ v3_0, v3_0_3.c Fix the below checkpatch warnings: WARNING: static const char * array should probably be static const char * const +static const char gfxhub_client_ids[] = { WARNING: Prefer 'unsigned int' to bare use of 'unsigned' + unsigned i; WARNING: static const char array should probably be static const char * const +static const char *gfxhub_client_ids[] = { WARNING: Prefer 'unsigned int' to bare use of 'unsigned' + unsigned i; Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-10 09:02:37 -04:00
Srinivasan Shanmugam	e8483e682a	drm/amdgpu: Fix warnings in gmc_v8_0.c Fix below checkpatch warnings: WARNING: braces {} are not necessary for single statement blocks WARNING: braces {} are not necessary for any arm of this statement WARNING: Block comments should align the * on each line Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-10 09:02:37 -04:00
gaba	edc857a682	drm/amdgpu: avoid restore process run into dead loop. In restore process worker, pinned BO cause update PTE fail, then the function re-schedule the restore_work. This will generate dead loop. Signed-off-by: gaba <gaba@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-10 09:02:37 -04:00
Guchun Chen	e2770d76d4	drm/amdgpu/vkms: drop redundant set of fb_modifiers_not_supported Due to a coding typo. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-10 09:02:37 -04:00
Srinivasan Shanmugam	c7a6c2b6b8	drm/amdgpu: Remove else after return statement in 'gfx_v10_0_check_grbm_cam_remapping' Fix below checkpatch warnings: WARNING: else is not generally useful after a break or return + return true; + } else { WARNING: else is not generally useful after a break or return + return true; + } else { Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-10 09:02:37 -04:00
Srinivasan Shanmugam	38d47145b0	drm/amdgpu: Fix warnings in gmc_v11_0.c Fix below checkpatch warnings: WARNING: quoted string split across lines WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: void function return statements are not generally useful WARNING: braces {} are not necessary for any arm of this statement Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-10 09:02:37 -04:00
Srinivasan Shanmugam	b8f68f1da5	drm/amdgpu: Remove else after return statement in 'gmc_v8_0_check_soft_reset' Fix below checkpatch warnings: WARNING: else is not generally useful after a break or return + return true; + } else { Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-10 09:02:37 -04:00
Srinivasan Shanmugam	f51f2088f1	drm/amdgpu: Fix warnings in gfxhub_v2_1.c Fix the below checkpatch warnings: WARNING: static const char * array should probably be static const char * const +static const char *gfxhub_client_ids[] = { WARNING: Prefer 'unsigned int' to bare use of 'unsigned' + unsigned i; WARNING: Missing a blank line after declarations + int i; + adev->gmc.VM_L2_CNTL = RREG32_SOC15(GC, 0, mmGCVM_L2_CNTL); WARNING: Missing a blank line after declarations + int i; + WREG32_SOC15(GC, 0, mmGCVM_L2_CNTL, adev->gmc.VM_L2_CNTL); WARNING: braces {} are not necessary for single statement blocks + if (!time) { + DRM_WARN("failed to wait for GRBM(EA) idle\n"); + } Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-10 09:02:36 -04:00
Srinivasan Shanmugam	0cfc1d6830	drm/amdgpu: Fix errors & warnings in gmc_ v6_0, v7_0.c Fix below checkpatch errors & warnings: ERROR: trailing statements should be on next line + default: BUG(); ERROR: trailing statements should be on next line WARNING: braces {} are not necessary for single statement blocks WARNING: braces {} are not necessary for any arm of this statement WARNING: Block comments use * on subsequent lines WARNING: Missing a blank line after declarations WARNING: Prefer 'unsigned int' to bare use of 'unsigned' Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-10 09:02:36 -04:00
Srinivasan Shanmugam	8612a435f3	drm/amdgpu: Fix warnings in gmc_v10_0.c Fix below checkpatch warnings: WARNING: Consider removing the code enclosed by this #if 0 and its #endif WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: quoted string split across lines Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-10 09:02:36 -04:00
Srinivasan Shanmugam	e2710187bb	drm/amdgpu: Prefer dev_warn over printk Fix the below warning: WARNING: Prefer [subsystem eg: netdev]_warn([subsystem]dev, ... then dev_warn(dev, ... then pr_warn(... to printk(KERN_WARNING ... Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-10 09:02:36 -04:00
Srinivasan Shanmugam	0e2b8507c4	drm/amdgpu: Fix warnings in gfxhub_v2_0.c Fix the below checkpatch warnings: WARNING: static const char * array should probably be static const char * const +static const char *gfxhub_client_ids[] = { WARNING: Prefer 'unsigned int' to bare use of 'unsigned' + unsigned i; WARNING: Missing a blank line after declarations + u32 tmp; + tmp = RREG32_SOC15(GC, 0, mmGCVM_L2_PROTECTION_FAULT_CNTL); Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-10 09:02:36 -04:00
Lijo Lazar	67769b7cdd	drm/amdgpu: Remove redundant GFX v9.4.3 sequence Programming of XCC id is already taken care with partition mode change. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-10 09:02:36 -04:00
Srinivasan Shanmugam	62e6771ae8	drm/amdgpu: Fix warnings in gfxhub_ v1_0, v1_2.c Fix the below checkpatch warnings: WARNING: Block comments should align the * on each line + /* + * Raven2 has a HW issue that it is unable to use the WARNING: Prefer 'unsigned int' to bare use of 'unsigned' + unsigned num_level, block_size; WARNING: Prefer 'unsigned int' to bare use of 'unsigned' + unsigned i; WARNING: Missing a blank line after declarations + u32 tmp; + tmp = RREG32_SOC15(GC, 0, mmVM_L2_PROTECTION_FAULT_CNTL); WARNING: Block comments should align the * on each line + /* + * Raven2 has a HW issue that it is unable to use the WARNING: Prefer 'unsigned int' to bare use of 'unsigned' + unsigned num_level, block_size; Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-10 09:02:36 -04:00
Srinivasan Shanmugam	08e8521576	drm/amdgpu: Fix error & warnings in gmc_v9_0.c Fix below checkpatch error & warnings: ERROR: that open brace { should be on the previous line WARNING: static const char * array should probably be static const char * const WARNING: Block comments use * on subsequent lines WARNING: Block comments use a trailing */ on a separate line Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-07 13:51:48 -04:00
Lijo Lazar	4755bfbd99	drm/amdgpu: Change golden settings for GFX v9.4.3 Change the settings applicable for A0. GRBM_MCM_ADDR setting will be applied by firmware. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Acked-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Tested-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-07 13:51:48 -04:00
Sreekant Somasekharan	c4cde7358d	drm/amd/amdgpu: Add cu_occupancy sysfs file to GFX9.4.3 Include kgd_gfx_v9_get_cu_occupancy call inside kfd2kgd_calls for GFX9.4.3 to expose cu_occupancy sysfs file. Signed-off-by: Sreekant Somasekharan <sreekant.somasekharan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-07 13:51:48 -04:00
Xiaogang Chen	eb58ad143d	drm/amdgpu: have bos for PDs/PTS cpu accessible when kfd uses cpu to update vm When kfd uses cpu to update vm iterates all current PDs/PTs bos, adds AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag and kmap them to kernel virtual address space before kfd updates the vm that was created by gfx. Signed-off-by: Xiaogang Chen <Xiaogang.Chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-07 13:51:48 -04:00
Mukul Joshi	9041b53a59	drm/amdkfd: Use KIQ to unmap HIQ Currently, we unmap HIQ by directly writing to HQD registers. This doesn't work for GFX9.4.3. Instead, use KIQ to unmap HIQ, similar to how we use KIQ to map HIQ. Using KIQ to unmap HIQ works for all GFX series post GFXv9. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-07 13:51:48 -04:00
Tao Zhou	a80fe1a698	drm/amdgpu: skip address adjustment for GFX RAS injection The address parameter of GFX RAS injection isn't related to XGMI node number, keep it unchanged. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Candice Li <candice.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-07 13:51:47 -04:00
Mukul Joshi	e77673d14f	drm/amdgpu: Update invalid PTE flag setting Update the invalid PTE flag setting with TF enabled. This is to ensure, in addition to transitioning the retry fault to a no-retry fault, it also causes the wavefront to enter the trap handler. With the current setting, the fault only transitions to a no-retry fault. Additionally, have 2 sets of invalid PTE settings, one for TF enabled, the other for TF disabled. The setting with TF disabled, doesn't work with TF enabled. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-07 13:51:47 -04:00
Alex Deucher	bc8ba5f2da	drm/amdgpu: return an error if query_video_caps is not set Should only be an issue for bring up when the function pointer is not set, but check it anyway to be safe. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-07 13:51:47 -04:00
Mario Limonciello	a90d36a49a	drm/amd: adjust whitespace for amdgpu_psp.h Adjust the whitespace to be consistent with the rest of the `struct psp_context` structure. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-07 13:51:47 -04:00
Mario Limonciello	e7347f1c73	drm/amd: Detect IFWI or PD upgrade support in psp_early_init() Rather than evaluating the IP version for visibility, evaluate it at the same time as the IP is initialized. Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-07 13:51:47 -04:00
Mario Limonciello	649663af73	drm/amd: Add documentation for how to flash a dGPU The flashing process for dGPUs uses sysfs files in a non-obvious way, so document it for users. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-07 13:51:47 -04:00
Mario Limonciello	98d19a6c49	drm/amd: Convert USB-C PD F/W attributes into groups Rather than special casing the creation of the file, special case the visibility to the supported dGPUs. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-07 13:51:47 -04:00
Mario Limonciello	1cc506f08b	drm/amd: Make flashing messages quieter Debug messages related to the kernel process of flashing an updated IFWI are needlessly noisy and also confusing. Downgrade them to debug instead and clarify what they are actually doing. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-07 13:51:47 -04:00
Mario Limonciello	521289d2a2	drm/amd: Use attribute groups for PSP flashing attributes Individually creating attributes can be racy, instead make attributes using attribute groups and control their visibility with an is_visible callback to only show when using appropriate products. v2: squash in fix for PSP 13.0.10 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-07 13:51:46 -04:00
Lijo Lazar	fe9aaddf90	drm/amdgpu: Rename aqua_vanjaram_reg_init.c This contains SOC specific functions, rename to a more generic format <soc>.c => aqua_vanjaram.c Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-07 13:38:27 -04:00
Linus Torvalds	5133c9e51d	drm fixes for 6.5-rc1 fbdev: - Fix module infos on sparc panel: - Fix mode on Starry-ili9882t i915: - Allow DC states along with PW2 only for PWB functionality [adlp+] - Fix SSC selection for MPLLA [mtl] - Use hw.adjusted mode when calculating io/fast wake times [psr] - Apply min softlimit correctly [guc/slpc] - Assign correct hdcp content type [hdcp] - Add missing forward declarations/includes to display power headers - Fix BDW PSR AUX CH data register offsets [psr] - Use mock device info for creating mock device amdgpu: - Misc cleanups - GFX 9.4.3 fixes - DEBUGFS build fix - Fix LPDDR5 reporting - ASPM fixes - DCN 3.1.4 fixes - DP MST fixes - DCN 3.2.x fixes - Display PSR TCON fixes - SMU 13.x fixes - RAS fixes - Vega12/20 SMU fixes - PSP flashing cleanup - GFX9 MCBP fixes - SR-IOV fixes - GPUVM clear mappings fix for always valid BOs - Add FAMS quirk for problematic monitor - Fix possible UAF - Better handle monentary temperature fluctuations - SDMA 4.4.2 fixes - Fencing fix -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmSnZkkACgkQDHTzWXnE hr5qlA//RZ5T0FaCV7lNJUkoHuHkd/nEyeqJCMD1GC+LJH7NgQv2ahIZte218nUZ 602MMsWqJYLRrP13TbuYy/McKe56w3lq87eeILmV6CGAUB96yjqjTaHoKHU49/PC kTzpgBgcKo0gqp73B+FZ1qBE3I3RowPwPYnL3ddH236lv4FBus6nowwJf/pmPU8f ok41lefHyo601ypHFVAWRfJNl+qnXpBAvZxpeTlExqklqZLgvgNTwvJvNHmVUoi3 hDgGJVQaMJH9O3Chwgr7JdE3Lk+ku+XH+CU3Qw0SdBB4NaMxU5KMagwQFYfV18cV VZ9DqL9G/y4xpRVl0anr/MM2eF4E0P/MZMCKGBHOPkP/Ol8mmlAUSDoIzDaeDiPc tGpSNuWywr/DzWBWUZFeZBYdR4a8A4jMg84SPzVvPNNjRUXMutKxFe9eMNCpDc6C yTInY4EuwrYSPFPa2sT/B5efKouieH4XsF6ORK3uct0ZWrGwf05DX+HdykhZQwH1 eQRpWhR6lZKtovCceGFstxaC0B+Gh58D3uUFznxVKXEQyIVZvjEtm/aMLtHZTYT/ Wtj0cAr12a1Mbyy5XXh1ZGeq5QqmIaQt/KGnatp2fxWEmccwd22Uw5q1YZuNm2A7 AlndyRnJtw9u6zfnvkNd/Fj/9v7QdZquuluSKQYoHe4ZPS3+GJs= =OBzT -----END PGP SIGNATURE----- Merge tag 'drm-next-2023-07-07' of git://anongit.freedesktop.org/drm/drm Pull drm fixes from Dave Airlie: "Lots of fixes, mostly i915 and amdgpu. It's two weeks of i915, and I think three weeks of amdgpu. fbdev: - Fix module infos on sparc panel: - Fix mode on Starry-ili9882t i915: - Allow DC states along with PW2 only for PWB functionality [adlp+] - Fix SSC selection for MPLLA [mtl] - Use hw.adjusted mode when calculating io/fast wake times [psr] - Apply min softlimit correctly [guc/slpc] - Assign correct hdcp content type [hdcp] - Add missing forward declarations/includes to display power headers - Fix BDW PSR AUX CH data register offsets [psr] - Use mock device info for creating mock device amdgpu: - Misc cleanups - GFX 9.4.3 fixes - DEBUGFS build fix - Fix LPDDR5 reporting - ASPM fixes - DCN 3.1.4 fixes - DP MST fixes - DCN 3.2.x fixes - Display PSR TCON fixes - SMU 13.x fixes - RAS fixes - Vega12/20 SMU fixes - PSP flashing cleanup - GFX9 MCBP fixes - SR-IOV fixes - GPUVM clear mappings fix for always valid BOs - Add FAMS quirk for problematic monitor - Fix possible UAF - Better handle monentary temperature fluctuations - SDMA 4.4.2 fixes - Fencing fix" * tag 'drm-next-2023-07-07' of git://anongit.freedesktop.org/drm/drm: (83 commits) drm/i915: use mock device info for creating mock device drm/i915/psr: Fix BDW PSR AUX CH data register offsets drm/amdgpu: Fix potential fence use-after-free v2 drm/amd/pm: avoid unintentional shutdown due to temperature momentary fluctuation drm/amd/pm: expose swctf threshold setting for legacy powerplay drm/amd/display: 3.2.241 drm/amd/display: Take full update path if number of planes changed drm/amd/display: Create debugging mechanism for Gaming FAMS drm/amd/display: Add monitor specific edid quirk drm/amd/display: For new fast update path, loop through each surface drm/amd/display: Remove Phantom Pipe Check When Calculating K1 and K2 drm/amd/display: Limit new fast update path to addr and gamma / color drm/amd/display: Fix the delta clamping for shaper LUT drm/amdgpu: Keep non-psp path for partition switch drm/amd/display: program DPP shaper and 3D LUT if updated Revert "drm/amd/display: edp do not add non-edid timings" drm/amdgpu: share drm device for pci amdgpu device with 1st partition device drm/amd/pm: Add GFX v9.4.3 unique id to sysfs drm/amd/pm: Enable pp_feature attribute drm/amdgpu/vcn: Need to unpause dpg before stop dpg ...	2023-07-06 22:42:54 -07:00
shanzhulig	2e54154b9f	drm/amdgpu: Fix potential fence use-after-free v2 fence Decrements the reference count before exiting. Avoid Race Vulnerabilities for fence use-after-free. v2 (chk): actually fix the use after free and not just move it. Signed-off-by: shanzhulig <shanzhulig@gmail.com> Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-30 13:12:16 -04:00
Evan Quan	b75efe88b2	drm/amd/pm: avoid unintentional shutdown due to temperature momentary fluctuation An intentional delay is added on soft ctf triggered. Then there will be a double check for the GPU temperature before taking further action. This can avoid unintended shutdown due to temperature momentary fluctuation. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-30 13:12:16 -04:00
Lijo Lazar	a28eb4871a	drm/amdgpu: Keep non-psp path for partition switch When PSP block is not present, use direct programming. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Tested-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-30 13:12:15 -04:00
James Zhu	150c213139	drm/amdgpu: share drm device for pci amdgpu device with 1st partition device To save render node resoure, share drm device setting for pci amdgpu device with 1st XCP partition device. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-30 13:12:15 -04:00
Emily Deng	803f31814f	drm/amdgpu/vcn: Need to unpause dpg before stop dpg Need to unpause dpg first, or it will hit follow error during stop dpg: "[drm] Register(1) [regUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000000n" Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-30 13:12:15 -04:00
Le Ma	67af691626	drm/amdgpu: remove duplicated doorbell range init for sdma v4.4.2 Handled in earlier phase Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-30 13:12:15 -04:00
YiPeng Chai	2c7cd280e5	drm/amdgpu: gpu recovers from fatal error in poison mode Fatal error occurs in ras poison mode, mode1 reset is used to recover gpu. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-30 13:12:15 -04:00
Alex Deucher	50a7c8765c	drm/amdgpu: enable mcbp by default on gfx9 It's required for high priority queues. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2535 Reviewed-and-tested-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-30 13:12:15 -04:00
Alex Deucher	02ff519e99	drm/amdgpu: make mcbp a per device setting So we can selectively enable it on certain devices. No intended functional change. Reviewed-and-tested-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-30 13:12:14 -04:00
Mario Limonciello	5efe0f3eed	drm/amd: Don't initialize PSP twice for Navi3x PSP functions are already set by psp_early_init() so initializing them a second time is unnecessary. No intended functional changes. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-30 13:12:14 -04:00
Zhigang Luo	2036b34d4a	drm/amdgpu: port SRIOV VF missed changes port SRIOV VF missed changes from gfx_v9_0 to gfx_v9_4_3. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Zhigang Luo <Zhigang.Luo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-30 13:11:35 -04:00
Mario Limonciello	5c6d52ff4b	drm/amd: Don't try to enable secure display TA multiple times If the securedisplay TA failed to load the first time, it's unlikely to work again after a suspend/resume cycle or reset cycle and it appears to be causing problems in futher attempts. Fixes: `e42dfa66d5` ("drm/amdgpu: Add secure display TA load for Renoir") Reported-by: Filip Hejsek <filip.hejsek@gmail.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2633 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-30 13:11:35 -04:00
Christian König	570b295248	drm/amdgpu: fix number of fence calculations Since adding gang submit we need to take the gang size into account while reserving fences. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: `4624459c84` ("drm/amdgpu: add gang submit frontend v6") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-30 13:11:35 -04:00
Lijo Lazar	b579ea632f	drm/amdgpu: Modify for_each_inst macro Modify it such that it doesn't change the instance mask parameter. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Victor Skvortsov <victor.skvortsov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-30 13:11:35 -04:00
Mangesh Gadre	7f03b1d14d	drm/amdgpu:Remove sdma halt/unhalt during frontdoor load sdma halt/unhalt is performed by psp when frontdoor loading used,so this can be skipped. v2: Instead of removing halt/unhalt completely, driver will do it only during backdoor load. Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-30 13:11:35 -04:00
Tao Zhou	4ff96bcc0d	drm/amdgpu: check RAS irq existence for VCN/JPEG No RAS irq is allowed. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-30 13:11:35 -04:00
Xiaogang Chen	1d7776cc14	drm/amdgpu: remove vm sanity check from amdgpu_vm_make_compute Since we allow kfd and graphic operate on same GPU VM to have interoperation between them GPU VM may have been used by graphic vm operations before kfd turns a GPU VM into a compute VM. Remove vm clean checking at amdgpu_vm_make_compute. Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-30 13:11:35 -04:00
Linus Torvalds	1b722407a1	drm changes for 6.5-rc1: core: - replace strlcpy with strscpy - EDID changes to support further conversion to struct drm_edid - Move i915 DSC parameter code to common DRM helpers - Add Colorspace functionality aperture: - ignore framebuffers with non-primary devices fbdev: - use fbdev i/o helpers - add Kconfig options for fb_ops helpers - use new fb io helpers directly in drivers sysfs: - export DRM connector ID scheduler: - Avoid an infinite loop ttm: - store function table in .rodata - Add query for TTM mem limit - Add NUMA awareness to pools - Export ttm_pool_fini() bridge: - fsl-ldb: support i.MX6SX - lt9211, lt9611: remove blanking packets - tc358768: implement input bus formats, devm cleanups - ti-snd65dsi86: implement wait_hpd_asserted - analogix: fix endless probe loop - samsung-dsim: support swapped clock, fix enabling, support var clock - display-connector: Add support for external power supply - imx: Fix module linking - tc358762: Support reset GPIO panel: - nt36523: Support Lenovo J606F - st7703: Support Anbernic RG353V-V2 - InnoLux G070ACE-L01 support - boe-tv101wum-nl6: Improve initialization - sharp-ls043t1le001: Mode fixes - simple: BOE EV121WXM-N10-1850, S6D7AA0 - Ampire AM-800480L1TMQW-T00H - Rocktech RK043FN48H - Starry himax83102-j02 - Starry ili9882t amdgpu: - add new ctx query flag to handle reset better - add new query/set shadow buffer for rdna3 - DCN 3.2/3.1.x/3.0.x updates - Enable DC_FP on loongarch - PCIe fix for RDNA2 - improve DC FAMS/SubVP support for better power management - partition support for lots of engines - Take NUMA into account when allocating memory - Add new DRM_AMDGPU_WERROR config parameter to help with CI - Initial SMU13 overdrive support - Add support for new colorspace KMS API - W=1 fixes amdkfd: - Query TTM mem limit rather than hardcoding it - GC 9.4.3 partition support - Handle NUMA for partitions - Add debugger interface for enabling gdb - Add KFD event age tracking radeon: - Fix possible UAF i915: - new getparam for PXP support - GSC/MEI proxy driver - Meteorlake display enablement - avoid clearing preallocated framebuffers with TTM - implement framebuffer mmap support - Disable sampler indirect state in bindless heap - Enable fdinfo for GuC backends - GuC loading and firmware table handling fixes - Various refactors for multi-tile enablement - Define MOCS and PAT tables for MTL - GSC/MEI support for Meteorlake - PMU multi-tile support - Large driver kernel doc cleanup - Allow VRR toggling and arbitrary refresh rates - Support async flips on linear buffers on display ver 12+ - Expose CRTC CTM property on ILK/SNB/VLV - New debugfs for display clock frequencies - Hotplug refactoring - Display refactoring - I915_GEM_CREATE_EXT_SET_PAT for Mesa on Meteorlake - Use large rings for compute contexts - HuC loading for MTL - Allow user to set cache at BO creation - MTL powermanagement enhancements - Switch to dedicated workqueues to stop using flush_scheduled_work() - Move display runtime init under display/ - Remove 10bit gamma on desktop gen3 parts, they don't support it habanalabs: - uapi: return 0 for user queries if there was a h/w or f/w error - Add pci health check when we lose connection with the firmware. This can be used to distinguish between pci link down and firmware getting stuck. - Add more info to the error print when TPC interrupt occur. - Firmware fixes msm: - Adreno A660 bindings - SM8350 MDSS bindings fix - Added support for DPU on sm6350 and sm6375 platforms - Implemented tearcheck support to support vsync on SM150 and newer platforms - Enabled missing features (DSPP, DSC, split display) on sc8180x, sc8280xp, sm8450 - Added support for DSI and 28nm DSI PHY on MSM8226 platform - Added support for DSI on sm6350 and sm6375 platforms - Added support for display controller on MSM8226 platform - A690 GPU support - Move cmdstream dumping out of fence signaling path - a610 support - Support for a6xx devices without GMU nouveau: - NULL ptr before deref fixes armada: - implement fbdev emulation as client sun4i: - fix mipi-dsi dotclock - release clocks vc4: - rgb range toggle property - BT601 / BT2020 HDMI support vkms: - convert to drmm helpers - add reflection and rotation support - fix rgb565 conversion gma500: - fix iomem access shmobile: - support renesas soc platform - enable fbdev mxsfb: - Add support for i.MX93 LCDIF stm: - dsi: Use devm_ helper - ltdc: Fix potential invalid pointer deref renesas: - Group drivers in renesas subdirectory to prepare for new platform - Drop deprecated R-Car H3 ES1.x support meson: - Add support for MIPI DSI displays virtio: - add sync object support mediatek: - Add display binding document for MT6795 -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmSc3UwACgkQDHTzWXnE hr69fQ/+PF9L7FSB/qfjaoqJnk6wJyCehv7pDX2/UK7FUrW0e4EwNVx4KKIRqO/P pKSU9wRlC72ViGgqOYnw0pwzuh45630vWo1stbgxipU2cvM6Ywlq8FiQFdymFe+P tLYWe5MR55Y+E9Y+bCrKn2yvQ7v+f6EZ6ITIX7mrXL77Bpxhv58VzmZawkxmw5MV vwhSqJaaeeWNoyfSIDdN8Oj9fE6ScTyiA0YisOP6jnK/TiQofXQxFrMIdKctCcoA HjolfEEPVCDOSBipkV3hLiyN8lXmt47BmuHp9opSL/g1aASteVeD1/GrccTaA4xV ah+Jx1hBLcH5sm8CZzbCcHhNu3ILnPCFZFCx8gwflQqmDIOZvoMdL75j7lgqJZG8 TePEiifG3kYO/ZiDc5TUBdeMfbgeehPOsxbvOlA3LxJrgyxe/5o9oejX2Uvvzhoq 9fno1PLqeCILqYaMiCocJwyTw/2VKYCCH7Wiypd4o3h0nmAbbqPT3KeZgNOjoa2X GXpiIU9rTQ8LZgSmOXdCt2rc9Jb6q+eCiDgrZzAukbP8veQyOvO16Nx1+XzLhOYc BfjEOoA7nBJD+UPLWkwj42gKtoEWN7IOMTHgcK11d8jdpGISGupl/1nntGhYk0jO +3RRZXMB/Gjwe9ge4K9bFC81pbfuAE7ELQtPsgV9LapMmWHKccY= =FmUA -----END PGP SIGNATURE----- Merge tag 'drm-next-2023-06-29' of git://anongit.freedesktop.org/drm/drm Pull drm updates from Dave Airlie: "There is one set of patches to misc for a i915 gsc/mei proxy driver. Otherwise it's mostly amdgpu/i915/msm, lots of hw enablement and lots of refactoring. core: - replace strlcpy with strscpy - EDID changes to support further conversion to struct drm_edid - Move i915 DSC parameter code to common DRM helpers - Add Colorspace functionality aperture: - ignore framebuffers with non-primary devices fbdev: - use fbdev i/o helpers - add Kconfig options for fb_ops helpers - use new fb io helpers directly in drivers sysfs: - export DRM connector ID scheduler: - Avoid an infinite loop ttm: - store function table in .rodata - Add query for TTM mem limit - Add NUMA awareness to pools - Export ttm_pool_fini() bridge: - fsl-ldb: support i.MX6SX - lt9211, lt9611: remove blanking packets - tc358768: implement input bus formats, devm cleanups - ti-snd65dsi86: implement wait_hpd_asserted - analogix: fix endless probe loop - samsung-dsim: support swapped clock, fix enabling, support var clock - display-connector: Add support for external power supply - imx: Fix module linking - tc358762: Support reset GPIO panel: - nt36523: Support Lenovo J606F - st7703: Support Anbernic RG353V-V2 - InnoLux G070ACE-L01 support - boe-tv101wum-nl6: Improve initialization - sharp-ls043t1le001: Mode fixes - simple: BOE EV121WXM-N10-1850, S6D7AA0 - Ampire AM-800480L1TMQW-T00H - Rocktech RK043FN48H - Starry himax83102-j02 - Starry ili9882t amdgpu: - add new ctx query flag to handle reset better - add new query/set shadow buffer for rdna3 - DCN 3.2/3.1.x/3.0.x updates - Enable DC_FP on loongarch - PCIe fix for RDNA2 - improve DC FAMS/SubVP support for better power management - partition support for lots of engines - Take NUMA into account when allocating memory - Add new DRM_AMDGPU_WERROR config parameter to help with CI - Initial SMU13 overdrive support - Add support for new colorspace KMS API - W=1 fixes amdkfd: - Query TTM mem limit rather than hardcoding it - GC 9.4.3 partition support - Handle NUMA for partitions - Add debugger interface for enabling gdb - Add KFD event age tracking radeon: - Fix possible UAF i915: - new getparam for PXP support - GSC/MEI proxy driver - Meteorlake display enablement - avoid clearing preallocated framebuffers with TTM - implement framebuffer mmap support - Disable sampler indirect state in bindless heap - Enable fdinfo for GuC backends - GuC loading and firmware table handling fixes - Various refactors for multi-tile enablement - Define MOCS and PAT tables for MTL - GSC/MEI support for Meteorlake - PMU multi-tile support - Large driver kernel doc cleanup - Allow VRR toggling and arbitrary refresh rates - Support async flips on linear buffers on display ver 12+ - Expose CRTC CTM property on ILK/SNB/VLV - New debugfs for display clock frequencies - Hotplug refactoring - Display refactoring - I915_GEM_CREATE_EXT_SET_PAT for Mesa on Meteorlake - Use large rings for compute contexts - HuC loading for MTL - Allow user to set cache at BO creation - MTL powermanagement enhancements - Switch to dedicated workqueues to stop using flush_scheduled_work() - Move display runtime init under display/ - Remove 10bit gamma on desktop gen3 parts, they don't support it habanalabs: - uapi: return 0 for user queries if there was a h/w or f/w error - Add pci health check when we lose connection with the firmware. This can be used to distinguish between pci link down and firmware getting stuck. - Add more info to the error print when TPC interrupt occur. - Firmware fixes msm: - Adreno A660 bindings - SM8350 MDSS bindings fix - Added support for DPU on sm6350 and sm6375 platforms - Implemented tearcheck support to support vsync on SM150 and newer platforms - Enabled missing features (DSPP, DSC, split display) on sc8180x, sc8280xp, sm8450 - Added support for DSI and 28nm DSI PHY on MSM8226 platform - Added support for DSI on sm6350 and sm6375 platforms - Added support for display controller on MSM8226 platform - A690 GPU support - Move cmdstream dumping out of fence signaling path - a610 support - Support for a6xx devices without GMU nouveau: - NULL ptr before deref fixes armada: - implement fbdev emulation as client sun4i: - fix mipi-dsi dotclock - release clocks vc4: - rgb range toggle property - BT601 / BT2020 HDMI support vkms: - convert to drmm helpers - add reflection and rotation support - fix rgb565 conversion gma500: - fix iomem access shmobile: - support renesas soc platform - enable fbdev mxsfb: - Add support for i.MX93 LCDIF stm: - dsi: Use devm_ helper - ltdc: Fix potential invalid pointer deref renesas: - Group drivers in renesas subdirectory to prepare for new platform - Drop deprecated R-Car H3 ES1.x support meson: - Add support for MIPI DSI displays virtio: - add sync object support mediatek: - Add display binding document for MT6795" * tag 'drm-next-2023-06-29' of git://anongit.freedesktop.org/drm/drm: (1791 commits) drm/i915: Fix a NULL vs IS_ERR() bug drm/i915: make i915_drm_client_fdinfo() reference conditional again drm/i915/huc: Fix missing error code in intel_huc_init() drm/i915/gsc: take a wakeref for the proxy-init-completion check drm/msm/a6xx: Add A610 speedbin support drm/msm/a6xx: Add A619_holi speedbin support drm/msm/a6xx: Use adreno_is_aXYZ macros in speedbin matching drm/msm/a6xx: Use "else if" in GPU speedbin rev matching drm/msm/a6xx: Fix some A619 tunables drm/msm/a6xx: Add A610 support drm/msm/a6xx: Add support for A619_holi drm/msm/adreno: Disable has_cached_coherent in GMU wrapper configurations drm/msm/a6xx: Introduce GMU wrapper support drm/msm/a6xx: Move CX GMU power counter enablement to hw_init drm/msm/a6xx: Extend and explain UBWC config drm/msm/a6xx: Remove both GBIF and RBBM GBIF halt on hw init drm/msm/a6xx: Add a helper for software-resetting the GPU drm/msm/a6xx: Improve a6xx_bus_clear_pending_transactions() drm/msm/a6xx: Move a6xx_bus_clear_pending_transactions to a6xx_gpu drm/msm/a6xx: Move force keepalive vote removal to a6xx_gmu_force_off() ...	2023-06-29 11:00:17 -07:00
Linus Torvalds	582c161cf3	hardening updates for v6.5-rc1 - Fix KMSAN vs FORTIFY in strlcpy/strlcat (Alexander Potapenko) - Convert strreplace() to return string start (Andy Shevchenko) - Flexible array conversions (Arnd Bergmann, Wyes Karny, Kees Cook) - Add missing function prototypes seen with W=1 (Arnd Bergmann) - Fix strscpy() kerndoc typo (Arne Welzel) - Replace strlcpy() with strscpy() across many subsystems which were either Acked by respective maintainers or were trivial changes that went ignored for multiple weeks (Azeem Shaikh) - Remove unneeded cc-option test for UBSAN_TRAP (Nick Desaulniers) - Add KUnit tests for strcat()-family - Enable KUnit tests of FORTIFY wrappers under UML - Add more complete FORTIFY protections for strlcat() - Add missed disabling of FORTIFY for all arch purgatories. - Enable -fstrict-flex-arrays=3 globally - Tightening UBSAN_BOUNDS when using GCC - Improve checkpatch to check for strcpy, strncpy, and fake flex arrays - Improve use of const variables in FORTIFY - Add requested struct_size_t() helper for types not pointers - Add __counted_by macro for annotating flexible array size members -----BEGIN PGP SIGNATURE----- iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmSbftQWHGtlZXNjb29r QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJj0MD/9X9jzJzCmsAU+yNldeoAzC84Sk GVU3RBxGcTNysL1gZXynkIgigw7DWc4htMGeSABHHwQRVP65JCH1Kw/VqIkyumbx 9LdX6IklMJb4pRT4PVU3azebV4eNmSjlur2UxMeW54Czm91/6I8RHbJOyAPnOUmo 2oomGdP/hpEHtKR7hgy8Axc6w5ySwQixh2V5sVZG3VbvCS5WKTmTXbs6puuRT5hz iHt7v+7VtEg/Qf1W7J2oxfoghvVBsaRrSLrExWT/oZYh1ZxM7DsCAAoG/IsDgHGA 9LBXiRECgAFThbHVxLvvKZQMXdVk0i8iXLX43XMKC0wTA+NTyH7wlcQQ4RWNMuo8 sfA9Qm9gMArXaf64aymr3Uwn20Zan0391HdlbhOJZAE6v3PPJbleUnM58AzD2d3r 5Lz6AIFBxDImy+3f9iDWgacCT5/PkeiXTHzk9QnKhJyKKtRA58XJxj4q2+rPnGJP n4haXqoxD5FJbxdXiGKk31RS0U5HBug7wkOcUrTqDHUbc/QNU2b7dxTKUx+zYtCU uV5emPzpF4H4z+91WpO47n9gkMAfwV0lt9S2dwS8pxsgqctbmIan+Jgip7rsqZ2G OgLXBsb43eEs+6WgO8tVt/ZHYj9ivGMdrcNcsIfikzNs/xweUJ53k2xSEn2xEa5J cwANDmkL6QQK7yfeeg== =s0j1 -----END PGP SIGNATURE----- Merge tag 'hardening-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: "There are three areas of note: A bunch of strlcpy()->strscpy() conversions ended up living in my tree since they were either Acked by maintainers for me to carry, or got ignored for multiple weeks (and were trivial changes). The compiler option '-fstrict-flex-arrays=3' has been enabled globally, and has been in -next for the entire devel cycle. This changes compiler diagnostics (though mainly just -Warray-bounds which is disabled) and potential UBSAN_BOUNDS and FORTIFY _warning_ coverage. In other words, there are no new restrictions, just potentially new warnings. Any new FORTIFY warnings we've seen have been fixed (usually in their respective subsystem trees). For more details, see commit `df8fc4e934`. The under-development compiler attribute __counted_by has been added so that we can start annotating flexible array members with their associated structure member that tracks the count of flexible array elements at run-time. It is possible (likely?) that the exact syntax of the attribute will change before it is finalized, but GCC and Clang are working together to sort it out. Any changes can be made to the macro while we continue to add annotations. As an example of that last case, I have a treewide commit waiting with such annotations found via Coccinelle: https://git.kernel.org/linus/adc5b3cb48a049563dc673f348eab7b6beba8a9b Also see commit `dd06e72e68` for more details. Summary: - Fix KMSAN vs FORTIFY in strlcpy/strlcat (Alexander Potapenko) - Convert strreplace() to return string start (Andy Shevchenko) - Flexible array conversions (Arnd Bergmann, Wyes Karny, Kees Cook) - Add missing function prototypes seen with W=1 (Arnd Bergmann) - Fix strscpy() kerndoc typo (Arne Welzel) - Replace strlcpy() with strscpy() across many subsystems which were either Acked by respective maintainers or were trivial changes that went ignored for multiple weeks (Azeem Shaikh) - Remove unneeded cc-option test for UBSAN_TRAP (Nick Desaulniers) - Add KUnit tests for strcat()-family - Enable KUnit tests of FORTIFY wrappers under UML - Add more complete FORTIFY protections for strlcat() - Add missed disabling of FORTIFY for all arch purgatories. - Enable -fstrict-flex-arrays=3 globally - Tightening UBSAN_BOUNDS when using GCC - Improve checkpatch to check for strcpy, strncpy, and fake flex arrays - Improve use of const variables in FORTIFY - Add requested struct_size_t() helper for types not pointers - Add __counted_by macro for annotating flexible array size members" * tag 'hardening-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (54 commits) netfilter: ipset: Replace strlcpy with strscpy uml: Replace strlcpy with strscpy um: Use HOST_DIR for mrproper kallsyms: Replace all non-returning strlcpy with strscpy sh: Replace all non-returning strlcpy with strscpy of/flattree: Replace all non-returning strlcpy with strscpy sparc64: Replace all non-returning strlcpy with strscpy Hexagon: Replace all non-returning strlcpy with strscpy kobject: Use return value of strreplace() lib/string_helpers: Change returned value of the strreplace() jbd2: Avoid printing outside the boundary of the buffer checkpatch: Check for 0-length and 1-element arrays riscv/purgatory: Do not use fortified string functions s390/purgatory: Do not use fortified string functions x86/purgatory: Do not use fortified string functions acpi: Replace struct acpi_table_slit 1-element array with flex-array clocksource: Replace all non-returning strlcpy with strscpy string: use __builtin_memcpy() in strlcpy/strlcat staging: most: Replace all non-returning strlcpy with strscpy drm/i2c: tda998x: Replace all non-returning strlcpy with strscpy ...	2023-06-27 21:24:18 -07:00
Thomas Zimmermann	71e801b9b4	drm: Clear fd/handle callbacks in struct drm_driver Clear all assignments of struct drm_driver's fd/handle callbacks to drm_gem_prime_fd_to_handle() and drm_gem_prime_handle_to_fd(). These functions are called by default. Add a TODO item to convert vmwgfx to the defaults as well. v2: * remove TODO item (Zack) * also update amdgpu's amdgpu_partition_driver Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Simon Ser <contact@emersion.fr> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Jeffrey Hugo <quic_jhugo@quicinc.com> # qaic Link: https://patchwork.freedesktop.org/patch/msgid/20230620080252.16368-3-tzimmermann@suse.de	2023-06-26 11:08:41 +02:00
Evan Quan	1af3d0a8e8	drm/amd/pm: update the LC_L1_INACTIVITY setting to address possible noise issue It is proved that insufficient LC_L1_INACTIVITY setting can cause audio noise on some platform. With the LC_L1_INACTIVITY increased to 4ms, the issue can be resolved. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-23 15:46:22 -04:00
Zhigang Luo	acbe761046	drm/amdgpu: Skip TMR for MP0_HWIP 13.0.6 For SRIOV VF, no TMR needed. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Zhigang Luo <Zhigang.Luo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-23 15:36:48 -04:00
Evan Quan	fd21987274	drm/amd/pm: revise the ASPM settings for thunderbolt attached scenario Also, correct the comment for NAVI10_PCIE__LC_L1_INACTIVITY_TBT_DEFAULT as 0x0000000E stands for 400ms instead of 4ms. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-23 15:36:03 -04:00
Samuel Pitoiset	ea2c3c0855	drm/amdgpu: fix clearing mappings for BOs that are always valid in VM Per VM BOs must be marked as moved or otherwise their ranges are not updated on use which might be necessary when the replace operation splits mappings. This fixes random GPU hangs when replacing sparse mappings from the userspace, while OP_MAP/OP_UNMAP works fine because always valid BOs are correctly handled there. Cc: stable@vger.kernel.org Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-23 15:35:51 -04:00
Lijo Lazar	184d838482	drm/amdgpu: Add vbios attribute only if supported Not all devices carry VBIOS version information. Add the device attribute only if supported. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-23 15:35:26 -04:00
Alex Deucher	c09b3bf736	drm/amdgpu/atomfirmware: fix LPDDR5 width reporting LPDDR5 channels are 32 bit rather than 64, report the width properly in the log. v2: Only LPDDR5 are 32 bits per channel. DDR5 is 64 bits per channel Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2468 Acked-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-23 15:35:16 -04:00
Nathan Chancellor	4e3f85d1c0	drm/amdgpu: Remove CONFIG_DEBUG_FS guard around body of amdgpu_rap_debugfs_init() After commit `8020f0f931` ("drm/amd/amdgpu: enable W=1 for amdgpu"), there is an instance of -Wunused-const-variable when CONFIG_DEBUG_FS is disabled: drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c:110:37: error: unused variable 'amdgpu_rap_debugfs_ops' [-Werror,-Wunused-const-variable] 110 \| static const struct file_operations amdgpu_rap_debugfs_ops = { \| ^ 1 error generated. There is no reason for the body of this function to be guarded when CONFIG_DEBUG_FS is disabled, as debugfs_create_file() is a stub that just returns an error pointer in that situation. Remove the preprocessor guards so that the variable never appears unused, while not changing anything at run time. Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-23 15:33:09 -04:00
Lijo Lazar	025654ae42	drm/amdgpu: Move calculation of xcp per memory node Its value is required for finding the memory id of xcp. Fixes: `d26ea1b346` ("drm/amdgpu: Add xcp manager num_xcp_per_mem_partition") Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-23 15:31:55 -04:00
Jiadong Zhu	ef3c36a6e0	drm/amdgpu: Skip mark offset for high priority rings Only low priority rings are using chunks to save the offset. Bypass the mark offset callings from high priority rings. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-23 15:31:50 -04:00
Thomas Zimmermann	042aeecc02	drm/amdgpu: Remove struct drm_driver.gem_prime_mmap The callback struct drm_driver.gem_prime_mmap as been removed in commit `0adec22702` ("drm: Remove struct drm_driver.gem_prime_mmap"). Do not assign to it. The assigned function, drm_gem_prime_mmap(), is now the default for the operation, so there is no change in functionality. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: `0adec22702` ("drm: Remove struct drm_driver.gem_prime_mmap") Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230619141129.2002-1-tzimmermann@suse.de	2023-06-19 16:35:27 +02:00
Thomas Zimmermann	de8a334f21	Merge drm/drm-next into drm-misc-next Backmerging into drm-misc-next to get commit `2c1c7ba457` ("drm/amdgpu: support partition drm devices"), which is required to fix commit `0adec22702` ("drm: Remove struct drm_driver.gem_prime_mmap"). Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2023-06-19 16:33:14 +02:00
Thomas Zimmermann	0adec22702	drm: Remove struct drm_driver.gem_prime_mmap All drivers initialize this field with drm_gem_prime_mmap(). Call the function directly and remove the field. Simplifies the code and resolves a long-standing TODO item. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230613150441.17720-3-tzimmermann@suse.de	2023-06-19 13:56:40 +02:00
Philip Yang	5d1c70bb6e	drm/amdgpu: Increase hmm range get pages timeout If hmm_range_fault returns -EBUSY, we should call hmm_range_fault again to validate the remaining pages. On one system with NUMA auto balancing enabled, hmm_range_fault takes 6 seconds for 1GB range because CPU migrate the range one page at a time. To be safe, increase timeout value to 1 second for 128MB range. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 17:55:41 -04:00
Philip Yang	d728eda3c5	drm/amdgpu: Enable translate further for GC v9.4.3 To extend UTCL2 reach. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 17:55:41 -04:00
Lijo Lazar	0e41639d9a	drm/amdgpu: Remove unused NBIO interface Set compute partition mode interface in NBIO is no longer used. Remove the only implementation from NBIO v7.9 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 14:33:56 -04:00
ZhenGuo Yin	71eaac368d	drm/amdgpu: add entity error check in amdgpu_ctx_get_entity [Why] UMD is not aware of entity error, and will keep submitting jobs into the error entity. [How] Add entity error check when getting entity from ctx. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:37:55 -04:00
Christian König	f88e295e90	drm/amdgpu: add VM generation token Instead of using the VRAM lost counter add a 64bit token which indicates if a context or job is still valid to use. Should the VRAM be lost or the page tables need re-creation the token will change indicating that userspace needs to act and re-create the contexts and re-submit the work. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:37:55 -04:00
Christian König	55bf196f60	drm/amdgpu: reset VM when an error is detected When some problem with the updates of page tables is detected reset the state machine of the VM and re-create all page tables from scratch. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:37:55 -04:00
Christian König	e84e697d92	drm/amdgpu: abort submissions during prepare on error Forward errors from previous submissions to this one. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:37:55 -04:00
Christian König	89fae8dc41	drm/amdgpu: mark soft recovered fences with -ENODATA Set the fence error code before trying to soft-recover it. It gets overwritten when a hard recovery is required. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:37:55 -04:00
Christian König	0a33b11d26	drm/amdgpu: mark force completed fences with -ECANCELED When we force complete fences we should mark them as canceled. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:37:55 -04:00
Christian König	b13eb02ba8	drm/amdgpu: add amdgpu_error_* debugfs file This allows us to insert some error codes into the bottom of the pipeline on an engine. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:37:54 -04:00
Alex Deucher	2eb841bdbc	drm/amdgpu: mark GC 9.4.3 experimental for now Mark as experimental for now until we get closer to production to avoid possible undesireable behavior when mixing newer boards with older kernels. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:37:54 -04:00
Lijo Lazar	b00f55374c	drm/amdgpu: Use PSP FW API for partition switch Use PSP firmware interface for switching compute partitions. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:06:59 -04:00
Lijo Lazar	fe381726c9	drm/amdgpu: Change nbio v7.9 xcp status definition PARTITION_MODE field in PARTITION_COMPUTE_STATUS register is defined as below by firmware. SPX = 0, DPX = 1, TPX = 2, QPX = 3, CPX = 4 Change driver definition accordingly. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:06:59 -04:00
Christian König	ca0b954a43	drm/amdgpu: make sure that BOs have a backing store It's perfectly possible that the BO is about to be destroyed and doesn't have a backing store associated with it. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:06:59 -04:00
Christian König	e2ad8e2df4	drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memory We need to grab the lock of the BO or otherwise can run into a crash when we try to inspect the current location. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:06:59 -04:00
Stanley.Yang	43aedbf4da	drm/amdgpu: Add checking mc_vram_size Do not compare injection address with mc_vram_size if mc_vram_size is zero. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:06:59 -04:00
Stanley.Yang	38298ce6fc	drm/amdgpu: Optimize checking ras supported Using "is_app_apu" to identify device in the native APU mode or carveout mode. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:06:59 -04:00
Candice Li	6fac3964a9	drm/amdgpu: Add channel_dis_num to ras init flags Add disabled channel number to ras init flags. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:06:59 -04:00
Candice Li	bcd9a5f8b9	drm/amdgpu: Update total channel number for umc v8_10 Update total channel number for umc v8_10. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:06:59 -04:00
Luben Tuikov	71344a718a	drm/amdgpu: Fix usage of UMC fill record in RAS The fixed commit listed in the Fixes tag below, introduced a bug in amdgpu_ras.c::amdgpu_reserve_page_direct(), in that when introducing the new amdgpu_umc_fill_error_record() and internally in that new function the physical address (argument "uint64_t retired_page"--wrong name) is right-shifted by AMDGPU_GPU_PAGE_SHIFT. Thus, in amdgpu_reserve_page_direct() when we pass "address" to that new function, we should NOT right-shift it, since this results, erroneously, in the page address to be 0 for first 2^(2*AMDGPU_GPU_PAGE_SHIFT) memory addresses. This commit fixes this bug. Cc: Tao Zhou <tao.zhou1@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Alex Deucher <Alexander.Deucher@amd.com> Fixes: `400013b268` ("drm/amdgpu: add umc_fill_error_record to make code more simple") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Link: https://lore.kernel.org/r/20230610113536.10621-1-luben.tuikov@amd.com Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:06:58 -04:00
Alex Deucher	e5df16d942	drm/amdgpu/sdma4: set align mask to 255 The wptr needs to be incremented at at least 64 dword intervals, use 256 to align with windows. This should fix potential hangs with unaligned updates. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:06:58 -04:00
Luben Tuikov	740f42a28f	drm/amdgpu: Report ras_num_recs in debugfs Report the number of records stored in the RAS EEPROM table in debugfs. This can be used by user-space to calculate the capacity of the RAS EEPROM table since "bad_page_cnt_threshold" is also reported in the same place in debugfs. See commit `7fb6407145` ("drm/amdgpu: Add bad_page_cnt_threshold to debugfs"). ras_num_recs can already be inferred by dumping the RAS EEPROM table, also in the same debugfs location, see commit reference `c65b0805e7` (drm/amdgpu: RAS EEPROM table is now in debugfs, 2021-04-08). This commit makes it an integer value easily shown in a single file. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Tao Zhou <tao.zhou1@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: John Clements <john.clements@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Link: https://lore.kernel.org/r/20230603051043.211548-1-luben.tuikov@amd.com Acked-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:06:58 -04:00
Lijo Lazar	82a1f42f6a	drm/amdgpu: Release SDMAv4.4.2 ecc irq properly Release ECC irq only if irq is enabled - only when RAS feature is enabled ECC irq gets enabled. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:06:58 -04:00
Likun Gao	d4a4ff1c8e	drm/amdgpu: add wait_for helper for spirom update Spirom update typically requires extremely long duration for command execution, and special helper function to wait for it completion. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:06:58 -04:00
Ma Jun	a1c23485b8	drm/amdgpu: Print client id for the unregistered interrupt resource Modify the debug information and print the clien id for these interrupts as well. Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 11:06:57 -04:00
Arunpravin Paneer Selvam	59eddd4e21	Revert "drm/amdgpu: remove TOPDOWN flags when allocating VRAM in large bar system" This reverts commit `c105518679`. This patch disables the TOPDOWN flag for APU and few dGPU cards which has the VRAM size equal to the BAR size. When we enable the TOPDOWN flag, we get the free blocks at the highest available memory region and we don't split the lower order blocks. This change is required to keep off the fragmentation related issues particularly in ASIC which has VRAM space <= 500MiB Hence, we are reverting this patch. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2270 Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 10:43:36 -04:00
Shiwu Zhang	8d8ffe3740	drm/amdgpu: expose num_hops and num_links xgmi info through dev attr Add these two dev attrs for xgmi info details which is helpful for developers checking the xgmi topology by catting the sys file directly. Take 4 cards with xgmi connection as an example, get the num_hops for each device or node through xmig_hive_info dir like, cat /sys/bus/pci/devices/0000:41:00.0/xgmi_hive_info/node1/num_hops will return "00 41 41 41" where "00" stands for the hops to node1 itself and "41" is the hops in hex format to every other node in the same hive. There are node1/node2/node3/node4 representing 4 cards in the hive. The same for num_links dev attr. Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Acked-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 10:43:31 -04:00
Sonny Jiang	188d3f80fc	drm/amdgpu: vcn_4_0 set instance 0 init sched score to 1 Only vcn0 can process AV1 codecx. In order to use both vcn0 and vcn1 in h264/265 transcode to AV1 cases, set vcn0 sched score to 1 at initialization time. Signed-off-by: Sonny Jiang <sonjiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 10:43:26 -04:00
Hamza Mahfooz	8020f0f931	drm/amd/amdgpu: enable W=1 for amdgpu We have a clean build with W=1 as of commit `c168feed5d` ("drm/amd/display/amdgpu_dm/amdgpu_dm_helpers: Move SYNAPTICS_DEVICE_ID into CONFIG_DRM_AMD_DC_DCN ifdef"). So, let's enable these checks unconditionally for the entire module to catch these errors during development. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 10:42:42 -04:00
Srinivasan Shanmugam	e06da81749	drm/amdgpu: Fix kdoc warning Fixes the following gcc with W=1: drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c:76: warning: This comment starts with '/*', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst EEPROM Table structure v1 drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c:98: warning: This comment starts with '/*', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst EEPROM Table structrue v2.1 Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 10:42:39 -04:00
Mukul Joshi	41ce6d6d03	drm/amdgpu: Rename DRM schedulers in amdgpu TTM Rename mman.entity to mman.high_pr to make the distinction clearer that this is a high priority scheduler. Similarly, rename the recently added mman.delayed to mman.low_pr to make it clear it is a low priority scheduler. No functional change in this patch. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-15 10:42:33 -04:00
Dave Airlie	901bdf5ea1	Merge tag 'amd-drm-next-6.5-2023-06-09' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.5-2023-06-02: amdgpu: - SR-IOV fixes - Warning fixes - Misc code cleanups and spelling fixes - DCN 3.2 updates - Improved DC FAMS support for better power management - Improved DC SubVP support for better power management - DCN 3.1.x fixes - Max IB size query - DC GPU reset fixes - RAS updates - DCN 3.0.x fixes - S/G display fixes - CP shadow buffer support - Implement connector force callback - Z8 power improvements - PSP 13.0.10 vbflash support - Mode2 reset fixes - Store MQDs in VRAM to improve queue switch latency - VCN 3.x fixes - JPEG 3.x fixes - Enable DC_FP on LoongArch - GFXOFF fixes - GC 9.4.3 partition support - SDMA 4.4.2 partition support - VCN/JPEG 4.0.3 partition support - VCN 4.0.3 updates - NBIO 7.9 updates - GC 9.4.3 updates - Take NUMA into account when allocating memory - Handle NUMA for partitions - SMU 13.0.6 updates - GC 9.4.3 RAS updates - Stop including unused swiotlb.h - SMU 13.0.7 fixes - Fix clock output ordering on some APUs - Clean up DC FPGA code - GFX9 preemption fixes - Misc irq fixes - S0ix fixes - Add new DRM_AMDGPU_WERROR config parameter to help with CI - PCIe fix for RDNA2 - kdoc fixes - Documentation updates amdkfd: - Query TTM mem limit rather than hardcoding it - GC 9.4.3 partition support - Handle NUMA for partitions radeon: - Fix possible double free - Stop including unused swiotlb.h - Fix possible division by zero ttm: - Add query for TTM mem limit - Add NUMA awareness to pools - Export ttm_pool_fini() UAPI: - Add new ctx query flag to better handle GPU resets Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22290 - Add new interface to query and set shadow buffer for RDNA3 Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21986 - Add new INFO query for max IB size Proposed userspace: https://gitlab.freedesktop.org/bnieuwenhuizen/mesa/-/commits/ib-rejection-v3 amd-drm-next-6.5-2023-06-09: amdgpu: - S0ix fixes - Initial SMU13 Overdrive support - kdoc fixes - Misc clode cleanups - Flexible array fixes - Display OTG fixes - SMU 13.0.6 updates - Revert some broken clock counter updates - Misc display fixes - GFX9 preemption fixes - Add support for newer EEPROM bad page table format - Add missing radeon secondary id - Add support for new colorspace KMS API - CSA fix - Stable pstate fixes for APUs - make vbl interface admin only - Handle PCI accelerator class amdkfd: - Add debugger support for gdb radeon: - Fix possible UAF drm: - Add Colorspace functionality UAPI: - Add debugger interface for enabling gdb Proposed userspace: https://github.com/ROCm-Developer-Tools/ROCdbgapi/tree/wip-dbgapi - Add KMS colorspace API Discussion: https://lists.freedesktop.org/archives/dri-devel/2023-June/408128.html From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230609174817.7764-1-alexander.deucher@amd.com	2023-06-15 14:11:22 +10:00
Arunpravin Paneer Selvam	34e5a54327	Revert "drm/amdgpu: remove TOPDOWN flags when allocating VRAM in large bar system" This reverts commit `c105518679`. This patch disables the TOPDOWN flag for APU and few dGPU cards which has the VRAM size equal to the BAR size. When we enable the TOPDOWN flag, we get the free blocks at the highest available memory region and we don't split the lower order blocks. This change is required to keep off the fragmentation related issues particularly in ASIC which has VRAM space <= 500MiB Hence, we are reverting this patch. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2270 Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2023-06-13 17:12:41 -04:00
Sonny Jiang	9db5ec1ceb	drm/amdgpu: vcn_4_0 set instance 0 init sched score to 1 Only vcn0 can process AV1 codecx. In order to use both vcn0 and vcn1 in h264/265 transcode to AV1 cases, set vcn0 sched score to 1 at initialization time. Signed-off-by: Sonny Jiang <sonjiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x	2023-06-13 17:08:42 -04:00
Mario Limonciello	7ab1a4913d	drm/amd: Tighten permissions on VBIOS flashing attributes Non-root users shouldn't be able to try to trigger a VBIOS flash or query the flashing status. This should be reserved for users with the appropriate permissions. Cc: stable@vger.kernel.org Fixes: `8424f2ccb3` ("drm/amdgpu/psp: Add vbflash sysfs interface support") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-13 17:04:51 -04:00
Mario Limonciello	3eb1a3a040	drm/amd: Make sure image is written to trigger VBIOS image update flow The VBIOS image update flow requires userspace to: 1) Write the image to `psp_vbflash` 2) Read `psp_vbflash` 3) Poll `psp_vbflash_status` to check for completion If userspace reads `psp_vbflash` before writing an image, it's possible that it causes problems that can put the dGPU into an invalid state. Explicitly check that an image has been written before letting a read succeed. Cc: stable@vger.kernel.org Fixes: `8424f2ccb3` ("drm/amdgpu/psp: Add vbflash sysfs interface support") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-13 17:04:45 -04:00
Alex Deucher	e61f67749b	drm/amdgpu: add missing radeon secondary PCI ID 0x5b70 is a missing RV370 secondary id. Add it so we don't try and probe it with amdgpu. Cc: michel@daenzer.net Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Tested-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2023-06-13 17:02:51 -04:00
Jiadong Zhu	5b711e7f9c	drm/amdgpu: Implement gfx9 patch functions for resubmission Patch the packages including CONTEXT_CONTROL and WRITE_DATA for gfx9 during the resubmission scenario. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.3.x	2023-06-13 16:58:23 -04:00
Jiadong Zhu	87af86ae89	drm/amdgpu: Modify indirect buffer packages for resubmission When the preempted IB frame resubmitted to cp, we need to modify the frame data including: 1. set PRE_RESUME 1 in CONTEXT_CONTROL. 2. use meta data(DE and CE) read from CSA in WRITE_DATA. Add functions to save the location the first time IBs emitted and callback to patch the package when resubmission happens. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.3.x	2023-06-13 16:57:44 -04:00
Jiadong Zhu	94034b306d	drm/amdgpu: Program gds backup address as zero if no gds allocated It is firmware requirement to set gds_backup_addrlo and gds_backup_addrhi of DE meta both zero if no gds partition is allocated for the frame. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.3.x	2023-06-13 16:57:14 -04:00
Jiadong Zhu	1dbcf770cc	drm/amdgpu: Reset CP_VMID_PREEMPT after trailing fence signaled When MEC executes unmap_queue for mid command buffer preemption, it will kick the write pointer of the gfx ring, set CP_VMID_PREEMPT to trigger the preemption and wait for CP_VMID_PREEMPT becomes zero after the preemption done. There is a race condition that PFP may excute the resetting command before MEC set CP_VMID_PREEMPT. As a result, hang happens as CP_VMID_PREEMPT is always 0xffff. To avoid this, we send resetting CP_VMID_PREEMPT command after the trailing fence is siganled and update gfx write pointer explicitly. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.3.x Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2535	2023-06-13 16:51:05 -04:00
Chong Li	80e709ee6e	drm/amdgpu: add option params to enforce process isolation between graphics and compute enforce process isolation between graphics and compute via using the same reserved vmid. v2: remove params "struct amdgpu_vm *vm" from amdgpu_vmid_alloc_reserved and amdgpu_vmid_free_reserved. Signed-off-by: Chong Li <chongli2@amd.com> Reviewed-by: Christian Koenig <Christian.Koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:49:48 -04:00
Nathan Chancellor	4e70da985c	drm/amdgpu: Wrap -Wunused-but-set-variable in cc-option -Wunused-but-set-variable was only supported in clang starting with 13.0.0, so earlier versions will emit a warning, which is turned into a hard error for the kernel to mirror GCC: error: unknown warning option '-Wunused-but-set-variable'; did you mean '-Wunused-const-variable'? [-Werror,-Wunknown-warning-option] The minimum supported version of clang for building the kernel is 11.0.0, so match the rest of the kernel and wrap -Wunused-but-set-variable in a cc-option call, so that it is only used when supported by the compiler. Closes: https://github.com/ClangBuiltLinux/linux/issues/1869 Fixes: `1b320ad3f5` ("drm/amd/amdgpu: introduce DRM_AMDGPU_WERROR") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:49:06 -04:00
Shiwu Zhang	9d65b1b4bc	drm/amdgpu: add the accelerator PCIe class Add the accelerator PCIe class and match the class in amdgpu for 0x1002 devices of that class. From PCI spec: "PCI Code and ID Assignment, r1.9, sec 1, 1.19" Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci_ids.h Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:48:57 -04:00
Jonathan Kim	09d49e14ea	drm/amdkfd: fix and enable debugging for gfx11 There are a couple of fixes required to enable gfx11 debugging. First, ADD_QUEUE.trap_en is an inappropriate place to toggle a per-process register so move it to SET_SHADER_DEBUGGER.trap_en. When ADD_QUEUE.skip_process_ctx_clear is set, MES will prioritize the SET_SHADER_DEBUGGER.trap_en setting. Second, to preserve correct save/restore priviledged wave states in coordination with the trap enablement setting, resume suspended waves early in the disable call. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:48:19 -04:00
Mario Limonciello	fe56c6ee04	drm/amd: Tighten permissions on VBIOS flashing attributes Non-root users shouldn't be able to try to trigger a VBIOS flash or query the flashing status. This should be reserved for users with the appropriate permissions. Cc: stable@vger.kernel.org Fixes: `8424f2ccb3` ("drm/amdgpu/psp: Add vbflash sysfs interface support") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:48:10 -04:00
Mario Limonciello	3537d6a48c	drm/amd: Make sure image is written to trigger VBIOS image update flow The VBIOS image update flow requires userspace to: 1) Write the image to `psp_vbflash` 2) Read `psp_vbflash` 3) Poll `psp_vbflash_status` to check for completion If userspace reads `psp_vbflash` before writing an image, it's possible that it causes problems that can put the dGPU into an invalid state. Explicitly check that an image has been written before letting a read succeed. Cc: stable@vger.kernel.org Fixes: `8424f2ccb3` ("drm/amdgpu/psp: Add vbflash sysfs interface support") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:47:50 -04:00
Yang Wang	cab69d36cc	drm/amdgpu: skip to resume rlcg for gc 9.4.3 in vf side skip to resume rlcg, because rlcg is already enabled in pf side. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:47:35 -04:00
Yang Wang	731b48463b	drm/amdgpu: disable virtual display support on APP device virtual display is not support on APP device. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Gavin Wan <gavin.wan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:47:31 -04:00
Lang Yu	5daff15cd0	drm/amdgpu: unmap and remove csa_va properly Root PD BO should be reserved before unmap and remove a bo_va from VM otherwise lockdep will complain. v2: check fpriv->csa_va is not NULL instead of amdgpu_mcbp (christian) [14616.936827] WARNING: CPU: 6 PID: 1711 at drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:1762 amdgpu_vm_bo_del+0x399/0x3f0 [amdgpu] [14616.937096] Call Trace: [14616.937097] <TASK> [14616.937102] amdgpu_driver_postclose_kms+0x249/0x2f0 [amdgpu] [14616.937187] drm_file_free+0x1d6/0x300 [drm] [14616.937207] drm_close_helper.isra.0+0x62/0x70 [drm] [14616.937220] drm_release+0x5e/0x100 [drm] [14616.937234] __fput+0x9f/0x280 [14616.937239] ____fput+0xe/0x20 [14616.937241] task_work_run+0x61/0x90 [14616.937246] exit_to_user_mode_prepare+0x215/0x220 [14616.937251] syscall_exit_to_user_mode+0x2a/0x60 [14616.937254] do_syscall_64+0x48/0x90 [14616.937257] entry_SYSCALL_64_after_hwframe+0x63/0xcd Signed-off-by: Lang Yu <Lang.Yu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:47:26 -04:00
Alex Deucher	c1ac2ea802	drm/amdgpu: add missing radeon secondary PCI ID 0x5b70 is a missing RV370 secondary id. Add it so we don't try and probe it with amdgpu. Cc: michel@daenzer.net Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Tested-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:44:49 -04:00
Stanley.Yang	0bc3137b21	drm/amdgpu: Set EEPROM ras info Set EEPROM ras info: rma status, health percent and bad page threshold. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:44:40 -04:00
Stanley.Yang	7c2551fa1d	drm/amdgpu: Calculate EEPROM table ras info bytes sum It's more reasonable to check EEPROM table ras info bytes. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:44:38 -04:00
Stanley.Yang	7f599fed3b	drm/amdgpu: Add support EEPROM table v2.1 Add ras info to EEPROM table, app can analyse device ECC status without GPU driver through EEPROM table ras info. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:44:34 -04:00
Stanley.Yang	b573cf88c0	drm/amdgpu: Support setting EEPROM table version Add setting EEPROM table version interface for umcv8.10, Add EEPROM table v2.1 to UMC v8.10. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:44:31 -04:00
Stanley.Yang	65183faec8	drm/amdgpu: Add RAS table v2.1 macro definition Add RAS EEPROM table version 2.1 macro definition. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:44:28 -04:00
Stanley.Yang	71c79a1960	drm/amdgpu: Rename ras table version Rename RAS_TABLE_VER to RAS_TABLE_VER_V1, move RAS_TABLE_VER_V1 from amdgpu_ras_eeprom.c to amdgpu_ras_eeprom.h. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:44:24 -04:00
Jiadong Zhu	ea791e704b	drm/amdgpu: Implement gfx9 patch functions for resubmission Patch the packages including CONTEXT_CONTROL and WRITE_DATA for gfx9 during the resubmission scenario. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:44:22 -04:00
Jiadong Zhu	8ff865be93	drm/amdgpu: Modify indirect buffer packages for resubmission When the preempted IB frame resubmitted to cp, we need to modify the frame data including: 1. set PRE_RESUME 1 in CONTEXT_CONTROL. 2. use meta data(DE and CE) read from CSA in WRITE_DATA. Add functions to save the location the first time IBs emitted and callback to patch the package when resubmission happens. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:44:15 -04:00
Emily Deng	f2bcc0c7db	drm/amdgpu/mmsch: Correct the definition for mmsch init header For the header, it is version related, shouldn't use MAX_VCN_INSTANCES. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:44:12 -04:00
YiPeng Chai	869bcf59fd	drm/amdgpu: change reserved vram info print The link object of mgr->reserved_pages is the blocks variable in struct amdgpu_vram_reservation, not the link variable in struct drm_buddy_block. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2023-06-09 12:41:14 -04:00
Chia-I Wu	5a863904ba	drm/amdgpu: fix xclk freq on CHIP_STONEY According to Alex, most APUs from that time seem to have the same issue (vbios says 48Mhz, actual is 100Mhz). I only have a CHIP_STONEY so I limit the fixup to CHIP_STONEY Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:41:11 -04:00
Alex Deucher	5a03159ab7	Revert "drm/amdgpu: switch to golden tsc registers for raven/raven2" This reverts commit `f03eb1d26c`. This results in inconsistent timing reported via asynchronous GPU queries. Link: https://lists.freedesktop.org/archives/amd-gfx/2023-May/093731.html Cc: Jesse.Zhang@amd.com Cc: michel@daenzer.net Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:40:56 -04:00
Alex Deucher	f9bfc9fff2	Revert "drm/amdgpu: Differentiate between Raven2 and Raven/Picasso according to revision id" This reverts commit `9d2d1827af`. This results in inconsistent timing reported via asynchronous GPU queries. Link: https://lists.freedesktop.org/archives/amd-gfx/2023-May/093731.html Cc: Jesse.Zhang@amd.com Cc: michel@daenzer.net Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:40:17 -04:00
Alex Deucher	a15a77c8e6	Revert "drm/amdgpu: change the reference clock for raven/raven2" This reverts commit `fbc24293ca`. This results in inconsistent timing reported via asynchronous GPU queries. Link: https://lists.freedesktop.org/archives/amd-gfx/2023-May/093731.html Cc: Jesse.Zhang@amd.com Cc: michel@daenzer.net Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:40:01 -04:00
Stanley.Yang	3898c8fc42	drm/amdgpu: convert vcn/jpeg logical mask to physical mask Changed from V1: Remove amdgpu_ras_logical_mask_to_physical_mask due to GET_MASK provides same feature. Support convert VCN/JPEG logical mask to physical mask. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:39:58 -04:00
Stanley.Yang	e3959cb547	drm/amdgpu: support check vcn jpeg block mask Support VCN/JPEG instance mask checking, pass logical mask directly except GFX/SDMA/VCN/JPEG blocks. Changed from V1: correct a typo Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:39:54 -04:00
Stanley.Yang	61a7c16239	drm/amdgpu: pass xcc mask to ras ta pass xcc mask to ras ta, ras ta will compare the mask with the one from chiplet topology. Changed from V1: Remove IP version checking. Set ras_cmd->ras_init_message.init_flags.xcc_mask directly due to xcc_mask is common structres to all the devices. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:39:51 -04:00
Srinivasan Shanmugam	25c30a12d7	drm/amdgpu: Mark 'kgd_gfx_aldebaran_clear_address_watch' & 'kgd_gfx_v11_clear_address_watch' functions as static Below two functions cause a warning because they lack a prototype: drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c:164:10: warning: no previous prototype for ‘kgd_gfx_aldebaran_clear_address_watch’ [-Wmissing-prototypes] drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c:782:10: warning: no previous prototype for ‘kgd_gfx_v11_clear_address_watch’ [-Wmissing-prototypes] There are no callers from other files, so just mark them as 'static'. Also fixes the following checks: CHECK: Alignment should match open parenthesis +static uint32_t kgd_gfx_aldebaran_clear_address_watch(struct amdgpu_device adev, uint32_t watch_id) CHECK: Alignment should match open parenthesis +static uint32_t kgd_gfx_v11_clear_address_watch(struct amdgpu_device adev, uint32_t watch_id) Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:39:36 -04:00
Chia-I Wu	9f0bcf49e9	amdgpu: validate offset_in_bo of drm_amdgpu_gem_va This is motivated by OOB access in amdgpu_vm_update_range when offset_in_bo+map_size overflows. v2: keep the validations in amdgpu_vm_bo_map v3: add the validations to amdgpu_vm_bo_map/amdgpu_vm_bo_replace_map rather than to amdgpu_gem_va_ioctl Fixes: `9f7eb5367d` ("drm/amdgpu: actually use the VM map parameters") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:37:56 -04:00
Jonathan Kim	9bd443cb74	drm/amdgpu: fix debug wait on idle for gfx9.4.1 Wait calls for amd_ip_block_type not amd_hw_ip_block_type. Reported-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:37:50 -04:00
Jonathan Kim	e0f85f4690	drm/amdkfd: add debug set and clear address watch points operation Shader read, write and atomic memory operations can be alerted to the debugger as an address watch exception. Allow the debugger to pass in a watch point to a particular memory address per device. Note that there exists only 4 watch points per devices to date, so have the KFD keep track of what watch points are allocated or not. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:36:46 -04:00
Jonathan Kim	a70a93fa56	drm/amdkfd: add debug suspend and resume process queues operation In order to inspect waves from the saved context at any point during a debug session, the debugger must be able to preempt queues to trigger context save by suspending them. On queue suspend, the KFD will copy the context save header information so that the debugger can correctly crawl the appropriate size of the saved context. The debugger must then also be allowed to resume suspended queues. A queue that is newly created cannot be suspended because queue ids are recycled after destruction so the debugger needs to know that this has occurred. Query functions will be later added that will clear a given queue of its new queue status. A queue cannot be destroyed while it is suspended to preserve its saved context during debugger inspection. Have queue destruction block while a queue is suspended and unblocked when it is resumed. Likewise, if a queue is about to be destroyed, it cannot be suspended. Return the number of queues successfully suspended or resumed along with a per queue status array where the upper bits per queue status show that the request was invalid (new/destroyed queue suspend request, missing queue) or an error occurred (HWS in a fatal state so it can't suspend or resume queues). Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:36:43 -04:00
Jonathan Kim	aea1b4738b	drm/amdkfd: add debug wave launch mode operation Allow the debugger to set wave behaviour on to either normally operate, halt at launch, trap on every instruction, terminate immediately or stall on allocation. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:36:39 -04:00
Jonathan Kim	101827e130	drm/amdkfd: add debug wave launch override operation This operation allows the debugger to override the enabled HW exceptions on the device. On debug devices that only support the debugging of a single process, the HW exceptions are global and set through the SPI_GDBG_TRAP_MASK register. Because they are global, only address watch exceptions are allowed to be enabled. In other words, the debugger must preserve all non-address watch exception states in normal mode operation by barring a full replacement override or a non-address watch override request. For multi-process debugging, all HW exception overrides are per-VMID so all exceptions can be overridden or fully replaced. In order for the debugger to know what is permissible, returned the supported override mask back to the debugger along with the previously enable overrides. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:36:37 -04:00
Jonathan Kim	12fb1ad70d	drm/amdkfd: update process interrupt handling for debug events The debugger must be notified by any debugger subscribed exception that comes from hardware interrupts. If a debugger session exits, any exceptions it subscribed to may still have interrupts in the interrupt ring buffer or KGD/KFD pipeline. To prevent a new session from inheriting stale interrupts, when a new queue is created, open an interrupt drain and allow the IH ring to drain from a timestamped checkpoint. Then inject a custom IV so that once the custom IV is picked up by the KFD, it's safe to close the drain and proceed with queue creation. The drain must also be on debug disable as SW interrupts may still be processed. Drain at this time and clear all the exception status. The debugger may also not be attached nor subscibed to certain exceptions so forward them directly to the runtime. GFX10 also requires its own IV processing, hence the creation of kfd_int_process_v10.c. This is because the IV from SQ interrupts are packed into a new continguous format unlike GFX9. To make this clear, a separate interrupting handling code file was created. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:36:17 -04:00
Jonathan Kim	69a8c3ae2d	drm/amdkfd: apply trap workaround for gfx11 Due to a HW bug, waves in only half the shader arrays can enter trap. When starting a debug session, relocate all waves to the first shader array of each shader engine and mask off the 2nd shader array as unavailable. When ending a debug session, re-enable the 2nd shader array per shader engine. User CU masking per queue cannot be guaranteed to remain functional if requested during debugging (e.g. user cu mask requests only 2nd shader array as an available resource leading to zero HW resources available) nor can runtime be alerted of any of these changes during execution. Make user CU masking and debugging mutual exclusive with respect to availability. If the debugger tries to attach to a process with a user cu masked queue, return the runtime status as enabled but busy. If the debugger tries to attach and fails to reallocate queue waves to the first shader array of each shader engine, return the runtime status as enabled but with an error. In addition, like any other mutli-process debug supported devices, disable trap temporary setup per-process to avoid performance impact from setup overhead. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:35:52 -04:00
Jonathan Kim	a9818854ea	drm/amdgpu: expose debug api for mes Similar to the F32 HWS, the RS64 HWS for GFX11 now supports a multi-process debug API. The skip_process_ctx_clear ADD_QUEUE requirement is to prevent the MES from clearing the process context when the first queue is added to the scheduler in order to maintain debug mode settings during queue preemption and restore. The MES clears the process context in this case due to an unresolved FW caching bug during normal mode operations. During debug mode, the KFD will hold a reference to the target process so the process context should never go stale and MES can afford to skip this requirement. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:35:43 -04:00
Jonathan Kim	7cee6a6824	drm/amdgpu: add configurable grace period for unmap queues The HWS schedule allows a grace period for wave completion prior to preemption for better performance by avoiding CWSR on waves that can potentially complete quickly. The debugger, on the other hand, will want to inspect wave status immediately after it actively triggers preemption (a suspend function to be provided). To minimize latency between preemption and debugger wave inspection, allow immediate preemption by setting the grace period to 0. Note that setting the preepmtion grace period to 0 will result in an infinite grace period being set due to a CP FW bug so set it to 1 for now. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:35:31 -04:00
Jonathan Kim	33f3437ae1	drm/amdgpu: add gfx11 hw debug mode enable and disable calls Implement the per-device calls to enable or disable HW debug mode for GFX11. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:35:28 -04:00
Jonathan Kim	be6f94039e	drm/amdgpu: add gfx9.4.2 hw debug mode enable and disable calls GFX9.4.2 now supports per-VMID debug mode controls registers (SPI_GDBG_PER_VMID_CNTL). Because the KFD lets the HWS handle PASID-VMID mapping, the KFD will forward all debug mode setting register writes to the HWS scheduler using a new MAP_PROCESS API, so instead of writing to registers, return the required register values that the HWS needs to write on debug enable and disable. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:35:25 -04:00
Jonathan Kim	d13f050fee	drm/amdgpu: add gfx10 hw debug mode enable and disable calls Similar to GFX9 debug devices, set the hardware debug mode by draining the SPI appropriately prior the mode setting request. Because GFX10 has waves allocated by the work group boundary and each SE's SPI instances do not communicate, the SPI drain time is much longer. This long drain time will be fixed for GFX11 onwards. Also remove a bunch of deprecated misplaced references for GFX10.3. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:35:21 -04:00
Jonathan Kim	01f648202c	drm/amdgpu: add gfx9.4.1 hw debug mode enable and disable calls On GFX9.4.1, the implicit wait count instruction on s_barrier is disabled by default in the driver during normal operation for performance requirements. There is a hardware bug in GFX9.4.1 where if the implicit wait count instruction after an s_barrier instruction is disabled, any wave that hits an exception may step over the s_barrier when returning from the trap handler with the barrier logic having no ability to be aware of this, thereby causing other waves to wait at the barrier indefinitely resulting in a shader hang. This bug has been corrected for GFX9.4.2 and onward. Since the debugger subscribes to hardware exceptions, in order to avoid this bug, the debugger must enable implicit wait count on s_barrier for a debug session and disable it on detach. In order to change this setting in the in the device global SQ_CONFIG register, the GFX pipeline must be idle. GFX9.4.1 as a compute device will either dispatch work through the compute ring buffers used for image post processing or through the hardware scheduler by the KFD. Have the KGD suspend and drain the compute ring buffer, then suspend the hardware scheduler and block any future KFD process job requests before changing the implicit wait count setting. Once set, resume all work. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:35:15 -04:00
Jonathan Kim	cde2e087a3	drm/amdgpu: add gfx9 hw debug mode enable and disable calls Implement the per-device calls to enable or disable HW debug mode for GFX9 prior to GFX9.4.1. GFX9.4.1 and onward will require their own enable/disable sequence as follow on patches. When hardware debug mode setting is requested, waves will inherit these settings in the Shader Processor Input's (SPI) Sequencer Global Block (SQG). This means that the KGD must drain all waves from the SPI into SQG (approximately 96 SPI clock cycles) prior to debug mode setting to ensure that the order of operations that the debugger expects with regards to debug mode setting transaction requests and wave inheritence of that mode is upheld. Also ensure that exception overrides are reset to their original state prior to debug enable or disable. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:35:11 -04:00
Mario Limonciello	257d7b7be2	drm/amd: Make lack of `ACPI_FADT_LOW_POWER_S0` or `CONFIG_AMD_PMC` louder during suspend path Users have reported that s2idle wasn't working on OEM Phoenix systems, but it was root caused to be because `CONFIG_AMD_PMC` wasn't set in the distribution kernel config. To make this more apparent, raise the messaging to err instead of warn. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217497 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:34:59 -04:00
Jonathan Kim	4504f14338	drm/amdgpu: setup hw debug registers on driver initialization Add missing debug trap registers references and initialize all debug registers on boot by clearing the hardware exception overrides and the wave allocation ID index. The debugger requires that TTMPs 6 & 7 save the dispatch ID to map waves onto dispatch during compute context inspection. In order to correctly set this up, set the special reserved CP bit by default whenever the MQD is initailized. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:34:56 -04:00
Horatio Zhang	cbb63eccc0	drm/amdgpu: fix Null pointer dereference error in amdgpu_device_recover_vram Use the function of amdgpu_bo_vm_destroy to handle the resource release of shadow bo. During the amdgpu_mes_self_test, shadow bo released, but vmbo->shadow_list was not, which caused a null pointer reference error in amdgpu_device_recover_vram when GPU reset. Fixes: `6c032c37ac` ("drm/amdgpu: Fix vram recover doesn't work after whole GPU reset (v2)") Signed-off-by: xinhui pan <xinhui.pan@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Acked-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:34:02 -04:00
Srinivasan Shanmugam	2e9fee9b8e	drm/amdgpu: Fix up kdoc in amdgpu_device.c Fix these warnings by deleting the deviant arguments. gcc with W=1 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:799: warning: Excess function parameter 'pcie_index' description in 'amdgpu_device_indirect_wreg' drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:799: warning: Excess function parameter 'pcie_data' description in 'amdgpu_device_indirect_wreg' drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:870: warning: Excess function parameter 'pcie_index' description in 'amdgpu_device_indirect_wreg64' drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:870: warning: Excess function parameter 'pcie_data' description in 'amdgpu_device_indirect_wreg64' Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:33:55 -04:00
Srinivasan Shanmugam	ebe884e8b9	drm/amdgpu: Fix up kdoc 'ring' parameter in sdma_v6_0_ring_pad_ib Fix this warning by adding 'ring' arguments to kdoc. gcc with W=1 drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:1128: warning: Function parameter or member 'ring' not described in 'sdma_v6_0_ring_pad_ib' Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:33:50 -04:00
Hamza Mahfooz	1b320ad3f5	drm/amd/amdgpu: introduce DRM_AMDGPU_WERROR We want to do -Werror builds on our CI. However, non-amdgpu breakages have prevented us from doing so thus far. Also, there are a number of additional checks that we should enable, that the community cares about and are hidden behind -Wextra. So, define DRM_AMDGPU_WERROR to only enable -Werror for the amdgpu kernel module and enable -Wextra while disabling all of the checks that are too noisy. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Kenny Ho <kenny.ho@amd.com> Suggested-by: Jani Nikula <jani.nikula@linux.intel.com> Reviewed-by: Kenny Ho <Kenny.Ho@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:33:42 -04:00
Mario Limonciello	09521b5d49	drm/amd: Disallow s0ix without BIOS support again commit `cf488dcd0a` ("drm/amd: Allow s0ix without BIOS support") showed improvements to power consumption over suspend when s0ix wasn't enabled in BIOS and the system didn't support S3. This patch however was misguided because the reason the system didn't support S3 was because SMT was disabled in OEM BIOS setup. This prevented the BIOS from allowing S3. Also allowing GPUs to use the s2idle path actually causes problems if they're invoked on systems that may not support s2idle in the platform firmware. `systemd` has a tendency to try to use `s2idle` if `deep` fails for any reason, which could lead to unexpected flows. The original commit also fixed a problem during resume from suspend to idle without hardware support, but this is no longer necessary with commit `ca47518663` ("drm/amd: Don't allow s0ix on APUs older than Raven") Revert commit `cf488dcd0a` ("drm/amd: Allow s0ix without BIOS support") to make it match the expected behavior again. Cc: Rafael Ávila de Espíndola <rafael@espindo.la> Link: https://github.com/torvalds/linux/blob/v6.1/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c#L1060 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2599 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:33:12 -04:00
ZhenGuo Yin	7a66ad6c08	drm/amdgpu: set finished fence error if job timedout Set finished fence to ETIME error if job timedout. Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:32:43 -04:00
Srinivasan Shanmugam	932fc49479	drm/amdgpu: Fix missing parameter desc for 'xcp_id' in amdgpu_amdkfd_reserve_mem_limit Fix these warnings by adding 'xcp_id' argument. gcc with W=1 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c:160: warning: Function parameter or member 'xcp_id' not described in 'amdgpu_amdkfd_reserve_mem_limit' Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:32:40 -04:00
Srinivasan Shanmugam	1bae03aab2	drm/amdgpu: Fix up missing parameter in kdoc for 'inst' in gmc_ v7, v8, v9, v10, v11.c Fix these warnings by adding 'inst' arguments to kdocs. gcc with W=1 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c:428: warning: Function parameter or member 'inst' not described in 'gmc_v7_0_flush_gpu_tlb_pasid' drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c:626: warning: Function parameter or member 'inst' not described in 'gmc_v8_0_flush_gpu_tlb_pasid' drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c:423: warning: Function parameter or member 'inst' not described in 'gmc_v10_0_flush_gpu_tlb_pasid' drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c:328: warning: Function parameter or member 'inst' not described in 'gmc_v11_0_flush_gpu_tlb_pasid' drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c:950: warning: Function parameter or member 'inst' not described in 'gmc_v9_0_flush_gpu_tlb_pasid' Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:32:35 -04:00
Srinivasan Shanmugam	3eeb0d037a	drm/amdgpu: Fix up missing kdoc parameter 'inst' in get_wave_count() & kgd_gfx_v9_get_cu_occupancy() Fix these warnings by adding 'inst' arguments to kdocs. gcc with W=1 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c:692: warning: Function parameter or member 'inst' not described in 'get_wave_count' drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c:763: warning: Function parameter or member 'inst' not described in 'kgd_gfx_v9_get_cu_occupancy' Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:32:32 -04:00
Srinivasan Shanmugam	ca2943fe0a	drm/amdgpu: Fix missing parameter desc for 'xcc_id' in gfx_v7_0.c & amdgpu_rlc.c Fix these warnings by adding 'xcc_id' arguments. gcc with W=1 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c:1557: warning: Function parameter or member 'xcc_id' not described in 'gfx_v7_0_select_se_sh' drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c:38: warning: Function parameter or member 'xcc_id' not described in 'amdgpu_gfx_rlc_enter_safe_mode' drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c:62: warning: Function parameter or member 'xcc_id' not described in 'amdgpu_gfx_rlc_exit_safe_mode' Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:32:30 -04:00
Lijo Lazar	c6a64ad9b7	drm/amdgpu: Initialize xcc mask For ASICs which are not initialized through discovery, initialize GFX cluster as 1. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:32:21 -04:00
Srinivasan Shanmugam	837d4e071d	drm/amdgpu: Fix create_dmamap_sg_bo kdoc warnings Fix the following gcc with W=1: drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c:292: warning: Cannot understand * @create_dmamap_sg_bo: Creates a amdgpu_bo object to reflect information Cc: Felix Kuehling <felix.kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:32:12 -04:00
Srinivasan Shanmugam	9eba1b8b70	drm/amdgpu: Fix up missing kdoc in sdma_v6_0.c Address a bunch of kdoc warnings: gcc with W=1 drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:248: warning: Function parameter or member 'job' not described in 'sdma_v6_0_ring_emit_ib' drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:248: warning: Function parameter or member 'flags' not described in 'sdma_v6_0_ring_emit_ib' drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:946: warning: Function parameter or member 'timeout' not described in 'sdma_v6_0_ring_test_ib' drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:1125: warning: Function parameter or member 'ring' not described in 'sdma_v6_0_ring_pad_ib' drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:1176: warning: Function parameter or member 'vmid' not described in 'sdma_v6_0_ring_emit_vm_flush' drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:1176: warning: Function parameter or member 'pd_addr' not described in 'sdma_v6_0_ring_emit_vm_flush' Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:31:29 -04:00
Srinivasan Shanmugam	66dadf1ab1	drm/amdgpu: Fix up kdoc in amdgpu_acpi.c Fix these warnings by adding & deleting the deviant arguments. gcc with W=1 drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:906: warning: Function parameter or member 'numa_info' not described in 'amdgpu_acpi_get_node_id' drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:906: warning: Excess function parameter 'nid' description in 'amdgpu_acpi_get_node_id' Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:31:26 -04:00
Srinivasan Shanmugam	23616d1ff3	drm/amdgpu: Fix up kdoc in sdma_v4_4_2.c Address a bunch of kdoc warnings: gcc with W=1 drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:426: warning: Function parameter or member 'inst_mask' not described in 'sdma_v4_4_2_inst_gfx_stop' drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:457: warning: Function parameter or member 'inst_mask' not described in 'sdma_v4_4_2_inst_rlc_stop' drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:470: warning: Function parameter or member 'inst_mask' not described in 'sdma_v4_4_2_inst_page_stop' drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:506: warning: Function parameter or member 'inst_mask' not described in 'sdma_v4_4_2_inst_ctx_switch_enable' drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:794: warning: Function parameter or member 'inst_mask' not described in 'sdma_v4_4_2_inst_rlc_resume' drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:810: warning: Function parameter or member 'inst_mask' not described in 'sdma_v4_4_2_inst_load_microcode' drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:854: warning: Function parameter or member 'inst_mask' not described in 'sdma_v4_4_2_inst_start' Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:31:23 -04:00
Ikshwaku Chauhan	3a25071a97	drm/amdgpu: enable tmz by default for GC 11.0.1 Add IP GC 11.0.1 in the list of target to have tmz enabled by default. Signed-off-by: Ikshwaku Chauhan <ikshwaku.chauhan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:31:20 -04:00
Guchun Chen	8ffd6f0442	drm/amdgpu: keep irq count in amdgpu_irq_disable_all This can clean up all irq warnings because of unbalanced amdgpu_irq_get/put when unplugging/unbinding device, and leave irq count decrease in each ip fini function. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 12:31:12 -04:00
Dan Carpenter	9c9d501b28	drm/amd/amdgpu: Fix up locking etc in amdgpu_debugfs_gprwave_ioctl() There are two bugs here. 1) Drop the lock if copy_from_user() fails. 2) If the copy fails then the correct error code is -EFAULT instead of -EINVAL. I also broke up the long line and changed "sizeof rd->id" to "sizeof(rd->id)". Fixes: `553f973a0d` ("drm/amd/amdgpu: Update debugfs for XCC support (v3)") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 11:10:04 -04:00
James Zhu	9938333a46	drm/amdgpu: use amdxcp platform device as spatial partition Use amdxcp platform device as spatial partition device. -v2: remove unused variable Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 11:08:58 -04:00
Srinivasan Shanmugam	3034983db3	drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused Silencing the compiler from below compilation error: drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c:704:23: error: variable 'mmhub_v1_8_mmea_err_status_reg' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration] static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = { ^ 1 error generated. Mark the variable as __maybe_unused to make it clear to clang that this is expected, so there is no more warning. Cc: Christian König <christian.koenig@amd.com> Cc: Lijo Lazar <lijo.lazar@amd.com> Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 11:08:43 -04:00
Shiwu Zhang	5d6cd20075	drm/amdgpu: add the accelerator pcie class v2: add the base class id for accelerator (lijo) v3: add the new pci class in amdgpu tree (hawking) Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 11:07:16 -04:00
James Zhu	2310554172	drm/amdgpu: save/restore part of xcp drm_device fields Redirect xcp allocated drm_device::rdev/pdev/driver with amdgpu pci_device/drm_device setting. They need be saved before redirect and restored after unregister xcp drm_device. -v2: fix warning discarded-qualifiers Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 11:07:12 -04:00
Shiwu Zhang	20997c04b7	drm/amdgpu: set the APU flag based on package type Since currently APU and dGPU share the same pcie class while gmc init needs the flag to set up correctly for upcomming memory allocations v2: call get_pkg_type in smuio 13_0_3 is enough (hawking) Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 11:07:04 -04:00
James Zhu	28bb7f13e7	drm/jpeg: add init value for num_jpeg_rings Need init new num_jpeg_rings to 1 on jpeg. Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Richard Liang <rliang1@amd.com> Tested-by: Ying Li <ying.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 11:07:00 -04:00
Shiwu Zhang	491ae27829	drm/amdgpu: complement the 4, 6 and 8 XCC cases Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 11:02:17 -04:00
Shiwu Zhang	89f8576555	drm/amdgpu: golden settings for ASIC rev_id 0 Suggested by FW team that GB_ADDR_CONFIG is handled by golden settings in driver to get the expected value Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 11:02:15 -04:00
Shiwu Zhang	9535a86a40	drm/amdgpu: bypass bios dependent operations Since bios reading does not work currently so just bypass all operations related to bios v2: hardcode the vram info for APP_APU case (hawking) v3: correct the vram_width with channel number * channel size (lijo) Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 11:02:12 -04:00
Jiadong Zhu	cfdce59417	drm/amdgpu: Program gds backup address as zero if no gds allocated It is firmware requirement to set gds_backup_addrlo and gds_backup_addrhi of DE meta both zero if no gds partition is allocated for the frame. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 11:02:08 -04:00
Jiadong Zhu	b7941e2fef	drm/amdgpu: Reset CP_VMID_PREEMPT after trailing fence signaled When MEC executes unmap_queue for mid command buffer preemption, it will kick the write pointer of the gfx ring, set CP_VMID_PREEMPT to trigger the preemption and wait for CP_VMID_PREEMPT becomes zero after the preemption done. There is a race condition that PFP may excute the resetting command before MEC set CP_VMID_PREEMPT. As a result, hang happens as CP_VMID_PREEMPT is always 0xffff. To avoid this, we send resetting CP_VMID_PREEMPT command after the trailing fence is siganled and update gfx write pointer explicitly. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 11:02:05 -04:00
Srinivasan Shanmugam	8cce16826f	drm/amdgpu: Fix unused variable in amdgpu_gfx.c drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c: In function ‘amdgpu_gfx_disable_kcq’: drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:497:6: warning: variable ‘j’ set but not used [-Wunused-but-set-variable] 497 \| int j; \| ^ drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c: In function ‘amdgpu_gfx_disable_kgq’: drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:528:6: warning: variable ‘j’ set but not used [-Wunused-but-set-variable] 528 \| int j; \| ^ drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c: In function ‘amdgpu_gfx_enable_kgq’: drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:630:12: warning: variable ‘j’ set but not used [-Wunused-but-set-variable] 630 \| int r, i, j; \| Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 11:01:10 -04:00
Srinivasan Shanmugam	a09e206510	drm/amdgpu: Fix defined but not used gfx9_cs_data in gfx_v9_4_3.c gcc with W=1 In file included from drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:33: drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h:939:36: warning: ‘gfx9_cs_data’ defined but not used [-Wunused-const-variable=] 939 \| static const struct cs_section_def gfx9_cs_data[] = { \| gfx9_cs_data is not used in gfx_v9_4_3.c, hence remove its include in gfx_v9_4_3.c Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 11:00:53 -04:00
Nathan Chancellor	6c882a573b	drm/amdgpu: Fix return types of certain NBIOv7.9 callbacks When building with clang's -Wincompatible-function-pointer-types-strict, which ensures that function pointer signatures match exactly to avoid tripping clang's Control Flow Integrity (kCFI) checks at run time and will eventually be turned on for the kernel, the following instances appear in the NBIOv7.9 code: drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c:465:32: error: incompatible function pointer types initializing 'int ()(struct amdgpu_device )' with an expression of type 'enum amdgpu_gfx_partition (struct amdgpu_device )' [-Werror,-Wincompatible-function-pointer-types-strict] .get_compute_partition_mode = nbio_v7_9_get_compute_partition_mode, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c:467:31: error: incompatible function pointer types initializing 'u32 ()(struct amdgpu_device , u32 )' (aka 'unsigned int ()(struct amdgpu_device , unsigned int )') with an expression of type 'enum amdgpu_memory_partition (struct amdgpu_device , u32 )' (aka 'enum amdgpu_memory_partition (struct amdgpu_device , unsigned int *)') [-Werror,-Wincompatible-function-pointer-types-strict] .get_memory_partition_mode = nbio_v7_9_get_memory_partition_mode, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2 errors generated. Change the return types of these callbacks to match the prototypes to clear up the warning and avoid tripping kCFI at run time. Both functions return a value from ffs(), which is an integer that can fit into either int or unsigned int. Fixes: `98a54e88e8` ("drm/amdgpu: add sysfs node for compute partition mode") Fixes: `ea2d2f8ece` ("drm/amdgpu: detect current GPU memory partition mode") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:59:07 -04:00
Lijo Lazar	3525844d48	drm/amdgpu: Use single copy per SDMA instance type (v2) v1: Only single copy per instance type is required for PSP. All instance types use the same firmware copy. (Lijo) v2: Apply the change into amdgpu_sdma_init_microcode() due to rebase. (Morris) Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:57:16 -04:00
Guchun Chen	93ab59ac6d	drm/amdgpu: switch to unified amdgpu_ring_test_helper This will simplify code. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:57:13 -04:00
Guchun Chen	232f243189	drm/amdgpu/gfx: set sched.ready status after ring/IB test in gfx sched.ready is nothing with ring initialization, it needs to set to be true after ring/IB test in amdgpu_ring_test_helper to tell the ring is ready for submission. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:57:11 -04:00
Guchun Chen	61c31b8b6c	drm/amdgpu/sdma: set sched.ready status after ring/IB test in sdma sched.ready is nothing with ring initialization, it needs to set to be true after ring/IB test in amdgpu_ring_test_helper to tell the ring is ready for submission. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:57:07 -04:00
Lijo Lazar	194224a54c	drm/amdgpu: Fix warnings Fix warnings reported by kernel test bot/smatch smatch warnings: drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c:65 amdgpu_xcp_run_transition() error: buffer overflow 'xcp_mgr->xcp' 8 <= 8 Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Link: https://lore.kernel.org/r/202305231453.I0bXngYn-lkp@intel.com/ Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:56:58 -04:00
Srinivasan Shanmugam	413521a4c9	drm/amd/amdgpu: Fix warnings in amdgpu_irq.c checkpatch reports: WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: braces {} are not necessary for any arm of this statement + if (nvec <= 0) { [...] + } else { [...] WARNING: Block comments use a trailing */ on a separate line Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:55:38 -04:00
Mukul Joshi	c3aaca43fb	drm/amdgpu: Add a low priority scheduler for VRAM clearing Add a low priority DRM scheduler for VRAM clearing instead of using the exisiting high priority scheduler. Use the high priority scheduler for migrations and evictions. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:54:40 -04:00
Jiapeng Chong	1fbc69b8f5	drm/amdgpu/vcn: Modify mismatched function name No functional modification involved. drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c:374: warning: expecting prototype for vcn_v4_0_mc_resume_dpg_mode(). Prototype was for vcn_v4_0_3_mc_resume_dpg_mode() instead. drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c:631: warning: expecting prototype for vcn_v4_0_enable_clock_gating(). Prototype was for vcn_v4_0_3_enable_clock_gating() instead. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=5284 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:54:37 -04:00
Jiapeng Chong	aaff9c0899	drm/amdgpu: Modify mismatched function name No functional modification involved. drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:426: warning: expecting prototype for sdma_v4_4_2_gfx_stop(). Prototype was for sdma_v4_4_2_inst_gfx_stop() instead. drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:457: warning: expecting prototype for sdma_v4_4_2_rlc_stop(). Prototype was for sdma_v4_4_2_inst_rlc_stop() instead. drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:470: warning: expecting prototype for sdma_v4_4_2_page_stop(). Prototype was for sdma_v4_4_2_inst_page_stop() instead. drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:506: warning: expecting prototype for sdma_v4_4_2_ctx_switch_enable(). Prototype was for sdma_v4_4_2_inst_ctx_switch_enable() instead. drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:561: warning: expecting prototype for sdma_v4_4_2_enable(). Prototype was for sdma_v4_4_2_inst_enable() instead. drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:798: warning: expecting prototype for sdma_v4_4_2_rlc_resume(). Prototype was for sdma_v4_4_2_inst_rlc_resume() instead. drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:814: warning: expecting prototype for sdma_v4_4_2_load_microcode(). Prototype was for sdma_v4_4_2_inst_load_microcode() instead. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=5283 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:54:34 -04:00
Jiapeng Chong	e7665d0ca7	drm/amdgpu: Remove duplicate include ./drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c: amdgpu_xcp.h is included more than once. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=5281 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:54:31 -04:00
Philip Yang	acf429dcac	drm/amdkfd: Align partition memory size to page size The compute partition memory size calculated from KFD_XCP_MEMORY_SIZE may not align to page size if xcp_mgr->num_xcp_per_mem_partition is 6. Change the KFD_XCP_MEMORY_SIZE macro to return page align size, so KFD node memory size reported in sysfs is page align size, to avoid application VRAM allocation failure because application may use the size directly and Thunk requires the memory allocation size is page size align. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:53:03 -04:00
Tom Rix	803e4c9efc	drm/amdgpu: remove unused variable num_xcc gcc with W=1 reports drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:2138:13: error: variable ‘num_xcc’ set but not used [-Werror=unused-but-set-variable] 2138 \| int num_xcc; \| ^~~~~~~ This variable is not used so remove it. Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:52:45 -04:00
Arnd Bergmann	1501fe94ee	drm/amdgpu: fix acpi build warnings Two newly introduced functions are in the global namespace but have no prototypes or callers outside of amdgpu_acpi.c, another function is static but only has a caller inside of an #ifdef: drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:902:13: error: no previous prototype for 'amdgpu_acpi_get_node_id' [-Werror=missing-prototypes] drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:928:30: error: no previous prototype for 'amdgpu_acpi_get_dev' [-Werror=missing-prototypes] drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:860:33: error: 'amdgpu_acpi_get_numa_info' defined but not used [-Werror=unused-function] Avoid the warnings by marking all of them static and ensuring that the compiler is able to see the callsites. v2: rebase on latest code (Alex) Fixes: `fa0497c34e` ("drm/amdgpu: Add API to get numa information of XCC") Fixes: `1cc823011a` ("drm/amdgpu: Store additional numa node information") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:51:35 -04:00
Arnd Bergmann	1f9bb94f12	drm/amdgpu: use %pad format string for dma_addr_t DMA addresses can be shorter than u64, which results in a broken debug output: drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c: In function 'amdgpu_gart_table_ram_alloc': drivers/gpu/drm/amd/amdgpu/amdgpu.h:41:22: error: format '%llx' expects argument of type 'long long unsigned int', but argument 4 has type 'dma_addr_t' {aka 'unsigned int'} [-Werror=format=] drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c:146:9: note: in expansion of macro 'dev_info' 146 \| dev_info(adev->dev, "%s dma_addr:%llx\n", __func__, dma_addr); Use the special %pad format string and pass the DMA address by reference. Fixes: `c9a502e981` ("drm/amdgpu: Allocate GART table in RAM for AMD APU") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:50:55 -04:00
Arnd Bergmann	1b177b5c68	drm/amdgpu:mark aqua_vanjaram_reg_init.c function as static A few newly added global functions have no prototype, which causes warnings: drivers/gpu/drm/amd/amdgpu/aqua_vanjaram_reg_init.c:169:5: error: no previous prototype for 'aqua_vanjaram_select_scheds' [-Werror=missing-prototypes] drivers/gpu/drm/amd/amdgpu/aqua_vanjaram_reg_init.c:310:5: error: no previous prototype for '__aqua_vanjaram_get_xcc_per_xcp' [-Werror=missing-prototypes] drivers/gpu/drm/amd/amdgpu/aqua_vanjaram_reg_init.c:337:5: error: no previous prototype for '__aqua_vanjaram_get_xcp_ip_info' [-Werror=missing-prototypes] drivers/gpu/drm/amd/amdgpu/aqua_vanjaram_reg_init.c:593:5: error: no previous prototype for 'aqua_vanjaram_get_xcp_ip_details' [-Werror=missing-prototypes] There are no callers from other files, so just mark them as 'static'. Fixes: `cd7d8400aa` ("drm/amdgpu: add partition schedule for GC(9, 4, 3)") Fixes: `9cb18287d8` ("drm/amdgpu: Add SOC partition funcs for GC v9.4.3") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:48:43 -04:00
Arnd Bergmann	d522458e63	drm/amdkfd: mark local functions as static The file was newly added and causes some -Wmissing-prototype warnings: drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c:57:5: error: no previous prototype for 'kgd_gfx_v9_4_3_hqd_sdma_load' [-Werror=missing-prototypes] drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c:126:5: error: no previous prototype for 'kgd_gfx_v9_4_3_hqd_sdma_dump' [-Werror=missing-prototypes] drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c:163:6: error: no previous prototype for 'kgd_gfx_v9_4_3_hqd_sdma_is_occupied' [-Werror=missing-prototypes] drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c:181:5: error: no previous prototype for 'kgd_gfx_v9_4_3_hqd_sdma_destroy' [-Werror=missing-prototypes] Mark these all as 'static' since there are no outside callers. Fixes: `a805889a15` ("drm/amdkfd: Update SDMA queue management for GFX9.4.3") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:47:53 -04:00
Harshit Mogalapalli	6dabce860d	drm/amdgpu: Fix unsigned comparison with zero in gmc_v9_0_process_interrupt() Smatch warns: drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c:579: unsigned 'xcc_id' is never less than zero. gfx_v9_4_3_ih_to_xcc_inst() returns negative numbers as well. Fix this by changing type of xcc_id to int. Fixes: `98b2e9cad2` ("drm/amdgpu: correct the vmhub index when page fault occurs") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:45:46 -04:00
Colin Ian King	9f77af014c	drm/amdgpu: Fix a couple of spelling mistakes in info and debug messages There are a couple of spelling mistakes, one in a dev_info message and the other in a dev_debug message. Fix them. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:45:43 -04:00
Lijo Lazar	a3ffabb250	drm/amdgpu: Disable interrupt tracker on NBIOv7.9 Enabling nBIF interrupt history tracker prevents LCLK deep sleep, hence disable it Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:45:23 -04:00
Shiwu Zhang	89fb3020d6	drm/amdgpu: init the XCC_DOORBELL_FENCE regs add the the init_registers callback for nbio_v7_9 Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:44:37 -04:00
Tao Zhou	332bb09352	drm/amdgpu: remove unused definition mmhub_v1_8_mmea_cgtt_clk_cntl_reg is defined but not used. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:44:33 -04:00
Srinivasan Shanmugam	1893549af6	drm/amdgpu: Fix uninitialized variable in gfxhub_v1_2_xcp_resume drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c:657:6: error: variable 'ret' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] if (!amdgpu_sriov_vf(adev)) ^~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c:660:9: note: uninitialized use occurs here return ret; ^~~ drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c:657:2: note: remove the 'if' if its condition is always true if (!amdgpu_sriov_vf(adev)) ^~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c:648:9: note: initialize the variable 'ret' to silence this warning int ret; ^ = 0 1 error generated. Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Lijo Lazar <lijo.lazar@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:44:30 -04:00
Srinivasan Shanmugam	40e39d7227	drm/amdgpu: Fix unused amdgpu_acpi_get_numa_info function in amdgpu_acpi_get_node_id() Fix the below compiler complaining error: drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:860:33: error: unused function 'amdgpu_acpi_get_numa_info' [-Werror,-Wunused-function] static struct amdgpu_numa_info *amdgpu_acpi_get_numa_info(uint32_t pxm) ^ 1 error generated. By guarding amdgpu_acpi_get_numa_info & amdgpu_acpi_get_numa_size function, only when CONFIG_ACPI_NUMA is enabled. Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Lijo Lazar <lijo.lazar@amd.com> Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:44:27 -04:00
Christoph Hellwig	73ade646c5	drm/amdgpu: stop including swiotlb.h amdgpu does not need swiotlb.h, so stop including it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:42:21 -04:00
Srinivasan Shanmugam	b3b0e016ec	drm/amdgpu: Fix uninitalized variable in jpeg_v4_0_3_is_idle & jpeg_v4_0_3_wait_for_idle drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c:752:4: error: variable 'ret' is uninitialized when used here [-Werror,-Wuninitialized] ret &= ((RREG32_SOC15_OFFSET( ^~~ drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c:745:10: note: initialize the variable 'ret' to silence this warning bool ret; ^ = 0 drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c:774:4: error: variable 'ret' is uninitialized when used here [-Werror,-Wuninitialized] ret &= SOC15_WAIT_ON_RREG_OFFSET( ^~~ drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c:767:9: note: initialize the variable 'ret' to silence this warning int ret; ^ = 0 2 errors generated. Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: James Zhu <James.Zhu@amd.com> Cc: Leo Liu <leo.liu@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:58 -04:00
Srinivasan Shanmugam	ff6b11cc72	drm/amd/amdgpu: Fix errors & warnings in mmhub_v1_8.c Fix below errors & warnings reported by checkpatch: ERROR: code indent should use tabs where possible WARNING: please, no space before tabs WARNING: please, no spaces at the start of a line WARNING: Prefer 'unsigned int' to bare use of 'unsigned' ERROR: space prohibited before that '++' (ctx:WxB) WARNING: Block comments use a trailing */ on a separate line Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:53 -04:00
Likun Gao	20a29ac091	drm/amdgpu: retire set_vga_state for some ASIC set_vga_state operation only allowed on SI generation ASIC, retire the realted function on those ASIC which did not do anything. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:50 -04:00
Likun Gao	b336c681bd	drm/amdgpu: fix vga_set_state NULL pointer issue Fix NULL pointer issue for vga_set_state function as not all the ASIC need this operation. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:40 -04:00
Srinivasan Shanmugam	d7b8e68dc0	drm/amdgpu: Fix uninitialized variable in gfx_v9_4_3_cp_resume drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:1925:6: error: variable 'r' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] if (amdgpu_xcp_query_partition_mode(adev->xcp_mgr, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:1931:6: note: uninitialized use occurs here if (r) ^ drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:1925:2: note: remove the 'if' if its condition is always true if (amdgpu_xcp_query_partition_mode(adev->xcp_mgr, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:1923:7: note: initialize the variable 'r' to silence this warning int r, i, num_xcc; ^ = 0 1 error generated. Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:35 -04:00
Horatio Zhang	a34b09060a	drm/amdgpu: add RAS POISON interrupt funcs for jpeg_v4_0 Add ras_poison_irq and functions. And fix the amdgpu_irq_put call trace in jpeg_v4_0_hw_fini. [ 50.497562] RIP: 0010:amdgpu_irq_put+0xa4/0xc0 [amdgpu] [ 50.497619] RSP: 0018:ffffaa2400fcfcb0 EFLAGS: 00010246 [ 50.497620] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 [ 50.497621] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 50.497621] RBP: ffffaa2400fcfcd0 R08: 0000000000000000 R09: 0000000000000000 [ 50.497622] R10: 0000000000000000 R11: 0000000000000000 R12: ffff99b2105242d8 [ 50.497622] R13: 0000000000000000 R14: ffff99b210500000 R15: ffff99b210500000 [ 50.497623] FS: 0000000000000000(0000) GS:ffff99b518480000(0000) knlGS:0000000000000000 [ 50.497623] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 50.497624] CR2: 00007f9d32aa91e8 CR3: 00000001ba210000 CR4: 0000000000750ee0 [ 50.497624] PKRU: 55555554 [ 50.497625] Call Trace: [ 50.497625] <TASK> [ 50.497627] jpeg_v4_0_hw_fini+0x43/0xc0 [amdgpu] [ 50.497693] jpeg_v4_0_suspend+0x13/0x30 [amdgpu] [ 50.497751] amdgpu_device_ip_suspend_phase2+0x240/0x470 [amdgpu] [ 50.497802] amdgpu_device_ip_suspend+0x41/0x80 [amdgpu] [ 50.497854] amdgpu_device_pre_asic_reset+0xd9/0x4a0 [amdgpu] [ 50.497905] amdgpu_device_gpu_recover.cold+0x548/0xcf1 [amdgpu] [ 50.498005] amdgpu_debugfs_reset_work+0x4c/0x80 [amdgpu] [ 50.498060] process_one_work+0x21f/0x400 [ 50.498063] worker_thread+0x200/0x3f0 [ 50.498064] ? process_one_work+0x400/0x400 [ 50.498065] kthread+0xee/0x120 [ 50.498067] ? kthread_complete_and_exit+0x20/0x20 [ 50.498068] ret_from_fork+0x22/0x30 Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:29 -04:00
Horatio Zhang	674f90f83b	drm/amdgpu: add RAS POISON interrupt funcs for jpeg_v2_6 Add ras_poison_irq and functions. Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:27 -04:00
Horatio Zhang	18dad20c3d	drm/amdgpu: separate ras irq from jpeg instance irq for UVD_POISON Separate jpegbRAS poison consumption handling from the instance irq, and register dedicated ras_poison_irq src and funcs for UVD_POISON. v2: - Separate ras irq from jpeg instance irq - Improve the subject and code comments v3: - Split the patch into three parts - Improve the code comments Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:23 -04:00
Horatio Zhang	66a11ecbde	drm/amdgpu: add RAS POISON interrupt funcs for vcn_v4_0 Add ras_poison_irq and functions. And fix the amdgpu_irq_put call trace in vcn_v4_0_hw_fini. [ 44.563572] RIP: 0010:amdgpu_irq_put+0xa4/0xc0 [amdgpu] [ 44.563629] RSP: 0018:ffffb36740edfc90 EFLAGS: 00010246 [ 44.563630] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 [ 44.563630] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 44.563631] RBP: ffffb36740edfcb0 R08: 0000000000000000 R09: 0000000000000000 [ 44.563631] R10: 0000000000000000 R11: 0000000000000000 R12: ffff954c568e2ea8 [ 44.563631] R13: 0000000000000000 R14: ffff954c568c0000 R15: ffff954c568e2ea8 [ 44.563632] FS: 0000000000000000(0000) GS:ffff954f584c0000(0000) knlGS:0000000000000000 [ 44.563632] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 44.563633] CR2: 00007f028741ba70 CR3: 000000026ca10000 CR4: 0000000000750ee0 [ 44.563633] PKRU: 55555554 [ 44.563633] Call Trace: [ 44.563634] <TASK> [ 44.563634] vcn_v4_0_hw_fini+0x62/0x160 [amdgpu] [ 44.563700] vcn_v4_0_suspend+0x13/0x30 [amdgpu] [ 44.563755] amdgpu_device_ip_suspend_phase2+0x240/0x470 [amdgpu] [ 44.563806] amdgpu_device_ip_suspend+0x41/0x80 [amdgpu] [ 44.563858] amdgpu_device_pre_asic_reset+0xd9/0x4a0 [amdgpu] [ 44.563909] amdgpu_device_gpu_recover.cold+0x548/0xcf1 [amdgpu] [ 44.564006] amdgpu_debugfs_reset_work+0x4c/0x80 [amdgpu] [ 44.564061] process_one_work+0x21f/0x400 [ 44.564062] worker_thread+0x200/0x3f0 [ 44.564063] ? process_one_work+0x400/0x400 [ 44.564064] kthread+0xee/0x120 [ 44.564065] ? kthread_complete_and_exit+0x20/0x20 [ 44.564066] ret_from_fork+0x22/0x30 Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:16 -04:00
Horatio Zhang	46d75d2300	drm/amdgpu: add RAS POISON interrupt funcs for vcn_v2_6 Add ras_poison_irq and functions. Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:13 -04:00
Horatio Zhang	2ecf927b17	drm/amdgpu: separate ras irq from vcn instance irq for UVD_POISON Separate vcn RAS poison consumption handling from the instance irq, and register dedicated ras_poison_irq src and funcs for UVD_POISON. v2: - Separate ras irq from vcn instance irq - Improve the subject and code comments v3: - Split the patch into three parts - Improve the code comments Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:11 -04:00
YuanShang	c77b3608b8	drm/amdgpu: Remove IMU ucode in vf2pf The IMU firmware is loaded on the host side, not the guest. Remove IMU in vf2pf ucode id enum. Signed-off-by: YuanShang <YuanShang.Mao@amd.com> Reviewed-By: Horace Chen <horace.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:07 -04:00
Shiwu Zhang	e825fb641b	drm/amdgpu: fix the memory override in kiq ring struct This is introduced by the code merge and will let the adev->gfx.kiq[0].ring struct being overrided Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:04 -04:00
Shiwu Zhang	c796d7e039	drm/amdgpu: add the smu_v13_0_6 and gfx_v9_4_3 ip block Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:02 -04:00
Jesse Zhang	b3122c9269	drm/amdgpu: don't enable secure display on incompatible platforms [why] [drm] psp gfx command LOAD_TA(0x1) failed and response status is (0x7) [drm] psp gfx command INVOKE_CMD(0x3) failed and response status is (0x4) amdgpu 0000:04:00.0: amdgpu: Secure display: Generic Failure. [how] don't enable secure display on incompatible platforms Suggested-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Jesse zhang <jesse.zhang@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:40:57 -04:00
Sukrut Bellary	1385d88c6a	drm:amd:amdgpu: Fix missing buffer object unlock in failure path smatch warning - 1) drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c:3615 gfx_v9_0_kiq_resume() warn: inconsistent returns 'ring->mqd_obj->tbo.base.resv'. 2) drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:6901 gfx_v10_0_kiq_resume() warn: inconsistent returns 'ring->mqd_obj->tbo.base.resv'. Signed-off-by: Sukrut Bellary <sukrut.bellary@linux.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:40:52 -04:00
Bas Nieuwenhuizen	a2b308044d	drm/amdgpu: Validate VM ioctl flags. None have been defined yet, so reject anybody setting any. Mesa sets it to 0 anyway. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2023-06-09 10:40:25 -04:00
Su Hui	109b4d8cfe	drm/amdgpu: remove unnecessary (void) conversions No need cast (void) to (struct amdgpu_device *). Signed-off-by: Su Hui <suhui@nfschina.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:40:12 -04:00
Tong Liu01	04e8595819	drm/amdgpu: fix incorrect pcie_gen_mask in passthrough case [why] Passthrough case is treated as root bus and pcie_gen_mask is set as default value that does not support GEN 3 and GEN 4 for PCIe link speed. So PCIe link speed will be downgraded at smu hw init in passthrough condition [how] Move get pci info after detect virtualization and check if it is passthrough case when set pcie_gen_mask Signed-off-by: Tong Liu01 <Tong.Liu01@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:40:00 -04:00
Jack Xiao	e602157ec0	drm/amdgpu: fix S3 issue if MQD in VRAM 1. Need flush HDP for MQD putting in vram 2. Zero out mes MQD Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:38:33 -04:00
Srinivasan Shanmugam	e03f04b849	drm/amdgpu: Fix warnings in amdgpu _sdma, _ucode.c Fix below checkpatch warnings: WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: Comparisons should place the constant on the right side of the test WARNING: Missing a blank line after declarations Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:38:29 -04:00
Srinivasan Shanmugam	f10984a353	drm/amd/amdgpu: Fix errors & warnings in amdgpu _uvd, _vce.c Fix below checkpatch errors & warnings: In amdgpu_uvd.c: WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: Prefer 'unsigned int ' to bare use of 'unsigned ' WARNING: Missing a blank line after declarations WARNING: %Lx is non-standard C, use %llx ERROR: space required before the open parenthesis '(' ERROR: space required before the open brace '{' WARNING: %LX is non-standard C, use %llX WARNING: Block comments use * on subsequent lines +/* multiple fence commands without any stream commands in between can + crash the vcpu so just try to emmit a dummy create/destroy msg to WARNING: Block comments use a trailing / on a separate line + avoid this / WARNING: braces {} are not necessary for single statement blocks + for (j = 0; j < adev->uvd.num_enc_rings; ++j) { + fences += amdgpu_fence_count_emitted(&adev->uvd.inst[i].ring_enc[j]); + } In amdgpu_vce.c: WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: Missing a blank line after declarations WARNING: %Lx is non-standard C, use %llx WARNING: Possible repeated word: 'we' ERROR: space required before the open parenthesis '(' Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:38:22 -04:00
YiPeng Chai	6c47a79b3b	drm/amdgpu: perform mode2 reset for sdma fed error on gfx v11_0_3 perform mode2 reset for sdma fed error on gfx v11_0_3. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:38:19 -04:00
Srinivasan Shanmugam	5d0622705f	drm/amd/amdgpu: Fix errors & warnings in amdgpu_vcn.c Fix below checkpatch insisted error & warnings: ERROR: space required before the open brace '{' WARNING: braces {} are not necessary for any arm of this statement + if ((type == VCN_ENCODE_RING) && (vcn_config & VCN_BLOCK_ENCODE_DISABLE_MASK)) { [...] + } else if ((type == VCN_DECODE_RING) && (vcn_config & VCN_BLOCK_DECODE_DISABLE_MASK)) { [...] + } else if ((type == VCN_UNIFIED_RING) && (vcn_config & VCN_BLOCK_QUEUE_DISABLE_MASK)) { [...] ERROR: code indent should use tabs where possible WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: braces {} are not necessary for single statement blocks + for (i = 0; i < adev->vcn.num_enc_rings; ++i) { + fence[j] += amdgpu_fence_count_emitted(&adev->vcn.inst[j].ring_enc[i]); + ERROR: space required before the open parenthesis '(' WARNING: Missing a blank line after declarations WARNING: please, no spaces at the start of a line WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:38:15 -04:00
Srinivasan Shanmugam	29f187f71e	drm/amd/amdgpu: Fix warnings in amdgpu_encoders.c Fix below checkpatch warnings: WARNING: Missing a blank line after declarations + struct amdgpu_connector *amdgpu_connector = to_amdgpu_connector(connector); + amdgpu_encoder->active_device = amdgpu_encoder->devices & amdgpu_connector->devices; WARNING: Prefer 'unsigned int' to bare use of 'unsigned' Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:38:09 -04:00
Srinivasan Shanmugam	01c3f46474	drm/amd/amdgpu: Fix errors & warnings in amdgpu_ttm.c Fix below checkpatch insisted error & warnings: ERROR: Macros with complex values should be enclosed in parentheses WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: braces {} are not necessary for single statement blocks WARNING: Block comments use a trailing */ on a separate line WARNING: Missing a blank line after declarations Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:38:03 -04:00
Alex Deucher	0ce50b2efe	drm/amdgpu/vcn4: fix endian conversion sq.is_enabled is a byte so there is no need to endian swap it. Acked-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:59 -04:00
Alex Deucher	45b3a914d4	drm/amdgpu/gmc9: fix 64 bit division in partition code Rework logic or use do_div() to avoid problems on 32 bit. v2: add a missing case for XCP macro v3: fix out of bounds array access v4: fix xcp handling harder Acked-by: Guchun Chen <guchun.chen@amd.com> (v1) Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> (v3) Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:52 -04:00
Tao Zhou	92ecb92ccc	drm/amdgpu: initialize RAS for gfx_v9_4_3 Register GFX RAS functions and initialize GFX RAS. v2: remove xcp operations. v3: reuse the return value of gfx_ras_sw_init. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:50 -04:00
Tao Zhou	0386d52d15	drm/amdgpu: add sq timeout status functions for gfx_v9_4_3 Query and reset sq timeout status. v2: change instance from 0 to xcc_id for register access. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:47 -04:00
Tao Zhou	30feef0676	drm/amdgpu: add RAS error count reset for gfx_v9_4_3 Add GFX RAS error count reset function. v2: remove xcp operation. only select_se_sh when instance number is more than 1. v3: add check for se_num before select_se_sh. change instance from 0 to xcc_id for register access. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:44 -04:00
Tao Zhou	bfa84da618	drm/amdgpu: add RAS error count query for gfx_v9_4_3 Query GFX RAS ce/ue count. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:42 -04:00
Tao Zhou	5c1c09a716	drm/amdgpu: add RAS error count definitions for gfx_v9_4_3 Prepare for the query of GFX RAS ce/ue count. v2: remove xcp operation. only select_se_sh when instance number is more than 1. v3: add more CE/UE registsers to query list. add check for se_num before select_se_sh. change instance from 0 to xcc_id for register access. v4: move gfx memory id definitions to gfx_v9_4_3. v5: create a dedicated patch for adding error count query function. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:39 -04:00
Tao Zhou	77462ab8c6	drm/amdgpu: add RAS definitions for GFX Add common GFX RAS definitions. v2: remove instance from amdgpu_gfx_ras_reg_entry, amdgpu_ras_err_status_reg_entry has already defined it. v3: remove memory id definitions from amdgpu_gfx.h, they are related to IP version. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:37 -04:00
Tao Zhou	47e7f527c8	drm/amdgpu: add RAS status reset for gfx_v9_4_3 Reset GFX RAS status registers. v2: fix typo in title. remove xcp operation. v3: change instance from 0 to xcc_id for register access. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:32 -04:00
Tao Zhou	bf16235b39	drm/amdgpu: add RAS status query for gfx_v9_4_3 Query GFX RAS status. v2: remove xcp operation. v3: change instance from 0 to xcc_id for register access. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:29 -04:00
Tao Zhou	d78c71321e	drm/amdgpu: add GFX RAS common function The common function can help reduce redundant code. v2: remove xcp operation, only need to do RAS operations for all instances. v3: remove check for GFX RAS support, will be checked in higher level. add amdgpu prefix for the function name. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:26 -04:00
Hawking Zhang	9a3ce1a7a9	drm/amdgpu: Do not access members of xcp w/o check (v2) Not all the asic needs xcp. ensure check xcp availabity before accessing its member. v2: add missing change in kfd_topology.c Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:24 -04:00
Tao Zhou	f464c5dd4d	drm/amdgpu: add check for RAS instance mask The mask is only needed to be set when RAS block instance number is more than 1 and invalid bits should be also masked out. We only check valid bits for GFX and SDMA block for now, and will add check for other RAS blocks in the future. v2: move the check under injection operation since the mask is only used by RAS error inject. v3: add valid bits handling for SDMA. v4: print message if the mask is adjusted. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:18 -04:00
Tao Zhou	6e3c51a581	drm/amdgpu: remove RAS GFX injection for gfx_v9_4/gfx_v9_4_2 No special requirement in RAS injection for the two versions, switch to use default injection interface. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:15 -04:00
Tao Zhou	27c5f29526	drm/amdgpu: reorganize RAS injection flow So GFX RAS injection could use default function if it doesn't define its own injection interface. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:13 -04:00
Tao Zhou	2c22ed0bdb	drm/amdgpu: add instance mask for RAS inject User can specify injected instances by the mask. For backward compatibility, the mask value is incorporated into sub block index without interface change of RAS TA. User uses logical mask and driver should convert it to physical value before sending it to RAS TA. v2: update parameter name. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:10 -04:00
Tao Zhou	af2ba36883	drm/amdgpu: convert logical instance mask to physical one Convert instance mask for the convenience of RAS TA. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:08 -04:00
Mukul Joshi	40b832aac0	drm/amdgpu: Enable IH CAM on GFX9.4.3 This patch enables IH CAM on GFX9.4.3 ASIC. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:04 -04:00
Philip Yang	3cde91172d	drm/amdgpu: Correct get_xcp_mem_id calculation Current calculation only works for NPS4/QPX mode, correct it for NPS4/CPX mode. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:02 -04:00
Philip Yang	84b4dd3f84	drm/amdkfd: Refactor migrate init to support partition switch Rename smv_migrate_init to a better name kgd2kfd_init_zone_device because it setup zone devive pgmap for page migration and keep it in kfd_migrate.c to access static functions svm_migrate_pgmap_ops. Call it only once in amdgpu_device_ip_init after adev ip blocks are initialized, but before amdgpu_amdkfd_device_init initialize kfd nodes which enable SVM support based on pgmap. svm_range_set_max_pages is called by kgd2kfd_device_init everytime after switching compute partition mode. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:36:58 -04:00
Shiwu Zhang	44a9766555	drm/amdgpu: route ioctls on primary node of XCPs to primary device During XCP init, unlike the primary device, there is no amdgpu_device attached to each XCP's drm_device In case that user trying to open/close the primary node of XCP drm_device this rerouting is to solve the NULL pointer issue causing by referring to any member of the amdgpu_device BUG: unable to handle page fault for address: 0000000000020c80 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page Oops: 0002 [#1] PREEMPT SMP NOPTI Call Trace: <TASK> lock_timer_base+0x6b/0x90 try_to_del_timer_sync+0x2b/0x80 del_timer_sync+0x29/0x40 flush_delayed_work+0x1c/0x50 amdgpu_driver_open_kms+0x2c/0x280 [amdgpu] drm_file_alloc+0x1b3/0x260 [drm] drm_open+0xaa/0x280 [drm] drm_stub_open+0xa2/0x120 [drm] chrdev_open+0xa6/0x1c0 Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:36:53 -04:00
Mukul Joshi	1c77527a69	drm/amdkfd: Fix memory reporting on GFX 9.4.3 This patch fixes memory reporting on the GFX 9.4.3 APU and dGPU by reporting available memory on a per partition basis. If its an APU, available and used memory calculations take into account system and TTM memory. v2: squash in fix ("drm/amdkfd: Fix array out of bound warning") squash in fix ("drm/amdgpu: Update memory reporting for GFX9.4.3") Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:36:48 -04:00
Mukul Joshi	315e29eca5	drm/amdkfd: Move local_mem_info to kfd_node We need to track memory usage on a per partition basis. To do that, store the local memory information in KFD node instead of kfd device. v2: squash in fix ("amdkfd: Use mem_id to access mem_partition info") Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:36:45 -04:00
James Zhu	b125b80bd5	drm/amdgpu: use xcp partition ID for amdgpu_gem Find xcp_id from amdgpu_fpriv, use it for amdgpu_gem_object_create. Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:36:43 -04:00
Philip Yang	2fa9ff25de	drm/amdgpu: KFD graphics interop support compute partition kfd_ioctl_get_dmabuf use the amdgpu bo xcp_id to get the gpu_id of the KFD node from the exported dmabuf_adev, and then create kfd bo on the correct adev and KFD node when importing the amdgpu bo to KFD. Remove function kfd_device_by_adev, it is not needed as it is the same result as dmabuf_adev->kfd.dev->nodes[0]->id. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:36:41 -04:00
Philip Yang	3ebfd221c1	drm/amdkfd: Store xcp partition id to amdgpu bo For memory accounting per compute partition and export drm amdgpu bo and then import to KFD, we need the xcp id to account the memory usage or find the KFD node of the original amdgpu bo to create the KFD bo on the correct adev KFD node. Set xcp_id_plus1 of amdgpu_bo_param to create bo and store xcp_id to amddgpu bo. Add helper macro to get the mem_id from adev and xcp_id. v2: squash in fix ("drm/amdgpu: Fix BO creation failure on GFX 9.4.3 dGPU") Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:36:38 -04:00
Philip Yang	6cfba94a77	drm/amdgpu: dGPU mode set VRAM range lpfn as exclusive TTM place lpfn is exclusive used as end (start + size) in drm and buddy allocator, adev->gmc memory partition range lpfn is inclusive (start + size - 1), should plus 1 to set TTM place lpfn. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:36:35 -04:00
Philip Yang	ea7bf2f220	drm/amdgpu: Alloc page table on correct memory partition Alloc kernel mode page table bo uses the amdgpu_vm->mem_id + 1 as bp mem_id_plus1 parameter. For APU mode, select the correct TTM pool to alloc page from the corresponding memory partition, this will be the closest NUMA node. For dGPU mode, select the correct address range for vram manager. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:36:33 -04:00
Philip Yang	dc12f9edde	drm/amdkfd: Update MTYPE for far memory partition Use MTYPE RW/MTYPE_CC for mapping system memory or VRAM to KFD node within the same memory partition, use MTYPE_NC for mapping on KFD node from the far memory partition of the same socket or from another socket on same XGMI hive. On NPS4 or 4P system, MTYPE will be overridden per page depending on the memory NUMA node id and vm->mem_id. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:36:30 -04:00
Philip Yang	7f6db89418	drm/amdgpu: dGPU mode placement support memory partition dGPU mode uses VRAM manager to validate bo, amdgpu bo placement use the mem_id to get the allocation range first, last page frame number from xcp manager, pass to drm buddy allocator as the allowed range. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:36:27 -04:00
Philip Yang	53c5692e7a	drm/amdkfd: Alloc memory of GPU support memory partition For dGPU mode VRAM allocation, create amdgpu_bo from amdgpu_vm->mem_id, to alloc from the correct memory range. For APU mode VRAM allocation, set alloc domain to GTT, and set bp->mem_id_plus1 from amdgpu_vm->mem_id + 1 to create amdgpu_bo, to allocate system memory from correct NUMA node. For GTT allocation, use mem_id -1 to allocate system memory from any NUMA nodes. Remove amdgpu_ttm_tt_set_mem_pool, to avoid the confusion that memory maybe allocated from different mem_id. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:00:03 -04:00
Philip Yang	f24e924b7e	drm/amdgpu: Add memory partition mem_id to amdgpu_bo Add mem_id_plus1 parameter to amdgpu_gem_object_create and pass it to amdgpu_bo_create. For dGPU mode allocation, mem_id is used by VRAM manager to get the memory partition fpfn, lpfn from xcp manager. For APU native mode allocation, mem_id is used to get NUMA node id from xcp manager, then pass to TTM as numa pool id to alloc memory from the specific NUMA node. mem_id -1 means for entire VRAM or any NUMA nodes. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:00:00 -04:00
Philip Yang	4c6ce75fdd	drm/amdkfd: Show KFD node memory partition info Show KFD node memory partition id and size, add helper function KFD_XCP_MEMORY_SIZE to get kfd node memory size, will be used later to support memory accounting per partition. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:58 -04:00
Philip Yang	934deb64fd	drm/amdgpu: Add memory partition id to amdgpu_vm If xcp_mgr is initialized, add mem_id to amdgpu_vm structure to store memory partition number when creating amdgpu_vm for the xcp. The xcp number is decided when opening the render device, for example /dev/dri/renderD129 is xcp_id 0, /dev/dri/renderD130 is xcp_id 1. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:56 -04:00
Philip Yang	d26ea1b346	drm/amdgpu: Add xcp manager num_xcp_per_mem_partition Used by KFD to check memory limit accounting. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:48 -04:00
James Zhu	3e7c6fe387	drm/amdgpu: update ref_cnt before ctx free Update ref_cnt before ctx free. Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:44 -04:00
James Zhu	9a18292d41	drm/amdgpu: run partition schedule if it is supported Run partition schedule if it is supported during ctx init entity. Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:42 -04:00
James Zhu	cd7d8400aa	drm/amdgpu: add partition schedule for GC(9, 4, 3) Implement partition schedule for GC(9, 4, 3). Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:39 -04:00
James Zhu	c30e326e48	drm/amdgpu: keep amdgpu_ctx_mgr in ctx structure Keep amdgpu_ctx_mgr in ctx structure to track fpriv. v2: add missing fpriv declaration lost in rebase Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:37 -04:00
James Zhu	d425c6f48b	drm/amdgpu: add partition scheduler list update Add partition scheduler list update in late init and xcp partition mode switch. Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:35 -04:00
James Zhu	0a9115fd95	drm/amdgpu: update header to support partition scheduling Update header to support partition scheduling. Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:32 -04:00
James Zhu	797a0a142c	drm/amdgpu: add partition ID track in ring Keep track partition ID in ring. Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:30 -04:00
James Zhu	be3800f57c	drm/amdgpu: find partition ID when open device Find partition ID when open device from render device minor. Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-and-tested-by: Philip Yang<Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:27 -04:00
James Zhu	2c1c7ba457	drm/amdgpu: support partition drm devices Support partition drm devices on GC_HWIP IP_VERSION(9, 4, 3). This is a temporary solution and will be superceded. Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-and-tested-by: Philip Yang<Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:20 -04:00
Graham Sider	b9cbd51000	drm/amdgpu/bu: update mtype_local parameter settings Update mtype_local module parameter to use MTYPE_RW by default. 0: MTYPE_RW (default) 1: MTYPE_NC 2: MTYPE_CC Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:15 -04:00
David Francis	76eb9c95a4	drm/amdgpu/bu: add mtype_local as a module parameter Selects the MTYPE to be used for local memory, (0 = MTYPE_CC (default), 1 = MTYPE_NC, 2 = MTYPE_RW) v2: squash in build fix (Alex) Reviewed-by: Graham Sider <Graham.Sider@amd.com> Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:13 -04:00
Felix Kuehling	352b919c1e	drm/amdgpu: Override MTYPE per page on GFXv9.4.3 APUs On GFXv9.4.3 NUMA APUs, system memory locality must be determined per page to choose the correct MTYPE. This patch adds a GMC callback that can provide this per-page override and implements it for native mode. Carve-out mode is not yet supported and will use the safe default (remote) MTYPE for system memory. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Reviewed-and-tested-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:08 -04:00
Felix Kuehling	1e4a00334a	drm/amdgpu: Fix per-BO MTYPE selection for GFXv9.4.3 Treat system memory on NUMA systems as remote by default. Overriding with a more efficient MTYPE per page will be implemented in the next patch. No need for a special case for APP APUs. System memory is handled the same for carve-out and native mode. And VRAM doesn't exist in native mode. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Reviewed-and-tested-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:05 -04:00
Graham Sider	895797d919	drm/amdgpu/bu: Add use_mtype_cc_wa module param By default, set use_mtype_cc_wa to 1 to set PTE coherence flag MTYPE_CC instead of MTYPE_RW by default. This is required for the time being to mitigate a bug causing XCCs to hit stale data due to TCC marking fully dirty lines as exclusive. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Joseph Greathouse <Joseph.Greathouse@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:03 -04:00

... 14 15 16 17 18 ...

13994 Commits