linux

mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00

Author	SHA1	Message	Date
Lijo Lazar	ed6e4f0a27	drm/amdgpu: Use another offset for GC 9.4.3 remap The legacy region at 0x7F000 maps to valid registers in GC 9.4.3 SOCs. Use 0x1A000 offset instead as MMIO register remap region. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 18:13:35 -05:00
Candice Li	e0409021e3	drm/amdgpu: Update EEPROM I2C address for smu v13_0_0 Check smu v13_0_0 SKU type to select EEPROM I2C address. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x	2023-11-29 18:11:39 -05:00
Lu Yao	2161e09cd0	drm/amdgpu: Fix cat debugfs amdgpu_regs_didt causes kernel null pointer For 'AMDGPU_FAMILY_SI' family cards, in 'si_common_early_init' func, init 'didt_rreg' and 'didt_wreg' to 'NULL'. But in func 'amdgpu_debugfs_regs_didt_read/write', using 'RREG32_DIDT' 'WREG32_DIDT' lacks of relevant judgment. And other 'amdgpu_ip_block_version' that use these two definitions won't be added for 'AMDGPU_FAMILY_SI'. So, add null pointer judgment before calling. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Lu Yao <yaolu@kylinos.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 18:09:53 -05:00
Mario Limonciello	6967741d26	drm/amd: Enable PCIe PME from D3 When dGPU is put into BOCO it may be in D3cold but still able send PME on display hotplug event. For this to work it must be enabled as wake source from D3. When runpm is enabled use pci_wake_from_d3() to mark wakeup as enabled by default. Cc: stable@vger.kernel.org # 6.1+ Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 18:09:34 -05:00
Alex Deucher	e222b36e96	drm/amdgpu: fix AGP addressing when GART is not at 0 This worked by luck if the GART aperture ended up at 0. When we ended up moving GART on some chips, the GART aperture ended up offsetting the AGP address since the resource->start is a GART offset, not an MC address. Fix this by moving the AGP address setup into amdgpu_bo_gpu_offset_no_check(). v2: check mem_type before checking agp v3: check if the ttm bo has a ttm_tt allocated yet Fixes: `67318cb843` ("drm/amdgpu/gmc11: set gart placement GC11") Tested-by: Mario Limonciello <mario.limonciello@amd.com> Reported-by: Jesse Zhang <Jesse.Zhang@amd.com> Reported-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: christian.koenig@amd.com Cc: mario.limonciello@amd.com	2023-11-29 18:09:00 -05:00
Prike Liang	c6df7f3137	drm/amdgpu: correct the amdgpu runtime dereference usage count Fix the amdgpu runpm dereference usage count. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2023-11-29 18:04:55 -05:00
Tim Huang	6b0b7789a7	drm/amdgpu: fix memory overflow in the IB test Fix a memory overflow issue in the gfx IB test for some ASICs. At least 20 bytes are needed for the IB test packet. v2: correct code indentation errors. (Christian) Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2023-11-29 18:03:11 -05:00
Li Ma	5c908a3586	drm/amdgpu: add init_registers for nbio v7.11 enable init_registers callback func for nbio v7.11. Signed-off-by: Li Ma <li.ma@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 18:02:36 -05:00
Alex Sierra	4b27a33c3b	drm/amdgpu: Force order between a read and write to the same address Setting register to force ordering to prevent read/write or write/read hazards for un-cached modes. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x	2023-11-29 17:57:33 -05:00
Hawking Zhang	884e9b0827	drm/amdgpu: Do not issue gpu reset from nbio v7_9 bif interrupt In nbio v7_9, host driver should not issu gpu reset Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 17:57:02 -05:00
Perry Yuan	8c4e9105b2	drm/amdgpu: optimize RLC powerdown notification on Vangogh The smu needs to get the rlc power down message to sync the rlc state with smu, the rlc state updating message need to be sent at while smu begin suspend sequence , otherwise SMU will crash while RLC state is not notified by driver, and rlc state probally changed after that notification, so it needs to notify rlc state to smu at the end of the suspend sequence in amdgpu_device_suspend() that can make sure the rlc state is correctly set to SMU. [ 101.000590] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000 [ 101.000598] amdgpu 0000:03:00.0: amdgpu: Failed to disable gfxoff! [ 110.838026] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000 [ 110.838035] amdgpu 0000:03:00.0: amdgpu: Failed to disable smu features. [ 110.838039] amdgpu 0000:03:00.0: amdgpu: Fail to disable dpm features! [ 110.838040] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] ERROR suspend of IP block <smu> failed -62 [ 110.884394] PM: suspend of devices aborted after 21213.620 msecs [ 110.884402] PM: start suspend of devices aborted after 21213.882 msecs [ 110.884405] PM: Some devices failed to suspend, or early wake event detected Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Perry Yuan <perry.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 17:55:47 -05:00
Jonathan Kim	b9eab9e0aa	drm/amdgpu: update xgmi num links info post gc9.4.2 GC IP 9.4.2 and up support TA reporting of the number of xGMI links between peers. Tested-by: Vignesh Chander <vignesh.chander@amd.com> Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 17:53:16 -05:00
Lijo Lazar	36fd9969fa	drm/amdgpu: Use another offset for GC 9.4.3 remap The legacy region at 0x7F000 maps to valid registers in GC 9.4.3 SOCs. Use 0x1A000 offset instead as MMIO register remap region. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 16:49:35 -05:00
Lijo Lazar	92e508eaf3	drm/amdgpu: Read aquavanjaram XGMI register state Add support to read state of XGMI links in aquavanjaram SOC. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 16:49:35 -05:00
Lijo Lazar	081a6eda2b	drm/amdgpu: Read aquavanjaram PCIE register state Add support to read aqua vanjaram PCIE register state Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 16:49:35 -05:00
Lijo Lazar	af39e6f4d8	drm/amdgpu: Add reg_state sysfs attribute Add reg_state attribute to fetch the register snapshot of different IPs like XGMI, WAFL,PCIE and USR. To get a snapshot for a particular IP 1) Open the sysfs file 2) Seek to the offset as defined in amdgpu_sysfs_reg_offset 3) Read Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 16:49:35 -05:00
Alex Deucher	9a5095e785	drm/amdgpu: add amdgpu_reg_state.h This header defines the reg state structures exposed via sysfs for umr debugging. v2: add content type Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>	2023-11-29 16:49:24 -05:00
Candice Li	ca0ad76089	drm/amdgpu: Update EEPROM I2C address for smu v13_0_0 Check smu v13_0_0 SKU type to select EEPROM I2C address. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 16:49:23 -05:00
Felix Kuehling	9a1c1339ab	drm/amdkfd: Run restore_workers on freezable WQs Make restore workers freezable so we don't have to explicitly flush them in suspend and GPU reset code paths, and we don't accidentally try to restore BOs while the GPU is suspended. Not having to flush restore_work also helps avoid lock/fence dependencies in the GPU reset case where we're not allowed to wait for fences. A side effect of this is, that we can now have multiple concurrent threads trying to signal the same eviction fence. Rework eviction fence signaling and replacement to account for that. The GPU reset path can no longer rely on restore_process_worker to resume queues because evict/restore workers can run independently of it. Instead call a new restore_process_helper directly. This is an RFC and request for testing. v2: - Reworked eviction fence signaling - Introduced restore_process_helper v3: - Handle unsignaled eviction fences in restore_process_bos Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 16:49:23 -05:00
Mario Limonciello	bd1f6a31e7	drm/amd: Enable PCIe PME from D3 When dGPU is put into BOCO it may be in D3cold but still able send PME on display hotplug event. For this to work it must be enabled as wake source from D3. When runpm is enabled use pci_wake_from_d3() to mark wakeup as enabled by default. Cc: stable@vger.kernel.org # 6.1+ Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 16:49:23 -05:00
Lu Yao	7b194fdccb	drm/amdgpu: Fix cat debugfs amdgpu_regs_didt causes kernel null pointer For 'AMDGPU_FAMILY_SI' family cards, in 'si_common_early_init' func, init 'didt_rreg' and 'didt_wreg' to 'NULL'. But in func 'amdgpu_debugfs_regs_didt_read/write', using 'RREG32_DIDT' 'WREG32_DIDT' lacks of relevant judgment. And other 'amdgpu_ip_block_version' that use these two definitions won't be added for 'AMDGPU_FAMILY_SI'. So, add null pointer judgment before calling. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Lu Yao <yaolu@kylinos.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 16:49:23 -05:00
Alex Deucher	ca0b006939	drm/amdgpu: fix AGP addressing when GART is not at 0 This worked by luck if the GART aperture ended up at 0. When we ended up moving GART on some chips, the GART aperture ended up offsetting the AGP address since the resource->start is a GART offset, not an MC address. Fix this by moving the AGP address setup into amdgpu_bo_gpu_offset_no_check(). v2: check mem_type before checking agp v3: check if the ttm bo has a ttm_tt allocated yet Fixes: `67318cb843` ("drm/amdgpu/gmc11: set gart placement GC11") Tested-by: Mario Limonciello <mario.limonciello@amd.com> Reported-by: Jesse Zhang <Jesse.Zhang@amd.com> Reported-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: christian.koenig@amd.com Cc: mario.limonciello@amd.com	2023-11-29 16:49:22 -05:00
Lijo Lazar	201761b5eb	drm/amdgpu: Move mca debug mode decision to ras Refactor code such that ras block decides the default mca debug mode, and not swsmu block. By default mca debug mode is set to false. v2: squash in uninitialized value fix (Alex) Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 16:49:01 -05:00
Prike Liang	0e6a12884c	drm/amdgpu: correct the amdgpu runtime dereference usage count Fix the amdgpu runpm dereference usage count. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 16:49:00 -05:00
Tim Huang	88f4b10a79	drm/amdgpu: fix memory overflow in the IB test Fix a memory overflow issue in the gfx IB test for some ASICs. At least 20 bytes are needed for the IB test packet. v2: correct code indentation errors. (Christian) Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 16:49:00 -05:00
Li Ma	ee95135bfe	drm/amdgpu: add init_registers for nbio v7.11 enable init_registers callback func for nbio v7.11. Signed-off-by: Li Ma <li.ma@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 16:49:00 -05:00
Alex Sierra	20b07b0cb3	drm/amdgpu: Force order between a read and write to the same address Setting register to force ordering to prevent read/write or write/read hazards for un-cached modes. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 16:48:59 -05:00
Hawking Zhang	4b8251e019	drm/amdgpu: Do not issue gpu reset from nbio v7_9 bif interrupt In nbio v7_9, host driver should not issu gpu reset Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 16:48:59 -05:00
Perry Yuan	2e9b152325	drm/amdgpu: optimize RLC powerdown notification on Vangogh The smu needs to get the rlc power down message to sync the rlc state with smu, the rlc state updating message need to be sent at while smu begin suspend sequence , otherwise SMU will crash while RLC state is not notified by driver, and rlc state probally changed after that notification, so it needs to notify rlc state to smu at the end of the suspend sequence in amdgpu_device_suspend() that can make sure the rlc state is correctly set to SMU. [ 101.000590] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000 [ 101.000598] amdgpu 0000:03:00.0: amdgpu: Failed to disable gfxoff! [ 110.838026] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000 [ 110.838035] amdgpu 0000:03:00.0: amdgpu: Failed to disable smu features. [ 110.838039] amdgpu 0000:03:00.0: amdgpu: Fail to disable dpm features! [ 110.838040] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] ERROR suspend of IP block <smu> failed -62 [ 110.884394] PM: suspend of devices aborted after 21213.620 msecs [ 110.884402] PM: start suspend of devices aborted after 21213.882 msecs [ 110.884405] PM: Some devices failed to suspend, or early wake event detected Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Perry Yuan <perry.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 16:48:58 -05:00
Hawking Zhang	702e2fb579	drm/amdgpu: Retire query/reset_ras_err_status from gfx_v9_4_3 Not needed anymore. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 16:48:58 -05:00
Jonathan Kim	35c425f5cc	drm/amdgpu: update xgmi num links info post gc9.4.2 GC IP 9.4.2 and up support TA reporting of the number of xGMI links between peers. Tested-by: Vignesh Chander <vignesh.chander@amd.com> Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 16:48:45 -05:00
André Almeida	613ecd6563	drm/amd: Document device reset methods Document what each amdgpu driver reset method does. Signed-off-by: André Almeida <andrealmeid@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-29 16:23:31 -05:00
Thomas Zimmermann	26b9a880d2	Merge drm/drm-next into drm-misc-next Backmerging to get commit `8d6ef26501` ("drm/ast: Disconnect BMC if physical connector is connected") into drm-misc-next. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2023-11-28 15:32:24 +01:00
Luben Tuikov	38f922a563	drm/sched: Reverse run-queue priority enumeration Reverse run-queue priority enumeration such that the higest priority is now 0, and for each consecutive integer the prioirty diminishes. Run-queues correspond to priorities. To an external observer a scheduler created with a single run-queue, and another created with DRM_SCHED_PRIORITY_COUNT number of run-queues, should always schedule sched->sched_rq[0] with the same "priority", as that index run-queue exists in both schedulers, i.e. a scheduler with one run-queue or many. This patch makes it so. In other words, the "priority" of sched->sched_rq[n], n >= 0, is the same for any scheduler created with any allowable number of run-queues (priorities), 0 to DRM_SCHED_PRIORITY_COUNT. Cc: Rob Clark <robdclark@gmail.com> Cc: Abhinav Kumar <quic_abhinavk@quicinc.com> Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: Danilo Krummrich <dakr@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Luben Tuikov <ltuikov89@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231124052752.6915-6-ltuikov89@gmail.com	2023-11-24 23:03:53 -05:00
Luben Tuikov	fe375c7480	drm/sched: Rename priority MIN to LOW Rename DRM_SCHED_PRIORITY_MIN to DRM_SCHED_PRIORITY_LOW. This mirrors DRM_SCHED_PRIORITY_HIGH, for a list of DRM scheduler priorities in ascending order, DRM_SCHED_PRIORITY_LOW, DRM_SCHED_PRIORITY_NORMAL, DRM_SCHED_PRIORITY_HIGH, DRM_SCHED_PRIORITY_KERNEL. Cc: Rob Clark <robdclark@gmail.com> Cc: Abhinav Kumar <quic_abhinavk@quicinc.com> Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: Danilo Krummrich <dakr@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Luben Tuikov <ltuikov89@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231124052752.6915-5-ltuikov89@gmail.com	2023-11-24 23:03:53 -05:00
Daniel Vetter	c79b972eb8	drm-misc-next for 6.8: UAPI Changes: - drm: Introduce CLOSE_FB ioctl - drm/dp-mst: Documentation for the PATH property - fdinfo: Do not align to a MB if the size is larger than 1MiB - virtio-gpu: add explicit virtgpu context debug name Cross-subsystem Changes: - dma-buf: Add dma_fence_timestamp helper Core Changes: - client: Do not acquire module reference - edid: split out drm_eld, add SAD helpers - format-helper: Cache format conversion buffers - sched: Move from a kthread to a workqueue, rename some internal functions to make it clearer, implement dynamic job-flow control - gpuvm: Provide more features to handle GEM objects - tests: Remove slow kunit tests Driver Changes: - ivpu: Update FW API, new debugfs file, a new NOP job submission test mode, improve suspend/resume, PM improvements, MMU PT optimizations, firmware profiling frequency support, support for uncached buffers, switch to gem shmem helpers, replace kthread with threaded interrupts - panfrost: PM improvements - qaic: Allow to run with a single MSI, support host/device time synchronization, misc improvements - simplefb: Support memory-regions, support power-domains - ssd130x: Unitialized variable fixes - omapdrm: dma-fence lockdep annotation fix - tidss: dma-fence lockdep annotation fix - v3d: Support BCM2712 (RaspberryPi5), Support fdinfo and gputop - panel: - edp: Support AUO B116XTN02, BOE NT116WHM-N21,836X2, NV116WHM-N49 V8.0, plus a whole bunch of panels used on Mediatek chromebooks. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCZVc02QAKCRDj7w1vZxhR xRXbAPsHQkuUz2n1XWx/8uExPCn+PmS5Lg6IcbtvqpfuFrGt1QEA+DaZf2yfekFj FgY66UExyddH/2LDHzpKa0WJk6l6vQw= =1HSK -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2023-11-17' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 6.8: UAPI Changes: - drm: Introduce CLOSE_FB ioctl - drm/dp-mst: Documentation for the PATH property - fdinfo: Do not align to a MB if the size is larger than 1MiB - virtio-gpu: add explicit virtgpu context debug name Cross-subsystem Changes: - dma-buf: Add dma_fence_timestamp helper Core Changes: - client: Do not acquire module reference - edid: split out drm_eld, add SAD helpers - format-helper: Cache format conversion buffers - sched: Move from a kthread to a workqueue, rename some internal functions to make it clearer, implement dynamic job-flow control - gpuvm: Provide more features to handle GEM objects - tests: Remove slow kunit tests Driver Changes: - ivpu: Update FW API, new debugfs file, a new NOP job submission test mode, improve suspend/resume, PM improvements, MMU PT optimizations, firmware profiling frequency support, support for uncached buffers, switch to gem shmem helpers, replace kthread with threaded interrupts - panfrost: PM improvements - qaic: Allow to run with a single MSI, support host/device time synchronization, misc improvements - simplefb: Support memory-regions, support power-domains - ssd130x: Unitialized variable fixes - omapdrm: dma-fence lockdep annotation fix - tidss: dma-fence lockdep annotation fix - v3d: Support BCM2712 (RaspberryPi5), Support fdinfo and gputop - panel: - edp: Support AUO B116XTN02, BOE NT116WHM-N21,836X2, NV116WHM-N49 V8.0, plus a whole bunch of panels used on Mediatek chromebooks. Note that the one missing s-o-b for `0da611a870` ("dma-buf: add dma_fence_timestamp helper") has been supplied here, and rebasing the entire tree with upsetting committers didn't seem worth the trouble: https://lore.kernel.org/dri-devel/ce94020e-a7d4-4799-b87d-fbea7b14a268@gmail.com/ Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Maxime Ripard <mripard@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/y4awn5vcfy2lr2hpauo7rc4nfpnc6kksr7btmnwaz7zk63pwoi@gwwef5iqpzva	2023-11-20 09:50:09 +01:00
Srinivasan Shanmugam	699d392903	drm/amdgpu: Add function parameter 'xcc_mask' not described in 'amdgpu_vm_flush_compute_tlb' Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:1373: warning: Function parameter or member 'xcc_mask' not described in 'amdgpu_vm_flush_compute_tlb' Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-17 09:30:51 -05:00
Prike Liang	425285d39a	drm/amdgpu: add amdgpu runpm usage trace for separate funcs Add trace for amdgpu runpm separate funcs usage and this will help debugging on the case of runpm usage missed to dereference. In the normal case the runpm usage count referred by one kind of functionality pairwise and usage should be changed from 1 to 0, otherwise there will be an issue in the amdgpu runpm usage dereference. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-17 09:30:51 -05:00
Shiwu Zhang	75fb313c55	drm/amdgpu: expose the connected port num info through sysfs By catting the xgmi_port_num sysfs node, it prints out the info in the format of <src node id>:<src port num> -> <dst node id>:<dst port num> for one xgmi link. For example, in case of 4 sockets fully and evenly connected setup, it would be like as below for the first node in the hive. 01:02 -> 02:03 01:03 -> 02:02 01:07 -> 03:04 01:04 -> 03:07 01:06 -> 04:05 01:05 -> 04:06 Based on the fact that there is two xgmi links between each socket pair, "01:02 -> 02:03" means that the current socket in question use the port 2 to connect with port 3 of the second node in the hive and so on. v2: print out the src/dst node id for each xgmi link (lijo) v3: replace the current_node++ with +1 to align with dst node (le) and use the dev_err instead of pr_err (lijo) v4: fix checkpatch warning (alex) Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-17 09:30:51 -05:00
Mario Limonciello	d9b3a066df	drm/amd: Exclude dGPUs in eGPU enclosures from DPM quirks The PCIe speed capabilities advertised by a USB4 or TBT3 link are limited to PCIe gen 1 per the USB4 spec. In reality the speed will change dynamically based on fabric conditions and other traffic. DPM is disabled when dGPUs are connected directly to Intel hosts since the PCIe root port isn't able to handle dynamic speed switching. As this limitation is specifically for PCIe root ports in the SoC, don't apply it when connected to an eGPU enclosure connected to an Intel host. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2885 Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-17 09:30:51 -05:00
Mario Limonciello	466a7d1153	drm/amd: Use the first non-dGPU PCI device for BW limits When bandwidth limits are looked up using pcie_bandwidth_available() virtual links such as USB4 are analyzed which might not represent the real speed. Furthermore devices may change speeds autonomously which may introduce conditional variation to the results reported in the status registers. Instead look at the capabilities of first PCI device outside of dGPU to decide upper limits that the dGPU will work at. For eGPU this effectively means that it will use the speed of the link partner. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925#note_2145860 Link: https://www.usb.org/document-library/usb4r-specification-v20 USB4 V2 with Errata and ECN through June 2023 Section 11.2.1 Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-17 09:30:50 -05:00
Srinivasan Shanmugam	8a1de314d1	drm/amdgpu: Refactor 'amdgpu_connector_dvi_detect' in amdgpu_connectors.c Fixes the below: WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: Missing a blank line after declarations WARNING: Too many leading tabs - consider code refactoring + if (list_connector->connector_type != DRM_MODE_CONNECTOR_VGA) { WARNING: Too many leading tabs - consider code refactoring + if (!amdgpu_display_hpd_sense(adev, amdgpu_connector->hpd.hpd)) { Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-17 09:29:54 -05:00
Sam James	b5a52d2afe	amdgpu: Adjust kmalloc_array calls for new -Walloc-size GCC 14 introduces a new -Walloc-size included in -Wextra which errors out on various files in drivers/gpu/drm/amd/amdgpu like: ``` amdgpu_amdkfd_gfx_v8.c:241:15: error: allocation of insufficient size ‘4’ for type ‘uint32_t[2]’ {aka ‘unsigned int[2]'} with size ‘8’ [-Werror=alloc-size] ``` This is because each HQD_N_REGS is actually a uint32_t[2]. Move the * 2 to the size argument so GCC sees we're allocating enough. Originally did 'sizeof(uint32_t) * 2' for the size but a friend suggested 'sizeof(**dump)' better communicates the intent. Link: https://lore.kernel.org/all/87wmuwo7i3.fsf@gentoo.org/ Signed-off-by: Sam James <sam@gentoo.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-17 09:29:54 -05:00
Felix Kuehling	94e2dae0a8	drm/amdkfd: Move TLB flushing logic into amdgpu This will make it possible for amdgpu GEM ioctls to flush TLBs on compute VMs. This removes VMID-based TLB flushing and always uses PASID-based flushing. This still works because it scans the VMID-PASID mapping registers to find the right VMID. It's only slightly less efficient. This is not a production use case. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-17 09:29:53 -05:00
Felix Kuehling	e6ed364efa	drm/amdgpu: update mappings not managed by KFD When restoring after an eviction, use amdgpu_vm_handle_moved to update BO VA mappings in KFD VMs that are not managed through the KFD API. This should allow using the render node API to create more flexible memory mappings in KFD VMs. v2: rebase on drm_exec changes (Alex) Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-17 09:29:53 -05:00
Arunpravin Paneer Selvam	c8031019dc	drm/amdgpu: Implement a new 64bit sequence memory driver Developed a new driver which allocates a 64bit memory on each request in sequence order. At the moment, user queue fence memory is the main consumer of this seq64 driver. v2: Worked on review comments from Christian for the following modifications - Move driver name from "semaphore" to "seq64" - Remove unnecessary PT/PD mapping - Move enable_mes check into init/fini functions. v3: Worked on review comments from Christian - drop enable_mes check - use DECLARE_BITMAP for bit array - added kerneldoc for seq64 v4: Worked on review comments from Christian - Rename amdgpu_seq64_get name with amdgpu_seq64_alloc v5: Worked on review comments from Christian - Fix seq64 lockdep warning - move fpriv->seq64_va check into amdgpu_seq64_unmap() - make the function amdgpu_seq64_unmap() return as void. - reserve the buffers as not interruptible. v6: port to drm_exec (Alex) v7: disable for now (Arun) Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-17 09:29:53 -05:00
Alex Deucher	e8c2d3e25b	drm/amdgpu/gmc9: disable AGP aperture We've had misc reports of random IOMMU page faults when this is used. It's just a rarely used optimization anyway, so let's just disable it. It can still be toggled via the module parameter for testing. v2: leave it configurable via module parameter Reviewed-by: Yang Wang <kevinyang.wang@amd.com> (v1) Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Tested-by: Mario Limonciello <mario.limonciello@amd.com> # PHX & Navi33 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-17 00:58:41 -05:00
Alex Deucher	61fc93695b	drm/amdgpu/gmc10: disable AGP aperture We've had misc reports of random IOMMU page faults when this is used. It's just a rarely used optimization anyway, so let's just disable it. It can still be toggled via the module parameter for testing. v2: leave it configurable via module parameter Reviewed-by: Yang Wang <kevinyang.wang@amd.com> (v1) Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Tested-by: Mario Limonciello <mario.limonciello@amd.com> # PHX & Navi33 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-17 00:58:34 -05:00
Alex Deucher	0db062eac3	drm/amdgpu/gmc11: disable AGP aperture We've had misc reports of random IOMMU page faults when this is used. It's just a rarely used optimization anyway, so let's just disable it. It can still be toggled via the module parameter for testing. v2: leave it configurable via module parameter Fixes: `67318cb843` ("drm/amdgpu/gmc11: set gart placement GC11") Reviewed-by: Yang Wang <kevinyang.wang@amd.com> (v1) Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Tested-by: Mario Limonciello <mario.limonciello@amd.com> # PHX & Navi33 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-17 00:58:28 -05:00
Alex Deucher	6ba5b61383	drm/amdgpu: add a module parameter to control the AGP aperture Add a module parameter to control the AGP aperture. The AGP aperture is an aperture in the GPU's internal address space which provides direct non-paged access to the platform address space. This access is non-snooped so only uncached memory can be accessed. Add a knob so that we can toggle this for debugging. Fixes: `67318cb843` ("drm/amdgpu/gmc11: set gart placement GC11") Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Tested-by: Mario Limonciello <mario.limonciello@amd.com> # PHX & Navi33 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-17 00:58:20 -05:00
Alex Deucher	564ca1b53e	drm/amdgpu/gmc11: fix logic typo in AGP check Should be && rather than \|\|. Fixes: `b2e1cbe628` ("drm/amdgpu/gmc11: disable AGP on GC 11.5") Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Tested-by: Mario Limonciello <mario.limonciello@amd.com> # PHX & Navi33 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-17 00:58:11 -05:00
Shiwu Zhang	9ddea8c977	drm/amdgpu: add and populate the port num into xgmi topology info The port num info is firstly introduced with 20.00.01.13 xgmi ta and make them as part of topology info. Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-17 00:53:49 -05:00
Yang Wang	07ee43faeb	drm/amdgpu: fix ras err_data null pointer issue in amdgpu_ras.c fix ras err_data null pointer issue in amdgpu_ras.c Fixes: `8cc0f5669e` ("drm/amdgpu: Support multiple error query modes") Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-17 00:51:58 -05:00
YuanShang	50d51374b4	drm/amdgpu: correct chunk_ptr to a pointer to chunk. The variable "chunk_ptr" should be a pointer pointing to a struct drm_amdgpu_cs_chunk instead of to a pointer of that. Signed-off-by: YuanShang <YuanShang.Mao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-17 00:48:41 -05:00
Srinivasan Shanmugam	8a0173cd90	drm/amdgpu: Address member 'ring' not described in 'amdgpu_ vce, uvd_entity_init()' Fixes the following: drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c:237: warning: Function parameter or member 'ring' not described in 'amdgpu_vce_entity_init' drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c:405: warning: Function parameter or member 'ring' not described in 'amdgpu_uvd_entity_init' Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-17 00:47:14 -05:00
Le Ma	bdb72185d3	drm/amdgpu: finalizing mem_partitions at the end of GMC v9 sw_fini The valid num_mem_partitions is required during ttm pool fini, thus move the cleanup at the end of the function. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-17 00:47:05 -05:00
Victor Lu	0288603040	drm/amdgpu: Do not program VF copy regs in mmhub v1.8 under SRIOV (v2) MC_VM_AGP_* registers should not be programmed by guest driver. v2: move early return outside of loop Signed-off-by: Victor Lu <victorchengchi.lu@amd.com> Reviewed-by: Samir Dhume <samir.dhume@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-17 00:46:27 -05:00
Maxime Ripard	3bf3e21c15	Merge drm/drm-next into drm-misc-next Let's kickstart the v6.8 release cycle. Signed-off-by: Maxime Ripard <mripard@kernel.org>	2023-11-15 10:56:44 +01:00
Linus Torvalds	c0d12d7692	drm fixes for 6.7-rc1 - big pile of amd fixes, but mostly hw support newly added in 6.7 - i915 fixes, mostly minor things - qxl memory leak fix - vc4 uaf fix in mock helpers - syncobj fix for DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEb4nG6jLu8Y5XI+PfTA9ye/CYqnEFAmVOlEQACgkQTA9ye/CY qnGawQ//d7M2CZpd9LvkRnaV2+fG9yGONOwZsG5fVtXrfT4RDQmITC9KMEh2TxJb s1W6HA+UijMMx6RtQN6cTYNHeYDaX2b55/g3lMnreXydii0COJwkWe52iFn0Dpcm RpsT5cYLEiRtiTvEzKbrkxS+rrQMu9jwxcA23b+lMkmybVgqQe1m9hYtRxZCFqr7 6BMKOgrCRoY1mYZrNaccXBHvvgOtcpWPOsuNNgjW3MDKB663BmpABT1iDg3xxPdX BI8SAl6PX6ju2Jwi3WPlscmI199fhcUXDuqb8LXhJsuqynhwU940aUlxcbI7Hz6Y LaFwSK4OiaIsIC8yisa7cZ1z2mqnMIiGXasP6mfYVmYqpGYMW+AZcOmzui/MLiGd duOcvK/bLxN7moSqcKgz+mmrLfZzJkPLV8pGEVk0IbTn239AnR76bqrEQJ+Iqukx d4yLGE4OJchD4zzw5RlVmQIhwA8M/5croJBIo6yNyQB5xgN/Krd4QUc6KjjgzNo8 e402NjaW2/PqWIiPtsL5tK4XdkVtvMvVUq+bJSch6Wfn3j9u06/wgFl1FBRyX7zS 3QYvMtF/QM5/QTpbdl82hSSsJ78iO62tPYSOhycIxgB/BHoc/fap+IOFtKKnT3RZ xr7UiwAQA043gAMvS+TkZAc6bFW8U8Dzxu5XxEPE1L+WU6XCSbs= =l45e -----END PGP SIGNATURE----- Merge tag 'drm-next-2023-11-10' of git://anongit.freedesktop.org/drm/drm Pull drm fixes from Daniel Vetter: "Dave's VPN to the big machine died, so it's on me to do fixes pr this and next week while everyone else is at plumbers. - big pile of amd fixes, but mostly for hw support newly added in 6.7 - i915 fixes, mostly minor things - qxl memory leak fix - vc4 uaf fix in mock helpers - syncobj fix for DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE" * tag 'drm-next-2023-11-10' of git://anongit.freedesktop.org/drm/drm: (78 commits) drm/amdgpu: fix error handling in amdgpu_vm_init drm/amdgpu: Fix possible null pointer dereference drm/amdgpu: move UVD and VCE sched entity init after sched init drm/amdgpu: move kfd_resume before the ip late init drm/amd: Explicitly check for GFXOFF to be enabled for s0ix drm/amdgpu: Change WREG32_RLC to WREG32_SOC15_RLC where inst != 0 (v2) drm/amdgpu: Use correct KIQ MEC engine for gfx9.4.3 (v5) drm/amdgpu: add smu v13.0.6 pcs xgmi ras error query support drm/amdgpu: fix software pci_unplug on some chips drm/amd/display: remove duplicated argument drm/amdgpu: correct mca debugfs dump reg list drm/amdgpu: correct acclerator check architecutre dump drm/amdgpu: add pcs xgmi v6.4.0 ras support drm/amdgpu: Change extended-scope MTYPE on GC 9.4.3 drm/amdgpu: disable smu v13.0.6 mca debug mode by default drm/amdgpu: Support multiple error query modes drm/amdgpu: refine smu v13.0.6 mca dump driver drm/amdgpu: Do not program PF-only regs in hdp_v4_0.c under SRIOV (v2) drm/amdgpu: Skip PCTL0_MMHUB_DEEPSLEEP_IB write in jpegv4.0.3 under SRIOV drm: amd: Resolve Sphinx unexpected indentation warning ...	2023-11-10 14:59:30 -08:00
Daniel Vetter	03df0fc007	amd-drm-next-6.7-2023-11-10: amdgpu: - SR-IOV fixes - DMCUB fixes - DCN3.5 fixes - DP2 fixes - SubVP fixes - SMU14 fixes - SDMA4.x fixes - Suspend/resume fixes - AGP regression fix - UAF fixes for some error cases - SMU 13.0.6 fixes - Documentation fixes - RAS fixes - Hotplug fixes - Scheduling entity ordering fix - GPUVM fixes -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZU58aAAKCRC93/aFa7yZ 2PvJAQDF1IHj90BAqH3EzOx7p2jkGVeK1p+em2sS051kOvpgiAD/fvZovVUBmt/V tD0NOtkL8bqmIavP3vDV0Yvf9tW48Qs= =Z4Je -----END PGP SIGNATURE----- Merge tag 'amd-drm-next-6.7-2023-11-10' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.7-2023-11-10: amdgpu: - SR-IOV fixes - DMCUB fixes - DCN3.5 fixes - DP2 fixes - SubVP fixes - SMU14 fixes - SDMA4.x fixes - Suspend/resume fixes - AGP regression fix - UAF fixes for some error cases - SMU 13.0.6 fixes - Documentation fixes - RAS fixes - Hotplug fixes - Scheduling entity ordering fix - GPUVM fixes Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231110190703.4741-1-alexander.deucher@amd.com	2023-11-10 20:51:38 +01:00
Christian König	8473bfdcb5	drm/amdgpu: fix error handling in amdgpu_vm_init When clearing the root PD fails we need to properly release it again. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2023-11-10 11:33:28 -05:00
Felix Kuehling	256503071c	drm/amdgpu: Fix possible null pointer dereference mem = bo->tbo.resource may be NULL in amdgpu_vm_bo_update. Fixes: `1802537820` ("drm/ttm: stop allocating dummy resources during BO creation") Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2023-11-10 11:33:28 -05:00
Alex Deucher	037b98a231	drm/amdgpu: move UVD and VCE sched entity init after sched init We need kernel scheduling entities to deal with handle clean up if apps are not cleaned up properly. With commit `56e449603f` ("drm/sched: Convert the GPU scheduler to variable number of run-queues") the scheduler entities have to be created after scheduler init, so change the ordering to fix this. v2: Leave logic in UVD and VCE code Fixes: `56e449603f` ("drm/sched: Convert the GPU scheduler to variable number of run-queues") Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <ltuikov89@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: ltuikov89@gmail.com	2023-11-10 11:33:08 -05:00
Tim Huang	8ed79c409e	drm/amdgpu: move kfd_resume before the ip late init The kfd_resume needs to touch GC registers to enable the interrupts, it needs to be done before GFXOFF is enabled to ensure that the GFX is not off and GC registers can be touched. So move kfd_resume before the amdgpu_device_ip_late_init which enables the CGPG/GFXOFF. Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-10 11:08:33 -05:00
Mario Limonciello	e4c44b1a19	drm/amd: Explicitly check for GFXOFF to be enabled for s0ix If a user has disabled GFXOFF this may cause problems for the suspend sequence. Ensure that it is enabled in amdgpu_acpi_is_s0ix_active(). The system won't reach the deepest state but it also won't hang. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-10 11:08:20 -05:00
Daniel Vetter	aec3e2e23b	Merge tag 'drm-misc-fixes-2023-11-08' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-fixes for v6.7-rc1: qxl: - qxl memory leak fix. syncobj: - Fix waiting for DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE vc4: - Fix UAF in mock helpers Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> [sima: Stitch together both changelogs from Maarten. Also because of branch history this contains a few more bugfixes which are already in v6.6, but I didn't feel like this justifies some backmerge since there wasn't any real conflict.] Link: https://patchwork.freedesktop.org/patch/msgid/bc8598ee-d427-4616-8ebd-64107ab9a2d8@linux.intel.com	2023-11-10 16:57:49 +01:00
Danilo Krummrich	a78422e9df	drm/sched: implement dynamic job-flow control Currently, job flow control is implemented simply by limiting the number of jobs in flight. Therefore, a scheduler is initialized with a credit limit that corresponds to the number of jobs which can be sent to the hardware. This implies that for each job, drivers need to account for the maximum job size possible in order to not overflow the ring buffer. However, there are drivers, such as Nouveau, where the job size has a rather large range. For such drivers it can easily happen that job submissions not even filling the ring by 1% can block subsequent submissions, which, in the worst case, can lead to the ring run dry. In order to overcome this issue, allow for tracking the actual job size instead of the number of jobs. Therefore, add a field to track a job's credit count, which represents the number of credits a job contributes to the scheduler's credit limit. Signed-off-by: Danilo Krummrich <dakr@redhat.com> Reviewed-by: Luben Tuikov <ltuikov89@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231110001638.71750-1-dakr@redhat.com	2023-11-10 02:54:29 +01:00
Victor Lu	1972642843	drm/amdgpu: Change WREG32_RLC to WREG32_SOC15_RLC where inst != 0 (v2) W/RREG32_RLC is hardedcoded to use instance 0. W/RREG32_SOC15_RLC should be used instead when inst != 0. v2: rebase Signed-off-by: Victor Lu <victorchengchi.lu@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-09 17:03:16 -05:00
Victor Lu	85150626ea	drm/amdgpu: Use correct KIQ MEC engine for gfx9.4.3 (v5) amdgpu_kiq_wreg/rreg is hardcoded to use MEC engine 0. Add an xcc_id parameter to amdgpu_kiq_wreg/rreg, define W/RREG32_XCC and amdgpu_device_xcc_wreg/rreg to use the new xcc_id parameter. Using amdgpu_sriov_runtime to determine whether to access via kiq or RLC is sufficient for now. v5: add condition in amdgpu_device_xcc_w/rreg, remove trace func call v4: avoid using amdgpu_sriov_w/rreg v3: use W/RREG32_XCC to handle non-kiq case v2: define amdgpu_device_xcc_wreg/rreg instead of changing parameters of amdgpu_device_wreg/rreg Signed-off-by: Victor Lu <victorchengchi.lu@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-09 17:03:07 -05:00
Yang Wang	76d2da18af	drm/amdgpu: add smu v13.0.6 pcs xgmi ras error query support add pcs xgmi ras error query support for smu v13.0.6. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-09 17:02:59 -05:00
Vitaly Prosyak	4638e0c29a	drm/amdgpu: fix software pci_unplug on some chips When software 'pci unplug' using IGT is executed we got a sysfs directory entry is NULL for differant ras blocks like hdp, umc, etc. Before call 'sysfs_remove_file_from_group' and 'sysfs_remove_group' check that 'sd' is not NULL. [ +0.000001] RIP: 0010:sysfs_remove_group+0x83/0x90 [ +0.000002] Code: 31 c0 31 d2 31 f6 31 ff e9 9a a8 b4 00 4c 89 e7 e8 f2 a2 ff ff eb c2 49 8b 55 00 48 8b 33 48 c7 c7 80 65 94 82 e8 cd 82 bb ff <0f> 0b eb cc 66 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 [ +0.000001] RSP: 0018:ffffc90002067c90 EFLAGS: 00010246 [ +0.000002] RAX: 0000000000000000 RBX: ffffffff824ea180 RCX: 0000000000000000 [ +0.000001] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ +0.000001] RBP: ffffc90002067ca8 R08: 0000000000000000 R09: 0000000000000000 [ +0.000001] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ +0.000001] R13: ffff88810a395f48 R14: ffff888101aab0d0 R15: 0000000000000000 [ +0.000001] FS: 00007f5ddaa43a00(0000) GS:ffff88841e800000(0000) knlGS:0000000000000000 [ +0.000002] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000001] CR2: 00007f8ffa61ba50 CR3: 0000000106432000 CR4: 0000000000350ef0 [ +0.000001] Call Trace: [ +0.000001] <TASK> [ +0.000001] ? show_regs+0x72/0x90 [ +0.000002] ? sysfs_remove_group+0x83/0x90 [ +0.000002] ? __warn+0x8d/0x160 [ +0.000001] ? sysfs_remove_group+0x83/0x90 [ +0.000001] ? report_bug+0x1bb/0x1d0 [ +0.000003] ? handle_bug+0x46/0x90 [ +0.000001] ? exc_invalid_op+0x19/0x80 [ +0.000002] ? asm_exc_invalid_op+0x1b/0x20 [ +0.000003] ? sysfs_remove_group+0x83/0x90 [ +0.000001] dpm_sysfs_remove+0x61/0x70 [ +0.000002] device_del+0xa3/0x3d0 [ +0.000002] ? ktime_get_mono_fast_ns+0x46/0xb0 [ +0.000002] device_unregister+0x18/0x70 [ +0.000001] i2c_del_adapter+0x26d/0x330 [ +0.000002] arcturus_i2c_control_fini+0x25/0x50 [amdgpu] [ +0.000236] smu_sw_fini+0x38/0x260 [amdgpu] [ +0.000241] amdgpu_device_fini_sw+0x116/0x670 [amdgpu] [ +0.000186] ? mutex_lock+0x13/0x50 [ +0.000003] amdgpu_driver_release_kms+0x16/0x40 [amdgpu] [ +0.000192] drm_minor_release+0x4f/0x80 [drm] [ +0.000025] drm_release+0xfe/0x150 [drm] [ +0.000027] __fput+0x9f/0x290 [ +0.000002] ____fput+0xe/0x20 [ +0.000002] task_work_run+0x61/0xa0 [ +0.000002] exit_to_user_mode_prepare+0x150/0x170 [ +0.000002] syscall_exit_to_user_mode+0x2a/0x50 Cc: Hawking Zhang <hawking.zhang@amd.com> Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-09 17:02:49 -05:00
Yang Wang	8140b07b0a	drm/amdgpu: correct mca debugfs dump reg list avoid driver to touch invalid mca reg. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-09 17:02:32 -05:00
Hawking Zhang	d406aec8dc	drm/amdgpu: correct acclerator check architecutre dump So driver doesn't touch invalid aca entries. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-09 17:02:26 -05:00
Yang Wang	27d80f7d68	drm/amdgpu: add pcs xgmi v6.4.0 ras support add pcs xgmi v6.4.0 ras support Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-09 17:02:20 -05:00
David Yat Sin	4abf0b0bdf	drm/amdgpu: Change extended-scope MTYPE on GC 9.4.3 Change local memory type to MTYPE_UC on revision id 0 Signed-off-by: David Yat Sin <David.YatSin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-09 17:02:14 -05:00
Hawking Zhang	8cc0f5669e	drm/amdgpu: Support multiple error query modes Direct error query mode and firmware error query mode are supported for now. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-09 17:01:58 -05:00
Yang Wang	07c1db7036	drm/amdgpu: refine smu v13.0.6 mca dump driver refine smu mca driver to support query ras error from pmfw path. - correct gfx smu bank hwid (from mp5 to smu bank) - retire unused callback function in amdgpu_mca_smu_funcs{} - add new mca_bank_set{} structure to collect mca bank - move enum mca_reg_idx into amdgpu_mca.h header - add mca status register field decode macro Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-09 17:01:51 -05:00
Victor Lu	0b1695710a	drm/amdgpu: Do not program PF-only regs in hdp_v4_0.c under SRIOV (v2) The following regs can only be programmed by the PF: HDP_MISC_CNTL HDP_NONSURFACE_BASE HDP_NONSURFACE_BASE_HI v2: update commit message Signed-off-by: Victor Lu <victorchengchi.lu@amd.com> Reviewed-by: Samir Dhume <samir.dhume@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-09 17:01:42 -05:00
Victor Lu	a78b481469	drm/amdgpu: Skip PCTL0_MMHUB_DEEPSLEEP_IB write in jpegv4.0.3 under SRIOV PCTL0_MMHUB_DEEPSLEEP_IB is blocked for VF access Signed-off-by: Victor Lu <victorchengchi.lu@amd.com> Reviewed-by: Samir Dhume <samir.dhume@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-09 17:01:35 -05:00
Yang Wang	bf13da6ae1	drm/amdgpu: correct smu v13.0.6 umc ras error check correct smu v13.0.0 umc ras error check Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-09 17:01:20 -05:00
Victor Lu	bc3c566071	drm/amdgpu: Add xcc param to SRIOV kiq write and WREG32_SOC15_IP_NO_KIQ (v4) WREG32/RREG32_SOC15_IP_NO_KIQ and amdgpu_virt_kiq_reg_write_reg_wait are not using the correct rlcg interface or mec engine, respectively. Add xcc instance parameter to them. v4: Use GET_INST and squash commit with: "drm/amdgpu: Add xcc_inst param to amdgpu_virt_kiq_reg_write_reg_wait" v3: xcc not needed for MMMHUB v2: rebase Signed-off-by: Victor Lu <victorchengchi.lu@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-09 17:01:10 -05:00
Victor Lu	f64c3fce46	drm/amdgpu: Add flag to enable indirect RLCG access for gfx v9.4.3 The "rlcg_reg_access_supported" flag is missing. Add it back in. Signed-off-by: Victor Lu <victorchengchi.lu@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-09 17:01:01 -05:00
Yang Wang	4eaa007c73	drm/amdgpu: correct amdgpu ip block rev info correct following amdgpu ip block version information: - gfx_v9_4_3 - sdma_v4_4_2 Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-09 17:00:48 -05:00
Tao Zhou	61d7052216	drm/amdgpu: Don't warn for unsupported set_xgmi_plpd_mode set_xgmi_plpd_mode may be unsupported and this isn't error, no need to print warning for it. v2: add ret2 to save the status of psp_ras_trigger_error. Suggested-by: lijo.lazar@amd.com Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-09 17:00:32 -05:00
Christian König	17daf01ab4	drm/amdgpu: lower CS errors to debug severity Otherwise userspace can spam the logs by using incorrect input values. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2023-11-09 17:00:17 -05:00
Christian König	12f76050d8	drm/amdgpu: fix error handling in amdgpu_bo_list_get() We should not leak the pointer where we couldn't grab the reference on to the caller because it can be that the error handling still tries to put the reference then. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2023-11-09 16:59:57 -05:00
Alex Deucher	bff3315ba8	drm/amdgpu: fix AGP init order The default AGP settings were overwriting the IP selected ones since the default was getting set after the IP ones were selected. Fixes: `de59b69932` ("drm/amdgpu/gmc: set a default disable value for AGP") Link: https://lists.freedesktop.org/archives/amd-gfx/2023-November/100966.html Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>	2023-11-09 16:59:46 -05:00
Linus Torvalds	25b6377007	drm next and fixes for 6.7-rc1 renesas: - atomic conversion - DT support ssd13xx: - dt binding fix for ssd132x - Initialize ssd130x crtc_state to NULL. amdgpu: - Fix RAS support check - RAS fixes - MES fixes - SMU13 fixes - Contiguous memory allocation fix - BACO fixes - GPU reset fixes - Min power limit fixes - GFX11 fixes - USB4/TB hotplug fixes - ARM regression fix - GFX9.4.3 fixes - KASAN/KCSAN stack size check fixes - SR-IOV fixes - SMU14 fixes - PSP13 fixes - Display blend fixes - Flexible array size fixes amdkfd: - GPUVM fix radeon: - Flexible array size fixes -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmVJmdsACgkQDHTzWXnE hr5hphAAoFdk4ma7TauUNyP3JoUwht+Ohm9NcUHq/9P9kOCwPIIehxMZnwPoTGyw VWwXpqpeVsW6zyMgfxWq/+P1S1C5LvZ1HLccbP3xv327fUZ1QnLapHwFxT1SYNpi Tw5qhQN/cwNOX0Pc9uBavYmJzf54OhvPxt2CHHPDShsHBOBc0Gd88gKJr7GWUF5M Ri6i20Tfsgq8AopWQj9628TT+y/aN3rVIfYYBiNejxejlFtt1HODkKFX3DuBNPOI bZOtQm11cbxmX7/2RI92mI20axUb4UMNIQFDYEl3bVlyPyEhhwKPQMSDxhQQodPg 8zMY9Fbl4Z4VaOFEDbpiRv/0/HeWoLefmpQ5LZbz35RhKLTkwsWXHUELPUEj3uTr 7+EnLwuvQdtbT9W2J8btO7v1dHOy86ArlnmqNjB2cEnvGaR3DNM4jxPVLn60SyAc N7CFWNU4EoJf02XhwAludYa5pQHEaTFL8ss6TSoRWXuHHg1vNdu2SorOvkcGkg28 q/t28gZDZaOpXfMmf3ec+PHhO6nxXLXJRbiksVP/rpXQQ42cEI72hr6//UYLRuzg BbYPZ8uXmDjsSZIqYceZwc3Vr6oEmD6EAzHM9+zwS5h0IZ12jKTmFiDEAhG2DwoG 8PCaj5UXkxVK+6iHndz8Qwg4+Fu1j5nodKM+vBNem/iSnxC/OVw= =TsFN -----END PGP SIGNATURE----- Merge tag 'drm-next-2023-11-07' of git://anongit.freedesktop.org/drm/drm Pull more drm updates from Dave Airlie: "Geert pointed out I missed the renesas reworks in my main pull, so this pull contains the renesas next work for atomic conversion and DT support. It also contains a bunch of amdgpu and some small ssd13xx fixes. renesas: - atomic conversion - DT support ssd13xx: - dt binding fix for ssd132x - Initialize ssd130x crtc_state to NULL. amdgpu: - Fix RAS support check - RAS fixes - MES fixes - SMU13 fixes - Contiguous memory allocation fix - BACO fixes - GPU reset fixes - Min power limit fixes - GFX11 fixes - USB4/TB hotplug fixes - ARM regression fix - GFX9.4.3 fixes - KASAN/KCSAN stack size check fixes - SR-IOV fixes - SMU14 fixes - PSP13 fixes - Display blend fixes - Flexible array size fixes amdkfd: - GPUVM fix radeon: - Flexible array size fixes" * tag 'drm-next-2023-11-07' of git://anongit.freedesktop.org/drm/drm: (83 commits) drm/amd/display: Enable fast update on blendTF change drm/amd/display: Fix blend LUT programming drm/amd/display: Program plane color setting correctly drm/amdgpu: Query and report boot status drm/amdgpu: Add psp v13 function to query boot status drm/amd/swsmu: remove fw version check in sw_init. drm/amd/swsmu: update smu v14_0_0 driver if and metrics table drm/amdgpu: Add C2PMSG_109/126 reg field shift/masks drm/amdgpu: Optimize the asic type fix code drm/amdgpu: fix GRBM read timeout when do mes_self_test drm/amdgpu: check recovery status of xgmi hive in ras_reset_error_count drm/amd/pm: only check sriov vf flag once when creating hwmon sysfs drm/amdgpu: Attach eviction fence on alloc drm/amdkfd: Improve amdgpu_vm_handle_moved drm/amd/display: Increase frame warning limit with KASAN or KCSAN in dml2 drm/amd/display: Avoid NULL dereference of timing generator drm/amdkfd: Update cache info for GFX 9.4.3 drm/amdkfd: Populate cache info for GFX 9.4.3 drm/amdgpu: don't put MQDs in VRAM on ARM \| ARM64 drm/amdgpu/smu13: drop compute workload workaround ...	2023-11-07 17:10:02 -08:00
Tao Zhou	20238a2cc9	drm/amdgpu: add RAS reset/query operations for XGMI v6_4 Reset/query RAS error status and count. v2: use XGMI IP version instead of WAFL version. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-07 12:03:31 -05:00
Tao Zhou	61fe5536d0	drm/amdgpu: handle extra UE register entries for gfx v9_4_3 The UE registe list is larger than CE list. Reported-by: yipeng.chai@amd.com Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-07 12:03:31 -05:00
Lijo Lazar	0553eb9f33	drm/amdgpu: Fix sdma 4.4.2 doorbell rptr/wptr init Doorbell rptr/wptr can be set through multiple ways including direct register initialization. Disable doorbell during hw_fini once the ring is disabled so that during next module reload direct initialization takes effect. Also, move the direct initialization after minor update is set to 1 since rptr/wptr are reinitialized back to 0 which could be lower than the previous doorbell value (ex: cases like module reload). Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-07 12:03:30 -05:00
Jiadong Zhu	9c561ca2d3	drm/amdgpu/soc21: add mode2 asic reset for SMU IP v14.0.0 Set the default reset method to mode2 for SMU IP v14.0.0 Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-07 12:03:30 -05:00
Surbhi Kakarya	9256e8d47a	drm/amd: Disable XNACK on SRIOV environment The purpose of this patch is to disable XNACK or set XNACK OFF mode on SRIOV platform which doesn't support it. This will prevent user-space application to fail or result into unexpected behaviour whenever the application need to run test-case in XNACK ON mode. Signed-off-by: Surbhi Kakarya <surbhi.kakarya@amd.com> Reviewed-by: Shaoyun Liu <shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-07 11:15:37 -05:00
Hawking Zhang	23618280cc	drm/amdgpu: Query and report boot status Query boot status and report boot errors. A follow up change is needed to stop GPU initialization if boot fails. v2: only invoke the call for dGPU (Le/Lijo) Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-03 12:18:33 -04:00
Hawking Zhang	df57e019d5	drm/amdgpu: Add psp v13 function to query boot status Add psp v13 function to query boot status. v2: limit the use case to dGPU only (Lijo) Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-03 12:18:33 -04:00
Ma Jun	dbab63561b	drm/amdgpu: Optimize the asic type fix code Use a new struct array to define the asic information which asic type needs to be fixed. Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-03 12:18:32 -04:00
Tim Huang	36e7ff5c13	drm/amdgpu: fix GRBM read timeout when do mes_self_test Use a proper MEID to make sure the CP_HQD_* and CP_GFX_HQD_* registers can be touched when initialize the compute and gfx mqd in mes_self_test. Otherwise, we expect no response from CP and an GRBM eventual timeout. Signed-off-by: Tim Huang <Tim.Huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2023-11-03 12:18:32 -04:00
Tao Zhou	18eae367cb	drm/amdgpu: check recovery status of xgmi hive in ras_reset_error_count Handle xgmi hive case. Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-03 12:18:32 -04:00
Felix Kuehling	0e2e7c5b3d	drm/amdgpu: Attach eviction fence on alloc Instead of attaching the eviction fence when a KFD BO is first mapped, attach it when it is allocated or imported. This in preparation to allow KFD BOs to be mapped using the render node API. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-03 12:18:32 -04:00
Felix Kuehling	5a104cb97c	drm/amdkfd: Improve amdgpu_vm_handle_moved Let amdgpu_vm_handle_moved update all BO VA mappings of BOs reserved by the caller. This will be useful for handling extra BO VA mappings in KFD VMs that are managed through the render node API. v2: rebase against drm_exec changes (Alex) Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-03 12:18:32 -04:00
Alex Deucher	ba0fb4b48c	drm/amdgpu: don't put MQDs in VRAM on ARM \| ARM64 Issues were reported with commit `1cfb4d6121` ("drm/amdgpu: put MQDs in VRAM") on an ADLINK Ampere Altra Developer Platform (AVA developer platform). Various ARM systems seem to have problems related to PCIe and MMIO access. In this case, I'm not sure if this is specific to the ADLINK platform or ARM in general. Seems to be some coherency issue with VRAM. For now, just don't put MQDs in VRAM on ARM. Link: https://lists.freedesktop.org/archives/amd-gfx/2023-October/100453.html Fixes: `1cfb4d6121` ("drm/amdgpu: put MQDs in VRAM") Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: alexey.klimov@linaro.org	2023-11-03 11:59:51 -04:00
Alex Deucher	3938eb956e	drm/amdgpu: add a retry for IP discovery init AMD dGPUs have integrated FW that runs as soon as the device gets power and initializes the board (determines the amount of memory, provides configuration details to the driver, etc.). For direct PCIe attached cards this happens as soon as power is applied and normally completes well before the OS has even started loading. However, with hotpluggable ports like USB4, the driver needs to wait for this to complete before initializing the device. This normally takes 60-100ms, but could take longer on some older boards periodically due to memory training. Retry for up to a second. In the non-hotplug case, there should be no change in behavior and this should complete on the first try. v2: adjust test criteria v3: adjust checks for the masks, only enable on removable devices v4: skip bif_fb_en check Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925 Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2023-11-03 11:59:51 -04:00
Perry Yuan	886b92f635	drm/amdgpu: ungate power gating when system suspend [Why] During suspend, if GFX DPM is enabled and GFXOFF feature is enabled the system may get hung. So, it is suggested to disable GFXOFF feature during suspend and enable it after resume. [How] Update the code to disable GFXOFF feature during suspend and enable it after resume. [ 311.396526] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000 [ 311.396530] amdgpu 0000:03:00.0: amdgpu: Fail to disable dpm features! [ 311.396531] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] ERROR suspend of IP block <smu> failed -62 Acked-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Perry Yuan <perry.yuan@amd.com> Signed-off-by: Kun Liu <kun.liu2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2023-11-03 11:59:51 -04:00
Alex Deucher	7b1c6263ea	drm/amdgpu: don't use pci_is_thunderbolt_attached() It's only valid on Intel systems with the Intel VSEC. Use dev_is_removable() instead. This should do the right thing regardless of the platform. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925 Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2023-11-03 11:59:44 -04:00
Alex Deucher	432e664e7c	drm/amdgpu: don't use ATRM for external devices The ATRM ACPI method is for fetching the dGPU vbios rom image on laptops and all-in-one systems. It should not be used for external add in cards. If the dGPU is thunderbolt connected, don't try ATRM. v2: pci_is_thunderbolt_attached only works for Intel. Use pdev->external_facing instead. v3: dev_is_removable() seems to be what we want Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925 Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2023-11-03 11:59:13 -04:00
Alex Deucher	b3c942bb6c	drm/amdgpu/gfx10,11: use memcpy_to/fromio for MQDs Since they were moved to VRAM, we need to use the IO variants of memcpy. Fixes: `1cfb4d6121` ("drm/amdgpu: put MQDs in VRAM") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-03 11:38:19 -04:00
Tao Zhou	f7aeee7346	drm/amdgpu: use mode-2 reset for RAS poison consumption Switch from mode-1 reset to mode-2 for poison consumption. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-03 11:38:13 -04:00
Lin.Cao	b77cc85bdb	drm/amdgpu doorbell range should be set when gpu recovery GFX doorbell range should be set after flr otherwise the gfx doorbell range will be overlap with MEC. v2: remove "amdgpu_sriov_vf" and "amdgpu_in_reset" check, and add grbm select for the case of 2 gfx rings. Signed-off-by: Lin.Cao <lincao12@amd.com> Acked-by: ZhenGuo Yin <zhenguo.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-03 11:38:04 -04:00
Linus Torvalds	27beb3ca34	pci-v6.7-changes -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmVBaU8UHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vwEdxAAo++s98+ZaaTdUuoV0Zpft1fuY6Yr mR80jUDxjHDbcI1G4iNVUSWG6pGIdlURnrBp5kU74FV9R2Ps3Fl49XQUHowE0HfH D/qmihiJQdnMsQKwzw3XGoTSINrDcF6nLafl9brBItVkgjNxfxSEbnweJMBf+Boc rpRXHzxbVHVjwwhBLODF2Wt/8sQ24w9c+wcQkpo7im8ZZReoigNMKgEa4J7tLlqA vTyPR/K6QeU8IBUk2ObCY3GeYrVuqi82eRK3Uwzu7IkQwA9orE416Okvq3Z026/h TUAivtrcygHaFRdGNvzspYLbc2hd2sEXF+KKKb6GNAjxuDWUhVQW4ObY4FgFkZ65 Gqz/05D6c1dqTS3vTxp3nZYpvPEbNnO1RaGRL4h0/mbU+QSPSlHXWd9Lfg6noVVd 3O+CcstQK8RzMiiWLeyctRPV5XIf7nGVQTJW5aCLajlHeJWcvygNpNG4N57j/hXQ gyEHrz3idXXHXkBKmyWZfre6YpLkxZtKyONZDHWI/AVhU0TgRdJWmqpRfC1kVVUe IUWBRcPUF4/r3jEu6t10N/aDWQN1uQzIsJNnCrKzAddPDTTYQJk8VVzKPo8SVxPD X+OjEMgBB/fXUfkJ7IMwgYnWaFJhxthrs6/3j1UqRvGYRoulE4NdWwJDky9UYIHd qV3dzuAxC/cpv08= =G//C -----END PGP SIGNATURE----- Merge tag 'pci-v6.7-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Use acpi_evaluate_dsm_typed() instead of open-coding _DSM evaluation to learn device characteristics (Andy Shevchenko) - Tidy multi-function header checks using new PCI_HEADER_TYPE_MASK definition (Ilpo Järvinen) - Simplify config access error checking in various drivers (Ilpo Järvinen) - Use pcie_capability_clear_word() (not pcie_capability_clear_and_set_word()) when only clearing (Ilpo Järvinen) - Add pci_get_base_class() to simplify finding devices using base class only (ignoring subclass and programming interface) (Sui Jingfeng) - Add pci_is_vga(), which includes ancient PCI_CLASS_NOT_DEFINED_VGA devices from before the Class Code was added to PCI (Sui Jingfeng) - Use pci_is_vga() for vgaarb, sysfs "boot_vga", virtio, qxl to include ancient VGA devices (Sui Jingfeng) Resource management: - Make pci_assign_unassigned_resources() non-init because sparc uses it after init (Randy Dunlap) Driver binding: - Retain .remove() and .probe() callbacks (previously __init) because sysfs may cause them to be called later (Uwe Kleine-König) - Prevent xHCI driver from claiming AMD VanGogh USB3 DRD device, so it can be claimed by dwc3 instead (Vicki Pfau) PCI device hotplug: - Add Ampere Altra Attention Indicator extension driver for acpiphp (D Scott Phillips) Power management: - Quirk VideoPropulsion Torrent QN16e with longer delay after reset (Lukas Wunner) - Prevent users from overriding drivers that say we shouldn't use D3cold (Lukas Wunner) - Avoid PME from D3hot/D3cold for AMD Rembrandt and Phoenix USB4 because wakeup interrupts from those states don't work if amd-pmc has put the platform in a hardware sleep state (Mario Limonciello) IOMMU: - Disable ATS for Intel IPU E2000 devices with invalidation message endianness erratum (Bartosz Pawlowski) Error handling: - Factor out interrupt enable/disable into helpers (Kai-Heng Feng) Peer-to-peer DMA: - Fix flexible-array usage in struct pci_p2pdma_pagemap in case we ever use pagemaps with multiple entries (Gustavo A. R. Silva) ASPM: - Revert a change that broke when drivers disabled L1 and users later enabled an L1.x substate via sysfs, and fix a similar issue when users disabled L1 via sysfs (Heiner Kallweit) Endpoint framework: - Fix double free in __pci_epc_create() (Dan Carpenter) - Use IS_ERR_OR_NULL() to simplify endpoint core (Ruan Jinjie) Cadence PCIe controller driver: - Drop unused "is_rc" member (Li Chen) Freescale Layerscape PCIe controller driver: - Enable 64-bit addressing in endpoint mode (Guanhua Gao) Intel VMD host bridge driver: - Fix multi-function header check (Ilpo Järvinen) Microsoft Hyper-V host bridge driver: - Annotate struct hv_dr_state with __counted_by (Kees Cook) NVIDIA Tegra194 PCIe controller driver: - Drop setting of LNKCAP_MLW (max link width) since dw_pcie_setup() already does this via dw_pcie_link_set_max_link_width() (Yoshihiro Shimoda) Qualcomm PCIe controller driver: - Use PCIE_SPEED2MBS_ENC() to simplify encoding of link speed (Manivannan Sadhasivam) - Add a .write_dbi2() callback so DBI2 register writes, e.g., for setting the BAR size, work correctly (Manivannan Sadhasivam) - Enable ASPM for platforms that use 1.9.0 ops, because the PCI core doesn't enable ASPM states that haven't been enabled by the firmware (Manivannan Sadhasivam) Renesas R-Car Gen4 PCIe controller driver: - Add DesignWare core support (set max link width, EDMA_UNROLL flag, .pre_init(), .deinit(), etc) for use by R-Car Gen4 driver (Yoshihiro Shimoda) - Add driver and DT schema for DesignWare-based Renesas R-Car Gen4 controller in both host and endpoint mode (Yoshihiro Shimoda) Xilinx NWL PCIe controller driver: - Update ECAM size to support 256 buses (Thippeswamy Havalige) - Stop setting bridge primary/secondary/subordinate bus numbers, since PCI core does this (Thippeswamy Havalige) Xilinx XDMA controller driver: - Add driver and DT schema for Zynq UltraScale+ MPSoCs devices with Xilinx XDMA Soft IP (Thippeswamy Havalige) Miscellaneous: - Use FIELD_GET()/FIELD_PREP() to simplify and reduce use of _SHIFT macros (Ilpo Järvinen, Bjorn Helgaas) - Remove logic_outb(), _outw(), outl() duplicate declarations (John Sanpe) - Replace unnecessary UTF-8 in Kconfig help text because menuconfig doesn't render it correctly (Liu Song)" * tag 'pci-v6.7-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (102 commits) PCI: qcom-ep: Add dedicated callback for writing to DBI2 registers PCI: Simplify pcie_capability_clear_and_set_word() to ..._clear_word() PCI: endpoint: Fix double free in __pci_epc_create() PCI: xilinx-xdma: Add Xilinx XDMA Root Port driver dt-bindings: PCI: xilinx-xdma: Add schemas for Xilinx XDMA PCIe Root Port Bridge PCI: xilinx-cpm: Move IRQ definitions to a common header PCI: xilinx-nwl: Modify ECAM size to enable support for 256 buses PCI: xilinx-nwl: Rename the NWL_ECAM_VALUE_DEFAULT macro dt-bindings: PCI: xilinx-nwl: Modify ECAM size in the DT example PCI: xilinx-nwl: Remove redundant code that sets Type 1 header fields PCI: hotplug: Add Ampere Altra Attention Indicator extension driver PCI/AER: Factor out interrupt toggling into helpers PCI: acpiphp: Allow built-in drivers for Attention Indicators PCI/portdrv: Use FIELD_GET() PCI/VC: Use FIELD_GET() PCI/PTM: Use FIELD_GET() PCI/PME: Use FIELD_GET() PCI/ATS: Use FIELD_GET() PCI/ATS: Show PASID Capability register width in bitmasks PCI/ASPM: Fix L1 substate handling in aspm_attr_store_common() ...	2023-11-02 14:05:18 -10:00
Matthew Brost	a6149f0393	drm/sched: Convert drm scheduler to use a work queue rather than kthread In Xe, the new Intel GPU driver, a choice has made to have a 1 to 1 mapping between a drm_gpu_scheduler and drm_sched_entity. At first this seems a bit odd but let us explain the reasoning below. 1. In Xe the submission order from multiple drm_sched_entity is not guaranteed to be the same completion even if targeting the same hardware engine. This is because in Xe we have a firmware scheduler, the GuC, which allowed to reorder, timeslice, and preempt submissions. If a using shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls apart as the TDR expects submission order == completion order. Using a dedicated drm_gpu_scheduler per drm_sched_entity solve this problem. 2. In Xe submissions are done via programming a ring buffer (circular buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow control on the ring for free. A problem with this design is currently a drm_gpu_scheduler uses a kthread for submission / job cleanup. This doesn't scale if a large number of drm_gpu_scheduler are used. To work around the scaling issue, use a worker rather than kthread for submission / job cleanup. v2: - (Rob Clark) Fix msm build - Pass in run work queue v3: - (Boris) don't have loop in worker v4: - (Tvrtko) break out submit ready, stop, start helpers into own patch v5: - (Boris) default to ordered work queue v6: - (Luben / checkpatch) fix alignment in msm_ringbuffer.c - (Luben) s/drm_sched_submit_queue/drm_sched_wqueue_enqueue - (Luben) Update comment for drm_sched_wqueue_enqueue - (Luben) Positive check for submit_wq in drm_sched_init - (Luben) s/alloc_submit_wq/own_submit_wq v7: - (Luben) s/drm_sched_wqueue_enqueue/drm_sched_run_job_queue v8: - (Luben) Adjust var names / comments Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Link: https://lore.kernel.org/r/20231031032439.1558703-3-matthew.brost@intel.com Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>	2023-11-01 17:29:21 -04:00
Matthew Brost	35963cf2cd	drm/sched: Add drm_sched_wqueue_* helpers Add scheduler wqueue ready, stop, and start helpers to hide the implementation details of the scheduler from the drivers. v2: - s/sched_wqueue/sched_wqueue (Luben) - Remove the extra white line after the return-statement (Luben) - update drm_sched_wqueue_ready comment (Luben) Cc: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Link: https://lore.kernel.org/r/20231031032439.1558703-2-matthew.brost@intel.com Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>	2023-11-01 17:29:20 -04:00
Linus Torvalds	7d461b291e	drm for 6.7-rc1 kernel: - add initial vmemdup-user-array core: - fix platform remove() to return void - drm_file owner updated to reflect owner - move size calcs to drm buddy allocator - let GPUVM build as a module - allow variable number of run-queues in scheduler edid: - handle bad h/v sync_end in EDIDs panfrost: - add Boris as maintainer fbdev: - use fb_ops helpers more - only allow logo use from fbcon - rename fb_pgproto to pgprot_framebuffer - add HPD state to drm_connector_oob_hotplug_event - convert to fbdev i/o mem helpers i915: - Enable meteorlake by default - Early Xe2 LPD/Lunarlake display enablement - Rework subplatforms into IP version checks - GuC based TLB invalidation for Meteorlake - Display rework for future Xe driver integration - LNL FBC features - LNL display feature capability reads - update recommended fw versions for DG2+ - drop fastboot module parameter - added deviceid for Arrowlake-S - drop preproduction workarounds - don't disable preemption for resets - cleanup inlines in headers - PXP firmware loading fix - Fix sg list lengths - DSC PPS state readout/verification - Add more RPL P/U PCI IDs - Add new DG2-G12 stepping - DP enhanced framing support to state checker - Improve shared link bandwidth management - stop using GEM macros in display code - refactor related code into display code - locally enable W=1 warnings - remove PSR watchdog timers on LNL amdgpu: - RAS/FRU EEPROM updatse - IP discovery updatses - GC 11.5 support - DCN 3.5 support - VPE 6.1 support - NBIO 7.11 support - DML2 support - lots of IP updates - use flexible arrays for bo list handling - W=1 fixes - Enable seamless boot in more cases - Enable context type property for HDMI - Rework GPUVM TLB flushing - VCN IB start/size alignment fixes amdkfd: - GC 10/11 fixes - GC 11.5 support - use partial migration in GPU faults radeon: - W=1 Fixes - fix some possible buffer overflow/NULL derefs nouveau: - update uapi for NO_PREFETCH - scheduler/fence fixes - rework suspend/resume for GSP-RM - rework display in preparation for GSP-RM habanalabs: - uapi: expose tsc clock - uapi: block access to eventfd through control device - uapi: force dma-buf export to PAGE_SIZE alignments - complete move to accel subsystem - move firmware interface include files - perform hard reset on PCIe AXI drain event - optimise user interrupt handling msm: - DP: use existing helpers for DPCD - DPU: interrupts reworked - gpu: a7xx (a730/a740) support - decouple msm_drv from kms for headless devices mediatek: - MT8188 dsi/dp/edp support - DDP GAMMA - 12 bit LUT support - connector dynamic selection capability rockchip: - rv1126 mipi-dsi/vop support - add planar formats ast: - rename constants panels: - Mitsubishi AA084XE01 - JDI LPM102A188A - LTK050H3148W-CTA6 ivpu: - power management fixes qaic: - add detach slice bo api komeda: - add NV12 writeback tegra: - support NVSYNC/NHSYNC - host1x suspend fixes ili9882t: - separate into own driver -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmVAgzYACgkQDHTzWXnE hr7ZEQ//UXne3tyGOsU3X8r+lstLFDMa90a3hvTg6hX+Q0MjHd/clwkKFkLpkipL n7gIZlaHl11dRs0FzrIZA5EVAAgjMLKmIl10NBDFec6ZFA3VERcggx8y61uifI15 VviMR1VbLHYZaCdyrQOK0A4wcktWnKXyoXp7cwy9crdc2GOBMUZkdIqtvD7jHxQx UMIFnzi1CyKUX/Fjt/JceYcNk9y2ZGkzakYO3sHcUdv4DPu9qX4kNzpjF691AZBP UeKWvCswTRVg2M0kuo/RYIBzqaTmOlk6dHLWBognIeZPyuyhCcaGC2d64c6tShwQ dtHdi+IgyQ8s2qb350ymKTQUP7xA/DfZBwH7LvrZALBxeQGYQN1CnsgDMOS2wcUc XrRFiS7PxEOtMMBctcPBnnoV5ttnsLLlPpzM9puh9sUFMn6CgLzcAMqXdqxzMajH +dz2aD1N0vMqq4varozOg9SC2QamgUiPN/TQfrulhCTCfQaXczy5x1OYiIz65+Sl mKoe2WASuP9Ve8do4N/wEwH5SZY2ItipBdUTRxttY9NTanmV0X5DjZBXH5b9XGci Zl5Ar613f9zwm5T5BVA5k6s3ZbGY6QcP5pDNTCPaSgitfFXIdReBZ2CaYzK3MPg/ Wit/TXrud9yT6VPpI1igboMyasf5QubV1MY1K83kOCWr9u8R2CM= =l79u -----END PGP SIGNATURE----- Merge tag 'drm-next-2023-10-31-1' of git://anongit.freedesktop.org/drm/drm Pull drm updates from Dave Airlie: "Highlights: - AMD adds some more upcoming HW platforms - Intel made Meteorlake stable and started adding Lunarlake - nouveau has a bunch of display rework in prepartion for the NVIDIA GSP firmware support - msm adds a7xx support - habanalabs has finished migration to accel subsystem Detail summary: kernel: - add initial vmemdup-user-array core: - fix platform remove() to return void - drm_file owner updated to reflect owner - move size calcs to drm buddy allocator - let GPUVM build as a module - allow variable number of run-queues in scheduler edid: - handle bad h/v sync_end in EDIDs panfrost: - add Boris as maintainer fbdev: - use fb_ops helpers more - only allow logo use from fbcon - rename fb_pgproto to pgprot_framebuffer - add HPD state to drm_connector_oob_hotplug_event - convert to fbdev i/o mem helpers i915: - Enable meteorlake by default - Early Xe2 LPD/Lunarlake display enablement - Rework subplatforms into IP version checks - GuC based TLB invalidation for Meteorlake - Display rework for future Xe driver integration - LNL FBC features - LNL display feature capability reads - update recommended fw versions for DG2+ - drop fastboot module parameter - added deviceid for Arrowlake-S - drop preproduction workarounds - don't disable preemption for resets - cleanup inlines in headers - PXP firmware loading fix - Fix sg list lengths - DSC PPS state readout/verification - Add more RPL P/U PCI IDs - Add new DG2-G12 stepping - DP enhanced framing support to state checker - Improve shared link bandwidth management - stop using GEM macros in display code - refactor related code into display code - locally enable W=1 warnings - remove PSR watchdog timers on LNL amdgpu: - RAS/FRU EEPROM updatse - IP discovery updatses - GC 11.5 support - DCN 3.5 support - VPE 6.1 support - NBIO 7.11 support - DML2 support - lots of IP updates - use flexible arrays for bo list handling - W=1 fixes - Enable seamless boot in more cases - Enable context type property for HDMI - Rework GPUVM TLB flushing - VCN IB start/size alignment fixes amdkfd: - GC 10/11 fixes - GC 11.5 support - use partial migration in GPU faults radeon: - W=1 Fixes - fix some possible buffer overflow/NULL derefs nouveau: - update uapi for NO_PREFETCH - scheduler/fence fixes - rework suspend/resume for GSP-RM - rework display in preparation for GSP-RM habanalabs: - uapi: expose tsc clock - uapi: block access to eventfd through control device - uapi: force dma-buf export to PAGE_SIZE alignments - complete move to accel subsystem - move firmware interface include files - perform hard reset on PCIe AXI drain event - optimise user interrupt handling msm: - DP: use existing helpers for DPCD - DPU: interrupts reworked - gpu: a7xx (a730/a740) support - decouple msm_drv from kms for headless devices mediatek: - MT8188 dsi/dp/edp support - DDP GAMMA - 12 bit LUT support - connector dynamic selection capability rockchip: - rv1126 mipi-dsi/vop support - add planar formats ast: - rename constants panels: - Mitsubishi AA084XE01 - JDI LPM102A188A - LTK050H3148W-CTA6 ivpu: - power management fixes qaic: - add detach slice bo api komeda: - add NV12 writeback tegra: - support NVSYNC/NHSYNC - host1x suspend fixes ili9882t: - separate into own driver" * tag 'drm-next-2023-10-31-1' of git://anongit.freedesktop.org/drm/drm: (1803 commits) drm/amdgpu: Remove unused variables from amdgpu_show_fdinfo drm/amdgpu: Remove duplicate fdinfo fields drm/amd/amdgpu: avoid to disable gfxhub interrupt when driver is unloaded drm/amdgpu: Add EXT_COHERENT support for APU and NUMA systems drm/amdgpu: Retrieve CE count from ce_count_lo_chip in EccInfo table drm/amdgpu: Identify data parity error corrected in replay mode drm/amdgpu: Fix typo in IP discovery parsing drm/amd/display: fix S/G display enablement drm/amdxcp: fix amdxcp unloads incompletely drm/amd/amdgpu: fix the GPU power print error in pm info drm/amdgpu: Use pcie domain of xcc acpi objects drm/amd: check num of link levels when update pcie param drm/amdgpu: Add a read to GFX v9.4.3 ring test drm/amd/pm: call smu_cmn_get_smc_version in is_mode1_reset_supported. drm/amdgpu: get RAS poison status from DF v4_6_2 drm/amdgpu: Use discovery table's subrevision drm/amd/display: 3.2.256 drm/amd/display: add interface to query SubVP status drm/amd/display: Read before writing Backlight Mode Set Register drm/amd/display: Disable SYMCLK32_SE RCO on DCN314 ...	2023-11-01 06:28:35 -10:00
Yang Wang	2bfb0ca3dd	drm/amdgpu: remove unused macro HW_REV remove unused macro HW_REV Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-31 17:13:59 -04:00
Arunpravin Paneer Selvam	9ae587f850	drm/amdgpu: Fix the vram base start address If the size returned by drm buddy allocator is higher than the required size, we take the higher size to calculate the buffer start address. This is required if we couldn't trim the buffer to the requested size. This will fix the display corruption issue on APU's which has limited VRAM size. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2859 Fixes: `0a1844bf0b` ("drm/buddy: Improve contiguous memory allocation") Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-31 17:10:13 -04:00
Tao Zhou	d539b0ad7c	drm/amdgpu: set XGMI IP version manually for v6_4 The version can't be queried from discovery table. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-31 17:09:53 -04:00
Tong Liu01	853eebe6ec	drm/amdgpu: add unmap latency when gfx11 set kiq resources [why] If driver does not set unmap latency for KIQ, the default value of KIQ unmap latency is zero. When do unmap queue, KIQ will return that almost immediately after receiving unmap command. So, the queue status will be saved to MQD incorrectly or lost in some chance. [how] Set unmap latency when do kiq set resources. The unmap latency is set to be 1 second that is synchronized with Windows driver. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Tong Liu01 <Tong.Liu01@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-31 16:40:16 -04:00
Kenneth Feng	5f38ac54e6	drm/amd/pm: fix the high voltage and temperature issue fix the high voltage and temperature issue after the driver is unloaded on smu 13.0.0, smu 13.0.7 and smu 13.0.10 v2 - fix the code format and make sure it is used on the unload case only. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-31 16:40:16 -04:00
Yifan Zhang	a17f574ab4	drm/amdgpu: remove amdgpu_mes_self_test in gpu recover gpu tlb flush is skipped if reset sem is held, it makes mes_self_test fail since it involves add_hw_queue/remove_hw_queue which needs tlb flush functional. Remove mes_self_test in gpu recover sequence. This patch is to fix the recover failure in gfx11. [ 1831.768292] [drm] ring sdma_32769.3.3 was added [ 1831.768313] [drm] ring gfx_32769.1.1 ib test pass [ 1831.768337] [drm] ring compute_32769.2.2 ib test pass [ 1831.768399] amdgpu 0000:c2:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:8 pasid:32769, for process pid 0 thread pid 0) [ 1831.768434] amdgpu 0000:c2:00.0: amdgpu: in page starting at address 0x0000aec200000000 from client 10 [ 1831.768456] amdgpu 0000:c2:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00800A30 [ 1831.768473] amdgpu 0000:c2:00.0: amdgpu: Faulty UTCL2 client ID: CPC (0x5) [ 1831.768489] amdgpu 0000:c2:00.0: amdgpu: MORE_FAULTS: 0x0 [ 1831.768501] amdgpu 0000:c2:00.0: amdgpu: WALKER_ERROR: 0x0 [ 1831.768513] amdgpu 0000:c2:00.0: amdgpu: PERMISSION_FAULTS: 0x3 [ 1831.768521] amdgpu 0000:c2:00.0: amdgpu: MAPPING_ERROR: 0x0 [ 1831.768529] amdgpu 0000:c2:00.0: amdgpu: RW: 0x0 [ 1831.931229] amdgpu 0000:c2:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] ERROR ring sdma_32769.3.3 test failed (-110) [ 1832.062917] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] ERROR MES failed to response msg=3 [ 1832.063107] [drm:amdgpu_mes_remove_hw_queue [amdgpu]] ERROR failed to remove hardware queue, queue id = 3 Fixes: `e2e3788850` ("drm/amdgpu: rework lock handling for flush_tlb v2") Reported-by: Li Ma <li.ma@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-31 16:40:15 -04:00
Candice Li	e020d01575	drm/amdgpu: Drop deferred error in uncorrectable error check Drop checking deferred error which can be handled by poison consumption. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-31 16:40:15 -04:00
Tao Zhou	d1d4c0b7b6	drm/amdgpu: check RAS supported first in ras_reset_error_count Not all platforms support RAS. Fixes: `73582be11a` ("drm/amdgpu: bypass RAS error reset in some conditions") Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-31 16:40:15 -04:00
Dave Airlie	631808095a	amd-drm-next-6.7-2023-10-27: amdgpu: - RAS fixes - Seamless boot fixes - NBIO 7.7 fix - SMU 14.0 fixes - GC 11.5 fixes - DML2 fixes - ASPM fixes - VPE fixes - Misc code cleanups - SRIOV fixes - Add some missing copyright notices - DCN 3.5 fixes - FAMS fixes - Backlight fix - S/G display fix - fdinfo cleanups - EXT_COHERENT fixes for APU and NUMA systems amdkfd: - Misc fixes - Misc code cleanups - SVM fixes -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZTwWOAAKCRC93/aFa7yZ 2Lz3AP0cInVS4qZYTZCh2O/k5AoidJjcmRl2DVm8OdowBPCa4wEAoNsekTIQnZsI Ru4SoVKhT2bs1LEMOcdzexsVrwlaxA8= =G3QX -----END PGP SIGNATURE----- Merge tag 'amd-drm-next-6.7-2023-10-27' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.7-2023-10-27: amdgpu: - RAS fixes - Seamless boot fixes - NBIO 7.7 fix - SMU 14.0 fixes - GC 11.5 fixes - DML2 fixes - ASPM fixes - VPE fixes - Misc code cleanups - SRIOV fixes - Add some missing copyright notices - DCN 3.5 fixes - FAMS fixes - Backlight fix - S/G display fix - fdinfo cleanups - EXT_COHERENT fixes for APU and NUMA systems amdkfd: - Misc fixes - Misc code cleanups - SVM fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231027200343.57132-1-alexander.deucher@amd.com	2023-10-31 12:37:19 +10:00
Dave Airlie	915b6d034b	Merge tag 'drm-misc-next-2023-10-27' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v6.7-rc1: drm-misc-next-2023-10-19 + following: UAPI Changes: Cross-subsystem Changes: - Convert fbdev drivers to use fbdev i/o mem helpers. Core Changes: - Use cross-references for macros in docs. - Make drm_client_buffer_addb use addfb2. - Add NV20 and NV30 YUV formats. - Documentation updates for create_dumb ioctl. - CI fixes. - Allow variable number of run-queues in scheduler. Driver Changes: - Rename drm/ast constants. - Make ili9882t its own driver. - Assorted fixes in ivpu, vc4, bridge/synopsis, amdgpu. - Add planar formats to rockchip. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/3d92fae8-9b1b-4165-9ca8-5fda11ee146b@linux.intel.com	2023-10-31 10:47:50 +10:00
Umio Yasuno	dd3dd9829b	drm/amdgpu: Remove unused variables from amdgpu_show_fdinfo Remove unused variables from amdgpu_show_fdinfo Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Umio Yasuno <coelacanth_dream@protonmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-27 14:23:01 -04:00
Rob Clark	e8e696c307	drm/amdgpu: Remove duplicate fdinfo fields Some of the fields that are handled by drm_show_fdinfo() crept back in when rebasing the patch. Remove them again. Fixes: `376c25f8ca` ("drm/amdgpu: Switch to fdinfo helper") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: <alexander.deucher@amd.com> Co-developed-by: Umio Yasuno <coelacanth_dream@protonmail.com> Signed-off-by: Umio Yasuno <coelacanth_dream@protonmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-27 14:23:01 -04:00
Kenneth Feng	3ea8dd3758	drm/amd/amdgpu: avoid to disable gfxhub interrupt when driver is unloaded avoid to disable gfxhub interrupt when driver is unloaded on gmc 11 Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-27 14:15:39 -04:00
David Francis	142262a1c0	drm/amdgpu: Add EXT_COHERENT support for APU and NUMA systems On gfx943 APU, EXT_COHERENT should give MTYPE_CC for local and MTYPE_UC for nonlocal memory. On NUMA systems, local memory gets the local mtype, set by an override callback. If EXT_COHERENT is set, memory will be set as MTYPE_UC by default, with local memory MTYPE_CC. Add an option in the override function for this case, and add a check to ensure it is not used on UNCACHED memory. V2: Combined APU and NUMA code into one patch V3: Fixed a potential nullptr in amdgpu_vm_bo_update Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-27 14:15:16 -04:00
Candice Li	a395f7ffce	drm/amdgpu: Retrieve CE count from ce_count_lo_chip in EccInfo table Retrieve correctable error count from ce_count_lo_chip instead of mca_umc_status. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-27 14:15:10 -04:00
Candice Li	d59fcfb084	drm/amdgpu: Identify data parity error corrected in replay mode Use ErrorCodeExt field to identify data parity error in replay mode. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-27 14:15:03 -04:00
Mukul Joshi	f7a17b2b36	drm/amdgpu: Fix typo in IP discovery parsing Fix a typo in parsing of the GC info table header when reading the IP discovery table. Fixes: `0e64c9aad0` ("drm/amdgpu: add type conversion for gc info") Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-27 14:13:27 -04:00
Dave Airlie	44117828ed	amd-drm-fixes-6.6-2023-10-25: amdgpu: - Extend VI APSM quirks to more platforms -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZTngEQAKCRC93/aFa7yZ 2AJrAP432Pqn+TmmQnm564esdTzy/aH8m/5eeB5KSTE3SQt+agD/bAoDPRCk/jgi PMgVEWp8pQo1Kezzcb/+iJq0tUfxQgc= =kRQ2 -----END PGP SIGNATURE----- Merge tag 'amd-drm-fixes-6.6-2023-10-25' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.6-2023-10-25: amdgpu: - Extend VI APSM quirks to more platforms Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231026035452.14921-1-alexander.deucher@amd.com	2023-10-27 12:17:26 +10:00
Dave Airlie	6366ffa6ed	Short summary of fixes pull: amdgpu: - ignore duplicated BOs in CS parser - remove redundant call to amdgpu_ctx_priority_is_valid() amdkfd: - reserve fence slot while locking BO dp_mst: - Fix NULL deref in get_mst_branch_device_by_guid_helper() logicvc: - Kconfig: Select REGMAP and REGMAP_MMIO ivpu: - Fix missing VPUIP interrupts -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmU6RugACgkQaA3BHVML eiOH1ggAsc+3EcnoNbeKuPRTefAGb7xsoljmyY0EJriSQZtiropmVGpXYmLSSSSC 4x8TLztWsDXkM/+AKJEQWT56egKwP4Z+QgASOo25p050z56IlPScSsENasuwumW1 vI/hlUQRkDjAixEd6WHcD/hGPu9kKNPhS/DUVNeWYr2237rFJChHg+hWw3PycTzm 2aD8fx1275sx6/OXkKW5bQKmshjHtGYnT86w3xgAJUfvTDPq/eZtwqCKhqCqaQja 5tAccKVmnWAaXneCvs7XwAXYou1Qny3GOBOOpCrBQYaj49N9cMyflovY2y2s1uyR pQyquVzLZLVhFHc2B58++49NyES8bw== =d+WA -----END PGP SIGNATURE----- Merge tag 'drm-misc-fixes-2023-10-26' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Short summary of fixes pull: amdgpu: - ignore duplicated BOs in CS parser - remove redundant call to amdgpu_ctx_priority_is_valid() amdkfd: - reserve fence slot while locking BO dp_mst: - Fix NULL deref in get_mst_branch_device_by_guid_helper() logicvc: - Kconfig: Select REGMAP and REGMAP_MMIO ivpu: - Fix missing VPUIP interrupts Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20231026110132.GA10591@linux-uq9g.fritz.box	2023-10-27 11:51:35 +10:00
Lijo Lazar	d055714a21	drm/amdgpu: Use pcie domain of xcc acpi objects PCI domain/segment information of xccs is available through ACPI DSM methods. Consider that also while looking for devices. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-26 19:04:39 -04:00
Lijo Lazar	3f69d5860f	drm/amdgpu: Add a read to GFX v9.4.3 ring test Issue a read to confirm the register write before ringing doorbell. With multiple XCCs there is chance for race condition. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-26 19:04:13 -04:00
Tao Zhou	2cea7bb911	drm/amdgpu: get RAS poison status from DF v4_6_2 Add DF block and RAS poison mode query for DF v4_6_2. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Acked-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-26 19:02:52 -04:00
Lijo Lazar	dd2687f5d9	drm/amdgpu: Use discovery table's subrevision Use subrevision of IP version in discovery table to identify SOC revision id for NBIO v7.9 SOCs. Only newer bootloaders update subrevision field. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-26 19:02:44 -04:00
Mario Limonciello	2757a848cb	drm/amd: Explicitly disable ASPM when dynamic switching disabled Currently there are separate but related checks: * amdgpu_device_should_use_aspm() * amdgpu_device_aspm_support_quirk() * amdgpu_device_pcie_dynamic_switching_supported() Simplify into checking whether DPM was enabled or not in the auto case. This works because amdgpu_device_pcie_dynamic_switching_supported() populates that value. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-26 18:41:23 -04:00
Mario Limonciello	1a6513de49	drm/amd: Move AMD_IS_APU check for ASPM into top level function There is no need for every ASIC driver to perform the same check. Move the duplicated code into amdgpu_device_should_use_aspm(). Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-26 18:41:23 -04:00
Mario Limonciello	fbf1035b03	drm/amd: Disable PP_PCIE_DPM_MASK when dynamic speed switching not supported Rather than individual ASICs checking for the quirk, set the quirk at the driver level. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-26 18:41:23 -04:00
Lin.Cao	9ee819285c	drm/amdgpu remove restriction of sriov max_pfn on Vega10 Remove restriction of sriov max_pfn so that TBA and TMA can move to high 47 bits address. Regression test: change range alloc flag of libdrm as AMDGPU_VA_RANGE_HIGH and there is no flr occur when testing amdgpu_test of drm. Signed-off-by: Lin.Cao <lincao12@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-26 18:41:22 -04:00
Qu Huang	5104fdf50d	drm/amdgpu: Fix a null pointer access when the smc_rreg pointer is NULL In certain types of chips, such as VEGA20, reading the amdgpu_regs_smc file could result in an abnormal null pointer access when the smc_rreg pointer is NULL. Below are the steps to reproduce this issue and the corresponding exception log: 1. Navigate to the directory: /sys/kernel/debug/dri/0 2. Execute command: cat amdgpu_regs_smc 3. Exception Log:: [4005007.702554] BUG: kernel NULL pointer dereference, address: 0000000000000000 [4005007.702562] #PF: supervisor instruction fetch in kernel mode [4005007.702567] #PF: error_code(0x0010) - not-present page [4005007.702570] PGD 0 P4D 0 [4005007.702576] Oops: 0010 [#1] SMP NOPTI [4005007.702581] CPU: 4 PID: 62563 Comm: cat Tainted: G OE 5.15.0-43-generic #46-Ubunt u [4005007.702590] RIP: 0010:0x0 [4005007.702598] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. [4005007.702600] RSP: 0018:ffffa82b46d27da0 EFLAGS: 00010206 [4005007.702605] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa82b46d27e68 [4005007.702609] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9940656e0000 [4005007.702612] RBP: ffffa82b46d27dd8 R08: 0000000000000000 R09: ffff994060c07980 [4005007.702615] R10: 0000000000020000 R11: 0000000000000000 R12: 00007f5e06753000 [4005007.702618] R13: ffff9940656e0000 R14: ffffa82b46d27e68 R15: 00007f5e06753000 [4005007.702622] FS: 00007f5e0755b740(0000) GS:ffff99479d300000(0000) knlGS:0000000000000000 [4005007.702626] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [4005007.702629] CR2: ffffffffffffffd6 CR3: 00000003253fc000 CR4: 00000000003506e0 [4005007.702633] Call Trace: [4005007.702636] <TASK> [4005007.702640] amdgpu_debugfs_regs_smc_read+0xb0/0x120 [amdgpu] [4005007.703002] full_proxy_read+0x5c/0x80 [4005007.703011] vfs_read+0x9f/0x1a0 [4005007.703019] ksys_read+0x67/0xe0 [4005007.703023] __x64_sys_read+0x19/0x20 [4005007.703028] do_syscall_64+0x5c/0xc0 [4005007.703034] ? do_user_addr_fault+0x1e3/0x670 [4005007.703040] ? exit_to_user_mode_prepare+0x37/0xb0 [4005007.703047] ? irqentry_exit_to_user_mode+0x9/0x20 [4005007.703052] ? irqentry_exit+0x19/0x30 [4005007.703057] ? exc_page_fault+0x89/0x160 [4005007.703062] ? asm_exc_page_fault+0x8/0x30 [4005007.703068] entry_SYSCALL_64_after_hwframe+0x44/0xae [4005007.703075] RIP: 0033:0x7f5e07672992 [4005007.703079] Code: c0 e9 b2 fe ff ff 50 48 8d 3d fa b2 0c 00 e8 c5 1d 02 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 e c 28 48 89 54 24 [4005007.703083] RSP: 002b:00007ffe03097898 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [4005007.703088] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f5e07672992 [4005007.703091] RDX: 0000000000020000 RSI: 00007f5e06753000 RDI: 0000000000000003 [4005007.703094] RBP: 00007f5e06753000 R08: 00007f5e06752010 R09: 00007f5e06752010 [4005007.703096] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000022000 [4005007.703099] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000 [4005007.703105] </TASK> [4005007.703107] Modules linked in: nf_tables libcrc32c nfnetlink algif_hash af_alg binfmt_misc nls_ iso8859_1 ipmi_ssif ast intel_rapl_msr intel_rapl_common drm_vram_helper drm_ttm_helper amd64_edac t tm edac_mce_amd kvm_amd ccp mac_hid k10temp kvm acpi_ipmi ipmi_si rapl sch_fq_codel ipmi_devintf ipm i_msghandler msr parport_pc ppdev lp parport mtd pstore_blk efi_pstore ramoops pstore_zone reed_solo mon ip_tables x_tables autofs4 ib_uverbs ib_core amdgpu(OE) amddrm_ttm_helper(OE) amdttm(OE) iommu_v 2 amd_sched(OE) amdkcl(OE) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core drm igb ahci xhci_pci libahci i2c_piix4 i2c_algo_bit xhci_pci_renesas dca [4005007.703184] CR2: 0000000000000000 [4005007.703188] ---[ end trace ac65a538d240da39 ]--- [4005007.800865] RIP: 0010:0x0 [4005007.800871] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. [4005007.800874] RSP: 0018:ffffa82b46d27da0 EFLAGS: 00010206 [4005007.800878] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa82b46d27e68 [4005007.800881] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9940656e0000 [4005007.800883] RBP: ffffa82b46d27dd8 R08: 0000000000000000 R09: ffff994060c07980 [4005007.800886] R10: 0000000000020000 R11: 0000000000000000 R12: 00007f5e06753000 [4005007.800888] R13: ffff9940656e0000 R14: ffffa82b46d27e68 R15: 00007f5e06753000 [4005007.800891] FS: 00007f5e0755b740(0000) GS:ffff99479d300000(0000) knlGS:0000000000000000 [4005007.800895] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [4005007.800898] CR2: ffffffffffffffd6 CR3: 00000003253fc000 CR4: 00000000003506e0 Signed-off-by: Qu Huang <qu.huang@linux.dev> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-26 18:41:22 -04:00
Tao Zhou	73582be11a	drm/amdgpu: bypass RAS error reset in some conditions PMFW is responsible for RAS error reset in some conditions, driver can skip the operation. v2: add check for ras->in_recovery, it's set earlier than amdgpu_in_reset. v3: fix error in gpu reset check. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-26 18:41:22 -04:00
Tao Zhou	f3a3bbf156	drm/amdgpu: enable RAS poison mode for APU Enable it by default on APU platform. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-26 18:41:22 -04:00
Lang Yu	fc4981b69c	drm/amdgpu/vpe: correct queue stop programing Otherwise IB test would fail during GPU reset. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-26 18:41:22 -04:00
Mario Limonciello	e5f52a84bf	drm/amd: Disable ASPM for VI w/ all Intel systems Originally we were quirking ASPM disabled specifically for VI when used with Alder Lake, but it appears to have problems with Rocket Lake as well. Like we've done in the case of dpm for newer platforms, disable ASPM for all Intel systems. Cc: stable@vger.kernel.org # 5.15+ Fixes: `0064b0ce85` ("drm/amd/pm: enable ASPM by default") Reported-and-tested-by: Paolo Gentili <paolo.gentili@canonical.com> Closes: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2036742 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-26 18:41:22 -04:00
Lijo Lazar	8eece69ace	drm/amdgpu: Add API to get full IP version Fetch the full version of IP including variant and subrevision. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-26 18:41:21 -04:00
Jiadong Zhu	037fb9c600	drm/amdgpu: add tmz support for GC IP v11.5.0 Add tmz support for GC 11.5.0. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-26 18:41:21 -04:00
Li Ma	493c75bbe3	drm/amdgpu: modify if condition in nbio_v7_7.c remove unnecessary "enable" in if condition. Signed-off-by: Li Ma <li.ma@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-26 18:41:21 -04:00
Yang Wang	ec3e0a9167	drm/amdgpu: refine ras error kernel log print refine ras error kernel log to avoid user-ridden ambiguity. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-26 18:41:21 -04:00
Yang Wang	53d4d77927	drm/amdgpu: fix find ras error node error the origin function might return the wrong node. Fixes: `5b1270beb3` ("drm/amdgpu: add ras_err_info to identify RAS error source") Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-26 18:41:21 -04:00
Alex Deucher	b70438004a	drm/amdgpu: move buffer funcs setting up a level Rather than doing this in the IP code for the SDMA paging engine, move it up to the core device level init level. This should fix the scheduler init ordering. v2: drop extra parens v3: drop SDMA helpers v4: Added a Fixes tag because amdgpu dereferences an uninitialized scheduler without this patch, and this patch fixes this. (Luben) Tested-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20231025171928.3318505-1-alexander.deucher@amd.com Acked-by: Christian König <christian.koenig@amd.com> Fixes: `56e449603f` ("drm/sched: Convert the GPU scheduler to variable number of run-queues") Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>	2023-10-26 16:04:24 -04:00
Luben Tuikov	56e449603f	drm/sched: Convert the GPU scheduler to variable number of run-queues The GPU scheduler has now a variable number of run-queues, which are set up at drm_sched_init() time. This way, each driver announces how many run-queues it requires (supports) per each GPU scheduler it creates. Note, that run-queues correspond to scheduler "priorities", thus if the number of run-queues is set to 1 at drm_sched_init(), then that scheduler supports a single run-queue, i.e. single "priority". If a driver further sets a single entity per run-queue, then this creates a 1-to-1 correspondence between a scheduler and a scheduled entity. Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Russell King <linux+etnaviv@armlinux.org.uk> Cc: Qiang Yu <yuq825@gmail.com> Cc: Rob Clark <robdclark@gmail.com> Cc: Abhinav Kumar <quic_abhinavk@quicinc.com> Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: Danilo Krummrich <dakr@redhat.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Emma Anholt <emma@anholt.net> Cc: etnaviv@lists.freedesktop.org Cc: lima@lists.freedesktop.org Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Cc: nouveau@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20231023032251.164775-1-luben.tuikov@amd.com	2023-10-26 12:03:47 -04:00
Mario Limonciello	64ffd2f1d0	drm/amd: Disable ASPM for VI w/ all Intel systems Originally we were quirking ASPM disabled specifically for VI when used with Alder Lake, but it appears to have problems with Rocket Lake as well. Like we've done in the case of dpm for newer platforms, disable ASPM for all Intel systems. Cc: stable@vger.kernel.org # 5.15+ Fixes: `0064b0ce85` ("drm/amd/pm: enable ASPM by default") Reported-and-tested-by: Paolo Gentili <paolo.gentili@canonical.com> Closes: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2036742 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-25 09:53:17 -04:00
Dave Airlie	0ecf4aa32b	amd-drm-next-6.7-2023-10-20: amdgpu: - SMU 13 updates - UMSCH updates - DC MPO fixes - RAS updates - MES 11 fixes - Fix possible memory leaks in error pathes - GC 11.5 fixes - Kernel doc updates - PSP updates - APU IMU fixes - Misc code cleanups - SMU 11 fixes - OD fix - Frame size warning fixes - SR-IOV fixes - NBIO 7.11 updates - NBIO 7.7 updates - XGMI fixes - devcoredump updates amdkfd: - Misc code cleanups - SVM fixes -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZTLX4QAKCRC93/aFa7yZ 2ORJAP9lnoyvUPIP63Hx5TADQtZHiA+ShkATGQmDia94ABtCxwEAlo88TipxAo7c tRX8Mn+rix3M739FDFxV0bp7hCXsbgQ= =fY9I -----END PGP SIGNATURE----- Merge tag 'amd-drm-next-6.7-2023-10-20' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.7-2023-10-20: amdgpu: - SMU 13 updates - UMSCH updates - DC MPO fixes - RAS updates - MES 11 fixes - Fix possible memory leaks in error pathes - GC 11.5 fixes - Kernel doc updates - PSP updates - APU IMU fixes - Misc code cleanups - SMU 11 fixes - OD fix - Frame size warning fixes - SR-IOV fixes - NBIO 7.11 updates - NBIO 7.7 updates - XGMI fixes - devcoredump updates amdkfd: - Misc code cleanups - SVM fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231020195043.4937-1-alexander.deucher@amd.com	2023-10-25 10:54:22 +10:00
Christian König	4984fc578a	drm/amdkfd: reserve a fence slot while locking the BO Looks like the KFD still needs this. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: `8abc1eb298` ("drm/amdkfd: switch over to using drm_exec v3") Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231020123306.43978-1-christian.koenig@amd.com	2023-10-23 14:48:47 +02:00
Dave Airlie	7cd62eab9b	Linux 6.6-rc7 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmU1ngkeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGrsIH/0k/+gdBBYFFdEym foRhKir9WV3ZX4oIozJjA1f7T+qVYclKs6kaYm3gNepRBb6AoG8pdgv4MMAqhYsf QMe2XHi0MrO/qKBgfNfivxEa9jq+0QK5uvTbqCRqCAB8LfwVyDqapCmg3EuiZcPW UbMITmnwLIfXgPxvp9rabmCsTqO6FLbf0GDOVIkNSAIDBXMpcO1iffjrWUbhRa7n oIoiJmWJLcXLxPWDsRKbpJwzw2cIG08YhfQYAiQnC3YaeRm1FKLDIICRBsmfYzja rWv9r4dn4TDfV4/AnjggQnsZvz2yPCxNaFSQIT88nIeiLvyuUTJ9j8aidsSfMZQf xZAbzbA= =NoQv -----END PGP SIGNATURE----- BackMerge tag 'v6.6-rc7' into drm-next This is needed to add the msm pr which is based on a higher base. Signed-off-by: Dave Airlie <airlied@redhat.com>	2023-10-23 18:20:06 +10:00
Luben Tuikov	d3df66fd98	drm/amdgpu: Remove redundant call to priority_is_valid() Remove a redundant call to amdgpu_ctx_priority_is_valid() from amdgpu_ctx_priority_permit(), which is called from amdgpu_ctx_init() which is called from amdgpu_ctx_alloc() which is called from amdgpu_ctx_ioctl(), where we've called amdgpu_ctx_priority_is_valid() already first thing in the function. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20231018010359.30393-1-luben.tuikov@amd.com	2023-10-21 20:27:15 -04:00
André Almeida	de009982c6	drm/amdgpu: Create version number for coredumps Even if there's nothing currently parsing amdgpu's coredump files, if we eventually have such tools they will be glad to find a version field to properly read the file. Create a version number to be displayed on top of coredump file, to be incremented when the file format or content get changed. Signed-off-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:29 -04:00
André Almeida	69619868d3	drm/amdgpu: Move coredump code to amdgpu_reset file Giving that we use codedump just for device resets, move it's functions and structs to a more semantic file, the amdgpu_reset.{c, h}. Signed-off-by: André Almeida <andrealmeid@igalia.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:29 -04:00
André Almeida	2d6a2a28cd	drm/amdgpu: Encapsulate all device reset info To better organize struct amdgpu_device, keep all reset information related fields together in a separated struct. Signed-off-by: André Almeida <andrealmeid@igalia.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:28 -04:00
Shiwu Zhang	723fac64d0	drm/amdgpu: support the port num info based on the capability flag XGMI TA will set the capability flag to indicate whether the port_num info is supported or not. KGD checks the flag and accordingly picks up the right buffer format and send the right command to TA to retrieve the info. v2: simplify the code by reusing the same statement (lijo) Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:28 -04:00
Shiwu Zhang	e8a5ded36b	drm/amdgpu: prepare the output buffer for GET_PEER_LINKS command Per the xgmi ta implementation, KGD needs to fill in node_ids in concern into the shared command output buffer rather than the command input buffer. Input buffer is not used for GET_PEER_LINKS command execution. In this way, xgmi ta can reuse the node info in the output buffer just filled in and populate the same buffer with link info directly. Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:28 -04:00
Tao Zhou	d9443ac4f9	drm/amdgpu: drop status query/reset for GCEA 9.4.3 and MMEA 1.8 PMFW will be responsible for them. v2: remove query interfaces. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:28 -04:00
Shiwu Zhang	626121fce4	drm/amdgpu: update the xgmi ta interface header Update the header file to the v20.00.00.13 v1: rename TA_COMMAND_XGMI__GET_GET_TOPOLOGY_INFO to TA_COMMAND_XGMI__GET_TOPOLOGY_INFO And also rename struct ta_xgmi_cmd_get_peer_link_info_output to ta_xgmi_cmd_get_peer_link_info accordingly v2: add structs to support xgmi GET_EXTEND_PEER_LINK command Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:28 -04:00
Tao Zhou	8096df7664	drm/amdgpu: add set/get mca debug mode operations Record the debug mode status in RAS. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:28 -04:00
Tao Zhou	21226f02d7	drm/amdgpu: replace reset_error_count with amdgpu_ras_reset_error_count Simplify the code. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:28 -04:00
Li Ma	9d7a965e22	drm/amdgpu: add clockgating support for NBIO v7.7.1 add clockgating support for NBIO ip 7.7.1 Signed-off-by: Li Ma <li.ma@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:28 -04:00
Li Ma	fa9dd7a285	drm/amdgpu: fix missing stuff in NBIO v7.11 add get_clockgating_state, update_medium_grain_light_sleep and update_medium_grain_clock_gating in nbio_v7_11_funcs v1: add missing funcs in nbio_v7_11.c v2: modify the if condition and add spport for nbio v7.11 clockgating. Signed-off-by: Li Ma <li.ma@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:28 -04:00
Stanley.Yang	66d64e4e03	drm/amdgpu: Enable RAS feature by default for APU Enable RAS feature by default for aqua vanjaram on apu platform. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:28 -04:00
Yang Wang	49c260bef3	drm/amdgpu: fix typo for amdgpu ras error data print typo fix. Fixes: `5b1270beb3` ("drm/amdgpu: add ras_err_info to identify RAS error source") Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Candice Li <candice.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:28 -04:00
Bokun Zhang	017634a68d	drm/amd/amdgpu/vcn: Add RB decouple feature under SRIOV - P4 - In VCN 4 SRIOV code path, add code to enable RB decouple feature Signed-off-by: Bokun Zhang <bokun.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:27 -04:00
Bokun Zhang	eb9d6256b9	drm/amd/amdgpu/vcn: Add RB decouple feature under SRIOV - P3 - Update VCN header for RB decouple feature - Add metadata struct, metadata will be placed after each RB Signed-off-by: Bokun Zhang <bokun.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:27 -04:00
Bokun Zhang	fc3136730b	drm/amd/amdgpu/vcn: Add RB decouple feature under SRIOV - P2 - Add function to check if RB decouple is enabled under SRIOV Signed-off-by: Bokun Zhang <bokun.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:27 -04:00
Bokun Zhang	97b2821643	drm/amd/amdgpu/vcn: Add RB decouple feature under SRIOV - P1 - Update SRIOV header with RB decouple flag Signed-off-by: Bokun Zhang <bokun.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:27 -04:00
Stanley.Yang	8a65661114	drm/amdgpu: Fix delete nodes that have been relesed Fix delete nodes that it has been freed. Fixes: `5b1270beb3` ("drm/amdgpu: add ras_err_info to identify RAS error source") Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:27 -04:00
Hawking Zhang	f2176d7063	drm/amdgpu: Add UVD_VCPU_INT_EN2 to dpg sram Add RAS sepcifc programming to dpg sram. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:27 -04:00
Hawking Zhang	9248462d7e	drm/amdgpu: Enable software RAS in vcn v4_0_3 Set VCN/JPEG RAS masks to enable software RAS for VCN and JPEG. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:26 -04:00
Tao Zhou	472c5fb297	drm/amdgpu: define ras_reset_error_count function Make the code architecture more simple. v2: reuse ras_reset_error_count in ras_reset_error_status. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:26 -04:00
Candice Li	afcf949cf3	drm/amdgpu: Log UE corrected by replay as correctable error Support replay mode where UE could be converted to CE. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:26 -04:00
Dave Airlie	d43c76c820	Short summary of fixes pull: amdgpu: - Disable AMD_CTX_PRIORITY_UNSET bridge: - ti-sn65dsi86: Fix device lifetime edid: - Add quirk for BenQ GW2765 ivpu: - Extend address range for MMU mmap nouveau: - DP-connector fixes - Documentation fixes panel: - Move AUX B116XW03 into panel-simple scheduler: - Eliminate DRM_SCHED_PRIORITY_UNSET ttm: - Fix possible NULL-ptr deref in cleanup -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmUxFsoACgkQaA3BHVML eiNkkwf/YORvSPLN2LEqh4P8to7X92moFR7XwkA1GVofKOC3C04wDBq+Ezdy5pnw 7T/uvEnmrnDP5Iucerly8bE5IdJGkafIB8+oBIcGhGvPgtYNGSV8NWoeA/lhBmsS av56+zBPpBF3s5fZa4rB/WciGRuCsLhbcYUN/LGGVBXdhLgb8LRb/seTv/Ah9Sra LQtFmrDNNE0+FIWuKkUsY/CQYysbMzuHYFiLumtY59lE1R5kyvlzE8OrdcqlDXhC HCFt0KuA+4onzdHKnge/as5T3jBJucGV9mTcgal3158n94U/ODrXEJqVD/KFKle6 052vhpApOL3+RyDKGjXQLm/GumEc9w== =LJB1 -----END PGP SIGNATURE----- Merge tag 'drm-misc-fixes-2023-10-19' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Short summary of fixes pull: amdgpu: - Disable AMD_CTX_PRIORITY_UNSET bridge: - ti-sn65dsi86: Fix device lifetime edid: - Add quirk for BenQ GW2765 ivpu: - Extend address range for MMU mmap nouveau: - DP-connector fixes - Documentation fixes panel: - Move AUX B116XW03 into panel-simple scheduler: - Eliminate DRM_SCHED_PRIORITY_UNSET ttm: - Fix possible NULL-ptr deref in cleanup Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20231019114605.GA22540@linux-uq9g	2023-10-20 14:07:58 +10:00
Felix Kuehling	316baf09d3	drm/amdgpu: Reserve fences for VM update In amdgpu_dma_buf_move_notify reserve fences for the page table updates in amdgpu_vm_clear_freed and amdgpu_vm_handle_moved. This fixes a BUG_ON in dma_resv_add_fence when using SDMA for page table updates. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:56:57 -04:00
Felix Kuehling	51b79f3381	drm/amdgpu: Fix possible null pointer dereference abo->tbo.resource may be NULL in amdgpu_vm_bo_update. Fixes: `1802537820` ("drm/ttm: stop allocating dummy resources during BO creation") Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:56:50 -04:00
Felix Kuehling	207430b76a	drm/amdgpu: Reserve fences for VM update In amdgpu_dma_buf_move_notify reserve fences for the page table updates in amdgpu_vm_clear_freed and amdgpu_vm_handle_moved. This fixes a BUG_ON in dma_resv_add_fence when using SDMA for page table updates. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:26:52 -04:00
Felix Kuehling	e6f8588733	drm/amdgpu: Fix possible null pointer dereference abo->tbo.resource may be NULL in amdgpu_vm_bo_update. Fixes: `1802537820` ("drm/ttm: stop allocating dummy resources during BO creation") Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:26:52 -04:00
Stanley.Yang	b1338a8e71	drm/amdgpu: Workaround to skip kiq ring test during ras gpu recovery This is workaround, kiq ring test failed in suspend stage when do ras recovery. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:26:52 -04:00
Mario Limonciello	e56690bb37	drm/amd: Read IMU FW version from scratch register during hw_init If the IMU version wasn't discovered from the header, such as when the firmware was directly loaded by PSP then there is no firmware version to show to userspace from sysfs or IOCTL. The IMU F/W stores the version in the first scratch register though, so fetch it in these cases to let the driver export. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:26:51 -04:00
Mario Limonciello	4916615fe9	drm/amd: Don't parse IMU ucode version if it won't be loaded When the IMU ucode is loaded by the PSP parsing the version that comes from Linux will vary. Rather than showing the wrong data to kernel interface consumers, avoid populating it in this case. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:26:51 -04:00
Mario Limonciello	d757dfd667	drm/amd: Move microcode init step to early_init() The intention for early init is to find any missing microcode early and fail the driver load if it's missing. Move this step to earlier in driver init to match other IP blocks. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:26:51 -04:00
Asad Kamal	d8c1925ba8	drm/amdgpu: update retry times for psp BL wait Increase retry time for PSP BL wait, to compensate for longer time to set c2pmsg 35 ready bit during mode1 with RAS Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:26:51 -04:00
Alex Deucher	28ab9a02b6	drm/amdgpu/mes11: remove aggregated doorbell code It's not enabled in hardware so the code is dead. Remove it. Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:26:51 -04:00
Asad Kamal	53dd920c1f	drm/amdgpu : Add hive ras recovery check If one of the devices in the hive detects a fatal error, need to send ras recovery reset message to PMFW of all devices in the hive. For that add a flag in hive to indicate that it's undergoing ras recovery Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:26:51 -04:00
Mangesh Gadre	2d955a06a5	Revert "drm/amdgpu: Program xcp_ctl registers as needed" This reverts commit `0bdebfef3f`. XCP_CTL register is programmed by firmware and register access is protected. Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:26:50 -04:00
Lang Yu	ab29ac57ad	drm/amdgpu/umsch: add suspend and resume callback Add missing IP callbacks. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:26:50 -04:00
Christian König	6b18ef481f	drm/amdgpu: ignore duplicate BOs again Looks like RADV is actually hitting this. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: `ca6c1e210a` ("drm/amdgpu: use the new drm_exec object for CS v3") Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231017121015.1336786-1-christian.koenig@amd.com	2023-10-19 13:19:44 +02:00
Dave Airlie	27442758e9	amd-drm-next-6.7-2023-10-13: amdgpu: - DC replay fixes - Misc code cleanups and spelling fixes - Documentation updates - RAS EEPROM Updates - FRU EEPROM Updates - IP discovery updates - SR-IOV fixes - RAS updates - DC PQ fixes - SMU 13.0.6 updates - GC 11.5 Support - NBIO 7.11 Support - GMC 11 Updates - Reset fixes - SMU 11.5 Updates - SMU 13.0 OD support - Use flexible arrays for bo list handling - W=1 Fixes - SubVP fixes - DPIA fixes - DCN 3.5 Support - Devcoredump fixes - VPE 6.1 support - VCN 4.0 Updates - S/G display fixes - DML fixes - DML2 Support - MST fixes - VRR fixes - Enable seamless boot in more cases - Enable content type property for HDMI - OLED fixes - Rework and clean up GPUVM TLB flushing - DC ODM fixes - DP 2.x fixes - AGP aperture fixes - SDMA firmware loading cleanups - Cyan Skillfish GPU clock counter fix - GC 11 GART fix - Cache GPU fault info for userspace queries - DC cursor check fixes - eDP fixes - DC FP handling fixes - Variable sized array fixes - SMU 13.0.x fixes - IB start and size alignment fixes for VCN - SMU 14 Support - Suspend and resume sequence rework - vkms fix amdkfd: - GC 11 fixes - GC 10 fixes - Doorbell fixes - CWSR fixes - SVM fixes - Clean up GC info enumeration - Rework memory limit handling - Coherent memory handling fixes - Use partial migrations in GPU faults - TLB flush fixes - DMA unmap fixes - GC 9.4.3 fixes - SQ interrupt fix - GTT mapping fix - GC 11.5 Support radeon: - Misc code cleanups - W=1 Fixes - Fix possible buffer overflow - Fix possible NULL pointer dereference UAPI: - Add EXT_COHERENT memory allocation flags. These allow for system scope atomics. Proposed userspace: https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/pull/88 - Add support for new VPE engine. This is a memory to memory copy engine with advanced scaling, CSC, and color management features Proposed mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25713 - Add INFO IOCTL interface to query GPU faults Proposed Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238 Proposed libdrm MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/298 -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZSmDAQAKCRC93/aFa7yZ 2EdeAQC2lkQ9IHLOon5kIZUK+r9IPYlgFsii+qfmMPLBaMcuwgEA8F4eJln/cc9V 02EKhlapkggYXYa+uhOE2KTnWgMFJgI= =SEXq -----END PGP SIGNATURE----- Merge tag 'amd-drm-next-6.7-2023-10-13' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.7-2023-10-13: amdgpu: - DC replay fixes - Misc code cleanups and spelling fixes - Documentation updates - RAS EEPROM Updates - FRU EEPROM Updates - IP discovery updates - SR-IOV fixes - RAS updates - DC PQ fixes - SMU 13.0.6 updates - GC 11.5 Support - NBIO 7.11 Support - GMC 11 Updates - Reset fixes - SMU 11.5 Updates - SMU 13.0 OD support - Use flexible arrays for bo list handling - W=1 Fixes - SubVP fixes - DPIA fixes - DCN 3.5 Support - Devcoredump fixes - VPE 6.1 support - VCN 4.0 Updates - S/G display fixes - DML fixes - DML2 Support - MST fixes - VRR fixes - Enable seamless boot in more cases - Enable content type property for HDMI - OLED fixes - Rework and clean up GPUVM TLB flushing - DC ODM fixes - DP 2.x fixes - AGP aperture fixes - SDMA firmware loading cleanups - Cyan Skillfish GPU clock counter fix - GC 11 GART fix - Cache GPU fault info for userspace queries - DC cursor check fixes - eDP fixes - DC FP handling fixes - Variable sized array fixes - SMU 13.0.x fixes - IB start and size alignment fixes for VCN - SMU 14 Support - Suspend and resume sequence rework - vkms fix amdkfd: - GC 11 fixes - GC 10 fixes - Doorbell fixes - CWSR fixes - SVM fixes - Clean up GC info enumeration - Rework memory limit handling - Coherent memory handling fixes - Use partial migrations in GPU faults - TLB flush fixes - DMA unmap fixes - GC 9.4.3 fixes - SQ interrupt fix - GTT mapping fix - GC 11.5 Support radeon: - Misc code cleanups - W=1 Fixes - Fix possible buffer overflow - Fix possible NULL pointer dereference UAPI: - Add EXT_COHERENT memory allocation flags. These allow for system scope atomics. Proposed userspace: https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/pull/88 - Add support for new VPE engine. This is a memory to memory copy engine with advanced scaling, CSC, and color management features Proposed mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25713 - Add INFO IOCTL interface to query GPU faults Proposed Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238 Proposed libdrm MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/298 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231013175758.1735031-1-alexander.deucher@amd.com	2023-10-18 16:08:07 +10:00
Luben Tuikov	fa8391ad68	gpu/drm: Eliminate DRM_SCHED_PRIORITY_UNSET Eliminate DRM_SCHED_PRIORITY_UNSET, value of -2, whose only user was amdgpu. Furthermore, eliminate an index bug, in that when amdgpu boots, it calls drm_sched_entity_init() with DRM_SCHED_PRIORITY_UNSET, which uses it to index sched->sched_rq[]. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alex Deucher <Alexander.Deucher@amd.com> Link: https://lore.kernel.org/r/20231017035656.8211-2-luben.tuikov@amd.com	2023-10-17 20:35:38 -04:00
Luben Tuikov	eab0261967	drm/amdgpu: Unset context priority is now invalid A context priority value of AMD_CTX_PRIORITY_UNSET is now invalid--instead of carrying it around and passing it to the Direct Rendering Manager--and it becomes AMD_CTX_PRIORITY_NORMAL in amdgpu_ctx_ioctl(), the gateway to context creation. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alex Deucher <Alexander.Deucher@amd.com> Link: https://lore.kernel.org/r/20231017035656.8211-1-luben.tuikov@amd.com	2023-10-17 20:35:38 -04:00
Ma Ke	cd90511557	drm/amdgpu/vkms: fix a possible null pointer dereference In amdgpu_vkms_conn_get_modes(), the return value of drm_cvt_mode() is assigned to mode, which will lead to a NULL pointer dereference on failure of drm_cvt_mode(). Add a check to avoid null pointer dereference. Signed-off-by: Ma Ke <make_ruc2021@163.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:36:25 -04:00
Yang Wang	3bba4bc6a0	drm/amdgpu: add RAS error info support for umc_v12_0 add RAS error info support for umc_v12_0. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:36:11 -04:00
Yang Wang	8736d17a7f	drm/amdgpu: add RAS error info support for mmhub_v1_8 add RAS error info support for mmhub_v1_8. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:36:03 -04:00
Yang Wang	156c2814c2	drm/amdgpu: add RAS error info support for gfx_v9_4_3 add RAS error info support for gfx_v9_4_3. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:35:55 -04:00
Yang Wang	dd401cd29a	drm/amdgpu: add RAS error info support for sdma_v4_4_2. add RAS error info support for sdma_v4_4_2. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:35:45 -04:00
Yang Wang	5b1270beb3	drm/amdgpu: add ras_err_info to identify RAS error source introduced "ras_err_info" to better identify a RAS ERROR source. NOTE: For legacy chips, keep the original RAS error print format. v1: RAS errors may come from different dies during a RAS error query, therefore, need a new data structure to identify the source of RAS ERROR. v2: - use new data structure 'amdgpu_smuio_mcm_config_info' instead of ras_err_id (in v1 patch) - refine ras error dump function name - refine ras error dump log format Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:35:35 -04:00
Yifan Zhang	6a1c31c7a8	drm/amdgpu: flush the correct vmid tlb for specific pasid flush the correct vmid tlb for specific pasid on gmc 11. Fixes: `041a574388` ("drm/amdgpu: fix and cleanup gmc_v11_0_flush_gpu_tlb_pasid") Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:34:29 -04:00
Yang Wang	1a00cfab37	drm/amdgpu: make err_data structure built-in for ras_manager (No effect outside the ras_mgr data structure) Since a new member was added to the ras_err_data data structure, it becomes unreasonable for the ras_mgr instance to contain this data, because ras mgr only uses the 2 member information of ue_count/ce_count in err_data. This patch changes the code err_data into built-in structure members, making the code directly compatible. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:34:13 -04:00
Jesse Zhang	e341631f4a	drm/amdgpu: disable GFXOFF and PG during compute for GFX9 Temporary workaround to fix issues observed in some compute applications when GFXOFF is enabled on GFX9. Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:33:56 -04:00
Lang Yu	ef2354c70f	drm/amdgpu/umsch: fix missing stuff during rebase These are missed during rebase. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:33:49 -04:00
Lang Yu	fb5b73acf7	drm/amdgpu/umsch: correct IP version format FW uses IP_VERSION_MAJ_MIN_REV format. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:33:42 -04:00
Lang Yu	1c1f14a472	drm/amdgpu: don't use legacy invalidation on MMHUB v3.3 Legacy invalidation is not supported. This is missed during rebase. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:33:29 -04:00
Lang Yu	4661482b9c	drm/amdgpu: correct NBIO v7.11 programing Use v7.7 before, switch to v7.11 now. Fix incorrect programing. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:33:21 -04:00
Xiaogang Chen	ffa88b0019	drm/amdgpu: Correctly use bo_va->ref_count in compute VMs This is needed to correctly handle BOs imported into compute VM from gfx. Both kfd and gfx should use same bo_va and set bo_va->ref_count correctly when map the Bos into same VM, otherwise we may trigger kernel general protection when iterate mappings over bo_va's valids or invalids list. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Xiaogang Chen <Xiaogang.Chen@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Tested-by: Xiaogang Chen <Xiaogang.Chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:33:08 -04:00
Lijo Lazar	f20f3b0d6c	drm/amd/pm: Add P2S tables for SMU v13.0.6 Add P2S table load support on SMU v13.0.6 ASICs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:33:01 -04:00
Lijo Lazar	79daf69246	drm/amdgpu: Add support to load P2S tables Add support to load P2S tables through PSP. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:32:55 -04:00
Lijo Lazar	cd21cb1fcb	drm/amdgpu: Update PSP interface header Adds FW id for P2S table. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:32:47 -04:00
Lijo Lazar	a8558fce7a	drm/amdgpu: Avoid FRU EEPROM access on APU FRU EEPROM access is not valid for APU devices. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:32:41 -04:00
Lin.Cao	f74f19c440	drm/amdgpu: save VCN instances init info before jpeg init JPEG init header will overwirte vcn init header info which will loss some debug information Signed-off-by: Lin.Cao <lincao12@amd.com> Reviewed-by: Jingwen Chen <Jingwen.Chen2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:32:34 -04:00
Alex Hung	98a80bb3dd	Revert "drm/amd/display: Hande writeback request from userspace" This reverts commit `cd1a4bc228`. [WHY & HOW] The writeback series cause a regression in thunderbolt display. Signed-off-by: Alex Hung <alex.hung@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:24:47 -04:00
Alex Hung	731a20cb89	Revert "drm/amd/display: Add writeback enable field (wb_enabled)" This reverts commit `f6893fcb10`. [WHY & HOW] The writeback series cause a regression in thunderbolt display. Signed-off-by: Alex Hung <alex.hung@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:08:57 -04:00
Asad Kamal	625e5f3851	drm/amdgpu: Expose ras version & schema info Expose ras table version & schema info to sysfs v2: Updated schema to get poison support info from ras context, removed asic specific checks Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:02:54 -04:00
Lijo Lazar	d4a02673b3	drm/amdgpu: Read PSPv13 OS version from register PSP OS updates the version information in register. On APUs with PSPv13, PSP OS will already be loaded with SBIOS. Hence use the version register instead of using information in driver binary header. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:02:43 -04:00
Lang Yu	faeddb6eab	drm/amdgpu/umsch: enable doorbell for umsch Program vcn_doorbell_range with vcn_ring0_1. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:01:37 -04:00
Mario Limonciello	db99889065	drm/amd: Split up UVD suspend into prepare and suspend steps amdgpu_uvd_suspend() allocates memory and copies objects into that allocated memory. This fails under memory pressure. Instead move majority of this code into a prepare step when swap can still be allocated. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:01:04 -04:00
Mario Limonciello	cb11ca3233	drm/amd: Add concept of running prepare_suspend() sequence for IP blocks If any IP blocks allocate memory during their hw_fini() sequence this can cause the suspend to fail under memory pressure. Introduce a new phase that IP blocks can use to allocate memory before suspend starts so that it can potentially be evicted into swap instead. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:00:58 -04:00
Mario Limonciello	5095d54181	drm/amd: Evict resources during PM ops prepare() callback Linux PM core has a prepare() callback run before suspend. If the system is under high memory pressure, the resources may need to be evicted into swap instead. If the storage backing for swap is offlined during the suspend() step then such a call may fail. So move this step into prepare() to move evict majority of resources and update all non-pmops callers to call the same callback. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2362 Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:00:18 -04:00
Li Ma	31715a8620	drm/amdgpu: enable GFX IP v11.5.0 CG and PG support Add CG support for GFX/MC/HDP/ATHUB/IH/BIF. Add PG support for GFX. Signed-off-by: Li Ma <li.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:00:15 -04:00
Li Ma	ad3e54ab9e	drm/amdgpu/discovery: add SMU 14 support add smu 14 into the IP discovery list. Signed-off-by: Li Ma <li.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:00:00 -04:00
Lang Yu	4acf679f86	drm/amdgpu/umsch: power on/off UMSCH by DLDO VCN 4.0.5 uses DLDO. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:59:32 -04:00
Lang Yu	617b472431	drm/amdgpu/umsch: fix psp frontdoor loading These changes are missed in rebase. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:59:24 -04:00
Lijo Lazar	558fcb7d11	drm/amdgpu: Increase IP discovery region size IP discovery region has increased to > 8K on some SOCs.Maximum reserve size is upto 12K, but not used. For now increase to 10K. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:59:16 -04:00
Lin.Cao	b053117e86	drm/amdgpu: Return -EINVAL when MMSCH init status incorrect Return -EINVAL when MMSCH init fail which can be handle by function amdgpu_device_reset_sriov correctly. Signed-off-by: Lin.Cao <lincao12@amd.com> Reviewed-by: Jingwen Chen <Jingwen.Chen2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:58:48 -04:00
Lang Yu	9a37f65c4e	drm/amdgpu/vpe: fix insert_nop ops Avoid infinite loop when count is 0. This is missed in rebase. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:58:33 -04:00
Srinivasan Shanmugam	54967d5683	drm/amdgpu: Address member 'gart_placement' not described in 'amdgpu_gmc_gart_location' Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c:274: warning: Function parameter or member 'gart_placement' not described in 'amdgpu_gmc_gart_location' Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:58:22 -04:00
Lang Yu	84aa39ab1e	drm/amdgpu/vpe: align with mcbp changes MCBP is decided by adev->gfx.mcbp now. This is missed in rebase. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:58:13 -04:00
Lang Yu	99ea82f424	drm/amdgpu/vpe: remove IB end boundary requirement Remove IB end boundary requirement, VPE has no such limitions, use existing amdgpu_ring_generic_pad_ib() instead. This is missed in rebase. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:58:01 -04:00
Jay Cornwall	757920585d	drm/amdgpu: Improve MES responsiveness during oversubscription When MES is oversubscribed it may not frequently check for new command submissions from driver if the scheduling load is high. Response latency as high as 5 seconds has been observed. Enable a flag which adds a check for new commands between scheduling quantums. Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Cc: Alexandru Tudor <alexandru.tudor@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:57:46 -04:00
Thomas Zimmermann	57390019b6	Merge drm/drm-next into drm-misc-next Updating drm-misc-next to the state of Linux v6.6-rc2. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2023-10-11 09:50:59 +02:00
Icenowy Zheng	3806a8c647	drm/amdgpu: fix SI failure due to doorbells allocation SI hardware does not have doorbells at all, however currently the code will try to do the allocation and thus fail, makes SI AMDGPU not usable. Fix this failure by skipping doorbells allocation when doorbells count is zero. Fixes: `54c30d2a8d` ("drm/amdgpu: create kernel doorbell pages") Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Icenowy Zheng <uwu@icenowy.me> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 17:59:29 -04:00
Christian König	ff89f064dc	drm/amdgpu: add missing NULL check bo->tbo.resource can easily be NULL here. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2902 Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> CC: stable@vger.kernel.org	2023-10-09 17:59:29 -04:00
Icenowy Zheng	219223eca4	drm/amdgpu: fix SI failure due to doorbells allocation SI hardware does not have doorbells at all, however currently the code will try to do the allocation and thus fail, makes SI AMDGPU not usable. Fix this failure by skipping doorbells allocation when doorbells count is zero. Fixes: `54c30d2a8d` ("drm/amdgpu: create kernel doorbell pages") Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Icenowy Zheng <uwu@icenowy.me> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 17:02:52 -04:00
Aaron Liu	ce862c4995	drm/amdgpu/discovery: enable DCN 3.5.0 support Enable DCN 3.5.0 support. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 17:02:46 -04:00
Arvind Yadav	367a0af433	drm/amdkfd: get doorbell's absolute offset based on the db_size Here, Adding db_size in byte to find the doorbell's absolute offset for both 32-bit and 64-bit doorbell sizes. So that doorbell offset will be aligned based on the doorbell size. v2: - Addressed the review comment from Felix. v3: - Adding doorbell_size as parameter to get db absolute offset. v4: Squash the two patches into one. Cc: Christian Koenig <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 17:02:34 -04:00
Christian König	31220ee9dc	drm/amdgpu: add missing NULL check bo->tbo.resource can easily be NULL here. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2902 Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> CC: stable@vger.kernel.org	2023-10-09 17:01:32 -04:00
Yifan Zhang	061863e5db	drm/amdgpu: add hub->ctx_distance in setup_vmid_config add hub->ctx_distance when read CONTEXT1_CNTL, align w/ write back operation. v2: fix coding style errors reported by checkpatch.pl (Christian) Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 16:59:06 -04:00
Stanley.Yang	80285ae1ec	drm/amdgpu: Fix potential null pointer derefernce The amdgpu_ras_get_context may return NULL if device not support ras feature, so add check before using. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 16:52:46 -04:00
Lijo Lazar	b3e73b5a8f	Documentation/amdgpu: Add FRU attribute details Add documentation for the newly added manufacturer and fru_id attributes in sysfs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 16:52:25 -04:00
Lijo Lazar	ac6b1f275f	drm/amdgpu: Add more FRU field information Add support to read Manufacturer Name and FRU File Id fields. Also add sysfs device attributes for external usage. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 16:52:17 -04:00
Lijo Lazar	8a2b51392a	drm/amdgpu: Refactor FRU product information Keep FRU related information together in a separate structure. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 16:52:08 -04:00
Yang Wang	be2e8aca06	drm/amdgpu: enable FRU device for SMU v13.0.6 v1: enable GFX v9.4.3 FRU device to query board information. v2: use MP1 version to identify different asic Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 16:51:58 -04:00
Boyuan Zhang	6cb8e3ee3a	drm/amdgpu: update ib start and size alignment Update IB starting address alignment and size alignment with correct values for decode and encode IPs. Decode IB starting address alignment: 256 bytes Decode IB size alignment: 64 bytes Encode IB starting address alignment: 256 bytes Encode IB size alignment: 4 bytes Also bump amdgpu driver version for this update. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 16:51:39 -04:00
Kees Cook	c8e7df374b	drm/amdgpu: Annotate struct amdgpu_bo_list with __counted_by Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct amdgpu_bo_list. Additionally, since the element count member must be set before accessing the annotated flexible array member, move its initialization earlier. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org> Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: linux-hardening@vger.kernel.org Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci [1] Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Srinivasan Shanmugam	e0a3e7bf62	drm/amdgpu: Drop unnecessary return statements There is no reason to call return at the end of function that returns void. Fixes the below: WARNING: void function return statements are not generally useful Thus remove such a statement in the affected functions. Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Lijo Lazar	4798db85b7	Documentation/amdgpu: Add board info details Add documentation for board info sysfs attribute. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Lijo Lazar	76da73f026	drm/amdgpu: Add sysfs attribute to get board info Add a sysfs attribute which shows the board form factor like OAM or CEM. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Lijo Lazar	b0a4553336	drm/amdgpu: Get package types for smuio v13.0 Add support to query package types supported in smuio v13.0 ASICs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Lijo Lazar	4365d2ed09	drm/amdgpu: Add more smuio v13.0.3 package types Expand support to get other board types like OAM or CEM. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Sathishkumar S	cbad0dd13a	drm/amdgpu: fix ip count query for xcp partitions fix wrong ip count INFO on spatial partitions. update the query to return the instance count corresponding to the partition id. v2: initialize variables only when required to be (Christian) move variable declarations to the beginning of function (Christian) Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Lijo Lazar	28a3f49609	drm/amdgpu: Move package type enum to amdgpu_smuio Move definition of package type to amdgpu_smuio header and add new package types for CEM and OAM. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Srinivasan Shanmugam	2b6b29f33f	drm/amdgpu: Fix complex macros error Fixes the below: ERROR: Macros with complex values should be enclosed in parentheses WARNING: macros should not use a trailing semicolon +#define amdgpu_inc_vram_lost(adev) atomic_inc(&((adev)->vram_lost_counter)); Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Alex Deucher	7d3f1d76f3	drm/amdgpu: refine fault cache updates Don't update the fault cache if status is 0. In the multiple fault case, subsequent faults will return a 0 status which is useless for userspace and replaces the useful fault status, so only update if status is non-0. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:49:51 -04:00
Alex Deucher	7a41ed8b59	drm/amdgpu: add new INFO ioctl query for the last GPU page fault Add a interface to query the last GPU page fault for the process. Useful for debugging context lost errors. v2: split vmhub representation between kernel and userspace v3: add locking when fetching fault info in INFO IOCTL Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238 libdrm MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238 Cc: samuel.pitoiset@gmail.com Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:49:39 -04:00
Kees Cook	ac8e62ab25	drm/amdgpu/discovery: Annotate struct ip_hw_instance with __counted_by Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct ip_hw_instance. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230922173216.3823169-2-keescook@chromium.org	2023-10-05 11:29:09 +02:00
Mario Limonciello	134b8c5d86	drm/amd: Fix detection of _PR3 on the PCIe root port On some systems with Navi3x dGPU will attempt to use BACO for runtime PM but fails to resume properly. This is because on these systems the root port goes into D3cold which is incompatible with BACO. This happens because in this case dGPU is connected to a bridge between root port which causes BOCO detection logic to fail. Fix the intent of the logic by looking at root port, not the immediate upstream bridge for _PR3. Cc: stable@vger.kernel.org Suggested-by: Jun Ma <Jun.Ma2@amd.com> Tested-by: David Perry <David.Perry@amd.com> Fixes: `b10c1c5b3a` ("drm/amdgpu: add check for ACPI power resources") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 22:52:05 -04:00
Luben Tuikov	5d061675b7	drm/amdgpu: Fix a memory leak Fix a memory leak in amdgpu_fru_get_product_info(). Cc: Alex Deucher <Alexander.Deucher@amd.com> Reported-by: Yang Wang <kevinyang.wang@amd.com> Fixes: `0dbf2c5626` ("drm/amdgpu: Interpret IPMI data for product information (v2)") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 22:43:26 -04:00
Philip Yang	fdac890966	drm/amdgpu: ratelimited override pte flags messages Use ratelimited version of dev_dbg to avoid flooding dmesg log. No functional change. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 18:40:11 -04:00
Mario Limonciello	b8e6aec146	drm/amd: Drop all hand-built MIN and MAX macros in the amdgpu base driver Several files declare MIN() or MAX() macros that ignore the types of the values being compared. Drop these macros and switch to min() min_t(), and max() from `linux/minmax.h`. Suggested-by: Hamza Mahfooz <Hamza.Mahfooz@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 18:39:52 -04:00
Alex Deucher	8dbf1ba867	drm/amdgpu: cache gpuvm fault information for gmc7+ Cache the current fault info in the vm struct. This can be queried by userspace later to help debug UMDs. Cc: samuel.pitoiset@gmail.com Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 18:37:07 -04:00
Alex Deucher	2e8ef6a561	drm/amdgpu: add cached GPU fault structure to vm struct When we get a GPU page fault, cache the fault for later analysis. Cc: samuel.pitoiset@gmail.com Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 18:36:57 -04:00
Rajneesh Bhardwaj	f4bff6e0b9	drm/amdgpu: Use ttm_pages_limit to override vram reporting On GFXIP9.4.3 APU, allow the memory reporting as per the ttm pages limit in NPS1 mode. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 18:36:42 -04:00
Rajneesh Bhardwaj	9b37d45d79	drm/amdgpu: Rework KFD memory max limits To allow bigger allocations specially on systems such as GFXIP 9.4.3 that use GTT memory for VRAM allocations, relax the limits to maximize ROCm allocations. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 18:36:17 -04:00
Alex Deucher	67318cb843	drm/amdgpu/gmc11: set gart placement GC11 Needed to avoid a hardware issue. v2: force high for all GC11 parts for consistency (Alex) v3: rebase Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 18:36:12 -04:00
Alex Deucher	917f91d8d8	drm/amdgpu/gmc: add a way to force a particular placement for GART We normally place GART based on the location of VRAM and the available address space around that, but provide an option to force a particular location for hardware that needs it. v2: Switch to passing the placement via parameter Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 18:36:07 -04:00
Lang Yu	a19d934986	drm/amdgpu: correct gpu clock counter query on cyan skilfish Cayn skilfish uses SMUIO v11.0.8 offset. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: <stable@vger.kernel.org> # v5.15+	2023-10-04 18:35:28 -04:00
Mario Limonciello	c4c8955b8a	drm/amd: Fix detection of _PR3 on the PCIe root port On some systems with Navi3x dGPU will attempt to use BACO for runtime PM but fails to resume properly. This is because on these systems the root port goes into D3cold which is incompatible with BACO. This happens because in this case dGPU is connected to a bridge between root port which causes BOCO detection logic to fail. Fix the intent of the logic by looking at root port, not the immediate upstream bridge for _PR3. Cc: stable@vger.kernel.org Suggested-by: Jun Ma <Jun.Ma2@amd.com> Tested-by: David Perry <David.Perry@amd.com> Fixes: `b10c1c5b3a` ("drm/amdgpu: add check for ACPI power resources") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-03 15:43:11 -04:00
Alex Hung	f6893fcb10	drm/amd/display: Add writeback enable field (wb_enabled) [WHAT] Add a new field to keep track whether a crtc is previously writeback-enabled. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-03 15:42:38 -04:00
Alex Hung	cd1a4bc228	drm/amd/display: Hande writeback request from userspace [WHAT] Handle writeback requests and fill in the required information for DWB programming and setup. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-03 15:42:06 -04:00
Alex Deucher	0021d70a06	drm/amdkfd: drop struct kfd_cu_info I think this was an abstraction back from when kfd supported both radeon and amdgpu. Since we just support amdgpu now, there is no more need for this and we can use the amdgpu structures directly. This also avoids having the kfd_cu_info structures on the stack when inlining which can blow up the stack. Cc: Arnd Bergmann <arnd@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-03 15:41:13 -04:00
Dave Airlie	79fb229b88	Merge tag 'drm-misc-next-2023-09-27' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v6.7-rc1: UAPI Changes: - drm_file owner is now updated during use, in the case of a drm fd opened by the display server for a client, the correct owner is displayed. - Qaic gains support for the QAIC_DETACH_SLICE_BO ioctl to allow bo recycling. Cross-subsystem Changes: - Disable boot logo for au1200fb, mmpfb and unexport logo helpers. Only fbcon should manage display of logo. - Update freescale in MAINTAINERS. - Add some bridge files to bridge in MAINTAINERS. - Update gma500 driver repo in MAINTAINERS to point to drm-misc. Core Changes: - Move size computations to drm buddy allocator. - Make drm_atomic_helper_shutdown(NULL) a nop. - Assorted small fixes in drm_debugfs, DP-MST payload addition error handling. - Fix DRM_BRIDGE_ATTACH_NO_CONNECTOR handling. - Handle bad (h/v)sync_end in EDID by clipping to htotal. - Build GPUVM as a module. Driver Changes: - Simple drivers don't need to cache prepared result. - Call drm_atomic_helper_shutdown() in shutdown/unbind for a whole lot more drm drivers. - Assorted small fixes in amdgpu, ssd130x, bridge/it6621, accel/qaic, nouveau, tc358768. - Add NV12 for komeda writeback. - Add arbitration lost event to synopsis/dw-hdmi-cec. - Speed up s/r in nouveau by not restoring some big bo's. - Assorted nouveau display rework in preparation for GSP-RM, especially related to how the modeset sequence works and the DP sequence in relation to link training. - Update anx7816 panel. - Support NVSYNC and NHSYNC in tegra. - Allow multiple power domains in simple driver. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/f1fae5eb-25b8-192a-9a53-215e1184ce81@linux.intel.com	2023-09-29 08:27:15 +10:00
Sui Jingfeng	18bf400530	drm/amdgpu: Use pci_get_base_class() to reduce duplicated code Use pci_get_base_class() to reduce duplicated code. No functional change intended. Link: https://lore.kernel.org/r/20230825062714.6325-5-sui.jingfeng@linux.dev Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 16:54:54 -05:00
Tao Zhou	fc59889071	drm/amdgpu: update retry times for psp vmbx wait Increase the retry loops and replace the constant number with macro. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:44:36 -04:00
Tao Zhou	1934907234	drm/amdgpu: exit directly if gpu reset fails No need to perform the full reset operation in case of gpu reset failure. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:44:36 -04:00
Mario Limonciello	93499bd6cd	drm/amd: Move microcode init from sw_init to early_init for CIK SDMA As part of IP discovery early_init is run for all HW IP blocks. During this phase all firmware is supposed to be identified that may be missing so that the driver can avoid releasing resources used by the EFI framebuffer or simpledrm until the last possible moment. Move microcode loading from sw_init to early_init. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:38:17 -04:00
Mario Limonciello	751e293f2c	drm/amd: Move microcode init from sw_init to early_init for SDMA v2.4 As part of IP discovery early_init is run for all HW IP blocks. During this phase all firmware is supposed to be identified that may be missing so that the driver can avoid releasing resources used by the EFI framebuffer or simpledrm until the last possible moment. Move microcode loading from sw_init to early_init. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:38:11 -04:00
Mario Limonciello	cc76630483	drm/amd: Move microcode init from sw_init to early_init for SDMA v3.0 As part of IP discovery early_init is run for all HW IP blocks. During this phase all firmware is supposed to be identified that may be missing so that the driver can avoid releasing resources used by the EFI framebuffer or simpledrm until the last possible moment. Move microcode loading from sw_init to early_init. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:38:05 -04:00
Mario Limonciello	e0d4fbb58c	drm/amd: Move microcode init from sw_init to early_init for SDMA v5.2 As part of IP discovery early_init is run for all HW IP blocks. During this phase all firmware is supposed to be identified that may be missing so that the driver can avoid releasing resources used by the EFI framebuffer or simpledrm until the last possible moment. Move microcode loading from sw_init to early_init. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:37:50 -04:00
Mario Limonciello	95b456d3b0	drm/amd: Move microcode init from sw_init to early_init for SDMA v6.0 As part of IP discovery early_init is run for all HW IP blocks. During this phase all firmware is supposed to be identified that may be missing so that the driver can avoid releasing resources used by the EFI framebuffer or simpledrm until the last possible moment. Move microcode loading from sw_init to early_init. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:37:43 -04:00
Mario Limonciello	ed1c1053cd	drm/amd: Move microcode init from sw_init to early_init for SDMA v5.0 As part of IP discovery early_init is run for all HW IP blocks. During this phase all firmware is supposed to be identified that may be missing so that the driver can avoid releasing resources used by the EFI framebuffer or simpledrm until the last possible moment. Move microcode loading from sw_init to early_init. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:37:34 -04:00
Mario Limonciello	161d076c2d	drm/amd: Drop error message about failing to load SDMA firmware The error path for SDMA firmware loading is unnecessarily noisy. When a firmware is missing 3 errors show up: ``` amdgpu 0000:07:00.0: Direct firmware load for amdgpu/green_sardine_sdma.bin failed with error -2 [drm:sdma_v4_0_early_init [amdgpu]] ERROR Failed to load sdma firmware! [drm:amdgpu_device_init [amdgpu]] ERROR early_init of IP block <sdma_v4_0> failed -19 ``` The error code for the device init is bubbled up already, remove the second one. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:36:41 -04:00
Le Ma	3152d01e88	drm/amd/pm: deprecate allow_xgmi_power_down interface Replace with set_plpd_mode uniformly for places to use. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:36:23 -04:00
Mario Limonciello	3657a1d5ac	drm/amd: Limit seamless boot by default to APUs A hang is reported on DCN 3.2 with seamless boot enabled. As the benefits come from an eDP setup, limit it to only enabled by default with APUs. Suggested-by: Alexander.Deucher@amd.com Reported-by: feifei.xu@amd.com Closes: https://lore.kernel.org/amd-gfx/85b427f6-11ec-4249-bf6f-eadf9c375f88@amd.com/T/#m2887e919d7c01b2a4860d2261b366d22e070f309 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:35:31 -04:00
Alex Deucher	b2e1cbe628	drm/amdgpu/gmc11: disable AGP on GC 11.5 AGP aperture is deprecated and no longer functional. v2: fix typo (Alex) v3: just skip the agp setup call v4: revert back to the original model v5: back to v3 Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 17:00:23 -04:00
David (Ming Qiang) Wu	fa1f1cc09d	drm/amdgpu: not to save bo in the case of RAS err_event_athub err_event_athub will corrupt VCPU buffer and not good to be restored in amdgpu_vcn_resume() and in this case the VCPU buffer needs to be cleared for VCN firmware to work properly. Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 17:00:23 -04:00
Luben Tuikov	9ed630c5c4	drm/amdgpu: Fix a memory leak Fix a memory leak in amdgpu_fru_get_product_info(). Cc: Alex Deucher <Alexander.Deucher@amd.com> Reported-by: Yang Wang <kevinyang.wang@amd.com> Fixes: `0dbf2c5626` ("drm/amdgpu: Interpret IPMI data for product information (v2)") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 17:00:23 -04:00
Alex Deucher	de59b69932	drm/amdgpu/gmc: set a default disable value for AGP To disable AGP, the start needs to be set to a higher value than the end. Set a default disable value for the AGP aperture and allow the IP specific GMC code to enable it selectively be calling amdgpu_gmc_agp_location(). Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 17:00:22 -04:00
Alex Deucher	29495d8145	drm/amdgpu/gmc6-8: properly disable the AGP aperture The BOT register needs to be larger than the TOP register for this to be properly disabled. The lower 22 bits of the BOT address are always 0 and the lower 22 bits of the TOP register are always 1 so you need to make the upper bits of BOT larger than the upper bits of BOT. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 17:00:22 -04:00
Mangesh Gadre	cd956e7531	drm/amdgpu:Expose physical id of device in XGMI hive This identifies the physical ordering of devices in the hive v2: fix compilation issue Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 17:00:22 -04:00
Lang Yu	7021b397c6	drm/amdgpu/vpe: fix truncation warnings Fix truncation warnings. Fixes: `9d4346bdbc` ("drm/amdgpu: add VPE 6.1.0 support") Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202309200028.aUVuM8os-lkp@intel.com Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 17:00:21 -04:00
Philip Yang	101b810430	drm/amdkfd: Move dma unmapping after TLB flush Otherwise GPU may access the stale mapping and generate IOMMU IO_PAGE_FAULT. Move this to inside p->mutex to prevent multiple threads mapping and unmapping concurrently race condition. After kfd_mem_dmaunmap_attachment is removed from unmap_bo_from_gpuvm, kfd_mem_dmaunmap_attachment is called if failed to map to GPUs, and before free the mem attachment in case failed to unmap from GPUs. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:10 -04:00
Christian König	08abccc9a7	drm/amdgpu: further move TLB hw workarounds a layer up For the PASID flushing we already handled that at a higher layer, apply those workarounds to the standard flush as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	e2e3788850	drm/amdgpu: rework lock handling for flush_tlb v2 Instead of each implementation doing this more or less correctly move taking the reset lock at a higher level. v2: fix typo Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	3983c9fd2d	drm/amdgpu: drop error return from flush_gpu_tlb_pasid That function never fails, drop the error return. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	041a574388	drm/amdgpu: fix and cleanup gmc_v11_0_flush_gpu_tlb_pasid The same PASID can be used by more than one VMID, reset each of them. Use the common KIQ handling. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	72cc99205c	drm/amdgpu: cleanup gmc_v10_0_flush_gpu_tlb_pasid The same PASID can be used by more than one VMID, reset each of them. Use the common KIQ handling. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	e7b90e99fa	drm/amdgpu: fix and cleanup gmc_v9_0_flush_gpu_tlb_pasid Testing for reset is pointless since the reset can start right after the test. The same PASID can be used by more than one VMID, invalidate each of them. Move the KIQ and all the workaround handling into common GMC code. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	0c525aa406	drm/amdgpu: fix and cleanup gmc_v8_0_flush_gpu_tlb_pasid Testing for reset is pointless since the reset can start right after the test. Grab the reset semaphore instead. The same PASID can be used by more than once VMID, build a mask of VMIDs to invalidate instead of just restting the first one. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	fb4c52db69	drm/amdgpu: fix and cleanup gmc_v7_0_flush_gpu_tlb_pasid Testing for reset is pointless since the reset can start right after the test. Grab the reset semaphore instead. The same PASID can be used by more than once VMID, build a mask of VMIDs to invalidate instead of just restting the first one. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	a54db42ff3	drm/amdgpu: cleanup gmc_v11_0_flush_gpu_tlb Remove leftovers from copying this from the gmc v10 code. v2: squash in fix from Yifan Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	a70cb2176f	drm/amdgpu: rework gmc_v10_0_flush_gpu_tlb v2 Move the SDMA workaround necessary for Navi 1x into a higher layer. v2: use dev_err Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:54:52 -04:00
Tao Zhou	8c14a67bdf	drm/amdgpu: change if condition for bad channel bitmap update The amdgpu_ras_eeprom_control.bad_channel_bitmap is u32 type, but the channel index could be larger than 32. For the ASICs whose channel number is more than 32, the amdgpu_dpm_send_hbm_bad_channel_flag interface is not supported, so we simply bypass channel bitmap update under this condition. v2: replace sizeof with BITS_PER_TYPE, we should check bit number instead of byte number. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:54:52 -04:00
Tao Zhou	6205b558e1	drm/amdgpu: fix value of some UMC parameters for UMC v12 Prepare for bad page retirement. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:54:52 -04:00
Philip Yang	e61801f162	drm/amdkfd: Don't use sw fault filter if retry cam enabled If retry cam enabled, we don't use sw retry fault filter and add fault into sw filter ring, so we shouldn't remove fault from sw filter. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:54:51 -04:00
Christian König	24a6eb92b7	drm/amdgpu: fix and cleanup gmc_v9_0_flush_gpu_tlb The KIQ code path was ignoring the second flush. Also avoid long lines and re-calculating the register offsets over and over again. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:54:51 -04:00
Philip Yang	bcfb9cee61	drm/amdgpu: Increase IH soft ring size for GFX v9.4.3 dGPU On GFX v9.4.3 dGPU, applications have random timeout failure when XNACK on, dmesg log has "amdgpu: IH soft ring buffer overflow 0x900, 0x900", because dGPU mode has 272 cam entries. After increasing IH soft ring to 512 entries, no more IH soft ring overflow message and application passed. Fixes: `bf80d34b6c` ("drm/amdgpu: Increase soft IH ring size") Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:54:51 -04:00
Lijo Lazar	c45e38f217	drm/amdgpu: Restore partition mode after reset On a full device reset, PSP FW gets unloaded. Hence restore the partition mode by placing a new request. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:54:51 -04:00
Cong Liu	f387bb578d	drm/amdgpu: fix a memory leak in amdgpu_ras_feature_enable This patch fixes a memory leak in the amdgpu_ras_feature_enable() function. The leak occurs when the function sends a command to the firmware to enable or disable a RAS feature for a GFX block. If the command fails, the kfree() function is not called to free the info memory. Fixes: `9f051d6ff1` ("drm/amdgpu: Free ras cmd input buffer properly") Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Cong Liu <liucong2@kylinos.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 17:27:04 -04:00
Lijo Lazar	06cce38ef5	Revert "drm/amdgpu: Report vbios version instead of PN" This reverts commit `7748ce5b69`. vbios_version sysfs node is used to identify Part Number also. Revert to the same so that it doesn't break scripts/software which parse this. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 17:26:15 -04:00
Lijo Lazar	ff96ddc3f2	drm/amdgpu: Add more fields to IP version Include subrevision and variant fileds also to IP version. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:25:17 -04:00
Tao Zhou	f8754f58d6	drm/amdgpu: print channel index for UMC bad page Print channel index for UMC v12. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:25:17 -04:00
Sathishkumar S	5aba51233b	drm/amdgpu: update IP count INFO query update the query to return the number of functional instances where there is more than an instance of the requested type and for others continue to return one. v2: count must reflect the actual number of engines (Alex) v3: fix wrong number of engines for vcn (Alex) Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:09 -04:00
Stanley.Yang	a83f2bf1f4	drm/amdgpu: Fix false positive error log It should first check block ras obj whether be set, it should return 0 directly if block ras obj or hw_ops is not set. If block doesn't support RAS just return 0 is fine. Changed from V1: return 0 directly if block ras obj or hw ops is not set Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:09 -04:00
Vignesh Chander	8c95cda3e1	drm/amdgpu/jpeg: skip set pg for sriov Host handles PG. Signed-off-by: Vignesh Chander <Vignesh.Chander@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:09 -04:00
André Almeida	a769178585	drm/amdgpu: Rework coredump to use memory dynamically Instead of storing coredump information inside amdgpu_device struct, move if to a proper separated struct and allocate it dynamically. This will make it easier to further expand the logged information. Signed-off-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:07 -04:00
Cong Liu	5838f74c29	drm/amdgpu: fix a memory leak in amdgpu_ras_feature_enable This patch fixes a memory leak in the amdgpu_ras_feature_enable() function. The leak occurs when the function sends a command to the firmware to enable or disable a RAS feature for a GFX block. If the command fails, the kfree() function is not called to free the info memory. Fixes: `9f051d6ff1` ("drm/amdgpu: Free ras cmd input buffer properly") Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Cong Liu <liucong2@kylinos.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:07 -04:00
Lijo Lazar	24f60ddc4b	drm/amdgpu: Fix vbios version string search Search for vbios version string in STRING_OFFSET-ATOM_ROM_HEADER region first. If those offsets are not populated, use the hardcoded region. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:06 -04:00
Lijo Lazar	2af351d692	Revert "drm/amdgpu: Report vbios version instead of PN" This reverts commit `7748ce5b69`. vbios_version sysfs node is used to identify Part Number also. Revert to the same so that it doesn't break scripts/software which parse this. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:06 -04:00
David Francis	5f248462c6	drm/amdgpu: Add EXT_COHERENT memory allocation flags These flags (for GEM and SVM allocations) allocate memory that allows for system-scope atomic semantics. On GFX943 these flags cause caches to be avoided on non-local memory. On all other ASICs they are identical in functionality to the equivalent COHERENT flags. Corresponding Thunk patch is at https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/pull/88 Reviewed-by: David Yat Sin <David.YatSin@amd.com> Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:06 -04:00
Yang Wang	4051844c66	drm/amdgpu: add amdgpu mca debug sysfs support add amdgpu mca debug sysfs support. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:25:19 -04:00
Alex Deucher	d11bbacee3	drm/amdgpu: add VPE IP discovery info to HW IP info query Add missing IP discovery info. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:25:11 -04:00
Yang Wang	7ff607e272	drm/amdgpu: add amdgpu smu mca dump feature support add amdgpu smu mca dump feature support. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:25:01 -04:00
Mario Limonciello	7f4ce7b50a	drm/amd: Enable seamless boot by default on newer ASICs Seamless boot can technically be supported as far back as DCN1 but to avoid regressions on older hardware, enable it for DCN3 and later. If users report using the module parameter that it works on older ASICs as well, this can be adjusted. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:24:46 -04:00
Mario Limonciello	5dc270d366	drm/amd: Add a module parameter for seamless boot The module parameter can be used to test more easily enabling seamless boot support on additional ASICs. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:24:39 -04:00
Timmy Tsai	2fa73a101c	drm/amd: Add HDP flush during jpeg init During jpeg init, CPU writes to frame buffer which can be cached by HDP, occasionally causing invalid header to be sent to MMSCH. Perform HDP flush after writing to frame buffer before continuing with jpeg init sequence. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timmy Tsai <timmtsai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:24:28 -04:00
Mario Limonciello	bb0f84293e	drm/amd: Move seamless boot check out of display This will allow base driver to dictate whether seamless should be enabled. No intended functional changes. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:24:21 -04:00
Mario Limonciello	3ef07651a5	drm/amd: Drop special case for yellow carp without discovery `amdgpu_gmc_get_vbios_allocations` has a special case for how to bring up yellow carp when amdgpu discovery is turned off. As this ASIC ships with discovery turned on, it's generally dead code and worse it causes `adev->mman.keep_stolen_vga_memory` to not be initialized for yellow carp. Remove it. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:24:04 -04:00
Lijo Lazar	4e8303cf2c	drm/amdgpu: Use function for IP version check Use an inline function for version check. Gives more flexibility to handle any format changes. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:23:28 -04:00
Tvrtko Ursulin	1c7a387ffe	drm: Update file owner during use With the typical model where the display server opens the file descriptor and then hands it over to the client(), we were showing stale data in debugfs. Fix it by updating the drm_file->pid on ioctl access from a different process. The field is also made RCU protected to allow for lockless readers. Update side is protected with dev->filelist_mutex. Before: $ cat /sys/kernel/debug/dri/0/clients command pid dev master a uid magic Xorg 2344 0 y y 0 0 Xorg 2344 0 n y 0 2 Xorg 2344 0 n y 0 3 Xorg 2344 0 n y 0 4 After: $ cat /sys/kernel/debug/dri/0/clients command tgid dev master a uid magic Xorg 830 0 y y 0 0 xfce4-session 880 0 n y 0 1 xfwm4 943 0 n y 0 2 neverball 1095 0 n y 0 3 ) More detailed and historically accurate description of various handover implementation kindly provided by Emil Velikov: """ The traditional model, the server was the orchestrator managing the primary device node. From the fd, to the master status and authentication. But looking at the fd alone, this has varied across the years. IIRC in the DRI1 days, Xorg (libdrm really) would have a list of open fd(s) and reuse those whenever needed, DRI2 the client was responsible for open() themselves and with DRI3 the fd was passed to the client. Around the inception of DRI3 and systemd-logind, the latter became another possible orchestrator. Whereby Xorg and Wayland compositors could ask it for the fd. For various reasons (hysterical and genuine ones) Xorg has a fallback path going the open(), whereas Wayland compositors are moving to solely relying on logind... some never had fallback even. Over the past few years, more projects have emerged which provide functionality similar (be that on API level, Dbus, or otherwise) to systemd-logind. """ v2: * Fixed typo in commit text and added a fine historical explanation from Emil. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: Daniel Vetter <daniel@ffwll.ch> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Tested-by: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230621094824.2348732-1-tvrtko.ursulin@linux.intel.com Signed-off-by: Christian König <christian.koenig@amd.com>	2023-09-20 15:27:44 +02:00
Dave Airlie	1216d49178	amd-drm-fixes-6.6-2023-09-13: amdgpu: - GC 9.4.3 fixes - Fix white screen issues with S/G display on system with >= 64G of ram - Replay fixes - SMU 13.0.6 fixes - AUX backlight fix - NBIO 4.3 SR-IOV fixes for HDP - RAS fixes - DP MST resume fix - Fix segfault on systems with no vbios - DPIA fixes amdkfd: - CWSR grace period fix - Unaligned doorbell fix - CRIU fix for GFX11 - Add missing TLB flush on gfx10 and newer -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZQIRSAAKCRC93/aFa7yZ 2O/nAP4zB0fdLB46Hhz11aYsE9Zghe91b2rcmF4EYpEAQs7awwEAhSjy0Wiy6EYb prEGCdW0O8Tq7fdjr7+JrPmF7dasAQk= =SUbg -----END PGP SIGNATURE----- Merge tag 'amd-drm-fixes-6.6-2023-09-13' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.6-2023-09-13: amdgpu: - GC 9.4.3 fixes - Fix white screen issues with S/G display on system with >= 64G of ram - Replay fixes - SMU 13.0.6 fixes - AUX backlight fix - NBIO 4.3 SR-IOV fixes for HDP - RAS fixes - DP MST resume fix - Fix segfault on systems with no vbios - DPIA fixes amdkfd: - CWSR grace period fix - Unaligned doorbell fix - CRIU fix for GFX11 - Add missing TLB flush on gfx10 and newer Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230913195009.7714-1-alexander.deucher@amd.com	2023-09-15 09:50:50 +10:00
Daniel Vetter	15794f9dc3	One doc fix for drm/connector, one fix for amdgpu for an crash when VRAM usage is high, and one fix in gm12u320 to fix the timeout units in the code -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCZPl/TAAKCRDj7w1vZxhR xWZZAP0b3k5vIuQdbiZBdXy7+guakiJ2DqOMxJJ+sYS5Mun53AEA73Cu1gmBNMoT d8H1uBjOfvPcXANNI0t0OgJfrESOdg8= =atPC -----END PGP SIGNATURE----- Merge tag 'drm-misc-fixes-2023-09-07' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes One doc fix for drm/connector, one fix for amdgpu for an crash when VRAM usage is high, and one fix in gm12u320 to fix the timeout units in the code Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Maxime Ripard <mripard@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/w5nlld5ukeh6bgtljsxmkex3e7s7f4qquuqkv5lv4cv3uxzwqr@pgokpejfsyef	2023-09-14 14:00:51 +02:00
Alex Deucher	addd7aef25	drm/amdgpu: add remap_hdp_registers callback for nbio 7.11 Implement support for remapping the HDP aperture registers for NBIO 7.11. Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-12 17:30:22 -04:00
Alex Deucher	b85a17d354	drm/amdgpu: add vcn_doorbell_range callback for nbio 7.11 Implement support for setting up the VCN doorbell range for NBIO 7.11. Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-12 17:30:18 -04:00
Thomas Zimmermann	c900529f3d	Merge drm/drm-fixes into drm-misc-fixes Forwarding to v6.6-rc1. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2023-09-12 08:53:30 +02:00
David Francis	5e7e822542	drm/amdgpu: Handle null atom context in VBIOS info ioctl On some APU systems, there is no atom context and so the atom_context struct is null. Add a check to the VBIOS_INFO branch of amdgpu_info_ioctl to handle this case, returning all zeroes. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 18:25:26 -04:00
Hawking Zhang	ffd6bde302	drm/amdgpu: fallback to old RAS error message for aqua_vanjaram So driver doesn't generate incorrect message until the new format is settled down for aqua_vanjaram Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 18:20:07 -04:00
Alex Deucher	ab43213e7a	drm/amdgpu/nbio4.3: set proper rmmio_remap.reg_offset for SR-IOV Needed for HDP flush to work correctly. Reviewed-by: Timmy Tsai <timmtsai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 18:19:48 -04:00
Alex Deucher	1832403cd4	drm/amdgpu/soc21: don't remap HDP registers for SR-IOV This matches the behavior for soc15 and nv. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Timmy Tsai <timmtsai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 18:19:42 -04:00
Hamza Mahfooz	169ed4ece8	Revert "drm/amd: Disable S/G for APUs when 64GB or more host memory" This reverts commit `70e64c4d52`. Since, we now have an actual fix for this issue, we can get rid of this workaround as it can cause pin failures if enough VRAM isn't carved out by the BIOS. Cc: stable@vger.kernel.org # 6.1+ Acked-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 18:18:17 -04:00
Mukul Joshi	97e3c6a853	drm/amdgpu: Store CU info from all XCCs for GFX v9.4.3 Currently, we store CU info only for a single XCC assuming that it is the same for all XCCs. However, that may not be true. As a result, store CU info for all XCCs. This info is later used for CU masking. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 18:16:31 -04:00
Mukul Joshi	81faf9e0c3	drm/amdkfd: Fix reg offset for setting CWSR grace period This patch fixes the case where the code currently passes absolute register address and not the reg offset, which HWS expects, when sending the PM4 packet to set/update CWSR grace period. Additionally, cleanup the signature of build_grace_period_packet_info function as it no longer needs the inst parameter. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 18:15:43 -04:00
David Francis	86f2ec2265	drm/amdgpu: Handle null atom context in VBIOS info ioctl On some APU systems, there is no atom context and so the atom_context struct is null. Add a check to the VBIOS_INFO branch of amdgpu_info_ioctl to handle this case, returning all zeroes. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:22:32 -04:00
André Almeida	ffde72107b	drm/amdgpu: Create an option to disable soft recovery Create a module option to disable soft recoveries on amdgpu, making every recovery go through the device reset path. This option makes easier to force device resets for testing and debugging purposes. Signed-off-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:22:23 -04:00
André Almeida	887db1e49a	drm/amdgpu: Merge debug module parameters Merge all developer debug options available as separated module parameters in one, making it obvious that are for developers. Drop the obsolete module options in favor of the new ones. Signed-off-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:19:54 -04:00
Yifan Zhang	0e64c9aad0	drm/amdgpu: add type conversion for gc info gc info usage misses type conversion. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Li Ma <li.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:17:30 -04:00
Mukul Joshi	68fa72a437	drm/amdgpu: Rename KGD_MAX_QUEUES to AMDGPU_MAX_QUEUES Rename KGD_MAX_QUEUES to AMDGPU_MAX_QUEUES to conform with the naming convention followed in amdgpu_gfx.h. No functional change. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:15:41 -04:00
Tao Zhou	ced575203a	drm/amdgpu: print more address info of UMC bad page Print out row, column and bank value of UMC error address for UMC v12. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:15:15 -04:00
Hawking Zhang	9f9d4651f7	drm/amdgpu: fallback to old RAS error message for aqua_vanjaram So driver doesn't generate incorrect message until the new format is settled down for aqua_vanjaram Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:15:02 -04:00
Alex Deucher	6a82822b90	drm/amdgpu/nbio4.3: set proper rmmio_remap.reg_offset for SR-IOV Needed for HDP flush to work correctly. Reviewed-by: Timmy Tsai <timmtsai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:14:59 -04:00
Alex Deucher	8a6e26e7ef	drm/amdgpu/soc21: don't remap HDP registers for SR-IOV This matches the behavior for soc15 and nv. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Timmy Tsai <timmtsai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:14:50 -04:00
Hamza Mahfooz	601c63ad8e	Revert "drm/amd: Disable S/G for APUs when 64GB or more host memory" This reverts commit `70e64c4d52`. Since, we now have an actual fix for this issue, we can get rid of this workaround as it can cause pin failures if enough VRAM isn't carved out by the BIOS. Cc: stable@vger.kernel.org # 6.1+ Acked-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:12:20 -04:00
Tao Zhou	3cb9ebc9d6	drm/amdgpu: add channel index table for UMC v12 Get UMC phyical channel index according to node id, umc instance and channel instance. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:10:58 -04:00
Tao Zhou	40a08fe890	drm/amdgpu: add address conversion for UMC v12 Convert MCA error address to physical address and find out all pages in one physical row. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:10:35 -04:00
Lijo Lazar	ca7aa3bf31	drm/amdgpu: Use default reset method handler When reset method is not passed in reset context, look for the handler for default reset method. On Aldebaran, default reset method for SOCs connected to CPU over XGMI is MODE2. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:10:25 -04:00
Mukul Joshi	f705a6f021	drm/amdgpu: Store CU info from all XCCs for GFX v9.4.3 Currently, we store CU info only for a single XCC assuming that it is the same for all XCCs. However, that may not be true. As a result, store CU info for all XCCs. This info is later used for CU masking. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:10:19 -04:00
Ma Jun	a1ce3e1f7c	drm/amd: Fix the flag setting code for interrupt request [1] Remove the irq flags setting code since pci_alloc_irq_vectors() handles these flags. [2] Free the msi vectors in case of error. Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:10:09 -04:00
Lang Yu	dbb8052151	drm/amdgpu: fix unsigned error codes Fixes: `5d5eac7e83` ("drm/amdgpu: add selftest framework for UMSCH") Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/all/ZPhddADtKmOuVyDq@lang-desktop Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:09:12 -04:00
Mukul Joshi	56d6daa3c7	drm/amdkfd: Fix reg offset for setting CWSR grace period This patch fixes the case where the code currently passes absolute register address and not the reg offset, which HWS expects, when sending the PM4 packet to set/update CWSR grace period. Additionally, cleanup the signature of build_grace_period_packet_info function as it no longer needs the inst parameter. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:07:33 -04:00
Arunpravin Paneer Selvam	2eb412aa25	drm/amdgpu: Move the size computations to drm buddy - Move roundup_power_of_two() and IS_ALIGNED() computations to drm buddy file to support the new try harder mechanism for contiguous allocation. - Move trim function call to drm_buddy_alloc_blocks() function. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230909160902.15644-2-Arunpravin.PaneerSelvam@amd.com Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Christian König <christian.koenig@amd.com>	2023-09-11 20:18:06 +02:00
Linus Torvalds	a48fa7efaf	drm fixes for 6.6-rc1 amdgpu: - Display replay fixes - Fixes for headless boards - Fix documentation breakage - RAS fixes - Handle newer IP discovery tables - SMU 13.0.6 fixes - SR-IOV fixes - Display vstartup fixes - NBIO 7.9 fixes - Display scaling mode fixes - Debugfs power reporting fix - GC 9.4.3 fixes - Dirty framebuffer fixes for fbcon - eDP fixes - DCN 3.1.5 fix - Display ODM fixes - GPU core dump fix - Re-enable zops property now that IGT test is fixed - Fix possible UAF in CS code - Cursor degamma fix amdkfd: - HMM fixes - Interrupt masking fix - GFX11 MQD fixes i915: - Mark requests for GuC virtual engines to avoid use-after-free nouveau: - Fix fence state in nouveau_fence_emit() ivpu: - replace strncpy -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmT6ihUACgkQDHTzWXnE hr6Vqg/+OGfbxx0qev5C93bLYpg8d4zbply0zTed2hU48zczARkOyDH+h2uYM4tP rmB6mK/0KRy3H8vkngKduR/IF6QBwzOnLDpS+C/TrHYQYqMDwvs3qEDVYqXh3V5H GPIFuu9sb2Nb/o9Fid70pvNACbgGsAUgqMUhQ/is3NjzOR8S/qjdBQ7wSdoUoOqx okTdlwuMK5SEYrihSyTZvhNcwpR/1L8JuxOUXXXUSQ0tRXBer/ZNF2lcEyYmQ0zs bZHKM4dNbdew/EhygbH6LVB5RjFaT5pGw08Xm8zJry+q5tXQV/NIXPQHL3vWqQoX i2QLbvGX/Uu8LJg9YNdsa1kPwNKADAxF64cW38Llv8ybsPHyva/I255j689/TvSG Se7HkTooURKS6GWFHPOkyMNC0Y+Fb/7WG5zUPSDVk9stqJz9pzx48okTsCWeuovD cBOssp8If1QsTyPvDq5A47l5z1oO3J1rdJ9fL0GnpOuXulPhgJGIoUSkftyI6lbw rhUoRd7w6VcgOsA9WIkAf+/325em0Y0AKZBgQnr2jfF56IE4iLa8yFuJNeReZ9oy W9yf14AB0orVm9+P4+PqATXBh+PdCTD1CPcB0MyK1SZAjh836Tc0HPKvJAUU6jp+ 8aw3BKXcaLjIP/dCyhSddXCnuuTPWreff5isktwgEXUtNFv9jx0= =fPeN -----END PGP SIGNATURE----- Merge tag 'drm-next-2023-09-08' of git://anongit.freedesktop.org/drm/drm Pull drm fixes from Dave Airlie: "Regular rounds of rc1 fixes, a large bunch for amdgpu since it's three weeks in one go, one i915, one nouveau and one ivpu. I think there might be a few more fixes in misc that I haven't pulled in yet, but we should get them all for rc2. amdgpu: - Display replay fixes - Fixes for headless boards - Fix documentation breakage - RAS fixes - Handle newer IP discovery tables - SMU 13.0.6 fixes - SR-IOV fixes - Display vstartup fixes - NBIO 7.9 fixes - Display scaling mode fixes - Debugfs power reporting fix - GC 9.4.3 fixes - Dirty framebuffer fixes for fbcon - eDP fixes - DCN 3.1.5 fix - Display ODM fixes - GPU core dump fix - Re-enable zops property now that IGT test is fixed - Fix possible UAF in CS code - Cursor degamma fix amdkfd: - HMM fixes - Interrupt masking fix - GFX11 MQD fixes i915: - Mark requests for GuC virtual engines to avoid use-after-free nouveau: - Fix fence state in nouveau_fence_emit() ivpu: - replace strncpy" * tag 'drm-next-2023-09-08' of git://anongit.freedesktop.org/drm/drm: (51 commits) drm/amdgpu: Restrict bootloader wait to SMUv13.0.6 drm/amd/display: prevent potential division by zero errors drm/amd/display: enable cursor degamma for DCN3+ DRM legacy gamma drm/amd/display: limit the v_startup workaround to ASICs older than DCN3.1 Revert "drm/amd/display: Remove v_startup workaround for dcn3+" drm/amdgpu: fix amdgpu_cs_p1_user_fence Revert "Revert "drm/amd/display: Implement zpos property"" drm/amdkfd: Add missing gfx11 MQD manager callbacks drm/amdgpu: Free ras cmd input buffer properly drm/amdgpu: Hide xcp partition sysfs under SRIOV drm/amdgpu: use read-modify-write mode for gfx v9_4_3 SQ setting drm/amdkfd: use mask to get v9 interrupt sq data bits correctly drm/amdgpu: Allocate coredump memory in a nonblocking way drm/amdgpu: Support query ecc cap for aqua_vanjaram drm/amdgpu: Add umc_info v4_0 structure drm/amd/display: always switch off ODM before committing more streams drm/amd/display: Remove wait while locked drm/amd/display: update blank state on ODM changes drm/amd/display: Add smu write msg id fail retry process drm/amdgpu: Add SMU v13.0.6 default reset methods ...	2023-09-07 19:47:04 -07:00
Lijo Lazar	fbe1a9e0c7	drm/amdgpu: Restrict bootloader wait to SMUv13.0.6 Restrict the wait for boot loader steady state only to SMUv13.0.6. For older SOCs, ASIC init has a longer wait period and that takes care. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 22:11:51 -04:00
Candice Li	7e6ec09974	drm/amdgpu: Add umc v12_0 ras functions Add umc v12_0 ras error querying. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:38:00 -04:00
Lijo Lazar	6b7d211740	drm/amdgpu: Fix refclk reporting for SMU v13.0.6 SMU v13.0.6 SOCs have 100MHz reference clock. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:37:03 -04:00
Hawking Zhang	c2c23a10f1	drm/amdgpu: Correct se_num and reg_inst for gfx v9_4_3 ras counters gfx_v9_4_3_ue\|ce_reg_list is an array per gfx core instance correct the settings of se_num and reg_inst for some of gfx ras counters so all the available register instances can be polled for ras status. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:36:57 -04:00
Lijo Lazar	1b8e56b994	drm/amdgpu: Restrict bootloader wait to SMUv13.0.6 Restrict the wait for boot loader steady state only to SMUv13.0.6. For older SOCs, ASIC init has a longer wait period and that takes care. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:36:52 -04:00
Lijo Lazar	b93fb0fe24	drm/amdgpu: Add only valid firmware version nodes Show only firmware version attributes that have valid version. Hide others. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:36:44 -04:00
Lang Yu	d519072d26	drm/amdgpu: fix incompatible types in conditional expression Use proper type. Fixes: `9d4346bdbc` ("drm/amdgpu: add VPE 6.1.0 support") Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Solomon Chiu <solomon.chiu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202309020608.FwP8QMht-lkp@intel.com Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:35:29 -04:00
Srinivasan Shanmugam	eb3b214c37	drm/amdgpu: Use min_t to replace min Use min_t to replace min, min_t is a bit fast because min use twice typeof. And using min_t is cleaner here since the min/max macros do a typecheck while min_t()/max_t() to an explicit type cast. Fixes the below checkpatch warning: WARNING: min() should probably be min_t() Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:35:22 -04:00
Candice Li	d57e24aa56	drm/amdgpu: Update amdgpu_device_indirect_r/wreg_ext Only calculate pcie_index_hi for register address greater than 32bits. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:34:26 -04:00
Candice Li	a76b2870bd	drm/amdgpu: Add RREG64_PCIE_EXT/WREG64_PCIE_EXT functions Add 64bits register access support on register whose address is greater than 32bits. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:34:18 -04:00
Srinivasan Shanmugam	9b70a1d414	drm/amdgpu: Declare array with strings as pointers constant This warning is for the declaration of a static array, and it is recommended to declare it as type "static const char * const" instead of "static const char ". an array pointer declared as type "static const char " can point to a different character constant because the pointer is mutable. However, if it is declared as type "static const char * const", the pointer will point to an immutable character constant, preventing it from being modified which can better ensure the safety and stability of the program. Fixes the below: WARNING: static const char * array should probably be static const char * const Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:34:11 -04:00
Jiapeng Chong	df04434cb5	drm/amdgpu: clean up some inconsistent indenting No functional modification involved. drivers/gpu/drm/amd/amdgpu/nbio_v7_11.c:34 nbio_v7_11_get_rev_id() warn: inconsistent indenting. v2: drop leftover printk (Alex) Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=6316 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:32:25 -04:00
Yifan Zhang	0bdf09cc5e	drm/amdgpu: calling address translation functions to simplify codes Use amdgpu_gmc_vram_pa to simplify codes. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:31:52 -04:00
Simon Pilkington	e2884fe84a	drm/amd: Make fence wait in suballocator uninterruptible Commit `c103a23f2f` ("drm/amd: Convert amdgpu to use suballocation helper.") made the fence wait in amdgpu_sa_bo_new() interruptible but there is no code to handle an interrupt. This caused the kernel to randomly explode in high-VRAM-pressure situations so make it uninterruptible again. Signed-off-by: Simon Pilkington <simonp.git@gmail.com> Fixes: `c103a23f2f` ("drm/amd: Convert amdgpu to use suballocation helper.") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2761 CC: stable@vger.kernel.org # 6.4+	2023-09-01 15:12:07 +02:00
Christian König	35588314e9	drm/amdgpu: fix amdgpu_cs_p1_user_fence The offset is just 32bits here so this can potentially overflow if somebody specifies a large value. Instead reduce the size to calculate the last possible offset. The error handling path incorrectly drops the reference to the user fence BO resulting in potential reference count underflow. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2023-08-31 18:14:49 -04:00
Hawking Zhang	9f051d6ff1	drm/amdgpu: Free ras cmd input buffer properly Do not access the pointer for ras input cmd buffer if it is even not allocated. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 18:12:13 -04:00
Rajneesh Bhardwaj	2031c46b09	drm/amdgpu: Hide xcp partition sysfs under SRIOV XCP partitions should not be visible for the VF for GFXIP 9.4.3. Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 18:11:41 -04:00
Tao Zhou	e23b10675a	drm/amdgpu: use read-modify-write mode for gfx v9_4_3 SQ setting Instead of using direct update, avoid touching unrelated fields. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 18:11:09 -04:00
André Almeida	6d1b345548	drm/amdgpu: Allocate coredump memory in a nonblocking way During a GPU reset, a normal memory reclaim could block to reclaim memory. Giving that coredump is a best effort mechanism, it shouldn't disturb the reset path. Change its memory allocation flag to a nonblocking one. Signed-off-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 18:10:11 -04:00
Hawking Zhang	e0c5c387ac	drm/amdgpu: Support query ecc cap for aqua_vanjaram Driver queries umc_info v4_0 to identify ecc cap for aqua_vanjaram Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Candice Li <candice.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 18:09:45 -04:00
Lijo Lazar	05347402d1	drm/amdgpu: Add SMU v13.0.6 default reset methods For APUs with SMU v13.0.6, mode-2 reset is kept as default and for others mode-1 is the default reset method. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 18:05:43 -04:00
Lijo Lazar	7c2949c12e	drm/amdgpu: Add bootloader wait for PSP v13 Implement the wait for bootloader call back for PSP v13.0 ASICs. Only for ASICs with PSP v13.0.6, it needs an additional check for VBIOS mailbox status. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 18:04:10 -04:00
Hamza Mahfooz	0a611560f5	drm/amdgpu: register a dirty framebuffer callback for fbcon fbcon requires that we implement &drm_framebuffer_funcs.dirty. Otherwise, the framebuffer might take a while to flush (which would manifest as noticeable lag). However, we can't enable this callback for non-fbcon cases since it may cause too many atomic commits to be made at once. So, implement amdgpu_dirtyfb() and only enable it for fbcon framebuffers (we can use the "struct drm_file file" parameter in the callback to check for this since it is only NULL when called by fbcon, at least in the mainline kernel) on devices that support atomic KMS. Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: stable@vger.kernel.org # 6.1+ Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2519 Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 18:03:48 -04:00
Mangesh Gadre	7b9f623530	drm/amdgpu: Updated TCP/UTCL1 programming Update TCP/UTCL1 thrashing control settings v2: updated rev_id check Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 18:02:49 -04:00
Hawking Zhang	7d4424373d	drm/amdgpu: Fix the return for gpu mode1_reset amdgpu_device_mode1_reset will return gpu mode1_reset succeed (ret = 0) as long as wait_for_bootloader call succeed, regardless of the status reported by smu or psp firmware. This results to driver continue executing recovery even smu or psp fail to perform mode1 reset. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 18:02:10 -04:00
Mangesh Gadre	3f16096795	drm/amdgpu: Remove SRAM clock gater override by driver rlc firmware does required setting, driver need not do it. Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:59:17 -04:00
Lijo Lazar	7656168a8a	drm/amdgpu: Add bootloader status check Add a function to wait till bootloader has reached steady state. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:58:58 -04:00
Horace Chen	8c97e87c13	drm/amdkfd: use correct method to get clock under SRIOV [What] Current SRIOV still using adev->clock.default_XX which gets from atomfirmware. But these fields are abandoned in atomfirmware long ago. Which may cause function to return a 0 value. [How] We don't need to check whether SR-IOV. For SR-IOV one-vf-mode, pm is enabled and VF is able to read dpm clock from pmfw, so we can use dpm clock interface directly. For multi-VF mode, VF pm is disabled, so driver can just react as pm disabled. One-vf-mode is introduced from GFX9 so it shall not have any backward compatibility issue. Signed-off-by: Horace Chen <horace.chen@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:58:29 -04:00
Lijo Lazar	8f1778939b	drm/amdgpu: Unset baco dummy mode on nbio v7.9 BACO dummy mode could be set under reset conditions and that affects framebuffer access. Check If baco dummy mode is set, unset it if so. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:58:23 -04:00
YiPeng Chai	e81c455685	drm/amdgpu: Enable ras for mp0 v13_0_6 sriov Enable ras for mp0 v13_0_6 sriov Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:55:02 -04:00
Samir Dhume	bae44a8fcb	drm/amdgpu/jpeg - skip change of power-gating state for sriov Powergating is handled in the host driver. Reviewed-by: Zhigang Luo <zhigang.luo@amd.com> Signed-off-by: Samir Dhume <samir.dhume@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:54:18 -04:00
Le Ma	46b55e25c9	drm/amdgpu: update gc_info v2_1 from discovery Several new fields are exposed in gc_info v2_1 Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:53:19 -04:00
Le Ma	d4f6425a56	drm/amdgpu: update mall info v2 from discovery Mall info v2 is introduced in ip discovery Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:53:08 -04:00
Candice Li	4b721ed87e	drm/amdgpu: Only support RAS EEPROM on dGPU platform RAS EEPROM device is only supported on dGPU platform for smu v13_0_6. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:52:49 -04:00
Lang Yu	983ac45a06	drm/amdgpu: update SET_HW_RESOURCES definition for UMSCH Align with FW changes. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:14:21 -04:00
Lang Yu	eebb06d121	drm/amdgpu: add amdgpu_umsch_mm module parameter Enable Multi Media User Mode Scheduler (0 = disabled (default), 1 = enabled). Signed-off-by: Lang Yu <Lang.Yu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:14:21 -04:00
Lang Yu	822f780829	drm/amdgpu/discovery: enable UMSCH 4.0 in IP discovery Enable UMSCH to support VPE and VCN user queues. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:14:21 -04:00
Lang Yu	4f94903332	drm/amdgpu: add PSP loading support for UMSCH Add front door loading support. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:14:21 -04:00
Lang Yu	40748f9a0a	drm/amdgpu: reserve mmhub engine 3 for UMSCH FW UMSCH FW uses mmhub engine 3 for invalidation. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Acked-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:14:21 -04:00
Lang Yu	d591ae0c9f	drm/amdgpu: add VPE queue submission test Submit a fence command through indirect buffer. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:14:21 -04:00
Lang Yu	5d5eac7e83	drm/amdgpu: add selftest framework for UMSCH Prepare for VPE and VCN queue submission test. v2: rebase on drm_exec (Alex) Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:14:21 -04:00
Lang Yu	dc6f3d6ff2	drm/amdgpu: enable UMSCH scheduling for VPE Add VPE into UMSCH hw resourses, set vmid mask to 0xf00, set hqd mask to 0xfe, then UMSCH can schedule VPE queues. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:40:56 -04:00
Lang Yu	3488c79bea	drm/amdgpu: add initial support for UMSCH Add basic data structure, dummy ring functions and ip functions for UMSCH. Implement sw_init(ring_init and init_microcodede) and hw_init(load_microcode), UMSCH can boot up now. Implement hw_init(ring_start) and hw_fini(ring_stop), UMSCH is ready for command submission now. Implement set_hw_resources and add/remove_queue, UMSCH is ready for scheduling now. Aggregated doorbell is used to notify UMSCH FW that there is unmapped queue with corresponding priority level (e.g., AGDB[0] for Real time band, etc.) is updating its job. v2: squash together initial patches to avoid breaking the build (Alex) Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:40:53 -04:00
Lang Yu	9c852a42a9	drm/amdgpu: add UMSCH firmware header definition Add firmware header definition for UMSCH. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:40:45 -04:00
Lang Yu	1a29f36781	drm/amdgpu: add UMSCH RING TYPE definition Add RING TYPE definition for Multi Mdeia User Mode Scheduler. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:40:40 -04:00
Saleemkhan Jamadar	1cf36599b9	drm/amdgpu/jpeg: initialize number of jpeg ring Initialize number of jpeg ring for vcn 4.0.5. Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:39:41 -04:00
Christian König	a5492fe27f	drm/amdgpu: fix amdgpu_cs_p1_user_fence The offset is just 32bits here so this can potentially overflow if somebody specifies a large value. Instead reduce the size to calculate the last possible offset. The error handling path incorrectly drops the reference to the user fence BO resulting in potential reference count underflow. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:39:28 -04:00
Evan Quan	90bcb9b595	drm/amdgpu: revise the device initialization sequences By placing the sysfs interfaces creation after `.late_int`. Since some operations performed during `.late_init` may affect how the sysfs interfaces should be created. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:35:33 -04:00
Evan Quan	3e38b634f9	drm/amd/pm: introduce a new set of OD interfaces There will be multiple interfaces(sysfs files) exposed with each representing a single OD functionality. And all those interface will be arranged in a tree liked hierarchy with the top dir as "gpu_od". Meanwhile all functionalities for the same component will be arranged under the same directory. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:35:26 -04:00
Saleemkhan Jamadar	6be6e74b7d	drm/amdgpu: enable PG flags for VCN Enable PG flags for VCN and Jpeg on IP 11_5_0 Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:35:02 -04:00
Saleemkhan Jamadar	844d8dd5b9	drm/amdgpu/discovery: add VCN 4.0.5 Support Enable VCN 4.0.5 on gc 11_5_0. Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:34:58 -04:00
Saleemkhan Jamadar	c64f389506	drm/amdgpu/soc21: Add video cap query support for VCN_4_0_5 Added the video capability query support for VCN version 4_0_5 Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:34:52 -04:00
Saleemkhan Jamadar	cc308acc9b	drm/amdgpu:enable CG and PG flags for VCN Enable CG and PG flags for VCN on IP 11_5_0 Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:34:48 -04:00
Saleemkhan Jamadar	1827b37582	drm/amdgpu: add VCN_4_0_5 firmware support Add VCN_4_0_5 firmware support Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:34:43 -04:00
Saleemkhan Jamadar	8f98a715da	drm/amdgpu/jpeg: add jpeg support for VCN4_0_5 Add jpeg support for VCN4_0_5 v2 - update license year (Leo Liu) Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:34:36 -04:00
Saleemkhan Jamadar	547aad32ed	drm/amdgpu: add VCN4 ip block support Add VCN 4.0.5 initialization and decoder/encoder ring functions. v2 - update license year (Leo Liu) Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:34:32 -04:00
Lang Yu	f9ecae9a4e	drm/amdgpu: fix VPE front door loading issue Implement proper front door loading for vpe 6.1. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:34:22 -04:00
Lang Yu	5f6e9cdc83	drm/amdgpu: add VPE FW version query support Add support to query VPE FW version. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:34:19 -04:00
Lang Yu	3ee8fb7005	drm/amdgpu: enable VPE for VPE 6.1.0 Enable Video Processing Engine on SoCs that contain VPE 6.1.0. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:34:16 -04:00
Lang Yu	523c12802d	drm/amdgpu: add user space CS support for VPE Enable command submission to VPE from user space. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:34:14 -04:00
Lang Yu	c5d67a0ec3	drm/amdgpu: add PSP loading support for VPE Add PSP loading support for Video Processing Engine. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:34:10 -04:00
Lang Yu	9d4346bdbc	drm/amdgpu: add VPE 6.1.0 support Add skeleton driver code. (Ray) Add initial support for Video Processing Engine. (Lang) Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:34:05 -04:00
Lang Yu	5861e47731	drm/amdgpu: add nbio 7.11 callback for VPE Add nbio callback to configure doorbell settings. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:33:56 -04:00
Lang Yu	75fdd738ff	drm/amdgpu: add nbio callback for VPE Add nbio callback to configure doorbell settings. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:33:53 -04:00
Lang Yu	964a36d7a4	drm/amdgpu: add PSP FW TYPE for VPE Add PSP FW TYPE for Video Processing Engine. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:33:50 -04:00
Lang Yu	4c63735fa8	drm/amdgpu: add UCODE ID for VPE Add UCODE ID for Video Processing Engine. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:33:47 -04:00
Lang Yu	ce7b59c1e6	drm/amdgpu: add support for VPE firmware name decoding Add decoding VPE firmware name support. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:33:43 -04:00
Lang Yu	2f3916bedb	drm/amdgpu: add doorbell index for VPE Add doorbell index for Video Processing Engine. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:33:40 -04:00
Lang Yu	0b233357a6	drm/amdgpu: add HWID for VPE Add HWID for Video Processing Engine. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:33:31 -04:00
Lang Yu	b0fa855cab	drm/amdgpu: add VPE firmware interface Add initial firmware interface. (Ray) Add more opcodes and rename to vpe_v6_1. (Lang) v2: Update copyright date (Alex) Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:33:27 -04:00
Lang Yu	878fe05116	drm/amdgpu: add VPE firmware header definition Add firmware header definition for Video Processing Engine. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:33:25 -04:00
Huang Rui	5b28f1c720	drm/amdgpu: add VPE HW IP BLOCK definition Add HW IP BLOCK for Video Processing Engine. Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:33:21 -04:00
Huang Rui	2d6ea3b07c	drm/amdgpu: add VPE RING TYPE definition Add RING TYPE for Video Processing Engine. Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:33:09 -04:00
Linus Torvalds	b6f6167ea8	pci-v6.6-changes -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmTvfQgUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vyDKA//UBxniXTyxvN8L/agMZngFJd9jLkE p2lnk5eTW6y/aJp1g+ujc7IJEmHG/B1Flp0b5mK8XL7S6OBtAGlPwnuPPpXb0ZxV ofSuQpYoNZGpkYrQMYvATfdLnH2WF3Yj3WCqh5jd2EldPEyqhMV68l7NMzf6+td2 KWJPli1XO8e60JAzbhpXH9vn1I0T8e6Qx8z/ulcydfiOH3PGDPnVrEo8gw9CvJOr aDqSPW7uhTk2SjjUJcAlQVpTGclE4yBxOOhEbuSGc7L6Ab04Y6D0XKx1589AUK6Z W2dQFK3cFYNQQ9aS/2DMUG88H09ca5t8kgUf7Iz3uan1soPzSYK8SLNBgxAPs11S 1jY093rDXXoaCJqxWUwDc/JUpWq6T3g4m445SNvFIOMcSwmMOIfAwfug4UexE1zC Ie8u3Um35Mp25o0o6V1J2EjdBsUsm0p//CsslfoAAIWi85W02Z/46bLLcITchkCe bP05H+c55ZN6maRJiaeghcpY+iWO4XCRCKS9mF1v9yn7FOhNxhBcwgTNPyGBVrYz T9w3ynTHAmuwNqtd6jhpTR/b1902up/Qv9I8uHhBDMqJAXfHocGEXHZblNuZMgfE bu9cjcbFghUPdrhUHYmbEqAzhdlL2SFuMYfn8D4QV4A6x+32xCdwsi39I0Effm5V wl0HmemjKjTYbLw= =iFFM -----END PGP SIGNATURE----- Merge tag 'pci-v6.6-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull PCI updates from Bjorn Helgaas: "Enumeration: - Add locking to read/modify/write PCIe Capability Register accessors for Link Control and Root Control - Use pci_dev_id() when possible instead of manually composing ID from dev->bus->number and dev->devfn Resource management: - Move prototypes for __weak sysfs resource files to linux/pci.h to fix 'no previous prototype' warnings - Make more I/O port accesses depend on HAS_IOPORT - Use devm_platform_get_and_ioremap_resource() instead of open-coding platform_get_resource() followed by devm_ioremap_resource() Power management: - Ensure devices are powered up while accessing VPD - If device is powered-up, keep it that way while polling for PME - Only read PCI_PM_CTRL register when available, to avoid reading the wrong register and corrupting dev->current_state Virtualization: - Avoid Secondary Bus Reset on NVIDIA T4 GPUs Error handling: - Remove unused pci_disable_pcie_error_reporting() - Unexport pci_enable_pcie_error_reporting(), used only by aer.c - Unexport pcie_port_bus_type, used only by PCI core VGA: - Simplify and clean up typos in VGA arbiter Apple PCIe controller driver: - Initialize pcie->nvecs (number of available MSIs) before use Broadcom iProc PCIe controller driver: - Use of_property_read_bool() instead of low-level accessors for boolean properties Broadcom STB PCIe controller driver: - Assert PERST# when probing BCM2711 because some bootloaders don't do it Freescale i.MX6 PCIe controller driver: - Add .host_deinit() callback so we can clean up things like regulators on probe failure or driver unload Freescale Layerscape PCIe controller driver: - Add support for link-down notification so the endpoint driver can process LINK_DOWN events - Add suspend/resume support, including manual PME_Turn_off/PME_TO_Ack handshake - Save Link Capabilities during probe so they can be restored when handling a link-up event, since the controller loses the Link Width and Link Speed values during reset Intel VMD host bridge driver: - Fix disable of bridge windows during domain reset; previously we cleared the base/limit registers, which actually left the windows enabled Marvell MVEBU PCIe controller driver: - Remove unused busn member Microchip PolarFlare PCIe controller driver: - Fix interrupt bit definitions so the SEC and DED interrupt handlers work correctly - Make driver buildable as a module - Read FPGA MSI configuration parameters from hardware instead of hard-coding them Microsoft Hyper-V host bridge driver: - To avoid a NULL pointer dereference, skip MSI restore after hibernate if MSI/MSI-X hasn't been enabled NVIDIA Tegra194 PCIe controller driver: - Revert 'PCI: tegra194: Enable support for 256 Byte payload' because Linux doesn't know how to reduce MPS from to 256 to 128 bytes for endpoints below a switch (because other devices below the switch might already be operating), which leads to 'Malformed TLP' errors Qualcomm PCIe controller driver: - Add DT and driver support for interconnect bandwidth voting for 'pcie-mem' and 'cpu-pcie' interconnects - Fix broken SDX65 'compatible' DT property - Configure controller so MHI bus master clock will be switched off while in ASPM L1.x states - Use alignment restriction from EPF core in EPF MHI driver - Add Endpoint eDMA support - Add MHI eDMA support - Add Snapdragon SM8450 support to the EPF MHI driversupport - Add MHI eDMA support - Add Snapdragon SM8450 support to the EPF MHI driversupport - Add MHI eDMA support - Add Snapdragon SM8450 support to the EPF MHI driversupport - Add MHI eDMA support - Add Snapdragon SM8450 support to the EPF MHI driver - Use iATU for EPF MHI transfers smaller than 4K to avoid eDMA setup latency - Add sa8775p DT binding and driver support Rockchip PCIe controller driver: - Use 64-bit mask on MSI 64-bit PCI address to avoid zeroing out the upper 32 bits SiFive FU740 PCIe controller driver: - Set the supported number of MSI vectors so we can use all available MSI interrupts Synopsys DesignWare PCIe controller driver: - Add generic dwc suspend/resume APIs (dw_pcie_suspend_noirq() and dw_pcie_resume_noirq()) to be called by controller driver suspend/resume ops, and a controller callback to send PME_Turn_Off MicroSemi Switchtec management driver: - Add support for PCIe Gen5 devices Miscellaneous: - Reorder and compress to reduce size of struct pci_dev - Fix race in DOE destroy_work_on_stack() - Add stubs to avoid casts between incompatible function types - Explicitly include correct DT includes to untangle headers" * tag 'pci-v6.6-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (96 commits) PCI: qcom-ep: Add ICC bandwidth voting support dt-bindings: PCI: qcom: ep: Add interconnects path PCI: qcom-ep: Treat unknown IRQ events as an error dt-bindings: PCI: qcom: Fix SDX65 compatible PCI: endpoint: Add kernel-doc for pci_epc_mem_init() API PCI: epf-mhi: Use iATU for small transfers PCI: epf-mhi: Add support for SM8450 PCI: epf-mhi: Add eDMA support PCI: qcom-ep: Add eDMA support PCI: epf-mhi: Make use of the alignment restriction from EPF core PCI/PM: Only read PCI_PM_CTRL register when available PCI: qcom: Add support for sa8775p SoC dt-bindings: PCI: qcom: Add sa8775p compatible PCI: qcom-ep: Pass alignment restriction to the EPF core PCI: Simplify pcie_capability_clear_and_set_word() control flow PCI: Tidy config space save/restore messages PCI: Fix code formatting inconsistencies PCI: Fix typos in docs and comments PCI: Fix pci_bus_resetable(), pci_slot_resetable() name typos PCI: Simplify pci_dev_driver() ...	2023-08-30 20:23:07 -07:00
Srinivasan Shanmugam	8254e05c82	drm/amdgpu: Fix printk_ratelimit() with DRM_ERROR_RATELIMITED in 'amdgpu_cs_ioctl' Replaced printk_ratelimit() with its DRM equivalent to avoid flooding of dmesg logs & hence fixes the following: WARNING: Prefer printk_ratelimited or pr_<level>_ratelimited to printk_ratelimit + if (printk_ratelimit()) Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:51:17 -04:00
Srinivasan Shanmugam	bf227a4f05	drm/amdgpu: Use READ_ONCE() when reading the values in 'sdma_v4_4_2_ring_get_rptr' Use READ_ONCE() instead of declaring the pointer volatile. To prevent the compiler from refetching or reordering the read, so that the read value is always consistent. Link: https://lwn.net/Articles/624126/ Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: Le Ma <le.ma@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:51:17 -04:00
Yifan Zhang	4d5dc6260c	drm/amdgpu: remove unused parameter in amdgpu_vmid_grab_idle amdgpu_vm is not used in amdgpu_vmid_grab_idle. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:51:17 -04:00
Hawking Zhang	bf7aa8bea9	drm/amdgpu: Free ras cmd input buffer properly Do not access the pointer for ras input cmd buffer if it is even not allocated. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:51:16 -04:00
Ma Jun	8f9a9a09af	drm/amd: Simplify the bo size check funciton Simplify the code logic of size check function amdgpu_bo_validate_size Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:51:16 -04:00
Rajneesh Bhardwaj	d30279a9e3	drm/amdgpu: Hide xcp partition sysfs under SRIOV XCP partitions should not be visible for the VF for GFXIP 9.4.3. Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:51:16 -04:00
Tao Zhou	ac3343c761	drm/amdgpu: use read-modify-write mode for gfx v9_4_3 SQ setting Instead of using direct update, avoid touching unrelated fields. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:51:16 -04:00
ZhenGuo Yin	9f05cfc78c	drm/amdgpu: access RLC_SPM_MC_CNTL through MMIO in SRIOV runtime Register RLC_SPM_MC_CNTL is not blocked by L1 policy, VF can directly access it through MMIO during SRIOV runtime. v2: use SOC15 interface to access registers Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:51:16 -04:00
Lee Jones	668dfc4533	drm/amd/amdgpu/sdma_v6_0: Demote a bunch of half-completed function headers Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:248: warning: Function parameter or member 'job' not described in 'sdma_v6_0_ring_emit_ib' drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:248: warning: Function parameter or member 'flags' not described in 'sdma_v6_0_ring_emit_ib' drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:945: warning: Function parameter or member 'timeout' not described in 'sdma_v6_0_ring_test_ib' drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:1124: warning: Function parameter or member 'ring' not described in 'sdma_v6_0_ring_pad_ib' drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:1175: warning: Function parameter or member 'vmid' not described in 'sdma_v6_0_ring_emit_vm_flush' drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:1175: warning: Function parameter or member 'pd_addr' not described in 'sdma_v6_0_ring_emit_vm_flush' Cc: linaro-mm-sig@lists.linaro.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:51:16 -04:00
André Almeida	d68ccdb263	drm/amdgpu: Allocate coredump memory in a nonblocking way During a GPU reset, a normal memory reclaim could block to reclaim memory. Giving that coredump is a best effort mechanism, it shouldn't disturb the reset path. Change its memory allocation flag to a nonblocking one. Signed-off-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:51:16 -04:00
Hawking Zhang	a8cde40201	drm/amdgpu: Support query ecc cap for aqua_vanjaram Driver queries umc_info v4_0 to identify ecc cap for aqua_vanjaram Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Candice Li <candice.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:51:13 -04:00
Lijo Lazar	c4b9dc5313	drm/amdgpu: Add SMU v13.0.6 default reset methods For APUs with SMU v13.0.6, mode-2 reset is kept as default and for others mode-1 is the default reset method. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:31:37 -04:00
Lee Jones	04cef5f583	drm/amd/amdgpu/amdgpu_doorbell_mgr: Correct misdocumented param 'doorbell_index' Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell_mgr.c:123: warning: Function parameter or member 'doorbell_index' not described in 'amdgpu_doorbell_index_on_bar' drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell_mgr.c:123: warning: Excess function parameter 'db_index' description in 'amdgpu_doorbell_index_on_bar' Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:27:23 -04:00
Lee Jones	a728342ae4	drm/amd/amdgpu/imu_v11_0: Increase buffer size to ensure all possible values can be stored Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/imu_v11_0.c: In function ‘imu_v11_0_init_microcode’: drivers/gpu/drm/amd/amdgpu/imu_v11_0.c:52:54: warning: ‘_imu.bin’ directive output may be truncated writing 8 bytes into a region of size between 4 and 33 [-Wformat-truncation=] drivers/gpu/drm/amd/amdgpu/imu_v11_0.c:52:9: note: ‘snprintf’ output between 16 and 45 bytes into a destination of size 40 Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:27:18 -04:00
Lee Jones	ac84d99a11	drm/amd/amdgpu/amdgpu_sdma: Increase buffer size to account for all possible values Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c: In function ‘amdgpu_sdma_init_microcode’: drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c:217:64: warning: ‘.bin’ directive output may be truncated writing 4 bytes into a region of size between 0 and 32 [-Wformat-truncation=] drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c:217:17: note: ‘snprintf’ output between 13 and 52 bytes into a destination of size 40 drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c:215:66: warning: ‘snprintf’ output may be truncated before the last format character [-Wformat-truncation=] drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c:215:17: note: ‘snprintf’ output between 12 and 41 bytes into a destination of size 40 Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:27:06 -04:00
Lee Jones	3dd8a754a5	drm/amd/amdgpu/amdgpu_ras: Increase buffer size to account for all possible values Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c: In function ‘amdgpu_ras_sysfs_create’: drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1406:20: warning: ‘_err_count’ directive output may be truncated writing 10 bytes into a region of size between 1 and 32 [-Wformat-truncation=] drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1405:9: note: ‘snprintf’ output between 11 and 42 bytes into a destination of size 32 Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:27:03 -04:00
Lee Jones	8057a9d656	drm/amd/amdgpu/amdgpu_device: Provide suitable description for param 'xcc_id' Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:516: warning: Function parameter or member 'xcc_id' not described in 'amdgpu_mm_wreg_mmio_rlc' Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:26:58 -04:00
Christophe JAILLET	415b7ba36a	drm/amdgpu: Use kvzalloc() to simplify code kvzalloc() can be used instead of kvmalloc() + memset() + explicit NULL assignments. It is less verbose and more future proof. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:26:47 -04:00
Christophe JAILLET	5f5c75bf16	drm/amdgpu: Remove amdgpu_bo_list_array_entry() Now that there is an explicit flexible array at the end of 'struct amdgpu_bo_list', it can be used to remove amdgpu_bo_list_array_entry() and simplify some macro. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:26:45 -04:00
Christophe JAILLET	a23abe1fbd	drm/amdgpu: Remove a redundant sanity check The case where 'num_entries' is too big, is already handled by struct_size(), because kvmalloc() would fail. It will return -ENOMEM instead of -EINVAL, but it is only related to a unlikely to happen sanity check. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:26:38 -04:00
Christophe JAILLET	ff49bd2c74	drm/amdgpu: Explicitly add a flexible array at the end of 'struct amdgpu_bo_list' 'struct amdgpu_bo_list' is really used as if it was ended by a flex array. So make it more explicit and add a 'struct amdgpu_bo_list_entry entries[]' field at the end of the structure. This way, struct_size() can be used when it is allocated. It is less verbose. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:26:32 -04:00
Hawking Zhang	ec70578c83	drm/amdgpu: Allow issue disable gfx ras cmd to firmware Disable gfx ras command is needed in some use cases Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:25:54 -04:00
Lijo Lazar	e370f8f389	drm/amdgpu: Add bootloader wait for PSP v13 Implement the wait for bootloader call back for PSP v13.0 ASICs. Only for ASICs with PSP v13.0.6, it needs an additional check for VBIOS mailbox status. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:25:44 -04:00
Hamza Mahfooz	1c6b6bd078	drm/amdgpu: register a dirty framebuffer callback for fbcon fbcon requires that we implement &drm_framebuffer_funcs.dirty. Otherwise, the framebuffer might take a while to flush (which would manifest as noticeable lag). However, we can't enable this callback for non-fbcon cases since it may cause too many atomic commits to be made at once. So, implement amdgpu_dirtyfb() and only enable it for fbcon framebuffers (we can use the "struct drm_file file" parameter in the callback to check for this since it is only NULL when called by fbcon, at least in the mainline kernel) on devices that support atomic KMS. Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: stable@vger.kernel.org # 6.1+ Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2519 Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:25:34 -04:00
Mangesh Gadre	7caebc8f99	drm/amdgpu: Updated TCP/UTCL1 programming Update TCP/UTCL1 thrashing control settings v2: updated rev_id check Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:20:14 -04:00
Srinivasan Shanmugam	f54e1d47e0	drm/amdgpu: Fix kcalloc over kzalloc in 'gmc_v9_0_init_mem_ranges' Replace kzalloc(n * sizeof(...), ...) with kcalloc(n, sizeof(...), ...) since kcalloc is the preferred API in case of allocating with multiply. Fixes the below: WARNING: Prefer kcalloc over kzalloc with multiply Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:20:02 -04:00
Philip Yang	5d44a766f7	drm/amdkfd: Share the original BO for GTT mapping If mGPUs is on same IOMMU group, or is ram direct mapped, then mGPUs can share the original BO for GTT mapping dma address, without creating new BO from export/import dmabuf. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:19:07 -04:00
Hawking Zhang	2c0f880abc	drm/amdgpu: Fix the return for gpu mode1_reset amdgpu_device_mode1_reset will return gpu mode1_reset succeed (ret = 0) as long as wait_for_bootloader call succeed, regardless of the status reported by smu or psp firmware. This results to driver continue executing recovery even smu or psp fail to perform mode1 reset. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:01:44 -04:00
benl	96271dd4d5	drm/amdgpu: add gfxhub 11.5.0 support Add initial gfxhub 11.5 support. Signed-off-by: benl <ben.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:01:15 -04:00
Prike Liang	b90975fa5b	drm/amdgpu: enable gmc11 for GC 11.5.0 Add to IP discovery table. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:01:13 -04:00
Lang Yu	aba2be4147	drm/amdgpu: add mmhub 3.3.0 support Add initial implementation for mmhub 3.3.0. v2: squash in client id fix (Alex) Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:01:09 -04:00
Prike Liang	b5549a2df0	drm/amdgpu/discovery: enable gfx11 for GC 11.5.0 Add to IP discovery table. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:01:03 -04:00
Lang Yu	d3ff0189c1	drm/amdgpu/discovery: enable mes block for gc 11.5.0 Add to IP discovery table. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:00:59 -04:00
Aaron Liu	10c9d86918	drm/amdgpu: add mes firmware support for gc_11_5_0 Add scheduler and kiq firmware support for gc_11_5_0. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:00:57 -04:00
Aaron Liu	d717da1775	drm/amdgpu: add imu firmware support for gc_11_5_0 Add imu firmware support for gc_11_5_0. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:00:54 -04:00
Aaron Liu	8e42b463df	drm/amdgpu: add golden setting for gc_11_5_0 Initialize golden setting for gc_11_5_0. v2: squash in latest golden updates (Alex) v3: squash in checkpatch fix (Alex) Signed-off-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:00:47 -04:00
Prike Liang	15e7cbd91d	drm/amdgpu/gfx11: initialize gfx11.5.0 Initalize gfx 11.5.0 and set gfx hw configuration. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:00:44 -04:00
Prike Liang	dd5a326155	drm/amdgpu/gmc11: initialize GMC for GC 11.5.0 memory support Initialize vram attribute and VMHUB for GC 11.5.0. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:00:39 -04:00
Prike Liang	d9d6833442	drm/amdgpu/discovery: add nbio 7.11.0 support Add to IP discovery table. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:00:30 -04:00
benl	e44d856eaa	drm/amdgpu: add nbio 7.11 support Add initial nbio 7.11 implementation. Signed-off-by: benl <ben.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:00:27 -04:00
Prike Liang	bb7249ee45	drm/amdgpu/discovery: enable soc21 support Add 11.5.0 to IP discovery table. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:00:22 -04:00
Prike Liang	0d1db799e7	drm/amdgpu/soc21: add initial GC 11.5.0 soc21 support Disable clock gating and power gating on the early bring up phase. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:00:18 -04:00
Prike Liang	2c8a7ca164	drm/amdgpu: add new AMDGPU_FAMILY definition add GC 11.5.0 family Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:00:15 -04:00
Lang Yu	f56c1941eb	drm/amdgpu: use 6.1.0 register offset for HDP CLK_CNTL Use 6.1.0 register offset and remove unused variable. v2: clean up logic (Alex) Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:00:12 -04:00
Mangesh Gadre	559259362e	drm/amdgpu: Remove SRAM clock gater override by driver rlc firmware does required setting, driver need not do it. Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 14:59:30 -04:00
Lijo Lazar	15c5c5f575	drm/amdgpu: Add bootloader status check Add a function to wait till bootloader has reached steady state. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 14:59:24 -04:00
Horace Chen	0bc119fa2e	drm/amdkfd: use correct method to get clock under SRIOV [What] Current SRIOV still using adev->clock.default_XX which gets from atomfirmware. But these fields are abandoned in atomfirmware long ago. Which may cause function to return a 0 value. [How] We don't need to check whether SR-IOV. For SR-IOV one-vf-mode, pm is enabled and VF is able to read dpm clock from pmfw, so we can use dpm clock interface directly. For multi-VF mode, VF pm is disabled, so driver can just react as pm disabled. One-vf-mode is introduced from GFX9 so it shall not have any backward compatibility issue. Signed-off-by: Horace Chen <horace.chen@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 14:59:21 -04:00
Lijo Lazar	36b0f88988	drm/amdgpu: Unset baco dummy mode on nbio v7.9 BACO dummy mode could be set under reset conditions and that affects framebuffer access. Check If baco dummy mode is set, unset it if so. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 14:59:16 -04:00
YiPeng Chai	80578f1641	drm/amdgpu: Enable ras for mp0 v13_0_6 sriov Enable ras for mp0 v13_0_6 sriov Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 14:58:16 -04:00
Samir Dhume	00481158ca	drm/amdgpu/jpeg - skip change of power-gating state for sriov Powergating is handled in the host driver. Reviewed-by: Zhigang Luo <zhigang.luo@amd.com> Signed-off-by: Samir Dhume <samir.dhume@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 14:57:59 -04:00
Lijo Lazar	f8a499aed2	drm/amdgpu: Keep reset handlers shared Instead of maintaining a list per device, keep the reset handlers common per ASIC family. A pointer to the list of handlers is maintained in reset control. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 14:57:54 -04:00
Le Ma	e240020ad1	drm/amdgpu: update gc_info v2_1 from discovery Several new fields are exposed in gc_info v2_1 Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 14:57:32 -04:00
Le Ma	f489a41998	drm/amdgpu: update mall info v2 from discovery Mall info v2 is introduced in ip discovery Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 14:57:29 -04:00
Candice Li	46963ed585	drm/amdgpu: Only support RAS EEPROM on dGPU platform RAS EEPROM device is only supported on dGPU platform for smu v13_0_6. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 14:57:26 -04:00
Chen Jiahao	d903af1a91	drm/amd/amdgpu: Use kmemdup to simplify kmalloc and memcpy logic Using kmemdup() helper function rather than implementing it again with kmalloc() + memcpy(), which improves the code readability. Signed-off-by: Chen Jiahao <chenjiahao16@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 14:56:47 -04:00
Ilpo Järvinen	ce7d88110b	drm/amdgpu: Use RMW accessors for changing LNKCTL Don't assume that only the driver would be accessing LNKCTL. ASPM policy changes can trigger write to LNKCTL outside of driver's control. And in the case of upstream bridge, the driver does not even own the device it's changing the registers for. Use RMW capability accessors which do proper locking to avoid losing concurrent updates to the register value. Suggested-by: Lukas Wunner <lukas@wunner.de> Fixes: `a2e73f56fa` ("drm/amdgpu: Add support for CIK parts") Fixes: `62a3755341` ("drm/amdgpu: add si implementation v10") Link: https://lore.kernel.org/r/20230717120503.15276-6-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-21 14:11:35 -05:00
Lijo Lazar	e20ff05170	drm/amdgpu: Add memory vendor information For ASICs with GC v9.4.3, determine the vendor information from scratch register. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-16 11:38:11 -04:00
Mario Limonciello	0dee726395	drm/amd: flush any delayed gfxoff on suspend entry DCN 3.1.4 is reported to hang on s2idle entry if graphics activity is happening during entry. This is because GFXOFF was scheduled as delayed but RLC gets disabled in s2idle entry sequence which will hang GFX IP if not already in GFXOFF. To help this problem, flush any delayed work for GFXOFF early in s2idle entry sequence to ensure that it's off when RLC is changed. commit `4b31b92b14` ("drm/amdgpu: complete gfxoff allow signal during suspend without delay") modified power gating flow so that if called in s0ix that it ensured that GFXOFF wasn't put in work queue but instead processed immediately. This is dead code due to commit `10cb67eb8a` ("drm/amdgpu: skip CG/PG for gfx during S0ix") because GFXOFF will now not be explicitly called as part of the suspend entry code. Remove that dead code. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-16 11:35:14 -04:00
Tim Huang	603b9a575d	drm/amdgpu: skip fence GFX interrupts disable/enable for S0ix GFX v11.0.1 reported fence fallback timer expired issue on SDMA and GFX rings after S0ix resume. This is generated by EOP interrupts are disabled when S0ix suspend but fails to re-enable when resume because of the GFX is in GFXOFF. [ 203.349571] [drm] Fence fallback timer expired on ring sdma0 [ 203.349572] [drm] Fence fallback timer expired on ring gfx_0.0.0 [ 203.861635] [drm] Fence fallback timer expired on ring gfx_0.0.0 For S0ix, GFX is in GFXOFF state, avoid to touch the GFX registers to configure the fence driver interrupts for rings that belong to GFX. The interrupts configuration will be restored by GFXOFF exit. Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-16 11:34:57 -04:00
Lijo Lazar	b5cdadedaa	drm/amdgpu: Remove gfxoff check in GFX v9.4.3 GFXOFF feature is not there for GFX 9.4.3 ASICs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-16 11:34:50 -04:00
James Zhu	400a39f1ec	drm/amdgpu: skip xcp drm device allocation when out of drm resource Return 0 when drm device alloc failed with -ENOSPC in order to allow amdgpu drive loading. But the xcp without drm device node assigned won't be visiable in user space. This helps amdgpu driver loading on system which has more than 64 nodes, the current limitation. The proposal to add more drm nodes is discussed in public, which will support up to 2^20 nodes totally. kernel drm: https://lore.kernel.org/lkml/20230724211428.3831636-1-michal.winiarski@intel.com/T/ libdrm: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/305 Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-16 11:34:11 -04:00
Samir Dhume	dd12b858c2	drm/amdgpu/vcn: Skip vcn power-gating change for sriov CG/PG is handled on the host side. Signed-off-by: Samir Dhume <samir.dhume@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-16 11:33:59 -04:00
Samir Dhume	d34fecc6e9	drm/amdgpu/jpeg: sriov support for jpeg_v4_0_3 initialization table handshake with mmsch Signed-off-by: Samir Dhume <samir.dhume@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-16 11:33:59 -04:00
Srinivasan Shanmugam	b828e1004c	drm/amdgpu: Replace ternary operator with min() in 'amdgpu_iomem_write' Fixes the following coccicheck: drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:2482:16-17: WARNING opportunity for min() min() macro is defined in include/linux/minmax.h. It avoids multiple evaluations of the arguments when non-constant and performs strict type-checking. Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:30 -04:00
Mario Limonciello	9366c2e87d	drm/amd: Rename AMDGPU_PP_SENSOR_GPU_POWER Use the clearer name `AMDGPU_PP_SENSOR_GPU_AVG_POWER` instead. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:30 -04:00
GUO Zihua	e8b2ad875f	drm/amdgpu: Remove duplicated includes Remove duplicated includes in amdgpu_amdkfd_gpuvm.c and amdgpu_ttm.c. Resolves checkincludes message. Signed-off-by: GUO Zihua <guozihua@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:29 -04:00
Alex Deucher	4d6fc55ab1	drm/amdgpu: expand runpm parameter Allow the user to specify -2 as auto enabled with displays. By default we don't enter runtime suspend when there are displays attached because it does not work well in some desktop environments due to the driver sending hotplug events on resume in case any new displays were attached while the GPU was powered down. Some users still want this functionality though, so this lets you enable it. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2428 Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:29 -04:00
Aurabindo Pillai	e94e787e37	drm/amd: Remove freesync video mode amdgpu parameter [Why&How] Freesync Video mode was enabled by default. Hence no need for the module parameter, so remove it completely Acked-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Stylon Wang <stylon.wang@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:29 -04:00
Samir Dhume	d117fd2964	drm/amdgpu/vcn: sriov support for vcn_v4_0_3 initialization table handshake with mmsch Signed-off-by: Samir Dhume <samir.dhume@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:28 -04:00
Srinivasan Shanmugam	44fd83e920	drm/amdgpu: Replace ternary operator with min() in 'amdgpu_iomem_read' Fixes the following coccicheck: drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:2427:16-17: WARNING opportunity for min() min() macro is defined in include/linux/minmax.h. It avoids multiple evaluations of the arguments when non-constant and performs strict type-checking. Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:28 -04:00
Samir Dhume	945355c96e	drm/amdgpu/vcn: change end doorbell index for vcn_v4_0_3 For sriov, doorbell index for vcn0 for AID needs to be on 32 byte boundary so we need to move the vcn end doorbell Signed-off-by: Samir Dhume <samir.dhume@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:28 -04:00
Eric Huang	3831989d62	drm/amdkfd: workaround address watch clearing bug for gfx v9.4.2 KFD currently relies on MEC FW to clear tcp watch control register on UNMAP_PROCESS, but FW doesn't work on it, which is a bug. So the solution is to clear the register as gfx v9 in KFD. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:27 -04:00
Samir Dhume	dba24294ff	drm/amdgpu/jpeg: mmsch_v4_0_3 requires doorbell on 32 byte boundary BASE: VCN0 unified (32 byte boundary) BASE+4: MJPEG0 BASE+5: MJPEG1 BASE+6: MJPEG2 BASE+7: MJPEG3 BASE+12: MJPEG4 BASE+13: MJPEG5 BASE+14: MJPEG6 BASE+15: MJPEG7 Signed-off-by: Samir Dhume <samir.dhume@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:27 -04:00
Samir Dhume	a31c114bcf	drm/amdgpu/vcn: mmsch_v4_0_3 requires doorbell on 32 byte boundary Align on 32 byte boundary. Signed-off-by: Samir Dhume <samir.dhume@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:27 -04:00
Samir Dhume	8d72444288	drm/amdgpu/vcn: Add MMSCH v4_0_3 support for sriov The structures are the same as v4_0 except for the init header Signed-off-by: Samir Dhume <samir.dhume@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:27 -04:00
Candice Li	b81fde0dfe	drm/amdgpu: Add I2C EEPROM support on smu v13_0_6 Support I2C EEPROM on smu v13_0_6. v2: Move IP_VERSION(13, 0, 6) ahead of IP_VERSION(13, 0, 10). Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:08:27 -04:00
Xiongfeng Wang	bdacd16afa	drm/amd: Use pci_dev_id() to simplify the code PCI core API pci_dev_id() can be used to get the BDF number for a pci device. We don't need to compose it mannually. Use pci_dev_id() to simplify the code a little bit. Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:07:42 -04:00
Srinivasan Shanmugam	d0d6928058	drm/amdgpu: Fix identifier names to function definition arguments in atom.h Fixes the following: WARNING: function definition argument 'struct card_info ' should also have an identifier name WARNING: function definition argument 'uint32_t' should also have an identifier name WARNING: function definition argument 'void ' should also have an identifier name WARNING: function definition argument 'struct atom_context ' should also have an identifier name WARNING: function definition argument 'int' should also have an identifier name WARNING: function definition argument 'uint32_t ' should also have an identifier name WARNING: Unnecessary space before function pointer name ERROR: space prohibited after that '*' (ctx:BxW) CHECK: Prefer kernel type 'u32' over 'uint32_t' Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:07:42 -04:00
YiPeng Chai	1b98a5f8e0	drm/amdgpu: mode1 reset needs to recover mp1 for mp0 v13_0_10 Mode1 reset needs to recover mp1 in fatal error case for mp0 v13_0_10. v2: Define a macro to wrap psp function calls. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:07:41 -04:00
Hawking Zhang	8b3a7a707c	drm/amdgpu: Remove unnecessary ras cap check RAS global isr will only be invoked by hardware interrupt. Don't need to query ras capability in isr In addition, amdgpu_ras_interrupt_fatal_error_handler ensures the isr won't be called from guest linux side by accident. The RAS cap check in isr that introduced to fix sriov crash is not needed any more Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 18:07:41 -04:00
Jiadong Zhu	1e9e15dcf4	drm/amdgpu: disable mcbp if parameter zero is set The parameter amdgpu_mcbp shall have priority against the default value calculated from the chip version. User could disable mcbp by setting the parameter mcbp as zero. v2: do not trigger preemption in sw ring muxer when mcbp is disabled. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 17:44:02 -04:00
Srinivasan Shanmugam	4c452b5c7d	drm/amdgpu: Fix missing comment for mb() in 'amdgpu_device_aper_access' This patch adds the missing code comment for memory barrier WARNING: memory barrier without comment + mb(); WARNING: memory barrier without comment + mb(); Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 17:40:44 -04:00
Alex Deucher	6be2ad4f00	drm/amdgpu: don't allow userspace to create a doorbell BO We need the domains in amdgpu_drm.h for the kernel driver to manage the pool, but we don't want userspace using it until the code is ready. So reject for now. Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-11 14:48:07 -04:00
Alex Deucher	c99a2e7ae2	drm/amdkfd: drop IOMMUv2 support Now that we use the dGPU path for all APUs, drop the IOMMUv2 support. v2: drop the now unused queue manager functions for gfx7/8 APUs Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Mike Lothian <mike@fireburn.co.uk> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-11 14:47:25 -04:00
Uros Bizjak	9e761bff03	drm/amdgpu: Use local64_try_cmpxchg in amdgpu_perf_read Use local64_try_cmpxchg instead of local64_cmpxchg (ptr, old, new) == old in amdgpu_perf_read. x86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg). Also, try_cmpxchg implicitly assigns old ptr value to "old" when cmpxchg fails. There is no need to re-read the value in the loop. No functional change intended. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 18:08:51 -04:00
Asad Kamal	59070fd9cc	drm/amdgpu: Add pci usage to nbio v7.9 Add implementation to get pcie usage for nbio v7.9. Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:46:05 -04:00
Asad Kamal	8d759dc664	drm/amdgpu: Add pcie usage callback to nbio Add a callback in nbio to get pcie usage Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:46:05 -04:00
Candice Li	bc0f80802d	drm/amdgpu: Extend poison mode check to SDMA/VCN/JPEG Treat SDMA/VCN/JPEG as RAS capable IP blocks in poison mode. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:46:05 -04:00
Emily Deng	f734b2133c	drm/amdgpu/irq: Move irq resume to the beginning Need to move irq resume to the beginning of reset sriov, or if one interrupt occurs before irq resume, then the irq won't work anymore. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:46:04 -04:00
Tao Zhou	7692e1ee24	drm/amdgpu: add RAS fatal error handler for NBIO v7.9 Register RAS fatal error interrupt and add handler. v2: only register NBIO RAS for dGPU platform. change nbio_v7_9_set_ras_controller_irq_state and nbio_v7_9_set_ras_err_event_athub_irq_state to dummy functions. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:46:04 -04:00
Srinivasan Shanmugam	657db07b32	drm/amdgpu: Fix identation issues in 'kgd_gfx_v9_program_trap_handler_settings' Fixes the following: ERROR: code indent should use tabs where possible WARNING: please, no spaces at the start of a line Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:46:04 -04:00
Alex Deucher	81af32520e	drm/amdgpu/gfx11: only enable CP GFX shadowing on SR-IOV This is only required for SR-IOV world switches, but it adds additional latency leading to reduced performance in some benchmarks. Disable for now on bare metal. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:46:04 -04:00
Lijo Lazar	7957ec80ef	drm/amdgpu: Add FRU sysfs nodes only if needed Create sysfs nodes for FRU data only if FRU data is available. Move the logic to FRU specific file. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:45:55 -04:00
Ran Sun	20c7435447	drm/amdgpu: Clean up errors in vcn_v3_0.c Fix the following errors reported by checkpatch: ERROR: space required before the open brace '{' ERROR: "foo * bar" should be "foo *bar" ERROR: space required before the open parenthesis '(' ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:44:00 -04:00
Ran Sun	7bb8c4f6a4	drm/amdgpu: Clean up errors in tonga_ih.c Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:58 -04:00
Ran Sun	7b57c54c96	drm/amdgpu: Clean up errors in gfx_v7_0.c Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line ERROR: trailing statements should be on next line ERROR: open brace '{' following struct go on the same line ERROR: space prohibited before that '++' (ctx:WxB) Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:56 -04:00
Ran Sun	2b2b5858f5	drm/amdgpu: Clean up errors in vcn_v4_0.c Fix the following errors reported by checkpatch: spaces required around that '==' (ctx:VxV) ERROR: space required before the open parenthesis '(' ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:54 -04:00
Ran Sun	c8a1439699	drm/amdgpu: Clean up errors in uvd_v3_1.c Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:52 -04:00
Ran Sun	599f7c8b85	drm/amdgpu: Clean up errors in mxgpu_vi.c Fix the following errors reported by checkpatch: ERROR: spaces required around that '-=' (ctx:WxV) Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:51 -04:00
Ran Sun	939a392f07	drm/amdgpu: Clean up errors in nv.c Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:49 -04:00
Ran Sun	baa5ede875	drm/amdgpu: Clean up errors in amdgpu_virt.c Fix the following errors reported by checkpatch: ERROR: space required before the open parenthesis '(' Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:47 -04:00
Ran Sun	1b01c010d7	drm/amdgpu: Clean up errors in amdgpu_ring.h Fix the following errors reported by checkpatch: ERROR: spaces required around that ':' (ctx:VxW) Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:45 -04:00
Ran Sun	7b7fbabbff	drm/amdgpu: Clean up errors in amdgpu_trace.h Fix the following errors reported by checkpatch: ERROR: space required after that ',' (ctx:VxV) ERROR: "foo* bar" should be "foo *bar" Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:43 -04:00
Ran Sun	91aafa3c4e	drm/amdgpu: Clean up errors in mes_v11_0.c Fix the following errors reported by checkpatch: ERROR: else should follow close brace '}' Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:42 -04:00
Ran Sun	98268d4033	drm/amdgpu: Clean up errors in amdgpu_atombios.h Fix the following errors reported by checkpatch: ERROR: open brace '{' following struct go on the same line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:40 -04:00
Ran Sun	06d82d87b4	drm/amdgpu: Clean up errors in soc21.c Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:38 -04:00
Ran Sun	18ef754488	drm/amdgpu: Clean up errors in dce_v8_0.c Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line ERROR: code indent should use tabs where possible ERROR: space required before the open brace '{' Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:36 -04:00
Ran Sun	665ba81b4a	drm/amdgpu/jpeg: Clean up errors in vcn_v1_0.c Fix the following errors reported by checkpatch: ERROR: space required before the open parenthesis '(' ERROR: space prohibited after that '~' (ctx:WxW) Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:34 -04:00
Ran Sun	e2515e2b90	drm/amdgpu: Clean up errors in mxgpu_nv.c Fix the following errors reported by checkpatch: ERROR: else should follow close brace '}' ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:32 -04:00
Ran Sun	2b77f199a5	drm/amdgpu: Clean up errors in dce_v10_0.c Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:30 -04:00
Ran Sun	7c29b40236	drm/jpeg: Clean up errors in jpeg_v2_0.c Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:28 -04:00
Ran Sun	a788b54f3d	drm/amdgpu: Clean up errors in uvd_v7_0.c Fix the following errors reported by checkpatch: ERROR: spaces required around that ':' (ctx:VxE) that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:26 -04:00
Ran Sun	7163dadea2	drm/amdgpu/atomfirmware: Clean up errors in amdgpu_atomfirmware.c Fix the following errors reported by checkpatch: ERROR: spaces required around that '>=' (ctx:WxV) ERROR: spaces required around that '!=' (ctx:WxV) ERROR: code indent should use tabs where possible Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:23 -04:00
Ran Sun	f291f9b9db	drm/amdgpu: Clean up errors in mmhub_v9_4.c Fix the following errors reported by checkpatch: ERROR: code indent should use tabs where possible ERROR: space required before the open parenthesis '(' Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:22 -04:00
Ran Sun	1f45f1c592	drm/amdgpu: Clean up errors in vega20_ih.c Fix the following errors reported by checkpatch: ERROR: trailing statements should be on next line ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:20 -04:00
Ran Sun	46eb29b867	drm/amdgpu: Clean up errors in ih_v6_0.c Fix the following errors reported by checkpatch: ERROR: trailing statements should be on next line ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:18 -04:00
Ran Sun	08110c26ce	drm/amdgpu: Clean up errors in amdgpu_psp.h Fix the following errors reported by checkpatch: ERROR: open brace '{' following struct go on the same line ERROR: open brace '{' following enum go on the same line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:17 -04:00
Ran Sun	042a70e43a	drm/amdgpu: Clean up errors in vce_v3_0.c Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:15 -04:00
Ran Sun	9c7f00f7d1	drm/amdgpu: Clean up errors in cik_ih.c Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:43:13 -04:00
Ruan Jinjie	3b780089fd	drm/amdgpu: Remove a lot of unnecessary ternary operators There are many ternary operators, the true or false judgement of which is unnecessary in C language semantics. Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:39:56 -04:00
Alex Deucher	73b0648179	drm/amdgpu: fix possible UAF in amdgpu_cs_pass1() Since the gang_size check is outside of chunk parsing loop, we need to reset i before we free the chunk data. Suggested by Ye Zhang (@VAR10CK) of Baidu Security. Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:39:40 -04:00
Lijo Lazar	7748ce5b69	drm/amdgpu: Report vbios version instead of PN Report VBIOS version in vbios_version sysfs node instead of part number. Part number remains constant for a SKU type. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:14:09 -04:00
Shashank Sharma	664c3b03f9	drm/amdgpu: cleanup MES process level doorbells MES allocates process level doorbells, but there is no userspace client to consume it. It was only being used for the MES ring tests (in kernel), and was written by kernel doorbell write. The previous patch of this series has changed the MES ring test code to use kernel level MES doorbells. This patch now cleans up the process level doorbell allocation code which is not required. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:14:07 -04:00
Shashank Sharma	e3cbb1f404	drm/amdgpu: use doorbell mgr for MES kernel doorbells This patch: - Removes the existing doorbell management code, and its variables from the doorbell_init function, it will be done in doorbell manager now. - uses the doorbell page created for MES kernel level needs (doorbells for MES self tests) - current MES code was allocating MES doorbells in MES process context, but those were getting written using kernel doorbell calls. This patch instead allocates a MES kernel doorbell for this (in add_hw_queue). V2: Create an extra page of doorbells for MES during kernel doorbell creation (Alex) V4: Move MES doorbell size and page offset objects in this patch from patch 6. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian Koenig <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:14:07 -04:00
Ori Messinger	557d466b15	drm/amdgpu: Report Missing MES Firmware Versions with Sysfs Added missing MES firmware versions to the 'fw_version' sysfs directory, they should now exist as a files named "mes_fw_version" and "mes_kiq_fw_version" found at: /sys/class/drm/cardX/device/fw_version/mes_fw_version /sys/class/drm/cardX/device/fw_version/mes_kiq_fw_version Where X is the card number, and the version is displayed in hexadecimal. Signed-off-by: Ori Messinger <ori.messinger@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:14:07 -04:00
Shashank Sharma	d124aa0ac9	drm/amdgpu: get absolute offset from doorbell index This patch adds a helper function which converts a doorbell's relative index in a BO to an absolute doorbell offset in the doorbell BAR. V2: No space between the variable name doc (Luben) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:14:07 -04:00
Shashank Sharma	54c30d2a8d	drm/amdgpu: create kernel doorbell pages This patch: - creates a doorbell page for graphics driver usages. - adds a few new varlables in adev->doorbell structure to keep track of kernel's doorbell-bo. - removes the adev->doorbell.ptr variable, replaces it with kernel-doorbell-bo's cpu address. V2: - Create doorbell BO directly, no wrappe functions (Alex) - no additional doorbell structure (Alex, Christian) - Use doorbell_cpu_ptr, remove ioremap (Christian, Alex) - Allocate one extra page of doorbells for MES (Alex) V4: Move MES doorbell base init into MES related patch (Christian) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:14:06 -04:00
Lijo Lazar	36f3f375ed	drm/amdgpu: Use nbio callback for nv and soc21 Make the new ascis to follow nbio callback method to get pcie replay count. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:14:06 -04:00
Lijo Lazar	50709d18f4	drm/amdgpu: Add pci replay count to nbio v7.9 Add implementation to get pcie replay count for nbio v7.9. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:14:06 -04:00
Shashank Sharma	792b84fb90	drm/amdgpu: initialize ttm for doorbells This patch initialzes the ttm resource manager for doorbells. V2: Do not round up doorbell size (Alex) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian Koenig <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:14:06 -04:00
Alex Deucher	dc3499c71d	drm/amdgpu: accommodate DOMAIN/PL_DOORBELL This patch adds changes: - to accommodate the new GEM domain for DOORBELLs - to accommodate the new TTM PL for DOORBELLs in order to manage doorbell pages as GEM object. V2: Addressed reviwe comments from Christian - drop the doorbell changes for pinning/unpinning - drop the doorbell changes for dma-buf map - drop the doorbell changes for sgt - no need to handle TTM_PL_FLAG_CONTIGUOUS for doorbell - add caching type for doorbell V3: - Removed unrelated empty line (Christian) - Add PL_DOORBELL in mem_type_to_domain() as well (Alex) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian Koenig <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>	2023-08-07 17:14:06 -04:00
Shashank Sharma	794c33c66f	drm/amdgpu: don't modify num_doorbells for mes This patch removes the check and change in num_kernel_doorbells for MES, which is not being used anywhere by MES code. V2: Fixed checkpatch warnings. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian Koenig <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:14:06 -04:00
Lijo Lazar	900af4e488	drm/amdgpu: Add pcie replay count callback to nbio Add a callback in nbio to get pcie replay count. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:14:06 -04:00
Srinivasan Shanmugam	07867a78f8	drm/amdgpu: Prefer pr_err/_warn/_notice over printk in amdgpu_atpx_handler.c Fixes the following style issues: ERROR: open brace '{' following function definitions go on the next line WARNING: printk() should include KERN_<LEVEL> facility level Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Bert Karwatzki <spasswolf@web.de> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:13:10 -04:00
Srinivasan Shanmugam	a494a7ce54	Revert "drm/amdgpu: Prefer dev_* variant over printk in amdgpu_atpx_handler.c" Usage of container_of is wrong here. struct acpi_device *adev = container_of(handle, struct acpi_device, handle) This reverts commit `b0bd0a92b8`. References: https://gitlab.freedesktop.org/drm/amd/-/issues/2744 Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Bert Karwatzki <spasswolf@web.de> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:13:09 -04:00
Zhigang Luo	e24b2fdaec	drm/amdgpu: init TA microcode for SRIOV VF when MP0 IP is 13.0.6 Init TA ucode for SRIOV. Signed-off-by: Zhigang Luo <Zhigang.Luo@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:13:09 -04:00
Zhigang Luo	66353ec433	drm/amdgpu: remove SRIOV VF FB location programming For SRIOV VF, FB location is programmed by host driver, no need to program it in guest driver. v2: squash in unused variable removal Signed-off-by: Zhigang Luo <Zhigang.Luo@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:13:09 -04:00
Prike Liang	f05f4fe6ab	drm/amdgpu: enable SDMA MGCG for SDMA 5.2.x Now the SDMA firmware can support SDMA MGCG properly, so let's enable it from the driver side. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:12:49 -04:00
Hawking Zhang	6fc9d92c3d	drm/amdgpu: Issue ras enable_feature for gfx ip only For non-GFX IP blocks, set up ras obj if ras feature is allowed. For GFX IP blocks, force issue ras enable_feature command to firmware and only set up ras obj if ras feature is allowed Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:12:49 -04:00
Hawking Zhang	a5c75947b4	drm/amdgpu: Remove gfx v11_0_3 ras_late_init call amdgpu_ras_late_init will invoke ras_late_init call per IP block Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:12:49 -04:00
Srinivasan Shanmugam	21539a6d41	drm/amdgpu: Clean up style problems in mmhub_v2_3.c Fixes the following: ERROR: code indent should use tabs where possible WARNING: Missing a blank line after declarations WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: suspect code indent for conditional statements (8, 24) + if (!(data & (DAGB0_CNTL_MISC2__DISABLE_WRREQ_CG_MASK \| [...] + *flags \|= AMD_CG_SUPPORT_MC_MGCG; Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:12:49 -04:00
Srinivasan Shanmugam	4e2abc197f	drm/amdgpu: Move vram, gtt & flash defines to amdgpu_ ttm & _psp.h As amdgpu.h is getting decomposed, move vram and gtt extern defines into amdgpu_ttm.h & flash extern to amdgpu_psp.h Fixes: `f9acfafc34` ("drm/amdgpu: Move externs to amdgpu.h file from amdgpu_drv.c") Suggested-by: Christian König <christian.koenig@amd.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:12:49 -04:00
Hawking Zhang	62c4b772bd	drm/amdgpu: Apply poison mode check to GFX IP only For GFX IP that only supports poison consumption, GFX RAS won't be marked as enabled. i.e., hardware doesn't support gfx sram ecc. But driver still needs to issue firmware to enable poison consumption mode for GFX IP. In such case, check poison mode and treat GFX IP as RAS capable IP block. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:12:49 -04:00
Hawking Zhang	f957138cc3	drm/amdgpu: Only create err_count sysfs when hw_op is supported Some IP blocks only support partial ras feature and don't have ras counter and/or ras error status register at all. Driver should not create err_count sysfs node for those IP blocks. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:12:49 -04:00
Srinivasan Shanmugam	e2e42edfe8	drm/amdgpu: Sort the includes in amdgpu/amdgpu_drv.c Sort the include files that are included in amdgpu_drv.c alphabetically. Suggested-by: Mario Limonciello <mario.limonciello@amd.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:12:49 -04:00
Srinivasan Shanmugam	5f95f00317	drm/amdgpu: Cleanup amdgpu/amdgpu_cgs.c Fixes the below: ERROR: switch and case should be at the same indent WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: Block comments use * on subsequent lines WARNING: Comparisons should place the constant on the right side of the test Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:12:49 -04:00
Praful Swarnakar	2d5c04152a	drm/amdgpu: Fix style issues in amdgpu_psp.c Fixes the following to align to linux coding style: WARNING: Block comments use a trailing / on a separate line WARNING: Block comments should align the on each line Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Praful Swarnakar <Praful.Swarnakar@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:12:48 -04:00
Praful Swarnakar	ad19c200b1	drm/amdgpu: Fix style issues in amdgpu_debugfs.c Fixes the following to align to linux coding style: WARNING: Missing a blank line after declarations WARNING: sizeof rd should be sizeof(rd) Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Praful Swarnakar <Praful.Swarnakar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 17:12:48 -04:00
Prike Liang	4c340d0034	drm/amdgpu/discovery: add ih 6.1.0 support Add to IP discovery table. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 16:35:42 -04:00
Ben Li	0ba96fd3c0	drm/amdgpu: add ih 6.1 support Add initial support for IH 6.1. v2: Fix copyright date (Alex) Signed-off-by: Ben Li <ben.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 16:35:38 -04:00
Prike Liang	eff7a442c1	drm/amdgpu/discovery: add smuio 14.0.0 support Add to IP discovery table. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 16:35:27 -04:00
Prike Liang	9b9a5e34d4	drm/amdgpu/discovery: add hdp 6.1.0 support Add to IP discovery table. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 16:35:23 -04:00
Lijo Lazar	161c908d6a	drm/amdgpu: Match against exact bootloader status On PSP v13.x ASICs, boot loader will set only the MSB to 1 and clear the least significant bits for any command submission. Hence match against the exact register value, otherwise a register value of all 0xFFs also could falsely indicate that boot loader is ready. Also, from PSP v13.0.6 and newer, bits[7:0] will be used to indicate command error status. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 16:34:55 -04:00
Prike Liang	99af9c950d	drm/amdgpu/discovery: enable sdma6 for SDMA 6.1.0 Add to IP discovery table. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 16:34:04 -04:00
Mario Limonciello	70e64c4d52	drm/amd: Disable S/G for APUs when 64GB or more host memory Users report a white flickering screen on multiple systems that is tied to having 64GB or more memory. When S/G is enabled pages will get pinned to both VRAM carve out and system RAM leading to this. Until it can be fixed properly, disable S/G when 64GB of memory or more is detected. This will force pages to be pinned into VRAM. This should fix white screen flickers but if VRAM pressure is encountered may lead to black screens. It's a trade-off for now. Fixes: `81d0bcf990` ("drm/amdgpu: make display pinning more flexible (v2)") Cc: Hamza Mahfooz <Hamza.Mahfooz@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: <stable@vger.kernel.org> # 6.1.y: `bf0207e172` ("drm/amdgpu: add S/G display parameter") Cc: <stable@vger.kernel.org> # 6.4.y Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2735 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2354 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 16:32:54 -04:00
Prike Liang	7a22c147f7	drm/amdgpu/sdma6: initialize sdma 6.1.0 Add firmware declaration. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 16:32:46 -04:00
Daniel Vetter	3d00c59d14	amd-drm-next-6.6-2023-07-28: amdgpu: - Lots of checkpatch cleanups - GFX 9.4.3 updates - Add USB PD and IFWI flashing documentation - GPUVM updates - RAS fixes - DRR fixes - FAMS fixes - Virtual display fixes - Soft IH fixes - SMU13 fixes - Rework PSP firmware loading for other IPs - Kernel doc fixes - DCN 3.0.1 fixes - LTTPR fixes - DP MST fixes - DCN 3.1.6 fixes - SubVP fixes - Display bandwidth calculation fixes - VCN4 secure submission fixes - Allow building DC on RISC-V - Add visible FB info to bo_print_info - HBR3 fixes - Add PSP 14.0 support - GFX9 MCBP fix - GMC10 vmhub index fix - GMC11 vmhub index fix - Create a new doorbell manager - SR-IOV fixes amdkfd: - Cleanup CRIU dma-buf handling - Use KIQ to unmap HIQ - GFX 9.4.3 debugger updates - GFX 9.4.2 debugger fixes - Enable cooperative groups fof gfx11 - SVM fixes radeon: - Lots of checkpatch cleanups -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZMQ0vAAKCRC93/aFa7yZ 2EOOAQCrsNf1IEynXVj0gVYOWFDpBCdaDkw+gXR73nOlwBeZzgD8DAoismXYDY95 pkKlx/HL5O8qyZ25Lc9ZlgsJnTpnpw4= =c/Jk -----END PGP SIGNATURE----- Merge tag 'amd-drm-next-6.6-2023-07-28' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.6-2023-07-28: amdgpu: - Lots of checkpatch cleanups - GFX 9.4.3 updates - Add USB PD and IFWI flashing documentation - GPUVM updates - RAS fixes - DRR fixes - FAMS fixes - Virtual display fixes - Soft IH fixes - SMU13 fixes - Rework PSP firmware loading for other IPs - Kernel doc fixes - DCN 3.0.1 fixes - LTTPR fixes - DP MST fixes - DCN 3.1.6 fixes - SubVP fixes - Display bandwidth calculation fixes - VCN4 secure submission fixes - Allow building DC on RISC-V - Add visible FB info to bo_print_info - HBR3 fixes - Add PSP 14.0 support - GFX9 MCBP fix - GMC10 vmhub index fix - GMC11 vmhub index fix - Create a new doorbell manager - SR-IOV fixes amdkfd: - Cleanup CRIU dma-buf handling - Use KIQ to unmap HIQ - GFX 9.4.3 debugger updates - GFX 9.4.2 debugger fixes - Enable cooperative groups fof gfx11 - SVM fixes radeon: - Lots of checkpatch cleanups Merge conflicts: - drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c The switch to drm eu helpers in `8a206685d3` ("drm/amdgpu: use drm_exec for GEM and CSA handling v2") clashed with the cosmetic cleanups from `30953c4d00` ("drm/amdgpu: Fix style issues in amdgpu_gem.c"). I kept the former since the cleanup up code is gone. - drivers/gpu/drm/amd/amdgpu/atom.c. `adf64e2142` ("drm/amd: Avoid reading the VBIOS part number twice") removed code that `992b8fe106` ("drm/radeon: Replace all non-returning strlcpy with strscpy") polished. From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230728214228.8102-1-alexander.deucher@amd.com [sima: some merge conflict wrangling as noted] Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2023-08-04 11:10:18 +02:00
Lang Yu	6f38bdb86a	drm/amdgpu: correct vmhub index in GMC v10/11 Align with new vmhub definition. v2: use client_id == VMC to decide vmhub(Hawking) Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 15:05:31 -04:00
Srinivasan Shanmugam	3dc6d8352e	drm/amdgpu: Fix non-standard format specifiers in 'amdgpu_show_fdinfo' Fixes the following: WARNING: %Lu is non-standard C, use %llu + seq_printf(m, "drm-client-id:\t%Lu\n", vm->immediate.fence_context); WARNING: %Ld is non-standard C, use %lld + seq_printf(m, "drm-engine-%s:\t%Ld ns\n", amdgpu_ip_name[hw_ip], Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 15:05:31 -04:00
Jiadong Zhu	8cbbd11547	drm/amdgpu: set completion status as preempted for the resubmission The driver's CSA buffer is shared by all the ibs. When the high priority ib is submitted after the preempted ib, CP overrides the ib_completion_status as completed in the csa buffer. After that the preempted ib is resubmitted, CP would clear some locals stored for ib resume when reading the completed status, which causes gpu hang in some cases. Always set status as preempted for those resubmitted ib instead of reading everything from the CSA buffer. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2535 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2717 Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 15:04:19 -04:00
Srinivasan Shanmugam	7db36fe942	drm/amdgpu: Use parentheses for sizeof numa_info in 'amdgpu_acpi_get_numa_info' Fixes the below: WARNING: sizeof numa_info should be sizeof(*numa_info) Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 14:59:30 -04:00
Srinivasan Shanmugam	6cf20211fc	drm/amdgpu: Fix unnecessary else after return in 'amdgpu_eeprom_xfer' Fixes the following: WARNING: else is not generally useful after a break or return + return -EINVAL; + } else { Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 14:59:29 -04:00
Li Ma	82f33504a4	drm/amdgpu/discovery: enable PSP 14.0.0 support Add it to IP discovery. Signed-off-by: Li Ma <li.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 14:59:29 -04:00
Li Ma	14b2760f3c	drm/amdgpu: add PSP 14.0.0 support Uses same driver interface as 13.0. Signed-off-by: Li Ma <li.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 14:59:29 -04:00
Jonathan Kim	fc7f1d9697	drm/amdkfd: fix and enable ttmp setup for gfx11 The MES cached process context must be cleared on adding any queue for the first time. For proper debug support, the MES will clear it's cached process context on the first call to SET_SHADER_DEBUGGER. This allows TTMPs to be pesistently enabled in a safe manner. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Eric Huang <jinhuieric@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 14:59:29 -04:00
Srinivasan Shanmugam	f9acfafc34	drm/amdgpu: Move externs to amdgpu.h file from amdgpu_drv.c Fixes the following: WARNING: externs should be avoided in .c files +extern const struct attribute_group amdgpu_vram_mgr_attr_group; WARNING: externs should be avoided in .c files +extern const struct attribute_group amdgpu_gtt_mgr_attr_group; WARNING: externs should be avoided in .c files +extern const struct attribute_group amdgpu_flash_attr_group; And other style fixes: WARNING: Block comments should align the * on each line WARNING: void function return statements are not generally useful WARNING: braces {} are not necessary for single statement blocks Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 14:59:29 -04:00
Srinivasan Shanmugam	b0bd0a92b8	drm/amdgpu: Prefer dev_* variant over printk in amdgpu_atpx_handler.c Changed from printk to dev_* variants so that we get better debug info when there are multiple GPUs in the system. Fixes other style issue: ERROR: open brace '{' following function definitions go on the next line WARNING: printk() should include KERN_<LEVEL> facility level Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 14:59:29 -04:00
Srinivasan Shanmugam	7593164d2f	drm/amdgpu: Fix no new typedefs for enum _AMDGPU_DOORBELL_* Fixes the following: WARNING: do not add new typedefs Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 14:59:29 -04:00
Srinivasan Shanmugam	b8920e1e0d	drm/amdgpu: Fix ENOSYS means 'invalid syscall nr' in amdgpu_device.c ENOSYS should be used for nonexistent syscalls only, replace ENOSYS with EOPNOTSUPP for reset handlers that are not implemented for respective ASIC. WARNING: ENOSYS means 'invalid syscall nr' and nothing else + if (r == -ENOSYS) WARNING: ENOSYS means 'invalid syscall nr' and nothing else + if (r == -ENOSYS) And other following style fixes in amdgpu_device.c: WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'. WARNING: Block comments should align the * on each line WARNING: Missing a blank line after declarations WARNING: braces {} are not necessary for single statement blocks Cc: Lijo Lazar <lijo.lazar@amd.com> Cc: Kent Russell <kent.russell@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 14:59:29 -04:00
Bob Zhou	8a92e8676c	drm/amdgpu: remove repeat code for mes_add_queue_pkt The setting of mes_add_queue_pkt is repeated, so remove it. Signed-off-by: Bob Zhou <bob.zhou@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 14:48:13 -04:00
Eric Huang	952ee94593	drm/amdgpu: enable trap of each kfd vmid for gfx v9.4.3 To setup ttmp on as default for gfx v9.4.3 in IP hw init. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 14:47:52 -04:00
Lijo Lazar	b5ac08806c	drm/amdgpu: Restore HQD persistent state register On GFX v9.4.3, compute queue MQD is populated using the values in HQD persistent state register. Hence don't clear the values on module unload, instead restore it to the default reset value so that MQD is initialized correctly during next module load. In particular, preload flag needs to be set on compute queue MQD, otherwise it could cause uninitialized values being used at device reset state resulting in EDC. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:47:27 -04:00
YuanShang	30b59910d9	drm/amdgpu: load sdma ucode in the guest machine [why] User mode driver need to check the sdma ucode version to see whether the sdma engine supports a new type of PM4 packet. In SRIOV, sdma is loaded by the host. And, there is no way to check the sdma ucode version of CHIP_NAVI12 and CHIP_SIENNA_CICHLID of the host in the guest machine. [how] Load the sdma ucode for CHIP_NAVI12 and CHIP_SIENNA_CICHLID in the guest machine. Signed-off-by: YuanShang <YuanShang.Mao@amd.com> Reviewed-By: Horace Chen <horace.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:47:26 -04:00
Srinivasan Shanmugam	fc8e55f378	drm/amdgpu: Use seq_puts() instead of seq_printf() For a constant format without additional arguments, use seq_puts() instead of seq_printf(). Also, it fixes the following warning. WARNING: Prefer seq_puts to seq_printf And other style fixes: WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: Block comments should align the * on each line Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:47:26 -04:00
Srinivasan Shanmugam	a0cc8e1512	drm/amdgpu: Update min() to min_t() in 'amdgpu_info_ioctl' Fixes the following: WARNING: min() should probably be min_t(size_t, size, sizeof(ip)) + ret = copy_to_user(out, &ip, min((size_t)size, sizeof(ip))); And other style fixes: WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: Missing a blank line after declarations Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:47:26 -04:00
Srinivasan Shanmugam	ce83aa7bad	drm/amdgpu: Remove else after return in 'is_fru_eeprom_supported' Expressions under 'else' branch under case 'CHIP_SIENNA_CICHLID' in function 'is_fru_eeprom_supported' are executed whenever the expression in 'if' is False. Otherwise, return from case occurs. Therefore, there is no need in 'else', and it has been removed. Fixes the following: WARNING: else is not generally useful after a break or return + return false; + } else { Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:47:26 -04:00
Srinivasan Shanmugam	50fbe0cc95	drm/amdgpu: Add -ENOMEM error handling when there is no memory Return -ENOMEM, when there is no sufficient dynamically allocated memory Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:47:26 -04:00
Stanley.Yang	fcb7a1849a	drm/amdgpu: Check APU flag to disable RAS Only disable RAS by default for aqua vanjaram on APU platform. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:47:26 -04:00
Shiwu Zhang	9bc12db4e2	drm/amdgpu: fix the indexing issue during rlcg access ctrl init In case that the GET_INST() is used for looping, only loops for the times of actual num of xcc, otherwise GET_INST() will return the invalid index, a.k.a -1 And also remove the redundant mask checking in case of GET_INST() Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:47:26 -04:00
Pierre-Eric Pelloux-Prayer	818c158fd4	drm/amdgpu: add VISIBLE info in amdgpu_bo_print_info This allows tools to distinguish between VRAM and visible VRAM. Use the opportunity to fix locking before accessing bo. v2: squash in unused variable fix Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:47:26 -04:00
Srinivasan Shanmugam	30953c4d00	drm/amdgpu: Fix style issues in amdgpu_gem.c Fixes the following to align to linux coding style: WARNING: braces {} are not necessary for any arm of this statement WARNING: Missing a blank line after declarations ERROR: space prohibited before that close parenthesis ')' WARNING: unnecessary whitespace before a quoted newline WARNING: %LX is non-standard C, use %llX Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:41:00 -04:00
Lijo Lazar	6cb209ed68	drm/amdgpu: Update ring scheduler info as needed Not all rings have scheduler associated. Only update scheduler data for rings with scheduler. It could result in out of bound access as total rings are more than those associated with particular IPs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:39:29 -04:00
sguttula	c6195ef5ee	drm/amdgpu: Enabling FW workaround through shared memory for VCN4_0_2 This patch will enable VCN FW workaround using DRM KEY INJECT WORKAROUND method, which is helping in fixing the secure playback. Signed-off-by: sguttula <Suresh.Guttula@amd.com> Reviewed-by: Leo Liu <leo.liiu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:39:21 -04:00
Srinivasan Shanmugam	93125cb704	drm/amd/amdgpu: Fix warnings in amdgpu/amdgpu_display.c Fixes the below checkpatch.pl warnings: WARNING: Block comments use * on subsequent lines WARNING: Block comments use a trailing */ on a separate line WARNING: suspect code indent for conditional statements (8, 12) WARNING: braces {} are not necessary for single statement blocks Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:36:58 -04:00
Srinivasan Shanmugam	37c3fc6620	drm/amdgpu: Return -ENOMEM when there is no memory in 'amdgpu_gfx_mqd_sw_init' Return -ENOMEM, when there is no sufficient dynamically allocated memory to create MQD backup for ring Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:36:16 -04:00
Srinivasan Shanmugam	88dd0b188e	drm/amdgpu: Fix do not add new typedefs in amdgpu_fw_attestation.c Fixes the following to align to coding style: WARNING: do not add new typedefs +typedef struct FW_ATT_DB_HEADER WARNING: do not add new typedefs +typedef struct FW_ATT_RECORD WARNING: Symbolic permissions 'S_IRUSR' are not preferred. Consider using octal permissions '0400'. + S_IRUSR, ERROR: "(foo)" should be "(foo )" WARNING: please, no space before tabs Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:36:08 -04:00
Srinivasan Shanmugam	b25b359926	drm/amdgpu: Prefer #if IS_ENABLED over #if defined in amdgpu_drv.c Adhere to linux coding style Fixes the following: WARNING: Prefer IS_ENABLED(<FOO>) to CONFIG_<FOO> \|\| CONFIG_<FOO>_MODULE +#if defined(CONFIG_DRM_RADEON) \|\| defined(CONFIG_DRM_RADEON_MODULE) WARNING: Prefer IS_ENABLED(<FOO>) to CONFIG_<FOO> \|\| CONFIG_<FOO>_MODULE +#if defined(CONFIG_DRM_RADEON) \|\| defined(CONFIG_DRM_RADEON_MODULE) Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:36:00 -04:00
Jonathan Kim	7a1c5c6753	drm/amdkfd: enable cooperative groups for gfx11 MES can concurrently schedule queues on the device that require exclusive device access if marked exclusively_scheduled without the requirement of GWS. Similar to the F32 HWS, MES will manage quality of service for these queues. Use this for cooperative groups since cooperative groups are device occupancy limited. Since some GFX11 devices can only be debugged with partial CUs, do not allow the debugging of cooperative groups on these devices as the CU occupancy limit will change on attach. In addition, zero initialize the MES add queue submission vector for MES initialization tests as we do not want these to be cooperative dispatches. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:35:43 -04:00
Horace Chen	83f24a8f05	drm/amdgpu: set sw state to gfxoff after SR-IOV reset [Why] Current SR-IOV will not set GC to off state, while it is a real GC hard reset. Whthout GFX off flag, driver may do gfxhub invalidation before firmware load and gfxhub gart enable. This operation may cause CP to become busy because GC is not in the right state for invalidation. [How] Add a function for SR-IOV to clean up some sw state before recover. Set adev->gfx.is_poweron to false to prevent gfxhub invalidation before gfx firmware autoload complete. Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: HaiJun Chang <HaiJun.Chang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:35:23 -04:00
Yang Li	f135b0fc31	drm/amdgpu: Fix one kernel-doc comment Use colon to separate parameter name from their specific meaning. silence the warning: drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c:793: warning: Function parameter or member 'adev' not described in 'amdgpu_vm_pte_update_noretry_flags' Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-21 16:52:25 -04:00
Mario Limonciello	6b4cf4a35f	drm/amd: Fix an error handling mistake in psp_sw_init() If the second call to amdgpu_bo_create_kernel() fails, the memory allocated from the first call should be cleared. If the third call fails, the memory from the second call should be cleared. Fixes: `b95b539168` ("drm/amdgpu/psp: move PSP memory alloc from hw_init to sw_init") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-21 16:52:25 -04:00
Victor Lu	9196b63bee	drm/amdgpu: Fix infinite loop in gfxhub_v1_2_xcc_gart_enable (v2) An instance of for_each_inst() was not changed to match its new behaviour and is causing a loop. v2: remove tmp_mask variable Fixes: `b579ea632f` ("drm/amdgpu: Modify for_each_inst macro") Signed-off-by: Victor Lu <victorchengchi.lu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-21 16:52:25 -04:00
Lijo Lazar	0bdebfef3f	drm/amdgpu: Program xcp_ctl registers as needed XCP_CTL register is expected to be programmed by firmware. Under certain conditions FW may not have programmed it correctly. As a workaround, program it when FW has not programmed the right values. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-21 16:52:25 -04:00
sguttula	5dbb59247b	drm/amdgpu: allow secure submission on VCN4 ring This patch will enable secure decode playback on VCN4_0_2 Signed-off-by: sguttula <Suresh.Guttula@amd.com> Reviewed-by: Leo Liu <leo.liiu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-21 16:52:24 -04:00
Mario Limonciello	adf64e2142	drm/amd: Avoid reading the VBIOS part number twice The VBIOS part number is read both in amdgpu_atom_parse() as well as in atom_get_vbios_pn() and stored twice in the `struct atom_context` structure. Remove the first unnecessary read and move the `pr_info` line from that read into the second. v2: squash in unused variable removal Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-21 16:52:24 -04:00
Guchun Chen	18cf073faa	drm/amdgpu: use a macro to define no xcp partition case ~0 as no xcp partition is used in several places, so improve its definition by a macro for code consistency. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-18 11:19:02 -04:00
Guchun Chen	e379b5e7dc	drm/amdgpu/vm: use the same xcp_id from root PD Other PDs/PTs allocation should just use the same xcp_id as that stored in root PD. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-18 11:18:53 -04:00
Guchun Chen	5003ca63bc	drm/amdgpu: fix slab-out-of-bounds issue in amdgpu_vm_pt_create Recent code set xcp_id stored from file private data when opening device to amdgpu bo for accounting memory usage etc, but not all VMs are attached to this fpriv structure like the vm cases in amdgpu_mes_self_test, otherwise, KASAN will complain below out of bound access. And more importantly, VM code should not touch fpriv structure, so drop fpriv code handling from amdgpu_vm_pt. [ 77.292314] BUG: KASAN: slab-out-of-bounds in amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] [ 77.293845] Read of size 4 at addr ffff888102c48a48 by task modprobe/1069 [ 77.294146] Call Trace: [ 77.294178] <TASK> [ 77.294208] dump_stack_lvl+0x49/0x63 [ 77.294260] print_report+0x16f/0x4a6 [ 77.294307] ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] [ 77.295979] ? kasan_complete_mode_report_info+0x3c/0x200 [ 77.296057] ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] [ 77.297556] kasan_report+0xb4/0x130 [ 77.297609] ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] [ 77.299202] __asan_load4+0x6f/0x90 [ 77.299272] amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] [ 77.300796] ? amdgpu_init+0x6e/0x1000 [amdgpu] [ 77.302222] ? amdgpu_vm_pt_clear+0x750/0x750 [amdgpu] [ 77.303721] ? preempt_count_sub+0x18/0xc0 [ 77.303786] amdgpu_vm_init+0x39e/0x870 [amdgpu] [ 77.305186] ? amdgpu_vm_wait_idle+0x90/0x90 [amdgpu] [ 77.306683] ? kasan_set_track+0x25/0x30 [ 77.306737] ? kasan_save_alloc_info+0x1b/0x30 [ 77.306795] ? __kasan_kmalloc+0x87/0xa0 [ 77.306852] amdgpu_mes_self_test+0x169/0x620 [amdgpu] v2: without specifying xcp partition for PD/PT bo, the xcp id is -1. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2686 Fixes: `3ebfd221c1` ("drm/amdkfd: Store xcp partition id to amdgpu bo") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-18 11:18:16 -04:00
Guchun Chen	50e633081e	drm/amdgpu: Allocate root PD on correct partition file_priv needs to be setup firstly, otherwise, root PD will always be allocated on partition 0, even if opening the device from other partitions. Fixes: `3ebfd221c1` ("drm/amdkfd: Store xcp partition id to amdgpu bo") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-18 11:16:48 -04:00
Victor Lu	8ed49dd1d3	drm/amdgpu: Add RLCG interface driver implementation for gfx v9.4.3 (v3) Add RLCG interface support for gfx v9.4.3 and multiple XCCs. Do not enable it yet. v2: Fix amdgpu_rlcg_reg_access_ctrl init, add support for multiple XCCs in amdgpu_mm_wreg_mmio_rlc v3: Use GET_INST() when indexing amdgpu_rlcg_reg_access_ctrl Signed-off-by: Victor Lu <victorchengchi.lu@amd.com> Reviewed-by: Zhigang Luo <zhigang.luo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-18 11:16:41 -04:00
Shashank Sharma	43c064db65	drm/amdgpu: create a new file for doorbell manager This patch: - creates a new file for doorbell management. - moves doorbell code from amdgpu_device.c to this file. V2: - remove doc from function declaration (Christian) - remove 'device' from function names to make it consistent (Alex) - add SPDX license identifier (Luben) V3: - change license to MIT license(Christian) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-18 11:12:08 -04:00
Candice Li	5229a37e17	drm/amdgpu: Allow the initramfs generator to include psp_13_0_6_ta Allow the initramfs generator to automatically include psp_13_0_6_ta firmware to initramfs. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-18 11:11:49 -04:00
Stanley.Yang	276f6e8cb7	drm/amdgpu: Disable RAS by default on APU flatform Disable RAS feature by default for aqua vanjaram on APU platform. Changed from V1: Splite Disable RAS by default on APU platform into a separated patch. Changed from V2: Avoid to modify global variable amdgpu_ras_enable. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-18 11:11:36 -04:00
Stanley.Yang	cb906ce32b	drm/amdgpu: Enable aqua vanjaram RAS Enable RAS for aqua vanjaram. Changed from V1: Split the change in amdgpu_ras_asic_supported into a separated patch. Changed from V2: Avoid to modify global variable amdgpu_ras_enable. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-18 11:11:23 -04:00
Srinivasan Shanmugam	a62e702ee1	drm/amdgpu: Avoid possiblity of kernel crash in 'gmc_v8_0, gmc_v7_0_init_microcode()' If the function 'gmc_v8_0_ or gmc_v7_0_init_microcode()' fails, the driver will just fail to load, hence return -EINVAL rather having BUG(), fixes WARNING: Do not crash the kernel unless it is absolutely unavoidable--use WARN_ON_ONCE() plus recovery code (if feasible) instead of BUG() or variants Fixes: `2f77b5931f` ("drm/amdgpu: Fix error & warnings in gmc_v8_0.c") Fixes: `0cfc1d6830` ("drm/amdgpu: Fix errors & warnings in gmc_ v6_0, v7_0.c") Suggested-by: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-18 11:09:30 -04:00
Saleemkhan Jamadar	33e88286d6	Revert "drm/amdgpu:update kernel vcn ring test" VCN FW depncencies revert it to unblock others This reverts commit `f3fa86f5c7`. Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Acked-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-18 11:06:54 -04:00
Daniel Vetter	6c7f27441d	drm-misc-next for v6.6: UAPI Changes: * fbdev: * Make fbdev userspace interfaces optional; only leaves the framebuffer console active * prime: * Support dma-buf self-import for all drivers automatically: improves support for many userspace compositors Cross-subsystem Changes: * backlight: * Fix interaction with fbdev in several drivers * base: Convert struct platform.remove to return void; part of a larger, tree-wide effort * dma-buf: Acquire reservation lock for mmap() in exporters; part of an on-going effort to simplify locking around dma-bufs * fbdev: * Use Linux device instead of fbdev device in many places * Use deferred-I/O helper macros in various drivers * i2c: Convert struct i2c from .probe_new to .probe; part of a larger, tree-wide effort * video: * Avoid including <linux/screen_info.h> Core Changes: * atomic: * Improve logging * prime: * Remove struct drm_driver.gem_prime_mmap plus driver updates: all drivers now implement this callback with drm_gem_prime_mmap() * gem: * Support execution contexts: provides locking over multiple GEM objects * ttm: * Support init_on_free * Swapout fixes Driver Changes: * accel: * ivpu: MMU updates; Support debugfs * ast: * Improve device-model detection * Cleanups * bridge: * dw-hdmi: Improve support for YUV420 bus format * dw-mipi-dsi: Fix enable/disable of DSI controller * lt9611uxc: Use MODULE_FIRMWARE() * ps8640: Remove broken EDID code * samsung-dsim: Fix command transfer * tc358764: Handle HS/VS polarity; Use BIT() macro; Various cleanups * Cleanups * ingenic: * Kconfig REGMAP fixes * loongson: * Support display controller * mgag200: * Minor fixes * mxsfb: * Support disabling overlay planes * nouveau: * Improve VRAM detection * Various fixes and cleanups * panel: * panel-edp: Support AUO B116XAB01.4 * Support Visionox R66451 plus DT bindings * Cleanups * ssd130x: * Support per-controller default resolution plus DT bindings * Reduce memory-allocation overhead * Cleanups * tidss: * Support TI AM625 plus DT bindings * Implement new connector model plus driver updates * vkms * Improve write-back support * Documentation fixes -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmSvvRAACgkQaA3BHVML eiNpGQgAs8jq1XjN9t8jZsdgXnoCbkZyVUI2NO0HwoVwpRCLgbXp5AX5qq2oRciE TBhe4Fceh/ZsYqHTZQahnguxgRKM5JgXwbI4Z0iiOVcqasNbycaKAqipxJJ7kdo1 qPhGCbgQFVX7oIq2xjfXehh6O0SYX+R9r88X8dMJxMYv/pcLwOHG74kS040WOcQq uATgcnobOf/D8ZmlqvfKGAeTUoFo/RSR2Uhlauka58qgeUbicrTELZT2barY9d+k as6U5vv4wx2zMklTkjrlkMpAT1ZpbB9d3jGHwL27VEnjlfd3wV2bdH7Dzn9qZRf/ gn0ALg/b3u5yBWk/k7YBvijXyNcH6Q== =bBuG -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2023-07-13' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v6.6: UAPI Changes: * fbdev: * Make fbdev userspace interfaces optional; only leaves the framebuffer console active * prime: * Support dma-buf self-import for all drivers automatically: improves support for many userspace compositors Cross-subsystem Changes: * backlight: * Fix interaction with fbdev in several drivers * base: Convert struct platform.remove to return void; part of a larger, tree-wide effort * dma-buf: Acquire reservation lock for mmap() in exporters; part of an on-going effort to simplify locking around dma-bufs * fbdev: * Use Linux device instead of fbdev device in many places * Use deferred-I/O helper macros in various drivers * i2c: Convert struct i2c from .probe_new to .probe; part of a larger, tree-wide effort * video: * Avoid including <linux/screen_info.h> Core Changes: * atomic: * Improve logging * prime: * Remove struct drm_driver.gem_prime_mmap plus driver updates: all drivers now implement this callback with drm_gem_prime_mmap() * gem: * Support execution contexts: provides locking over multiple GEM objects * ttm: * Support init_on_free * Swapout fixes Driver Changes: * accel: * ivpu: MMU updates; Support debugfs * ast: * Improve device-model detection * Cleanups * bridge: * dw-hdmi: Improve support for YUV420 bus format * dw-mipi-dsi: Fix enable/disable of DSI controller * lt9611uxc: Use MODULE_FIRMWARE() * ps8640: Remove broken EDID code * samsung-dsim: Fix command transfer * tc358764: Handle HS/VS polarity; Use BIT() macro; Various cleanups * Cleanups * ingenic: * Kconfig REGMAP fixes * loongson: * Support display controller * mgag200: * Minor fixes * mxsfb: * Support disabling overlay planes * nouveau: * Improve VRAM detection * Various fixes and cleanups * panel: * panel-edp: Support AUO B116XAB01.4 * Support Visionox R66451 plus DT bindings * Cleanups * ssd130x: * Support per-controller default resolution plus DT bindings * Reduce memory-allocation overhead * Cleanups * tidss: * Support TI AM625 plus DT bindings * Implement new connector model plus driver updates * vkms * Improve write-back support * Documentation fixes Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20230713090830.GA23281@linux-uq9g	2023-07-17 15:37:57 +02:00
Dave Airlie	38d88d5e97	amd-drm-fixes-6.5-2023-07-12: amdgpu: - SMU i2c locking fix - Fix a possible deadlock in process restoration for ROCm apps - Disable PCIe lane/speed switching on Intel platforms (the platforms don't support it) -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZK7yYQAKCRC93/aFa7yZ 2JvYAQDpMj8/rLUsmWRk30jvkaZivgaeUEAG0FGaMpKaaATbvwEA2eHxN3xk5GKs ethEPp/zdivIrz6h/JWSCFrpCzqg4g8= =rsRJ -----END PGP SIGNATURE----- Merge tag 'amd-drm-fixes-6.5-2023-07-12' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.5-2023-07-12: amdgpu: - SMU i2c locking fix - Fix a possible deadlock in process restoration for ROCm apps - Disable PCIe lane/speed switching on Intel platforms (the platforms don't support it) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230712184009.7740-1-alexander.deucher@amd.com	2023-07-14 13:19:54 +10:00
Saleemkhan Jamadar	093b21f431	Revert "drm/amdgpu: update kernel vcn ring test" VCN FW depncencies revert it to unlock others This reverts commit `3ebfa943b8`. Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Acked-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-13 17:32:40 -04:00
Guchun Chen	826c1e923b	drm/amdgpu/vkms: relax timer deactivation by hrtimer_try_to_cancel In below thousands of screen rotation loop tests with virtual display enabled, a CPU hard lockup issue may happen, leading system to unresponsive and crash. do { xrandr --output Virtual --rotate inverted xrandr --output Virtual --rotate right xrandr --output Virtual --rotate left xrandr --output Virtual --rotate normal } while (1); NMI watchdog: Watchdog detected hard LOCKUP on cpu 1 ? hrtimer_run_softirq+0x140/0x140 ? store_vblank+0xe0/0xe0 [drm] hrtimer_cancel+0x15/0x30 amdgpu_vkms_disable_vblank+0x15/0x30 [amdgpu] drm_vblank_disable_and_save+0x185/0x1f0 [drm] drm_crtc_vblank_off+0x159/0x4c0 [drm] ? record_print_text.cold+0x11/0x11 ? wait_for_completion_timeout+0x232/0x280 ? drm_crtc_wait_one_vblank+0x40/0x40 [drm] ? bit_wait_io_timeout+0xe0/0xe0 ? wait_for_completion_interruptible+0x1d7/0x320 ? mutex_unlock+0x81/0xd0 amdgpu_vkms_crtc_atomic_disable It's caused by a stuck in lock dependency in such scenario on different CPUs. CPU1 CPU2 drm_crtc_vblank_off hrtimer_interrupt grab event_lock (irq disabled) __hrtimer_run_queues grab vbl_lock/vblank_time_block amdgpu_vkms_vblank_simulate amdgpu_vkms_disable_vblank drm_handle_vblank hrtimer_cancel grab dev->event_lock So CPU1 stucks in hrtimer_cancel as timer callback is running endless on current clock base, as that timer queue on CPU2 has no chance to finish it because of failing to hold the lock. So NMI watchdog will throw the errors after its threshold, and all later CPUs are impacted/blocked. So use hrtimer_try_to_cancel to fix this, as disable_vblank callback does not need to wait the handler to finish. And also it's not necessary to check the return value of hrtimer_try_to_cancel, because even if it's -1 which means current timer callback is running, it will be reprogrammed in hrtimer_start with calling enable_vblank to make it works. v2: only re-arm timer when vblank is enabled (Christian) and add a Fixes tag as well v3: drop warn printing (Christian) v4: drop superfluous check of blank->enabled in timer function, as it's guaranteed in drm_handle_vblank (Christian) Fixes: `84ec374bd5` ("drm/amdgpu: create amdgpu_vkms (v4)") Cc: stable@vger.kernel.org Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-13 17:32:15 -04:00
Srinivasan Shanmugam	2f77b5931f	drm/amdgpu: Fix error & warnings in gmc_v8_0.c Fix below checkpatch error & warnings: ERROR: trailing statements should be on next line + default: BUG(); WARNING: braces {} are not necessary for single statement blocks WARNING: braces {} are not necessary for any arm of this statement WARNING: Block comments should align the * on each line Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-13 17:29:11 -04:00

... 11 12 13 14 15 ...

13994 Commits