linux

mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00

Author	SHA1	Message	Date
Evan Quan	915b5ce774	drm/amdgpu: enable more GFX clockgating features for GC 11.0.0 Support more GFX clockgating features(3D_CGCG, 3D_CGLS, MGCG, FGCG and PERF_CLK). Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-05 16:50:58 -04:00
Minghao Chi	38c1c73670	drm/amdgpu: simplify the return expression of vega10_ih_hw_init() Simplify the return expression. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-05 16:50:22 -04:00
Minghao Chi	3f92a7d828	drm/amdgpu: simplify the return expression Simplify the return expression. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-05 16:50:20 -04:00
Mike Lothian	0c1c5e4aae	drm/amdgpu/gfx11: Avoid uninitialised variable 'index' This stops clang complaining: drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:376:6: warning: variable 'index' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized] if (ring->is_mes_queue) { ^~~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:433:30: note: uninitialized use occurs here amdgpu_device_wb_free(adev, index); ^~~~~ drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:376:2: note: remove the 'if' if its condition is always false if (ring->is_mes_queue) { ^~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:364:16: note: initialize the variable 'index' to silence this warning unsigned index; ^ = 0 Signed-off-by: Mike Lothian <mike@fireburn.co.uk> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-05 16:50:17 -04:00
Mike Lothian	8fab8e2ecc	drm/amdgpu/gfx10: Avoid uninitialised variable 'index' This stops clang complaining: drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:3846:6: warning: variable 'index' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized] if (ring->is_mes_queue) { ^~~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:3903:30: note: uninitialized use occurs here amdgpu_device_wb_free(adev, index); ^~~~~ drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:3846:2: note: remove the 'if' if its condition is always false if (ring->is_mes_queue) { ^~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:3839:16: note: initialize the variable 'index' to silence this warning unsigned index; ^ = 0 Signed-off-by: Mike Lothian <mike@fireburn.co.uk> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-05 16:50:15 -04:00
Mike Lothian	0a8c5ec66a	drm/amdgpu/gfx11: Add missing break This stops clang complaining: drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:5895:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough] default: ^ drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:5895:2: note: insert 'break;' to avoid fall-through default: ^ break; Signed-off-by: Mike Lothian <mike@fireburn.co.uk> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-05 16:50:11 -04:00
Alex Deucher	5a90c24ad0	Revert "drm/amdgpu: disable runpm if we are the primary adapter" This reverts commit `b95dc06af3`. This workaround is no longer necessary. We have a better workaround in commit `f95af4a923` ("drm/amdgpu: don't runtime suspend if there are displays attached (v3)"). Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-05 16:50:00 -04:00
Alex Deucher	98bae89647	drm/amdgpu/gfx11: remove some register fields that no longer exist Some copy paste leftovers for older asics. They were protected by __BIG_ENDIAN, so we didn't notice them initially. Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-05 16:33:08 -04:00
James Zhu	d6ffefccf7	drm/amdgpu/discovery: add VCN 4.0 Support Enable VCN 4.0 on asics where it is present. Signed-off-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:56 -04:00
James Zhu	9ac0edaa0f	drm/amdgpu: add vcn_4_0_0 video codec query Add vcn_4_0_0 video codec query. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:56 -04:00
James Zhu	04270390fe	drm/amdgpu/vcn: enable vcn4 dpg mode Enable vcn4 dpg mode. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:56 -04:00
James Zhu	7c507d35a5	drm/amdgpu/jpeg: enable JPEG PG and CG for VCN4_0_0 Enable JPEG PG and CG for VCN4_0_0. Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Sonny Jiang <sonny.jiang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:56 -04:00
Leo Liu	8b719b968f	drm/amdgpu: enable VCN4 PG and CG for VCN4_0_0 Most of the tiles can be power/clock gated. Reviewed-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:56 -04:00
James Zhu	b13111de32	drm/amdgpu/jpeg: add jpeg support for VCN4_0_0 Add jpeg support for VCN4_0_0. Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Sonny Jiang <sonny.jiang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:56 -04:00
Leo Liu	8da1170a16	drm/amdgpu: add VCN4 ip block support Add VCN 4.0 initialization and decoder/encoder ring functions. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:56 -04:00
James Zhu	b857e1477d	drm/amdgpu: move out asic specific definition from common header Move out asic specific definition from common header. Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:56 -04:00
Leo Liu	1218a2e39f	drm/amdgpu: make software ring functions reuseable for newer VCN Software ring will be supported only from VCN4 Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:56 -04:00
Stanley Yang	8143b87c9d	drm/amdgpu/discovery: add SDMA v6_0 ip block Add SDMA v6 ip block for asics which support it. Signed-off-by: Stanley Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:55 -04:00
Stanley Yang	61a039d175	drm/amdgpu: add initial support for sdma v6.0 Add functions for SDMA version 6. Signed-off-by: Stanley Yang <Stanley.Yang@amd.com> Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:55 -04:00
Hawking Zhang	5e779b1745	drm/amdgpu: add sdma v6_0_0 pkt header v3 v1: add sdma v6_0_0 pkt definitions (Hawking) v2: add gcr control field definition (Likun) v3: correct some definitions (Likun) Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:55 -04:00
Alex Deucher	e97b07208d	drm/amdgpu/discovery: add MES11 support Enable MES 11 on asics which support it. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:55 -04:00
Likun Gao	f6abd4d9f5	drm/amdgpu/discovery: add GFX 11.0 Support Enable GFX 11.0 on asics where it is present. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:55 -04:00
Jack Xiao	d81d75c999	drm/amdgpu/gfx11: enable kiq to map mes ring Enable KIQ to map MES ring: 1). add MES queue mapping support in MAP_QUEUES packet. 2). use correct MQD settings for MES queue. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:55 -04:00
Jack Xiao	12ec9a432b	drm/amdgpu/gfx10: enable kiq to map mes ring Enable KIQ to map MES ring: 1). add MES queue mapping support in MAP_QUEUES packet. 2). use correct MQD settings for MES queue. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:54 -04:00
Hawking Zhang	65b462fc7e	drm/amdgpu: enable GENERIC0_INT for gfx/compute pipes To generate an interrupt to RLC for accessing indirect registers that CP can not access directly Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:54 -04:00
Evan Quan	b21348a28b	drm/amdgpu: enable fgcg for soc21 Enable Fine Grained Clock Gating on soc21 asics. Signed-off-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:54 -04:00
Evan Quan	390db4b84a	drm/amdgpu: enable GFX CGCG/CGLS for GC11.0.0 Enable GFX CGCG (coarse grained clockgating) and CGLS (coarse grained light sleep) for GC11.0.0. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:54 -04:00
Mukul Joshi	cc009e613d	drm/amdkfd: Add KFD support for soc21 v3 Add initial support for soc21 in KFD compute driver (Mukul) - Add new definition for soc21 device. - Add new file for amdgpu-kfd interface for GFX11 family. - Add new file for queue management, interrupt handling, mqd management for GFX11 family in KFD driver. - Related changes/updates for soc21 device in KFD driver. - Repurpose last 2 entries of SDMA MQD for driver use. v2: Add an optional argument into update queue operation (Mukul) v3: Switch to ip version check, replace kgd_dev with amdgpu_device (Hawking) Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:54 -04:00
Hawking Zhang	3d879e81f0	drm/amdgpu: add init support for GFX11 (v2) Add initial support for GC version 11. GC is the graphics and compute block on the GPU. v1: add initial gfx11 support (Wenhui) v2: switch to new amdgpu_gfx_is_high_priority_compute_queue interface (Hawking) v3: fix num_mec (Alex) Signed-off-by: Wenhui Sheng <Wenhui.Sheng@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:54 -04:00
Jack Xiao	028c3fb37e	drm/amdgpu/mes11: initiate mes v11 support Initiate mes v11 code base from mes v10, rename function and register names. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:54 -04:00
Likun Gao	289bcffb9d	drm/amdgpu: support imu for gfx11 Add support to initialize imu for gfx v11. IMU is a new power management block for gfx which manages gfx power. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:54 -04:00
Jack Xiao	18ee4ce63e	drm/amdgpu: add mes unmap legacy queue routine For mes kiq has been taken over by mes sched, drv can't directly use mes kiq to unmap queues. drv has to use mes sched api to unmap legacy queue. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:54 -04:00
Likun Gao	14ab292418	drm/amdgpu: support RS64 CP fw front door load Support to load RS64 CP firmware front door load. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:54 -04:00
Likun Gao	8e070831d3	drm/amdgpu: renovate sdma fw struct Add sdma firmware struct version 2 to support new SDMA v6 and forward firmware version. v2: squash in fix Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:54 -04:00
Alex Deucher	a8bc892398	drm/amdgpu/discovery: handle AMDGPU_FW_LOAD_RLC_BACKDOOR_AUTO in SMU Handle SMU load ordering when firmware load type is AMDGPU_FW_LOAD_RLC_BACKDOOR_AUTO. This works similarly to AMDGPU_FW_LOAD_DIRECT where the SMU load order is different from the standard ordering when front door loading is enabled. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:54 -04:00
Likun Gao	aca670e41f	drm/amdgpu: fix the fw size for sdma For SDMA, if use the total size of SDMA TH0 and TH1 to allocate fw BO may result to the ucode data overflow when copy ucode to BO as the PAGE alignment. IMU have the same issue. Fix the above issue by alignment the fw size per fw ID. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:54 -04:00
Chengming Gui	a76be7bbc3	drm/amd/amdgpu: add more fw load type to fit new ASICs Align exported fw load types with internal used. Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:54 -04:00
Jack Xiao	fd0ed91ae8	drm/amdgpu: correct cp doorbell range 1. move MES doorbell inside the mec doorbell range, for mes belongs to mec block 2. setting the correct gfx/mec doorbell range, so that fw can correctly detect gfx/compute work load to enter/exit power saving state. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Tested-and-acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:53 -04:00
Chengming Gui	ae2d50be7e	drm/amd/amdgpu: adjust the fw load type list Use 0 for legacy backdoor and 1 for frontdoor. Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:53 -04:00
Likun Gao	7edda6749f	drm/amdgpu/gfx: refine fw hdr check fuction The return value of function amdgpu_ucode_hdr_version doesn't make sense, so change it to return true when fw header version is match with passed in parameters. Signed-off-by: Wenhui Sheng <Wenhui.Sheng@amd.com> Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:53 -04:00
Likun Gao	619c94c3b5	drm/amdgpu: extend the show ucode name function Extend amdgpu_ucode_name function to show SDMA TH0, TH1, IMU, RLCP, RLCV and MES related ucode name via ucode id. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:53 -04:00
Likun Gao	4e9d10ce44	drm/amdgpu: init SDMA v6 microcode with PSP load type Update to use new SDMA UCODE ID when init sdma microcode for sdma6 with psp front door load type. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:53 -04:00
Likun Gao	be3a3409ef	drm/amdgpu: add convert for new gfx type Add convert for CP RS64 related gfx ip type. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:53 -04:00
Likun Gao	a32fa02921	drm/amdgpu: support IMU front door load Support for front door to load IMU firmware. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:53 -04:00
Jack Xiao	d6b4014ad7	drm/amdgpu: add new CP_MES ucode ids Needed for MES KIQ firmware loading. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:53 -04:00
Likun Gao	6777c8cfca	drm/amdgpu: support for new SDMA front door load Support for SDMA v6_0 ucode front door load. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:53 -04:00
Likun Gao	8e41a56a79	drm/amdgpu: support RLCV firmware front door load Support RLCV firmware front door load. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:53 -04:00
Likun Gao	a0fe38b490	drm/amdgpu: support RLCP firmware front door load Support RLCP firmware front door load. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:53 -04:00
Mukul Joshi	464913c0dd	drm/amdgpu/mes: Update the doorbell function signatures Update the function signatures for process doorbell allocations with MES enabled to make them more generic. KFD would need to access these functions to allocate/free doorbells when MES is enabled. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Acked-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:53 -04:00
Jack Xiao	da1c0338f0	drm/amdgpu/mes: disable mes sdma queue test Disable mes sdma queue test on sienna cichlid+, for fw hasn't supported to map sdma queue. The test can be enabled if fw supports. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:53 -04:00
Jack Xiao	7c18b40e22	drm/amdgpu/mes: fix vm csa update issue Need reserve VM buffers before update VM csa. v2: rebase fixes Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:53 -04:00
Jack Xiao	2131733594	drm/amdgpu/mes10.1: add mes self test in late init Add MES self test in late init. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:53 -04:00
Jack Xiao	6624d16103	drm/amdgpu/mes: implement mes self test Add mes self test to verify its fundamental functionality by running ring test and ib test of mes kernel queue. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:52 -04:00
Jack Xiao	cdb7476d96	drm/amdgpu/mes: add ring/ib test for mes self test Run the ring test and ib test for mes self test. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:52 -04:00
Jack Xiao	f1d93c9c27	drm/amdgpu/mes: create gang and queues for mes self test Create gang and queues for mes self test. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:52 -04:00
Jack Xiao	a22f760a02	drm/amdgpu/mes: map ctx metadata for mes self test Map ctx metadata for mes self test. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:52 -04:00
Jack Xiao	712ce87221	drm/amdgpu: kiq takes charge of all queues To make kgq/kcq and mes queue co-exist, kiq needs take charge of all queues. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:52 -04:00
Jack Xiao	a4a5f5cab6	drm/amdgpu: skip gds switch for mes queue For mes manages gds allocation, skip gds switch. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:52 -04:00
Jack Xiao	9d3bccdc72	drm/amdgpu: skip kiq ib tests if mes enabled For kiq conflicts with mes, skip kiq ib tests. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:52 -04:00
Jack Xiao	f89703f561	drm/amdgpu: skip some checking for mes queue ib submission Skip some checking for mes queue ib submission. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:52 -04:00
Mukul Joshi	c004d44e10	drm/amdgpu: Enable KFD with MES enabled Enable KFD initialization with MES enabled. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Acked-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:52 -04:00
Jack Xiao	9c12f5cd06	drm/amdgpu: skip kfd routines when mes enabled For kfd hasn't supported mes, skip kfd routines. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:52 -04:00
Jack Xiao	e3652b0976	drm/amdgpu/mes: add helper functions to alloc/free ctx metadata Add the helper functions to allocate/free context metadata. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:52 -04:00
Jack Xiao	9cc654c8ce	drm/amdgpu/mes: implement removing mes ring Remove the mes ring and its resources. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:52 -04:00
Jack Xiao	d0c423b647	drm/amdgpu/mes: use ring for kernel queue submission Use ring as the front end for kernel queue submission. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:52 -04:00
Jack Xiao	11ec5b3605	drm/amdgpu/mes: add helper function to get the ctx meta data offset Add the helper function to get the corresponding ctx meta data offset. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:52 -04:00
Jack Xiao	1a27aacb6e	drm/amdgpu/mes: add helper function to convert ring to queue property Add the helper function to convert ring to queue property. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:52 -04:00
Jack Xiao	bcc4e1e1d4	drm/amdgpu/mes: implement removing mes queue Remove the MES queue from MES scheduling and free its resources. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:52 -04:00
Jack Xiao	be5609de15	drm/amdgpu/mes: implement adding mes queue Allocate related resources for the queue and add it to mes for scheduling. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:51 -04:00
Jack Xiao	5fa963d0fc	drm/amdgpu/mes: initialize mqd from queue properties Add helper function to initialize mqd from queue properties. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:51 -04:00
Jack Xiao	ea756bd5cc	drm/amdgpu/mes: implement resuming all gangs Implement resuming all gangs. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:51 -04:00
Jack Xiao	c8bb10572c	drm/amdgpu/mes: implement suspending all gangs Implement suspending all gangs. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:51 -04:00
Jack Xiao	b0306e5840	drm/amdgpu/mes: implement removing mes gang Free the mes gang and its resources. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:51 -04:00
Jack Xiao	5d0f619f72	drm/amdgpu/mes: implement adding mes gang Gang is a group of the same type queue, which is the scheduling unit of mes hardware scheduler. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:51 -04:00
Jack Xiao	063a38d662	drm/amdgpu/mes: implement destroying mes process Destroy the mes process, which free resources of the process. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:51 -04:00
Jack Xiao	48dcd2b751	drm/amdgpu/mes: implement creating mes process v2 Create a mes process which contains process-related resources, like vm, doorbell bitmap, process ctx bo and etc. v2: move the simple variable to the end Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:51 -04:00
Jack Xiao	29634c3f8b	drm/amdgpu/mes10.1: implement the suspend/resume routine Implement the suspend/resume routine of mes. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:51 -04:00
Jack Xiao	7149599be4	drm/amdgpu/mes10.1: add delay after mes engine enable Add delay after mes engine enable, for it needs more time to complete engine initialising. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:51 -04:00
Jack Xiao	4df8092737	drm/amdgpu/mes10.1: call general mes initialization Call general mes initialization/finalization. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:51 -04:00
Jack Xiao	0bf478f01a	drm/amdgpu/mes: relocate status_fence slot allocation Move the status_fence slot allocation from ip specific function to general mes function. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:51 -04:00
Jack Xiao	b04c1d6468	drm/amdgpu/mes: initialize/finalize common mes structure v2 Initialize/finalize common mes structure. v2: add mutex_init for adev->mes.mutex Cc: Le Ma <le.ma@amd.com> Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:51 -04:00
Jack Xiao	534000c080	drm/amdgpu: add mes queue id mask v2 Add MES queue id mask. v2: move queue id mask to amdgpu_mes_ctx.h Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:51 -04:00
Jack Xiao	32de57e9ef	drm/amdgpu/mes: manage mes doorbell allocation It is used to manage the doorbell allocation of mes processes and queues. Driver calls into process doorbell allocation to get the slice doorbell for the process, then the doorbell for a queue is allocated from the process doorbell slice. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:51 -04:00
Jack Xiao	f10e80e3a4	drm/amdgpu: enable mes kiq N-1 test on sienna cichlid Enable kiq support on gfx10.3, enable mes kiq (n-1) test on sienna cichlid, so that mes kiq can be tested on sienna cichlid. The patch can be dropped once mes kiq is functional. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:50 -04:00
Jack Xiao	b0f340288b	drm/amdgpu: add mes kiq frontdoor loading support Add mes kiq frontdoor loading support. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:50 -04:00
Jack Xiao	cf064b4589	drm/amdgpu/mes: add mes kiq callback Needed to properly initialize mes kiq. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:50 -04:00
Likun Gao	c1248e1124	drm/amdgpu: add mes kiq PSP GFX FW type Add MES KIQ PSP GFX FW type and the convert type. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:50 -04:00
Jack Xiao	8183d7436a	drm/amdgpu/sdma5: add mes support for sdma ib test Add MES support for sdma ib test. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:50 -04:00
Jack Xiao	ea93ac2f4e	drm/amdgpu/sdma5: add mes support for sdma ring test Add MES support for sdma ring test. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:50 -04:00
Jack Xiao	76411afd5b	drm/amdgpu/sdma5: add mes queue fence handling From IH ring buffer look up the coresponding kernel queue and process. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:50 -04:00
Jack Xiao	217d29f138	drm/amdgpu/sdma5: associate mes queue id with fence Associate mes queue id with fence, so that EOP trap handler can look up which queue issues the fence. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:50 -04:00
Jack Xiao	810479bad3	drm/amdgpu/sdma5: initialize sdma mqd Initialize sdma mqd according to ring settings. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:50 -04:00
Jack Xiao	c097aac7d9	drm/amdgpu/sdma5.2: add mes support for sdma ib test Add MES support for sdma ib test. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:50 -04:00
Jack Xiao	7e5e7971ce	drm/amdgpu/sdma5.2: add mes support for sdma ring test Add MES support for sdma ring test. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:50 -04:00
Jack Xiao	254492b66c	drm/amdgpu/sdma5.2: add mes queue fence handling From IH ring buffer, look up the coresponding kernel queue and process. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:50 -04:00
Jack Xiao	6f120134ff	drm/amdgpu/sdma5.2: associate mes queue id with fence Associate mes queue id with fence, so that EOP trap handler can look up which queue issues the fence. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:50 -04:00
Jack Xiao	e0f5b4c9af	drm/amdgpu/sdma5.2: initialize sdma mqd Initialize sdma mqd according to ring settings. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:50 -04:00
Jack Xiao	065891958d	drm/amdgpu/sdma: use per-ctx sdma csa address for mes sdma queue Use per context sdma csa address for mes sdma queue. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:49 -04:00
Jack Xiao	a3d686a6ad	drm/amdgpu: don't use kiq to flush gpu tlb if mes enabled If MES is enabled, don't use kiq to flush gpu tlb, for it would result in conflicting with mes fw. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:49 -04:00
Jack Xiao	15d839c16a	drm/amdgpu/gfx10: add mes support for gfx ib test Add mes support for gfx ib test. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:49 -04:00
Jack Xiao	954e0a72b4	drm/amdgpu/gfx10: add mes queue fence handling From IH ring buffer, look up the coresponding kernel queue and process. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:49 -04:00
Jack Xiao	207e8bbe66	drm/amdgpu/mes: extend mes framework to support multiple mes pipes Add support for multiple mes pipes, so that reuse the existing code to initialize more mes pipe and queue. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:49 -04:00
Jack Xiao	b608e785e1	drm/amdgpu: allocate doorbell index for mes kiq Allocate a doorbell index for mes kiq queue. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:49 -04:00
Jack Xiao	928fe236c0	drm/amdgpu: add mes_kiq module parameter v2 mes_kiq parameter is used to enable mes kiq pipe. This module parameter is unneccessary or enabled by default in final version. v2: reword commit message. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:49 -04:00
Jack Xiao	3a42c7f38b	drm/amdgpu: update mes process/gang/queue definitions Update the definitions of MES process/gang/queue. v2: add missing includes v3: rebase fix, include mm.h Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:43:49 -04:00
Jack Xiao	de33a32968	drm/amdgpu: use the whole doorbell space for mes Use the whole doorbell space for mes. Each queue in one process occupies one doorbell slot to ring the queue submitting. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:04:01 -04:00
Jack Xiao	564434020a	drm/amdgpu/gmc10: skip emitting pasid mapping packet For MES FW manages IH_VMID_x_LUT updating, skip emitting pasid mapping packet. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:03:58 -04:00
Jack Xiao	115efa440f	drm/amdgpu/gfx10: use INVALIDATE_TLBS to invalidate TLBs v2 For MES queue VM flush, use INVALIDATE_TLBS to invalidate TLBs. This packet can let CP firmware to determine the current vmid and inv eng to invalidate. v2: unify invalidate_tlbs functions Cc: Le Ma <le.ma@amd.com> Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:03:54 -04:00
Jack Xiao	1f0f303c85	drm/amdgpu/gfx10: inherit vmid from mqd For MES manages vmid assignment, let vmid inherit from mqd instead of ib packet setting. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:03:47 -04:00
Jack Xiao	11f39576ac	drm/amdgpu/gfx10: associate mes queue id with fence v2 Associate mes queue id with fence, so that EOP trap handler can look up which queue has issued the fence. v2: move mes queue flag to amdgpu_mes_ctx.h Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:03:45 -04:00
Jack Xiao	34ec3c2e0e	drm/amdgpu/gfx10: use per ctx CSA for de metadata As MES requires per context preemption, use per context CSA address for DE metadata to correctly enable context MCBP preemption. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:03:43 -04:00
Jack Xiao	75df9e88c5	drm/amdgpu/gfx10: use per ctx CSA for ce metadata As MES requires per context preemption, use per context CSA address for CE metadata to correctly enable context MCBP preemption. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:03:40 -04:00
Jack Xiao	c755f68095	drm/amdgpu/gfx10: implement mqd functions of gfx/compute eng v2 Refine the existing gfx/compute mqd functions, and add them to engine mqd layer. v2: rebase fix. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:03:35 -04:00
Jack Xiao	ae9fd76fd8	drm/amdgpu: assign the cpu/gpu address of fence from ring assign the cpu/gpu address of fence for the normal or mes ring from ring structure. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:03:31 -04:00
Jack Xiao	502b6cef8f	drm/amdgpu: initialize/finalize the ring for mes queue Iniailize/finalize the ring for mes queue which submits the command stream to the mes-managed hardware queue. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:03:29 -04:00
Jack Xiao	3748424ba9	drm/amdgpu: use ring structure to access rptr/wptr v2 Use ring structure to access the cpu/gpu address of rptr/wptr. v2: merge gfx10/sdma5/sdma5.2 patches Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:03:27 -04:00
Jack Xiao	d74c5b06e6	drm/amdgpu: define ring structure to access rptr/wptr/fence Define ring structure to access the cpu/gpu address of rptr/wptr/fence instead of dynamic calculation. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:03:23 -04:00
Jack Xiao	c6abbcbc76	drm/amdgpu: add mes ctx data in amdgpu_ring Add mes context data structure in amdgpu_ring. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:03:20 -04:00
Jack Xiao	2bc956ef54	drm/amdgpu: add the per-context meta data v3 The per-context meta data is a per-context data structure associated with a mes-managed hardware ring, which includes MCBP CSA, ring buffer and etc. v2: fix typo v3: a. use structure instead of typedef b. move amdgpu_mes_ctx_get_offs_* to amdgpu_ring.h c. use __aligned to make alignement Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:03:17 -04:00
Jack Xiao	80af9daa62	drm/amdgpu: add helper function to initialize mqd from ring v4 Add the helper function to initialize mqd from ring configuration. v2: use if/else pair instead of ?/: pair v3: use simpler way to judge hqd_active v4: fix parameters to amdgpu_gfx_is_high_priority_compute_queue Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:03:14 -04:00
Jack Xiao	5405a52627	drm/amdgpu: define MQD abstract layer for hw ip Define MQD abstract layer for hw ip, for the passing mqd configuration not only from ring but more sources, like user queue. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:03:11 -04:00
Likun Gao	d142f56e4f	drm/amdgpu: add imu fw structure Add IMU firmware structure. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:03:07 -04:00
Likun Gao	89466f49b2	drm/amdgpu: add rlc TOC header file for soc21 (v2) Add RLC autoload TOC header file for soc21 ASIC. v2: squash in updates Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:03:04 -04:00
Likun Gao	550bb28e64	drm/amdgpu: support rlc v2_3 ucode struct Add support for rlc v2_3 to support RLCV and RLCP fw load. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:02:38 -04:00
Likun Gao	641f053e3e	drm/amdgpu: add gfx firmware header v2_0 We need define new firmware header to support CP RS64 fw. Signed-off-by: Wenhui Sheng <Wenhui.Sheng@amd.com> Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:02:36 -04:00
Hawking Zhang	6c982cf878	drm/amdgpu: add gfx11 clearstate header Add gfx11 clearstate register arrays Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:02:20 -04:00
Likun Gao	7d33614285	drm/amdgpu/discovery: Set GC family for GC 11.0 IP Set GC family for GC 11.0 IPs. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:02:14 -04:00
Alice Wong	ab0cd4a9ae	drm/amdgpu/ucode: Remove firmware load type check in amdgpu_ucode_free_bo When psp_hw_init failed, it will set the load_type to AMDGPU_FW_LOAD_DIRECT. During amdgpu_device_ip_fini, amdgpu_ucode_free_bo checks that load_type is AMDGPU_FW_LOAD_DIRECT and skips deallocating fw_buf causing memory leak. Remove load_type check in amdgpu_ucode_free_bo. Signed-off-by: Alice Wong <shiwei.wong@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 10:01:53 -04:00
Likun Gao	40c487409a	drm/amdgpu/discovery: Enable SMU for SMU 13.0.0 Enable SMU on SMU IP version 13.0.0 Signed-off-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:59:01 -04:00
Evan Quan	a6dec86840	drm/amdgpu/soc21: enable ATHUB and MMHUB PG Enable ATHUB and MMHUB powergating. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:58:59 -04:00
Evan Quan	b37c41f2cb	drm/amdgpu: enable pptable ucode loading With SCPM enabled, pptable cannot be uploaded to SMU directly. The transferring has to be via PSP. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:58:00 -04:00
Hawking Zhang	f0b0a1b806	drm/amdgpu: query core refclk from bios for smu v13 The smu_info structrue for smu v13 is changed that core_refclk in v31 strucuture is not correct for smu v13_0_0 Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:57:35 -04:00
Likun Gao	0984d38441	drm/amdgpu/discovery: add GMC 11.0 Support Enable GMC 11.0 on asics where it is present. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:57:29 -04:00
Tianci.Yin	1c2014da77	drm/amdgpu: add gmc v11_0 ip block (v3) Add support for GPU memory controller v11. v1: Add support for gmc v11.0 Add gmc 11 block (Tianci) v2: drop unused amdgpu_bo_late_init (Hawking) v3: squash in various fix Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:57:26 -04:00
Jack Xiao	d7dab4fc44	drm/amdgpu: save the setting of VM_CONTEXT_CNTL MES firmware needs the setting of VM_CONTEXT_CNTL to perform vmid switch. Save the initial setting when hub initializing. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:57:22 -04:00
Tianci.Yin	98a0f8687e	drm/amdgpu: add mmhub v3_0 ip block Add support for mmhub v3.0 Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:57:18 -04:00
Tianci.Yin	2279b4e596	drm/amdgpu: add gfxhub v3_0 ip block Add support for gfxhub v3.0 Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:57:13 -04:00
Tianci.Yin	ae460cd566	drm/amdgpu: add athub v3_0 ip block Add support for athub v3.0 Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:57:10 -04:00
Likun Gao	55a800da49	drm/amdgpu/discovery: Enable PSP for PSP 13.0.0 Enable PSP on PSP IP version 13.0.0 Signed-off-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:57:00 -04:00
Likun Gao	7f318f4e30	drm/amdgpu: add tracking for the enablement of SCPM Add parmeter to shows whether SCPM feature is enabled or not, and whether is valid. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:56:51 -04:00
Likun Gao	a6b6d38ed8	drm/amdgpu: rework psp firmware name Use the new helper for deriving the fw name from the IP version. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:55:30 -04:00
Likun Gao	911a75043f	drm/amdgpu: support psp v13_0_0 microcode init Support psp v13_0_0 microcode init. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:55:27 -04:00
Likun Gao	e995e2ecdf	drm/amdgpu: add support for spl fw load on psp v13 Support for spl firmware load on psp v13. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:55:23 -04:00
Likun Gao	47a2038554	drm/amdgpu: extend PSP GFX FW type Extend PSP GFX FW type to support IMU, LSDMA, SDMA v6, RS64 MES related fw load. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:55:18 -04:00
Hawking Zhang	5fea10d5a9	drm/amdgpu: support print psp v2_0 hdr debug information print out psp firmware v2_0 hdr information for debugging purpose Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:55:13 -04:00
Alice Wong	e2c34219d1	drm/amdgpu/psp: deallocate memory when psp_load_fw failed psp_load_fw failure would cause memory leak for psp tmr and psp ring because psp_hw_init is not called as psp block is not fully initialized. Clean up psp tmr and psp ring when psp_load_fw fail by calling psp_free_shared_bufs and psp_ring_destroy. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alice Wong <shiwei.wong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:55:07 -04:00
Alex Deucher	da40bf8f93	drm/amdgpu/psp: move shared buffer frees into single function So we can properly clean up if any of the TAs or TMR fails to properly initialize or terminate. This avoids any memory leaks in the error case. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:55:05 -04:00
Alex Deucher	fb4f4f4256	drm/amdgpu/psp: fix memory leak in terminate functions Make sure we free the memory even if the unload fails. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:55:02 -04:00
Alex Deucher	f03d97b0bd	drm/amdgpu/psp: drop load/unload/init_shared_buf wrappers Just call the load/unload/init_shared_buf functions directly. Makes the code easier to follow. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:54:58 -04:00
Hawking Zhang	996ea8591b	drm/amdgpu: init smuio v13_0_6 callbacks initialize smuio callback for soc21 Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:54:46 -04:00
Alex Deucher	b95b539168	drm/amdgpu/psp: move PSP memory alloc from hw_init to sw_init Memory allocations should be done in sw_init. hw_init should just be hardware programming needed to initialize the IP block. This is how most other IP blocks work. Move the GPU memory allocations from psp hw_init to psp sw_init and move the memory free to sw_fini. This also fixes a potential GPU memory leak if psp hw_init fails. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:54:41 -04:00
Hawking Zhang	e6e405e048	drm/amdgpu: add smuio v13_0_6 support add smuio v13_0_6 callbacks to support read rom image Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:54:38 -04:00
Likun Gao	1761e5efab	drm/amdgpu/discovery: add HDP v6 Enable HDP v6 on asics where it is present. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:54:01 -04:00
Likun Gao	563fcfbf31	drm/amdgpu: add hdp version 6 functions Unify hdp related function into hdp structure for hdp version 6. V2: Remove hdp invalidate function as hdp v6 doesn't have read cache. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:53:58 -04:00
Philip Yang	068421b173	drm/amdgpu: Free user pages if kvmalloc_array fails To cleanup the BOs of bo_list which have got user pages. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:53:40 -04:00
Minghao Chi	364d453f4d	drm/amdgpu: simplify the return expression of navi10_ih_hw_init() Simplify the return expression. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:53:28 -04:00
Minghao Chi	3453677aea	drm/amdgpu: simplify the return expression of iceland_ih_hw_init Simplify the return expression. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:53:18 -04:00
Likun Gao	2929a6bfa1	drm/amdgpu/discovery: add IH v6 Enable IH v6 on asics where it is present. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:53:15 -04:00
Stanley.Yang	6e02c0ed4b	drm/amdgpu: add ih v6_0 ip block v2 This adds ih v6_0 ip block support. IH is the interrupt handler. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:53:12 -04:00
Likun Gao	2c0e7ddd1f	drm/amdgpu/discovery: add NBIO 4.3 Support Enable NBIO 4.3 on asics where it is present. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:52:53 -04:00
Stanley.Yang	0d09a60e3e	drm/amdgpu: add nbio v4_3_0 ip block v2 This adds nbio v4_3_0 ip block support Changed from v1: use WREG32_SOC15/RREG32_SOC15 instead of WREG32_PCIE/RREG32_PCIE remove the programming of PCIE_CONFIG_CNTL Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:52:44 -04:00
Likun Gao	759693aced	drm/amdgpu/discovery: add soc21 common Support Enable soc21 common support on asics where it is present. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-04 09:52:35 -04:00
Christian König	8d62a974ac	drm/amdgpu: fix drm-next merge fallout That hunk somehow got missing while solving the conflict between the TTM and AMDGPU changes for drm-next. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220503063613.46925-1-christian.koenig@amd.com	2022-05-04 04:20:53 +10:00
Dave Airlie	e954d2c94d	Linux 5.18-rc5 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmJu9FYeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGAyEH/16xtJSpLmLwrQzG o+4ToQxSQ+/9UHyu0RTEvHg2THm9/8emtIuYyc/5FgdoWctcSa3AaDcveWmuWmkS KYcdhfJsaEqjNHS3OPYXN84fmo9Hel7263shu5+IYmP/sN0DfQp6UWTryX1q4B3Q 4Pdutkuq63Uwd8nBZ5LXQBumaBrmkkuMgWEdT4+6FOo1mPzwdIGBxCuz1UsNNl5k chLWxkQfe2eqgWbYJrgCQfrVdORXVtoU2fGilZUNrHRVGkkldXkkz5clJfapyZD3 odmZCEbrE4GPKgZwCmDERMfD1hzhZDtYKiHfOQ506szH5ykJjPBcOjHed7dA60eB J3+wdek= =39Ca -----END PGP SIGNATURE----- Backmerge tag 'v5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into drm-next Linux 5.18-rc5 There was a build fix for arm I wanted in drm-next, so backmerge rather then cherry-pick. Signed-off-by: Dave Airlie <airlied@redhat.com>	2022-05-03 16:08:48 +10:00
Dave Airlie	15e2b419a8	drm-misc-next for 5.19: UAPI Changes: Cross-subsystem Changes: Core Changes: - Introduction of display-helper module, and rework of the DP, DSC, HDCP, HDMI and SCDC headers - doc: Improvements for tiny drivers, link to external resources - formats: helper to convert from RGB888 and RGB565 to XRGB8888 - modes: make width-mm/height-mm check mandatory in of_get_drm_panel_display_mode - ttm: Convert from kvmalloc_array to kvcalloc Driver Changes: - bridge: - analogix_dp: Fix error handling in probe - dw_hdmi: Coccinelle fixes - it6505: Fix Kconfig dependency on DRM_DP_AUX_BUS - panel: - new panel: DataImage FG040346DSSWBG04 - amdgpu: ttm_eu cleanups - mxsfb: Rework CRTC mode setting - nouveau: Make some variables static - sun4i: Drop drm_display_info.is_hdmi caching, support for the Allwinner D1 - vc4: Drop drm_display_info.is_hdmi caching - vmwgfx: Fence improvements -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCYmpHrQAKCRDj7w1vZxhR xTC3AQDrZx4SIu9p76z9cRCDwvkOFs1H2Q/YJGnv6zU6QiSWBwD9ENTmaJslcSVw +tYfhRo2Z0GEvhiKVmQPZ561TNUJzQU= =4Bw4 -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2022-04-28' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.19: UAPI Changes: Cross-subsystem Changes: Core Changes: - Introduction of display-helper module, and rework of the DP, DSC, HDCP, HDMI and SCDC headers - doc: Improvements for tiny drivers, link to external resources - formats: helper to convert from RGB888 and RGB565 to XRGB8888 - modes: make width-mm/height-mm check mandatory in of_get_drm_panel_display_mode - ttm: Convert from kvmalloc_array to kvcalloc Driver Changes: - bridge: - analogix_dp: Fix error handling in probe - dw_hdmi: Coccinelle fixes - it6505: Fix Kconfig dependency on DRM_DP_AUX_BUS - panel: - new panel: DataImage FG040346DSSWBG04 - amdgpu: ttm_eu cleanups - mxsfb: Rework CRTC mode setting - nouveau: Make some variables static - sun4i: Drop drm_display_info.is_hdmi caching, support for the Allwinner D1 - vc4: Drop drm_display_info.is_hdmi caching - vmwgfx: Fence improvements Signed-off-by: Dave Airlie <airlied@redhat.com> # gpg: Signature made Thu 28 Apr 2022 17:52:13 AEST # gpg: using EDDSA key 5C1337A45ECA9AEB89060E9EE3EF0D6F671851C5 # gpg: Can't check signature: No public key From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20220428075237.yypztjha7hetphcd@houat	2022-04-29 11:33:00 +10:00
Dave Airlie	9d9f720733	Merge tag 'amd-drm-fixes-5.18-2022-04-27' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-5.18-2022-04-27: amdgpu: - Runtime pm fix - DCN memory leak fix in error path - SI DPM deadlock fix - S0ix fix amdkfd: - GWS fix - GWS support for CRIU Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220428023232.5794-1-alexander.deucher@amd.com	2022-04-29 10:27:05 +10:00
Philip Yang	3da2c38231	drm/amdgpu: Free user pages if amdgpu_cs_parser_bos failed Otherwise userspace resubmit the BOs again will trigger kernel WARNING and fail the command submission. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Tested-by: Robert Święcki <robert@swiecki.net> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-28 17:49:04 -04:00
Candice Li	86e18ac3ae	drm/amdgpu: Fix build warning for TA debugfs interface Remove the redundant codes to fix build warning when CONFIG_DEBUG_FS is disabled. Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-28 17:48:44 -04:00
Stanley.Yang	71199aa47b	drm/amdgpu: add soc21 common ip block v2 This adds soc21 common ip block support Changed from v1: Switch WREG32/RREG32_PCIE to use indirect reg access helper for sco15 and onwards Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-28 17:48:40 -04:00
Stanley.Yang	ba9e7a4a31	drm/amdgpu: add new write field for soc21 add new write field macro to handle soc21 registers with reg prefix Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-28 17:48:35 -04:00
Hawking Zhang	fb1d683513	drm/amdgpu: add nbio callback to query rom offset Add nbio callback func used to query rom offset. Used to query the rom offset for fetching the vbios. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-28 17:48:32 -04:00
Hawking Zhang	a8d59943b8	drm/amdgpu: update query ref clk from bios Handle atom_gfx_info_v3_0 structure. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-28 17:48:21 -04:00
Hawking Zhang	f5fb30b6b3	drm/amdgpu: update gc info from bios table Handle newer gc info tables. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-28 17:48:15 -04:00
Hawking Zhang	7089dd3cc0	drm/amdgpu: support query vram_info v3_0 vram_info table provides various vram information including vram_vendor, vram_type, vram_width, etc. v2: correct the calculation of vram_width Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-28 17:48:11 -04:00
Hawking Zhang	85d1bcc6e0	drm/amdgpu: switch to atomfirmware_asic_init Some initial settings now are not available from the atom data table. The assumption that !ps[0] \|\| !ps[1] in amdgpu_atom_asic_init is not valid. In addition, driver needs to strictly follow atomfirmware structure (asic_init_parameters) to initialize parameters used to execute asic_init function, otherwise, the execution of asic_init would fail. This shall be applicable to all soc15 adapters,but let make the transition on soc21 first. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-28 17:48:00 -04:00
Hawking Zhang	ba75f6eb87	drm/amdgpu: add helper to execute atomfirmware asic_init Add helper function to execute atomfirmware asic_init from the cmd table Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-28 17:47:52 -04:00
Alex Deucher	e24d0e91b3	drm/amdgpu/discovery: move all table parsing into amdgpu_discovery.c This data has no dependencies, so encapsulate it all within amdgpu_discovery.c. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-28 17:47:43 -04:00
Alex Deucher	622469c87f	drm/amdgpu/discovery: add a function to parse the vcn info table To get the codec disable fuse mask. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-28 17:47:39 -04:00
Alex Deucher	f716113aac	drm/amdgpu/discovery: add additional validation Check the table signatures and checksums and verify that the tables exist before accessing them. v2: disable MALL table for now Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-28 17:47:33 -04:00
Alex Deucher	24681cb50b	drm/amdgpu/discovery: add a function to get the mall_size Add a function to fetch the mall size from the IP discovery table. Properly handle harvest configurations where more or less cache may be available. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-28 17:47:30 -04:00
Alex Deucher	478d338bb0	drm/amdgpu/discovery: handle UMC harvesting in IP discovery Check the harvesting table to determing if any UMC blocks have been harvested. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-28 17:47:27 -04:00
Alex Deucher	a2efebf1a4	drm/amdgpu/discovery: store the number of UMC IPs on the asic For chips with IP discovery get this from the table, hardcode it for older asics. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-28 17:47:22 -04:00
Alex Deucher	053d35dedd	drm/amdgpu: store the mall size in the gmc structure This will be useful in future patches. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-28 17:47:16 -04:00
Alex Deucher	8eece29c4e	drm/amdgpu/discovery: fix byteswapping in gc info parsing The table is in little endian format. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-28 17:47:09 -04:00
Guchun Chen	d1acd68b2b	drm/amdgpu: disable runtime pm on several sienna cichlid cards(v2) Disable runtime power management on several sienna cichlid cards, otherwise SMU will possibly fail to be resumed from runtime suspend. Will drop this after a clean solution between kernel driver and SMU FW is available. amdgpu 0000:63:00.0: amdgpu: GECC is enabled amdgpu 0000:63:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available amdgpu 0000:63:00.0: amdgpu: SMU is resuming... amdgpu 0000:63:00.0: amdgpu: SMU: I'm not done with your command: SMN_C2PMSG_66:0x0000000E SMN_C2PMSG_82:0x00000080 amdgpu 0000:63:00.0: amdgpu: Failed to SetDriverDramAddr! amdgpu 0000:63:00.0: amdgpu: Failed to setup smc hw! [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] ERROR resume of IP block <smu> failed -62 amdgpu 0000:63:00.0: amdgpu: amdgpu_device_ip_resume failed (-62) v2: seperate to a function. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-28 17:46:42 -04:00
Alex Deucher	5cb1cfd5f1	drm/amdgpu/discovery: populate additional GC info From the GC info table to the gfx config structure in the driver. The driver will use this data to configure the card correctly. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-28 17:46:38 -04:00
Likun Gao	1d5eee7dd6	drm/amdgpu: add function to decode ip version Add function to decode IP version. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-28 17:46:28 -04:00
Likun Gao	3202c7e782	drm/amdgpu: increase HWIP MAX INSTANCE Extend HWIP MAX INSTANCE to 11. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-28 17:46:24 -04:00
Marek Marczykowski-Górecki	78b12008f2	drm/amdgpu: do not use passthrough mode in Xen dom0 While technically Xen dom0 is a virtual machine too, it does have access to most of the hardware so it doesn't need to be considered a "passthrough". Commit `b818a5d374` ("drm/amdgpu/gmc: use PCI BARs for APUs in passthrough") changed how FB is accessed based on passthrough mode. This breaks amdgpu in Xen dom0 with message like this: [drm:dc_dmub_srv_wait_idle [amdgpu]] ERROR Error waiting for DMUB idle: status=3 While the reason for this failure is unclear, the passthrough mode is not really necessary in Xen dom0 anyway. So, to unbreak booting affected kernels, disable passthrough mode in this case. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1985 Fixes: `b818a5d374` ("drm/amdgpu/gmc: use PCI BARs for APUs in passthrough") Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-28 17:46:01 -04:00
Dave Airlie	4eaf02db9c	Merge tag 'amd-drm-next-5.19-2022-04-22' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.19-2022-04-22: amdgpu: - SMU message documentation update - Misc code cleanups - Documenation updates - PSP TA updates - Runtime PM regression fix - SR-IOV header cleanup - Misc fixes amdkfd: - TLB flush fixes - GWS fixes - CRIU GWS support radeon: - Misc code cleanups Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220422150049.5859-1-alexander.deucher@amd.com	2022-04-28 14:56:04 +10:00
Dave Airlie	dbe946287e	Merge tag 'amd-drm-next-5.19-2022-04-15' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.19-2022-04-15: amdgpu: - USB-C updates - GPUVM updates - TMZ fixes for RV - DCN 3.1 pstate fixes - Display z state fixes - RAS fixes - Misc code cleanups and spelling fixes - More DC FP rework - GPUVM TLB handling rework - Power management sysfs code cleanup - Add RAS support for VCN - Backlight fix - Add unique id support for more asics - Misc display updates - SR-IOV fixes - Extend CG and PG flags to 64 bits - Enable VCN clk sysfs nodes for navi12 amdkfd: - Fix IO link cleanup during device removal - RAS fixes - Retry fault fixes - Asynchronously free events - SVM fixes radeon: - Drop some dead code - Misc code cleanups From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220415135144.5700-1-alexander.deucher@amd.com Signed-off-by: Dave Airlie <airlied@redhat.com>	2022-04-28 14:33:20 +10:00
Prike Liang	fb8cc3318e	drm/amdgpu: keep mmhub clock gating being enabled during s2idle suspend Without MMHUB clock gating being enabled then MMHUB will not disconnect from DF and will result in DF C-state entry can't be accessed during S2idle suspend, and eventually s0ix entry will be blocked. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-27 17:38:02 -04:00
Alex Deucher	f95af4a923	drm/amdgpu: don't runtime suspend if there are displays attached (v3) We normally runtime suspend when there are displays attached if they are in the DPMS off state, however, if something wakes the GPU we send a hotplug event on resume (in case any displays were connected while the GPU was in suspend) which can cause userspace to light up the displays again soon after they were turned off. Prior to commit `087451f372` ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's."), the driver took a runtime pm reference when the fbdev emulation was enabled because we didn't implement proper shadowing support for vram access when the device was off so the device never runtime suspended when there was a console bound. Once that commit landed, we now utilize the core fb helper implementation which properly handles the emulation, so runtime pm now suspends in cases where it did not before. Ultimately, we need to sort out why runtime suspend in not working in this case for some users, but this should restore similar behavior to before. v2: move check into runtime_suspend v3: wake ups -> wakeups in comment, retain pm_runtime behavior in runtime_idle callback Fixes: `087451f372` ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.") Link: https://lore.kernel.org/r/20220403132322.51c90903@darkstar.example.org/ Tested-by: Michele Ballabio <ballabio.m@gmail.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-04-27 17:18:53 -04:00
Dan Carpenter	2f33a397e9	drm/amdgpu: debugfs: fix NULL dereference in ta_if_invoke_debugfs_write() If the kzalloc() fails then this code will crash. Return -ENOMEM instead. Fixes: `e50d9ba0d2` ("drm/amdgpu: Add debugfs TA load/unload/invoke support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-26 11:52:57 -04:00
Dan Carpenter	a52ad5b6ce	drm/amdgpu: debugfs: fix error codes in write functions There are two error code bugs here. The copy_to/from_user() functions return the number of bytes remaining (a positive number). We should return -EFAULT if the copy fails. Second if we fail because "context.resp_status" is non-zero then return -EINVAL instead of zero. Fixes: `e50d9ba0d2` ("drm/amdgpu: Add debugfs TA load/unload/invoke support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-26 11:52:20 -04:00
Prike Liang	3bbeaa307b	drm/amdgpu: keep mmhub clock gating being enabled during s2idle suspend Without MMHUB clock gating being enabled then MMHUB will not disconnect from DF and will result in DF C-state entry can't be accessed during S2idle suspend, and eventually s0ix entry will be blocked. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-26 11:51:25 -04:00
Haohui Mai	428f273cbb	drm/amdgpu: Fix out-of-bound access for gfx_v10_0_ring_test_ib() The gfx_v10_0_ring_test_ib() function uses 20 bytes instead of 16 bytes during the test. The patch sets the size of the allocation to be 4-byte larger to match the actual usage. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Haohui Mai <ricetons@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-26 11:44:09 -04:00
Haohui Mai	ca5d251b3b	drm/amdgpu/sdma: Remove redundant lower_32_bits() calls when settings SDMA doorbell Updated the patch for the pre-vega hardware. I kept the clamping code to be safe. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Haohui Mai <ricetons@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-26 11:44:06 -04:00
Haohui Mai	7dba6e838e	drm/amdgpu/sdma: Fix incorrect calculations of the wptr of the doorbells This patch fixes the issue where the driver miscomputes the 64-bit values of the wptr of the SDMA doorbell when initializing the hardware. SDMA engines v4 and later on have full 64-bit registers for wptr thus they should be set properly. Older generation hardwares like CIK / SI have only 16 / 20 / 24bits for the WPTR, where the calls of lower_32_bits() will be removed in a following patch. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Haohui Mai <ricetons@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-26 11:44:01 -04:00
Haowen Bai	b3ef3205bc	drm/amdgpu: Remove useless kfree After alloc fail, we do not need to kfree. Signed-off-by: Haowen Bai <baihaowen@meizu.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-25 17:05:38 -04:00
Thomas Zimmermann	da68386d9e	drm: Rename dp/ to display/ Rename dp/ to display/ to account for additional display-related helpers, such as HDMI. Update all related include statements. No functional changes. Various drivers, such as i915 and amdgpu, use similar naming scheme by putting code for video-output standards into a local display/ directory. The new directory's name is aligned with this convention. v2: * update commit message (Javier) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220421073108.19226-3-tzimmermann@suse.de	2022-04-25 11:17:45 +02:00
Dave Airlie	c18a2a280c	Two fixes for the raspberrypi panel initialisation, one fix for a logic inversion in radeon, a build and pm refcounting fix for vc4, two reverts for drm_of_get_bridge that caused a number of regression and a locking regression for amdgpu. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCYmJqDwAKCRDj7w1vZxhR xQlAAP9N78SStxmzZ3UjjU2h4fj7JXs3y97DddJZpzyu92+d5QD7BFP8i3mKLGhq hmmYabGl58dWK+bXZRD85kOsIxv80A0= =RgD5 -----END PGP SIGNATURE----- Merge tag 'drm-misc-fixes-2022-04-22' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Two fixes for the raspberrypi panel initialisation, one fix for a logic inversion in radeon, a build and pm refcounting fix for vc4, two reverts for drm_of_get_bridge that caused a number of regression and a locking regression for amdgpu. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20220422084403.2xrhf3jusdej5yo4@houat	2022-04-23 15:00:44 +10:00
David Yu	a2443ef0a8	drm/amdgpu: Ta fw needs to be loaded for SRIOV aldebaran Load ta fw during psp_init_sriov_microcode to enable XGMI. It is required to be loaded by both guest and host starting from Arcturus. Cap fw needs to be loaded first. Signed-off-by: David Yu <David.Yu@amd.com> Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-22 14:51:51 -04:00
Tao Zhou	b3c76814ce	drm/amdgpu: add RAS fatal error interrupt handler The fatal error handler is independent from general ras interrupt handler since there is no related IH ring. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-22 14:50:18 -04:00
Tao Zhou	66f8794961	drm/amdgpu: add RAS poison consumption handler (v2) Add support for general RAS poison consumption handler. v2: remove callback function for poison consumption. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-22 14:50:13 -04:00
Tao Zhou	50a7d025ca	drm/amdgpu: add RAS poison creation handler (v2) Prepare for the implementation of poison consumption handler. v2: separate umc handler from poison creation. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-22 14:50:07 -04:00
Bokun Zhang	e15c9d06e9	drm/amd/amdgpu: Update PF2VF header - In the latest version of the header, there is a variable name change. This should not cause any backward compatibility since the variable is at the same offset in the struct. Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-21 16:00:14 -04:00
Bokun Zhang	451913e980	drm/amd/amdgpu: Properly indent PF2VF header - Clean up the identation in the header file Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-21 16:00:09 -04:00
Bokun Zhang	c649287aba	drm/amd/amdgpu: Update MIT license in SRIOV msg header - Update MIT license header Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-21 16:00:01 -04:00
Alex Deucher	4020c22802	drm/amdgpu: don't runtime suspend if there are displays attached (v3) We normally runtime suspend when there are displays attached if they are in the DPMS off state, however, if something wakes the GPU we send a hotplug event on resume (in case any displays were connected while the GPU was in suspend) which can cause userspace to light up the displays again soon after they were turned off. Prior to commit `087451f372` ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's."), the driver took a runtime pm reference when the fbdev emulation was enabled because we didn't implement proper shadowing support for vram access when the device was off so the device never runtime suspended when there was a console bound. Once that commit landed, we now utilize the core fb helper implementation which properly handles the emulation, so runtime pm now suspends in cases where it did not before. Ultimately, we need to sort out why runtime suspend in not working in this case for some users, but this should restore similar behavior to before. v2: move check into runtime_suspend v3: wake ups -> wakeups in comment, retain pm_runtime behavior in runtime_idle callback Fixes: `087451f372` ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.") Link: https://lore.kernel.org/r/20220403132322.51c90903@darkstar.example.org/ Tested-by: Michele Ballabio <ballabio.m@gmail.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-21 15:59:37 -04:00
Candice Li	e50d9ba0d2	drm/amdgpu: Add debugfs TA load/unload/invoke support v1: Add debugfs support to load/unload/invoke TA in runtime. v2: 1. Update some variables to static. 2. Use PAGE_ALIGN to calculate shared buf size directly. 3. Remove fp check. 4. Update debugfs from read to write. Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-21 15:58:22 -04:00
Candice Li	fe96e5636a	drm/amdgpu: Use indirect buffer and save response status for TA load/invoke The upcoming TA debugfs interface needs to use indirect buffer when performing TA invoke and check psp response status for TA load and invoke. Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-21 15:58:10 -04:00
Christian König	94f4c4965e	drm/amdgpu: partial revert "remove ctx->lock" v2 This reverts commit `461fa7b0ac`. We are missing some inter dependencies here so re-introduce the lock until we have figured out what's missing. Just drop/retake it while adding dependencies. v2: still drop the lock while adding dependencies Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> (v1) Fixes: `461fa7b0ac` ("drm/amdgpu: remove ctx->lock") Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220419110633.166236-1-christian.koenig@amd.com	2022-04-21 11:26:20 +02:00
Christian König	32c2d7a536	drm/amdgpu: remove pointless ttm_eu usage from vkms We just need to reserve one BO here, no need for using ttm_eu to reserve multiple BOs. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220419141915.122157-1-christian.koenig@amd.com	2022-04-21 11:10:37 +02:00
Zack Rusin	7212d24cec	drm/amdgpu: Use TTM builtin resource manager debugfs code Switch to using the TTM resource manager debugfs helpers. It's exactly the same functionality but the debugfs code is shared with other drivers. The TTM resource managers need to stay valid for as long as the drm debugfs_root is valid. Signed-off-by: Zack Rusin <zackr@vmware.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Nirmoy Das <nirmoy.das@amd.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: amd-gfx@lists.freedesktop.org Link: https://patchwork.freedesktop.org/patch/msgid/20220412033526.369115-4-zack@kde.org Reviewed-by: Christian König <christian.koenig@amd.com>	2022-04-20 21:06:02 -04:00
Paul Cercueil	40f458b781	Merge drm/drm-next into drm-misc-next drm/drm-next has a build fix for the NewVision NV3052C panel (drivers/gpu/drm/panel/panel-newvision-nv3052c.c), which needs to be merged back to drm-misc-next, as it was failing to build there. Signed-off-by: Paul Cercueil <paul@crapouillou.net>	2022-04-18 20:46:55 +01:00
Gavin Wan	d68cf992de	drm/amd/amdgpu: Remove static from variable in RLCG Reg RW [why] These static variables save the RLC Scratch registers address. When we install multiple GPUs (for example: XGMI setting) and multiple GPUs call the function at same time. The RLC Scratch registers address are changed each other. Then it caused reading/writing from/to wrong GPU. [how] Removed the static from the variables. The variables are on the stack. Fixes: `5d447e2967` ("drm/amdgpu: add helper for rlcg indirect reg access") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Gavin Wan <Gavin.Wan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-14 15:29:20 -04:00
xinhui pan	7c703a7d3f	drm/amdgpu: Fix one use-after-free of VM VM might already be freed when amdgpu_vm_tlb_seq_cb() is called. We see the calltrace below. Fix it by keeping the last flush fence around and wait for it to signal BUG kmalloc-4k (Not tainted): Poison overwritten 0xffff9c88630414e8-0xffff9c88630414e8 @offset=5352. First byte 0x6c instead of 0x6b Allocated in amdgpu_driver_open_kms+0x9d/0x360 [amdgpu] age=44 cpu=0 pid=2343 __slab_alloc.isra.0+0x4f/0x90 kmem_cache_alloc_trace+0x6b8/0x7a0 amdgpu_driver_open_kms+0x9d/0x360 [amdgpu] drm_file_alloc+0x222/0x3e0 [drm] drm_open+0x11d/0x410 [drm] Freed in amdgpu_driver_postclose_kms+0x3e9/0x550 [amdgpu] age=22 cpu=1 pid=2485 kfree+0x4a2/0x580 amdgpu_driver_postclose_kms+0x3e9/0x550 [amdgpu] drm_file_free+0x24e/0x3c0 [drm] drm_close_helper.isra.0+0x90/0xb0 [drm] drm_release+0x97/0x1a0 [drm] __fput+0xb6/0x280 ____fput+0xe/0x10 task_work_run+0x64/0xb0 Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-14 15:27:02 -04:00
Tomasz Moń	4593c1b6d1	drm/amdgpu: Enable gfxoff quirk on MacBook Pro Enabling gfxoff quirk results in perfectly usable graphical user interface on MacBook Pro (15-inch, 2019) with Radeon Pro Vega 20 4 GB. Without the quirk, X server is completely unusable as every few seconds there is gpu reset due to ring gfx timeout. Signed-off-by: Tomasz Moń <desowin@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-04-13 22:22:27 -04:00
Kai-Heng Feng	887f75cfd0	drm/amdgpu: Ensure HDA function is suspended before ASIC reset DP/HDMI audio on AMD PRO VII stops working after S3: [ 149.450391] amdgpu 0000:63:00.0: amdgpu: MODE1 reset [ 149.450395] amdgpu 0000:63:00.0: amdgpu: GPU mode1 reset [ 149.450494] amdgpu 0000:63:00.0: amdgpu: GPU psp mode1 reset [ 149.983693] snd_hda_intel 0000:63:00.1: refused to change power state from D0 to D3hot [ 150.003439] amdgpu 0000:63:00.0: refused to change power state from D0 to D3hot ... [ 155.432975] snd_hda_intel 0000:63:00.1: CORB reset timeout#2, CORBRP = 65535 The offending commit is `daf8de0874` ("drm/amdgpu: always reset the asic in suspend (v2)"). Commit `34452ac303` ("drm/amdgpu: don't use BACO for reset in S3 ") doesn't help, so the issue is something different. Assuming that to make HDA resume to D0 fully realized, it needs to be successfully put to D3 first. And this guesswork proves working, by moving amdgpu_asic_reset() to noirq callback, so it's called after HDA function is in D3. Fixes: `daf8de0874` ("drm/amdgpu: always reset the asic in suspend (v2)") Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-04-13 22:21:58 -04:00
Alex Deucher	e3cf2e0544	drm/amdgpu: fix VCN 3.1.2 firmware name Drop the trailing vcn. Fixes: `afc2f27605` ("drm/amdgpu/vcn: add vcn support for vcn 3.1.2") Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-13 22:21:33 -04:00
Yongqiang Sun	eb85fc2389	drm/amd/amdgpu: Not request init data for MS_HYPERV with vega10 MS_HYPERV with vega10 doesn't have the interface to process request init data msg. Check hypervisor type to not send the request for MS_HYPERV. Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com> Reviewed-by: Alice Wong <shiwei.wong@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-13 09:14:22 -04:00
Dave Airlie	b85ffe47c4	drm-misc-next for 5.19: UAPI Changes: Cross-subsystem Changes: Core Changes: - atomic: Add atomic_print_state to private objects - edid: Constify the EDID parsing API, rework of the API - dma-buf: Add dma_resv_replace_fences, dma_resv_get_singleton, make dma_resv_excl_fence private - format: Support monochrome formats - fbdev: fixes for cfb_imageblit and sys_imageblit, pagelist corruption fix - selftests: several small fixes - ttm: Rework bulk move handling Driver Changes: - Switch all relevant drivers to drm_mode_copy or drm_mode_duplicate - bridge: conversions to devm_drm_of_get_bridge and panel_bridge, autosuspend for analogix_dp, audio support for it66121, DSI to DPI support for tc358767, PLL fixes and I2C support for icn6211 - bridge_connector: Enable HPD if supported - etnaviv: fencing improvements - gma500: GEM and GTT improvements, connector handling fixes - komeda: switch to plane reset helper - mediatek: MIPI DSI improvements - omapdrm: GEM improvements - panel: DT bindings fixes for st7735r, few fixes for ssd130x, new panels: ltk035c5444t, B133UAN01, NV3052C - qxl: Allow to run on arm64 - sysfb: Kconfig rework, support for VESA graphic mode selection - vc4: Add a tracepoint for CL submissions, HDMI YUV output, HDMI and clock improvements - virtio: Remove restriction of non-zero blob_flags, - vmwgfx: support for CursorMob and CursorBypass 4, various improvements and small fixes -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCYk6nlQAKCRDj7w1vZxhR xaTTAP0ZeeXRWIYxFfmuEAUd3H4ztvr3cx/QU/85qMXQUM4gSgD/cvQHMeucrFlX 2Bafjzl/p1tQrth0HNOkSz85dABUUws= =rJSD -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2022-04-07' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.19: UAPI Changes: Cross-subsystem Changes: Core Changes: - atomic: Add atomic_print_state to private objects - edid: Constify the EDID parsing API, rework of the API - dma-buf: Add dma_resv_replace_fences, dma_resv_get_singleton, make dma_resv_excl_fence private - format: Support monochrome formats - fbdev: fixes for cfb_imageblit and sys_imageblit, pagelist corruption fix - selftests: several small fixes - ttm: Rework bulk move handling Driver Changes: - Switch all relevant drivers to drm_mode_copy or drm_mode_duplicate - bridge: conversions to devm_drm_of_get_bridge and panel_bridge, autosuspend for analogix_dp, audio support for it66121, DSI to DPI support for tc358767, PLL fixes and I2C support for icn6211 - bridge_connector: Enable HPD if supported - etnaviv: fencing improvements - gma500: GEM and GTT improvements, connector handling fixes - komeda: switch to plane reset helper - mediatek: MIPI DSI improvements - omapdrm: GEM improvements - panel: DT bindings fixes for st7735r, few fixes for ssd130x, new panels: ltk035c5444t, B133UAN01, NV3052C - qxl: Allow to run on arm64 - sysfb: Kconfig rework, support for VESA graphic mode selection - vc4: Add a tracepoint for CL submissions, HDMI YUV output, HDMI and clock improvements - virtio: Remove restriction of non-zero blob_flags, - vmwgfx: support for CursorMob and CursorBypass 4, various improvements and small fixes [airlied: fixup conflict with newvision panel callbacks] Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20220407085940.pnflvjojs4qw4b77@houat	2022-04-12 17:44:27 +10:00
Stanley.Yang	05eee31c08	drm/amdgpu: add umc query error status function In order to debug ras error, driver will print IPID/SYND/MISC0 register value if detect correctable or uncorrectable error. Provide umc_query_error_status_helper function to reduce code redundancy. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-11 13:50:35 -04:00
Grigory Vasilyev	6f90a49bc0	drm/amdgpu: Fix incorrect enum type Instead of the 'amdgpu_ring_priority_level' type, the 'amdgpu_gfx_pipe_priority' type was used, which is an error when setting ring priority. This is a minor error, but may cause problems in the future. Instead of AMDGPU_RING_PRIO_2 = 2, we can use AMDGPU_RING_PRIO_MAX = 3, but AMDGPU_RING_PRIO_2 = 2 is used for compatibility with AMDGPU_GFX_PIPE_PRIO_HIGH = 2, and not change the behavior of the code. Signed-off-by: Grigory Vasilyev <h0tc0d3@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-11 13:50:35 -04:00
Tom St Denis	dc2947b35f	drm/amd/amdgpu: Update debugfs GCA data The data revision was not changed to 5 from 4 when the CG flags were extended to 64-bits. Since this was missed I took the opportunity to add future upper 64-bits of PG flags as well so we don't need to bump it again when that comes. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-11 13:50:35 -04:00
Yongqiang Sun	d9e50239a9	drm/amd/amdgpu: Fix asm/hypervisor.h build error. Add CONFIG_X86 check to fix the build error. Fixes: `49aa98ca30` ("drm/amd/amdgpu: Only reserve vram for firmware with vega9 MS_HYPERV host.") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-11 13:50:35 -04:00
Lijo Lazar	73bce7a423	drm/amdgpu: Use flexible array member Use flexible array member in ip discovery struct as recommended[1]. [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays v2: squash in struct_size fixes Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-11 13:50:35 -04:00
Evan Quan	25faeddcf3	drm/amdgpu: expand cg_flags from u32 to u64 With this, we can support more CG flags. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-08 17:24:24 -04:00
Arunpravin Paneer Selvam	c9cad937c0	drm/amdgpu: add drm buddy support to amdgpu - Switch to drm buddy allocator - Add resource cursor support for drm buddy v2(Matthew Auld): - replace spinlock with mutex as we call kmem_cache_zalloc (..., GFP_KERNEL) in drm_buddy_alloc() function - lock drm_buddy_block_trim() function as it calls mark_free/mark_split are all globally visible v3(Matthew Auld): - remove trim method error handling as we address the failure case at drm_buddy_block_trim() function v4: - fix warnings reported by kernel test robot <lkp@intel.com> v5: - fix merge conflict issue v6: - fix warnings reported by kernel test robot <lkp@intel.com> v7: - remove DRM_BUDDY_RANGE_ALLOCATION flag usage v8: - keep DRM_BUDDY_RANGE_ALLOCATION flag usage - resolve conflicts created by drm/amdgpu: remove VRAM accounting v2 v9(Christian): - merged the below patch - drm/amdgpu: move vram inline functions into a header - rename label name as fallback - move struct amdgpu_vram_mgr to amdgpu_vram_mgr.h - remove unnecessary flags from struct amdgpu_vram_reservation - rewrite block NULL check condition - change else style as per coding standard - rewrite the node max size - add a helper function to fetch the first entry from the list v10(Christian): - rename amdgpu_get_node() function name as amdgpu_vram_mgr_first_block v11: - if size is not aligned with min_page_size, enable is_contiguous flag, therefore, the size round up to the power of two and trimmed to the original size. v12: - rename the function names having prefix as amdgpu_vram_mgr_*() - modify the round_up() logic conforming to contiguous flag enablement or if size is not aligned to min_block_size - modify the trim logic - rename node as block wherever applicable Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220407224843.2416-1-Arunpravin.PaneerSelvam@amd.com Signed-off-by: Christian König <christian.koenig@amd.com>	2022-04-08 12:58:15 +02:00
Yongqiang Sun	49aa98ca30	drm/amd/amdgpu: Only reserve vram for firmware with vega9 MS_HYPERV host. driver loading failed on VEGA10 SRIOV VF with linux host due to a wide range of stolen reserved vram. Since VEGA10 SRIOV VF need to reserve vram for firmware with windows Hyper_V host specifically, check hypervisor type to only reserve memory for it, and the range of the reserved vram can be limited to between 5M-7M area. Fixes: `faad5ccac1` ("drm/amdgpu: Add stolen reserved memory for MI25 SRIOV.") Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-07 16:38:53 -04:00
Felix Kuehling	3cd3e731f3	drm/amdkfd: Fix NULL pointer dereference Check that adev->gfx.ras is valid before using it. Fixes: `6475ae2b74` ("drm/amdgpu: add UTCL2 RAS poison query for Aldebaran (v2)") CC: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-07 16:37:39 -04:00
Tomasz Moń	9b6a1ec792	drm/amdgpu: Enable gfxoff quirk on MacBook Pro Enabling gfxoff quirk results in perfectly usable graphical user interface on MacBook Pro (15-inch, 2019) with Radeon Pro Vega 20 4 GB. Without the quirk, X server is completely unusable as every few seconds there is gpu reset due to ring gfx timeout. Signed-off-by: Tomasz Moń <desowin@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-07 16:37:24 -04:00
Kai-Heng Feng	9e051720f9	drm/amdgpu: Ensure HDA function is suspended before ASIC reset DP/HDMI audio on AMD PRO VII stops working after S3: [ 149.450391] amdgpu 0000:63:00.0: amdgpu: MODE1 reset [ 149.450395] amdgpu 0000:63:00.0: amdgpu: GPU mode1 reset [ 149.450494] amdgpu 0000:63:00.0: amdgpu: GPU psp mode1 reset [ 149.983693] snd_hda_intel 0000:63:00.1: refused to change power state from D0 to D3hot [ 150.003439] amdgpu 0000:63:00.0: refused to change power state from D0 to D3hot ... [ 155.432975] snd_hda_intel 0000:63:00.1: CORB reset timeout#2, CORBRP = 65535 The offending commit is `daf8de0874` ("drm/amdgpu: always reset the asic in suspend (v2)"). Commit `34452ac303` ("drm/amdgpu: don't use BACO for reset in S3 ") doesn't help, so the issue is something different. Assuming that to make HDA resume to D0 fully realized, it needs to be successfully put to D3 first. And this guesswork proves working, by moving amdgpu_asic_reset() to noirq callback, so it's called after HDA function is in D3. Fixes: `daf8de0874` ("drm/amdgpu: always reset the asic in suspend (v2)") Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-07 16:35:48 -04:00
Alex Deucher	dd48182897	drm/amdgpu: fix VCN 3.1.2 firmware name Drop the trailing vcn. Fixes: `afc2f27605` ("drm/amdgpu/vcn: add vcn support for vcn 3.1.2") Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-07 16:34:42 -04:00
Christian König	8bb3158782	drm/ttm: remove bo->moving This is now handled by the DMA-buf framework in the dma_resv obj. Also remove the workaround inside VMWGFX to update the moving fence. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-14-christian.koenig@amd.com	2022-04-07 12:53:54 +02:00
Christian König	46b35b33cc	dma-buf: wait for map to complete for static attachments We have previously done that in the individual drivers but it is more defensive to move that into the common code. Dynamic attachments should wait for map operations to complete by themselves. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-12-christian.koenig@amd.com	2022-04-07 12:53:54 +02:00
Christian König	0cc848a75b	dma-buf: add DMA_RESV_USAGE_BOOKKEEP v3 Add an usage for submissions independent of implicit sync but still interesting for memory management. v2: cleanup the kerneldoc a bit v3: separate amdgpu changes from this Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-10-christian.koenig@amd.com	2022-04-07 12:53:54 +02:00
Christian König	c35fcfa344	drm/amdgpu: use DMA_RESV_USAGE_KERNEL Wait only for kernel fences before kmap or UVD direct submission. This also makes sure that we always wait in amdgpu_bo_kmap() even when returning a cached pointer. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-6-christian.koenig@amd.com	2022-04-07 12:53:54 +02:00
Christian König	047a1b877e	dma-buf & drm/amdgpu: remove dma_resv workaround Rework the internals of the dma_resv object to allow adding more than one write fence and remember for each fence what purpose it had. This allows removing the workaround from amdgpu which used a container for this instead. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: amd-gfx@lists.freedesktop.org Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-4-christian.koenig@amd.com	2022-04-07 12:53:53 +02:00
Christian König	73511edf8b	dma-buf: specify usage while adding fences to dma_resv obj v7 Instead of distingting between shared and exclusive fences specify the fence usage while adding fences. Rework all drivers to use this interface instead and deprecate the old one. v2: some kerneldoc comments suggested by Daniel v3: fix a missing case in radeon v4: rebase on nouveau changes, fix lockdep and temporary disable warning v5: more documentation updates v6: separate internal dma_resv changes from this patch, avoids to disable warning temporary, rebase on upstream changes v7: fix missed case in lima driver, minimize changes to i915_gem_busy_ioctl Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-3-christian.koenig@amd.com	2022-04-07 12:53:53 +02:00
Christian König	7bc80a5462	dma-buf: add enum dma_resv_usage v4 This change adds the dma_resv_usage enum and allows us to specify why a dma_resv object is queried for its containing fences. Additional to that a dma_resv_usage_rw() helper function is added to aid retrieving the fences for a read or write userspace submission. This is then deployed to the different query functions of the dma_resv object and all of their users. When the write paratermer was previously true we now use DMA_RESV_USAGE_WRITE and DMA_RESV_USAGE_READ otherwise. v2: add KERNEL/OTHER in separate patch v3: some kerneldoc suggestions by Daniel v4: some more kerneldoc suggestions by Daniel, fix missing cases lost in the rebase pointed out by Bas. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-2-christian.koenig@amd.com	2022-04-07 12:53:53 +02:00
Ma Jun	ef1a0808a2	drm/amdgpu: Sync up header and implementation to use the same parameter names Sync up header and implementation to use the same parameter names in function amdgpu_ring_init. ring_size -> max_dw, prio -> hw_prio Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-06 12:02:57 -04:00
Ruili Ji	96f2b7a357	drm/amdgpu: fix incorrect GCR_GENERAL_CNTL address gfx10.3.3/gfx10.3.6/gfx10.3.7 shall use 0x1580 address for GCR_GENERAL_CNTL Acked-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Ruili Ji <ruiliji2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-04-06 12:02:57 -04:00
Christian König	c8d4c18bfb	dma-buf/drivers: make reserving a shared slot mandatory v4 Audit all the users of dma_resv_add_excl_fence() and make sure they reserve a shared slot also when only trying to add an exclusive fence. This is the next step towards handling the exclusive fence like a shared one. v2: fix missed case in amdgpu v3: and two more radeon, rename function v4: add one more case to TTM, fix i915 after rebase Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220406075132.3263-2-christian.koenig@amd.com	2022-04-06 17:38:25 +02:00
tiancyin	dda81d9761	drm/amd/vcn: fix an error msg on vcn 3.0 Some video card has more than one vcn instance, passing 0 to vcn_v3_0_pause_dpg_mode is incorrect. Error msg: Register(1) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002 Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: tiancyin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-04-06 11:25:22 -04:00
Boyuan Zhang	945da79e6d	drm/amdgpu/vcn3: send smu interface type For VCN FW to detect ASIC type, in order to use different mailbox registers. V2: simplify codes and fix format issue. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-06 11:24:23 -04:00
Grigory Vasilyev	d1826081bb	drm/amdgpu: Remove leftover igp_lane_info Variable igp_lane_info always is 0. 0 & any value = 0 and false. In this way, all сonditional statements will false. The code was leftover from when the code was ported from radeon where igp_lane_info was derived from the vbios on supported platforms. [update commit message - Alex] Signed-off-by: Grigory Vasilyev <h0tc0d3@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-06 10:20:29 -04:00
Philip Yang	0f12a22f37	drm/amdgpu: Flush TLB after mapping for VG20+XGMI For VG20 + XGMI bridge, all mappings PTEs cache in TC, this may have stall invalid PTEs in TC because one cache line has 8 pages. Need always flush_tlb after updating mapping. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-05 10:29:47 -04:00
Haowen Bai	7e97de3e7f	drm/amdgpu/vcn: Remove unneeded semicolon report by coccicheck: drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c:1951:2-3: Unneeded semicolon Fixes: `c543dcbe42` ("drm/amdgpu/vcn: Add VCN ras error query support") Signed-off-by: Haowen Bai <baihaowen@meizu.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-05 10:26:42 -04:00
Christian König	30671b44aa	drm/amdgpu: fix TLB flushing during eviction Testing the valid bit is not enough to figure out if we need to invalidate the TLB or not. During eviction it is quite likely that we move a BO from VRAM to GTT and update the page tables immediately to the new GTT address. Rework the whole function to get all the necessary parameters directly as value. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-05 10:26:20 -04:00
Maxime Ripard	9cbbd694a5	Merge drm/drm-next into drm-misc-next Let's start the 5.19 development cycle. Signed-off-by: Maxime Ripard <maxime@cerno.tech>	2022-04-05 11:06:58 +02:00
Christian König	ba5f33cccc	drm/amdgpu: use dma_resv_get_singleton in amdgpu_pasid_free_cb Makes the code a bit more simpler. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: amd-gfx@lists.freedesktop.org Link: https://patchwork.freedesktop.org/patch/msgid/20220321135856.1331-15-christian.koenig@amd.com	2022-04-03 20:15:51 +02:00
Christian König	644704740b	drm/amdgpu: use dma_resv_for_each_fence for CS workaround v2 Get the write fence using dma_resv_for_each_fence instead of accessing it manually. v2: add TODO comment Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: amd-gfx@lists.freedesktop.org Link: https://patchwork.freedesktop.org/patch/msgid/20220321135856.1331-9-christian.koenig@amd.com	2022-04-03 18:50:49 +02:00
Ma Jun	cf8cc382aa	drm/amdgpu: Sync up header and implementation to use the same parameter names Sync up header and implementation to use the same parameter names in function amdgpu_ring_init. ring_size -> max_dw, prio -> hw_prio Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-31 23:05:54 -04:00
Ruili Ji	058497e1f5	drm/amdgpu: fix incorrect GCR_GENERAL_CNTL address gfx10.3.3/gfx10.3.6/gfx10.3.7 shall use 0x1580 address for GCR_GENERAL_CNTL Acked-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Ruili Ji <ruiliji2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-31 23:05:54 -04:00
Christian König	4499c90e90	drm/amdgpu: fix incorrect size printing in error msg That are bytes not pages. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-31 23:05:54 -04:00
Christian König	55a2d21bba	drm/amdgpu: fix some kerneldoc in the VM code v2 Fix two incorrect kerneldocs for the recent VM code changes. v2: fix one more typo Signed-off-by: Christian König <christian.koenig@amd.com> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-31 23:05:54 -04:00
Philip Yang	44e121fbf1	drm/amdgpu: Add tlb_cb for unlocked update Flush TLB needs wait for GPU update fence done. MMU notify callback to unmap range from GPUs uses unlocked GPU page table update, so add tlb_cb to unlocked update fence to increase vm->tlb_seq. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-31 23:05:53 -04:00
Philip Yang	9563e1ec92	drm/amdgpu: Correct unlocked update fence handling To fix two issues with unlocked update fence: 1. vm->last_unlocked store the latest fence without taking refcount. 2. amdgpu_vm_bo_update_mapping returns old fence, not the latest fence. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-31 23:05:53 -04:00
Christian König	77ef271fae	drm/amdgpu: drop amdgpu_gtt_node We have the BO pointer in the base structure now as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Link: https://patchwork.freedesktop.org/patch/msgid/20220321132601.2161-6-christian.koenig@amd.com	2022-03-29 10:57:12 +02:00
Christian König	fee2ede155	drm/ttm: rework bulk move handling v5 Instead of providing the bulk move structure for each LRU update set this as property of the BO. This should avoid costly bulk move rebuilds with some games under RADV. v2: some name polishing, add a few more kerneldoc words. v3: add some lockdep v4: fix bugs, handle pin/unpin as well v5: improve kerneldoc Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220321132601.2161-5-christian.koenig@amd.com	2022-03-29 10:55:32 +02:00
Christian König	6a9b028994	drm/ttm: move the LRU into resource handling v4 This way we finally fix the problem that new resource are not immediately evict-able after allocation. That has caused numerous problems including OOM on GDS handling and not being able to use TTM as general resource manager. v2: stop assuming in ttm_resource_fini that res->bo is still valid. v3: cleanup kerneldoc, add more lockdep annotation v4: consistently use res->num_pages Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220321132601.2161-1-christian.koenig@amd.com	2022-03-28 20:05:32 +02:00
Mohammad Zafar Ziya	749831acb1	drm/amdgpu/jpeg: Add jpeg ras error query support RAS error query support addition for JPEG 2.6 V2: removed unused options and corrected comment format. Moved register definition to header file. V3: poison query status check added. Removed the error query support V4: Return statement refactored. Signed-off-by: Mohammad Zafar Ziya <Mohammadzafar.ziya@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-28 12:54:39 -04:00
Mohammad Zafar Ziya	c543dcbe42	drm/amdgpu/vcn: Add VCN ras error query support RAS error query support addition for VCN 2.6 V2: removed unused option and corrected comment format Moved the register definition under header file V3: poison query status check added. Removed error query interface V4: MMSCH poison check option removed, return true/false refactored. Signed-off-by: Mohammad Zafar Ziya <Mohammadzafar.ziya@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-28 12:54:39 -04:00
Mohammad Zafar Ziya	edd08fa137	drm/amdgpu/jpeg: Add jpeg block ras support Ras support addition for JPEG block V2: removed default callback Signed-off-by: Mohammad Zafar Ziya <Mohammadzafar.ziya@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-28 12:54:39 -04:00
Mohammad Zafar Ziya	60fce7417f	drm/amdgpu/vcn: Add vcn ras support VCN block ras feature support addition V2: default ras callback removed Signed-off-by: Mohammad Zafar Ziya <Mohammadzafar.ziya@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-28 12:54:39 -04:00
Mohammad Zafar Ziya	a3d63c62bd	drm/amdgpu: Add vcn and jpeg ras support flag Add vcn and jpeg ras support options V2: vcn and jpeg ras flag enabled for aldebaran asic only V3: vcn and jpeg ras flag disabled for error counter query Generic poison query interface added VCN and JPEG ras enabled based on IP version check V4: vcn and jpeg ras flag moved under ecc flag for dGPU Signed-off-by: Mohammad Zafar Ziya <Mohammadzafar.ziya@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-28 12:54:39 -04:00
tiancyin	425d7a87e5	drm/amd/vcn: fix an error msg on vcn 3.0 Some video card has more than one vcn instance, passing 0 to vcn_v3_0_pause_dpg_mode is incorrect. Error msg: Register(1) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002 Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: tiancyin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-28 12:54:39 -04:00
Sean Paul	9f07550b3c	drm/amdgpu: Re-classify some log messages in commit path ATOMIC and DRIVER log categories do not typically contain per-frame log messages. This patch re-classifies some messages in amd to chattier categories to keep ATOMIC/DRIVER quiet. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Sean Paul <seanpaul@chromium.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-28 12:54:39 -04:00
Boyuan Zhang	e3026a057f	drm/amdgpu/vcn3: send smu interface type For VCN FW to detect ASIC type, in order to use different mailbox registers. V2: simplify codes and fix format issue. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-28 12:54:38 -04:00
Christian König	8f8cc3fb43	drm/amdgpu: remove table_freed param from the VM code Better to leave the decision when to flush the VM changes in the TLB to the VM code. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Philip Yang<Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:52 -04:00
Christian König	4d30a83c74	drm/amdkfd: use tlb_seq from the VM subsystem for SVM as well v2 Instead of hand rolling the table_freed parameter. v2: add some changes suggested by Philip Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Philip Yang<Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:52 -04:00
Christian König	5255e146c9	drm/amdgpu: rework TLB flushing Instead of tracking the VM updates through the dependencies just use a sequence counter for page table updates which indicates the need to flush the TLB. This reduces the need to flush the TLB drastically. v2: squash in NULL check fix (Christian) Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:51 -04:00
Christian König	e997b82745	drm/amdgpu: simplify VM update tracking a bit Store the 64bit sequence directly. Makes it simpler to use and saves a bit of fence reference counting overhead. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:26 -04:00
Christian König	184a69ca4d	drm/amdgpu: separate VM PT handling into amdgpu_vm_pt.c Separate the VM page table backend operations from the state machine since the amdgpu_vm.c file is becoming to complex. The allocating, freeing and updating page tables and page directories can easily be moved into a separate file. While at it cleanup everything checkpatch.pl reported and rename the functions a bit to make more clear that they belong together. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:26 -04:00
Christian König	6e97c2f968	drm/amdgpu: move VM PDEs to idle after update Move the page tables to the idle list after updating the PDEs. We have gone back and forth with that a couple of times because of problems with the inter PD dependencies, but it should work now that we have the state handling cleanly separated. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:26 -04:00
Guchun Chen	f3fa490960	drm/amdgpu: drop redundant check of harvest info Harvest bit setting in IP data structure promises this, so no need to set it explicitly. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:26 -04:00
Colin Ian King	2f78f0d3e3	drm/amdgpu: Fix spelling mistake "regiser" -> "register" There is a spelling mistake in a dev_error error message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:26 -04:00
Tao Zhou	6475ae2b74	drm/amdgpu: add UTCL2 RAS poison query for Aldebaran (v2) Add help functions to query and reset RAS UTCL2 poison status. v2: implement it on amdgpu side and kfd only calls it. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:26 -04:00
Prike Liang	15f9cd4334	drm/amdgpu/gfx10: enable gfx1037 clock counter retrieval function Enable gfx1037 clock counter retrieval function for KFDPerfCountersTest.ClockCountersBasicTest. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:25 -04:00
Prike Liang	0dc386add5	drm/amdgpu: set noretry for gfx 10.3.7 Disable xnack on the gfx10.3.7 for the KFD test. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:25 -04:00
Felix Kuehling	609910db56	drm/amdgpu: set noretry=1 for GFX 10.3.4 Retry faults are not supported on GFX 10.3.4. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:25 -04:00
Alex Deucher	c5b266810c	drm/amdgpu: make amdgpu_display_gem_fb_verify_and_init() static Unused outside of amdgpu_display.c. Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:25 -04:00
Yifan Zhang	7057c81773	drm/amdgpu: set noretry=1 for gc 10.3.6 this patch to set noretry=1 for gc 10.3.6. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:25 -04:00
Alex Deucher	60da2f7440	drm/amdgpu: drop amdgpu_display_gem_fb_init() Unused. Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:25 -04:00
Alex Deucher	5f3854f1f4	drm/amdgpu: add more cases to noretry=1 Port current list from amd-staging-drm-next. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:25 -04:00
Alex Deucher	31d5c52346	drm/amdgpu: make amdgpu_display_framebuffer_init() static It's not used outside of amdgpu_display.c. Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:25 -04:00
Tianci Yin	6ea239adc2	drm/amdgpu/vcn: improve vcn dpg stop procedure Prior to disabling dpg, VCN need unpausing dpg mode, or VCN will hang in S3 resuming. Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Tianci Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:25 -04:00
Tushar Patel	b7dfbd2e60	drm/amdkfd: Fix Incorrect VMIDs passed to HWS Compute-only GPUs have more than 8 VMIDs allocated to KFD. Fix this by passing correct number of VMIDs to HWS v2: squash in warning fix (Alex) Signed-off-by: Tushar Patel <tushar.patel@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:25 -04:00
Emily Deng	02fc996d50	drm/amdgpu/vcn: Fix the register setting for vcn1 Correct the code error for setting register UVD_GFX10_ADDR_CONFIG. Need to use inst_idx, or it only will set VCN0. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-03-25 12:40:24 -04:00
Lang Yu	0d8e4eb337	drm/amdgpu: add workarounds for VCN TMZ issue on CHIP_RAVEN It is a hardware issue that VCN can't handle a GTT backing stored TMZ buffer on CHIP_RAVEN series ASIC. Move such a TMZ buffer to VRAM domain before command submission as a workaround. v2: - Use patch_cs_in_place callback. v3: - Bail out early if unsecure IBs. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-03-25 12:40:24 -04:00
Alex Deucher	b818a5d374	drm/amdgpu/gmc: use PCI BARs for APUs in passthrough If the GPU is passed through to a guest VM, use the PCI BAR for CPU FB access rather than the physical address of carve out. The physical address is not valid in a guest. v2: Fix HDP handing as suggested by Michel Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:24 -04:00
Dan Carpenter	1647b54ed5	drm/amdgpu: fix off by one in amdgpu_gfx_kiq_acquire() This post-op should be a pre-op so that we do not pass -1 as the bit number to test_bit(). The current code will loop downwards from 63 to -1. After changing to a pre-op, it loops from 63 to 0. Fixes: `71c37505e7` ("drm/amdgpu/gfx: move more common KIQ code to amdgpu_gfx.c") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:24 -04:00
Guchun Chen	2d505453f3	drm/amdgpu: conduct a proper cleanup of PDB bo Use amdgpu_bo_free_kernel instead of amdgpu_bo_unref to perform a proper cleanup of PDB bo. v2: update subject to be more accurate Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:24 -04:00
Guchun Chen	32f90e6525	drm/amdgpu: prevent memory wipe in suspend/shutdown stage On GPUs with RAS enabled, below call trace is observed when suspending or shutting down device. The cause is we have enabled memory wipe flag for BOs on such GPUs by default, and such BOs will go to memory wipe by amdgpu_fill_buffer, however, because ring is off already, it fails to clean up the memory and throw this error message. So add a suspend/shutdown check before wipping memory. [drm:amdgpu_fill_buffer [amdgpu]] ERROR Trying to clear memory with ring turned off. v2: fix coding style issue Fixes: `fc6ea4bee1` ("drm/amdgpu: Wipe all VRAM on free when RAS is enabled") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:24 -04:00
Christian König	548e7432dc	dma-buf: add dma_resv_replace_fences v2 This function allows to replace fences from the shared fence list when we can gurantee that the operation represented by the original fence has finished or no accesses to the resources protected by the dma_resv object any more when the new fence finishes. Then use this function in the amdkfd code when BOs are unmapped from the process. v2: add an example when this is usefull. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220321135856.1331-1-christian.koenig@amd.com	2022-03-24 12:02:26 +01:00
Aurabindo Pillai	c5c948aa89	drm/amd: Add USBC connector ID [Why&How] Add a dedicated AMDGPU specific ID for use with newer ASICs that support USB-C output Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-22 10:53:39 -04:00
Ville Syrjälä	426c89aa20	drm/amdgpu: Use drm_mode_copy() struct drm_display_mode embeds a list head, so overwriting the full struct with another one will corrupt the list (if the destination mode is on a list). Use drm_mode_copy() instead which explicitly preserves the list head of the destination mode. Even if we know the destination mode is not on any list using drm_mode_copy() seems decent as it sets a good example. Bad examples of not using it might eventually get copied into code where preserving the list head actually matters. Obviously one case not covered here is when the mode itself is embedded in a larger structure and the whole structure is copied. But if we are careful when copying into modes embedded in structures I think we can be a little more reassured that bogus list heads haven't been propagated in. @is_mode_copy@ @@ drm_mode_copy(...) { ... } @depends on !is_mode_copy@ struct drm_display_mode mode; expression E, S; @@ ( - mode = E + drm_mode_copy(mode, &E) \| - memcpy(mode, E, S) + drm_mode_copy(mode, E) ) @depends on !is_mode_copy@ struct drm_display_mode mode; expression E; @@ ( - mode = E + drm_mode_copy(&mode, &E) \| - memcpy(&mode, E, S) + drm_mode_copy(&mode, E) ) @@ struct drm_display_mode mode; @@ - &mode + mode Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Leo Li <sunpeng.li@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: amd-gfx@lists.freedesktop.org Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 15:01:12 -04:00
Paul Menzel	07d0146932	drm/amdgpu: Use ternary operator in `vcn_v1_0_start()` Remove the boilerplate of declaring a variable and using an if else statement by using the ternary operator. Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 15:01:12 -04:00
Julia Lawall	58398727e6	drm/amdgpu: fix typos in comments Various spelling mistakes in comments. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 15:01:12 -04:00
Yongqiang Sun	faad5ccac1	drm/amdgpu: Add stolen reserved memory for MI25 SRIOV. MI25 SRIOV guest driver loading failed due to allocated memory overlaps with firmware reserved area. Allocate stolen reserved memory for MI25 SRIOV specifically to avoid the memory overlap. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 15:01:12 -04:00
Yongqiang Sun	3f54355284	drm/amdgpu: Merge get_reserved_allocation to get_vbios_allocations. Some ASICs need reserved memory for firmware or other components, which is not allowed to be used by driver. amdgpu_gmc_get_reserved_allocation is to handle additional areas. To avoid any missing calling, merged amdgpu_gmc_get_reserved_allocation to amdgpu_gmc_get_vbios_allocations. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 15:01:12 -04:00
Tianci Yin	4e2f50e230	drm/amdgpu/vcn: fix vcn ring test failure in igt reload test [why] On Renoir, vcn ring test failed on the second time insmod in the reload test. After investigation, it proves that vcn only can disable dpg under dpg unpause mode (dpg unpause mode is default for dec only, dpg pause mode is for dec/enc). [how] unpause dpg in dpg stopping procedure. Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Tianci Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 15:01:02 -04:00
Lang Yu	8c0f11ff38	drm/amdgpu: only allow secure submission on rings which support that Only GFX ring, SDMA ring and VCN decode ring support secure submission at the moment. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 14:42:27 -04:00
yipechai	8476269f75	drm/amdgpu: fixed the warnings reported by kernel test robot The reported warnings are as follows: 1.warning:no-previous-prototype-for-amdgpu_hdp_ras_fini. 2.warning:no-previous-prototype-for-amdgpu_mmhub_ras_fini. Amdgpu_hdp_ras_fini and amdgpu_mmhub_ras_fini are unused in the code, they are the only functions in amdgpu_hdp.c and amdgpu_mmhub.c. After removing these two functions, both amdgpu_hdp.c and amdgpu_mmhub.c are empty, so these two files can be deleted to fix the warning. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 14:42:20 -04:00
Philip Yang	436afdfa35	drm/amdgpu: Move reset domain init before calling RREG32 amdgpu_detect_virtualization reads register, amdgpu_device_rreg access adev->reset_domain->sem if kernel defined CONFIG_LOCKDEP, below is the random boot hang backtrace on Vega10. It may get random NULL pointer access backtrace if amdgpu_sriov_runtime is true too. Move amdgpu_reset_create_reset_domain before calling to RREG32. BUG: kernel NULL pointer dereference, address: #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI Workqueue: events work_for_cpu_fn RIP: 0010:down_read_trylock+0x13/0xf0 Call Trace: <TASK> amdgpu_device_skip_hw_access+0x38/0x80 [amdgpu] amdgpu_device_rreg+0x1b/0x170 [amdgpu] amdgpu_detect_virtualization+0x73/0x100 [amdgpu] amdgpu_device_init.cold.60+0xbe/0x16b1 [amdgpu] ? pci_bus_read_config_word+0x43/0x70 amdgpu_driver_load_kms+0x15/0x120 [amdgpu] amdgpu_pci_probe+0x1a1/0x3a0 [amdgpu] Fixes: `d0fb18b535` ("drm/amdgpu: Move reset sem into reset_domain") Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 14:40:49 -04:00
Tianci.Yin	72a98763b4	drm/amd: fix gfx hang on renoir in IGT reload test [why] CP hangs in igt reloading test on renoir, more precisely, hangs on the second time insmod. [how] mode2 reset can make it recover, and mode2 reset only effects gfx core, dcn and the screen will not be impacted. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 14:34:45 -04:00
Alex Deucher	85ac2021fe	drm/amdgpu: only check for _PR3 on dGPUs We don't support runtime pm on APUs. They support more dynamic power savings using clock and powergating. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Tested-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 14:34:41 -04:00
Hawking Zhang	a03b288650	drm/amdgpu: drop xmgi23 error query/reset support xgmi_ras is only initialized when host to GPU interface is PCIE. in such case, xgmi23 is disabled and protected by security firmware. Host access will results to security violation Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 14:34:26 -04:00
Jonathan Kim	6f172ae59a	drm/amdgpu: fix aldebaran xgmi topology for vf VFs must also distinguish whether or not the TA supports full duplex or half duplex link records in order to report the correct xGMI topology. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Shaoyun Liu <shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 14:34:20 -04:00
Stanley.Yang	69691c8235	drm/amdgpu: message smu to update bad channel info It should notice SMU to update bad channel info when detected uncorrectable error in UMC block Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 14:25:16 -04:00
Lijo Lazar	bb7c3e9ce2	drm/amdgpu: Disable baco dummy mode On aldebaran, BACO dummy mode may be enabled during reset. Disable it during resume. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 14:25:15 -04:00
Lang Yu	96a2f0f2c8	drm/amdgpu: fix a wrong ib reference It should be p->job->ibs[j] instead of p->job->ibs[i] here. Fixes: `cdc7893fc9` ("drm/amdgpu: use job and ib structures directly in CS parsers") Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-09 17:28:37 -05:00
Christian König	48e9fbd1a2	drm/amdgpu: initialize the vmid_wait with the stub fence This way we don't need to check for NULL any more. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-04 13:03:30 -05:00
Christian König	6103b2f24e	drm/amdgpu: properly embed the IBs into the job We now have standard macros for that. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-04 13:03:30 -05:00
Christian König	cdc7893fc9	drm/amdgpu: use job and ib structures directly in CS parsers Instead of providing the ib index provide the job and ib pointers directly to the patch and parse functions for UVD and VCE. Also move the set/get functions for IB values to the IB declerations. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-04 13:03:30 -05:00
Christian König	a190f8dc4a	drm/amdgpu: header cleanup No function change, just move a bunch of definitions from amdgpu.h into separate header files. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-04 13:03:30 -05:00
Jingwen Chen	8c7442f026	drm/amd/amdgpu: set disabled vcn to no_schduler [Why] after the reset domain introduced, the sched.ready will be init after hw_init, which will overwrite the setup in vcn hw_init, and lead to vcn ib test fail. [How] set disabled vcn to no_scheduler Fixes: `5fd8518d18` ("drm/amdgpu: Move scheduler init to after XGMI is ready") Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com> Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-04 13:03:30 -05:00
Christian König	d18b8eadd8	drm/amdgpu: install ctx entities with cmpxchg Since we removed the context lock we need to make sure that not two threads are trying to install an entity at the same time. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: `461fa7b0ac` ("drm/amdgpu: remove ctx->lock") Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-04 13:03:30 -05:00
Yifan Zhang	b664a56e86	drm/amdkfd: implement get_atc_vmid_pasid_mapping_info for gfx10.3 This patch implements get_atc_vmid_pasid_mapping_info for gfx10.3 Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-04 13:03:30 -05:00
Ruijing Dong	11eb648d01	drm/amdgpu/vcn: Add vcn firmware log vcn fwlog is for debugging purpose only, by default, it is disabled. Signed-off-by: Ruijing Dong <ruijing.dong@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-04 13:03:30 -05:00
Ruijing Dong	b6065ebf55	drm/amdgpu/vcn: Update fw shared data structure Add fw log in fw shared data structure. Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-04 13:03:30 -05:00
David Yu	811c04dbb3	drm/amdgpu: Add DFC CAP support for aldebaran Add DFC CAP support for aldebaran Initialize cap microcode in psp_init_sriov_microcode, the ta microcode will be initialized in psp_vxx_init_microcode Signed-off-by: David Yu <David.Yu@amd.com> Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-04 13:03:30 -05:00
Harish Kasiviswanathan	24bf9fd197	drm/amdgpu: Set correct DMA mask for aldebaran Aldebaran has 48-bit physical address support Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-04 13:03:29 -05:00
Lijo Lazar	9e08564727	drm/amdgpu: Refactor mode2 reset logic for v13.0.2 Use IP version and refactor reset logic to apply to a list of devices. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-04 13:03:29 -05:00
Weiguo Li	3192f1d9b6	drm/amdgpu: remove redundant null check Remove the redundant null check since the caller ensures that 'ctx' is never NULL. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Weiguo Li <liwg06@foxmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:07 -05:00
Alex Deucher	825e0af0d4	drm/amdgpu/sdma5: drop unused cyan skillfish firmware Leftover from bring up. Not used anymore. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:07 -05:00
Alex Deucher	31f5f46043	drm/amdgpu/gfx10: drop unused cyan skillfish firmware Leftover from bring up. Not used anymore. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:07 -05:00
Alex Deucher	1b537e6410	drm/amdgpu: remove unused gpu_info firmwares These were leftover from bring up and are no longer necessary. The information is available via the IP discovery table. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:07 -05:00
Alex Deucher	45a3e06be4	drm/amdgpu: Use IP versions in convert_tiling_flags_to_modifier() Rather than checking the asic_type. Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:07 -05:00
Prike Liang	d7709eb6a1	drm/amdgpu: enable gfxoff routine for GC 10.3.7 Enable gfxoff routine for GC 10.3.7. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:07 -05:00
Prike Liang	fabe175385	drm/amdgpu: enable gfx power gating for GC 10.3.7 Enable gfx power gating for GC 10.3.7. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
Prike Liang	9a1358bb2c	drm/amdgpu/nv: enable clock gating for GC 10.3.7 subblock This will enable the following block clock gating. - MC - SDMA - HDP - ATHUB - IH - VCN/JPEG Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
Prike Liang	00bfab4457	drm/amdgpu: enable gfx clock gating control for GC 10.3.7 Enable gfx cg gate/ungate control for GC 10.3.7. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
Qiang Yu	b6901d93cc	drm/amdgpu: fix suspend/resume hang regression Regression has been reported that suspend/resume may hang with the previous vm ready check commit. So bring back the evicted list check as a temp fix. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1922 Fixes: `c1a66c3bc4` ("drm/amdgpu: check vm ready by amdgpu_vm->evicting flag") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Qiang Yu <qiang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
Yifan Zha	e6fac6a9c9	drm/amdgpu: Move CAP firmware loading to the beginning of PSP firmware list [Why] As PSP needs to verify the signature, CAP firmware must be loaded first when PSP loads firmwares. Otherwise, when DFC feature is enabled, CP firmwares would be loaded failed. [ 1149.160480] [drm] MM table gpu addr = 0x800022f000, cpu addr = 00000000a62afcea. [ 1149.209874] [drm] failed to load ucode CP_CE(0x8) [ 1149.209878] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0007) [ 1149.215914] [drm] failed to load ucode CP_PFP(0x9) [ 1149.215917] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0007) [ 1149.221941] [drm] failed to load ucode CP_ME(0xA) [ 1149.221944] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0007) [ 1149.228082] [drm] failed to load ucode CP_MEC1(0xB) [ 1149.228085] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0007) [ 1149.234209] [drm] failed to load ucode CP_MEC2(0xD) [ 1149.234212] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0007) [ 1149.242379] [drm] failed to load ucode VCN(0x1C) [ 1149.242382] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0007) [How] Move CAP UCODE ID to the beginning of AMDGPU_UCODE_ID enum list. Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Reviewed-by: Bokun Zhang <Bokun.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
Andrey Grodzovsky	5aa061474b	drm/amdgpu: Bump minor version for hot plug tests enabling. This will allow to enable the tests only after latest fix after which the tests passed on my system. I tested on NV21 standalone and Vega 10 and Polaris as pair with DRI_PRIME. It's possible there might be still issues on ASICs i don't have at my posession but that that the point of enbling the tests finally - if other people during testing will encounter errors they will report and I will be able to fix. The releated merge request for enabling libdrm tests suite is in https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/227 Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
Andrey Grodzovsky	57230f0ce6	drm/amdgpu: Fix sigsev when accessing MMIO on hot unplug. Protect with drm_dev_enter/exit Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
Yifan Zhang	7d4108e4ce	drm/amdgpu: convert code name to ip version for noretry set Use IP version rather than codename for noretry set. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
Yifan Zhang	957b0787ee	drm/amdgpu: move amdgpu_gmc_noretry_set after ip_versions populated otherwise adev->ip_versions is still empty when amdgpu_gmc_noretry_set is called. Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
yipechai	80e0c2cb37	drm/amdgpu: Remove redundant .ras_fini initialization in some ras blocks 1. Define amdgpu_ras_block_late_fini_default in amdgpu_ras.c as .ras_fini common function, which is called when .ras_fini of ras block isn't initialized. 2. Remove the code of using amdgpu_ras_block_late_fini to initialize .ras_fini in ras blocks. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
yipechai	30e58102d5	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in mca ras block Remove redundant calls of amdgpu_ras_block_late_fini in mca ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
yipechai	149d7ba1f8	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in sdma ras block Remove redundant calls of amdgpu_ras_block_late_fini in sdma ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
yipechai	aa8e65dfc7	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in hdp ras block Remove redundant calls of amdgpu_ras_block_late_fini in hdp ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
yipechai	f148c143ef	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in xgmi ras block Remove redundant calls of amdgpu_ras_block_late_fini in xgmi ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
yipechai	0dca257d6d	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in umc ras block Remove redundant calls of amdgpu_ras_block_late_fini in umc ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
yipechai	f578a37d19	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in nbio ras block Remove redundant calls of amdgpu_ras_block_late_fini in nbio ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
yipechai	9dad47c50f	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in mmhub ras block Remove redundant calls of amdgpu_ras_block_late_fini in mmhub ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
yipechai	35366481d0	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in gfx ras block Remove redundant calls of amdgpu_ras_block_late_fini in gfx ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
yipechai	1f211a827c	drm/amdgpu: centrally calls the .ras_fini function of all ras blocks centrally calls the .ras_fini function of all ras blocks. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
yipechai	667c7091a3	drm/amdgpu: Optimize xxx_ras_fini function of each ras block 1. Move the variables of ras block instance members from specific xxx_ras_fini to general ras_fini call. 2. Function calls inside the modules only use parameters passed from xxx_ras_fini instead of ras block instance members. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
yipechai	01d468d9a4	drm/amdgpu: Modify .ras_fini function pointer parameter Modify .ras_fini function pointer parameter so that we can remove redundant intermediate calls in some ras blocks. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
Tom Rix	ca6fcfa8d4	drm/amdgpu: Fix realloc of ptr Clang static analysis reports this error amdgpu_debugfs.c:1690:9: warning: 1st function call argument is an uninitialized value tmp = krealloc_array(tmp, i + 1, ^~~~~~~~~~~~~~~~~~~~~~~~~~~ realloc uses tmp, so tmp can not be garbage. And the return needs to be checked. Fixes: `5ce5a584cb` ("drm/amdgpu: add debugfs for reset registers list") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
Dave Airlie	38a15ad948	Merge tag 'amd-drm-next-5.18-2022-02-25' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.18-2022-02-25: amdgpu: - Raven2 suspend/resume fix - SDMA 5.2.6 updates - VCN 3.1.2 updates - SMU 13.0.5 updates - DCN 3.1.5 updates - Virtual display fixes - SMU code cleanup - Harvest fixes - Expose benchmark tests via debugfs - Drop no longer relevant gart aperture tests - More RAS restructuring - W=1 fixes - PSR rework - DP/VGA adapter fixes - DP MST fixes - GPUVM eviction fix - GPU reset debugfs register dumping support - Misc display fixes - SR-IOV fix - Aldebaran mGPU fix - Add module parameter to disable XGMI for testing amdkfd: - IH ring overflow logging fixes - CRIU fixes - Misc fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220225183535.5907-1-alexander.deucher@amd.com	2022-03-01 16:19:02 +10:00
Dave Airlie	6c64ae228f	Linux 5.17-rc6 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmIb/PEeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGugMH/R8icwG5gOjAuxBr fuz9032ECVFS36Cy3ps9Eqgf5EdS4G6yz3joM4aNtp4B7e5FzI9lzBaHS8OAguNL y7puFtBr9CywsnniJumZzciB9pEHmF/yyEKfMlRZA3JsRfLDacFstETp+duJnXoA +s49IWsy1ot5zoherhDXcFLqDoFhLVU4hYwE1xpLpW/cllqmSnW8SDZMtWW1Ui/B Of7zqINR4zBMmRjP4ymGq/ZrPWlFyWLdtOo0xxVoAQkeMgm33kfaaeGzfyK25FqR JDGUZ7lkKvwz3PYh2hqJ7dc5K+vhJ18I+F1UmOiL6QAAUF/k8jq7i3Qaf6MKqh2Z 2v+A5k0= =Hiar -----END PGP SIGNATURE----- Backmerge tag 'v5.17-rc6' into drm-next This backmerges v5.17-rc6 so I can merge some amdgpu and some tegra changes on top. Signed-off-by: Dave Airlie <airlied@redhat.com>	2022-02-28 14:57:14 +10:00
Andrey Grodzovsky	2656fd230d	drm/amdgpu: Exclude PCI reset method for now. According to my investigation of the state of PCI reset recently it's not working. The reason is due to the fact the kernel PCI code rejects SBR when there are more then one PF under same bridge which we always have (at least AUDIO PF but usually more) and that because SBR will reset all the PFS and devices under the same bridge as you and you cannot assume they support SBR. Once we anble FLR support we can reenable this option as FLR is doable on single PF and doens't have this restriction. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-24 17:25:20 -05:00
Alex Sierra	158a05a0b8	drm/amdgpu: Add use_xgmi_p2p module parameter This parameter controls xGMI p2p communication, which is enabled by default. However, it can be disabled by setting it to 0. In case xGMI p2p is disabled in a dGPU, PCIe p2p interface will be used instead. This parameter is ignored in GPUs that do not support xGMI p2p configuration. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-24 17:25:12 -05:00
Xiaogang Chen	45f0ff404c	drm/amdgpu: config HDP_MISC_CNTL.READ_BUFFER_WATERMARK To fix applications running across multiple GPU config hang. Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-24 17:24:43 -05:00
Dave Airlie	54f43c17d6	drm-misc-next for v5.18: UAPI Changes: Cross-subsystem Changes: - Split out panel-lvds and lvds dt bindings . - Put yes/no on/off disabled/enabled strings in linux/string_helpers.h and use it in drivers and tomoyo. - Clarify dma_fence_chain and dma_fence_array should never include eachother. - Flatten chains in syncobj's. - Don't double add in fbdev/defio when page is already enlisted. - Don't sort deferred-I/O pages by default in fbdev. Core Changes: - Fix missing pm_runtime_put_sync in bridge. - Set modifier support to only linear fb modifier if drivers don't advertise support. - As a result, we remove allow_fb_modifiers. - Add missing clear for EDID Deep Color Modes in drm_reset_display_info. - Assorted documentation updates. - Warn once in drm_clflush if there is no arch support. - Add missing select for dp helper in drm_panel_edp. - Assorted small fixes. - Improve fb-helper's clipping handling. - Don't dump shmem mmaps in a core dump. - Add accounting to ttm resource manager, and use it in amdgpu. - Allow querying the detected eDP panel through debugfs. - Add helpers for xrgb8888 to 8 and 1 bits gray. - Improve drm's buddy allocator. - Add selftests for the buddy allocator. Driver Changes: - Add support for nomodeset to a lot of drm drivers. - Use drm_module__driver in a lot of drm drivers. - Assorted small fixes to bridge/lt9611, v3d, vc4, vmwgfx, mxsfb, nouveau, bridge/dw-hdmi, panfrost, lima, ingenic, sprd, bridge/anx7625, ti-sn65dsi86. - Add bridge/it6505. - Create DP and DVI-I connectors in ast. - Assorted nouveau backlight fixes. - Rework amdgpu reset handling. - Add dt bindings for ingenic,jz4780-dw-hdmi. - Support reading edid through aux channel in ingenic. - Add a drm driver for Solomon SSD130x OLED displays. - Add simple support for sharp LQ140M1JW46. - Add more panels to nt35560. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAmIWLSEACgkQ/lWMcqZw E8OP7hAAjix94EX5fhFa7OAdqUbFtsiKhK/4zNtV9FWpFiEsDBz+dlbfDQWIx5an FIiiiQtSfWjpDv6pcMhoNf80w+dDbc/Cuauz6nNGO7Pkaerh2D/EPG74FD7f7nE3 EIScVs1heYtzM9usKrFKupNYgIdgZxmeovClWuE0OTjLOes2PGvvxXK6iJqNqJMX VlDO5SR7GRqsDUDV6vmwl63uKL77xJXAahAXIx+BQ/1xrtEhlu6NwsgHIsmPmMSN YluX34zc1xD/6/uUqvEdp7u46/5/He1c5Q/ia1WV3wRxsO/eMZ+axXqCZP3XGZdt rMdGNtj1MWKkudYiowStWkCVSG/0fXJCFIAhvRmeZy+YqPdVlqZ2W7g4H1l9iJoo UVfT9cHrKoxHsukvIEckC5Ov9v1yr39Bd4wUuqaUTUSxY8VID5vjY63TsXl9Zke1 SluTFe9qybbnRNz/hYRvwIS1eT8HvUauAfAhypGTLI5DYHTD7PawcfMJkNzCtJm4 Ta4SC3rTpkpN+7oc8SoNgqRHQ8U9KL5oksP0wVa8vwHsMptSd3X4pUljc6TcfjLv GEo41D5AuJz3HRVcn9yqPbLoPE2FFB7bfwIMH77yNnoos4Izy/LGhKpN0YdImmI5 W5XVFB0jltGSIhkzLe1mFpLrdJwdUTSUVeCK4H5PhZZQEHLkVtg= =HuwD -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2022-02-23' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v5.18: UAPI Changes: Cross-subsystem Changes: - Split out panel-lvds and lvds dt bindings . - Put yes/no on/off disabled/enabled strings in linux/string_helpers.h and use it in drivers and tomoyo. - Clarify dma_fence_chain and dma_fence_array should never include eachother. - Flatten chains in syncobj's. - Don't double add in fbdev/defio when page is already enlisted. - Don't sort deferred-I/O pages by default in fbdev. Core Changes: - Fix missing pm_runtime_put_sync in bridge. - Set modifier support to only linear fb modifier if drivers don't advertise support. - As a result, we remove allow_fb_modifiers. - Add missing clear for EDID Deep Color Modes in drm_reset_display_info. - Assorted documentation updates. - Warn once in drm_clflush if there is no arch support. - Add missing select for dp helper in drm_panel_edp. - Assorted small fixes. - Improve fb-helper's clipping handling. - Don't dump shmem mmaps in a core dump. - Add accounting to ttm resource manager, and use it in amdgpu. - Allow querying the detected eDP panel through debugfs. - Add helpers for xrgb8888 to 8 and 1 bits gray. - Improve drm's buddy allocator. - Add selftests for the buddy allocator. Driver Changes: - Add support for nomodeset to a lot of drm drivers. - Use drm_module__driver in a lot of drm drivers. - Assorted small fixes to bridge/lt9611, v3d, vc4, vmwgfx, mxsfb, nouveau, bridge/dw-hdmi, panfrost, lima, ingenic, sprd, bridge/anx7625, ti-sn65dsi86. - Add bridge/it6505. - Create DP and DVI-I connectors in ast. - Assorted nouveau backlight fixes. - Rework amdgpu reset handling. - Add dt bindings for ingenic,jz4780-dw-hdmi. - Support reading edid through aux channel in ingenic. - Add a drm driver for Solomon SSD130x OLED displays. - Add simple support for sharp LQ140M1JW46. - Add more panels to nt35560. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/686ec871-e77f-c230-22e5-9e3bb80f064a@linux.intel.com	2022-02-25 05:50:18 +10:00
Qiang Yu	c1a66c3bc4	drm/amdgpu: check vm ready by amdgpu_vm->evicting flag Workstation application ANSA/META v21.1.4 get this error dmesg when running CI test suite provided by ANSA/META: [drm:amdgpu_gem_va_ioctl [amdgpu]] ERROR Couldn't update BO_VA (-16) This is caused by: 1. create a 256MB buffer in invisible VRAM 2. CPU map the buffer and access it causes vm_fault and try to move it to visible VRAM 3. force visible VRAM space and traverse all VRAM bos to check if evicting this bo is valuable 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable() will set amdgpu_vm->evicting, but latter due to not in visible VRAM, won't really evict it so not add it to amdgpu_vm->evicted 5. before next CS to clear the amdgpu_vm->evicting, user VM ops ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted) but fail in amdgpu_vm_bo_update_mapping() (check amdgpu_vm->evicting) and get this error log This error won't affect functionality as next CS will finish the waiting VM ops. But we'd better clear the error log by checking the amdgpu_vm->evicting flag in amdgpu_vm_ready() to stop calling amdgpu_vm_bo_update_mapping() later. Another reason is amdgpu_vm->evicted list holds all BOs (both user buffer and page table), but only page table BOs' eviction prevent VM ops. amdgpu_vm->evicting flag is set only for page table BOs, so we should use evicting flag instead of evicted list in amdgpu_vm_ready(). The side effect of this change is: previously blocked VM op (user buffer in "evicted" list but no page table in it) gets done immediately. v2: update commit comments. Acked-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Qiang Yu <qiang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-02-23 16:31:06 -05:00
Guchun Chen	e2b993302f	drm/amdgpu: bypass tiling flag check in virtual display case (v2) vkms leverages common amdgpu framebuffer creation, and also as it does not support FB modifier, there is no need to check tiling flags when initing framebuffer when virtual display is enabled. This can fix below calltrace: amdgpu 0000:00:08.0: GFX9+ requires FB check based on format modifier WARNING: CPU: 0 PID: 1023 at drivers/gpu/drm/amd/amdgpu/amdgpu_display.c:1150 amdgpu_display_framebuffer_init+0x8e7/0xb40 [amdgpu] v2: check adev->enable_virtual_display instead as vkms can be enabled in bare metal as well. Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 16:31:06 -05:00
Guchun Chen	97c61e0b7c	Revert "drm/amdgpu: add modifiers in amdgpu_vkms_plane_init()" This reverts commit `4046afcebf`. No need to support modifier in virtual kms, otherwise, in SRIOV mode, when lanuching X server, set crtc will fail due to mismatch between primary plane modifier and framebuffer modifier. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 16:31:06 -05:00
Chen Gong	1e2be869c8	drm/amdgpu: do not enable asic reset for raven2 The GPU reset function of raven2 is not maintained or tested, so it should be very unstable. Now the amdgpu_asic_reset function is added to amdgpu_pmops_suspend, which causes the S3 test of raven2 to fail, so the asic_reset of raven2 is ignored here. Fixes: `daf8de0874` ("drm/amdgpu: always reset the asic in suspend (v2)") Signed-off-by: Chen Gong <curry.gong@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-02-23 16:31:06 -05:00
Mario Limonciello	7294863a6f	drm/amd: Check if ASPM is enabled from PCIe subsystem commit `0064b0ce85` ("drm/amd/pm: enable ASPM by default") enabled ASPM by default but a variety of hardware configurations it turns out that this caused a regression. * PPC64LE hardware does not support ASPM at a hardware level. CONFIG_PCIEASPM is often disabled on these architectures. * Some dGPUs on ALD platforms don't work with ASPM enabled and PCIe subsystem disables it Check with the PCIe subsystem to see that ASPM has been enabled or not. Fixes: `0064b0ce85` ("drm/amd/pm: enable ASPM by default") Link: https://wiki.raptorcs.com/w/images/a/ad/P9_PHB_version1.0_27July2018_pub.pdf Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1723 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1739 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1885 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1907 Tested-by: koba.ko@canonical.com Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-02-23 16:31:06 -05:00
Alex Deucher	e776a755ab	drm/amdgpu: fix typo in amdgpu_discovery.c disocvery -> discovery Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:26:36 -05:00
Somalapuram Amaranath	15fd09a05a	drm/amdgpu: add reset register dump trace on GPU Dump the list of register values to trace event on GPU reset. Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:26:36 -05:00
Somalapuram Amaranath	5ce5a584cb	drm/amdgpu: add debugfs for reset registers list List of register populated for dump collection during the GPU reset. Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:26:36 -05:00
Qiang Yu	b74e2476ef	drm/amdgpu: check vm ready by amdgpu_vm->evicting flag Workstation application ANSA/META v21.1.4 get this error dmesg when running CI test suite provided by ANSA/META: [drm:amdgpu_gem_va_ioctl [amdgpu]] ERROR Couldn't update BO_VA (-16) This is caused by: 1. create a 256MB buffer in invisible VRAM 2. CPU map the buffer and access it causes vm_fault and try to move it to visible VRAM 3. force visible VRAM space and traverse all VRAM bos to check if evicting this bo is valuable 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable() will set amdgpu_vm->evicting, but latter due to not in visible VRAM, won't really evict it so not add it to amdgpu_vm->evicted 5. before next CS to clear the amdgpu_vm->evicting, user VM ops ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted) but fail in amdgpu_vm_bo_update_mapping() (check amdgpu_vm->evicting) and get this error log This error won't affect functionality as next CS will finish the waiting VM ops. But we'd better clear the error log by checking the amdgpu_vm->evicting flag in amdgpu_vm_ready() to stop calling amdgpu_vm_bo_update_mapping() later. Another reason is amdgpu_vm->evicted list holds all BOs (both user buffer and page table), but only page table BOs' eviction prevent VM ops. amdgpu_vm->evicting flag is set only for page table BOs, so we should use evicting flag instead of evicted list in amdgpu_vm_ready(). The side effect of this change is: previously blocked VM op (user buffer in "evicted" list but no page table in it) gets done immediately. v2: update commit comments. Acked-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Qiang Yu <qiang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:26:36 -05:00
Prike Liang	db749b769f	drm/amdgpu/nv: set mode2 reset for MP1 13.0.8 Set mode2 reset support for MP1 13.0.8. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:26:36 -05:00
Prike Liang	9e148e8ce2	drm/amdgpu/nv: enable gfx10.3.7 clock gating support This will enable the following gfx clock gating. - Fine clock gating - Medium Grain clock gating - 3D Coarse clock gating - Coarse Grain clock gating - RLC/CP light sleep clock gating Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:26:36 -05:00
Yifan Zhang	5043906024	drm/amdgpu: add mode2 reset support for smu 13.0.5 This patch adds mode2 reset support for smu 13.0.5. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:26:36 -05:00
yipechai	29c9b6cd58	drm/amdgpu: Fixed warning reported by kernel test robot Fixed warning reported by kernel test robot: 1.warning: variable 'ras_obj' is used uninitialized whenever '\|\|' condition is true. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:26:35 -05:00
Maíra Canal	78be946dad	drm/amdgpu: Remove unused get_umc_v8_7_channel_index function Remove get_umc_v8_7_channel_index function, which is not used in the codebase. This was pointed by clang with the following warning: drivers/gpu/drm/amd/amdgpu/umc_v8_7.c:50:24: warning: unused function 'get_umc_v8_7_channel_index' [-Wunused-function] static inline uint32_t get_umc_v8_7_channel_index(struct amdgpu_device *adev, ^ Signed-off-by: Maíra Canal <maira.canal@usp.br> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:02:51 -05:00
Maíra Canal	d41ff22a4e	drm/amdgpu: Change amdgpu_ras_block_late_init_default function scope Turn previously global function into a static function to avoid the following Clang warning: drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:2459:5: warning: no previous prototype for function 'amdgpu_ras_block_late_init_default' [-Wmissing-prototypes] int amdgpu_ras_block_late_init_default(struct amdgpu_device adev, ^ drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:2459:1: note: declare 'static' if the function is not intended to be used outside of this translation unit int amdgpu_ras_block_late_init_default(struct amdgpu_device adev, ^ static Signed-off-by: Maíra Canal <maira.canal@usp.br> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:02:51 -05:00
Alex Deucher	4683af148f	drm/amdgpu: use ktime rather than jiffies for benchmark results To protect against wraparounds. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:02:51 -05:00
Alex Deucher	5a82b01823	drm/amdgpu: use kernel BO API for benchmark buffer management Simplifies the code quite a bit. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:02:51 -05:00
Alex Deucher	a7f520bfd0	drm/amdgpu: derive GTT display support from DM Rather than duplicating the logic in two places, consolidate the logic in the display manager. Acked-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:02:51 -05:00
Alex Deucher	b784f42cf7	drm/amdgpu: drop testing module parameter This test is not particularly useful now that GTT and GART are decoupled in the driver. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:02:51 -05:00
Alex Deucher	0b1a63487b	drm/amdgpu: drop benchmark module parameter Now that we expose the benchmarks via debugfs, there is no longer a need for the module parameter. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:02:51 -05:00
Alex Deucher	e7c4723103	drm/amdgpu: expose benchmarks via debugfs They provide a nice smoke test of transfer performance using SDMA. Allow the user to run these at runtime rather than only at init time. v2: fix permissions (Alex) Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:02:50 -05:00
Alex Deucher	f113cc32e3	drm/amdgpu: add a benchmark mutex To avoid multiple runs in parallel to avoid mixing results. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:02:50 -05:00
Alex Deucher	b887d5f9b9	drm/amdgpu: print the selected benchmark test in the log So you can tell which benchmark was run. v2: print the test description as well Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:02:50 -05:00
Alex Deucher	e460f244fb	drm/amdgpu: plumb error handling though amdgpu_benchmark() So we can tell when this function fails. v2: squash in error handling fix (Alex) Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:02:50 -05:00
Jiawei Gu	8ab62eda17	drm/sched: Add device pointer to drm_gpu_scheduler Add device pointer so scheduler's printing can use DRM_DEV_ERROR() instead, which makes life easier under multiple GPU scenario. v2: amend all calls of drm_sched_init() v3: fill dev pointer for all drm_sched_init() calls Signed-off-by: Jiawei Gu <Jiawei.Gu@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220221095705.5290-1-Jiawei.Gu@amd.com	2022-02-23 10:04:14 +01:00
Alex Deucher	091cd9c3ab	drm/amdgpu/benchmark: use dev_info rather than DRM macros for logging So we can tell which output goes to which device when multiple GPUs are present. Also while we are here, convert DRM_ERROR to dev_info. The error cases are not critical. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-22 14:52:39 -05:00
Paul Menzel	cec2cc7b1c	drm/amdgpu: Fix typo in whether in comment Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-22 14:52:39 -05:00
Guchun Chen	e1dd4bbf86	drm/amdgpu: read harvest bit per IP data on legacy GPUs Based on firmware team's input, harvest table in VBIOS does not apply well to legacy products like Navi1x, so seperate harvest mask configuration retrieve from different places. On legacy GPUs, scan harvest bit per IP data stuctures, while for newer ones, still read IP harvest info from harvest table. v2: squash in fix to limit it to specific skus (Guchun) Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-22 14:52:39 -05:00
Prike Liang	7342bf6530	drm/amdgpu: enable TMZ option for onwards asic The TMZ is disabled by default and enable TMZ option for the IP discovery based asic will help on the TMZ function verification. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-22 14:44:27 -05:00
Guchun Chen	d4a7eac27e	drm/amdgpu: bypass tiling flag check in virtual display case (v2) vkms leverages common amdgpu framebuffer creation, and also as it does not support FB modifier, there is no need to check tiling flags when initing framebuffer when virtual display is enabled. This can fix below calltrace: amdgpu 0000:00:08.0: GFX9+ requires FB check based on format modifier WARNING: CPU: 0 PID: 1023 at drivers/gpu/drm/amd/amdgpu/amdgpu_display.c:1150 amdgpu_display_framebuffer_init+0x8e7/0xb40 [amdgpu] v2: check adev->enable_virtual_display instead as vkms can be enabled in bare metal as well. Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-22 14:44:12 -05:00
Guchun Chen	fa3e5a43ec	Revert "drm/amdgpu: add modifiers in amdgpu_vkms_plane_init()" This reverts commit `4046afcebf`. No need to support modifier in virtual kms, otherwise, in SRIOV mode, when lanuching X server, set crtc will fail due to mismatch between primary plane modifier and framebuffer modifier. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-22 14:41:24 -05:00
Evan Quan	f626dd0ff0	drm/amdgpu: disable MMHUB PG for Picasso MMHUB PG needs to be disabled for Picasso for stability reasons. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-02-21 17:55:17 -05:00
Yifan Zhang	181ebed7dc	drm/amdgpu: add dm ip block for dcn 3.1.5 this patch adds dm ip block for dcn 3.1.5. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-18 14:07:01 -05:00
Yifan Zhang	068ea8bdc0	drm/amd/pm: add smu_v13_0_5_ppt implementation this patch adds smu_v13_0_5_ppt implementation. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-18 14:07:00 -05:00
Yifan Zhang	d7fd297cb0	drm/amdgpu: add support for psp 13.0.5 Enabl psp support for psp 13.0.5. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-18 14:07:00 -05:00
Yifan Zhang	ec3ca07885	drm/amdgpu: add smuio support for smuio 13.0.10 this patch adds smuio support for smuio 13.0.10. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-18 14:07:00 -05:00
Yifan Zhang	935ad3a74c	drm/amdgpu: add support for nbio 7.3.0 this patch adds support for nbio 7.3.0. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-18 14:07:00 -05:00
Boyuan Zhang	87b5e77f02	drm/amdgpu: enable vcn pg and cg for vcn 3.1.2 Enable PG and CG for VCN/JPEG Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-18 14:06:59 -05:00
Boyuan Zhang	afc2f27605	drm/amdgpu/vcn: add vcn support for vcn 3.1.2 Load VCN FW, set caps. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-18 14:06:59 -05:00
Yifan Zhang	93afe15837	drm/amdgpu: add support for sdma 5.2.6 This patch adds support for sdma 5.2.6. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-18 14:06:59 -05:00
Chen Gong	89bfcd82b3	drm/amdgpu: do not enable asic reset for raven2 The GPU reset function of raven2 is not maintained or tested, so it should be very unstable. Now the amdgpu_asic_reset function is added to amdgpu_pmops_suspend, which causes the S3 test of raven2 to fail, so the asic_reset of raven2 is ignored here. Fixes: `daf8de0874` ("drm/amdgpu: always reset the asic in suspend (v2)") Signed-off-by: Chen Gong <curry.gong@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-18 14:06:59 -05:00
Yifan Zhang	874bfdfa47	drm/amdgpu: add gc 10.3.6 support this patch adds gc 10.3.6 support. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:06 -05:00
Yifan Zhang	a142606d54	drm/amdgpu: add support for gmc10 for gc 10.3.6 this patch adds support for gmc10. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
Yifan Zhang	50e14a62ac	drm/amdgpu: add Clock and Power Gating support for gc 10.3.6 Add below supports: GFX Coarse Grain Clock Gating(CGCG) GFX Coarse grain light sleep/deep sleep(CGLS) GFX Medium Grain Clock Gating(MGCG) GFX Medium Grain light sleep/deep sleep(MGLS) GFX Fine Grain Clock Gating(FGCG) RLC MGLS CP MGLS MMHUB Clock Gating SDMA Clock Gating HDP Clock Gating ATHUB Clock Gating IH Clock Gating GFX Power Gating Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
Yifan Zhang	1957f27de2	drm/amdgpu: add nv common init for gc 10.3.6 This patch adds add nv common init for gc 10.3.6. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
Tom Rix	779596ce6a	drm/amdgpu: fix amdgpu_ras_block_late_init error handler Clang build fails with amdgpu_ras.c:2416:7: error: variable 'ras_obj' is used uninitialized whenever 'if' condition is true if (adev->in_suspend \|\| amdgpu_in_reset(adev)) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ amdgpu_ras.c:2453:6: note: uninitialized use occurs here if (ras_obj->ras_cb) ^~~~~~~ There is a logic error in the error handler's labels. ex/ The sysfs: is the last goto label in the normal code but is the middle of error handler. Rework the error handler. cleanup: is the first error, so it's handler should be last. interrupt: is the second error, it's handler is next. interrupt: handles the failure of amdgpu_ras_interrupt_add_hander() by calling amdgpu_ras_interrupt_remove_handler(). This is wrong, remove() assumes the interrupt has been setup, not torn down by add(). Change the goto label to cleanup. sysfs is the last error, it's handler should be first. sysfs: handles the failure of amdgpu_ras_sysfs_create() by calling amdgpu_ras_sysfs_remove(). But when the create() fails there is nothing added so there is nothing to remove. This error handler is not needed. Remove the error handler and change goto label to interrupt. Fixes: `b293e891b0` ("drm/amdgpu: add helper function to do common ras_late_init/fini (v3)") Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
Luben Tuikov	6b5033831f	drm/amdgpu: Dynamically initialize IP instance attributes Dynamically initialize IP instance attributes. This eliminates bugs stemming from adding new attributes to an IP instance. Cc: Alex Deucher <Alexander.Deucher@amd.com> Reported-by: Tom StDenis <tom.stdenis@amd.com> Fixes: `4d7ba312dd` ("drm/amdgpu: Add "harvest" to IP discovery sysfs") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
Tom St Denis	8f74f68d90	drm/amd/amdgpu: Add APU flag to gca_config debugfs data (v3) Needed by umr to detect if ip discovered ASIC is an APU or not. (v2): Remove asic type from packet it's not strictly needed (v3): Correct comment Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
Mario Limonciello	d01899d3db	drm/amd: Use amdgpu_device_should_use_aspm on navi umd pstate switching The `program_aspm` callback is already guarded for aspm, but the `enable_aspm` callback doesn't follow the module parameter. Update it to use the helper `amdgpu_device_should_use_aspm`. Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
Mario Limonciello	0ab5d711ec	drm/amd: Refactor `amdgpu_aspm` to be evaluated per device Evaluating `pcie_aspm_enabled` as part of driver probe has the implication that if one PCIe bridge with an AMD GPU connected doesn't support ASPM then none of them do. This is an invalid assumption as the PCIe core will configure ASPM for individual PCIe bridges. Create a new helper function that can be called by individual dGPUs to react to the `amdgpu_aspm` module parameter without having negative results for other dGPUs on the PCIe bus. Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
Luben Tuikov	f0d5409895	drm/amdgpu: Fix ARM compilation warning Fix this ARM warning: drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:664:35: warning: format '%ld' expects argument of type 'long int', but argument 4 has type 'size_t' {aka 'unsigned int'} [-Wformat=] Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: kbuild-all@lists.01.org Cc: linux-kernel@vger.kernel.org Reported-by: kernel test robot <lkp@intel.com> Fixes: `a6c40b1780` ("drm/amdgpu: Show IP discovery in sysfs") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
Mario Limonciello	cba07cce39	drm/amd: Check if ASPM is enabled from PCIe subsystem commit `0064b0ce85` ("drm/amd/pm: enable ASPM by default") enabled ASPM by default but a variety of hardware configurations it turns out that this caused a regression. * PPC64LE hardware does not support ASPM at a hardware level. CONFIG_PCIEASPM is often disabled on these architectures. * Some dGPUs on ALD platforms don't work with ASPM enabled and PCIe subsystem disables it Check with the PCIe subsystem to see that ASPM has been enabled or not. Fixes: `0064b0ce85` ("drm/amd/pm: enable ASPM by default") Link: https://wiki.raptorcs.com/w/images/a/ad/P9_PHB_version1.0_27July2018_pub.pdf Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1723 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1739 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1885 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1907 Tested-by: koba.ko@canonical.com Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
yipechai	418abce203	drm/amdgpu: Remove redundant .ras_late_init initialization in some ras blocks 1. Define amdgpu_ras_block_late_init_default in amdgpu_ras.c as .ras_late_init common function, which is called when .ras_late_init of ras block isn't initialized. 2. Remove the code of using amdgpu_ras_block_late_init to initialize .ras_late_init in ras blocks. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
yipechai	867e24ca49	drm/amdgpu: define amdgpu_ras_late_init to call all ras blocks' .ras_late_init Define amdgpu_ras_late_init to call all ras blocks' .ras_late_init. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
yipechai	caae42f009	drm/amdgpu: Optimize xxx_ras_late_init function of each ras block 1. Move calling ras block instance members from module internal function to the top calling xxx_ras_late_init. 2. Module internal function calls can only use parameter variables of xxx_ras_late_init instead of ras block instance members. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
yipechai	20c43547ad	drm/amdgpu: Remove redundant calls of ras_late_init in mca ras block Remove redundant calls of ras_late_init in mca ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
yipechai	068001b711	drm/amdgpu: Remove redundant calls of ras_late_init in mmhub ras block Remove redundant calls of ras_late_init in mmhub ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:04 -05:00
yipechai	72b3588e27	drm/amdgpu: Remove redundant calls of ras_late_init in hdp ras block Remove redundant calls of ras_late_init in hdp ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:04 -05:00
yipechai	4e9b1fa5a2	drm/amdgpu: Modify .ras_late_init function pointer parameter Modify .ras_late_init function pointer parameter so that it can remove redundant intermediate calls in some ras blocks. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:04 -05:00
Prike Liang	f83e14011e	drm/amdgpu/discovery: Add sw DM function for 3.1.6 DCE Add 3.1.6 DCE IP and assign relevant sw DM function for the new DCE. Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:04 -05:00
Prike Liang	a65dbf7cde	drm/amdgpu/gfx10: Add GC 10.3.7 Support Needed to properly initialize GC 10.3.7. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:03 -05:00
Prike Liang	967af863f2	drm/amdgpu/sdma5.2: add support for SDMA 5.2.7 Initialize SDMA engine firmware loading. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:03 -05:00
Prike Liang	db090ff8f9	drm/amd/pm: Add support for MP1 13.0.8 Set smu sw function and enable swSMU support for MP1. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:03 -05:00
Prike Liang	f99a7eb2d1	drm/amdgpu/psp: Add support for MP0 13.0.8 Set psp sw funcs callback and firmware loading for MP0. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:03 -05:00
Prike Liang	97437f475c	drm/amdgpu/gmc10: add support for GC 10.3.7 Set gfxhub function and configure VM for GC block. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:03 -05:00
Sathishkumar S	35c27d9578	drm/amdgpu: update vcn/jpeg PG flags for VCN 3.1.1 update vcn and jpeg power gating flags for VCN 3.1.1 Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:03 -05:00
Prike Liang	b67f00e06f	drm/amdgpu: set new revision id for 10.3.7 GC Add new revision ID for GC 10.3.7 and set cg/pg flags. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:03 -05:00
Prike Liang	2fbc508697	drm/amdgpu/discovery: set sw common init for GC 10.3.7 Set nv_common_ip_block for GC 10.3.7. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:02 -05:00
Prike Liang	2019bf7cd2	drm/amdgpu/discovery: Add 13.0.9 SMUIO block Add SMUIO sw function for the new SMUIO block. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:02 -05:00
Prike Liang	01cbf049e1	drm/amdgpu/discovery: add nbio sw func for 7.5.1 nbio add nbio sw func for the new 7.5.1 nbio block. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:02 -05:00
Alex Deucher	dfcc3e8c24	drm/amdgpu: make cyan skillfish support code more consistent Since this is an existing asic, adjust the code to follow the same logic as previously so the driver state is consistent. No functional change intended. Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:02 -05:00
Victor Skvortsov	aa79d3808e	drm/amdgpu: Fix wait for RLCG command completion if (!(tmp & flag)) condition will always evaluate to true when the flag is 0x0 (AMDGPU_RLCG_GC_WRITE). Instead check that address bits are cleared to determine whether the command is complete. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Tested-by: Bokun Zhang <bokun.zhang@amd.com> Reviewed by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:02 -05:00
Luben Tuikov	4d7ba312dd	drm/amdgpu: Add "harvest" to IP discovery sysfs Add the "harvest" field to the IP attributes in the IP discovery sysfs visualization, as this field is present in the binary data. At the time of this commit, the harvest data isn't consistently correct in VBIOS, but it is exposed for completeness, in the hopes that VBIOS will be fixed in the future. Cc: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:02:12 -05:00
Evan Quan	e506db5905	drm/amdgpu: disable MMHUB PG for Picasso MMHUB PG needs to be disabled for Picasso for stability reasons. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:01:28 -05:00
Alex Deucher	92ede25ece	drm/amdgpu/sdma5.2: Adjust the name string for firmware This will make it easier to add new firmwares in the future. Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 16:44:40 -05:00
Tom Rix	eed1a5c742	drm/amdgpu: check return status before using stable_pstate Clang static analysis reports this problem amdgpu_ctx.c:616:26: warning: Assigned value is garbage or undefined args->out.pstate.flags = stable_pstate; ^ ~~~~~~~~~~~~~ amdgpu_ctx_stable_pstate can fail without setting stable_pstate. So check. Fixes: `8cda7a4f96` ("drm/amdgpu/UAPI: add new CTX OP to get/set stable pstates") Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 16:44:40 -05:00
Surbhi Kakarya	7258fa31ea	drm/amdgpu: Handle the GPU recovery failure in SRIOV environment. This patch handles the GPU recovery failure in sriov environment by retrying the reset if the first reset fails. To determine the condition of retry, a new macro AMDGPU_RETRY_SRIOV_RESET is added which returns true if failure is due to ETIMEDOUT, EINVAL or EBUSY, otherwise return false.A new macro AMDGPU_MAX_RETRY_LIMIT is used to limit the retry to 2. It also handles the return status in Post Asic Reset by updating the return code with asic_reset_res and eventually return the return code in amdgpu_job_timedout(). Signed-off-by: Surbhi Kakarya <surbhi.kakarya@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:41 -05:00
Stanley.Yang	1ec1944eb5	drm/amdgpu: print more error info print more error info when deferred uncorrectable ras error changed from V1: move Defferred error msg into query uncorrectable error count function. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:41 -05:00
yipechai	563285c85e	drm/amdgpu: Merge amdgpu_ras_late_init/amdgpu_ras_late_fini to amdgpu_ras_block_late_init/amdgpu_ras_block_late_fini 1. Merge amdgpu_ras_late_init to amdgpu_ras_block_late_init. 2. Remove amdgpu_ras_late_init since no ras block calls amdgpu_ras_late_init. 3. Merge amdgpu_ras_late_fini to amdgpu_ras_block_late_fini. 4. Remove amdgpu_ras_late_fini since no ras block calls amdgpu_ras_late_fini. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:41 -05:00
yipechai	9252d33df5	drm/amdgpu: Optimize operating sysfs and interrupt function interface in amdgpu_ras.c In order to reduce redundant struct conversion, modify operating sysfs and interrupt function interface parameters. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:41 -05:00
yipechai	892a57a975	drm/amdgpu: Optimize amdgpu_xgmi_ras_late_init/amdgpu_xgmi_ras_fini function code Optimize amdgpu_xgmi_ras_late_init/amdgpu_xgmi_ras_fini function code. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:41 -05:00
yipechai	a3ace75cdb	drm/amdgpu: Optimize amdgpu_umc_ras_late_init/amdgpu_umc_ras_fini function code Optimize amdgpu_umc_ras_late_init/amdgpu_umc_ras_fini function code. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:41 -05:00
yipechai	683bac6b00	drm/amdgpu: Optimize amdgpu_sdma_ras_late_init/amdgpu_sdma_ras_fini function code Optimize amdgpu_sdma_ras_late_init/amdgpu_sdma_ras_fini function code. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:41 -05:00
yipechai	80ed77f971	drm/amdgpu: Optimize amdgpu_nbio_ras_late_init/amdgpu_nbio_ras_fini function code Optimize amdgpu_nbio_ras_late_init/amdgpu_nbio_ras_fini function code. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:41 -05:00
yipechai	cb9561d0e3	drm/amdgpu: Optimize amdgpu_mmhub_ras_late_init/amdgpu_mmhub_ras_fini function code Optimize amdgpu_mmhub_ras_late_init/amdgpu_mmhub_ras_fini function code. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:41 -05:00
yipechai	88bc3cd845	drm/amdgpu: Optimize amdgpu_mca_ras_late_init/amdgpu_mca_ras_fini function code Optimize amdgpu_mca_ras_late_init/amdgpu_mca_ras_fini function code. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:41 -05:00
yipechai	634b56b0f8	drm/amdgpu: Optimize amdgpu_hdp_ras_late_init/amdgpu_hdp_ras_fini function code Optimize amdgpu_hdp_ras_late_init/amdgpu_hdp_ras_fini function code. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:41 -05:00
yipechai	311065086e	drm/amdgpu: Optimize amdgpu_gfx_ras_late_init/amdgpu_gfx_ras_fini function code Optimize amdgpu_gfx_ras_late_init/amdgpu_gfx_ras_fini function code. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:40 -05:00
yipechai	bdb3489cfc	drm/amdgpu: Optimize xxx_ras_late_init/xxx_ras_late_fini for each ras block 1. Define amdgpu_ras_block_late_init to create sysfs nodes and interrupt handles. 2. Define amdgpu_ras_block_late_fini to remove sysfs nodes and interrupt handles. 3. Replace ras block variable members in struct amdgpu_ras_block_object with struct ras_common_if, which can make it easy to associate each ras block instance with each ras block functional interface. 4. Add .ras_cb to struct amdgpu_ras_block_object. 5. Change each ras block to fit for the changement of struct amdgpu_ras_block_object. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:40 -05:00
Guchun Chen	22b1df28c0	drm/amdgpu: no rlcg legacy read in SRIOV case rlcg legacy read is not available in SRIOV configration. Otherwise, gmc_v9_0_flush_gpu_tlb will always complain timeout and finally breaks driver load. v2: bypass read in amdgpu_virt_get_rlcg_reg_access_flag (from Victor) Fixes: `97d1a3b967` ("drm/amdgpu: switch to get_rlcg_reg_access_flag for gfx9") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Victor Skvortsov <Victor.Skvortsov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:40 -05:00
Rajneesh Bhardwaj	7157934699	drm/amdgpu: Fix a kerneldoc warning Add missing parameters to fix a kerneldoc warning Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:40 -05:00
Luben Tuikov	a6c40b1780	drm/amdgpu: Show IP discovery in sysfs Add IP discovery data in sysfs. The format is: /sys/class/drm/cardX/device/ip_discovery/die/D/B/I/<attrs> where, X is the card ID, an integer, D is the die ID, an integer, B is the IP HW ID, an integer, aka block type, I is the IP HW ID instance, an integer. <attrs> are the attributes of the block instance. At the moment these include HW ID, instance number, major, minor, revision, number of base addresses, and the base addresses themselves. A symbolic link of the acronym HW ID is also created, under D/, if you prefer to browse by something humanly accessible. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Tom StDenis <tom.stdenis@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:40 -05:00
Rajneesh Bhardwaj	77608faa77	drm/amdgpu: Fix some kerneldoc warnings Fix few kerneldoc warnings and one typo. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:40 -05:00
Rajib Mahapatra	f8f4e2a518	drm/amdgpu: skipping SDMA hw_init and hw_fini for S0ix. [Why] SDMA ring buffer test failed if suspend is aborted during S0i3 resume. [How] If suspend is aborted for some reason during S0i3 resume cycle, it follows SDMA ring test failing and errors in amdgpu resume. For RN/CZN/Picasso, SMU saves and restores SDMA registers during S0ix cycle. So, skipping SDMA suspend and resume from driver solves the issue. This time, the system is able to resume gracefully even the suspend is aborted. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rajib Mahapatra <rajib.mahapatra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-02-14 14:59:46 -05:00
Christian König	7db47b8388	drm/amdgpu: remove VRAM accounting v2 This is provided by TTM now. Also switch man->size to bytes instead of pages and fix the double printing of size and usage in debugfs. v2: fix size checking as well Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220214093439.2989-8-christian.koenig@amd.com	2022-02-14 15:05:39 +01:00
Christian König	3fc2b087df	drm/amdgpu: remove PL_PREEMPT accounting This is provided by TTM now. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220214093439.2989-7-christian.koenig@amd.com	2022-02-14 15:05:39 +01:00
Christian König	dfa714b88e	drm/amdgpu: remove GTT accounting v2 This is provided by TTM now. Also switch man->size to bytes instead of pages and fix the double printing of size and usage in debugfs. v2: fix size checking as well Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220214093439.2989-6-christian.koenig@amd.com	2022-02-14 15:05:39 +01:00
Dave Airlie	123db17ddf	Merge tag 'amd-drm-next-5.18-2022-02-11-1' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.18-2022-02-11-1: amdgpu: - Clean up of power management code - Enable freesync video mode by default - Clean up of RAS code - Improve VRAM access for debug using SDMA - Coding style cleanups - SR-IOV fixes - More display FP reorg - TLB flush fixes for Arcuturus, Vega20 - Misc display fixes - Rework special register access methods for SR-IOV - DP2 fixes - DP tunneling fixes - DSC fixes - More IP discovery cleanups - Misc RAS fixes - Enable both SMU i2c buses where applicable - s2idle improvements - DPCS header cleanup - Add new CAP firmware support for SR-IOV amdkfd: - Misc cleanups - SVM fixes - CRIU support - Clean up MQD manager UAPI: - Add interface to amdgpu CTX ioctl to request a stable power state for profiling https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/207 - Add amdkfd support for CRIU https://github.com/checkpoint-restore/criu/pull/1709 - Remove old unused amdkfd debugger interface Was only implemented for Kaveri and was only ever used by an old HSA tool that was never open sourced radeon: - Fix error handling in radeon_driver_open_kms - UVD suspend fix - Misc fixes From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220211220706.5803-1-alexander.deucher@amd.com	2022-02-14 10:31:51 +10:00
Stanley.Yang	1915a43395	drm/amdgpu: adjust register address calculation the UMC_STATUS register is not linear, adjust offset calculation formula to get correct address Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-11 16:19:59 -05:00
Rajib Mahapatra	f3986e86b2	drm/amdgpu: skipping SDMA hw_init and hw_fini for S0ix. [Why] SDMA ring buffer test failed if suspend is aborted during S0i3 resume. [How] If suspend is aborted for some reason during S0i3 resume cycle, it follows SDMA ring test failing and errors in amdgpu resume. For RN/CZN/Picasso, SMU saves and restores SDMA registers during S0ix cycle. So, skipping SDMA suspend and resume from driver solves the issue. This time, the system is able to resume gracefully even the suspend is aborted. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rajib Mahapatra <rajib.mahapatra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-11 16:19:41 -05:00
Ken Xue	461fa7b0ac	drm/amdgpu: remove ctx->lock KMD reports a warning on holding a lock from drm_syncobj_find_fence, when running amdgpu_test case “syncobj timeline test”. ctx->lock was designed to prevent concurrent "amdgpu_ctx_wait_prev_fence" calls and avoid dead reservation lock from GPU reset. since no reservation lock is held in latest GPU reset any more, ctx->lock can be simply removed and concurrent "amdgpu_ctx_wait_prev_fence" call also can be prevented by PD root bo reservation lock. call stacks: ================= //hold lock amdgpu_cs_ioctl->amdgpu_cs_parser_init->mutex_lock(&parser->ctx->lock); … //report warning amdgpu_cs_dependencies->amdgpu_cs_process_syncobj_timeline_in_dep \ ->amdgpu_syncobj_lookup_and_add_to_sync -> drm_syncobj_find_fence \ -> lockdep_assert_none_held_once … amdgpu_cs_ioctl->amdgpu_cs_parser_fini->mutex_unlock(&parser->ctx->lock); Signed-off-by: Ken Xue <Ken.Xue@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-11 16:19:23 -05:00
Stanley.Yang	8bbd4d83a6	drm/amdgpu: Reset OOB table error count info The OOB table error count info should be reset after reset eeprom table Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-11 16:13:00 -05:00
Tao Zhou	69f915cc97	drm/amdgpu: loose check for umc poison mode No need to check poison setting for each channel, check for umc0 channel0 is enough. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-11 16:12:07 -05:00
Lang Yu	f9ed188d5a	drm/amdgpu: add support for GC 10.1.4 Add basic support for GC 10.1.4, it uses same IP blocks with GC 10.1.3 Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-11 16:11:55 -05:00
Andrey Grodzovsky	c7703ce38c	drm/amdgpu: Fix htmldoc warning Update function name. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220211205500.601391-1-andrey.grodzovsky@amd.com	2022-02-11 16:05:08 -05:00
Andrey Grodzovsky	f5666d4823	drm/amdgpu: Fix compile error. Seems I forgot to add this to the relevant commit when submitting. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220210031724.440943-1-andrey.grodzovsky@amd.com	2022-02-10 10:23:40 +01:00
Yang Wang	63b5fa9dbb	drm/amdgpu: fix gmc init fail in sriov mode "adev->gfx.rlc.rlcg_reg_access_supported = true;" the above varible were set too late during driver initialization. it will cause the driver to fail to write/read register during GMC hw init in sriov mode. move gfx_xxx_init_rlcg_reg_access_ctrl() function to gfx early init stage to avoid this issue. Fixes: `5d447e2967` ("drm/amdgpu: add helper for rlcg indirect reg access") Signed-off-by: Yang Wang <KevinYang.Wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-09 16:57:52 -05:00
zhanglianjie	db7b81545f	drm/amd/amdgpu/amdgpu_uvd: Fix forgotten unmap buffer object After the buffer object is successfully mapped, call amdgpu_bo_kunmap before the function returns. Signed-off-by: zhanglianjie <zhanglianjie@uniontech.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-09 16:57:52 -05:00
Mukul Joshi	5bdd3eb253	drm/amdkfd: Remove unused old debugger implementation Cleanup the kfd code by removing the unused old debugger implementation. The address watch was only ever implemented in the upstream driver for GFXv7 (Kaveri). The user mode tools runtime using this API was never open-sourced. Work on the old debugger prototype that used this API has been discontinued years ago. Only a small piece of resetting wavefronts is kept and is moved to kfd_device_queue_manager.c. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-09 16:57:51 -05:00
Aaron Liu	a072312f43	drm/amdgpu: add utcl2_harvest to gc 10.3.1 Confirmed with hardware team, there is harvesting for gc 10.3.1. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-09 15:08:05 -05:00
Andrey Grodzovsky	3675c2f26f	drm/amdgpu: Revert 'drm/amdgpu: annotate a false positive recursive locking' Since we have a single instance of reset semaphore which we lock only once even for XGMI hive we don't need the nested locking hint anymore. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74120.html	2022-02-09 12:19:14 -05:00
Andrey Grodzovsky	e923be9934	drm/amdgpu: Rework amdgpu_device_lock_adev This functions needs to be split into 2 parts where one is called only once for locking single instance of reset_domain's sem and reset flag and the other part which handles MP1 states should still be called for each device in XGMI hive. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74118.html	2022-02-09 12:18:39 -05:00
Andrey Grodzovsky	89a7a87093	drm/amdgpu: Move in_gpu_reset into reset_domain We should have a single instance per entrire reset domain. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74116.html	2022-02-09 12:17:57 -05:00
Andrey Grodzovsky	d0fb18b535	drm/amdgpu: Move reset sem into reset_domain We want single instance of reset sem across all reset clients because in case of XGMI we should stop access cross device MMIO because any of them could be in a reset in the moment. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74117.html	2022-02-09 12:17:32 -05:00
Andrey Grodzovsky	cfbb6b0047	drm/amdgpu: Rework reset domain to be refcounted. The reset domain contains register access semaphor now and so needs to be present as long as each device in a hive needs it and so it cannot be binded to XGMI hive life cycle. Adress this by making reset domain refcounted and pointed by each member of the hive and the hive itself. v4: Fix crash on boot witrh XGMI hive by adding type to reset_domain. XGMI will only create a new reset_domain if prevoius was of single device type meaning it's first boot. Otherwsie it will take a refocunt to exsiting reset_domain from the amdgou device. Add a wrapper around reset_domain->refcount get/put and a wrapper around send to reset wq (Lijo) Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74121.html	2022-02-09 12:17:09 -05:00
Andrey Grodzovsky	f287a3c5b0	drm/amdgpu: Drop concurrent GPU reset protection for device Since now all GPU resets are serialzied there is no need for this. This patch also reverts 'drm/amdgpu: race issue when jobs on 2 ring timeout' Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74119.html	2022-02-09 12:16:53 -05:00
Andrey Grodzovsky	681260df4d	drm/amdgpu: Drop hive->in_reset Since we serialize all resets no need to protect from concurrent resets. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74115.html	2022-02-09 12:16:29 -05:00
Andrey Grodzovsky	02599bc7f7	drm/amd/virt: For SRIOV send GPU reset directly to TDR queue. No need to to trigger another work queue inside the work queue. v3: Problem: Extra reset caused by host side FLR notification following guest side triggered reset. Fix: Preven qeuing flr_work from mailbox irq if guest already executing a reset. Suggested-by: Liu Shaoyun <Shaoyun.Liu@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Liu Shaoyun <Shaoyun.Liu@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74114.html	2022-02-09 12:16:06 -05:00
Andrey Grodzovsky	54f329cc7a	drm/amdgpu: Serialize non TDR gpu recovery with TDRs Use reset domain wq also for non TDR gpu recovery trigers such as sysfs and RAS. We must serialize all possible GPU recoveries to gurantee no concurrency there. For TDR call the original recovery function directly since it's already executed from within the wq. For others just use a wrapper to qeueue work and wait on it to finish. v2: Rename to amdgpu_recover_work_struct Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74113.html	2022-02-09 12:15:23 -05:00
Andrey Grodzovsky	5fd8518d18	drm/amdgpu: Move scheduler init to after XGMI is ready Before we initialize schedulers we must know which reset domain are we in - for single device there iis a single domain per device and so single wq per device. For XGMI the reset domain spans the entire XGMI hive and so the reset wq is per hive. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74112.html	2022-02-09 12:15:04 -05:00
Andrey Grodzovsky	a4c63cafa5	drm/amdgpu: Introduce reset domain Defined a reset_domain struct such that all the entities that go through reset together will be serialized one against another. Do it for both single device and XGMI hive cases. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Suggested-by: Christian König <ckoenig.leichtzumerken@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74111.html	2022-02-09 12:14:32 -05:00
Christian König	e09b9aef68	drm/amdgpu: use dma_fence_chain_contained Instead of manually extracting the fence. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220204100429.2049-7-christian.koenig@amd.com	2022-02-08 09:25:40 +01:00
Alex Deucher	3786a9bc04	drm/amdgpu: drop experimental flag on aldebaran These have been at production level for a while. Drop the flag. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 18:03:50 -05:00
Christian König	b6fba4ecf3	drm/amdgpu: reserve the pd while cleaning up PRTs We want to have lockdep annotation here, so make sure that we reserve the PD while removing PRTs even if it isn't strictly necessary since the VM object is about to be destroyed anyway. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 18:03:50 -05:00
Christian König	d7d7ddc156	drm/amdgpu: move lockdep assert to the right place. Since newly added BOs don't have any mappings it's ok to add them without holding the VM lock. Only when we add per VM BOs the lock is mandatory. Signed-off-by: Christian König <christian.koenig@amd.com> Reported-by: Bhardwaj, Rajneesh <Rajneesh.Bhardwaj@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 18:03:50 -05:00
Aaron Liu	29ba7b16b9	drm/amdgpu: check the GART table before invalidating TLB Bypass group programming (utcl2_harvest) aims to forbid UTCL2 to send invalidation command to harvested SE/SA. Once invalidation command comes into harvested SE/SA, SE/SA has no response and system hang. This patch is to add checking if the GART table is already allocated before invalidating TLB. The new procedure is as following: 1. Calling amdgpu_gtt_mgr_init() in amdgpu_ttm_init(). After this step GTT BOs can be allocated, but GART mappings are still ignored. 2. Calling amdgpu_gart_table_vram_alloc() from the GMC code. This allocates the GART backing store. 3. Initializing the hardware, and programming the backing store into VMID0 for all VMHUBs. 4. Calling amdgpu_gtt_mgr_recover() to make sure the table is updated with the GTT allocations done before it was allocated. Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Aaron Liu <aaron.liu@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 18:01:16 -05:00
Aaron Liu	6d53b115be	drm/amdgpu: add utcl2_harvest to gc 10.3.1 Confirmed with hardware team, there is harvesting for gc 10.3.1. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 18:01:16 -05:00
Tao Zhou	4e781873fa	drm/amdgpu: fix list add issue in vram reserve The parameter order in the list_add_tail is incorrect, it causes the reuse of ras reserved page. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 18:01:16 -05:00
yipechai	a50b048276	Revert "drm/amdgpu: Add judgement to avoid infinite loop" The commit `d5e8ff5f7b` ("drm/amdgpu: Fixed the defect of soft lock caused by infinite loop") had fixed this defect. Revert workaround commit `a2170b4af6` ("drm/amdgpu: Add judgement to avoid infinite loop"). Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 18:00:05 -05:00
yipechai	d5e8ff5f7b	drm/amdgpu: Fixed the defect of soft lock caused by infinite loop 1. The infinite loop case only occurs on multiple cards support ras functions. 2. The explanation of root cause refer to commit 76641cbbf196 ("drm/amdgpu: Add judgement to avoid infinite loop"). 3. Create new node to manage each unique ras instance to guarantee each device .ras_list is completely independent. 4. Fixes: commit 7a6b8ab3231b51 ("drm/amdgpu: Unify ras block interface for each ras block"). 5. The soft locked logs are as follows: [ 262.165690] CPU: 93 PID: 758 Comm: kworker/93:1 Tainted: G OE 5.13.0-27-generic #29~20.04.1-Ubuntu [ 262.165695] Hardware name: Supermicro AS -4124GS-TNR/H12DSG-O-CPU, BIOS T20200717143848 07/17/2020 [ 262.165698] Workqueue: events amdgpu_ras_do_recovery [amdgpu] [ 262.165980] RIP: 0010:amdgpu_ras_get_ras_block+0x86/0xd0 [amdgpu] [ 262.166239] Code: 68 d8 4c 8d 71 d8 48 39 c3 74 54 49 8b 45 38 48 85 c0 74 32 44 89 fa 44 89 e6 4c 89 ef e8 82 e4 9b dc 85 c0 74 3c 49 8b 46 28 <49> 8d 56 28 4d 89 f5 48 83 e8 28 48 39 d3 74 25 49 89 c6 49 8b 45 [ 262.166243] RSP: 0018:ffffac908fa87d80 EFLAGS: 00000202 [ 262.166247] RAX: ffffffffc1394248 RBX: ffff91e4ab8d6e20 RCX: ffffffffc1394248 [ 262.166249] RDX: ffff91e4aa356e20 RSI: 000000000000000e RDI: ffff91e4ab8c0000 [ 262.166252] RBP: ffffac908fa87da8 R08: 0000000000000007 R09: 0000000000000001 [ 262.166254] R10: ffff91e4930b64ec R11: 0000000000000000 R12: 000000000000000e [ 262.166256] R13: ffff91e4aa356df8 R14: ffffffffc1394320 R15: 0000000000000003 [ 262.166258] FS: 0000000000000000(0000) GS:ffff92238fb40000(0000) knlGS:0000000000000000 [ 262.166261] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 262.166264] CR2: 00000001004865d0 CR3: 000000406d796000 CR4: 0000000000350ee0 [ 262.166267] Call Trace: [ 262.166272] amdgpu_ras_do_recovery+0x130/0x290 [amdgpu] [ 262.166529] ? psi_task_switch+0xd2/0x250 [ 262.166537] ? __switch_to+0x11d/0x460 [ 262.166542] ? __switch_to_asm+0x36/0x70 [ 262.166549] process_one_work+0x220/0x3c0 [ 262.166556] worker_thread+0x4d/0x3f0 [ 262.166560] ? process_one_work+0x3c0/0x3c0 [ 262.166563] kthread+0x12b/0x150 [ 262.166568] ? set_kthread_struct+0x40/0x40 [ 262.166571] ret_from_fork+0x22/0x30 Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 17:59:53 -05:00
Luben Tuikov	00d6936dbd	drm/amdgpu: Set FRU bus for Aldebaran and Vega 20 The FRU and RAS EEPROMs share the same I2C bus on Aldebaran and Vega 20 ASICs. Set the FRU bus "pointer" to this single bus, as access to the FRU is sought through that bus "pointer" and not through the RAS bus "pointer". Cc: Roy Sun <Roy.Sun@amd.com> Cc: Alex Deucher <Alexander.Deucher@amd.com> Fixes: `2f60dd5076` ("drm/amd: Expose the FRU SMU I2C bus") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 17:59:53 -05:00
Rajneesh Bhardwaj	447c7997b6	drm/amdgpu: Fix recursive locking warning Noticed the below warning while running a pytorch workload on vega10 GPUs. Change to trylock to avoid conflicts with already held reservation locks. [ +0.000003] WARNING: possible recursive locking detected [ +0.000003] 5.13.0-kfd-rajneesh #1030 Not tainted [ +0.000004] -------------------------------------------- [ +0.000002] python/4822 is trying to acquire lock: [ +0.000004] ffff932cd9a259f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000203] but task is already holding lock: [ +0.000003] ffff932cbb7181f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: ttm_eu_reserve_buffers+0x270/0x470 [ttm] [ +0.000017] other info that might help us debug this: [ +0.000002] Possible unsafe locking scenario: [ +0.000003] CPU0 [ +0.000002] ---- [ +0.000002] lock(reservation_ww_class_mutex); [ +0.000004] lock(reservation_ww_class_mutex); [ +0.000003] * DEADLOCK * [ +0.000002] May be due to missing lock nesting notation [ +0.000003] 7 locks held by python/4822: [ +0.000003] #0: ffff932c4ac028d0 (&process->mutex){+.+.}-{3:3}, at: kfd_ioctl_map_memory_to_gpu+0x10b/0x320 [amdgpu] [ +0.000232] #1: ffff932c55e830a8 (&info->lock#2){+.+.}-{3:3}, at: amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x64/0xf60 [amdgpu] [ +0.000241] #2: ffff932cc45b5e68 (&(mem)->lock){+.+.}-{3:3}, at: amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0xdf/0xf60 [amdgpu] [ +0.000236] #3: ffffb2b35606fd28 (reservation_ww_class_acquire){+.+.}-{0:0}, at: amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x232/0xf60 [amdgpu] [ +0.000235] #4: ffff932cbb7181f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: ttm_eu_reserve_buffers+0x270/0x470 [ttm] [ +0.000015] #5: ffffffffc045f700 ((sspp++)){....}-{0:0}, at: drm_dev_enter+0x5/0xa0 [drm] [ +0.000038] #6: ffff932c52da7078 (&vm->eviction_lock){+.+.}-{3:3}, at: amdgpu_vm_bo_update_mapping+0xd5/0x4f0 [amdgpu] [ +0.000195] stack backtrace: [ +0.000003] CPU: 11 PID: 4822 Comm: python Not tainted 5.13.0-kfd-rajneesh #1030 [ +0.000005] Hardware name: GIGABYTE MZ01-CE0-00/MZ01-CE0-00, BIOS F02 08/29/2018 [ +0.000003] Call Trace: [ +0.000003] dump_stack+0x6d/0x89 [ +0.000010] __lock_acquire+0xb93/0x1a90 [ +0.000009] lock_acquire+0x25d/0x2d0 [ +0.000005] ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000184] ? lock_is_held_type+0xa2/0x110 [ +0.000006] ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000184] __ww_mutex_lock.constprop.17+0xca/0x1060 [ +0.000007] ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000183] ? lock_release+0x13f/0x270 [ +0.000005] ? lock_is_held_type+0xa2/0x110 [ +0.000006] ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000183] amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000185] ttm_bo_release+0x4c6/0x580 [ttm] [ +0.000010] amdgpu_bo_unref+0x1a/0x30 [amdgpu] [ +0.000183] amdgpu_vm_free_table+0x76/0xa0 [amdgpu] [ +0.000189] amdgpu_vm_free_pts+0xb8/0xf0 [amdgpu] [ +0.000189] amdgpu_vm_update_ptes+0x411/0x770 [amdgpu] [ +0.000191] amdgpu_vm_bo_update_mapping+0x324/0x4f0 [amdgpu] [ +0.000191] amdgpu_vm_bo_update+0x251/0x610 [amdgpu] [ +0.000191] update_gpuvm_pte+0xcc/0x290 [amdgpu] [ +0.000229] ? amdgpu_vm_bo_map+0xd7/0x130 [amdgpu] [ +0.000190] amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x912/0xf60 [amdgpu] [ +0.000234] kfd_ioctl_map_memory_to_gpu+0x182/0x320 [amdgpu] [ +0.000218] kfd_ioctl+0x2b9/0x600 [amdgpu] [ +0.000216] ? kfd_ioctl_unmap_memory_from_gpu+0x270/0x270 [amdgpu] [ +0.000216] ? lock_release+0x13f/0x270 [ +0.000006] ? __fget_files+0x107/0x1e0 [ +0.000007] __x64_sys_ioctl+0x8b/0xd0 [ +0.000007] do_syscall_64+0x36/0x70 [ +0.000004] entry_SYSCALL_64_after_hwframe+0x44/0xae [ +0.000007] RIP: 0033:0x7fbff90a7317 [ +0.000004] Code: b3 66 90 48 8b 05 71 4b 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 41 4b 2d 00 f7 d8 64 89 01 48 [ +0.000005] RSP: 002b:00007fbe301fe648 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ +0.000006] RAX: ffffffffffffffda RBX: 00007fbcc402d820 RCX: 00007fbff90a7317 [ +0.000003] RDX: 00007fbe301fe690 RSI: 00000000c0184b18 RDI: 0000000000000004 [ +0.000003] RBP: 00007fbe301fe690 R08: 0000000000000000 R09: 00007fbcc402d880 [ +0.000003] R10: 0000000002001000 R11: 0000000000000246 R12: 00000000c0184b18 [ +0.000003] R13: 0000000000000004 R14: 00007fbf689593a0 R15: 00007fbcc402d820 Cc: Christian König <christian.koenig@amd.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Alex Deucher <Alexander.Deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 17:59:53 -05:00
Luben Tuikov	00b14ce075	drm/amdgpu: Prevent random memory access in FRU code Prevent random memory access in the FRU EEPROM code by passing the size of the destination buffer to the reading routine, and reading no more than the size of the buffer. Cc: Kent Russell <kent.russell@amd.com> Cc: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 17:59:53 -05:00
Luben Tuikov	3f3a24a0a3	drm/amdgpu: Don't offset by 2 in FRU EEPROM Read buffers no longer expose the I2C address, and so we don't need to offset by two when we get the read data. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Kent Russell <kent.russell@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Fixes: `bd607166af` ("drm/amdgpu: Enable reading FRU chip via I2C v3") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 17:59:53 -05:00
Luben Tuikov	3f1e2e9d99	drm/amdgpu: Nerf "buff" to "buf" Buffer is abbreviated "buf" (buf-fer), not "buff" (buff-er). This is consistent with the rest of the kernel code. Cc: Kent Russell <kent.russell@amd.com> Cc: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 17:59:53 -05:00
Rajneesh Bhardwaj	011bbb0302	drm/amdkfd: CRIU Implement KFD resume ioctl This adds support to create userptr BOs on restore and introduces a new ioctl op to restart memory notifiers for the restored userptr BOs. When doing CRIU restore MMU notifications can happen anytime after we call amdgpu_mn_register. Prevent MMU notifications until we reach stage-4 of the restore process i.e. criu_resume ioctl op is received, and the process is ready to be resumed. This ioctl is different from other KFD CRIU ioctls since its called by CRIU master restore process for all the target processes being resumed by CRIU. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: David Yat Sin <david.yatsin@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 17:59:41 -05:00
Rajneesh Bhardwaj	5ccbb057c0	drm/amdkfd: CRIU Implement KFD checkpoint ioctl This adds support to discover the buffer objects that belong to a process being checkpointed. The data corresponding to these buffer objects is returned to user space plugin running under criu master context which then stores this info to recreate these buffer objects during a restore operation. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: David Yat Sin <david.yatsin@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 17:59:29 -05:00
Luben Tuikov	afa3731591	drm/amdgpu: Print once if RAS unsupported MESA polls for errors every 2-3 seconds. Printing with dev_info() causes the dmesg log to fill up with the same message, e.g, [18028.206676] amdgpu 0000:0b:00.0: amdgpu: df doesn't config ras function. Make it dev_dbg_once(), as it isn't something correctible during boot or thereafter, so printing just once is sufficient. Also sanitize the message. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: John Clements <john.clements@amd.com> Cc: Tao Zhou <tao.zhou1@amd.com> Cc: yipechai <YiPeng.Chai@amd.com> Fixes: `8b0fb0e967` ("drm/amdgpu: Modify gfx block to fit for the unified ras block data and ops") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 17:14:16 -05:00
Christian König	e56694f718	drm/amdgpu: rename amdgpu_vm_bo_rmv to _del Some people complained about the name and this matches much more Linux naming conventions for object functions. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 17:14:10 -05:00
Christian König	2d022081b3	drm/amdgpu: add some lockdep checks to the VM code Whenever a bo_va structure is added or removed the VM and eventually added BO should be locked. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 17:13:52 -05:00
Lucas De Marchi	b8c75bd974	drm: Convert open-coded yes/no strings to yesno() linux/string_helpers.h provides a helper to return "yes"/"no" strings. Replace the open coded versions with str_yes_no(). The places were identified with the following semantic patch: @@ expression b; @@ - b ? "yes" : "no" + str_yes_no(b) Then the includes were added, so we include-what-we-use, and parenthesis adjusted in drivers/gpu/drm/v3d/v3d_debugfs.c. After the conversion we still see the same binary sizes: text data bss dec hex filename 51149 3295 212 54656 d580 virtio/virtio-gpu.ko.old 51149 3295 212 54656 d580 virtio/virtio-gpu.ko 1441491 60340 800 1502631 16eda7 radeon/radeon.ko.old 1441491 60340 800 1502631 16eda7 radeon/radeon.ko 6125369 328538 34000 6487907 62ff63 amd/amdgpu/amdgpu.ko.old 6125369 328538 34000 6487907 62ff63 amd/amdgpu/amdgpu.ko 411986 10490 6176 428652 68a6c drm.ko.old 411986 10490 6176 428652 68a6c drm.ko 98129 1636 264 100029 186bd dp/drm_dp_helper.ko.old 98129 1636 264 100029 186bd dp/drm_dp_helper.ko 1973432 109640 2352 2085424 1fd230 nouveau/nouveau.ko.old 1973432 109640 2352 2085424 1fd230 nouveau/nouveau.ko Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220126093951.1470898-10-lucas.demarchi@intel.com	2022-02-07 13:04:25 -08:00
Maarten Lankhorst	542898c5aa	Merge remote-tracking branch 'drm/drm-next' into drm-misc-next First backmerge into drm-misc-next. Required for more helpers backmerged, and to pull in 5.17 (rc2). Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>	2022-02-07 17:03:24 +01:00
Christian König	e8ae38720e	drm/amdgpu: fix logic inversion in check We probably never trigger this, but the logic inside the check is inverted. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:35:00 -05:00
Mario Limonciello	e55a3aea41	drm/amd: avoid suspend on dGPUs w/ s2idle support when runtime PM enabled dGPUs connected to Intel systems configured for suspend to idle will not have the power rails cut at suspend and resetting the GPU may lead to problematic behaviors. Fixes: `e25443d276` ("drm/amdgpu: add a dev_pm_ops prepare callback (v2)") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1879 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:35:00 -05:00
Lang Yu	bca52455a3	drm/amdgpu: fix a potential GPU hang on cyan skillfish We observed a GPU hang when querying GMC CG state(i.e., cat amdgpu_pm_info) on cyan skillfish. Acctually, cyan skillfish doesn't support any CG features. Just prevent it from accessing GMC CG registers. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-02-02 18:35:00 -05:00
Mario Limonciello	04ef860469	drm/amd: Only run s3 or s0ix if system is configured properly This will cause misconfigured systems to not run the GPU suspend routines. * In APUs that are properly configured system will go into s2idle. * In APUs that are intended to be S3 but user selects s2idle the GPU will stay fully powered for the suspend. * In APUs that are intended to be s2idle and system misconfigured the GPU will stay fully powered for the suspend. * In systems that are intended to be s2idle, but AMD dGPU is also present, the dGPU will go through S3 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:35:00 -05:00
Mario Limonciello	f52a2b8bad	drm/amd: add support to check whether the system is set to s3 This will be used to help make decisions on what to do in misconfigured systems. v2: squash in semicolon fix from Stephen Rothwell Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:35:00 -05:00
Somalapuram Amaranath	4f860edecd	drm/amdgpu: limit the number of dst address in trace trace_amdgpu_vm_update_ptes trace unable to log when nptes too large Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:27:52 -05:00
Mario Limonciello	9308a49d8e	drm/amd: avoid suspend on dGPUs w/ s2idle support when runtime PM enabled dGPUs connected to Intel systems configured for suspend to idle will not have the power rails cut at suspend and resetting the GPU may lead to problematic behaviors. Fixes: `e25443d276` ("drm/amdgpu: add a dev_pm_ops prepare callback (v2)") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1879 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:27:01 -05:00
Christian König	22f7cc7524	drm/amdgpu: restructure amdgpu_fill_buffer v2 We ran into the problem that clearing really larger buffer (60GiB) caused an SDMA timeout. Restructure the function to use the dst window instead of mapping the whole buffer into the GART and then fill only 2MiB/256MiB chunks at a time. v2: rebase on restructured window map. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:26:55 -05:00
Christian König	6927913d70	drm/amdgpu: rework GART copy window handling Instead of limiting the size before we call the mapping function let the function itself limit the size. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:26:50 -05:00
Christian König	e0a4459d45	drm/amdgpu: lower BUG_ON into WARN_ON for AMDGPU_PL_PREEMPT That should never happen, but make sure that we only warn instead of crash. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:26:44 -05:00
Christian König	fcd6b0e270	drm/amdgpu: fix logic inversion in check We probably never trigger this, but the logic inside the check is inverted. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:26:32 -05:00
Guchun Chen	274b924c3e	drm/amdgpu: drop flood print in rlcg reg access function A lot of below message are outputed in SRIOV case. amdgpu: indirect registers access through rlcg is not supported Also drop redundant ret set, as it's initialized to be false already. Fixes: `29dbcac82f` ("drm/amdgpu: add helper to query rlcg reg access flag") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:26:31 -05:00
Lijo Lazar	889f84798c	drm/amdgpu: Fix uninitialized variable use warning Fix uninitialized variable use warning: variable 'reg_access_ctrl' is uninitialized when used here [-Wuninitialized] scratch_reg0 = (void __iomem )adev->rmmio + 4 reg_access_ctrl->scratch_reg0; Fixes: `5d447e2967` ("drm/amdgpu: add helper for rlcg indirect reg access") Reported-by: kernel test robot <yujie.liu@intel.com> Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:26:31 -05:00
yipechai	a2170b4af6	drm/amdgpu: Add judgement to avoid infinite loop 1. The infinite loop causing soft lock occurs on multiple amdgpu cards supporting ras feature. 2. This a workaround patch to fix `6492e1b07c`. It is valid for multiple amdgpu cards of the same type. 3. The root cause is that each GPU card device has a separate .ras_list link header, but the instance and linked list node of each ras block are unique. When each device is initialized, each ras instance will repeatedly add link node to the device every time. In this way, only the .ras_list of the last initialized device is completely correct. the .ras_list->prev and .ras_list->next of the device initialzied before can still point to the correct ras instance, but the prev pointer and next pointer of the pointed ras instance both point to the last initialized device's .ras_ list instead of the beginning .ras_ list. When using list_for_each_entry_safe searches for non-existent Ras nodes on devices other than the last device, the last ras instance next pointer cannot always be equal to the beginning .ras_list, so that the loop cannot be terminated, the program enters a infinite loop. BTW: Since the data and initialization process of each card are the same, the link list between ras instances will not be destroyed every time the device is initialized. 4. The soft locked logs are as follows: [ 262.165690] CPU: 93 PID: 758 Comm: kworker/93:1 Tainted: G OE 5.13.0-27-generic #29~20.04.1-Ubuntu [ 262.165695] Hardware name: Supermicro AS -4124GS-TNR/H12DSG-O-CPU, BIOS T20200717143848 07/17/2020 [ 262.165698] Workqueue: events amdgpu_ras_do_recovery [amdgpu] [ 262.165980] RIP: 0010:amdgpu_ras_get_ras_block+0x86/0xd0 [amdgpu] [ 262.166239] Code: 68 d8 4c 8d 71 d8 48 39 c3 74 54 49 8b 45 38 48 85 c0 74 32 44 89 fa 44 89 e6 4c 89 ef e8 82 e4 9b dc 85 c0 74 3c 49 8b 46 28 <49> 8d 56 28 4d 89 f5 48 83 e8 28 48 39 d3 74 25 49 89 c6 49 8b 45 [ 262.166243] RSP: 0018:ffffac908fa87d80 EFLAGS: 00000202 [ 262.166247] RAX: ffffffffc1394248 RBX: ffff91e4ab8d6e20 RCX: ffffffffc1394248 [ 262.166249] RDX: ffff91e4aa356e20 RSI: 000000000000000e RDI: ffff91e4ab8c0000 [ 262.166252] RBP: ffffac908fa87da8 R08: 0000000000000007 R09: 0000000000000001 [ 262.166254] R10: ffff91e4930b64ec R11: 0000000000000000 R12: 000000000000000e [ 262.166256] R13: ffff91e4aa356df8 R14: ffffffffc1394320 R15: 0000000000000003 [ 262.166258] FS: 0000000000000000(0000) GS:ffff92238fb40000(0000) knlGS:0000000000000000 [ 262.166261] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 262.166264] CR2: 00000001004865d0 CR3: 000000406d796000 CR4: 0000000000350ee0 [ 262.166267] Call Trace: [ 262.166272] amdgpu_ras_do_recovery+0x130/0x290 [amdgpu] [ 262.166529] ? psi_task_switch+0xd2/0x250 [ 262.166537] ? __switch_to+0x11d/0x460 [ 262.166542] ? __switch_to_asm+0x36/0x70 [ 262.166549] process_one_work+0x220/0x3c0 [ 262.166556] worker_thread+0x4d/0x3f0 [ 262.166560] ? process_one_work+0x3c0/0x3c0 [ 262.166563] kthread+0x12b/0x150 [ 262.166568] ? set_kthread_struct+0x40/0x40 [ 262.166571] ret_from_fork+0x22/0x30 Fixes: `6492e1b07c` ("drm/amdgpu: Unify ras block interface for each ras block") Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:26:31 -05:00
Changcheng Deng	6a77bce58c	drm/amdgpu: remove duplicate include in 'amdgpu_device.c' 'linux/pci.h' included in 'amdgpu_device.c' is duplicated. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Changcheng Deng <deng.changcheng@zte.com.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:26:31 -05:00
Lang Yu	d2895ec4ca	drm/amdgpu: fix a potential GPU hang on cyan skillfish We observed a GPU hang when querying GMC CG state(i.e., cat amdgpu_pm_info) on cyan skillfish. Acctually, cyan skillfish doesn't support any CG features. Just prevent it from accessing GMC CG registers. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:26:31 -05:00
Mario Limonciello	d2a197a45d	drm/amd: Only run s3 or s0ix if system is configured properly This will cause misconfigured systems to not run the GPU suspend routines. * In APUs that are properly configured system will go into s2idle. * In APUs that are intended to be S3 but user selects s2idle the GPU will stay fully powered for the suspend. * In APUs that are intended to be s2idle and system misconfigured the GPU will stay fully powered for the suspend. * In systems that are intended to be s2idle, but AMD dGPU is also present, the dGPU will go through S3 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:26:30 -05:00
Mario Limonciello	18b66ace6b	drm/amd: add support to check whether the system is set to s3 This will be used to help make decisions on what to do in misconfigured systems. v2: squash in semicolon fix from Stephen Rothwell Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:26:30 -05:00
Dave Airlie	53dbee4926	drm-misc-next for v5.18: UAPI Changes: - Fix invalid IN_FORMATS blob when plane->format_mod_supported is NULL. Cross-subsystem Changes: - Assorted dt bindings updates. - Fix vga16fb vga checking on x86. - Fix extra semicolon in rwsem.h's _down_write_nest_lock. - Assorted small fixes to agp and fbdev drivers. - Fix oops in creating a udmabuf with 0 pages. - Hot-unplug firmware fb devices on forced removal - Reqquest memory region in simplefb and simpledrm, and don't make the ioresource as busy. Core Changes: - Mock a drm_plane in drm-plane-helper selftest. - Assorted bug fixes to device logging, dbi. - Use DP helper for sink count in mst. - Assorted documentation fixes. - Assorted small fixes. - Move DP headers to drm/dp, and add a drm dp helper module. - Move the buddy allocator from i915 to common drm. - Add simple pci and platform module init macros to remove a lot of boilerplate from some drivers. - Support microsoft extension for HMDs and specialized monitors. - Improve edid parser's deep color handling. - Add type 7 timing support to edid parser. - Add a weak backpointer to the ttm_bo from ttm_resource - Add 3 eDP panels. Driver Changes: - Add support for HDMI and JZ4780 to ingenic. - Add support for higher DP/eDP bitrates to nouveau. - Assorted driver fixes to tilcdc, vmwgfx, sn65dsi83, meson, stm, panfrost, v3d, gma500, vc4, virtio, mgag200, ast, radeon, amdgpu, nouveau, various bridge drivers. - Convert and revert exynos dsi support to bridge driver. - Add vcc supply regulator support for sn65dsi83. - More conversion of bridge/chipone-icn6211 to atomic. - Remove conflicting fb's from stm, and add support for new hw version. - Add device link in parade-ps8640 to fix suspend/resume. - Update Boe-tv110c9m init sequence. - Add wide screen support to AST2600. - Fix omapdrm implicit dma_buf fencing. - Add support for multiple overlay planes to vkms. - Convert bridge/anx7625 to atomic, add HDCP support, add eld support for audio, and fix HPD. - Add driver for ChromeOS privacy screen. - Handover display from firmware to vc4 more gracefully, and support nomodeset. - Add flexible and ycbcr pixel formats to stm/ltdc. - Convert exynos mipi dsi to atomic. - Add initial dual core group GPUs support to panfrost. - No longer add exclusive fence in amdgpu as shared fence. - Add CSC and full range supoprt to vc4. - Shutdown the display on system shutdown and unbind. - Add Multi-Inno Technology MI0700S4T-6 simple panel. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAmHyiHMACgkQ/lWMcqZw E8OLfQ//Xd1njt93nRGoQofuQkz23n2AUTAnmbwzQKcvmat8ugXbRJ5JaVQJrFpu OQEYM46eZIyu2LekMiz4HgPK8CjS156QJ1WtltUFglOY1KLejb6HF5boBYxLkIC7 wLhkaRiwed4t7WOTrftgzpH5FNj/7Vi+Hav9l8rYRC74sWanEZNGBJL2OD9GRdlU 3tlmY8oXVAN8YDD/43Cv+foOTzLS/COI7JCFgFRhfzoFss3EVR061u55uOq18STB UI29NusqX7/K6hQAWCKl0EQBEZWMR02/dgu3ZpOEHHAa96RgHxIuRYsIO9kvGgiF VyW0EW6AyD/KsOSBYnsfUqkFfNchx9Xb8ZDjIhHUYxPsxe4iUJneCrdIKEmLWgSd 1bVNrltLJKBQARW4Whpy/gaiKV8RD8YVJobA/+/COeCUXCnNAT43O9aJmix/7253 Q7ORXTss5WRpuYswMWmObebf8p3IhFjTvlzzenXynl7mkaohGzHPf6SUSUZbJ8Df PZCh17McwIEQ1BtYeegeAGM6s8lrv5+yZaY4bnkQsJNOHeab0cPqmQ8/s+hUeRtp 3VDRVhkgzz2XuTaiKia0gWcAQbdZ2KornkP4QMyDH7w0+6bsuJnNXe4L1XY9lt4J 5v411FaD61FbGDhu5PFtYI7+ZlgM0h5sqlhVkUEzbckzTF3SC9c= =IMtm -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2022-01-27' of git://anongit.freedesktop.org/drm/drm-misc into drm-next [airlied: add two missing Kconfig] drm-misc-next for v5.18: UAPI Changes: - Fix invalid IN_FORMATS blob when plane->format_mod_supported is NULL. Cross-subsystem Changes: - Assorted dt bindings updates. - Fix vga16fb vga checking on x86. - Fix extra semicolon in rwsem.h's _down_write_nest_lock. - Assorted small fixes to agp and fbdev drivers. - Fix oops in creating a udmabuf with 0 pages. - Hot-unplug firmware fb devices on forced removal - Reqquest memory region in simplefb and simpledrm, and don't make the ioresource as busy. Core Changes: - Mock a drm_plane in drm-plane-helper selftest. - Assorted bug fixes to device logging, dbi. - Use DP helper for sink count in mst. - Assorted documentation fixes. - Assorted small fixes. - Move DP headers to drm/dp, and add a drm dp helper module. - Move the buddy allocator from i915 to common drm. - Add simple pci and platform module init macros to remove a lot of boilerplate from some drivers. - Support microsoft extension for HMDs and specialized monitors. - Improve edid parser's deep color handling. - Add type 7 timing support to edid parser. - Add a weak backpointer to the ttm_bo from ttm_resource - Add 3 eDP panels. Driver Changes: - Add support for HDMI and JZ4780 to ingenic. - Add support for higher DP/eDP bitrates to nouveau. - Assorted driver fixes to tilcdc, vmwgfx, sn65dsi83, meson, stm, panfrost, v3d, gma500, vc4, virtio, mgag200, ast, radeon, amdgpu, nouveau, various bridge drivers. - Convert and revert exynos dsi support to bridge driver. - Add vcc supply regulator support for sn65dsi83. - More conversion of bridge/chipone-icn6211 to atomic. - Remove conflicting fb's from stm, and add support for new hw version. - Add device link in parade-ps8640 to fix suspend/resume. - Update Boe-tv110c9m init sequence. - Add wide screen support to AST2600. - Fix omapdrm implicit dma_buf fencing. - Add support for multiple overlay planes to vkms. - Convert bridge/anx7625 to atomic, add HDCP support, add eld support for audio, and fix HPD. - Add driver for ChromeOS privacy screen. - Handover display from firmware to vc4 more gracefully, and support nomodeset. - Add flexible and ycbcr pixel formats to stm/ltdc. - Convert exynos mipi dsi to atomic. - Add initial dual core group GPUs support to panfrost. - No longer add exclusive fence in amdgpu as shared fence. - Add CSC and full range supoprt to vc4. - Shutdown the display on system shutdown and unbind. - Add Multi-Inno Technology MI0700S4T-6 simple panel. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/456a23c6-7324-7543-0c45-751f30ef83f7@linux.intel.com	2022-02-01 19:02:41 +10:00
Mario Limonciello	a6ed203587	drm/amd: Warn users about potential s0ix problems On some OEM setups users can configure the BIOS for S3 or S2idle. When configured to S3 users can still choose 's2idle' in the kernel by using `/sys/power/mem_sleep`. Before commit `6dc8265f98` ("drm/amdgpu: always reset the asic in suspend (v2)"), the GPU would crash. Now when configured this way, the system should resume but will use more power. As such, adjust the `amdpu_acpi_is_s0ix function` to warn users about potential power consumption issues during their first attempt at suspending. Reported-by: Bjoren Dasse <bjoern.daase@gmail.com> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1824 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-31 18:10:47 -05:00
Mario Limonciello	f588a1bbfc	drm/amd: Warn users about potential s0ix problems On some OEM setups users can configure the BIOS for S3 or S2idle. When configured to S3 users can still choose 's2idle' in the kernel by using `/sys/power/mem_sleep`. Before commit `6dc8265f98` ("drm/amdgpu: always reset the asic in suspend (v2)"), the GPU would crash. Now when configured this way, the system should resume but will use more power. As such, adjust the `amdpu_acpi_is_s0ix function` to warn users about potential power consumption issues during their first attempt at suspending. Reported-by: Bjoren Dasse <bjoern.daase@gmail.com> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1824 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-31 16:12:29 -05:00
Tomohito Esaki	2af104290d	drm: introduce fb_modifiers_not_supported flag in mode_config If only linear modifier is advertised, since there are many drivers that only linear supported, the DRM core should handle this rather than open-coding in every driver. However, there are legacy drivers such as radeon that do not support modifiers but infer the actual layout of the underlying buffer. Therefore, a new flag fb_modifiers_not_supported is introduced for these legacy drivers, and allow_fb_modifiers is replaced with this new flag. v3: - change the order as follows: 1. add fb_modifiers_not_supported flag 2. add default modifiers 3. remove allow_fb_modifiers flag - add a conditional disable in amdgpu_dm_plane_init() v4: - modify kernel docs v5: - modify kernel docs Signed-off-by: Tomohito Esaki <etom@igel.co.jp> Acked-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220128060836.11216-2-etom@igel.co.jp	2022-01-31 21:45:23 +01:00
huangqu	c57f5ba2c8	drm/amdgpu: Wrong order for config and counter_id parameters Wrong order for config and counter_id parameters was passed, when calling df_v3_6_pmc_set_deferred and df_v3_6_pmc_is_deferred functions. Signed-off-by: huangqu <jinsdb@126.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:50:22 -05:00
tangmeng	1ec5a44331	drm/amd/amdgpu: fix spelling mistake "disbale" -> "disable" There is a spelling mistake. Fix it. Signed-off-by: tangmeng <tangmeng@uniontech.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:50:17 -05:00
Alex Deucher	ded81d5b2b	drm/amdgpu: bump driver version for new CTX OP to set/get stable pstates So mesa and tools know when this is available. Mesa MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/207 Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:50:14 -05:00
Alex Deucher	8cda7a4f96	drm/amdgpu/UAPI: add new CTX OP to get/set stable pstates Add a new CTX ioctl operation to set stable pstates for profiling. When creating traces for tools like RGP or using SPM or doing performance profiling, it's required to enable a special stable profiling power state on the GPU. These profiling states set fixed clocks and disable certain other power features like powergating which may impact the results. Historically, these profiling pstates were enabled via sysfs, but this adds an interface to enable it via the CTX ioctl from the application. Since the power state is global only one application can set it at a time, so if multiple applications try and use it only the first will get it, the ioctl will return -EBUSY for others. The sysfs interface will override whatever has been set by this interface. Mesa MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/207 v2: don't default r = 0; v3: rebase on Evan's PM cleanup Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:50:08 -05:00
Luben Tuikov	3ed893396b	drm/amd: Enable FRU EEPROM for Sienna Cichlid Enable the FRU EEPROM I2C bus for Sienna Cichlid server boards, for which it is enabled by checking the VBIOS version. Cc: Roy Sun <Roy.Sun@amd.com> Cc: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:49:56 -05:00
Luben Tuikov	2f60dd5076	drm/amd: Expose the FRU SMU I2C bus Expose both SMU I2C buses. Some boards use the same bus for both the RAS and FRU EEPROMs and others use different buses. This enables the additional I2C bus and sets the right buses to use for RAS and FRU EEPROM access. Cc: Roy Sun <Roy.Sun@amd.com> Co-developed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:49:48 -05:00
Aaron Liu	f06d9e4eec	drm/amdgpu: add 1.3.1/2.4.0 athub CG support This patch adds 1.3.1/2.4.0 athub clock gating support. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:49:42 -05:00
Aaron Liu	4e13b063d2	drm/amdgpu: convert code name to ip version for athub Use IP version rather than codename for athub. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:49:36 -05:00
Tao Zhou	bee7f8d092	drm/amdgpu: get hash bit for CH4 in umc channel index On ALDEBARAN, the umc channel bits are not original values, they are hashed. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:49:13 -05:00
Tao Zhou	e63fa4dcea	drm/amdgpu: update algorithm of umc address conversion On ALDEBARAN, we need to traverse all column bits higher than BIT11(C4C3C2) in a row, the shift of R14 bit should be also taken into account. Retire all pages we find. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:49:08 -05:00
Tao Zhou	498d46fe7a	drm/amdgpu: increase bad page number for umc ras query One piece of umc normalizing address can be mapped to 16 pieces of physical address in each umc channel on ALDEBARAN. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:49:02 -05:00
Tao Zhou	400013b268	drm/amdgpu: add umc_fill_error_record to make code more simple Create common amdgpu_umc_fill_error_record function for all versions of UMC and clean up related codes. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:48:56 -05:00
Felix Kuehling	fc6ea4bee1	drm/amdgpu: Wipe all VRAM on free when RAS is enabled On GPUs with RAS, poison can propagate between processes if VRAM is not cleared when it is freed or allocated. The reason is, that not all write accesses clear RAS poison. 32-byte writes by the SDMA engine do clear RAS poison. Clearing memory in the background when it is freed should avoid major performance impact. KFD has been doing this already for a long time. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:48:41 -05:00
Tianci.Yin	7270e8957e	drm/amdgpu: Fix an error message in rmmod [why] In rmmod procedure, kfd sends cp a dequeue request, but the request does not get response, then an error message "cp queue pipe 4 queue 0 preemption failed" printed. [how] Performing kfd suspending after disabling gfxoff can fix it. Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:48:30 -05:00
Victor Zhao	039cacd239	drm/amdgpu: add determine passthrough under arm64 add determine for passthrough mode under arm64 by reading CurrentEL register v2: squash in warning fix (Alex) Signed-off-by: Victor Zhao <Victor.Zhao@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:47:34 -05:00
Christian König	3f268ef06f	drm/ttm: add back a reference to the bdev to the res manager It is simply a lot cleaner to have this around instead of adding the device throughout the call chain. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220124122514.1832-3-christian.koenig@amd.com	2022-01-26 15:29:24 +01:00
Christian König	de3688e469	drm/ttm: add ttm_resource_fini v2 Make sure we call the common cleanup function in all implementations of the resource manager. v2: fix missing case in i915, rudimentary kerneldoc, should be filled in more when we add more functionality Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220124122514.1832-2-christian.koenig@amd.com	2022-01-26 15:23:51 +01:00
Tim Huang	37d6b1506b	drm/amdgpu: convert to UVD IP version checking Use IP versions rather than asic_type to differentiate IP version specific features. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:36 -05:00
Tim Huang	d726d43c20	drm/amdgpu: convert to NBIO IP version checking Use IP versions rather than asic_type to differentiate IP version specific features. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:36 -05:00
Alex Deucher	82c3a7a5ed	drm/amdgpu: convert amdgpu_display_supported_domains() to IP versions Check IP versions rather than asic types. Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:36 -05:00
Alex Deucher	243c719e87	drm/amdgpu: handle BACO synchronization with secondary funcs Extend secondary function handling for runtime pm beyond audio to USB and UCSI. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:36 -05:00
Alex Deucher	d0d66b8c66	drm/amdgpu: move runtime pm init after drm and fbdev init Seems more logical to enable runtime pm at the end of the init sequence so we don't end up entering runtime suspend before init is finished. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:36 -05:00
Alex Deucher	901e2be20d	drm/amdgpu: move PX checking into amdgpu_device_ip_early_init We need to set the APU flag from IP discovery before we evaluate this code. Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:36 -05:00
Alex Deucher	212021297e	drm/amdgpu: set APU flag based on IP discovery table Use the IP versions to set the APU flag when necessary. Reviewed-by: Aaron Liu <aaron.liu@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:36 -05:00
yipechai	e9287ef8d4	Revert "drm/amdgpu: No longer insert ras blocks into ras_list if it already exists in ras_list" This reverts commit `df4f0041c6`. Xgmi ras initialization had been moved from .late_init to early_init, the defect of repeated calling amdgpu_ras_register_ras_block had been fixed, so revert this patch. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:34 -05:00
yipechai	1f33bd18d7	drm/amdgpu: Move xgmi ras initialization from .late_init to .early_init Move xgmi ras initialization from .late_init to .early_init, which let xgmi ras can be initialized only once. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:33 -05:00

... 9 10 11 12 13 ...

11309 Commits