linux

mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00

Author	SHA1	Message	Date
Srinivasan Shanmugam	40e39d7227	drm/amdgpu: Fix unused amdgpu_acpi_get_numa_info function in amdgpu_acpi_get_node_id() Fix the below compiler complaining error: drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:860:33: error: unused function 'amdgpu_acpi_get_numa_info' [-Werror,-Wunused-function] static struct amdgpu_numa_info *amdgpu_acpi_get_numa_info(uint32_t pxm) ^ 1 error generated. By guarding amdgpu_acpi_get_numa_info & amdgpu_acpi_get_numa_size function, only when CONFIG_ACPI_NUMA is enabled. Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Lijo Lazar <lijo.lazar@amd.com> Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:44:27 -04:00
Christoph Hellwig	73ade646c5	drm/amdgpu: stop including swiotlb.h amdgpu does not need swiotlb.h, so stop including it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:42:21 -04:00
Srinivasan Shanmugam	b3b0e016ec	drm/amdgpu: Fix uninitalized variable in jpeg_v4_0_3_is_idle & jpeg_v4_0_3_wait_for_idle drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c:752:4: error: variable 'ret' is uninitialized when used here [-Werror,-Wuninitialized] ret &= ((RREG32_SOC15_OFFSET( ^~~ drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c:745:10: note: initialize the variable 'ret' to silence this warning bool ret; ^ = 0 drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c:774:4: error: variable 'ret' is uninitialized when used here [-Werror,-Wuninitialized] ret &= SOC15_WAIT_ON_RREG_OFFSET( ^~~ drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c:767:9: note: initialize the variable 'ret' to silence this warning int ret; ^ = 0 2 errors generated. Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: James Zhu <James.Zhu@amd.com> Cc: Leo Liu <leo.liu@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:58 -04:00
Srinivasan Shanmugam	ff6b11cc72	drm/amd/amdgpu: Fix errors & warnings in mmhub_v1_8.c Fix below errors & warnings reported by checkpatch: ERROR: code indent should use tabs where possible WARNING: please, no space before tabs WARNING: please, no spaces at the start of a line WARNING: Prefer 'unsigned int' to bare use of 'unsigned' ERROR: space prohibited before that '++' (ctx:WxB) WARNING: Block comments use a trailing */ on a separate line Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:53 -04:00
Likun Gao	20a29ac091	drm/amdgpu: retire set_vga_state for some ASIC set_vga_state operation only allowed on SI generation ASIC, retire the realted function on those ASIC which did not do anything. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:50 -04:00
Likun Gao	b336c681bd	drm/amdgpu: fix vga_set_state NULL pointer issue Fix NULL pointer issue for vga_set_state function as not all the ASIC need this operation. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:40 -04:00
Srinivasan Shanmugam	d7b8e68dc0	drm/amdgpu: Fix uninitialized variable in gfx_v9_4_3_cp_resume drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:1925:6: error: variable 'r' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] if (amdgpu_xcp_query_partition_mode(adev->xcp_mgr, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:1931:6: note: uninitialized use occurs here if (r) ^ drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:1925:2: note: remove the 'if' if its condition is always true if (amdgpu_xcp_query_partition_mode(adev->xcp_mgr, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:1923:7: note: initialize the variable 'r' to silence this warning int r, i, num_xcc; ^ = 0 1 error generated. Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:35 -04:00
Horatio Zhang	a34b09060a	drm/amdgpu: add RAS POISON interrupt funcs for jpeg_v4_0 Add ras_poison_irq and functions. And fix the amdgpu_irq_put call trace in jpeg_v4_0_hw_fini. [ 50.497562] RIP: 0010:amdgpu_irq_put+0xa4/0xc0 [amdgpu] [ 50.497619] RSP: 0018:ffffaa2400fcfcb0 EFLAGS: 00010246 [ 50.497620] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 [ 50.497621] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 50.497621] RBP: ffffaa2400fcfcd0 R08: 0000000000000000 R09: 0000000000000000 [ 50.497622] R10: 0000000000000000 R11: 0000000000000000 R12: ffff99b2105242d8 [ 50.497622] R13: 0000000000000000 R14: ffff99b210500000 R15: ffff99b210500000 [ 50.497623] FS: 0000000000000000(0000) GS:ffff99b518480000(0000) knlGS:0000000000000000 [ 50.497623] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 50.497624] CR2: 00007f9d32aa91e8 CR3: 00000001ba210000 CR4: 0000000000750ee0 [ 50.497624] PKRU: 55555554 [ 50.497625] Call Trace: [ 50.497625] <TASK> [ 50.497627] jpeg_v4_0_hw_fini+0x43/0xc0 [amdgpu] [ 50.497693] jpeg_v4_0_suspend+0x13/0x30 [amdgpu] [ 50.497751] amdgpu_device_ip_suspend_phase2+0x240/0x470 [amdgpu] [ 50.497802] amdgpu_device_ip_suspend+0x41/0x80 [amdgpu] [ 50.497854] amdgpu_device_pre_asic_reset+0xd9/0x4a0 [amdgpu] [ 50.497905] amdgpu_device_gpu_recover.cold+0x548/0xcf1 [amdgpu] [ 50.498005] amdgpu_debugfs_reset_work+0x4c/0x80 [amdgpu] [ 50.498060] process_one_work+0x21f/0x400 [ 50.498063] worker_thread+0x200/0x3f0 [ 50.498064] ? process_one_work+0x400/0x400 [ 50.498065] kthread+0xee/0x120 [ 50.498067] ? kthread_complete_and_exit+0x20/0x20 [ 50.498068] ret_from_fork+0x22/0x30 Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:29 -04:00
Horatio Zhang	674f90f83b	drm/amdgpu: add RAS POISON interrupt funcs for jpeg_v2_6 Add ras_poison_irq and functions. Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:27 -04:00
Horatio Zhang	18dad20c3d	drm/amdgpu: separate ras irq from jpeg instance irq for UVD_POISON Separate jpegbRAS poison consumption handling from the instance irq, and register dedicated ras_poison_irq src and funcs for UVD_POISON. v2: - Separate ras irq from jpeg instance irq - Improve the subject and code comments v3: - Split the patch into three parts - Improve the code comments Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:23 -04:00
Horatio Zhang	66a11ecbde	drm/amdgpu: add RAS POISON interrupt funcs for vcn_v4_0 Add ras_poison_irq and functions. And fix the amdgpu_irq_put call trace in vcn_v4_0_hw_fini. [ 44.563572] RIP: 0010:amdgpu_irq_put+0xa4/0xc0 [amdgpu] [ 44.563629] RSP: 0018:ffffb36740edfc90 EFLAGS: 00010246 [ 44.563630] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 [ 44.563630] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 44.563631] RBP: ffffb36740edfcb0 R08: 0000000000000000 R09: 0000000000000000 [ 44.563631] R10: 0000000000000000 R11: 0000000000000000 R12: ffff954c568e2ea8 [ 44.563631] R13: 0000000000000000 R14: ffff954c568c0000 R15: ffff954c568e2ea8 [ 44.563632] FS: 0000000000000000(0000) GS:ffff954f584c0000(0000) knlGS:0000000000000000 [ 44.563632] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 44.563633] CR2: 00007f028741ba70 CR3: 000000026ca10000 CR4: 0000000000750ee0 [ 44.563633] PKRU: 55555554 [ 44.563633] Call Trace: [ 44.563634] <TASK> [ 44.563634] vcn_v4_0_hw_fini+0x62/0x160 [amdgpu] [ 44.563700] vcn_v4_0_suspend+0x13/0x30 [amdgpu] [ 44.563755] amdgpu_device_ip_suspend_phase2+0x240/0x470 [amdgpu] [ 44.563806] amdgpu_device_ip_suspend+0x41/0x80 [amdgpu] [ 44.563858] amdgpu_device_pre_asic_reset+0xd9/0x4a0 [amdgpu] [ 44.563909] amdgpu_device_gpu_recover.cold+0x548/0xcf1 [amdgpu] [ 44.564006] amdgpu_debugfs_reset_work+0x4c/0x80 [amdgpu] [ 44.564061] process_one_work+0x21f/0x400 [ 44.564062] worker_thread+0x200/0x3f0 [ 44.564063] ? process_one_work+0x400/0x400 [ 44.564064] kthread+0xee/0x120 [ 44.564065] ? kthread_complete_and_exit+0x20/0x20 [ 44.564066] ret_from_fork+0x22/0x30 Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:16 -04:00
Horatio Zhang	46d75d2300	drm/amdgpu: add RAS POISON interrupt funcs for vcn_v2_6 Add ras_poison_irq and functions. Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:13 -04:00
Horatio Zhang	2ecf927b17	drm/amdgpu: separate ras irq from vcn instance irq for UVD_POISON Separate vcn RAS poison consumption handling from the instance irq, and register dedicated ras_poison_irq src and funcs for UVD_POISON. v2: - Separate ras irq from vcn instance irq - Improve the subject and code comments v3: - Split the patch into three parts - Improve the code comments Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:11 -04:00
YuanShang	c77b3608b8	drm/amdgpu: Remove IMU ucode in vf2pf The IMU firmware is loaded on the host side, not the guest. Remove IMU in vf2pf ucode id enum. Signed-off-by: YuanShang <YuanShang.Mao@amd.com> Reviewed-By: Horace Chen <horace.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:07 -04:00
Shiwu Zhang	e825fb641b	drm/amdgpu: fix the memory override in kiq ring struct This is introduced by the code merge and will let the adev->gfx.kiq[0].ring struct being overrided Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:04 -04:00
Shiwu Zhang	c796d7e039	drm/amdgpu: add the smu_v13_0_6 and gfx_v9_4_3 ip block Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:41:02 -04:00
Jesse Zhang	b3122c9269	drm/amdgpu: don't enable secure display on incompatible platforms [why] [drm] psp gfx command LOAD_TA(0x1) failed and response status is (0x7) [drm] psp gfx command INVOKE_CMD(0x3) failed and response status is (0x4) amdgpu 0000:04:00.0: amdgpu: Secure display: Generic Failure. [how] don't enable secure display on incompatible platforms Suggested-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Jesse zhang <jesse.zhang@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:40:57 -04:00
Sukrut Bellary	1385d88c6a	drm:amd:amdgpu: Fix missing buffer object unlock in failure path smatch warning - 1) drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c:3615 gfx_v9_0_kiq_resume() warn: inconsistent returns 'ring->mqd_obj->tbo.base.resv'. 2) drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:6901 gfx_v10_0_kiq_resume() warn: inconsistent returns 'ring->mqd_obj->tbo.base.resv'. Signed-off-by: Sukrut Bellary <sukrut.bellary@linux.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:40:52 -04:00
Bas Nieuwenhuizen	a2b308044d	drm/amdgpu: Validate VM ioctl flags. None have been defined yet, so reject anybody setting any. Mesa sets it to 0 anyway. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2023-06-09 10:40:25 -04:00
Su Hui	109b4d8cfe	drm/amdgpu: remove unnecessary (void) conversions No need cast (void) to (struct amdgpu_device *). Signed-off-by: Su Hui <suhui@nfschina.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:40:12 -04:00
Tong Liu01	04e8595819	drm/amdgpu: fix incorrect pcie_gen_mask in passthrough case [why] Passthrough case is treated as root bus and pcie_gen_mask is set as default value that does not support GEN 3 and GEN 4 for PCIe link speed. So PCIe link speed will be downgraded at smu hw init in passthrough condition [how] Move get pci info after detect virtualization and check if it is passthrough case when set pcie_gen_mask Signed-off-by: Tong Liu01 <Tong.Liu01@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:40:00 -04:00
Jack Xiao	e602157ec0	drm/amdgpu: fix S3 issue if MQD in VRAM 1. Need flush HDP for MQD putting in vram 2. Zero out mes MQD Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:38:33 -04:00
Srinivasan Shanmugam	e03f04b849	drm/amdgpu: Fix warnings in amdgpu _sdma, _ucode.c Fix below checkpatch warnings: WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: Comparisons should place the constant on the right side of the test WARNING: Missing a blank line after declarations Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:38:29 -04:00
Srinivasan Shanmugam	f10984a353	drm/amd/amdgpu: Fix errors & warnings in amdgpu _uvd, _vce.c Fix below checkpatch errors & warnings: In amdgpu_uvd.c: WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: Prefer 'unsigned int ' to bare use of 'unsigned ' WARNING: Missing a blank line after declarations WARNING: %Lx is non-standard C, use %llx ERROR: space required before the open parenthesis '(' ERROR: space required before the open brace '{' WARNING: %LX is non-standard C, use %llX WARNING: Block comments use * on subsequent lines +/* multiple fence commands without any stream commands in between can + crash the vcpu so just try to emmit a dummy create/destroy msg to WARNING: Block comments use a trailing / on a separate line + avoid this / WARNING: braces {} are not necessary for single statement blocks + for (j = 0; j < adev->uvd.num_enc_rings; ++j) { + fences += amdgpu_fence_count_emitted(&adev->uvd.inst[i].ring_enc[j]); + } In amdgpu_vce.c: WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: Missing a blank line after declarations WARNING: %Lx is non-standard C, use %llx WARNING: Possible repeated word: 'we' ERROR: space required before the open parenthesis '(' Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:38:22 -04:00
YiPeng Chai	6c47a79b3b	drm/amdgpu: perform mode2 reset for sdma fed error on gfx v11_0_3 perform mode2 reset for sdma fed error on gfx v11_0_3. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:38:19 -04:00
Srinivasan Shanmugam	5d0622705f	drm/amd/amdgpu: Fix errors & warnings in amdgpu_vcn.c Fix below checkpatch insisted error & warnings: ERROR: space required before the open brace '{' WARNING: braces {} are not necessary for any arm of this statement + if ((type == VCN_ENCODE_RING) && (vcn_config & VCN_BLOCK_ENCODE_DISABLE_MASK)) { [...] + } else if ((type == VCN_DECODE_RING) && (vcn_config & VCN_BLOCK_DECODE_DISABLE_MASK)) { [...] + } else if ((type == VCN_UNIFIED_RING) && (vcn_config & VCN_BLOCK_QUEUE_DISABLE_MASK)) { [...] ERROR: code indent should use tabs where possible WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: braces {} are not necessary for single statement blocks + for (i = 0; i < adev->vcn.num_enc_rings; ++i) { + fence[j] += amdgpu_fence_count_emitted(&adev->vcn.inst[j].ring_enc[i]); + ERROR: space required before the open parenthesis '(' WARNING: Missing a blank line after declarations WARNING: please, no spaces at the start of a line WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:38:15 -04:00
Srinivasan Shanmugam	29f187f71e	drm/amd/amdgpu: Fix warnings in amdgpu_encoders.c Fix below checkpatch warnings: WARNING: Missing a blank line after declarations + struct amdgpu_connector *amdgpu_connector = to_amdgpu_connector(connector); + amdgpu_encoder->active_device = amdgpu_encoder->devices & amdgpu_connector->devices; WARNING: Prefer 'unsigned int' to bare use of 'unsigned' Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:38:09 -04:00
Srinivasan Shanmugam	01c3f46474	drm/amd/amdgpu: Fix errors & warnings in amdgpu_ttm.c Fix below checkpatch insisted error & warnings: ERROR: Macros with complex values should be enclosed in parentheses WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: braces {} are not necessary for single statement blocks WARNING: Block comments use a trailing */ on a separate line WARNING: Missing a blank line after declarations Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:38:03 -04:00
Alex Deucher	0ce50b2efe	drm/amdgpu/vcn4: fix endian conversion sq.is_enabled is a byte so there is no need to endian swap it. Acked-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:59 -04:00
Alex Deucher	45b3a914d4	drm/amdgpu/gmc9: fix 64 bit division in partition code Rework logic or use do_div() to avoid problems on 32 bit. v2: add a missing case for XCP macro v3: fix out of bounds array access v4: fix xcp handling harder Acked-by: Guchun Chen <guchun.chen@amd.com> (v1) Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> (v3) Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:52 -04:00
Tao Zhou	92ecb92ccc	drm/amdgpu: initialize RAS for gfx_v9_4_3 Register GFX RAS functions and initialize GFX RAS. v2: remove xcp operations. v3: reuse the return value of gfx_ras_sw_init. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:50 -04:00
Tao Zhou	0386d52d15	drm/amdgpu: add sq timeout status functions for gfx_v9_4_3 Query and reset sq timeout status. v2: change instance from 0 to xcc_id for register access. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:47 -04:00
Tao Zhou	30feef0676	drm/amdgpu: add RAS error count reset for gfx_v9_4_3 Add GFX RAS error count reset function. v2: remove xcp operation. only select_se_sh when instance number is more than 1. v3: add check for se_num before select_se_sh. change instance from 0 to xcc_id for register access. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:44 -04:00
Tao Zhou	bfa84da618	drm/amdgpu: add RAS error count query for gfx_v9_4_3 Query GFX RAS ce/ue count. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:42 -04:00
Tao Zhou	5c1c09a716	drm/amdgpu: add RAS error count definitions for gfx_v9_4_3 Prepare for the query of GFX RAS ce/ue count. v2: remove xcp operation. only select_se_sh when instance number is more than 1. v3: add more CE/UE registsers to query list. add check for se_num before select_se_sh. change instance from 0 to xcc_id for register access. v4: move gfx memory id definitions to gfx_v9_4_3. v5: create a dedicated patch for adding error count query function. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:39 -04:00
Tao Zhou	77462ab8c6	drm/amdgpu: add RAS definitions for GFX Add common GFX RAS definitions. v2: remove instance from amdgpu_gfx_ras_reg_entry, amdgpu_ras_err_status_reg_entry has already defined it. v3: remove memory id definitions from amdgpu_gfx.h, they are related to IP version. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:37 -04:00
Tao Zhou	47e7f527c8	drm/amdgpu: add RAS status reset for gfx_v9_4_3 Reset GFX RAS status registers. v2: fix typo in title. remove xcp operation. v3: change instance from 0 to xcc_id for register access. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:32 -04:00
Tao Zhou	bf16235b39	drm/amdgpu: add RAS status query for gfx_v9_4_3 Query GFX RAS status. v2: remove xcp operation. v3: change instance from 0 to xcc_id for register access. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:29 -04:00
Tao Zhou	d78c71321e	drm/amdgpu: add GFX RAS common function The common function can help reduce redundant code. v2: remove xcp operation, only need to do RAS operations for all instances. v3: remove check for GFX RAS support, will be checked in higher level. add amdgpu prefix for the function name. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:26 -04:00
Hawking Zhang	9a3ce1a7a9	drm/amdgpu: Do not access members of xcp w/o check (v2) Not all the asic needs xcp. ensure check xcp availabity before accessing its member. v2: add missing change in kfd_topology.c Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:24 -04:00
Tao Zhou	f464c5dd4d	drm/amdgpu: add check for RAS instance mask The mask is only needed to be set when RAS block instance number is more than 1 and invalid bits should be also masked out. We only check valid bits for GFX and SDMA block for now, and will add check for other RAS blocks in the future. v2: move the check under injection operation since the mask is only used by RAS error inject. v3: add valid bits handling for SDMA. v4: print message if the mask is adjusted. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:18 -04:00
Tao Zhou	6e3c51a581	drm/amdgpu: remove RAS GFX injection for gfx_v9_4/gfx_v9_4_2 No special requirement in RAS injection for the two versions, switch to use default injection interface. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:15 -04:00
Tao Zhou	27c5f29526	drm/amdgpu: reorganize RAS injection flow So GFX RAS injection could use default function if it doesn't define its own injection interface. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:13 -04:00
Tao Zhou	2c22ed0bdb	drm/amdgpu: add instance mask for RAS inject User can specify injected instances by the mask. For backward compatibility, the mask value is incorporated into sub block index without interface change of RAS TA. User uses logical mask and driver should convert it to physical value before sending it to RAS TA. v2: update parameter name. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:10 -04:00
Tao Zhou	af2ba36883	drm/amdgpu: convert logical instance mask to physical one Convert instance mask for the convenience of RAS TA. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:08 -04:00
Mukul Joshi	40b832aac0	drm/amdgpu: Enable IH CAM on GFX9.4.3 This patch enables IH CAM on GFX9.4.3 ASIC. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:04 -04:00
Philip Yang	3cde91172d	drm/amdgpu: Correct get_xcp_mem_id calculation Current calculation only works for NPS4/QPX mode, correct it for NPS4/CPX mode. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:37:02 -04:00
Philip Yang	84b4dd3f84	drm/amdkfd: Refactor migrate init to support partition switch Rename smv_migrate_init to a better name kgd2kfd_init_zone_device because it setup zone devive pgmap for page migration and keep it in kfd_migrate.c to access static functions svm_migrate_pgmap_ops. Call it only once in amdgpu_device_ip_init after adev ip blocks are initialized, but before amdgpu_amdkfd_device_init initialize kfd nodes which enable SVM support based on pgmap. svm_range_set_max_pages is called by kgd2kfd_device_init everytime after switching compute partition mode. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:36:58 -04:00
Shiwu Zhang	44a9766555	drm/amdgpu: route ioctls on primary node of XCPs to primary device During XCP init, unlike the primary device, there is no amdgpu_device attached to each XCP's drm_device In case that user trying to open/close the primary node of XCP drm_device this rerouting is to solve the NULL pointer issue causing by referring to any member of the amdgpu_device BUG: unable to handle page fault for address: 0000000000020c80 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page Oops: 0002 [#1] PREEMPT SMP NOPTI Call Trace: <TASK> lock_timer_base+0x6b/0x90 try_to_del_timer_sync+0x2b/0x80 del_timer_sync+0x29/0x40 flush_delayed_work+0x1c/0x50 amdgpu_driver_open_kms+0x2c/0x280 [amdgpu] drm_file_alloc+0x1b3/0x260 [drm] drm_open+0xaa/0x280 [drm] drm_stub_open+0xa2/0x120 [drm] chrdev_open+0xa6/0x1c0 Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:36:53 -04:00
Mukul Joshi	1c77527a69	drm/amdkfd: Fix memory reporting on GFX 9.4.3 This patch fixes memory reporting on the GFX 9.4.3 APU and dGPU by reporting available memory on a per partition basis. If its an APU, available and used memory calculations take into account system and TTM memory. v2: squash in fix ("drm/amdkfd: Fix array out of bound warning") squash in fix ("drm/amdgpu: Update memory reporting for GFX9.4.3") Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:36:48 -04:00
Mukul Joshi	315e29eca5	drm/amdkfd: Move local_mem_info to kfd_node We need to track memory usage on a per partition basis. To do that, store the local memory information in KFD node instead of kfd device. v2: squash in fix ("amdkfd: Use mem_id to access mem_partition info") Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:36:45 -04:00
James Zhu	b125b80bd5	drm/amdgpu: use xcp partition ID for amdgpu_gem Find xcp_id from amdgpu_fpriv, use it for amdgpu_gem_object_create. Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:36:43 -04:00
Philip Yang	2fa9ff25de	drm/amdgpu: KFD graphics interop support compute partition kfd_ioctl_get_dmabuf use the amdgpu bo xcp_id to get the gpu_id of the KFD node from the exported dmabuf_adev, and then create kfd bo on the correct adev and KFD node when importing the amdgpu bo to KFD. Remove function kfd_device_by_adev, it is not needed as it is the same result as dmabuf_adev->kfd.dev->nodes[0]->id. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:36:41 -04:00
Philip Yang	3ebfd221c1	drm/amdkfd: Store xcp partition id to amdgpu bo For memory accounting per compute partition and export drm amdgpu bo and then import to KFD, we need the xcp id to account the memory usage or find the KFD node of the original amdgpu bo to create the KFD bo on the correct adev KFD node. Set xcp_id_plus1 of amdgpu_bo_param to create bo and store xcp_id to amddgpu bo. Add helper macro to get the mem_id from adev and xcp_id. v2: squash in fix ("drm/amdgpu: Fix BO creation failure on GFX 9.4.3 dGPU") Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:36:38 -04:00
Philip Yang	6cfba94a77	drm/amdgpu: dGPU mode set VRAM range lpfn as exclusive TTM place lpfn is exclusive used as end (start + size) in drm and buddy allocator, adev->gmc memory partition range lpfn is inclusive (start + size - 1), should plus 1 to set TTM place lpfn. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:36:35 -04:00
Philip Yang	ea7bf2f220	drm/amdgpu: Alloc page table on correct memory partition Alloc kernel mode page table bo uses the amdgpu_vm->mem_id + 1 as bp mem_id_plus1 parameter. For APU mode, select the correct TTM pool to alloc page from the corresponding memory partition, this will be the closest NUMA node. For dGPU mode, select the correct address range for vram manager. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:36:33 -04:00
Philip Yang	dc12f9edde	drm/amdkfd: Update MTYPE for far memory partition Use MTYPE RW/MTYPE_CC for mapping system memory or VRAM to KFD node within the same memory partition, use MTYPE_NC for mapping on KFD node from the far memory partition of the same socket or from another socket on same XGMI hive. On NPS4 or 4P system, MTYPE will be overridden per page depending on the memory NUMA node id and vm->mem_id. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:36:30 -04:00
Philip Yang	7f6db89418	drm/amdgpu: dGPU mode placement support memory partition dGPU mode uses VRAM manager to validate bo, amdgpu bo placement use the mem_id to get the allocation range first, last page frame number from xcp manager, pass to drm buddy allocator as the allowed range. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:36:27 -04:00
Philip Yang	53c5692e7a	drm/amdkfd: Alloc memory of GPU support memory partition For dGPU mode VRAM allocation, create amdgpu_bo from amdgpu_vm->mem_id, to alloc from the correct memory range. For APU mode VRAM allocation, set alloc domain to GTT, and set bp->mem_id_plus1 from amdgpu_vm->mem_id + 1 to create amdgpu_bo, to allocate system memory from correct NUMA node. For GTT allocation, use mem_id -1 to allocate system memory from any NUMA nodes. Remove amdgpu_ttm_tt_set_mem_pool, to avoid the confusion that memory maybe allocated from different mem_id. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:00:03 -04:00
Philip Yang	f24e924b7e	drm/amdgpu: Add memory partition mem_id to amdgpu_bo Add mem_id_plus1 parameter to amdgpu_gem_object_create and pass it to amdgpu_bo_create. For dGPU mode allocation, mem_id is used by VRAM manager to get the memory partition fpfn, lpfn from xcp manager. For APU native mode allocation, mem_id is used to get NUMA node id from xcp manager, then pass to TTM as numa pool id to alloc memory from the specific NUMA node. mem_id -1 means for entire VRAM or any NUMA nodes. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 10:00:00 -04:00
Philip Yang	4c6ce75fdd	drm/amdkfd: Show KFD node memory partition info Show KFD node memory partition id and size, add helper function KFD_XCP_MEMORY_SIZE to get kfd node memory size, will be used later to support memory accounting per partition. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:58 -04:00
Philip Yang	934deb64fd	drm/amdgpu: Add memory partition id to amdgpu_vm If xcp_mgr is initialized, add mem_id to amdgpu_vm structure to store memory partition number when creating amdgpu_vm for the xcp. The xcp number is decided when opening the render device, for example /dev/dri/renderD129 is xcp_id 0, /dev/dri/renderD130 is xcp_id 1. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:56 -04:00
Philip Yang	d26ea1b346	drm/amdgpu: Add xcp manager num_xcp_per_mem_partition Used by KFD to check memory limit accounting. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:48 -04:00
James Zhu	3e7c6fe387	drm/amdgpu: update ref_cnt before ctx free Update ref_cnt before ctx free. Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:44 -04:00
James Zhu	9a18292d41	drm/amdgpu: run partition schedule if it is supported Run partition schedule if it is supported during ctx init entity. Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:42 -04:00
James Zhu	cd7d8400aa	drm/amdgpu: add partition schedule for GC(9, 4, 3) Implement partition schedule for GC(9, 4, 3). Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:39 -04:00
James Zhu	c30e326e48	drm/amdgpu: keep amdgpu_ctx_mgr in ctx structure Keep amdgpu_ctx_mgr in ctx structure to track fpriv. v2: add missing fpriv declaration lost in rebase Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:37 -04:00
James Zhu	d425c6f48b	drm/amdgpu: add partition scheduler list update Add partition scheduler list update in late init and xcp partition mode switch. Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:35 -04:00
James Zhu	0a9115fd95	drm/amdgpu: update header to support partition scheduling Update header to support partition scheduling. Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:32 -04:00
James Zhu	797a0a142c	drm/amdgpu: add partition ID track in ring Keep track partition ID in ring. Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:30 -04:00
James Zhu	be3800f57c	drm/amdgpu: find partition ID when open device Find partition ID when open device from render device minor. Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-and-tested-by: Philip Yang<Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:27 -04:00
James Zhu	2c1c7ba457	drm/amdgpu: support partition drm devices Support partition drm devices on GC_HWIP IP_VERSION(9, 4, 3). This is a temporary solution and will be superceded. Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-and-tested-by: Philip Yang<Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:20 -04:00
Graham Sider	b9cbd51000	drm/amdgpu/bu: update mtype_local parameter settings Update mtype_local module parameter to use MTYPE_RW by default. 0: MTYPE_RW (default) 1: MTYPE_NC 2: MTYPE_CC Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:15 -04:00
David Francis	76eb9c95a4	drm/amdgpu/bu: add mtype_local as a module parameter Selects the MTYPE to be used for local memory, (0 = MTYPE_CC (default), 1 = MTYPE_NC, 2 = MTYPE_RW) v2: squash in build fix (Alex) Reviewed-by: Graham Sider <Graham.Sider@amd.com> Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:13 -04:00
Felix Kuehling	352b919c1e	drm/amdgpu: Override MTYPE per page on GFXv9.4.3 APUs On GFXv9.4.3 NUMA APUs, system memory locality must be determined per page to choose the correct MTYPE. This patch adds a GMC callback that can provide this per-page override and implements it for native mode. Carve-out mode is not yet supported and will use the safe default (remote) MTYPE for system memory. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Reviewed-and-tested-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:08 -04:00
Felix Kuehling	1e4a00334a	drm/amdgpu: Fix per-BO MTYPE selection for GFXv9.4.3 Treat system memory on NUMA systems as remote by default. Overriding with a more efficient MTYPE per page will be implemented in the next patch. No need for a special case for APP APUs. System memory is handled the same for carve-out and native mode. And VRAM doesn't exist in native mode. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Reviewed-and-tested-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:05 -04:00
Graham Sider	895797d919	drm/amdgpu/bu: Add use_mtype_cc_wa module param By default, set use_mtype_cc_wa to 1 to set PTE coherence flag MTYPE_CC instead of MTYPE_RW by default. This is required for the time being to mitigate a bug causing XCCs to hit stale data due to TCC marking fully dirty lines as exclusive. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Joseph Greathouse <Joseph.Greathouse@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:59:03 -04:00
Graham Sider	2e8cc5d317	drm/amdgpu: Use legacy TLB flush for gfx943 Invalidate TLBs via a legacy flush request (flush_type=0) prior to the heavyweight flush requests (flush_type=2) in gmc_v9_0.c. This is temporarily required to mitigate a bug causing CPC UTCL1 to return stale translations after invalidation requests in address range mode. v2: squash in long term fix "drm/amdgpu: disable extra gfx943 legacy flush on rev1+" Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:58:58 -04:00
Harish Kasiviswanathan	f915f3af99	drm/amdgpu: For GFX 9.4.3 APU fix vram_usage value For GFX 9.4.3 APP APU VRAM is allocated in GTT domain. While freeing memory check for GTT domain instead of VRAM if it is APP APU Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:58:52 -04:00
Philip Yang	fc021438d0	drm/amdgpu: Enable NPS4 CPX mode CPX compute mode is valid mode for NPS4 memory partition mode. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:58:48 -04:00
Philip Yang	610dab118f	drm/amdkfd: Move pgmap to amdgpu_kfd_dev structure VRAM pgmap resource is allocated every time when switching compute partitions because kfd_dev is re-initialized by post_partition_switch, As a result, it causes memory region resource leaking and system memory usage accounting unbalanced. pgmap resource should be allocated and registered only once when loading driver and freed when unloading driver, move it from kfd_dev to amdgpu_kfd_dev. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:58:45 -04:00
Lijo Lazar	00e1ab02c2	drm/amdgpu: Skip halting RLC on GFX v9.4.3 RLC-PMFW handshake happens periodically when GFXCLK DPM is enabled and halting RLC may cause unexpected results. Avoid halting RLC from driver side. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:58:41 -04:00
Lijo Lazar	1e91a5f791	drm/amdgpu: Fix register accesses in GFX v9.4.3 Access registers with the right xcc id. Also, remove the unused logic as PG is not used in GFX v9.4.3 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:58:39 -04:00
Hawking Zhang	9b337b7d62	drm/amdgpu: Adjust the sequence to query ras error info It turns out STATUS_VALID_FLAG needs to be checked ahead of any other fields. ADDRESS_VALID_FLAG and ERR_INFO_VALID_FLAG only manages ADDRESS and ERR_INFO field respectively. driver should continue poll ERR CNT field even ERR_INFO_VALD_FLAG is not set. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:58:32 -04:00
Hawking Zhang	35d54e21e0	drm/amdgpu: Initialize jpeg v4_0_3 ras function Initialize jpeg v4_0_3 ras function. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:58:30 -04:00
Hawking Zhang	570df4bca6	drm/amdgpu: Add reset_ras_error_count for jpeg v4_0_3 Add reset_ras_error_count callback for jpeg v4_0_3. It will be used to reset jpeg ras error count. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:58:28 -04:00
Hawking Zhang	41e491d8b6	drm/amdgpu: Add query_ras_error_count for jpeg v4_0_3 Add query_ras_error_count callback for jpeg v4_0_3. It will be used to query and log jpeg error count. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:58:25 -04:00
Hawking Zhang	85f23b0a8c	drm/amdgpu: Re-enable VCN RAS if DPG is enabled VCN RAS enablement sequence needs to be added in DPG HW init sequence. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:58:22 -04:00
Hawking Zhang	c3f05ab8c4	drm/amdgpu: Initialize vcn v4_0_3 ras function Initialize vcn v4_0_3 ras function Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:58:20 -04:00
Hawking Zhang	6d39fa3fc8	drm/amdgpu: Add reset_ras_error_count for vcn v4_0_3 Add reset_ras_error_count callback for vcn v4_0_3. It will be used to reset vcn ras error count. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:58:17 -04:00
Hawking Zhang	5e1e227fb7	drm/amdgpu: Add query_ras_error_count for vcn v4_0_3 Add query_ras_error_count callback for vcn v4_0_3. It will be used to query and log vcn error count. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:58:15 -04:00
Gavin Wan	b4520bfd80	drm/amdgpu: Checked if the pointer NULL before use it. For SRIOV on some parts, the host driver does not post VBIOS. So the guest cannot get bios information. Therefore, adev->virt.fw_reserve.p_pf2vf and adev->mode_info.atom_context are NULL. Signed-off-by: Gavin Wan <Gavin.Wan@amd.com> Reviewed-by: Zhigang Luo <Zhigang.Luo@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:58:09 -04:00
Gavin Wan	46f7b4deb3	drm/amdgpu: Set memory partitions to 1 for SRIOV. For SRIOV, the memory partitions are set on host drover. Each VF only has one memory partition. We need set the memory partitions to 1 on guest driver for SRIOV. V2: sqaush in fix ("drm/amdgpu: Fix memory range info of GC 9.4.3 VFs") Signed-off-by: Gavin Wan <Gavin.Wan@amd.com> Acked-by: Zhigang Luo <zhigang.luo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:58:06 -04:00
Gavin Wan	b0a3bbf947	drm/amdgpu: Skip using MC FB Offset when APU flag is set for SRIOV. The MC_VM_FB_OFFSET is PF only register. It cannot be read on VF. So, the driver should not use MC_VM_FB_OFFSET address to set the address of dev->gmc.aper_base. Signed-off-by: Gavin Wan <Gavin.Wan@amd.com> Reviewed-by: Zhigang Luo <zhigang.luo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:58:02 -04:00
Gavin Wan	63630c9e5c	drm/amdgpu: Add PSP supporting PSP 13.0.6 SRIOV ucode init. Add PSP supporting PSP 13.0.6 SRIOV ucode init. Signed-off-by: Gavin Wan <Gavin.Wan@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:57:55 -04:00
Lijo Lazar	ba08e9cb6f	drm/amdgpu: Add PSP spatial parition interface Add PSP ring command interface for spatial partitioning. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:57:53 -04:00
Lijo Lazar	b6b85c8b43	drm/amdgpu: Return error on invalid compute mode Return error if an invalid compute partition mode is requested. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:57:50 -04:00
Lijo Lazar	f9632096be	drm/amdgpu: Add compute mode descriptor function Keep a helper function to get description of compute partition mode. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:57:48 -04:00
Lijo Lazar	a0ba127960	drm/amdgpu: Fix unmapping of aperture When aperture size is zero, there is no mapping done. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:57:46 -04:00
Rajneesh Bhardwaj	e181be58cc	drm/amdgpu: Fix xGMI access P2P mapping failure on GFXIP 9.4.3 On GFXIP 9.4.3, we dont need to rely on xGMI hive info to determine P2P access. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-and-tested-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:57:43 -04:00
Rajneesh Bhardwaj	fcfefd85f1	drm/amdkfd: Native mode memory partition support For native mode, after amdgpu_bo is created on CPU domain, then call amdgpu_ttm_tt_set_mem_pool to select the TTM pool using bo->mem_id. ttm_bo_validate will allocate the memory to the correct memory partition before mapping to GPUs. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-and-tested-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:57:37 -04:00
Philip Yang	1e03322cfe	drm/amdgpu: Set TTM pools for memory partitions For native mode only, create TTM pool for each memory partition to store the NUMA node id, then the TTM pool will be selected using memory partition id to allocate memory from the correct partition. Acked-by: Christian König <christian.koenig@amd.com> (rajneesh: changed need_swiotlb and need_dma32 to false for pool init) Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-and-tested-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:57:31 -04:00
Lijo Lazar	570de94b9c	drm/amdgpu: Add auto mode for compute partition When auto mode is specified, driver will choose the right compute partition mode. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Reviewed-by: Philip Yang <philip.yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:57:27 -04:00
Lijo Lazar	1589c82a10	drm/amdgpu: Check memory ranges for valid xcp mode Check the memory ranges available to the device also for deciding a valid partition mode. Only select combinations are valid for a particular mode. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Reviewed-by: Philip Yang <philip.yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:57:25 -04:00
Lijo Lazar	e47947abb9	drm/amdgpu: Move initialization of xcp before kfd After partition switch, fill all relevant xcp information before kfd starts initialization. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:57:14 -04:00
Lijo Lazar	15e3eee8d3	drm/amdgpu: Fill xcp mem node in aquavanjaram Implement callbacks to fill memory node information in aquavanjaram. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:57:11 -04:00
Lijo Lazar	da539b213d	drm/amdgpu: Add callback to fill xcp memory id Add callback in xcp interface to fill xcp memory id information. Memory id is used to identify the range/partition of an XCP from the available memory partitions in device. Also, fill the id information. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:57:08 -04:00
Lijo Lazar	a433f1f594	drm/amdgpu: Initialize memory ranges for GC 9.4.3 GC 9.4.3 ASICS may have memory split into multiple partitions.Initialize the memory partition information for each range. The information may be in the form of a numa node id or a range of pages. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:57:03 -04:00
Lijo Lazar	14493cb99b	drm/amdgpu: Add memory partitions to gmc Some ASICs have the device memory divided into multiple partitions. The parititions could be denoted by a numa node or by a range of pages. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:57:01 -04:00
Lijo Lazar	fa0497c34e	drm/amdgpu: Add API to get numa information of XCC Add interface to get numa information of ACPI XCC object. The interface uses logical id to identify an XCC. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:56:59 -04:00
Lijo Lazar	1cc823011a	drm/amdgpu: Store additional numa node information Use a struct to store additional numa node information including size and base address. Add numa_info pointer to xcc object to point to the relevant structure based on its proximity domain. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:56:57 -04:00
Lijo Lazar	0f2e1d620e	drm/amdgpu: Get supported memory partition modes Expand the interface to get supported memory partition modes also along with the current memory partition mode. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:56:54 -04:00
Lijo Lazar	b6f90baafe	drm/amdgpu: Move memory partition query to gmc GMC block handles memory related information, it makes more sense to keep memory partition functions in gmc block. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:56:50 -04:00
Lijo Lazar	4bdca20579	drm/amdgpu: Add utility functions for xcp Add utility functions to get details of xcp and iterate through available xcps. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:56:48 -04:00
Lijo Lazar	db3b5cb64a	drm/amdgpu: Use apt name for FW reserved region Use the generic term fw_reserved_memory for FW reserve region. This region may also hold discovery TMR in addition to other reserve regions. This region size could be larger than discovery tmr size, hence don't change the discovery tmr size based on this. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:56:42 -04:00
Lijo Lazar	bc71daff4f	drm/amdgpu: Use GPU VA space for IH v4.4.2 in APU For IH ring buffer and read/write pointers, use GPU VA space rather than Guest PA on APU configs. Access through Guest PA doesn't work when IOMMU is enabled. It is also beneficial in NUMA configs as it allocates from the closest numa pool in a numa enabled system. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:56:39 -04:00
Lijo Lazar	672c883c26	drm/amdgpu: Simplify aquavanjram instance mapping Simplify so as to use the same sequence to assign logical to physical ids for all IPs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Tested-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:56:35 -04:00
Lijo Lazar	a3edd1ac70	drm/amdgpu/vcn: Use buffer object's deletion logic VCN DPG buffer object is intialized to NULL. If allotted, buffer object deletion logic will take care of NULL check and delete accordingly. This is useful for cases where indirect sram flag could be manipulated later after buffer allocation. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:56:32 -04:00
Sonny Jiang	e7947c021a	drm/amdgpu: Use a different value than 0xDEADBEEF for jpeg ring test The 0xDEADBEEF standard anti-hang value. Use it may cause fake pass. Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:56:28 -04:00
Sonny Jiang	96e693ad78	drm/amdgpu: Add a read after write DB_CTRL for vcn_v4_0_3 To make sure VCN DB_CTRL is delivered before doorbell write. Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:56:26 -04:00
Sonny Jiang	55ff23d9eb	drm/amdgpu: fixes a JPEG get write/read pointer bug Need parentheses for the micro parameters. Signed-off-by: Sonny Jiang <sonjiang@amd.com> Reviewed-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:56:19 -04:00
Sonny Jiang	26dc0448ef	drm/amdgpu: A workaround for JPEG_v4_0_3 ring test fail The jpeg_v4_0_3 jpeg_pitch register uses UVD_JRBC_SCRATCH0. It needs to move WREG() to after jpeg_start. Switch to a posted register write when doing the ring test to make sure the register write lands before we test the result. Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:56:15 -04:00
James Zhu	358e6c3830	drm/amdgpu: use physical AID index for ring name Use physical AID index for VCN/JPEG ring name instead of logical AID index. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:56:13 -04:00
James Zhu	d3e53452b0	drm/amdgpu/vcn: use dummy register selects AID for VCN_RAM ucode Use dummy register 0xDEADBEEF selects AID for PSP VCN_RAM ucode. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:56:10 -04:00
Lijo Lazar	6a944ccbf5	drm/amdgpu: Fix harvest reporting of VCN Use VCN instance mask to check if an instance is harvested or not. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:56:08 -04:00
Lijo Lazar	fd91d38b52	drm/amdgpu: Use logical ids for VCN/JPEG v4.0.3 Address VCN/JPEG instances using logical ids. Whenever register access is required, get the physical instance using GET_INST. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Tested-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:56:06 -04:00
Lijo Lazar	07bc0ac8ff	drm/amdgpu: Add VCN logical to physical id mapping Add mappings for logical to physical id for VCN/JPEG 4.0.3 v2: make local function static (Alex) Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Tested-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:56:03 -04:00
Lijo Lazar	aaf1090a6c	drm/amdgpu: Add instance mask for VCN and JPEG Keep an instance mask formed by physical instance numbers for VCN and JPEG IPs. Populate the mask from discovery table information. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Tested-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:55:59 -04:00
Sonny Jiang	48d19834ea	drm/amdgpu: Load vcn_v4_0_3 ucode during early_init VCN loading ucode is moved to early_init with using 'amdgpu_ucode_*' helpers. Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Sonny Jiang <sonjiang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:55:56 -04:00
Shiwu Zhang	5ae0ec8b80	drm/amdgpu: preserve the num_links in case of reflection For topology reflection, each socket to every other socket has the exactly same topology info as the other way around. So it is safe to keep the reflected num_links value otherwise it will be overriden by the link info output of GET_PEER_LINKS command. Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:55:54 -04:00
Lijo Lazar	f2b8447b1f	drm/amdgpu: Fix discovery sys node harvest info Initalize syfs nodes after harvest information is fetched and fetch the correct harvest info based on each IP instance. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:55:51 -04:00
Lijo Lazar	ac772a3c07	drm/amdgpu: Add fallback path for discovery info If SOC doesn't expose dedicated vram, discovery region may be available through system memory. Rename the existing interface to generic read_binary_from_mem and add a fallback path to read from system memory. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:55:45 -04:00
Lijo Lazar	368bb1bcfb	drm/amdgpu: Read discovery info from system memory On certain ASICs, discovery info is available at reserved region in system memory. The location is available through ACPI interface. Add API to read discovery info from there. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:55:42 -04:00
Lijo Lazar	6e01882267	drm/amdgpu: Add API to get tmr info from acpi In certain configs, TMR information is available from ACPI. Add API to fetch the information. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:55:40 -04:00
Lijo Lazar	4d5275ab0b	drm/amdgpu: Add parsing of acpi xcc objects Add parsing of ACPI xcc objects and fill in relevant info from them by invoking the DSM methods. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-and-tested-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:55:38 -04:00
Lijo Lazar	01ef47477d	drm/amdgpu: Add FGCG for GFX v9.4.3 It's not fine grain, behaves similar to MGCG. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:55:33 -04:00
Lijo Lazar	46d79cbf9a	drm/amdgpu: Use transient mode during xcp switch During partition switch, keep the state as transient mode. Fetch the latest state if switch fails. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:55:31 -04:00
Lijo Lazar	ded7d99eb5	drm/amdgpu: Add flags for partition mode query It's not required to take lock on all cases while querying partition mode. Querying partition mode during KFD init process doesn't need to take a lock. Init process after a switch will already be happening under lock. Control the behaviour by adding flags to xcp_query_partition_mode. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:55:29 -04:00
Lijo Lazar	8f2ccaaa37	drm/amdgpu: Add mode-2 reset in SMU v13.0.6 Modifications to mode-2 reset flow for SMU v13.0.6 ASICs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:55:12 -04:00
Stanley.Yang	1ad29cb343	drm/amdgpu: fix sdma instance It should change logical instance to device instance to query ras info Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:54:15 -04:00
Le Ma	0c451baf3b	drm/amdgpu: change the print level to warn for ip block disabled Avoid to mislead users as it's not a real error. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Amber Lin <Amber.Lin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:54:12 -04:00
Mukul Joshi	9e4216cf2d	drm/amdgpu: Increase Max GPU instance to 64 Increase Max GPU instances to 64 to handle multi-socket system with GFX 9.4.3 asic. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:54:09 -04:00
Le Ma	bb0ed57b44	drm/amdgpu: increase AMDGPU_MAX_RINGS On newer GPUs, the number of kernel rings are increased. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:54:05 -04:00
Rajneesh Bhardwaj	970c1646b5	drm/amdgpu: Create VRAM BOs on GTT for GFXIP9.4.3 On GFXIP9.4.3 APP APU where there is no dedicated VRAM domain handle VRAM BO allocation requests on CPU domain and validate them on GTT. Support for handling multi-socket and multi-numa partitions within a socket will be added by future patches, this enables 1P NPS1 asic bringup configuration. Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:54:01 -04:00
Rajneesh Bhardwaj	f431393d60	drm/amdgpu: Implement new dummy vram manager This adds dummy vram manager to support ASICs that do not have a dedicated or carvedout vram domain. Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:53:57 -04:00
Rajneesh Bhardwaj	228ce17643	drm/amdgpu: Handle VRAM dependencies on GFXIP9.4.3 [For 1P NPS1 mode driver bringup] Changes required to initialize the amdgpu driver with frontdoor firmware loading and discovery=2 with the native mode SBIOS that enables CPU GPU unified interleaved memory. sudo modprobe amdgpu discovery=2 Once PSP TMR region is reported via the ACPI interface, the dependency on the ip_discovery.bin will be removed. Choice of where to allocate driver table is given to each IP version. In general, both GTT and VRAM domains will be considered. If one of the tables has a strict restriction for VRAM domain, then only VRAM domain is considered. Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> (lijo: Modified the handling for SMU Tables) Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:53:52 -04:00
Asad kamal	9faf929fbf	drm/amdgpu: Enable CG for IH v4.4.2 Enable clock gating on IH v4.4.2 versions. Signed-off-by: Asad kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:53:49 -04:00
Hawking Zhang	8107e4996f	drm/amdgpu: Enable persistent edc harvesting in APP APU Persistent edc harvesting is supported in APP APU Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:53:45 -04:00
Hawking Zhang	73c2b3fd2c	drm/amdgpu: Initialize mmhub v1_8 ras function Initialize mmhub v1_8 ras function. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:53:42 -04:00
Hawking Zhang	ccfdbd4bdc	drm/amdgpu: Add reset_ras_error_status for mmhub v1_8 Add reset_ras_error_status callback for mmhub v1_8. It will be used to reset mmhub error status. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-06-09 09:53:34 -04:00

1 2 3 4 5 ...

12597 Commits