linux

mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00

Author	SHA1	Message	Date
Lang Yu	f9070b0f2f	drm/amdgpu/vpe: add VPE 6.1.1 support Add initial support for VPE 6.1.1. v2: squash in updates (Alex) Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-03-07 15:33:10 -05:00
Lang Yu	d40f6213b5	drm/amdgpu/vpe: don't emit cond exec command under collaborate mode Not ready now. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-03-07 15:33:04 -05:00
Lang Yu	26f5f34e6e	drm/amdgpu/vpe: add collaborate mode support for VPE Under clollaborate mode, multiple VPE instances share a ring buferr and work together to finish a job. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-03-07 15:33:01 -05:00
Lang Yu	72f4ae0a64	drm/amdgpu/vpe: add PRED_EXE and COLLAB_SYNC OPCODE To support multi VPE collaborate mode. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-03-07 15:32:58 -05:00
Lang Yu	709ef39f95	drm/amdgpu/vpe: add multi instance VPE support Add support for multi instance VPE processing. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-03-07 15:32:48 -05:00
Likun Gao	79698b145f	drm/amdgpu/discovery: add nbif v6_3_1 ip block Add nbif v6_3_1 ip block. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-03-07 15:32:45 -05:00
Hawking Zhang	894c6d3522	drm/amdgpu: Add nbif v6_3_1 ip block support Add nbif v6_3_1 ip block support. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-03-07 15:32:42 -05:00
Sunil Khatri	5e592956cc	drm/amdgpu: add ring timeout information in devcoredump Add ring timeout related information in the amdgpu devcoredump file for debugging purposes. During the gpu recovery process the registered call is triggered and add the debug information in data file created by devcoredump framework under the directory /sys/class/devcoredump/devcdx/ Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-03-06 15:24:50 -05:00
Yifan Zhang	2bdebcb1e4	drm/amdgpu: add dcn3.5.1 support This patch to add dcn3.5.1 support. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-03-06 15:24:50 -05:00
Pierre-Eric Pelloux-Prayer	bf909454fe	drm/amdgpu: disable ring_muxer if mcbp is off Using the ring_muxer without preemption adds overhead for no reason since mcbp cannot be triggered. Moving back to a single queue in this case also helps when high priority app are used: in this case the gpu_scheduler priority handling will work as expected - much better than ring_muxer with its 2 independant schedulers competing for the same hardware queue. This change requires moving amdgpu_device_set_mcbp above amdgpu_device_ip_early_init because we use adev->gfx.mcbp. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-03-06 15:24:49 -05:00
Jesse Zhang	bb8863cc9d	drm/amdgpu: remove unused code Remove the unused function - amdgpu_vm_pt_is_root_clean and remove the impossible condition v1: entries == 0 is not possible any more, so this condition could probably be removed (Felix) Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Suggested-by：Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-03-06 15:24:24 -05:00
Christian König	8bc75586ea	drm/amdgpu: workaround to avoid SET_Q_MODE packets v2 It turned out that executing the SET_Q_MODE packet on every submission creates to much overhead. Implement a workaround which allows skipping the SET_Q_MODE packet if subsequent submissions all use the same parameters. v2: add a NULL check for ring_obj Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-03-04 15:59:08 -05:00
Christian König	c68cbbfd54	drm/amdgpu: cleanup conditional execution First of all calculating the number of dw to patch into a conditional execution is not something HW generation specific. This is just standard ring buffer calculations. While at it also reduce the BUG_ON() into WARN_ON(). Then instead of a random bit pattern use 0 as default value for the number of dw skipped, this way it's not mandatory any more to patch the conditional execution. And last make the address to check a parameter of the conditional execution instead of getting this from the ring. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-03-04 15:59:08 -05:00
Ma Jun	86e14a7386	drm/amdgpu: Use rpm_mode flag instead of checking it again for rpm Because the rpm_mode flag is already set when the driver is initialized, we use it directly for runtime suspend/resume instead of checking it again Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-03-04 15:59:08 -05:00
Shashank Sharma	b8f67b9ddf	drm/amdgpu: change vm->task_info handling This patch changes the handling and lifecycle of vm->task_info object. The major changes are: - vm->task_info is a dynamically allocated ptr now, and its uasge is reference counted. - introducing two new helper funcs for task_info lifecycle management - amdgpu_vm_get_task_info: reference counts up task_info before returning this info - amdgpu_vm_put_task_info: reference counts down task_info - last put to task_info() frees task_info from the vm. This patch also does logistical changes required for existing usage of vm->task_info. V2: Do not block all the prints when task_info not found (Felix) V3: Fixed review comments from Felix - Fix wrong indentation - No debug message for -ENOMEM - Add NULL check for task_info - Do not duplicate the debug messages (ti vs no ti) - Get first reference of task_info in vm_init(), put last in vm_fini() V4: Fixed review comments from Felix - fix double reference increment in create_task_info - change amdgpu_vm_get_task_info_pasid - additional changes in amdgpu_gem.c while porting Cc: Christian Koenig <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-03-04 15:59:08 -05:00
Jesse Zhang	feb13f52c8	Revert "drm/amdgpu: remove vm sanity check from amdgpu_vm_make_compute" for Raven fix the issue: "amdgpu: Failed to create process VM object". [Why]when amdgpu initialized, seq64 do mampping and update bo mapping in vm page table. But when clifo run. It also initializes a vm for a process device through the function kfd_process_device_init_vm and ensure the root PD is clean through the function amdgpu_vm_pt_is_root_clean. So they have a conflict, and clinfo always failed. v1: - remove all the pte_supports_ats stuff from the amdgpu_vm code (Felix) Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-03-04 15:58:45 -05:00
Christian König	216c1282dd	drm/amdgpu: use GTT only as fallback for VRAM\|GTT Try to fill up VRAM as well by setting the busy flag on GTT allocations. This fixes the issue that when VRAM was evacuated for suspend it's never filled up again unless the application is restarted. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240229134003.3688-2-christian.koenig@amd.com	2024-03-01 17:12:26 +01:00
Bjorn Helgaas	b07395d5d5	drm/amdgpu: remove misleading amdgpu_pmops_runtime_idle() comment After `4020c22802` ("drm/amdgpu: don't runtime suspend if there are displays attached (v3)"), "ret" is unconditionally set later before being used, so there's point in initializing it and the associated comment is no longer meaningful. Remove the comment and the unnecessary initialization. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-29 20:35:39 -05:00
Alex Deucher	959143dab1	Revert "drm/amd: Remove freesync video mode amdgpu parameter" This reverts commit `e94e787e37`. This conflicts with how compositors want to handle VRR. Now that compositors actually handle VRR, we probably don't need freesync video. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2985 Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-29 20:35:31 -05:00
Tao Zhou	2c684b9342	drm/amdgpu: add deferred error check for UMC v12 address query Both RAS UE and deferred errors need page retirement. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-29 20:35:14 -05:00
Eric Huang	1761d9a688	amd/amdkfd: remove unused parameter The adev can be found from bo by amdgpu_ttm_adev(bo->tbo.bdev), and adev is also not used in the function amdgpu_amdkfd_map_gtt_bo_to_gart(). Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-28 17:10:53 -05:00
Srinivasan Shanmugam	eb4f139888	drm/amd/amdgpu: Fix potential ioremap() memory leaks in amdgpu_device_init() This ensures that the memory mapped by ioremap for adev->rmmio, is properly handled in amdgpu_device_init(). If the function exits early due to an error, the memory is unmapped. If the function completes successfully, the memory remains mapped. Reported by smatch: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:4337 amdgpu_device_init() warn: 'adev->rmmio' from ioremap() not released on lines: 4035,4045,4051,4058,4068,4337 Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-27 11:06:58 -05:00
Srinivasan Shanmugam	7cf1ad2fe1	drm/amdgpu: Fix missing break in ATOM_ARG_IMM Case of atom_get_src_int() Missing break statement in the ATOM_ARG_IMM case of a switch statement, adds the missing break statement, ensuring that the program's control flow is as intended. Fixes the below: drivers/gpu/drm/amd/amdgpu/atom.c:323 atom_get_src_int() warn: ignoring unreachable code. Fixes: `d38ceaf99e` ("drm/amdgpu: add core driver (v4)") Cc: Jammy Zhou <Jammy.Zhou@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-27 11:06:29 -05:00
Mario Limonciello	0887054d14	drm/amd: Drop abm_level property This vendor specific property has never been used by userspace software and conflicts with the panel_power_savings sysfs file. That is a compositor and user could fight over the same data. Fixes: `63d0b87213` ("drm/amd/display: add panel_power_savings sysfs entry to eDP connectors") Suggested-by: Harry Wentland <Harry.Wentland@amd.com> Cc: Hamza Mahfooz <Hamza.Mahfooz@amd.com> Cc: "Sun peng Li (Leo)" <Sunpeng.Li@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-27 10:46:59 -05:00
Tim Huang	93d64097f7	drm/amdgpu: reserve more memory for MES runtime DRAM This patch fixes a MES firmware boot failure issue when backdoor loading the MES firmware. MES firmware runtime DRAM size is changed to 512k, the driver needs to reserve this amount of memory in FB, otherwise adjacent memory will be overwritten by the MES firmware startup code. Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-26 11:15:32 -05:00
Prike Liang	63fcd306c0	drm/amdgpu: Enable gpu reset for S3 abort cases on Raven series Currently, GPU resets can now be performed successfully on the Raven series. While GPU reset is required for the S3 suspend abort case. So now can enable gpu reset for S3 abort cases on the Raven series. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-26 11:15:25 -05:00
Victor Lu	56f7d2ac6d	drm/amdgpu: Do not program SQ_TIMEOUT_CONFIG in SRIOV VF should not program this register. Signed-off-by: Victor Lu <victorchengchi.lu@amd.com> Reviewed-by: Zhigang Luo <Zhigang.Luo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-26 11:14:46 -05:00
Stanley.Yang	7ec11c2f65	drm/amdgpu: Fix ineffective ras_mask settings Check amdgpu_ras_mask to fix ineffective ras_mask setting due to special asic without sram ecc enable but with poison supported. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-26 11:14:37 -05:00
Lijo Lazar	e1f6746f33	drm/amdkfd: Skip packet submission on fatal error If fatal error is detected, packet submission won't go through. Return error in such cases. Also, avoid waiting for fence when fatal error is detected. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-26 11:14:31 -05:00
Lijo Lazar	1b6ef74b2b	drm/amdgpu: Add fatal error detected flag For a RAS error that needs a full reset to recover, set the fatal error status. Clear the status once the device is reset. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-26 11:14:24 -05:00
Ma Jun	f435b5156b	drm/amdgpu: Fix the runtime resume failure issue Don't set power state flag when system enter runtime suspend, or it may cause runtime resume failure issue. Fixes: `3a9626c816` ("drm/amd: Stop evicting resources on APUs in suspend") Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-26 11:11:46 -05:00
Daniel Vetter	f112b68f27	Linux 6.8-rc6 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmXb0T4eHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG5YQH/3eCV90sNGch0Y94 8rtTdqFrVx7QPNl0pz+Mo6OUIKUUHvTuwime16ckLxG+3x2Y3I0MjP1edd1NB99C Kje//JTpaZBPpTZ/jY4u8B1Shov2Drdx/J4NFnE/9rG6yXzKQBtvON/xAxXDCVHT mLhst2LR0FeCSMk9jAX6CoqUPEgwlylNyAetKxaDQgoHl4GTZC7FDO17WxyjpIxe 1rVHsrV9Eq8kD4uxrzpTYWgZrwTObPmlZjvefa1JfzSwRNABIBJj/C1nra1Zc1oi b7xVaXS1cMOxrtuuG00fmHsPnWivu0tuND7H3/yLd1mRCZAPSsVbVvrI/KNtoeV4 1euINlY= =7IFt -----END PGP SIGNATURE----- Merge v6.8-rc6 into drm-next Thomas Zimmermann asked to backmerge -rc6 for drm-misc branches, there's a few same-area-changed conflicts (xe and amdgpu mostly) that are getting a bit too annoying. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2024-02-26 11:41:07 +01:00
Daniel Vetter	71ab34f72f	drm-misc-next for v6.9: UAPI Changes: - changes to fdinfo stats Cross-subsystem Changes: agp: - remove unused type field from struct agp_bridge_data Core Changes: ci: - update test names - cleanups gem: - add stats for shared buffers plus updates to amdgpu, i915, xe Documentation: - fixes syncobj: - fixes to waiting and sleeping Driver Changes: bridge: - adv7511: fix crash on irq during probe - dw_hdmi: set bridge type host1x: - cleanups ivpu: - updates to firmware API - refactor BO allocation meson: - fix error handling in probe panel: - revert "drm/panel-edp: Add auo_b116xa3_mode" - add Himax HX83112A plus DT bindings - ltk500hd1829: add support for ltk101b4029w and admatec 9904370 - simple: add BOE BP082WX1-100 8.2" panel plus DT bindungs renesas: - add RZ/G2L DU support plus DT bindings -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmXXUhsACgkQaA3BHVML eiNVFQf+IoOXCACGkWEVmVaen50pjEfLq0OjSGHdbTJqhc9wU7Q/kPC+jEpZLyqo OUMdXlA55BeLX52O+bvLordDPNETUsYH1QX2BYKDwcNIrvj8ISXcvdbnDcbVmttD ZUaaBgZ0g2M6sZQvTVU88/1RtaG64+zuk9VA1dPlh6WnBtXBUeXNtD6YQjH6xY+a MjZpB5VafwJTmQxy7qJ4yTLX291Ao8J2YZK8cCSyEr3FQKkAx9sJyp3hPurVIjLM f1y1rtoHhxUV/OVg4M559fp6F6tUkFauv4qu5VUvmPPihJTaU0eSQxir0za4VJ4e Jr2GOkju0oRRpKfjd0aKvaoWhl+MNg== =aaTQ -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2024-02-22' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v6.9: UAPI Changes: - changes to fdinfo stats Cross-subsystem Changes: agp: - remove unused type field from struct agp_bridge_data Core Changes: ci: - update test names - cleanups gem: - add stats for shared buffers plus updates to amdgpu, i915, xe Documentation: - fixes syncobj: - fixes to waiting and sleeping Driver Changes: bridge: - adv7511: fix crash on irq during probe - dw_hdmi: set bridge type host1x: - cleanups ivpu: - updates to firmware API - refactor BO allocation meson: - fix error handling in probe panel: - revert "drm/panel-edp: Add auo_b116xa3_mode" - add Himax HX83112A plus DT bindings - ltk500hd1829: add support for ltk101b4029w and admatec 9904370 - simple: add BOE BP082WX1-100 8.2" panel plus DT bindungs renesas: - add RZ/G2L DU support plus DT bindings Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240222135841.GA6677@localhost.localdomain	2024-02-26 09:51:49 +01:00
Nathan Chancellor	2947a4567f	treewide: update LLVM Bugzilla links LLVM moved their issue tracker from their own Bugzilla instance to GitHub issues. While all of the links are still valid, they may not necessarily show the most up to date information around the issues, as all updates will occur on GitHub, not Bugzilla. Another complication is that the Bugzilla issue number is not always the same as the GitHub issue number. Thankfully, LLVM maintains this mapping through two shortlinks: https://llvm.org/bz<num> -> https://bugs.llvm.org/show_bug.cgi?id=<num> https://llvm.org/pr<num> -> https://github.com/llvm/llvm-project/issues/<mapped_num> Switch all "https://bugs.llvm.org/show_bug.cgi?id=<num>" links to the "https://llvm.org/pr<num>" shortlink so that the links show the most up to date information. Each migrated issue links back to the Bugzilla entry, so there should be no loss of fidelity of information here. Link: https://lkml.kernel.org/r/20240109-update-llvm-links-v1-3-eb09b59db071@kernel.org Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Fangrui Song <maskray@google.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Mykola Lysenko <mykolal@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-02-22 15:38:51 -08:00
Ma Jun	bbfaf2aea7	drm/amdgpu: Fix the runtime resume failure issue Don't set power state flag when system enter runtime suspend, or it may cause runtime resume failure issue. Fixes: `3a9626c816` ("drm/amd: Stop evicting resources on APUs in suspend") Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-02-22 12:28:27 -05:00
Yifan Zhang	a24029cc40	drm/amdgpu: add vcn 4.0.6 discovery support This patch is to add vcn 4.0.6 support Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 12:05:23 -05:00
Ilpo Järvinen	bb87e511b2	drm/amdgpu: Use RMW accessors for changing LNKCTL2 Convert open coded RMW accesses for LNKCTL2 to use pcie_capability_clear_and_set_word() which makes its easier to understand what the code tries to do. LNKCTL2 is not really owned by any driver because it is a collection of control bits that PCI core might need to touch. RMW accessors already have support for proper locking for a selected set of registers (LNKCTL2 is not yet among them but likely will be in the future) to avoid losing concurrent updates. Acked-by: Alex Deucher <alexander.deucher@amd.com> Suggested-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 12:05:20 -05:00
Kunwu Chan	a5fc4e5014	drm/amdgpu: Simplify the allocation of sync slab caches Use the new KMEM_CACHE() macro instead of direct kmem_cache_create to simplify the creation of SLAB caches. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 12:05:16 -05:00
Veerabadhran Gopalakrishnan	84eaa2c2c6	drm/amdgpu/soc21: Enabling PG and CG flags for VCN 4.0.6 Enabled the VCN Power Gating and Clock Gating flags for VCN 4.0.6. Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Veerabadhran Gopalakrishnan <veerabadhran.gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 12:05:12 -05:00
Kunwu Chan	e4e4618bc1	drm/amdgpu: Simplify the allocation of mux_chunk slab caches Use the new KMEM_CACHE() macro instead of direct kmem_cache_create to simplify the creation of SLAB caches. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:28:24 -05:00
Kunwu Chan	3d14cb0263	drm/amdgpu: Simplify the allocation of fence slab caches Use the new KMEM_CACHE() macro instead of direct kmem_cache_create to simplify the creation of SLAB caches. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:28:19 -05:00
Veerabadhran Gopalakrishnan	437591d237	drm/amdgpu/soc21: Added Video Capabilities for VCN 406 Updated Query Video codecs for VCN 406 Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Veerabadhran Gopalakrishnan <veerabadhran.gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:28:16 -05:00
Veerabadhran Gopalakrishnan	2b53b3668e	drm/amdgpu/vcn: Enable VCN 4.0.6 Support Modified driver to use the appropriate FW files and instance. v2: squash in fixes (Alex) Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Veerabadhran Gopalakrishnan <veerabadhran.gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:28:11 -05:00
Saleemkhan Jamadar	07cb7fd0fd	drm/amdgpu/jpeg: add support for jpeg multi instance Enable support for multi instance on JPEG 4.0.6. v2: squash in fixes (Alex) Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:28:04 -05:00
Victor Lu	8f4de8f72e	drm/amdgpu: Use correct SRIOV macro for gmc_v9_0_vm_fault_interrupt_state Under SRIOV, programming to VM_CONTEXT_CNTL regs failed because the current macro does not pass through the correct xcc instance. Use the REG32_XCC macro in this case. The behaviour without SRIOV is the same without this patch. Signed-off-by: Victor Lu <victorchengchi.lu@amd.com> Reviewed-by: Zhigang Luo <Zhigang.Luo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:27:42 -05:00
Victor Lu	bea07b215d	drm/amdgpu: Do not program IH_CHICKEN in vega20_ih.c under SRIOV IH_CHICKEN is blocked for VF writes; this access should be skipped. Signed-off-by: Victor Lu <victorchengchi.lu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:27:31 -05:00
Victor Lu	8093383ae7	drm/amdgpu: Improve error checking in amdgpu_virt_rlcg_reg_rw (v2) The current error detection only looks for a timeout. This should be changed to also check scratch_reg1 for any errors returned from RLCG. v2: remove new error value Signed-off-by: Victor Lu <victorchengchi.lu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:27:23 -05:00
Yifan Zhang	d6a76c0a5a	drm/amdgpu: enable MES discovery for GC 11.5.1 This patch to enable MES for GC 11.5.1 Reviewed-by: shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:27:18 -05:00
Yifan Zhang	e2442d3e32	drm/amdgpu: add GC 11.5.1 discovery support This patch to add GC 11.5.1 support Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:27:13 -05:00
Tim Huang	455918cf28	drm/amdgpu: enable CGPG for GFX ip v11.5.1 Enable CGPG support for GFX ip v11.5.1 Signed-off-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:27:07 -05:00
Yifan Zhang	7c15ac1183	drm/amdgpu: initialize gfx11.5.1 Initialize gfx 11.5.0 and set gfx hw configuration. v2: squash in CG, PG, GFXOFF fixes (Alex) Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:27:03 -05:00
Yifan Zhang	846f7385bf	drm/amdgpu: add mes firmware support for GC 11.5.1 This patch to add MES PIPE0 and PIPE1 firmware support for gc_11_5_1. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:27:01 -05:00
Yifan Zhang	fa744c0dd2	drm/amdgpu: add imu firmware support for GC 11.5.1 This patch is to add imu firmware support for GC 11.5.1 Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:26:58 -05:00
Yifan Zhang	dad4f543ac	drm/amdgpu: add firmware for GC 11.5.1 This patch is to add firmware for GC 11.5.1 Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:26:55 -05:00
Yifan Zhang	93c5cc8312	drm/amdgpu: add GC 11.5.1 to GC 11.5.0 family This patch to add GC 11.5.1 to GC 11.5.0 family. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:26:52 -05:00
Yifan Zhang	f1c40b6ea4	drm/amdgpu: enable soc21 discovery support for GC 11.5.1 This patch to enable soc21 support for GC 11.5.1 Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:26:49 -05:00
Yifan Zhang	e971995657	drm/amdgpu: add initial GC 11.5.1 soc21 support Disable clock gating and power gating for now. v2: squash in revision fix (Alex) Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:26:44 -05:00
Yifan Zhang	278318d371	drm/amdgpu: enable gmc11 discovery support for GC 11.5.1 This patch to enable gmc11 for GC 11.5.1 Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:26:40 -05:00
Asad Kamal	5fe4a8d3c6	drm/amdgpu: Remove pcie bw sys entry Remove pcie bw sys entry for asics not supporting such function Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:23:45 -05:00
Asad Kamal	c607e76e64	Revert "drm/amdgpu: Add pcie usage callback to nbio" pcie usage is now handled by fw This reverts commit `8d759dc664`. Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:21:27 -05:00
Ma Jun	4acd31e6c2	drm/amdgpu: Drop redundant parameter in amdgpu_gfx_kiq_init_ring Drop redundant parameters in function amdgpu_gfx_kiq_init_ring to simplify the code Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:17:45 -05:00
Asad Kamal	86a08f1af2	Revert "drm/amdgpu: Add pci usage to nbio v7.9" Remove implementation to get pcie usage for nbio v7.9 as pcie usage is handled by fw This reverts commit `59070fd9cc`. Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:15:26 -05:00
Yifan Zhang	2612c8313f	drm/amdgpu: add tmz support for GC IP v11.5.1 Add tmz support for GC 11.5.1. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:14:30 -05:00
Hawking Zhang	5c07015619	drm/amdgpu: Do not toggle bif ras irq from guest Only do this from host side. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:14:24 -05:00
Yifan Zhang	46e5de77b3	drm/amdgpu: add GFXHUB 11.5.1 support This patch to add GFXHUB 11.5.1 support. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-22 10:14:17 -05:00
Dave Airlie	40d47c5fb4	amd-drm-next-6.9-2024-02-19: amdgpu: - ATHUB 4.1 support - EEPROM support updates - RAS updates - LSDMA 7.0 support - JPEG DPG support - IH 7.0 support - HDP 7.0 support - VCN 5.0 support - Misc display fixes - Retimer fixes - DCN 3.5 fixes - VCN 4.x fixes - PSR fixes - PSP 14.0 support - VA_RESERVED cleanup - SMU 13.0.6 updates - NBIO 7.11 updates - SDMA 6.1 updates - MMHUB 3.3 updates - Suspend/resume fixes - DMUB updates amdkfd: - Trap handler enhancements - Fix cache size reporting - Relocate the trap handler radeon: - fix typo in print statement -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZdPLUgAKCRC93/aFa7yZ 2AakAQDmhlQkJAxIxJw4/5mEQY5zaMJ033lcZGzBQbj8uL42pQD/aQ/gdN/bOfPZ gsdidzgL5MThBOFfw72pBkEoE+kQXgc= =oP2J -----END PGP SIGNATURE----- Merge tag 'amd-drm-next-6.9-2024-02-19' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.9-2024-02-19: amdgpu: - ATHUB 4.1 support - EEPROM support updates - RAS updates - LSDMA 7.0 support - JPEG DPG support - IH 7.0 support - HDP 7.0 support - VCN 5.0 support - Misc display fixes - Retimer fixes - DCN 3.5 fixes - VCN 4.x fixes - PSR fixes - PSP 14.0 support - VA_RESERVED cleanup - SMU 13.0.6 updates - NBIO 7.11 updates - SDMA 6.1 updates - MMHUB 3.3 updates - Suspend/resume fixes - DMUB updates amdkfd: - Trap handler enhancements - Fix cache size reporting - Relocate the trap handler radeon: - fix typo in print statement Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240219214810.4911-1-alexander.deucher@amd.com	2024-02-22 13:21:19 +10:00
Yifan Zhang	31e0a586f3	drm/amdgpu: add MMHUB 3.3.1 support This patch to add MMHUB 3.3.1 support. v2: squash in fault info fix (Alex) Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-19 14:50:46 -05:00
Mario Limonciello	2bb2ad58f6	drm/amd: Change `jpeg_v4_0_5_start_dpg_mode()` to void jpeg_v4_0_5_start_dpg_mode() always returns 0 and the return value doesn't get used in the caller jpeg_v4_0_5_start(). Modify the function to be void. Reported-by: coverity-bot <keescook+coverity-bot@chromium.org> Addresses-Coverity-ID: 1583635 ("Code maintainability issues") Fixes: `0a119d53f7` ("drm/amdgpu/jpeg: add support for jpeg DPG mode") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-16 15:44:30 -05:00
Srinivasan Shanmugam	6f18d7ad9d	drm/amdgpu: Fix missing parameter descriptions in ih_v7_0.c Rectifies kdoc warnings related to the 'ih' parameter in the 'ih_v7_0_get_wptr', 'ih_v7_0_irq_rearm', and 'ih_v7_0_set_rptr' functions within the 'ih_v7_0.c' file. Fixes the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/ih_v7_0.c:392: warning: Function parameter or member 'ih' not described in 'ih_v7_0_get_wptr' drivers/gpu/drm/amd/amdgpu/ih_v7_0.c:432: warning: Function parameter or member 'ih' not described in 'ih_v7_0_irq_rearm' drivers/gpu/drm/amd/amdgpu/ih_v7_0.c:458: warning: Function parameter or member 'ih' not described in 'ih_v7_0_set_rptr' Fixes: `12443fc53e` ("drm/amdgpu: Add ih v7_0 ip block support") Cc: Likun Gao <Likun.Gao@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-16 15:42:26 -05:00
Yifan Zhang	a02cfac90f	drm/amdgpu: add SDMA 6.1.1 discovery support This patch to add SDMA 6.1.1 support. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-16 15:42:23 -05:00
Yifan Zhang	c40797d320	drm/amdgpu: add sdma 6.1.1 firmware This patch to add sdma 6.1.1 firmware declaration. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-16 15:42:21 -05:00
Yifan Zhang	aec765a4dc	drm/amdgpu: add psp 14.0.1 discovery support This patch to add psp 14.0.1 support. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-16 15:42:18 -05:00
Yifan Zhang	24b5a5df94	drm/amdgpu: add PSP 14.0.1 support This patch to add PSP 14.0.1 support. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-16 15:42:15 -05:00
Yifan Zhang	c5ce1f1a21	drm/amdgpu: add smuio 14.0.1 support This patch to add smuio 14.0.1 support. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-16 15:42:11 -05:00
Yifan Zhang	bd377b1281	drm/amdgpu: add nbio 7.11.1 discovery support This patch to add nbio 7.11.1 support. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-16 15:42:08 -05:00
Yifan Zhang	dc84f52eb2	drm/amdgpu/nbio: Add NBIO 7.11.1 Support Fix up doorbell setup and clockgating. v2: squash in fixes (Alex) Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Veerabadhran Gopalakrishnan <veerabadhran.gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-16 15:42:03 -05:00
Felix Kuehling	34a1de0f79	drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole The TBA and TMA, along with an unused IB allocation, reside at low addresses in the VM address space. A stray VM fault which hits these pages must be serviced by making their page table entries invalid. The scheduler depends upon these pages being resident and fails, preventing a debugger from inspecting the failure state. By relocating these pages above 47 bits in the VM address space they can only be reached when bits [63:48] are set to 1. This makes it much less likely for a misbehaving program to generate accesses to them. The current placement at VA (PAGE_SIZE*2) is readily hit by a NULL access with a small offset. v2: - Move it to the reserved space to avoid concflicts with Mesa - Add macros to make reserved space management easier v3: - Move VM max PFN calculation into AMDGPU_VA_RESERVED macros Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-16 15:41:50 -05:00
Alex Deucher	ba1a58d5b9	drm/amdgpu: add shared fdinfo stats Add shared stats. Useful for seeing shared memory. v2: take dma-buf into account as well v3: use the new gem helper Link: https://lore.kernel.org/all/20231207180225.439482-1-alexander.deucher@amd.com/ Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Rob Clark <robdclark@gmail.com> Reviewed-by: Christian König <christian.keonig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>	2024-02-16 12:52:50 +01:00
Thong	2f542421a4	drm/amdgpu/soc21: update VCN 4 max HEVC encoding resolution Update the maximum resolution reported for HEVC encoding on VCN 4 devices to reflect its 8K encoding capability. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3159 Signed-off-by: Thong <thong.thai@amd.com> Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-02-15 14:18:43 -05:00
Mario Limonciello	9163616853	Revert "drm/amd: flush any delayed gfxoff on suspend entry" commit `ab4750332d` ("drm/amdgpu/sdma5.2: add begin/end_use ring callbacks") caused GFXOFF control to be used more heavily and the codepath that was removed from commit `0dee726395` ("drm/amd: flush any delayed gfxoff on suspend entry") now can be exercised at suspend again. Users report that by using GNOME to suspend the lockscreen trigger will cause SDMA traffic and the system can deadlock. This reverts commit `0dee726395`. Acked-by: Alex Deucher <alexander.deucher@amd.com> Fixes: `ab4750332d` ("drm/amdgpu/sdma5.2: add begin/end_use ring callbacks") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-15 14:18:43 -05:00
Mario Limonciello	3a9626c816	drm/amd: Stop evicting resources on APUs in suspend commit `5095d54181` ("drm/amd: Evict resources during PM ops prepare() callback") intentionally moved the eviction of resources to earlier in the suspend process, but this introduced a subtle change that it occurs before adev->in_s0ix or adev->in_s3 are set. This meant that APUs actually started to evict resources at suspend time as well. Explicitly set s0ix or s3 in the prepare() stage, and unset them if the prepare() stage failed. v2: squash in warning fix from Stephen Rothwell Reported-by: Jürg Billeter <j@bitron.ch> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3132#note_2271038 Fixes: `5095d54181` ("drm/amd: Evict resources during PM ops prepare() callback") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-15 14:18:43 -05:00
Hamza Mahfooz	d16df040c8	drm/amdgpu: make damage clips support configurable We have observed that there are quite a number of PSR-SU panels on the market that are unable to keep up with what user space throws at them, resulting in hangs and random black screens. So, make damage clips support configurable and disable it by default for PSR-SU displays. Cc: stable@vger.kernel.org Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-15 14:18:42 -05:00
Likun Gao	efc11f34e2	drm/amdgpu: support psp ip block discovery for psp v14 Support PSP ip block discovery for psp v14. Add psp ip block for psp v14_0_2 and v14_0_3. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-14 17:18:27 -05:00
Likun Gao	8152825498	drm/amdgpu: add psp_timeout to limit PSP related operation Add a new parameter psp_timeout to limit psp related operation to unify the timeout limition for psp. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-14 17:18:24 -05:00
Likun Gao	e71658299d	drm/amdgpu/psp: set boot_time_tmr flag Set boot_time_tmr flag for the ASIC which MP0 ip version newer than 14.0.2 For runtime TMR: Init tmr and load tmr should did. For boottime TMR: If do not support autoload, skip init TMR. If support autoload, excute init TMR but skip load tmr. v2: rebase (Alex) Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-14 17:18:20 -05:00
Likun Gao	2fb4460fb8	drm/amdgpu/psp: handle TMR type via flag Add flag boot_time_tmr to indicate boot time TMR or runtime TMR instead of function. v2: rework logic (Alex) Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-14 17:18:16 -05:00
Likun Gao	8d339b0df2	drm/amdgpu/psp: set autoload support by default Set psp->autoload_supported to true by default, as only a few version of ASIC not support autoload, and the furture version of PSP should support this. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-14 17:18:06 -05:00
Likun Gao	a78791c2b2	drm/amdgpu: support psp ip block for psp v14 Support PSP ip block for psp v14. Add psp ip block for psp v14_0_2 and v14_0_3. v2: sqaush in 14.0.3 firmware fix (Alex) Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-14 17:16:12 -05:00
Likun Gao	f19cb91615	drm/amdgpu: use spirom update wait_for helper for psp v14 Spirom update typically requires extremely long duration for command execution, and special helper function to wait for it's completion. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-14 17:16:07 -05:00
Felix Kuehling	efe0f34c2b	drm/amdgpu: Reduce VA_RESERVED_BOTTOM to 64KB The reservation is there to catch NULL pointer dereferences from the GPU. Reduce the size to 64KB to make sure that shared virtual address programming models can map all CPU-accessible virtual addresses for GPU access. This is also the default for CPU virtual address mappings as seen in /proc/sys/vm/mmap_min_addr. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-14 17:15:50 -05:00
Hawking Zhang	876fa5f8a0	drm/amdgpu: Add psp v14_0 ip block support Add psp v14_0 ip block support. v2: rebase (Alex) Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-14 17:15:46 -05:00
Thong	fc2d4230e5	drm/amdgpu/soc21: update VCN 4 max HEVC encoding resolution Update the maximum resolution reported for HEVC encoding on VCN 4 devices to reflect its 8K encoding capability. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3159 Signed-off-by: Thong <thong.thai@amd.com> Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-14 17:13:46 -05:00
Mario Limonciello	ce311df91d	Revert "drm/amd: flush any delayed gfxoff on suspend entry" commit `ab4750332d` ("drm/amdgpu/sdma5.2: add begin/end_use ring callbacks") caused GFXOFF control to be used more heavily and the codepath that was removed from commit `0dee726395` ("drm/amd: flush any delayed gfxoff on suspend entry") now can be exercised at suspend again. Users report that by using GNOME to suspend the lockscreen trigger will cause SDMA traffic and the system can deadlock. This reverts commit `0dee726395`. Acked-by: Alex Deucher <alexander.deucher@amd.com> Fixes: `ab4750332d` ("drm/amdgpu/sdma5.2: add begin/end_use ring callbacks") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-13 08:59:50 -05:00
Mario Limonciello	226db36032	drm/amd: Stop evicting resources on APUs in suspend commit `5095d54181` ("drm/amd: Evict resources during PM ops prepare() callback") intentionally moved the eviction of resources to earlier in the suspend process, but this introduced a subtle change that it occurs before adev->in_s0ix or adev->in_s3 are set. This meant that APUs actually started to evict resources at suspend time as well. Explicitly set s0ix or s3 in the prepare() stage, and unset them if the prepare() stage failed. v2: squash in warning fix from Stephen Rothwell Reported-by: Jürg Billeter <j@bitron.ch> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3132#note_2271038 Fixes: `5095d54181` ("drm/amd: Evict resources during PM ops prepare() callback") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-13 08:59:49 -05:00
Dave Airlie	b344e64fbd	amd-drm-next-6.9-2024-02-09: amdgpu: - Validate DMABuf imports in compute VMs - Add RAS ACA framework - PSP 13 fixes - Misc code cleanups - Replay fixes - Atom interpretor PS, WS bounds checking - DML2 fixes - Audio fixes - DCN 3.5 Z state fixes - Remove deprecated ida_simple usage - UBSAN fixes - RAS fixes - Enable seq64 infrastructure - DC color block enablement - Documentation updates - DC documentation updates - DMCUB updates - S3 fixes - VCN 4.0.5 fixes - DP MST fixes - SR-IOV fixes amdkfd: - Validate DMABuf imports in compute VMs - SVM fixes - Trap handler updates radeon: - Atom interpretor PS, WS bounds checking - Misc code cleanups UAPI: - Bump KFD version so UMDs know that the fixes that enable the management of VA mappings in compute VMs using the GEM_VA ioctl for DMABufs exported from KFD are present - Add INFO query for input power. This matches the existing INFO query for average power. Used in gaming HUDs, etc. Example userspace: https://github.com/Umio-Yasuno/libdrm-amdgpu-sys-rs/tree/input_power -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZcaM8gAKCRC93/aFa7yZ 2L64AP9S8Wh5T2dEm3Nr8zBR008KdFQyOGVoO4qwlmyJMgin3wEA57gHiUrvs3o7 HRR+PU4JMo4OxQZNpVQtYYHc1BL6nQU= =3AqF -----END PGP SIGNATURE----- Merge tag 'amd-drm-next-6.9-2024-02-09' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.9-2024-02-09: amdgpu: - Validate DMABuf imports in compute VMs - Add RAS ACA framework - PSP 13 fixes - Misc code cleanups - Replay fixes - Atom interpretor PS, WS bounds checking - DML2 fixes - Audio fixes - DCN 3.5 Z state fixes - Remove deprecated ida_simple usage - UBSAN fixes - RAS fixes - Enable seq64 infrastructure - DC color block enablement - Documentation updates - DC documentation updates - DMCUB updates - S3 fixes - VCN 4.0.5 fixes - DP MST fixes - SR-IOV fixes amdkfd: - Validate DMABuf imports in compute VMs - SVM fixes - Trap handler updates radeon: - Atom interpretor PS, WS bounds checking - Misc code cleanups UAPI: - Bump KFD version so UMDs know that the fixes that enable the management of VA mappings in compute VMs using the GEM_VA ioctl for DMABufs exported from KFD are present - Add INFO query for input power. This matches the existing INFO query for average power. Used in gaming HUDs, etc. Example userspace: https://github.com/Umio-Yasuno/libdrm-amdgpu-sys-rs/tree/input_power From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240209221459.5453-1-alexander.deucher@amd.com	2024-02-13 11:32:23 +10:00
Alex Deucher	ddc23e6e23	drm/amdgpu/psp: update define to better align with its meaning MEM_TRAINING_ENCROACHED_SIZE is for BIST training data. It's not memory type specific. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-12 16:14:12 -05:00
Hamza Mahfooz	040fdcde28	drm/amdgpu: respect the abmlevel module parameter value if it is set Currently, if the abmlevel module parameter is set, it is possible for user space to override the ABM level at some point after boot. However, that is undesirable because it means that we aren't respecting the user's wishes with regard to the level that they want to use. So, prevent user space from changing the ABM level if the module parameter is set to a non-auto value. Tested-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-12 16:14:05 -05:00
Sonny Jiang	cff9960317	drm/amdgpu: Add jpeg_v5_0_0 ip block support Enable support for jpeg_v5_0_0 ip block. Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-12 16:13:44 -05:00
Sonny Jiang	75a178926c	drm/amdgpu/jpeg5: Enable doorbell Add doorbell for JPEG5 Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-12 16:13:39 -05:00
Sonny Jiang	785e53a83b	drm/amdgpu/jpeg5: add power gating support Add PG support for JPEG5 Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-12 16:13:35 -05:00
Sonny Jiang	dfad65c657	drm/amdgpu: Add JPEG5 support Add support for JPEG5 Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-12 16:12:00 -05:00
Sonny Jiang	470675f6bf	amdgpu/drm: Add vcn_v5_0_0_ip_block support Enable support for vcn_v5_0_0_ip_block Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-12 16:11:53 -05:00
Hamza Mahfooz	fc184dbe9f	drm/amdgpu: make damage clips support configurable We have observed that there are quite a number of PSR-SU panels on the market that are unable to keep up with what user space throws at them, resulting in hangs and random black screens. So, make damage clips support configurable and disable it by default for PSR-SU displays. Cc: stable@vger.kernel.org Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-12 16:10:24 -05:00
Sonny Jiang	b6d1a06320	drm/amdgpu: add VCN_5_0_0 IP block support Add VCN_5_0_0 IP init, ring functions, DPG support. v2: squash in warning fixes (Alex) v3: squash in block and ring init, boot, doorbell enablement, DPG support (Alex) Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-12 16:10:18 -05:00
Sonny Jiang	816dae1d69	drm/amdgpu: add VCN_5_0_0 firmware support Add VCN5_0_0 firmware support Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-12 16:10:14 -05:00
Likun Gao	ca46c25909	drm/amdgpu/discovery: Add hdp v7_0 ip block Add hdp v7_0 ip block Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-12 16:10:00 -05:00
Likun Gao	f3bcdf2d90	drm/amdgpu: Add hdp v7_0 ip block support Add hdp v7_0 ip block support. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-12 16:09:57 -05:00
Likun Gao	56018e8363	drm/amdgpu/discovery: Add ih v7_0 ip block Add ih v7_0 ip block. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-12 16:09:45 -05:00
Likun Gao	12443fc53e	drm/amdgpu: Add ih v7_0 ip block support Add ih v7_0 ip block support. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-12 16:09:42 -05:00
Saleemkhan Jamadar	0a119d53f7	drm/amdgpu/jpeg: add support for jpeg DPG mode Jpeg DPG support for GC IP v11_5_0 Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-12 16:09:32 -05:00
Saleemkhan Jamadar	617efef4af	drm/amdgpu: add ucode id for jpeg DPG support add ucode id and cmd buffer for jpeg psp sram programming and Jpeg DPG support. Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-12 16:09:11 -05:00
Likun Gao	39df603d2c	drm/amdgpu/discovery: Add lsdma v7_0 ip block Add lsdma v7_0 ip block. v2: squash in updates (Alex) Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-12 16:08:46 -05:00
Likun Gao	aa2fb23605	drm/amdgpu: Add lsdma v7_0 ip block support Add lsdma v7_0 ip block support. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-12 16:08:41 -05:00
Yang Wang	f579c06bdc	drm/amdgpu: send smu rma reason event in ras eeprom driver send smu rma reason event to smu in ras eeprom driver. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-12 16:08:27 -05:00
Hawking Zhang	53edf77179	drm/amdgpu: Add athub v4_1_0 ip block support Add athub v4_1_0 ip block support. v2: fix clang warning (Alex) Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-12 16:08:12 -05:00
Likun Gao	b35c3feafe	drm/amdgpu: support rlc auotload type set Support to set fw_load_type=3 to use backdoor rlc autoload. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-12 16:07:45 -05:00
Likun Gao	45b801c24c	drm/amdgpu: skip ucode bo reserve for RLC AUTOLOAD Skip ucode BO reservation for backdoor RLC autoload. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-12 16:07:42 -05:00
Lijo Lazar	534c8a5b9d	drm/amdgpu: Fix HDP flush for VFs on nbio v7.9 HDP flush remapping is not done for VFs. Keep the original offsets in VF environment. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-07 18:30:11 -05:00
Lijo Lazar	55173942a6	drm/amdgpu: Avoid fetching VRAM vendor info The present way to fetch VRAM vendor information turns out to be not reliable on GFX 9.4.3 dGPUs as well. Avoid using the data. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-02-07 18:28:31 -05:00
Stanley.Yang	2dcf82a8e8	drm/amdgpu: Fix shared buff copy to user ta if invoke node buffer \|-------- ta type ----------\| \|-------- ta id ----------\| \|-------- cmd id ----------\| \|------ shared buf len -----\| \|------ shared buffer ------\| ta if invoke node buffer is as above, copy shared buffer data to correct location Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-07 18:22:04 -05:00
Li Ma	897925dcc5	drm/amdgpu: remove asymmetrical irq disabling in jpeg 4.0.5 suspend A supplement to commit: `615dd56ac5` There is an irq warning of jpeg during resume in s2idle process. No irq enabled in jpeg 4.0.5 resume. Fixes: `615dd56ac5` ("drm/amdgpu: remove asymmetrical irq disabling in vcn 4.0.5 suspend") Signed-off-by: Li Ma <li.ma@amd.com> Acked-By: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-07 18:19:22 -05:00
Prike Liang	6ef82ac664	drm/amdgpu: reset gpu for s3 suspend abort case In the s3 suspend abort case some type of gfx9 power rail not turn off from FCH side and this will put the GPU in an unknown power status, so let's reset the gpu to a known good power state before reinitialize gpu device. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-07 18:19:08 -05:00
Prike Liang	93bafa32a6	drm/amdgpu: skip to program GFXDEC registers for suspend abort In the suspend abort cases, the gfx power rail doesn't turn off so some GFXDEC registers/CSB can't reset to default value and at this moment reinitialize GFXDEC/CSB will result in an unexpected error. So let skip those program sequence for the suspend abort case. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-07 18:19:04 -05:00
Lijo Lazar	d559744403	drm/amdgpu: Fix HDP flush for VFs on nbio v7.9 HDP flush remapping is not done for VFs. Keep the original offsets in VF environment. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-07 12:26:24 -05:00
Lijo Lazar	3d1554d999	drm/amdgpu: Avoid fetching VRAM vendor info The present way to fetch VRAM vendor information turns out to be not reliable on GFX 9.4.3 dGPUs as well. Avoid using the data. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-07 12:26:23 -05:00
Stanley.Yang	28b34ad207	drm/amdgpu: Fix shared buff copy to user ta if invoke node buffer \|-------- ta type ----------\| \|-------- ta id ----------\| \|-------- cmd id ----------\| \|------ shared buf len -----\| \|------ shared buffer ------\| ta if invoke node buffer is as above, copy shared buffer data to correct location Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-07 12:26:22 -05:00
Srinivasan Shanmugam	cdb637d339	drm/amdgpu: Fix potential out-of-bounds access in 'amdgpu_discovery_reg_base_init()' The issue arises when the array 'adev->vcn.vcn_config' is accessed before checking if the index 'adev->vcn.num_vcn_inst' is within the bounds of the array. The fix involves moving the bounds check before the array access. This ensures that 'adev->vcn.num_vcn_inst' is within the bounds of the array before it is used as an index. Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1289 amdgpu_discovery_reg_base_init() error: testing array offset 'adev->vcn.num_vcn_inst' after use. Fixes: `a0ccc717c4` ("drm/amdgpu/discovery: validate VCN and SDMA instances") Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-07 12:26:22 -05:00
Li Ma	0a8ff0cbee	drm/amdgpu: remove asymmetrical irq disabling in jpeg 4.0.5 suspend A supplement to commit: `615dd56ac5` There is an irq warning of jpeg during resume in s2idle process. No irq enabled in jpeg 4.0.5 resume. Fixes: `615dd56ac5` ("drm/amdgpu: remove asymmetrical irq disabling in vcn 4.0.5 suspend") Signed-off-by: Li Ma <li.ma@amd.com> Acked-By: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-07 12:26:22 -05:00
Prike Liang	3f719cf22f	drm/amdgpu: reset gpu for s3 suspend abort case In the s3 suspend abort case some type of gfx9 power rail not turn off from FCH side and this will put the GPU in an unknown power status, so let's reset the gpu to a known good power state before reinitialize gpu device. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-07 12:26:22 -05:00
Prike Liang	0326de4c44	drm/amdgpu: skip to program GFXDEC registers for suspend abort In the suspend abort cases, the gfx power rail doesn't turn off so some GFXDEC registers/CSB can't reset to default value and at this moment reinitialize GFXDEC/CSB will result in an unexpected error. So let skip those program sequence for the suspend abort case. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-07 12:26:22 -05:00
Qiang Ma	aeaf3e6cf8	drm/amdgpu: Clear the hotplug interrupt ack bit before hpd initialization Problem: The computer in the bios initialization process, unplug the HDMI display, wait until the system up, plug in the HDMI display, did not enter the hotplug interrupt function, the display is not bright. Fix: After the above problem occurs, and the hpd ack interrupt bit is 1, the interrupt should be cleared during hpd_init initialization so that when the driver is ready, it can respond to the hpd interrupt normally. Signed-off-by: Qiang Ma <maqianga@uniontech.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-07 10:01:10 -05:00
Alex Deucher	39a82d304b	drm/amdgpu: fix typo in parameter description Missing space. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-07 10:01:05 -05:00
shaoyunl	3fad156572	drm/amdgpu: Only create mes event log debugfs when mes is enabled Skip the debugfs file creation for mes event log if the GPU doesn't use MES. This to prevent potential kernel oops when user try to read the event log in debugfs on a GPU without MES Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-02-07 10:00:56 -05:00
Thomas Zimmermann	0e85f1ae4a	Merge drm/drm-next into drm-misc-next Backmerging to update drm-misc-next to the state of v6.8-rc3. Also fixes a build problem with xe. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2024-02-07 13:02:20 +01:00
Friedrich Vock	7330256268	drm/amdgpu: Reset IH OVERFLOW_CLEAR bit Allows us to detect subsequent IH ring buffer overflows as well. Cc: Joshua Ashton <joshua@froggi.es> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Friedrich Vock <friedrich.vock@gmx.de> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-31 17:39:47 -05:00
Yifan Zhang	4f56acdee4	drm/amdgpu: remove asymmetrical irq disabling in vcn 4.0.5 suspend There is no irq enabled in vcn 4.0.5 resume, causing wrong amdgpu_irq_src status. Beside, current set function callbacks are empty with no real effect. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-31 17:39:40 -05:00
Yifan Zhang	de4a733868	drm/amdgpu: drm/amdgpu: remove golden setting for gfx 11.5.0 No need to set GC golden settings in driver from gfx 11.5.0 onwards. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-31 17:38:02 -05:00
Lang Yu	9c29282ecb	drm/amdkfd: reserve the BO before validating it Fix a warning. v2: Avoid unmapping attachment repeatedly when ERESTARTSYS. v3: Lock the BO before accessing ttm->sg to avoid race conditions.(Felix) [ 41.708711] WARNING: CPU: 0 PID: 1463 at drivers/gpu/drm/ttm/ttm_bo.c:846 ttm_bo_validate+0x146/0x1b0 [ttm] [ 41.708989] Call Trace: [ 41.708992] <TASK> [ 41.708996] ? show_regs+0x6c/0x80 [ 41.709000] ? ttm_bo_validate+0x146/0x1b0 [ttm] [ 41.709008] ? __warn+0x93/0x190 [ 41.709014] ? ttm_bo_validate+0x146/0x1b0 [ttm] [ 41.709024] ? report_bug+0x1f9/0x210 [ 41.709035] ? handle_bug+0x46/0x80 [ 41.709041] ? exc_invalid_op+0x1d/0x80 [ 41.709048] ? asm_exc_invalid_op+0x1f/0x30 [ 41.709057] ? amdgpu_amdkfd_gpuvm_dmaunmap_mem+0x2c/0x80 [amdgpu] [ 41.709185] ? ttm_bo_validate+0x146/0x1b0 [ttm] [ 41.709197] ? amdgpu_amdkfd_gpuvm_dmaunmap_mem+0x2c/0x80 [amdgpu] [ 41.709337] ? srso_alias_return_thunk+0x5/0x7f [ 41.709346] kfd_mem_dmaunmap_attachment+0x9e/0x1e0 [amdgpu] [ 41.709467] amdgpu_amdkfd_gpuvm_dmaunmap_mem+0x56/0x80 [amdgpu] [ 41.709586] kfd_ioctl_unmap_memory_from_gpu+0x1b7/0x300 [amdgpu] [ 41.709710] kfd_ioctl+0x1ec/0x650 [amdgpu] [ 41.709822] ? __pfx_kfd_ioctl_unmap_memory_from_gpu+0x10/0x10 [amdgpu] [ 41.709945] ? srso_alias_return_thunk+0x5/0x7f [ 41.709949] ? tomoyo_file_ioctl+0x20/0x30 [ 41.709959] __x64_sys_ioctl+0x9c/0xd0 [ 41.709967] do_syscall_64+0x3f/0x90 [ 41.709973] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Fixes: `101b810430` ("drm/amdkfd: Move dma unmapping after TLB flush") Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-31 17:37:53 -05:00
Srinivasan Shanmugam	16da399091	drm/amdgpu: Fix missing error code in 'gmc_v6/7/8/9_0_hw_init()' Return 0 for success scenairos in 'gmc_v6/7/8/9_0_hw_init()' Fixes the below: drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c:920 gmc_v6_0_hw_init() warn: missing error code? 'r' drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c:1104 gmc_v7_0_hw_init() warn: missing error code? 'r' drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c:1224 gmc_v8_0_hw_init() warn: missing error code? 'r' drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c:2347 gmc_v9_0_hw_init() warn: missing error code? 'r' Fixes: `fac4ebd79f` ("drm/amdgpu: Fix with right return code '-EIO' in 'amdgpu_gmc_vram_checking()'") Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-31 17:36:44 -05:00
David McFarland	8ef85a0ce2	drm/amd: Don't init MEC2 firmware when it fails to load The same calls are made directly above, but conditional on the firmware loading and validating successfully. Cc: stable@vger.kernel.org Fixes: `9931b67690` ("drm/amd: Load GFX10 microcode during early_init") Signed-off-by: David McFarland <corngood@gmail.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-31 17:34:14 -05:00
Ma Jun	bb34bc2cd3	drm/amdgpu: Fix the warning info in mode1 reset Fix the warning info below during mode1 reset. [ +0.000004] Call Trace: [ +0.000004] <TASK> [ +0.000006] ? show_regs+0x6e/0x80 [ +0.000011] ? __flush_work.isra.0+0x2e8/0x390 [ +0.000005] ? __warn+0x91/0x150 [ +0.000009] ? __flush_work.isra.0+0x2e8/0x390 [ +0.000006] ? report_bug+0x19d/0x1b0 [ +0.000013] ? handle_bug+0x46/0x80 [ +0.000012] ? exc_invalid_op+0x1d/0x80 [ +0.000011] ? asm_exc_invalid_op+0x1f/0x30 [ +0.000014] ? __flush_work.isra.0+0x2e8/0x390 [ +0.000007] ? __flush_work.isra.0+0x208/0x390 [ +0.000007] ? _prb_read_valid+0x216/0x290 [ +0.000008] __cancel_work_timer+0x11d/0x1a0 [ +0.000007] ? try_to_grab_pending+0xe8/0x190 [ +0.000012] cancel_work_sync+0x14/0x20 [ +0.000008] amddrm_sched_stop+0x3c/0x1d0 [amd_sched] [ +0.000032] amdgpu_device_gpu_recover+0x29a/0xe90 [amdgpu] This warning info was printed after applying the patch "drm/sched: Convert drm scheduler to use a work queue rather than kthread". The root cause is that amdgpu driver tries to use the uninitialized work_struct in the struct drm_gpu_scheduler v2: - Rename the function to amdgpu_ring_sched_ready and move it to amdgpu_ring.c (Alex) v3: - Fix a few more checks based on Vitaly's patch (Alex) v4: - squash in fix noticed by Bert in https://gitlab.freedesktop.org/drm/amd/-/issues/3139 Fixes: `11b3b9f461` ("drm/sched: Check scheduler ready before calling timeout handling") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-31 17:34:05 -05:00
Le Ma	db2aad036e	drm/amdgpu: move the drm client creation behind drm device registration This patch is to eliminate interrupt warning below: "[drm] Fence fallback timer expired on ring sdma0.0". An early vm pt clearing job is sent to SDMA ahead of interrupt enabled. And re-locating the drm client creation following after drm_dev_register looks like a more proper flow. v2: wrap the drm client creation Fixes: `1819200166` ("drm/amdkfd: Export DMABufs from KFD using GEM handles") Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-31 15:33:52 -05:00
Friedrich Vock	9217b91c64	drm/amdgpu: Reset IH OVERFLOW_CLEAR bit Allows us to detect subsequent IH ring buffer overflows as well. Cc: Joshua Ashton <joshua@froggi.es> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Friedrich Vock <friedrich.vock@gmx.de> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-31 14:05:28 -05:00
Yifan Zhang	615dd56ac5	drm/amdgpu: remove asymmetrical irq disabling in vcn 4.0.5 suspend There is no irq enabled in vcn 4.0.5 resume, causing wrong amdgpu_irq_src status. Beside, current set function callbacks are empty with no real effect. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-31 14:05:20 -05:00
Tao Zhou	01087a1974	drm/amdgpu: use PSP address query command Get UMC physical address from PSP in RAS error address coversion. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-31 14:05:19 -05:00
Tao Zhou	a1eac5bd91	drm/amdgpu: add PSP RAS address query command Convert mca address to physical address or vice versa via RAS TA. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-31 14:05:19 -05:00
Yifan Zhang	e4d65510e8	drm/amdgpu: drm/amdgpu: remove golden setting for gfx 11.5.0 No need to set GC golden settings in driver from gfx 11.5.0 onwards. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-31 14:05:19 -05:00
Lang Yu	0c93bd4957	drm/amdkfd: reserve the BO before validating it Fix a warning. v2: Avoid unmapping attachment repeatedly when ERESTARTSYS. v3: Lock the BO before accessing ttm->sg to avoid race conditions.(Felix) [ 41.708711] WARNING: CPU: 0 PID: 1463 at drivers/gpu/drm/ttm/ttm_bo.c:846 ttm_bo_validate+0x146/0x1b0 [ttm] [ 41.708989] Call Trace: [ 41.708992] <TASK> [ 41.708996] ? show_regs+0x6c/0x80 [ 41.709000] ? ttm_bo_validate+0x146/0x1b0 [ttm] [ 41.709008] ? __warn+0x93/0x190 [ 41.709014] ? ttm_bo_validate+0x146/0x1b0 [ttm] [ 41.709024] ? report_bug+0x1f9/0x210 [ 41.709035] ? handle_bug+0x46/0x80 [ 41.709041] ? exc_invalid_op+0x1d/0x80 [ 41.709048] ? asm_exc_invalid_op+0x1f/0x30 [ 41.709057] ? amdgpu_amdkfd_gpuvm_dmaunmap_mem+0x2c/0x80 [amdgpu] [ 41.709185] ? ttm_bo_validate+0x146/0x1b0 [ttm] [ 41.709197] ? amdgpu_amdkfd_gpuvm_dmaunmap_mem+0x2c/0x80 [amdgpu] [ 41.709337] ? srso_alias_return_thunk+0x5/0x7f [ 41.709346] kfd_mem_dmaunmap_attachment+0x9e/0x1e0 [amdgpu] [ 41.709467] amdgpu_amdkfd_gpuvm_dmaunmap_mem+0x56/0x80 [amdgpu] [ 41.709586] kfd_ioctl_unmap_memory_from_gpu+0x1b7/0x300 [amdgpu] [ 41.709710] kfd_ioctl+0x1ec/0x650 [amdgpu] [ 41.709822] ? __pfx_kfd_ioctl_unmap_memory_from_gpu+0x10/0x10 [amdgpu] [ 41.709945] ? srso_alias_return_thunk+0x5/0x7f [ 41.709949] ? tomoyo_file_ioctl+0x20/0x30 [ 41.709959] __x64_sys_ioctl+0x9c/0xd0 [ 41.709967] do_syscall_64+0x3f/0x90 [ 41.709973] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Fixes: `101b810430` ("drm/amdkfd: Move dma unmapping after TLB flush") Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-31 14:05:19 -05:00
Srinivasan Shanmugam	fa8a91b0e5	drm/amdgpu: Fix missing error code in 'gmc_v6/7/8/9_0_hw_init()' Return 0 for success scenairos in 'gmc_v6/7/8/9_0_hw_init()' Fixes the below: drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c:920 gmc_v6_0_hw_init() warn: missing error code? 'r' drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c:1104 gmc_v7_0_hw_init() warn: missing error code? 'r' drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c:1224 gmc_v8_0_hw_init() warn: missing error code? 'r' drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c:2347 gmc_v9_0_hw_init() warn: missing error code? 'r' Fixes: `fac4ebd79f` ("drm/amdgpu: Fix with right return code '-EIO' in 'amdgpu_gmc_vram_checking()'") Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-31 14:05:19 -05:00
YiPeng Chai	adb4d6a40d	drm/amdgpu: Need to resume ras during gpu reset for gfx v9_4_3 sriov Need to resume ras during gpu reset for gfx v9_4_3 sriov Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-31 14:05:18 -05:00
Tao Zhou	edfdde9013	drm/amdgpu: disable RAS feature when fini Send RAS disable feature command in fini. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-31 14:05:18 -05:00
Hawking Zhang	1731ba9b64	drm/amdgpu: Update boot time errors polling sequence Update boot time errors polling sequence to align with the latest firmware change. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-31 14:04:55 -05:00
David McFarland	c3ec8c4f9a	drm/amd: Don't init MEC2 firmware when it fails to load The same calls are made directly above, but conditional on the firmware loading and validating successfully. Cc: stable@vger.kernel.org Fixes: `9931b67690` ("drm/amd: Load GFX10 microcode during early_init") Signed-off-by: David McFarland <corngood@gmail.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-31 13:53:06 -05:00
Ma Jun	9749c86843	drm/amdgpu: Fix the warning info in mode1 reset Fix the warning info below during mode1 reset. [ +0.000004] Call Trace: [ +0.000004] <TASK> [ +0.000006] ? show_regs+0x6e/0x80 [ +0.000011] ? __flush_work.isra.0+0x2e8/0x390 [ +0.000005] ? __warn+0x91/0x150 [ +0.000009] ? __flush_work.isra.0+0x2e8/0x390 [ +0.000006] ? report_bug+0x19d/0x1b0 [ +0.000013] ? handle_bug+0x46/0x80 [ +0.000012] ? exc_invalid_op+0x1d/0x80 [ +0.000011] ? asm_exc_invalid_op+0x1f/0x30 [ +0.000014] ? __flush_work.isra.0+0x2e8/0x390 [ +0.000007] ? __flush_work.isra.0+0x208/0x390 [ +0.000007] ? _prb_read_valid+0x216/0x290 [ +0.000008] __cancel_work_timer+0x11d/0x1a0 [ +0.000007] ? try_to_grab_pending+0xe8/0x190 [ +0.000012] cancel_work_sync+0x14/0x20 [ +0.000008] amddrm_sched_stop+0x3c/0x1d0 [amd_sched] [ +0.000032] amdgpu_device_gpu_recover+0x29a/0xe90 [amdgpu] This warning info was printed after applying the patch "drm/sched: Convert drm scheduler to use a work queue rather than kthread". The root cause is that amdgpu driver tries to use the uninitialized work_struct in the struct drm_gpu_scheduler v2: - Rename the function to amdgpu_ring_sched_ready and move it to amdgpu_ring.c (Alex) v3: - Fix a few more checks based on Vitaly's patch (Alex) v4: - squash in fix noticed by Bert in https://gitlab.freedesktop.org/drm/amd/-/issues/3139 Fixes: `11b3b9f461` ("drm/sched: Check scheduler ready before calling timeout handling") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-31 09:40:42 -05:00
Jani Nikula	345a36c4f1	drm/amdgpu: prefer snprintf over sprintf This will trade the W=1 warning -Wformat-overflow to -Wformat-truncation. This lets us enable -Wformat-overflow subsystem wide. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Pan, Xinhui <Xinhui.Pan@amd.com> Cc: amd-gfx@lists.freedesktop.org Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/fea7a52924f98b1ac24f4a7e6ba21d7754422430.1704908087.git.jani.nikula@intel.com	2024-01-31 11:04:25 +02:00
Yang Wang	788686e2d9	drm/amdgpu: use helper macro HW_ERR instead of Hardware error string use helper macro HW_ERR to instead of Hardware error string. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-29 15:47:02 -05:00
Le Ma	c0125b848a	drm/amdgpu: move the drm client creation behind drm device registration This patch is to eliminate interrupt warning below: "[drm] Fence fallback timer expired on ring sdma0.0". An early vm pt clearing job is sent to SDMA ahead of interrupt enabled. And re-locating the drm client creation following after drm_dev_register looks like a more proper flow. v2: wrap the drm client creation Fixes: `1819200166` ("drm/amdkfd: Export DMABufs from KFD using GEM handles") Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-29 15:35:13 -05:00
Mario Limonciello	c92c108403	Revert "drm/amd/pm: fix the high voltage and temperature issue" This reverts commit `5f38ac54e6`. This causes issues with rebooting and the 7800XT. Cc: Kenneth Feng <kenneth.feng@amd.com> Cc: stable@vger.kernel.org Fixes: `5f38ac54e6` ("drm/amd/pm: fix the high voltage and temperature issue") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3062 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-29 08:56:13 -05:00
Maxime Ripard	4db102dcb0	Merge drm/drm-next into drm-misc-next Kickstart 6.9 development cycle. Signed-off-by: Maxime Ripard <mripard@kernel.org>	2024-01-29 14:20:23 +01:00
Alex Deucher	3380fcad2c	drm/amdgpu/gfx11: set UNORD_DISPATCH in compute MQDs This needs to be set to 1 to avoid a potential deadlock in the GC 10.x and newer. On GC 9.x and older, this needs to be set to 0. This can lead to hangs in some mixed graphics and compute workloads. Updated firmware is also required for AQL. Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-01-25 15:48:57 -05:00
Alex Deucher	03ff6d7238	drm/amdgpu/gfx10: set UNORD_DISPATCH in compute MQDs This needs to be set to 1 to avoid a potential deadlock in the GC 10.x and newer. On GC 9.x and older, this needs to be set to 0. This can lead to hangs in some mixed graphics and compute workloads. Updated firmware is also required for AQL. Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-01-25 15:48:51 -05:00
Tom St Denis	e7a8594cc2	drm/amd/amdgpu: Assign GART pages to AMD device mapping This allows kernel mapped pages like the PDB and PTB to be read via the iomem debugfs when there is no vram in the system. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.7.x	2024-01-25 15:47:36 -05:00
Lijo Lazar	89a7c0bd74	drm/amdgpu: Show vram vendor only if available Ony if vram vendor info is available, show in sysfs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.7.x	2024-01-25 15:44:11 -05:00
Lijo Lazar	90751bdeee	drm/amdgpu: Avoid fetching vram vendor information For GFX 9.4.3 APUs, the current method of fetching vram vendor information is not reliable. Avoid fetching the information. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.7.x	2024-01-25 15:41:57 -05:00
Victor Skvortsov	362936d613	amdgpu/drm: Use vram manager for virtualization page retirement In runtime, use vram manager for virtualization page retirement. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-25 14:58:03 -05:00
Victor Skvortsov	2474414c60	drm/amdgpu: Add RAS_POISON_READY host response message In a non-FLR page avoidance scenario, the host driver will provide the bad pages in the pf2vf exchange region. Adding a new host response message to indicate when the pf2vf exchange region has been updated. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-25 14:58:03 -05:00
YiPeng Chai	ed1e1e42fd	drm/amdgpu: Support passing poison consumption ras block to SRIOV Support passing poison consumption ras blocks to SRIOV. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-25 14:58:03 -05:00
Yang Wang	c0c48f0d61	drm/amdgpu: adjust aca init/fini sequence to match gpu reset - move aca init/fini function into ras init/fini to adapt gpu reset sequence. - add new function amdgpu_aca_reset() Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-25 14:58:02 -05:00
Yang Wang	6eb726a082	drm/amdgpu: add aca sysfs remove support add aca sysfs remove support. Fixes: `37973b69ea` ("drm/amdgpu: add aca sysfs support") Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-25 14:58:02 -05:00
Mukul Joshi	c84a7e21db	drm/amdgpu: Fix module unload hang with RAS enabled The driver unload hangs because the page retirement kthread cannot be stopped as it is sleeping and waiting on page retirement event to occur. Add kthread_should_stop() to the event condition to wake up the kthread when kthread stop is called during driver unload. Fixes: `3fdcd0a31d` ("drm/amdgpu: Prepare for asynchronous processing of umc page retirement") Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-25 14:57:52 -05:00
Alex Deucher	fc8f5a29d4	drm/amdgpu/gfx11: set UNORD_DISPATCH in compute MQDs This needs to be set to 1 to avoid a potential deadlock in the GC 10.x and newer. On GC 9.x and older, this needs to be set to 0. This can lead to hangs in some mixed graphics and compute workloads. Updated firmware is also required for AQL. Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-01-25 14:49:03 -05:00
Alex Deucher	ca01082353	drm/amdgpu/gfx10: set UNORD_DISPATCH in compute MQDs This needs to be set to 1 to avoid a potential deadlock in the GC 10.x and newer. On GC 9.x and older, this needs to be set to 0. This can lead to hangs in some mixed graphics and compute workloads. Updated firmware is also required for AQL. Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-01-25 14:48:58 -05:00
Tom St Denis	8352ca1090	drm/amd/amdgpu: Assign GART pages to AMD device mapping This allows kernel mapped pages like the PDB and PTB to be read via the iomem debugfs when there is no vram in the system. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-25 14:48:52 -05:00
Srinivasan Shanmugam	8d1717fb64	drm/amdgpu: Fix return type in 'aca_bank_hwip_is_matched()' Change the return type of "if (!bank \|\| type == ACA_HWIP_TYPE_UNKNOW)" to be bool instead of int. Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_aca.c:185 aca_bank_hwip_is_matched() warn: signedness bug returning '(-22)' Fixes: `f5e4cc8461` ("drm/amdgpu: implement RAS ACA driver framework") Cc: Yang Wang <kevinyang.wang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-25 14:47:03 -05:00
Somalapuram Amaranath	a78a8da51b	drm/ttm: replace busy placement with flags v6 Instead of a list of separate busy placement add flags which indicate that a placement should only be used when there is room or if we need to evict. v2: add missing TTM_PL_FLAG_IDLE for i915 v3: fix auto build test ERROR on drm-tip/drm-tip v4: fix some typos pointed out by checkpatch v5: cleanup some rebase problems with VMWGFX v6: implement some missing VMWGFX functionality pointed out by Zack, rename the flags as suggested by Michel, rebase on drm-tip and adjust XE as well Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com> Reviewed-by: Zack Rusin <zack.rusin@broadcom.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240112125158.2748-4-christian.koenig@amd.com	2024-01-25 09:59:44 +01:00
Mario Limonciello	7055c5856a	Revert "drm/amd/pm: fix the high voltage and temperature issue" This reverts commit `5f38ac54e6`. This causes issues with rebooting and the 7800XT. Cc: Kenneth Feng <kenneth.feng@amd.com> Cc: stable@vger.kernel.org Fixes: `5f38ac54e6` ("drm/amd/pm: fix the high voltage and temperature issue") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3062 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-22 17:13:28 -05:00
Yang Wang	2866a4549c	drm/amdgpu: skip call ras_late_init if ras block is not supported skip call ras_late_init callback if ras block is not supported. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-22 17:13:28 -05:00
Lijo Lazar	9c3f6e2c4e	drm/amdgpu: Show vram vendor only if available Ony if vram vendor info is available, show in sysfs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-22 17:13:28 -05:00
Lijo Lazar	e0eb08dcec	drm/amdgpu: Avoid fetching vram vendor information For GFX 9.4.3 APUs, the current method of fetching vram vendor information is not reliable. Avoid fetching the information. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-22 17:13:28 -05:00
Tao Zhou	1757bb7dab	drm/amdgpu: update check condition of query for ras page retire Support page retirement handling in debug mode. v2: revert smu_v13_0_6_get_ecc_info directly. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-22 17:13:26 -05:00
Srinivasan Shanmugam	be91a828d0	drm/amdgpu: Cleanup inconsistent indenting in 'amdgpu_gfx_enable_kcq()' Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:645 amdgpu_gfx_enable_kcq() warn: inconsistent indenting Cc: Le Ma <Le.Ma@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-22 17:13:26 -05:00
YiPeng Chai	0795b5d234	drm/amdgpu:Support retiring multiple MCA error address pages Support retiring multiple MCA error address pages in one in-band query for umc v12_0. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-22 17:13:25 -05:00
YiPeng Chai	afb617f38f	drm/amdgpu: add interface to check mca umc status Add interface to check mca umc status. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-22 17:13:25 -05:00
YiPeng Chai	6c23f3d12a	drm/amdgpu: Use asynchronous polling to handle umc_v12_0 poisoning Use asynchronous polling to handle umc_v12_0 poisoning. v2: 1. Change function name. 2. Change the debugging information content. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-22 17:13:25 -05:00
Stanley.Yang	ee9c3031d0	drm/amdgpu: Fix ras features value calltrace The high three bits of ras features mask indicate socket id, it should skip to check high three bits of ras features mask before disable all ras features. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-22 17:13:25 -05:00
YiPeng Chai	3fdcd0a31d	drm/amdgpu: Prepare for asynchronous processing of umc page retirement Preparing for asynchronous processing of umc page retirement. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-22 17:13:25 -05:00
YiPeng Chai	22f6e3e112	drm/amdgpu: Add log info for umc_v12_0 Add log info for umc_v12_0. v2: Delete redundant logs. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-22 17:13:25 -05:00
Arunpravin Paneer Selvam	00a11f977b	drm/amdgpu: Enable seq64 manager and fix bugs - Enable the seq64 mapping sequence. - Fix wflinfo va conflict and other bugs. v1: - The seq64 area needs to be included in the AMDGPU_VA_RESERVED_SIZE otherwise the areas will conflict with user space allocations (Alex) - It needs to be mapped read only in the user VM (Alex) v2: - Instead of just one define for TOP/BOTTOM reserved space separate them into two (Christian) - Fix the CPU and VA calculations and while at it also cleanup error handling and kerneldoc (Christian) Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2024-01-22 17:13:18 -05:00
Linus Torvalds	e08b575815	drm fixes for 6.8-rc1 amdgpu: - DSC fixes - DC resource pool fixes - OTG fix - DML2 fixes - Aux fix - GFX10 RLC firmware handling fix - Revert a broken workaround for SMU 13.0.2 - DC writeback fix - Enable gfxoff when ROCm apps are active on gfx11 with the proper FW version amdkfd: - Fix dma-buf exports using GEM handles nouveau: - fix a unneeded WARN_ON triggering xe: - Fix for definition of wakeref_t - Fix for an error code aliasing - Fix for VM_UNBIND_ALL in the case there are no bound VMAs - Fixes for a number of __iomem address space mismatches reported by sparse - Fixes for the assignment of exec_queue priority - A Fix for skip_guc_pc not taking effect - Workaround for a build problem on GCC 11 - A couple of fixes for error paths - Fix a Flat CCS compression metadata copy issue - Fix a misplace array bounds checking - Don't have display support depend on EXPERT (as discussed on IRC) -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmWqGdoACgkQDHTzWXnE hr5tiA//ZIos/mK+70JprAhJkXN/Lo5IDBOsldDQ1BkakkVLU1taHIsrER6iDT8g WmDzuC4ZIkHZyJ1V8zcIZ4wjE+sIOUeje0fSuMRgwPD+rrdn3WjUiAuvofuxQ5fD LFW+O/9hzTl3xBoxidPdupf33WGRMAKhNuYvhwfH14LaqNDVVdAHU7MPTORmFsyY CbCOLze5dwAmlB4rk+LDsO0gFtXjB7/Ewg2vHUzlEmaYmPpRxu9MHCpadQGq2Sal nwxwI5lb8DqR8jbel3pA0/kz06EdSKKr20YTdw+RVrp/tPqDq4njkZfJMPK+2VYf VpSPnGJOvvdtehCrKBBOBK4WRZWgbTUSuxrjgtIy+fp9aWt84NOEG2xDk1W+qWHS sqp5PeFb3meK1bsCdBBVSjj1fKApKmzJWqiCr0Z8dzV8QsZWSpPzhyfp3puxKESV dORGhMWdtdTMSPRysDUiTMGxn+fxaFTTx9Jsd0LbLizf/sob+ol9EjbNihaa5aKM FdHpUgO08X3OAppmcGmGQdIXG2K+TmKMsvguRAK98OVYXhVUpB/BrCGnd9eGyjYY mSQWrwKGm+Y/dGnr+ylHKxcyRQMmhc33gHmPAItNdMC8OemiHWZKKzoKBOzT2aME pZc6ZRJesG46bfrNqID6ASAIw6SEA+Zj3rk3QsWNPAbHVJcZboY= =LInS -----END PGP SIGNATURE----- Merge tag 'drm-next-2024-01-19' of git://anongit.freedesktop.org/drm/drm Pull more drm fixes from Dave Airlie: "This is mostly amdgpu and xe fixes, with an amdkfd and nouveau fix thrown in. The amdgpu ones are just the usual couple of weeks of fixes. The xe ones are bunch of cleanups for the new xe driver, the fix you put in on the merge commit and the kconfig fix that was hiding the problem from me. amdgpu: - DSC fixes - DC resource pool fixes - OTG fix - DML2 fixes - Aux fix - GFX10 RLC firmware handling fix - Revert a broken workaround for SMU 13.0.2 - DC writeback fix - Enable gfxoff when ROCm apps are active on gfx11 with the proper FW version amdkfd: - Fix dma-buf exports using GEM handles nouveau: - fix a unneeded WARN_ON triggering xe: - Fix for definition of wakeref_t - Fix for an error code aliasing - Fix for VM_UNBIND_ALL in the case there are no bound VMAs - Fixes for a number of __iomem address space mismatches reported by sparse - Fixes for the assignment of exec_queue priority - A Fix for skip_guc_pc not taking effect - Workaround for a build problem on GCC 11 - A couple of fixes for error paths - Fix a Flat CCS compression metadata copy issue - Fix a misplace array bounds checking - Don't have display support depend on EXPERT (as discussed on IRC)" * tag 'drm-next-2024-01-19' of git://anongit.freedesktop.org/drm/drm: (71 commits) nouveau/vmm: don't set addr on the fail path to avoid warning drm/amdgpu: Enable GFXOFF for Compute on GFX11 drm/amd/display: Drop 'acrtc' and add 'new_crtc_state' NULL check for writeback requests. drm/amdgpu: revert "Adjust removal control flow for smu v13_0_2" drm/amdkfd: init drm_client with funcs hook drm/amd/display: Fix a switch statement in populate_dml_output_cfg_from_stream_state() drm/amdgpu: Fix the null pointer when load rlc firmware drm/amd/display: Align the returned error code with legacy DP drm/amd/display: Fix DML2 watermark calculation drm/amd/display: Clear OPTC mem select on disable drm/amd/display: Port DENTIST hang and TDR fixes to OTG disable W/A drm/amd/display: Add logging resource checks drm/amd/display: Init link enc resources in dc_state only if res_pool presents drm/amd/display: Fix late derefrence 'dsc' check in 'link_set_dsc_pps_packet()' drm/amd/display: Avoid enum conversion warning drm/amd/pm: Fix smuv13.0.6 current clock reporting drm/amd/pm: Add error log for smu v13.0.6 reset drm/amdkfd: Fix 'node' NULL check in 'svm_range_get_range_boundaries()' drm/amdgpu: drop exp hw support check for GC 9.4.3 drm/amdgpu: move debug options init prior to amdgpu device init ...	2024-01-19 11:50:00 -08:00
Linus Torvalds	ed8d84530a	This cycle, I2C removes the currently unused CLASS_DDC support (controllers set the flag, but there is no client to use it). Also, CLASS_SPD support gets simplified to prepare removal in the future. Class based instantiation is not recommended these days anyhow. Furthermore, I2C core now creates a debugfs directory per I2C adapter. Current bus driver users were converted to use it. Then, there are also quite some driver updates. Standing out are patches for the wmt-driver which is refactored to support more variants. This is the rebased pull request where a large series for the designware driver was dropped. -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEOZGx6rniZ1Gk92RdFA3kzBSgKbYFAmWph0UPHHdzYUBrZXJu ZWwub3JnAAoJEBQN5MwUoCm2kbIQAJotSmX0mM+nNPReYCMMiloxoxUwgpiErNwY WDrYQSezthAJ1LDsGOEeLcE4f4I+UcUHBO1BoERtOZg3cGtE0Ii5N845sp100S9O ktyaKS5utoErymThWFFrnZX60/8yKXUMzZmNzy96560gPcxbFyyyVhKfBSPzK9T+ O8CGu7GRNqgWHlvH3yqGeCbreWYrYVSrluEpBu6807cp3zDxrU+autOnsewm5+md ka3DdqrbxJSblYK8fJKESAUgkRmZgYKbgl0iiCuqX+ib6I4OA3Z68ny7dl0fY3Ws vwt7d88SaBKDdJmUZyb/sm4aJsW69GN+ECZolxrn4TIw45k4tes2s6Ma5+TV3E9h Fd1RuqduFEqQ7cj31UPe2x8rgj5Fo5nbjCWxdZv+/3zF8+cHwi8iwkp2PScsPCsa fmCdehUE5DrgobsRNANe6XJzxY5wp2VNpGEWKeaQz2Z0/d9T1YFS7a8aewvhXoPC isZboi6GQh2XoE8UgGJa29VUuaIkUW513DwCGw8mz1yKN+kHGcsRXXjkjaZoQn3U MMvh/zkI2Hpy/m2R8PWeIq5XhLJvmlZ19JJzUHJIjXh9Fn9EVtXhlUleh6mzMfeM n8NOg7Eukep2sBgmaufkUKz2Jtogs59YDSXZEvqJjIkPM2Wi0hA18Qj+pilES1ff 3ckk3mxY =8D3Q -----END PGP SIGNATURE----- Merge tag 'i2c-for-6.8-rc1-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c updates from Wolfram Sang: "This removes the currently unused CLASS_DDC support (controllers set the flag, but there is no client to use it). Also, CLASS_SPD support gets simplified to prepare removal in the future. Class based instantiation is not recommended these days anyhow. Furthermore, I2C core now creates a debugfs directory per I2C adapter. Current bus driver users were converted to use it. Finally, quite some driver updates. Standing out are patches for the wmt-driver which is refactored to support more variants. This is the rebased pull request where a large series for the designware driver was dropped" * tag 'i2c-for-6.8-rc1-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (38 commits) MAINTAINERS: use proper email for my I2C work i2c: stm32f7: add support for stm32mp25 soc i2c: stm32f7: perform I2C_ISR read once at beginning of event isr dt-bindings: i2c: document st,stm32mp25-i2c compatible i2c: stm32f7: simplify status messages in case of errors i2c: stm32f7: perform most of irq job in threaded handler i2c: stm32f7: use dev_err_probe upon calls of devm_request_irq i2c: i801: Add lis3lv02d for Dell XPS 15 7590 i2c: i801: Add lis3lv02d for Dell Precision 3540 i2c: wmt: Reduce redundant: REG_CR setting i2c: wmt: Reduce redundant: function parameter i2c: wmt: Reduce redundant: clock mode setting i2c: wmt: Reduce redundant: wait event complete i2c: wmt: Reduce redundant: bus busy check i2c: mux: reg: Remove class-based device auto-detection support i2c: make i2c_bus_type const dt-bindings: at24: add ROHM BR24G04 eeprom: at24: use of_match_ptr() i2c: cpm: Remove linux,i2c-index conversion from be32 i2c: imx: Make SDA actually optional for bus recovering ...	2024-01-18 17:29:01 -08:00
Ori Messinger	aa0901a900	drm/amdgpu: Enable GFXOFF for Compute on GFX11 On GFX version 11, GFXOFF was disabled due to a MES KIQ firmware issue, which has since been fixed after version 64. This patch only re-enables GFXOFF for GFX version 11 if the GPU's MES KIQ firmware version is newer than version 64. V2: Keep GFXOFF disabled on GFX11 if MES KIQ is below version 64. V3: Add parentheses to avoid GCC warning for parentheses: "suggest parentheses around comparison in operand of ‘&’" V4: Remove "V3" from commit title V5: Change commit description and insert 'Acked-by' Signed-off-by: Ori Messinger <Ori.Messinger@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-01-18 16:45:19 -05:00
Christian König	fb1c93c2e9	drm/amdgpu: revert "Adjust removal control flow for smu v13_0_2" Calling amdgpu_device_ip_resume_phase1() during shutdown leaves the HW in an active state and is an unbalanced use of the IP callbacks. Using the IP callbacks like this can lead to memory leaks, double free and imbalanced reference counters. Leaving the HW in an active state can lead to DMA accesses to memory now freed by the driver. Both is a complete no-go for driver unload so completely revert the workaround for now. This reverts commit `f5c7e77970`. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-01-18 16:43:42 -05:00
Flora Cui	3c4e4eb5d8	drm/amdkfd: init drm_client with funcs hook otherwise drm_client_dev_unregister() would try to kfree(&adev->kfd.client). Fixes: `1819200166` ("drm/amdkfd: Export DMABufs from KFD using GEM handles") Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-18 16:42:54 -05:00
Ma Jun	bc03c02cc1	drm/amdgpu: Fix the null pointer when load rlc firmware If the RLC firmware is invalid because of wrong header size, the pointer to the rlc firmware is released in function amdgpu_ucode_request. There will be a null pointer error in subsequent use. So skip validation to fix it. Fixes: `3da9b71563` ("drm/amd: Use `amdgpu_ucode_*` helpers for GFX10") Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-01-18 16:34:45 -05:00
Tao Zhou	a9e4f61df1	drm/amdgpu: update error condition check for umc_v12_0_query_error_address Deferred error is also taken into account. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-18 15:47:24 -05:00
Stanley.Yang	601429cca9	drm/amdgpu: Skip do PCI error slot reset during RAS recovery Why: The PCI error slot reset maybe triggered after inject ue to UMC multi times, this caused system hang. [ 557.371857] amdgpu 0000:af:00.0: amdgpu: GPU reset succeeded, trying to resume [ 557.373718] [drm] PCIE GART of 512M enabled. [ 557.373722] [drm] PTB located at 0x0000031FED700000 [ 557.373788] [drm] VRAM is lost due to GPU reset! [ 557.373789] [drm] PSP is resuming... [ 557.547012] mlx5_core 0000:55:00.0: mlx5_pci_err_detected Device state = 1 pci_status: 0. Exit, result = 3, need reset [ 557.547067] [drm] PCI error: detected callback, state(1)!! [ 557.547069] [drm] No support for XGMI hive yet... [ 557.548125] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 0. Enter [ 557.607763] mlx5_core 0000:55:00.0: wait vital counter value 0x16b5b after 1 iterations [ 557.607777] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 1. Exit, err = 0, result = 5, recovered [ 557.610492] [drm] PCI error: slot reset callback!! ... [ 560.689382] amdgpu 0000:3f:00.0: amdgpu: GPU reset(2) succeeded! [ 560.689546] amdgpu 0000:5a:00.0: amdgpu: GPU reset(2) succeeded! [ 560.689562] general protection fault, probably for non-canonical address 0x5f080b54534f611f: 0000 [#1] SMP NOPTI [ 560.701008] CPU: 16 PID: 2361 Comm: kworker/u448:9 Tainted: G OE 5.15.0-91-generic #101-Ubuntu [ 560.712057] Hardware name: Microsoft C278A/C278A, BIOS C2789.5.BS.1C11.AG.1 11/08/2023 [ 560.720959] Workqueue: amdgpu-reset-hive amdgpu_ras_do_recovery [amdgpu] [ 560.728887] RIP: 0010:amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu] [ 560.736891] Code: ff 41 89 c6 e9 1b ff ff ff 44 0f b6 45 b0 e9 4f ff ff ff be 01 00 00 00 4c 89 e7 e8 76 c9 8b ff 44 0f b6 45 b0 e9 3c fd ff ff <48> 83 ba 18 02 00 00 00 0f 84 6a f8 ff ff 48 8d 7a 78 be 01 00 00 [ 560.757967] RSP: 0018:ffa0000032e53d80 EFLAGS: 00010202 [ 560.763848] RAX: ffa00000001dfd10 RBX: ffa0000000197090 RCX: ffa0000032e53db0 [ 560.771856] RDX: 5f080b54534f5f07 RSI: 0000000000000000 RDI: ff11000128100010 [ 560.779867] RBP: ffa0000032e53df0 R08: 0000000000000000 R09: ffffffffffe77f08 [ 560.787879] R10: 0000000000ffff0a R11: 0000000000000001 R12: 0000000000000000 [ 560.795889] R13: ffa0000032e53e00 R14: 0000000000000000 R15: 0000000000000000 [ 560.803889] FS: 0000000000000000(0000) GS:ff11007e7e800000(0000) knlGS:0000000000000000 [ 560.812973] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 560.819422] CR2: 000055a04c118e68 CR3: 0000000007410005 CR4: 0000000000771ee0 [ 560.827433] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 560.835433] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 560.843444] PKRU: 55555554 [ 560.846480] Call Trace: [ 560.849225] <TASK> [ 560.851580] ? show_trace_log_lvl+0x1d6/0x2ea [ 560.856488] ? show_trace_log_lvl+0x1d6/0x2ea [ 560.861379] ? amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu] [ 560.867778] ? show_regs.part.0+0x23/0x29 [ 560.872293] ? __die_body.cold+0x8/0xd [ 560.876502] ? die_addr+0x3e/0x60 [ 560.880238] ? exc_general_protection+0x1c5/0x410 [ 560.885532] ? asm_exc_general_protection+0x27/0x30 [ 560.891025] ? amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu] [ 560.898323] amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu] [ 560.904520] process_one_work+0x228/0x3d0 How: In RAS recovery, mode-1 reset is issued from RAS fatal error handling and expected all the nodes in a hive to be reset. no need to issue another mode-1 during this procedure. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-18 15:47:15 -05:00
Stanley.Yang	2c7a1560e8	drm/amdgpu: Show deferred error count for UMC Show deferred error count for UMC syfs node Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-18 15:47:07 -05:00
Ori Messinger	776b0953ab	drm/amdgpu: Enable GFXOFF for Compute on GFX11 On GFX version 11, GFXOFF was disabled due to a MES KIQ firmware issue, which has since been fixed after version 64. This patch only re-enables GFXOFF for GFX version 11 if the GPU's MES KIQ firmware version is newer than version 64. V2: Keep GFXOFF disabled on GFX11 if MES KIQ is below version 64. V3: Add parentheses to avoid GCC warning for parentheses: "suggest parentheses around comparison in operand of ‘&’" V4: Remove "V3" from commit title V5: Change commit description and insert 'Acked-by' Signed-off-by: Ori Messinger <Ori.Messinger@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-18 15:46:56 -05:00
Yang Wang	7ed97155b2	drm/amdgpu: fix UBSAN array-index-out-of-bounds for ras_block_string[] fix array index out of bounds issue for ras_block_string[] array. Fixes: `30df05fb74` ("drm/amdgpu: Align ras block enum with firmware") Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-18 15:46:07 -05:00
YuanShang	b5387349ca	drm/amd/amdgpu: Update RLC_SPM_MC_CNT by ring wreg in guest Submit command of wreg in GFX and COMPUTE ring to update RLC_SPM_MC_CNT in guest machine during runtime. Signed-off-by: YuanShang <YuanShang.Mao@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-18 15:45:58 -05:00
Felix Kuehling	5394fb2a5b	drm/amdgpu: Remove unnecessary NULL check A static checker pointed out, that bo_va->base.bo was already derefenced earlier in the same scope. Therefore this check is unnecessary here. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Fixes: `50661eb1a2` ("drm/amdgpu: Auto-validate DMABuf imports in compute VMs") Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-18 15:44:49 -05:00
Christian König	087a3e13ec	drm/amdgpu: revert "Adjust removal control flow for smu v13_0_2" Calling amdgpu_device_ip_resume_phase1() during shutdown leaves the HW in an active state and is an unbalanced use of the IP callbacks. Using the IP callbacks like this can lead to memory leaks, double free and imbalanced reference counters. Leaving the HW in an active state can lead to DMA accesses to memory now freed by the driver. Both is a complete no-go for driver unload so completely revert the workaround for now. This reverts commit `f5c7e77970`. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-18 15:42:20 -05:00
Christophe JAILLET	8a1f7fddab	drm/amdgpu: Remove usage of the deprecated ida_simple_xx() API ida_alloc() and ida_free() should be preferred to the deprecated ida_simple_get() and ida_simple_remove(). Note that the upper limit of ida_simple_get() is exclusive, but the one of ida_alloc_range() is inclusive. So a -1 has been added when needed. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-18 15:42:13 -05:00
Flora Cui	733965a90f	drm/amdkfd: init drm_client with funcs hook otherwise drm_client_dev_unregister() would try to kfree(&adev->kfd.client). Fixes: `1819200166` ("drm/amdkfd: Export DMABufs from KFD using GEM handles") Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-18 15:41:08 -05:00
chenxuebing	7937b6f63f	drm/amdgpu: Clean up errors in umc_v6_0.c Fix the following errors reported by checkpatch: ERROR: space required after that ',' (ctx:VxV) Signed-off-by: chenxuebing <chenxb_99091@126.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-18 15:38:00 -05:00
chenxuebing	ac4d654f3d	drm/amdgpu: Clean up errors in clearstate_si.h Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: chenxuebing <chenxb_99091@126.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-18 15:37:52 -05:00
Heiner Kallweit	e965a70727	drm: remove I2C_CLASS_DDC support After removal of the legacy EEPROM driver and I2C_CLASS_DDC support in olpc_dcon there's no i2c client driver left supporting I2C_CLASS_DDC. Class-based device auto-detection is a legacy mechanism and shouldn't be used in new code. So we can remove this class completely now. Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Acked-by: Harry Wentland <harry.wentland@amd.com> Acked-by: Heiko Stuebner <heiko@sntech.de> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>	2024-01-18 21:10:41 +01:00
chenxuebing	762343f79e	drm/amdgpu: Clean up errors in amdgpu.h Fix the following errors reported by checkpatch: ERROR: open brace '{' following struct go on the same line Signed-off-by: chenxuebing <chenxb_99091@126.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:40 -05:00
chenxuebing	f5e1f90b67	drm/amdgpu: Clean up errors in amdgpu_gmc.c Fix the following errors reported by checkpatch: ERROR: need consistent spacing around '-' (ctx:WxV) Signed-off-by: chenxuebing <chenxb_99091@126.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:40 -05:00
chenxuebing	0b0fb6da9b	drm/amdgpu: Clean up errors in jpeg_v2_5.c Fix the following errors reported by checkpatch: ERROR: space required before the open parenthesis '(' ERROR: that open brace { should be on the previous line Signed-off-by: chenxuebing <chenxb_99091@126.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:40 -05:00
chenxuebing	7230ebeb0a	drm/amdgpu: Clean up errors in gfx_v9_4.c Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: chenxuebing <chenxb_99091@126.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:40 -05:00
chenxuebing	b679566bf0	drm/amdgpu: Clean up errors in amdgpu_drv.c Fix the following errors reported by checkpatch: ERROR: do not initialise globals to 0 Signed-off-by: chenxuebing <chenxb_99091@126.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:40 -05:00
chenxuebing	995d629f74	drm/amd: Clean up errors in amdgpu_vkms.c Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: chenxuebing <chenxb_99091@126.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:39 -05:00
Ma Jun	849e133c97	drm/amdgpu: Fix the null pointer when load rlc firmware If the RLC firmware is invalid because of wrong header size, the pointer to the rlc firmware is released in function amdgpu_ucode_request. There will be a null pointer error in subsequent use. So skip validation to fix it. Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:39 -05:00
Hawking Zhang	4e2965bd3b	drm/amdgpu: Centralize ras cap query to amdgpu_ras_check_supported Move ras capablity check to amdgpu_ras_check_supported. Driver will query ras capablity through psp interace, or vbios interface, or specific ip callbacks. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:39 -05:00
Hawking Zhang	0e14eb0cef	drm/amdgpu: Query ras capablity from psp v2 Instead of traditional atomfirmware interfaces for RAS capability, host driver can query ras capability from psp starting from psp v13_0_6. v2: drop redundant local variable from get_ras_capability. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:39 -05:00
chenxuebing	73888bad4d	drm/amdgpu: Clean up errors in amdgpu_rlc.c Fix the following errors reported by checkpatch: ERROR: space prohibited before that '++' (ctx:WxB) Signed-off-by: chenxuebing <chenxb_99091@126.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:38 -05:00
chenxuebing	1ed8ccf268	drm/amd: Clean up errors in sdma_v2_4.c Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line ERROR: trailing statements should be on next line Signed-off-by: chenxuebing <chenxb_99091@126.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:38 -05:00
chenxuebing	32a0a398fc	drm/amd/amdgpu: Clean up errors in amdgpu_umr.h Fix the following errors reported by checkpatch: spaces required around that '=' (ctx:VxV) Signed-off-by: chenxuebing <chenxb_99091@126.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:38 -05:00
chenxuebing	2bb012138d	drm/amdgpu: Clean up errors in amdgpu_atomfirmware.h Fix the following errors reported by checkpatch: ERROR: "foo* bar" should be "foo *bar" Signed-off-by: chenxuebing <chenxb_99091@126.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:38 -05:00
chenxuebing	05ec623147	drm/amdgpu: Clean up errors in clearstate_gfx9.h Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: chenxuebing <chenxb_99091@126.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:38 -05:00
chenxuebing	2763da27f9	drm/amdgpu: Clean up errors in navi10_ih.c Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: chenxuebing <chenxb_99091@126.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:38 -05:00
Alexander Richards	4630d5031c	drm/amdgpu: check PS, WS index Theoretically, it would be possible for a buggy or malicious VBIOS to overwrite past the bounds of the passed parameters (or its own workspace); add bounds checking to prevent this from happening. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3093 Signed-off-by: Alexander Richards <electrodeyt@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:37 -05:00
Hawking Zhang	30df05fb74	drm/amdgpu: Align ras block enum with firmware Driver and firmware share the same ras block enum. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:37 -05:00
Yang Wang	1714a1ffaf	drm/amdgpu: replace MCA macro with ACA for XGMI use new ACA macro to instead of MCA Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:37 -05:00
Candice Li	46e2231ce0	drm/amdgpu: Log deferred error separately Separate deferred error from UE and CE and log it individually. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:37 -05:00
Candice Li	9c97bf88f4	drm/amdgpu: Do bad page retirement for deferred errors Needs to do bad page retirement for deferred errors. v2: Drop unused dev_info. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:37 -05:00
Yang Wang	bbcbfd4363	drm/amdgpu: add xgmi v6.4.0 ACA support add xgmi v6.4.0 ACA driver support Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:37 -05:00
Yang Wang	f45e6f2b5c	drm/amdgpu: add mmhub v1.8 ACA support v1: add mmhub v1.8 ACA driver support v2: use macro to define smn address value. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:36 -05:00
Yang Wang	373e970a4a	drm/amdgpu: add sdma v4.4.2 ACA support v1: add sdma v4.4.2 ACA driver support v2: use macro to define smn address value. v3: squash in fix for unbalanced irqs Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:36 -05:00
Yang Wang	0f3cd24e96	drm/amdgpu: add gfx v9.4.3 ACA support v1: add gfx v9.4.3 ACA driver support v2: use macro to define smn address value. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:36 -05:00
Ma Jun	e372baeb3d	drm/amdgpu: Check extended configuration space register when system uses large bar Some customer platforms do not enable mmconfig for various reasons, such as bios bug, and therefore cannot access the GPU extend configuration space through mmio. When the system enters the d3cold state and resumes, the amdgpu driver fails to resume because the extend configuration space registers of GPU can't be restored. At this point, Usually we only see some failure dmesg log printed by amdgpu driver, it is difficult to find the root cause. Therefor print a warnning message if the system can't access the extended configuration space register when using large bar. Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:36 -05:00
Yang Wang	f38765de83	drm/amdgpu: add umc v12.0 ACA support add umc v12.0 ACA driver support Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:36 -05:00
Yang Wang	37973b69ea	drm/amdgpu: add aca sysfs support add aca sysfs node support Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:36 -05:00
Yang Wang	04c4fcd263	drm/amdgpu: add amdgpu ras aca query interface v1: add ACA error query interface v2: Add a new helper function to determine whether to use ACA or MCA. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:36 -05:00
Alex Deucher	26405ff430	drm/amdgpu: move kiq_reg_write_reg_wait() out of amdgpu_virt.c It's used for more than just SR-IOV now, so move it to amdgpu_gmc.c and rename it to better match the functionality and update the comments in the code paths to better document when each path is used and why. No functional change. Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Shaoyun.Liu@amd.com Cc: Christian.Koenig@amd.com	2024-01-15 18:35:36 -05:00
Alex Deucher	d3f452f3a0	drm/amdgpu: add new INFO IOCTL query for input power Some chips provide both average and input power. Previously we just exposed average power, add a new query for input power. Example userspace: https://github.com/Umio-Yasuno/libdrm-amdgpu-sys-rs/tree/input_power Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:36 -05:00
Hawking Zhang	d4b9cfe2c7	drm/amdgpu: Query boot status if boot failed Check and report firmware boot status if it doesn't reach steady status. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:36 -05:00
Yang Wang	33dcda51e9	drm/amdgpu: add ACA bank dump debugfs support add ACA bank dump debugfs support Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:35 -05:00
Yang Wang	0599849c32	drm/amdgpu: add ACA kernel hardware error log support add ACA kernel hardware error log support. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:35 -05:00
Hawking Zhang	c8cb7e09db	drm/amdgpu: Query boot status if discovery failed Check and report boot status if discovery failed. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:35 -05:00
Hawking Zhang	cce4febb27	drm/amdgpu: Add ras helper to query boot errors v2 Add ras helper function to query boot time gpu errors. v2: use aqua_vanjaram smn addressing pattern Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:35 -05:00
Yang Wang	f5e4cc8461	drm/amdgpu: implement RAS ACA driver framework v1: implement new RAS ACA driver code framework. v2: - rename aca_bank_set to aca_banks. - rename aca_source_xxx to aca_handle_xxx. v3: Optimize some function implementation details. (from Hawking's suggestion) Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:35 -05:00
Hawking Zhang	ad390542ec	drm/amdgpu: Init pcie_index/data address as fallback (v2) To allow using this helper for indirect access when nbio funcs is not available. For instance, in ip discovery phase. v2: define macro for pcie_index/data/index_hi fallback. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:35 -05:00
Hawking Zhang	ea0f6dfeec	drm/amdgpu: drop psp v13 query_boot_status implementation Will replace it with new implementation to cover boot fails in ip discovery phase. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:35 -05:00
Hawking Zhang	ac3ff8a906	drm/amdgpu: Replace DRM_* with dev_* in amdgpu_psp.c So kernel message has the device pcie bdf information, which helps issue debugging especially in multiple GPU system. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:35 -05:00
Felix Kuehling	50661eb1a2	drm/amdgpu: Auto-validate DMABuf imports in compute VMs DMABuf imports in compute VMs are not wrapped in a kgd_mem object on the process_info->kfd_bo_list. There is no explicit KFD API call to validate them or add eviction fences to them. This patch automatically validates and fences dymanic DMABuf imports when they are added to a compute VM. Revalidation after evictions is handled in the VM code. v2: * Renamed amdgpu_vm_validate_evicted_bos to amdgpu_vm_validate * Eliminated evicted_user state, use evicted state for VM BOs and user BOs * Fixed and simplified amdgpu_vm_fence_imports, depends on reserved BOs * Moved dma_resv_reserve_fences for amdgpu_vm_fence_imports into amdgpu_vm_validate, outside the vm->status_lock * Added dummy version of amdgpu_amdkfd_bo_validate_and_fence for builds without KFD v4: Eliminate amdgpu_vm_fence_imports. It's not needed because the reservation with its fences is shared with the export, as long as all imports are from KFD, with the exports already reserved, validated and fenced by the KFD restore worker. v5: Reintroduced separate evicted_user state to simplify the state machine and CS error handling when amdgpu_vm_validate is called without a ticket. Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:35:35 -05:00
Alex Deucher	c3d5e297dc	drm/amdgpu: drop exp hw support check for GC 9.4.3 No longer needed. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.7.x	2024-01-15 18:32:56 -05:00
Le Ma	51258acdc4	drm/amdgpu: move debug options init prior to amdgpu device init To bring debug options into effect in early initialization phase Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:32:54 -05:00
Le Ma	d20e1aec88	drm/amdgpu: add debug flag to place fw bo on vram for frontdoor loading Use debug_mask=0x8 param to help isolating data path issues on new systems in early phase. v2: rename the flag for explicitness (lijo) Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:32:49 -05:00
Le Ma	6c5683bd9e	Revert "drm/amdgpu: add param to specify fw bo location for front-door loading" This reverts commit `c572abffe9`. Will use debug module param instead of independent module param. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:32:23 -05:00
Yifan Zhang	2b9a073b73	drm/amdgpu: update regGL2C_CTRL4 value in golden setting This patch to update regGL2C_CTRL4 in golden setting. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.7.x	2024-01-15 18:32:15 -05:00
Srinivasan Shanmugam	8a44fdd3cf	drm/amdgpu: Release 'adev->pm.fw' before return in 'amdgpu_device_need_post()' In function 'amdgpu_device_need_post(struct amdgpu_device *adev)' - 'adev->pm.fw' may not be released before return. Using the function release_firmware() to release adev->pm.fw. Thus fixing the below: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1571 amdgpu_device_need_post() warn: 'adev->pm.fw' from request_firmware() not released on lines: 1554. Cc: Monk Liu <Monk.Liu@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:32:06 -05:00
Srinivasan Shanmugam	8e8272f0dc	drm/amdgpu: Fix unsigned comparison with less than zero in vpe_u1_8_from_fraction() The variables 'numerator' and 'denominator', are unsigned 16-bit integer types, that can never be less than 0. Thus fixing the below: drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c:62 vpe_u1_8_from_fraction() warn: unsigned 'numerator' is never less than zero. drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c:63 vpe_u1_8_from_fraction() warn: unsigned 'denominator' is never less than zero. Cc: Peyton Lee <peytolee@amd.com> Cc: Lang Yu <lang.yu@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Peyton Lee <peyton.lee@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:32:03 -05:00
Srinivasan Shanmugam	fac4ebd79f	drm/amdgpu: Fix with right return code '-EIO' in 'amdgpu_gmc_vram_checking()' The amdgpu_gmc_vram_checking() function in emulation checks whether all of the memory range of shared system memory could be accessed by GPU, from this aspect, -EIO is returned for error scenarios. Fixes the below: drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c:919 gmc_v6_0_hw_init() warn: missing error code? 'r' drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c:1103 gmc_v7_0_hw_init() warn: missing error code? 'r' drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c:1223 gmc_v8_0_hw_init() warn: missing error code? 'r' drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c:2344 gmc_v9_0_hw_init() warn: missing error code? 'r' Cc: Xiaojian Du <Xiaojian.Du@amd.com> Cc: Lijo Lazar <lijo.lazar@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:31:59 -05:00
Victor Lu	30d8dffab7	drm/amdgpu: Do not program VM_L2_CNTL under SRIOV VM_L2_CNTL* should not be programmed on driver unload under SRIOV. These regs are skipped during SRIOV driver init. Signed-off-by: Victor Lu <victorchengchi.lu@amd.com> Reviewed-by: Vignesh Chander <Vignesh.Chander@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:31:56 -05:00
Yifan Zhang	c9edcc1864	drm/amdgpu: update ATHUB_MISC_CNTL offset for athub v3.3 This patch to update ATHUB_MISC_CNTL offset for athub v3.3 v2: correct a typo (Tim) v3: correct patch title (Lang) Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:31:45 -05:00
Alex Deucher	d02069850f	drm/amdgpu: fall back to INPUT power for AVG power via INFO IOCTL For backwards compatibility with userspace. Fixes: `47f1724db4` ("drm/amd: Introduce `AMDGPU_PP_SENSOR_GPU_INPUT_POWER`") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2897 Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-15 18:31:24 -05:00
James Zhu	50e60184bf	drm/amdgpu: make a correction on comment Use a generic comment for AMDGPU_VM_RESERVED_VRAM size. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-09 15:44:13 -05:00
Felix Kuehling	c147ddc68e	drm/amdkfd: Fix sparse __rcu annotation warnings Properly mark kfd_process->ef as __rcu and consistently use the right accessor functions. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202312052245.yFpBSgNH-lkp@intel.com/ Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-09 15:44:13 -05:00
Hawking Zhang	73cb81dc54	drm/amdgpu: Packed socket_id to ras feature mask Initialize RAS feature mask bit[31:29] with socket_id. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-09 15:44:13 -05:00
Candice Li	fb1e917199	drm/amdgpu: Support poison error injection via ras_ctrl debugfs Support poison error injection. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-09 15:44:13 -05:00
Likun Gao	f4a94dbb6d	drm/amdgpu: correct the cu count for gfx v11 Correct the algorithm of active CU to skip disabled sa for gfx v11. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-01-09 15:43:54 -05:00
Candice Li	90bd01471d	drm/amdgpu: Drop unnecessary sentences about CE and deferred error. Remove "no user action is needed" for correctable and deferred error to avoid confusion. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-09 15:43:54 -05:00
Dave Airlie	e54478fbda	amd-drm-next-6.8-2024-01-05: amdgpu: - VRR fixes - PSR-SU fixes - SubVP fixes - DCN 3.5 fixes - Documentation updates - DMCUB fixes - DML2 fixes - UMC 12.0 updates - GPUVM fix - Misc code cleanups and whitespace cleanups - DP MST fix - Let KFD sync with GPUVM fences - GFX11 reset fix - SMU 13.0.6 fixes - VSC fix for DP/eDP - Navi12 display fix - RN/CZN system aperture fix - DCN 2.1 bandwidth validation fix - DCN INIT cleanup amdkfd: - SVM fixes - Revert TBA/TMA location change -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZZh7ZQAKCRC93/aFa7yZ 2EPJAQDkoOlSjRLZoqwOPvBCo3WIzVO+4N6pOgohbTrjhHvDFAD+ONEgIH/wydk1 IOdtyizh9o7spo2qN2Oi06MDimclDg8= =bFa3 -----END PGP SIGNATURE----- Merge tag 'amd-drm-next-6.8-2024-01-05' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.8-2024-01-05: amdgpu: - VRR fixes - PSR-SU fixes - SubVP fixes - DCN 3.5 fixes - Documentation updates - DMCUB fixes - DML2 fixes - UMC 12.0 updates - GPUVM fix - Misc code cleanups and whitespace cleanups - DP MST fix - Let KFD sync with GPUVM fences - GFX11 reset fix - SMU 13.0.6 fixes - VSC fix for DP/eDP - Navi12 display fix - RN/CZN system aperture fix - DCN 2.1 bandwidth validation fix - DCN INIT cleanup amdkfd: - SVM fixes - Revert TBA/TMA location change Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240105220522.4976-1-alexander.deucher@amd.com	2024-01-09 09:07:50 +10:00
Alex Deucher	16783d8ef0	drm/amdgpu: apply the RV2 system aperture fix to RN/CZN as well These chips needs the same fix. This was previously not seen on then since the AGP aperture expanded the system aperture, but this showed up again when AGP was disabled. Reviewed-and-tested-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-05 16:10:44 -05:00
Srinivasan Shanmugam	bf2ad4fb8a	drm/amdgpu: Drop 'fence' check in 'to_amdgpu_amdkfd_fence()' Return value of container_of(...) can't be null, so null check is not required for 'fence'. Hence drop its NULL check. Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c:93 to_amdgpu_amdkfd_fence() warn: can 'fence' even be NULL? Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-05 16:10:43 -05:00
Srinivasan Shanmugam	13a1851f92	drm/amdgpu: Fix 'fw' from request_firmware() not released in 'amdgpu_ucode_request()' Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c:1404 amdgpu_ucode_request() warn: 'fw' from request_firmware() not released on lines: 1404. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Lijo Lazar <lijo.lazar@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-05 16:10:43 -05:00
Srinivasan Shanmugam	4f32504a2f	drm/amdgpu: Fix variable 'mca_funcs' dereferenced before NULL check in 'amdgpu_mca_smu_get_mca_entry()' Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_mca.c:377 amdgpu_mca_smu_get_mca_entry() warn: variable dereferenced before check 'mca_funcs' (see line 368) 357 int amdgpu_mca_smu_get_mca_entry(struct amdgpu_device adev, enum amdgpu_mca_error_type type, 358 int idx, struct mca_bank_entry entry) 359 { 360 const struct amdgpu_mca_smu_funcs *mca_funcs = adev->mca.mca_funcs; 361 int count; 362 363 switch (type) { 364 case AMDGPU_MCA_ERROR_TYPE_UE: 365 count = mca_funcs->max_ue_count; mca_funcs is dereferenced here. 366 break; 367 case AMDGPU_MCA_ERROR_TYPE_CE: 368 count = mca_funcs->max_ce_count; mca_funcs is dereferenced here. 369 break; 370 default: 371 return -EINVAL; 372 } 373 374 if (idx >= count) 375 return -EINVAL; 376 377 if (mca_funcs && mca_funcs->mca_get_mca_entry) ^^^^^^^^^ Checked too late! Cc: Yang Wang <kevinyang.wang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-05 16:10:43 -05:00
Le Ma	c572abffe9	drm/amdgpu: add param to specify fw bo location for front-door loading This param can help isolating data path issues on new systems in early phase. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-05 16:10:43 -05:00
Srinivasan Shanmugam	8e317a811f	drm/amdgpu: Remove unreachable code in 'atom_skip_src_int()' Fixes the below: drivers/gpu/drm/amd/amdgpu/atom.c:398 atom_skip_src_int() warn: ignoring unreachable code. Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-05 16:10:33 -05:00
Alex Deucher	fb915c87ed	drm/amdgpu: skip gpu_info fw loading on navi12 It's no longer required. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2318 Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-05 16:05:08 -05:00
Hawking Zhang	6697dbf0af	Revert "drm/amdgpu: enable mca debug mode on APU by default" Not needed any more with firmware fixes Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-05 16:04:36 -05:00
Srinivasan Shanmugam	b8d55a90fd	drm/amdgpu: Fix possible NULL dereference in amdgpu_ras_query_error_status_helper() Return invalid error code -EINVAL for invalid block id. Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1183 amdgpu_ras_query_error_status_helper() error: we previously assumed 'info' could be null (see line 1176) Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Tao Zhou <tao.zhou1@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-03 11:16:06 -05:00
Srinivasan Shanmugam	5ce9a6ad8e	drm/amdgpu: Drop redundant unsigned >=0 comparision 'amdgpu_gfx_rlc_init_microcode()' unsigned int "version_minor" is always >= 0 Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.c:534 amdgpu_gfx_rlc_init_microcode() warn: always true condition '(version_minor >= 0) => (0-u16max >= 0)' Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-03 11:16:06 -05:00
Srinivasan Shanmugam	b57e3ca1fb	drm/amdgpu: Use kvcalloc instead of kvmalloc_array in amdgpu_cs_parser_bos() kvmalloc_array + __GFP_ZERO is the same with kvcalloc. Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:873 amdgpu_cs_parser_bos() warn: Please consider using kvcalloc instead of kvmalloc_array Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-03 11:16:06 -05:00
Srinivasan Shanmugam	091411be7a	drm/amdgpu: Use kzalloc instead of kmalloc+__GFP_ZERO in amdgpu_ras.c Fixes the below smatch warnings: drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:2543 amdgpu_ras_recovery_init() warn: Please consider using kzalloc instead of kmalloc drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:2830 amdgpu_ras_init() warn: Please consider using kzalloc instead of kmalloc Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-03 11:16:06 -05:00
Srinivasan Shanmugam	abaf0666a6	drm/amdgpu: Cleanup indenting in amdgpu_connector_dvi_detect() drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c:1106 amdgpu_connector_dvi_detect() warn: inconsistent indenting Fixes: `8a1de314d1` ("drm/amdgpu: Refactor 'amdgpu_connector_dvi_detect' in amdgpu_connectors.c") Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-03 11:16:05 -05:00
Jack Xiao	4b5c5f5ad3	drm/amdgpu/gfx11: need acquire mutex before access CP_VMID_RESET v2 It's required to take the gfx mutex before access to CP_VMID_RESET, for there is a race condition with CP firmware to write the register. v2: add extra code to ensure the mutex releasing is successful. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-03 10:46:52 -05:00
Felix Kuehling	ec9ba4821f	drm/amdgpu: Let KFD sync with VM fences Change the rules for amdgpu_sync_resv to let KFD synchronize with VM fences on page table reservations. This fixes intermittent memory corruption after evictions when using amdgpu_vm_handle_moved to update page tables for VM mappings managed through render nodes. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-03 10:46:13 -05:00
Stanley.Yang	a32c6f7f57	drm/amdgpu: Fix ecc irq enable/disable unpaired The ecc_irq is disabled while GPU mode2 reset suspending process, but not be enabled during GPU mode2 reset resume process. Changed from V1: only do sdma/gfx ras_late_init in aldebaran_mode2_restore_ip delete amdgpu_ras_late_resume function Changed from V2: check umc ras supported before put ecc_irq Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-03 10:30:49 -05:00
Mangesh Gadre	5eb8094a9b	drm/amdgpu: Add register read/write debugfs support for AID's SMN address is larger than 32 bits for registers on different AID's Updating existing interface to support access to such registers. Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-01-03 10:30:19 -05:00
Dave Airlie	22a2decedf	Merge tag 'drm-msm-next-2023-12-15' of https://gitlab.freedesktop.org/drm/msm into drm-next Updates for v6.8: Core: - Add support for SDM670, SM8650 - Handle the CFG interconnect to fix the obscure hangs / timeouts on register write - Kconfig fix for QMP dependency - DT schema fixes DPU: - Add support for SDM670, SM8650 - Enable SmartDMA on SM8350 and SM8450 - Correct UBWC settings for SC8280XP - Fix catalog settings for SC8180X - Actually make use of the version to switch between QSEED3/3LITE/4 scalers - Use devres-managed and drm-managed allocations where appropriate - misc other fixes - Enabled YUV writeback on SC7280, SM8250 - Enabled writeback on SM8350, SM8450 - CRC fix when encoder is selected as the input source - other misc fixes MDP4: - Use devres-managed and drm-managed allocations where appropriate - flush vblank event on CRTC disable MDP5: - Use devres-managed and drm-managed allocations where appropriate DP: - Add support for SM8650 - Enable PM runtime support - Merge msm-specific debugfs dir with the generic one - Described DisplayPort on SM8150 in DeviceTree bindings - Moved dp_display_get_next_bridge() to probe() DSI: - Add support for SM8650 - Enable PM runtime support GPU/GEM: - demote userspace triggerable warnings to debug - add GEM object metadata UAPI - move GPU devcoredumps to GPU device - fix hangcheck to skip retired submits - expose UBWC config to userspace - fix a680 chip-id - drm_exec conversion - drm/ci: remove rebase-merge directory (to unblock CI) [airlied: fix drm_exec/amd interaction] Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGs9auYqmo-7NSd9FsbNBCDf7aBevd=4xkcF3A5G_OGvMQ@mail.gmail.com	2023-12-20 07:54:03 +10:00
ZhenGuo Yin	87825c860e	drm/amdgpu: re-create idle bo's PTE during VM state machine reset Idle bo's PTE needs to be re-created when resetting VM state machine. Set idle bo's vm_bo as moved to mark it as invalid. Fixes: `55bf196f60` ("drm/amdgpu: reset VM when an error is detected") Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-12-19 14:59:03 -05:00
YiPeng Chai	99cab331a4	drm/amdgpu: Add umc page retirement for umc v12_0 Add umc page retirement for umc v12_0. V2: 1. Changed umc page retirement check condition to call umc_v12_0_is_uncorrectable_error. 2. Use memset to clear the contents of the umc error address structure. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-12-19 14:59:03 -05:00
YiPeng Chai	a8c77a121c	drm/amdgpu: Add poison mode check error condition for umc v12_0 Add poison mode check error condition for umc v12_0. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-12-19 14:59:03 -05:00
YiPeng Chai	9f91e983ee	drm/amdgpu: MCA supports recording umc address information MCA supports recording umc address information. V2: Move err_addr variable from struct ras_err_node to struct ras_err_info. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-12-19 14:59:03 -05:00
James Zhu	0c8c0e7a9e	drm/amdgpu: make an improvement on amdgpu_hmm_range_get_pages Only schedule when hmm_range_fault returns error. Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-12-15 12:16:42 -05:00
James Zhu	78b4dfd359	drm/amdgpu: increase hmm range get pages timeout When application tries to allocate all system memory and cause memory to swap out. Needs more time for hmm_range_fault to validate the remaining page for allocation. To be safe, increase timeout value to 1 second for 64MB range. Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-12-15 12:16:34 -05:00
Alex Deucher	afe58346d5	drm/amdgpu/debugfs: fix error code when smc register accessors are NULL Should be -EOPNOTSUPP. Fixes: `5104fdf50d` ("drm/amdgpu: Fix a null pointer access when the smc_rreg pointer is NULL") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-12-14 15:27:42 -05:00
Peyton Lee	5f82a0c90c	drm/amdgpu/vpe: enable vpe dpm enable vpe dpm Signed-off-by: Peyton Lee <peytolee@amd.com> Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-12-14 15:26:21 -05:00
Melissa Wen	b8b92c1bd7	drm/amd/display: add plane CTM driver-specific property Plane CTM for pre-blending color space conversion. Only enable driver-specific plane CTM property on drivers that support both pre- and post-blending gamut remap matrix, i.e., DCN3+ family. Otherwise it conflits with DRM CRTC CTM property. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-12-14 15:26:15 -05:00
Wang, Beyond	94aeb41173	drm/amdgpu: fix ftrace event amdgpu_bo_move always move on same heap Issue: during evict or validate happened on amdgpu_bo, the 'from' and 'to' is always same in ftrace event of amdgpu_bo_move where calling the 'trace_amdgpu_bo_move', the comment says move_notify is called before move happens, but actually it is called after move happens, here the new_mem is same as bo->resource Fix: move trace_amdgpu_bo_move from move_notify to amdgpu_bo_move Signed-off-by: Wang, Beyond <Wang.Beyond@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-12-14 15:25:48 -05:00
Christian König	65d2765d62	drm/amdgpu: warn when there are still mappings when a BO is destroyed v2 This can only happen when there is a reference counting bug. v2: fix typo Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-12-13 16:33:02 -05:00
Christian König	683b8c7e7a	drm/amdgpu: fix tear down order in amdgpu_vm_pt_free When freeing PD/PT with shadows it can happen that the shadow destruction races with detaching the PD/PT from the VM causing a NULL pointer dereference in the invalidation code. Fix this by detaching the the PD/PT from the VM first and then freeing the shadow instead. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: https://gitlab.freedesktop.org/drm/amd/-/issues/2867 Cc: <stable@vger.kernel.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-12-13 16:08:01 -05:00
Jani Nikula	d9501844d5	drm/amd: include drm/drm_edid.h only where needed Including drm_edid.h from amdgpu_mode.h causes the rebuild of literally hundreds of files when drm_edid.h is modified, while there are only a handful of files that actually need to include drm_edid.h. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-12-13 16:08:01 -05:00
Melissa Wen	0f5afa190b	drm/amd/display: add CRTC gamma TF driver-specific property Add AMD pre-defined transfer function property to default DRM CRTC gamma to convert to wire encoding with or without a user gamma LUT. There is no post-blending regamma ROM for pre-defined TF. When setting Gamma TF (!= Identity) and LUT at the same time, the color module will combine the pre-defined TF and the custom LUT values into the LUT that's actually programmed. v2: - enable CRTC prop in the end of driver-specific prop sequence - define inverse EOTFs as supported regamma TFs - reword driver-specific function doc to remove shaper/3D LUT v3: - spell out TF+LUT behavior in the commit and comments (Harry) Reviewed-by: Harry Wentland <harry.wentland@amd.com> Co-developed-by: Joshua Ashton <joshua@froggi.es> Signed-off-by: Joshua Ashton <joshua@froggi.es> Signed-off-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-12-13 16:08:01 -05:00
Joshua Ashton	0ef47454dc	drm/amd/display: add plane blend LUT and TF driver-specific properties Blend 1D LUT or a pre-defined transfer function (TF) can be set to linearize content before blending, so that it's positioned just before blending planes in the AMD color mgmt pipeline, and after 3D LUT (non-linear space). Shaper and Blend LUTs are 1D LUTs that sandwich 3D LUT. Drivers should advertize blend properties according to HW caps. There is no blend ROM for pre-defined TF. When setting blend TF (!= Identity) and LUT at the same time, the color module will combine the pre-defined TF and the custom LUT values into the LUT that's actually programmed. v3: - spell out TF+LUT behavior in the commit and comments (Harry) v5: - get blend blob correctly Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Joshua Ashton <joshua@froggi.es> Signed-off-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-12-13 16:08:00 -05:00
Melissa Wen	f545d82479	drm/amd/display: add plane shaper LUT and TF driver-specific properties On AMD HW, 3D LUT always assumes a preceding shaper 1D LUT used for delinearizing and/or normalizing the color space before applying a 3D LUT. Add pre-defined transfer function to enable delinearizing content with or without shaper LUT, where AMD color module calculates the resulted shaper curve. We apply an inverse EOTF to go from linear values to encoded values. If we are already in a non-linear space and/or don't need to normalize values, we can bypass shaper LUT with a linear transfer function that is also the default TF value. There is no shaper ROM. When setting shaper TF (!= Identity) and LUT at the same time, the color module will combine the pre-defined TF and the custom LUT values into the LUT that's actually programmed. v2: - squash commits for shaper LUT and shaper TF - define inverse EOTF as supported shaper TFs v3: - spell out TF+LUT behavior in the commit and comments (Harry) - replace BT709 EOTF by inv OETF v5: - get shaper blob correctly (Joshua) Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-12-13 16:08:00 -05:00
Jonathan Kim	bd33bb1409	drm/amdkfd: fix mes set shader debugger process management MES provides the driver a call to explicitly flush stale process memory within the MES to avoid a race condition that results in a fatal memory violation. When SET_SHADER_DEBUGGER is called, the driver passes a memory address that represents a process context address MES uses to keep track of future per-process calls. Normally, MES will purge its process context list when the last queue has been removed. The driver, however, can call SET_SHADER_DEBUGGER regardless of whether a queue has been added or not. If SET_SHADER_DEBUGGER has been called with no queues as the last call prior to process termination, the passed process context address will still reside within MES. On a new process call to SET_SHADER_DEBUGGER, the driver may end up passing an identical process context address value (based on per-process gpu memory address) to MES but is now pointing to a new allocated buffer object during KFD process creation. Since the MES is unaware of this, access of the passed address points to the stale object within MES and triggers a fatal memory violation. The solution is for KFD to explicitly flush the process context address from MES on process termination. Note that the flush call and the MES debugger calls use the same MES interface but are separated as KFD calls to avoid conflicting with each other. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Tested-by: Alice Wong <shiwei.wong@amd.com> Reviewed-by: Eric Huang <jinhuieric.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-12-13 16:07:43 -05:00

... 4 5 6 7 8 ...

13994 Commits