2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
Commit Graph

14881 Commits

Author SHA1 Message Date
Prike Liang
ad17b124c3 drm/amdgpu/gfx9.4.3: Implement compute pipe reset
Implement the compute pipe reset, and the driver will
fallback to pipe reset when queue reset fails.
The pipe reset only deactivates the queue which is
scheduled in the pipe, and meanwhile the MEC pipe
will be reset to the firmware _start pointer. So,
it seems pipe reset will cost more cycles than the
queue reset; therefore, the driver tries to recover
by doing queue reset first.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-02 11:32:32 -04:00
Alex Deucher
6c0a7c3c69 drm/amdgpu: always allocate cleared VRAM for GEM allocations
This adds allocation latency, but aligns better with user
expectations.  The latency should improve with the drm buddy
clearing patches that Arun has been working on.

In addition this fixes the high CPU spikes seen when doing
wipe on release.

v2: always set AMDGPU_GEM_CREATE_VRAM_CLEARED (Christian)

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3528
Fixes: a68c7eaa7a ("drm/amdgpu: Enable clear page functionality")
Acked-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Cc: Christian König <christian.koenig@amd.com>
2024-08-29 14:56:37 -04:00
Jack Xiao
52491d97aa drm/amdgpu/mes: add mes mapping legacy queue switch
For mes11 old firmware has issue to map legacy queue,
add a flag to switch mes to map legacy queue.

Fixes: f9d8c5c785 ("drm/amdgpu/gfx: enable mes to map legacy queue support")
Reported-by: Andrew Worsley <amworsley@gmail.com>
Link: https://lists.freedesktop.org/archives/amd-gfx/2024-August/112773.html
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-29 13:41:02 -04:00
Alex Deucher
1125f95cd2 drm/amdgpu/gfx12: return early in preempt_ib()
When MES is enabled KIQ is not available.  Return an error
when someone uses the debugfs preempt test interface in
that case.

Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-29 13:39:57 -04:00
Alex Deucher
1e487c9173 drm/amdgpu/gfx11: return early in preempt_ib()
When MES is enabled KIQ is not available.  Return an error
when someone uses the debugfs preempt test interface in
that case.

Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-29 13:39:48 -04:00
Sunil Khatri
30e8f4c2bd drm/amdgpu: Move the dumping log out of for loop
log message "Dumping IP State Completed" needs to
be logged only once when state dumping is complete.

Hence moving it out of the for loop.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Trigger Huang <Trigger.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-29 13:39:19 -04:00
Victor Zhao
af76ca8e18 drm/amd/amdgpu: move drain_workqueue before shutdown is set
[background] when unloading amdgpu driver right after running a
workload, drain_workqueue is causing "Fence fallback timer
expired on ring sdma0.0". Under sriov, this issue will cause sriov
full access timeout and a reset happening.

move drain_workqueue before shutdown is set to allow ih process and
before enter full access under sriov to avoid full access time cost.

Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-29 13:39:07 -04:00
Trigger Huang
c67db6a6a6 drm/amdgpu: Do core dump immediately when job tmo
Do the coredump immediately after a job timeout to get a closer
representation of GPU's error status.

V2: This will skip printing vram_lost as the GPU reset is not
happened yet (Alex)

V3: Unconditionally call the core dump as we care about all the reset
functions(soft-recovery and queue reset and full adapter reset, Alex)

V4: Do the dump after adev->job_hang = true (Sunil)

Signed-off-by: Trigger Huang <Trigger.Huang@amd.com>
Acked-by:  Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-29 13:39:00 -04:00
Trigger Huang
6122f5c72e drm/amdgpu: skip printing vram_lost if needed
The vm lost status can only be obtained after a GPU reset occurs, but
sometimes a dev core dump can be happened before GPU reset. So a new
argument is added to tell the dev core dump implementation whether to
skip printing the vram_lost status in the dump.
And this patch is also trying to decouple the core dump function from
the GPU reset function, by replacing the argument amdgpu_reset_context
with amdgpu_job to specify the context for core dump.

V2: Inform user if VRAM lost check is skipped so users don't assume
VRAM wasn't lost (Alex)

Signed-off-by: Trigger Huang <Trigger.Huang@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-29 13:38:53 -04:00
Alex Deucher
7c1a2d8aba drm/amdgpu/gfx9: put queue resets behind a debug option
Pending extended validation.

Reviewed-and-tested-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-29 13:38:48 -04:00
Alex Deucher
a9b67c036c drm/amdgpu: add experimental resets debug flag
Add this flag to enable experimental resets for testing before they
are fully validated.

Reviewed-and-tested-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-29 13:38:36 -04:00
Daniel Vetter
e55ef65510 amd-drm-next-6.12-2024-08-26:
amdgpu:
 - SDMA devcoredump support
 - DCN 4.0.1 updates
 - DC SUBVP fixes
 - Refactor OPP in DC
 - Refactor MMHUBBUB in DC
 - DC DML 2.1 updates
 - DC FAMS2 updates
 - RAS updates
 - GFX12 updates
 - VCN 4.0.3 updates
 - JPEG 4.0.3 updates
 - Enable wave kill (soft recovery) for compute queues
 - Clean up CP error interrupt handling
 - Enable CP bad opcode interrupts
 - VCN 4.x fixes
 - VCN 5.x fixes
 - GPU reset fixes
 - Fix vbios embedded EDID size handling
 - SMU 14.x updates
 - Misc code cleanups and spelling fixes
 - VCN devcoredump support
 - ISP MFD i2c support
 - DC vblank fixes
 - GFX 12 fixes
 - PSR fixes
 - Convert vbios embedded EDID to drm_edid
 - DCN 3.5 updates
 - DMCUB updates
 - Cursor fixes
 - Overdrive support for SMU 14.x
 - GFX CP padding optimizations
 - DCC fixes
 - DSC fixes
 - Preliminary per queue reset infrastructure
 - Initial per queue reset support for GFX 9
 - Initial per queue reset support for GFX 7, 8
 - DCN 3.2 fixes
 - DP MST fixes
 - SR-IOV fixes
 - GFX 9.4.3/4 devcoredump support
 - Add process isolation framework
 - Enable process isolation support for GFX 9.4.3/4
 - Take IOMMU remapping into account for P2P DMA checks
 
 amdkfd:
 - CRIU fixes
 - Improved input validation for user queues
 - HMM fix
 - Enable process isolation support for GFX 9.4.3/4
 - Initial per queue reset support for GFX 9
 - Allow users to target recommended SDMA engines
 
 radeon:
 - remove .load and drm_dev_alloc
 - Fix vbios embedded EDID size handling
 - Convert vbios embedded EDID to drm_edid
 - Use GEM references instead of TTM
 - r100 cp init cleanup
 - Fix potential overflows in evergreen CS offset tracking
 
 UAPI:
 - KFD support for targetting queues on recommended SDMA engines
   Proposed userspace:
   2f588a2406
   eb30a5bbc7
 
 drm/buddy:
 - Add start address support for trim function
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZszhcQAKCRC93/aFa7yZ
 2M4ZAQD+xgIJkQ9HISQeqER5GblnfrorARd32yP/BH0c+JbGUAD9H/BIB41teZ80
 vw2WTx+4TyB39awgvtpDH8iEQdkcSAE=
 =717w
 -----END PGP SIGNATURE-----

Merge tag 'amd-drm-next-6.12-2024-08-26' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-6.12-2024-08-26:

amdgpu:
- SDMA devcoredump support
- DCN 4.0.1 updates
- DC SUBVP fixes
- Refactor OPP in DC
- Refactor MMHUBBUB in DC
- DC DML 2.1 updates
- DC FAMS2 updates
- RAS updates
- GFX12 updates
- VCN 4.0.3 updates
- JPEG 4.0.3 updates
- Enable wave kill (soft recovery) for compute queues
- Clean up CP error interrupt handling
- Enable CP bad opcode interrupts
- VCN 4.x fixes
- VCN 5.x fixes
- GPU reset fixes
- Fix vbios embedded EDID size handling
- SMU 14.x updates
- Misc code cleanups and spelling fixes
- VCN devcoredump support
- ISP MFD i2c support
- DC vblank fixes
- GFX 12 fixes
- PSR fixes
- Convert vbios embedded EDID to drm_edid
- DCN 3.5 updates
- DMCUB updates
- Cursor fixes
- Overdrive support for SMU 14.x
- GFX CP padding optimizations
- DCC fixes
- DSC fixes
- Preliminary per queue reset infrastructure
- Initial per queue reset support for GFX 9
- Initial per queue reset support for GFX 7, 8
- DCN 3.2 fixes
- DP MST fixes
- SR-IOV fixes
- GFX 9.4.3/4 devcoredump support
- Add process isolation framework
- Enable process isolation support for GFX 9.4.3/4
- Take IOMMU remapping into account for P2P DMA checks

amdkfd:
- CRIU fixes
- Improved input validation for user queues
- HMM fix
- Enable process isolation support for GFX 9.4.3/4
- Initial per queue reset support for GFX 9
- Allow users to target recommended SDMA engines

radeon:
- remove .load and drm_dev_alloc
- Fix vbios embedded EDID size handling
- Convert vbios embedded EDID to drm_edid
- Use GEM references instead of TTM
- r100 cp init cleanup
- Fix potential overflows in evergreen CS offset tracking

UAPI:
- KFD support for targetting queues on recommended SDMA engines
  Proposed userspace:
  2f588a2406
  eb30a5bbc7

drm/buddy:
- Add start address support for trim function

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240826201528.55307-1-alexander.deucher@amd.com
2024-08-27 14:33:12 +02:00
Daniel Vetter
4461e9e5c3 Linux 6.11-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmbK2B8eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGFwkH/10QpUgzIfbFKbF+
 5hwcvaqS5myxWwJ4PjN0eR1qGE6RzVO0Tb24+TVql+7pxu+iWm1kYgC3+/T5xJsP
 ECAszdmPWSco1xaHrh2y3PyCJjaBiqFbIxdjPp7odjDpG9qarbcty8YpWs44u/gd
 RDXzHUuScEShBhEt0ZhvE1pIDL8jJ8JL3yqOMZ+XaDxtJbjaHw4GHp8efxlBWc8N
 jZKIVJi22q5NWG5T0tGtPWwzCm0ewA/JNMTEvE9leoSoAgO85NZ0ivxMC76q/tbj
 BrYk5KnzfhJs4b/n/KtIwWaLTgLyXKGqHMaMq8sbXtp410aUdgnRJO2cl3fI+1vc
 vxQfAfk=
 =RemI
 -----END PGP SIGNATURE-----

Merge v6.11-rc5 into drm-next

amdgpu pr conconflicts due to patches cherry-picked to -fixes, I might
as well catch up with a backmerge and handle them all. Plus both misc
and intel maintainers asked for a backmerge anyway.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2024-08-27 14:09:45 +02:00
Xiaogang Chen
6ef29715ac drm/amdkfd: Change kfd/svm page fault drain handling
When app unmap vm ranges(munmap) kfd/svm starts drain pending page fault and
not handle any incoming pages fault of this process until a deferred work item
got executed by default system wq. The time period of "not handle page fault"
can be long and is unpredicable. That is advese to kfd performance on page
faults recovery.

This patch uses time stamp of incoming page fault to decide to drop or recover
page fault. When app unmap vm ranges kfd records each gpu device's ih ring
current time stamp. These time stamps are used at kfd page fault recovery
routine.

Any page fault happened on unmapped ranges after unmap events is application
bug that accesses vm range after unmap. It is not driver work to cover that.

By using time stamp of page fault do not need drain page faults at deferred
work. So, the time period that kfd does not handle page faults is reduced
and can be controlled.

Signed-off-by: Xiaogang.Chen <Xiaogang.Chen@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-23 10:55:13 -04:00
Likun Gao
875ff9a7ee drm/amdgpu: support for gc_info table v1.3
Add gc_info table v1.3 for IP discovery.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-23 10:54:57 -04:00
Yang Wang
4416377ae1 drm/amdgpu: add list empty check to avoid null pointer issue
Add list empty check to avoid null pointer issues in some corner cases.
- list_for_each_entry_safe()

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-23 10:53:45 -04:00
Alex Deucher
40318a2406 drm/amdgpu/gfx12: set UNORD_DISPATCH in compute MQDs
This needs to be set to 1 to avoid a potential deadlock in
the GC 10.x and newer.  On GC 9.x and older, this needs
to be set to 0. This can lead to hangs in some mixed
graphics and compute workloads.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3575
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-23 10:53:25 -04:00
Hawking Zhang
b05d6476ae drm/amdgpu: Retire query_utcl2_poison_status callback
Driver switches to interrupt source id to identify
utcl2 poison event. polling interface is not needed.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-23 10:53:16 -04:00
Rahul Jain
75f0efbc4b drm/amdgpu: Take IOMMU remapping into account for p2p checks
when trying to enable p2p the amdgpu_device_is_peer_accessible()
checks the condition where address_mask overlaps the aper_base
and hence returns 0, due to which the p2p disables for this platform

IOMMU should remap the BAR addresses so the device can access
them. Hence check if peer_adev is remapping DMA

v5: (Felix, Alex)
- fixing comment as per Alex feedback
- refactor code as per Felix

v4: (Alex)
- fix the comment and description

v3:
- remove iommu_remap variable

v2: (Alex)
- Fix as per review comments
- add new function amdgpu_device_check_iommu_remap to check if iommu
  remap

Signed-off-by: Rahul Jain <Rahul.Jain@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-23 10:53:05 -04:00
Alex Deucher
9cead81eff drm/amdgpu: fix eGPU hotplug regression
The driver needs to wait for the on board firmware
to finish its initialization before probing the card.
Commit 959056982a ("drm/amdgpu: Fix discovery initialization failure during pci rescan")
switched from using msleep() to using usleep_range() which
seems to have caused init failures on some navi1x boards. Switch
back to msleep().

Fixes: 959056982a ("drm/amdgpu: Fix discovery initialization failure during pci rescan")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3559
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3500
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Ma Jun <Jun.Ma2@amd.com>
(cherry picked from commit c69b07f7bb)
Cc: stable@vger.kernel.org # 6.10.x
2024-08-20 23:07:11 -04:00
Candice Li
c99769bcea drm/amdgpu: Validate TA binary size
Add TA binary size validation to avoid OOB write.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c0a04e3570)
Cc: stable@vger.kernel.org
2024-08-20 23:04:17 -04:00
Alex Deucher
e3e4bf58ba drm/amdgpu/sdma5.2: limit wptr workaround to sdma 5.2.1
The workaround seems to cause stability issues on other
SDMA 5.2.x IPs.

Fixes: a03ebf1163 ("drm/amdgpu/sdma5.2: Update wptr registers as well as doorbell")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3556
Acked-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2dc3851ef7)
Cc: stable@vger.kernel.org
2024-08-20 22:51:37 -04:00
Yang Wang
0b43312902 drm/amdgpu: fixing rlc firmware loading failure issue
Skip rlc firmware validation to ignore firmware header size mismatch issues.
This restores the workaround added in
commit 849e133c97 ("drm/amdgpu: Fix the null pointer when load rlc firmware")

Fixes: 3af2c80ae2 ("drm/amdgpu: refine gfx10 firmware loading")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3551
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 89ec85d16e)
2024-08-20 22:51:31 -04:00
Alex Deucher
88c511dea1 drm/amd/gfx11: move the gfx mutex into the caller
Otherwise we can fail to drop the software mutex when
we fail to take the hardware mutex.

Fixes: 76acba7b7f ("drm/amdgpu/gfx11: add a mutex for the gfx semaphore")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-20 22:14:14 -04:00
Victor Zhao
bf2bc61638 drm/amd/amdgpu: allow use kiq to do hdp flush under sriov
when use cpu to do page table update under sriov runtime, since mmio
access is blocked, kiq has to be used to flush hdp.

change WREG32_NO_KIQ to WREG32 to allow kiq.

Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-20 22:14:14 -04:00
Alex Deucher
c69b07f7bb drm/amdgpu: fix eGPU hotplug regression
The driver needs to wait for the on board firmware
to finish its initialization before probing the card.
Commit 959056982a ("drm/amdgpu: Fix discovery initialization failure during pci rescan")
switched from using msleep() to using usleep_range() which
seems to have caused init failures on some navi1x boards. Switch
back to msleep().

Fixes: 959056982a ("drm/amdgpu: Fix discovery initialization failure during pci rescan")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3559
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3500
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Ma Jun <Jun.Ma2@amd.com>
2024-08-20 22:14:14 -04:00
Candice Li
c0a04e3570 drm/amdgpu: Validate TA binary size
Add TA binary size validation to avoid OOB write.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-20 22:14:13 -04:00
Mukul Joshi
ccf8ef6b75 drm/amdgpu: Implement MES Suspend and Resume APIs for GFX11
Add implementation for MES Suspend and Resume APIs to unmap/map
all queues for GFX11. Support for GFX12 will be added when the
corresponding firmware support is in place.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-20 22:14:04 -04:00
Srinivasan Shanmugam
f846250b8a drm/amdgpu/gfx_v9_4_3: Apply Isolation Enforcement to GFX & Compute rings
This commit applies isolation enforcement to the GFX and Compute rings
in the gfx_v9_4_3 module.

The commit sets `amdgpu_gfx_enforce_isolation_ring_begin_use` and
`amdgpu_gfx_enforce_isolation_ring_end_use` as the functions to be
called when a ring begins and ends its use, respectively.

`amdgpu_gfx_enforce_isolation_ring_begin_use` is called when a ring
begins its use. This function cancels any scheduled
`enforce_isolation_work` and, if necessary, signals the Kernel Fusion
Driver (KFD) to stop the runqueue.

`amdgpu_gfx_enforce_isolation_ring_end_use` is called when a ring ends
its use. This function schedules `enforce_isolation_work` to be run
after a delay.

These functions are part of the Enforce Isolation Handler, which
enforces shader isolation on AMD GPUs to prevent data leakage between
different processes.

The commit also includes a check for the type of the ring. If the type
of the ring is `AMDGPU_RING_TYPE_COMPUTE`, the `xcp_id` of the
`enforce_isolation` structure in the `gfx` structure of the
`amdgpu_device` is set to the `xcp_id` of the ring. This ensures that
the correct `xcp_id` is used when enforcing isolation on compute rings.
The `xcp_id` is an identifier for an XCP partition, and different rings
can be associated with different XCP partitions.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
2024-08-20 22:08:02 -04:00
Srinivasan Shanmugam
b710dbe55d drm/amdgpu/gfx9: Apply Isolation Enforcement to GFX & Compute rings
This commit applies isolation enforcement to the GFX and Compute rings
in the gfx_v9_0 module.

The commit sets `amdgpu_gfx_enforce_isolation_ring_begin_use` and
`amdgpu_gfx_enforce_isolation_ring_end_use` as the functions to be
called when a ring begins and ends its use, respectively.

`amdgpu_gfx_enforce_isolation_ring_begin_use` is called when a ring
begins its use. This function cancels any scheduled
`enforce_isolation_work` and, if necessary, signals the Kernel Fusion
Driver (KFD) to stop the runqueue.

`amdgpu_gfx_enforce_isolation_ring_end_use` is called when a ring ends
its use. This function schedules `enforce_isolation_work` to be run
after a delay.

These functions are part of the Enforce Isolation Handler, which
enforces shader isolation on AMD GPUs to prevent data leakage between
different processes.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
2024-08-20 22:07:58 -04:00
Srinivasan Shanmugam
afefd6f245 drm/amdgpu: Implement Enforce Isolation Handler for KGD/KFD serialization
This commit introduces the Enforce Isolation Handler designed to enforce
shader isolation on AMD GPUs, which helps to prevent data leakage
between different processes.

The handler counts the number of emitted fences for each GFX and compute
ring. If there are any fences, it schedules the `enforce_isolation_work`
to be run after a delay of `GFX_SLICE_PERIOD`. If there are no fences,
it signals the Kernel Fusion Driver (KFD) to resume the runqueue.

The function is synchronized using the `enforce_isolation_mutex`.

This commit also introduces a reference count mechanism
(kfd_sch_req_count) to keep track of the number of requests to enable
the KFD scheduler. When a request to enable the KFD scheduler is made,
the reference count is decremented. When the reference count reaches
zero, a delayed work is scheduled to enforce isolation after a delay of
GFX_SLICE_PERIOD.

When a request to disable the KFD scheduler is made, the function first
checks if the reference count is zero. If it is, it cancels the delayed
work for enforcing isolation and checks if the KFD scheduler is active.
If the KFD scheduler is active, it sends a request to stop the KFD
scheduler and sets the KFD scheduler state to inactive. Then, it
increments the reference count.

The function is synchronized using the kfd_sch_mutex to ensure that the
KFD scheduler state and reference count are updated atomically.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-20 22:07:35 -04:00
Amber Lin
234eebe161 drm/amdkfd: APIs to stop/start KFD scheduling
Provide amdgpu_amdkfd_stop_sched() for amdgpu to stop KFD scheduling
compute work on HIQ. amdgpu_amdkfd_start_sched() resumes the scheduling.
When amdgpu_amdkfd_stop_sched is called, KFD will unmap queues from
runlist. If users send ioctls to KFD to create queues, they'll be added
but those queues won't be mapped to runlist (so not scheduled) until
amdgpu_amdkfd_start_sched is called.

v2: fix build (Alex)

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-20 22:07:28 -04:00
Srinivasan Shanmugam
b1f49ff9cb drm/amdgpu/gfx9: Add cleaner shader support for GFX9.4.4 hardware
This commit extends the cleaner shader feature to support GFX9.4.4
hardware.

The cleaner shader feature is used to clear or initialize certain GPU
resources, such as Local Data Share (LDS), Vector General Purpose
Registers (VGPRs), and Scalar General Purpose Registers (SGPRs). This
operation needs to be performed in isolation, while no other tasks
should be running on the GPU at the same time.

Previously, the cleaner shader feature was implemented for GFX9.4.3
hardware. This commit adds support for GFX9.4.4 hardware by allowing the
cleaner shader to be used with this hardware version.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-20 22:07:23 -04:00
Srinivasan Shanmugam
335288315a drm/amdgpu/gfx9: Add cleaner shader for GFX9.4.3
This commit adds the cleaner shader microcode for GFX9.4.3 GPUs. The
cleaner shader is a piece of GPU code that is used to clear or
initialize certain GPU resources, such as Local Data Share (LDS), Vector
General Purpose Registers (VGPRs), and Scalar General Purpose Registers
(SGPRs).

Clearing these resources is important for ensuring data isolation
between different workloads running on the GPU. Without the cleaner
shader, residual data from a previous workload could potentially be
accessed by a subsequent workload, leading to data leaks and incorrect
computation results.

The cleaner shader microcode is represented as an array of 32-bit words
(`gfx_9_4_3_cleaner_shader_hex`). This array is the binary
representation of the cleaner shader code, which is written in a
low-level GPU instruction set.

When the cleaner shader feature is enabled, the AMDGPU driver loads this
array into a specific location in the GPU memory. The GPU then reads
this memory location to fetch and execute the cleaner shader
instructions.

The cleaner shader is executed automatically by the GPU at the end of
each workload, before the next workload starts. This ensures that all
GPU resources are in a clean state before the start of each workload.

This addition is part of the cleaner shader feature implementation. The
cleaner shader feature helps improve GPU performance and resource
utilization by cleaning up GPU resources after they are used. It also
enhances security and reliability by preventing data leaks between
workloads.

v2: fix copyright date (Alex)

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-20 22:07:16 -04:00
Srinivasan Shanmugam
d4c3815495 drm/amdgpu/gfx9: Implement cleaner shader support for GFX9.4.3 hardware
The patch modifies the gfx_v9_4_3_kiq_set_resources function to write
the cleaner shader's memory controller address to the ring buffer. It
also adds a new function, gfx_v9_4_3_ring_emit_cleaner_shader, which
emits the PACKET3_RUN_CLEANER_SHADER packet to the ring buffer.

This patch adds support for the PACKET3_RUN_CLEANER_SHADER packet in the
gfx_v9_4_3 module. This packet is used to emit the cleaner shader, which
is used to clear GPU memory before it's reused, helping to prevent data
leakage between different processes.

Finally, the patch updates the ring function structures to include the
new gfx_v9_4_3_ring_emit_cleaner_shader function. This allows the
cleaner shader to be emitted as part of the ring's operations.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-20 22:07:11 -04:00
Srinivasan Shanmugam
c2e70d307f drm/amdgpu/gfx9: Implement cleaner shader support for GFX9 hardware
The patch modifies the gfx_v9_0_kiq_set_resources function to write
the cleaner shader's memory controller address to the ring buffer. It
also adds a new function, gfx_v9_0_ring_emit_cleaner_shader, which
emits the PACKET3_RUN_CLEANER_SHADER packet to the ring buffer.

This patch adds support for the PACKET3_RUN_CLEANER_SHADER packet in the
gfx_v9_0 module. This packet is used to emit the cleaner shader, which
is used to clear GPU memory before it's reused, helping to prevent data
leakage between different processes.

Finally, the patch updates the ring function structures to include the
new gfx_v9_0_ring_emit_cleaner_shader function. This allows the
cleaner shader to be emitted as part of the ring's operations.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-20 22:07:07 -04:00
Srinivasan Shanmugam
22ff907d4f drm/amdgpu: Add PACKET3_RUN_CLEANER_SHADER for cleaner shader execution
This commit adds the PACKET3_RUN_CLEANER_SHADER definition. This packet
is a command packet used to instruct the GPU to execute the cleaner
shader.

The cleaner shader is a piece of GPU code that is used to clear or
initialize certain GPU resources, such as Local Data Share (LDS), Vector
General Purpose Registers (VGPRs), and Scalar General Purpose Registers
(SGPRs). Clearing these resources is important for ensuring data
isolation between different workloads running on the GPU.

The PACKET3_RUN_CLEANER_SHADER packet is used to trigger the execution
of the cleaner shader on the GPU. The packet consists of a header
followed by a RESERVED field, which is programmed to zero. When the GPU
receives this packet, it fetches and executes the cleaner shader
instructions from the location specified in the packet.

The cleaner shader feature helps to enhances security and reliability by
preventing data leaks between workloads.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-20 22:07:02 -04:00
Srinivasan Shanmugam
d361ad5d2f drm/amdgpu: Add sysfs interface for running cleaner shader
This patch adds a new sysfs interface for running the cleaner shader on
AMD GPUs. The cleaner shader is used to clear GPU memory before it's
reused, which can help prevent data leakage between different processes.

The new sysfs file is write-only and is named `run_cleaner_shader`.
Write the number of the partition to this file to trigger the cleaner shader
on that partition. There is only one partition on GPUs which do not
support partitioning.

Changes made in this patch:

- Added `amdgpu_set_run_cleaner_shader` function to handle writes to the
  `run_cleaner_shader` sysfs file.
- Added `run_cleaner_shader` to the list of device attributes in
  `amdgpu_device_attrs`.
- Updated `default_attr_update` to handle `run_cleaner_shader`.
- Added `AMDGPU_DEVICE_ATTR_WO` macro to create write-only device
  attributes.

v2: fix error handling (Alex)

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
2024-08-20 22:06:56 -04:00
Srinivasan Shanmugam
e189be9b2e drm/amdgpu: Add enforce_isolation sysfs attribute
This commit adds a new sysfs attribute 'enforce_isolation' to control
the 'enforce_isolation' setting per GPU. The attribute can be read and
written, and accepts values 0 (disabled) and 1 (enabled).

When 'enforce_isolation' is enabled, reserved VMIDs are allocated for
each ring. When it's disabled, the reserved VMIDs are freed.

The set function locks a mutex before changing the 'enforce_isolation'
flag and the VMIDs, and unlocks it afterwards. This ensures that these
operations are atomic and prevents race conditions and other concurrency
issues.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-20 22:06:52 -04:00
Srinivasan Shanmugam
dba1a6cfc3 drm/amdgpu: Enforce isolation as part of the job
This patch adds a new parameter 'enforce_isolation' to the amdgpu_job
structure. This parameter is used to determine whether shader isolation
should be enforced for a job. The enforce_isolation parameter is then
stored in the amdgpu_job structure and used when flushing the VM.

The enforce_isolation field of the amdgpu_job structure is set directly
after the job is allocated

This change allows more fine-grained control over shader isolation,
making it possible to enforce isolation on a per-job basis rather than
globally. This can be useful in scenarios where only certain jobs
require isolation.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
2024-08-20 22:06:43 -04:00
Victor Skvortsov
19cff16559 drm/amdgpu: abort KIQ waits when there is a pending reset
Stop waiting for the KIQ to return back when there is a reset pending.
It's quite likely that the KIQ will never response.

Signed-off-by: Koenig Christian <Christian.Koenig@amd.com>
Suggested-by: Lazar Lijo <Lijo.Lazar@amd.com>
Tested-by: Victor Skvortsov <victor.skvortsov@amd.com>
Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:27:50 -04:00
Srinivasan Shanmugam
9659520419 drm/amdgpu: Make enforce_isolation setting per GPU
This commit makes enforce_isolation setting to be per GPU and per
partition by adding the enforce_isolation array to the adev structure.
The adev variable is set based on the global enforce_isolation module
parameter during device initialization.

In amdgpu_ids.c, the adev->enforce_isolation value for the current GPU
is used to determine whether to enforce isolation between graphics and
compute processes on that GPU.

In amdgpu_ids.c, the adev->enforce_isolation value for the current GPU
and partition is used to determine whether to enforce isolation between
graphics and compute processes on that GPU and partition.

This allows the enforce_isolation setting to be controlled individually
for each GPU and each partition, which is useful in a system with
multiple GPUs and partitions where different isolation settings might be
desired for different GPUs and partitions.

v2: fix loop in amdgpu_vmid_mgr_init() (Alex)

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
2024-08-16 14:27:45 -04:00
Alex Deucher
ee7a846ea2 drm/amdgpu: Emit cleaner shader at end of IB submission
This commit introduces the emission of a cleaner shader at the end of
the IB submission process. This is achieved by adding a new function
pointer, `emit_cleaner_shader`, to the `amdgpu_ring_funcs` structure. If
the `emit_cleaner_shader` function is set in the ring functions, it is
called during the VM flush process.

The cleaner shader is only emitted if the `enable_cleaner_shader` flag
is set in the `amdgpu_device` structure. This allows the cleaner shader
emission to be controlled on a per-device basis.

By emitting a cleaner shader at the end of the IB submission, we can
ensure that the VM state is properly cleaned up after each submission.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
2024-08-16 14:27:42 -04:00
Srinivasan Shanmugam
aec773a1fb drm/amdgpu: Add infrastructure for Cleaner Shader feature
The cleaner shader is used by the CP firmware to clean LDS and GPRs
between processes on the CUs.

This adds an internal API for GFX IP code to allocate and initialize the
cleaner shader.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
2024-08-16 14:27:34 -04:00
Alex Deucher
f49280ffd2 drm/amdgpu: handle enforce isolation on non-0 gfxhub
Some chips have more than one gfxhub so check if we
are a gfxhub rather than just gfxhub 0.

Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:27:28 -04:00
Alex Deucher
2dc3851ef7 drm/amdgpu/sdma5.2: limit wptr workaround to sdma 5.2.1
The workaround seems to cause stability issues on other
SDMA 5.2.x IPs.

Fixes: a03ebf1163 ("drm/amdgpu/sdma5.2: Update wptr registers as well as doorbell")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3556
Acked-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:27:04 -04:00
Sunil Khatri
1a2103d685 drm/amdgpu: add vcn ip dump support for vcn_v2_6
Add support for logging the registers in devcoredump
buffer for vcn_v2_6.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:26:57 -04:00
Sunil Khatri
bc62abe1b9 drm/amdgpu: add print support for vcn_v2_5 ip dump
Add support for logging the registers in devcoredump
buffer for vcn_v2_5.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:26:52 -04:00
Sunil Khatri
0eea81ee2e drm/amdgpu: add vcn_v2_5 ip dump support
Add support of vcn ip dump in the devcoredump
for vcn_v2_5.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:26:47 -04:00
Sunil Khatri
b910cacb4e drm/amdgpu: add print support for vcn_v2_0 ip dump
Add support for logging the registers in devcoredump
buffer for vcn_v2_0.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:26:41 -04:00
Sunil Khatri
2239aaa204 drm/amdgpu: add vcn_v2_0 ip dump support
Add support of vcn ip dump in the devcoredump
for vcn_v2_0.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:26:36 -04:00
Sunil Khatri
ef9f3b5fd9 drm/amdgpu: add print support for vcn_v1_0 ip dump
Add support for logging the registers in devcoredump
buffer for vcn_v1_0.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:26:31 -04:00
Sunil Khatri
837cc7f1bf drm/amdgpu: add vcn_v1_0 ip dump support
Add support of vcn ip dump in the devcoredump
for vcn_v1_0.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:26:25 -04:00
Sunil Khatri
439c3b124e drm/amdgpu: add print support for vcn_v4_0_5 ip dump
Add support for logging the registers in devcoredump
buffer for vcn_v4_0_5.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:26:19 -04:00
Sunil Khatri
3a50a51d04 drm/amdgpu: add print support for vcn_v4_0 ip dump
Add support for logging the registers in devcoredump
buffer for vcn_v4_0.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:26:12 -04:00
Sunil Khatri
dc57edda81 drm/amdgpu: add print support for vcn_v4_0_3 ip dump
Add support for logging the registers in devcoredump
buffer for vcn_v4_0_3.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:26:06 -04:00
Sunil Khatri
46553db49c drm/amdgpu: add vcn_v4_0_5 ip dump support
Add support of vcn ip dump in the devcoredump
for vcn_v4_0_5.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:25:59 -04:00
Sunil Khatri
9d87dac3f9 drm/amdgpu: add vcn_v4_0 ip dump support
Add support of vcn ip dump in the devcoredump
for vcn_v4_0.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:25:53 -04:00
Sunil Khatri
8962915044 drm/amdgpu: add vcn_v4_0_3 ip dump support
Add support of vcn ip dump in the devcoredump
for vcn_v4_0_3.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:25:30 -04:00
Sunil Khatri
f3c958ab85 drm/amdgpu: add print support for vcn_v5_0 ip dump
Add support for logging the registers in devcoredump
buffer for vcn_v5_0.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:25:12 -04:00
Alex Deucher
32aada4d0a drm/amdgpu/mes12: add API for user queue reset
Add API for resetting user queues.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:25:09 -04:00
Alex Deucher
d4f1fde734 drm/amdgpu/mes11: add API for user queue reset
Add API for resetting user queues.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:25:06 -04:00
Alex Deucher
5b7a59de48 drm/amdgpu/mes: add API for user queue reset
Add API for resetting user queues.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:25:02 -04:00
Alex Deucher
478efcb90b drm/amdgpu/gfx11: export gfx_v11_0_request_gfx_index_mutex()
It will be used by the queue reset code.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:24:56 -04:00
Alex Deucher
76acba7b7f drm/amdgpu/gfx11: add a mutex for the gfx semaphore
This will be used in more places in the future so
add a mutex.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:24:33 -04:00
Alex Deucher
b5be054c58 drm/amdgpu/gfx11: enter safe mode before touching CP_INT_CNTL
Need to enter safe mode before touching GC MMIO.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:24:23 -04:00
Alex Deucher
d479158f65 drm/amdgpu/gfx7: add ring reset callback for gfx
Add ring reset callback for gfx.

v2: fix operator precedence (kernel test robot)

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:24:18 -04:00
Alex Deucher
4af8071b65 drm/amdgpu/gfx8: add ring reset callback for gfx
Add ring reset callback for gfx.

v2: fix operator precedence (kernel test robot)

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:24:09 -04:00
Sunil Khatri
f685b38455 drm/amdgpu: add vcn_v5_0 ip dump support
Add support of vcn ip dump in the devcoredump
for vcn_v5_0.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:24:04 -04:00
Sunil Khatri
6d88c0f94a drm/amdgpu: add print support for vcn_v3_0 ip dump
Add support for logging the registers in devcoredump
buffer for vcn_v3_0.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:23:59 -04:00
Sunil Khatri
ab10f77487 drm/amdgpu: add vcn_v3_0 ip dump support
Add support of vcn ip dump in the devcoredump
for vcn_v3_0.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:23:53 -04:00
Sunil Khatri
27a74c125d drm/amdgpu: add vcn ip dump ptr in vcn global struct
Add pointer to the vcn ip dump in the vcn global structure
to be accessible for all vcn version via global adev.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:23:45 -04:00
Zhang Zekun
20588d5afc drm/amd: Remove unused declarations
amdgpu_gart_table_vram_pin() and amdgpu_gart_table_vram_unpin() has
been removed since commit 575e55ee4f ("drm/amdgpu: recover gart table
at resume") remain the declarations untouched in the header files.

Besides, amdgpu_dm_display_resume() has also beed removed since
commit a80aa93de1 ("drm/amd/display: Unify dm resume sequence into a
single call"). So, let's remove this unused declarations.

Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:23:16 -04:00
Yang Wang
89ec85d16e drm/amdgpu: fixing rlc firmware loading failure issue
Skip rlc firmware validation to ignore firmware header size mismatch issues.
This restores the workaround added in
commit 849e133c97 ("drm/amdgpu: Fix the null pointer when load rlc firmware")

Fixes: 3af2c80ae2 ("drm/amdgpu: refine gfx10 firmware loading")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3551
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:19:16 -04:00
Sunil Khatri
0f2c243dbf drm/amdgpu: remove ME0 registers from mi300 dump
Remove ME0 registers from MI300 gfx_9_4_3 ipdump
MI300 does not have  gfx ME and hence those register
are just empty one and could be dropped.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:18:54 -04:00
Alex Deucher
3ec2ad7c34 drm/amdgpu/gfx9: use rlc safe mode for soft recovery
Protect the MMIO access with safe mode.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:18:51 -04:00
Alex Deucher
d082e5cde4 drm/amdgpu/gfx9.4.3: use rlc safe mode for soft recovery
Protect the MMIO access with safe mode.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:18:46 -04:00
Alex Deucher
a48f31fb78 drm/amdgpu/gfx9.4.3: use proper rlc safe mode helpers
Rather than open coding it for the queue reset.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:18:44 -04:00
Alex Deucher
27ef61f961 drm/amdgpu/gfx9: use proper rlc safe mode helpers
Rather than open coding it for the queue reset.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:18:41 -04:00
Alex Deucher
c4f503551f drm/amdgpu/gfx9: add ring reset callback for gfx
Add ring reset callback for gfx.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:18:38 -04:00
Alex Deucher
31ef969301 drm/amdgpu/gfx9: per queue reset only on bare metal
It's not supported under SR-IOV at the moment.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:18:35 -04:00
Jiadong Zhu
4dc4422f11 drm/amdgpu/gfx9.4.3: implement reset_hw_queue for gfx9.4.3
Using mmio to do queue reset. Enter safe mode
before writing mmio registers.

v2: set register instance offset according to xcc id.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:18:30 -04:00
Jiadong Zhu
2e9bbdd7b7 drm/amdgpu/gfx9: implement reset_hw_queue for gfx9
Using mmio to do queue reset. Enter safe mode
when writing registers.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:18:27 -04:00
Jiadong Zhu
186020c166 drm/amdgpu/gfx: add a new kiq_pm4_funcs callback for reset_hw_queue
Add reset_hw_queue in kiq_pm4_funcs callbacks.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:18:25 -04:00
Jiadong Zhu
4c953e53cc drm/amdgpu/gfx_9.4.3: wait for reset done before remap
There is a racing condition that cp firmware modifies
MQD in reset sequence after driver updates it for
remapping. We have to wait till CP_HQD_ACTIVE becoming
false then remap the queue.

v2: fix KIQ locking (Alex)
v3: fix KIQ locking harder

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:18:22 -04:00
Jiadong Zhu
6f38589e17 drm/amdgpu/gfx9.4.3: remap queue after reset successfully
Kiq command unmap_queues only does the dequeueing action.
We have to map the queue back with clean mqd.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:18:19 -04:00
Alex Deucher
5d0112f777 drm/amdgpu/gfx9.4.3: add ring reset callback
Add ring reset callback for compute.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:18:14 -04:00
Jiadong Zhu
fdbd69486b drm/amdgpu/gfx9: wait for reset done before remap
There is a racing condition that cp firmware modifies
MQD in reset sequence after driver updates it for
remapping. We have to wait till CP_HQD_ACTIVE becoming
false then remap the queue.

v2: fix KIQ locking (Alex)
v3: fix KIQ locking harder

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:18:06 -04:00
Jiadong Zhu
b5e1a3874f drm/amdgpu/gfx9: remap queue after reset successfully
Kiq command unmap_queues only does the dequeueing action.
We have to map the queue back with clean mqd.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:18:03 -04:00
Alex Deucher
5fb4d2a771 drm/amdgpu/gfx9: add ring reset callback
Add ring reset callback for compute.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:18:00 -04:00
Prike Liang
fb0a5834a3 drm/amdgpu: increase the reset counter for the queue reset
Update the reset counter for the amdgpu_cs_query_reset_state()

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:17:56 -04:00
Alex Deucher
15789fa0f0 drm/amdgpu: add per ring reset support (v5)
If a specific job is hung, try and reset just the
ring associated with the job.

v2: move to amdgpu_job.c
v3: fix drm_sched_stop() handling when ring reset fails
v4: drop unnecessary amdgpu_fence_driver_clear_job_fences() and
    drm_sched_increase_karma()
v5: rework sched_stop handling

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:17:52 -04:00
Alex Deucher
57a372f676 drm/amdgpu: add new ring reset callback
Use this to reset just a single ring.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:17:40 -04:00
Soham Dandapat
406792dc2a drm/amdgpu: Return earlier in amdgpu_sw_ring_ib_end if mcbp is off
As we don't trigger preemption is sw ring muxer when mcbp is
disabled,so return earlier in amdgpu_sw_ring_ib_end function
if mcbp is disabled ,not required to call amdgpu_ring_mux_end_ib

Signed-off-by: Soham Dandapat <sdandapa@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:17:31 -04:00
Sunil Khatri
37ee145623 drm/amdgpu: add cp queue registers print for gfx9_4_3
Add gfx9_4_3 print support of CP queue registers
for all queues to be used by devcoredump.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:17:26 -04:00
Sunil Khatri
f9e491c863 drm/amdgpu: add cp queue registers for gfx9_4_3 ipdump
Add gfx9 support of CP queue registers for all queues
to be used by devcoredump.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-16 14:17:14 -04:00
Thomas Zimmermann
ddda6542c8 drm/amdgpu: Use backlight power constants
Replace FB_BLANK_ constants with their counterparts from the
backlight subsystem. The values are identical, so there's no
change in functionality or semantics.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Xinhui Pan <Xinhui.Pan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240731122311.1143153-2-tzimmermann@suse.de
2024-08-16 09:27:15 +02:00
Kenneth Feng
23acd1f344 drm/amd/amdgpu: add HDP_SD support on gc 12.0.0/1
add HDP_SD support on gc 12.0.0/1

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 61cffacb3a)
2024-08-13 13:20:43 -04:00
Yinjie Yao
507a2286c0 drm/amdgpu: Update kmd_fw_shared for VCN5
kmd_fw_shared changed in VCN5

Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit aa02486fb1)
2024-08-13 13:20:36 -04:00
David (Ming Qiang) Wu
470516c292 drm/amd/amdgpu: command submission parser for JPEG
Add JPEG IB command parser to ensure registers
in the command are within the JPEG IP block.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a7f670d5d8)
Cc: stable@vger.kernel.org
2024-08-13 13:17:36 -04:00
Jack Xiao
4246b1077f drm/amdgpu/mes12: fix suspend issue
Use mes pipe to unmap kcq and kgq.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f7fb9d677f)
2024-08-13 13:17:36 -04:00
Jack Xiao
af401543df drm/amdgpu/mes12: sw/hw fini for unified mes
Free memory for two pipes and unmap pipe0 via pipe1.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 98cae695a8)
2024-08-13 13:17:36 -04:00
Jack Xiao
7254027e1e drm/amdgpu/mes12: configure two pipes hardware resources
Configure two pipes with different hardware resources.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ea5d6db17a)
2024-08-13 13:17:36 -04:00
Jack Xiao
1097727d6d drm/amdgpu/mes12: adjust mes12 sw/hw init for multiple pipes
Adjust mes12 sw/hw initiailization for both pipe0 and pipe1
enablement. The two pipes are almost identical pipe. Pipe0
behaves like schq and pipe1 like kiq, pipe0 was mapped by pipe1.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit aa539da8af)
2024-08-13 13:17:36 -04:00
Jack Xiao
3738a7f0dd drm/amdgpu/mes12: add mes pipe switch support
Add mes pipe switch to let caller choose pipe
to submit packet.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit b2dee0837a)
2024-08-13 13:17:36 -04:00
Jack Xiao
a13d91bf3c drm/amdgpu/mes12: load unified mes fw on pipe0 and pipe1
Enable unified mes firmware to load on pipe0 and pipe1.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e69c2dd753)
2024-08-13 13:17:36 -04:00
Jack Xiao
2029b3d7e1 drm/amdgpu/mes: add multiple mes ring instances support
Add multiple mes ring instances in mes structure to support
multiple mes pipes.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c7d4355648)
2024-08-13 13:04:48 -04:00
Bas Nieuwenhuizen
0573a1e2ea drm/amdgpu: Actually check flags for all context ops.
Missing validation ...

Checked libdrm and it clears all the structs, so we should be
safe to just check everything.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c6b86421f1)
Cc: stable@vger.kernel.org
2024-08-13 13:03:57 -04:00
Alex Deucher
e6c6bd6253 drm/amdgpu/jpeg4: properly set atomics vmid field
This needs to be set as well if the IB uses atomics.

Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c6c2e8b6a4)
Cc: stable@vger.kernel.org
2024-08-13 13:03:11 -04:00
Alex Deucher
e414a304f2 drm/amdgpu/jpeg2: properly set atomics vmid field
This needs to be set as well if the IB uses atomics.

Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 35c628774e)
Cc: stable@vger.kernel.org
2024-08-13 13:02:48 -04:00
Jack Xiao
11752c013f drm/amdgpu/mes: fix mes ring buffer overflow
wait memory room until enough before writing mes packets
to avoid ring buffer overflow.

v2: squash in sched_hw_submission fix

Fixes: de32462541 ("drm/amdgpu: cleanup MES11 command submission")
Fixes: fffe347e14 ("drm/amdgpu: cleanup MES12 command submission")
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 34e087e892)
Cc: stable@vger.kernel.org
2024-08-13 12:50:01 -04:00
Sunil Khatri
b232c4a63a drm/amdgpu: add print support for gfx9_4_3 ipdump
Add support of gfx9_4_3 ipdump print so devcoredump
could trigger it to dump the captured registers
in devcoredump.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 12:13:03 -04:00
Sunil Khatri
1091796fb1 drm/amdgpu: add gfx9_4_3 register support in ipdump
Add general registers of gfx9_4_3 in ipdump for
devcoredump support.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 12:13:03 -04:00
David (Ming Qiang) Wu
6a28a072d9 drm/amd/amdgpu: cleanup parse_cs callbacks
Because gpu_addr is updated in the calling routine
(amdgpu_cs_patch_ibs()),it is removed in the callback.

Use .patch_cs_in_place instead of .parse_cs for
amdgpu_vce_ring_parse_cs_vm() as there is no need for keeping
a temporary IB, therefore ib->sa_bo is NULL and amdgpu_ib_free()
is removed.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 12:13:03 -04:00
David (Ming Qiang) Wu
a7f670d5d8 drm/amd/amdgpu: command submission parser for JPEG
Add JPEG IB command parser to ensure registers
in the command are within the JPEG IP block.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 12:13:03 -04:00
Jack Xiao
f7fb9d677f drm/amdgpu/mes12: fix suspend issue
Use mes pipe to unmap kcq and kgq.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 12:13:03 -04:00
Jack Xiao
98cae695a8 drm/amdgpu/mes12: sw/hw fini for unified mes
Free memory for two pipes and unmap pipe0 via pipe1.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 12:13:03 -04:00
Jack Xiao
ea5d6db17a drm/amdgpu/mes12: configure two pipes hardware resources
Configure two pipes with different hardware resources.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 12:12:52 -04:00
Jack Xiao
aa539da8af drm/amdgpu/mes12: adjust mes12 sw/hw init for multiple pipes
Adjust mes12 sw/hw initiailization for both pipe0 and pipe1
enablement. The two pipes are almost identical pipe. Pipe0
behaves like schq and pipe1 like kiq, pipe0 was mapped by pipe1.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 12:12:52 -04:00
Jack Xiao
b2dee0837a drm/amdgpu/mes12: add mes pipe switch support
Add mes pipe switch to let caller choose pipe
to submit packet.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 12:12:52 -04:00
Victor Skvortsov
9e823f3070 drm/amdgpu: Block MMR_READ IOCTL in reset
Register access from userspace should be blocked until
reset is complete.

Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 12:12:52 -04:00
Jonathan Kim
a85c3db6b3 drm/amdkfd: fallback to pipe reset on queue reset fail for gfx9
If queue reset fails, tell the CP to reset the pipe.
Since queues multiplex context per pipe and we've issued a device wide
preemption prior to the hang, we can assume the hung pipe only has one
queue to reset on pipe reset.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 12:12:52 -04:00
Lijo Lazar
9c081c11c6 drm/amdgpu: Reorder to read EFI exported ROM first
On EFI BIOSes, PCI ROM may be exported through EFI_PCI_IO_PROTOCOL and
expansion ROM BARs may not be enabled. Choose to read from EFI exported
ROM data before reading PCI Expansion ROM BAR.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 12:12:52 -04:00
Jack Xiao
e69c2dd753 drm/amdgpu/mes12: load unified mes fw on pipe0 and pipe1
Enable unified mes firmware to load on pipe0 and pipe1.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 12:12:52 -04:00
Victor Skvortsov
f83cec3b3a drm/amdgpu: Disable dpm_enabled flag while VF is in reset
VFs do not perform HW fini/suspend in FLR, so the dpm_enabled
is incorrectly kept enabled. Add interface to disable it in
virt_pre_reset call.

v2: Made implementation generic for all asics
v3: Re-order conditionals so PP_MP1_STATE_FLR is only evaluated on VF

Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 12:12:52 -04:00
Victor Skvortsov
35c7152202 Revert "drm/amdgpu: Extend KIQ reg polling wait for VF"
KIQ timeouts no longer seen.

This reverts commit 3a19a8af64.

Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed-by: Zhigang Luo <zhigang.luo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 12:12:52 -04:00
Yinjie Yao
aa02486fb1 drm/amdgpu: Update kmd_fw_shared for VCN5
kmd_fw_shared changed in VCN5

Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 12:12:52 -04:00
Kenneth Feng
61cffacb3a drm/amd/amdgpu: add HDP_SD support on gc 12.0.0/1
add HDP_SD support on gc 12.0.0/1

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 12:12:51 -04:00
Victor Zhao
ef6c2cb349 drm/amd/sriov: extend NV_MAILBOX_POLL_MSG_TIMEDOUT
on MI300/MI308 UBB products, when doing mode1 reset, since 1 gpu need to
wait all 8 gpus finish mode1 reset and then do re-init. As observed,
sometimes the gpu which triggered the reset need to wait 15s for all
gpus to finish.

If poll msg timeout, guest driver will send the reset message again, and
may mess up the following reinit sequence on other gpus.

So extend the time to cover the maximum time needed to recover.

Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 12:12:51 -04:00
Sunil Khatri
9b7e697839 drm/amdgpu: fix ptr check warning in gfx12 ip_dump
Change condition, if (ptr == NULL) to if (!ptr)
for a better format and fix the warning.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 10:34:51 -04:00
Sunil Khatri
bd15f805cd drm/amdgpu: fix ptr check warning in gfx11 ip_dump
Change condition, if (ptr == NULL) to if (!ptr)
for a better format and fix the warning.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 10:34:31 -04:00
Sunil Khatri
98df5a7732 drm/amdgpu: fix ptr check warning in gfx10 ip_dump
Change condition, if (ptr == NULL) to if (!ptr)
for a better format and fix the warning.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 10:34:23 -04:00
Sunil Khatri
07f4f9c00e drm/amdgpu: fix ptr check warning in gfx9 ip_dump
Change if (ptr == NULL) to if (!ptr) for a better
format and fix the warning.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 10:33:07 -04:00
Sunil Khatri
311f2b5874 Revert "drm/amdgpu: add vcn ip dump ptr in vcn global struct"
This reverts commit f3392e662e.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 10:30:49 -04:00
Sunil Khatri
434b3554d6 Revert "drm/amdgpu: add vcn_v3_0 ip dump support"
This reverts commit 58d283801d.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 10:30:24 -04:00
Sunil Khatri
2f93ec07ab Revert "drm/amdgpu: add print support for vcn_v3_0 ip dump"
This reverts commit cd162ae9bc.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 10:29:32 -04:00
Jack Xiao
c7d4355648 drm/amdgpu/mes: add multiple mes ring instances support
Add multiple mes ring instances in mes structure to support
multiple mes pipes.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 10:29:25 -04:00
Sunil Khatri
3df3433414 Revert "drm/amdgpu: add vcn_v5_0 ip dump support"
This reverts commit a46a7bef7d.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 10:28:51 -04:00
Sunil Khatri
a46a7bef7d drm/amdgpu: add vcn_v5_0 ip dump support
Add support of vcn ip dump in the devcoredump
for vcn_v5_0.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 10:28:13 -04:00
Alex Deucher
947c080869 drm/amdgpu/mes12: add API for legacy queue reset
Add API for resetting kernel queues.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 10:28:11 -04:00
Alex Deucher
45a2a45143 drm/amdgpu/mes11: add API for legacy queue reset
Add API for resetting kernel queues.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 10:28:07 -04:00
Alex Deucher
c30fb344a2 drm/amdgpu/mes: add API for legacy queue reset
Add API for resetting kernel queues.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 10:27:59 -04:00
Bas Nieuwenhuizen
c6b86421f1 drm/amdgpu: Actually check flags for all context ops.
Missing validation ...

Checked libdrm and it clears all the structs, so we should be
safe to just check everything.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 10:27:22 -04:00
Arnd Bergmann
020620424b drm/amd: Use a constant format string for amdgpu_ucode_request
Multiple files in amdgpu call amdgpu_ucode_request() with a fw_name
variable that the compiler cannot check for being a valid format string,
as seen by enabling the (default-disabled) -Wformat-security option:

drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c: In function 'amdgpu_mes_init_microcode':
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c:1517:61: error: format not a string literal and no format arguments [-Werror=format-security]
 1517 |         r = amdgpu_ucode_request(adev, &adev->mes.fw[pipe], fw_name);
      |                                                             ^~~~~~~
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c: In function 'amdgpu_uvd_sw_init':
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c:263:9: error: format not a string literal and no format arguments [-Werror=format-security]
  263 |         r = amdgpu_ucode_request(adev, &adev->uvd.fw, fw_name);
      |         ^
drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c: In function 'amdgpu_vce_sw_init':
drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c:161:9: error: format not a string literal and no format arguments [-Werror=format-security]
  161 |         r = amdgpu_ucode_request(adev, &adev->vce.fw, fw_name);
      |         ^
drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c: In function 'amdgpu_umsch_mm_init_microcode':
drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c:590:9: error: format not a string literal and no format arguments [-Werror=format-security]
  590 |         r = amdgpu_ucode_request(adev, &adev->umsch_mm.fw, fw_name);
      |         ^
drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c: In function 'amdgpu_cgs_get_firmware_info':
drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:417:72: error: format not a string literal and no format arguments [-Werror=format-security]
  417 |                         err = amdgpu_ucode_request(adev, &adev->pm.fw, fw_name);
      |                                                                        ^~~~~~~
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c: In function 'load_dmcu_fw':
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:2221:9: error: format not a string literal and no format arguments [-Werror=format-security]
 2221 |         r = amdgpu_ucode_request(adev, &adev->dm.fw_dmcu, fw_name_dmcu);
      |         ^
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c: In function 'dm_init_microcode':
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:5147:9: error: format not a string literal and no format arguments [-Werror=format-security]
 5147 |         r = amdgpu_ucode_request(adev, &adev->dm.dmub_fw, fw_name_dmub);
      |         ^

Change these all to use a "%s" format with the actual name as an argument,
to let the compiler prove this to be correct.

Fixes: e5a7d047f4 ("drm/amd: Use `amdgpu_ucode_*` helpers for CGS")
Fixes: 52215e2a5d ("drm/amd: Use `amdgpu_ucode_*` helpers for VCE")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 10:27:03 -04:00
Tobias Jakobi
8641b81739 drm/amd: Make amd_ip_funcs static for SDMA v5.2
The struct can be static, as it is only used in this
translation unit.

Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 10:26:59 -04:00
Tobias Jakobi
9a12b1c7a0 drm/amd: Make amd_ip_funcs static for SDMA v5.0
The struct can be static, as it is only used in this
translation unit.

Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 10:26:53 -04:00
WangYuli
0cee47cde4 drm/amd/amdgpu: Properly tune the size of struct
The struct assertion is failed because sparse cannot parse
`#pragma pack(push, 1)` and `#pragma pack(pop)` correctly.
GCC's output is still 1-byte-aligned. No harm to memory layout.

The error can be filtered out by sparse-diff, but sometimes
multiple lines queezed into one, making the sparse-diff thinks
its a new error. I'm trying to aviod this by fixing errors.

Link: https://lore.kernel.org/all/20230620045919.492128-1-suhui@nfschina.com/
Link: https://lore.kernel.org/all/93d10611-9fbb-4242-87b8-5860b2606042@suswa.mountain/
Fixes: 1721bc1b2a ("drm/amdgpu: Update VF2PF interface")
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Cc: wenlunpeng <wenlunpeng@uniontech.com>
Reported-by: Su Hui <suhui@nfschina.com>
Signed-off-by: WangYuli <wangyuli@uniontech.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 10:25:49 -04:00
Alex Deucher
c6c2e8b6a4 drm/amdgpu/jpeg4: properly set atomics vmid field
This needs to be set as well if the IB uses atomics.

Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 10:25:45 -04:00
Alex Deucher
35c628774e drm/amdgpu/jpeg2: properly set atomics vmid field
This needs to be set as well if the IB uses atomics.

Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 10:25:16 -04:00
Thomas Zimmermann
7a26f18119 drm/amdgpu: Do not set struct drm_driver.lastclose
Remove the implementation of struct drm_driver.lastclose. The hook
was only necessary before in-kernel DRM clients existed, but is now
obsolete. The code in amdgpu_driver_lastclose_kms() is performed by
drm_lastclose().

v2:
- update commit message

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240812083000.337744-3-tzimmermann@suse.de
2024-08-13 16:21:08 +02:00
Jack Xiao
34e087e892 drm/amdgpu/mes: fix mes ring buffer overflow
wait memory room until enough before writing mes packets
to avoid ring buffer overflow.

v2: squash in sched_hw_submission fix

Fixes: de32462541 ("drm/amdgpu: cleanup MES11 command submission")
Fixes: fffe347e14 ("drm/amdgpu: cleanup MES12 command submission")
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 09:57:52 -04:00
Daniel Vetter
4e996697a4 drm-misc-next for v6.12:
UAPI Changes:
 
 - remove Power Saving Policy property
 
 Core Changes:
 
 - update connector documentation
 
 CI:
 - add tests for mediatek, meson, rockchip
 
 Driver Changes:
 
 amdgpu:
 - revert support for Power Saving Policy property
 
 bridge:
 - lt9611uxc: require DRM_BRIDGE_ATTACH_NO_CONNECTOR
 
 mgag200:
 - transparently support BMC outputs
 
 omapdrm:
 - use common helper for_each_endpoint_of_node()
 
 panel:
 - panel-edp: fix name for HKC MB116AN01
 
 vkms:
 - clean up endianess warnings
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAma1wMsACgkQaA3BHVML
 eiM+/Af9Fuq5tLo9MlRMDS5kF41UJhEiqd805tlz0mofoIbsbynKIxbTD+sCrXxZ
 mzWX5lVO6tPPudtYbHX9tkD4eV8neiI9QTuTPiBgP4oRpPlJbnJzuG77/qG3gPIe
 L+ByiEx1e88A7BY0rY/xjBAKirYaQfVkGouD6m3NgZasiop0Bie3hTDtfPoRYh5u
 odsEHvrOf4tFOTjEZI9/KlEmW2lBziydTtm2yIiFyQIAy0O+FZIfDtMRjpo3ivKR
 wfPm15SleKGuLvDL8QsBNl+HshFywqn489yvbp0g56aRfIeQgNFF5g9Aj1+DXniE
 C/Dm7GDQhK98Wso3yECoK0/AuMHBmA==
 =JEjA
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2024-08-09' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

drm-misc-next for v6.12:

UAPI Changes:

- remove Power Saving Policy property

Core Changes:

- update connector documentation

CI:
- add tests for mediatek, meson, rockchip

Driver Changes:

amdgpu:
- revert support for Power Saving Policy property

bridge:
- lt9611uxc: require DRM_BRIDGE_ATTACH_NO_CONNECTOR

mgag200:
- transparently support BMC outputs

omapdrm:
- use common helper for_each_endpoint_of_node()

panel:
- panel-edp: fix name for HKC MB116AN01

vkms:
- clean up endianess warnings

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240809071241.GA222501@localhost.localdomain
2024-08-09 10:41:59 +02:00
Daniel Vetter
91dae758bd drm-misc-next for v6.12:
UAPI Changes:
 
 virtio:
 - Define DRM capset
 
 Cross-subsystem Changes:
 
 dma-buf:
 - heaps: Clean up documentation
 
 printk:
 - Pass description to kmsg_dump()
 
 Core Changes:
 
 CI:
 - Update IGT tests
 - Point upstream repo to GitLab instance
 
 modesetting:
 - Introduce Power Saving Policy property for connectors
 - Add might_fault() to drm_modeset_lock priming
 - Add dynamic per-crtc vblank configuration support
 
 panic:
 - Avoid build-time interference with framebuffer console
 
 docs:
 - Document Colorspace property
 
 scheduler:
 - Remove full_recover from drm_sched_start
 
 TTM:
 - Make LRU walk restartable after dropping locks
 - Allow direct reclaim to allocate local memory
 
 Driver Changes:
 
 amdgpu:
 - Support Power Saving Policy connector property
 
 ast:
 - astdp: Support AST2600 with VGA; Clean up HPD
 
 bridge:
 - Silence error message on -EPROBE_DEFER
 - analogix: Clean aup
 - bridge-connector: Fix double free
 - lt6505: Disable interrupt when powered off
 - tc358767: Make default DP port preemphasis configurable
 
 gma500:
 - Update i2c terminology
 
 ivpu:
 - Add MODULE_FIRMWARE()
 
 lcdif:
 - Fix pixel clock
 
 loongson:
 - Use GEM refcount over TTM's
 
 mgag200:
 - Improve BMC handling
 - Support VBLANK intterupts
 
 nouveau:
 - Refactor and clean up internals
 - Use GEM refcount over TTM's
 
 panel:
 - Shutdown fixes plus documentation
 - Refactor several drivers for better code sharing
 - boe-th101mb31ig002: Support for starry-er88577 MIPI-DSI panel plus
   DT; Fix porch parameter
 - edp: Support AOU B116XTN02.3, AUO B116XAN06.1, AOU B116XAT04.1,
   BOE NV140WUM-N41, BOE NV133WUM-N63, BOE NV116WHM-A4D, CMN N116BCA-EA2,
   CMN N116BCP-EA2, CSW MNB601LS1-4
 - himax-hx8394: Support Microchip AC40T08A MIPI Display panel plus DT
 - ilitek-ili9806e: Support Densitron DMT028VGHMCMI-1D TFT plus DT
 - jd9365da: Support Melfas lmfbx101117480 MIPI-DSI panel plus DT; Refactor
   for code sharing
 
 sti:
 - Fix module owner
 
 stm:
 - Avoid UAF wih managed plane and CRTC helpers
 - Fix module owner
 - Fix error handling in probe
 - Depend on COMMON_CLK
 - ltdc: Fix transparency after disabling plane; Remove unused interrupt
 
 tegra:
 - Call drm_atomic_helper_shutdown()
 
 v3d:
 - Clean up perfmon
 
 vkms:
 - Clean up
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmareygACgkQaA3BHVML
 eiO2vwf9FirbMiq4lfHzgcbNIU1dTUtjRAZjrlwGmqk5cb9lUshAMCMBMOEQBDdg
 XMQQj/RMBvRUuxzsPGk78ObSz5FBaBLgKwFprer0V6uslQaJxj4YRsnkp0l2n+0k
 +ebhfo2rUgZOdgNOkXH326w9UhqiydIa7GaA2aq1vUzXKFDfvGXtSN75BMlEWlKP
 rTft56AiwjwcKu7zYFHGlFUMSNpKAQy7lnV3+dBXAfFNHu4zVNoI/yWGEOdR7eVo
 WhiEcpvismsOh+BfUvMNPP3RKwjXHdwMlJYb+v9XGgH27hqc50lSceWydHtoJTto
 DTXF9WQhJ+/GQR9ZGmBjos9GVbECDA==
 =L/1W
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2024-08-01' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

drm-misc-next for v6.12:

UAPI Changes:

virtio:
- Define DRM capset

Cross-subsystem Changes:

dma-buf:
- heaps: Clean up documentation

printk:
- Pass description to kmsg_dump()

Core Changes:

CI:
- Update IGT tests
- Point upstream repo to GitLab instance

modesetting:
- Introduce Power Saving Policy property for connectors
- Add might_fault() to drm_modeset_lock priming
- Add dynamic per-crtc vblank configuration support

panic:
- Avoid build-time interference with framebuffer console

docs:
- Document Colorspace property

scheduler:
- Remove full_recover from drm_sched_start

TTM:
- Make LRU walk restartable after dropping locks
- Allow direct reclaim to allocate local memory

Driver Changes:

amdgpu:
- Support Power Saving Policy connector property

ast:
- astdp: Support AST2600 with VGA; Clean up HPD

bridge:
- Silence error message on -EPROBE_DEFER
- analogix: Clean aup
- bridge-connector: Fix double free
- lt6505: Disable interrupt when powered off
- tc358767: Make default DP port preemphasis configurable

gma500:
- Update i2c terminology

ivpu:
- Add MODULE_FIRMWARE()

lcdif:
- Fix pixel clock

loongson:
- Use GEM refcount over TTM's

mgag200:
- Improve BMC handling
- Support VBLANK intterupts

nouveau:
- Refactor and clean up internals
- Use GEM refcount over TTM's

panel:
- Shutdown fixes plus documentation
- Refactor several drivers for better code sharing
- boe-th101mb31ig002: Support for starry-er88577 MIPI-DSI panel plus
  DT; Fix porch parameter
- edp: Support AOU B116XTN02.3, AUO B116XAN06.1, AOU B116XAT04.1,
  BOE NV140WUM-N41, BOE NV133WUM-N63, BOE NV116WHM-A4D, CMN N116BCA-EA2,
  CMN N116BCP-EA2, CSW MNB601LS1-4
- himax-hx8394: Support Microchip AC40T08A MIPI Display panel plus DT
- ilitek-ili9806e: Support Densitron DMT028VGHMCMI-1D TFT plus DT
- jd9365da: Support Melfas lmfbx101117480 MIPI-DSI panel plus DT; Refactor
  for code sharing

sti:
- Fix module owner

stm:
- Avoid UAF wih managed plane and CRTC helpers
- Fix module owner
- Fix error handling in probe
- Depend on COMMON_CLK
- ltdc: Fix transparency after disabling plane; Remove unused interrupt

tegra:
- Call drm_atomic_helper_shutdown()

v3d:
- Clean up perfmon

vkms:
- Clean up

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240801121406.GA102996@linux.fritz.box
2024-08-08 18:58:46 +02:00
Arunpravin Paneer Selvam
6ad9dafba1 drm/amdgpu: Add DCC GFX12 flag to enable address alignment
We require this flag AMDGPU_GEM_CREATE_GFX12_DCC or any other
kernel level GFX12 DCC flag to differentiate the DCC buffers and other
pinned display buffers(which has TTM_PL_FLAG_CONTIGUOUS enabled).

If we use the TTM_PL_FLAG_CONTIGUOUS flag for DCC buffers, we may over
allocate for all the pinned display buffers unnecessarily that leads to
memory allocation failure.

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 46142cc1b9)
2024-08-07 18:23:59 -04:00
Frank Min
7fc5f252c0 drm/amdgpu: correct sdma7 max dw
correct sdma7 max dw into 8

Signed-off-by: Frank Min <Frank.Min@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 86598c3819)
2024-08-07 18:23:49 -04:00
Arunpravin Paneer Selvam
4a5ad08f53 drm/amdgpu: Add address alignment support to DCC buffers
Add address alignment support to the DCC VRAM buffers.

v2:
  - adjust size based on the max_texture_channel_caches values
    only for GFX12 DCC buffers.
  - used AMDGPU_GEM_CREATE_GFX12_DCC flag to apply change only
    for DCC buffers.
  - roundup non power of two DCC buffer adjusted size to nearest
    power of two number as the buddy allocator does not support non
    power of two alignments. This applies only to the contiguous
    DCC buffers.

v3:(Alex)
  - rewrite the max texture channel caches comparison code in an
    algorithmic way to determine the alignment size.

v4:(Alex)
  - Move the logic from amdgpu_vram_mgr_dcc_alignment() to gmc_v12_0.c
    and add a new gmc func callback for dcc alignment. If the callback
    is non-NULL, call it to get the alignment, otherwise, use the default.

v5:(Alex)
  - Set the Alignment to a default value if the callback doesn't exist.
  - Add the callback to amdgpu_gmc_funcs.

v6:
  - Fix checkpatch warning reported by Intel CI.

v7:(Christian)
  - remove the AMDGPU_GEM_CREATE_GFX12_DCC flag and keep a flag that
    checks the BO pinning and for a specific hw generation.

v8:(Christian)
  - move this check into gmc_v12_0_get_dcc_alignment.

v9:
  - Fix 32bit build errors

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit aa94b623cb)
2024-08-07 18:23:42 -04:00
Frank Min
5d687a67fd drm/amdgpu: change non-dcc buffer copy configuration
Without setting cpv bit and 7th ib dw, non-dcc buffer copy will have
random corruption

So set the cpv bit and clear the 7th ib dw for copy non-dcc buffers

Signed-off-by: Frank Min <Frank.Min@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5aacf8917f)
2024-08-07 18:21:05 -04:00
Joshua Ashton
829798c789 drm/amdgpu: Forward soft recovery errors to userspace
As we discussed before[1], soft recovery should be
forwarded to userspace, or we can get into a really
bad state where apps will keep submitting hanging
command buffers cascading us to a hard reset.

1: https://lore.kernel.org/all/bf23d5ed-9a6b-43e7-84ee-8cbfd0d60f18@froggi.es/
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 434967aadb)
Cc: stable@vger.kernel.org
2024-08-07 18:20:07 -04:00
Likun Gao
8ff3bb44cc drm/amdgpu: add golden setting for gc v12
Adding Manual GDB golden setting for gc v12
revision 0 ASIC.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c9875d0a78)
2024-08-07 18:19:38 -04:00
Likun Gao
aa5c9701eb drm/amdgpu: force to use legacy inv in mmhub
MMHUB v4.1.0 only support fixed cache mode, so
only use legacy invalidation accordingly.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 9192c7613c)
2024-08-07 18:16:38 -04:00
Arunpravin Paneer Selvam
46142cc1b9 drm/amdgpu: Add DCC GFX12 flag to enable address alignment
We require this flag AMDGPU_GEM_CREATE_GFX12_DCC or any other
kernel level GFX12 DCC flag to differentiate the DCC buffers and other
pinned display buffers(which has TTM_PL_FLAG_CONTIGUOUS enabled).

If we use the TTM_PL_FLAG_CONTIGUOUS flag for DCC buffers, we may over
allocate for all the pinned display buffers unnecessarily that leads to
memory allocation failure.

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 11:11:03 -04:00
Tim Huang
92549780e3 drm/amdgpu: fix unchecked return value warning for amdgpu_atombios
This resolves the unchecded return value warning reported by Coverity.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 11:11:03 -04:00
Tim Huang
c0277b9d7c drm/amdgpu: fix unchecked return value warning for amdgpu_gfx
This resolves the unchecded return value warning reported by Coverity.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 11:11:03 -04:00
Frank Min
86598c3819 drm/amdgpu: correct sdma7 max dw
correct sdma7 max dw into 8

Signed-off-by: Frank Min <Frank.Min@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 11:11:03 -04:00
Arunpravin Paneer Selvam
aa94b623cb drm/amdgpu: Add address alignment support to DCC buffers
Add address alignment support to the DCC VRAM buffers.

v2:
  - adjust size based on the max_texture_channel_caches values
    only for GFX12 DCC buffers.
  - used AMDGPU_GEM_CREATE_GFX12_DCC flag to apply change only
    for DCC buffers.
  - roundup non power of two DCC buffer adjusted size to nearest
    power of two number as the buddy allocator does not support non
    power of two alignments. This applies only to the contiguous
    DCC buffers.

v3:(Alex)
  - rewrite the max texture channel caches comparison code in an
    algorithmic way to determine the alignment size.

v4:(Alex)
  - Move the logic from amdgpu_vram_mgr_dcc_alignment() to gmc_v12_0.c
    and add a new gmc func callback for dcc alignment. If the callback
    is non-NULL, call it to get the alignment, otherwise, use the default.

v5:(Alex)
  - Set the Alignment to a default value if the callback doesn't exist.
  - Add the callback to amdgpu_gmc_funcs.

v6:
  - Fix checkpatch warning reported by Intel CI.

v7:(Christian)
  - remove the AMDGPU_GEM_CREATE_GFX12_DCC flag and keep a flag that
    checks the BO pinning and for a specific hw generation.

v8:(Christian)
  - move this check into gmc_v12_0_get_dcc_alignment.

v9:
  - Fix 32bit build errors

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 11:11:03 -04:00
Frank Min
5aacf8917f drm/amdgpu: change non-dcc buffer copy configuration
Without setting cpv bit and 7th ib dw, non-dcc buffer copy will have
random corruption

So set the cpv bit and clear the 7th ib dw for copy non-dcc buffers

Signed-off-by: Frank Min <Frank.Min@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 11:11:02 -04:00
Tao Zhou
b3a3c9a6b2 drm/amdgpu: report bad status in GPU recovery
Instead of printing GPU reset failed.

v2: add check for reset_context->src.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 11:11:02 -04:00
Tao Zhou
dd3e296289 drm/amdgpu: update bad state check in GPU recovery
Return RMA status without message print.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 11:11:02 -04:00
Joshua Ashton
434967aadb drm/amdgpu: Forward soft recovery errors to userspace
As we discussed before[1], soft recovery should be
forwarded to userspace, or we can get into a really
bad state where apps will keep submitting hanging
command buffers cascading us to a hard reset.

1: https://lore.kernel.org/all/bf23d5ed-9a6b-43e7-84ee-8cbfd0d60f18@froggi.es/
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 11:11:01 -04:00
Yang Wang
671af06690 drm/amdgpu: remove RAS unused paramter 'err_addr'
- amdgpu_ras_error_statistic_ue_count()
- amdgpu_ras_error_statistic_ce_count()
- amdgpu_ras_error_statistic_de_count()

The parameter 'err_addr' is no longer used since following patch.

Fixes: a7e8467fbe ("drm/amdgpu: Remove unused code")
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 11:11:01 -04:00
Likun Gao
c9875d0a78 drm/amdgpu: add golden setting for gc v12
Adding Manual GDB golden setting for gc v12
revision 0 ASIC.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 11:11:01 -04:00
Tao Zhou
792be2e23a drm/amdgpu: create function to check RAS RMA status
In the convenience of calling it globally.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 11:11:01 -04:00
Sunil Khatri
62341f7bc2 drm/amdgpu: optimize the padding for gfx_v9_4_3
Adding NOP packets one by one in the ring
does not use the CP efficiently.

Solution:
Use CP optimization while adding NOP packet's so PFP
can discard NOP packets based on information of count
from the Header instead of fetching all NOP packets
one by one.

Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Cc: Tvrtko Ursulin <tursulin@igalia.com>
Cc: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 11:11:00 -04:00
Sunil Khatri
dee44a7cb5 drm/amdgpu: optimize the padding for gfx9
Adding NOP packets one by one in the ring
does not use the CP efficiently.

Solution:
Use CP optimization while adding NOP packet's so PFP
can discard NOP packets based on information of count
from the Header instead of fetching all NOP packets
one by one.

Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Cc: Tvrtko Ursulin <tursulin@igalia.com>
Cc: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 11:11:00 -04:00
Tvrtko Ursulin
bb670c31e1 drm/amdpgu: Micro-optimise amdgpu_ring_commit
For some value of optimisation we can replace the division with an
bitwise and. And it even shrinks the code. Before:

     6c9:       53                      push   %rbx
     6ca:       4c 8b 47 08             mov    0x8(%rdi),%r8
     6ce:       31 d2                   xor    %edx,%edx
     6d0:       48 89 fb                mov    %rdi,%rbx
     6d3:       8b 87 c8 05 00 00       mov    0x5c8(%rdi),%eax
     6d9:       41 8b 48 04             mov    0x4(%r8),%ecx
     6dd:       f7 d0                   not    %eax
     6df:       21 c8                   and    %ecx,%eax
     6e1:       83 c1 01                add    $0x1,%ecx
     6e4:       83 c0 01                add    $0x1,%eax
     6e7:       f7 f1                   div    %ecx
     6e9:       89 d6                   mov    %edx,%esi
     6eb:       41 ff 90 88 00 00 00    call   *0x88(%r8)

After:

     6c9:       53                      push   %rbx
     6ca:       48 8b 57 08             mov    0x8(%rdi),%rdx
     6ce:       48 89 fb                mov    %rdi,%rbx
     6d1:       8b 87 c8 05 00 00       mov    0x5c8(%rdi),%eax
     6d7:       8b 72 04                mov    0x4(%rdx),%esi
     6da:       f7 d0                   not    %eax
     6dc:       21 f0                   and    %esi,%eax
     6de:       83 c0 01                add    $0x1,%eax
     6e1:       21 c6                   and    %eax,%esi
     6e3:       ff 92 88 00 00 00       call   *0x88(%rdx)

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 11:10:40 -04:00
Hawking Zhang
dfe9d047b1 drm/amdgpu: Add more types for boot time error reporting
Data abort exception and unknown errors are supported.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 11:10:17 -04:00
Likun Gao
9192c7613c drm/amdgpu: force to use legacy inv in mmhub
MMHUB v4.1.0 only support fixed cache mode, so
only use legacy invalidation accordingly.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 11:10:08 -04:00
Sunil Khatri
f59902ffcc drm/amdgpu: optimize the padding for gfx12
Adding NOP packets one by one in the ring
does not use the CP efficiently.

Solution:
Use CP optimization while adding NOP packet's so PFP
can discard NOP packets based on information of count
from the Header instead of fetching all NOP packets
one by one.

Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Cc: Tvrtko Ursulin <tursulin@igalia.com>
Cc: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 10:44:37 -04:00
Yifan Zhang
62eefd10ac drm/amdgpu: use CPU for page table update if SDMA is unavailable
avoid using SDMA if it is unavailable.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Tim Huang <tim.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 10:43:57 -04:00
Sunil Khatri
847e387e00 drm/amdgpu: optimize the padding for gfx11
Adding NOP packets one by one in the ring
does not use the CP efficiently.

Solution:
Use CP optimization while adding NOP packet's so PFP
can discard NOP packets based on information of count
from the Header instead of fetching all NOP packets
one by one.

Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Cc: Tvrtko Ursulin <tursulin@igalia.com>
Cc: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 10:43:49 -04:00
Sunil Khatri
67c4ca9f79 drm/amdgpu: do not call insert_nop fn for zero count
Do not make a function call for zero size NOP as it
does not add anything in the ring and is unnecessary
function call.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 10:43:28 -04:00
Jonathan Kim
ee0a469cf9 drm/amdkfd: support per-queue reset on gfx9
Support per-queue reset for GFX9.  The recommendation is for the driver
to target reset the HW queue via a SPI MMIO register write.

Since this requires pipe and HW queue info and MEC FW is limited to
doorbell reports of hung queues after an unmap failure, scan the HW
queue slots defined by SET_RESOURCES first to identify the user queue
candidates to reset.

Only signal reset events to processes that have had a queue reset.

If queue reset fails, fall back to GPU reset.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 10:43:18 -04:00
Sunil Khatri
e89d2fec4c drm/amdgpu: optimize the padding for gfx10
Adding NOP packets one by one in the ring
does not use the CP efficiently.

Solution:
Use CP optimization while adding NOP packet's so PFP
can discard NOP packets based on information of count
from the Header instead of fetching all NOP packets
one by one.

Cc: Christian König <christian.koenig@amd.com>
Cc: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Cc: Tvrtko Ursulin <tursulin@igalia.com>
Cc: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 10:43:01 -04:00
Sunil Khatri
836af5be1b drm/amdgpu: Clean up the register dump via debugfs list
debugfs register list for dump is cleaned as it have
some issues related to proper power state of the IP
before register read.

Since the above mentioned is removed we no longer want
this to be dumped part of the devcoredump and hence
removed.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 10:42:22 -04:00
Sunil Khatri
17277da266 drm/amdgpu: Remove debugfs amdgpu_reset_dump_register_list
There are some problem with existing amdgpu_reset_dump_register_list
debugfs node. It is supposed to read a list of registers but there
could be cases when the IP is not in correct power state. Register
read in such cases could lead to more problems.

We are taking care of all such power states in devcoredump and
dumping the registers of need for debugging. So cleaning this code
and we dont need this functionality via debugfs anymore.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06 10:42:09 -04:00
Hamza Mahfooz
717b432b6d
Revert "drm/amd: Add power_saving_policy drm property to eDP connectors"
This reverts commit 9d8c094dda.

It was merged without meeting userspace requirements.

Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240802145946.48073-2-hamza.mahfooz@amd.com
2024-08-02 11:29:17 -04:00
Thomas Zimmermann
0e8655b4e8 Merge drm/drm-next into drm-misc-next
Backmerging to get a late RC of v6.10 before moving into v6.11.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-07-29 09:35:54 +02:00
Michael Chen
9038e25c80 drm/amdgpu: increase mes log buffer size for gfx12
MES firmware requires larger log buffer for gfx12. Allocate
proper buffer respectively for gfx11 and gfx12.

Signed-off-by: Michael Chen <michael.chen@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 739d0f3e1f)
2024-07-27 18:10:12 -04:00
Christian König
f3572db3c0 drm/amdgpu: fix contiguous handling for IB parsing v2
Otherwise we won't get correct access to the IB.

v2: keep setting AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS to avoid problems in
    the VRAM backend.

Signed-off-by: Christian König <christian.koenig@amd.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3501
Fixes: e362b7c8f8 ("drm/amdgpu: Modify the contiguous flags behaviour")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Tested-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit fbfb5f0342)
2024-07-27 18:09:38 -04:00
Thomas Weißschuh
aeb81b62c7 drm/amdgpu: convert bios_hardcoded_edid to drm_edid
Instead of manually passing around 'struct edid *' and its size,
use 'struct drm_edid', which encapsulates a validated combination of
both.

As the drm_edid_ can handle NULL gracefully, the explicit checks can be
dropped.

Also save a few characters by transforming '&array[0]' to the equivalent
'array' and using 'max_t(int, ...)' instead of manual casts.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-27 17:35:05 -04:00
Sunil Khatri
13d8850a33 drm/amdgpu: trigger ip dump before suspend of IP's
Problem:
IP dump right now is done post suspend of all
IP's which for some IP's could change power
state and software state too which we do not want
to reflect in the dump as it might not be same at
the time of hang.

Solution:
IP should be dumped as close to the HW state when
the GPU was in hung state without trying to reinitialize
any resource.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-27 17:34:26 -04:00
Michael Chen
739d0f3e1f drm/amdgpu: increase mes log buffer size for gfx12
MES firmware requires larger log buffer for gfx12. Allocate
proper buffer respectively for gfx11 and gfx12.

Signed-off-by: Michael Chen <michael.chen@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-27 17:32:05 -04:00
Sunil Khatri
076362d931 drm/amdgpu: print VCN instance dump for valid instance
VCN dump is dependent on power state of the ip. Dump is
valid if VCN was powered up at the time of ip dump.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-27 17:31:10 -04:00
Venkata Narendra Kumar Gutta
25dd25f86e drm/amdgpu: Add MFD support for ISP I2C bus
ISP I2C bus device can't be enumerated via ACPI mechanism
since it shares the memory map with the AMDGPU.
So use the MFD mechanism for registering the ISP I2C device
and add the required resources.

Signed-off-by: Venkata Narendra Kumar Gutta <vengutta@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-27 17:28:58 -04:00
Christian König
fbfb5f0342 drm/amdgpu: fix contiguous handling for IB parsing v2
Otherwise we won't get correct access to the IB.

v2: keep setting AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS to avoid problems in
    the VRAM backend.

Signed-off-by: Christian König <christian.koenig@amd.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3501
Fixes: e362b7c8f8 ("drm/amdgpu: Modify the contiguous flags behaviour")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Tested-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-27 17:28:47 -04:00
Sunil Khatri
cd162ae9bc drm/amdgpu: add print support for vcn_v3_0 ip dump
Add support for logging the registers in devcoredump
buffer for vcn_v3_0.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-27 17:28:41 -04:00
Sunil Khatri
58d283801d drm/amdgpu: add vcn_v3_0 ip dump support
Add support of vcn ip dump in the devcoredump
for vcn_v3_0.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-27 17:28:35 -04:00
Sunil Khatri
50d10d9271 drm/amdgpu: add macro to calculate offset with instance
Add macro definition which calculate offset of the
register with index override.

This is useful in case when there is an array of
registers which is common for all instances.
To read registers in that case it is easy to define
registers once and the index value is manually passed
to calculate proper offset of register for each instance.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-27 17:28:28 -04:00
Sunil Khatri
f3392e662e drm/amdgpu: add vcn ip dump ptr in vcn global struct
Add pointer to the vcn ip dump in the vcn global structure
to be accessible for all vcn version via global adev.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-27 17:28:17 -04:00
Alex Deucher
8155566a26 drm/amdgpu: properly handle vbios fake edid sizing
The comment in the vbios structure says:
// = 128 means EDID length is 128 bytes, otherwise the EDID length = ucFakeEDIDLength*128

This fake edid struct has not been used in a long time, so I'm
not sure if there were actually any boards out there with a non-128 byte
EDID, but align the code with the comment.

Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Reported-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lists.freedesktop.org/archives/amd-gfx/2024-June/109964.html
Fixes: d38ceaf99e ("drm/amdgpu: add core driver (v4)")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-25 17:41:46 -04:00
Christian König
83b501c179 drm/scheduler: remove full_recover from drm_sched_start
This was basically just another one of amdgpus hacks. The parameter
allowed to restart the scheduler without turning fence signaling on
again.

That this is absolutely not a good idea should be obvious by now since
the fences will then just sit there and never signal.

While at it cleanup the code a bit.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240722083816.99685-1-christian.koenig@amd.com
2024-07-25 14:05:12 +02:00
ZhenGuo Yin
5659b0c93a drm/amdgpu: reset vm state machine after gpu reset(vram lost)
[Why]
Page table of compute VM in the VRAM will lost after gpu reset.
VRAM won't be restored since compute VM has no shadows.

[How]
Use higher 32-bit of vm->generation to record a vram_lost_counter.
Reset the VM state machine when vm->genertaion is not equal to
the new generation token.

v2: Check vm->generation instead of calling drm_sched_entity_error
in amdgpu_vm_validate.
v3: Use new generation token instead of vram_lost_counter for check.

Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
(cherry picked from commit 47c0388b05)
2024-07-24 17:30:49 -04:00
Tim Huang
fab1ead0ae drm/amdgpu: add missed harvest check for VCN IP v4/v5
To prevent below probe failure, add a check for models with VCN
IP v4.0.6 where VCN1 may be harvested.

v2:
Apply the same check to VCN IP v4.0 and v5.0.

[   54.070117] RIP: 0010:vcn_v4_0_5_start_dpg_mode+0x9be/0x36b0 [amdgpu]
[   54.071055] Code: 80 fb ff 8d 82 00 80 fe ff 81 fe 00 06 00 00 0f 43
c2 49 69 d5 38 0d 00 00 48 8d 71 04 c1 e8 02 4c 01 f2 48 89 b2 50 f6 02
00 <89> 01 48 8b 82 50 f6 02 00 48 8d 48 04 48 89 8a 50 f6 02 00 c7 00
[   54.072408] RSP: 0018:ffffb17985f736f8 EFLAGS: 00010286
[   54.072793] RAX: 00000000000000d6 RBX: ffff99a82f680000 RCX:
0000000000000000
[   54.073315] RDX: ffff99a82f680000 RSI: 0000000000000004 RDI:
ffff99a82f680000
[   54.073835] RBP: ffffb17985f73730 R08: 0000000000000001 R09:
0000000000000000
[   54.074353] R10: 0000000000000008 R11: ffffb17983c05000 R12:
0000000000000000
[   54.074879] R13: 0000000000000000 R14: ffff99a82f680000 R15:
0000000000000001
[   54.075400] FS:  00007f8d9c79a000(0000) GS:ffff99ab2f140000(0000)
knlGS:0000000000000000
[   54.075988] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   54.076408] CR2: 0000000000000000 CR3: 0000000140c3a000 CR4:
0000000000750ef0
[   54.076927] PKRU: 55555554
[   54.077132] Call Trace:
[   54.077319]  <TASK>
[   54.077484]  ? show_regs+0x69/0x80
[   54.077747]  ? __die+0x28/0x70
[   54.077979]  ? page_fault_oops+0x180/0x4b0
[   54.078286]  ? do_user_addr_fault+0x2d2/0x680
[   54.078610]  ? exc_page_fault+0x84/0x190
[   54.078910]  ? asm_exc_page_fault+0x2b/0x30
[   54.079224]  ? vcn_v4_0_5_start_dpg_mode+0x9be/0x36b0 [amdgpu]
[   54.079941]  ? vcn_v4_0_5_start_dpg_mode+0xe6/0x36b0 [amdgpu]
[   54.080617]  vcn_v4_0_5_set_powergating_state+0x82/0x19b0 [amdgpu]
[   54.081316]  amdgpu_device_ip_set_powergating_state+0x64/0xc0
[amdgpu]
[   54.082057]  amdgpu_vcn_ring_begin_use+0x6f/0x1d0 [amdgpu]
[   54.082727]  amdgpu_ring_alloc+0x44/0x70 [amdgpu]
[   54.083351]  amdgpu_vcn_dec_sw_ring_test_ring+0x40/0x110 [amdgpu]
[   54.084054]  amdgpu_ring_test_helper+0x22/0x90 [amdgpu]
[   54.084698]  vcn_v4_0_5_hw_init+0x87/0xc0 [amdgpu]
[   54.085307]  amdgpu_device_init+0x1f96/0x2780 [amdgpu]
[   54.085951]  amdgpu_driver_load_kms+0x1e/0xc0 [amdgpu]
[   54.086591]  amdgpu_pci_probe+0x19f/0x550 [amdgpu]
[   54.087215]  local_pci_probe+0x48/0xa0
[   54.087509]  pci_device_probe+0xc9/0x250
[   54.087812]  really_probe+0x1a4/0x3f0
[   54.088101]  __driver_probe_device+0x7d/0x170
[   54.088443]  driver_probe_device+0x24/0xa0
[   54.088765]  __driver_attach+0xdd/0x1d0
[   54.089068]  ? __pfx___driver_attach+0x10/0x10
[   54.089417]  bus_for_each_dev+0x8e/0xe0
[   54.089718]  driver_attach+0x22/0x30
[   54.090000]  bus_add_driver+0x120/0x220
[   54.090303]  driver_register+0x62/0x120
[   54.090606]  ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
[   54.091255]  __pci_register_driver+0x62/0x70
[   54.091593]  amdgpu_init+0x67/0xff0 [amdgpu]
[   54.092190]  do_one_initcall+0x5f/0x330
[   54.092495]  do_init_module+0x68/0x240
[   54.092794]  load_module+0x201c/0x2110
[   54.093093]  init_module_from_file+0x97/0xd0
[   54.093428]  ? init_module_from_file+0x97/0xd0
[   54.093777]  idempotent_init_module+0x11c/0x2a0
[   54.094134]  __x64_sys_finit_module+0x64/0xc0
[   54.094476]  do_syscall_64+0x58/0x120
[   54.094767]  entry_SYSCALL_64_after_hwframe+0x6e/0x76

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
(cherry picked from commit 0b071245dd)
2024-07-24 17:30:23 -04:00
Stanley.Yang
1a8825259a drm/amdgpu: Fix eeprom max record count
The eeprom table is empty before initializing,
set eeprom table version first before initializing.

Changed from V1:
	Reuse amdgpu_ras_set_eeprom_table_version function

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 015b8a2fdf)
2024-07-24 17:30:23 -04:00
YiPeng Chai
afac8c6554 drm/amdgpu: fix ras UE error injection failure issue
The ras command shared memory is allocated from
VRAM and the response status of the command
buffer will not be zero due to gpu being in
fatal error state after ras UE error injection.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8284951a6e)
2024-07-24 17:30:23 -04:00
Jane Jian
6728f55590 drm/amdgpu/vcn: Use offsets local to VCN/JPEG in VF
For VCN/JPEG 4.0.3, use only the local addressing scheme.

- Mask bit higher than AID0 range

v2
remain the case for mmhub use master XCC

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit caaf576292)
2024-07-24 17:30:23 -04:00
Lijo Lazar
485432d090 drm/amdgpu: Add empty HDP flush function to VCN v4.0.3
VCN 4.0.3 does not HDP flush with RRMT enabled. Instead, mmsch
will do the HDP flush.

This change is necessary for VCN v4.0.3, no need for backward compatibility

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 49cfaebe48)
2024-07-24 17:30:23 -04:00
Lijo Lazar
23df34997d drm/amdgpu: Add empty HDP flush function to JPEG v4.0.3
JPEG v4.0.3 doesn't support HDP flush when RRMT is enabled. Instead,
mmsch fw will do the flush.

This change is necessary for JPEG v4.0.3, no need for backward compatibility

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 585e3fdb36)
2024-07-24 17:30:23 -04:00
Ma Ke
df65aabef3 drm/amd/amdgpu: Fix uninitialized variable warnings
Return 0 to avoid returning an uninitialized variable r.

Cc: stable@vger.kernel.org
Fixes: 230dd6bb61 ("drm/amd/amdgpu: implement mode2 reset on smu_v13_0_10")
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 6472de66c0)
2024-07-24 17:30:23 -04:00
David Belanger
73048bda46 drm/amdgpu: Fix atomics on GFX12
If PCIe supports atomics, configure register to prevent DF from
breaking atomics in separate load/store operations.

Signed-off-by: David Belanger <david.belanger@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 666f14cab2)
2024-07-24 17:30:23 -04:00
Alex Deucher
a03ebf1163 drm/amdgpu/sdma5.2: Update wptr registers as well as doorbell
We seem to have a case where SDMA will sometimes miss a doorbell
if GFX is entering the powergating state when the doorbell comes in.
To workaround this, we can update the wptr via MMIO, however,
this is only safe because we disallow gfxoff in begin_ring() for
SDMA 5.2 and then allow it again in end_ring().

Enable this workaround while we are root causing the issue with
the HW team.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/3440
Tested-by: Friedrich Vock <friedrich.vock@gmx.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
(cherry picked from commit f2ac526349)
2024-07-24 17:29:57 -04:00
ZhenGuo Yin
47c0388b05 drm/amdgpu: reset vm state machine after gpu reset(vram lost)
[Why]
Page table of compute VM in the VRAM will lost after gpu reset.
VRAM won't be restored since compute VM has no shadows.

[How]
Use higher 32-bit of vm->generation to record a vram_lost_counter.
Reset the VM state machine when vm->genertaion is not equal to
the new generation token.

v2: Check vm->generation instead of calling drm_sched_entity_error
in amdgpu_vm_validate.
v3: Use new generation token instead of vram_lost_counter for check.

Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-24 14:45:24 -04:00
Tim Huang
0b071245dd drm/amdgpu: add missed harvest check for VCN IP v4/v5
To prevent below probe failure, add a check for models with VCN
IP v4.0.6 where VCN1 may be harvested.

v2:
Apply the same check to VCN IP v4.0 and v5.0.

[   54.070117] RIP: 0010:vcn_v4_0_5_start_dpg_mode+0x9be/0x36b0 [amdgpu]
[   54.071055] Code: 80 fb ff 8d 82 00 80 fe ff 81 fe 00 06 00 00 0f 43
c2 49 69 d5 38 0d 00 00 48 8d 71 04 c1 e8 02 4c 01 f2 48 89 b2 50 f6 02
00 <89> 01 48 8b 82 50 f6 02 00 48 8d 48 04 48 89 8a 50 f6 02 00 c7 00
[   54.072408] RSP: 0018:ffffb17985f736f8 EFLAGS: 00010286
[   54.072793] RAX: 00000000000000d6 RBX: ffff99a82f680000 RCX:
0000000000000000
[   54.073315] RDX: ffff99a82f680000 RSI: 0000000000000004 RDI:
ffff99a82f680000
[   54.073835] RBP: ffffb17985f73730 R08: 0000000000000001 R09:
0000000000000000
[   54.074353] R10: 0000000000000008 R11: ffffb17983c05000 R12:
0000000000000000
[   54.074879] R13: 0000000000000000 R14: ffff99a82f680000 R15:
0000000000000001
[   54.075400] FS:  00007f8d9c79a000(0000) GS:ffff99ab2f140000(0000)
knlGS:0000000000000000
[   54.075988] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   54.076408] CR2: 0000000000000000 CR3: 0000000140c3a000 CR4:
0000000000750ef0
[   54.076927] PKRU: 55555554
[   54.077132] Call Trace:
[   54.077319]  <TASK>
[   54.077484]  ? show_regs+0x69/0x80
[   54.077747]  ? __die+0x28/0x70
[   54.077979]  ? page_fault_oops+0x180/0x4b0
[   54.078286]  ? do_user_addr_fault+0x2d2/0x680
[   54.078610]  ? exc_page_fault+0x84/0x190
[   54.078910]  ? asm_exc_page_fault+0x2b/0x30
[   54.079224]  ? vcn_v4_0_5_start_dpg_mode+0x9be/0x36b0 [amdgpu]
[   54.079941]  ? vcn_v4_0_5_start_dpg_mode+0xe6/0x36b0 [amdgpu]
[   54.080617]  vcn_v4_0_5_set_powergating_state+0x82/0x19b0 [amdgpu]
[   54.081316]  amdgpu_device_ip_set_powergating_state+0x64/0xc0
[amdgpu]
[   54.082057]  amdgpu_vcn_ring_begin_use+0x6f/0x1d0 [amdgpu]
[   54.082727]  amdgpu_ring_alloc+0x44/0x70 [amdgpu]
[   54.083351]  amdgpu_vcn_dec_sw_ring_test_ring+0x40/0x110 [amdgpu]
[   54.084054]  amdgpu_ring_test_helper+0x22/0x90 [amdgpu]
[   54.084698]  vcn_v4_0_5_hw_init+0x87/0xc0 [amdgpu]
[   54.085307]  amdgpu_device_init+0x1f96/0x2780 [amdgpu]
[   54.085951]  amdgpu_driver_load_kms+0x1e/0xc0 [amdgpu]
[   54.086591]  amdgpu_pci_probe+0x19f/0x550 [amdgpu]
[   54.087215]  local_pci_probe+0x48/0xa0
[   54.087509]  pci_device_probe+0xc9/0x250
[   54.087812]  really_probe+0x1a4/0x3f0
[   54.088101]  __driver_probe_device+0x7d/0x170
[   54.088443]  driver_probe_device+0x24/0xa0
[   54.088765]  __driver_attach+0xdd/0x1d0
[   54.089068]  ? __pfx___driver_attach+0x10/0x10
[   54.089417]  bus_for_each_dev+0x8e/0xe0
[   54.089718]  driver_attach+0x22/0x30
[   54.090000]  bus_add_driver+0x120/0x220
[   54.090303]  driver_register+0x62/0x120
[   54.090606]  ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
[   54.091255]  __pci_register_driver+0x62/0x70
[   54.091593]  amdgpu_init+0x67/0xff0 [amdgpu]
[   54.092190]  do_one_initcall+0x5f/0x330
[   54.092495]  do_init_module+0x68/0x240
[   54.092794]  load_module+0x201c/0x2110
[   54.093093]  init_module_from_file+0x97/0xd0
[   54.093428]  ? init_module_from_file+0x97/0xd0
[   54.093777]  idempotent_init_module+0x11c/0x2a0
[   54.094134]  __x64_sys_finit_module+0x64/0xc0
[   54.094476]  do_syscall_64+0x58/0x120
[   54.094767]  entry_SYSCALL_64_after_hwframe+0x6e/0x76

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-24 14:44:05 -04:00
Yifan Zhang
3b37e2725a drm/amdgpu: skip kfd init if GFX is not ready.
avoid kfd init crash in that case.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-24 14:43:58 -04:00
Alex Deucher
bd4bea5ab2 drm/amdgpu/gfx9.4.3: Enable bad opcode interrupt
For the bad opcode case, it will cause CP/ME hang.
The firmware will prevent the ME side from hanging by raising a bad opcode interrupt.
And the driver needs to perform a vmid reset when receiving the interrupt.

Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:45:46 -04:00
Alex Deucher
238352b494 drm/amdgpu/gfx9: Enable bad opcode interrupt
For the bad opcode case, it will cause CP/ME hang.
The firmware will prevent the ME side from hanging by raising a bad opcode interrupt.
And the driver needs to perform a vmid reset when receiving the interrupt.

Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:45:46 -04:00
Jesse Zhang
5ebca62eb8 drm/amdgpu/gfx12: Enable bad opcode interrupt
For the bad opcode case, it will cause CP/ME hang.
The firmware will prevent the ME side from hanging by raising a bad opcode interrupt.
And the driver needs to perform a vmid reset when receiving the interrupt.

v2: update irq naming (drop priv) (Alex)

Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:45:46 -04:00
Jesse Zhang
bc6c2a6f64 drm/amdgpu/gfx10: Enable bad opcode interrupt
For the bad opcode case, it will cause CP/ME hang.
The firmware will prevent the ME side from hanging by raising a bad opcode interrupt.
And the driver needs to perform a vmid reset when receiving the interrupt.

v2: update irq naming (drop priv) (Alex)

Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:45:45 -04:00
Jesse Zhang
a790902237 drm/amdgpu/gfx11: Enable bad opcode interrupt
For the bad opcode case, it will cause CP/ME hang.
The firmware will prevent the ME side from hanging by raising a bad opcode interrupt.
And the driver needs to perform a vmid reset when receiving the interrupt.

v2: update irq naming (drop priv) (Alex)

Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:45:45 -04:00
Alex Deucher
acddd5cf70 drm/amdgpu/gfx: add bad opcode interrupt
Add the irq source for bad opcodes.

Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:45:45 -04:00
Alex Deucher
48695573d2 drm/amdgpu/gfx9: properly handle error ints on all pipes
Need to handle the interrupt enables for all pipes.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:45:45 -04:00
Alex Deucher
3987932176 drm/amdgpu/gfx12: properly handle error ints on all pipes
Need to handle the interrupt enables for all pipes.

v2: fix indexing (Jessie)

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:45:45 -04:00
Alex Deucher
2662b7d9d8 drm/amdgpu/gfx11: properly handle error ints on all pipes
Need to handle the interrupt enables for all pipes.

v2: fix indexing (Jessie)

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:45:45 -04:00
Alex Deucher
4b95cec689 drm/amdgpu/gfx10: properly handle error ints on all pipes
Need to handle the interrupt enables for all pipes.

v2: fix indexing (Jessie)

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:45:45 -04:00
Alex Deucher
af4808ac40 drm/amdgpu/gfx12: enable wave kill for compute queues
It should work the same for compute as well as gfx.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:45:45 -04:00
Alex Deucher
f53f526f70 drm/amdgpu/gfx11: enable wave kill for compute queues
It should work the same for compute as well as gfx.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:45:45 -04:00
Alex Deucher
a2737c404c drm/amdgpu/gfx10: enable wave kill for compute queues
It should work the same for compute as well as gfx.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:45:45 -04:00
Stanley.Yang
015b8a2fdf drm/amdgpu: Fix eeprom max record count
The eeprom table is empty before initializing,
set eeprom table version first before initializing.

Changed from V1:
	Reuse amdgpu_ras_set_eeprom_table_version function

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:45:45 -04:00
YiPeng Chai
8284951a6e drm/amdgpu: fix ras UE error injection failure issue
The ras command shared memory is allocated from
VRAM and the response status of the command
buffer will not be zero due to gpu being in
fatal error state after ras UE error injection.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:43:06 -04:00
Philip Yang
834368eab3 drm/amdkfd: Ensure user queue buffers residency
Add atomic queue_refcount to struct bo_va, return -EBUSY to fail unmap
BO from the GPU if the bo_va queue_refcount is not zero.

Create queue to increase the bo_va queue_refcount, destroy queue to
decrease the bo_va queue_refcount, to ensure the queue buffers mapped on
the GPU when queue is active.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:42:54 -04:00
Alex Deucher
22a9d5cbf8 drm/amdgpu/gfx9.4.3: implement wave kill for compute queues
Based on gfx9.0 implementation.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:42:45 -04:00
Sunil Khatri
eac3b274aa drm/amdgpu: add print support for sdma_v_4_4_2 ip_dump
Add print support for ip dump for sdma_v_4_4_2 in
devcoredump.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:42:37 -04:00
Alex Deucher
9c7e69d2e1 drm/amdgpu/gfx9: enable wave kill for compute queues
It should work the same for compute as well as gfx.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:42:11 -04:00
Alex Deucher
7e60ecc2b7 drm/amdgpu/gfx8: enable wave kill for compute queues
It should work the same for compute as well as gfx.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:42:08 -04:00
Alex Deucher
12fb3e9c88 drm/amdgpu/gfx7: enable wave kill for compute queues
It should work the same for compute as well as gfx.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:42:03 -04:00
Philip Yang
fb91065851 drm/amdkfd: Refactor queue wptr_bo GART mapping
Add helper function kfd_queue_acquire_buffers to get queue wptr_bo
reference from queue write_ptr if it is mapped to the KFD node with
expected size.

Add wptr_bo to structure queue_properties because structure queue is
allocated after queue buffers are validated, then we can remove wptr_bo
parameter from pqm_create_queue.

Rename structure queue wptr_bo_gart to hold wptr_bo reference for GART
mapping and umapping. Move MES wptr_bo_gart mapping to init_user_queue,
the same location with queue ctx_bo GART mapping.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:35:04 -04:00
Sunil Khatri
db54a725d5 drm/amdgpu: Add sdma_v4_4_2 ip dump for devcoredump
Add ip dump for sdma_v4_4_2 for devcoredump for all
instances of sdma.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:34:58 -04:00
Sunil Khatri
a11b36ba9c drm/amdgpu: add print support for sdma_v_4_0 ip_dump
Add print support for ip dump for sdma_v_4_0 in
devcoredump.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:34:49 -04:00
Philip Yang
c86ad39140 drm/amdkfd: amdkfd_free_gtt_mem clear the correct pointer
Pass pointer reference to amdgpu_bo_unref to clear the correct pointer,
otherwise amdgpu_bo_unref clear the local variable, the original pointer
not set to NULL, this could cause use-after-free bug.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:34:44 -04:00
Philip Yang
f9e292cbba drm/amdkfd: kfd_bo_mapped_dev support partition
Change amdgpu_amdkfd_bo_mapped_to_dev to use drm_priv as parameter
instead of adev, to support spatial partition. This is only used by CRIU
checkpoint restore now. No functional change.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:34:36 -04:00
Jane Jian
caaf576292 drm/amdgpu/vcn: Use offsets local to VCN/JPEG in VF
For VCN/JPEG 4.0.3, use only the local addressing scheme.

- Mask bit higher than AID0 range

v2
remain the case for mmhub use master XCC

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:34:28 -04:00
Lijo Lazar
49cfaebe48 drm/amdgpu: Add empty HDP flush function to VCN v4.0.3
VCN 4.0.3 does not HDP flush with RRMT enabled. Instead, mmsch
will do the HDP flush.

This change is necessary for VCN v4.0.3, no need for backward compatibility

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:34:18 -04:00
Lijo Lazar
585e3fdb36 drm/amdgpu: Add empty HDP flush function to JPEG v4.0.3
JPEG v4.0.3 doesn't support HDP flush when RRMT is enabled. Instead,
mmsch fw will do the flush.

This change is necessary for JPEG v4.0.3, no need for backward compatibility

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:34:12 -04:00
Pierre-Eric Pelloux-Prayer
fec5f8e8c6 drm/amdgpu: disallow multiple BO_HANDLES chunks in one submit
Before this commit, only submits with both a BO_HANDLES chunk and a
'bo_list_handle' would be rejected (by amdgpu_cs_parser_bos).

But if UMD sent multiple BO_HANDLES, what would happen is:
* only the last one would be really used
* all the others would leak memory as amdgpu_cs_p1_bo_handles would
  overwrite the previous p->bo_list value

This commit rejects submissions with multiple BO_HANDLES chunks to
match the implementation of the parser.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:33:57 -04:00
Sunil Khatri
80237bfc03 drm/amdgpu: Add sdma_v4_0 ip dump for devcoredump
Add ip dump for sdma_v4_0 for devcoredump for all
instances of sdma.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:33:52 -04:00
Sunil Khatri
abf839f5eb drm/amdgpu: add print support for sdma_v_7_0 ip_dump
Add print support for ip dump for sdma_v_7_0 in
devcoredump.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:33:45 -04:00
Ma Ke
6472de66c0 drm/amd/amdgpu: Fix uninitialized variable warnings
Return 0 to avoid returning an uninitialized variable r.

Cc: stable@vger.kernel.org
Fixes: 230dd6bb61 ("drm/amd/amdgpu: implement mode2 reset on smu_v13_0_10")
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:33:38 -04:00
Ma Ke
93381e6b61 drm/amdgpu: fix a possible null pointer dereference
In amdgpu_connector_add_common_modes(), the return value of drm_cvt_mode()
is assigned to mode, which will lead to a NULL pointer dereference on
failure of drm_cvt_mode(). Add a check to avoid npd.

Cc: stable@vger.kernel.org
Fixes: d38ceaf99e ("drm/amdgpu: add core driver (v4)")
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:33:25 -04:00
David Belanger
666f14cab2 drm/amdgpu: Fix atomics on GFX12
If PCIe supports atomics, configure register to prevent DF from
breaking atomics in separate load/store operations.

Signed-off-by: David Belanger <david.belanger@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:33:17 -04:00
Sunil Khatri
4df9e2200f drm/amdgpu: Add sdma_v7_0 ip dump for devcoredump
Add ip dump for sdma_v7_0 for devcoredump for all
instances of sdma.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23 17:33:10 -04:00