2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
Commit Graph

95 Commits

Author SHA1 Message Date
Ruijing Dong
11eb648d01 drm/amdgpu/vcn: Add vcn firmware log
vcn fwlog is for debugging purpose only,
by default, it is disabled.

Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-03-04 13:03:30 -05:00
Tom Rix
ca6fcfa8d4 drm/amdgpu: Fix realloc of ptr
Clang static analysis reports this error
amdgpu_debugfs.c:1690:9: warning: 1st function call
  argument is an uninitialized value
  tmp = krealloc_array(tmp, i + 1,
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~

realloc uses tmp, so tmp can not be garbage.
And the return needs to be checked.

Fixes: 5ce5a584cb ("drm/amdgpu: add debugfs for reset registers list")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-03-02 18:40:05 -05:00
Dave Airlie
38a15ad948 Merge tag 'amd-drm-next-5.18-2022-02-25' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.18-2022-02-25:

amdgpu:
- Raven2 suspend/resume fix
- SDMA 5.2.6 updates
- VCN 3.1.2 updates
- SMU 13.0.5 updates
- DCN 3.1.5 updates
- Virtual display fixes
- SMU code cleanup
- Harvest fixes
- Expose benchmark tests via debugfs
- Drop no longer relevant gart aperture tests
- More RAS restructuring
- W=1 fixes
- PSR rework
- DP/VGA adapter fixes
- DP MST fixes
- GPUVM eviction fix
- GPU reset debugfs register dumping support
- Misc display fixes
- SR-IOV fix
- Aldebaran mGPU fix
- Add module parameter to disable XGMI for testing

amdkfd:
- IH ring overflow logging fixes
- CRIU fixes
- Misc fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220225183535.5907-1-alexander.deucher@amd.com
2022-03-01 16:19:02 +10:00
Dave Airlie
54f43c17d6 drm-misc-next for v5.18:
UAPI Changes:
 
 Cross-subsystem Changes:
 - Split out panel-lvds and lvds dt bindings .
 - Put yes/no on/off disabled/enabled strings in linux/string_helpers.h
   and use it in drivers and tomoyo.
 - Clarify dma_fence_chain and dma_fence_array should never include eachother.
 - Flatten chains in syncobj's.
 - Don't double add in fbdev/defio when page is already enlisted.
 - Don't sort deferred-I/O pages by default in fbdev.
 
 Core Changes:
 - Fix missing pm_runtime_put_sync in bridge.
 - Set modifier support to only linear fb modifier if drivers don't
   advertise support.
 - As a result, we remove allow_fb_modifiers.
 - Add missing clear for EDID Deep Color Modes in drm_reset_display_info.
 - Assorted documentation updates.
 - Warn once in drm_clflush if there is no arch support.
 - Add missing select for dp helper in drm_panel_edp.
 - Assorted small fixes.
 - Improve fb-helper's clipping handling.
 - Don't dump shmem mmaps in a core dump.
 - Add accounting to ttm resource manager, and use it in amdgpu.
 - Allow querying the detected eDP panel through debugfs.
 - Add helpers for xrgb8888 to 8 and 1 bits gray.
 - Improve drm's buddy allocator.
 - Add selftests for the buddy allocator.
 
 Driver Changes:
 - Add support for nomodeset to a lot of drm drivers.
 - Use drm_module_*_driver in a lot of drm drivers.
 - Assorted small fixes to bridge/lt9611, v3d, vc4, vmwgfx, mxsfb, nouveau,
   bridge/dw-hdmi, panfrost, lima, ingenic, sprd, bridge/anx7625, ti-sn65dsi86.
 - Add bridge/it6505.
 - Create DP and DVI-I connectors in ast.
 - Assorted nouveau backlight fixes.
 - Rework amdgpu reset handling.
 - Add dt bindings for ingenic,jz4780-dw-hdmi.
 - Support reading edid through aux channel in ingenic.
 - Add a drm driver for Solomon SSD130x OLED displays.
 - Add simple support for sharp LQ140M1JW46.
 - Add more panels to nt35560.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAmIWLSEACgkQ/lWMcqZw
 E8OP7hAAjix94EX5fhFa7OAdqUbFtsiKhK/4zNtV9FWpFiEsDBz+dlbfDQWIx5an
 FIiiiQtSfWjpDv6pcMhoNf80w+dDbc/Cuauz6nNGO7Pkaerh2D/EPG74FD7f7nE3
 EIScVs1heYtzM9usKrFKupNYgIdgZxmeovClWuE0OTjLOes2PGvvxXK6iJqNqJMX
 VlDO5SR7GRqsDUDV6vmwl63uKL77xJXAahAXIx+BQ/1xrtEhlu6NwsgHIsmPmMSN
 YluX34zc1xD/6/uUqvEdp7u46/5/He1c5Q/ia1WV3wRxsO/eMZ+axXqCZP3XGZdt
 rMdGNtj1MWKkudYiowStWkCVSG/0fXJCFIAhvRmeZy+YqPdVlqZ2W7g4H1l9iJoo
 UVfT9cHrKoxHsukvIEckC5Ov9v1yr39Bd4wUuqaUTUSxY8VID5vjY63TsXl9Zke1
 SluTFe9qybbnRNz/hYRvwIS1eT8HvUauAfAhypGTLI5DYHTD7PawcfMJkNzCtJm4
 Ta4SC3rTpkpN+7oc8SoNgqRHQ8U9KL5oksP0wVa8vwHsMptSd3X4pUljc6TcfjLv
 GEo41D5AuJz3HRVcn9yqPbLoPE2FFB7bfwIMH77yNnoos4Izy/LGhKpN0YdImmI5
 W5XVFB0jltGSIhkzLe1mFpLrdJwdUTSUVeCK4H5PhZZQEHLkVtg=
 =HuwD
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2022-02-23' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for v5.18:

UAPI Changes:

Cross-subsystem Changes:
- Split out panel-lvds and lvds dt bindings .
- Put yes/no on/off disabled/enabled strings in linux/string_helpers.h
  and use it in drivers and tomoyo.
- Clarify dma_fence_chain and dma_fence_array should never include eachother.
- Flatten chains in syncobj's.
- Don't double add in fbdev/defio when page is already enlisted.
- Don't sort deferred-I/O pages by default in fbdev.

Core Changes:
- Fix missing pm_runtime_put_sync in bridge.
- Set modifier support to only linear fb modifier if drivers don't
  advertise support.
- As a result, we remove allow_fb_modifiers.
- Add missing clear for EDID Deep Color Modes in drm_reset_display_info.
- Assorted documentation updates.
- Warn once in drm_clflush if there is no arch support.
- Add missing select for dp helper in drm_panel_edp.
- Assorted small fixes.
- Improve fb-helper's clipping handling.
- Don't dump shmem mmaps in a core dump.
- Add accounting to ttm resource manager, and use it in amdgpu.
- Allow querying the detected eDP panel through debugfs.
- Add helpers for xrgb8888 to 8 and 1 bits gray.
- Improve drm's buddy allocator.
- Add selftests for the buddy allocator.

Driver Changes:
- Add support for nomodeset to a lot of drm drivers.
- Use drm_module_*_driver in a lot of drm drivers.
- Assorted small fixes to bridge/lt9611, v3d, vc4, vmwgfx, mxsfb, nouveau,
  bridge/dw-hdmi, panfrost, lima, ingenic, sprd, bridge/anx7625, ti-sn65dsi86.
- Add bridge/it6505.
- Create DP and DVI-I connectors in ast.
- Assorted nouveau backlight fixes.
- Rework amdgpu reset handling.
- Add dt bindings for ingenic,jz4780-dw-hdmi.
- Support reading edid through aux channel in ingenic.
- Add a drm driver for Solomon SSD130x OLED displays.
- Add simple support for sharp LQ140M1JW46.
- Add more panels to nt35560.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/686ec871-e77f-c230-22e5-9e3bb80f064a@linux.intel.com
2022-02-25 05:50:18 +10:00
Somalapuram Amaranath
5ce5a584cb drm/amdgpu: add debugfs for reset registers list
List of register populated for dump collection during the GPU reset.

Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-23 14:26:36 -05:00
Alex Deucher
e7c4723103 drm/amdgpu: expose benchmarks via debugfs
They provide a nice smoke test of transfer performance
using SDMA.  Allow the user to run these at runtime
rather than only at init time.

v2: fix permissions (Alex)

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-23 14:02:50 -05:00
Tom St Denis
8f74f68d90 drm/amd/amdgpu: Add APU flag to gca_config debugfs data (v3)
Needed by umr to detect if ip discovered ASIC is an APU or not.

(v2): Remove asic type from packet it's not strictly needed
(v3): Correct comment

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17 15:59:05 -05:00
Andrey Grodzovsky
d0fb18b535 drm/amdgpu: Move reset sem into reset_domain
We want single instance of reset sem across all
reset clients because in case of XGMI we should stop
access cross device MMIO because any of them could be
in a reset in the moment.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://www.spinics.net/lists/amd-gfx/msg74117.html
2022-02-09 12:17:32 -05:00
Yongzhi Liu
4bd8dd0d61 drm/amdgpu: Add missing pm_runtime_put_autosuspend
pm_runtime_get_sync() increments the runtime PM usage counter even
when it returns an error code, thus a matching decrement is needed
on the error handling path to keep the counter balanced.

Signed-off-by: Yongzhi Liu <lyz_cs@pku.edu.cn>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-18 17:43:36 -05:00
Evan Quan
bc143d8b83 drm/amd/pm: do not expose implementation details to other blocks out of power
Those implementation details(whether swsmu supported, some ppt_funcs supported,
accessing internal statistics ...)should be kept internally. It's not a good
practice and even error prone to expose implementation details.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-14 17:51:14 -05:00
Evan Quan
7e31a8585b drm/amdgpu: move smu_debug_mask to a more proper place
As the smu_context will be invisible from outside(of power). Also,
the smu_debug_mask can be shared around all power code instead of
some specific framework(swSMU) only.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-14 16:09:11 -05:00
Lang Yu
6ff7fddbd1 drm/amdgpu: add support for SMU debug option
SMU firmware expects the driver maintains error context
and doesn't interact with SMU any more when SMU errors
occurred. That will aid in debugging SMU firmware issues.

Add SMU debug option support for this request, it can be
enabled or disabled via amdgpu_smu_debug debugfs file.
Use a 32-bit mask to indicate corresponding debug modes.
Currently, only one mode(HALT_ON_ERROR) is supported.
When enabled, it brings hardware to a kind of halt state
so that no one can touch it any more in the envent of SMU
errors.

The dirver interacts with SMU via sending messages. And
threre are three ways to sending messages to SMU in current
implementation. Handle them respectively as following:

1, smu_cmn_send_smc_msg_with_param() for normal timeout cases

  Halt on any error.

2, smu_cmn_send_msg_without_waiting()/smu_cmn_wait_for_response()
for longer timeout cases

  Halt on errors apart from ETIME. Otherwise this way won't work.
  Let the user handle ETIME error in such a case.

3, smu_cmn_send_msg_without_waiting() for no waiting cases

  Halt on errors apart from ETIME. Otherwise second way won't work.

== Command Guide ==

1, enable HALT_ON_ERROR mode

 # echo 0x1 > /sys/kernel/debug/dri/0/amdgpu_smu_debug

2, disable HALT_ON_ERROR mode

 # echo 0x0 > /sys/kernel/debug/dri/0/amdgpu_smu_debug

v5:
 - Use bit mask to allow more debug features.(Evan)
 - Use WRAN() instead of BUG().(Evan)

v4:
 - Set to halt state instead of a simple hang.(Christian)

v3:
 - Use debugfs_create_bool().(Christian)
 - Put variable into smu_context struct.
 - Don't resend command when timeout.

v2:
 - Resend command when timeout.(Lijo)
 - Use debugfs file instead of module parameter.

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:33:16 -05:00
Nirmoy Das
58144d2837 drm/amdgpu: unify BO evicting method in amdgpu_ttm
Unify BO evicting functionality for possible memory
types in amdgpu_ttm.c.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-07 11:55:46 -04:00
Nirmoy Das
5b9581df9f drm/amdgpu: return early if debugfs is not initialized
Check first if debugfs is initialized before creating
amdgpu debugfs files.

References: https://gitlab.freedesktop.org/drm/amd/-/issues/1686
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-06 15:53:14 -04:00
Christian König
c8365dbda0 drm/amdgpu: revert "Add autodump debugfs node for gpu reset v8"
This reverts commit 728e7e0cd6.

Further discussion reveals that this feature is severely broken
and needs to be reverted ASAP.

GPU reset can never be delayed by userspace even for debugging or
otherwise we can run into in kernel deadlocks.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-05 12:22:36 -04:00
Alex Deucher
81d1bf01e4 drm/amdgpu: add debugfs access to the IP discovery table
Useful for debugging and new asic validation.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
xinhui pan
0fcfb30019 drm/amdgpu: Fix a race of IB test
Direct IB submission should be exclusive. So use write lock.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:59:58 -04:00
Nirmoy Das
62d266b2bd drm/amdgpu: cleanup debugfs for amdgpu rings
Use debugfs_create_file_size API for creating ring debugfs, and as its a
NULL returning API, change the return type for amdgpu_debugfs_ring_init
API as well. Also cleanup surrounding code.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:35:59 -04:00
Nirmoy Das
59715cffce drm/amdgpu: use IS_ERR for debugfs APIs
debugfs APIs returns encoded error so use
IS_ERR for checking return value.

v2: return PTR_ERR(ent)

References: https://gitlab.freedesktop.org/drm/amd/-/issues/1686
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-By: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:35:44 -04:00
Tom St Denis
37df9560cd drm/amd/amdgpu: New debugfs interface for MMIO registers (v5)
This new debugfs interface uses an IOCTL interface in order to pass
along state information like SRBM and GRBM bank switching.  This
new interface also allows a full 32-bit MMIO address range which
the previous didn't.  With this new design we have room to grow
the flexibility of the file as need be.

(v2): Move read/write to .read/.write, fix style, add comment
      for IOCTL data structure

(v3): C style comments

(v4): use u32 in struct and remove offset variable

(v5): Drop flag clearing in op function, use 0xFFFFFFFF for broadcast
      instead of 0x3FF, use mutex for op/ioctl.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-01 16:55:11 -04:00
Jack Zhang
c530b02f39 drm/amd/amdgpu embed hw_fence into amdgpu_job
Why: Previously hw fence is alloced separately with job.
It caused historical lifetime issues and corner cases.
The ideal situation is to take fence to manage both job
and fence's lifetime, and simplify the design of gpu-scheduler.

How:
We propose to embed hw_fence into amdgpu_job.
1. We cover the normal job submission by this method.
2. For ib_test, and submit without a parent job keep the
legacy way to create a hw fence separately.
v2:
use AMDGPU_FENCE_FLAG_EMBED_IN_JOB_BIT to show that the fence is
embedded in a job.
v3:
remove redundant variable ring in amdgpu_job
v4:
add tdr sequence support for this feature. Add a job_run_counter to
indicate whether this job is a resubmit job.
v5
add missing handling in amdgpu_fence_enable_signaling

Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Signed-off-by: Jack Zhang <Jack.Zhang7@hotmail.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16 15:16:58 -04:00
Nirmoy Das
391629bdfc drm/amdgpu: remove amdgpu_vm_pt
Page table entries are now in embedded in VM BO, so
we do not need struct amdgpu_vm_pt. This patch replaces
struct amdgpu_vm_pt with struct amdgpu_vm_bo_base.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-15 17:25:42 -04:00
Lee Jones
e72d4a8b08 drm/amd/amdgpu/amdgpu_debugfs: Fix a couple of misnamed functions
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1004: warning: expecting prototype for amdgpu_debugfs_regs_gfxoff_write(). Prototype was for amdgpu_debugfs_gfxoff_write() instead
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1053: warning: expecting prototype for amdgpu_debugfs_regs_gfxoff_status(). Prototype was for amdgpu_debugfs_gfxoff_read() instead

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-21 10:32:13 -04:00
Kevin Wang
43fb6c195d drm/amdgpu: fix parameter error of RREG32_PCIE() in amdgpu_regs_pcie
the register offset isn't needed division by 4 to pass RREG32_PCIE()

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-03 10:51:37 -05:00
Yang Li
7271a5c2ae drm/amdgpu: Replace DEFINE_SIMPLE_ATTRIBUTE with DEFINE_DEBUGFS_ATTRIBUTE
Fix the following coccicheck warning:
./drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1589:0-23: WARNING:
fops_ib_preempt should be defined with DEFINE_DEBUGFS_ATTRIBUTE
./drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1592:0-23: WARNING:
fops_sclk_set should be defined with DEFINE_DEBUGFS_ATTRIBUTE

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-26 17:23:49 -05:00
Nirmoy Das
98d28ac2f5 drm/amdgpu: do not use drm middle layer for debugfs
Use debugfs API directly instead of drm middle layer.

This also includes following debugfs file output changes:
1 amdgpu_evict_vram/amdgpu_evict_gtt output will not contain any braces.
  e.g. (0) --> 0
2 amdgpu_gpu_recover output will print return value of
  amdgpu_device_gpu_recover() instead of not so important "gpu recover"
  message.

v2: * checkpatch.pl: use '0444' instead of S_IRUGO.
    * remove S_IFREG from mode.
    * remove mode variable.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-18 16:43:09 -05:00
Nirmoy Das
373720f79d drm/amd/pm: do not use drm middle layer for debugfs
Use debugfs API directly instead of drm middle layer.

v2: * checkpatch.pl: use '0444' instead of S_IRUGO.
    * remove S_IFREG from mode.
    * remove mode variable.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-18 16:43:09 -05:00
Nirmoy Das
afd3a359c4 drm/amd/display: do not use drm middle layer for debugfs
Use debugfs API directly instead of drm middle layer.

v2: * checkpatch.pl: use '0444' instead of S_IRUGO.
    * remove S_IFREG from mode.
    * remove mode variable.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-18 16:43:09 -05:00
Nirmoy Das
88293c03c8 drm/amdgpu: do not keep debugfs dentry
Cleanup unnecessary debugfs dentries and surrounding functions.

v3: remove return value check for debugfs_create_file()
v2: remove ttm_debugfs_entries array.
    do not init variables.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-18 16:43:09 -05:00
Daniel Vetter
a6b8720c2f Merge tag 'amd-drm-next-5.12-2021-01-20' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.12-2021-01-20:

amdgpu:
- Fix non-x86 build
- W=1 fixes from Lee Jones
- Enable GPU reset on Navy Flounder
- Kernel doc fixes
- SMU workload profile fixes for APUs
- Display updates
- SR-IOV fixes
- Vangogh SMU feature enablment and bug fixes
- GPU reset support for Vangogh
- Misc cleanups

Conflicts:
	drivers/gpu/drm/amd/display/dc/dcn10/dcn10_mpc.c

Resolve the conflict by picking the initialization value from amd from
f03e80d2e8 ("drm/amd/display: Initialize stack variable") over the
one Linus picked in 61d791365b ("drm/amd/display: avoid
uninitialized variable warning"). It shouldn't matter.

From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210120060951.22600-1-alexander.deucher@amd.com
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2021-01-20 13:08:18 +01:00
Jinzhou Su
ecaafb7b5a drm/amdgpu: Add secure display TA interface
Add interface to load, unload, invoke command for
secure display TA.

v2: Add debugfs interface for secure display TA
v3: fix warning in copy_from_user (Alex)

Signed-off-by: Jinzhou.Su <Jinzhou.Su@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-13 23:58:14 -05:00
Maarten Lankhorst
ae75a0431f Merge drm/drm-next into drm-misc-next
Required backmerge since we will be based on top of v5.11, and there
has been a request to backmerge already to upstream some features.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2020-12-15 11:05:43 +01:00
Luben Tuikov
6efa4b465c gpu/drm: ring_mirror_list --> pending_list
Rename "ring_mirror_list" to "pending_list",
to describe what something is, not what it does,
how it's used, or how the hardware implements it.

This also abstracts the actual hardware
implementation, i.e. how the low-level driver
communicates with the device it drives, ring, CAM,
etc., shouldn't be exposed to DRM.

The pending_list keeps jobs submitted, which are
out of our control. Usually this means they are
pending execution status in hardware, but the
latter definition is a more general (inclusive)
definition.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/405573/

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Christian König <christian.koenig@amd.com>
2020-12-08 14:38:03 +01:00
Luben Tuikov
8935ff00e3 drm/scheduler: "node" --> "list"
Rename "node" to "list" in struct drm_sched_job,
in order to make it consistent with what we see
being used throughout gpu_scheduler.h, for
instance in struct drm_sched_entity, as well as
the rest of DRM and the kernel.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/403515/

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Christian König <christian.koenig@amd.com>
2020-12-08 14:37:55 +01:00
Lee Jones
20ed491bbb drm/amd/amdgpu/amdgpu_debugfs: Demote obvious abuse of kernel-doc formatting
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:308: warning: Function parameter or member 'f' not described in 'amdgpu_debugfs_regs_read'
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:308: warning: Function parameter or member 'buf' not described in 'amdgpu_debugfs_regs_read'
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:308: warning: Function parameter or member 'size' not described in 'amdgpu_debugfs_regs_read'
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:308: warning: Function parameter or member 'pos' not described in 'amdgpu_debugfs_regs_read'
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:317: warning: Function parameter or member 'f' not described in 'amdgpu_debugfs_regs_write'
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:317: warning: Function parameter or member 'buf' not described in 'amdgpu_debugfs_regs_write'
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:317: warning: Function parameter or member 'size' not described in 'amdgpu_debugfs_regs_write'
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:317: warning: Function parameter or member 'pos' not described in 'amdgpu_debugfs_regs_write'

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-11-13 17:29:47 -05:00
Dave Airlie
5b8c596976 Merge tag 'amd-drm-next-5.11-2020-11-05' of git://people.freedesktop.org/~agd5f/linux into drm-next
amd-drm-next-5.11-2020-11-05:

amdgpu:
- Add initial support for Vangogh
- Add support for Green Sardine
- Add initial support for Dimgrey Cavefish
- Scatter/Gather display support for Renoir
- Updates for Sienna Cichlid
- Updates for Navy Flounder
- SMU7 power improvements
- Modifier support for gfx9+
- CI BACO fixes
- Arcturus SMU fixes
- Lots of code cleanups
- DC fixes
- Kernel doc fixes
- Add more GPU HW client information to page fault error logging
- MPO clock tuning for RV
- FP fixes for DCN3 on ARM and PPC

radeon:
- Expose voltage via hwmon on Sumo APUs

amdkfd:
- Fix unique id handling
- Misc fixes

From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201105222749.201798-1-alexander.deucher@amd.com
2020-11-10 17:48:47 +10:00
Dave Airlie
1cd260a790 drm-misc-next for 5.11:
UAPI Changes:
 
   - doc: rules for EBUSY on non-blocking commits; requirements for fourcc
     modifiers; on parsing EDID
   - fbdev/sbuslib: Remove unused FBIOSCURSOR32
   - fourcc: deprecate DRM_FORMAT_MOD_NONE
   - virtio: Support blob resources for memory allocations; Expose host-visible
     and cross-device features
 
 Cross-subsystem Changes:
 
   - devicetree: Add vendor Prefix for Yes Optoelectronics, Shanghai Top Display
     Optoelectronics
   - dma-buf: Add struct dma_buf_map that stores DMA pointer and I/O-memory flag;
     dma_buf_vmap()/vunmap() return address in dma_buf_map; Use struct_size() macro
 
 Core Changes:
 
   - atomic: pass full state to CRTC atomic enable/disable; warn for EBUSY during
     non-blocking commits
   - dp: Prepare for DP 2.0 DPCD
   - dp_mst: Receive extended DPCD caps
   - dma-buf: Documentation
   - doc: Format modifiers; dma-buf-map; Cleanups
   - fbdev: Don't use compat_alloc_user_space(); mark as orphaned
   - fb-helper: Take lock in drm_fb_helper_restore_work_fb()
   - gem: Convert implementation and drivers to GEM object functions, remove
     GEM callbacks from struct drm_driver (expect gem_prime_mmap)
   - panel: Cleanups
   - pci: Add legacy infix to drm_irq_by_busid()
   - sched: Avoid infinite waits in drm_sched_entity_destroy()
   - switcheroo: Cleanups
   - ttm: Remove AGP support; Don't modify caching during swapout; Major
     refactoring of the implementation and API that affects all depending
     drivers; Add ttm_bo_wait_ctx(); Add ttm_bo_pin()/unpin() in favor of
     TTM_PL_FLAG_NO_EVICT; Remove ttm_bo_create(); Remove fault_reserve_notify()
     callback; Push move() implementation into drivers; Remove TTM_PAGE_FLAG_WRITE;
     Replace caching flags with init-time cache setting; Push ttm_tt_bind() into
     drivers; Replace move_notify() with delete_mem_notify(); No overlapping memcpy();
     no more ttm_set_populated()
   - vram-helper: Fix BO top-down placement; TTM-related changes; Init GEM
     object functions with defaults; Default placement in system memory; Cleanups
 
 Driver Changes:
 
   - amdgpu: Use GEM object functions
   - armada: Use GEM object functions
   - aspeed: Configure output via sysfs; Init struct drm_driver with
   - ast: Reload LUT after FB format changes
   - bridge: Add driver and DT bindings for anx7625; Cleanups
   - bridge/dw-hdmi: Constify ops
   - bridge/ti-sn65dsi86: Add retries for link training
   - bridge/lvds-codec: Add support for regulator
   - bridge/tc358768: Restore connector support DRM_GEM_CMA_DRIVEROPS; Cleanups
   - display/ti,j721e-dss: Add DT properies assigned-clocks, assigned-clocks-parent and
     dma-coherent
   - display/ti,am65s-dss: Add DT properies assigned-clocks, assigned-clocks-parent and
     dma-coherent
   - etnaviv: Use GEM object functions
   - exynos: Use GEM object functions
   - fbdev: Cleanups and compiler fixes throughout framebuffer drivers
   - fbdev/cirrusfb: Avoid division by 0
   - gma500: Use GEM object functions; Fix double-free of connector; Cleanups
   - hisilicon/hibmc: I2C-based DDC support; Use to_hibmc_drm_device(); Cleanups
   - i915: Use GEM object functions
   - imx/dcss: Init driver with DRM_GEM_CMA_DRIVER_OPS; Cleanups
   - ingenic: Reset pixel clock when parent clock changes; support reserved
     memory; Alloc F0 and F1 DMA channels at once; Support different pixel formats;
     Revert support for cached mmap buffers
     on F0/F1; support 30-bit/24-bit/8-bit-palette modes
   - komeda: Use DEFINE_SHOW_ATTRIBUTE
   - mcde: Detect platform_get_irq() errors
   - mediatek: Use GEM object functions
   - msm: Use GEM object functions
   - nouveau: Cleanups; TTM-related changes; Use GEM object functions
   - omapdrm: Use GEM object functions
   - panel: Add driver and DT bindings for Novatak nt36672a; Add driver and DT
     bindings for YTC700TLAG-05-201C; Add driver and DT bindings for TDO TL070WSH30;
     Cleanups
   - panel/mantix: Fix reset; Fix deref of NULL pointer in mantix_get_modes()
   - panel/otm8009a: Allow non-continuous dsi clock; Cleanups
   - panel/rm68200: Allow non-continuous dsi clock; Fix mode to 50 FPS
   - panfrost: Fix job timeout handling; Cleanups
   - pl111: Use GEM object functions
   - qxl: Cleanups; TTM-related changes; Pin new BOs with ttm_bo_init_reserved()
   - radeon: Cleanups; TTM-related changes; Use GEM object functions
   - rockchip: Use GEM object functions
   - shmobile: Cleanups
   - tegra: Use GEM object functions
   - tidss: Set drm_plane_helper_funcs.prepare_fb
   - tilcdc: Don't keep vblank interrupt enabled all the time
   - tve200: Detect platform_get_irq() errors
   - vc4: Use GEM object functions; Only register components once DSI is attached;
     Add Maxime as maintainer
   - vgem: Use GEM object functions
   - via: Simplify critical section in via_mem_alloc()
   - virtgpu: Use GEM object functions
   - virtio: Implement blob resources, host-visible and cross-device features;
     Support mapping of host-allocated resources; Use UUID APi; Cleanups
   - vkms: Use GEM object functions; Switch to SHMEM
   - vmwgfx: TTM-related changes; Inline ttm_bo_swapout_all()
   - xen: Use GEM object functions
   - xlnx: Use GEM object functions
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAl+X8N4ACgkQaA3BHVML
 eiPulggAu/RudQn0GmWPcDg9DpM30zE/ppBrmKk+WGWqj6M2DgK9gy+KhJps+Uht
 Fb2jMnUirNS5ZNa5kVhkdazOKqHq5jHHV+SbRPziySzV56TW8lbPU/HOUhKSQbkF
 FUB/YCWbb2kJA23So9VwNkjSJUXKpy896WoVxH7b/gLYL7c+sHUK9TOWAlsbFEmD
 t3kjxQgsHdVhqaZIKE7zg72Vi1AkkhjCVraPQeZY1GgmmLxdQeEKhNO8xdfG3OzY
 US4MYwJ51RfaCDTFr5t1UA224ODxoJtV3dTDDtrx4R5sf4MYJUC4SJYZHIyHyUkm
 9KXjFFzB9+Hd0JjpUHFUyl+4k8JjHQ==
 =GWwb
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2020-10-27' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.11:

UAPI Changes:

  - doc: rules for EBUSY on non-blocking commits; requirements for fourcc
    modifiers; on parsing EDID
  - fbdev/sbuslib: Remove unused FBIOSCURSOR32
  - fourcc: deprecate DRM_FORMAT_MOD_NONE
  - virtio: Support blob resources for memory allocations; Expose host-visible
    and cross-device features

Cross-subsystem Changes:

  - devicetree: Add vendor Prefix for Yes Optoelectronics, Shanghai Top Display
    Optoelectronics
  - dma-buf: Add struct dma_buf_map that stores DMA pointer and I/O-memory flag;
    dma_buf_vmap()/vunmap() return address in dma_buf_map; Use struct_size() macro

Core Changes:

  - atomic: pass full state to CRTC atomic enable/disable; warn for EBUSY during
    non-blocking commits
  - dp: Prepare for DP 2.0 DPCD
  - dp_mst: Receive extended DPCD caps
  - dma-buf: Documentation
  - doc: Format modifiers; dma-buf-map; Cleanups
  - fbdev: Don't use compat_alloc_user_space(); mark as orphaned
  - fb-helper: Take lock in drm_fb_helper_restore_work_fb()
  - gem: Convert implementation and drivers to GEM object functions, remove
    GEM callbacks from struct drm_driver (expect gem_prime_mmap)
  - panel: Cleanups
  - pci: Add legacy infix to drm_irq_by_busid()
  - sched: Avoid infinite waits in drm_sched_entity_destroy()
  - switcheroo: Cleanups
  - ttm: Remove AGP support; Don't modify caching during swapout; Major
    refactoring of the implementation and API that affects all depending
    drivers; Add ttm_bo_wait_ctx(); Add ttm_bo_pin()/unpin() in favor of
    TTM_PL_FLAG_NO_EVICT; Remove ttm_bo_create(); Remove fault_reserve_notify()
    callback; Push move() implementation into drivers; Remove TTM_PAGE_FLAG_WRITE;
    Replace caching flags with init-time cache setting; Push ttm_tt_bind() into
    drivers; Replace move_notify() with delete_mem_notify(); No overlapping memcpy();
    no more ttm_set_populated()
  - vram-helper: Fix BO top-down placement; TTM-related changes; Init GEM
    object functions with defaults; Default placement in system memory; Cleanups

Driver Changes:

  - amdgpu: Use GEM object functions
  - armada: Use GEM object functions
  - aspeed: Configure output via sysfs; Init struct drm_driver with
  - ast: Reload LUT after FB format changes
  - bridge: Add driver and DT bindings for anx7625; Cleanups
  - bridge/dw-hdmi: Constify ops
  - bridge/ti-sn65dsi86: Add retries for link training
  - bridge/lvds-codec: Add support for regulator
  - bridge/tc358768: Restore connector support DRM_GEM_CMA_DRIVEROPS; Cleanups
  - display/ti,j721e-dss: Add DT properies assigned-clocks, assigned-clocks-parent and
    dma-coherent
  - display/ti,am65s-dss: Add DT properies assigned-clocks, assigned-clocks-parent and
    dma-coherent
  - etnaviv: Use GEM object functions
  - exynos: Use GEM object functions
  - fbdev: Cleanups and compiler fixes throughout framebuffer drivers
  - fbdev/cirrusfb: Avoid division by 0
  - gma500: Use GEM object functions; Fix double-free of connector; Cleanups
  - hisilicon/hibmc: I2C-based DDC support; Use to_hibmc_drm_device(); Cleanups
  - i915: Use GEM object functions
  - imx/dcss: Init driver with DRM_GEM_CMA_DRIVER_OPS; Cleanups
  - ingenic: Reset pixel clock when parent clock changes; support reserved
    memory; Alloc F0 and F1 DMA channels at once; Support different pixel formats;
    Revert support for cached mmap buffers
    on F0/F1; support 30-bit/24-bit/8-bit-palette modes
  - komeda: Use DEFINE_SHOW_ATTRIBUTE
  - mcde: Detect platform_get_irq() errors
  - mediatek: Use GEM object functions
  - msm: Use GEM object functions
  - nouveau: Cleanups; TTM-related changes; Use GEM object functions
  - omapdrm: Use GEM object functions
  - panel: Add driver and DT bindings for Novatak nt36672a; Add driver and DT
    bindings for YTC700TLAG-05-201C; Add driver and DT bindings for TDO TL070WSH30;
    Cleanups
  - panel/mantix: Fix reset; Fix deref of NULL pointer in mantix_get_modes()
  - panel/otm8009a: Allow non-continuous dsi clock; Cleanups
  - panel/rm68200: Allow non-continuous dsi clock; Fix mode to 50 FPS
  - panfrost: Fix job timeout handling; Cleanups
  - pl111: Use GEM object functions
  - qxl: Cleanups; TTM-related changes; Pin new BOs with ttm_bo_init_reserved()
  - radeon: Cleanups; TTM-related changes; Use GEM object functions
  - rockchip: Use GEM object functions
  - shmobile: Cleanups
  - tegra: Use GEM object functions
  - tidss: Set drm_plane_helper_funcs.prepare_fb
  - tilcdc: Don't keep vblank interrupt enabled all the time
  - tve200: Detect platform_get_irq() errors
  - vc4: Use GEM object functions; Only register components once DSI is attached;
    Add Maxime as maintainer
  - vgem: Use GEM object functions
  - via: Simplify critical section in via_mem_alloc()
  - virtgpu: Use GEM object functions
  - virtio: Implement blob resources, host-visible and cross-device features;
    Support mapping of host-allocated resources; Use UUID APi; Cleanups
  - vkms: Use GEM object functions; Switch to SHMEM
  - vmwgfx: TTM-related changes; Inline ttm_bo_swapout_all()
  - xen: Use GEM object functions
  - xlnx: Use GEM object functions

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20201027100936.GA4858@linux-uq9g
2020-11-04 11:49:10 +10:00
Deepak R Varma
f3729f7b1a drm/amdgpu/amdgpu: improve code indentation and alignment
General code indentation and alignment changes such as replace spaces
by tabs or align function arguments as per the coding style
guidelines. The patch corrects issues for various amdgpu_*.c files
for this driver. Issue reported by checkpatch script.

Signed-off-by: Deepak R Varma <mh12gx2825@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-11-02 15:34:29 -05:00
John Clements
19ae333001 drm/amdgpu: added support for psp fw attestation
loaded fw can be queried from sys fs interface

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-26 13:27:00 -04:00
Mihir Bhogilal Patel
ff72bc4031 drm/amdgpu: Add debugfs entry for printing VM info
Create new debugfs entry to print memory info using VM buffer
objects.

V2: Added Common function for printing BO info.
    Dump more VM lists for evicted, moved, relocated, invalidated.
    Removed dumping VM mapped BOs.
V3: Fixed coding style comments, renamed print API and variables.
V4: Fixed coding style comments.

Signed-off-by: Mihir Bhogilal Patel <Mihir.Patel@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-15 12:23:24 -04:00
Christian König
4ce032d64c drm/ttm: nuke ttm_bo_evict_mm and rename mgr function v3
Make it more clear what the resource manager function
does and nuke the wrapper function.

v2: nuke the wrapper
v3: fix typo in radeon, rebased

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v2)
Link: https://patchwork.freedesktop.org/patch/393914/
2020-10-07 13:53:08 +02:00
Hawking Zhang
f7ee1874b0 drm/amdgpu: support indirect access reg outside of mmio bar (v2)
support both direct and indirect accessor in unified
helper functions.

v2: Retire indirect mmio access via mm_index/data

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-01 10:42:55 -04:00
Luben Tuikov
4a580877bd drm/amdgpu: Get DRM dev from adev by inline-f
Add a static inline adev_to_drm() to obtain
the DRM device pointer from an amdgpu_device pointer.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24 13:06:06 -04:00
Luben Tuikov
1348969ab6 drm/amdgpu: drm_device to amdgpu_device by inline-f (v2)
Get the amdgpu_device from the DRM device by use
of an inline function, drm_to_adev(). The inline
function resolves a pointer to struct drm_device
to a pointer to struct amdgpu_device.

v2: Use a typed visible static inline function
    instead of an invisible macro.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24 13:06:06 -04:00
Dennis Li
6049db43d6 drm/amdgpu: change reset lock from mutex to rw_semaphore
clients don't need reset-lock for synchronization when no
GPU recovery.

v2:
change to return the return value of down_read_killable.

v3:
if GPU recovery begin, VF ignore FLR notification.

Reviewed-by: Monk Liu <monk.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24 12:23:48 -04:00
Dennis Li
53b3f8f40e drm/amdgpu: refine codes to avoid reentering GPU recovery
if other threads have holden the reset lock, recovery will
fail to try_lock. Therefore we introduce atomic hive->in_reset
and adev->in_gpu_reset, to avoid reentering GPU recovery.

v2:
drop "? true : false" in the definition of amdgpu_in_reset

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24 12:22:56 -04:00
Christian König
f1403342eb drm/amdgpu: revert "fix system hang issue during GPU reset"
The whole approach wasn't thought through till the end.

We already had a reset lock like this in the past and it caused the same problems like this one.

Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary.

This reverts commit df9c8d1aa2.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:40 -04:00
Wenhui Sheng
a4322e1881 drm/amdgpu: add debugfs interface for RAP test
After amdgpu driver loading successfully, we can use
RAP debugfs interface <debugfs_dir>/dri/xxx/rap_test
to trigger RAP test.

Currently only L0 validate test is supported.

v2: refine amdgpu_rap.h

Signed-off-by: Wenhui Sheng <Wenhui.Sheng@amd.com>
Reviewed-by: Guchun Chen <Guchun.Chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:40 -04:00
Dennis Li
df9c8d1aa2 drm/amdgpu: fix system hang issue during GPU reset
when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover,
the atomic adev->in_gpu_reset and hive->in_reset are used to avoid
re-entering GPU recovery.

During GPU reset and resume, it is unsafe that other threads access GPU,
which maybe cause GPU reset failed. Therefore the new rw_semaphore
adev->reset_sem is introduced, which protect GPU from being accessed by
external threads during recovery.

v2:
1. add rwlock for some ioctls, debugfs and file-close function.
2. change to use dqm->is_resetting and dqm_lock for protection in kfd
driver.
3. remove try_lock and change adev->in_gpu_reset as atomic, to avoid
re-enter GPU recovery for the same GPU hang.

v3:
1. change back to use adev->reset_sem to protect kfd callback
functions, because dqm_lock couldn't protect all codes, for example:
free_mqd must be called outside of dqm_lock;

[ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019
[ 1230.177221] Call Trace:
[ 1230.178249]  dump_stack+0x98/0xd5
[ 1230.179443]  amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu]
[ 1230.180673]  gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu]
[ 1230.181882]  amdgpu_gart_unbind+0xa9/0xe0 [amdgpu]
[ 1230.183098]  amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu]
[ 1230.184239]  ? ttm_bo_put+0x171/0x5f0 [ttm]
[ 1230.185394]  ttm_tt_unbind+0x21/0x40 [ttm]
[ 1230.186558]  ttm_tt_destroy.part.12+0x12/0x60 [ttm]
[ 1230.187707]  ttm_tt_destroy+0x13/0x20 [ttm]
[ 1230.188832]  ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm]
[ 1230.189979]  ttm_bo_put+0x1be/0x5f0 [ttm]
[ 1230.191230]  amdgpu_bo_unref+0x1e/0x30 [amdgpu]
[ 1230.192522]  amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu]
[ 1230.193833]  free_mqd+0x25/0x40 [amdgpu]
[ 1230.195143]  destroy_queue_cpsch+0x1a7/0x270 [amdgpu]
[ 1230.196475]  pqm_destroy_queue+0x105/0x260 [amdgpu]
[ 1230.197819]  kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu]
[ 1230.199154]  kfd_ioctl+0x277/0x500 [amdgpu]
[ 1230.200458]  ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu]
[ 1230.201656]  ? tomoyo_file_ioctl+0x19/0x20
[ 1230.202831]  ksys_ioctl+0x98/0xb0
[ 1230.204004]  __x64_sys_ioctl+0x1a/0x20
[ 1230.205174]  do_syscall_64+0x5f/0x250
[ 1230.206339]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

2. remove try_lock and introduce atomic hive->in_reset, to avoid
re-enter GPU recovery.

v4:
1. remove an unnecessary whitespace change in kfd_chardev.c
2. remove comment codes in amdgpu_device.c
3. add more detailed comment in commit message
4. define a wrap function amdgpu_in_reset

v5:
1. Fix some style issues.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Suggested-by: Lijo Lazar <Lijo.Lazar@amd.com>
Suggested-by: Luben Tukov <luben.tuikov@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-27 16:21:37 -04:00
Jinzhou.Su
443c7f3c36 drm/amdgpu: add read amdgpu_gfxoff status in debugfs
Add interface for SMU12 device, used by UMR.

v2: fix code style

Signed-off-by: Jinzhou.Su <Jinzhou.Su@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21 15:37:37 -04:00