2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
Commit Graph

20 Commits

Author SHA1 Message Date
Christian König
dd80d9c8ee drm/amdgpu: revert "partial revert "remove ctx->lock" v2"
This reverts commit 94f4c4965e.

We found that the bo_list is missing a protection for its list entries.
Since that is fixed now this workaround can be removed again.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:33:01 -04:00
Christian König
af0b541670 drm/amdgpu: Convert to common fdinfo format v5
Convert fdinfo format to one documented in drm-usage-stats.rst.

It turned out that the existing implementation was actually completely
nonsense. The calculated percentages indeed represented the usage of the
engine, but with varying time slices.

So 10% usage for application A could mean something completely different
than 10% usage for application B.

Completely nuke that and just use the now standardized nanosecond
interface.

v2: drop the documentation change for now, nuke percentage calculation
v3: only account for each hw_ip, move the time_spend to the ctx mgr.
v4: move general ctx changes into separate patch, rework the fdinfo to
    ctx_mgr interface so that all usages are calculated at once, drop
    some unecessary and dangerous refcount dance.
v5: add one more comment how we calculate the time spend

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:34 -04:00
Christian König
69493c034d drm/amdgpu: cleanup ctx implementation
Let each context have a pointer to the ctx manager and properly
initialize the adev pointer inside the context manager.

Reduce the BUG_ON() in amdgpu_ctx_add_fence() into a WARN_ON() and
directly return the sequence number instead of writing into a parmeter.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:31 -04:00
Christian König
94f4c4965e drm/amdgpu: partial revert "remove ctx->lock" v2
This reverts commit 461fa7b0ac.

We are missing some inter dependencies here so re-introduce the lock
until we have figured out what's missing. Just drop/retake it while
adding dependencies.

v2: still drop the lock while adding dependencies

Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> (v1)
Fixes: 461fa7b0ac ("drm/amdgpu: remove ctx->lock")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220419110633.166236-1-christian.koenig@amd.com
2022-04-21 11:26:20 +02:00
Ken Xue
461fa7b0ac drm/amdgpu: remove ctx->lock
KMD reports a warning on holding a lock from drm_syncobj_find_fence,
when running amdgpu_test case “syncobj timeline test”.

ctx->lock was designed to prevent concurrent "amdgpu_ctx_wait_prev_fence"
calls and avoid dead reservation lock from GPU reset. since no reservation
lock is held in latest GPU reset any more, ctx->lock can be simply removed
and concurrent "amdgpu_ctx_wait_prev_fence" call also can be prevented by
PD root bo reservation lock.

call stacks:
=================
//hold lock
amdgpu_cs_ioctl->amdgpu_cs_parser_init->mutex_lock(&parser->ctx->lock);
…
//report warning
amdgpu_cs_dependencies->amdgpu_cs_process_syncobj_timeline_in_dep \
->amdgpu_syncobj_lookup_and_add_to_sync -> drm_syncobj_find_fence \
-> lockdep_assert_none_held_once
…
amdgpu_cs_ioctl->amdgpu_cs_parser_fini->mutex_unlock(&parser->ctx->lock);

Signed-off-by: Ken Xue <Ken.Xue@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-11 16:19:23 -05:00
Alex Deucher
8cda7a4f96 drm/amdgpu/UAPI: add new CTX OP to get/set stable pstates
Add a new CTX ioctl operation to set stable pstates for profiling.
When creating traces for tools like RGP or using SPM or doing
performance profiling, it's required to enable a special
stable profiling power state on the GPU.  These profiling
states set fixed clocks and disable certain other power
features like powergating which may impact the results.

Historically, these profiling pstates were enabled via sysfs,
but this adds an interface to enable it via the CTX ioctl
from the application.  Since the power state is global
only one application can set it at a time, so if multiple
applications try and use it only the first will get it,
the ioctl will return -EBUSY for others.  The sysfs interface
will override whatever has been set by this interface.

Mesa MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/207

v2: don't default r = 0;
v3: rebase on Evan's PM cleanup

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-27 15:50:08 -05:00
Nirmoy Das
84d588c3de drm/amdgpu: rework context priority handling
To get a hardware queue priority for a context, we are currently
mapping AMDGPU_CTX_PRIORITY_* to DRM_SCHED_PRIORITY_* and then
to hardware queue priority, which is not the right way to do that
as DRM_SCHED_PRIORITY_* is software scheduler's priority and it is
independent from a hardware queue priority.

Use userspace provided context priority, AMDGPU_CTX_PRIORITY_* to
map a context to proper hardware queue priority.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-01 16:55:11 -04:00
David M Nieto
5c439c38f5 drm/amdgpu: fix fence calculation (v2)
The proper metric for fence utilization over several
contexts is an harmonic mean, but such calculation is
prohibitive in kernel space, so the code approximates it.

Because the approximation diverges when one context has a
very small ratio compared with the other context, this change
filter out ratios smaller that 0.01%

v2: make the fence calculation static and initialize variables
within that function

v3: Fix warnings (Alex)

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: David M Nieto <david.nieto@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210513174539.27409-2-david.nieto@amd.com
2021-05-13 14:09:12 -04:00
Roy Sun
8744425411 drm/amdgpu: Add show_fdinfo() interface
Tracking devices, process info and fence info using
/proc/pid/fdinfo

Signed-off-by: David M Nieto <David.Nieto@amd.com>
Signed-off-by: Roy Sun <Roy.Sun@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210426062701.39732-2-Roy.Sun@amd.com
2021-05-05 09:26:53 +02:00
Nirmoy Das
1c6d567bdf drm/amdgpu: rework sched_list generation
Generate HW IP's sched_list in amdgpu_ring_init() instead of
amdgpu_ctx.c. This makes amdgpu_ctx_init_compute_sched(),
ring.has_high_prio and amdgpu_ctx_init_sched() unnecessary.
This patch also stores sched_list for all HW IPs in one big
array in struct amdgpu_device which makes amdgpu_ctx_init_entity()
much more leaner.

v2:
fix a coding style issue
do not use drm hw_ip const to populate amdgpu_ring_type enum

v3:
remove ctx reference and move sched array and num_sched to a struct
use num_scheds to detect uninitialized scheduler list

v4:
use array_index_nospec for user space controlled variables
fix possible checkpatch.pl warnings

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-09 10:43:14 -04:00
Nirmoy Das
977f7e1068 drm/amdgpu: allocate entities on demand
Currently we pre-allocate entities and fences for all the HW IPs on
context creation and some of which are might never be used.

This patch tries to resolve entity/fences wastage by creating entity
only when needed.

v2: allocate memory for entity and fences together

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-30 17:15:42 -05:00
Nirmoy Das
63e3ab9a82 drm/amdgpu: individualize fence allocation per entity
Allocate fences for each entity and remove ctx->fences reference as
fences should be bound to amdgpu_ctx_entity instead amdgpu_ctx.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:55:27 -05:00
Nirmoy Das
f880799d7f amd/amdgpu: add sched array to IPs with multiple run-queues
This sched array can be passed on to entity creation routine
instead of manually creating such sched array on every context creation.

v2: squash in missing break fix

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:12 -05:00
Guchun Chen
64cc5414fb drm/amdgpu: correct ras error count type
Use unsigned long type for the same ras count variable.
This will avoid overflow on 64 bit system.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-23 11:30:32 -05:00
Christian König
56753e73fb drm/amdgpu: wait for VM to become idle during flush
Make sure that not only the entities are flush, but that
we also wait for the HW to finish all processing.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-19 15:36:58 -05:00
xinhui pan
ae363a212b drm/amdgpu: Add a new flag to AMDGPU_CTX_OP_QUERY_STATE2
Add AMDGPU_CTX_QUERY2_FLAGS_RAS_CE/UE which indicate if any error happened
between previous query and this query.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-19 15:36:51 -05:00
Christian König
85eff20020 drm/amdgpu: amdgpu_ctx_add_fence can't fail
No more waiting for a fence done here.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 15:12:10 -05:00
Christian König
1b1f2fecb6 drm/amdgpu: rework ctx entity creation
Use a fixed number of entities for each hardware IP.

The number of compute entities is reduced to four, SDMA keeps it two
entities and all other engines just expose one entity.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:11:11 -05:00
Christian König
0d346a14c6 drm/amdgpu: use entity instead of ring for CS
Further demangle ring from entity handling.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:10:48 -05:00
Christian König
8290268f31 drm/amdgpu: move context related stuff to amdgpu_ctx.h
Further unmangle amdgpu.h.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:10:48 -05:00