2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
Commit Graph

58 Commits

Author SHA1 Message Date
Maxime Ripard
a1ff5a7d78
Merge drm/drm-fixes into drm-misc-fixes
Let's start the new drm-misc-fixes cycle by bringing in 6.11-rc1.

Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-07-30 09:09:23 +02:00
Tvrtko Ursulin
32df4abc44 drm/v3d: Fix potential memory leak in the performance extension
If fetching of userspace memory fails during the main loop, all drm sync
objs looked up until that point will be leaked because of the missing
drm_syncobj_put.

Fix it by exporting and using a common cleanup helper.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: bae7cb5d68 ("drm/v3d: Create a CPU job extension for the reset performance query job")
Cc: Maíra Canal <mcanal@igalia.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: stable@vger.kernel.org # v6.8+
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-4-tursulin@igalia.com
(cherry picked from commit 484de39fa5)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-07-18 15:49:28 +02:00
Tvrtko Ursulin
0e50fcc20b drm/v3d: Fix potential memory leak in the timestamp extension
If fetching of userspace memory fails during the main loop, all drm sync
objs looked up until that point will be leaked because of the missing
drm_syncobj_put.

Fix it by exporting and using a common cleanup helper.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: 9ba0ff3e08 ("drm/v3d: Create a CPU job extension for the timestamp query job")
Cc: Maíra Canal <mcanal@igalia.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: stable@vger.kernel.org # v6.8+
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-3-tursulin@igalia.com
(cherry picked from commit 753ce4fea6)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-07-18 15:49:08 +02:00
Maíra Canal
7c78fdbace
drm/v3d: Add V3D tech revision to the device information
The V3D tech revision can be a useful information when configuring
jobs. Therefore, expose it in the `struct v3d_dev` with the V3D tech
version.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240714145243.1223131-1-mcanal@igalia.com
2024-07-15 12:45:59 -03:00
Tvrtko Ursulin
9c3951ec27
drm/v3d: Fix perfmon build error/warning
Move static const array into the source file to fix the "defined but not
used" errors.

The fix is perhaps not the prettiest due hand crafting the array sizes
in v3d_performance_counters.h, but I did add some build time asserts to
validate the counts look sensible, so hopefully it is good enough for a
quick fix.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: 3cbcbe016c ("drm/v3d: Add Performance Counters descriptions for V3D 4.2 and 7.1")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202405211137.hueFkLKG-lkp@intel.com/Cc: Maíra Canal <mcanal@igalia.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240604160210.24073-1-tursulin@igalia.com
2024-06-05 10:44:51 +02:00
Maíra Canal
f5b798bdc9
drm/v3d: Use V3D_MAX_COUNTERS instead of V3D_PERFCNT_NUM
V3D_PERFCNT_NUM represents the maximum number of performance counters
for V3D 4.2, but not for V3D 7.1. This means that, if we use
V3D_PERFCNT_NUM, we might go out-of-bounds on V3D 7.1.

Therefore, use the number of performance counters on V3D 7.1 as the
maximum number of counters. This will allow us to create arrays on the
stack with reasonable size. Note that userspace must use the value
provided by DRM_V3D_PARAM_MAX_PERF_COUNTERS.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240512222655.2792754-6-mcanal@igalia.com
2024-05-20 16:38:03 -03:00
Maíra Canal
f33fe58298
drm/v3d: Create new IOCTL to expose performance counters information
Userspace usually needs some information about the performance counters
available. Although we could replicate this information in the kernel
and user-space, let's use the kernel as the "single source of truth" to
avoid issues in the future (e.g. list of performance counters is updated
in user-space, but not in the kernel, generating invalid requests).

Therefore, create a new IOCTL to expose the performance counters
information, that is name, category, and description.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240512222655.2792754-5-mcanal@igalia.com
2024-05-20 16:38:02 -03:00
Maíra Canal
c606043ddb
drm/v3d: Different V3D versions can have different number of perfcnt
Currently, even though V3D 7.1 has 93 performance counters, it is not
possible to create counters bigger than 87, as
`v3d_perfmon_create_ioctl()` understands that counters bigger than 87
are invalid.

Therefore, create a device variable to expose the maximum
number of counters for a given V3D version and make
`v3d_perfmon_create_ioctl()` check this variable.

This commit fixes CTS failures in the performance queries tests
`dEQP-VK.query_pool.performance_query.*` [1]

Link: ea1f09a5f2 [1]
Fixes: 6fd9487147 ("drm/v3d: add brcm,2712-v3d as a compatible V3D device")
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240512222655.2792754-3-mcanal@igalia.com
2024-05-20 16:38:00 -03:00
Maíra Canal
3cbcbe016c
drm/v3d: Add Performance Counters descriptions for V3D 4.2 and 7.1
Add name, category and description for each one of the 93 performance
counters available on V3D.

Note that V3D 4.2 has 87 performance counters, while V3D 7.1 has 93.
Therefore, there are two performance counters arrays. The index of the
performance counter for each V3D version is represented by its position
on the array.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240512222655.2792754-2-mcanal@igalia.com
2024-05-20 16:37:59 -03:00
Maíra Canal
6abe93b621
drm/v3d: Fix race-condition between sysfs/fdinfo and interrupt handler
In V3D, the conclusion of a job is indicated by a IRQ. When a job
finishes, then we update the local and the global GPU stats of that
queue. But, while the GPU stats are being updated, a user might be
reading the stats from sysfs or fdinfo.

For example, on `gpu_stats_show()`, we could think about a scenario where
`v3d->queue[queue].start_ns != 0`, then an interrupt happens, we update
the value of `v3d->queue[queue].start_ns` to 0, we come back to
`gpu_stats_show()` to calculate `active_runtime` and now,
`active_runtime = timestamp`.

In this simple example, the user would see a spike in the queue usage,
that didn't match reality.

In order to address this issue properly, use a seqcount to protect read
and write sections of the code.

Fixes: 09a93cc4f7 ("drm/v3d: Implement show_fdinfo() callback for GPU usage stats")
Reported-by: Tvrtko Ursulin <tursulin@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240420213632.339941-7-mcanal@igalia.com
2024-04-23 19:32:49 -03:00
Maíra Canal
12d1624ce3
drm/v3d: Decouple stats calculation from printing
Create a function to decouple the stats calculation from the printing.
This will be useful in the next step when we add a seqcount to protect
the stats.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240420213632.339941-6-mcanal@igalia.com
2024-04-23 19:32:48 -03:00
Maíra Canal
b136b1953f
drm/v3d: Create a struct to store the GPU stats
This will make it easier to instantiate the GPU stats variables and it
will create a structure where we can store all the variables that refer
to GPU stats.

Note that, when we created the struct `v3d_stats`, we renamed
`jobs_sent` to `jobs_completed`. This better express the semantics of
the variable, as we are only accounting jobs that have been completed.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240420213632.339941-4-mcanal@igalia.com
2024-04-23 19:32:46 -03:00
Maíra Canal
52ce97765c
drm/v3d: Create two functions to update all GPU stats variables
Currently, we manually perform all operations to update the GPU stats
variables. Apart from the code repetition, this is very prone to errors,
as we can see on commit 35f4f8c9fc ("drm/v3d: Don't increment
`enabled_ns` twice").

Therefore, create two functions to manage updating all GPU stats
variables. Now, the jobs only need to call for `v3d_job_update_stats()`
when the job is done and `v3d_job_start_stats()` when starting the job.

Co-developed-by: Tvrtko Ursulin <tursulin@igalia.com>
Signed-off-by: Tvrtko Ursulin <tursulin@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240420213632.339941-3-mcanal@igalia.com
2024-04-23 19:32:45 -03:00
Maíra Canal
51b76c1f30
drm/v3d: Enable V3D to use different PAGE_SIZE
Currently, the V3D driver uses PAGE_SHIFT over the assumption that
PAGE_SHIFT = 12, as the PAGE_SIZE = 4KB. But, the RPi 5 is using
PAGE_SIZE = 16KB, so the MMU PAGE_SHIFT is different than the system's
PAGE_SHIFT.

Enable V3D to be used in system's with any PAGE_SIZE by making sure that
everything MMU-related uses the MMU page shift.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240214193503.164462-1-mcanal@igalia.com
2024-02-23 16:37:20 -03:00
Maíra Canal
209e8d2695
drm/v3d: Create a CPU job extension for the copy performance query job
A CPU job is a type of job that performs operations that requires CPU
intervention. A copy performance query job is a job that copy the complete
or partial result of a query to a buffer. In order to copy the result of
a performance query to a buffer, we need to get the values from the
performance monitors.

So, create a user extension for the CPU job that enables the creation
of a copy performance query job. This user extension will allow the creation
of a CPU job that copy the results of a performance query to a BO with the
possibility to indicate the availability with a availability bit.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-19-mcanal@igalia.com
2023-12-01 09:47:36 -03:00
Maíra Canal
bae7cb5d68
drm/v3d: Create a CPU job extension for the reset performance query job
A CPU job is a type of job that performs operations that requires CPU
intervention. A reset performance query job is a job that resets the
performance queries by resetting the values of the perfmons. Moreover,
we also reset the syncobjs related to the availability of the query.

So, create a user extension for the CPU job that enables the creation
of a reset performance job. This user extension will allow the creation of
a CPU job that resets the perfmons values and resets the availability syncobj.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-18-mcanal@igalia.com
2023-12-01 09:47:35 -03:00
Maíra Canal
6745f3e44a
drm/v3d: Create a CPU job extension to copy timestamp query to a buffer
A CPU job is a type of job that performs operations that requires CPU
intervention. A copy timestamp query job is a job that copy the complete
or partial result of a query to a buffer. As V3D doesn't provide any
mechanism to obtain a timestamp from the GPU, it is a job that needs
CPU intervention.

So, create a user extension for the CPU job that enables the creation
of a copy timestamp query job. This user extension will allow the creation
of a CPU job that copy the results of a timestamp query to a BO with the
possibility to indicate the timestamp availability with a availability bit.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-17-mcanal@igalia.com
2023-12-01 09:47:31 -03:00
Maíra Canal
34a101e642
drm/v3d: Create a CPU job extension for the reset timestamp job
A CPU job is a type of job that performs operations that requires CPU
intervention. A reset timestamp job is a job that resets the timestamp
queries based on the value offset of the first query. As V3D doesn't
provide any mechanism to obtain a timestamp from the GPU, it is a job
that needs CPU intervention.

So, create a user extension for the CPU job that enables the creation
of a reset timestamp job. This user extension will allow the creation of
a CPU job that resets the timestamp value in the timestamp BO and resets
the availability syncobj.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-16-mcanal@igalia.com
2023-12-01 09:42:47 -03:00
Maíra Canal
9ba0ff3e08
drm/v3d: Create a CPU job extension for the timestamp query job
A CPU job is a type of job that performs operations that requires CPU
intervention. A timestamp query job is a job that calculates the
query timestamp and updates the query availability by signaling a
syncobj. As V3D doesn't provide any mechanism to obtain a timestamp
from the GPU, it is a job that needs CPU intervention.

So, create a user extension for the CPU job that enables the creation
of a timestamp query job. This user extension will allow the creation of
a CPU job that performs the timestamp query calculation and updates the
timestamp BO with the proper value.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-15-mcanal@igalia.com
2023-12-01 09:41:37 -03:00
Maíra Canal
18b8413b25
drm/v3d: Create a CPU job extension for a indirect CSD job
A CPU job is a type of job that performs operations that requires CPU
intervention. An indirect CSD job is a job that, when executed in the
queue, will map the indirect buffer, read the dispatch parameters, and
submit a regular dispatch. Therefore, it is a job that needs CPU
intervention.

So, create a user extension for the CPU job that enables the creation
of an indirect CSD. This user extension will allow the creation of a CSD
job linked to a CPU job. The CPU job will wait for the indirect CSD job
dependencies and, once they are signaled, it will update the CSD job
parameters.

Co-developed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-14-mcanal@igalia.com
2023-12-01 09:40:15 -03:00
Maíra Canal
7c13132c40
drm/v3d: Enable BO mapping
For the indirect CSD CPU job, we will need to access the internal
contents of the BO with the dispatch parameters. Therefore, create
methods to allow the mapping and unmapping of the BO.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-13-mcanal@igalia.com
2023-12-01 09:37:50 -03:00
Melissa Wen
aafc1a2bea
drm/v3d: Add a CPU job submission
Create a new type of job, a CPU job. A CPU job is a type of job that
performs operations that requires CPU intervention. The overall idea is
to use user extensions to enable different types of CPU job, allowing the
CPU job to perform different operations according to the type of user
extension. The user extension ID identify the type of CPU job that must
be dealt.

Having a CPU job is interesting for synchronization purposes as a CPU
job has a queue like any other V3D job and can be synchoronized by the
multisync extension.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Co-developed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-9-mcanal@igalia.com
2023-12-01 09:34:19 -03:00
Melissa Wen
9032d5f633
drm/v3d: Detach job submissions IOCTLs to a new specific file
We will include a new job submission type, the CPU job submission. For
readability and maintability, separate the job submission IOCTLs and
related operations from v3d_gem.c.

Minor fix in the CSD submission kernel doc:
CSD (texture formatting) -> CSD (compute shader).

Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-5-mcanal@igalia.com
2023-12-01 09:34:01 -03:00
Melissa Wen
a8ad9d63a1
drm/v3d: Move wait BO ioctl to the v3d_bo file
IOCTLs related to BO operations reside on the file v3d_bo.c. The wait BO
ioctl is the only IOCTL regarding BOs that is placed in a different file.
So, move it to the v3d_bo.c file.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-4-mcanal@igalia.com
2023-12-01 09:33:56 -03:00
Melissa Wen
780b9463ce
drm/v3d: Remove unused function header
v3d_mmu_get_offset header was added but the function was never defined.
Just remove it.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-3-mcanal@igalia.com
2023-12-01 09:33:52 -03:00
Maíra Canal
509433d814
drm/v3d: Expose the total GPU usage stats on sysfs
The previous patch exposed the accumulated amount of active time per
client for each V3D queue. But this doesn't provide a global notion of
the GPU usage.

Therefore, provide the accumulated amount of active time for each V3D
queue (BIN, RENDER, CSD, TFU and CACHE_CLEAN), considering all the jobs
submitted to the queue, independent of the client.

This data is exposed through the sysfs interface, so that if the
interface is queried at two different points of time the usage percentage
of each of the queues can be calculated.

Co-developed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Acked-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230905213416.1290219-3-mcanal@igalia.com
2023-11-06 10:09:29 -03:00
Maíra Canal
09a93cc4f7
drm/v3d: Implement show_fdinfo() callback for GPU usage stats
This patch exposes the accumulated amount of active time per client
through the fdinfo infrastructure. The amount of active time is exposed
for each V3D queue: BIN, RENDER, CSD, TFU and CACHE_CLEAN.

In order to calculate the amount of active time per client, a CPU clock
is used through the function local_clock(). The point where the jobs has
started is marked and is finally compared with the time that the job had
finished.

Moreover, the number of jobs submitted to each queue is also exposed on
fdinfo through the identifier "v3d-jobs-<queue>".

Co-developed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Acked-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230905213416.1290219-3-mcanal@igalia.com
2023-11-06 10:09:23 -03:00
Kees Cook
9586e24017 drm/v3d: Annotate struct v3d_perfmon with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct v3d_perfmon.

[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

Cc: Emma Anholt <emma@anholt.net>
Cc: Melissa Wen <mwen@igalia.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230922173216.3823169-9-keescook@chromium.org
2023-10-05 11:31:33 +02:00
Nathan Chancellor
b27211db61
drm/v3d: Avoid -Wconstant-logical-operand in nsecs_to_jiffies_timeout()
A proposed update to clang's -Wconstant-logical-operand to warn when the
left hand side is a constant shows the following instance in
nsecs_to_jiffies_timeout() when NSEC_PER_SEC is not a multiple of HZ,
such as CONFIG_HZ=300:

  In file included from drivers/gpu/drm/v3d/v3d_debugfs.c:12:
  drivers/gpu/drm/v3d/v3d_drv.h:343:24: warning: use of logical '&&' with constant operand [-Wconstant-logical-operand]
    343 |         if (NSEC_PER_SEC % HZ &&
        |             ~~~~~~~~~~~~~~~~~ ^
  drivers/gpu/drm/v3d/v3d_drv.h:343:24: note: use '&' for a bitwise operation
    343 |         if (NSEC_PER_SEC % HZ &&
        |                               ^~
        |                               &
  drivers/gpu/drm/v3d/v3d_drv.h:343:24: note: remove constant to silence this warning
  1 warning generated.

Turn this into an explicit comparison against zero to make the
expression a boolean to make it clear this should be a logical check,
not a bitwise one.

Link: https://reviews.llvm.org/D142609
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Maíra Canal <mairacanal@riseup.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20230718-nsecs_to_jiffies_timeout-constant-logical-operand-v1-1-36ed8fc8faea@kernel.org
2023-07-27 13:01:27 -03:00
Melissa Wen
e4165ae830 drm/v3d: add multiple syncobjs support
Using the generic extension from the previous patch, a specific multisync
extension enables more than one in/out binary syncobj per job submission.
Arrays of syncobjs are set in struct drm_v3d_multisync, that also cares
of determining the stage for sync (wait deps) according to the job
queue.

v2:
- subclass the generic extension struct (Daniel)
- simplify adding dependency conditions to make understandable (Iago)

v3:
- fix conditions to consider single or multiples in/out_syncs (Iago)
- remove irrelevant comment (Iago)

Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ffd8b2e3dd2e0c686db441a0c0a4a0181ff85328.1633016479.git.mwen@igalia.com
2021-10-04 10:08:46 +01:00
Daniel Vetter
da3208e863 drm/v3d: Use scheduler dependency handling
With the prep work out of the way this isn't tricky anymore.

Aside: The chaining of the various jobs is a bit awkward, with the
possibility of failure in bad places. I think with the
drm_sched_job_init/arm split and maybe preloading the
job->dependencies xarray this should be fixable.

v2: Rebase over renamed function names for adding dependencies.

Reviewed-by: Melissa Wen <mwen@igalia.com> (v1)
Acked-by: Emma Anholt <emma@anholt.net>
Cc: Melissa Wen <melissa.srw@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Emma Anholt <emma@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-11-daniel.vetter@ffwll.ch
2021-08-30 10:58:47 +02:00
Daniel Vetter
916044fac8 drm/v3d: Move drm_sched_job_init to v3d_job_init
Prep work for using the scheduler dependency handling. We need to call
drm_sched_job_init earlier so we can use the new drm_sched_job_await*
functions for dependency handling here.

v2: Slightly better commit message and rebase to include the
drm_sched_job_arm() call (Emma).

v3: Cleanup jobs under construction correctly (Emma)

v4: Rebase over perfmon patch

Reviewed-by: Melissa Wen <mwen@igalia.com> (v3)
Acked-by: Emma Anholt <emma@anholt.net>
Cc: Melissa Wen <melissa.srw@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Emma Anholt <emma@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-10-daniel.vetter@ffwll.ch
2021-08-30 10:58:40 +02:00
Juan A. Suarez Romero
26a4dc29b7 drm/v3d: Expose performance counters to userspace
The V3D engine has several hardware performance counters that can of
interest for userspace performance analysis tools.

This exposes new ioctls to create and destroy performance monitor
objects, as well as to query the counter values.

Each created performance monitor object has an ID that can be attached
to CL/CSD submissions, so the driver enables the requested counters when
the job is submitted, and updates the performance monitor values when
the job is done.

It is up to the user to ensure all the jobs have been finished before
getting the performance monitor values. It is also up to the user to
properly synchronize BCL jobs when submitting jobs with different
performance monitors attached.

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: Emma Anholt <emma@anholt.net>
To: dri-devel@lists.freedesktop.org
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Acked-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210608111541.461991-1-jasuarez@igalia.com
2021-07-21 00:19:59 +01:00
Daniel Vetter
0df3ac7657 drm/v3d: Delete v3d_dev->pdev
We already have it in v3d_dev->drm.dev with zero additional pointer
chasing. Personally I don't like duplicated pointers like this
because:
- reviewers need to check whether the pointer is for the same or
different objects if there's multiple
- compilers have an easier time too

To avoid having to pull in some big headers I implemented the casting
function as a macro instead of a static inline. Typechecking thanks to
container_of still assured.

But also a bit a bikeshed, so feel free to ignore.

v2: More parens for v3d_to_pdev macro (checkpatch)

Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-11-daniel.vetter@ffwll.ch
2020-04-28 15:15:59 +02:00
Daniel Vetter
bc662528e2 drm/v3d: Delete v3d_dev->dev
We already have it in v3d_dev->drm.dev with zero additional pointer
chasing. Personally I don't like duplicated pointers like this
because:
- reviewers need to check whether the pointer is for the same or
  different objects if there's multiple
- compilers have an easier time too

But also a bit a bikeshed, so feel free to ignore.

Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-10-daniel.vetter@ffwll.ch
2020-04-28 15:15:52 +02:00
Daniel Vetter
af25c16bd1 drm/v3d: Don't set drm_device->dev_private
And switch the helper over to container_of, which is a bunch faster
than chasing a pointer. Plus allows gcc to see through this maze.

Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-8-daniel.vetter@ffwll.ch
2020-04-28 15:15:41 +02:00
Wambui Karuga
7ce84471e3 drm: convert .debugfs_init() hook to return void.
As a result of commit 987d65d013 (drm: debugfs: make
drm_debugfs_create_files() never fail) and changes to various debugfs
functions in drm/core and across various drivers, there is no need for
the drm_driver.debugfs_init() hook to have a return value. Therefore,
declare it as void.

This also includes refactoring all users of the .debugfs_init() hook to
return void across the subsystem.

v2: include changes to the hook and drivers that use it in one patch to
prevent driver breakage and enable individual successful compilation of
this change.

References: https://lists.freedesktop.org/archives/dri-devel/2020-February/257183.html
Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200310133121.27913-18-wambui.karugax@gmail.com
2020-03-18 17:53:28 +01:00
James Hughes
9daee6141c drm/v3d: Replace wait_for macros to remove use of msleep
The wait_for macro's for Broadcom V3D driver used msleep, which is
inappropriate due to its inaccuracy at low values (minimum wait time
is about 30ms on the Raspberry Pi).  This sleep was triggering in
v3d_clean_caches(), causing us to only be able to dispatch ~33 compute
jobs per second.

This patch replaces the macro with the one from the Intel i915 version
which uses usleep_range to provide more accurate waits.

v2: Split from the vc4 patch so that we can confidently apply to
    stable (by anholt)

Signed-off-by: James Hughes <james.hughes@raspberrypi.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20200217153145.13780-1-james.hughes@raspberrypi.com
Link: https://github.com/raspberrypi/linux/issues/3460
Fixes: 57692c94dc ("drm/v3d: Introduce a new DRM driver for Broadcom V3D V3.x+")
2020-03-04 22:15:34 -08:00
Sam Ravnborg
220989e709 drm/v3d: drop use of drmP.h
Drop use of the deprecated drmP.h header file.
Made v3d_drv.h self-contained with only sufficient
include files.
Fixed fallout in remaining files.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190716064220.18157-3-sam@ravnborg.org
2019-07-17 12:52:20 +02:00
Eric Anholt
38c2c7917a drm/v3d: Fix and extend MMU error handling.
We were setting the wrong flags to enable PTI errors, so we were
seeing reads to invalid PTEs show up as write errors.  Also, we
weren't turning on the interrupts.  The AXI IDs we were dumping
included the outstanding write number and so they looked basically
random.  And the VIO_ADDR decoding was based on the MMU VA_WIDTH for
the first platform I worked on and was wrong on others.  In short,
this was a thorough mess from early HW enabling.

Tested on V3D 4.1 and 4.2 with intentional L2T, CLE, PTB, and TLB
faults.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190419001014.23579-4-eric@anholt.net
Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
2019-05-16 09:24:52 -07:00
Eric Anholt
dffa9b7a78 drm/v3d: Add missing implicit synchronization.
It is the expectation of existing userspace (X11 + Mesa, in
particular) that jobs submitted to the kernel against a shared BO will
get implicitly synchronized by their submission order.  If we want to
allow clever userspace to disable implicit synchronization, we should
do that under its own submit flag (as amdgpu and lima do).

Note that we currently only implicitly sync for the rendering pass,
not binning -- if you texture-from-pixmap in the binning vertex shader
(vertex coordinate generation), you'll miss out on synchronization.

Fixes flickering when multiple clients are running in parallel,
particularly GL apps and compositors.

v2: Fix a missing refcount on the CSD done fence for L2 cleaning.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-6-eric@anholt.net
Acked-by: Rob Clark <robdclark@gmail.com>
2019-04-18 09:54:16 -07:00
Eric Anholt
d223f98f02 drm/v3d: Add support for compute shader dispatch.
The compute shader dispatch interface is pretty simple -- just pass in
the regs that userspace has passed us, with no CLs to run.  However,
with no CL to run it means that we need to do manual cache flushing of
the L2 after the HW execution completes (for SSBO, atomic, and
image_load_store writes that are the output of compute shaders).

This doesn't yet expose the L2 cache's ability to have a region of the
address space not write back to memory (which could be used for
shared_var storage).

So far, the Mesa side has been tested on V3D v4.2 simpenrose (passing
the ES31 tests), and on the kernel side on 7278 (failing atomic
compswap tests in a way that doesn't reproduce on simpenrose).

v2: Fix excessive allocation for the clean_job (reported by Dan
    Carpenter).  Keep refs on jobs until clean_job is finished, to
    avoid spurious MMU errors if the output BOs are freed by userspace
    before L2 cleaning is finished.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-4-eric@anholt.net
Acked-by: Rob Clark <robdclark@gmail.com>
2019-04-18 09:54:10 -07:00
Eric Anholt
a783a09ee7 drm/v3d: Refactor job management.
The CL submission had two jobs embedded in an exec struct.  When I
added TFU support, I had to replicate some of the exec stuff and some
of the job stuff.  As I went to add CSD, it became clear that actually
what was in exec should just be in the two CL jobs, and it would let
us share a lot more code between the 4 queues.

v2: Fix missing error path in TFU ioctl's bo[] allocation.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-3-eric@anholt.net
Acked-by: Rob Clark <robdclark@gmail.com>
2019-04-18 09:54:07 -07:00
Eric Anholt
d4c3022a23 drm/v3d: Switch the type of job-> to reduce casting.
All consumers wanted drm_gem_object * now.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-2-eric@anholt.net
Acked-by: Rob Clark <robdclark@gmail.com>
2019-04-18 09:53:56 -07:00
Eric Anholt
3f0b646e1a drm/v3d: Rename the fence signaled from IRQs to "irq_fence".
We have another thing called the "done fence" that tracks when the
scheduler considers the job done, and having the shared name was
confusing.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190313235211.28995-2-eric@anholt.net
Reviewed-by: Dave Emett <david.emett@broadcom.com>
2019-04-01 10:44:34 -07:00
Eric Anholt
40609d4820 drm/v3d: Use the new shmem helpers to reduce driver boilerplate.
The new shmem helpers from Noralf and Rob abstract out a bunch of our
BO creation and mapping code.

v2: Use the new sgt getter, and flag pages as dirty before freeing.
v3: Remove the mismatched put_pages.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190314163451.13431-1-eric@anholt.net
Reviewed-by: Rob Herring <robh@kernel.org> (v2)
2019-03-14 12:06:44 -07:00
Eric Anholt
a83e47e421 drm/v3d: Remove some dead members of struct v3d_bo.
vmas was from the previous model of page table management (one per
fd), and vaddr was left over from vc4.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308161716.2466-4-eric@anholt.net
Acked-by: Rob Herring <robh@kernel.org>
2019-03-14 09:22:58 -07:00
Eric Anholt
eea9b97b45 drm/v3d: Add support for V3D v4.2.
No compatible string for it yet, just the version-dependent changes.
They've now tied the hub and the core interrupt lines into a single
interrupt line coming out of the block.  It also turns out I made a
mistake in modeling the V3D v3.3 and v4.1 bridge as a part of V3D
itself -- the bridge is going away in favor of an external reset
controller in a larger HW module.

v2: Use consistent checks for whether we're on 4.2, and fix a leak in
    an error path.
v3: Use more general means of determining if the current 4.2 changes
    are in place, as apparently other platforms may switch back (noted
    by Dave).  Update the binding doc.
v4: Improve error handling for IRQ init.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308174336.7866-2-eric@anholt.net
Reviewed-by: Dave Emett <david.emett@broadcom.com>
2019-03-08 15:09:56 -08:00
Eric Anholt
fc22771547 drm/v3d: Handle errors from IRQ setup.
Noted in review by Dave Emett for V3D 4.2 support.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308174336.7866-1-eric@anholt.net
Reviewed-by: Dave Emett <david.emett@broadcom.com>
2019-03-08 15:09:42 -08:00
Rob Herring
8d66830976
drm: v3d: Switch to use drm_gem_object reservation_object
Now that the base struct drm_gem_object has a reservation_object, use it
and remove the private BO one.

Cc: Eric Anholt <eric@anholt.net>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190202154158.10443-5-robh@kernel.org
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
2019-02-19 11:08:40 +01:00