2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
Commit Graph

11309 Commits

Author SHA1 Message Date
Hawking Zhang
f66f48471b drm/amdgpu: declare firmware for new SDMA 6.0.3
To support new sdma ip block

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-29 18:00:54 -04:00
John Clements
773562364a drm/amdgpu: enable smu block for smu 13.0.10
Force to enable smu block for SMU v13.0.10

Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-29 18:00:54 -04:00
Frank Min
a60d219137 drm/amdgpu: add new ip block for PSP 13.0
Add ip block support for psp v13_0_10.

Signed-off-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-29 17:59:30 -04:00
John Clements
10f8927d74 drm/amdgpu: added firmware module for psp 13.0.10
added missing firmware module

Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-29 17:59:30 -04:00
Frank Min
7ab47ba22e drm/amdgpu: support psp v13_0_10 ip block
Add psp v13_0_10 ip block, initialize firmware and
psp functions

Signed-off-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-29 17:59:30 -04:00
Hawking Zhang
fc968efdf0 drm/amdgpu: add new ip block for SOC21
Add ip block support for soc21_common.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-29 17:59:30 -04:00
Sonny Jiang
0f05a2e528 drm/amdgpu: Enable pg/cg flags on GC11_0_3 for VCN
This enable VCN PG, CG, DPG and JPEG PG, CG

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-29 17:59:30 -04:00
Hawking Zhang
6b46251c50 drm/amdgpu: initialize common sw config for v11_0_3
init cp/pg_flags and extenal_rev_id

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-29 17:59:30 -04:00
Hawking Zhang
6ec128c3ff drm/amdgpu: drop gc 11_0_0 golden settings
driver doesn't need to program any gc 11_0_0 golden

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-29 17:59:30 -04:00
Alex Sierra
ab23c5b9c7 drm/amdgpu: ensure no PCIe peer access for CPU XGMI iolinks
[Why] Devices with CPU XGMI iolink do not support PCIe peer access.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-29 17:59:30 -04:00
YuBiao Wang
2581c5d85e drm/amdgpu: Fix use-after-free in amdgpu_cs_ioctl
[Why]
In amdgpu_cs_ioctl, amdgpu_job_free could be performed ealier if there
is -ERESTARTSYS error. In this case, job->hw_fence could be not
initialized yet. Putting hw_fence during amdgpu_job_free could lead to a
use-after-free warning.

[How]
Check if drm_sched_job_init is performed before job_free by checking
s_fence.

v2: Check hw_fence.ops instead since it could be NULL if fence is not
initialized. Reverse the condition since !=NULL check is discouraged in
kernel.

Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-29 17:45:36 -04:00
Yang Yingliang
6b11af6d1c drm/amdgpu: add missing pci_disable_device() in amdgpu_pmops_runtime_resume()
Add missing pci_disable_device() if amdgpu_device_resume() fails.

Fixes: 8e4d5d43cc ("drm/amdgpu: Handling of amdgpu_device_resume return value for graceful teardown")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-29 17:44:15 -04:00
ye xingchen
1d5d194777 drm/amdgpu: Remove the unneeded result variable
Return the value sdma_v5_2_start() directly instead of storing it in
another redundant variable.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-29 17:43:59 -04:00
Graham Sider
2aefa9a38f drm/amdgpu: Update mes_v11_api_def.h
New GFX11 MES FW adds the trap_en bit. For now hardcode to 1 (traps
enabled).

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-29 17:43:50 -04:00
Guchun Chen
73515bbdc4 drm/amdgpu: disable FRU access on special SIENNA CICHLID card
Below driver load error will be printed, not friendly to end user.

amdgpu: ATOM BIOS: 113-D603GLXE-077
[drm] FRU: Failed to get size field
[drm:amdgpu_fru_get_product_info [amdgpu]] *ERROR* Failed to read FRU Manufacturer, ret:-5

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-29 17:43:29 -04:00
Qu Huang
58dcc22106 drm/amdgpu: mmVM_L2_CNTL3 register not initialized correctly
The mmVM_L2_CNTL3 register is not assigned an initial value

Signed-off-by: Qu Huang <jinsdb@126.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-25 13:35:18 -04:00
Rafael J. Wysocki
61ebd2fe6f drm: amd: amdgpu: ACPI: Add comment about ACPI_FADT_LOW_POWER_S0
According to the ACPI specification [1], the ACPI_FADT_LOW_POWER_S0
flag merely means that it is better to use low-power S0 idle on the
given platform than S3 (provided that the latter is supported) and it
doesn't preclude using either of them (which of them will be used
depends on the choices made by user space).

However, on some systems that flag is used to indicate whether or not
to enable special firmware mechanics allowing the system to save more
energy when suspended to idle.  If that flag is unset, doing so is
generally risky.

Accordingly, add a comment to explain the ACPI_FADT_LOW_POWER_S0 check
in amdgpu_acpi_is_s0ix_active(), the purpose of which is otherwise
somewhat unclear.

Link: https://uefi.org/specs/ACPI/6.4/05_ACPI_Software_Programming_Model/ACPI_Software_Programming_Model.html#fixed-acpi-description-table-fadt # [1]
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-25 13:35:18 -04:00
Guchun Chen
638bc30f85 drm/amdgpu: use dev_info to benefit mGPU case
'free PSP TMR buffer' happens in suspend, but sometimes
in mGPU config, it mixes with PSP resume log printing from
another GPU, which is confusing. So use dev_info instead of
DRM_INFO for printing.

[drm] PSP is resuming...
[drm] reserve 0xa00000 from 0x877e000000 for PSP TMR
amdgpu 0000:e3:00.0: amdgpu: GECC is enabled
amdgpu 0000:e3:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
amdgpu 0000:e3:00.0: amdgpu: SMU is resuming...
amdgpu 0000:e3:00.0: amdgpu: smu driver if version = 0x00000040, smu fw if version = 0x00000041, smu fw program = 0, version = 0x003a5400 (58.84.0)
amdgpu 0000:e3:00.0: amdgpu: SMU driver if version not matched
amdgpu 0000:e3:00.0: amdgpu: dpm has been enabled
amdgpu 0000:e3:00.0: amdgpu: SMU is resumed successfully!
[drm] DMUB hardware initialized: version=0x02020014
[drm] free PSP TMR buffer
[drm] kiq ring mec 2 pipe 1 q 0

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-25 13:35:18 -04:00
Guchun Chen
a79f56d191 drm/amdgpu: use adev_to_drm to get drm device
adev_to_drm is used everywhere in amdgpu code, so modify
it to keep consistency.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-25 13:35:18 -04:00
Likun Gao
7201023910 drm/amdgpu: add MGCG perfmon setting for gfx11
Enable GFX11 MGCG perfmon setting.
V2: set rlc to saft mode before setting.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-25 13:35:18 -04:00
Chengming Gui
d3ef9d57f2 drm/amd/amdgpu: avoid soft reset check when gpu recovery disabled
Avoid soft reset, even ip hang check (ring/ib test) when gpu recovery
disabled.

v2: add missing "}"

Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-25 13:35:18 -04:00
Mukul Joshi
f47f9b2e9c drm/amdgpu: Fix page table setup on Arcturus
When translate_further is enabled, page table depth needs to
be updated. This was missing on Arcturus MMHUB init. This was
causing address translations to fail for SDMA user-mode queues.

Fixes: 352e683b72 ("drm/amdgpu: Enable translate_further to extend UTCL2 reach")
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-25 13:35:17 -04:00
Vignesh Chander
7c55b598b3 drm/amdgpu: skip set_topology_info for VF
Skip set_topology_info as xgmi TA will now block it
and host needs to program it.

Signed-off-by: Vignesh Chander <Vignesh.Chander@amd.com>
Reviewed-By : Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-25 13:19:23 -04:00
Tim Huang
345c0bc0a3 drm/amdgpu: add sdma instance check for gfx11 CGCG
For some ASICs, like GFX IP v11.0.1, only have one SDMA instance,
so not need to configure SDMA1_RLC_CGCG_CTRL for this case.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-25 13:19:04 -04:00
Hans de Goede
da11ef8329 drm/amdgpu: Don't register backlight when another backlight should be used (v3)
Before this commit when we want userspace to use the acpi_video backlight
device we register both the GPU's native backlight device and acpi_video's
firmware acpi_video# backlight device. This relies on userspace preferring
firmware type backlight devices over native ones.

Registering 2 backlight devices for a single display really is
undesirable, don't register the GPU's native backlight device when
another backlight device should be used.

Changes in v2:
- To avoid linker errors when amdgpu is builtin and video_detect.c is in
  a module, select ACPI_VIDEO and its deps if ACPI is enabled.
  When ACPI is disabled, ACPI_VIDEO is also always disabled, ensuring
  the stubs from acpi/video.h will be used.

Changes in v3:
- Use drm_info(drm_dev, "...") to log messages
- ACPI_VIDEO can now be enabled on non X86 too,
  adjust the Kconfig changes to match this.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-08-25 10:56:20 +02:00
Tim Huang
9407feacd2 drm/amdgpu: enable NBIO IP v7.7.0 Clock Gating
Enable AMD_CG_SUPPORT_BIF_MGCG and AMD_CG_SUPPORT_BIF_LS support.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-22 16:47:28 -04:00
Tim Huang
c4d0d69999 drm/amdgpu: add NBIO IP v7.7.0 Clock Gating support
Add BIF Clock Gating MGCG and LS support for NBIO IP v7.7.0.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-22 16:47:23 -04:00
shaoyunl
947f63f17e drm/amdgpu: Remove the additional kfd pre reset call for sriov
The additional call is caused by merge conflict

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-22 16:47:09 -04:00
Candice Li
4bb5fed169 drm/amdgpu: Check num_gfx_rings for gfx v9_0 rb setup.
No need to set up rb when no gfx rings.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-22 16:47:09 -04:00
YiPeng Chai
9dfa4860ef drm/amdgpu: fix hive reference leak when adding xgmi device
Only amdgpu_get_xgmi_hive but no amdgpu_put_xgmi_hive
which will leak the hive reference.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-22 16:47:09 -04:00
YiPeng Chai
d8adafc7fe drm/amdgpu: Move psp_xgmi_terminate call from amdgpu_xgmi_remove_device to psp_hw_fini
V1:
The amdgpu_xgmi_remove_device function will send unload command
to psp through psp ring to terminate xgmi, but psp ring has been
destroyed in psp_hw_fini.

V2:
1. Change the commit title.
2. Restore amdgpu_xgmi_remove_device to its original calling location.
   Move psp_xgmi_terminate call from amdgpu_xgmi_remove_device to
   psp_hw_fini.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-22 16:47:09 -04:00
Tim Huang
95a72fb73c drm/amdgpu: enable GFXOFF allow control for GC IP v11.0.1
Enable GFXOFF allow control when set the GFX power gating.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-22 16:47:08 -04:00
Arunpravin Paneer Selvam
6d3c900c12 drm/ttm: Switch to using the new res callback
Apply new intersect and compatible callback instead
of having a generic placement range verfications.

v2: Added a separate callback for compatiblilty
    checks (Christian)
v3: Cleanups and removal of workarounds

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220820073304.178444-6-Arunpravin.PaneerSelvam@amd.com
2022-08-22 15:36:29 +02:00
Arunpravin Paneer Selvam
ded910f368 drm/amdgpu: Implement intersect/compatible functions
Implemented a new intersect and compatible callback function
fetching start offset from backend drm buddy allocator.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220820073304.178444-3-Arunpravin.PaneerSelvam@amd.com
2022-08-22 15:35:22 +02:00
Dave Airlie
b1fb6b87ed Merge tag 'amd-drm-fixes-6.0-2022-08-17' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.0-2022-08-17:

amdgpu:
- Revert some DML stack changes
- Rounding fixes in KFD allocations
- atombios vram info table parsing fix
- DCN 3.1.4 fixes
- Clockgating fixes for various new IPs
- SMU 13.0.4 fixes
- DCN 3.1.4 FP fixes
- TMDS fixes for YCbCr420 4k modes
- DCN 3.2.x fixes
- USB 4 fixes
- SMU 13.0 fixes
- SMU driver unload memory leak fixes
- Display orientation fix
- Regression fix for generic fbdev conversion
- SDMA 6.x fixes
- SR-IOV fixes
- IH 6.x fixes
- Use after free fix in bo list handling
- Revert pipe1 support
- XGMI hive reset fix

amdkfd:
- Fix potential crach in kfd_create_indirect_link_prop()

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220818025206.6463-1-alexander.deucher@amd.com
2022-08-19 09:45:22 +10:00
André Almeida
a021e2aa4d drm/amdgpu: Document gfx_off members of struct amdgpu_gfx
Add comments to document gfx_off related members of struct amdgpu_gfx.

Signed-off-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:17:32 -04:00
André Almeida
0ad7347a64 drm/amd: Add detailed GFXOFF stats to debugfs
Add debugfs interface to log GFXOFF statistics:

- Read amdgpu_gfxoff_count to get the total GFXOFF entry count at the
  time of query since system power-up

- Write 1 to amdgpu_gfxoff_residency to start logging, and 0 to stop.
  Read it to get average GFXOFF residency % multiplied by 100
  during the last logging interval.

Both features are designed to be keep the values persistent between
suspends.

Signed-off-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:17:31 -04:00
shaoyunl
1d32af4fac drm/amdgpu: use sjt mec fw on aldebaran for sriov
The second jump table is required on live migration or mulitple VF
configuration on Aldebaran. With this implemented, the first level
jump table(hw used) will be same, mec fw internal will use the
second level jump table jump to the real functionality implementation.
so the different VF can load different version of MEC as long as
they support sjt

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:17:31 -04:00
Michel Dänzer
085292c3d7 Revert "drm/amd/amdgpu: add pipe1 hardware support"
This reverts commit 4c7631800e.

Triggered GFX hangs with GNOME Wayland on Navi 21.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2117
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:14:31 -04:00
Maíra Canal
bbca24d0a3 drm/amdgpu: Fix use-after-free on amdgpu_bo_list mutex
If amdgpu_cs_vm_handling returns r != 0, then it will unlock the
bo_list_mutex inside the function amdgpu_cs_vm_handling and again on
amdgpu_cs_parser_fini. This problem results in the following
use-after-free problem:

[ 220.280990] ------------[ cut here ]------------
[ 220.281000] refcount_t: underflow; use-after-free.
[ 220.281019] WARNING: CPU: 1 PID: 3746 at lib/refcount.c:28 refcount_warn_saturate+0xba/0x110
[ 220.281029] ------------[ cut here ]------------
[ 220.281415] CPU: 1 PID: 3746 Comm: chrome:cs0 Tainted: G W L ------- --- 5.20.0-0.rc0.20220812git7ebfc85e2cd7.10.fc38.x86_64 #1
[ 220.281421] Hardware name: System manufacturer System Product Name/ROG STRIX X570-I GAMING, BIOS 4403 04/27/2022
[ 220.281426] RIP: 0010:refcount_warn_saturate+0xba/0x110
[ 220.281431] Code: 01 01 e8 79 4a 6f 00 0f 0b e9 42 47 a5 00 80 3d de
7e be 01 00 75 85 48 c7 c7 f8 98 8e 98 c6 05 ce 7e be 01 01 e8 56 4a
6f 00 <0f> 0b e9 1f 47 a5 00 80 3d b9 7e be 01 00 0f 85 5e ff ff ff 48
c7
[ 220.281437] RSP: 0018:ffffb4b0d18d7a80 EFLAGS: 00010282
[ 220.281443] RAX: 0000000000000026 RBX: 0000000000000003 RCX: 0000000000000000
[ 220.281448] RDX: 0000000000000001 RSI: ffffffff988d06dc RDI: 00000000ffffffff
[ 220.281452] RBP: 00000000ffffffff R08: 0000000000000000 R09: ffffb4b0d18d7930
[ 220.281457] R10: 0000000000000003 R11: ffffa0672e2fffe8 R12: ffffa058ca360400
[ 220.281461] R13: ffffa05846c50a18 R14: 00000000fffffe00 R15: 0000000000000003
[ 220.281465] FS: 00007f82683e06c0(0000) GS:ffffa066e2e00000(0000) knlGS:0000000000000000
[ 220.281470] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 220.281475] CR2: 00003590005cc000 CR3: 00000001fca46000 CR4: 0000000000350ee0
[ 220.281480] Call Trace:
[ 220.281485] <TASK>
[ 220.281490] amdgpu_cs_ioctl+0x4e2/0x2070 [amdgpu]
[ 220.281806] ? amdgpu_cs_find_mapping+0xe0/0xe0 [amdgpu]
[ 220.282028] drm_ioctl_kernel+0xa4/0x150
[ 220.282043] drm_ioctl+0x21f/0x420
[ 220.282053] ? amdgpu_cs_find_mapping+0xe0/0xe0 [amdgpu]
[ 220.282275] ? lock_release+0x14f/0x460
[ 220.282282] ? _raw_spin_unlock_irqrestore+0x30/0x60
[ 220.282290] ? _raw_spin_unlock_irqrestore+0x30/0x60
[ 220.282297] ? lockdep_hardirqs_on+0x7d/0x100
[ 220.282305] ? _raw_spin_unlock_irqrestore+0x40/0x60
[ 220.282317] amdgpu_drm_ioctl+0x4a/0x80 [amdgpu]
[ 220.282534] __x64_sys_ioctl+0x90/0xd0
[ 220.282545] do_syscall_64+0x5b/0x80
[ 220.282551] ? futex_wake+0x6c/0x150
[ 220.282568] ? lock_is_held_type+0xe8/0x140
[ 220.282580] ? do_syscall_64+0x67/0x80
[ 220.282585] ? lockdep_hardirqs_on+0x7d/0x100
[ 220.282592] ? do_syscall_64+0x67/0x80
[ 220.282597] ? do_syscall_64+0x67/0x80
[ 220.282602] ? lockdep_hardirqs_on+0x7d/0x100
[ 220.282609] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 220.282616] RIP: 0033:0x7f8282a4f8bf
[ 220.282639] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10
00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00
0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00
00
[ 220.282644] RSP: 002b:00007f82683df410 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 220.282651] RAX: ffffffffffffffda RBX: 00007f82683df588 RCX: 00007f8282a4f8bf
[ 220.282655] RDX: 00007f82683df4d0 RSI: 00000000c0186444 RDI: 0000000000000018
[ 220.282659] RBP: 00007f82683df4d0 R08: 00007f82683df5e0 R09: 00007f82683df4b0
[ 220.282663] R10: 00001d04000a0600 R11: 0000000000000246 R12: 00000000c0186444
[ 220.282667] R13: 0000000000000018 R14: 00007f82683df588 R15: 0000000000000003
[ 220.282689] </TASK>
[ 220.282693] irq event stamp: 6232311
[ 220.282697] hardirqs last enabled at (6232319): [<ffffffff9718cd7e>] __up_console_sem+0x5e/0x70
[ 220.282704] hardirqs last disabled at (6232326): [<ffffffff9718cd63>] __up_console_sem+0x43/0x70
[ 220.282709] softirqs last enabled at (6232072): [<ffffffff970ff669>] __irq_exit_rcu+0xf9/0x170
[ 220.282716] softirqs last disabled at (6232061): [<ffffffff970ff669>] __irq_exit_rcu+0xf9/0x170
[ 220.282722] ---[ end trace 0000000000000000 ]---

Therefore, remove the mutex_unlock from the amdgpu_cs_vm_handling
function, so that amdgpu_cs_submit and amdgpu_cs_parser_fini can handle
the unlock.

Fixes: 90af0ca047 ("drm/amdgpu: Protect the amdgpu_bo_list list with a mutex v2")
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maíra Canal <mairacanal@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:14:31 -04:00
Mukul Joshi
de8341ee3c drm/amdgpu: Fix interrupt handling on ih_soft ring
There are no backing hardware registers for ih_soft ring.
As a result, don't try to access hardware registers for read
and write pointers when processing interrupts on the IH soft
ring.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:14:31 -04:00
Victor Zhao
194eb174cb drm/amdgpu: reduce reset time
In multi container use case, reset time is important, so skip ring
tests and cp halt wait during ip suspending for reset as they are
going to fail and cost more time on reset

v2: add a hang flag to indicate the reset comes from a job timeout,
skip ring test and cp halt wait in this case

v3: move hang flag to adev

Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:14:31 -04:00
Victor Zhao
72fadb1367 drm/amdgpu: revert context to stop engine before mode2 reset
For some hang caused by slow tests, engine cannot be stopped which
may cause resume failure after reset. In this case, force halt
engine by reverting context addresses

Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:14:31 -04:00
Victor Zhao
bfaced6ee7 drm/amdgpu: save and restore gc hub regs
Save and restore gfxhub regs as they will be reset during mode 2

Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:14:31 -04:00
Victor Zhao
5bd8d53f6f drm/amdgpu: add debugfs amdgpu_reset_level
Introduce amdgpu_reset_level debugfs in order to help debug and
test specific type of reset. Also helps blocking unwanted type of
resets.

By default, mode2 reset will not be enabled

v2: make this debugfs in adev and use debugfs_create_u32

Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:14:31 -04:00
Victor Zhao
dac6b80818 drm/amdgpu: let mode2 reset fallback to default when failure
- introduce AMDGPU_SKIP_MODE2_RESET flag
- let mode2 reset fallback to default reset method if failed

v2: move this part out from the asic specific part

Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:14:31 -04:00
Victor Zhao
672c0218e3 drm/amdgpu: add mode2 reset for sienna_cichlid
To meet the requirement for multi container usecase which needs
a quicker reset and not causing VRAM lost, adding the Mode2
reset handler for sienna_cichlid.

v2: move skip mode2 flag part separately

v3: remove the use of asic_reset_res

Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:14:31 -04:00
Shane Xiao
e42dfa66d5 drm/amdgpu: Add secure display TA load for Renoir
Add secure display TA load for Renoir

Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:14:30 -04:00
Khalid Masum
385bf5a856 drm/amdgpu/vcn: Return void from the stop_dbg_mode
There is no point in returning an int here. It only returns 0 which
the caller never uses. Therefore return void and remove the unnecessary
assignment.

Addresses-Coverity: 1504988 ("Unused value")
Fixes: 8da1170a16 ("drm/amdgpu: add VCN4 ip block support")
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Suggested-by: Ruijing Dong <ruijing.dong@amd.com>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Khalid Masum <khalid.masum.92@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:10:10 -04:00
Andrey Strachuk
bf7f7efbe0 drm/amdgpu: remove useless condition in amdgpu_job_stop_all_jobs_on_sched()
Local variable 'rq' is initialized by an address
of field of drm_sched_job, so it does not make
sense to compare 'rq' with NULL.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Andrey Strachuk <strochuk@ispras.ru>
Fixes: 7c6e68c777 ("drm/amdgpu: Avoid HW GPU reset for RAS.")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:09:47 -04:00
Harish Kasiviswanathan
1af9add1f1 drm/amdgpu: Add decode_iv_ts helper for ih_v6 block
Was missing.  Add it.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:09:32 -04:00
Dusica Milinkovic
373008bfc9 drm/amdgpu: Increase tlb flush timeout for sriov
[Why]
During multi-vf executing benchmark (Luxmark) observed kiq error timeout.
It happenes because all of VFs do the tlb invalidation at the same time.
Although each VF has the invalidate register set, from hardware side
the invalidate requests are queue to execute.

[How]
In case of 12 VF increase timeout on 12*100ms

Signed-off-by: Dusica Milinkovic <Dusica.Milinkovic@amd.com>
Acked-by: Shaoyun Liu <shaoyun.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:08:01 -04:00
Tim Huang
fa0bbd3be9 drm/amdgpu: enable IH Clock Gating for OSS IP v6.0.1
Enable AMD_CG_SUPPORT_IH_CG support.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:06:54 -04:00
Tim Huang
8e78c7c4fe drm/amdgpu: enable ATHUB IP v3.0.1 Clock Gating
Enable AMD_CG_SUPPORT_ATHUB_MGCG and AMD_CG_SUPPORT_ATHUB_LS support.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:06:18 -04:00
Tim Huang
7e4a77de08 drm/amdgpu: enable HDP IP v5.2.1 Clock Gating
Enable AMD_CG_SUPPORT_HDP_MGCG and AMD_CG_SUPPORT_HDP_LS support.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:06:11 -04:00
Tim Huang
adcd15dc47 drm/amdgpu: enable MMHUB IP v3.0.1 Clock Gating
Enable AMD_CG_SUPPORT_MC_MGCG and AMD_CG_SUPPORT_MC_LS support.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:06:06 -04:00
Tim Huang
cede849e9e drm/amdgpu: add ATHUB IP v3.0.1 Clock Gating support
Add ATHUB IP v3.0.1 in athub_v3_0_set_clockgating.

The regATHUB_MISC_CNTL has different offset for ATHUB IP v3.0.1,
so need to add IP version checking to use the right REG offset.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:06:00 -04:00
Tim Huang
73c49a624a drm/amdgpu: add HDP IP v5.2.1 Clock Gating support
Add set/get_clockgating for HDP IP v5.2.1.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:05:54 -04:00
Tim Huang
8172cebac5 drm/amdgpu: add MMHUB IP v3.0.1 Clock Gating support
Add set/get_clockgating for MMHUB IP v3.0.1.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:05:46 -04:00
Kenneth Feng
b4ddb27d1d drm/amd/amdgpu: add ih cg and hdp sd on smu_v13_0_7
add ih cg and hdp sd on smu_v13_0_7

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:03:10 -04:00
Evan Quan
1b586595df drm/amdgpu: disable 3DCGCG/CGLS temporarily due to stability issue
Some stability issues were reported with these features.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:02:42 -04:00
Sebin Sebastian
ad2feebd71 drm/amdgpu: double free error and freeing uninitialized null pointer
Fix a double free and an uninitialized pointer read error. Both tmp and
new are pointing at same address and both are freed which leads to
double free. Adding a check to verify if new and tmp are free in the
error_free label fixes the double free issue. new is not initialized to
null which also leads to a free on an uninitialized pointer.

Reviewed-by: André Almeida <andrealmeid@igalia.com>
Suggested by: S. Amaranath <Amaranath.Somalapuram@amd.com>
Signed-off-by: Sebin Sebastian <mailmesebin00@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-10 15:41:23 -04:00
Alex Deucher
a6250bdb6c drm/amdgpu: Only disable prefer_shadow on hawaii
We changed it for all asics due to a hibernation regression
on hawaii, but the workaround breaks suspend on a polaris12.
Just disable it for hawaii.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216119
Fixes: 3a4b1cc28f ("drm/amdgpu/display: disable prefer_shadow for generic fb helpers")
Reviewed-and-tested-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-08-10 15:41:23 -04:00
Tim Huang
dc0a096bcc drm/amdgpu: add GFX Power Gating support for GC IP v11.0.1
Add AMD_PG_SUPPORT_GFX_PG support.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-10 15:41:23 -04:00
Tim Huang
cb9c7ab1b3 drm/amdgpu: enable GFX Power Gating for GC IP v11.0.1
Enable GFX Power Gating control for GC IP v11.0.1.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-10 15:41:23 -04:00
Rajneesh Bhardwaj
c4c10a68e8 drm/amdgpu: Avoid direct cast to amdgpu_ttm_tt
For typesafety, use container_of() instead of implicit cast from struct
ttm_tt to struct amdgpu_ttm_tt.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-10 15:41:23 -04:00
Joseph Greathouse
352e683b72 drm/amdgpu: Enable translate_further to extend UTCL2 reach
Enable translate_further on Arcturus and Aldebaran server chips
in order to increase the UTCL2 reach from 8 GiB to 64 GiB,
which is more in line with the amount of framebuffer DRAM in
the devices.

Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Kent Russell <kent.russell@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-10 15:41:23 -04:00
Tim Huang
fb1a140b7b drm/amdgpu: enable GFX Clock Gating control for GC IP v11.0.1
Enable GFX CG gate/ungate control.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-10 15:10:26 -04:00
Tim Huang
8df436d5cc drm/amdgpu: add GFX Clock Gating support for GC IP v11.0.1
Add below GFX Clock Gating supports:

1. GFX Coarse Grain Clock Gating(CGCG)
2. GFX Coarse grain light sleep/deep sleep(CGLS)
3. GFX Medium Grain Clock Gating(MGCG)
4. GFX Fine Grain Clock Gating(FGCG)
5. Repeater Fine Grain Clock Gating
6. Perfmon Clock Gating

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-10 15:10:19 -04:00
Ma Jun
616699d77b drm/amdgpu: Remove redundant reference of header file
Remove redundant reference of header file dev_printk.h

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-10 15:10:04 -04:00
Lijo Lazar
0a83bb35d8 drm/amdgpu: Avoid another list of reset devices
A list of devices to be reset is already created in
amdgpu_device_gpu_recover function. Creating another list with the
same nodes is incorrect and not supported in list_head. Instead, pass
the device list as part of reset context.

Fixes: 9e08564727 (drm/amdgpu: Refactor mode2 reset logic for v13.0.2)
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-10 15:07:14 -04:00
Likun Gao
4a0a2cf4c0 drm/amdgpu: change vram width algorithm for vram_info v3_0
Update the vram width algorithm for vram_info v3_0 to align with the
changes of latest IFWI.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.19.x
2022-08-10 14:59:04 -04:00
Daniel Phillips
1ac354beec drm/amdgpu: Pessimistic availability based on rounded up allocations
Separately accumulate a statistic of rounded up allocations to use
to report availability, with a view to increasing the likelihood a
buffer object can be successfully allocated at exactly the size
reported by the availability API.

Signed-off-by: Daniel Phillips <daniel.phillips@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-10 14:58:57 -04:00
Daniel Phillips
aec208eecf drm/amdgpu: Remove rounding from vram allocation path
Rounding up allocations in the allocation path caused test regressions,
so now just round in the availability path.

Signed-off-by: Daniel Phillips <daniel.phillips@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-10 14:58:44 -04:00
Sudip Mukherjee
f0a892f599 drm/amd/amdgpu: fix build failure due to implicit declaration
The builds for alpha and mips allmodconfig fails with the error:

  drivers/gpu/drm/amd/amdgpu/psp_v13_0.c:534:23: error: implicit declaration of function 'vmalloc'; did you mean 'kvmalloc'? [-Werror=implicit-function-declaration]
  drivers/gpu/drm/amd/amdgpu/psp_v13_0.c:534:21: error: assignment to 'void *' from 'int' makes pointer from integer without a cast [-Werror=int-conversion]
  drivers/gpu/drm/amd/amdgpu/psp_v13_0.c:545:33: error: implicit declaration of function 'vfree'; did you mean 'kvfree'? [-Werror=implicit-function-declaration]

Add the header file for vmalloc and vfree.

Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-04 12:29:10 -07:00
Thomas Zimmermann
9cf26c8968 Merge drm/drm-next into drm-misc-next
Backmerging to pick up fixes from amdgpu.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2022-08-01 16:04:00 +02:00
Xiaojian Du
7e8a3ca972 drm/amdgpu: enable support for psp 13.0.4 block
This patch will enable support for psp 13.0.4 blcok.

Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-29 15:24:38 -04:00
Xiaojian Du
2605e60c82 drm/amdgpu: add files for PSP 13.0.4
This patch will add files for PSP 13.0.4.

Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-29 15:24:38 -04:00
Yifan Zhang
a16161a869 drm/amdgpu: correct RLC_RLCS_BOOTLOAD_STATUS offset and index
This patch corrects RLC_RLCS_BOOTLOAD_STATUS offset and index for
GC 11.0.1.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-29 15:24:21 -04:00
Jonathan Kim
1ff186ff32 drm/amdgpu: fix hive reference leak when reflecting psp topology info
Hives that require psp topology info to be reflected will leak hive
reference so fix it.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Shaoyun Liu <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28 16:28:54 -04:00
Jack Xiao
ed67f7292b drm/amdgpu: move mes self test after drm sched re-started
mes self test rely on vm mapping, move it after
drm sched re-started so that vm mapping can work
during gpu reset.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-and-tested-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28 16:28:54 -04:00
Evan Quan
0da0def770 drm/amdgpu: drop non-necessary call trace dump
This extra call trace dump comes out in every gpu reset.
And it gives people a wrong impression that something
went wrong. Although actually there was not.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28 16:28:54 -04:00
Sonny Jiang
47231d5e39 drm/amdgpu: enable VCN cg and JPEG cg/pg
Not enable VCN pg because encode issue

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28 16:28:54 -04:00
Sonny Jiang
1c0a903648 drm/amdgpu: vcn_4_0_2 video codec query
Enable support for vcn_4_0_2 video codec

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28 16:28:54 -04:00
Sonny Jiang
cbe93a234b drm/amdgpu: add VCN_4_0_2 firmware support
Add VCN_4_0_2 firmware support

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28 16:28:54 -04:00
Sonny Jiang
4ac77cce84 drm/amdgpu: add VCN function in NBIO v7.7
Add function to support VCN_4_0_2 doorbell

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28 16:28:54 -04:00
Sonny Jiang
736f7308d3 drm/amdgpu: fix a vcn4 boot poll bug in emulation mode
The return value should be set in vcn4 boot poll.

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28 16:28:54 -04:00
Chengming Gui
6fdd2077ec drm/amd/amdgpu: add memory training support for PSP_V13
Add PSP_V13 memory training support funcs.

v2: replace DRM_{DEBUG/ERROR} with dev_{dbg/err}. (Hawking)
v3: fix checkpatch error (Alex)

Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28 16:28:28 -04:00
Lang Yu
e22ec18750 drm/amdkfd: remove an unnecessary amdgpu_bo_ref
No need to reference the BO here, dmabuf framework will handle that.

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28 16:26:01 -04:00
Chengming Gui
7c8e4a2572 drm/amd/amdgpu: add additional page fault settings for gfx11
Add three additional page fault settings.

V2: move reg offset definition to header file. (Alex)
V3: add all shift/mask definitions of used reg. (Hawking)

Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28 16:19:59 -04:00
Vijendar Mukunda
b2065fb21d drm/amdgpu: fix i2s_pdata out of bound array access
Fixed following Smatch static checker warning:

    drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:393 acp_hw_init()
    error: buffer overflow 'i2s_pdata' 3 <= 3
    drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:396 acp_hw_init()
    error: buffer overflow 'i2s_pdata' 3 <= 3

Fixes: 4c33e5179f ("drm/amdgpu: create I2S platform devices for Jadeite platform")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28 16:05:16 -04:00
Lang Yu
0dc204bc3f drm/amdkfd: fix kgd_mem memory leak when importing dmabuf
The kgd_mem memory allocated in amdgpu_amdkfd_gpuvm_import_dmabuf()
is not freed properly.

Explicitly free it in amdgpu_amdkfd_gpuvm_free_memory_of_gpu()
under condition "mem->bo->kfd_bo != mem".

Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28 16:05:16 -04:00
Alex Sierra
3d2af401cf drm/amdgpu: add debugfs for kfd system and ttm mem used
This keeps track of kfd system mem used and kfd ttm mem used.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28 16:05:16 -04:00
Alex Sierra
f9af3c16bf drm/amdkfd: track unified memory reservation with xnack off
[WHY]
Unified memory with xnack off should be tracked, as userptr mappings
and legacy allocations do. To avoid oversuscribe system memory when
xnack off.
[How]
Exposing functions reserve_mem_limit and unreserve_mem_limit to SVM
API and call them on every prange creation and free.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28 16:05:16 -04:00
Philip Yang
79b2c54f19 drm/amdgpu: Allow TTM to evict svm bo from same process
To support SVM range VRAM overcommitment, TTM should be able to evict
svm bo of same process to system memory, to get space to alloc new svm
bo.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28 16:05:14 -04:00
Roy Sun
1f83db6be3 drm/amdgpu: Fix the incomplete product number
The comments say that the product number is a 16-digit HEX string so the
buffer needs to be at least 17 characters to hold the NUL terminator. Expand
the buffer size to 20 to avoid the alignment issues.

The comment:Product number should only be 16 characters. Any
more,and something could be wrong. Cap it at 16 to be safe

Signed-off-by: Roy Sun <Roy.Sun@amd.com>
Reviewed-by: André Almeida <andrealmeid@igalia.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28 16:05:14 -04:00
Guchun Chen
8585732baa drm/amdgpu: use adev_to_drm for consistency
Keep code consistency when accessing drm_device from amdgpu driver.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28 16:05:14 -04:00
Danilo Krummrich
c4f306e316
drm/amdgpu: use idr_init_base() to initialize fpriv->bo_list_handles
idr_init_base(), implemented by commit 6ce711f275 ("idr: Make 1-based
IDRs more efficient"), let us set an arbitrary base other than
idr_init(), which uses base 0.

Since, for this IDR, no ID < 1 is ever requested/allocated, using
idr_init_base(&idr, 1) avoids unnecessary tree walks.

Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20220701185303.284082-3-dakr@redhat.com
2022-07-28 15:35:55 +01:00
Danilo Krummrich
2ddd1e6ccb
drm/amdgpu: use idr_init_base() to initialize mgr->ctx_handles
idr_init_base(), implemented by commit 6ce711f275 ("idr: Make 1-based
IDRs more efficient"), let us set an arbitrary base other than
idr_init(), which uses base 0.

Since, for this IDR, no ID < 1 is ever requested, using
idr_init_base(&idr, 1) avoids unnecessary tree walks.

Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20220701185303.284082-2-dakr@redhat.com
2022-07-28 15:35:55 +01:00
Dave Airlie
ee8b1ef9a6 Merge tag 'amd-drm-next-5.20-2022-07-26' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amdgpu:
- VCN4 fixes
- RAS support for UMC 8.10
- ACP support for jadeite platforms
- NBIO HDP flush fixes
- Misc spelling and grammar fixes
- Runtime PM fixes
- Non-DC HPD fix
- Clean up amdgpu DM code
- DSC fixes
- Expose some additional GFXOFF data via debugfs
- More FP clean up for new DCN blocks
- PPC DC FP fixes
- DCN 3.1.4 fixes
- DC DML stack usage fixes
- GMC fixes
- SPM fixes for RDNA2

amdkfd:
- MMU notifier fix
- Mutex fix

UAPI:
- Add a comment about VCN4 unified queues
- IP version information for UMDs
  Proposed mesa change: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411/diffs?commit_id=c8a63590dfd0d64e6e6a634dcfed993f135dd075

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220726181536.5759-1-alexander.deucher@amd.com
2022-07-27 09:33:45 +10:00
Thomas Zimmermann
254e5e8829 drm: Remove unnecessary include statements of drm_plane_helper.h
Remove the include statement for drm_plane_helper.h from all the files
that don't need it. Althogh the header file is almost empty, many drivers
include it somewhere.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220720083058.15371-5-tzimmermann@suse.de
2022-07-26 18:42:04 +02:00
Thomas Zimmermann
cce32e4e38 drm/atomic-helper: Remove _HELPER_ infix from DRM_PLANE_HELPER_NO_SCALING
Rename DRM_PLANE_HELPER_NO_SCALING to DRM_PLANE_NO_SCALING. The constant
is not really a helper, but rather a characteristic of the plane itself.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220720083058.15371-4-tzimmermann@suse.de
2022-07-26 18:42:00 +02:00
Aaron Liu
4c5aa59492 drm/amdgpu: enable swiotlb for gmc 11.0
Enable swiotlb for gmc 11.0.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-25 17:14:35 -04:00
Aaron Liu
3616d49da5 drm/amdgpu: enable swiotlb for gmc 10.0 (V2)
Enable swiotlb for gmc 10.0.
v2: include drm_cache.h to use the function ‘drm_need_swiotlb’

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-25 17:14:19 -04:00
Slark Xiao
9dd4545f65 drm/amd: Fix typo 'the the' in comment
Replace 'the the' with 'the' in the comment.

Signed-off-by: Slark Xiao <slark_xiao@163.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-25 09:31:05 -04:00
Rajneesh Bhardwaj
44998fbdcd drm/amdgpu: Refactor code to handle non coherent and uncached
This simplifies existing coherence handling for Arcturus and Aldabaran
to account for !coherent && uncached scenarios.

Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-25 09:31:04 -04:00
Chengming Gui
2207efdd83 drm/amd/amdgpu: add TAP_DELAYS upload support for gfx10
Support {GLOBAL/SE0/SE1/SE2/SE3}_TAP_DELAYS uploading.

v2: upload TAP_DELAYS before RLC autoload was triggered. (Hawking)

Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-25 09:31:04 -04:00
Leo Li
792a0cdde3 drm/amd/display: Add visualconfirm module parameter
[Why]

Being able to configure visual confirm at boot or in cmdline is helpful
when debugging.

[How]

Add a module parameter to configure DC visual confirm, which works the
same way as the equivalent debugfs entry.

Signed-off-by: Leo Li <sunpeng.li@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-25 09:31:03 -04:00
Alex Deucher
465576ca48 drm/amdgpu: bump driver version for IP discovery info in HW INFO
So userspace knows when it is available.

Proposed mesa patch:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411/diffs?commit_id=c8a63590dfd0d64e6e6a634dcfed993f135dd075

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-25 09:31:03 -04:00
Alex Deucher
af14e7c2fc drm/amdgpu: add the IP discovery IP versions for HW INFO data
Use the former pad element to store the IP versions from the
IP discovery table.  This allows userspace to get the IP
version from the kernel to better align with hardware IP
versions.

Proposed mesa patch:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411/diffs?commit_id=c8a63590dfd0d64e6e6a634dcfed993f135dd075

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-25 09:31:03 -04:00
Jason Wang
c19a23fadd drm/amdgpu: Fix comment typo
The double `to' is duplicated in the comment, remove one.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-25 09:31:03 -04:00
Roman Li
869b10ac8d drm/amdgpu: add dm ip block for dcn 3.1.4
Adding dm ip block to enable display on dcn 3.1.4.

Signed-off-by: Roman Li <Roman.Li@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-25 09:31:02 -04:00
André Almeida
4686177f7d drm/amd/debugfs: Expose GFXOFF state to userspace
GFXOFF has two different "state" values: one to define if the GPU is
allowed/disallowed to enter GFXOFF, usually called state; and another
one to define if currently GFXOFF is being used, usually called status.
Even when GFXOFF is allowed, GPU firmware can decide to not used it
accordingly to the GPU load.

Userspace can allow/disallow GPUs to enter into GFXOFF via debugfs. The
kernel maintains a counter of requests for GFXOFF (gfx_off_req_count)
that should be decreased to allow GFXOFF and increased to disallow.

The issue with this interface is that userspace can't be sure if GFXOFF
is currently allowed. Even by checking amdgpu_gfxoff file, one might get
an ambiguous 2, that means that GPU is currently out of GFXOFF, but that
can be either because it's currently disallowed or because it's allowed
but given the current GPU load it's enabled. Then, userspace needs to
rely on the fact that GFXOFF is enabled by default on boot and to track
this information.

To make userspace life easier and GFXOFF more reliable, return the
current state of GFXOFF to userspace when reading amdgpu_gfxoff with the
same semantics of writing: 0 means not allowed, not 0 means allowed.

Expose the current status of GFXOFF through a new file,
amdgpu_gfxoff_status.

Signed-off-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-25 09:31:02 -04:00
Dave Airlie
cb6b81b21b Short summary of fixes pull:
* amdgpu: Fix for drm buddy memory corruption
  * nouveau: PM fixes; DP fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmLY+rcACgkQaA3BHVML
 eiNuGwf/e5tnPLnJMYbkrGy3UxqoAxA/LsJ0ZhUfznsb8yKE8FyIcINJQ3BTtjrG
 9wEiVoUfitKttPtnkFBPrhR+8VDhVOd8qi953PEO+EwLNDmfv+hkKAONyc4oMdLG
 3WucvtvZDYhvWitSH3nJQoPF/W8Ctj6WGsq5JlhQ5AvePt7Fvw7mjYRi+20mLmFS
 cMUEF+R75iW94RfK2aQ8JMz6IKhnh/nGzTdIz1qiO9lK06OAfkhaJbYyJVYt1s7d
 PkL9DcqlMjh1u5uNllTN1RK+UNNgj12DXeFWpUfLfPP1Jd4ywcGgV61tcyHw3D8G
 r34vxdiuaDuFMv8yYJMvlnQSqjp0Hg==
 =vspz
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-fixes-2022-07-21' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

Short summary of fixes pull:

 * amdgpu: Fix for drm buddy memory corruption
 * nouveau: PM fixes; DP fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/Ytj65+PdAJs4jIEO@linux-uq9g
2022-07-22 13:44:07 +10:00
Maíra Canal
40835624ef drm/amdgpu: Write masked value to control register
On the dce_v6_0 and dce_v8_0 hpd tear down callback, the tmp variable
should be written into the control register instead of 0.

Reviewed-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Maíra Canal <mairacanal@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-20 16:03:59 -04:00
Gavin Wan
dc2b9c70eb drm/amdgpu: fix scratch register access method in SRIOV
The scratch register should be accessed through MMIO instead of RLCG
in SRIOV, since it being used in RLCG register access function.

Fixes: d54762cc3e ("drm/amdgpu: nuke dynamic gfx scratch reg allocation")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Gavin Wan <Gavin.Wan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-20 16:03:53 -04:00
Alex Sierra
86bd6706c4 drm/amdgpu: remove acc_size from reserve/unreserve mem
TTM used to track the "acc_size" of all BOs internally. We needed to
keep track of it in our memory reservation to avoid TTM running out
of memory in its own accounting. However, that "acc_size" accounting
has since been removed from TTM. Therefore we don't really need to
track it any more.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Philip Yang <philip.yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-20 16:03:49 -04:00
Luben Tuikov
5df79aeb6e drm/amdgpu: Protect the amdgpu_bo_list list with a mutex v2
Protect the struct amdgpu_bo_list with a mutex. This is used during command
submission in order to avoid buffer object corruption as recorded in
the link below.

v2 (chk): Keep the mutex looked for the whole CS to avoid using the
	  list from multiple CS threads at the same time.

Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Vitaly Prosyak <Vitaly.Prosyak@amd.com>
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2048
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-20 16:03:41 -04:00
Kenneth Feng
a53bc32182 drm/amd/pm: enable mode1 reset for smu_v13_0_7
enable mode1 reset for smu_v13_0_7 since it's missing.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-18 16:42:39 -04:00
Hawking Zhang
5877b7ddbc drm/amdgpu: correct the PSP_BL_CMD enum
To match with the enum defined in trusted os

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-18 16:42:39 -04:00
Guchun Chen
9c913f3803 drm/amdgpu: drop runpm from amdgpu_device structure
It's redundant, as now switching to rpm_mode to indicate
runtime power management mode.

Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-18 16:42:39 -04:00
Guchun Chen
75a9ad8c1b drm/amdgpu: drop runtime pm disablement quirk on several sienna cichlid cards
This quirk is not needed any more as it's fixed by bypassing
SMU FW reloading in runtime resume.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-18 16:42:39 -04:00
Guchun Chen
f746556aa9 drm/amdgpu: skip SMU FW reloading in runpm BACO case
SMU is always alive, so it's fine to skip SMU FW reloading
when runpm resumed from BACO, this can avoid some race issues
when resuming SMU.

Suggested-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-18 16:42:39 -04:00
Guchun Chen
50fe04d46a drm/amdgpu: introduce runtime pm mode
It can benefit code consistency in future.

Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-18 16:42:39 -04:00
André Almeida
133dc89c64 drm/amdgpu: Clarify asics naming in Kconfig options
Clarify which architecture those asics acronyms refers to.

Signed-off-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-18 16:42:39 -04:00
Alex Deucher
958afce98c drm/amdgpu: restore original stable pstate on ctx fini
Save the original stable pstate on ctx init and restore
it on ctx fini so that we restore a manually selected
stable pstate on ctx exit.

v2: fix init order (Alex)
v3: don't add new variable to ctx struct (Evan)

Fixes: c65b364c52 ("drm/amdgpu/ctx: only reset stable pstate if the user changed it (v2)")
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-18 16:42:33 -04:00
Alex Deucher
98a90f1f0f drm/amdgpu: use the same HDP flush registers for all nbio 2.3.x
Align RDNA2.x with other asics.  One HDP bit per SDMA instance,
aligned with firmware.  This is effectively a revert of
commit 369b7d04ba ("drm/amdgpu/nbio2.3: don't use GPU_HDP_FLUSH bit 12").
On further discussions with the relevant hardware teams,
re-align the bits for SDMA.

Fixes: 369b7d04ba ("drm/amdgpu/nbio2.3: don't use GPU_HDP_FLUSH bit 12")
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-18 16:42:18 -04:00
Alex Deucher
912db6a587 drm/amdgpu: use the same HDP flush registers for all nbio 7.4.x
Align aldebaran with all other asics.  One HDP bit per
SDMA instance, aligned with firmware.  This is effectively
a revert of
commit a0f9f85466 ("drm/amdgpu/nbio7.4: don't use GPU_HDP_FLUSH bit 12").
On further discussions with the relevant hardware teams,
re-align the bits for SDMA.

Fixes: a0f9f85466 ("drm/amdgpu/nbio7.4: don't use GPU_HDP_FLUSH bit 12")
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-18 16:41:55 -04:00
Vijendar Mukunda
4c33e5179f drm/amdgpu: create I2S platform devices for Jadeite platform
Jadeite platform uses I2S MICSP instance.
Create platform devices for DMA controller and I2S controller for
Jadeite platform.

Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-18 16:38:25 -04:00
Vijendar Mukunda
49062ee374 drm/amdgpu: add dmi check for jadeite platform
DMI check is required to distinguish Jadeite platform from
Stoney base variant.
Add DMI check logic for Jadeite platform.

Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-18 16:38:19 -04:00
Vijendar Mukunda
604d3a3f0d drm/amdgpu: fix for coding style issues
Fixed below checkpatch warnings and errors

drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:131: CHECK: Comparison to NULL could be written "apd"
drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:150: CHECK: Comparison to NULL could be written "apd"
drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:196: CHECK: Prefer kernel type 'u64' over 'uint64_t'
drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:224: CHECK: Please don't use multiple blank lines
drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:226: CHECK: Comparison to NULL could be written "!adev->acp.acp_genpd"
drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:233: CHECK: Please don't use multiple blank lines
drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:239: CHECK: Alignment should match open parenthesis
drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:241: CHECK: Comparison to NULL could be written "!adev->acp.acp_cell"
drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:247: CHECK: Comparison to NULL could be written "!adev->acp.acp_res"
drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:253: CHECK: Comparison to NULL could be written "!i2s_pdata"
drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:350: CHECK: Alignment should match open parenthesis
drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:550: ERROR: that open brace { should be on the previous line

Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-18 16:38:03 -04:00
YiPeng Chai
e4b1edf48f drm/amdgpu: add umc ras functions for umc v8_10_0
1. Support query umc ras error counter.
2. Support ras umc ue error address remapping.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-18 16:37:47 -04:00
Andrey Grodzovsky
f6a3f66063 drm/amdgpu: Get rid of amdgpu_job->external_hw_fence
This is a follow-up cleanup to [1]. See bellow refcount balancing
for calling amdgpu_job_submit_direct after this cleanup as far
as I calculated.

amdgpu_fence_emit
	dma_fence_init 1
	dma_fence_get(fence) 2
	rcu_assign_pointer(*ptr, dma_fence_get(fence) 3

---> amdgpu_job_submit_direct completes before fence signaled
			amdgpu_sa_bo_free
				(*sa_bo)->fence = dma_fence_get(fence) 4

			amdgpu_job_free
				dma_fence_put 3

			amdgpu_vcn_enc_get_destroy_msg
				*fence = dma_fence_get(f) 4
				dma_fence_put(f); 3

			amdgpu_vcn_enc_ring_test_ib
				dma_fence_put(fence) 2

			amdgpu_fence_process
				dma_fence_put 1

			amdgpu_sa_bo_remove_locked
				dma_fence_put 0

---> amdgpu_job_submit_direct completes after fence signaled
			amdgpu_fence_process
				dma_fence_put 2

			amdgpu_job_free
				dma_fence_put 1

			amdgpu_vcn_enc_get_destroy_msg
				*fence = dma_fence_get(f) 2
				dma_fence_put(f); 1

			amdgpu_vcn_enc_ring_test_ib
				dma_fence_put(fence) 0

[1] - https://patchwork.kernel.org/project/dri-devel/cover/20220624180955.485440-1-andrey.grodzovsky@amd.com/

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-18 16:37:25 -04:00
Sonny Jiang
0b15205c73 drm/amdgpu: limiting AV1 to first instance on VCN4 decode
AV1 is only supported on first instance.

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-18 16:37:09 -04:00
Arunpravin Paneer Selvam
6f2c8d5f16 drm/amdgpu: Fix for drm buddy memory corruption
User reported gpu page fault when running graphics applications
and in some cases garbaged graphics are observed as soon as X
starts. This patch fixes all the issues.

Fixed the typecast issue for fpfn and lpfn variables, thus
preventing the overflow problem which resolves the memory
corruption.

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reported-by: Mike Lothian <mike@fireburn.co.uk>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20220714101214.7620-1-Arunpravin.PaneerSelvam@amd.com
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
2022-07-15 15:41:51 +02:00
Christian König
59dad4a0d1 drm/amdgpu: re-apply "move internal vram_mgr function into the C file""
This re-applys commit 708d19d9f3.

The original problem this was reverted for was found and the correct fix
will be merged to drm-misc-next-fixes.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220714132315.587217-2-christian.koenig@amd.com
2022-07-15 13:24:38 +02:00
Christian König
55b3d6a63f drm/amdgpu: reapply "fix start calculation in amdgpu_vram_mgr_new""
This re-applys commit 5e3f1e7729.

The original problem this was reverted for was found and the correct fix
will be merged to drm-misc-next-fixes.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220714132315.587217-1-christian.koenig@amd.com
2022-07-15 13:24:01 +02:00
Dave Airlie
60693e3a38 Merge tag 'amd-drm-next-5.20-2022-07-14' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.20-2022-07-14:

amdgpu:
- DCN3.2 updates
- DC SubVP support
- DP MST fixes
- Audio fixes
- DC code cleanup
- SMU13 updates
- Adjust GART size on newer APUs for S/G display
- Soft reset for GFX 11
- Soft reset for SDMA 6
- Add gfxoff status query for vangogh
- Improve BO domain pinning
- Fix timestamps for cursor only commits
- MES fixes
- DCN 3.1.4 support
- Misc fixes
- Misc code cleanup

amdkfd:
- Simplify GPUVM validation
- Unified memory for CWSR save/restore area
- fix possible list corruption on queue failure

radeon:
- Fix bogus power of two warning

UAPI:
- Unified memory for CWSR save/restore area for KFD
  Proposed userspace: https://lists.freedesktop.org/archives/amd-gfx/2022-June/080952.html

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220714214716.8203-1-alexander.deucher@amd.com
2022-07-15 15:07:26 +10:00
Leo Li
f5ba140436 drm/amdgpu: Check BO's requested pinning domains against its preferred_domains
When pinning a buffer, we should check to see if there are any
additional restrictions imposed by bo->preferred_domains. This will
prevent the BO from being moved to an invalid domain when pinning.

For example, this can happen if the user requests to create a BO in GTT
domain for display scanout. amdgpu_dm will allow pinning to either VRAM
or GTT domains, since DCN can scanout from either or. However, in
amdgpu_bo_pin_restricted(), pinning to VRAM is preferred if there is
adequate carveout. This can lead to pinning to VRAM despite the user
requesting GTT placement for the BO.

v2: Allow the kernel to override the domain, which can happen when
    exporting a BO to a V4L camera (for example).

Signed-off-by: Leo Li <sunpeng.li@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-07-13 20:56:44 -04:00
Jack Xiao
af019bef6d drm/amdgpu/gfx11: add aggregated doorbell support
Port aggregated doorbell support to gfx11.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-13 11:25:17 -04:00
Jack Xiao
86ef6eae08 drm/amdgpu/sdma6: add aggregated doorbell support
Port aggregated doorbell support to sdma6.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-13 11:25:17 -04:00
Le Ma
2d7a1f7183 drm/amdgpu/mes: ring aggregatged doorbell when mes queue is unmapped
Ring aggregated doorbel to make unmapped queue scheduled in mes firmware.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-13 11:25:17 -04:00
Jack Xiao
b7320117b3 drm/amdgpu/mes11: initialize aggregated doorbell
Allocate and enable aggregated doorbell.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-13 11:25:17 -04:00
Le Ma
0fe6906203 drm/amdgpu/mes: init aggregated doorbell
Allocate and enable aggregated doorbell.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-13 11:25:17 -04:00
Likun Gao
f1549c09c5 drm/amdgpu: support reset flag set for gpu reset
Move reset_context out of gpu recover function to make it configurable
for different reset purpose.
For the reset way of call gpu_recovery sysfs, force to use full reset
method. Otherwise, try soft reset by default if the related ASIC
supportted, if soft reset failed, will use full reset.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-13 11:25:17 -04:00
Likun Gao
58e969b60d drm/amdgpu: support SDMA soft recovery for sdma v6
Support SDMA soft reset for SDMA v6.

V3: use ib test to check soft reset.
V4: squash in unused variable fix (Alex)

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-13 11:25:16 -04:00
Likun Gao
c0ff84cb58 drm/amdgpu: enable soft reset for gfx 11
Enable soft reset for gfx 11.
V2: enable both gfx v11.0.0 and gfx v11.0.2.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-13 11:25:16 -04:00
Likun Gao
a84e43b81e drm/amdgpu: support gfx soft reset for gfx v11
Support GFX soft reset for gfx v11.

V3: use ib test check soft reset.
V4: squash in unused variable fix (Alex)

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-13 11:25:16 -04:00
Maxime Ripard
4de395f2c6
Merge drm/drm-next into drm-misc-next
I need to have some vc4 patches merged in -rc4, but drm-misc-next is
only at -rc2 for now.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2022-07-13 10:33:00 +02:00
Jack Xiao
636774860a drm/amdgpu/mes: set correct mes ring ready flag
Set corresponding ready flag for mes ring when enable or disable
mes ring.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-12 15:33:17 -04:00