linux

mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00

Author	SHA1	Message	Date
Thomas Zimmermann	da68386d9e	drm: Rename dp/ to display/ Rename dp/ to display/ to account for additional display-related helpers, such as HDMI. Update all related include statements. No functional changes. Various drivers, such as i915 and amdgpu, use similar naming scheme by putting code for video-output standards into a local display/ directory. The new directory's name is aligned with this convention. v2: * update commit message (Javier) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220421073108.19226-3-tzimmermann@suse.de	2022-04-25 11:17:45 +02:00
Dave Airlie	c18a2a280c	Two fixes for the raspberrypi panel initialisation, one fix for a logic inversion in radeon, a build and pm refcounting fix for vc4, two reverts for drm_of_get_bridge that caused a number of regression and a locking regression for amdgpu. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCYmJqDwAKCRDj7w1vZxhR xQlAAP9N78SStxmzZ3UjjU2h4fj7JXs3y97DddJZpzyu92+d5QD7BFP8i3mKLGhq hmmYabGl58dWK+bXZRD85kOsIxv80A0= =RgD5 -----END PGP SIGNATURE----- Merge tag 'drm-misc-fixes-2022-04-22' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Two fixes for the raspberrypi panel initialisation, one fix for a logic inversion in radeon, a build and pm refcounting fix for vc4, two reverts for drm_of_get_bridge that caused a number of regression and a locking regression for amdgpu. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20220422084403.2xrhf3jusdej5yo4@houat	2022-04-23 15:00:44 +10:00
David Yu	a2443ef0a8	drm/amdgpu: Ta fw needs to be loaded for SRIOV aldebaran Load ta fw during psp_init_sriov_microcode to enable XGMI. It is required to be loaded by both guest and host starting from Arcturus. Cap fw needs to be loaded first. Signed-off-by: David Yu <David.Yu@amd.com> Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-22 14:51:51 -04:00
Tao Zhou	b3c76814ce	drm/amdgpu: add RAS fatal error interrupt handler The fatal error handler is independent from general ras interrupt handler since there is no related IH ring. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-22 14:50:18 -04:00
Tao Zhou	66f8794961	drm/amdgpu: add RAS poison consumption handler (v2) Add support for general RAS poison consumption handler. v2: remove callback function for poison consumption. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-22 14:50:13 -04:00
Tao Zhou	50a7d025ca	drm/amdgpu: add RAS poison creation handler (v2) Prepare for the implementation of poison consumption handler. v2: separate umc handler from poison creation. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-22 14:50:07 -04:00
Bokun Zhang	e15c9d06e9	drm/amd/amdgpu: Update PF2VF header - In the latest version of the header, there is a variable name change. This should not cause any backward compatibility since the variable is at the same offset in the struct. Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-21 16:00:14 -04:00
Bokun Zhang	451913e980	drm/amd/amdgpu: Properly indent PF2VF header - Clean up the identation in the header file Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-21 16:00:09 -04:00
Bokun Zhang	c649287aba	drm/amd/amdgpu: Update MIT license in SRIOV msg header - Update MIT license header Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-21 16:00:01 -04:00
Alex Deucher	4020c22802	drm/amdgpu: don't runtime suspend if there are displays attached (v3) We normally runtime suspend when there are displays attached if they are in the DPMS off state, however, if something wakes the GPU we send a hotplug event on resume (in case any displays were connected while the GPU was in suspend) which can cause userspace to light up the displays again soon after they were turned off. Prior to commit `087451f372` ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's."), the driver took a runtime pm reference when the fbdev emulation was enabled because we didn't implement proper shadowing support for vram access when the device was off so the device never runtime suspended when there was a console bound. Once that commit landed, we now utilize the core fb helper implementation which properly handles the emulation, so runtime pm now suspends in cases where it did not before. Ultimately, we need to sort out why runtime suspend in not working in this case for some users, but this should restore similar behavior to before. v2: move check into runtime_suspend v3: wake ups -> wakeups in comment, retain pm_runtime behavior in runtime_idle callback Fixes: `087451f372` ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.") Link: https://lore.kernel.org/r/20220403132322.51c90903@darkstar.example.org/ Tested-by: Michele Ballabio <ballabio.m@gmail.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-21 15:59:37 -04:00
Candice Li	e50d9ba0d2	drm/amdgpu: Add debugfs TA load/unload/invoke support v1: Add debugfs support to load/unload/invoke TA in runtime. v2: 1. Update some variables to static. 2. Use PAGE_ALIGN to calculate shared buf size directly. 3. Remove fp check. 4. Update debugfs from read to write. Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-21 15:58:22 -04:00
Candice Li	fe96e5636a	drm/amdgpu: Use indirect buffer and save response status for TA load/invoke The upcoming TA debugfs interface needs to use indirect buffer when performing TA invoke and check psp response status for TA load and invoke. Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-21 15:58:10 -04:00
Christian König	94f4c4965e	drm/amdgpu: partial revert "remove ctx->lock" v2 This reverts commit `461fa7b0ac`. We are missing some inter dependencies here so re-introduce the lock until we have figured out what's missing. Just drop/retake it while adding dependencies. v2: still drop the lock while adding dependencies Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> (v1) Fixes: `461fa7b0ac` ("drm/amdgpu: remove ctx->lock") Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220419110633.166236-1-christian.koenig@amd.com	2022-04-21 11:26:20 +02:00
Christian König	32c2d7a536	drm/amdgpu: remove pointless ttm_eu usage from vkms We just need to reserve one BO here, no need for using ttm_eu to reserve multiple BOs. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220419141915.122157-1-christian.koenig@amd.com	2022-04-21 11:10:37 +02:00
Zack Rusin	7212d24cec	drm/amdgpu: Use TTM builtin resource manager debugfs code Switch to using the TTM resource manager debugfs helpers. It's exactly the same functionality but the debugfs code is shared with other drivers. The TTM resource managers need to stay valid for as long as the drm debugfs_root is valid. Signed-off-by: Zack Rusin <zackr@vmware.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Nirmoy Das <nirmoy.das@amd.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: amd-gfx@lists.freedesktop.org Link: https://patchwork.freedesktop.org/patch/msgid/20220412033526.369115-4-zack@kde.org Reviewed-by: Christian König <christian.koenig@amd.com>	2022-04-20 21:06:02 -04:00
Paul Cercueil	40f458b781	Merge drm/drm-next into drm-misc-next drm/drm-next has a build fix for the NewVision NV3052C panel (drivers/gpu/drm/panel/panel-newvision-nv3052c.c), which needs to be merged back to drm-misc-next, as it was failing to build there. Signed-off-by: Paul Cercueil <paul@crapouillou.net>	2022-04-18 20:46:55 +01:00
Gavin Wan	d68cf992de	drm/amd/amdgpu: Remove static from variable in RLCG Reg RW [why] These static variables save the RLC Scratch registers address. When we install multiple GPUs (for example: XGMI setting) and multiple GPUs call the function at same time. The RLC Scratch registers address are changed each other. Then it caused reading/writing from/to wrong GPU. [how] Removed the static from the variables. The variables are on the stack. Fixes: `5d447e2967` ("drm/amdgpu: add helper for rlcg indirect reg access") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Gavin Wan <Gavin.Wan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-14 15:29:20 -04:00
xinhui pan	7c703a7d3f	drm/amdgpu: Fix one use-after-free of VM VM might already be freed when amdgpu_vm_tlb_seq_cb() is called. We see the calltrace below. Fix it by keeping the last flush fence around and wait for it to signal BUG kmalloc-4k (Not tainted): Poison overwritten 0xffff9c88630414e8-0xffff9c88630414e8 @offset=5352. First byte 0x6c instead of 0x6b Allocated in amdgpu_driver_open_kms+0x9d/0x360 [amdgpu] age=44 cpu=0 pid=2343 __slab_alloc.isra.0+0x4f/0x90 kmem_cache_alloc_trace+0x6b8/0x7a0 amdgpu_driver_open_kms+0x9d/0x360 [amdgpu] drm_file_alloc+0x222/0x3e0 [drm] drm_open+0x11d/0x410 [drm] Freed in amdgpu_driver_postclose_kms+0x3e9/0x550 [amdgpu] age=22 cpu=1 pid=2485 kfree+0x4a2/0x580 amdgpu_driver_postclose_kms+0x3e9/0x550 [amdgpu] drm_file_free+0x24e/0x3c0 [drm] drm_close_helper.isra.0+0x90/0xb0 [drm] drm_release+0x97/0x1a0 [drm] __fput+0xb6/0x280 ____fput+0xe/0x10 task_work_run+0x64/0xb0 Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-14 15:27:02 -04:00
Tomasz Moń	4593c1b6d1	drm/amdgpu: Enable gfxoff quirk on MacBook Pro Enabling gfxoff quirk results in perfectly usable graphical user interface on MacBook Pro (15-inch, 2019) with Radeon Pro Vega 20 4 GB. Without the quirk, X server is completely unusable as every few seconds there is gpu reset due to ring gfx timeout. Signed-off-by: Tomasz Moń <desowin@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-04-13 22:22:27 -04:00
Kai-Heng Feng	887f75cfd0	drm/amdgpu: Ensure HDA function is suspended before ASIC reset DP/HDMI audio on AMD PRO VII stops working after S3: [ 149.450391] amdgpu 0000:63:00.0: amdgpu: MODE1 reset [ 149.450395] amdgpu 0000:63:00.0: amdgpu: GPU mode1 reset [ 149.450494] amdgpu 0000:63:00.0: amdgpu: GPU psp mode1 reset [ 149.983693] snd_hda_intel 0000:63:00.1: refused to change power state from D0 to D3hot [ 150.003439] amdgpu 0000:63:00.0: refused to change power state from D0 to D3hot ... [ 155.432975] snd_hda_intel 0000:63:00.1: CORB reset timeout#2, CORBRP = 65535 The offending commit is `daf8de0874` ("drm/amdgpu: always reset the asic in suspend (v2)"). Commit `34452ac303` ("drm/amdgpu: don't use BACO for reset in S3 ") doesn't help, so the issue is something different. Assuming that to make HDA resume to D0 fully realized, it needs to be successfully put to D3 first. And this guesswork proves working, by moving amdgpu_asic_reset() to noirq callback, so it's called after HDA function is in D3. Fixes: `daf8de0874` ("drm/amdgpu: always reset the asic in suspend (v2)") Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-04-13 22:21:58 -04:00
Alex Deucher	e3cf2e0544	drm/amdgpu: fix VCN 3.1.2 firmware name Drop the trailing vcn. Fixes: `afc2f27605` ("drm/amdgpu/vcn: add vcn support for vcn 3.1.2") Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-13 22:21:33 -04:00
Yongqiang Sun	eb85fc2389	drm/amd/amdgpu: Not request init data for MS_HYPERV with vega10 MS_HYPERV with vega10 doesn't have the interface to process request init data msg. Check hypervisor type to not send the request for MS_HYPERV. Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com> Reviewed-by: Alice Wong <shiwei.wong@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-13 09:14:22 -04:00
Dave Airlie	b85ffe47c4	drm-misc-next for 5.19: UAPI Changes: Cross-subsystem Changes: Core Changes: - atomic: Add atomic_print_state to private objects - edid: Constify the EDID parsing API, rework of the API - dma-buf: Add dma_resv_replace_fences, dma_resv_get_singleton, make dma_resv_excl_fence private - format: Support monochrome formats - fbdev: fixes for cfb_imageblit and sys_imageblit, pagelist corruption fix - selftests: several small fixes - ttm: Rework bulk move handling Driver Changes: - Switch all relevant drivers to drm_mode_copy or drm_mode_duplicate - bridge: conversions to devm_drm_of_get_bridge and panel_bridge, autosuspend for analogix_dp, audio support for it66121, DSI to DPI support for tc358767, PLL fixes and I2C support for icn6211 - bridge_connector: Enable HPD if supported - etnaviv: fencing improvements - gma500: GEM and GTT improvements, connector handling fixes - komeda: switch to plane reset helper - mediatek: MIPI DSI improvements - omapdrm: GEM improvements - panel: DT bindings fixes for st7735r, few fixes for ssd130x, new panels: ltk035c5444t, B133UAN01, NV3052C - qxl: Allow to run on arm64 - sysfb: Kconfig rework, support for VESA graphic mode selection - vc4: Add a tracepoint for CL submissions, HDMI YUV output, HDMI and clock improvements - virtio: Remove restriction of non-zero blob_flags, - vmwgfx: support for CursorMob and CursorBypass 4, various improvements and small fixes -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCYk6nlQAKCRDj7w1vZxhR xaTTAP0ZeeXRWIYxFfmuEAUd3H4ztvr3cx/QU/85qMXQUM4gSgD/cvQHMeucrFlX 2Bafjzl/p1tQrth0HNOkSz85dABUUws= =rJSD -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2022-04-07' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.19: UAPI Changes: Cross-subsystem Changes: Core Changes: - atomic: Add atomic_print_state to private objects - edid: Constify the EDID parsing API, rework of the API - dma-buf: Add dma_resv_replace_fences, dma_resv_get_singleton, make dma_resv_excl_fence private - format: Support monochrome formats - fbdev: fixes for cfb_imageblit and sys_imageblit, pagelist corruption fix - selftests: several small fixes - ttm: Rework bulk move handling Driver Changes: - Switch all relevant drivers to drm_mode_copy or drm_mode_duplicate - bridge: conversions to devm_drm_of_get_bridge and panel_bridge, autosuspend for analogix_dp, audio support for it66121, DSI to DPI support for tc358767, PLL fixes and I2C support for icn6211 - bridge_connector: Enable HPD if supported - etnaviv: fencing improvements - gma500: GEM and GTT improvements, connector handling fixes - komeda: switch to plane reset helper - mediatek: MIPI DSI improvements - omapdrm: GEM improvements - panel: DT bindings fixes for st7735r, few fixes for ssd130x, new panels: ltk035c5444t, B133UAN01, NV3052C - qxl: Allow to run on arm64 - sysfb: Kconfig rework, support for VESA graphic mode selection - vc4: Add a tracepoint for CL submissions, HDMI YUV output, HDMI and clock improvements - virtio: Remove restriction of non-zero blob_flags, - vmwgfx: support for CursorMob and CursorBypass 4, various improvements and small fixes [airlied: fixup conflict with newvision panel callbacks] Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20220407085940.pnflvjojs4qw4b77@houat	2022-04-12 17:44:27 +10:00
Stanley.Yang	05eee31c08	drm/amdgpu: add umc query error status function In order to debug ras error, driver will print IPID/SYND/MISC0 register value if detect correctable or uncorrectable error. Provide umc_query_error_status_helper function to reduce code redundancy. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-11 13:50:35 -04:00
Grigory Vasilyev	6f90a49bc0	drm/amdgpu: Fix incorrect enum type Instead of the 'amdgpu_ring_priority_level' type, the 'amdgpu_gfx_pipe_priority' type was used, which is an error when setting ring priority. This is a minor error, but may cause problems in the future. Instead of AMDGPU_RING_PRIO_2 = 2, we can use AMDGPU_RING_PRIO_MAX = 3, but AMDGPU_RING_PRIO_2 = 2 is used for compatibility with AMDGPU_GFX_PIPE_PRIO_HIGH = 2, and not change the behavior of the code. Signed-off-by: Grigory Vasilyev <h0tc0d3@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-11 13:50:35 -04:00
Tom St Denis	dc2947b35f	drm/amd/amdgpu: Update debugfs GCA data The data revision was not changed to 5 from 4 when the CG flags were extended to 64-bits. Since this was missed I took the opportunity to add future upper 64-bits of PG flags as well so we don't need to bump it again when that comes. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-11 13:50:35 -04:00
Yongqiang Sun	d9e50239a9	drm/amd/amdgpu: Fix asm/hypervisor.h build error. Add CONFIG_X86 check to fix the build error. Fixes: `49aa98ca30` ("drm/amd/amdgpu: Only reserve vram for firmware with vega9 MS_HYPERV host.") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-11 13:50:35 -04:00
Lijo Lazar	73bce7a423	drm/amdgpu: Use flexible array member Use flexible array member in ip discovery struct as recommended[1]. [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays v2: squash in struct_size fixes Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-11 13:50:35 -04:00
Evan Quan	25faeddcf3	drm/amdgpu: expand cg_flags from u32 to u64 With this, we can support more CG flags. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-08 17:24:24 -04:00
Arunpravin Paneer Selvam	c9cad937c0	drm/amdgpu: add drm buddy support to amdgpu - Switch to drm buddy allocator - Add resource cursor support for drm buddy v2(Matthew Auld): - replace spinlock with mutex as we call kmem_cache_zalloc (..., GFP_KERNEL) in drm_buddy_alloc() function - lock drm_buddy_block_trim() function as it calls mark_free/mark_split are all globally visible v3(Matthew Auld): - remove trim method error handling as we address the failure case at drm_buddy_block_trim() function v4: - fix warnings reported by kernel test robot <lkp@intel.com> v5: - fix merge conflict issue v6: - fix warnings reported by kernel test robot <lkp@intel.com> v7: - remove DRM_BUDDY_RANGE_ALLOCATION flag usage v8: - keep DRM_BUDDY_RANGE_ALLOCATION flag usage - resolve conflicts created by drm/amdgpu: remove VRAM accounting v2 v9(Christian): - merged the below patch - drm/amdgpu: move vram inline functions into a header - rename label name as fallback - move struct amdgpu_vram_mgr to amdgpu_vram_mgr.h - remove unnecessary flags from struct amdgpu_vram_reservation - rewrite block NULL check condition - change else style as per coding standard - rewrite the node max size - add a helper function to fetch the first entry from the list v10(Christian): - rename amdgpu_get_node() function name as amdgpu_vram_mgr_first_block v11: - if size is not aligned with min_page_size, enable is_contiguous flag, therefore, the size round up to the power of two and trimmed to the original size. v12: - rename the function names having prefix as amdgpu_vram_mgr_*() - modify the round_up() logic conforming to contiguous flag enablement or if size is not aligned to min_block_size - modify the trim logic - rename node as block wherever applicable Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220407224843.2416-1-Arunpravin.PaneerSelvam@amd.com Signed-off-by: Christian König <christian.koenig@amd.com>	2022-04-08 12:58:15 +02:00
Yongqiang Sun	49aa98ca30	drm/amd/amdgpu: Only reserve vram for firmware with vega9 MS_HYPERV host. driver loading failed on VEGA10 SRIOV VF with linux host due to a wide range of stolen reserved vram. Since VEGA10 SRIOV VF need to reserve vram for firmware with windows Hyper_V host specifically, check hypervisor type to only reserve memory for it, and the range of the reserved vram can be limited to between 5M-7M area. Fixes: `faad5ccac1` ("drm/amdgpu: Add stolen reserved memory for MI25 SRIOV.") Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-07 16:38:53 -04:00
Felix Kuehling	3cd3e731f3	drm/amdkfd: Fix NULL pointer dereference Check that adev->gfx.ras is valid before using it. Fixes: `6475ae2b74` ("drm/amdgpu: add UTCL2 RAS poison query for Aldebaran (v2)") CC: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-07 16:37:39 -04:00
Tomasz Moń	9b6a1ec792	drm/amdgpu: Enable gfxoff quirk on MacBook Pro Enabling gfxoff quirk results in perfectly usable graphical user interface on MacBook Pro (15-inch, 2019) with Radeon Pro Vega 20 4 GB. Without the quirk, X server is completely unusable as every few seconds there is gpu reset due to ring gfx timeout. Signed-off-by: Tomasz Moń <desowin@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-07 16:37:24 -04:00
Kai-Heng Feng	9e051720f9	drm/amdgpu: Ensure HDA function is suspended before ASIC reset DP/HDMI audio on AMD PRO VII stops working after S3: [ 149.450391] amdgpu 0000:63:00.0: amdgpu: MODE1 reset [ 149.450395] amdgpu 0000:63:00.0: amdgpu: GPU mode1 reset [ 149.450494] amdgpu 0000:63:00.0: amdgpu: GPU psp mode1 reset [ 149.983693] snd_hda_intel 0000:63:00.1: refused to change power state from D0 to D3hot [ 150.003439] amdgpu 0000:63:00.0: refused to change power state from D0 to D3hot ... [ 155.432975] snd_hda_intel 0000:63:00.1: CORB reset timeout#2, CORBRP = 65535 The offending commit is `daf8de0874` ("drm/amdgpu: always reset the asic in suspend (v2)"). Commit `34452ac303` ("drm/amdgpu: don't use BACO for reset in S3 ") doesn't help, so the issue is something different. Assuming that to make HDA resume to D0 fully realized, it needs to be successfully put to D3 first. And this guesswork proves working, by moving amdgpu_asic_reset() to noirq callback, so it's called after HDA function is in D3. Fixes: `daf8de0874` ("drm/amdgpu: always reset the asic in suspend (v2)") Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-07 16:35:48 -04:00
Alex Deucher	dd48182897	drm/amdgpu: fix VCN 3.1.2 firmware name Drop the trailing vcn. Fixes: `afc2f27605` ("drm/amdgpu/vcn: add vcn support for vcn 3.1.2") Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-07 16:34:42 -04:00
Christian König	8bb3158782	drm/ttm: remove bo->moving This is now handled by the DMA-buf framework in the dma_resv obj. Also remove the workaround inside VMWGFX to update the moving fence. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-14-christian.koenig@amd.com	2022-04-07 12:53:54 +02:00
Christian König	46b35b33cc	dma-buf: wait for map to complete for static attachments We have previously done that in the individual drivers but it is more defensive to move that into the common code. Dynamic attachments should wait for map operations to complete by themselves. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-12-christian.koenig@amd.com	2022-04-07 12:53:54 +02:00
Christian König	0cc848a75b	dma-buf: add DMA_RESV_USAGE_BOOKKEEP v3 Add an usage for submissions independent of implicit sync but still interesting for memory management. v2: cleanup the kerneldoc a bit v3: separate amdgpu changes from this Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-10-christian.koenig@amd.com	2022-04-07 12:53:54 +02:00
Christian König	c35fcfa344	drm/amdgpu: use DMA_RESV_USAGE_KERNEL Wait only for kernel fences before kmap or UVD direct submission. This also makes sure that we always wait in amdgpu_bo_kmap() even when returning a cached pointer. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-6-christian.koenig@amd.com	2022-04-07 12:53:54 +02:00
Christian König	047a1b877e	dma-buf & drm/amdgpu: remove dma_resv workaround Rework the internals of the dma_resv object to allow adding more than one write fence and remember for each fence what purpose it had. This allows removing the workaround from amdgpu which used a container for this instead. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: amd-gfx@lists.freedesktop.org Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-4-christian.koenig@amd.com	2022-04-07 12:53:53 +02:00
Christian König	73511edf8b	dma-buf: specify usage while adding fences to dma_resv obj v7 Instead of distingting between shared and exclusive fences specify the fence usage while adding fences. Rework all drivers to use this interface instead and deprecate the old one. v2: some kerneldoc comments suggested by Daniel v3: fix a missing case in radeon v4: rebase on nouveau changes, fix lockdep and temporary disable warning v5: more documentation updates v6: separate internal dma_resv changes from this patch, avoids to disable warning temporary, rebase on upstream changes v7: fix missed case in lima driver, minimize changes to i915_gem_busy_ioctl Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-3-christian.koenig@amd.com	2022-04-07 12:53:53 +02:00
Christian König	7bc80a5462	dma-buf: add enum dma_resv_usage v4 This change adds the dma_resv_usage enum and allows us to specify why a dma_resv object is queried for its containing fences. Additional to that a dma_resv_usage_rw() helper function is added to aid retrieving the fences for a read or write userspace submission. This is then deployed to the different query functions of the dma_resv object and all of their users. When the write paratermer was previously true we now use DMA_RESV_USAGE_WRITE and DMA_RESV_USAGE_READ otherwise. v2: add KERNEL/OTHER in separate patch v3: some kerneldoc suggestions by Daniel v4: some more kerneldoc suggestions by Daniel, fix missing cases lost in the rebase pointed out by Bas. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-2-christian.koenig@amd.com	2022-04-07 12:53:53 +02:00
Ma Jun	ef1a0808a2	drm/amdgpu: Sync up header and implementation to use the same parameter names Sync up header and implementation to use the same parameter names in function amdgpu_ring_init. ring_size -> max_dw, prio -> hw_prio Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-06 12:02:57 -04:00
Ruili Ji	96f2b7a357	drm/amdgpu: fix incorrect GCR_GENERAL_CNTL address gfx10.3.3/gfx10.3.6/gfx10.3.7 shall use 0x1580 address for GCR_GENERAL_CNTL Acked-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Ruili Ji <ruiliji2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-04-06 12:02:57 -04:00
Christian König	c8d4c18bfb	dma-buf/drivers: make reserving a shared slot mandatory v4 Audit all the users of dma_resv_add_excl_fence() and make sure they reserve a shared slot also when only trying to add an exclusive fence. This is the next step towards handling the exclusive fence like a shared one. v2: fix missed case in amdgpu v3: and two more radeon, rename function v4: add one more case to TTM, fix i915 after rebase Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220406075132.3263-2-christian.koenig@amd.com	2022-04-06 17:38:25 +02:00
tiancyin	dda81d9761	drm/amd/vcn: fix an error msg on vcn 3.0 Some video card has more than one vcn instance, passing 0 to vcn_v3_0_pause_dpg_mode is incorrect. Error msg: Register(1) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002 Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: tiancyin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-04-06 11:25:22 -04:00
Boyuan Zhang	945da79e6d	drm/amdgpu/vcn3: send smu interface type For VCN FW to detect ASIC type, in order to use different mailbox registers. V2: simplify codes and fix format issue. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-06 11:24:23 -04:00
Grigory Vasilyev	d1826081bb	drm/amdgpu: Remove leftover igp_lane_info Variable igp_lane_info always is 0. 0 & any value = 0 and false. In this way, all сonditional statements will false. The code was leftover from when the code was ported from radeon where igp_lane_info was derived from the vbios on supported platforms. [update commit message - Alex] Signed-off-by: Grigory Vasilyev <h0tc0d3@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-06 10:20:29 -04:00
Philip Yang	0f12a22f37	drm/amdgpu: Flush TLB after mapping for VG20+XGMI For VG20 + XGMI bridge, all mappings PTEs cache in TC, this may have stall invalid PTEs in TC because one cache line has 8 pages. Need always flush_tlb after updating mapping. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-05 10:29:47 -04:00
Haowen Bai	7e97de3e7f	drm/amdgpu/vcn: Remove unneeded semicolon report by coccicheck: drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c:1951:2-3: Unneeded semicolon Fixes: `c543dcbe42` ("drm/amdgpu/vcn: Add VCN ras error query support") Signed-off-by: Haowen Bai <baihaowen@meizu.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-05 10:26:42 -04:00
Christian König	30671b44aa	drm/amdgpu: fix TLB flushing during eviction Testing the valid bit is not enough to figure out if we need to invalidate the TLB or not. During eviction it is quite likely that we move a BO from VRAM to GTT and update the page tables immediately to the new GTT address. Rework the whole function to get all the necessary parameters directly as value. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-05 10:26:20 -04:00
Maxime Ripard	9cbbd694a5	Merge drm/drm-next into drm-misc-next Let's start the 5.19 development cycle. Signed-off-by: Maxime Ripard <maxime@cerno.tech>	2022-04-05 11:06:58 +02:00
Christian König	ba5f33cccc	drm/amdgpu: use dma_resv_get_singleton in amdgpu_pasid_free_cb Makes the code a bit more simpler. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: amd-gfx@lists.freedesktop.org Link: https://patchwork.freedesktop.org/patch/msgid/20220321135856.1331-15-christian.koenig@amd.com	2022-04-03 20:15:51 +02:00
Christian König	644704740b	drm/amdgpu: use dma_resv_for_each_fence for CS workaround v2 Get the write fence using dma_resv_for_each_fence instead of accessing it manually. v2: add TODO comment Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: amd-gfx@lists.freedesktop.org Link: https://patchwork.freedesktop.org/patch/msgid/20220321135856.1331-9-christian.koenig@amd.com	2022-04-03 18:50:49 +02:00
Ma Jun	cf8cc382aa	drm/amdgpu: Sync up header and implementation to use the same parameter names Sync up header and implementation to use the same parameter names in function amdgpu_ring_init. ring_size -> max_dw, prio -> hw_prio Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-31 23:05:54 -04:00
Ruili Ji	058497e1f5	drm/amdgpu: fix incorrect GCR_GENERAL_CNTL address gfx10.3.3/gfx10.3.6/gfx10.3.7 shall use 0x1580 address for GCR_GENERAL_CNTL Acked-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Ruili Ji <ruiliji2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-31 23:05:54 -04:00
Christian König	4499c90e90	drm/amdgpu: fix incorrect size printing in error msg That are bytes not pages. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-31 23:05:54 -04:00
Christian König	55a2d21bba	drm/amdgpu: fix some kerneldoc in the VM code v2 Fix two incorrect kerneldocs for the recent VM code changes. v2: fix one more typo Signed-off-by: Christian König <christian.koenig@amd.com> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-31 23:05:54 -04:00
Philip Yang	44e121fbf1	drm/amdgpu: Add tlb_cb for unlocked update Flush TLB needs wait for GPU update fence done. MMU notify callback to unmap range from GPUs uses unlocked GPU page table update, so add tlb_cb to unlocked update fence to increase vm->tlb_seq. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-31 23:05:53 -04:00
Philip Yang	9563e1ec92	drm/amdgpu: Correct unlocked update fence handling To fix two issues with unlocked update fence: 1. vm->last_unlocked store the latest fence without taking refcount. 2. amdgpu_vm_bo_update_mapping returns old fence, not the latest fence. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-31 23:05:53 -04:00
Christian König	77ef271fae	drm/amdgpu: drop amdgpu_gtt_node We have the BO pointer in the base structure now as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Link: https://patchwork.freedesktop.org/patch/msgid/20220321132601.2161-6-christian.koenig@amd.com	2022-03-29 10:57:12 +02:00
Christian König	fee2ede155	drm/ttm: rework bulk move handling v5 Instead of providing the bulk move structure for each LRU update set this as property of the BO. This should avoid costly bulk move rebuilds with some games under RADV. v2: some name polishing, add a few more kerneldoc words. v3: add some lockdep v4: fix bugs, handle pin/unpin as well v5: improve kerneldoc Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220321132601.2161-5-christian.koenig@amd.com	2022-03-29 10:55:32 +02:00
Christian König	6a9b028994	drm/ttm: move the LRU into resource handling v4 This way we finally fix the problem that new resource are not immediately evict-able after allocation. That has caused numerous problems including OOM on GDS handling and not being able to use TTM as general resource manager. v2: stop assuming in ttm_resource_fini that res->bo is still valid. v3: cleanup kerneldoc, add more lockdep annotation v4: consistently use res->num_pages Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220321132601.2161-1-christian.koenig@amd.com	2022-03-28 20:05:32 +02:00
Mohammad Zafar Ziya	749831acb1	drm/amdgpu/jpeg: Add jpeg ras error query support RAS error query support addition for JPEG 2.6 V2: removed unused options and corrected comment format. Moved register definition to header file. V3: poison query status check added. Removed the error query support V4: Return statement refactored. Signed-off-by: Mohammad Zafar Ziya <Mohammadzafar.ziya@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-28 12:54:39 -04:00
Mohammad Zafar Ziya	c543dcbe42	drm/amdgpu/vcn: Add VCN ras error query support RAS error query support addition for VCN 2.6 V2: removed unused option and corrected comment format Moved the register definition under header file V3: poison query status check added. Removed error query interface V4: MMSCH poison check option removed, return true/false refactored. Signed-off-by: Mohammad Zafar Ziya <Mohammadzafar.ziya@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-28 12:54:39 -04:00
Mohammad Zafar Ziya	edd08fa137	drm/amdgpu/jpeg: Add jpeg block ras support Ras support addition for JPEG block V2: removed default callback Signed-off-by: Mohammad Zafar Ziya <Mohammadzafar.ziya@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-28 12:54:39 -04:00
Mohammad Zafar Ziya	60fce7417f	drm/amdgpu/vcn: Add vcn ras support VCN block ras feature support addition V2: default ras callback removed Signed-off-by: Mohammad Zafar Ziya <Mohammadzafar.ziya@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-28 12:54:39 -04:00
Mohammad Zafar Ziya	a3d63c62bd	drm/amdgpu: Add vcn and jpeg ras support flag Add vcn and jpeg ras support options V2: vcn and jpeg ras flag enabled for aldebaran asic only V3: vcn and jpeg ras flag disabled for error counter query Generic poison query interface added VCN and JPEG ras enabled based on IP version check V4: vcn and jpeg ras flag moved under ecc flag for dGPU Signed-off-by: Mohammad Zafar Ziya <Mohammadzafar.ziya@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-28 12:54:39 -04:00
tiancyin	425d7a87e5	drm/amd/vcn: fix an error msg on vcn 3.0 Some video card has more than one vcn instance, passing 0 to vcn_v3_0_pause_dpg_mode is incorrect. Error msg: Register(1) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002 Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: tiancyin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-28 12:54:39 -04:00
Sean Paul	9f07550b3c	drm/amdgpu: Re-classify some log messages in commit path ATOMIC and DRIVER log categories do not typically contain per-frame log messages. This patch re-classifies some messages in amd to chattier categories to keep ATOMIC/DRIVER quiet. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Sean Paul <seanpaul@chromium.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-28 12:54:39 -04:00
Boyuan Zhang	e3026a057f	drm/amdgpu/vcn3: send smu interface type For VCN FW to detect ASIC type, in order to use different mailbox registers. V2: simplify codes and fix format issue. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-28 12:54:38 -04:00
Christian König	8f8cc3fb43	drm/amdgpu: remove table_freed param from the VM code Better to leave the decision when to flush the VM changes in the TLB to the VM code. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Philip Yang<Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:52 -04:00
Christian König	4d30a83c74	drm/amdkfd: use tlb_seq from the VM subsystem for SVM as well v2 Instead of hand rolling the table_freed parameter. v2: add some changes suggested by Philip Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Philip Yang<Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:52 -04:00
Christian König	5255e146c9	drm/amdgpu: rework TLB flushing Instead of tracking the VM updates through the dependencies just use a sequence counter for page table updates which indicates the need to flush the TLB. This reduces the need to flush the TLB drastically. v2: squash in NULL check fix (Christian) Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:51 -04:00
Christian König	e997b82745	drm/amdgpu: simplify VM update tracking a bit Store the 64bit sequence directly. Makes it simpler to use and saves a bit of fence reference counting overhead. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:26 -04:00
Christian König	184a69ca4d	drm/amdgpu: separate VM PT handling into amdgpu_vm_pt.c Separate the VM page table backend operations from the state machine since the amdgpu_vm.c file is becoming to complex. The allocating, freeing and updating page tables and page directories can easily be moved into a separate file. While at it cleanup everything checkpatch.pl reported and rename the functions a bit to make more clear that they belong together. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:26 -04:00
Christian König	6e97c2f968	drm/amdgpu: move VM PDEs to idle after update Move the page tables to the idle list after updating the PDEs. We have gone back and forth with that a couple of times because of problems with the inter PD dependencies, but it should work now that we have the state handling cleanly separated. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:26 -04:00
Guchun Chen	f3fa490960	drm/amdgpu: drop redundant check of harvest info Harvest bit setting in IP data structure promises this, so no need to set it explicitly. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:26 -04:00
Colin Ian King	2f78f0d3e3	drm/amdgpu: Fix spelling mistake "regiser" -> "register" There is a spelling mistake in a dev_error error message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:26 -04:00
Tao Zhou	6475ae2b74	drm/amdgpu: add UTCL2 RAS poison query for Aldebaran (v2) Add help functions to query and reset RAS UTCL2 poison status. v2: implement it on amdgpu side and kfd only calls it. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:26 -04:00
Prike Liang	15f9cd4334	drm/amdgpu/gfx10: enable gfx1037 clock counter retrieval function Enable gfx1037 clock counter retrieval function for KFDPerfCountersTest.ClockCountersBasicTest. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:25 -04:00
Prike Liang	0dc386add5	drm/amdgpu: set noretry for gfx 10.3.7 Disable xnack on the gfx10.3.7 for the KFD test. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:25 -04:00
Felix Kuehling	609910db56	drm/amdgpu: set noretry=1 for GFX 10.3.4 Retry faults are not supported on GFX 10.3.4. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:25 -04:00
Alex Deucher	c5b266810c	drm/amdgpu: make amdgpu_display_gem_fb_verify_and_init() static Unused outside of amdgpu_display.c. Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:25 -04:00
Yifan Zhang	7057c81773	drm/amdgpu: set noretry=1 for gc 10.3.6 this patch to set noretry=1 for gc 10.3.6. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:25 -04:00
Alex Deucher	60da2f7440	drm/amdgpu: drop amdgpu_display_gem_fb_init() Unused. Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:25 -04:00
Alex Deucher	5f3854f1f4	drm/amdgpu: add more cases to noretry=1 Port current list from amd-staging-drm-next. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:25 -04:00
Alex Deucher	31d5c52346	drm/amdgpu: make amdgpu_display_framebuffer_init() static It's not used outside of amdgpu_display.c. Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:25 -04:00
Tianci Yin	6ea239adc2	drm/amdgpu/vcn: improve vcn dpg stop procedure Prior to disabling dpg, VCN need unpausing dpg mode, or VCN will hang in S3 resuming. Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Tianci Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:25 -04:00
Tushar Patel	b7dfbd2e60	drm/amdkfd: Fix Incorrect VMIDs passed to HWS Compute-only GPUs have more than 8 VMIDs allocated to KFD. Fix this by passing correct number of VMIDs to HWS v2: squash in warning fix (Alex) Signed-off-by: Tushar Patel <tushar.patel@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:25 -04:00
Emily Deng	02fc996d50	drm/amdgpu/vcn: Fix the register setting for vcn1 Correct the code error for setting register UVD_GFX10_ADDR_CONFIG. Need to use inst_idx, or it only will set VCN0. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-03-25 12:40:24 -04:00
Lang Yu	0d8e4eb337	drm/amdgpu: add workarounds for VCN TMZ issue on CHIP_RAVEN It is a hardware issue that VCN can't handle a GTT backing stored TMZ buffer on CHIP_RAVEN series ASIC. Move such a TMZ buffer to VRAM domain before command submission as a workaround. v2: - Use patch_cs_in_place callback. v3: - Bail out early if unsecure IBs. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-03-25 12:40:24 -04:00
Alex Deucher	b818a5d374	drm/amdgpu/gmc: use PCI BARs for APUs in passthrough If the GPU is passed through to a guest VM, use the PCI BAR for CPU FB access rather than the physical address of carve out. The physical address is not valid in a guest. v2: Fix HDP handing as suggested by Michel Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:24 -04:00
Dan Carpenter	1647b54ed5	drm/amdgpu: fix off by one in amdgpu_gfx_kiq_acquire() This post-op should be a pre-op so that we do not pass -1 as the bit number to test_bit(). The current code will loop downwards from 63 to -1. After changing to a pre-op, it loops from 63 to 0. Fixes: `71c37505e7` ("drm/amdgpu/gfx: move more common KIQ code to amdgpu_gfx.c") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:24 -04:00
Guchun Chen	2d505453f3	drm/amdgpu: conduct a proper cleanup of PDB bo Use amdgpu_bo_free_kernel instead of amdgpu_bo_unref to perform a proper cleanup of PDB bo. v2: update subject to be more accurate Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:24 -04:00
Guchun Chen	32f90e6525	drm/amdgpu: prevent memory wipe in suspend/shutdown stage On GPUs with RAS enabled, below call trace is observed when suspending or shutting down device. The cause is we have enabled memory wipe flag for BOs on such GPUs by default, and such BOs will go to memory wipe by amdgpu_fill_buffer, however, because ring is off already, it fails to clean up the memory and throw this error message. So add a suspend/shutdown check before wipping memory. [drm:amdgpu_fill_buffer [amdgpu]] ERROR Trying to clear memory with ring turned off. v2: fix coding style issue Fixes: `fc6ea4bee1` ("drm/amdgpu: Wipe all VRAM on free when RAS is enabled") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-25 12:40:24 -04:00
Christian König	548e7432dc	dma-buf: add dma_resv_replace_fences v2 This function allows to replace fences from the shared fence list when we can gurantee that the operation represented by the original fence has finished or no accesses to the resources protected by the dma_resv object any more when the new fence finishes. Then use this function in the amdkfd code when BOs are unmapped from the process. v2: add an example when this is usefull. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220321135856.1331-1-christian.koenig@amd.com	2022-03-24 12:02:26 +01:00
Aurabindo Pillai	c5c948aa89	drm/amd: Add USBC connector ID [Why&How] Add a dedicated AMDGPU specific ID for use with newer ASICs that support USB-C output Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-22 10:53:39 -04:00
Ville Syrjälä	426c89aa20	drm/amdgpu: Use drm_mode_copy() struct drm_display_mode embeds a list head, so overwriting the full struct with another one will corrupt the list (if the destination mode is on a list). Use drm_mode_copy() instead which explicitly preserves the list head of the destination mode. Even if we know the destination mode is not on any list using drm_mode_copy() seems decent as it sets a good example. Bad examples of not using it might eventually get copied into code where preserving the list head actually matters. Obviously one case not covered here is when the mode itself is embedded in a larger structure and the whole structure is copied. But if we are careful when copying into modes embedded in structures I think we can be a little more reassured that bogus list heads haven't been propagated in. @is_mode_copy@ @@ drm_mode_copy(...) { ... } @depends on !is_mode_copy@ struct drm_display_mode mode; expression E, S; @@ ( - mode = E + drm_mode_copy(mode, &E) \| - memcpy(mode, E, S) + drm_mode_copy(mode, E) ) @depends on !is_mode_copy@ struct drm_display_mode mode; expression E; @@ ( - mode = E + drm_mode_copy(&mode, &E) \| - memcpy(&mode, E, S) + drm_mode_copy(&mode, E) ) @@ struct drm_display_mode mode; @@ - &mode + mode Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Leo Li <sunpeng.li@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: amd-gfx@lists.freedesktop.org Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 15:01:12 -04:00
Paul Menzel	07d0146932	drm/amdgpu: Use ternary operator in `vcn_v1_0_start()` Remove the boilerplate of declaring a variable and using an if else statement by using the ternary operator. Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 15:01:12 -04:00
Julia Lawall	58398727e6	drm/amdgpu: fix typos in comments Various spelling mistakes in comments. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 15:01:12 -04:00
Yongqiang Sun	faad5ccac1	drm/amdgpu: Add stolen reserved memory for MI25 SRIOV. MI25 SRIOV guest driver loading failed due to allocated memory overlaps with firmware reserved area. Allocate stolen reserved memory for MI25 SRIOV specifically to avoid the memory overlap. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 15:01:12 -04:00
Yongqiang Sun	3f54355284	drm/amdgpu: Merge get_reserved_allocation to get_vbios_allocations. Some ASICs need reserved memory for firmware or other components, which is not allowed to be used by driver. amdgpu_gmc_get_reserved_allocation is to handle additional areas. To avoid any missing calling, merged amdgpu_gmc_get_reserved_allocation to amdgpu_gmc_get_vbios_allocations. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 15:01:12 -04:00
Tianci Yin	4e2f50e230	drm/amdgpu/vcn: fix vcn ring test failure in igt reload test [why] On Renoir, vcn ring test failed on the second time insmod in the reload test. After investigation, it proves that vcn only can disable dpg under dpg unpause mode (dpg unpause mode is default for dec only, dpg pause mode is for dec/enc). [how] unpause dpg in dpg stopping procedure. Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Tianci Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 15:01:02 -04:00
Lang Yu	8c0f11ff38	drm/amdgpu: only allow secure submission on rings which support that Only GFX ring, SDMA ring and VCN decode ring support secure submission at the moment. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 14:42:27 -04:00
yipechai	8476269f75	drm/amdgpu: fixed the warnings reported by kernel test robot The reported warnings are as follows: 1.warning:no-previous-prototype-for-amdgpu_hdp_ras_fini. 2.warning:no-previous-prototype-for-amdgpu_mmhub_ras_fini. Amdgpu_hdp_ras_fini and amdgpu_mmhub_ras_fini are unused in the code, they are the only functions in amdgpu_hdp.c and amdgpu_mmhub.c. After removing these two functions, both amdgpu_hdp.c and amdgpu_mmhub.c are empty, so these two files can be deleted to fix the warning. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 14:42:20 -04:00
Philip Yang	436afdfa35	drm/amdgpu: Move reset domain init before calling RREG32 amdgpu_detect_virtualization reads register, amdgpu_device_rreg access adev->reset_domain->sem if kernel defined CONFIG_LOCKDEP, below is the random boot hang backtrace on Vega10. It may get random NULL pointer access backtrace if amdgpu_sriov_runtime is true too. Move amdgpu_reset_create_reset_domain before calling to RREG32. BUG: kernel NULL pointer dereference, address: #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI Workqueue: events work_for_cpu_fn RIP: 0010:down_read_trylock+0x13/0xf0 Call Trace: <TASK> amdgpu_device_skip_hw_access+0x38/0x80 [amdgpu] amdgpu_device_rreg+0x1b/0x170 [amdgpu] amdgpu_detect_virtualization+0x73/0x100 [amdgpu] amdgpu_device_init.cold.60+0xbe/0x16b1 [amdgpu] ? pci_bus_read_config_word+0x43/0x70 amdgpu_driver_load_kms+0x15/0x120 [amdgpu] amdgpu_pci_probe+0x1a1/0x3a0 [amdgpu] Fixes: `d0fb18b535` ("drm/amdgpu: Move reset sem into reset_domain") Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 14:40:49 -04:00
Tianci.Yin	72a98763b4	drm/amd: fix gfx hang on renoir in IGT reload test [why] CP hangs in igt reloading test on renoir, more precisely, hangs on the second time insmod. [how] mode2 reset can make it recover, and mode2 reset only effects gfx core, dcn and the screen will not be impacted. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 14:34:45 -04:00
Alex Deucher	85ac2021fe	drm/amdgpu: only check for _PR3 on dGPUs We don't support runtime pm on APUs. They support more dynamic power savings using clock and powergating. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Tested-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 14:34:41 -04:00
Hawking Zhang	a03b288650	drm/amdgpu: drop xmgi23 error query/reset support xgmi_ras is only initialized when host to GPU interface is PCIE. in such case, xgmi23 is disabled and protected by security firmware. Host access will results to security violation Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 14:34:26 -04:00
Jonathan Kim	6f172ae59a	drm/amdgpu: fix aldebaran xgmi topology for vf VFs must also distinguish whether or not the TA supports full duplex or half duplex link records in order to report the correct xGMI topology. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Shaoyun Liu <shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 14:34:20 -04:00
Stanley.Yang	69691c8235	drm/amdgpu: message smu to update bad channel info It should notice SMU to update bad channel info when detected uncorrectable error in UMC block Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 14:25:16 -04:00
Lijo Lazar	bb7c3e9ce2	drm/amdgpu: Disable baco dummy mode On aldebaran, BACO dummy mode may be enabled during reset. Disable it during resume. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-15 14:25:15 -04:00
Lang Yu	96a2f0f2c8	drm/amdgpu: fix a wrong ib reference It should be p->job->ibs[j] instead of p->job->ibs[i] here. Fixes: `cdc7893fc9` ("drm/amdgpu: use job and ib structures directly in CS parsers") Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-09 17:28:37 -05:00
Christian König	48e9fbd1a2	drm/amdgpu: initialize the vmid_wait with the stub fence This way we don't need to check for NULL any more. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-04 13:03:30 -05:00
Christian König	6103b2f24e	drm/amdgpu: properly embed the IBs into the job We now have standard macros for that. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-04 13:03:30 -05:00
Christian König	cdc7893fc9	drm/amdgpu: use job and ib structures directly in CS parsers Instead of providing the ib index provide the job and ib pointers directly to the patch and parse functions for UVD and VCE. Also move the set/get functions for IB values to the IB declerations. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-04 13:03:30 -05:00
Christian König	a190f8dc4a	drm/amdgpu: header cleanup No function change, just move a bunch of definitions from amdgpu.h into separate header files. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-04 13:03:30 -05:00
Jingwen Chen	8c7442f026	drm/amd/amdgpu: set disabled vcn to no_schduler [Why] after the reset domain introduced, the sched.ready will be init after hw_init, which will overwrite the setup in vcn hw_init, and lead to vcn ib test fail. [How] set disabled vcn to no_scheduler Fixes: `5fd8518d18` ("drm/amdgpu: Move scheduler init to after XGMI is ready") Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com> Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-04 13:03:30 -05:00
Christian König	d18b8eadd8	drm/amdgpu: install ctx entities with cmpxchg Since we removed the context lock we need to make sure that not two threads are trying to install an entity at the same time. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: `461fa7b0ac` ("drm/amdgpu: remove ctx->lock") Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-04 13:03:30 -05:00
Yifan Zhang	b664a56e86	drm/amdkfd: implement get_atc_vmid_pasid_mapping_info for gfx10.3 This patch implements get_atc_vmid_pasid_mapping_info for gfx10.3 Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-04 13:03:30 -05:00
Ruijing Dong	11eb648d01	drm/amdgpu/vcn: Add vcn firmware log vcn fwlog is for debugging purpose only, by default, it is disabled. Signed-off-by: Ruijing Dong <ruijing.dong@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-04 13:03:30 -05:00
Ruijing Dong	b6065ebf55	drm/amdgpu/vcn: Update fw shared data structure Add fw log in fw shared data structure. Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-04 13:03:30 -05:00
David Yu	811c04dbb3	drm/amdgpu: Add DFC CAP support for aldebaran Add DFC CAP support for aldebaran Initialize cap microcode in psp_init_sriov_microcode, the ta microcode will be initialized in psp_vxx_init_microcode Signed-off-by: David Yu <David.Yu@amd.com> Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-04 13:03:30 -05:00
Harish Kasiviswanathan	24bf9fd197	drm/amdgpu: Set correct DMA mask for aldebaran Aldebaran has 48-bit physical address support Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-04 13:03:29 -05:00
Lijo Lazar	9e08564727	drm/amdgpu: Refactor mode2 reset logic for v13.0.2 Use IP version and refactor reset logic to apply to a list of devices. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-04 13:03:29 -05:00
Weiguo Li	3192f1d9b6	drm/amdgpu: remove redundant null check Remove the redundant null check since the caller ensures that 'ctx' is never NULL. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Weiguo Li <liwg06@foxmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:07 -05:00
Alex Deucher	825e0af0d4	drm/amdgpu/sdma5: drop unused cyan skillfish firmware Leftover from bring up. Not used anymore. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:07 -05:00
Alex Deucher	31f5f46043	drm/amdgpu/gfx10: drop unused cyan skillfish firmware Leftover from bring up. Not used anymore. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:07 -05:00
Alex Deucher	1b537e6410	drm/amdgpu: remove unused gpu_info firmwares These were leftover from bring up and are no longer necessary. The information is available via the IP discovery table. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:07 -05:00
Alex Deucher	45a3e06be4	drm/amdgpu: Use IP versions in convert_tiling_flags_to_modifier() Rather than checking the asic_type. Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:07 -05:00
Prike Liang	d7709eb6a1	drm/amdgpu: enable gfxoff routine for GC 10.3.7 Enable gfxoff routine for GC 10.3.7. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:07 -05:00
Prike Liang	fabe175385	drm/amdgpu: enable gfx power gating for GC 10.3.7 Enable gfx power gating for GC 10.3.7. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
Prike Liang	9a1358bb2c	drm/amdgpu/nv: enable clock gating for GC 10.3.7 subblock This will enable the following block clock gating. - MC - SDMA - HDP - ATHUB - IH - VCN/JPEG Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
Prike Liang	00bfab4457	drm/amdgpu: enable gfx clock gating control for GC 10.3.7 Enable gfx cg gate/ungate control for GC 10.3.7. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
Qiang Yu	b6901d93cc	drm/amdgpu: fix suspend/resume hang regression Regression has been reported that suspend/resume may hang with the previous vm ready check commit. So bring back the evicted list check as a temp fix. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1922 Fixes: `c1a66c3bc4` ("drm/amdgpu: check vm ready by amdgpu_vm->evicting flag") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Qiang Yu <qiang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
Yifan Zha	e6fac6a9c9	drm/amdgpu: Move CAP firmware loading to the beginning of PSP firmware list [Why] As PSP needs to verify the signature, CAP firmware must be loaded first when PSP loads firmwares. Otherwise, when DFC feature is enabled, CP firmwares would be loaded failed. [ 1149.160480] [drm] MM table gpu addr = 0x800022f000, cpu addr = 00000000a62afcea. [ 1149.209874] [drm] failed to load ucode CP_CE(0x8) [ 1149.209878] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0007) [ 1149.215914] [drm] failed to load ucode CP_PFP(0x9) [ 1149.215917] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0007) [ 1149.221941] [drm] failed to load ucode CP_ME(0xA) [ 1149.221944] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0007) [ 1149.228082] [drm] failed to load ucode CP_MEC1(0xB) [ 1149.228085] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0007) [ 1149.234209] [drm] failed to load ucode CP_MEC2(0xD) [ 1149.234212] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0007) [ 1149.242379] [drm] failed to load ucode VCN(0x1C) [ 1149.242382] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0007) [How] Move CAP UCODE ID to the beginning of AMDGPU_UCODE_ID enum list. Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Reviewed-by: Bokun Zhang <Bokun.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
Andrey Grodzovsky	5aa061474b	drm/amdgpu: Bump minor version for hot plug tests enabling. This will allow to enable the tests only after latest fix after which the tests passed on my system. I tested on NV21 standalone and Vega 10 and Polaris as pair with DRI_PRIME. It's possible there might be still issues on ASICs i don't have at my posession but that that the point of enbling the tests finally - if other people during testing will encounter errors they will report and I will be able to fix. The releated merge request for enabling libdrm tests suite is in https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/227 Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
Andrey Grodzovsky	57230f0ce6	drm/amdgpu: Fix sigsev when accessing MMIO on hot unplug. Protect with drm_dev_enter/exit Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
Yifan Zhang	7d4108e4ce	drm/amdgpu: convert code name to ip version for noretry set Use IP version rather than codename for noretry set. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
Yifan Zhang	957b0787ee	drm/amdgpu: move amdgpu_gmc_noretry_set after ip_versions populated otherwise adev->ip_versions is still empty when amdgpu_gmc_noretry_set is called. Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
yipechai	80e0c2cb37	drm/amdgpu: Remove redundant .ras_fini initialization in some ras blocks 1. Define amdgpu_ras_block_late_fini_default in amdgpu_ras.c as .ras_fini common function, which is called when .ras_fini of ras block isn't initialized. 2. Remove the code of using amdgpu_ras_block_late_fini to initialize .ras_fini in ras blocks. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
yipechai	30e58102d5	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in mca ras block Remove redundant calls of amdgpu_ras_block_late_fini in mca ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
yipechai	149d7ba1f8	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in sdma ras block Remove redundant calls of amdgpu_ras_block_late_fini in sdma ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
yipechai	aa8e65dfc7	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in hdp ras block Remove redundant calls of amdgpu_ras_block_late_fini in hdp ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
yipechai	f148c143ef	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in xgmi ras block Remove redundant calls of amdgpu_ras_block_late_fini in xgmi ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
yipechai	0dca257d6d	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in umc ras block Remove redundant calls of amdgpu_ras_block_late_fini in umc ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
yipechai	f578a37d19	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in nbio ras block Remove redundant calls of amdgpu_ras_block_late_fini in nbio ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:06 -05:00
yipechai	9dad47c50f	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in mmhub ras block Remove redundant calls of amdgpu_ras_block_late_fini in mmhub ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
yipechai	35366481d0	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in gfx ras block Remove redundant calls of amdgpu_ras_block_late_fini in gfx ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
yipechai	1f211a827c	drm/amdgpu: centrally calls the .ras_fini function of all ras blocks centrally calls the .ras_fini function of all ras blocks. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
yipechai	667c7091a3	drm/amdgpu: Optimize xxx_ras_fini function of each ras block 1. Move the variables of ras block instance members from specific xxx_ras_fini to general ras_fini call. 2. Function calls inside the modules only use parameters passed from xxx_ras_fini instead of ras block instance members. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
yipechai	01d468d9a4	drm/amdgpu: Modify .ras_fini function pointer parameter Modify .ras_fini function pointer parameter so that we can remove redundant intermediate calls in some ras blocks. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
Tom Rix	ca6fcfa8d4	drm/amdgpu: Fix realloc of ptr Clang static analysis reports this error amdgpu_debugfs.c:1690:9: warning: 1st function call argument is an uninitialized value tmp = krealloc_array(tmp, i + 1, ^~~~~~~~~~~~~~~~~~~~~~~~~~~ realloc uses tmp, so tmp can not be garbage. And the return needs to be checked. Fixes: `5ce5a584cb` ("drm/amdgpu: add debugfs for reset registers list") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-03-02 18:40:05 -05:00
Dave Airlie	38a15ad948	Merge tag 'amd-drm-next-5.18-2022-02-25' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.18-2022-02-25: amdgpu: - Raven2 suspend/resume fix - SDMA 5.2.6 updates - VCN 3.1.2 updates - SMU 13.0.5 updates - DCN 3.1.5 updates - Virtual display fixes - SMU code cleanup - Harvest fixes - Expose benchmark tests via debugfs - Drop no longer relevant gart aperture tests - More RAS restructuring - W=1 fixes - PSR rework - DP/VGA adapter fixes - DP MST fixes - GPUVM eviction fix - GPU reset debugfs register dumping support - Misc display fixes - SR-IOV fix - Aldebaran mGPU fix - Add module parameter to disable XGMI for testing amdkfd: - IH ring overflow logging fixes - CRIU fixes - Misc fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220225183535.5907-1-alexander.deucher@amd.com	2022-03-01 16:19:02 +10:00
Dave Airlie	6c64ae228f	Linux 5.17-rc6 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmIb/PEeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGugMH/R8icwG5gOjAuxBr fuz9032ECVFS36Cy3ps9Eqgf5EdS4G6yz3joM4aNtp4B7e5FzI9lzBaHS8OAguNL y7puFtBr9CywsnniJumZzciB9pEHmF/yyEKfMlRZA3JsRfLDacFstETp+duJnXoA +s49IWsy1ot5zoherhDXcFLqDoFhLVU4hYwE1xpLpW/cllqmSnW8SDZMtWW1Ui/B Of7zqINR4zBMmRjP4ymGq/ZrPWlFyWLdtOo0xxVoAQkeMgm33kfaaeGzfyK25FqR JDGUZ7lkKvwz3PYh2hqJ7dc5K+vhJ18I+F1UmOiL6QAAUF/k8jq7i3Qaf6MKqh2Z 2v+A5k0= =Hiar -----END PGP SIGNATURE----- Backmerge tag 'v5.17-rc6' into drm-next This backmerges v5.17-rc6 so I can merge some amdgpu and some tegra changes on top. Signed-off-by: Dave Airlie <airlied@redhat.com>	2022-02-28 14:57:14 +10:00
Andrey Grodzovsky	2656fd230d	drm/amdgpu: Exclude PCI reset method for now. According to my investigation of the state of PCI reset recently it's not working. The reason is due to the fact the kernel PCI code rejects SBR when there are more then one PF under same bridge which we always have (at least AUDIO PF but usually more) and that because SBR will reset all the PFS and devices under the same bridge as you and you cannot assume they support SBR. Once we anble FLR support we can reenable this option as FLR is doable on single PF and doens't have this restriction. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-24 17:25:20 -05:00
Alex Sierra	158a05a0b8	drm/amdgpu: Add use_xgmi_p2p module parameter This parameter controls xGMI p2p communication, which is enabled by default. However, it can be disabled by setting it to 0. In case xGMI p2p is disabled in a dGPU, PCIe p2p interface will be used instead. This parameter is ignored in GPUs that do not support xGMI p2p configuration. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-24 17:25:12 -05:00
Xiaogang Chen	45f0ff404c	drm/amdgpu: config HDP_MISC_CNTL.READ_BUFFER_WATERMARK To fix applications running across multiple GPU config hang. Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-24 17:24:43 -05:00
Dave Airlie	54f43c17d6	drm-misc-next for v5.18: UAPI Changes: Cross-subsystem Changes: - Split out panel-lvds and lvds dt bindings . - Put yes/no on/off disabled/enabled strings in linux/string_helpers.h and use it in drivers and tomoyo. - Clarify dma_fence_chain and dma_fence_array should never include eachother. - Flatten chains in syncobj's. - Don't double add in fbdev/defio when page is already enlisted. - Don't sort deferred-I/O pages by default in fbdev. Core Changes: - Fix missing pm_runtime_put_sync in bridge. - Set modifier support to only linear fb modifier if drivers don't advertise support. - As a result, we remove allow_fb_modifiers. - Add missing clear for EDID Deep Color Modes in drm_reset_display_info. - Assorted documentation updates. - Warn once in drm_clflush if there is no arch support. - Add missing select for dp helper in drm_panel_edp. - Assorted small fixes. - Improve fb-helper's clipping handling. - Don't dump shmem mmaps in a core dump. - Add accounting to ttm resource manager, and use it in amdgpu. - Allow querying the detected eDP panel through debugfs. - Add helpers for xrgb8888 to 8 and 1 bits gray. - Improve drm's buddy allocator. - Add selftests for the buddy allocator. Driver Changes: - Add support for nomodeset to a lot of drm drivers. - Use drm_module__driver in a lot of drm drivers. - Assorted small fixes to bridge/lt9611, v3d, vc4, vmwgfx, mxsfb, nouveau, bridge/dw-hdmi, panfrost, lima, ingenic, sprd, bridge/anx7625, ti-sn65dsi86. - Add bridge/it6505. - Create DP and DVI-I connectors in ast. - Assorted nouveau backlight fixes. - Rework amdgpu reset handling. - Add dt bindings for ingenic,jz4780-dw-hdmi. - Support reading edid through aux channel in ingenic. - Add a drm driver for Solomon SSD130x OLED displays. - Add simple support for sharp LQ140M1JW46. - Add more panels to nt35560. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAmIWLSEACgkQ/lWMcqZw E8OP7hAAjix94EX5fhFa7OAdqUbFtsiKhK/4zNtV9FWpFiEsDBz+dlbfDQWIx5an FIiiiQtSfWjpDv6pcMhoNf80w+dDbc/Cuauz6nNGO7Pkaerh2D/EPG74FD7f7nE3 EIScVs1heYtzM9usKrFKupNYgIdgZxmeovClWuE0OTjLOes2PGvvxXK6iJqNqJMX VlDO5SR7GRqsDUDV6vmwl63uKL77xJXAahAXIx+BQ/1xrtEhlu6NwsgHIsmPmMSN YluX34zc1xD/6/uUqvEdp7u46/5/He1c5Q/ia1WV3wRxsO/eMZ+axXqCZP3XGZdt rMdGNtj1MWKkudYiowStWkCVSG/0fXJCFIAhvRmeZy+YqPdVlqZ2W7g4H1l9iJoo UVfT9cHrKoxHsukvIEckC5Ov9v1yr39Bd4wUuqaUTUSxY8VID5vjY63TsXl9Zke1 SluTFe9qybbnRNz/hYRvwIS1eT8HvUauAfAhypGTLI5DYHTD7PawcfMJkNzCtJm4 Ta4SC3rTpkpN+7oc8SoNgqRHQ8U9KL5oksP0wVa8vwHsMptSd3X4pUljc6TcfjLv GEo41D5AuJz3HRVcn9yqPbLoPE2FFB7bfwIMH77yNnoos4Izy/LGhKpN0YdImmI5 W5XVFB0jltGSIhkzLe1mFpLrdJwdUTSUVeCK4H5PhZZQEHLkVtg= =HuwD -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2022-02-23' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v5.18: UAPI Changes: Cross-subsystem Changes: - Split out panel-lvds and lvds dt bindings . - Put yes/no on/off disabled/enabled strings in linux/string_helpers.h and use it in drivers and tomoyo. - Clarify dma_fence_chain and dma_fence_array should never include eachother. - Flatten chains in syncobj's. - Don't double add in fbdev/defio when page is already enlisted. - Don't sort deferred-I/O pages by default in fbdev. Core Changes: - Fix missing pm_runtime_put_sync in bridge. - Set modifier support to only linear fb modifier if drivers don't advertise support. - As a result, we remove allow_fb_modifiers. - Add missing clear for EDID Deep Color Modes in drm_reset_display_info. - Assorted documentation updates. - Warn once in drm_clflush if there is no arch support. - Add missing select for dp helper in drm_panel_edp. - Assorted small fixes. - Improve fb-helper's clipping handling. - Don't dump shmem mmaps in a core dump. - Add accounting to ttm resource manager, and use it in amdgpu. - Allow querying the detected eDP panel through debugfs. - Add helpers for xrgb8888 to 8 and 1 bits gray. - Improve drm's buddy allocator. - Add selftests for the buddy allocator. Driver Changes: - Add support for nomodeset to a lot of drm drivers. - Use drm_module__driver in a lot of drm drivers. - Assorted small fixes to bridge/lt9611, v3d, vc4, vmwgfx, mxsfb, nouveau, bridge/dw-hdmi, panfrost, lima, ingenic, sprd, bridge/anx7625, ti-sn65dsi86. - Add bridge/it6505. - Create DP and DVI-I connectors in ast. - Assorted nouveau backlight fixes. - Rework amdgpu reset handling. - Add dt bindings for ingenic,jz4780-dw-hdmi. - Support reading edid through aux channel in ingenic. - Add a drm driver for Solomon SSD130x OLED displays. - Add simple support for sharp LQ140M1JW46. - Add more panels to nt35560. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/686ec871-e77f-c230-22e5-9e3bb80f064a@linux.intel.com	2022-02-25 05:50:18 +10:00
Qiang Yu	c1a66c3bc4	drm/amdgpu: check vm ready by amdgpu_vm->evicting flag Workstation application ANSA/META v21.1.4 get this error dmesg when running CI test suite provided by ANSA/META: [drm:amdgpu_gem_va_ioctl [amdgpu]] ERROR Couldn't update BO_VA (-16) This is caused by: 1. create a 256MB buffer in invisible VRAM 2. CPU map the buffer and access it causes vm_fault and try to move it to visible VRAM 3. force visible VRAM space and traverse all VRAM bos to check if evicting this bo is valuable 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable() will set amdgpu_vm->evicting, but latter due to not in visible VRAM, won't really evict it so not add it to amdgpu_vm->evicted 5. before next CS to clear the amdgpu_vm->evicting, user VM ops ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted) but fail in amdgpu_vm_bo_update_mapping() (check amdgpu_vm->evicting) and get this error log This error won't affect functionality as next CS will finish the waiting VM ops. But we'd better clear the error log by checking the amdgpu_vm->evicting flag in amdgpu_vm_ready() to stop calling amdgpu_vm_bo_update_mapping() later. Another reason is amdgpu_vm->evicted list holds all BOs (both user buffer and page table), but only page table BOs' eviction prevent VM ops. amdgpu_vm->evicting flag is set only for page table BOs, so we should use evicting flag instead of evicted list in amdgpu_vm_ready(). The side effect of this change is: previously blocked VM op (user buffer in "evicted" list but no page table in it) gets done immediately. v2: update commit comments. Acked-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Qiang Yu <qiang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-02-23 16:31:06 -05:00
Guchun Chen	e2b993302f	drm/amdgpu: bypass tiling flag check in virtual display case (v2) vkms leverages common amdgpu framebuffer creation, and also as it does not support FB modifier, there is no need to check tiling flags when initing framebuffer when virtual display is enabled. This can fix below calltrace: amdgpu 0000:00:08.0: GFX9+ requires FB check based on format modifier WARNING: CPU: 0 PID: 1023 at drivers/gpu/drm/amd/amdgpu/amdgpu_display.c:1150 amdgpu_display_framebuffer_init+0x8e7/0xb40 [amdgpu] v2: check adev->enable_virtual_display instead as vkms can be enabled in bare metal as well. Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 16:31:06 -05:00
Guchun Chen	97c61e0b7c	Revert "drm/amdgpu: add modifiers in amdgpu_vkms_plane_init()" This reverts commit `4046afcebf`. No need to support modifier in virtual kms, otherwise, in SRIOV mode, when lanuching X server, set crtc will fail due to mismatch between primary plane modifier and framebuffer modifier. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 16:31:06 -05:00
Chen Gong	1e2be869c8	drm/amdgpu: do not enable asic reset for raven2 The GPU reset function of raven2 is not maintained or tested, so it should be very unstable. Now the amdgpu_asic_reset function is added to amdgpu_pmops_suspend, which causes the S3 test of raven2 to fail, so the asic_reset of raven2 is ignored here. Fixes: `daf8de0874` ("drm/amdgpu: always reset the asic in suspend (v2)") Signed-off-by: Chen Gong <curry.gong@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-02-23 16:31:06 -05:00
Mario Limonciello	7294863a6f	drm/amd: Check if ASPM is enabled from PCIe subsystem commit `0064b0ce85` ("drm/amd/pm: enable ASPM by default") enabled ASPM by default but a variety of hardware configurations it turns out that this caused a regression. * PPC64LE hardware does not support ASPM at a hardware level. CONFIG_PCIEASPM is often disabled on these architectures. * Some dGPUs on ALD platforms don't work with ASPM enabled and PCIe subsystem disables it Check with the PCIe subsystem to see that ASPM has been enabled or not. Fixes: `0064b0ce85` ("drm/amd/pm: enable ASPM by default") Link: https://wiki.raptorcs.com/w/images/a/ad/P9_PHB_version1.0_27July2018_pub.pdf Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1723 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1739 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1885 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1907 Tested-by: koba.ko@canonical.com Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-02-23 16:31:06 -05:00
Alex Deucher	e776a755ab	drm/amdgpu: fix typo in amdgpu_discovery.c disocvery -> discovery Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:26:36 -05:00
Somalapuram Amaranath	15fd09a05a	drm/amdgpu: add reset register dump trace on GPU Dump the list of register values to trace event on GPU reset. Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:26:36 -05:00
Somalapuram Amaranath	5ce5a584cb	drm/amdgpu: add debugfs for reset registers list List of register populated for dump collection during the GPU reset. Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:26:36 -05:00
Qiang Yu	b74e2476ef	drm/amdgpu: check vm ready by amdgpu_vm->evicting flag Workstation application ANSA/META v21.1.4 get this error dmesg when running CI test suite provided by ANSA/META: [drm:amdgpu_gem_va_ioctl [amdgpu]] ERROR Couldn't update BO_VA (-16) This is caused by: 1. create a 256MB buffer in invisible VRAM 2. CPU map the buffer and access it causes vm_fault and try to move it to visible VRAM 3. force visible VRAM space and traverse all VRAM bos to check if evicting this bo is valuable 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable() will set amdgpu_vm->evicting, but latter due to not in visible VRAM, won't really evict it so not add it to amdgpu_vm->evicted 5. before next CS to clear the amdgpu_vm->evicting, user VM ops ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted) but fail in amdgpu_vm_bo_update_mapping() (check amdgpu_vm->evicting) and get this error log This error won't affect functionality as next CS will finish the waiting VM ops. But we'd better clear the error log by checking the amdgpu_vm->evicting flag in amdgpu_vm_ready() to stop calling amdgpu_vm_bo_update_mapping() later. Another reason is amdgpu_vm->evicted list holds all BOs (both user buffer and page table), but only page table BOs' eviction prevent VM ops. amdgpu_vm->evicting flag is set only for page table BOs, so we should use evicting flag instead of evicted list in amdgpu_vm_ready(). The side effect of this change is: previously blocked VM op (user buffer in "evicted" list but no page table in it) gets done immediately. v2: update commit comments. Acked-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Qiang Yu <qiang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:26:36 -05:00
Prike Liang	db749b769f	drm/amdgpu/nv: set mode2 reset for MP1 13.0.8 Set mode2 reset support for MP1 13.0.8. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:26:36 -05:00
Prike Liang	9e148e8ce2	drm/amdgpu/nv: enable gfx10.3.7 clock gating support This will enable the following gfx clock gating. - Fine clock gating - Medium Grain clock gating - 3D Coarse clock gating - Coarse Grain clock gating - RLC/CP light sleep clock gating Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:26:36 -05:00
Yifan Zhang	5043906024	drm/amdgpu: add mode2 reset support for smu 13.0.5 This patch adds mode2 reset support for smu 13.0.5. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:26:36 -05:00
yipechai	29c9b6cd58	drm/amdgpu: Fixed warning reported by kernel test robot Fixed warning reported by kernel test robot: 1.warning: variable 'ras_obj' is used uninitialized whenever '\|\|' condition is true. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:26:35 -05:00
Maíra Canal	78be946dad	drm/amdgpu: Remove unused get_umc_v8_7_channel_index function Remove get_umc_v8_7_channel_index function, which is not used in the codebase. This was pointed by clang with the following warning: drivers/gpu/drm/amd/amdgpu/umc_v8_7.c:50:24: warning: unused function 'get_umc_v8_7_channel_index' [-Wunused-function] static inline uint32_t get_umc_v8_7_channel_index(struct amdgpu_device *adev, ^ Signed-off-by: Maíra Canal <maira.canal@usp.br> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:02:51 -05:00
Maíra Canal	d41ff22a4e	drm/amdgpu: Change amdgpu_ras_block_late_init_default function scope Turn previously global function into a static function to avoid the following Clang warning: drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:2459:5: warning: no previous prototype for function 'amdgpu_ras_block_late_init_default' [-Wmissing-prototypes] int amdgpu_ras_block_late_init_default(struct amdgpu_device adev, ^ drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:2459:1: note: declare 'static' if the function is not intended to be used outside of this translation unit int amdgpu_ras_block_late_init_default(struct amdgpu_device adev, ^ static Signed-off-by: Maíra Canal <maira.canal@usp.br> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:02:51 -05:00
Alex Deucher	4683af148f	drm/amdgpu: use ktime rather than jiffies for benchmark results To protect against wraparounds. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:02:51 -05:00
Alex Deucher	5a82b01823	drm/amdgpu: use kernel BO API for benchmark buffer management Simplifies the code quite a bit. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:02:51 -05:00
Alex Deucher	a7f520bfd0	drm/amdgpu: derive GTT display support from DM Rather than duplicating the logic in two places, consolidate the logic in the display manager. Acked-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:02:51 -05:00
Alex Deucher	b784f42cf7	drm/amdgpu: drop testing module parameter This test is not particularly useful now that GTT and GART are decoupled in the driver. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:02:51 -05:00
Alex Deucher	0b1a63487b	drm/amdgpu: drop benchmark module parameter Now that we expose the benchmarks via debugfs, there is no longer a need for the module parameter. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:02:51 -05:00
Alex Deucher	e7c4723103	drm/amdgpu: expose benchmarks via debugfs They provide a nice smoke test of transfer performance using SDMA. Allow the user to run these at runtime rather than only at init time. v2: fix permissions (Alex) Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:02:50 -05:00
Alex Deucher	f113cc32e3	drm/amdgpu: add a benchmark mutex To avoid multiple runs in parallel to avoid mixing results. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:02:50 -05:00
Alex Deucher	b887d5f9b9	drm/amdgpu: print the selected benchmark test in the log So you can tell which benchmark was run. v2: print the test description as well Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:02:50 -05:00
Alex Deucher	e460f244fb	drm/amdgpu: plumb error handling though amdgpu_benchmark() So we can tell when this function fails. v2: squash in error handling fix (Alex) Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-23 14:02:50 -05:00
Jiawei Gu	8ab62eda17	drm/sched: Add device pointer to drm_gpu_scheduler Add device pointer so scheduler's printing can use DRM_DEV_ERROR() instead, which makes life easier under multiple GPU scenario. v2: amend all calls of drm_sched_init() v3: fill dev pointer for all drm_sched_init() calls Signed-off-by: Jiawei Gu <Jiawei.Gu@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220221095705.5290-1-Jiawei.Gu@amd.com	2022-02-23 10:04:14 +01:00
Alex Deucher	091cd9c3ab	drm/amdgpu/benchmark: use dev_info rather than DRM macros for logging So we can tell which output goes to which device when multiple GPUs are present. Also while we are here, convert DRM_ERROR to dev_info. The error cases are not critical. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-22 14:52:39 -05:00
Paul Menzel	cec2cc7b1c	drm/amdgpu: Fix typo in whether in comment Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-22 14:52:39 -05:00
Guchun Chen	e1dd4bbf86	drm/amdgpu: read harvest bit per IP data on legacy GPUs Based on firmware team's input, harvest table in VBIOS does not apply well to legacy products like Navi1x, so seperate harvest mask configuration retrieve from different places. On legacy GPUs, scan harvest bit per IP data stuctures, while for newer ones, still read IP harvest info from harvest table. v2: squash in fix to limit it to specific skus (Guchun) Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-22 14:52:39 -05:00
Prike Liang	7342bf6530	drm/amdgpu: enable TMZ option for onwards asic The TMZ is disabled by default and enable TMZ option for the IP discovery based asic will help on the TMZ function verification. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-22 14:44:27 -05:00
Guchun Chen	d4a7eac27e	drm/amdgpu: bypass tiling flag check in virtual display case (v2) vkms leverages common amdgpu framebuffer creation, and also as it does not support FB modifier, there is no need to check tiling flags when initing framebuffer when virtual display is enabled. This can fix below calltrace: amdgpu 0000:00:08.0: GFX9+ requires FB check based on format modifier WARNING: CPU: 0 PID: 1023 at drivers/gpu/drm/amd/amdgpu/amdgpu_display.c:1150 amdgpu_display_framebuffer_init+0x8e7/0xb40 [amdgpu] v2: check adev->enable_virtual_display instead as vkms can be enabled in bare metal as well. Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-22 14:44:12 -05:00
Guchun Chen	fa3e5a43ec	Revert "drm/amdgpu: add modifiers in amdgpu_vkms_plane_init()" This reverts commit `4046afcebf`. No need to support modifier in virtual kms, otherwise, in SRIOV mode, when lanuching X server, set crtc will fail due to mismatch between primary plane modifier and framebuffer modifier. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-22 14:41:24 -05:00
Evan Quan	f626dd0ff0	drm/amdgpu: disable MMHUB PG for Picasso MMHUB PG needs to be disabled for Picasso for stability reasons. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-02-21 17:55:17 -05:00
Yifan Zhang	181ebed7dc	drm/amdgpu: add dm ip block for dcn 3.1.5 this patch adds dm ip block for dcn 3.1.5. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-18 14:07:01 -05:00
Yifan Zhang	068ea8bdc0	drm/amd/pm: add smu_v13_0_5_ppt implementation this patch adds smu_v13_0_5_ppt implementation. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-18 14:07:00 -05:00
Yifan Zhang	d7fd297cb0	drm/amdgpu: add support for psp 13.0.5 Enabl psp support for psp 13.0.5. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-18 14:07:00 -05:00
Yifan Zhang	ec3ca07885	drm/amdgpu: add smuio support for smuio 13.0.10 this patch adds smuio support for smuio 13.0.10. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-18 14:07:00 -05:00
Yifan Zhang	935ad3a74c	drm/amdgpu: add support for nbio 7.3.0 this patch adds support for nbio 7.3.0. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-18 14:07:00 -05:00
Boyuan Zhang	87b5e77f02	drm/amdgpu: enable vcn pg and cg for vcn 3.1.2 Enable PG and CG for VCN/JPEG Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-18 14:06:59 -05:00
Boyuan Zhang	afc2f27605	drm/amdgpu/vcn: add vcn support for vcn 3.1.2 Load VCN FW, set caps. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-18 14:06:59 -05:00
Yifan Zhang	93afe15837	drm/amdgpu: add support for sdma 5.2.6 This patch adds support for sdma 5.2.6. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-18 14:06:59 -05:00
Chen Gong	89bfcd82b3	drm/amdgpu: do not enable asic reset for raven2 The GPU reset function of raven2 is not maintained or tested, so it should be very unstable. Now the amdgpu_asic_reset function is added to amdgpu_pmops_suspend, which causes the S3 test of raven2 to fail, so the asic_reset of raven2 is ignored here. Fixes: `daf8de0874` ("drm/amdgpu: always reset the asic in suspend (v2)") Signed-off-by: Chen Gong <curry.gong@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-18 14:06:59 -05:00
Yifan Zhang	874bfdfa47	drm/amdgpu: add gc 10.3.6 support this patch adds gc 10.3.6 support. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:06 -05:00
Yifan Zhang	a142606d54	drm/amdgpu: add support for gmc10 for gc 10.3.6 this patch adds support for gmc10. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
Yifan Zhang	50e14a62ac	drm/amdgpu: add Clock and Power Gating support for gc 10.3.6 Add below supports: GFX Coarse Grain Clock Gating(CGCG) GFX Coarse grain light sleep/deep sleep(CGLS) GFX Medium Grain Clock Gating(MGCG) GFX Medium Grain light sleep/deep sleep(MGLS) GFX Fine Grain Clock Gating(FGCG) RLC MGLS CP MGLS MMHUB Clock Gating SDMA Clock Gating HDP Clock Gating ATHUB Clock Gating IH Clock Gating GFX Power Gating Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
Yifan Zhang	1957f27de2	drm/amdgpu: add nv common init for gc 10.3.6 This patch adds add nv common init for gc 10.3.6. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
Tom Rix	779596ce6a	drm/amdgpu: fix amdgpu_ras_block_late_init error handler Clang build fails with amdgpu_ras.c:2416:7: error: variable 'ras_obj' is used uninitialized whenever 'if' condition is true if (adev->in_suspend \|\| amdgpu_in_reset(adev)) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ amdgpu_ras.c:2453:6: note: uninitialized use occurs here if (ras_obj->ras_cb) ^~~~~~~ There is a logic error in the error handler's labels. ex/ The sysfs: is the last goto label in the normal code but is the middle of error handler. Rework the error handler. cleanup: is the first error, so it's handler should be last. interrupt: is the second error, it's handler is next. interrupt: handles the failure of amdgpu_ras_interrupt_add_hander() by calling amdgpu_ras_interrupt_remove_handler(). This is wrong, remove() assumes the interrupt has been setup, not torn down by add(). Change the goto label to cleanup. sysfs is the last error, it's handler should be first. sysfs: handles the failure of amdgpu_ras_sysfs_create() by calling amdgpu_ras_sysfs_remove(). But when the create() fails there is nothing added so there is nothing to remove. This error handler is not needed. Remove the error handler and change goto label to interrupt. Fixes: `b293e891b0` ("drm/amdgpu: add helper function to do common ras_late_init/fini (v3)") Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
Luben Tuikov	6b5033831f	drm/amdgpu: Dynamically initialize IP instance attributes Dynamically initialize IP instance attributes. This eliminates bugs stemming from adding new attributes to an IP instance. Cc: Alex Deucher <Alexander.Deucher@amd.com> Reported-by: Tom StDenis <tom.stdenis@amd.com> Fixes: `4d7ba312dd` ("drm/amdgpu: Add "harvest" to IP discovery sysfs") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
Tom St Denis	8f74f68d90	drm/amd/amdgpu: Add APU flag to gca_config debugfs data (v3) Needed by umr to detect if ip discovered ASIC is an APU or not. (v2): Remove asic type from packet it's not strictly needed (v3): Correct comment Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
Mario Limonciello	d01899d3db	drm/amd: Use amdgpu_device_should_use_aspm on navi umd pstate switching The `program_aspm` callback is already guarded for aspm, but the `enable_aspm` callback doesn't follow the module parameter. Update it to use the helper `amdgpu_device_should_use_aspm`. Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
Mario Limonciello	0ab5d711ec	drm/amd: Refactor `amdgpu_aspm` to be evaluated per device Evaluating `pcie_aspm_enabled` as part of driver probe has the implication that if one PCIe bridge with an AMD GPU connected doesn't support ASPM then none of them do. This is an invalid assumption as the PCIe core will configure ASPM for individual PCIe bridges. Create a new helper function that can be called by individual dGPUs to react to the `amdgpu_aspm` module parameter without having negative results for other dGPUs on the PCIe bus. Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
Luben Tuikov	f0d5409895	drm/amdgpu: Fix ARM compilation warning Fix this ARM warning: drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:664:35: warning: format '%ld' expects argument of type 'long int', but argument 4 has type 'size_t' {aka 'unsigned int'} [-Wformat=] Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: kbuild-all@lists.01.org Cc: linux-kernel@vger.kernel.org Reported-by: kernel test robot <lkp@intel.com> Fixes: `a6c40b1780` ("drm/amdgpu: Show IP discovery in sysfs") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
Mario Limonciello	cba07cce39	drm/amd: Check if ASPM is enabled from PCIe subsystem commit `0064b0ce85` ("drm/amd/pm: enable ASPM by default") enabled ASPM by default but a variety of hardware configurations it turns out that this caused a regression. * PPC64LE hardware does not support ASPM at a hardware level. CONFIG_PCIEASPM is often disabled on these architectures. * Some dGPUs on ALD platforms don't work with ASPM enabled and PCIe subsystem disables it Check with the PCIe subsystem to see that ASPM has been enabled or not. Fixes: `0064b0ce85` ("drm/amd/pm: enable ASPM by default") Link: https://wiki.raptorcs.com/w/images/a/ad/P9_PHB_version1.0_27July2018_pub.pdf Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1723 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1739 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1885 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1907 Tested-by: koba.ko@canonical.com Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
yipechai	418abce203	drm/amdgpu: Remove redundant .ras_late_init initialization in some ras blocks 1. Define amdgpu_ras_block_late_init_default in amdgpu_ras.c as .ras_late_init common function, which is called when .ras_late_init of ras block isn't initialized. 2. Remove the code of using amdgpu_ras_block_late_init to initialize .ras_late_init in ras blocks. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
yipechai	867e24ca49	drm/amdgpu: define amdgpu_ras_late_init to call all ras blocks' .ras_late_init Define amdgpu_ras_late_init to call all ras blocks' .ras_late_init. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
yipechai	caae42f009	drm/amdgpu: Optimize xxx_ras_late_init function of each ras block 1. Move calling ras block instance members from module internal function to the top calling xxx_ras_late_init. 2. Module internal function calls can only use parameter variables of xxx_ras_late_init instead of ras block instance members. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
yipechai	20c43547ad	drm/amdgpu: Remove redundant calls of ras_late_init in mca ras block Remove redundant calls of ras_late_init in mca ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:05 -05:00
yipechai	068001b711	drm/amdgpu: Remove redundant calls of ras_late_init in mmhub ras block Remove redundant calls of ras_late_init in mmhub ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:04 -05:00
yipechai	72b3588e27	drm/amdgpu: Remove redundant calls of ras_late_init in hdp ras block Remove redundant calls of ras_late_init in hdp ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:04 -05:00
yipechai	4e9b1fa5a2	drm/amdgpu: Modify .ras_late_init function pointer parameter Modify .ras_late_init function pointer parameter so that it can remove redundant intermediate calls in some ras blocks. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:04 -05:00
Prike Liang	f83e14011e	drm/amdgpu/discovery: Add sw DM function for 3.1.6 DCE Add 3.1.6 DCE IP and assign relevant sw DM function for the new DCE. Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-17 15:59:04 -05:00
Prike Liang	a65dbf7cde	drm/amdgpu/gfx10: Add GC 10.3.7 Support Needed to properly initialize GC 10.3.7. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:03 -05:00
Prike Liang	967af863f2	drm/amdgpu/sdma5.2: add support for SDMA 5.2.7 Initialize SDMA engine firmware loading. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:03 -05:00
Prike Liang	db090ff8f9	drm/amd/pm: Add support for MP1 13.0.8 Set smu sw function and enable swSMU support for MP1. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:03 -05:00
Prike Liang	f99a7eb2d1	drm/amdgpu/psp: Add support for MP0 13.0.8 Set psp sw funcs callback and firmware loading for MP0. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:03 -05:00
Prike Liang	97437f475c	drm/amdgpu/gmc10: add support for GC 10.3.7 Set gfxhub function and configure VM for GC block. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:03 -05:00
Sathishkumar S	35c27d9578	drm/amdgpu: update vcn/jpeg PG flags for VCN 3.1.1 update vcn and jpeg power gating flags for VCN 3.1.1 Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:03 -05:00
Prike Liang	b67f00e06f	drm/amdgpu: set new revision id for 10.3.7 GC Add new revision ID for GC 10.3.7 and set cg/pg flags. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:03 -05:00
Prike Liang	2fbc508697	drm/amdgpu/discovery: set sw common init for GC 10.3.7 Set nv_common_ip_block for GC 10.3.7. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:02 -05:00
Prike Liang	2019bf7cd2	drm/amdgpu/discovery: Add 13.0.9 SMUIO block Add SMUIO sw function for the new SMUIO block. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:02 -05:00
Prike Liang	01cbf049e1	drm/amdgpu/discovery: add nbio sw func for 7.5.1 nbio add nbio sw func for the new 7.5.1 nbio block. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:02 -05:00
Alex Deucher	dfcc3e8c24	drm/amdgpu: make cyan skillfish support code more consistent Since this is an existing asic, adjust the code to follow the same logic as previously so the driver state is consistent. No functional change intended. Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:02 -05:00
Victor Skvortsov	aa79d3808e	drm/amdgpu: Fix wait for RLCG command completion if (!(tmp & flag)) condition will always evaluate to true when the flag is 0x0 (AMDGPU_RLCG_GC_WRITE). Instead check that address bits are cleared to determine whether the command is complete. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Tested-by: Bokun Zhang <bokun.zhang@amd.com> Reviewed by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:02 -05:00
Luben Tuikov	4d7ba312dd	drm/amdgpu: Add "harvest" to IP discovery sysfs Add the "harvest" field to the IP attributes in the IP discovery sysfs visualization, as this field is present in the binary data. At the time of this commit, the harvest data isn't consistently correct in VBIOS, but it is exposed for completeness, in the hopes that VBIOS will be fixed in the future. Cc: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:02:12 -05:00
Evan Quan	e506db5905	drm/amdgpu: disable MMHUB PG for Picasso MMHUB PG needs to be disabled for Picasso for stability reasons. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:01:28 -05:00
Alex Deucher	92ede25ece	drm/amdgpu/sdma5.2: Adjust the name string for firmware This will make it easier to add new firmwares in the future. Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 16:44:40 -05:00
Tom Rix	eed1a5c742	drm/amdgpu: check return status before using stable_pstate Clang static analysis reports this problem amdgpu_ctx.c:616:26: warning: Assigned value is garbage or undefined args->out.pstate.flags = stable_pstate; ^ ~~~~~~~~~~~~~ amdgpu_ctx_stable_pstate can fail without setting stable_pstate. So check. Fixes: `8cda7a4f96` ("drm/amdgpu/UAPI: add new CTX OP to get/set stable pstates") Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 16:44:40 -05:00
Surbhi Kakarya	7258fa31ea	drm/amdgpu: Handle the GPU recovery failure in SRIOV environment. This patch handles the GPU recovery failure in sriov environment by retrying the reset if the first reset fails. To determine the condition of retry, a new macro AMDGPU_RETRY_SRIOV_RESET is added which returns true if failure is due to ETIMEDOUT, EINVAL or EBUSY, otherwise return false.A new macro AMDGPU_MAX_RETRY_LIMIT is used to limit the retry to 2. It also handles the return status in Post Asic Reset by updating the return code with asic_reset_res and eventually return the return code in amdgpu_job_timedout(). Signed-off-by: Surbhi Kakarya <surbhi.kakarya@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:41 -05:00
Stanley.Yang	1ec1944eb5	drm/amdgpu: print more error info print more error info when deferred uncorrectable ras error changed from V1: move Defferred error msg into query uncorrectable error count function. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:41 -05:00
yipechai	563285c85e	drm/amdgpu: Merge amdgpu_ras_late_init/amdgpu_ras_late_fini to amdgpu_ras_block_late_init/amdgpu_ras_block_late_fini 1. Merge amdgpu_ras_late_init to amdgpu_ras_block_late_init. 2. Remove amdgpu_ras_late_init since no ras block calls amdgpu_ras_late_init. 3. Merge amdgpu_ras_late_fini to amdgpu_ras_block_late_fini. 4. Remove amdgpu_ras_late_fini since no ras block calls amdgpu_ras_late_fini. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:41 -05:00
yipechai	9252d33df5	drm/amdgpu: Optimize operating sysfs and interrupt function interface in amdgpu_ras.c In order to reduce redundant struct conversion, modify operating sysfs and interrupt function interface parameters. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:41 -05:00
yipechai	892a57a975	drm/amdgpu: Optimize amdgpu_xgmi_ras_late_init/amdgpu_xgmi_ras_fini function code Optimize amdgpu_xgmi_ras_late_init/amdgpu_xgmi_ras_fini function code. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:41 -05:00
yipechai	a3ace75cdb	drm/amdgpu: Optimize amdgpu_umc_ras_late_init/amdgpu_umc_ras_fini function code Optimize amdgpu_umc_ras_late_init/amdgpu_umc_ras_fini function code. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:41 -05:00
yipechai	683bac6b00	drm/amdgpu: Optimize amdgpu_sdma_ras_late_init/amdgpu_sdma_ras_fini function code Optimize amdgpu_sdma_ras_late_init/amdgpu_sdma_ras_fini function code. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:41 -05:00
yipechai	80ed77f971	drm/amdgpu: Optimize amdgpu_nbio_ras_late_init/amdgpu_nbio_ras_fini function code Optimize amdgpu_nbio_ras_late_init/amdgpu_nbio_ras_fini function code. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:41 -05:00
yipechai	cb9561d0e3	drm/amdgpu: Optimize amdgpu_mmhub_ras_late_init/amdgpu_mmhub_ras_fini function code Optimize amdgpu_mmhub_ras_late_init/amdgpu_mmhub_ras_fini function code. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:41 -05:00
yipechai	88bc3cd845	drm/amdgpu: Optimize amdgpu_mca_ras_late_init/amdgpu_mca_ras_fini function code Optimize amdgpu_mca_ras_late_init/amdgpu_mca_ras_fini function code. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:41 -05:00
yipechai	634b56b0f8	drm/amdgpu: Optimize amdgpu_hdp_ras_late_init/amdgpu_hdp_ras_fini function code Optimize amdgpu_hdp_ras_late_init/amdgpu_hdp_ras_fini function code. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:41 -05:00
yipechai	311065086e	drm/amdgpu: Optimize amdgpu_gfx_ras_late_init/amdgpu_gfx_ras_fini function code Optimize amdgpu_gfx_ras_late_init/amdgpu_gfx_ras_fini function code. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:40 -05:00
yipechai	bdb3489cfc	drm/amdgpu: Optimize xxx_ras_late_init/xxx_ras_late_fini for each ras block 1. Define amdgpu_ras_block_late_init to create sysfs nodes and interrupt handles. 2. Define amdgpu_ras_block_late_fini to remove sysfs nodes and interrupt handles. 3. Replace ras block variable members in struct amdgpu_ras_block_object with struct ras_common_if, which can make it easy to associate each ras block instance with each ras block functional interface. 4. Add .ras_cb to struct amdgpu_ras_block_object. 5. Change each ras block to fit for the changement of struct amdgpu_ras_block_object. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:40 -05:00
Guchun Chen	22b1df28c0	drm/amdgpu: no rlcg legacy read in SRIOV case rlcg legacy read is not available in SRIOV configration. Otherwise, gmc_v9_0_flush_gpu_tlb will always complain timeout and finally breaks driver load. v2: bypass read in amdgpu_virt_get_rlcg_reg_access_flag (from Victor) Fixes: `97d1a3b967` ("drm/amdgpu: switch to get_rlcg_reg_access_flag for gfx9") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Victor Skvortsov <Victor.Skvortsov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:40 -05:00
Rajneesh Bhardwaj	7157934699	drm/amdgpu: Fix a kerneldoc warning Add missing parameters to fix a kerneldoc warning Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:40 -05:00
Luben Tuikov	a6c40b1780	drm/amdgpu: Show IP discovery in sysfs Add IP discovery data in sysfs. The format is: /sys/class/drm/cardX/device/ip_discovery/die/D/B/I/<attrs> where, X is the card ID, an integer, D is the die ID, an integer, B is the IP HW ID, an integer, aka block type, I is the IP HW ID instance, an integer. <attrs> are the attributes of the block instance. At the moment these include HW ID, instance number, major, minor, revision, number of base addresses, and the base addresses themselves. A symbolic link of the acronym HW ID is also created, under D/, if you prefer to browse by something humanly accessible. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Tom StDenis <tom.stdenis@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:40 -05:00
Rajneesh Bhardwaj	77608faa77	drm/amdgpu: Fix some kerneldoc warnings Fix few kerneldoc warnings and one typo. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-14 15:08:40 -05:00
Rajib Mahapatra	f8f4e2a518	drm/amdgpu: skipping SDMA hw_init and hw_fini for S0ix. [Why] SDMA ring buffer test failed if suspend is aborted during S0i3 resume. [How] If suspend is aborted for some reason during S0i3 resume cycle, it follows SDMA ring test failing and errors in amdgpu resume. For RN/CZN/Picasso, SMU saves and restores SDMA registers during S0ix cycle. So, skipping SDMA suspend and resume from driver solves the issue. This time, the system is able to resume gracefully even the suspend is aborted. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rajib Mahapatra <rajib.mahapatra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-02-14 14:59:46 -05:00
Christian König	7db47b8388	drm/amdgpu: remove VRAM accounting v2 This is provided by TTM now. Also switch man->size to bytes instead of pages and fix the double printing of size and usage in debugfs. v2: fix size checking as well Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220214093439.2989-8-christian.koenig@amd.com	2022-02-14 15:05:39 +01:00
Christian König	3fc2b087df	drm/amdgpu: remove PL_PREEMPT accounting This is provided by TTM now. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220214093439.2989-7-christian.koenig@amd.com	2022-02-14 15:05:39 +01:00
Christian König	dfa714b88e	drm/amdgpu: remove GTT accounting v2 This is provided by TTM now. Also switch man->size to bytes instead of pages and fix the double printing of size and usage in debugfs. v2: fix size checking as well Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220214093439.2989-6-christian.koenig@amd.com	2022-02-14 15:05:39 +01:00
Dave Airlie	123db17ddf	Merge tag 'amd-drm-next-5.18-2022-02-11-1' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.18-2022-02-11-1: amdgpu: - Clean up of power management code - Enable freesync video mode by default - Clean up of RAS code - Improve VRAM access for debug using SDMA - Coding style cleanups - SR-IOV fixes - More display FP reorg - TLB flush fixes for Arcuturus, Vega20 - Misc display fixes - Rework special register access methods for SR-IOV - DP2 fixes - DP tunneling fixes - DSC fixes - More IP discovery cleanups - Misc RAS fixes - Enable both SMU i2c buses where applicable - s2idle improvements - DPCS header cleanup - Add new CAP firmware support for SR-IOV amdkfd: - Misc cleanups - SVM fixes - CRIU support - Clean up MQD manager UAPI: - Add interface to amdgpu CTX ioctl to request a stable power state for profiling https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/207 - Add amdkfd support for CRIU https://github.com/checkpoint-restore/criu/pull/1709 - Remove old unused amdkfd debugger interface Was only implemented for Kaveri and was only ever used by an old HSA tool that was never open sourced radeon: - Fix error handling in radeon_driver_open_kms - UVD suspend fix - Misc fixes From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220211220706.5803-1-alexander.deucher@amd.com	2022-02-14 10:31:51 +10:00
Stanley.Yang	1915a43395	drm/amdgpu: adjust register address calculation the UMC_STATUS register is not linear, adjust offset calculation formula to get correct address Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-11 16:19:59 -05:00
Rajib Mahapatra	f3986e86b2	drm/amdgpu: skipping SDMA hw_init and hw_fini for S0ix. [Why] SDMA ring buffer test failed if suspend is aborted during S0i3 resume. [How] If suspend is aborted for some reason during S0i3 resume cycle, it follows SDMA ring test failing and errors in amdgpu resume. For RN/CZN/Picasso, SMU saves and restores SDMA registers during S0ix cycle. So, skipping SDMA suspend and resume from driver solves the issue. This time, the system is able to resume gracefully even the suspend is aborted. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rajib Mahapatra <rajib.mahapatra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-11 16:19:41 -05:00
Ken Xue	461fa7b0ac	drm/amdgpu: remove ctx->lock KMD reports a warning on holding a lock from drm_syncobj_find_fence, when running amdgpu_test case “syncobj timeline test”. ctx->lock was designed to prevent concurrent "amdgpu_ctx_wait_prev_fence" calls and avoid dead reservation lock from GPU reset. since no reservation lock is held in latest GPU reset any more, ctx->lock can be simply removed and concurrent "amdgpu_ctx_wait_prev_fence" call also can be prevented by PD root bo reservation lock. call stacks: ================= //hold lock amdgpu_cs_ioctl->amdgpu_cs_parser_init->mutex_lock(&parser->ctx->lock); … //report warning amdgpu_cs_dependencies->amdgpu_cs_process_syncobj_timeline_in_dep \ ->amdgpu_syncobj_lookup_and_add_to_sync -> drm_syncobj_find_fence \ -> lockdep_assert_none_held_once … amdgpu_cs_ioctl->amdgpu_cs_parser_fini->mutex_unlock(&parser->ctx->lock); Signed-off-by: Ken Xue <Ken.Xue@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-11 16:19:23 -05:00
Stanley.Yang	8bbd4d83a6	drm/amdgpu: Reset OOB table error count info The OOB table error count info should be reset after reset eeprom table Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-11 16:13:00 -05:00
Tao Zhou	69f915cc97	drm/amdgpu: loose check for umc poison mode No need to check poison setting for each channel, check for umc0 channel0 is enough. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-11 16:12:07 -05:00
Lang Yu	f9ed188d5a	drm/amdgpu: add support for GC 10.1.4 Add basic support for GC 10.1.4, it uses same IP blocks with GC 10.1.3 Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-11 16:11:55 -05:00
Andrey Grodzovsky	c7703ce38c	drm/amdgpu: Fix htmldoc warning Update function name. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220211205500.601391-1-andrey.grodzovsky@amd.com	2022-02-11 16:05:08 -05:00
Andrey Grodzovsky	f5666d4823	drm/amdgpu: Fix compile error. Seems I forgot to add this to the relevant commit when submitting. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220210031724.440943-1-andrey.grodzovsky@amd.com	2022-02-10 10:23:40 +01:00
Yang Wang	63b5fa9dbb	drm/amdgpu: fix gmc init fail in sriov mode "adev->gfx.rlc.rlcg_reg_access_supported = true;" the above varible were set too late during driver initialization. it will cause the driver to fail to write/read register during GMC hw init in sriov mode. move gfx_xxx_init_rlcg_reg_access_ctrl() function to gfx early init stage to avoid this issue. Fixes: `5d447e2967` ("drm/amdgpu: add helper for rlcg indirect reg access") Signed-off-by: Yang Wang <KevinYang.Wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-09 16:57:52 -05:00
zhanglianjie	db7b81545f	drm/amd/amdgpu/amdgpu_uvd: Fix forgotten unmap buffer object After the buffer object is successfully mapped, call amdgpu_bo_kunmap before the function returns. Signed-off-by: zhanglianjie <zhanglianjie@uniontech.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-09 16:57:52 -05:00
Mukul Joshi	5bdd3eb253	drm/amdkfd: Remove unused old debugger implementation Cleanup the kfd code by removing the unused old debugger implementation. The address watch was only ever implemented in the upstream driver for GFXv7 (Kaveri). The user mode tools runtime using this API was never open-sourced. Work on the old debugger prototype that used this API has been discontinued years ago. Only a small piece of resetting wavefronts is kept and is moved to kfd_device_queue_manager.c. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-09 16:57:51 -05:00
Aaron Liu	a072312f43	drm/amdgpu: add utcl2_harvest to gc 10.3.1 Confirmed with hardware team, there is harvesting for gc 10.3.1. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-09 15:08:05 -05:00
Andrey Grodzovsky	3675c2f26f	drm/amdgpu: Revert 'drm/amdgpu: annotate a false positive recursive locking' Since we have a single instance of reset semaphore which we lock only once even for XGMI hive we don't need the nested locking hint anymore. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74120.html	2022-02-09 12:19:14 -05:00
Andrey Grodzovsky	e923be9934	drm/amdgpu: Rework amdgpu_device_lock_adev This functions needs to be split into 2 parts where one is called only once for locking single instance of reset_domain's sem and reset flag and the other part which handles MP1 states should still be called for each device in XGMI hive. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74118.html	2022-02-09 12:18:39 -05:00
Andrey Grodzovsky	89a7a87093	drm/amdgpu: Move in_gpu_reset into reset_domain We should have a single instance per entrire reset domain. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74116.html	2022-02-09 12:17:57 -05:00
Andrey Grodzovsky	d0fb18b535	drm/amdgpu: Move reset sem into reset_domain We want single instance of reset sem across all reset clients because in case of XGMI we should stop access cross device MMIO because any of them could be in a reset in the moment. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74117.html	2022-02-09 12:17:32 -05:00
Andrey Grodzovsky	cfbb6b0047	drm/amdgpu: Rework reset domain to be refcounted. The reset domain contains register access semaphor now and so needs to be present as long as each device in a hive needs it and so it cannot be binded to XGMI hive life cycle. Adress this by making reset domain refcounted and pointed by each member of the hive and the hive itself. v4: Fix crash on boot witrh XGMI hive by adding type to reset_domain. XGMI will only create a new reset_domain if prevoius was of single device type meaning it's first boot. Otherwsie it will take a refocunt to exsiting reset_domain from the amdgou device. Add a wrapper around reset_domain->refcount get/put and a wrapper around send to reset wq (Lijo) Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74121.html	2022-02-09 12:17:09 -05:00
Andrey Grodzovsky	f287a3c5b0	drm/amdgpu: Drop concurrent GPU reset protection for device Since now all GPU resets are serialzied there is no need for this. This patch also reverts 'drm/amdgpu: race issue when jobs on 2 ring timeout' Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74119.html	2022-02-09 12:16:53 -05:00
Andrey Grodzovsky	681260df4d	drm/amdgpu: Drop hive->in_reset Since we serialize all resets no need to protect from concurrent resets. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74115.html	2022-02-09 12:16:29 -05:00
Andrey Grodzovsky	02599bc7f7	drm/amd/virt: For SRIOV send GPU reset directly to TDR queue. No need to to trigger another work queue inside the work queue. v3: Problem: Extra reset caused by host side FLR notification following guest side triggered reset. Fix: Preven qeuing flr_work from mailbox irq if guest already executing a reset. Suggested-by: Liu Shaoyun <Shaoyun.Liu@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Liu Shaoyun <Shaoyun.Liu@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74114.html	2022-02-09 12:16:06 -05:00
Andrey Grodzovsky	54f329cc7a	drm/amdgpu: Serialize non TDR gpu recovery with TDRs Use reset domain wq also for non TDR gpu recovery trigers such as sysfs and RAS. We must serialize all possible GPU recoveries to gurantee no concurrency there. For TDR call the original recovery function directly since it's already executed from within the wq. For others just use a wrapper to qeueue work and wait on it to finish. v2: Rename to amdgpu_recover_work_struct Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74113.html	2022-02-09 12:15:23 -05:00
Andrey Grodzovsky	5fd8518d18	drm/amdgpu: Move scheduler init to after XGMI is ready Before we initialize schedulers we must know which reset domain are we in - for single device there iis a single domain per device and so single wq per device. For XGMI the reset domain spans the entire XGMI hive and so the reset wq is per hive. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74112.html	2022-02-09 12:15:04 -05:00
Andrey Grodzovsky	a4c63cafa5	drm/amdgpu: Introduce reset domain Defined a reset_domain struct such that all the entities that go through reset together will be serialized one against another. Do it for both single device and XGMI hive cases. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Suggested-by: Christian König <ckoenig.leichtzumerken@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://www.spinics.net/lists/amd-gfx/msg74111.html	2022-02-09 12:14:32 -05:00
Christian König	e09b9aef68	drm/amdgpu: use dma_fence_chain_contained Instead of manually extracting the fence. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220204100429.2049-7-christian.koenig@amd.com	2022-02-08 09:25:40 +01:00
Alex Deucher	3786a9bc04	drm/amdgpu: drop experimental flag on aldebaran These have been at production level for a while. Drop the flag. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 18:03:50 -05:00
Christian König	b6fba4ecf3	drm/amdgpu: reserve the pd while cleaning up PRTs We want to have lockdep annotation here, so make sure that we reserve the PD while removing PRTs even if it isn't strictly necessary since the VM object is about to be destroyed anyway. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 18:03:50 -05:00
Christian König	d7d7ddc156	drm/amdgpu: move lockdep assert to the right place. Since newly added BOs don't have any mappings it's ok to add them without holding the VM lock. Only when we add per VM BOs the lock is mandatory. Signed-off-by: Christian König <christian.koenig@amd.com> Reported-by: Bhardwaj, Rajneesh <Rajneesh.Bhardwaj@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 18:03:50 -05:00
Aaron Liu	29ba7b16b9	drm/amdgpu: check the GART table before invalidating TLB Bypass group programming (utcl2_harvest) aims to forbid UTCL2 to send invalidation command to harvested SE/SA. Once invalidation command comes into harvested SE/SA, SE/SA has no response and system hang. This patch is to add checking if the GART table is already allocated before invalidating TLB. The new procedure is as following: 1. Calling amdgpu_gtt_mgr_init() in amdgpu_ttm_init(). After this step GTT BOs can be allocated, but GART mappings are still ignored. 2. Calling amdgpu_gart_table_vram_alloc() from the GMC code. This allocates the GART backing store. 3. Initializing the hardware, and programming the backing store into VMID0 for all VMHUBs. 4. Calling amdgpu_gtt_mgr_recover() to make sure the table is updated with the GTT allocations done before it was allocated. Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Aaron Liu <aaron.liu@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 18:01:16 -05:00
Aaron Liu	6d53b115be	drm/amdgpu: add utcl2_harvest to gc 10.3.1 Confirmed with hardware team, there is harvesting for gc 10.3.1. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 18:01:16 -05:00
Tao Zhou	4e781873fa	drm/amdgpu: fix list add issue in vram reserve The parameter order in the list_add_tail is incorrect, it causes the reuse of ras reserved page. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 18:01:16 -05:00
yipechai	a50b048276	Revert "drm/amdgpu: Add judgement to avoid infinite loop" The commit `d5e8ff5f7b` ("drm/amdgpu: Fixed the defect of soft lock caused by infinite loop") had fixed this defect. Revert workaround commit `a2170b4af6` ("drm/amdgpu: Add judgement to avoid infinite loop"). Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 18:00:05 -05:00
yipechai	d5e8ff5f7b	drm/amdgpu: Fixed the defect of soft lock caused by infinite loop 1. The infinite loop case only occurs on multiple cards support ras functions. 2. The explanation of root cause refer to commit 76641cbbf196 ("drm/amdgpu: Add judgement to avoid infinite loop"). 3. Create new node to manage each unique ras instance to guarantee each device .ras_list is completely independent. 4. Fixes: commit 7a6b8ab3231b51 ("drm/amdgpu: Unify ras block interface for each ras block"). 5. The soft locked logs are as follows: [ 262.165690] CPU: 93 PID: 758 Comm: kworker/93:1 Tainted: G OE 5.13.0-27-generic #29~20.04.1-Ubuntu [ 262.165695] Hardware name: Supermicro AS -4124GS-TNR/H12DSG-O-CPU, BIOS T20200717143848 07/17/2020 [ 262.165698] Workqueue: events amdgpu_ras_do_recovery [amdgpu] [ 262.165980] RIP: 0010:amdgpu_ras_get_ras_block+0x86/0xd0 [amdgpu] [ 262.166239] Code: 68 d8 4c 8d 71 d8 48 39 c3 74 54 49 8b 45 38 48 85 c0 74 32 44 89 fa 44 89 e6 4c 89 ef e8 82 e4 9b dc 85 c0 74 3c 49 8b 46 28 <49> 8d 56 28 4d 89 f5 48 83 e8 28 48 39 d3 74 25 49 89 c6 49 8b 45 [ 262.166243] RSP: 0018:ffffac908fa87d80 EFLAGS: 00000202 [ 262.166247] RAX: ffffffffc1394248 RBX: ffff91e4ab8d6e20 RCX: ffffffffc1394248 [ 262.166249] RDX: ffff91e4aa356e20 RSI: 000000000000000e RDI: ffff91e4ab8c0000 [ 262.166252] RBP: ffffac908fa87da8 R08: 0000000000000007 R09: 0000000000000001 [ 262.166254] R10: ffff91e4930b64ec R11: 0000000000000000 R12: 000000000000000e [ 262.166256] R13: ffff91e4aa356df8 R14: ffffffffc1394320 R15: 0000000000000003 [ 262.166258] FS: 0000000000000000(0000) GS:ffff92238fb40000(0000) knlGS:0000000000000000 [ 262.166261] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 262.166264] CR2: 00000001004865d0 CR3: 000000406d796000 CR4: 0000000000350ee0 [ 262.166267] Call Trace: [ 262.166272] amdgpu_ras_do_recovery+0x130/0x290 [amdgpu] [ 262.166529] ? psi_task_switch+0xd2/0x250 [ 262.166537] ? __switch_to+0x11d/0x460 [ 262.166542] ? __switch_to_asm+0x36/0x70 [ 262.166549] process_one_work+0x220/0x3c0 [ 262.166556] worker_thread+0x4d/0x3f0 [ 262.166560] ? process_one_work+0x3c0/0x3c0 [ 262.166563] kthread+0x12b/0x150 [ 262.166568] ? set_kthread_struct+0x40/0x40 [ 262.166571] ret_from_fork+0x22/0x30 Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 17:59:53 -05:00
Luben Tuikov	00d6936dbd	drm/amdgpu: Set FRU bus for Aldebaran and Vega 20 The FRU and RAS EEPROMs share the same I2C bus on Aldebaran and Vega 20 ASICs. Set the FRU bus "pointer" to this single bus, as access to the FRU is sought through that bus "pointer" and not through the RAS bus "pointer". Cc: Roy Sun <Roy.Sun@amd.com> Cc: Alex Deucher <Alexander.Deucher@amd.com> Fixes: `2f60dd5076` ("drm/amd: Expose the FRU SMU I2C bus") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 17:59:53 -05:00
Rajneesh Bhardwaj	447c7997b6	drm/amdgpu: Fix recursive locking warning Noticed the below warning while running a pytorch workload on vega10 GPUs. Change to trylock to avoid conflicts with already held reservation locks. [ +0.000003] WARNING: possible recursive locking detected [ +0.000003] 5.13.0-kfd-rajneesh #1030 Not tainted [ +0.000004] -------------------------------------------- [ +0.000002] python/4822 is trying to acquire lock: [ +0.000004] ffff932cd9a259f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000203] but task is already holding lock: [ +0.000003] ffff932cbb7181f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: ttm_eu_reserve_buffers+0x270/0x470 [ttm] [ +0.000017] other info that might help us debug this: [ +0.000002] Possible unsafe locking scenario: [ +0.000003] CPU0 [ +0.000002] ---- [ +0.000002] lock(reservation_ww_class_mutex); [ +0.000004] lock(reservation_ww_class_mutex); [ +0.000003] * DEADLOCK * [ +0.000002] May be due to missing lock nesting notation [ +0.000003] 7 locks held by python/4822: [ +0.000003] #0: ffff932c4ac028d0 (&process->mutex){+.+.}-{3:3}, at: kfd_ioctl_map_memory_to_gpu+0x10b/0x320 [amdgpu] [ +0.000232] #1: ffff932c55e830a8 (&info->lock#2){+.+.}-{3:3}, at: amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x64/0xf60 [amdgpu] [ +0.000241] #2: ffff932cc45b5e68 (&(mem)->lock){+.+.}-{3:3}, at: amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0xdf/0xf60 [amdgpu] [ +0.000236] #3: ffffb2b35606fd28 (reservation_ww_class_acquire){+.+.}-{0:0}, at: amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x232/0xf60 [amdgpu] [ +0.000235] #4: ffff932cbb7181f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: ttm_eu_reserve_buffers+0x270/0x470 [ttm] [ +0.000015] #5: ffffffffc045f700 ((sspp++)){....}-{0:0}, at: drm_dev_enter+0x5/0xa0 [drm] [ +0.000038] #6: ffff932c52da7078 (&vm->eviction_lock){+.+.}-{3:3}, at: amdgpu_vm_bo_update_mapping+0xd5/0x4f0 [amdgpu] [ +0.000195] stack backtrace: [ +0.000003] CPU: 11 PID: 4822 Comm: python Not tainted 5.13.0-kfd-rajneesh #1030 [ +0.000005] Hardware name: GIGABYTE MZ01-CE0-00/MZ01-CE0-00, BIOS F02 08/29/2018 [ +0.000003] Call Trace: [ +0.000003] dump_stack+0x6d/0x89 [ +0.000010] __lock_acquire+0xb93/0x1a90 [ +0.000009] lock_acquire+0x25d/0x2d0 [ +0.000005] ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000184] ? lock_is_held_type+0xa2/0x110 [ +0.000006] ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000184] __ww_mutex_lock.constprop.17+0xca/0x1060 [ +0.000007] ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000183] ? lock_release+0x13f/0x270 [ +0.000005] ? lock_is_held_type+0xa2/0x110 [ +0.000006] ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000183] amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000185] ttm_bo_release+0x4c6/0x580 [ttm] [ +0.000010] amdgpu_bo_unref+0x1a/0x30 [amdgpu] [ +0.000183] amdgpu_vm_free_table+0x76/0xa0 [amdgpu] [ +0.000189] amdgpu_vm_free_pts+0xb8/0xf0 [amdgpu] [ +0.000189] amdgpu_vm_update_ptes+0x411/0x770 [amdgpu] [ +0.000191] amdgpu_vm_bo_update_mapping+0x324/0x4f0 [amdgpu] [ +0.000191] amdgpu_vm_bo_update+0x251/0x610 [amdgpu] [ +0.000191] update_gpuvm_pte+0xcc/0x290 [amdgpu] [ +0.000229] ? amdgpu_vm_bo_map+0xd7/0x130 [amdgpu] [ +0.000190] amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x912/0xf60 [amdgpu] [ +0.000234] kfd_ioctl_map_memory_to_gpu+0x182/0x320 [amdgpu] [ +0.000218] kfd_ioctl+0x2b9/0x600 [amdgpu] [ +0.000216] ? kfd_ioctl_unmap_memory_from_gpu+0x270/0x270 [amdgpu] [ +0.000216] ? lock_release+0x13f/0x270 [ +0.000006] ? __fget_files+0x107/0x1e0 [ +0.000007] __x64_sys_ioctl+0x8b/0xd0 [ +0.000007] do_syscall_64+0x36/0x70 [ +0.000004] entry_SYSCALL_64_after_hwframe+0x44/0xae [ +0.000007] RIP: 0033:0x7fbff90a7317 [ +0.000004] Code: b3 66 90 48 8b 05 71 4b 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 41 4b 2d 00 f7 d8 64 89 01 48 [ +0.000005] RSP: 002b:00007fbe301fe648 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ +0.000006] RAX: ffffffffffffffda RBX: 00007fbcc402d820 RCX: 00007fbff90a7317 [ +0.000003] RDX: 00007fbe301fe690 RSI: 00000000c0184b18 RDI: 0000000000000004 [ +0.000003] RBP: 00007fbe301fe690 R08: 0000000000000000 R09: 00007fbcc402d880 [ +0.000003] R10: 0000000002001000 R11: 0000000000000246 R12: 00000000c0184b18 [ +0.000003] R13: 0000000000000004 R14: 00007fbf689593a0 R15: 00007fbcc402d820 Cc: Christian König <christian.koenig@amd.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Alex Deucher <Alexander.Deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 17:59:53 -05:00
Luben Tuikov	00b14ce075	drm/amdgpu: Prevent random memory access in FRU code Prevent random memory access in the FRU EEPROM code by passing the size of the destination buffer to the reading routine, and reading no more than the size of the buffer. Cc: Kent Russell <kent.russell@amd.com> Cc: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 17:59:53 -05:00
Luben Tuikov	3f3a24a0a3	drm/amdgpu: Don't offset by 2 in FRU EEPROM Read buffers no longer expose the I2C address, and so we don't need to offset by two when we get the read data. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Kent Russell <kent.russell@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Fixes: `bd607166af` ("drm/amdgpu: Enable reading FRU chip via I2C v3") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 17:59:53 -05:00
Luben Tuikov	3f1e2e9d99	drm/amdgpu: Nerf "buff" to "buf" Buffer is abbreviated "buf" (buf-fer), not "buff" (buff-er). This is consistent with the rest of the kernel code. Cc: Kent Russell <kent.russell@amd.com> Cc: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 17:59:53 -05:00
Rajneesh Bhardwaj	011bbb0302	drm/amdkfd: CRIU Implement KFD resume ioctl This adds support to create userptr BOs on restore and introduces a new ioctl op to restart memory notifiers for the restored userptr BOs. When doing CRIU restore MMU notifications can happen anytime after we call amdgpu_mn_register. Prevent MMU notifications until we reach stage-4 of the restore process i.e. criu_resume ioctl op is received, and the process is ready to be resumed. This ioctl is different from other KFD CRIU ioctls since its called by CRIU master restore process for all the target processes being resumed by CRIU. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: David Yat Sin <david.yatsin@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 17:59:41 -05:00
Rajneesh Bhardwaj	5ccbb057c0	drm/amdkfd: CRIU Implement KFD checkpoint ioctl This adds support to discover the buffer objects that belong to a process being checkpointed. The data corresponding to these buffer objects is returned to user space plugin running under criu master context which then stores this info to recreate these buffer objects during a restore operation. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: David Yat Sin <david.yatsin@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 17:59:29 -05:00
Luben Tuikov	afa3731591	drm/amdgpu: Print once if RAS unsupported MESA polls for errors every 2-3 seconds. Printing with dev_info() causes the dmesg log to fill up with the same message, e.g, [18028.206676] amdgpu 0000:0b:00.0: amdgpu: df doesn't config ras function. Make it dev_dbg_once(), as it isn't something correctible during boot or thereafter, so printing just once is sufficient. Also sanitize the message. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: John Clements <john.clements@amd.com> Cc: Tao Zhou <tao.zhou1@amd.com> Cc: yipechai <YiPeng.Chai@amd.com> Fixes: `8b0fb0e967` ("drm/amdgpu: Modify gfx block to fit for the unified ras block data and ops") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 17:14:16 -05:00
Christian König	e56694f718	drm/amdgpu: rename amdgpu_vm_bo_rmv to _del Some people complained about the name and this matches much more Linux naming conventions for object functions. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 17:14:10 -05:00
Christian König	2d022081b3	drm/amdgpu: add some lockdep checks to the VM code Whenever a bo_va structure is added or removed the VM and eventually added BO should be locked. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-07 17:13:52 -05:00
Lucas De Marchi	b8c75bd974	drm: Convert open-coded yes/no strings to yesno() linux/string_helpers.h provides a helper to return "yes"/"no" strings. Replace the open coded versions with str_yes_no(). The places were identified with the following semantic patch: @@ expression b; @@ - b ? "yes" : "no" + str_yes_no(b) Then the includes were added, so we include-what-we-use, and parenthesis adjusted in drivers/gpu/drm/v3d/v3d_debugfs.c. After the conversion we still see the same binary sizes: text data bss dec hex filename 51149 3295 212 54656 d580 virtio/virtio-gpu.ko.old 51149 3295 212 54656 d580 virtio/virtio-gpu.ko 1441491 60340 800 1502631 16eda7 radeon/radeon.ko.old 1441491 60340 800 1502631 16eda7 radeon/radeon.ko 6125369 328538 34000 6487907 62ff63 amd/amdgpu/amdgpu.ko.old 6125369 328538 34000 6487907 62ff63 amd/amdgpu/amdgpu.ko 411986 10490 6176 428652 68a6c drm.ko.old 411986 10490 6176 428652 68a6c drm.ko 98129 1636 264 100029 186bd dp/drm_dp_helper.ko.old 98129 1636 264 100029 186bd dp/drm_dp_helper.ko 1973432 109640 2352 2085424 1fd230 nouveau/nouveau.ko.old 1973432 109640 2352 2085424 1fd230 nouveau/nouveau.ko Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220126093951.1470898-10-lucas.demarchi@intel.com	2022-02-07 13:04:25 -08:00
Maarten Lankhorst	542898c5aa	Merge remote-tracking branch 'drm/drm-next' into drm-misc-next First backmerge into drm-misc-next. Required for more helpers backmerged, and to pull in 5.17 (rc2). Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>	2022-02-07 17:03:24 +01:00
Christian König	e8ae38720e	drm/amdgpu: fix logic inversion in check We probably never trigger this, but the logic inside the check is inverted. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:35:00 -05:00
Mario Limonciello	e55a3aea41	drm/amd: avoid suspend on dGPUs w/ s2idle support when runtime PM enabled dGPUs connected to Intel systems configured for suspend to idle will not have the power rails cut at suspend and resetting the GPU may lead to problematic behaviors. Fixes: `e25443d276` ("drm/amdgpu: add a dev_pm_ops prepare callback (v2)") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1879 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:35:00 -05:00
Lang Yu	bca52455a3	drm/amdgpu: fix a potential GPU hang on cyan skillfish We observed a GPU hang when querying GMC CG state(i.e., cat amdgpu_pm_info) on cyan skillfish. Acctually, cyan skillfish doesn't support any CG features. Just prevent it from accessing GMC CG registers. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-02-02 18:35:00 -05:00
Mario Limonciello	04ef860469	drm/amd: Only run s3 or s0ix if system is configured properly This will cause misconfigured systems to not run the GPU suspend routines. * In APUs that are properly configured system will go into s2idle. * In APUs that are intended to be S3 but user selects s2idle the GPU will stay fully powered for the suspend. * In APUs that are intended to be s2idle and system misconfigured the GPU will stay fully powered for the suspend. * In systems that are intended to be s2idle, but AMD dGPU is also present, the dGPU will go through S3 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:35:00 -05:00
Mario Limonciello	f52a2b8bad	drm/amd: add support to check whether the system is set to s3 This will be used to help make decisions on what to do in misconfigured systems. v2: squash in semicolon fix from Stephen Rothwell Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:35:00 -05:00
Somalapuram Amaranath	4f860edecd	drm/amdgpu: limit the number of dst address in trace trace_amdgpu_vm_update_ptes trace unable to log when nptes too large Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:27:52 -05:00
Mario Limonciello	9308a49d8e	drm/amd: avoid suspend on dGPUs w/ s2idle support when runtime PM enabled dGPUs connected to Intel systems configured for suspend to idle will not have the power rails cut at suspend and resetting the GPU may lead to problematic behaviors. Fixes: `e25443d276` ("drm/amdgpu: add a dev_pm_ops prepare callback (v2)") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1879 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:27:01 -05:00
Christian König	22f7cc7524	drm/amdgpu: restructure amdgpu_fill_buffer v2 We ran into the problem that clearing really larger buffer (60GiB) caused an SDMA timeout. Restructure the function to use the dst window instead of mapping the whole buffer into the GART and then fill only 2MiB/256MiB chunks at a time. v2: rebase on restructured window map. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:26:55 -05:00
Christian König	6927913d70	drm/amdgpu: rework GART copy window handling Instead of limiting the size before we call the mapping function let the function itself limit the size. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:26:50 -05:00
Christian König	e0a4459d45	drm/amdgpu: lower BUG_ON into WARN_ON for AMDGPU_PL_PREEMPT That should never happen, but make sure that we only warn instead of crash. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:26:44 -05:00
Christian König	fcd6b0e270	drm/amdgpu: fix logic inversion in check We probably never trigger this, but the logic inside the check is inverted. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:26:32 -05:00
Guchun Chen	274b924c3e	drm/amdgpu: drop flood print in rlcg reg access function A lot of below message are outputed in SRIOV case. amdgpu: indirect registers access through rlcg is not supported Also drop redundant ret set, as it's initialized to be false already. Fixes: `29dbcac82f` ("drm/amdgpu: add helper to query rlcg reg access flag") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:26:31 -05:00
Lijo Lazar	889f84798c	drm/amdgpu: Fix uninitialized variable use warning Fix uninitialized variable use warning: variable 'reg_access_ctrl' is uninitialized when used here [-Wuninitialized] scratch_reg0 = (void __iomem )adev->rmmio + 4 reg_access_ctrl->scratch_reg0; Fixes: `5d447e2967` ("drm/amdgpu: add helper for rlcg indirect reg access") Reported-by: kernel test robot <yujie.liu@intel.com> Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:26:31 -05:00
yipechai	a2170b4af6	drm/amdgpu: Add judgement to avoid infinite loop 1. The infinite loop causing soft lock occurs on multiple amdgpu cards supporting ras feature. 2. This a workaround patch to fix `6492e1b07c`. It is valid for multiple amdgpu cards of the same type. 3. The root cause is that each GPU card device has a separate .ras_list link header, but the instance and linked list node of each ras block are unique. When each device is initialized, each ras instance will repeatedly add link node to the device every time. In this way, only the .ras_list of the last initialized device is completely correct. the .ras_list->prev and .ras_list->next of the device initialzied before can still point to the correct ras instance, but the prev pointer and next pointer of the pointed ras instance both point to the last initialized device's .ras_ list instead of the beginning .ras_ list. When using list_for_each_entry_safe searches for non-existent Ras nodes on devices other than the last device, the last ras instance next pointer cannot always be equal to the beginning .ras_list, so that the loop cannot be terminated, the program enters a infinite loop. BTW: Since the data and initialization process of each card are the same, the link list between ras instances will not be destroyed every time the device is initialized. 4. The soft locked logs are as follows: [ 262.165690] CPU: 93 PID: 758 Comm: kworker/93:1 Tainted: G OE 5.13.0-27-generic #29~20.04.1-Ubuntu [ 262.165695] Hardware name: Supermicro AS -4124GS-TNR/H12DSG-O-CPU, BIOS T20200717143848 07/17/2020 [ 262.165698] Workqueue: events amdgpu_ras_do_recovery [amdgpu] [ 262.165980] RIP: 0010:amdgpu_ras_get_ras_block+0x86/0xd0 [amdgpu] [ 262.166239] Code: 68 d8 4c 8d 71 d8 48 39 c3 74 54 49 8b 45 38 48 85 c0 74 32 44 89 fa 44 89 e6 4c 89 ef e8 82 e4 9b dc 85 c0 74 3c 49 8b 46 28 <49> 8d 56 28 4d 89 f5 48 83 e8 28 48 39 d3 74 25 49 89 c6 49 8b 45 [ 262.166243] RSP: 0018:ffffac908fa87d80 EFLAGS: 00000202 [ 262.166247] RAX: ffffffffc1394248 RBX: ffff91e4ab8d6e20 RCX: ffffffffc1394248 [ 262.166249] RDX: ffff91e4aa356e20 RSI: 000000000000000e RDI: ffff91e4ab8c0000 [ 262.166252] RBP: ffffac908fa87da8 R08: 0000000000000007 R09: 0000000000000001 [ 262.166254] R10: ffff91e4930b64ec R11: 0000000000000000 R12: 000000000000000e [ 262.166256] R13: ffff91e4aa356df8 R14: ffffffffc1394320 R15: 0000000000000003 [ 262.166258] FS: 0000000000000000(0000) GS:ffff92238fb40000(0000) knlGS:0000000000000000 [ 262.166261] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 262.166264] CR2: 00000001004865d0 CR3: 000000406d796000 CR4: 0000000000350ee0 [ 262.166267] Call Trace: [ 262.166272] amdgpu_ras_do_recovery+0x130/0x290 [amdgpu] [ 262.166529] ? psi_task_switch+0xd2/0x250 [ 262.166537] ? __switch_to+0x11d/0x460 [ 262.166542] ? __switch_to_asm+0x36/0x70 [ 262.166549] process_one_work+0x220/0x3c0 [ 262.166556] worker_thread+0x4d/0x3f0 [ 262.166560] ? process_one_work+0x3c0/0x3c0 [ 262.166563] kthread+0x12b/0x150 [ 262.166568] ? set_kthread_struct+0x40/0x40 [ 262.166571] ret_from_fork+0x22/0x30 Fixes: `6492e1b07c` ("drm/amdgpu: Unify ras block interface for each ras block") Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:26:31 -05:00
Changcheng Deng	6a77bce58c	drm/amdgpu: remove duplicate include in 'amdgpu_device.c' 'linux/pci.h' included in 'amdgpu_device.c' is duplicated. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Changcheng Deng <deng.changcheng@zte.com.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:26:31 -05:00
Lang Yu	d2895ec4ca	drm/amdgpu: fix a potential GPU hang on cyan skillfish We observed a GPU hang when querying GMC CG state(i.e., cat amdgpu_pm_info) on cyan skillfish. Acctually, cyan skillfish doesn't support any CG features. Just prevent it from accessing GMC CG registers. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:26:31 -05:00
Mario Limonciello	d2a197a45d	drm/amd: Only run s3 or s0ix if system is configured properly This will cause misconfigured systems to not run the GPU suspend routines. * In APUs that are properly configured system will go into s2idle. * In APUs that are intended to be S3 but user selects s2idle the GPU will stay fully powered for the suspend. * In APUs that are intended to be s2idle and system misconfigured the GPU will stay fully powered for the suspend. * In systems that are intended to be s2idle, but AMD dGPU is also present, the dGPU will go through S3 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:26:30 -05:00
Mario Limonciello	18b66ace6b	drm/amd: add support to check whether the system is set to s3 This will be used to help make decisions on what to do in misconfigured systems. v2: squash in semicolon fix from Stephen Rothwell Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-02 18:26:30 -05:00
Dave Airlie	53dbee4926	drm-misc-next for v5.18: UAPI Changes: - Fix invalid IN_FORMATS blob when plane->format_mod_supported is NULL. Cross-subsystem Changes: - Assorted dt bindings updates. - Fix vga16fb vga checking on x86. - Fix extra semicolon in rwsem.h's _down_write_nest_lock. - Assorted small fixes to agp and fbdev drivers. - Fix oops in creating a udmabuf with 0 pages. - Hot-unplug firmware fb devices on forced removal - Reqquest memory region in simplefb and simpledrm, and don't make the ioresource as busy. Core Changes: - Mock a drm_plane in drm-plane-helper selftest. - Assorted bug fixes to device logging, dbi. - Use DP helper for sink count in mst. - Assorted documentation fixes. - Assorted small fixes. - Move DP headers to drm/dp, and add a drm dp helper module. - Move the buddy allocator from i915 to common drm. - Add simple pci and platform module init macros to remove a lot of boilerplate from some drivers. - Support microsoft extension for HMDs and specialized monitors. - Improve edid parser's deep color handling. - Add type 7 timing support to edid parser. - Add a weak backpointer to the ttm_bo from ttm_resource - Add 3 eDP panels. Driver Changes: - Add support for HDMI and JZ4780 to ingenic. - Add support for higher DP/eDP bitrates to nouveau. - Assorted driver fixes to tilcdc, vmwgfx, sn65dsi83, meson, stm, panfrost, v3d, gma500, vc4, virtio, mgag200, ast, radeon, amdgpu, nouveau, various bridge drivers. - Convert and revert exynos dsi support to bridge driver. - Add vcc supply regulator support for sn65dsi83. - More conversion of bridge/chipone-icn6211 to atomic. - Remove conflicting fb's from stm, and add support for new hw version. - Add device link in parade-ps8640 to fix suspend/resume. - Update Boe-tv110c9m init sequence. - Add wide screen support to AST2600. - Fix omapdrm implicit dma_buf fencing. - Add support for multiple overlay planes to vkms. - Convert bridge/anx7625 to atomic, add HDCP support, add eld support for audio, and fix HPD. - Add driver for ChromeOS privacy screen. - Handover display from firmware to vc4 more gracefully, and support nomodeset. - Add flexible and ycbcr pixel formats to stm/ltdc. - Convert exynos mipi dsi to atomic. - Add initial dual core group GPUs support to panfrost. - No longer add exclusive fence in amdgpu as shared fence. - Add CSC and full range supoprt to vc4. - Shutdown the display on system shutdown and unbind. - Add Multi-Inno Technology MI0700S4T-6 simple panel. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAmHyiHMACgkQ/lWMcqZw E8OLfQ//Xd1njt93nRGoQofuQkz23n2AUTAnmbwzQKcvmat8ugXbRJ5JaVQJrFpu OQEYM46eZIyu2LekMiz4HgPK8CjS156QJ1WtltUFglOY1KLejb6HF5boBYxLkIC7 wLhkaRiwed4t7WOTrftgzpH5FNj/7Vi+Hav9l8rYRC74sWanEZNGBJL2OD9GRdlU 3tlmY8oXVAN8YDD/43Cv+foOTzLS/COI7JCFgFRhfzoFss3EVR061u55uOq18STB UI29NusqX7/K6hQAWCKl0EQBEZWMR02/dgu3ZpOEHHAa96RgHxIuRYsIO9kvGgiF VyW0EW6AyD/KsOSBYnsfUqkFfNchx9Xb8ZDjIhHUYxPsxe4iUJneCrdIKEmLWgSd 1bVNrltLJKBQARW4Whpy/gaiKV8RD8YVJobA/+/COeCUXCnNAT43O9aJmix/7253 Q7ORXTss5WRpuYswMWmObebf8p3IhFjTvlzzenXynl7mkaohGzHPf6SUSUZbJ8Df PZCh17McwIEQ1BtYeegeAGM6s8lrv5+yZaY4bnkQsJNOHeab0cPqmQ8/s+hUeRtp 3VDRVhkgzz2XuTaiKia0gWcAQbdZ2KornkP4QMyDH7w0+6bsuJnNXe4L1XY9lt4J 5v411FaD61FbGDhu5PFtYI7+ZlgM0h5sqlhVkUEzbckzTF3SC9c= =IMtm -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2022-01-27' of git://anongit.freedesktop.org/drm/drm-misc into drm-next [airlied: add two missing Kconfig] drm-misc-next for v5.18: UAPI Changes: - Fix invalid IN_FORMATS blob when plane->format_mod_supported is NULL. Cross-subsystem Changes: - Assorted dt bindings updates. - Fix vga16fb vga checking on x86. - Fix extra semicolon in rwsem.h's _down_write_nest_lock. - Assorted small fixes to agp and fbdev drivers. - Fix oops in creating a udmabuf with 0 pages. - Hot-unplug firmware fb devices on forced removal - Reqquest memory region in simplefb and simpledrm, and don't make the ioresource as busy. Core Changes: - Mock a drm_plane in drm-plane-helper selftest. - Assorted bug fixes to device logging, dbi. - Use DP helper for sink count in mst. - Assorted documentation fixes. - Assorted small fixes. - Move DP headers to drm/dp, and add a drm dp helper module. - Move the buddy allocator from i915 to common drm. - Add simple pci and platform module init macros to remove a lot of boilerplate from some drivers. - Support microsoft extension for HMDs and specialized monitors. - Improve edid parser's deep color handling. - Add type 7 timing support to edid parser. - Add a weak backpointer to the ttm_bo from ttm_resource - Add 3 eDP panels. Driver Changes: - Add support for HDMI and JZ4780 to ingenic. - Add support for higher DP/eDP bitrates to nouveau. - Assorted driver fixes to tilcdc, vmwgfx, sn65dsi83, meson, stm, panfrost, v3d, gma500, vc4, virtio, mgag200, ast, radeon, amdgpu, nouveau, various bridge drivers. - Convert and revert exynos dsi support to bridge driver. - Add vcc supply regulator support for sn65dsi83. - More conversion of bridge/chipone-icn6211 to atomic. - Remove conflicting fb's from stm, and add support for new hw version. - Add device link in parade-ps8640 to fix suspend/resume. - Update Boe-tv110c9m init sequence. - Add wide screen support to AST2600. - Fix omapdrm implicit dma_buf fencing. - Add support for multiple overlay planes to vkms. - Convert bridge/anx7625 to atomic, add HDCP support, add eld support for audio, and fix HPD. - Add driver for ChromeOS privacy screen. - Handover display from firmware to vc4 more gracefully, and support nomodeset. - Add flexible and ycbcr pixel formats to stm/ltdc. - Convert exynos mipi dsi to atomic. - Add initial dual core group GPUs support to panfrost. - No longer add exclusive fence in amdgpu as shared fence. - Add CSC and full range supoprt to vc4. - Shutdown the display on system shutdown and unbind. - Add Multi-Inno Technology MI0700S4T-6 simple panel. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/456a23c6-7324-7543-0c45-751f30ef83f7@linux.intel.com	2022-02-01 19:02:41 +10:00
Mario Limonciello	a6ed203587	drm/amd: Warn users about potential s0ix problems On some OEM setups users can configure the BIOS for S3 or S2idle. When configured to S3 users can still choose 's2idle' in the kernel by using `/sys/power/mem_sleep`. Before commit `6dc8265f98` ("drm/amdgpu: always reset the asic in suspend (v2)"), the GPU would crash. Now when configured this way, the system should resume but will use more power. As such, adjust the `amdpu_acpi_is_s0ix function` to warn users about potential power consumption issues during their first attempt at suspending. Reported-by: Bjoren Dasse <bjoern.daase@gmail.com> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1824 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-31 18:10:47 -05:00
Mario Limonciello	f588a1bbfc	drm/amd: Warn users about potential s0ix problems On some OEM setups users can configure the BIOS for S3 or S2idle. When configured to S3 users can still choose 's2idle' in the kernel by using `/sys/power/mem_sleep`. Before commit `6dc8265f98` ("drm/amdgpu: always reset the asic in suspend (v2)"), the GPU would crash. Now when configured this way, the system should resume but will use more power. As such, adjust the `amdpu_acpi_is_s0ix function` to warn users about potential power consumption issues during their first attempt at suspending. Reported-by: Bjoren Dasse <bjoern.daase@gmail.com> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1824 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-31 16:12:29 -05:00
Tomohito Esaki	2af104290d	drm: introduce fb_modifiers_not_supported flag in mode_config If only linear modifier is advertised, since there are many drivers that only linear supported, the DRM core should handle this rather than open-coding in every driver. However, there are legacy drivers such as radeon that do not support modifiers but infer the actual layout of the underlying buffer. Therefore, a new flag fb_modifiers_not_supported is introduced for these legacy drivers, and allow_fb_modifiers is replaced with this new flag. v3: - change the order as follows: 1. add fb_modifiers_not_supported flag 2. add default modifiers 3. remove allow_fb_modifiers flag - add a conditional disable in amdgpu_dm_plane_init() v4: - modify kernel docs v5: - modify kernel docs Signed-off-by: Tomohito Esaki <etom@igel.co.jp> Acked-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220128060836.11216-2-etom@igel.co.jp	2022-01-31 21:45:23 +01:00
huangqu	c57f5ba2c8	drm/amdgpu: Wrong order for config and counter_id parameters Wrong order for config and counter_id parameters was passed, when calling df_v3_6_pmc_set_deferred and df_v3_6_pmc_is_deferred functions. Signed-off-by: huangqu <jinsdb@126.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:50:22 -05:00
tangmeng	1ec5a44331	drm/amd/amdgpu: fix spelling mistake "disbale" -> "disable" There is a spelling mistake. Fix it. Signed-off-by: tangmeng <tangmeng@uniontech.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:50:17 -05:00
Alex Deucher	ded81d5b2b	drm/amdgpu: bump driver version for new CTX OP to set/get stable pstates So mesa and tools know when this is available. Mesa MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/207 Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:50:14 -05:00
Alex Deucher	8cda7a4f96	drm/amdgpu/UAPI: add new CTX OP to get/set stable pstates Add a new CTX ioctl operation to set stable pstates for profiling. When creating traces for tools like RGP or using SPM or doing performance profiling, it's required to enable a special stable profiling power state on the GPU. These profiling states set fixed clocks and disable certain other power features like powergating which may impact the results. Historically, these profiling pstates were enabled via sysfs, but this adds an interface to enable it via the CTX ioctl from the application. Since the power state is global only one application can set it at a time, so if multiple applications try and use it only the first will get it, the ioctl will return -EBUSY for others. The sysfs interface will override whatever has been set by this interface. Mesa MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/207 v2: don't default r = 0; v3: rebase on Evan's PM cleanup Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:50:08 -05:00
Luben Tuikov	3ed893396b	drm/amd: Enable FRU EEPROM for Sienna Cichlid Enable the FRU EEPROM I2C bus for Sienna Cichlid server boards, for which it is enabled by checking the VBIOS version. Cc: Roy Sun <Roy.Sun@amd.com> Cc: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:49:56 -05:00
Luben Tuikov	2f60dd5076	drm/amd: Expose the FRU SMU I2C bus Expose both SMU I2C buses. Some boards use the same bus for both the RAS and FRU EEPROMs and others use different buses. This enables the additional I2C bus and sets the right buses to use for RAS and FRU EEPROM access. Cc: Roy Sun <Roy.Sun@amd.com> Co-developed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:49:48 -05:00
Aaron Liu	f06d9e4eec	drm/amdgpu: add 1.3.1/2.4.0 athub CG support This patch adds 1.3.1/2.4.0 athub clock gating support. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:49:42 -05:00
Aaron Liu	4e13b063d2	drm/amdgpu: convert code name to ip version for athub Use IP version rather than codename for athub. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:49:36 -05:00
Tao Zhou	bee7f8d092	drm/amdgpu: get hash bit for CH4 in umc channel index On ALDEBARAN, the umc channel bits are not original values, they are hashed. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:49:13 -05:00
Tao Zhou	e63fa4dcea	drm/amdgpu: update algorithm of umc address conversion On ALDEBARAN, we need to traverse all column bits higher than BIT11(C4C3C2) in a row, the shift of R14 bit should be also taken into account. Retire all pages we find. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:49:08 -05:00
Tao Zhou	498d46fe7a	drm/amdgpu: increase bad page number for umc ras query One piece of umc normalizing address can be mapped to 16 pieces of physical address in each umc channel on ALDEBARAN. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:49:02 -05:00
Tao Zhou	400013b268	drm/amdgpu: add umc_fill_error_record to make code more simple Create common amdgpu_umc_fill_error_record function for all versions of UMC and clean up related codes. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:48:56 -05:00
Felix Kuehling	fc6ea4bee1	drm/amdgpu: Wipe all VRAM on free when RAS is enabled On GPUs with RAS, poison can propagate between processes if VRAM is not cleared when it is freed or allocated. The reason is, that not all write accesses clear RAS poison. 32-byte writes by the SDMA engine do clear RAS poison. Clearing memory in the background when it is freed should avoid major performance impact. KFD has been doing this already for a long time. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:48:41 -05:00
Tianci.Yin	7270e8957e	drm/amdgpu: Fix an error message in rmmod [why] In rmmod procedure, kfd sends cp a dequeue request, but the request does not get response, then an error message "cp queue pipe 4 queue 0 preemption failed" printed. [how] Performing kfd suspending after disabling gfxoff can fix it. Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:48:30 -05:00
Victor Zhao	039cacd239	drm/amdgpu: add determine passthrough under arm64 add determine for passthrough mode under arm64 by reading CurrentEL register v2: squash in warning fix (Alex) Signed-off-by: Victor Zhao <Victor.Zhao@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:47:34 -05:00
Christian König	3f268ef06f	drm/ttm: add back a reference to the bdev to the res manager It is simply a lot cleaner to have this around instead of adding the device throughout the call chain. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220124122514.1832-3-christian.koenig@amd.com	2022-01-26 15:29:24 +01:00
Christian König	de3688e469	drm/ttm: add ttm_resource_fini v2 Make sure we call the common cleanup function in all implementations of the resource manager. v2: fix missing case in i915, rudimentary kerneldoc, should be filled in more when we add more functionality Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220124122514.1832-2-christian.koenig@amd.com	2022-01-26 15:23:51 +01:00
Tim Huang	37d6b1506b	drm/amdgpu: convert to UVD IP version checking Use IP versions rather than asic_type to differentiate IP version specific features. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:36 -05:00
Tim Huang	d726d43c20	drm/amdgpu: convert to NBIO IP version checking Use IP versions rather than asic_type to differentiate IP version specific features. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:36 -05:00
Alex Deucher	82c3a7a5ed	drm/amdgpu: convert amdgpu_display_supported_domains() to IP versions Check IP versions rather than asic types. Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:36 -05:00
Alex Deucher	243c719e87	drm/amdgpu: handle BACO synchronization with secondary funcs Extend secondary function handling for runtime pm beyond audio to USB and UCSI. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:36 -05:00
Alex Deucher	d0d66b8c66	drm/amdgpu: move runtime pm init after drm and fbdev init Seems more logical to enable runtime pm at the end of the init sequence so we don't end up entering runtime suspend before init is finished. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:36 -05:00
Alex Deucher	901e2be20d	drm/amdgpu: move PX checking into amdgpu_device_ip_early_init We need to set the APU flag from IP discovery before we evaluate this code. Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:36 -05:00
Alex Deucher	212021297e	drm/amdgpu: set APU flag based on IP discovery table Use the IP versions to set the APU flag when necessary. Reviewed-by: Aaron Liu <aaron.liu@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:36 -05:00
yipechai	e9287ef8d4	Revert "drm/amdgpu: No longer insert ras blocks into ras_list if it already exists in ras_list" This reverts commit `df4f0041c6`. Xgmi ras initialization had been moved from .late_init to early_init, the defect of repeated calling amdgpu_ras_register_ras_block had been fixed, so revert this patch. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:34 -05:00
yipechai	1f33bd18d7	drm/amdgpu: Move xgmi ras initialization from .late_init to .early_init Move xgmi ras initialization from .late_init to .early_init, which let xgmi ras can be initialized only once. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:33 -05:00
Stanley.Yang	d6dac2bc12	drm/amdgpu: fix channel index mapping for SIENNA_CICHLID Pmfw read ecc info registers in the following order, umc0: ch_inst 0, 1, 2 ... 7 umc1: ch_inst 0, 1, 2 ... 7 The position of the register value stored in eccinfo table is calculated according to the below formula, channel_index = umc_inst * channel_in_umc + ch_inst Driver directly use the index of eccinfo table array as channel index, it's not correct, driver needs convert eccinfo table array index to channel index according to channel_idx_tbl. Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:33 -05:00
Hawking Zhang	04022982fc	drm/amdgpu: switch to common helper to read bios from rom create a common helper function for soc15 and onwards to read bios image from rom Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:33 -05:00
Hawking Zhang	381519dff8	drm/amdgpu: retire rlc callbacks sriov_rreg/wreg Not needed anymore. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Zhou, Peng Ju <PengJu.Zhou@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:33 -05:00
Hawking Zhang	1b2dc99e2d	drm/amdgpu: switch to amdgpu_sriov_rreg/wreg Instead of ip specific implementation for rlcg indirect register access Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Zhou, Peng Ju <PengJu.Zhou@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:33 -05:00
Hawking Zhang	5d447e2967	drm/amdgpu: add helper for rlcg indirect reg access The helper will be used to access registers from sriov guest in full access time Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Zhou, Peng Ju <PengJu.Zhou@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:33 -05:00
Hawking Zhang	f8f96b17f0	drm/amdgpu: init rlcg_reg_access_ctrl for gfx10 Initialize all the register offsets that will be used in rlcg indirect reg access path for gfx10 in sw_init phase Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Zhou, Peng Ju <PengJu.Zhou@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:33 -05:00
Hawking Zhang	4819732f59	drm/amdgpu: init rlcg_reg_access_ctrl for gfx9 Initialize all the register offsets that will be used in rlcg indirect reg access path for gfx9 in sw_init phase Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Zhou, Peng Ju <PengJu.Zhou@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:33 -05:00
Hawking Zhang	b12252b053	drm/amdgpu: add structures for rlcg indirect reg access Add structures that are used to cache registers offsets for rlcg indirect reg access ctrl and flag availability of such interface Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Zhou, Peng Ju <PengJu.Zhou@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:33 -05:00
Hawking Zhang	7bbe43f8a4	drm/amdgpu: switch to get_rlcg_reg_access_flag for gfx10 Switch to common helper to query rlcg access flag specified by sriov host driver for gfx10 Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Zhou, Peng Ju <PengJu.Zhou@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:33 -05:00
Hawking Zhang	97d1a3b967	drm/amdgpu: switch to get_rlcg_reg_access_flag for gfx9 Switch to common helper to query rlcg access flag specified by sriov host driver for gfx9 Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Zhou, Peng Ju <PengJu.Zhou@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:33 -05:00
Hawking Zhang	29dbcac82f	drm/amdgpu: add helper to query rlcg reg access flag Query rlc indirect register access approach specified by sriov host driver per ip blocks Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Zhou, Peng Ju <PengJu.Zhou@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:33 -05:00
Xin Xiong	dfced44f12	drm/amd/amdgpu/amdgpu_cs: fix refcount leak of a dma_fence obj This issue takes place in an error path in amdgpu_cs_fence_to_handle_ioctl(). When `info->in.what` falls into default case, the function simply returns -EINVAL, forgetting to decrement the reference count of a dma_fence obj, which is bumped earlier by amdgpu_cs_get_fence(). This may result in reference count leaks. Fix it by decreasing the refcount of specific object before returning the error code. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:33 -05:00
Alex Deucher	25c6aefcee	drm/amdgpu: filter out radeon secondary ids as well Older radeon boards (r2xx-r5xx) had secondary PCI functions which we solely there for supporting multi-head on OSs with special requirements. Add them to the unsupported list as well so we don't attempt to bind to them. The driver would fail to bind to them anyway, but this does so in a cleaner way that should not confuse the user. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:32 -05:00
Guchun Chen	f9130b81ae	drm/amdgpu: drop WARN_ON in amdgpu_gart_bind/unbind NULL pointer check has guarded it already. calltrace: amdgpu_ttm_gart_bind+0x49/0xa0 [amdgpu] amdgpu_ttm_alloc_gart+0x13f/0x180 [amdgpu] amdgpu_bo_create_reserved+0x139/0x2c0 [amdgpu] ? amdgpu_ttm_debugfs_init+0x120/0x120 [amdgpu] amdgpu_bo_create_kernel+0x17/0x80 [amdgpu] amdgpu_ttm_init+0x542/0x5e0 [amdgpu] Fixes: `1b08dfb889` ("drm/amdgpu: remove gart.ready flag") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:32 -05:00
Lang Yu	6a6c2ab687	drm/amdgpu: enable amdgpu_dc module parameter It doesn't work under IP discovery mode. Make it work! Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Alex Deucher <aleander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:32 -05:00
Mario Limonciello	828904660a	drm/amd: Fix MSB of SMU version printing Yellow carp has been outputting versions like `1093.24.0`, but this is supposed to be 69.24.0. That is the MSB is being interpreted incorrectly. The MSB is not part of the major version, but has generally been treated that way thus far. It's actually the program, and used to distinguish between two programs from a similar family but different codebase. Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:32 -05:00
shaoyunl	901abf367d	drm/amdgpu: Disable FRU EEPROM access for SRIOV VF acces the EEPROM is blocked by security policy, we might need other way to get SKUs info for VF v2: squash in compilation fix from Luben Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:32 -05:00
Alex Deucher	9e5a14bce2	drm/amdgpu: filter out radeon secondary ids as well Older radeon boards (r2xx-r5xx) had secondary PCI functions which we solely there for supporting multi-head on OSs with special requirements. Add them to the unsupported list as well so we don't attempt to bind to them. The driver would fail to bind to them anyway, but this does so in a cleaner way that should not confuse the user. Cc: stable@vger.kernel.org Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 17:50:13 -05:00
Maxime Ripard	4adc33f36d	drm/edid: Split deep color modes between RGB and YUV444 The current code assumes that the RGB444 and YUV444 formats are the same, but the HDMI 2.0 specification states that: The three DC_XXbit bits above only indicate support for RGB 4:4:4 at that pixel size. Support for YCBCR 4:4:4 in Deep Color modes is indicated with the DC_Y444 bit. If DC_Y444 is set, then YCBCR 4:4:4 is supported for all modes indicated by the DC_XXbit flags. So if we have YUV444 support and any DC_XXbit flag set but the DC_Y444 flag isn't, we'll assume that we support that deep colour mode for YUV444 which breaks the specification. In order to fix this, let's split the edid_hdmi_dc_modes field in struct drm_display_info into two fields, one for RGB444 and one for YUV444. Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Fixes: `d0c94692e0` ("drm/edid: Parse and handle HDMI deep color modes.") Signed-off-by: Maxime Ripard <maxime@cerno.tech> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220120151625.594595-4-maxime@cerno.tech	2022-01-25 10:01:24 +01:00
Christian König	b3bddb7a38	drm/amdgpu: use ttm_resource_manager_debug Instead of calling the debug operation directly. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211124124430.20859-10-christian.koenig@amd.com	2022-01-24 11:24:06 +01:00
Xiaojian Du	a357dca964	drm/amdgpu: fix the page fault caused by uninitialized variables This patch will fix the page fault caused by uninitialized variables. Error Log: ...... [ 130.246323] [drm] GART: num cpu pages 131072, num gpu pages 131072 [ 131.963112] [drm] PCIE GART of 512M enabled (table at 0x0000008000000000). [ 131.963130] BUG: unable to handle page fault for address: 000000000002db80 [ 131.963181] #PF: supervisor write access in kernel mode [ 131.963210] #PF: error_code(0x0002) - not-present page [ 131.963233] PGD 0 P4D 0 [ 131.963253] Oops: 0002 [#1] SMP NOPTI [ 131.963273] CPU: 3 PID: 1411 Comm: modprobe Not tainted 5.13.0+ #1 [ 131.963338] RIP: 0010:osq_lock+0x4d/0x120 [ 131.963381] Code: 10 00 00 00 00 48 c7 02 00 00 00 00 89 42 14 87 07 85 c0 0f 84 d0 00 00 00 83 e8 01 48 98 48 03 0c c5 00 d9 ea 9c 48 89 4a 08 <48> 89 11 44 8b 42 10 45 85 c0 0f 85 af 00 00 00 55 48 89 fe 65 4c [ 131.963460] RSP: 0018:ffffa40481717768 EFLAGS: 00010202 [ 131.963483] RAX: fffffffffffffffe RBX: ffffa40481717920 RCX: 000000000002db80 [ 131.963520] RDX: ffff9256fecedb80 RSI: ffff9256cbed2e80 RDI: ffffa40481717ac4 [ 131.963547] RBP: ffffa40481717808 R08: ffffa40481717920 R09: 00000000ffffffff [ 131.963582] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 [ 131.963609] R13: ffffa40481717ac4 R14: ffffa40481717ab8 R15: ffff9256c9480000 [ 131.963646] FS: 00007f23d9b9c540(0000) GS:ffff9256fecc0000(0000) knlGS:0000000000000000 [ 131.963687] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 131.963721] CR2: 000000000002db80 CR3: 0000000008444000 CR4: 00000000000506e0 [ 131.963758] Call Trace: [ 131.963772] ? __ww_mutex_lock.isra.0+0x3a2/0x760 [ 131.963810] ? prb_read_valid+0x1c/0x20 [ 131.963830] ? console_unlock+0x2fe/0x4f0 [ 131.963849] __ww_mutex_lock_interruptible_slowpath+0x16/0x20 [ 131.963882] ww_mutex_lock_interruptible+0x83/0x90 [ 131.963908] amdgpu_bo_create_reserved+0xf0/0x1e0 [amdgpu] [ 131.964237] amdgpu_bo_create_kernel+0x17/0x80 [amdgpu] [ 131.964509] amdgpu_gmc_vram_checking+0x41/0xf0 [amdgpu] [ 131.964807] gmc_v10_0_hw_init+0x105/0x120 [amdgpu] [ 131.965108] amdgpu_device_init.cold+0x1aa4/0x1e3e [amdgpu] ...... Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-21 16:25:49 -05:00
Stanley.Yang	37ff945f80	drm/amdgpu: fix convert bad page retiremt Pmfw read ecc info registers and store values in eccinfo_table in the following order umc0 ch_inst 0, 1, 2 ... 7 umc1 ch_inst 0, 1, 2 ... 7 ... umc3 ch_inst 0, 1, 2 ... 7 Driver should convert eccinfo_table_idx to channel_index according to channel_idx_tbl. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-21 16:25:36 -05:00
Linus Torvalds	c2c94b3b18	drm fixes for 5.17-rc1 amdgpu: - SR-IOV fix - VCN harvest fix - Suspend/resume fixes - Tahiti fix - Enable GPU recovery on yellow carp radeon: - Fix error handling regression in radeon_driver_open_kms i915: - Update EHL display voltage swing table - Fix programming the ADL-P display TC voltage swing -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmHp+xAACgkQDHTzWXnE hr47LQ//ZeXWUZSOxFiYa8mzRkUuBWCihj7xdGiKlHiSBz6FaGiiaMutqorG9V3O ktQKji16Q48vvvLZmRecigrZ3maOtisAgNWgdlKT1XbgMnVCmXcbhb57mNbLC2D/ HcV6b5wKvLmTpNMyto6gRlPXyDMgczP76ChqyHb+MdUZfXEmAh6yAeP06sR9KaG6 XF17SMI+KB9OLnnRrwg+ws+Lh6KCHZYVA8LGAapTTGUbn8yAS49/JrE2QjKTCDZo 1v2i77dblnxHNvI4kPlrDJEndwa+VJdUoqseZTyRwwVBm3vrggNLvkclzCRH9AuI 61p8RW6+w0xqfM73+5B+HEFb8dpVkts+E6JdYL9ZkQ+5/Hz1EamBDqKcZKd5f6Yd DC7yit07rzRPEV/YvAnJV0AMxLKy8RKjbxfB7Q6SapCENVp9kGc8mGJa5nlfbGBh 3dz1Moop8/tiqf2WRYOY5yotcXBxySDKFzrW9QDABqBb8m8UVbsW9EO4iL+0fhvW hosbPWop6CvsvT2QSyHhpeVPhpkZwNmwPzrrONzjf+K6Q7jm9fDYqbbmFkQMrGeL c93Ii4OQRjSok/dKTWIH+YCPdQF9bmwtjae8ul6CDkWniBW/p0u5T9fXD2ylUGxW D0F0NPcV4G1S/MsrFzAmJXJE7n4Fjd39nnIRiOMg4d4cdRkAIUQ= =cNda -----END PGP SIGNATURE----- Merge tag 'drm-next-2022-01-21' of git://anongit.freedesktop.org/drm/drm Pull drm fixes from Dave Airlie: "Thanks to Daniel for taking care of things while I was out, just a set of merge window fixes that came in this week, two i915 display fixes and a bunch of misc amdgpu, along with a radeon regression fix. amdgpu: - SR-IOV fix - VCN harvest fix - Suspend/resume fixes - Tahiti fix - Enable GPU recovery on yellow carp radeon: - Fix error handling regression in radeon_driver_open_kms i915: - Update EHL display voltage swing table - Fix programming the ADL-P display TC voltage swing" * tag 'drm-next-2022-01-21' of git://anongit.freedesktop.org/drm/drm: drm/radeon: fix error handling in radeon_driver_open_kms drm/amd/amdgpu: fixing read wrong pf2vf data in SRIOV drm/amdgpu: apply vcn harvest quirk drm/i915/display/adlp: Implement new step in the TC voltage swing prog sequence drm/i915/display/ehl: Update voltage swing table drm/amd/display: Revert W/A for hard hangs on DCN20/DCN21 drm/amdgpu: drop flags check for CHIP_IP_DISCOVERY drm/amdgpu: Fix rejecting Tahiti GPUs drm/amdgpu: don't do resets on APUs which don't support it drm/amdgpu: invert the logic in amdgpu_device_should_recover_gpu() drm/amdgpu: Enable recovery on yellow carp	2022-01-21 09:25:38 +02:00
Eric Huang	f61c40c075	drm/amdkfd: enable heavy-weight TLB flush on Arcturus SDMA FW fixes the hang issue for adding heavy-weight TLB flush on Arcturus, so we can enable it. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-19 22:33:27 -05:00
Jonathan Kim	590e86fe34	drm/amdgpu: fix broken debug sdma vram access function Debug VRAM access through SDMA has several broken parts resulting in silent MMIO fallback. BO kernel creation takes the location of the cpu addr pointer, not the pointer itself for address kmap. drm_dev_enter return true on success so change access check. The source BO is reserved but not pinned so find the address using the cursor offset relative to its memory domain start. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-19 22:33:03 -05:00
Christian König	1b08dfb889	drm/amdgpu: remove gart.ready flag That's just a leftover from old radeon days and was preventing CS and GART bindings before the hardware was initialized. But nowdays that is perfectly valid. The only thing we need to warn about are GART binding before the table is even allocated. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-19 22:32:47 -05:00
Stanley.Yang	5904e4135f	drm/amdgpu: remove unused variable warning Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-19 22:32:38 -05:00
mziya	33cd016e60	drm/amdgpu: remove unused variable Remove set but unused variable. warning: variable 'umc_reg_offset' set but not used Signed-off-by: mziya <Mohammadzafar.ziya@amd.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-19 22:32:28 -05:00
yipechai	8eb53bb2aa	drm/amdgpu: Remove repeated calls Remove repeated calls. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-19 22:32:19 -05:00
Xiaojian Du	86700a4026	drm/amdgpu: modify a pair of functions for the pcie port wreg/rreg This patch will modify a pair of functions for pcie port wreg/rreg. AMD GPU have had an independent NBIO block from SOC15 arch. If the dirver wants to read/write the address space of the pcie devices, it has to go through the NBIO block. This patch will move the pcie port wreg/rreg functions to "amdgpu_device.c", so that to reuse the functions on the future GPU ASICs. Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-19 22:32:11 -05:00
Xiaojian Du	479e3b02b7	drm/amdgpu: add vram check function for GMC This patch will add vram check function for GMC block. It will write pattern data to the vram and then read back from the vram, so that to verify the work status of vram. This patch will cover gmc v6/7/8/9/10. Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-19 22:31:51 -05:00
Christian König	75ab2b3633	dma-buf: drop excl_fence parameter from dma_resv_get_fences Returning the exclusive fence separately is no longer used. Instead add a write parameter to indicate the use case. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20211207123411.167006-4-christian.koenig@amd.com	2022-01-19 10:03:56 +01:00
Christian König	acde6234f6	drm/amdgpu: remove excl as shared workarounds This was added because of the now dropped shared on excl dependency. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20211123142111.3885-15-christian.koenig@amd.com	2022-01-19 08:25:50 +01:00
Jingwen Chen	9a458402fb	drm/amd/amdgpu: fixing read wrong pf2vf data in SRIOV [Why] This fixes `892deb4826` ("drm/amdgpu: Separate vf2pf work item init from virt data exchange"). we should read pf2vf data based at mman.fw_vram_usage_va after gmc sw_init. commit `892deb4826` breaks this logic. [How] calling amdgpu_virt_exchange_data in amdgpu_virt_init_data_exchange to set the right base in the right sequence. v2: call amdgpu_virt_init_data_exchange after gmc sw_init to make data exchange workqueue run v3: clean up the code logic v4: add some comment and make the code more readable Fixes: `892deb4826` ("drm/amdgpu: Separate vf2pf work item init from virt data exchange") Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com> Reviewed-by: Horace Chen <horace.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-18 18:00:58 -05:00
Guchun Chen	520d9cd267	drm/amdgpu: apply vcn harvest quirk This is a following patch to apply the workaround only on those boards with a bad harvest table in ip discovery. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-18 18:00:58 -05:00
Minghao Chi	a5e7ffa119	amdgpu/amdgpu_psp: remove unneeded ret variable Return value from amdgpu_bo_create_kernel() directly instead of taking this in another redundant variable. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn> Signed-off-by: CGEL ZTE <cgel.zte@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-18 17:43:36 -05:00
Yongzhi Liu	4bd8dd0d61	drm/amdgpu: Add missing pm_runtime_put_autosuspend pm_runtime_get_sync() increments the runtime PM usage counter even when it returns an error code, thus a matching decrement is needed on the error handling path to keep the counter balanced. Signed-off-by: Yongzhi Liu <lyz_cs@pku.edu.cn> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-18 17:43:36 -05:00
Jingwen Chen	22c16d251a	drm/amd/amdgpu: fixing read wrong pf2vf data in SRIOV [Why] This fixes `892deb4826` ("drm/amdgpu: Separate vf2pf work item init from virt data exchange"). we should read pf2vf data based at mman.fw_vram_usage_va after gmc sw_init. commit `892deb4826` breaks this logic. [How] calling amdgpu_virt_exchange_data in amdgpu_virt_init_data_exchange to set the right base in the right sequence. v2: call amdgpu_virt_init_data_exchange after gmc sw_init to make data exchange workqueue run v3: clean up the code logic v4: add some comment and make the code more readable Fixes: `892deb4826` ("drm/amdgpu: Separate vf2pf work item init from virt data exchange") Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com> Reviewed-by: Horace Chen <horace.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-18 17:41:03 -05:00
yipechai	71b6c4a277	drm/amdgpu: Fix the code style warnings in hdp xgmi mca and umc drm/amdgpu: Fix the code style warnings in hdp xgmi mca and umc: 1. WARNING: missing space after struct definition. 2. WARNING: please, no space before tabs. 3. WARNING: line length of xxx exceeds 100 columns. 4. ERROR: "foo* bar" should be "foo *bar". 5. ERROR: space required before the open parenthesis '('. 6. ERROR: space prohibited after that open parenthesis '('. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-18 17:24:24 -05:00
yipechai	8697a19ee9	drm/amdgpu: Fix the code style warnings in sdma Fix the code style warnings in sdma: 1. WARNING: Missing a blank line after declarations. 2. ERROR: that open brace { should be on the previous line. 3. WARNING: unnecessary whitespace before a quoted newline. 4. ERROR: space required after that ',' (ctx:VxV). Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-18 17:24:18 -05:00
yipechai	d622c094f8	drm/amdgpu: Fix the code style warnings in gmc Fix the code style warnings in gmc: ERROR: space required after that ',' (ctx:VxV). Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-18 17:24:12 -05:00
yipechai	4f64ccf4f2	drm/amdgpu: Fix the code style warnings in gfx Fix the code style warnings in gfx: 1. WARNING: suspect code indent for conditional statements. 2. ERROR: spaces required around that '=' (ctx:WxV). Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-18 17:24:05 -05:00
yipechai	b6efdb02d2	drm/amdgpu: Fix the code style warnings in amdgpu_ras Fix the code style warnings in amdgpu_ras: 1. ERROR: space required before the open parenthesis '('. 2. WARNING: line length of xxx exceeds 100 columns. 3. ERROR: "foo* bar" should be "foo *bar". 4. WARNING: unnecessary whitespace before a quoted newline. 5. WARNING: space prohibited before semicolon. 6. WARNING: suspect code indent for conditional statements. 7. WARNING: braces {} are not necessary for single statement blocks. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-18 17:23:58 -05:00
Guchun Chen	03f6fb84bd	drm/amdgpu: apply vcn harvest quirk This is a following patch to apply the workaround only on those boards with a bad harvest table in ip discovery. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-18 17:23:51 -05:00
Guchun Chen	e475986f18	drm/amdgpu: drop redundant check of ip discovery_bin Early check in amdgpu_discovery_reg_base_init promises this. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-18 17:23:43 -05:00
Stanley.Yang	79c0462159	drm/amdgpu: handle denied inject error into critical regions v2 Changed from v1: remove unused brace Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-18 17:22:36 -05:00
mziya	c34242eea1	drm/amdgpu: add new query interface for umc_v8_7 block add smu message query error information interface, function name align with IP version number V2: Removed unused err cnt entry Signed-off-by: mziya <Mohammadzafar.ziya@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-18 17:22:10 -05:00
Thomas Zimmermann	5b529e8d9c	drm/dp: Move public DisplayPort headers into dp/ Move all public DisplayPort headers into dp/ and update users. No functional changes. v3: * rebased onto latest drm-tip Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Daniel Vetter <daniel@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220114114535.29157-5-tzimmermann@suse.de	2022-01-17 11:25:44 +01:00
Linus Torvalds	59d41458f1	drm fixes for 5.17-rc1: drivers fixes: - i915 fixes for ttm backend + one pm wakelock fix - amdgpu fixes, fairly big pile of small things all over. Note this doesn't yet containe the fixed version of the otg sync patch that blew up - small driver fixes: meson, sun4i, vga16fb probe fix drm core fixes: - cma-buf heap locking - ttm compilation - self refresh helper state check - wrong error message in atomic helpers - mipi-dbi buffer mapping -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEb4nG6jLu8Y5XI+PfTA9ye/CYqnEFAmHhrxUACgkQTA9ye/CY qnEowxAAgBPwGEobRGMbR3Me98vEKvcWqSxBe/k1VC4LhO5DrvbG5iW9cuxCJZM2 wlGlGAtU7C7pcCP5Xp1UlMqZ5a0rSVhqMPPkMKO9+7033ofSlAQatnMI1EENH6Hn BkhXwTyuOBSN6zqskg8FKqzF+VPTt5ZV2U5qJzQweP/wFtZPAKI4tWE4oKiHactH fJHnAi7T6ytF6a7J21BsSEluk4z7BjmcmFF0tW6iuq7Y6TXDFXFq9QFDR041b2rI GYDUXl2mebp/L+2M3sPYuMiIiyJ8enh7crNIdmi+EstmzRADa7RMjnY3j2tJg/7M pqnuJZAVcpkCurb7NMr3ycmrxnhfUsZfbuXvm+k5yJYfQCaGNiKy1ObsFWH9zBDz XMuxcE+csSaX/7rjoyXrL2ZTRPXnVwJNJ8x1CuKn3giLxMSqnPnDMjyHmNLB8qa1 R0wbPQbdx5+jWgs/ngUGFNo4vFBnNmqQP4G3LaWJ/Ku5cSrEM+Jt9GJOw5jjVgun gaOlKlUpMBlKmXOwkvhRW2AwHRcL7lrBuIw0oFOThdMzkSNlKSNBMpAkgvD/9C7g IDtJgA7a3MqzLjhPLmhUB3rFyXz5dg5rNZhoH8z4DFiJVpTKYYA5/UoU/RTXZGqW yFdrBkFmNJeKTZZAe+e/4OP/dm1vKVEKK6Ko/M9CrELzBKSussg= =h74X -----END PGP SIGNATURE----- Merge tag 'drm-next-2022-01-14' of git://anongit.freedesktop.org/drm/drm Pull drm fixes from Daniel Vetter: "drivers fixes: - i915 fixes for ttm backend + one pm wakelock fix - amdgpu fixes, fairly big pile of small things all over. Note this doesn't yet containe the fixed version of the otg sync patch that blew up - small driver fixes: meson, sun4i, vga16fb probe fix drm core fixes: - cma-buf heap locking - ttm compilation - self refresh helper state check - wrong error message in atomic helpers - mipi-dbi buffer mapping" * tag 'drm-next-2022-01-14' of git://anongit.freedesktop.org/drm/drm: (49 commits) drm/mipi-dbi: Fix source-buffer address in mipi_dbi_buf_copy drm: fix error found in some cases after the patch d1af5cd86997 drm/ttm: fix compilation on ARCH=um dma-buf: cma_heap: Fix mutex locking section video: vga16fb: Only probe for EGA and VGA 16 color graphic cards drm/amdkfd: Fix ASIC name typos drm/amdkfd: Fix DQM asserts on Hawaii drm/amdgpu: Use correct VIEWPORT_DIMENSION for DCN2 drm/amd/pm: only send GmiPwrDnControl msg on master die (v3) drm/amdgpu: use spin_lock_irqsave to avoid deadlock by local interrupt drm/amdgpu: not return error on the init_apu_flags drm/amdkfd: Use prange->update_list head for remove_list drm/amdkfd: Use prange->list head for insert_list drm/amdkfd: make SPDX License expression more sound drm/amdkfd: Check for null pointer after calling kmemdup drm/amd/display: invalid parameter check in dmub_hpd_callback Revert "drm/amdgpu: Don't inherit GEM object VMAs in child process" drm/amd/display: reset dcn31 SMU mailbox on failures drm/amdkfd: use default_groups in kobj_type drm/amdgpu: use default_groups in kobj_type ...	2022-01-16 06:52:38 +02:00
Alex Deucher	a8e6398ffe	drm/amdgpu: drop flags check for CHIP_IP_DISCOVERY Support for IP based discovery is in place now so this check is no longer required. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 18:08:01 -05:00
Lukas Fink	5f0754ab27	drm/amdgpu: Fix rejecting Tahiti GPUs `eb4fd29afd` ("drm/amdgpu: bind to any 0x1002 PCI diplay class device") added generic bindings to amdgpu so that that it binds to all display class devices with VID 0x1002 and then rejects those in amdgpu_pci_probe. Unfortunately it reuses a driver_data value of 0 to detect those new bindings, which is already used to denote CHIP_TAHITI ASICs. The driver_data value given to those new bindings was changed in dd0761fd24ea1 ("drm/amdgpu: set CHIP_IP_DISCOVERY as the asic type by default") to CHIP_IP_DISCOVERY (=36), but it seems that the check in amdgpu_pci_probe was forgotten to be changed. Therefore, it still rejects Tahiti GPUs. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1860 Fixes: `eb4fd29afd` ("drm/amdgpu: bind to any 0x1002 PCI diplay class device") Signed-off-by: Lukas Fink <lukas.fink1@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 18:07:55 -05:00
Alex Deucher	06cf9bd61a	drm/amdgpu: don't do resets on APUs which don't support it It can cause a hang. This is normally not enabled for GPU hangs on these asics, but was recently enabled for handling aborted suspends. This causes hangs on some platforms on suspend. Fixes: `daf8de0874` ("drm/amdgpu: always reset the asic in suspend (v2)") Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1858 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 18:07:48 -05:00
Alex Deucher	d82ce3cd30	drm/amdgpu: drop flags check for CHIP_IP_DISCOVERY Support for IP based discovery is in place now so this check is no longer required. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 18:06:44 -05:00
Lukas Fink	3993a799fc	drm/amdgpu: Fix rejecting Tahiti GPUs `eb4fd29afd` ("drm/amdgpu: bind to any 0x1002 PCI diplay class device") added generic bindings to amdgpu so that that it binds to all display class devices with VID 0x1002 and then rejects those in amdgpu_pci_probe. Unfortunately it reuses a driver_data value of 0 to detect those new bindings, which is already used to denote CHIP_TAHITI ASICs. The driver_data value given to those new bindings was changed in dd0761fd24ea1 ("drm/amdgpu: set CHIP_IP_DISCOVERY as the asic type by default") to CHIP_IP_DISCOVERY (=36), but it seems that the check in amdgpu_pci_probe was forgotten to be changed. Therefore, it still rejects Tahiti GPUs. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1860 Fixes: `eb4fd29afd` ("drm/amdgpu: bind to any 0x1002 PCI diplay class device") Cc: stable@vger.kernel.org Signed-off-by: Lukas Fink <lukas.fink1@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 18:06:44 -05:00
Alex Deucher	e8309d50e9	drm/amdgpu: don't do resets on APUs which don't support it It can cause a hang. This is normally not enabled for GPU hangs on these asics, but was recently enabled for handling aborted suspends. This causes hangs on some platforms on suspend. Fixes: `daf8de0874` ("drm/amdgpu: always reset the asic in suspend (v2)") Cc: stable@vger.kernel.org Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1858 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 18:06:44 -05:00
Alex Deucher	0ffb1fd158	drm/amdgpu: invert the logic in amdgpu_device_should_recover_gpu() Rather than opting into GPU recovery support, default to on, and opt out if it's not working on a particular GPU. This avoids the need to add new asics to this list since this is a core feature. Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 18:06:44 -05:00
CHANDAN VURDIGERE NATARAJ	4175c32be5	drm/amdgpu: Enable recovery on yellow carp Add yellow carp to devices which support recovery Signed-off-by: CHANDAN VURDIGERE NATARAJ <chandan.vurdigerenataraj@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 18:06:44 -05:00
Alex Deucher	b3523c4573	drm/amdgpu: invert the logic in amdgpu_device_should_recover_gpu() Rather than opting into GPU recovery support, default to on, and opt out if it's not working on a particular GPU. This avoids the need to add new asics to this list since this is a core feature. Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:52:50 -05:00
CHANDAN VURDIGERE NATARAJ	31425abeda	drm/amdgpu: Enable recovery on yellow carp Add yellow carp to devices which support recovery Signed-off-by: CHANDAN VURDIGERE NATARAJ <chandan.vurdigerenataraj@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:52:50 -05:00
yipechai	e3d833f41c	drm/amdgpu: fix compile warning for ras_block_match_default fix compile warning for ras_block_match_default Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:52:44 -05:00
yipechai	954ea6aa15	drm/amdgpu: Use ARRAY_SIZE to get array length Use ARRAY_SIZE to get array length. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:52:26 -05:00
Yang Li	ab3b9de65b	drm/amdgpu: clean up some inconsistent indenting Eliminate the follow smatch warnings: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3504 amdgpu_device_init() warn: inconsistent indenting drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1716 amdgpu_ras_error_status_query() warn: if statement not indented drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1058 amdgpu_ras_error_inject() warn: inconsistent indenting Reported-by: Abaci Robot <abaci@linux.alibaba.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:52:10 -05:00
Yang Li	69f91d32c6	drm/amdgpu: remove unneeded semicolon Eliminate the following coccicheck warning: ./drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:2725:16-17: Unneeded semicolon Reported-by: Abaci Robot <abaci@linux.alibaba.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:52:07 -05:00
yipechai	df4f0041c6	drm/amdgpu: No longer insert ras blocks into ras_list if it already exists in ras_list No longer insert ras blocks into ras_list if it already exists in ras_list. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:52:04 -05:00
yipechai	df01fe73ee	drm/amdgpu: Add ras supported check for register_ras_block Add ras supported check for register_ras_block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:52:00 -05:00
Bokun Zhang	c4381d0ee8	drm/amdgpu: Add interface to load SRIOV cap FW - Add interface to load SRIOV cap FW. If the FW does not exist, simply skip this FW loading routine. This FW will only be loaded under SRIOV. Other driver configuration will not be affected. By adding this interface, it will make us easier to prepare SRIOV Linux guest driver for different users. - Update sysfs interface to read cap FW version. - Refactor PSP FW loading routine under SRIOV to use a unified SWITCH statement instead of using IF statement - Remove redundant amdgpu_sriov_vf() check in FW loading routine Acked-by: Monk Liu <monk.liu@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:52:00 -05:00
Jonathan Kim	400ef298f4	drm/amdgpu: cleanup ttm debug sdma vram access function Some suggested cleanups to declutter ttm when doing debug VRAM access over SDMA. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:52:00 -05:00
Jonathan Kim	cb5cc4f573	drm/amdgpu: improve debug VRAM access performance using sdma For better performance during VRAM access for debugged processes, do read/write copies over SDMA. In order to fulfill post mortem debugging on a broken device, fallback to stable MMIO access when gpu recovery is disabled or when job submission time outs are set to max. Failed SDMA access should automatically fall back to MMIO access. Use a pre-allocated GTT bounce buffer pre-mapped into GART to avoid page-table updates and TLB flushes on access. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:52:00 -05:00
yipechai	7389a5b837	drm/amdgpu: Removed redundant ras code Removed redundant ras code. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:52:00 -05:00
yipechai	22d4ba53b1	drm/amdgpu: Adjust error inject function code style in amdgpu_ras.c 1. Move xgmi special error inject function from amdgpu_ras.c to xgmi block. 2. Support to use psp_ras_trigger_error as default error inject function in amdgpu_ras.c. If .ras_error_inject isn't defined in ras block, default error inject function will take effect. v2: squash in warning fix (Alex) Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:52:00 -05:00
yipechai	b0e2062dc8	drm/amdgpu: Modify mca block to fit for the unified ras block data and ops 1.Modify mca block to fit for the unified ras block data and ops. 2.Define special .ras_block_match function for mca block to identify itself. 3.Change amdgpu_mca_ras_funcs to amdgpu_mca_ras_block(amdgpu_mca_ras had been used), and the corresponding variable name remove _funcs suffix. 4.Remove the const flag of cma ras variable so that cma ras block can be able to be inserted into amdgpu device ras block link list. 5.Invoke amdgpu_ras_register_ras_block function to register cma ras block into amdgpu device ras block link list. 6.Remove the redundant code about cma in amdgpu_ras.c after using the unified ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:52:00 -05:00
yipechai	bdc4292bd3	drm/amdgpu: Modify sdma block to fit for the unified ras block data and ops 1.Modify sdma block to fit for the unified ras block data and ops. 2.Change amdgpu_sdma_ras_funcs to amdgpu_sdma_ras, and the corresponding variable name remove _funcs suffix. 3.Remove the const flag of sdma ras variable so that sdma ras block can be able to be inserted into amdgpu device ras block link list. 4.Invoke amdgpu_ras_register_ras_block function to register sdma ras block into amdgpu device ras block link list. 5.Remove the redundant code about sdma in amdgpu_ras.c after using the unified ras block. 6.Fill unified ras block .name .block .ras_late_init and .ras_fini for all of sdma versions. If .ras_late_init and .ras_fini had been defined by the selected sdma version, the defined functions will take effect; if not defined, default fill them with amdgpu_sdma_ras_late_init and amdgpu_sdma_ras_fini. v2: squash in warning fix (Alex) Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:52:00 -05:00
yipechai	efe17d5a21	drm/amdgpu: Modify umc block to fit for the unified ras block data and ops 1.Modify umc block to fit for the unified ras block data and ops. 2.Change amdgpu_umc_ras_funcs to amdgpu_umc_ras, and the corresponding variable name remove _funcs suffix. 3.Remove the const flag of umc ras variable so that umc ras block can be able to be inserted into amdgpu device ras block link list. 4.Invoke amdgpu_ras_register_ras_block function to register umc ras block into amdgpu device ras block link list. 5.Remove the redundant code about umc in amdgpu_ras.c after using the unified ras block. 6.Fill unified ras block .name .block .ras_late_init and .ras_fini for all of umc versions. If .ras_late_init and .ras_fini had been defined by the selected umc version, the defined functions will take effect; if not defined, default fill them with amdgpu_umc_ras_late_init and amdgpu_umc_ras_fini. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:51:59 -05:00
yipechai	2e54fe5d05	drm/amdgpu: Modify nbio block to fit for the unified ras block data and ops 1.Modify nbio block to fit for the unified ras block data and ops. 2.Change amdgpu_nbio_ras_funcs to amdgpu_nbio_ras, and the corresponding variable name remove _funcs suffix. 3.Remove the const flag of mmhub ras variable so that nbio ras block can be able to be inserted into amdgpu device ras block link list. 4.Invoke amdgpu_ras_register_ras_block function to register nbio ras block into amdgpu device ras block link list. 5.Remove the redundant code about nbio in amdgpu_ras.c after using the unified ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:51:59 -05:00
yipechai	5e67bba301	drm/amdgpu: Modify mmhub block to fit for the unified ras block data and ops 1.Modify mmhub block to fit for the unified ras block data and ops. 2.Change amdgpu_mmhub_ras_funcs to amdgpu_mmhub_ras, and the corresponding variable name remove _funcs suffix. 3.Remove the const flag of mmhub ras variable so that mmhub ras block can be able to be inserted into amdgpu device ras block link list. 4.Invoke amdgpu_ras_register_ras_block function to register mmhub ras block into amdgpu device ras block link list. 5.Remove the redundant code about mmhub in amdgpu_ras.c after using the unified ras block. 5.Remove the redundant code about mmhub in amdgpu_ras.c after using the unified ras block. 6.Fill unified ras block .name .block .ras_late_init and .ras_fini for all of mmhub versions. If .ras_late_init and .ras_fini had been defined by the selected mmhub version, the defined functions will take effect; if not defined, default fill them with amdgpu_mmhub_ras_late_init and amdgpu_mmhub_ras_fini. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:51:59 -05:00
yipechai	6d76e9049a	drm/amdgpu: Modify hdp block to fit for the unified ras block data and ops 1.Modify hdp block to fit for the unified ras block data and ops. 2.Change amdgpu_hdp_ras_funcs to amdgpu_hdp_ras, and the corresponding variable name remove _funcs suffix. 3.Remove the const flag of hdp ras variable so that hdp ras block can be able to be inserted into amdgpu device ras block link list. 4.Invoke amdgpu_ras_register_ras_block function to register hdp ras block into amdgpu device ras block link list. 5.Remove the redundant code about hdp in amdgpu_ras.c after using the unified ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:51:59 -05:00
yipechai	6c2453861f	drm/amdgpu: Modify xgmi block to fit for the unified ras block data and ops 1.Modify gmc block to fit for the unified ras block data and ops. 2.Change amdgpu_xgmi_ras_funcs to amdgpu_xgmi_ras, and the corresponding variable name remove _funcs suffix. 3.Remove the const flag of gmc ras variable so that gmc ras block can be able to be inserted into amdgpu device ras block link list. 4.Invoke amdgpu_ras_register_ras_block function to register gmc ras block into amdgpu device ras block link list. 5.Remove the redundant code about gmc in amdgpu_ras.c after using the unified ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:51:59 -05:00
yipechai	8b0fb0e967	drm/amdgpu: Modify gfx block to fit for the unified ras block data and ops 1.Modify gfx block to fit for the unified ras block data and ops. 2.Change amdgpu_gfx_ras_funcs to amdgpu_gfx_ras, and the corresponding variable name remove _funcs suffix. 3.Remove the const flag of gfx ras variable so that gfx ras block can be able to be inserted into amdgpu device ras block link list. 4.Invoke amdgpu_ras_register_ras_block function to register gfx ras block into amdgpu device ras block link list. 5.Remove the redundant code about gfx in amdgpu_ras.c after using the unified ras block. 6.Fill unified ras block .name .block .ras_late_init and .ras_fini for all of gfx versions. If .ras_late_init and .ras_fini had been defined by the selected gfx version, the defined functions will take effect; if not defined, default fill with amdgpu_gfx_ras_late_init and amdgpu_gfx_ras_fini. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:51:59 -05:00
yipechai	7cab212405	drm/amdgpu: Modify the compilation failed problem when other ras blocks' .h include amdgpu_ras.h Modify the compilation failed problem when other ras blocks' .h include amdgpu_ras.h. v2: squash in forward declaration warning fix (Alex) Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:51:59 -05:00
yipechai	6492e1b07c	drm/amdgpu: Unify ras block interface for each ras block 1. Define unified ops interface for each block. 2. Add ras_block_match function pointer in ops interface, each ras block can customize specail match function to identify itself. 3. Add amdgpu_ras_block_match_default new function. If a ras block doesn't define .ras_block_match, default execute amdgpu_ras_block_match_default to identify this ras block. 4. Define unified basic ras block data for each ras block. 5. Create dedicated amdgpu device ras block link list to manage all of the ras blocks. 6. Add amdgpu_ras_register_ras_block new function interface for each ras block to register itself to ras controlling block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:51:59 -05:00
Evan Quan	1a408c710d	drm/amdgpu: wrap those atombios APIs used by SI under CONFIG_DRM_AMDGPU_SI No need to compile them on CONFIG_DRM_AMDGPU_SI disabled. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:51:15 -05:00
Evan Quan	ebfc253335	drm/amd/pm: do not expose the smu_context structure used internally in power This can cover the power implementation details. And as what did for powerplay framework, we hook the smu_context to adev->powerplay.pp_handle. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:51:14 -05:00
Evan Quan	d698a2c485	drm/amd/pm: move pp_force_state_enabled member to amdgpu_pm structure As it lables an internal pm state and amdgpu_pm structure is the more proper place than amdgpu_device structure for it. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:51:14 -05:00
Evan Quan	84176663e7	drm/amd/pm: create a new holder for those APIs used only by legacy ASICs(si/kv) Those APIs are used only by legacy ASICs(si/kv). They cannot be shared by other ASICs. So, we create a new holder for them. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:51:14 -05:00
Evan Quan	bc143d8b83	drm/amd/pm: do not expose implementation details to other blocks out of power Those implementation details(whether swsmu supported, some ppt_funcs supported, accessing internal statistics ...)should be kept internally. It's not a good practice and even error prone to expose implementation details. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:51:14 -05:00
Solomon Chiu	de05abe6b9	drm/amd/display: Enable Freesync Video Mode by default [Why&How] Freesync Video Mode is a experimental feature previously, and need to be enabled by kernel parameter. We enable it by default with removing module paramterter in amdgpu_dm. v2: squash the patches together Signed-off-by: Solomon Chiu <solomon.chiu@amd.com> Reviewed-by: Aurabindo Jayamohanan Pillai <Aurabindo.Pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-14 17:51:13 -05:00
Daniel Vetter	4efdddbce7	Merge tag 'amd-drm-next-5.17-2022-01-12' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.17-2022-01-12: amdgpu: - SR-IOV fixes - Suspend/resume fixes - Display fixes - DMCUB fixes - DP alt mode fixes - RAS fixes - UBSAN fix - Navy Flounder VCN fix - ttm resource manager cleanup - default_groups change for kobj_type - vkms fix - Aldebaran fixes amdkfd: - SDMA ECC interrupt fix - License clarification - Pointer check fix - DQM fixes for hawaii - default_groups change for kobj_type - Typo fixes Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220113030537.5758-1-alexander.deucher@amd.com	2022-01-14 15:42:28 +01:00
Daniel Vetter	71e4a70290	Two DT bindings fixes for meson, a device refcounting fix for sun4i, a probe fix for vga16fb, a locking fix for the CMA dma-buf heap and a compilation fix for ttm. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCYeFyZQAKCRDj7w1vZxhR xel6AQCm6cP8oxsADsMgKCmJQXff0a3/AH+LhXAsl0bX5coUVQEAwN2tsBgjJi2D R8imZ0Ex9llmuxAGx6izKSyyFmeLKAM= =teXV -----END PGP SIGNATURE----- Merge tag 'drm-misc-fixes-2022-01-14' of git://anongit.freedesktop.org/drm/drm-misc into drm-next Two DT bindings fixes for meson, a device refcounting fix for sun4i, a probe fix for vga16fb, a locking fix for the CMA dma-buf heap and a compilation fix for ttm. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> [danvet: I made sure I have exactly the same conflict resolution as Linus in `8d0749b4f8` ("Merge tag 'drm-next-2022-01-07' of git://anongit.freedesktop.org/drm/drm") to avoid further conflict fun. From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20220114125454.zs46ny52lrxk3ljz@houat	2022-01-14 15:17:17 +01:00
Harry Wentland	dc5d4aff2e	drm/amdgpu: Use correct VIEWPORT_DIMENSION for DCN2 For some reason this file isn't using the appropriate register headers for DCN headers, which means that on DCN2 we're getting the VIEWPORT_DIMENSION offset wrong. This means that we're not correctly carving out the framebuffer memory correctly for a framebuffer allocated by EFI and therefore see corruption when loading amdgpu before the display driver takes over control of the framebuffer scanout. Fix this by checking the DCE_HWIP and picking the correct offset accordingly. Long-term we should expose this info from DC as GMC shouldn't need to know about DCN registers. Cc: stable@vger.kernel.org Signed-off-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-11 15:44:28 -05:00
Guchun Chen	2096b74b1d	drm/amdgpu: use spin_lock_irqsave to avoid deadlock by local interrupt This is observed in SRIOV case with virtual KMS as display. _raw_spin_lock_irqsave+0x37/0x40 drm_handle_vblank+0x69/0x350 [drm] ? try_to_wake_up+0x432/0x5c0 ? amdgpu_vkms_prepare_fb+0x1c0/0x1c0 [amdgpu] drm_crtc_handle_vblank+0x17/0x20 [drm] amdgpu_vkms_vblank_simulate+0x4d/0x80 [amdgpu] __hrtimer_run_queues+0xfb/0x230 hrtimer_interrupt+0x109/0x220 __sysvec_apic_timer_interrupt+0x64/0xe0 asm_call_irq_on_stack+0x12/0x20 Fixes: `84ec374bd5` ("drm/amdgpu: create amdgpu_vkms (v4)") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Kelly Zytaruk <kelly.zytaruk@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-11 15:44:28 -05:00
Prike Liang	4eaf21b752	drm/amdgpu: not return error on the init_apu_flags In some APU project we needn't always assign flags to identify each other, so we may not need return an error. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-11 15:44:27 -05:00
Rajneesh Bhardwaj	8b5da5a458	Revert "drm/amdgpu: Don't inherit GEM object VMAs in child process" This reverts commit `fbcdbfde87`. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-11 15:44:27 -05:00
Greg Kroah-Hartman	7ff61cdcc8	drm/amdgpu: use default_groups in kobj_type There are currently 2 ways to create a set of sysfs files for a kobj_type, through the default_attrs field, and the default_groups field. Move the amdgpu sysfs code to use default_groups field which has been the preferred way since `aa30f47cf6` ("kobject: Add support for default attribute groups to kobj_type") so that we can soon get rid of the obsolete default_attrs field. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: John Clements <john.clements@amd.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Jonathan Kim <jonathan.kim@amd.com> Cc: Kevin Wang <kevin1.wang@amd.com> Cc: shaoyunl <shaoyun.liu@amd.com> Cc: Tao Zhou <tao.zhou1@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-11 15:44:26 -05:00
Tom St Denis	4cc9f86f85	drm/amd/amdgpu: Add pcie indirect support to amdgpu_mm_wreg_mmio_rlc() The function amdgpu_mm_wreg_mmio_rlc() is used by debugfs to write to MMIO registers. It didn't support registers beyond the BAR mapped MMIO space. This adds pcie indirect write support. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-11 15:44:26 -05:00
Nirmoy Das	575e55ee4f	drm/amdgpu: recover gart table at resume Get rid off pin/unpin of gart BO at resume/suspend and instead pin only once and try to recover gart content at resume time. This is much more stable in case there is OOM situation at 2nd call to amdgpu_device_evict_resources() while evicting GART table. v3: remove gart recovery from other places v2: pin gart at amdgpu_gart_table_vram_alloc() Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-11 15:44:26 -05:00
Nirmoy Das	ec6aae9711	drm/amdgpu: do not pass ttm_resource_manager to vram_mgr Do not allow exported amdgpu_vram_mgr_*() to accept any ttm_resource_manager pointer. Also there is no need to force other module to call a ttm function just to eventually call vram_mgr functions. v2: pass adev's vram_mgr instead of adev Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-11 15:44:26 -05:00
Nirmoy Das	ffb378fb30	drm/amdkfd: remove unused function Remove unused amdgpu_amdkfd_get_vram_usage() CC: Felix.Kuehling@amd.com Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Fixes: `dfcbe6d5f4` ("drm/amdgpu: Remove unused function pointers")	2022-01-11 15:44:26 -05:00
Nirmoy Das	1dd8b1b987	drm/amdgpu: do not pass ttm_resource_manager to gtt_mgr Do not allow exported amdgpu_gtt_mgr_*() to accept any ttm_resource_manager pointer. Also there is no need to force other module to call a ttm function just to eventually call gtt_mgr functions. v4: remove unused adev. v3: upcast mgr from ttm resopurce manager instead of getting it from adev. v2: pass adev's gtt_mgr instead of adev. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-11 15:44:26 -05:00
Leslie Shi	62d5f9f711	drm/amdgpu: Unmap MMIO mappings when device is not unplugged Patch: 3efb17ae7e92 ("drm/amdgpu: Call amdgpu_device_unmap_mmio() if device is unplugged to prevent crash in GPU initialization failure") makes call to amdgpu_device_unmap_mmio() conditioned on device unplugged. This patch unmaps MMIO mappings even when device is not unplugged. v2: Add condition of drm_dev_enter() to deleted unmaps in patch "drm/amdgpu: Unmap all MMIO mappings" Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-11 15:44:26 -05:00
Peng Ju Zhou	6638391b9f	drm/amdgpu: Enable second VCN for certain Navy Flounder. Certain Navy Flounder cards have 2 VCNs, enable it. Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-11 15:44:25 -05:00
Jiawei Gu	b54ce6c92c	drm/amdgpu: Clear garbage data in err_data before usage Memory of err_data should be cleaned before usage when there're multiple entry in ras ih. Otherwise garbage data from last loop will be used. Signed-off-by: Jiawei Gu <Jiawei.Gu@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-11 15:44:25 -05:00
Linus Torvalds	8d0749b4f8	drm for 5.17-rc1 core: - add privacy screen support - move nomodeset option into drm subsystem - clean up nomodeset handling in drivers - make drm_irq.c legacy - fix stack_depot name conflicts - remove DMA_BUF_SET_NAME ioctl restrictions - sysfs: send hotplug event - replace several DRM_* logging macros with drm_* - move hashtable to legacy code - add error return from gem_create_object - cma-helper: improve interfaces, drop CONFIG_DRM_KMS_CMA_HELPER - kernel.h related include cleanups - support XRGB2101010 source buffers ttm: - don't include drm hashtable - stop pruning fences after wait - documentation updates dma-buf: - add dma_resv selftest - add debugfs helpers - remove dma_resv_get_excl_unlocked - documentation - make fences mandatory in dma_resv_add_excl_fence dp: - add link training delay helpers gem: - link shmem/cma helpers into separate modules - use dma_resv iteratior - import dma-buf namespace into gem helper modules scheduler: - fence grab fix - lockdep fixes bridge: - switch to managed MIPI DSI helpers - register and attach during probe fixes - convert to YAML in several places. panel: - add bunch of new panesl simpledrm: - support FB_DAMAGE_CLIPS - support virtual screen sizes - add Apple M1 support amdgpu: - enable seamless boot for DCN 3.01 - runtime PM fixes - use drm_kms_helper_connector_hotplug_event - get all fences at once - use generic drm fb helpers - PSR/DPCD/LTTPR/DSC/PM/RAS/OLED/SRIOV fixes - add smart trace buffer (STB) for supported GPUs - display debugfs entries - new SMU debug option - Documentation update amdkfd: - IP discovery enumeration refactor - interface between driver fixes - SVM fixes - kfd uapi header to define some sysfs bitfields. i915: - support VESA panel backlights - enable ADL-P by default - add eDP privacy screen support - add Raptor Lake S (RPL-S) support - DG2 page table support - lots of GuC/HuC fw refactoring - refactored i915->gt interfaces - CD clock squashing support - enable 10-bit gamma support - update ADL-P DMC fw to v2.14 - enable runtime PM autosuspend by default - ADL-P DSI support - per-lane DP drive settings for ICL+ - add support for pipe C/D DMC firmware - Atomic gamma LUT updates - remove CCS FB stride restrictions on ADL-P - VRR platform support for display 11 - add support for display audio codec keepalive - lots of display refactoring - fix runtime PM handling during PXP suspend - improved eviction performance with async TTM moves - async VMA unbinding improvements - VMA locking refactoring - improved error capture robustness - use per device iommu checks - drop bits stealing from i915_sw_fence function ptr - remove dma_resv_prune - add IC cache invalidation on DG2 nouveau: - crc fixes - validate LUTs in atomic check - set HDMI AVI RGB quant to full tegra: - buffer objects reworks for dma-buf compat - NVDEC driver uAPI support - power management improvements etnaviv: - IOMMU enabled system support - fix > 4GB command buffer mapping - close a DoS vector - fix spurious GPU resets ast: - fix i2c initialization rcar-du: - DSI output support exynos: - replace legacy gpio interface - implement generic GEM object mmap msm: - dpu plane state cleanup in prep for multirect - dpu debugfs cleanups - dp support for sc7280 - a506 support - removal of struct_mutex - remove old eDP sub-driver anx7625: - support MIPI DSI input - support HDMI audio - fix reading EDID lvds: - fix bridge DT bindings megachips: - probe both bridges before registering dw-hdmi: - allow interlace on bridge ps8640: - enable runtime PM - support aux-bus tx358768: - enable reference clock - add pulse mode support ti-sn65dsi86: - use regmap bulk write - add PWM support etnaviv: - get all fences at once gma500: - gem object cleanups kmb: - enable fb console radeon: - use dma_resv_wait_timeout rockchip: - add DSP hold timeout - suspend/resume fixes - PLL clock fixes - implement mmap in GEM object functions - use generic fbdev emulation sun4i: - use CMA helpers without vmap support vc4: - fix HDMI-CEC hang with display is off - power on HDMI controller while disabling - support 4K@60Hz modes - support 10-bit YUV 4:2:0 output vmwgfx: - fix leak on probe errors - fail probing on broken hosts - new placement for MOB page tables - hide internal BOs from userspace - implement GEM support - implement GL 4.3 support virtio: - overflow fixes xen: - implement mmap as GEM object function omapdrm: - fix scatterlist export - support virtual planes mediatek: - MT8192 support - CMDQ refinement -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmHX1vMACgkQDHTzWXnE hr6rAw/9ES5RO5N3Ku9foFk1CI9bqy1Kh663KLkkEc+rDdhKpiZbBnAsrKkZ9sGu fNuHmWNN5nWXtDSOqHWuslt3F7Gh+qEBQtlkqC9mZsBm3bWB0aJK6E4QaJxfeSaK ta6AmyGx8DaV+C69i86dnemQurYSDVjROd7LDPKnCU0Fye/JxiXSXQmXksKMFVxd x5vmO9yfeDSg3EF+u1yB6nJNUYZBV0vhrAfjPqxPCRBXuQc7akuaglE/SFwlGnEk vn0GjVHEQcRTqYKrHr64xvQxIoKXcJP0pkDUyT7KYCsyj8GJkvxkb7/ls5pp5DvL SwyNg3J3vwUVP6w6GEvzf3ffG720qqUZvCbvLmE+A/t2DhGILiAm+HXSo43PTOW8 uagT7Gxma8dy8EovjSxioS9HPX8Gcu+S+XYavgOsevOZ7oeEt4f4TLW7LXsw9d6y 75FrMhiUpreab5hAh8Le0swuLYZHjdnJRdjSTqZJ/T6VdTdVftLT6IfwvSDx5CHy cWuufgcAjd7xVTXFquHWYXWLTQkiSMGf1M02jx9IWolTd4Cm41LNBhqMEDHZLHJD 7ngGgoaREVDQ+MqjG90yfIwJFIpJPI3YOaHLi/Kznga+iDzFY6cyOQWW2vX7ZdY5 7+LJWsgGT8Feb7/bzD5hX1mYqJLxh1pWUIaqIKMl+7LJL7gTVU8= =MByd -----END PGP SIGNATURE----- Merge tag 'drm-next-2022-01-07' of git://anongit.freedesktop.org/drm/drm Pull drm updates from Dave Airlie: "Highlights are support for privacy screens found in new laptops, a bunch of nomodeset refactoring, and i915 enables ADL-P systems by default, while starting to add RPL-S support. vmwgfx adds GEM and support for OpenGL 4.3 features in userspace. Lots of internal refactorings around dma reservations, and lots of driver refactoring as well. Summary: core: - add privacy screen support - move nomodeset option into drm subsystem - clean up nomodeset handling in drivers - make drm_irq.c legacy - fix stack_depot name conflicts - remove DMA_BUF_SET_NAME ioctl restrictions - sysfs: send hotplug event - replace several DRM_* logging macros with drm_* - move hashtable to legacy code - add error return from gem_create_object - cma-helper: improve interfaces, drop CONFIG_DRM_KMS_CMA_HELPER - kernel.h related include cleanups - support XRGB2101010 source buffers ttm: - don't include drm hashtable - stop pruning fences after wait - documentation updates dma-buf: - add dma_resv selftest - add debugfs helpers - remove dma_resv_get_excl_unlocked - documentation - make fences mandatory in dma_resv_add_excl_fence dp: - add link training delay helpers gem: - link shmem/cma helpers into separate modules - use dma_resv iteratior - import dma-buf namespace into gem helper modules scheduler: - fence grab fix - lockdep fixes bridge: - switch to managed MIPI DSI helpers - register and attach during probe fixes - convert to YAML in several places. panel: - add bunch of new panesl simpledrm: - support FB_DAMAGE_CLIPS - support virtual screen sizes - add Apple M1 support amdgpu: - enable seamless boot for DCN 3.01 - runtime PM fixes - use drm_kms_helper_connector_hotplug_event - get all fences at once - use generic drm fb helpers - PSR/DPCD/LTTPR/DSC/PM/RAS/OLED/SRIOV fixes - add smart trace buffer (STB) for supported GPUs - display debugfs entries - new SMU debug option - Documentation update amdkfd: - IP discovery enumeration refactor - interface between driver fixes - SVM fixes - kfd uapi header to define some sysfs bitfields. i915: - support VESA panel backlights - enable ADL-P by default - add eDP privacy screen support - add Raptor Lake S (RPL-S) support - DG2 page table support - lots of GuC/HuC fw refactoring - refactored i915->gt interfaces - CD clock squashing support - enable 10-bit gamma support - update ADL-P DMC fw to v2.14 - enable runtime PM autosuspend by default - ADL-P DSI support - per-lane DP drive settings for ICL+ - add support for pipe C/D DMC firmware - Atomic gamma LUT updates - remove CCS FB stride restrictions on ADL-P - VRR platform support for display 11 - add support for display audio codec keepalive - lots of display refactoring - fix runtime PM handling during PXP suspend - improved eviction performance with async TTM moves - async VMA unbinding improvements - VMA locking refactoring - improved error capture robustness - use per device iommu checks - drop bits stealing from i915_sw_fence function ptr - remove dma_resv_prune - add IC cache invalidation on DG2 nouveau: - crc fixes - validate LUTs in atomic check - set HDMI AVI RGB quant to full tegra: - buffer objects reworks for dma-buf compat - NVDEC driver uAPI support - power management improvements etnaviv: - IOMMU enabled system support - fix > 4GB command buffer mapping - close a DoS vector - fix spurious GPU resets ast: - fix i2c initialization rcar-du: - DSI output support exynos: - replace legacy gpio interface - implement generic GEM object mmap msm: - dpu plane state cleanup in prep for multirect - dpu debugfs cleanups - dp support for sc7280 - a506 support - removal of struct_mutex - remove old eDP sub-driver anx7625: - support MIPI DSI input - support HDMI audio - fix reading EDID lvds: - fix bridge DT bindings megachips: - probe both bridges before registering dw-hdmi: - allow interlace on bridge ps8640: - enable runtime PM - support aux-bus tx358768: - enable reference clock - add pulse mode support ti-sn65dsi86: - use regmap bulk write - add PWM support etnaviv: - get all fences at once gma500: - gem object cleanups kmb: - enable fb console radeon: - use dma_resv_wait_timeout rockchip: - add DSP hold timeout - suspend/resume fixes - PLL clock fixes - implement mmap in GEM object functions - use generic fbdev emulation sun4i: - use CMA helpers without vmap support vc4: - fix HDMI-CEC hang with display is off - power on HDMI controller while disabling - support 4K@60Hz modes - support 10-bit YUV 4:2:0 output vmwgfx: - fix leak on probe errors - fail probing on broken hosts - new placement for MOB page tables - hide internal BOs from userspace - implement GEM support - implement GL 4.3 support virtio: - overflow fixes xen: - implement mmap as GEM object function omapdrm: - fix scatterlist export - support virtual planes mediatek: - MT8192 support - CMDQ refinement" * tag 'drm-next-2022-01-07' of git://anongit.freedesktop.org/drm/drm: (1241 commits) drm/amdgpu: no DC support for headless chips drm/amd/display: fix dereference before NULL check drm/amdgpu: always reset the asic in suspend (v2) drm/amdgpu: put SMU into proper state on runpm suspending for BOCO capable platform drm/amd/display: Fix the uninitialized variable in enable_stream_features() drm/amdgpu: fix runpm documentation amdgpu/pm: Make sysfs pm attributes as read-only for VFs drm/amdgpu: save error count in RAS poison handler drm/amdgpu: drop redundant semicolon drm/amd/display: get and restore link res map drm/amd/display: support dynamic HPO DP link encoder allocation drm/amd/display: access hpo dp link encoder only through link resource drm/amd/display: populate link res in both detection and validation drm/amd/display: define link res and make it accessible to all link interfaces drm/amd/display: 3.2.167 drm/amd/display: [FW Promotion] Release 0.0.98 drm/amd/display: Undo ODM combine drm/amd/display: Add reg defs for DCN303 drm/amd/display: Changed pipe split policy to allow for multi-display pipe split drm/amd/display: Set optimize_pwr_state for DCN31 ...	2022-01-10 12:58:46 -08:00
Linus Torvalds	7e740ae635	- First part of a series to move the AMD address translation code from arch/x86/ to amd64_edac as that is its only user anyway - Some MCE error injection improvements to the AMD side - Reorganization of the #MC handler code and the facilities it calls to make it noinstr-safe - Add support for new AMD MCA bank types and non-uniform banks layout - The usual set of cleanups and fixes -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmHcGZ4ACgkQEsHwGGHe VUr6Zw//WBvNvfV/akQGsvVo94G0DaF+buYB+Tl1p0goMd7QfKA5iHxjB1alEJC2 dTchIr7pjiiE3nr4svuWLLQZamx8kMwQNqipioBHXg3YThj0wD4PbUOC9TlIceBR 3yxVbvwlD7Y7sb2PII6IMlagzTiIeW0/ps29DHFr5vqDBvEanNdAHoV/h2vQi+76 Ma96psIxzTMSk11yGB6l9k66EASCdDGBU7sODjup7wuQmuRaQ/1oJAWY0wIJvJez frjpaz/YKmlTwTf9bxoJbky2FkeBsD4yXXUGwjDgMq0EyUUaeSbvaQkm8gSHX9Yr VDDv1WvT6QIw6x7Wc4skS8lWmZghNBbAHOoNS31BPJ2IDmFWkF5Q2bNEuHrtU4EC 0mkNeyN6x48L/F8j/1aE/tm+SjiGexZX4zhi6MNWReTV140I1zqQq/r7CCu5+MEa PAB1YH/96k2dMPT6mbFrRIFJmkDuBuZOAkuwYWEjO/XjPl2SGBGj1jKolWW3qjRR Po7vBJnDt7wgigWFh6+R4rJv+fh87XfB7B2wEOt4Yn37jUkK6dNRIy0zFmDaC1J2 bHgsJbWC+Sgs1G57gnYABJYzLj7RRdDyCu1/UUVyBBP7/WfZJw0kjABE7p3AaYTd 15JV1L0c/Ypuv05LJf40LkyF2F5w2fnP5QM2Rr8U4xW/GumEyWs= =8Hu7 -----END PGP SIGNATURE----- Merge tag 'ras_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Borislav Petkov: "A relatively big amount of movements in RAS-land this time around: - First part of a series to move the AMD address translation code from arch/x86/ to amd64_edac as that is its only user anyway - Some MCE error injection improvements to the AMD side - Reorganization of the #MC handler code and the facilities it calls to make it noinstr-safe - Add support for new AMD MCA bank types and non-uniform banks layout - The usual set of cleanups and fixes" * tag 'ras_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86/mce: Reduce number of machine checks taken during recovery x86/mce/inject: Avoid out-of-bounds write when setting flags x86/MCE/AMD, EDAC/mce_amd: Support non-uniform MCA bank type enumeration x86/MCE/AMD, EDAC/mce_amd: Add new SMCA bank types x86/mce: Check regs before accessing it x86/mce: Mark mce_start() noinstr x86/mce: Mark mce_timed_out() noinstr x86/mce: Move the tainting outside of the noinstr region x86/mce: Mark mce_read_aux() noinstr x86/mce: Mark mce_end() noinstr x86/mce: Mark mce_panic() noinstr x86/mce: Prevent severity computation from being instrumented x86/mce: Allow instrumentation during task work queueing x86/mce: Remove noinstr annotation from mce_setup() x86/mce: Use mce_rdmsrl() in severity checking code x86/mce: Remove function-local cpus variables x86/mce: Do not use memset to clear the banks bitmaps x86/mce/inject: Set the valid bit in MCA_STATUS before error injection x86/mce/inject: Check if a bank is populated before injecting x86/mce: Get rid of cpu_missing ...	2022-01-10 11:43:09 -08:00
Len Brown	df5bc0aa7f	Revert "drm/amdgpu: stop scheduler when calling hw_fini (v2)" This reverts commit `f7d6779df6`. This bisected regression has impacted suspend-resume stability since 5.15-rc1. It regressed -stable via 5.14.10. Link: https://bugzilla.kernel.org/show_bug.cgi?id=215315 Fixes: `f7d6779df6` ("drm/amdgpu: stop scheduler when calling hw_fini (v2)") Cc: Guchun Chen <guchun.chen@amd.com> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: <stable@vger.kernel.org> # 5.14+ Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-01-09 10:32:16 -08:00
Mario Limonciello	eac4c54bf7	drm/amdgpu: don't set s3 and s0ix at the same time This makes it clearer which codepaths are in use specifically in one state or the other. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-07 17:20:55 -05:00
Mario Limonciello	e53d9665ab	drm/amdgpu: explicitly check for s0ix when evicting resources This codepath should be running in both s0ix and s3, but only does currently because s3 and s0ix are both set in the s0ix case. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-07 17:20:38 -05:00
James Yao	216a987319	drm/amdgpu: add dummy event6 for vega10 [why] Malicious mailbox event1 fails driver loading on vega10. A dummy event6 prevent driver from taking response from malicious event1 as its own. [how] On vega10, send a mailbox event6 before sending event1. Signed-off-by: James Yao <yiqing.yao@amd.com> Reviewed-by: Jingwen Chen <Jingwen.Chen2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-07 17:19:34 -05:00
Alex Deucher	b95dc06af3	drm/amdgpu: disable runpm if we are the primary adapter If we are the primary adapter (i.e., the one used by the firwmare framebuffer), disable runtime pm. This fixes a regression caused by commit `55285e21f0` which results in the displays waking up shortly after they go to sleep due to the device coming out of runtime suspend and sending a hotplug uevent. v2: squash in reworked fix from Evan Fixes: `55285e21f0` ("fbdev/efifb: Release PCI device's runtime PM ref during FB destroy") Bug: https://bugzilla.kernel.org/show_bug.cgi?id=215203 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1840 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-31 08:57:45 -05:00
Dave Airlie	ce9b333c73	Merge branch 'drm-misc-fixes' of ssh://git.freedesktop.org/git/drm/drm-misc into drm-fixes This merges two fixes that haven't been sent to me yet, but I wanted to get in. One amdgpu fix, but one nouveau regression fixer. Signed-off-by: Dave Airlie <airlied@redhat.com>	2021-12-31 11:40:29 +10:00
Dave Airlie	cb6846fbb8	Merge tag 'amd-drm-next-5.17-2021-12-30' of ssh://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.17-2021-12-30: amdgpu: - Suspend/resume fixes - Fence fix - Misc code cleanups - IP discovery fixes - SRIOV fixes - RAS fixes - GMC 8 VRAM detection fix - FRU fixes for Aldebaran - Display fixes amdkfd: - SVM fixes - IP discovery fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211230141032.613596-1-alexander.deucher@amd.com	2021-12-31 10:59:17 +10:00
Alex Deucher	0637d41786	drm/amdgpu: no DC support for headless chips Chips with no display hardware should return false for DC support. v2: drop Arcturus and Aldebaran Fixes: `f7f12b2582` ("drm/amdgpu: default to true in amdgpu_device_asic_has_dc_support") Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reported-by: Tareque Md.Hanif <tarequemd.hanif@yahoo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-30 08:54:45 -05:00
Alex Deucher	6dc8265f98	drm/amdgpu: always reset the asic in suspend (v2) If the platform suspend happens to fail and the power rail is not turned off, the GPU will be in an unknown state on resume, so reset the asic so that it will be in a known good state on resume even if the platform suspend failed. v2: handle s0ix Acked-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-30 08:54:45 -05:00
Evan Quan	4a700546ec	drm/amdgpu: put SMU into proper state on runpm suspending for BOCO capable platform By setting mp1_state as PP_MP1_STATE_UNLOAD, MP1 will do some proper cleanups and put itself into a state ready for PNP. That can workaround some random resuming failure observed on BOCO capable platforms. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-30 08:54:45 -05:00
Alex Deucher	937ed9c866	drm/amdgpu: fix runpm documentation It's not only supported by HG/PX laptops. It's supported by all dGPUs which supports BOCO/BACO functionality (runtime D3). BOCO - Bus Off, Chip Off. The entire chip is powered off. This is controlled by ACPI. BACO - Bus Active, Chip Off. The chip still shows up on the PCI bus, but the device itself is powered down. v2: fix missed HG/PX reference Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-30 08:54:45 -05:00
Tao Zhou	fec8c5244f	drm/amdgpu: save error count in RAS poison handler Otherwise the RAS error count couldn't be queried from sysfs. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-30 08:54:45 -05:00
Guchun Chen	45e3d1db7d	drm/amdgpu: drop redundant semicolon A minor typo. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-30 08:54:45 -05:00
Surbhi Kakarya	b6fd6e0f5e	drm/amdgpu: Check the memory can be accesssed by ttm_device_clear_dma_mappings. If the event guard is enabled and VF doesn't receive an ack from PF for full access, the guest driver load crashes. This is caused due to the call to ttm_device_clear_dma_mappings with non-initialized mman during driver tear down. This patch adds the necessary condition to check if the mman initialization passed or not and takes the path based on the condition output. Signed-off-by: Surbhi Kakarya <Surbhi.Kakarya@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-30 08:54:43 -05:00
Kent Russell	67416bf853	drm/amdgpu: Access the FRU on Aldebaran This is supported, although the offset is different from VG20, so fix that with a variable and enable getting the product name and serial number from the FRU. Do this for all SKUs since all SKUs have the FRU Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-30 08:54:43 -05:00
Kent Russell	6c92fe5fa5	drm/amdgpu: Increase potential product_name to 64 characters Having seen at least 1 42-character product_name, bump the number up to 64, and put that definition into amdgpu.h to make future adjustments simpler. Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-30 08:54:43 -05:00
yipechai	fd5256cbe1	drm/amdgpu: Remove the redundant code of psp bootloader functions The psp bootloader functions code of psp_v13_0.c had been optimized before. According the code style of psp_v13_0.c to remove the redundant code of psp_v11_0.c. v2: squash in drop unused variable (Alex) Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-30 08:54:43 -05:00
Leslie Shi	87172e89dc	drm/amdgpu: Call amdgpu_device_unmap_mmio() if device is unplugged to prevent crash in GPU initialization failure [Why] In amdgpu_driver_load_kms, when amdgpu_device_init returns error during driver modprobe, it will start the error handle path immediately and call into amdgpu_device_unmap_mmio as well to release mapped VRAM. However, in the following release callback, driver stills visits the unmapped memory like vcn.inst[i].fw_shared_cpu_addr in vcn_v3_0_sw_fini. So a kernel crash occurs. [How] call amdgpu_device_unmap_mmio() if device is unplugged to prevent invalid memory address in vcn_v3_0_sw_fini() when GPU initialization failure. Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-30 08:54:24 -05:00
Zongmin Zhou	11544d77e3	drm/amdgpu: fixup bad vram size on gmc v8 Some boards(like RX550) seem to have garbage in the upper 16 bits of the vram size register. Check for this and clamp the size properly. Fixes boards reporting bogus amounts of vram. after add this patch,the maximum GPU VRAM size is 64GB, otherwise only 64GB vram size will be used. Signed-off-by: Zongmin Zhou<zhouzongmin@kylinos.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-28 16:05:21 -05:00
sashank saye	4da8b63944	drm/amdgpu: Send Message to SMU on aldebaran passthrough for sbr handling For Aldebaran chip passthrough case we need to intimate SMU about special handling for SBR.On older chips we send LightSBR to SMU, enabling the same for Aldebaran. Slight difference, compared to previous chips, is on Aldebaran, SMU would do a heavy reset on SBR. Hence, the word Heavy instead of Light SBR is used for SMU to differentiate. Reviewed by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: sashank saye <sashank.saye@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-28 16:03:19 -05:00
Rajneesh Bhardwaj	fbcdbfde87	drm/amdgpu: Don't inherit GEM object VMAs in child process When an application having open file access to a node forks, its shared mappings also get reflected in the address space of child process even though it cannot access them with the object permissions applied. With the existing permission checks on the gem objects, it might be reasonable to also create the VMAs with VM_DONTCOPY flag so a user space application doesn't need to explicitly call the madvise(addr, len, MADV_DONTFORK) system call to prevent the pages in the mapped range to appear in the address space of the child process. It also prevents the memory leaks due to additional reference counts on the mapped BOs in the child process that prevented freeing the memory in the parent for which we had worked around earlier in the user space inside the thunk library. Additionally, we faced this issue when using CRIU to checkpoint restore an application that had such inherited mappings in the child which confuse CRIU when it mmaps on restore. Having this flag set for the render node VMAs helps. VMAs mapped via KFD already take care of this so this is needed only for the render nodes. To limit the impact of the change to user space consumers such as OpenGL etc, limit it to KFD BOs only. Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: David Yat Sin <david.yatsin@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-28 16:03:08 -05:00
Tao Zhou	b6485bed40	drm/amdkfd: reset queue which consumes RAS poison (v2) CP supports unmap queue with reset mode which only destroys specific queue without affecting others. Replacing whole gpu reset with reset queue mode for RAS poison consumption saves much time, and we can also fallback to gpu reset solution if reset queue fails. v2: Return directly if process is NULL; Reset queue solution is not applicable to SDMA, fallback to legacy way; Call kfd_unref_process after lookup process. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-28 16:02:59 -05:00
Tao Zhou	f4409ee846	drm/amdgpu: add gpu reset control for umc page retirement Add a reset parameter for umc page retirement, let user decide whether call gpu reset in umc page retirement. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-28 16:02:32 -05:00
Victor Skvortsov	d764fb2af6	drm/amdgpu: Modify indirect register access for gfx9 sriov Expand RLCG interface for new GC read & write commands. New interface will only be used if the PF enables the flag in pf2vf msg. v2: Added a description for the scratch registers Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Reviewed-by: David Nieto <david.nieto@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-28 16:02:25 -05:00
Victor Skvortsov	4a0165f060	drm/amdgpu: get xgmi info before ip_init Driver needs to call get_xgmi_info() before ip_init to determine whether it needs to handle a pending hive reset. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Reviewed-by: David Nieto <david.nieto@amd.com> Reviewed by: shaoyun.liu <Shaoyun.lui@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-28 16:02:17 -05:00
Victor Skvortsov	4aa325ae54	drm/amdgpu: Modify indirect register access for amdkfd_gfx_v9 sriov Modify GC register access from MMIO to RLCG if the indirect flag is set Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Reviewed-by: David Nieto <david.nieto@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-28 16:02:10 -05:00
Victor Skvortsov	92f153bb5a	drm/amdgpu: Modify indirect register access for gmc_v9_0 sriov Modify GC register access from MMIO to RLCG if the indirect flag is set v2: Replaced ternary operator with if-else for better readability Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Reviewed-by: David Nieto <david.nieto@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-28 16:02:02 -05:00
Victor Skvortsov	0da6f6e587	drm/amdgpu: Add *_SOC15_IP_NO_KIQ() macro definitions Add helper macros to change register access from direct to indirect. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Reviewed-by: David Nieto <david.nieto@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-28 16:01:55 -05:00
Bokun Zhang	b18ff6925d	drm/amdgpu: Filter security violation registers Recently, there is security policy update under SRIOV. We need to filter the registers that hit the violation and move the code to the host driver side so that the guest driver can execute correctly. Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-28 16:00:47 -05:00
Alex Deucher	ebae897388	drm/amdgpu: no DC support for headless chips Chips with no display hardware should return false for DC support. v2: drop Arcturus and Aldebaran Fixes: `f7f12b2582` ("drm/amdgpu: default to true in amdgpu_device_asic_has_dc_support") Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reported-by: Tareque Md.Hanif <tarequemd.hanif@yahoo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-28 09:07:30 -05:00
Evan Quan	7be3be2b02	drm/amdgpu: put SMU into proper state on runpm suspending for BOCO capable platform By setting mp1_state as PP_MP1_STATE_UNLOAD, MP1 will do some proper cleanups and put itself into a state ready for PNP. That can workaround some random resuming failure observed on BOCO capable platforms. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-27 13:08:28 -05:00
Alex Deucher	daf8de0874	drm/amdgpu: always reset the asic in suspend (v2) If the platform suspend happens to fail and the power rail is not turned off, the GPU will be in an unknown state on resume, so reset the asic so that it will be in a known good state on resume even if the platform suspend failed. v2: handle s0ix Acked-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-27 12:16:34 -05:00
Alex Deucher	4d625a97a7	drm/amdgpu: fix runpm documentation It's not only supported by HG/PX laptops. It's supported by all dGPUs which supports BOCO/BACO functionality (runtime D3). BOCO - Bus Off, Chip Off. The entire chip is powered off. This is controlled by ACPI. BACO - Bus Active, Chip Off. The chip still shows up on the PCI bus, but the device itself is powered down. v2: fix missed HG/PX reference Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-22 21:51:30 -05:00
Dave Airlie	b06103b532	Merge tag 'amd-drm-next-5.17-2021-12-16' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amdgpu: - Add some display debugfs entries - RAS fixes - SR-IOV fixes - W=1 fixes - Documentation fixes - IH timestamp fix - Misc power fixes - IP discovery fixes - Large driver documentation updates - Multi-GPU memory use reductions - Misc display fixes and cleanups - Add new SMU debug option amdkfd: - SVM fixes radeon: - Fix typo in comment From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211216202731.5900-1-alexander.deucher@amd.com	2021-12-23 11:55:28 +10:00
Yazen Ghannam	91f75eb481	x86/MCE/AMD, EDAC/mce_amd: Support non-uniform MCA bank type enumeration AMD systems currently lay out MCA bank types such that the type of bank number "i" is either the same across all CPUs or is Reserved/Read-as-Zero. For example: Bank # \| CPUx \| CPUy 0 LS LS 1 RAZ UMC 2 CS CS 3 SMU RAZ Future AMD systems will lay out MCA bank types such that the type of bank number "i" may be different across CPUs. For example: Bank # \| CPUx \| CPUy 0 LS LS 1 RAZ UMC 2 CS NBIO 3 SMU RAZ Change the structures that cache MCA bank types to be per-CPU and update smca_get_bank_type() to handle this change. Move some SMCA-specific structures to amd.c from mce.h, since they no longer need to be global. Break out the "count" for bank types from struct smca_hwid, since this should provide a per-CPU count rather than a system-wide count. Apply the "const" qualifier to the struct smca_hwid_mcatypes array. The values in this array should not change at runtime. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211216162905.4132657-3-yazen.ghannam@amd.com	2021-12-22 17:22:09 +01:00
Alex Deucher	5e713c6afa	drm/amdgpu: add support for IP discovery gc_info table v2 Used on gfx9 based systems. Fixes incorrect CU counts reported in the kernel log. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1833 Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2021-12-17 12:47:29 -05:00
chen gong	b7865173cf	drm/amdgpu: When the VCN(1.0) block is suspended, powergating is explicitly enabled Play a video on the raven (or PCO, raven2) platform, and then do the S3 test. When resume, the following error will be reported: amdgpu 0000:02:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] ERROR ring vcn_dec test failed (-110) [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] ERROR resume of IP block <vcn_v1_0> failed -110 amdgpu 0000:02:00.0: amdgpu: amdgpu_device_ip_resume failed (-110). PM: dpm_run_callback(): pci_pm_resume+0x0/0x90 returns -110 [why] When playing the video: The power state flag of the vcn block is set to POWER_STATE_ON. When doing suspend: There is no change to the power state flag of the vcn block, it is still POWER_STATE_ON. When doing resume: Need to open the power gate of the vcn block and set the power state flag of the VCN block to POWER_STATE_ON. But at this time, the power state flag of the vcn block is already POWER_STATE_ON. The power status flag check in the "8f2cdef drm/amd/pm: avoid duplicate powergate/ungate setting" patch will return the amdgpu_dpm_set_powergating_by_smu function directly. As a result, the gate of the power was not opened, causing the subsequent ring test to fail. [how] In the suspend function of the vcn block, explicitly change the power state flag of the vcn block to POWER_STATE_OFF. BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1828 Signed-off-by: chen gong <curry.gong@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2021-12-17 12:47:29 -05:00
Huang Rui	bf67014d6b	drm/amdgpu: introduce new amdgpu_fence object to indicate the job embedded fence The job embedded fence donesn't initialize the flags at dma_fence_init(). Then we will go a wrong way in amdgpu_fence_get_timeline_name callback and trigger a null pointer panic once we enabled the trace event here. So introduce new amdgpu_fence object to indicate the job embedded fence. [ 156.131790] BUG: kernel NULL pointer dereference, address: 00000000000002a0 [ 156.131804] #PF: supervisor read access in kernel mode [ 156.131811] #PF: error_code(0x0000) - not-present page [ 156.131817] PGD 0 P4D 0 [ 156.131824] Oops: 0000 [#1] PREEMPT SMP PTI [ 156.131832] CPU: 6 PID: 1404 Comm: sdma0 Tainted: G OE 5.16.0-rc1-custom #1 [ 156.131842] Hardware name: Gigabyte Technology Co., Ltd. Z170XP-SLI/Z170XP-SLI-CF, BIOS F20 11/04/2016 [ 156.131848] RIP: 0010:strlen+0x0/0x20 [ 156.131859] Code: 89 c0 c3 0f 1f 80 00 00 00 00 48 01 fe eb 0f 0f b6 07 38 d0 74 10 48 83 c7 01 84 c0 74 05 48 39 f7 75 ec 31 c0 c3 48 89 f8 c3 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31 [ 156.131872] RSP: 0018:ffff9bd0018dbcf8 EFLAGS: 00010206 [ 156.131880] RAX: 00000000000002a0 RBX: ffff8d0305ef01b0 RCX: 000000000000000b [ 156.131888] RDX: ffff8d03772ab924 RSI: ffff8d0305ef01b0 RDI: 00000000000002a0 [ 156.131895] RBP: ffff9bd0018dbd60 R08: ffff8d03002094d0 R09: 0000000000000000 [ 156.131901] R10: 000000000000005e R11: 0000000000000065 R12: ffff8d03002094d0 [ 156.131907] R13: 000000000000001f R14: 0000000000070018 R15: 0000000000000007 [ 156.131914] FS: 0000000000000000(0000) GS:ffff8d062ed80000(0000) knlGS:0000000000000000 [ 156.131923] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 156.131929] CR2: 00000000000002a0 CR3: 000000001120a005 CR4: 00000000003706e0 [ 156.131937] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 156.131942] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 156.131949] Call Trace: [ 156.131953] <TASK> [ 156.131957] ? trace_event_raw_event_dma_fence+0xcc/0x200 [ 156.131973] ? ring_buffer_unlock_commit+0x23/0x130 [ 156.131982] dma_fence_init+0x92/0xb0 [ 156.131993] amdgpu_fence_emit+0x10d/0x2b0 [amdgpu] [ 156.132302] amdgpu_ib_schedule+0x2f9/0x580 [amdgpu] [ 156.132586] amdgpu_job_run+0xed/0x220 [amdgpu] v2: fix mismatch warning between the prototype and function name (Ray, kernel test robot) Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-17 12:47:28 -05:00
Christian König	fc74881c28	drm/amdgpu: fix dropped backing store handling in amdgpu_dma_buf_move_notify bo->tbo.resource can now be NULL. Signed-off-by: Christian König <christian.koenig@amd.com> Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1811 Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211210083927.1754-1-christian.koenig@amd.com	2021-12-17 11:26:40 +01:00
Alex Deucher	0cd7f378b0	drm/amdgpu: add support for IP discovery gc_info table v2 Used on gfx9 based systems. Fixes incorrect CU counts reported in the kernel log. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1833 Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-16 14:08:20 -05:00
Victor Skvortsov	892deb4826	drm/amdgpu: Separate vf2pf work item init from virt data exchange We want to be able to call virt data exchange conditionally after gmc sw init to reserve bad pages as early as possible. Since this is a conditional call, we will need to call it again unconditionally later in the init sequence. Refactor the data exchange function so it can be called multiple times without re-initializing the work item. v2: Cleaned up the code. Kept the original call to init_exchange_data() inside early init to initialize the work item, afterwards call exchange_data() when needed. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Reviewed By: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-16 14:08:20 -05:00
chen gong	d4c2933fb8	drm/amdgpu: When the VCN(1.0) block is suspended, powergating is explicitly enabled Play a video on the raven (or PCO, raven2) platform, and then do the S3 test. When resume, the following error will be reported: amdgpu 0000:02:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] ERROR ring vcn_dec test failed (-110) [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] ERROR resume of IP block <vcn_v1_0> failed -110 amdgpu 0000:02:00.0: amdgpu: amdgpu_device_ip_resume failed (-110). PM: dpm_run_callback(): pci_pm_resume+0x0/0x90 returns -110 [why] When playing the video: The power state flag of the vcn block is set to POWER_STATE_ON. When doing suspend: There is no change to the power state flag of the vcn block, it is still POWER_STATE_ON. When doing resume: Need to open the power gate of the vcn block and set the power state flag of the VCN block to POWER_STATE_ON. But at this time, the power state flag of the vcn block is already POWER_STATE_ON. The power status flag check in the "8f2cdef drm/amd/pm: avoid duplicate powergate/ungate setting" patch will return the amdgpu_dpm_set_powergating_by_smu function directly. As a result, the gate of the power was not opened, causing the subsequent ring test to fail. [how] In the suspend function of the vcn block, explicitly change the power state flag of the vcn block to POWER_STATE_OFF. BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1828 Signed-off-by: chen gong <curry.gong@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-16 14:08:10 -05:00
Huang Rui	5c1e6fa49e	drm/amdgpu: introduce new amdgpu_fence object to indicate the job embedded fence The job embedded fence donesn't initialize the flags at dma_fence_init(). Then we will go a wrong way in amdgpu_fence_get_timeline_name callback and trigger a null pointer panic once we enabled the trace event here. So introduce new amdgpu_fence object to indicate the job embedded fence. [ 156.131790] BUG: kernel NULL pointer dereference, address: 00000000000002a0 [ 156.131804] #PF: supervisor read access in kernel mode [ 156.131811] #PF: error_code(0x0000) - not-present page [ 156.131817] PGD 0 P4D 0 [ 156.131824] Oops: 0000 [#1] PREEMPT SMP PTI [ 156.131832] CPU: 6 PID: 1404 Comm: sdma0 Tainted: G OE 5.16.0-rc1-custom #1 [ 156.131842] Hardware name: Gigabyte Technology Co., Ltd. Z170XP-SLI/Z170XP-SLI-CF, BIOS F20 11/04/2016 [ 156.131848] RIP: 0010:strlen+0x0/0x20 [ 156.131859] Code: 89 c0 c3 0f 1f 80 00 00 00 00 48 01 fe eb 0f 0f b6 07 38 d0 74 10 48 83 c7 01 84 c0 74 05 48 39 f7 75 ec 31 c0 c3 48 89 f8 c3 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31 [ 156.131872] RSP: 0018:ffff9bd0018dbcf8 EFLAGS: 00010206 [ 156.131880] RAX: 00000000000002a0 RBX: ffff8d0305ef01b0 RCX: 000000000000000b [ 156.131888] RDX: ffff8d03772ab924 RSI: ffff8d0305ef01b0 RDI: 00000000000002a0 [ 156.131895] RBP: ffff9bd0018dbd60 R08: ffff8d03002094d0 R09: 0000000000000000 [ 156.131901] R10: 000000000000005e R11: 0000000000000065 R12: ffff8d03002094d0 [ 156.131907] R13: 000000000000001f R14: 0000000000070018 R15: 0000000000000007 [ 156.131914] FS: 0000000000000000(0000) GS:ffff8d062ed80000(0000) knlGS:0000000000000000 [ 156.131923] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 156.131929] CR2: 00000000000002a0 CR3: 000000001120a005 CR4: 00000000003706e0 [ 156.131937] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 156.131942] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 156.131949] Call Trace: [ 156.131953] <TASK> [ 156.131957] ? trace_event_raw_event_dma_fence+0xcc/0x200 [ 156.131973] ? ring_buffer_unlock_commit+0x23/0x130 [ 156.131982] dma_fence_init+0x92/0xb0 [ 156.131993] amdgpu_fence_emit+0x10d/0x2b0 [amdgpu] [ 156.132302] amdgpu_ib_schedule+0x2f9/0x580 [amdgpu] [ 156.132586] amdgpu_job_run+0xed/0x220 [amdgpu] v2: fix mismatch warning between the prototype and function name (Ray, kernel test robot) Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-16 13:42:36 -05:00
Thomas Zimmermann	9758ff2fa2	Merge drm/drm-next into drm-misc-next Backmerging for v5.16-rc5. Resolves a conflict between drm-misc-next and drm-misc-fixes in the vc4 driver. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2021-12-16 14:48:27 +01:00
Evan Quan	17c65d6fca	drm/amdgpu: correct the wrong cached state for GMC on PICASSO Pair the operations did in GMC ->hw_init and ->hw_fini. That can help to maintain correct cached state for GMC and avoid unintention gate operation dropping due to wrong cached state. BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1828 Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-14 17:56:53 -05:00
Hawking Zhang	841933d5b8	drm/amdgpu: don't override default ECO_BITs setting Leave this bit as hardware default setting Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2021-12-14 17:50:36 -05:00
Le Ma	f3a8076eb2	drm/amdgpu: correct register access for RLC_JUMP_TABLE_RESTORE should count on GC IP base address Signed-off-by: Le Ma <le.ma@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2021-12-14 17:49:50 -05:00
Yann Dirson	326db0dc00	amdgpu: fix some comment typos Signed-off-by: Yann Dirson <ydirson@free.fr> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-14 16:10:58 -05:00
Yann Dirson	03f2abb07e	amdgpu: fix some kernel-doc markup Those are not today pulled by the sphinx doc, but better be ready. Signed-off-by: Yann Dirson <ydirson@free.fr> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-14 16:10:53 -05:00
Jingwen Chen	948e7ce014	drm/amd/amdgpu: fix gmc bo pin count leak in SRIOV [Why] gmc bo will be pinned during loading amdgpu and reset in SRIOV while only unpinned in unload amdgpu [How] add amdgpu_in_reset and sriov judgement to skip pin bo v2: fix wrong judgement Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com> Reviewed-by: Horace Chen <horace.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-14 16:10:00 -05:00
Jingwen Chen	85dfc1d692	drm/amd/amdgpu: fix psp tmr bo pin count leak in SRIOV [Why] psp tmr bo will be pinned during loading amdgpu and reset in SRIOV while only unpinned in unload amdgpu [How] add amdgpu_in_reset and sriov judgement to skip pin bo v2: fix wrong judgement Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com> Reviewed-by: Horace Chen <horace.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-14 16:09:49 -05:00
Evan Quan	17252701ec	drm/amdgpu: correct the wrong cached state for GMC on PICASSO Pair the operations did in GMC ->hw_init and ->hw_fini. That can help to maintain correct cached state for GMC and avoid unintention gate operation dropping due to wrong cached state. BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1828 Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-14 16:09:31 -05:00
Guchun Chen	e0f943b4f9	drm/amdgpu: use adev_to_drm to get drm_device pointer Updated for consistency when accessing drm_device from amdgpu driver. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-14 16:09:24 -05:00
Evan Quan	7e31a8585b	drm/amdgpu: move smu_debug_mask to a more proper place As the smu_context will be invisible from outside(of power). Also, the smu_debug_mask can be shared around all power code instead of some specific framework(swSMU) only. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-14 16:09:11 -05:00
Victor Skvortsov	fa4a427d84	drm/amdgpu: SRIOV flr_work should use down_write Host initiated VF FLR may fail if someone else is already holding a read_lock. Change from down_write_trylock to down_write to guarantee the reset goes through. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Reviewed by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-14 16:09:02 -05:00
Daniel Vetter	99b03ca651	Linux 5.16-rc5 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmG2fU0eHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGC7EH/3R7Rt+OD8Wn8Ss3 w8V+dBxVwa2u2oMTyUHPxaeOXZ7bi38XlUdLFPOK/76bGwO0a5TmYZqsWdRbGyT0 HfcYjHsQ0lbJXk/nh2oM47oJxJXVpThIHXJEk0FZ0Y5t+DYjIYlNHzqZymUyhLem St74zgWcyT+MXuqY34vB827FJDUnOxhhhi85tObeunaSPAomy9aiYidSC1ARREnz iz2VUntP/QnRnKVvL2nUZNzcz1xL5vfCRSKsRGRSv3qW1Y/1M71ylt6JVmSftWq+ VmMdFxFhdrb1OK/1ct/930Un/UP2NG9EJsWxote2XYlnVSZHzDqH7lUhbqgdCcLz 1m2tVNY= =7wRd -----END PGP SIGNATURE----- Merge v5.16-rc5 into drm-next Thomas Zimmermann requested a fixes backmerge, specifically also for `96c5f82ef0` ("drm/vc4: fix error code in vc4_create_object()") Just a bunch of adjacent changes conflicts, even the big pile of them in vc4. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2021-12-14 10:24:28 +01:00
chiminghao	47d9c6faa7	drm:amdgpu:remove unneeded variable return value form directly instead of taking this in another redundant variable. Reported-by: Zeal Robot <zealci@zte.com.cm> Signed-off-by: chiminghao <chi.minghao@zte.com.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:34:27 -05:00
Isabella Basso	ba6f8c135a	drm/amdgpu: re-format file header comments Fix the warning below: warning: Cannot understand * \file amdgpu_ioc32.c on line 2 - I thought it was a doc line Changes since v1: - As suggested by Alexander Deucher: 1. Reduce diff to minimum as this DOC section doesn't provide much value. Signed-off-by: Isabella Basso <isabbasso@riseup.net> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:33:17 -05:00
Isabella Basso	929bb8e200	drm/amdgpu: fix amdgpu_ras_mca_query_error_status scope This commit fixes the compile-time warning below: warning: no previous prototype for ‘amdgpu_ras_mca_query_error_status’ [-Wmissing-prototypes] Changes since v1: - As suggested by Alexander Deucher: 1. Make function static instead of adding prototype. Signed-off-by: Isabella Basso <isabbasso@riseup.net> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:33:17 -05:00
Philip Yang	28fe416466	drm/amdgpu: Reduce SG bo memory usage for mGPUs For userptr bo, if adev is not in IOMMU isolation mode, RAM direct map to GPU, multiple GPUs use same system memory dma mapping address, they can share the original mem->bo in attachment to reduce dma address array memory usage. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:33:16 -05:00
Philip Yang	4a74c38cd6	drm/amdgpu: Detect if amdgpu in IOMMU direct map mode If host and amdgpu IOMMU is not enabled or IOMMU is pass through mode, set adev->ram_is_direct_mapped flag which will be used to optimize memory usage for multi GPU mappings. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:33:16 -05:00
Lang Yu	6ff7fddbd1	drm/amdgpu: add support for SMU debug option SMU firmware expects the driver maintains error context and doesn't interact with SMU any more when SMU errors occurred. That will aid in debugging SMU firmware issues. Add SMU debug option support for this request, it can be enabled or disabled via amdgpu_smu_debug debugfs file. Use a 32-bit mask to indicate corresponding debug modes. Currently, only one mode(HALT_ON_ERROR) is supported. When enabled, it brings hardware to a kind of halt state so that no one can touch it any more in the envent of SMU errors. The dirver interacts with SMU via sending messages. And threre are three ways to sending messages to SMU in current implementation. Handle them respectively as following: 1, smu_cmn_send_smc_msg_with_param() for normal timeout cases Halt on any error. 2, smu_cmn_send_msg_without_waiting()/smu_cmn_wait_for_response() for longer timeout cases Halt on errors apart from ETIME. Otherwise this way won't work. Let the user handle ETIME error in such a case. 3, smu_cmn_send_msg_without_waiting() for no waiting cases Halt on errors apart from ETIME. Otherwise second way won't work. == Command Guide == 1, enable HALT_ON_ERROR mode # echo 0x1 > /sys/kernel/debug/dri/0/amdgpu_smu_debug 2, disable HALT_ON_ERROR mode # echo 0x0 > /sys/kernel/debug/dri/0/amdgpu_smu_debug v5: - Use bit mask to allow more debug features.(Evan) - Use WRAN() instead of BUG().(Evan) v4: - Set to halt state instead of a simple hang.(Christian) v3: - Use debugfs_create_bool().(Christian) - Put variable into smu_context struct. - Don't resend command when timeout. v2: - Resend command when timeout.(Lijo) - Use debugfs file instead of module parameter. Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:33:16 -05:00
Lang Yu	34f3a4a98b	drm/amdgpu: introduce a kind of halt state for amdgpu device It is useful to maintain error context when debugging SW/FW issues. Introduce amdgpu_device_halt() for this purpose. It will bring hardware to a kind of halt state, so that no one can touch it any more. Compare to a simple hang, the system will keep stable at least for SSH access. Then it should be trivial to inspect the hardware state and see what's going on. v2: - Set adev->no_hw_access earlier to avoid potential crashes.(Christian) Suggested-by: Christian Koenig <christian.koenig@amd.com> Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: Christian Koenig <christian.koenig@amd.co> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:33:16 -05:00
Hawking Zhang	cace4bff75	drm/amdgpu: check df_funcs and its callback pointers in case they are not avaiable in early phase Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:33:16 -05:00
Hawking Zhang	4ac955baa9	drm/amdgpu: don't override default ECO_BITs setting Leave this bit as hardware default setting Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:33:16 -05:00
Le Ma	2c113b999c	drm/amdgpu: correct register access for RLC_JUMP_TABLE_RESTORE should count on GC IP base address Signed-off-by: Le Ma <le.ma@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:33:16 -05:00
Hawking Zhang	2cb6577a30	drm/amdgpu: read and authenticate ip discovery binary read and authenticate ip discovery binary getting from vram first, if it is not valid, read and authenticate the one getting from file Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:33:16 -05:00
Hawking Zhang	32f0e1a330	drm/amdgpu: add helper to verify ip discovery binary signature To be used to check ip discovery binary signature Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:33:15 -05:00
Hawking Zhang	f6dcaf0c07	drm/amdgpu: rename discovery_read_binary helper add _from_vram in the funciton name to diffrentiate the one used to read from file Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:33:15 -05:00
Hawking Zhang	43a80bd511	drm/amdgpu: add helper to load ip_discovery binary from file To be used when ip_discovery binary is not carried by vbios Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:33:15 -05:00
Leslie Shi	c40bdfb2ff	drm/amdgpu: fix incorrect VCN revision in SRIOV Guest OS will setup VCN instance 1 which is disabled as an enabled instance and execute initialization work on it, but this causes VCN ib ring test failure on the disabled VCN instance during modprobe: amdgpu 0000:00:08.0: amdgpu: ring vcn_enc_1.0 uses VM inv eng 5 on hub 1 amdgpu 0000:00:08.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on vcn_dec_0 (-110). amdgpu 0000:00:08.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on vcn_enc_0.0 (-110). [drm:amdgpu_device_delayed_init_work_handler [amdgpu]] ERROR ib ring test failed (-110). v2: drop amdgpu_discovery_get_vcn_version and rename sriov_config to vcn_config v3: modify VCN's revision in SR-IOV and bare-metal Fixes: `baf3f8f374` ("drm/amdgpu: handle SRIOV VCN revision parsing") Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:33:15 -05:00
Leslie Shi	4046afcebf	drm/amdgpu: add modifiers in amdgpu_vkms_plane_init() Fix following warning in SRIOV during modprobe: amdgpu 0000:00:08.0: GFX9+ requires FB check based on format modifier WARNING: CPU: 0 PID: 1023 at drivers/gpu/drm/amd/amdgpu/amdgpu_display.c:1150 amdgpu_display_framebuffer_init+0x8e7/0xb40 [amdgpu] Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:33:15 -05:00
Lang Yu	613aa3ea74	drm/amdgpu: only hw fini SMU fisrt for ASICs need that We found some headaches on ASICs don't need that, so remove that for them. Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: Kevin Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:32:35 -05:00
Philip Yang	0771c80591	drm/amdgpu: Handle fault with same timestamp Remove not unique timestamp WARNING as same timestamp interrupt happens on some chips, Drain fault need to wait for the processed_timestamp to be truly greater than the checkpoint or the ring to be empty to be sure no stale faults are handled. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1818 Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:32:35 -05:00
Isabella Basso	e105b64a36	drm/amdgpu: fix location of prototype for amdgpu_kms_compat_ioctl This fixes the warning below by changing the prototype to a location that's actually included by the .c files that call amdgpu_kms_compat_ioctl: warning: no previous prototype for ‘amdgpu_kms_compat_ioctl’ [-Wmissing-prototypes] 37 \| long amdgpu_kms_compat_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) \| ^~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Isabella Basso <isabbasso@riseup.net> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:32:34 -05:00
Isabella Basso	64cf26f04a	drm/amd: append missing includes This fixes warnings caused by global functions lacking prototypes:, such as: warning: no previous prototype for 'dcn303_hw_sequencer_construct' [-Wmissing-prototypes] 12 \| void dcn303_hw_sequencer_construct(struct dc *dc) \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ... warning: no previous prototype for ‘amdgpu_has_atpx’ [-Wmissing-prototypes] 76 \| bool amdgpu_has_atpx(void) { \| ^~~~~~~~~~~~~~~ Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Isabella Basso <isabbasso@riseup.net> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:32:34 -05:00
Isabella Basso	2351b7d4e3	drm/amdgpu: fix function scopes This turns previously global functions into static, thus removing compile-time warnings such as: warning: no previous prototype for 'amdgpu_vkms_output_init' [-Wmissing-prototypes] 399 \| int amdgpu_vkms_output_init(struct drm_device *dev, \| ^~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Isabella Basso <isabbasso@riseup.net> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:32:34 -05:00
Isabella Basso	bbe04dec5c	drm/amd: fix improper docstring syntax This fixes various warnings relating to erroneous docstring syntax, of which some are listed below: warning: Function parameter or member 'adev' not described in 'amdgpu_atomfirmware_ras_rom_addr' ... warning: expecting prototype for amdgpu_atpx_validate_functions(). Prototype was for amdgpu_atpx_validate() instead ... warning: Excess function parameter 'mem' description in 'amdgpu_preempt_mgr_new' ... warning: Cannot understand * @kfd_get_cu_occupancy - Collect number of waves in-flight on this device ... warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst Signed-off-by: Isabella Basso <isabbasso@riseup.net> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:32:34 -05:00
Zhigang Luo	85a774d9ad	drm/amdgpu: extended waiting SRIOV VF reset completion timeout to 10s For the ASIC has big FB, it need more time to clear FB during reset. This change extended SRIOV VF waiting reset completion timeout from 5s to 10s. Signed-off-by: Zhigang Luo <zhigang.luo@amd.com> Acked-by: Shaoyun Liu <shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:32:34 -05:00
Zhigang Luo	a5f67c939e	drm/amdgpu: recover XGMI topology for SRIOV VF after reset For SRIOV VF, the XGMI topology was not recovered after reset. This change added code to SRIOV VF reset function to update XGMI topology for SRIOV VF after reset. Signed-off-by: Zhigang Luo <zhigang.luo@amd.com> Reviewed-by: Shaoyun Liu <shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:32:34 -05:00
Zhigang Luo	dd26e018aa	drm/amdgpu: added PSP XGMI initialization for SRIOV VF during recover For SRIOV VF, XGMI was not initialized in PSP during recover. This change added PSP XGMI initialization for SRIOV VF during recover. Signed-off-by: Zhigang Luo <zhigang.luo@amd.com> Reviewed-by: Shaoyun Liu <shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:32:34 -05:00
Zhigang Luo	175ac6ec6b	drm/amdgpu: skip reset other device in the same hive if it's SRIOV VF On SRIOV, host driver can support FLR(function level reset) on individual VF within the hive which might bring the individual device back to normal without the necessary to execute the hive reset. If the FLR failed , host driver will trigger the hive reset, each guest VF will get reset notification before the real hive reset been executed. The VF device can handle the reset request individually in it's reset work handler. This change updated gpu recover sequence to skip reset other device in the same hive for SRIOV VF. Signed-off-by: Zhigang Luo <zhigang.luo@amd.com> Reviewed-by: Shaoyun Liu <shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:32:34 -05:00
Tao Zhou	655ff3538e	drm/amdgpu: enable RAS poison flag when GPU is connected to CPU The RAS poison mode is enabled by default on the platform. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-13 16:32:34 -05:00
Dave Airlie	f8eb96b4df	Merge tag 'amd-drm-next-5.17-2021-12-02' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.17-2021-12-02: amdgpu: - Use generic drm fb helpers - PSR fixes - Rework DCN3.1 clkmgr - DPCD 1.3 fixes - Misc display fixes can cleanups - Clock query fixes for APUs - LTTPR fixes - DSC fixes - Misc PM fixes - RAS fixes - OLED backlight fix - SRIOV fixes - Add STB (Smart Trace Buffer) for supported dGPUs - IH rework - Enable seamless boot for DCN3.01 amdkfd: - Rework more stuff around IP discovery enumeration - Further clean up of interfaces with amdgpu - SVM fixes radeon: - Indentation fixes UAPI: - Add a new KFD header that defines some of the sysfs bitfields and enums that userspace has been using for a while The corresponding bit-fields and enums in user mode are defined in https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/blob/master/include/hsakmttypes.h Signed-off-by: Dave Airlie <airlied@redhat.com> # Conflicts: # drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211202191643.5970-1-alexander.deucher@amd.com	2021-12-10 13:52:51 +10:00
Christian König	21a6732f46	drm/amdgpu: don't skip runtime pm get on A+A config The runtime PM get was incorrectly added after the check. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211206084551.92502-1-christian.koenig@amd.com	2021-12-09 17:13:17 +01:00
Daniel Vetter	c8a04cbeed	drm-misc-next for 5.17: UAPI Changes: Cross-subsystem Changes: * Move 'nomodeset' kernel boot option into DRM subsystem Core Changes: * Replace several DRM_() logging macros with drm_() equivalents * panel: Add quirk for Lenovo Yoga Book X91F/L * ttm: Documentation fixes Driver Changes: * Cleanup nomodeset handling in drivers * Fixes * bridge/anx7625: Fix reading EDID; Fix error code * bridge/megachips: Probe both bridges before registering * vboxvideo: Fix ERR_PTR usage -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmGklRkACgkQaA3BHVML eiPweQf7B6r4Bu5XRdodHs3arqt5A7st1ZIZKOwb6SdgriisViZPdAdmgoHjBxtj rMj0eUac80ssPmtXf6si0jDx2G/QS0bOT38g/NQ2rjxiR3jt5mjbKnd4m/fwVzxw uD9Qmk/6IlLdAQa4hpgmaF5er11zHxCxhKbDYlNzQX9dPqvHKq+KjN00FWkupqTl 5CSA6wRcZuQXbEIKrueXBWz0oX9/KX8LraI/U4Ze9I4PmP6iD09Au8dgx5Mc8pUC OfA94pTdUVjg9ZHBhI4jm1LRIYKDlQcOkAPSCGgrA61dM/J0JOYmiE0fRAV0KH0S 7lzKDXqnc/KoyohI8oGFYXNbMKf2cQ== =xiFx -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2021-11-29' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.17: UAPI Changes: Cross-subsystem Changes: * Move 'nomodeset' kernel boot option into DRM subsystem Core Changes: * Replace several DRM_() logging macros with drm_() equivalents * panel: Add quirk for Lenovo Yoga Book X91F/L * ttm: Documentation fixes Driver Changes: * Cleanup nomodeset handling in drivers * Fixes * bridge/anx7625: Fix reading EDID; Fix error code * bridge/megachips: Probe both bridges before registering * vboxvideo: Fix ERR_PTR usage Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/YaSVz15Q7dAlEevU@linux-uq9g.fritz.box	2021-12-09 09:31:45 +01:00
Claudio Suarez	3c02193102	drm/amdgpu: replace drm_detect_hdmi_monitor() with drm_display_info.is_hdmi Once EDID is parsed, the monitor HDMI support information is available through drm_display_info.is_hdmi. The amdgpu driver still calls drm_detect_hdmi_monitor() to retrieve the same information, which is less efficient. Change to drm_display_info.is_hdmi This is a TODO task in Documentation/gpu/todo.rst Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Claudio Suarez <cssk@net-c.es> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-07 13:13:07 -05:00
Claudio Suarez	20543be93c	drm/amdgpu: update drm_display_info correctly when the edid is read drm_display_info is updated by drm_get_edid() or drm_connector_update_edid_property(). In the amdgpu driver it is almost always updated when the edid is read in amdgpu_connector_get_edid(), but not always. Change amdgpu_connector_get_edid() and amdgpu_connector_free_edid() to keep drm_display_info updated. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Claudio Suarez <cssk@net-c.es> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-07 13:12:58 -05:00
Stanley.Yang	cf63b70272	drm/amdgpu: skip umc ras error count harvest remove in recovery stat check, skip umc ras err cnt harvest in amdgpu_ras_log_on_err_counter Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-07 13:12:19 -05:00
Flora Cui	30c1e39197	drm/amdgpu: free vkms_output after use Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Leslie Shi <Yuliang.Shi@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-07 13:12:13 -05:00
Flora Cui	f7ed3f90b2	drm/amdgpu: drop the critial WARN_ON in amdgpu_vkms Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Leslie Shi <Yuliang.Shi@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-07 13:12:05 -05:00
Stanley.Yang	aed1faab9d	drm/amdgpu: only skip get ecc info for aldebaran skip get ecc info for aldebarn through check ip version do not affect other asic type Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-07 13:07:55 -05:00
Zhou Qingyang	b220110e4c	drm/amdgpu: Fix a NULL pointer dereference in amdgpu_connector_lcd_native_mode() In amdgpu_connector_lcd_native_mode(), the return value of drm_mode_duplicate() is assigned to mode, and there is a dereference of it in amdgpu_connector_lcd_native_mode(), which will lead to a NULL pointer dereference on failure of drm_mode_duplicate(). Fix this bug add a check of mode. This bug was found by a static analyzer. The analysis employs differential checking to identify inconsistent security operations (e.g., checks or kfrees) between two code paths and confirms that the inconsistent operations are not recovered in the current function or the callers, so they constitute bugs. Note that, as a bug found by static analysis, it can be a false positive or hard to trigger. Multiple researchers have cross-reviewed the bug. Builds with CONFIG_DRM_AMDGPU=m show no new warnings, and our static analyzer no longer warns about this code. Fixes: `d38ceaf99e` ("drm/amdgpu: add core driver (v4)") Signed-off-by: Zhou Qingyang <zhou1615@umn.edu> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-02 12:43:31 -05:00
Alex Deucher	baf3f8f374	drm/amdgpu: handle SRIOV VCN revision parsing For SR-IOV, the IP discovery revision number encodes additional information. Handle that case here. v2: drop additional IP versions Reviewed-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-02 12:43:25 -05:00
Stanley.Yang	bab73f092d	drm/amdgpu: skip query ecc info in gpu recovery this is a workaround due to get ecc info failed during gpu recovery [ 700.236122] amdgpu 0000:09:00.0: amdgpu: Failed to export SMU ecc table! [ 700.236128] amdgpu 0000:09:00.0: amdgpu: GPU reset begin! [ 704.331171] amdgpu: qcm fence wait loop timeout expired [ 704.331194] amdgpu: The cp might be in an unrecoverable state due to an unsuccessful queues preemption [ 704.332445] amdgpu 0000:09:00.0: amdgpu: GPU reset begin! [ 704.332448] amdgpu 0000:09:00.0: amdgpu: Bailing on TDR for s_job:ffffffffffffffff, as another already in progress [ 704.332456] amdgpu: Pasid 0x8000 destroy queue 0 failed, ret -62 [ 710.360924] amdgpu 0000:09:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x00000013 SMN_C2PMSG_82:0x00000007 [ 710.360964] amdgpu 0000:09:00.0: amdgpu: Failed to disable smu features. [ 710.361002] amdgpu 0000:09:00.0: amdgpu: Fail to disable dpm features! [ 710.361014] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] ERROR suspend of IP block <smu> failed -62 Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-02 12:43:06 -05:00
shaoyunl	428890a3fe	drm/amdgpu: adjust the kfd reset sequence in reset sriov function This change revert previous commits: `9f4f2c1a35` ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov") `271fd38ce5` ("drm/amdgpu: move kfd post_reset out of reset_sriov function") This change moves the amdgpu_amdkfd_pre_reset to an earlier place in amdgpu_device_reset_sriov, presumably to address the sequence issue that the first patch was originally meant to fix. Some register access(GRBM_GFX_CNTL) only be allowed on full access mode. Move kfd_pre_reset and kfd_post_reset back inside reset_sriov function. Fixes: `9f4f2c1a35` ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov") Fixes: `271fd38ce5` ("drm/amdgpu: move kfd post_reset out of reset_sriov function") Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-01 17:09:30 -05:00
Philip Yang	494f2e42ce	drm/amdkfd: fix double free mem structure drm_gem_object_put calls release_notify callback to free the mem structure and unreserve_mem_limit, move it down after the last access of mem and make it conditional call. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-01 17:08:00 -05:00
Lijo Lazar	e0570f0b6e	drm/amdgpu: Don't halt RLC on GFX suspend On aldebaran, RLC also controls GFXCLK. Skip halting RLC during GFX IP suspend and keep it running till PMFW disables all DPMs. [ 578.019986] amdgpu 0000:23:00.0: amdgpu: GPU reset begin! [ 583.245566] amdgpu 0000:23:00.0: amdgpu: Failed to disable smu features. [ 583.245621] amdgpu 0000:23:00.0: amdgpu: Fail to disable dpm features! [ 583.245639] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] ERROR suspend of IP block <smu> failed -62 [ 583.248504] [drm] free PSP TMR buffer Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-01 17:02:40 -05:00
Guchun Chen	7551f70ab9	drm/amdgpu: fix the missed handling for SDMA2 and SDMA3 There is no base reg offset or ip_version set for SDMA2 and SDMA3 on SIENNA_CICHLID, so add them. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Kevin Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-01 17:00:55 -05:00
Flora Cui	1053b9c948	drm/amdgpu: check atomic flag to differeniate with legacy path since vkms support atomic KMS interface Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Alex Deucher <aleander.deucher@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-01 16:59:38 -05:00
Flora Cui	3e467e478e	drm/amdgpu: cancel the correct hrtimer on exit Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-01 16:58:52 -05:00
Jane Jian	da3b36a23b	drm/amdgpu/sriov/vcn: add new vcn ip revision check case for SIENNA_CICHLID [WHY] for sriov odd# vf will modify vcn0 engine ip revision(due to multimedia bandwidth feature), which will be mismatched with original vcn0 revision [HOW] add new version check for vcn0 disabled revision(3, 0, 192), typically modified under sriov mode Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-01 16:58:11 -05:00
Yann Dirson	ddb267b66a	drm/amdgpu: update fw_load_type module parameter doc to match code amdgpu_ucode_get_load_type() does not interpret this parameter as documented. It is ignored for many ASIC types (which presumably only support one load_type), and when not ignored it is only used to force direct loading instead of PSP loading. SMU loading is only available for ASICs for which the parameter is ignored. Signed-off-by: Yann Dirson <ydirson@free.fr> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-01 16:16:06 -05:00
Philip Yang	a899fe8b43	drm/amdkfd: err_pin_bo path leaks kfd_bo_list Refactor userptr and pin_bo path to make it less confusing, move err_pin_bo label up to remove mem from process_info kfd_bo_list. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-01 16:16:00 -05:00
shaoyunl	992110d747	drm/amdgpu: adjust the kfd reset sequence in reset sriov function This change revert previous commits: `9f4f2c1a35` ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov") `271fd38ce5` ("drm/amdgpu: move kfd post_reset out of reset_sriov function") This change moves the amdgpu_amdkfd_pre_reset to an earlier place in amdgpu_device_reset_sriov, presumably to address the sequence issue that the first patch was originally meant to fix. Some register access(GRBM_GFX_CNTL) only be allowed on full access mode. Move kfd_pre_reset and kfd_post_reset back inside reset_sriov function. Fixes: `9f4f2c1a35` ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov") Fixes: `271fd38ce5` ("drm/amdgpu: move kfd post_reset out of reset_sriov function") Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-01 16:07:33 -05:00
Philip Yang	a872c152fd	drm/amdkfd: fix double free mem structure drm_gem_object_put calls release_notify callback to free the mem structure and unreserve_mem_limit, move it down after the last access of mem and make it conditional call. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-01 16:07:13 -05:00
Lijo Lazar	81d104f4af	drm/amdgpu: Don't halt RLC on GFX suspend On aldebaran, RLC also controls GFXCLK. Skip halting RLC during GFX IP suspend and keep it running till PMFW disables all DPMs. [ 578.019986] amdgpu 0000:23:00.0: amdgpu: GPU reset begin! [ 583.245566] amdgpu 0000:23:00.0: amdgpu: Failed to disable smu features. [ 583.245621] amdgpu 0000:23:00.0: amdgpu: Fail to disable dpm features! [ 583.245639] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] ERROR suspend of IP block <smu> failed -62 [ 583.248504] [drm] free PSP TMR buffer Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-01 16:04:23 -05:00
Lijo Lazar	fe9c5c9aff	drm/amdgpu: Use MAX_HWIP instead of HW_ID_MAX HW_ID_MAX considers HWID of all IPs, far more than what amdgpu uses. amdgpu tracks only the IPs defined by amd_hw_ip_block_type whose max is MAX_HWIP. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-01 16:04:10 -05:00
Guchun Chen	3700169886	drm/amdgpu: fix the missed handling for SDMA2 and SDMA3 There is no base reg offset or ip_version set for SDMA2 and SDMA3 on SIENNA_CICHLID, so add them. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Kevin Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-01 16:04:02 -05:00
Guchun Chen	6c18ecefab	drm/amdgpu: declare static function to fix compiler warning >> drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:503:6: warning: no previous prototype for function 'release_psp_cmd_buf' [-Wmissing-prototypes] void release_psp_cmd_buf(struct psp_context psp) ^ drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:503:1: note: declare 'static' if the function is not intended to be used outside of this translation unit void release_psp_cmd_buf(struct psp_context psp) ^ static 1 warning generated. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Kevin Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-01 16:03:56 -05:00
Philip Yang	3c2d6ea279	drm/amdgpu: handle IH ring1 overflow IH ring1 is used to process GPU retry fault, overflow is enabled to drain retry fault because we want receive other interrupts while handling retry fault to recover range. There is no overflow flag set when wptr pass rptr. Use timestamp of rptr and wptr to handle overflow and drain retry fault. If fault timestamp goes backward, the fault is filtered and should not be processed. Drain fault is finished if processed_timestamp is equal to or larger than checkpoint timestamp. Add amdgpu_ih_functions interface decode_iv_ts for different chips to get timestamp from IV entry with different iv size and timestamp offset. amdgpu_ih_decode_iv_ts_helper is used for vega10, vega20, navi10. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-01 16:03:34 -05:00
Stanley.Yang	232d1d43b5	drm/amdgpu: fix disable ras feature failed when unload drvier v2 v2: still need call ras_disable_all_featrures to handle ras initilization failure case. Function amdgpu_device_fini_hw is called before amdgpu_device_fini_sw, so ras ta will unload before send ras disable command, ras dsiable operation must before hw fini. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-01 16:03:22 -05:00
Flora Cui	700de2c8aa	drm/amdgpu: check atomic flag to differeniate with legacy path since vkms support atomic KMS interface Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Alex Deucher <aleander.deucher@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-01 16:03:03 -05:00
Flora Cui	deefd07eed	drm/amdgpu: fix vkms crtc settings otherwise adev->mode_info.crtcs[] is NULL Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-01 16:02:57 -05:00
Flora Cui	4f7ee199d9	drm/amdgpu: cancel the correct hrtimer on exit Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-01 16:02:46 -05:00
Jane Jian	981b304546	drm/amdgpu/sriov/vcn: add new vcn ip revision check case for SIENNA_CICHLID [WHY] for sriov odd# vf will modify vcn0 engine ip revision(due to multimedia bandwidth feature), which will be mismatched with original vcn0 revision [HOW] add new version check for vcn0 disabled revision(3, 0, 192), typically modified under sriov mode Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-01 16:02:04 -05:00
Javier Martinez Canillas	6a2d2ddf2c	drm: Move nomodeset kernel parameter to the DRM subsystem The "nomodeset" kernel cmdline parameter is handled by the vgacon driver but the exported vgacon_text_force() symbol is only used by DRM drivers. It makes much more sense for the parameter logic to be in the subsystem of the drivers that are making use of it. Let's move the vgacon_text_force() function and related logic to the DRM subsystem. While doing that, rename it to drm_firmware_drivers_only() and make it return true if "nomodeset" was used and false otherwise. This is a better description of the condition that the drivers are testing for. Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Pekka Paalanen <pekka.paalanen@collabora.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patchwork.freedesktop.org/patch/msgid/20211112133230.1595307-4-javierm@redhat.com	2021-11-27 13:52:22 +01:00
Javier Martinez Canillas	35f7775f81	drm: Don't print messages if drivers are disabled due nomodeset The nomodeset kernel parameter handler already prints a message that the DRM drivers will be disabled, so there's no need for drivers to do that. Suggested-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Pekka Paalanen <pekka.paalanen@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211112133230.1595307-2-javierm@redhat.com	2021-11-27 13:52:16 +01:00
Alex Deucher	692cd92e66	drm/amd/display: update bios scratch when setting backlight Update the bios scratch register when updating the backlight level. Some platforms apparently read this scratch register and do additional operations in their hotkey handlers. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1518 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-24 15:14:36 -05:00
Lijo Lazar	57961c4c18	drm/amdgpu: Skip ASPM programming on aldebaran There is no need for additional programming, keep the default settings. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-24 15:14:03 -05:00
Yang Wang	fd08953b2d	drm/amdgpu: fix byteorder error in amdgpu discovery fix some byteorder issues about amdgpu discovery. This will result in running errors on the big end system. (e.g:MIPS) Signed-off-by: Yang Wang <KevinYang.Wang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-24 15:13:54 -05:00
Philip Yang	c4ef8a73bf	drm/amdgpu: enable Navi retry fault wptr overflow If xnack is on, VM retry fault interrupt send to IH ring1, and ring1 will be full quickly. IH cannot receive other interrupts, this causes deadlock if migrating buffer using sdma and waiting for sdma done while handling retry fault. Remove VMC from IH storm client, enable ring1 write pointer overflow, then IH will drop retry fault interrupts and be able to receive other interrupts while driver is handling retry fault. IH ring1 write pointer doesn't writeback to memory by IH, and ring1 write pointer recorded by self-irq is not updated, so always read the latest ring1 write pointer from register. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-24 15:13:10 -05:00
Philip Yang	8888e2fe9c	drm/amdgpu: enable Navi 48-bit IH timestamp counter By default this timestamp is 32 bit counter. It gets overflowed in around 10 minutes. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-24 15:12:55 -05:00
Philip Yang	4d62555f62	drm/amdgpu: IH process reset count when restart Otherwise when IH process restart, count is zero, the loop will not exit to wake_up_all after processing AMDGPU_IH_MAX_NUM_IVS interrupts. Cc: stable@vger.kernel.org Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-24 15:09:46 -05:00
Alex Deucher	53af98c091	drm/amdgpu/gfx9: switch to golden tsc registers for renoir+ Renoir and newer gfx9 APUs have new TSC register that is not part of the gfxoff tile, so it can be read without needing to disable gfx off. Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-24 15:09:17 -05:00
Alex Deucher	244ee39885	drm/amdgpu/gfx10: add wraparound gpu counter check for APUs as well Apply the same check we do for dGPUs for APUs as well. Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-24 15:09:09 -05:00
shaoyunl	271fd38ce5	drm/amdgpu: move kfd post_reset out of reset_sriov function Fixes: `9f4f2c1a35` ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov") For sriov XGMI configuration, the host driver will handle the hive reset, so in guest side, the reset_sriov only be called once on one device. This will make kfd post_reset unblanced with kfd pre_reset since kfd pre_reset already been moved out of reset_sriov function. Move kfd post_reset out of reset_sriov function to make them balance . Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-24 15:08:58 -05:00
xinhui pan	4eb6bb649f	drm/amdgpu: Fix double free of dmabuf amdgpu_amdkfd_gpuvm_free_memory_of_gpu drop dmabuf reference increased in amdgpu_gem_prime_export. amdgpu_bo_destroy drop dmabuf reference increased in amdgpu_gem_prime_import. So remove this extra dma_buf_put to avoid double free. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Tested-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-24 15:04:05 -05:00
Felix Kuehling	d3a21f7e35	drm/amdgpu: Fix MMIO HDP flush on SRIOV Disable HDP register remapping on SRIOV and set rmmio_remap.reg_offset to the fixed address of the VF register for hdp_v*_flush_hdp. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: Bokun Zhang <bokun.zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-24 15:02:25 -05:00
Alex Deucher	1f57925493	drm/amd/display: update bios scratch when setting backlight Update the bios scratch register when updating the backlight level. Some platforms apparently read this scratch register and do additional operations in their hotkey handlers. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1518 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-24 14:06:54 -05:00
Lijo Lazar	6ff53495ce	drm/amdgpu: Skip ASPM programming on aldebaran There is no need for additional programming, keep the default settings. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-24 14:06:53 -05:00
Yang Wang	cc7818d709	drm/amdgpu: fix byteorder error in amdgpu discovery fix some byteorder issues about amdgpu discovery. This will result in running errors on the big end system. (e.g:MIPS) Signed-off-by: Yang Wang <KevinYang.Wang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-24 14:06:53 -05:00
Philip Yang	23eb49251b	drm/amdgpu: enable Navi retry fault wptr overflow If xnack is on, VM retry fault interrupt send to IH ring1, and ring1 will be full quickly. IH cannot receive other interrupts, this causes deadlock if migrating buffer using sdma and waiting for sdma done while handling retry fault. Remove VMC from IH storm client, enable ring1 write pointer overflow, then IH will drop retry fault interrupts and be able to receive other interrupts while driver is handling retry fault. IH ring1 write pointer doesn't writeback to memory by IH, and ring1 write pointer recorded by self-irq is not updated, so always read the latest ring1 write pointer from register. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-24 14:06:53 -05:00
Philip Yang	71ee9236ab	drm/amdgpu: enable Navi 48-bit IH timestamp counter By default this timestamp is 32 bit counter. It gets overflowed in around 10 minutes. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-24 14:06:53 -05:00
Philip Yang	514f4a99c7	drm/amdgpu: IH process reset count when restart Otherwise when IH process restart, count is zero, the loop will not exit to wake_up_all after processing AMDGPU_IH_MAX_NUM_IVS interrupts. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-24 14:06:53 -05:00
Evan Quan	1223c15c78	drm/amdgpu: update the domain flags for dumb buffer creation After switching to generic framebuffer framework, we rely on the ->dumb_create routine for frame buffer creation. However, the different domain flags used are not optimal. Add the contiguous flag to directly allocate the scanout BO as one linear buffer. Fixes: `087451f372` ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.") Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-24 14:06:53 -05:00
Ramesh Errabolu	37ba5bbc89	drm/amdgpu: Declare Unpin BO api as static Fixes warning report from kernel test robot Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-24 14:06:53 -05:00
Alex Deucher	7b37c7f8f5	drm/amdgpu/gfx9: switch to golden tsc registers for renoir+ Renoir and newer gfx9 APUs have new TSC register that is not part of the gfxoff tile, so it can be read without needing to disable gfx off. Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-24 14:06:53 -05:00
Alex Deucher	f75de84475	drm/amdgpu/gfx10: add wraparound gpu counter check for APUs as well Apply the same check we do for dGPUs for APUs as well. Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-24 14:06:52 -05:00
shaoyunl	4f30d920d1	drm/amdgpu: move kfd post_reset out of reset_sriov function Fixes: `9f4f2c1a35` ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov") For sriov XGMI configuration, the host driver will handle the hive reset, so in guest side, the reset_sriov only be called once on one device. This will make kfd post_reset unblanced with kfd pre_reset since kfd pre_reset already been moved out of reset_sriov function. Move kfd post_reset out of reset_sriov function to make them balance . Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-24 14:06:52 -05:00
Dave Airlie	c18c889111	drm-misc-next for 5.17: UAPI Changes: * Remove restrictions on DMA_BUF_SET_NAME ioctl * connector: State of privacy screen * sysfs: Send hotplug uevent Cross-subsystem Changes: * clk/bmc-2835: Fixes * dma-buf: Add dma_resv selftest; Error-handling fixes; Add debugfs helpers; Remove dma_resv_get_excl_unlocked(); Documentation fixes * pwm: Introduce of_pwm_single_xlate() Core Changes: * Support for privacy screens * Make drm_irq.c legacy * Fix __stack_depot_* name conflict * Documentation fixes * Fixes and cleanups * dp-helper: Reuse 8b/10b link-training delay helpers * format-helper: Update interfaces * fb-helper: Allocate shadow buffer of correct size * gem: Link GEM SHMEM and CMA helpers into separate modules; Use dma_resv iterator; Import DMA_BUF namespace into GEM-helper modules * gem/shmem-helper: Interface cleanups * scheduler: Grab fence in drm_sched_job_add_implicit_dependencies(); Lockdep fixes * kms-helpers: Link several files from core into the KMS-helper module Driver Changes: * Use dma_resv_iter in several places * Fixes and cleanups * amdgpu: Use drm_kms_helper_connector_hotplug_event(); Get all fences at once * bridge: Switch to managed MIPI DSI helpers in several places; Register and attach during probe in several places; Convert to YAML in several places * bridge/anx7625: Support MIPI DPI input; Support HDMI audio; Fixes * bridge/dw-hdmi: Allow interlace on bridge * bridge/ps8640: Enable PM; Support aux-bus * bridge/tc358768: Enabled reference clock; Support pulse mode; Modesetting fixes * bridge/ti-sn65dsi86: Use regmap_bulk_write(); Implement PWM * etnaviv: Get all fences at once * gma500: GEM object cleanups; Remove generic drivers in probe function * i915: Support VESA panel backlights * ingenic: Fixes and cleanups * kirin: Adjust probe order * kmb: Enable framebuffer console * lima: Kconfig fixes * meson: Refactoring to supperot DRM_BRIDGE_ATTACH_NO_ENCODER * msm: Fixes and cleanups * msm/dsi: Adjust probe order * omap: Fixes and cleanups * nouveau: CRC fixes; Validate LUTs in atomic check; Set HDMI AVI RGB quantization to FULL; Fixes and cleanups * panel: Support Innolux G070Y2-T02, Vivax TPC-9150, JDI R63452, Newhaven 1.8-128160EF, Wanchanglong W552964ABA, Novatek NT35950, BOE BF060Y8M, Sony Tulip Truly NT35521; Use dev_err_probe() throughout drivers; Fixes and cleanups * panel/ili9881c: Orientation fixes * radeon: Use dma_resv_wait_timeout() * rockchip: Add timeout for DSP hold; Suspend/resume fixes; PLL clock fixes; Implement mmap in GEM object functions * simpledrm: Support FB_DAMAGE_CLIPS and virtual screen sizes * sun4i: Use CMA helpers without vmap support * tidss: Fixes and cleanups * v3d: Cleanups * vc4: Fix HDMI-CEC hang when display is off; Power on HDMI controller while disabling; Support 4k@60 Hz modes; Fixes and cleanups * video: Convert to sysfs_emit() in several places * video/omapfb: Fix fall-through * virtio: Overflow fixes * xen: Implement mmap as GEM object functions -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmGWF0wACgkQaA3BHVML eiN55ggAr6QN7S7Uxu98XnqfAHC9RErY7r3PoTXXS6ODvxY41bWOpHk8TQzuw626 JCNnpQCk6Gi8L3yl8r/l1fqoirGXrfDR1YvrnmG4I9xhPxOqBmgxDWw7HQrROm2B FctOvgFukvzn5jzQk2FqYgs5JBV20WqLrfEhttPFFMvjLGti/U31/+d+aGMJdRIQ kE2eVpPZtAlbBAP+S4mglKp6w+WrrzNHHULSGOPcGS5jwLrNFaQg7w75JiLInzyR RWum28USXSkKE0d6XBqHw+PWAwo3B/Vq9XSnbCX4+3GQX4tJh+hosyU7ju+rA0BY FJCyN/YgZvc+474o/LVtMh7PbKMEHQ== =TV6Q -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2021-11-18' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.17: UAPI Changes: * Remove restrictions on DMA_BUF_SET_NAME ioctl * connector: State of privacy screen * sysfs: Send hotplug uevent Cross-subsystem Changes: * clk/bmc-2835: Fixes * dma-buf: Add dma_resv selftest; Error-handling fixes; Add debugfs helpers; Remove dma_resv_get_excl_unlocked(); Documentation fixes * pwm: Introduce of_pwm_single_xlate() Core Changes: * Support for privacy screens * Make drm_irq.c legacy * Fix __stack_depot_* name conflict * Documentation fixes * Fixes and cleanups * dp-helper: Reuse 8b/10b link-training delay helpers * format-helper: Update interfaces * fb-helper: Allocate shadow buffer of correct size * gem: Link GEM SHMEM and CMA helpers into separate modules; Use dma_resv iterator; Import DMA_BUF namespace into GEM-helper modules * gem/shmem-helper: Interface cleanups * scheduler: Grab fence in drm_sched_job_add_implicit_dependencies(); Lockdep fixes * kms-helpers: Link several files from core into the KMS-helper module Driver Changes: * Use dma_resv_iter in several places * Fixes and cleanups * amdgpu: Use drm_kms_helper_connector_hotplug_event(); Get all fences at once * bridge: Switch to managed MIPI DSI helpers in several places; Register and attach during probe in several places; Convert to YAML in several places * bridge/anx7625: Support MIPI DPI input; Support HDMI audio; Fixes * bridge/dw-hdmi: Allow interlace on bridge * bridge/ps8640: Enable PM; Support aux-bus * bridge/tc358768: Enabled reference clock; Support pulse mode; Modesetting fixes * bridge/ti-sn65dsi86: Use regmap_bulk_write(); Implement PWM * etnaviv: Get all fences at once * gma500: GEM object cleanups; Remove generic drivers in probe function * i915: Support VESA panel backlights * ingenic: Fixes and cleanups * kirin: Adjust probe order * kmb: Enable framebuffer console * lima: Kconfig fixes * meson: Refactoring to supperot DRM_BRIDGE_ATTACH_NO_ENCODER * msm: Fixes and cleanups * msm/dsi: Adjust probe order * omap: Fixes and cleanups * nouveau: CRC fixes; Validate LUTs in atomic check; Set HDMI AVI RGB quantization to FULL; Fixes and cleanups * panel: Support Innolux G070Y2-T02, Vivax TPC-9150, JDI R63452, Newhaven 1.8-128160EF, Wanchanglong W552964ABA, Novatek NT35950, BOE BF060Y8M, Sony Tulip Truly NT35521; Use dev_err_probe() throughout drivers; Fixes and cleanups * panel/ili9881c: Orientation fixes * radeon: Use dma_resv_wait_timeout() * rockchip: Add timeout for DSP hold; Suspend/resume fixes; PLL clock fixes; Implement mmap in GEM object functions * simpledrm: Support FB_DAMAGE_CLIPS and virtual screen sizes * sun4i: Use CMA helpers without vmap support * tidss: Fixes and cleanups * v3d: Cleanups * vc4: Fix HDMI-CEC hang when display is off; Power on HDMI controller while disabling; Support 4k@60 Hz modes; Fixes and cleanups * video: Convert to sysfs_emit() in several places * video/omapfb: Fix fall-through * virtio: Overflow fixes * xen: Implement mmap as GEM object functions Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/YZYZSypIrr+qcih3@linux-uq9g.fritz.box	2021-11-23 09:38:55 +10:00
Christian König	11b4da9827	drm/amdgpu: partially revert "svm bo enable_signal call condition" Partially revert commit `5f319c5c21`. First of all this is illegal use of RCU to call dma_fence_enable_sw_signaling() since we don't hold a reference to the fence in question and can crash badly. Then the code doesn't seem to have the intended effect since only the exclusive fence is handled, but the KFD fences are always added as shared fence. Only keep the handling to throw away the content of SVM BOs. Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211122123926.385017-1-christian.koenig@amd.com Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>	2021-11-22 21:27:04 +01:00
xinhui pan	4aaea9d72e	drm/amdgpu: Fix double free of dmabuf amdgpu_amdkfd_gpuvm_free_memory_of_gpu drop dmabuf reference increased in amdgpu_gem_prime_export. amdgpu_bo_destroy drop dmabuf reference increased in amdgpu_gem_prime_import. So remove this extra dma_buf_put to avoid double free. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Tested-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-22 14:59:08 -05:00
Felix Kuehling	e39938117e	drm/amdgpu: Fix MMIO HDP flush on SRIOV Disable HDP register remapping on SRIOV and set rmmio_remap.reg_offset to the fixed address of the VF register for hdp_v*_flush_hdp. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: Bokun Zhang <bokun.zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-22 14:45:54 -05:00
Stanley.Yang	fdcb279d5b	drm/amdgpu: query umc error info from ecc_table v2 if smu support ECCTABLE, driver can message smu to get ecc_table then query umc error info from ECCTABLE v2: optimize source code makes logical more reasonable Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-22 14:45:46 -05:00
Stanley.Yang	8882f90a3f	drm/amdgpu: add new query interface for umc block v2 add message smu to query error information v2: rename message_smu to ecc_info Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-22 14:45:14 -05:00
Alex Deucher	92020e81dd	drm/amdgpu/display: set vblank_disable_immediate for DC Disable vblanks immediately to save power. I think this was missed when we merged DC support. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1781 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-22 14:45:03 -05:00
Bernard Zhao	7b833d6804	drm/amd/amdgpu: fix potential memleak In function amdgpu_get_xgmi_hive, when kobject_init_and_add failed There is a potential memleak if not call kobject_put. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Bernard Zhao <bernard@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-22 14:45:03 -05:00
Bernard Zhao	8b11e14bd5	drm/amd/amdgpu: cleanup the code style a bit This change is to cleanup the code style a bit. Signed-off-by: Bernard Zhao <bernard@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-22 14:45:02 -05:00
Bernard Zhao	7b755d6510	drm/amd/amdgpu: remove useless break after return This change is to remove useless break after return. Signed-off-by: Bernard Zhao <bernard@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-22 14:45:02 -05:00
hongao	6c5af7d2f8	drm/amdgpu: fix set scaling mode Full/Full aspect/Center not works on vga and dvi connectors amdgpu_connector_vga_get_modes missed function amdgpu_get_native_mode which assign amdgpu_encoder->native_mode with *preferred_mode result in amdgpu_encoder->native_mode.clock always be 0. That will cause amdgpu_connector_set_property returned early on: if ((rmx_type != DRM_MODE_SCALE_NONE) && (amdgpu_encoder->native_mode.clock == 0)) when we try to set scaling mode Full/Full aspect/Center. Add the missing function to amdgpu_connector_vga_get_mode can fix this. It also works on dvi connectors because amdgpu_connector_dvi_helper_funcs.get_mode use the same method. Signed-off-by: hongao <hongao@uniontech.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-22 14:45:02 -05:00
Candice Li	d9a69fe512	drm/amdgpu: Add recovery_lock to save bad pages function Fix race condition failure during UMC UE injection. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-22 14:45:02 -05:00
Evan Quan	6c08e0ef87	drm/amd/pm: avoid duplicate powergate/ungate setting Just bail out if the target IP block is already in the desired powergate/ungate state. This can avoid some duplicate settings which sometimes may cause unexpected issues. Link: https://lore.kernel.org/all/YV81vidWQLWvATMM@zn.tnic/ Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214921 Bug: https://bugzilla.kernel.org/show_bug.cgi?id=215025 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1789 Signed-off-by: Evan Quan <evan.quan@amd.com> Tested-by: Borislav Petkov <bp@suse.de> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-22 14:45:02 -05:00
Ramesh Errabolu	d25e35bc26	drm/amdgpu: Pin MMIO/DOORBELL BO's in GTT domain MMIO/DOORBELL BOs encode control data and should be pinned in GTT domain before enabling PCIe connected peer devices in accessing it Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-22 14:45:01 -05:00
Ramesh Errabolu	f441dd33db	drm/amdgpu: Update BO memory accounting to rely on allocation flag Accounting system to track amount of available memory (system, TTM and VRAM of a device) relies on BO's domain. The change is to rely instead on allocation flag indicating BO type - VRAM, GTT, USERPTR, MMIO or DOORBELL Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-22 14:45:01 -05:00
Thomas Zimmermann	a713ca234e	Merge drm/drm-next into drm-misc-next Backmerging from drm/drm-next for v5.16-rc1. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2021-11-18 09:36:39 +01:00
Bernard Zhao	27dfaedc0d	drm/amd/amdgpu: fix potential memleak In function amdgpu_get_xgmi_hive, when kobject_init_and_add failed There is a potential memleak if not call kobject_put. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Bernard Zhao <bernard@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-17 23:04:57 -05:00
hongao	bf55208391	drm/amdgpu: fix set scaling mode Full/Full aspect/Center not works on vga and dvi connectors amdgpu_connector_vga_get_modes missed function amdgpu_get_native_mode which assign amdgpu_encoder->native_mode with *preferred_mode result in amdgpu_encoder->native_mode.clock always be 0. That will cause amdgpu_connector_set_property returned early on: if ((rmx_type != DRM_MODE_SCALE_NONE) && (amdgpu_encoder->native_mode.clock == 0)) when we try to set scaling mode Full/Full aspect/Center. Add the missing function to amdgpu_connector_vga_get_mode can fix this. It also works on dvi connectors because amdgpu_connector_dvi_helper_funcs.get_mode use the same method. Signed-off-by: hongao <hongao@uniontech.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2021-11-17 23:03:08 -05:00
Evan Quan	6ee27ee27b	drm/amd/pm: avoid duplicate powergate/ungate setting Just bail out if the target IP block is already in the desired powergate/ungate state. This can avoid some duplicate settings which sometimes may cause unexpected issues. Link: https://lore.kernel.org/all/YV81vidWQLWvATMM@zn.tnic/ Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214921 Bug: https://bugzilla.kernel.org/show_bug.cgi?id=215025 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1789 Fixes: `bf756fb833` ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend") Signed-off-by: Evan Quan <evan.quan@amd.com> Tested-by: Borislav Petkov <bp@suse.de> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2021-11-17 17:41:20 -05:00
Guchun Chen	69650a879b	drm/amdgpu: add error print when failing to add IP block(v2) Driver initialization is driven by IP version from IP discovery table. So add error print when failing to add ip block during driver initialization, this will be more friendly to user to know which IP version is not correct. [ 40.467361] [drm] host supports REQ_INIT_DATA handshake [ 40.474076] [drm] add ip block number 0 <nv_common> [ 40.474090] [drm] add ip block number 1 <gmc_v10_0> [ 40.474101] [drm] add ip block number 2 <psp> [ 40.474103] [drm] add ip block number 3 <navi10_ih> [ 40.474114] [drm] add ip block number 4 <smu> [ 40.474119] [drm] add ip block number 5 <amdgpu_vkms> [ 40.474134] [drm] add ip block number 6 <gfx_v10_0> [ 40.474143] [drm] add ip block number 7 <sdma_v5_2> [ 40.474147] amdgpu 0000:00:08.0: amdgpu: Fatal error during GPU init [ 40.474545] amdgpu 0000:00:08.0: amdgpu: amdgpu: finishing device. v2: use dev_err to multi-GPU system Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-17 17:41:07 -05:00
Guchun Chen	73729a7d07	drm/amdgpu: add error print when failing to add IP block(v2) Driver initialization is driven by IP version from IP discovery table. So add error print when failing to add ip block during driver initialization, this will be more friendly to user to know which IP version is not correct. [ 40.467361] [drm] host supports REQ_INIT_DATA handshake [ 40.474076] [drm] add ip block number 0 <nv_common> [ 40.474090] [drm] add ip block number 1 <gmc_v10_0> [ 40.474101] [drm] add ip block number 2 <psp> [ 40.474103] [drm] add ip block number 3 <navi10_ih> [ 40.474114] [drm] add ip block number 4 <smu> [ 40.474119] [drm] add ip block number 5 <amdgpu_vkms> [ 40.474134] [drm] add ip block number 6 <gfx_v10_0> [ 40.474143] [drm] add ip block number 7 <sdma_v5_2> [ 40.474147] amdgpu 0000:00:08.0: amdgpu: Fatal error during GPU init [ 40.474545] amdgpu 0000:00:08.0: amdgpu: amdgpu: finishing device. v2: use dev_err to multi-GPU system Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-17 17:09:18 -05:00
Graham Sider	02274fc0f6	drm/amdkfd: replace trivial funcs with direct access These get funcs simply return an adev field. Replace funcs/calls with direct field accesses instead. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-17 16:58:11 -05:00
Nirmoy Das	26db557e35	drm/amdgpu: return early on error while setting bar0 memtype We set WC memtype for aper_base but don't check return value of arch_io_reserve_memtype_wc(). Be more defensive and return early on error. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-17 16:58:04 -05:00
Nirmoy Das	d5a28852e8	drm/amdgpu: remove unnecessary checks amdgpu_ttm_backend_bind() only needed for TTM_PL_TT and AMDGPU_PL_PREEMPT. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-17 16:58:04 -05:00
Evan Quan	087451f372	drm/amdgpu: use generic fb helpers instead of setting up AMD own's. With the shadow buffer support from generic framebuffer emulation, it's possible now to have runpm kicked when no update for console. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-17 16:58:03 -05:00
Graham Sider	b5d1d755c1	drm/amdkfd: remove kgd_dev declaration and initialization Completes removal of kgd_dev. Direct references to amdgpu_device objects should now be used instead. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-17 16:58:03 -05:00
Graham Sider	56c5977eae	drm/amdkfd: replace/remove remaining kgd_dev references Remove get_amdgpu_device and other remaining kgd_dev references aside from declaration/kfd struct entry and initialization. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-17 16:58:02 -05:00
Graham Sider	dff63da93e	drm/amdkfd: replace kgd_dev in gpuvm amdgpu_amdkfd funcs Modified definitions: - amdgpu_amdkfd_gpuvm_acquire_process_vm - amdgpu_amdkfd_gpuvm_release_process_vm - amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu - amdgpu_amdkfd_gpuvm_free_memory_of_gpu - amdgpu_amdkfd_gpuvm_map_memory_to_gpu - amdgpu_amdkfd_gpuvm_unmap_memory_from_gpu - amdgpu_amdkfd_gpuvm_sync_memory - amdgpu_amdkfd_gpuvm_map_gtt_bo_to_kernel - amdgpu_amdkfd_gpuvm_unmap_gtt_bo_from_kernel - amdgpu_amdkfd_gpuvm_get_vm_fault_info - amdgpu_amdkfd_gpuvm_import_dmabuf - amdgpu_amdkfd_get_tile_config Removed: - get_amdgpu_device Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-17 16:58:02 -05:00
Graham Sider	574c4183ef	drm/amdkfd: replace kgd_dev in get amdgpu_amdkfd funcs Modified definitions: - amdgpu_amdkfd_get_fw_version - amdgpu_amdkfd_get_local_mem_info - amdgpu_amdkfd_get_gpu_clock_counter - amdgpu_amdkfd_get_max_engine_clock_in_mhz - amdgpu_amdkfd_get_cu_info - amdgpu_amdkfd_get_dmabuf_info - amdgpu_amdkfd_get_vram_usage - amdgpu_amdkfd_get_hive_id - amdgpu_amdkfd_get_unique_id - amdgpu_amdkfd_get_mmio_remap_phys_addr - amdgpu_amdkfd_get_num_gws - amdgpu_amdkfd_get_asic_rev_id - amdgpu_amdkfd_get_noretry - amdgpu_amdkfd_get_xgmi_hops_count - amdgpu_amdkfd_get_xgmi_bandwidth_mbytes - amdgpu_amdkfd_get_pcie_bandwidth_mbytes Also replaces kfd_device_by_kgd with kfd_device_by_adev, now searching via adev rather than kgd. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-17 16:58:02 -05:00
Graham Sider	6bfc7c7e17	drm/amdkfd: replace kgd_dev in various amgpu_amdkfd funcs Modified definitions: - amdgpu_amdkfd_submit_ib - amdgpu_amdkfd_set_compute_idle - amdgpu_amdkfd_have_atomics_support - amdgpu_amdkfd_flush_gpu_tlb_pasid - amdgpu_amdkfd_flush_gpu_tlb_pasid - amdgpu_amdkfd_gpu_reset - amdgpu_amdkfd_alloc_gtt_mem - amdgpu_amdkfd_free_gtt_mem - amdgpu_amdkfd_alloc_gws - amdgpu_amdkfd_free_gws - amdgpu_amdkfd_ras_poison_consumption_handler Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-17 16:58:01 -05:00
Graham Sider	3356c38dc1	drm/amdkfd: replace kgd_dev in various kfd2kgd funcs Modified definitions: - program_sh_mem_settings - set_pasid_vmid_mapping - init_interrupts - address_watch_disable - address_watch_execute - wave_control_execute - address_watch_get_offset - get_atc_vmid_pasid_mapping_info - set_scratch_backing_va - set_vm_context_page_table_base - read_vmid_from_vmfault_reg - get_cu_occupancy - program_trap_handler_settings Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-17 16:58:01 -05:00
Graham Sider	420185fdad	drm/amdkfd: replace kgd_dev in hqd/mqd kfd2kgd funcs Modified definitions: - hqd_load - hiq_mqd_load - hqd_sdma_load - hqd_dump - hqd_sdma_dump - hqd_is_occupied - hqd_destroy - hqd_sdma_is_occupied - hqd_sdma_destroy Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-17 16:58:01 -05:00
Graham Sider	c531a58bb6	drm/amdkfd: replace kgd_dev in static gfx v10_3 funcs Static funcs in amdgpu_amdkfd_gfx_v10_3.c now using amdgpu_device. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-17 16:58:00 -05:00
Graham Sider	4056b03377	drm/amdkfd: replace kgd_dev in static gfx v10 funcs Static funcs in amdgpu_amdkfd_gfx_v10.c now using amdgpu_device. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-17 16:58:00 -05:00
Graham Sider	9a17c9b79b	drm/amdkfd: replace kgd_dev in static gfx v9 funcs Static funcs in amdgpu_amdkfd_gfx_v9.c now using amdgpu_device. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-17 16:58:00 -05:00
Graham Sider	1cca608742	drm/amdkfd: replace kgd_dev in static gfx v8 funcs Static funcs in amdgpu_amdkfd_gfx_v8.c now using amdgpu_device. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-17 16:57:59 -05:00
Graham Sider	9365fbf3d7	drm/amdkfd: replace kgd_dev in static gfx v7 funcs Static funcs in amdgpu_amdkfd_gfx_v7.c now using amdgpu_device. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-17 16:57:59 -05:00
Christian König	fa78e367a2	drm/amdgpu: stop getting excl fence separately Just grab all fences for the display flip in one go. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211028132630.2330-2-christian.koenig@amd.com	2021-11-17 14:32:27 +01:00
Linus Torvalds	304ac8032d	drm next/fixes for 5.16-rc1 bridge: - HPD improvments for lt9611uxc - eDP aux-bus support for ps8640 - LVDS data-mapping selection support ttm: - remove huge page functionality (needs reworking) - fix a race condition during BO eviction panels: - add some new panels fbdev: - fix double-free - remove unused scrolling acceleration - CONFIG_FB dep improvements locking: - improve contended locking logging - naming collision fix dma-buf: - add dma_resv_for_each_fence iterator - fix fence refcounting bug - name locking fixesA prime: - fix object references during mmap nouveau: - various code style changes - refcount fix - device removal fixes - protect client list with a mutex - fix CE0 address calculation i915: - DP rates related fixes - Revert disabling dual eDP that was causing state readout problems - put the cdclk vtables in const data - Fix DVO port type for older platforms - Fix blankscreen by turning DP++ TMDS output buffers on encoder->shutdown - CCS FBs related fixes - Fix recursive lock in GuC submission - Revert guc_id from i915_request tracepoint - Build fix around dmabuf amdgpu: - GPU reset fix - Aldebaran fix - Yellow Carp fixes - DCN2.1 DMCUB fix - IOMMU regression fix for Picasso - DSC display fixes - BPC display calculation fixes - Other misc display fixes - Don't allow partial copy from user for DC debugfs - SRIOV fixes - GFX9 CSB pin count fix - Various IP version check fixes - DP 2.0 fixes - Limit DCN1 MPO fix to DCN1 amdkfd: - SVM fixes - Fix gfx version for renoir - Reset fixes udl: - timeout fix imx: - circular locking fix virtio: - NULL ptr deref fix -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmGN3YwACgkQDHTzWXnE hr6aZQ/+Pobf1VE7V3wPUcopxccJYmgBvG/uY8EDyjA8qaxHs2pQqGN2IooOGxr6 F8G1N94Hem/PCDn3T8JI2Tqw5z4sy4UwLahEWISurFCen1IMAfA7hYfutp9X3O7X 8h7b+PgkvVruEAHF7z0kqnWGPHmcro29cIHNkXVRjnJuz+Gmn1XRfo6Jj65n6D7u NfMeU4/lWRR3767oJQzTqyAYtGxsKaZT3/tBD5WggZBzEKC7hqhAl8EUoOLWwojo fDqwiEpLXpraPRIQH8trkXVHhzPeLAmG916WwS8JG3CEk9mUQ+I7Jshhd8cw+bsQ XPuk3OBfU9mtuiGgNzrLP3xXJZs/QN3EkpKZWLefTnJY+C4BgiP2RifTnghmwV31 6/7Pr83CX/cn3BRd7r0xaeBZYvVYBZmwoZcsZFJBM8SVjd/ofKUfAmCzZZKheio2 5qa6bj9DQoyjEoFAULh23plcX6hvATGP7wzfRTnJ9AlAJ0KyEjVJ3r0qE6jHMDc/ uzcTAnKIWCxt9kSgE5qwLQtxLBaBpr/iOniZbCqGkPjiZeMzqP/ug1AKVP7kk39x FxZVT8ZOKk8Xt4iLZx8jmHi2KKheXYZi9LqieoTrJd44qMXDOmR9DCtQX9FZuWJS EJAlMj6sCowAZdODPZMVpoMc3Gti9nZ2Fpu7mLrRcMk1gKfjKwo= =qMNk -----END PGP SIGNATURE----- Merge tag 'drm-next-2021-11-12' of git://anongit.freedesktop.org/drm/drm Pull more drm updates from Dave Airlie: "I missed a drm-misc-next pull for the main pull last week. It wasn't that major and isn't the bulk of this at all. This has a bunch of fixes all over, a lot for amdgpu and i915. bridge: - HPD improvments for lt9611uxc - eDP aux-bus support for ps8640 - LVDS data-mapping selection support ttm: - remove huge page functionality (needs reworking) - fix a race condition during BO eviction panels: - add some new panels fbdev: - fix double-free - remove unused scrolling acceleration - CONFIG_FB dep improvements locking: - improve contended locking logging - naming collision fix dma-buf: - add dma_resv_for_each_fence iterator - fix fence refcounting bug - name locking fixesA prime: - fix object references during mmap nouveau: - various code style changes - refcount fix - device removal fixes - protect client list with a mutex - fix CE0 address calculation i915: - DP rates related fixes - Revert disabling dual eDP that was causing state readout problems - put the cdclk vtables in const data - Fix DVO port type for older platforms - Fix blankscreen by turning DP++ TMDS output buffers on encoder->shutdown - CCS FBs related fixes - Fix recursive lock in GuC submission - Revert guc_id from i915_request tracepoint - Build fix around dmabuf amdgpu: - GPU reset fix - Aldebaran fix - Yellow Carp fixes - DCN2.1 DMCUB fix - IOMMU regression fix for Picasso - DSC display fixes - BPC display calculation fixes - Other misc display fixes - Don't allow partial copy from user for DC debugfs - SRIOV fixes - GFX9 CSB pin count fix - Various IP version check fixes - DP 2.0 fixes - Limit DCN1 MPO fix to DCN1 amdkfd: - SVM fixes - Fix gfx version for renoir - Reset fixes udl: - timeout fix imx: - circular locking fix virtio: - NULL ptr deref fix" * tag 'drm-next-2021-11-12' of git://anongit.freedesktop.org/drm/drm: (126 commits) drm/ttm: Double check mem_type of BO while eviction drm/amdgpu: add missed support for UVD IP_VERSION(3, 0, 64) drm/amdgpu: drop jpeg IP initialization in SRIOV case drm/amd/display: reject both non-zero src_x and src_y only for DCN1x drm/amd/display: Add callbacks for DMUB HPD IRQ notifications drm/amd/display: Don't lock connection_mutex for DMUB HPD drm/amd/display: Add comment where CONFIG_DRM_AMD_DC_DCN macro ends drm/amdkfd: Fix retry fault drain race conditions drm/amdkfd: lower the VAs base offset to 8KB drm/amd/display: fix exit from amdgpu_dm_atomic_check() abruptly drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov drm/amdgpu: fix uvd crash on Polaris12 during driver unloading drm/i915/adlp/fb: Prevent the mapping of redundant trailing padding NULL pages drm/i915/fb: Fix rounding error in subsampled plane size calculation drm/i915/hdmi: Turn DP++ TMDS output buffers back on in encoder->shutdown() drm/locking: fix __stack_depot_* name conflict drm/virtio: Fix NULL dereference error in virtio_gpu_poll drm/amdgpu: fix SI handling in amdgpu_device_asic_has_dc_support() drm/amdgpu: Fix dangling kfd_bo pointer for shared BOs drm/amd/amdkfd: Don't sent command to HWS on kfd reset ...	2021-11-12 12:11:07 -08:00
Dave Airlie	951bad0bd9	Merge tag 'amd-drm-fixes-5.16-2021-11-10' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-fixes-5.16-2021-11-10: amdgpu: - Don't allow partial copy from user for DC debugfs - SRIOV fixes - GFX9 CSB pin count fix - Various IP version check fixes - DP 2.0 fixes - Limit DCN1 MPO fix to DCN1 amdkfd: - SVM fixes - Reset fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211110222536.7527-1-alexander.deucher@amd.com	2021-11-11 10:14:44 +10:00
Dave Airlie	f8ca7b7419	Removed the TTM Huge Page functionnality to address a crash, a timeout fix for udl, CONFIG_FB dependency improvements, a fix for a circular locking depency in imx, a NULL pointer dereference fix for virtio, and a naming collision fix for drm/locking. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCYYuA2wAKCRDj7w1vZxhR xQdZAP4oqiWpCO1JfnZdvEJ/lOULqvdzYkbUZexshGLdbb4ECwEA83TzIbQvXP8p jsC1hPNAIsOBkQ+nGZwJkTOtDcpEAQ4= =N2pK -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-fixes-2021-11-10' of git://anongit.freedesktop.org/drm/drm-misc into drm-next Removed the TTM Huge Page functionnality to address a crash, a timeout fix for udl, CONFIG_FB dependency improvements, a fix for a circular locking depency in imx, a NULL pointer dereference fix for virtio, and a naming collision fix for drm/locking. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20211110082114.vfpkpnecwdfg27lk@gilmour	2021-11-11 08:14:19 +10:00
Guchun Chen	4d395f938a	drm/amdgpu: add missed support for UVD IP_VERSION(3, 0, 64) Fixes: `96b8dd4423` ("drm/amdgpu/amdgpu_vcn: convert to IP version checking") Signed-off-by: Flora Cui <flora.cui@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-10 12:03:41 -05:00
Guchun Chen	b45a36032d	drm/amdgpu: drop jpeg IP initialization in SRIOV case Fixes: `b05b9c591f` ("drm/amdgpu: clean up set IP function") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-10 12:00:08 -05:00
shaoyunl	9f4f2c1a35	drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov The KFD pre_reset should be called before reset been executed, it will hold the lock to prevent other rocm process to sent the packlage to hiq during host execute the real reset on the HW Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-09 17:08:00 -05:00
Evan Quan	4fc30ea780	drm/amdgpu: fix uvd crash on Polaris12 during driver unloading There was a change(below) target for such issue: `d82e2c249c` ("drm/amdgpu: Fix crash on device remove/driver unload") But the fix for VI ASICs was missing there. This is a supplement for that. Fixes: `d82e2c249c` ("drm/amdgpu: Fix crash on device remove/driver unload") Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-09 17:06:15 -05:00
Alex Deucher	2d32ffd6e9	drm/amdgpu: fix SI handling in amdgpu_device_asic_has_dc_support() Properly handle SI DC support when CONFIG_DRM_AMD_DC_SI is not set. Fixes: `f7f12b2582` ("drm/amdgpu: default to true in amdgpu_device_asic_has_dc_support") Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-05 14:12:53 -04:00
Felix Kuehling	5702d05295	drm/amdgpu: Fix dangling kfd_bo pointer for shared BOs If a kfd_bo was shared (e.g. a dmabuf export), the original kfd_bo may be freed when the amdgpu_bo still lives on. Free the kfd_bo struct in the release_notify callback then the amdgpu_bo is freed. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-By: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-05 14:12:45 -04:00
Evan Quan	e6ef9b396b	drm/amdgpu: correctly toggle gfx on/off around RLC_SPM_* register access As part of the ib padding process, accessing the RLC_SPM_* register may trigger gfx hang. Since gfxoff may be already kicked during the whole period. To address that, we manually toggle gfx on/off around the RLC_SPM_* register access. This can resolve the gfx hang issue observed on running Talos with RDP launched in parallel. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-05 14:12:29 -04:00
Tao Zhou	7513c9ff44	drm/amdgpu: correct xgmi ras error count reset The error count reset for xgmi3x16 pcs is missed. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-05 14:12:21 -04:00
YuBiao Wang	6ddc0eb7a2	drm/amd/amdgpu: Fix csb.bo pin_count leak on gfx 9 [Why] csb bo is not unpinned in gfx 9. It will lead to pin_count leak on driver unload. [How] Call bo_free_kernel corresponding to bo_create_kernel in gfx_rlc_init_csb. This will also unify the code path with other gfx versions. Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-05 14:11:32 -04:00
YuBiao Wang	c4fc13b581	drm/amd/amdgpu: Avoid writing GMC registers under sriov in gmc9 [Why] For Vega10, disabling gart of gfxhub could mess up KIQ and PSP under sriov mode, and lead to DMAR on host side. [How] Skip writing GMC registers under sriov. Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-05 14:11:20 -04:00
Kent Russell	7ef6b7f844	drm/amdgpu: Make sure to reserve BOs before adding or removing BOs need to be reserved before they are added or removed, so ensure that they are reserved during kfd_mem_attach and kfd_mem_detach Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-05 14:11:05 -04:00
Jason Gunthorpe	0d97950953	drm/ttm: remove ttm_bo_vm_insert_huge() The huge page functionality in TTM does not work safely because PUD and PMD entries do not have a special bit. get_user_pages_fast() considers any page that passed pmd_huge() as usable: if (unlikely(pmd_trans_huge(pmd) \|\| pmd_huge(pmd) \|\| pmd_devmap(pmd))) { And vmf_insert_pfn_pmd_prot() unconditionally sets entry = pmd_mkhuge(pfn_t_pmd(pfn, prot)); eg on x86 the page will be _PAGE_PRESENT \| PAGE_PSE. As such gup_huge_pmd() will try to deref a struct page: head = try_grab_compound_head(pmd_page(orig), refs, flags); and thus crash. Thomas further notices that the drivers are not expecting the struct page to be used by anything - in particular the refcount incr above will cause them to malfunction. Thus everything about this is not able to fully work correctly considering GUP_fast. Delete it entirely. It can return someday along with a proper PMD/PUD_SPECIAL bit in the page table itself to gate GUP_fast. Fixes: `314b6580ad` ("drm/ttm, drm/vmwgfx: Support huge TTM pagefaults") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Thomas Hellström <thomas.helllstrom@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> [danvet: Update subject per Thomas' &Christian's review] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/0-v2-a44694790652+4ac-ttm_pmd_jgg@nvidia.com	2021-11-05 11:13:19 +01:00
Linus Torvalds	5c904c66ed	Char/Misc driver update for 5.16-rc1 Here is the big set of char and misc and other tiny driver subsystem updates for 5.16-rc1. Loads of things in here, all of which have been in linux-next for a while with no reported problems (except for one called out below.) Included are: - habanana labs driver updates, including dma_buf usage, reviewed and acked by the dma_buf maintainers - iio driver update (going through this tree not staging as they really do not belong going through that tree anymore) - counter driver updates - hwmon driver updates that the counter drivers needed, acked by the hwmon maintainer - xillybus driver updates - binder driver updates - extcon driver updates - dma_buf module namespaces added (will cause a build error in arm64 for allmodconfig, but that change is on its way through the drm tree) - lkdtm driver updates - pvpanic driver updates - phy driver updates - virt acrn and nitr_enclaves driver updates - smaller char and misc driver updates Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYYPX2A8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ymUUgCbB4EKysgLuXYdjUalZDx+vvZO4k0AniS14O4k F+2dVSZ5WX6wumUzCaA6 =bXQM -----END PGP SIGNATURE----- Merge tag 'char-misc-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver updates from Greg KH: "Here is the big set of char and misc and other tiny driver subsystem updates for 5.16-rc1. Loads of things in here, all of which have been in linux-next for a while with no reported problems (except for one called out below.) Included are: - habanana labs driver updates, including dma_buf usage, reviewed and acked by the dma_buf maintainers - iio driver update (going through this tree not staging as they really do not belong going through that tree anymore) - counter driver updates - hwmon driver updates that the counter drivers needed, acked by the hwmon maintainer - xillybus driver updates - binder driver updates - extcon driver updates - dma_buf module namespaces added (will cause a build error in arm64 for allmodconfig, but that change is on its way through the drm tree) - lkdtm driver updates - pvpanic driver updates - phy driver updates - virt acrn and nitr_enclaves driver updates - smaller char and misc driver updates" * tag 'char-misc-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (386 commits) comedi: dt9812: fix DMA buffers on stack comedi: ni_usb6501: fix NULL-deref in command paths arm64: errata: Enable TRBE workaround for write to out-of-range address arm64: errata: Enable workaround for TRBE overwrite in FILL mode coresight: trbe: Work around write to out of range coresight: trbe: Make sure we have enough space coresight: trbe: Add a helper to determine the minimum buffer size coresight: trbe: Workaround TRBE errata overwrite in FILL mode coresight: trbe: Add infrastructure for Errata handling coresight: trbe: Allow driver to choose a different alignment coresight: trbe: Decouple buffer base from the hardware base coresight: trbe: Add a helper to pad a given buffer area coresight: trbe: Add a helper to calculate the trace generated coresight: trbe: Defer the probe on offline CPUs coresight: trbe: Fix incorrect access of the sink specific data coresight: etm4x: Add ETM PID for Kryo-5XX coresight: trbe: Prohibit trace before disabling TRBE coresight: trbe: End the AUX handle on truncation coresight: trbe: Do not truncate buffer on IRQ coresight: trbe: Fix handling of spurious interrupts ...	2021-11-04 08:21:47 -07:00
James Zhu	93cec18478	drm/amdgpu: remove duplicated kfd_resume_iommu Remove duplicated kfd_resume_iommu which already runs in mdgpu_amdkfd_device_init. Tested-By: Ken Moffat <zarniwhoop@ntlworld.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-03 12:22:08 -04:00
Aaron Liu	e8a423c589	drm/amdgpu: update RLC_PG_DELAY_3 Value to 200us for yellow carp For yellow carp, the desired CGPG hysteresis value is 0x4E20. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-03 12:22:08 -04:00
Mario Limonciello	c92f909614	drm/amdgpu: Convert SMU version to decimal in debugfs This is more useful when talking to the SMU team to have the information in this format, save one less step to manually do it. Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-03 12:22:07 -04:00
Oak Zeng	72c148d776	drm/amdgpu: use correct register mask to extract field Aldebaran has different register mask definitions for regiter MC_VM_XGMI_LFB_CNTL. Use the correct masks to interpret fields of this register. Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-03 12:22:07 -04:00
Jingwen Chen	38d4e4638e	drm/amd/amdgpu: fix bad job hw_fence use after free in advance tdr [Why] In advance tdr mode, the real bad job will be resubmitted twice, while in drm_sched_resubmit_jobs_ext, there's a dma_fence_put, so the bad job is put one more time than other jobs. [How] Adding dma_fence_get before resbumit job in amdgpu_device_recheck_guilty_jobs and put the fence for normal jobs Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-11-03 12:22:07 -04:00
Linus Torvalds	56d3375448	drm for 5.16-rc1 core: - improve dma_fence, lease and resv documentation - shmem-helpers: allocate WC pages on x86, use vmf_insert_pin - sched fixes/improvements - allow empty drm leases - add dma resv iterator - add more DP 2.0 headers - DP MST helper improvements for DP2.0 dma-buf: - avoid warnings, remove fence trace macros bridge: - new helper to get rid of panels - probe improvements for it66121 - enable DSI EOTP for anx7625 fbdev: - efifb: release runtime PM on destroy ttm: - kerneldoc switch - helper to clear all DMA mappings - pool shrinker optimizaton - remove ttm_tt_destroy_common - update ttm_move_memcpy for async use panel: - add new panel-edp driver amdgpu: - Initial DP 2.0 support - Initial USB4 DP tunnelling support - Aldebaran MCE support - Modifier support for DCC image stores for GFX 10.3 - Display rework for better FP code handling - Yellow Carp/Cyan Skillfish updates - Cyan Skillfish display support - convert vega/navi to IP discovery asic enumeration - validate IP discovery table - RAS improvements - Lots of fixes i915: - DG1 PCI IDs + LMEM discovery/placement - DG1 GuC submission by default - ADL-S PCI IDs updated + enabled by default - ADL-P (XE_LPD) fixed and updates - DG2 display fixes - PXP protected object support for Gen12 integrated - expose multi-LRC submission interface for GuC - export logical engine instance to user - Disable engine bonding on Gen12+ - PSR cleanup - PSR2 selective fetch by default - DP 2.0 prep work - VESA vendor block + MSO use of it - FBC refactor - try again to fix fast-narrow vs slow-wide eDP training - use THP when IOMMU enabled - LMEM backup/restore for suspend/resume - locking simplification - GuC major reworking - async flip VT-D workaround changes - DP link training improvements - misc display refactorings bochs: - new PCI ID rcar-du: - Non-contiguious buffer import support for rcar-du - r8a779a0 support prep omapdrm: - COMPILE_TEST fixes sti: - COMPILE_TEST fixes msm: - fence ordering improvements - eDP support in DP sub-driver - dpu irq handling cleanup - CRC support for making igt happy - NO_CONNECTOR bridge support - dsi: 14nm phy support for msm8953 - mdp5: msm8x53, sdm450, sdm632 support stm: - layer alpha + zpo support v3d: - fix Vulkan CTS failure - support multiple sync objects gud: - add R8/RGB332/RGB888 pixel formats vc4: - convert to new bridge helpers vgem: - use shmem helpers virtio: - support mapping exported vram zte: - remove obsolete driver rockchip: - use bridge attach no connector for LVDS/RGB -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmGByPYACgkQDHTzWXnE hr6fxA//cXUvTHlEtF7UJDBRAYv+9lXH39NbGYU4aLJuBNlZztCuUi5JOSyDFDH1 N9VI5biVseev2PEnCzJUubWxTqbUO7FBQTw0TyvZ4Eqn+UZMuFeo0dvdKZRAkvjV VHSUc0fm0+WSYanKUK7XK0fwG8aE6JVyYngzgKPSjifhszTdiiRsbU21iTinFhkS rgh3HEVELp+LqfoG4qzAYqFUjYqUjvCjd/hX/UkzCII8ZXKr38/4127e95443WOk +jes0gWGJe9TvSDrqo9TMx4qukcOniINFUvnzoD2RhOS+Jzr/i5rBh51Xy92g3NO Q7hy6byZdk/ZO/MXCDQ2giUOkBiqn5fQjlRGQp4iAZYw9pb3HU+/xrTq0BWVWd8o /vmzZYEKKU/sCGpxVDMZxsHV3mXIuVBvuZq6bjmSGcybgOBCiDx5F/Rum4nY2yHp lr3cuc0HP3m3f4b/HVvACO4tGd1nDDpVcon7CuhBB7HB7t6Zl9u18qc/qFw0tCTh 3sgAhno6XFXtPFcSX2KAeeg0mhKDKKrsOnq5y3bDRr05Z0jLocJk95aXEKs6em4j gbyHwNaX3CHtiCnFn2/5169+n1K7zqHBtVSGmQlmFDv55rcdx7L3Spk7tCahQeSQ ur24r+sEggm8d5Wjl+MYq6wW3oP31s04JFaeV6oCkaSp1wS+alg= =jdhH -----END PGP SIGNATURE----- Merge tag 'drm-next-2021-11-03' of git://anongit.freedesktop.org/drm/drm Pull drm updates from Dave Airlie: "Summary below. i915 starts to add support for DG2 GPUs, enables DG1 and ADL-S support by default, lots of work to enable DisplayPort 2.0 across drivers. Lots of documentation updates and fixes across the board. core: - improve dma_fence, lease and resv documentation - shmem-helpers: allocate WC pages on x86, use vmf_insert_pin - sched fixes/improvements - allow empty drm leases - add dma resv iterator - add more DP 2.0 headers - DP MST helper improvements for DP2.0 dma-buf: - avoid warnings, remove fence trace macros bridge: - new helper to get rid of panels - probe improvements for it66121 - enable DSI EOTP for anx7625 fbdev: - efifb: release runtime PM on destroy ttm: - kerneldoc switch - helper to clear all DMA mappings - pool shrinker optimizaton - remove ttm_tt_destroy_common - update ttm_move_memcpy for async use panel: - add new panel-edp driver amdgpu: - Initial DP 2.0 support - Initial USB4 DP tunnelling support - Aldebaran MCE support - Modifier support for DCC image stores for GFX 10.3 - Display rework for better FP code handling - Yellow Carp/Cyan Skillfish updates - Cyan Skillfish display support - convert vega/navi to IP discovery asic enumeration - validate IP discovery table - RAS improvements - Lots of fixes i915: - DG1 PCI IDs + LMEM discovery/placement - DG1 GuC submission by default - ADL-S PCI IDs updated + enabled by default - ADL-P (XE_LPD) fixed and updates - DG2 display fixes - PXP protected object support for Gen12 integrated - expose multi-LRC submission interface for GuC - export logical engine instance to user - Disable engine bonding on Gen12+ - PSR cleanup - PSR2 selective fetch by default - DP 2.0 prep work - VESA vendor block + MSO use of it - FBC refactor - try again to fix fast-narrow vs slow-wide eDP training - use THP when IOMMU enabled - LMEM backup/restore for suspend/resume - locking simplification - GuC major reworking - async flip VT-D workaround changes - DP link training improvements - misc display refactorings bochs: - new PCI ID rcar-du: - Non-contiguious buffer import support for rcar-du - r8a779a0 support prep omapdrm: - COMPILE_TEST fixes sti: - COMPILE_TEST fixes msm: - fence ordering improvements - eDP support in DP sub-driver - dpu irq handling cleanup - CRC support for making igt happy - NO_CONNECTOR bridge support - dsi: 14nm phy support for msm8953 - mdp5: msm8x53, sdm450, sdm632 support stm: - layer alpha + zpo support v3d: - fix Vulkan CTS failure - support multiple sync objects gud: - add R8/RGB332/RGB888 pixel formats vc4: - convert to new bridge helpers vgem: - use shmem helpers virtio: - support mapping exported vram zte: - remove obsolete driver rockchip: - use bridge attach no connector for LVDS/RGB" * tag 'drm-next-2021-11-03' of git://anongit.freedesktop.org/drm/drm: (1259 commits) drm/amdgpu/gmc6: fix DMA mask from 44 to 40 bits drm/amd/display: MST support for DPIA drm/amdgpu: Fix even more out of bound writes from debugfs drm/amdgpu/discovery: add SDMA IP instance info for soc15 parts drm/amdgpu/discovery: add UVD/VCN IP instance info for soc15 parts drm/amdgpu/UAPI: rearrange header to better align related items drm/amd/display: Enable dpia in dmub only for DCN31 B0 drm/amd/display: Fix USB4 hot plug crash issue drm/amd/display: Fix deadlock when falling back to v2 from v3 drm/amd/display: Fallback to clocks which meet requested voltage on DCN31 drm/amd/display: move FPU associated DCN301 code to DML folder drm/amd/display: fix link training regression for 1 or 2 lane drm/amd/display: add two lane settings training options drm/amd/display: decouple hw_lane_settings from dpcd_lane_settings drm/amd/display: implement decide lane settings drm/amd/display: adopt DP2.0 LT SCR revision 8 drm/amd/display: FEC configuration for dpia links in MST mode drm/amd/display: FEC configuration for dpia links drm/amd/display: Add workaround flag for EDID read on certain docks drm/amd/display: Set phy_mux_sel bit in dmub scratch register ...	2021-11-02 16:47:49 -07:00
Linus Torvalds	6e5772c8d9	Add an interface called cc_platform_has() which is supposed to be used by confidential computing solutions to query different aspects of the system. The intent behind it is to unify testing of such aspects instead of having each confidential computing solution add its own set of tests to code paths in the kernel, leading to an unwieldy mess. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmF/uLUACgkQEsHwGGHe VUqGbQ/+LOmz8hmL5vtbXw/lVonCSBRKI2KVefnN2VtQ3rjtCq8HlNoq/hAdi15O WntABFV8u4daNAcssp+H/p+c8Mt/NzQa60TRooC5ZIynSOCj4oZQxTWjcnR4Qxrf oABy4sp09zNW31qExtTVTwPC/Ejzv4hA0Vqt9TLQOSxp7oYVYKeDJNp79VJK64Yz Ky7epgg8Pauk0tAT76ATR4kyy9PLGe4/Ry0bOtAptO4NShL1RyRgI0ywUmptJHSw FV/MnoexdAs4V8+4zPwyOkf8YMDnhbJcvFcr7Yd9AEz2q9Z1wKCgi1M3aZIoW8lV YMXECMGe9DfxmEJbnP5zbnL6eF32x+tbq+fK8Ye4V2fBucpWd27zkcTXjoP+Y+zH NLg+9QykR9QCH75YCOXcAg1Q5hSmc4DaWuJymKjT+W7MKs89ywjq+ybIBpLBHbQe uN9FM/CEKXx8nQwpNQc7mdUE5sZeCQ875028RaLbLx3/b6uwT6rBlNJfxl/uxmcZ iF1kG7Cx4uO+7G1a9EWgxtWiJQ8GiZO7PMCqEdwIymLIrlNksAk7nX2SXTuH5jIZ YDuBj/Xz2UUVWYFm88fV5c4ogiFlm9Jeo140Zua/BPdDJd2VOP013rYxzFE/rVSF SM2riJxCxkva8Fb+8TNiH42AMhPMSpUt1Nmd1H2rcEABRiT83Ow= =Na0U -----END PGP SIGNATURE----- Merge tag 'x86_cc_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull generic confidential computing updates from Borislav Petkov: "Add an interface called cc_platform_has() which is supposed to be used by confidential computing solutions to query different aspects of the system. The intent behind it is to unify testing of such aspects instead of having each confidential computing solution add its own set of tests to code paths in the kernel, leading to an unwieldy mess" * tag 'x86_cc_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: treewide: Replace the use of mem_encrypt_active() with cc_platform_has() x86/sev: Replace occurrences of sev_es_active() with cc_platform_has() x86/sev: Replace occurrences of sev_active() with cc_platform_has() x86/sme: Replace occurrences of sme_active() with cc_platform_has() powerpc/pseries/svm: Add a powerpc version of cc_platform_has() x86/sev: Add an x86 version of cc_platform_has() arch/cc: Introduce a function to check for confidential computing features x86/ioremap: Selectively build arch override encryption functions	2021-11-01 15:16:52 -07:00
Alex Deucher	403475be6d	drm/amdgpu/gmc6: fix DMA mask from 44 to 40 bits The DMA mask on SI parts is 40 bits not 44. Copy paste typo. Fixes: `244511f386` ("drm/amdgpu: simplify and cleanup setting the dma mask") Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1762 Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-28 14:27:00 -04:00
Alex Deucher	58f8c7fa88	drm/amdgpu/discovery: add SDMA IP instance info for soc15 parts Add secondary instance version info for soc15 parts. Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-28 14:26:59 -04:00
Alex Deucher	074b2092d9	drm/amdgpu/discovery: add UVD/VCN IP instance info for soc15 parts Add secondary instance version info for vega20, arcturure, and aldebaran. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-28 14:26:59 -04:00
Tao Zhou	3d1a8d950d	drm/amdgpu: remove GPRs init for ALDEBARAN in gpu reset (v3) Remove GPRs init for ALDEBARAN in gpu reset temporarily, will add the init once the algorithm is stable. v2: Only remove GPRs init in gpu reset. v3: Suspend needs it, only skip it in gpu reset. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-28 14:26:13 -04:00
Tao Zhou	f7e053435c	drm/amdgpu: skip GPRs init for some CU settings on ALDEBARAN Skip GPRs init in specific condition since current GPRs init algorithm only works for some CU settings. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-28 14:26:13 -04:00
Candice Li	4320e6f86d	drm/amdgpu: Update TA version output in driver TA version should only be displayed in firmware version column. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-28 14:26:12 -04:00
Lang Yu	a5c5d8d50e	drm/amdgpu: fix a potential memory leak in amdgpu_device_fini_sw() amdgpu_fence_driver_sw_fini() should be executed before amdgpu_device_ip_fini(), otherwise fence driver resource won't be properly freed as adev->rings have been tore down. Fixes: `72c8c97b15` ("drm/amdgpu: Split amdgpu_device_fini into early and late") Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-28 14:26:12 -04:00
Lang Yu	68df0f195a	drm/amdkfd: Separate pinned BOs destruction from general routine Currently, all kfd BOs use same destruction routine. But pinned BOs are not unpinned properly. Separate them from general routine. v2 (Felix): Add safeguard to prevent user space from freeing signal BO. Kunmap signal BO in the event of setting event page error. Just kunmap signal BO to avoid duplicating the code. Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-28 14:26:12 -04:00
Philip Yang	3b8a23ae52	drm/amdkfd: restore userptr ignore bad address error The userptr can be unmapped by application and still registered to driver, restore userptr work return user pages will get -EFAULT bad address error. Pretend this error as succeed. GPU access this userptr will have VM fault later, it is better than application soft hangs with stalled user mode queues. v2: squash in warning fix (Alex) Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-28 14:26:12 -04:00
Kent Russell	68daadf3d6	drm/amdgpu: Add kernel parameter support for ignoring bad page threshold When a GPU hits the bad_page_threshold, it will not be initialized by the amdgpu driver. This means that the table cannot be cleared, nor can information gathering be performed (getting serial number, BDF, etc). If the bad_page_threshold kernel parameter is set to -2, continue to initialize the GPU, while printing a warning to dmesg that this action has been done v2: squash in Luben's fix to restore RAS info reporting Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: Mukul Joshi <Mukul.Joshi@amd.com> Signed-off-by: Kent Russell <kent.russell@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-28 14:26:12 -04:00
Kent Russell	8483fdfea7	drm/amdgpu: Warn when bad pages approaches 90% threshold dmesg doesn't warn when the number of bad pages approaches the threshold for page retirement. WARN when the number of bad pages is at 90% or greater for easier checks and planning, instead of waiting until the GPU is full of bad pages. Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: Mukul Joshi <Mukul.Joshi@amd.com> Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-28 14:26:12 -04:00
Dave Airlie	970eae1560	Linux 5.15-rc7 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmF298ceHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGIJYH/1rsEFQQ6caeQdy1 z9eFIe48DNM4l7bFk+qEj2UAbzPdahVJ299Mg5fW0n2CDemOc9/n0b9TxQ37YObi mOzu0xwJVupIxkyFMPQSSc2q8aLm67NSpJy08DsmaNses5hSvu8x15RPHLQTybjt SwtKns+jpCq79P1GWbrB5e5UkLb0VNoxNp4L1U4pMrYGcEkJUXbaxNY2V/JcXdM7 Vtn+qN0T/J6V6QVftv0t8Ecj3bjEnmL3kZHaTaNg3dGeKRpCGyHc5lcBQ0cNFG6t vjZ9VbuhBzGI3TN2tHH5hpA1UXo7HPBBCwQqxF1jeGLGHULikYwZ3TAPWqL3QZqC 9cxr9SY= =p75d -----END PGP SIGNATURE----- BackMerge tag 'v5.15-rc7' into drm-next The msm next tree is based on rc3, so let's just backmerge rc7 before pulling it in. Signed-off-by: Dave Airlie <airlied@redhat.com>	2021-10-28 14:59:38 +10:00
Maxime Ripard	736638246e	Merge drm/drm-next into drm-misc-next drm-misc-next hasn't been updated in a while and I need a post -rc2 state to merge some vc4 patches. Signed-off-by: Maxime Ripard <maxime@cerno.tech>	2021-10-25 15:27:56 +02:00
Greg Kroah-Hartman	16b0314aa7	dma-buf: move dma-buf symbols into the DMA_BUF module namespace In order to better track where in the kernel the dma-buf code is used, put the symbols in the namespace DMA_BUF and modify all users of the symbols to properly import the namespace to not break the build at the same time. Now the output of modinfo shows the use of these symbols, making it easier to watch for users over time: $ modinfo drivers/misc/fastrpc.ko \| grep import import_ns: DMA_BUF Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: dri-devel@lists.freedesktop.org Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Sumit Semwal <sumit.semwal@linaro.org> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20211010124628.17691-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-10-25 14:53:08 +02:00
Alex Deucher	df9feb1a69	drm/amdgpu/nbio7.4: use original HDP_FLUSH bits The extended bits were not available for use on vega20 and presumably arcturus as well. Fixes: `a0f9f85466` ("drm/amdgpu/nbio7.4: don't use GPU_HDP_FLUSH bit 12") Reviewed-and-tested-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-22 10:11:41 -04:00
Christian König	25b8a14e88	drm/amdgpu: use new iterator in amdgpu_ttm_bo_eviction_valuable Simplifying the code a bit. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20211005113742.1101-13-christian.koenig@amd.com	2021-10-22 14:41:59 +02:00
Christian König	930ca2a7cb	drm/amdgpu: use the new iterator in amdgpu_sync_resv Simplifying the code a bit. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20211005113742.1101-12-christian.koenig@amd.com	2021-10-22 14:41:07 +02:00
Alex Deucher	8cbc52c207	drm/amdgpu: Workaround harvesting info for some navy flounder boards Some navy flounder boards do not properly mark harvested VCN instances. Fix that here. v2: use IP versions Fixes: `1b592d00b4` ("drm/amdgpu/vcn: remove manual instance setting") Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1743 Reviewed-and-tested-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-21 23:39:00 -04:00
Alex Deucher	47be978be0	drm/amdgpu/vcn3.0: remove intermediate variable No need to use the id variable, just use the constant plus instance offset directly. Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-21 23:38:57 -04:00
Alex Deucher	7876c7ea14	drm/amdgpu/vcn2.0: remove intermediate variable No need to use the tmp variable, just use the constant directly. Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-21 23:38:53 -04:00
Alex Deucher	c5dd5667f4	drm/amdgpu: Consolidate VCN firmware setup code Roughly the same code was present in all VCN versions. Consolidate it into a single function. v2: use AMDGPU_UCODE_ID_VCN + i, check if num_inst >= 2 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com>	2021-10-21 23:38:46 -04:00
Alex Deucher	e8ac9e93b4	drm/amdgpu/vcn3.0: handle harvesting in firmware setup Only enable firmware for the instance that is enabled. v2: use AMDGPU_UCODE_ID_VCN + i Fixes: `1b592d00b4` ("drm/amdgpu/vcn: remove manual instance setting") Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1743 Reviewed-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-21 23:38:41 -04:00
Jingwen Chen	e77f0f5c6a	drm/amd/amdgpu: add dummy_page_addr to sriov msg Add dummy_page_addr to sriov msg for host driver to set GCVM_L2_PROTECTION_DEFAULT_ADDR* registers correctly. v2: should update vf2pf msg instead Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com> Reviewed-by: Horace Chen <horace.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-21 23:38:16 -04:00
Huang Rui	a61794bd2f	drm/amdgpu: remove grbm cam index/data operations for gfx v10 PSP firmware will be responsible for applying the GRBM CAM remapping in the production. And the GRBM_CAM_INDEX / GRBM_CAM_DATA registers will be protected by PSP under security policy. So remove it according to the new security policy. Signed-off-by: Huang Rui <ray.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-21 23:38:10 -04:00
Aaron Liu	53c2ff8bcb	drm/amdgpu: support B0&B1 external revision id for yellow carp B0 internal rev_id is 0x01, B1 internal rev_id is 0x02. The external rev_id for B0 and B1 is 0x20. The original expression is not suitable for B1. v2: squash in fix for display code (Alex) Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2021-10-20 15:27:31 -04:00
Kent Russell	dcd5ea9f94	drm/amdgpu: Clarify error when hitting bad page threshold Change the error message when the bad_page_threshold is reached, explicitly stating that the GPU will not be initialized. Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: Mukul Joshi <Mukul.Joshi@amd.com> Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-20 11:43:57 -04:00
Alex Deucher	0d055f09e1	drm/amdgpu: drop navi reg init functions No longer used since IP enumeration is driven by the IP discovery table now. Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-20 11:43:57 -04:00
Alex Deucher	bf99b9b032	drm/amdgpu: drop nv_set_ip_blocks() No longer used since IP enumeration is now driven by amdgpu IP discovery code. Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-20 11:43:57 -04:00
Alex Deucher	7092432e3c	drm/amdgpu: drop soc15_set_ip_blocks() No longer used since IP enumeration is now driven by amdgpu IP discovery code. Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-20 11:43:57 -04:00
Alex Deucher	c9c7d18045	drm/amdgpu/gfx10: fix typo in gfx_v10_0_update_gfx_clock_gating() Check was incorrectly converted to IP version checking. Fixes: `4b0ad84254` ("drm/amdgpu/gfx10: convert to IP version checking") Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-20 11:43:56 -04:00
Qing Wang	40320159f0	drm/amdgpu: replace snprintf in show functions with sysfs_emit show() must not use snprintf() when formatting the value to be returned to user space. Fix the following coccicheck warning: drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c:427: WARNING: use scnprintf or sprintf. Signed-off-by: Qing Wang <wangqing@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-20 11:43:56 -04:00
Aaron Liu	5efacdf072	drm/amdgpu: support B0&B1 external revision id for yellow carp B0 internal rev_id is 0x01, B1 internal rev_id is 0x02. The external rev_id for B0 and B1 is 0x20. The original expression is not suitable for B1. v2: squash in fix for display code (Alex) Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-20 11:43:56 -04:00
Christian König	a0a8e75948	drm/amdgpu: use new iterator in amdgpu_vm_prt_fini No need to actually allocate an array of fences here. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20211005113742.1101-14-christian.koenig@amd.com	2021-10-20 14:06:35 +02:00
Guchun Chen	dac35c4239	drm/amdgpu/discovery: parse hw_id_name for SDMA instance 2 and 3 Otherwise, hw_id_name string is NULL for SDMA 2 and 3 when dumping ip version from VBIOS. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-19 17:32:52 -04:00
Tao Zhou	42f88ab772	drm/amdgpu: output warning for unsupported ras error inject (v2) Output a warning message if RAS TA returns UNSUPPORTED_ERROR_INJ status. v2: implement it in psp_ras_ta_check_status function. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-19 17:32:52 -04:00
Tao Zhou	1b5254e8d9	drm/amdgpu: centralize checking for RAS TA status Create new function to check status returned by RAS TA. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-19 17:32:52 -04:00
YuBiao Wang	a3848df60b	drm/amd/amdgpu: Do irq_fini_hw after ip_fini_early [Why] drm_irq_uninstall is called in irq_fini_hw so that irq is disabled in sw stage. SMU (and maybe other IP blocks) fini_hw will call irq_put for cleanup and the whole cleanup process will be skipped because of drm->irq_enable = false. [How] Move ip_fini_early before irq_fini_hw. Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-19 17:16:50 -04:00
Tao Zhou	c72942c167	drm/amdgpu: load PSP RL in resume path Some registers' access will fail without PSP RL after resume. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-19 17:14:33 -04:00
Lang Yu	5aeeac6fa3	drm/amdkfd: Fix an inappropriate error handling in allloc memory of gpu We should unreference a gem object instead of an amdgpu bo here. Fixes: `fd9a9f8801` ("drm/amdgpu: Use GEM obj reference for KFD BOs") Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-19 17:13:48 -04:00
Alex Deucher	48737ac4d7	drm/amdgpu/psp: add some missing cases to psp_check_pmfw_centralized_cstate_management Missed a few asics. v2: update comment Fixes: `82d05736c4` ("drm/amdgpu/amdgpu_psp: convert to IP version checking") Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-13 22:51:41 -04:00
Philip Yang	43fc10c187	drm/amdkfd: unregistered svm range not overlap with TTM range When creating unregistered new svm range to recover retry fault, avoid new svm range to overlap with ranges or userptr ranges managed by TTM, otherwise svm migration will trigger TTM or userptr eviction, to evict user queues unexpectedly. Change helper amdgpu_ttm_tt_affect_userptr to return userptr which is inside the range. Add helper svm_range_check_vm_userptr to scan all userptr of the vm, and return overlap userptr bo start, last. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-13 22:20:13 -04:00
Yifan Zhang	afd18180c0	drm/amdkfd: fix boot failure when iommu is disabled in Picasso. When IOMMU disabled in sbios and kfd in iommuv2 path, iommuv2 init will fail. But this failure should not block amdgpu driver init. Reported-by: youling <youling257@gmail.com> Tested-by: youling <youling257@gmail.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-13 14:15:52 -04:00
Mukul Joshi	91a1a52d03	drm/amdgpu: Fix RAS page retirement with mode2 reset on Aldebaran During mode2 reset, the GPU is temporarily removed from the mgpu_info list. As a result, page retirement fails because it cannot find the GPU in the GPU list. To fix this, create our own list of GPUs that support MCE notifier based page retirement and use that list to check if the UMC error occurred on a GPU that supports MCE notifier based page retirement. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-13 14:14:48 -04:00
Mukul Joshi	a4967a1ebf	drm/amdgpu: Enable RAS error injection after mode2 reset on Aldebaran Add the missing call to re-enable RAS error injections on the Aldebaran mode2 reset code path. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-13 14:14:35 -04:00
Lang Yu	fe04957e26	drm/amdgpu: enable display for cyan skillfish Display support for cyan skillfish is ready now. Enable it! Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-13 14:14:34 -04:00
Alex Deucher	369b7d04ba	drm/amdgpu/nbio2.3: don't use GPU_HDP_FLUSH bit 12 It's used internally by firmware. Using it in the driver could conflict with firmware. v2: squash in fix for navi1x (Alex) Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-13 14:14:34 -04:00
Alex Deucher	a0f9f85466	drm/amdgpu/nbio7.4: don't use GPU_HDP_FLUSH bit 12 It's used internally by firmware. Using it in the driver could conflict with firmware. Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-11 11:57:00 -04:00
Dave Airlie	797d72ce8e	drm-misc-next for v5.16: UAPI Changes: - Allow empty drm leases for creating separate GEM namespaces. Cross-subsystem Changes: - Slightly rework dma_buf_poll. - Add dma_resv_for_each_fence_unlocked to iterate, and use it inside the lockless dma-resv functions. Core Changes: - Allow devm_drm_of_get_bridge to build without CONFIG_OF for compile testing. - Add more DP2 headers. - fix CONFIG_FB dependency in fb_helper. - Add DRM_FORMAT_R8 to drm_format_info, and helpers for RGB332 and RGB888. - Fix crash on a 0 or invalid EDID. Driver Changes: - Apply and revert DRM_MODESET_LOCK_ALL_BEGIN. - Add mode_valid to ti-sn65dsi86 bridge. - Support multiple syncobjs in v3d. - Add R8, RGB332 and RGB888 pixel formats to GUD. - Use devm_add_action_or_reset in dw-hdmi-cec. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAmFdfuwACgkQ/lWMcqZw E8OTgg/+Nmsqhj1tsbSCWF1yx81CXHVSOhExPaMl+GPs6+y+sZ+U2rN99dnbULvA U56eOmjc8FvgmK89BwhSYNt++QYIRRpzjBGlCYm4bwpgqFOmYsK+en35PYMwHdxM Ke8newhzqa6/detvjX52igddZzrBv1Cs8aXuV5rw7Dg0ivlSlQUV0MO8JYwCliWI arRT8bg7wzUzhyRZqwqOqKXjvRirqBlFjJmvfL0WgHevZbzYuXbn4eWCUgCVthMH pU9QgK6FMW912pBxVppDO2aTDmNvqwj1BsB3RFfRuqS/JJ4s/gf39JxsipnI+/qn kPxZVFzzonR8Nl6h9sPi1jZrcVDCBebFgyG8hSgIVb/09U7AVYomtP18VKeh8yCy Pp4iQINqOcyMPmXKF491LIL92dcXZAIRaRQFKc/ZSHcfIDA7ZB1+7zf1ixBjlxjP GqtjLbmPspI2DzBRlTFEdf58jvX70E5nFYdQyYcy3VprJHuqEgL5PKz2Xcnve6R0 dEkGA2vMrGtb23YyjbFTNfkdvg9WYXze9HbQLt7kc8mI77TugkG0/rCcwv5pEEu3 WSwqMeb+5H+7va4AI715MoXbxgnCba2zPTUm1s8kSqTK0Oighc/vWcnnJ4iVuEGE 8Xt8AIIYUtccufR6ujucVUh7nju2ZOnFE7S92LybnGnByAIADfM= =qxpr -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2021-10-06' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v5.16: UAPI Changes: - Allow empty drm leases for creating separate GEM namespaces. Cross-subsystem Changes: - Slightly rework dma_buf_poll. - Add dma_resv_for_each_fence_unlocked to iterate, and use it inside the lockless dma-resv functions. Core Changes: - Allow devm_drm_of_get_bridge to build without CONFIG_OF for compile testing. - Add more DP2 headers. - fix CONFIG_FB dependency in fb_helper. - Add DRM_FORMAT_R8 to drm_format_info, and helpers for RGB332 and RGB888. - Fix crash on a 0 or invalid EDID. Driver Changes: - Apply and revert DRM_MODESET_LOCK_ALL_BEGIN. - Add mode_valid to ti-sn65dsi86 bridge. - Support multiple syncobjs in v3d. - Add R8, RGB332 and RGB888 pixel formats to GUD. - Use devm_add_action_or_reset in dw-hdmi-cec. Signed-off-by: Dave Airlie <airlied@redhat.com> # gpg: Signature made Wed 06 Oct 2021 20:48:12 AEST # gpg: using RSA key B97BD6A80CAC4981091AE547FE558C72A67013C3 # gpg: Good signature from "Maarten Lankhorst <maarten.lankhorst@linux.intel.com>" [expired] # gpg: aka "Maarten Lankhorst <maarten@debian.org>" [expired] # gpg: aka "Maarten Lankhorst <maarten.lankhorst@canonical.com>" [expired] # gpg: Note: This key has expired! # Primary key fingerprint: B97B D6A8 0CAC 4981 091A E547 FE55 8C72 A670 13C3 From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/2602f4e9-a8ac-83f8-6c2a-39fd9ca2e1ba@linux.intel.com	2021-10-11 12:39:15 +10:00
Guchun Chen	c58a863b1c	drm/amdgpu: use adev_to_drm for consistency when accessing drm_device adev_to_drm is used everywhere, so improve recent changes when accessing drm_device pointer from amdgpu_device. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-08 13:22:13 -04:00
Alex Deucher	35bdf463de	drm/amdgpu: add missing case for HDP for renoir Missing 4.1.2. Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-08 13:20:13 -04:00
Alex Deucher	73bf66712d	drm/amdgpu/discovery: add missing case for SMU 11.0.5 Was missed when converting the driver over to IP based initialization. Tested-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-08 13:20:13 -04:00
Nirmoy Das	58144d2837	drm/amdgpu: unify BO evicting method in amdgpu_ttm Unify BO evicting functionality for possible memory types in amdgpu_ttm.c. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-07 11:55:46 -04:00
Mukul Joshi	12b2cab790	drm/amdgpu: Register MCE notifier for Aldebaran RAS On Aldebaran, GPU driver will handle bad page retirement for GPU memory even though UMC is host managed. As a result, register a bad page retirement handler on the mce notifier chain to retire bad pages on Aldebaran. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-06 15:53:38 -04:00
Nirmoy Das	5b9581df9f	drm/amdgpu: return early if debugfs is not initialized Check first if debugfs is initialized before creating amdgpu debugfs files. References: https://gitlab.freedesktop.org/drm/amd/-/issues/1686 Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-06 15:53:14 -04:00
Marek Olšák	f2e7d85680	drm/amd/display: fix DCC settings for DCN3 ind_block_64b_no_128bcl means INDEP_64B && INDEP_128B && MAX_COMPRESSED_BLOCK_SIZE == 64B. Only used by gfx10.3. ind_block_64b means INDEP_64B && !INDEP_128B && MAX_COMPRESSED_BLOCK_SIZE == 64B. Only used by gfx9 and gfx10. Signed-off-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-06 15:50:43 -04:00
Guchun Chen	248b061689	drm/amdgpu: handle the case of pci_channel_io_frozen only in amdgpu_pci_resume In current code, when a PCI error state pci_channel_io_normal is detectd, it will report PCI_ERS_RESULT_CAN_RECOVER status to PCI driver, and PCI driver will continue the execution of PCI resume callback report_resume by pci_walk_bridge, and the callback will go into amdgpu_pci_resume finally, where write lock is releasd unconditionally without acquiring such lock first. In this case, a deadlock will happen when other threads start to acquire the read lock. To fix this, add a member in amdgpu_device strucutre to cache pci_channel_state, and only continue the execution in amdgpu_pci_resume when it's pci_channel_io_frozen. Fixes: `c9a6b82f45` ("drm/amdgpu: Implement DPC recovery") Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-05 13:02:31 -04:00
Yifan Zhang	714d9e4574	drm/amdgpu: init iommu after amdkfd device init This patch is to fix clinfo failure in Raven/Picasso: Number of platforms: 1 Platform Profile: FULL_PROFILE Platform Version: OpenCL 2.2 AMD-APP (3364.0) Platform Name: AMD Accelerated Parallel Processing Platform Vendor: Advanced Micro Devices, Inc. Platform Extensions: cl_khr_icd cl_amd_event_callback Platform Name: AMD Accelerated Parallel Processing Number of devices: 0 Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Tested-by: James Zhu <James.Zhu@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-05 13:02:20 -04:00
Guchun Chen	e17e27f9bd	drm/amdgpu: handle the case of pci_channel_io_frozen only in amdgpu_pci_resume In current code, when a PCI error state pci_channel_io_normal is detectd, it will report PCI_ERS_RESULT_CAN_RECOVER status to PCI driver, and PCI driver will continue the execution of PCI resume callback report_resume by pci_walk_bridge, and the callback will go into amdgpu_pci_resume finally, where write lock is releasd unconditionally without acquiring such lock first. In this case, a deadlock will happen when other threads start to acquire the read lock. To fix this, add a member in amdgpu_device strucutre to cache pci_channel_state, and only continue the execution in amdgpu_pci_resume when it's pci_channel_io_frozen. Fixes: `c9a6b82f45` ("drm/amdgpu: Implement DPC recovery") Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-05 12:23:56 -04:00
Christian König	127aedf979	drm/amdgpu: print warning and taint kernel if lockup timeout is disabled Make sure that we notice this in error reports. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-05 12:23:42 -04:00
Christian König	c8365dbda0	drm/amdgpu: revert "Add autodump debugfs node for gpu reset v8" This reverts commit `728e7e0cd6`. Further discussion reveals that this feature is severely broken and needs to be reverted ASAP. GPU reset can never be delayed by userspace even for debugging or otherwise we can run into in kernel deadlocks. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-05 12:22:36 -04:00
Yifan Zhang	286826d7d9	drm/amdgpu: init iommu after amdkfd device init This patch is to fix clinfo failure in Raven/Picasso: Number of platforms: 1 Platform Profile: FULL_PROFILE Platform Version: OpenCL 2.2 AMD-APP (3364.0) Platform Name: AMD Accelerated Parallel Processing Platform Vendor: Advanced Micro Devices, Inc. Platform Extensions: cl_khr_icd cl_amd_event_callback Platform Name: AMD Accelerated Parallel Processing Number of devices: 0 Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Tested-by: James Zhu <James.Zhu@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-05 12:22:29 -04:00
Lijo Lazar	1d617c029f	drm/amdgpu: During s0ix don't wait to signal GFXOFF In the rare event when GFX IP suspend coincides with a s0ix entry, don't schedule a delayed work, instead signal PMFW immediately to allow GFXOFF entry. GFXOFF is a prerequisite for s0ix entry. PMFW needs to be signaled about GFXOFF status before amd-pmc module passes OS HINT to PMFW telling that everything is ready for a safe s0ix entry. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1712 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Mario Limonciello <mario.limonciell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2021-10-05 10:55:07 -04:00
Lang Yu	b072ef1215	drm/amdkfd: fix a potential ttm->sg memory leak Memory is allocated for ttm->sg by kmalloc in kfd_mem_dmamap_userptr, but isn't freed by kfree in kfd_mem_dmaunmap_userptr. Free it! Fixes: `264fb4d332` ("drm/amdgpu: Add multi-GPU DMA mapping helpers") Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-05 10:53:45 -04:00
Alex Deucher	630e959f25	drm/amdgpu/gmc9: convert to IP version checking Use IP versions rather than asic_type to differentiate IP version specific features. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:02 -04:00
Guo Zhengkui	8001ba85d0	drm/amdgpu: remove some repeated includings Remove two repeated includings in line 46 and 47. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:02 -04:00
Lijo Lazar	d04287d062	drm/amdgpu: During s0ix don't wait to signal GFXOFF In the rare event when GFX IP suspend coincides with a s0ix entry, don't schedule a delayed work, instead signal PMFW immediately to allow GFXOFF entry. GFXOFF is a prerequisite for s0ix entry. PMFW needs to be signaled about GFXOFF status before amd-pmc module passes OS HINT to PMFW telling that everything is ready for a safe s0ix entry. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1712 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Mario Limonciello <mario.limonciell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:02 -04:00
Alex Deucher	4b3a624c4c	drm/amdgpu: consolidate case statements IP_VERSION(11, 0, 13) does the exact same thing as IP_VERSION(11, 0, 12) so squash them together. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:02 -04:00
James Zhu	c60511493b	drm/amdgpu/jpeg: add jpeg2.6 start/end Add jpeg2.6 start/end with updated PCTL0_MMHUB_DEEPSLEEP_IB address. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.lilu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:02 -04:00
James Zhu	d4b0ee65de	drm/amdgpu/jpeg2: move jpeg2 shared macro to header file Move jpeg2 shared macro to header file Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.lilu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:02 -04:00
Lang Yu	546dc20fed	drm/amdkfd: fix a potential ttm->sg memory leak Memory is allocated for ttm->sg by kmalloc in kfd_mem_dmamap_userptr, but isn't freed by kfree in kfd_mem_dmaunmap_userptr. Free it! Fixes: `264fb4d332` ("drm/amdgpu: Add multi-GPU DMA mapping helpers") Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:02 -04:00
Alex Deucher	a79d3709c4	drm/amdgpu: add an option to override IP discovery table from a file If you set amdgpu.discovery=2 you can force the the driver to fetch the IP discovery table from a file rather than from the table shipped on the device. This is useful for debugging and for device bring up and emulation when the tables may be in flux. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:02 -04:00
Alex Deucher	5b983db8c3	drm/amdkfd: clean up parameters in kgd2kfd_probe We can get the pdev and asic type from the adev. No need to pass them explicitly. v2: squash in build fix for !CONFIG_HSA_AMD from Anson Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:02 -04:00
Alex Deucher	6d46d419af	drm/amdgpu: add support for SRIOV in IP discovery path Handle SRIOV requirements when adding IP blocks. v2: add comment about UVD/VCE support on vega20 SR-IOV Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:01 -04:00
Alex Deucher	b05b9c591f	drm/amdgpu: clean up set IP function Split into several smaller per IP functions to make it easier to handle ordering issues for things like SR-IOV in a follow up patch. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:01 -04:00
Alex Deucher	1d789535a0	drm/amdgpu: convert IP version array to include instances Allow us to query instances versions more cleanly. Instancing support is not consistent unfortunately. SDMA is a good example. Sienna cichlid has 4 total SDMA instances, each enumerated separately (HWIDs 42, 43, 68, 69). Arcturus has 8 total SDMA instances, but they are enumerated as multiple instances of the same HWIDs (4x HWID 42, 4x HWID 43). UMC is another example. On most chips there are multiple instances with the same HWID. This allows us to support both forms. v2: rebase v3: clarify instancing support Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:01 -04:00
Alex Deucher	d0761fd24e	drm/amdgpu: set CHIP_IP_DISCOVERY as the asic type by default For new chips with no explicit entry in the PCI ID list. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:01 -04:00
Alex Deucher	3ae695d691	drm/amdgpu: add new asic_type for IP discovery Add a new asic type for asics where we don't have an explicit entry in the PCI ID list. We don't need an asic type for these asics, other than something higher than the existing ones, so just use this for all new asics. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:01 -04:00
Alex Deucher	aa9f8cc349	drm/amdgpu/ucode: add default behavior Default to PSP ucode loading unless the user specifies direct. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:01 -04:00
Alex Deucher	f174161517	drm/amdgpu: get VCN harvest information from IP discovery table Use the table rather than asic specific harvest registers. v2: remove harvesting register checking Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:01 -04:00
Alex Deucher	1b592d00b4	drm/amdgpu/vcn: remove manual instance setting Handled by IP discovery now. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:01 -04:00
Alex Deucher	fe323f039d	drm/amdgpu/sdma: remove manual instance setting Handled by IP discovery now. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:01 -04:00
Alex Deucher	5c3720be7d	drm/amdgpu: get VCN and SDMA instances from IP discovery table Rather than hardcoding it. We already have the number of VCN instances from a previous patch, so just update the VCN instances for chips with static tables. v2: squash in checks for SDMA3,4 (Guchun) v3: clarify VCN changes Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:01 -04:00
Alex Deucher	5eceb20192	drm/amdgpu: add VCN1 hardware IP So we can store the VCN IP revision for each instance of VCN. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:01 -04:00
Alex Deucher	75a07bcd1d	drm/amdgpu/soc15: convert to IP version checking Use IP versions rather than asic_type to differentiate IP version specific features. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:01 -04:00
Alex Deucher	0b64a5a852	drm/amdgpu/vcn2.5: convert to IP version checking Use IP versions rather than asic_type to differentiate IP version specific features. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:00 -04:00
Alex Deucher	96b8dd4423	drm/amdgpu/amdgpu_vcn: convert to IP version checking Use IP versions rather than asic_type to differentiate IP version specific features. v2: squash in fix for navy flounder and sienna cichlid Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:00 -04:00
Alex Deucher	1fcc208cd7	drm/amdgpu/psp_v13.0: convert to IP version checking Use IP versions rather than asic_type to differentiate IP version specific features. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:00 -04:00
Alex Deucher	e47868ea15	drm/amdgpu/psp_v11.0: convert to IP version checking Use IP versions rather than asic_type to differentiate IP version specific features. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:00 -04:00
Alex Deucher	82d05736c4	drm/amdgpu/amdgpu_psp: convert to IP version checking Use IP versions rather than asic_type to differentiate IP version specific features. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:00 -04:00
Alex Deucher	9d0cb2c318	drm/amdgpu/gfx9.0: convert to IP version checking Use IP versions rather than asic_type to differentiate IP version specific features. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:00 -04:00
Alex Deucher	24be2d7004	drm/amdgpu/hdp4.0: convert to IP version checking Use IP versions rather than asic_type to differentiate IP version specific features. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:00 -04:00
Alex Deucher	43bf00f21e	drm/amdgpu/sdma4.0: convert to IP version checking Use IP versions rather than asic_type to differentiate IP version specific features. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:00 -04:00
Alex Deucher	f7f12b2582	drm/amdgpu: default to true in amdgpu_device_asic_has_dc_support We are not going to support any new chips with the old non-DC code so make it the default. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:00 -04:00
Alex Deucher	9878844094	drm/amdgpu: drive all vega asics from the IP discovery table Rather than hardcoding based on asic_type, use the IP discovery table to configure the driver. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:00 -04:00
Alex Deucher	91e9db33be	drm/amdgpu/soc15: get rev_id in soc15_common_early_init for consistency with other SoCs. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:00 -04:00
Alex Deucher	d4c6e870bd	drm/amdgpu: add initial IP discovery support for vega based parts Hardcode the IP versions for asics without IP discovery tables and then enumerate the asics based on the IP versions. TODO: fix SR-IOV support v2: Squash in HDP fix for Renoir Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:23:00 -04:00

... 13 14 15 16 17 ...

11309 Commits