linux

mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00

Author	SHA1	Message	Date
Jingwen Chen	b5840166dc	drm/amdgpu: SRIOV flr_work should take write_lock [Why] If flr_work takes read_lock, then other threads who takes read_lock can access hardware when host is doing vf flr. [How] flr_work should take write_lock to avoid this case. Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-08 15:12:58 -04:00
Luben Tuikov	c0838d3a93	drm/amdgpu: The I2C IP doesn't support 0 writes/reads The I2C IP doesn't support writes or reads of 0 bytes. In order for a START/STOP transaction to take place on the bus, the data written/read has to be at least one byte. That is, you cannot generate a write with 0 bytes, just to get the ACK from a device, just so you can probe that device if it is on the bus and so to discover all devices on the bus--you'll have to read at least one byte. Writes of 0 bytes generate no START/STOP on this I2C IP--the bus is not engaged at all. Set the I2C_AQ_NO_ZERO_LEN to the existing I2C quirk tables for Aldebaran, Arcturus, Navi10 and Sienna Cichlid, and add a quirk table to the I2C driver which drives the bus when the SMU doesn't--for instance on Vega20. Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: John Clements <john.clements@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Lijo Lazar <Lijo.Lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-08 15:12:51 -04:00
Luben Tuikov	ae87df0775	drm/amd/pm: Add I2C quirk table to Aldebaran Add I2C quirk table to Aldebaran. Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: John Clements <john.clements@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Lijo Lazar <Lijo.Lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-08 15:12:44 -04:00
YuBiao Wang	5af4438f1e	drm/amdgpu: Read clock counter via MMIO to reduce delay (v5) [Why] GPU timing counters are read via KIQ under sriov, which will introduce a delay. [How] It could be directly read by MMIO. v2: Add additional check to prevent carryover issue. v3: Only check for carryover for once to prevent performance issue. v4: Add comments of the rough frequency where carryover happens. v5: Remove mutex and gfxoff ctrl unused with current timing registers. Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com> Acked-by: Horace Chen <horace.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.co> Reviewed-by: Monk Liu <monk.liu@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-08 15:12:36 -04:00
Eric Huang	51627f0380	drm/amdkfd: Only apply TLB flush optimization on ALdebaran It is based on reverting two patches back. drm/amdkfd: Make TLB flush conditional on mapping drm/amdgpu: Add table_freed parameter to amdgpu_vm_bo_update Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-08 15:12:26 -04:00
Nirmoy Das	88f7f88159	drm/amdgpu: separate out vm pasid assignment Use new helper function amdgpu_vm_set_pasid() to assign vm pasid value. This also ensures that we don't free a pasid from vm code as pasids are allocated somewhere else. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-08 15:12:20 -04:00
Nirmoy Das	dcb388eddb	drm/amdgpu: use xarray for storing pasid in vm Replace idr with xarray as we actually need hash functionality. Cleanup code related to vm pasid by adding helper function. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-08 15:12:07 -04:00
Jiri Kosina	36f5f9d37e	drm/amdgpu: Avoid printing of stack contents on firmware load error In case when psp_init_asd_microcode() fails to load ASD microcode file, psp_v12_0_init_microcode() tries to print the firmware filename that failed to load before bailing out. This is wrong because: - the firmware filename it would want it print is an incorrect one as psp_init_asd_microcode() and psp_v12_0_init_microcode() are loading different filenames - it tries to print fw_name, but that's not yet been initialized by that time, so it prints random stack contents, e.g. amdgpu 0000:04:00.0: Direct firmware load for amdgpu/renoir_asd.bin failed with error -2 amdgpu 0000:04:00.0: amdgpu: fail to initialize asd microcode amdgpu 0000:04:00.0: amdgpu: psp v12.0: Failed to load firmware "\xfeTO\x8e\xff\xff" Fix that by bailing out immediately, instead of priting the bogus error message. Reported-by: Vojtech Pavlik <vojtech@ucw.cz> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 12:09:25 -04:00
Jiri Kosina	4ef87d8f10	drm/amdgpu: Fix resource leak on probe error path This reverts commit `4192f7b576`. It is not true (as stated in the reverted commit changelog) that we never unmap the BAR on failure; it actually does happen properly on amdgpu_driver_load_kms() -> amdgpu_driver_unload_kms() -> amdgpu_device_fini() error path. What's worse, this commit actually completely breaks resource freeing on probe failure (like e.g. failure to load microcode), as amdgpu_driver_unload_kms() notices adev->rmmio being NULL and bails too early, leaving all the resources that'd normally be freed in amdgpu_acpi_fini() and amdgpu_device_fini() still hanging around, leading to all sorts of oopses when someone tries to, for example, access the sysfs and procfs resources which are still around while the driver is gone. Fixes: `4192f7b576` ("drm/amdgpu: unmap register bar on device init failure") Reported-by: Vojtech Pavlik <vojtech@ucw.cz> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 12:09:19 -04:00
Lang Yu	6312333210	drm/amdgpu: show explicit name instead of id in psp_cmd_submit_buf Use amdgpu_ucode_name to show ucode name and psp_gfx_cmd_name to show psp_gfx_cmd name in psp_cmd_submit_buf. v2: adjust function name Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:25:33 -04:00
Lang Yu	dc739d18c6	drm/amdgpu: add function to show psp_gfx_cmd name via id Implement function psp_gfx_cmd_name to show cmd name via cmd id. v2: rename it to psp_gfx_cmd_name Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:25:33 -04:00
Lang Yu	aae435c6e8	drm/amdgpu: add function to show ucode name via id Implement function amdgpu_ucode_name to show ucode name via ucode id. v2: rename it to amdgpu_ucode_name Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:25:33 -04:00
Alex Deucher	0677e42256	drm/amdgpu: add license to umc_8_7_0_sh_mask.h Was missing. Add it. Fixes: `6b36fa6143` ("drm/amdgpu: add umc v8_7_0 IP headers") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:25:33 -04:00
Lukas Bulwahn	c11ffa54be	drm/amdgpu: rectify line endings in umc v8_7_0 IP headers Commit `6b36fa6143` ("drm/amdgpu: add umc v8_7_0 IP headers") adds the new file ./drivers/gpu/drm/amd/include/asic_reg/umc/umc_8_7_0_sh_mask.h with DOS line endings, which is very uncommon for the kernel repository. Rectify the line endings in this file with dos2unix. Identified by a checkpatch evaluation on the whole kernel repository and spot-checking for really unexpected checkpatch rule violations. Reported-by: Dwaipayan Ray <dwaipayanray1@gmail.com> Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:25:33 -04:00
Luben Tuikov	da98d99b0a	drm/amd/pm: Simplify managed I2C transfer of Aldebaran Simplify Aldebaran managed I2C transfer function to correctly play with the upper I2C layers. This gets it in line with Navi10, Acturus, and Sienna Cichlid. Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: John Clements <john.clements@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:25:33 -04:00
Luben Tuikov	9de96f3f7e	drm/amdgpu: Correctly disable the I2C IP block On long transfers to the EEPROM device, i.e. write, it is observed that the driver aborts the transfer. The reason for this is that the driver isn't patient enough--the IC_STATUS register's contents is 0x27, which is MST_ACTIVITY \| TFE \| TFNF \| ACTIVITY. That is, while the transmission FIFO is empty, we, the I2C master device, are still driving the bus. Implement the correct procedure to disable the block, as described in the DesignWare I2C Databook, section 3.8.3 Disabling DW_apb_i2c on page 56. Now there are no premature aborts on long data transfers. Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:25:33 -04:00
Luben Tuikov	e2e04041a2	drm/amdgpu: Use a single loop In smu_v11_0_i2c_transmit() use a single loop to transmit bytes, instead of two nested loops. Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:25:33 -04:00
Luben Tuikov	1d9d2ca85b	drm/amdgpu: Fix koops when accessing RAS EEPROM Debugfs RAS EEPROM files are available when the ASIC supports RAS, and when the debugfs is enabled, an also when "ras_enable" module parameter is set to 0. However in this case, we get a kernel oops when accessing some of the "ras_..." controls in debugfs. The reason for this is that struct amdgpu_ras::adev is unset. This commit sets it, thus enabling access to those facilities. Note that this facilitates EEPROM access and not necessarily RAS features or functionality. Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: John Clements <john.clements@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:25:33 -04:00
Alex Deucher	d456f3875a	drm/amdgpu: fix 64 bit divide in eeprom code pos is 64 bits. Fixes: `c65b0805e7` ("drm/amdgpu: RAS EEPROM table is now in debugfs") Cc: luben.tuikov@amd.com Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:41 -04:00
Luben Tuikov	c65b0805e7	drm/amdgpu: RAS EEPROM table is now in debugfs Add "ras_eeprom_size" file in debugfs, which reports the maximum size allocated to the RAS table in EEROM, as the number of bytes and the number of records it could store. For instance, $cat /sys/kernel/debug/dri/0/ras/ras_eeprom_size 262144 bytes or 10921 records $_ Add "ras_eeprom_table" file in debugfs, which dumps the RAS table stored EEPROM, in a formatted way. For instance, $cat ras_eeprom_table Signature Version FirstOffs Size Checksum 0x414D4452 0x00010000 0x00000014 0x000000EC 0x000000DA Index Offset ErrType Bank/CU TimeStamp Offs/Addr MemChl MCUMCID RetiredPage 0 0x00014 ue 0x00 0x00000000607608DC 0x000000000000 0x00 0x00 0x000000000000 1 0x0002C ue 0x00 0x00000000607608DC 0x000000001000 0x00 0x00 0x000000000001 2 0x00044 ue 0x00 0x00000000607608DC 0x000000002000 0x00 0x00 0x000000000002 3 0x0005C ue 0x00 0x00000000607608DC 0x000000003000 0x00 0x00 0x000000000003 4 0x00074 ue 0x00 0x00000000607608DC 0x000000004000 0x00 0x00 0x000000000004 5 0x0008C ue 0x00 0x00000000607608DC 0x000000005000 0x00 0x00 0x000000000005 6 0x000A4 ue 0x00 0x00000000607608DC 0x000000006000 0x00 0x00 0x000000000006 7 0x000BC ue 0x00 0x00000000607608DC 0x000000007000 0x00 0x00 0x000000000007 8 0x000D4 ue 0x00 0x00000000607608DD 0x000000008000 0x00 0x00 0x000000000008 $_ Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: John Clements <john.clements@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Xinhui Pan <xinhui.pan@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:41 -04:00
Luben Tuikov	63d4c081a5	drm/amdgpu: Optimize EEPROM RAS table I/O Split functionality between read and write, which simplifies the code and exposes areas of optimization and more or less complexity, and take advantage of that. Read and write the table in one go; use a separate stage to decode or encode the data, as opposed to on the fly, which keeps the I2C bus busy. Use a single read/write to read/write the table or at most two if the number of records we're reading/writing wraps around. Check the check-sum of a table in EEPROM on init. Update the checksum at the same time as when updating the table header signature, when the threshold was increased on boot. Take advantage of arithmetic modulo 256, that is, use a byte!, to greatly simplify checksum arithmetic. Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:41 -04:00
Luben Tuikov	017dad64db	drm/amdgpu: Get rid of test function The code is now tested from userspace. Remove already macroed out test function. Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:41 -04:00
Luben Tuikov	0686627b3f	drm/amdgpu: Some renames Qualify with "ras_". Use kernel's own--don't redefine your own. Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:41 -04:00
Luben Tuikov	d7edde3dea	drm/amdgpu: Nerf buff buff --> buf. Essentially buffer abbreviates to buf, remove 1/2 of it, or just the iron part, as opposed to just the Er, Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:41 -04:00
Luben Tuikov	e4e6a58935	drm/amdgpu: Use explicit cardinality for clarity RAS_MAX_RECORD_NUM may mean the maximum record number, as in the maximum house number on your street, or it may mean the maximum number of records, as in the count of records, which is also a number. To make this distinction whether the number is ordinal (index) or cardinal (count), rename this macro to RAS_MAX_RECORD_COUNT. This makes it easy to understand what it refers to, especially when we compute quantities such as, how many records do we have left in the table, especially when there are so many other numbers, quantities and numerical macros around. Also rename the long, amdgpu_ras_eeprom_get_record_max_length() to the more succinct and clear, amdgpu_ras_eeprom_max_record_count(). When computing the threshold, which also deals with counts, i.e. "how many", use cardinal "max_eeprom_records_count", than the quantitative "max_eeprom_records_len". Simplify the logic here and there, as well. Cc: Guchun Chen <guchun.chen@amd.com> Cc: John Clements <john.clements@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:41 -04:00
Luben Tuikov	803c6ebdd3	drm/amdgpu: Simplify RAS EEPROM checksum calculations Rename update_table_header() to write_table_header() as this function is actually writing it to EEPROM. Use kernel types; use u8 to carry around the checksum, in order to take advantage of arithmetic modulo 8-bits (256). Tidy up to 80 columns. When updating the checksum, just recalculate the whole thing. Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:41 -04:00
Luben Tuikov	dce4400e65	drm/amdgpu: Fix amdgpu_ras_eeprom_init() No need to account for the 2 bytes of EEPROM address--this is now well abstracted away by the fixes the the lower layers. Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:41 -04:00
Luben Tuikov	cf696091d3	drm/amdgpu: Return result fix in RAS The low level EEPROM write method, doesn't return 1, but the number of bytes written. Thus do not compare to 1, instead, compare to greater than 0 for success. Other cleanup: if the lower layers returned -errno, then return that, as opposed to overwriting the error code with one-fits-all -EINVAL. For instance, some return -EAGAIN. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:41 -04:00
Luben Tuikov	36b1a00d2b	drm/amdgpu: Fix width of I2C address The I2C address is kept as a 16-bit quantity in the kernel. The I2C_TAR::I2C_TAR field is 10-bit wide. Fix the width of the I2C address for Vega20 from 8 bits to 16 bits to accommodate the full spectrum of I2C address space. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:41 -04:00
Luben Tuikov	ebe57d0c8e	drm/amd/pm: Simplify managed I2C transfer functions Now that we have an I2C quirk table for SMU-managed I2C controllers, the I2C core does the checks for us, so we don't need to do them, and so simplify the managed I2C transfer functions. Also, for Arcturus and Navi10, fix setting the command type from "cmd->CmdConfig" to "cmd->Cmd". The latter is what appears to be taking in the enumeration I2C_CMD_... as an integer, not a bit-flag. For Sienna, the "Cmd" field seems to have been eliminated, and command type and flags all live in the "CmdConfig" field--this is left untouched. Fix: Detect and add changing of direction bit-flag, as this is necessary for the SMU to detect the direction change in the 1-d array of data it gets. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:41 -04:00
Luben Tuikov	1673662761	drm/amd/pm: Extend the I2C quirk table Extend the I2C quirk table for SMU access controlled I2C adapters. Let the kernel I2C layer check that the messages all have the same address, and that their combined size doesn't exceed the maximum size of a SMU software I2C request. Suggested-by: Jean Delvare <jdelvare@suse.de> Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:40 -04:00
Luben Tuikov	16ef797737	drm/amdgpu: EEPROM: add explicit read and write Add explicit amdgpu_eeprom_read() and amdgpu_eeprom_write() for clarity. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:40 -04:00
Luben Tuikov	1fab841ff6	drm/amdgpu: RAS xfer to read/write Wrap amdgpu_ras_eeprom_xfer(..., bool write), into amdgpu_ras_eeprom_read() and amdgpu_ras_eeprom_write(), as that makes reading and understanding the code clearer. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:40 -04:00
Luben Tuikov	a43996573a	drm/amdgpu: Rename misspelled function Instead of fixing the spelling in amdgpu_ras_eeprom_process_recods(), rename it to, amdgpu_ras_eeprom_xfer(), to look similar to other I2C and protocol transfer (read/write) functions. Also to keep the column span to within reason by using a shorter name. Change the "num" function parameter from "int" to "const u32" since it is the number of items (records) to xfer, i.e. their count, which cannot be a negative number. Also swap the order of parameters, keeping the pointer to records and their number next to each other, while the direction now becomes the last parameter. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:40 -04:00
Luben Tuikov	c28aa44de8	drm/amdgpu: RAS: EEPROM --> RAS In amdgpu_ras_eeprom.c--the interface from RAS to EEPROM, rename macros from EEPROM to RAS, to indicate that the quantities and objects are RAS specific, not EEPROM. We can decrease the RAS table, or put it in different offset of EEPROM as needed in the future. Remove EEPROM_ADDRESS_SIZE macro definition, equal to 2, from the file and calculations, as that quantity is computed and added on the stack, in the lower layer, amdgpu_eeprom_xfer(). Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:40 -04:00
Luben Tuikov	f4322d80ad	drm/amdgpu: I2C class is HWMON Set the auto-discoverable class of I2C bus to HWMON. Remove SPD. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:40 -04:00
Luben Tuikov	edb63a5308	drm/amdgpu: Fix wrap-around bugs in RAS Fix the size of the EEPROM from 256000 bytes to 262144 bytes (256 KiB). Fix a couple or wrap around bugs. If a valid value/address is 0 <= addr < size, the inverse of this inequality (barring negative values which make no sense here) is addr >= size. Fix this in the RAS code. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:40 -04:00
Luben Tuikov	ccdfbfec9e	drm/amdgpu: RAS and FRU now use 19-bit I2C address Convert RAS and FRU code to use the 19-bit I2C memory address and remove all "slave_addr", as this is now absolved into the 19-bit address. Cc: Jean Delvare <jdelvare@suse.de> Cc: John Clements <john.clements@amd.com> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:40 -04:00
Luben Tuikov	025a64a587	drm/amdgpu: I2C EEPROM full memory addressing * "eeprom_addr" is now 32-bit wide. * Remove "slave_addr" from the I2C EEPROM driver interface. The I2C EEPROM Device Type Identifier is fixed at 1010b, and the rest of the bits of the Device Address Byte/Device Select Code, are memory address bits, where the first three of those bits are the hardware selection bits. All this is now a 19-bit address and passed as "eeprom_addr". This abstracts the I2C bus for EEPROM devices for this I2C EEPROM driver. Now clients only pass the 19-bit EEPROM memory address, to the I2C EEPROM driver, as the 32-bit "eeprom_addr", from which they want to read from or write to. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:40 -04:00
Luben Tuikov	93ade343bb	drm/amdgpu: EEPROM respects I2C quirks Consult the i2c_adapter.quirks table for the maximum read/write data length per bus transaction. Do not exceed this transaction limit. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:40 -04:00
Luben Tuikov	746b584762	drm/amdgpu: Fixes to the AMDGPU EEPROM driver * When reading from the EEPROM device, there is no device limitation on the number of bytes read--they're simply sequenced out. Thus, read the whole data requested in one go. * When writing to the EEPROM device, there is a 256-byte page limit to write to before having to generate a STOP on the bus, as well as the address written to mustn't cross over the page boundary (it actually rolls over). Maximize the data written to per bus acquisition. * Return the number of bytes read/written, or -errno. * Add kernel doc. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:40 -04:00
Luben Tuikov	daaa75fd98	drm/amdgpu: Fix Vega20 I2C to be agnostic (v2) Teach Vega20 I2C to be agnostic. Allow addressing different devices while the master holds the bus. Set STOP as per the controller's specification. v2: Qualify generating ReSTART before the 1st byte of the message, when set by the caller, as those functions are separated, as caught by Andrey G. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:40 -04:00
Andrey Grodzovsky	35ed27032c	drm/amdgpu/pm: ADD I2C quirk adapter table To be used by kernel clients of the adapter. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Suggested-by: Lazar Lijo <Lijo.Lazar@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:40 -04:00
Andrey Grodzovsky	14df56504f	drm/amd/pm: SMU I2C: Return number of messages processed Fix from number of processed bytes to number of processed I2C messages. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:40 -04:00
Andrey Grodzovsky	6a0a55a2eb	drm/amdgpu: Send STOP for the last byte of msg only Let's just ignore the I2C_M_STOP hint from upper layer for SMU I2C code as there is no clean mapping between single per I2C message STOP flag at the kernel I2C layer and the SMU, per each byte STOP flag. We will just by default set it at the end of the SMU I2C message. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Suggested-by: Lazar Lijo <Lijo.Lazar@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:40 -04:00
Andrey Grodzovsky	965ec37c46	drm/amdgpu: Drop i > 0 restriction for issuing RESTART Drop i > 0 restriction for issuing RESTART. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:40 -04:00
Andrey Grodzovsky	6240da4dfc	dmr/amdgpu: Add RESTART handling also to smu_v11_0_i2c (VG20) Also generilize the code to accept and translate to HW bits any I2C relvent flags both for read and write. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:24:40 -04:00
Andrey Grodzovsky	2485f8cfff	drm/amdgpu: Remember to wait 10ms for write buffer flush v2 EEPROM spec requests this. v2: Only to be done for write data transactions. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>	2021-07-01 00:24:39 -04:00
Alex Deucher	b36d8d6b77	drm/amdgpu: only set restart on first cmd of the smu i2c transaction Not sure how the firmware interprets these. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>	2021-07-01 00:24:39 -04:00
Aaron Rice	73a5784a5b	drm/amdgpu: rework smu11 i2c for generic operation Handle things besides EEPROMS. Signed-off-by: Aaron Rice <wolf@lovehindpa.ws> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>	2021-07-01 00:24:39 -04:00
Alex Deucher	3e2eae8db2	drm/amdgpu: add I2C_CLASS_HWMON to SMU i2c buses Not sure that this really matters that much, but these could have various other hwmon chips on them. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>	2021-07-01 00:24:39 -04:00
Alex Deucher	39ed82d1d9	drm/amdgpu: i2c subsystem uses 7 bit addresses Convert from 8 bit to 7 bit. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>	2021-07-01 00:24:39 -04:00
Alex Deucher	25e5c09f2b	drm/amdgpu/ras: switch fru eeprom handling to use generic helper (v2) Use the new helper rather than doing i2c transfers directly. v2: fix typo Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>	2021-07-01 00:24:39 -04:00
Alex Deucher	24f55c0559	drm/amdgpu/ras: switch ras eeprom handling to use generic helper Use the new helper rather than doing i2c transfers directly. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>	2021-07-01 00:24:39 -04:00
Alex Deucher	00e3a289d9	drm/amdgpu: add new helper for handling EEPROM i2c transfers Encapsulates the i2c protocol handling so other parts of the driver can just tell it the offset and size of data to write. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>	2021-07-01 00:24:39 -04:00
Alex Deucher	af01340bc4	drm/amdgpu/pm: add smu i2c implementation for navi1x (v5) And handle more than just EEPROMs. v2: fix restart handling between transactions. v3: handle 7 to 8 bit addr conversion v4: Fix &req --> req. (Luben T) v5: squash in i2c channel fix Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>	2021-07-01 00:24:39 -04:00
Alex Deucher	f400b6cec8	drm/amdgpu/pm: rework i2c xfers on arcturus (v5) Make it generic so we can support more than just EEPROMs. v2: fix restart handling between transactions. v3: handle 7 to 8 bit addr conversion v4: Fix &req --> req. (Luben T) v5: squash in i2c channel fix Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>	2021-07-01 00:24:39 -04:00
Alex Deucher	5125c96a9d	drm/amdgpu/pm: rework i2c xfers on sienna cichlid (v4) Make it generic so we can support more than just EEPROMs. v2: fix restart handling between transactions. v3: handle 7 to 8 bit addr conversion v4: Fix &req --> req. (Luben T) Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>	2021-07-01 00:24:39 -04:00
Alex Deucher	6963d6c176	drm/amdgpu: add a mutex for the smu11 i2c bus (v2) So we lock software as well as hardware access to the bus. v2: fix mutex handling. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>	2021-07-01 00:24:39 -04:00
Mukul Joshi	93c5bcd4ea	drm/amdgpu: Conditionally reset SDMA RAS error counts Reset SDMA RAS error counts during init only if persistent EDC harvesting is not supported. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:05:41 -04:00
Alex Sierra	7981ec6549	drm/amdkfd: Maintain svm_bo reference in page->zone_device_data Each zone-device page holds a reference to the SVM BO that manages its backing storage. This is necessary to correctly hold on to the BO in case zone_device pages are shared with a child-process. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:05:41 -04:00
Alex Sierra	3bf8282c6b	drm/amdkfd: add invalid pages debug at vram migration This is for debug purposes only. It conditionally generates partial migrations to test mixed CPU/GPU memory domain pages in a prange easily. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:05:41 -04:00
Alex Sierra	6ffecc946f	drm/amdkfd: skip migration for pages already in VRAM Migration skipped for pages that are already in VRAM domain. These could be the result of previous partial migrations to SYS RAM, and prefetch back to VRAM. Ex. Coherent pages in VRAM that were not written/invalidated after a copy-on-write. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:05:41 -04:00
Alex Sierra	1ade5f84cc	drm/amdkfd: skip invalid pages during migrations Invalid pages can be the result of pages that have been migrated already due to copy-on-write procedure or pages that were never migrated to VRAM in first place. This is not an issue anymore, as pranges now support mixed memory domains (CPU/GPU). Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:05:41 -04:00
Alex Sierra	1d5dbfe6c0	drm/amdkfd: classify and map mixed svm range pages in GPU [Why] svm ranges can have mixed pages from device or system memory. A good example is, after a prange has been allocated in VRAM and a copy-on-write is triggered by a fork. This invalidates some pages inside the prange. Endding up in mixed pages. [How] By classifying each page inside a prange, based on its type. Device or system memory, during dma mapping call. If page corresponds to VRAM domain, a flag is set to its dma_addr entry for each GPU. Then, at the GPU page table mapping. All group of contiguous pages within the same type are mapped with their proper pte flags. v2: Instead of using ttm_res to calculate vram pfns in the svm_range. It is now done by setting the vram real physical address into drm_addr array. This makes more flexible VRAM management, plus removes the need to have a BO reference in the svm_range. v3: Remove mapping member from svm_range Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:05:41 -04:00
Alex Sierra	278a708758	drm/amdkfd: use hmm range fault to get both domain pfns Now that prange could have mixed domains (VRAM or SYSRAM), actual_loc nor svm_bo can not be used to check its current domain and eventually get its pfns to map them in GPU. Instead, pfns from both domains, are now obtained from hmm_range_fault through amdgpu_hmm_range_get_pages call. This is done everytime a GPU map occur. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:05:41 -04:00
Alex Sierra	1fc160cfe1	drm/amdgpu: get owner ref in validate and map Get the proper owner reference for amdgpu_hmm_range_get_pages function. This is useful for partial migrations. To avoid migrating back to system memory, VRAM pages, that are accessible by all devices in the same memory domain. Ex. multiple devices in the same hive. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:05:41 -04:00
Alex Sierra	a010d98a78	drm/amdkfd: set owner ref to svm range prefault svm_range_prefault is called right before migrations to VRAM, to make sure pages are resident in system memory before the migration. With partial migrations, this reference is used by hmm range get pages to avoid migrating pages that are already in the same VRAM domain. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:05:41 -04:00
Alex Sierra	8c21fc49a8	drm/amdkfd: add owner ref param to get hmm pages The parameter is used in the dev_private_owner to decide if device pages in the range require to be migrated back to system memory, based if they are or not in the same memory domain. In this case, this reference could come from the same memory domain with devices connected to the same hive. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:05:41 -04:00
Alex Sierra	3a61dae854	drm/amdkfd: device pgmap owner at the svm migrate init GPUs in the same XGMI hive have direct access to all members'VRAM. When mapping memory to a GPU, we don't need hmm_range_fault to fault device-private pages in the same hive back to the host. Identifying the page owner as the hive, rather than the individual GPU, accomplishes this. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:05:41 -04:00
Alex Sierra	9e4a91cd9e	drm/amdkfd: inc counter on child ranges with xnack off During GPU page table invalidation with xnack off, new ranges split may occur concurrently in the same prange. Creating a new child per split. Each child should also increment its invalid counter, to assure GPU page table updates in these ranges. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:05:40 -04:00
Nicholas Kazlauskas	1d40ef902d	drm/amd/display: Extend DMUB diagnostic logging to DCN3.1 [Why & How] Extend existing support for DCN2.1 DMUB diagnostic logging to DCN3.1 so we can collect useful information if the DMUB hangs. Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:05:40 -04:00
Joseph Greathouse	aa61581126	drm/amdgpu: Update NV SIMD-per-CU to 2 Navi series GPUs have 2 SIMDs per CU (and then 2 CUs per WGP). The NV enum headers incorrectly listed this as 4, which later meant we were incorrectly reporting the number of SIMDs in the HSA topology. This could cause problems down the line for user-space applications that want to launch a fixed amount of work to each SIMD. Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2021-07-01 00:05:18 -04:00
Alex Deucher	06ac9b6c73	drm/amdgpu: add new dimgrey cavefish DID Add new PCI device id. Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2021-07-01 00:05:18 -04:00
Shyam Sundar S K	0e2125227e	drm/amd/pm: skip PrepareMp1ForUnload message in s0ix The documentation around PrepareMp1ForUnload message says that anything sent to SMU after this command would be stalled as the PMFW would not be in a state to take further job requests. Technically this is right in case of S3 scenario. But, this might not be the case during s0ix as the PMC driver would be the last to send the SMU on the OS_HINT. If SMU gets a PrepareMp1ForUnload message before the OS_HINT, this would stall the entire S0ix process. Results show that, this message to SMU is not required during S0ix and hence skip it. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:05:18 -04:00
Huang Rui	9f6a785720	drm/amdgpu: move apu flags initialization to the start of device init In some asics, we need to adjust the behavior according to the apu flags at very early stage. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:05:18 -04:00
Reka Norman	25f178bbd0	drm/amd/display: Respect CONFIG_FRAME_WARN=0 in dml Makefile Setting CONFIG_FRAME_WARN=0 should disable 'stack frame larger than' warnings. This is useful for example in KASAN builds. Make the dml Makefile respect this config. Fixes the following build warnings with CONFIG_KASAN=y and CONFIG_FRAME_WARN=0: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3642:6: warning: stack frame size of 2216 bytes in function 'dml30_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than=] drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3957:6: warning: stack frame size of 2568 bytes in function 'dml31_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than=] Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Reka Norman <rekanorman@google.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:05:18 -04:00
Michal Suchanek	c339a80d3a	drm/amdgpu/dc: Really fix DCN3.1 Makefile for PPC64 Also copy over the part that makes old gcc handling cross-platform. Fixes: `df7a1658f2` ("drm/amdgpu/dc: fix DCN3.1 Makefile for PPC64") Fixes: `926d6972ef` ("drm/amd/display: Add DCN3.1 blocks to the DC Makefile") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:05:18 -04:00
Oak Zeng	8dbe43e99f	drm/amdgpu: Set ttm caching flags during bo allocation The ttm caching flags (ttm_cached, ttm_write_combined etc) are used to determine a buffer object's mapping attributes in both CPU page table and GPU page table (when that buffer is also accessed by GPU). Currently the ttm caching flags are set in function amdgpu_ttm_io_mem_reserve which is called during DRM_AMDGPU_GEM_MMAP ioctl. This has a problem since the GPU mapping of the buffer object (ioctl DRM_AMDGPU_GEM_VA) can happen earlier than the mmap time, thus the GPU page table update code can't pick up the right ttm caching flags to decide the right GPU page table attributes. This patch moves the ttm caching flags setting to function amdgpu_vram_mgr_new - this function is called during the first step of a buffer object create (eg, DRM_AMDGPU_GEM_CREATE) so the later both CPU and GPU mapping function calls will pick up this flag for CPU/GPU page table set up. v2: rebase (Alex) Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Suggested-by: Christian Koenig <Christian.Koenig@amd.com> Reviewed-by: Christian Koenig <Christian.Koenig@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Tested-by: Po Huang <Po.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:05:12 -04:00
Guchun Chen	b66596f626	drm/amd/display: fix null pointer access in gpu reset During GPU reset, when receiving a DMCUB OUTBUX0 interrupt, DAL code will set it to be OUTBOX interrupt and sets hw interrupt. However, OUTBOX interrupt is not registered yet, so a NULL pointer access will be executed. Call Trace: dal_irq_service_set+0x30/0x90 [amdgpu] dc_interrupt_set+0x24/0x30 [amdgpu] amdgpu_dm_set_dmub_outbox_irq_state+0x22/0x30 [amdgpu] amdgpu_irq_update+0x77/0xa0 [amdgpu] amdgpu_irq_gpu_reset_resume_helper+0x67/0xa0 [amdgpu] amdgpu_do_asic_reset+0x219/0x260 [amdgpu] amdgpu_device_gpu_recover.cold+0x8c5/0xb64 [amdgpu] amdgpu_debugfs_gpu_recover_show+0x2c/0x60 [amdgpu] seq_read_iter+0xc2/0x450 ? do_anonymous_page+0x22c/0x3b0 seq_read+0xf9/0x140 full_proxy_read+0x5c/0x90 vfs_read+0xaa/0x190 ksys_read+0x67/0xe0 __x64_sys_read+0x1a/0x20 Fixes: `effbf6ca7e` ("drm/amdgpu/display: remove an old DCN3 guard") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-and-tested-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-01 00:03:13 -04:00
Guchun Chen	e38ca7e422	drm/amd/display: fix incorrrect valid irq check valid DAL irq should be < DAL_IRQ_SOURCES_NUMBER. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-and-tested-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2021-07-01 00:02:41 -04:00
Aaron Liu	e2329e74a6	drm/amdgpu: enable sdma0 tmz for Raven/Renoir(V2) Without driver loaded, SDMA0_UTCL1_PAGE.TMZ_ENABLE is set to 1 by default for all asic. On Raven/Renoir, the sdma goldsetting changes SDMA0_UTCL1_PAGE.TMZ_ENABLE to 0. This patch restores SDMA0_UTCL1_PAGE.TMZ_ENABLE to 1. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2021-07-01 00:02:22 -04:00
Chengming Gui	a2f55040cf	drm/amd/amdgpu: enable gpu recovery for beige_goby Enable gpu recovery for beige_goby. Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-30 00:18:56 -04:00
Darren Powell	91161b06be	amdgpu/pm: remove code duplication in show_power_cap calls v3: updated patch to apply to latest code v2: reorder to check pointers before calling pm_runtime_* functions created generic function and call with enum from * amdgpu_hwmon_show_power_cap_max * amdgpu_hwmon_show_power_cap * amdgpu_hwmon_show_power_cap_default === Test === AMDGPU_PCI_ADDR=`lspci -nn \| grep "VGA\\|Display" \| cut -d " " -f 1` AMDGPU_HWMON=`ls -la /sys/class/hwmon \| grep $AMDGPU_PCI_ADDR \| cut -d " " -f 10` HWMON_DIR=/sys/class/hwmon/${AMDGPU_HWMON} cp pp_show_power_cap.txt{,.old} lspci -nn \| grep "VGA\\|Display" > pp_show_power_cap.test.log FILES=" power1_cap power1_cap_max power1_cap_default " for f in $FILES do echo $f = `cat $HWMON_DIR/$f` >> pp_show_power_cap.test.log done Signed-off-by: Darren Powell <darren.powell@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-30 00:18:56 -04:00
Alex Deucher	ed50995514	drm/amdgpu/display: drop unused variable Remove unused variable. Fixes: `e7d9560aea` ("Revert "drm/amd/display: Fix overlay validation by considering cursors"") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-30 00:18:23 -04:00
Rodrigo Siqueira	e7d9560aea	Revert "drm/amd/display: Fix overlay validation by considering cursors" This reverts commit `33f409e60e`. The patch that we are reverting here was originally applied because it fixes multiple IGT issues and flickering in Android. However, after a discussion with Sean Paul and Mark, it looks like that this patch might cause problems on ChromeOS. For this reason, we decided to revert this patch. Cc: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Cc: Harry Wentland <Harry.Wentland@amd.com> Cc: Hersen Wu <hersenxs.wu@amd.com> Cc: Sean Paul <seanpaul@chromium.org> Cc: Mark Yacoub <markyacoub@chromium.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Reviewed-by: Sean Paul <seanpaul@chromium.org> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2021-06-30 00:18:23 -04:00
Veerabadhran Gopalakrishnan	b3a24461f9	amdgpu/nv.c - Added codec query for Beige Goby Added the Beige Goby capabilities in codec query. v2: fix build error and indent (James) Signed-off-by: Veerabadhran Gopalakrishnan <veerabadhran.gopalakrishnan@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-30 00:18:23 -04:00
Aaron Liu	c8af9390e5	drm/amdgpu: enable tmz on yellow carp The tmz functions are verified on yellow carp. So enable it by default. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-30 00:18:23 -04:00
Philip Yang	d4ebc20070	drm/amdkfd: implement counters for vm fault and migration Add helper function to get process device data structure from adev to update counters. Update vm faults, page_in, page_out counters will no be executed in parallel, use WRITE_ONCE to avoid any form of compiler optimizations. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-30 00:18:23 -04:00
Philip Yang	751580b3ff	drm/amdkfd: add sysfs counters for vm fault and migration This is part of SVM profiling API, export sysfs counters for per-process, per-GPU vm retry fault, pages migrated in and out of GPU vram. counters will not be updated in parallel in GPU retry fault handler and migration to vram/ram path, use READ_ONCE to avoid compiler optimization. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-30 00:18:23 -04:00
Philip Yang	dcdb4d904b	drm/amdkfd: fix sysfs kobj leak 3 cases of kobj leak, which causes memory leak: kobj_type must have release() method to free memory from release callback. Don't need NULL default_attrs to init kobj. sysfs files created under kobj_status should be removed with kobj_status as parent kobject. Remove queue sysfs files when releasing queue from process MMU notifier release callback. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-30 00:18:23 -04:00
Philip Yang	75ae84c89b	drm/amdkfd: add helper function for kfd sysfs create No functionality change. Modify kfd_sysfs_create_file to use kobject as parameter, so it becomes common helper function to remove duplicate code and will simplify new kfd sysfs file create in future. Move pr_warn to helper function if sysfs file create failed. Set helper function as void return because caller doesn't use the helper function return value. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-30 00:18:23 -04:00
Evan Quan	ff4b601a05	drm/amdgpu: update HDP LS settings Avoid unnecessary register programming on feature disablement. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-30 00:18:23 -04:00
Evan Quan	3e7fbfb40f	drm/amdgpu: update GFX MGCG settings Update GFX MGCG related settings. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-30 00:18:22 -04:00
Evan Quan	754e9883d4	drm/amdgpu: correct clock gating settings on feature unsupported Clock gating setting is still performed even when the corresponding CG feature is not supported. And the tricky part is disablement is actually performed no matter for enablement or disablement request. That seems not logically right. Considering HW should already properly take care of the CG state, we will just skip the corresponding clock gating setting when the feature is not supported. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-30 00:18:22 -04:00
Evan Quan	adcf949e66	drm/amdgpu: fix the hang caused by PCIe link width switch SMU had set all the necessary fields for a link width switch but the width switch wasn't occurring because the link was idle in the L1 state. Setting LC_L1_RECONFIG_EN=0x1 will allow width switches to also be initiated while in L1 instead of waiting until the link is back in L0. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2021-06-30 00:18:14 -04:00
Evan Quan	5a5da8ae95	drm/amdgpu: fix NAK-G generation during PCI-e link width switch A lot of NAK-G being generated when link widht switching is happening. WA for this issue is to program the SPC to 4 symbols per clock during bootup when the native PCIE width is x4. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2021-06-30 00:17:56 -04:00
Evan Quan	9c26ddb1c5	drm/amdgpu: fix Navi1x tcp power gating hang when issuing lightweight invalidaiton Fix TCP hang when a lightweight invalidation happens on Navi1x. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-30 00:17:47 -04:00
Evan Quan	0dbc2c81a1	drm/amdgpu: correct tcp harvest setting Add missing settings for SQC bits. And correct some confusing logics around active wgp bitmap calculation. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-30 00:16:55 -04:00
Chengzhe Liu	dafff0476d	drm/amdgpu: Power down VCN and JPEG before disabling SMU features When unloading driver, if VCN is powered on, sending message DisableAllSmuFeatures to SMU will cause SMU hang. We need to power down VCN and JPEG before clean up SMU. Signed-off-by: Chengzhe Liu <ChengZhe.Liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-29 23:41:23 -04:00
Zhan Liu	a51482458d	drm/amd/display: Enabling eDP no power sequencing with DAL feature mask [Why] Sometimes, DP receiver chip power-controlled externally by an Embedded Controller could be treated and used as eDP, if it drives mobile display. In this case, we shouldn't be doing power-sequencing, hence we can skip waiting for T7-ready and T9-ready." [How] Added a feature mask to enable eDP no power sequencing feature. To enable this, set 0x10 flag in amdgpu.dcfeaturemask on Linux command line. Signed-off-by: Zhan Liu <zhan.liu@amd.com> Reviewed-by: Nikola Cornij <Nikola.Cornij@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-29 23:41:22 -04:00
Alex Deucher	8fe44c080a	drm/amdgpu/display: fold DRM_AMD_DC_DCN3_1 into DRM_AMD_DC_DCN No need for a separate flag now that DCN3.1 is not in bring up. Fold into DRM_AMD_DC_DCN like previous DCN IPs. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-22 16:51:45 -04:00
Aric Cyr	a7268cf9a4	drm/amd/display: 3.2.141 Signed-off-by: Aric Cyr <aric.cyr@amd.com> Acked-by: Bindu Ramamurthy <bindu.r@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-22 16:51:45 -04:00
Anthony Koo	021eaef8ae	drm/amd/display: [FW Promotion] Release 0.0.71 - Introduce CMD for EDID CEA block parsing - Add SCR5 definition for reporting eDP power sequencer status Signed-off-by: Anthony Koo <Anthony.Koo@amd.com> Acked-by: Bindu Ramamurthy <bindu.r@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-22 16:51:45 -04:00
Josip Pavic	7335d95659	drm/amd/display: do not compare integers of different widths [Why & How] Increase width of some variables to avoid comparing integers of different widths Signed-off-by: Josip Pavic <Josip.Pavic@amd.com> Reviewed-by: Aric Cyr <Aric.Cyr@amd.com> Acked-by: Bindu Ramamurthy <bindu.r@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-22 16:51:44 -04:00
Stylon Wang	715bfff397	drm/amd/display: Revert "Guard ASSR with internal display flag" This reverts commit `9127daa0a8`. [Why] 1. Previous patch regresses on some embedded panels. 2. Project coreboot doesn't support passing of internal display flag. [How] This reverts "Guard ASSR with internal display flag" commit. Fixes: `9127daa0a8` ("drm/amd/display: Guard ASSR with internal display flag") Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1620 Signed-off-by: Stylon Wang <stylon.wang@amd.com> Reviewed-by: Wesley Chalmers <Wesley.Chalmers@amd.com> Acked-by: Bindu Ramamurthy <bindu.r@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2021-06-22 16:45:10 -04:00
Logush Oliver	eeb90e26ed	drm/amd/display: Fix edp_bootup_bl_level initialization issue [why] Updating the file to fix the missing line Signed-off-by: Logush Oliver <ollogush@amd.com> Reviewed-by: Charlene Liu <Charlene.Liu@amd.com> Acked-by: Bindu Ramamurthy <bindu.r@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-21 17:45:15 -04:00
Charlene Liu	452c76dfd2	drm/amd/display: get refclk from MICROSECOND_TIME_BASE_DIV HW register [why] recent VBIOS dce_infotable reference clock change caused a I2c regression. instead of relying on vbios, let's get it from HW directly. Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Reviewed-by: Chris Park <Chris.Park@amd.com> Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Bindu Ramamurthy <bindu.r@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-21 17:45:14 -04:00
Roman Li	1a365683d6	drm/amd/display: Delay PSR entry [Why] After panel power up, if PSR entry attempted too early, PSR state may get stuck in transition. This could happen if the panel is not ready to respond to the SDP PSR entry message. In this case dmub f/w is unable to abort PSR entry since abortion is not permitted after the SDP has been sent. [How] Skip 5 pageflips before PSR enable. Signed-off-by: Roman Li <roman.li@amd.com> Reviewed-by: Hersen Wu <hersenxs.wu@amd.com> Acked-by: Bindu Ramamurthy <bindu.r@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-21 17:45:14 -04:00
Aurabindo Pillai	9253e11503	drm/amd/display: get socBB from VBIOS for dcn302 and dcn303 [why] Some SOC BB paramters may vary per SKU, and it does not make sense for driver to hardcode these values. This change was added for dcn30 and dcn301, but not for dcn302 and dcn303 [how] Parse the values from VBIOS if available, and use them if valid Fixes: `93669c8e48` ("drm/amd/display: get socBB from VBIOS") Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Bindu Ramamurthy <bindu.r@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-21 17:45:14 -04:00
Wesley Chalmers	d8ddeb155c	drm/amd/display: Fix incorrect variable name [WHY] extended_end_address can only be calculated from the extended_address and extended_size Signed-off-by: Wesley Chalmers <Wesley.Chalmers@amd.com> Reviewed-by: Ashley Thomas <Ashley.Thomas2@amd.com> Acked-by: Bindu Ramamurthy <bindu.r@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-21 17:45:14 -04:00
Martin Tsai	068312559d	drm/amd/display: Clear lane settings after LTTPRs have been trained [Why] The voltage swing has to start from the minimum level when transmit TPS1 over Main-Link in clock recovery sequence. The lane settings from current design will inherit the existing VS/PE values that could be adjusted by Repeater X, and to use the adjusted voltage swing level in Repeater X-1 or DPRX could violate DP specs. [How] To reset VS from lane settings after LTTPRs have been trained to meet the requirement. Signed-off-by: Martin Tsai <martin.tsai@amd.com> Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com> Acked-by: Bindu Ramamurthy <bindu.r@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-21 17:45:14 -04:00
Nikola Cornij	5d9e7fe8ef	drm/amd/display: Clamp VStartup value at DML calculations time [why] Some timings with a large VBlank cause the value to overflow the register related, while also producing other wrong values in DML output. [how] Clamp VStartup at the DCN3.1 maximum value Signed-off-by: Nikola Cornij <nikola.cornij@amd.com> Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Acked-by: Bindu Ramamurthy <bindu.r@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-21 17:45:14 -04:00
Aric Cyr	d9b20b45ec	drm/amd/display: Multiplane cursor position incorrect when plane rotated [Why] When video plane is rotate the cursor position is incorrect and not matching the desktop location. [How] When a plane is rotated 90 or 270 degrees, the src_rect.width and height should be swapped when determining the scaling factor compared to the dst_rect. Signed-off-by: Aric Cyr <aric.cyr@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Bindu Ramamurthy <bindu.r@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-21 17:45:14 -04:00
Yifan Zhang	962f2f1ae2	Revert "drm/amdgpu/gfx9: fix the doorbell missing when in CGPG issue." This reverts commit `631003101c`. Reason for revert: side effect of enlarging CP_MEC_DOORBELL_RANGE may cause some APUs fail to enter gfxoff in certain user cases. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-21 17:44:16 -04:00
Yifan Zhang	a334bb6979	Revert "drm/amdgpu/gfx10: enlarge CP_MEC_DOORBELL_RANGE_UPPER to cover full doorbell." This reverts commit `1ba7b24ba6`. Reason for revert: Side effect of enlarging CP_MEC_DOORBELL_RANGE may cause some APUs fail to enter gfxoff in certain user cases. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-21 17:43:56 -04:00
Michel Dänzer	24981fa336	drm/amdgpu: Call drm_framebuffer_init last for framebuffer init Once drm_framebuffer_init has returned 0, the framebuffer is hooked up to the reference counting machinery and can no longer be destroyed with a simple kfree. Therefore, it must be called last. If drm_framebuffer_init returns 0 but its caller then returns non-0, there will likely be memory corruption fireworks down the road. The following lead me to this fix: [ 12.891228] kernel BUG at lib/list_debug.c:25! [...] [ 12.891263] RIP: 0010:__list_add_valid+0x4b/0x70 [...] [ 12.891324] Call Trace: [ 12.891330] drm_framebuffer_init+0xb5/0x100 [drm] [ 12.891378] amdgpu_display_gem_fb_verify_and_init+0x47/0x120 [amdgpu] [ 12.891592] ? amdgpu_display_user_framebuffer_create+0x10d/0x1f0 [amdgpu] [ 12.891794] amdgpu_display_user_framebuffer_create+0x126/0x1f0 [amdgpu] [ 12.891995] drm_internal_framebuffer_create+0x378/0x3f0 [drm] [ 12.892036] ? drm_internal_framebuffer_create+0x3f0/0x3f0 [drm] [ 12.892075] drm_mode_addfb2+0x34/0xd0 [drm] [ 12.892115] ? drm_internal_framebuffer_create+0x3f0/0x3f0 [drm] [ 12.892153] drm_ioctl_kernel+0xe2/0x150 [drm] [ 12.892193] drm_ioctl+0x3da/0x460 [drm] [ 12.892232] ? drm_internal_framebuffer_create+0x3f0/0x3f0 [drm] [ 12.892274] amdgpu_drm_ioctl+0x43/0x80 [amdgpu] [ 12.892475] __se_sys_ioctl+0x72/0xc0 [ 12.892483] do_syscall_64+0x33/0x40 [ 12.892491] entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: `f258907fdd` "drm/amdgpu: Verify bo size can fit framebuffer size on init." Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-18 17:14:44 -04:00
Wan Jiabing	d9db759652	drm/display: Fix duplicated argument Fix coccicheck warning: ./drivers/gpu/drm/amd/display/dc/dml/dcn31/display_rq_dlg_calc_31.c: 55:12-42: duplicated argument to && or \|\| Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-18 17:14:41 -04:00
Shaokun Zhang	dc22356c8f	drm/amd/display: Remove the repeated dpp1_full_bypass declaration Function 'dpp1_full_bypass' is declared twice, so remove the repeated declaration and unnessary blank line. Cc: Harry Wentland <harry.wentland@amd.com> Cc: Leo Li <sunpeng.li@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-18 17:14:36 -04:00
Gustavo A. R. Silva	bb82ea3b04	drm/amd/display: Fix fall-through warning for Clang In preparation to enable -Wimplicit-fallthrough for Clang, fix the following warning by replacing a /* fall through / comment with the new pseudo-keyword macro fallthrough: rivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_aux.c:672:4: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough] case AUX_TRANSACTION_REPLY_I2C_OVER_AUX_DEFER: ^ Notice that Clang doesn't recognize / fall through */ comments as implicit fall-through markings, so in order to globally enable -Wimplicit-fallthrough for Clang, these comments need to be replaced with fallthrough; in the whole codebase. Link: https://github.com/KSPP/linux/issues/115 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-18 17:14:33 -04:00
Pu Lehui	23549470ea	drm/amd/display: remove unused variable 'dc' GCC reports the following warning with W=1: drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_psr.c:70:13: warning: variable ‘dc’ set but not used [-Wunused-but-set-variable] 70 \| struct dc *dc = NULL; \| ^~ This variable is not used in function, this commit remove it to fix the warning. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-18 17:14:30 -04:00
Pu Lehui	85019b19d4	drm/amd/display: Fix gcc unused variable warning GCC reports the following warning with W=1: drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link_dp.c:3635:17: warning: variable ‘status’ set but not used [-Wunused-but-set-variable] 3635 \| enum dc_status status = DC_ERROR_UNEXPECTED; \| ^~~~~~ The variable should be used for error check, let's fix it. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-18 17:14:19 -04:00
xinhui pan	56f221b638	drm/amdkfd: Walk through list with dqm lock hold To avoid any list corruption. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-18 17:14:13 -04:00
Bokun Zhang	376002f4b0	drm/amd/amdgpu: Use IP discovery data to determine VCN enablement instead of MMSCH In the past, we use MMSCH to determine whether a VCN is enabled or not. This is not reliable since after a FLR, MMSCH may report junk data. It is better to use IP discovery data. Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-18 17:14:01 -04:00
Yifan Zhang	942ab769c5	drm/amdgpu: remove unused parameter in amdgpu_gart_bind Pagelist is no long used in amdgpu_gart_bind. Remove it. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-18 17:12:47 -04:00
Yifan Zha	f1802aa706	drm/amd/pm: Disable SMU messages in navi10 sriov [Why] sriov vf send unsupported SMU message lead to fail. [How] disable related messages in sriov. Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Acked-by: Jingwen Chen <Jingwen.Chen2@amd.com> Reviewed-by: Jack Zhang <Jack.Zhang1@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-18 17:12:41 -04:00
Stanley.Yang	513befa634	drm/amdgpu: message smu to update hbm bad page number Use SMU to update the bad pages rather than directly accessing the EEPROM from the driver. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-18 17:11:56 -04:00
Ashish Pawar	7c5f3d7d61	drm/amdgpu: PWRBRK sequence changes for Aldebaran Modify power brake enablement sequence on Aldebaran Signed-off-by: Ashish Pawar <ashish.pawar@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-18 17:11:49 -04:00
Stanley.Yang	6ec598cc9d	drm/amdgpu: fix bad address translation for sienna_cichlid Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-18 17:11:40 -04:00
Darren Powell	09b6744cc6	amdgpu/pm: replaced snprintf usage in amdgpu_pm.c with sysfs_emit replaced snprintf usage in amdgpu_pm.c with sysfs_emit fixed warning on comparing int with uint32_t in amdgpu_get_pp_num_states() == Test == AMDGPU_PCI_ADDR=`lspci -nn \| grep "VGA\\|Display" \| cut -d " " -f 1` AMDGPU_HWMON=`ls -la /sys/class/hwmon \| grep $AMDGPU_PCI_ADDR \| cut -d " " -f 10` HWMON_DIR=/sys/class/hwmon/${AMDGPU_HWMON} lspci -nn \| grep "VGA\\|Display" > scnprintf.test.log FILES="pp_num_states pp_od_clk_voltage pp_features pp_dpm_sclk pp_dpm_mclk pp_dpm_socclk pp_dpm_fclk pp_dpm_vclk pp_dpm_dclk pp_dpm_dcefclk pp_power_profile_mode " for f in $FILES do echo === $f === >> scnprintf.test.log cat $HWMON_DIR/device/$f >> scnprintf.test.log done Signed-off-by: Darren Powell <darren.powell@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-18 17:11:28 -04:00
Eric Huang	c9cfbf7f44	drm/amdkfd: Set iolink non-coherent in topology Fix non-coherent bit of iolink properties flag which always is 0. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-18 17:11:17 -04:00
Rodrigo Siqueira	5fd953a3f6	drm/amd/display: Add Freesync video documentation Recently, we added support for an experimental feature named Freesync video; for more details on that, refer to: commit `6f59f229f8` ("drm/amd/display: Skip modeset for front porch change") commit `d10cd527f5` ("drm/amd/display: Add freesync video modes based on preferred modes") commit `0eb1af2e82` ("drm/amd/display: Add module parameter for freesync video mode") Nevertheless, we did not document it in detail in our driver. This commit introduces a kernel-doc and expands the module parameter description. Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Sean Paul <seanpaul@chromium.org> Cc: Harry Wentland <Harry.Wentland@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Reviewed by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-18 17:06:43 -04:00
Alex Deucher	26c0504ad3	drm/amdgpu/vcn3: drop extraneous Beige Goby hunk Probably a rebase leftover. This doesn't apply to SR-IOV, and the non-SR-IOV code below it already handles this properly. Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-18 17:01:35 -04:00
Aurabindo Pillai	ceaf9f5719	drm/amd/display: Increase stutter watermark for dcn302 and dcn303 [Why] Current watermarks end up programming lowers watermarks which results in screen flickering and underflow for certain modes like 1440p. [How] Add 11us to stutter exit & stutter enter plus exit watermark. Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Jun Lei <jun.lei@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-18 17:01:35 -04:00
Stanley.Yang	e11d5e0d68	drm/amdgpu: add vega20 to ras quirk list Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-18 17:01:35 -04:00
xinhui pan	84408d5f38	drm/amdgpu: Set TTM_PAGE_FLAG_SG earlier for userprt BOs Because TTM do page counting on userptr BOs which is actually not needed. To avoid that, lets set TTM_PAGE_FLAG_SG after tt_create and before tt_populate. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-18 17:01:35 -04:00
Wan Jiabing	a4b0b97aac	drm: display: Fix duplicate field initialization in dcn31 Fix the following coccicheck warning: drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c:917:56-57: pstate_enabled: first occurrence line 935, second occurrence line 937 Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-15 17:25:42 -04:00
Amber Lin	a7b2451d31	drm/amdkfd: Fix circular lock in nocpsch path Calling free_mqd inside of destroy_queue_nocpsch_locked can cause a circular lock. destroy_queue_nocpsch_locked is called under a DQM lock, which is taken in MMU notifiers, potentially in FS reclaim context. Taking another lock, which is BO reservation lock from free_mqd, while causing an FS reclaim inside the DQM lock creates a problematic circular lock dependency. Therefore move free_mqd out of destroy_queue_nocpsch_locked and call it after unlocking DQM. Signed-off-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-15 17:25:42 -04:00
Felix Kuehling	d760895d55	drm/amdgpu: Use spinlock_irqsave for pasid_lock This should fix a kernel LOCKDEP warning on Vega10: [ 149.416604] ================================ [ 149.420877] WARNING: inconsistent lock state [ 149.425152] 5.11.0-kfd-fkuehlin #517 Not tainted [ 149.429770] -------------------------------- [ 149.434053] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. [ 149.440059] swapper/3/0 [HC1[1]:SC0[0]:HE0:SE1] takes: [ 149.445198] ffff9ac80e005d68 (&adev->vm_manager.pasid_lock){?.+.}-{2:2}, at: amdgpu_vm_get_task_info+0x25/0x90 [amdgpu] [ 149.456252] {HARDIRQ-ON-W} state was registered at: [ 149.461136] lock_acquire+0x242/0x390 [ 149.464895] _raw_spin_lock+0x2c/0x40 [ 149.468647] amdgpu_vm_handle_fault+0x44/0x380 [amdgpu] [ 149.474187] gmc_v9_0_process_interrupt+0xa8/0x410 [amdgpu] ... Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-15 17:25:42 -04:00
Yifan Zhang	1ba7b24ba6	drm/amdgpu/gfx10: enlarge CP_MEC_DOORBELL_RANGE_UPPER to cover full doorbell. If GC has entered CGPG, ringing doorbell > first page doesn't wakeup GC. Enlarge CP_MEC_DOORBELL_RANGE_UPPER to workaround this issue. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-15 17:25:42 -04:00
Yifan Zhang	631003101c	drm/amdgpu/gfx9: fix the doorbell missing when in CGPG issue. If GC has entered CGPG, ringing doorbell > first page doesn't wakeup GC. Enlarge CP_MEC_DOORBELL_RANGE_UPPER to workaround this issue. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-15 17:25:42 -04:00
Nirmoy Das	e18aaea733	drm/amdgpu: move shadow_list to amdgpu_bo_vm Move shadow_list to struct amdgpu_bo_vm as shadow BOs are part of PT/PD BOs. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-15 17:25:42 -04:00
Nirmoy Das	23e24fbb76	drm/amdgpu: parameterize ttm BO destroy callback Make provision to pass different ttm BO destroy callback while creating a amdgpu_bo. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-15 17:25:42 -04:00
Nirmoy Das	391629bdfc	drm/amdgpu: remove amdgpu_vm_pt Page table entries are now in embedded in VM BO, so we do not need struct amdgpu_vm_pt. This patch replaces struct amdgpu_vm_pt with struct amdgpu_vm_bo_base. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-15 17:25:42 -04:00
Hawking Zhang	ed4454c384	drm/amdgpu: correct psp ucode arrary start address For ASICs that need to load sys_drv_aux and sos_aux, the sys_start_addr is not the start address of psp ucode array because the sys_drv_aux and sos_aux actaully located at the end of the ucode array, instead, the psp ucode arrary start address should be sos_hdr + sos_hdr_offset. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <John.Clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-15 17:25:42 -04:00
Felix Kuehling	5a75ea56e3	drm/amdkfd: Disable SVM per GPU, not per process When some GPUs don't support SVM, don't disabe it for the entire process. That would be inconsistent with the information the process got from the topology, which indicates SVM support per GPU. Instead disable SVM support only for the unsupported GPUs. This is done by checking any per-device attributes against the bitmap of supported GPUs. Also use the supported GPU bitmap to initialize access bitmaps for new SVM address ranges. Don't handle recoverable page faults from unsupported GPUs. (I don't think there will be unsupported GPUs that can generate recoverable page faults. But better safe than sorry.) Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yang <philip.yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-15 17:25:41 -04:00
Wesley Chalmers	d0414a834c	drm/amd/display: Extend AUX timeout for DP initial reads [WHY] DP LL Compliance tests require that the first DPCD transactions after a hotplug have a timeout interval of 3.2 ms. In cases where LTTPR is disabled, this means that the first reads from DP_SET_POWER and DP_DPCD_REV must have an extended timeout. Signed-off-by: Wesley Chalmers <Wesley.Chalmers@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Anson Jacob <Anson.Jacob@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-15 17:25:41 -04:00
Wesley Chalmers	78ebca3219	drm/amd/display: Cover edge-case when changing DISPCLK WDIVIDER [WHY] When changing the DISPCLK_WDIVIDER value from 126 to 127, the change in clock rate is too great for the FIFOs to handle. This can cause visible corruption during clock change. HW has handed down this register sequence to fix the issue. [HOW] The sequence, from HW: a. 127 -> 126 Read DIG_FIFO_CAL_AVERAGE_LEVEL FIFO level N = DIG_FIFO_CAL_AVERAGE_LEVEL / 4 Set DCCG_FIFO_ERRDET_OVR_EN = 1 Write 1 to OTGx_DROP_PIXEL for (N-4) times Set DCCG_FIFO_ERRDET_OVR_EN = 0 Write DENTIST_DISPCLK_RDIVIDER = 126 Because of frequency stepping, sequence a can be executed to change the divider from 127 to any other divider value. b. 126 -> 127 Read DIG_FIFO_CAL_AVERAGE_LEVEL FIFO level N = DIG_FIFO_CAL_AVERAGE_LEVEL / 4 Set DCCG_FIFO_ERRDET_OVR_EN = 1 Write 1 to OTGx_ADD_PIXEL for (12-N) times Set DCCG_FIFO_ERRDET_OVR_EN = 0 Write DENTIST_DISPCLK_RDIVIDER = 127 Because of frequency stepping, divider must first be set from any other divider value to 126 before executing sequence b. Signed-off-by: Wesley Chalmers <Wesley.Chalmers@amd.com> Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Acked-by: Anson Jacob <Anson.Jacob@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-15 17:25:41 -04:00
Wesley Chalmers	a659f2fdf8	drm/amd/display: Add interface to get Calibrated Avg Level from FIFO [WHY] Hardware has handed down a new sequence requiring the value of this register be read from clk_mgr. Signed-off-by: Wesley Chalmers <Wesley.Chalmers@amd.com> Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Acked-by: Anson Jacob <Anson.Jacob@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-15 17:25:41 -04:00
Wesley Chalmers	9cf9498f66	drm/amd/display: Partition DPCD address space and break up transactions [WHY] SCR for DP 2.0 spec says that multiple LTTPRs must not be accessed in a single AUX transaction. There may be other places in future where breaking up AUX accesses is necessary. [HOW] Partition the entire DPCD address space into blocks. When an incoming AUX request spans multiple blocks, break up the request into multiple requests. Signed-off-by: Wesley Chalmers <Wesley.Chalmers@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Anson Jacob <Anson.Jacob@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-15 17:25:41 -04:00

1 2 3 4 5 ...

19310 Commits