diff options
| author | Rodrigo Siqueira <siqueira@igalia.com> | 2025-11-21 06:55:28 -0700 |
|---|---|---|
| committer | Alex Deucher <alexander.deucher@amd.com> | 2025-11-24 12:34:40 -0500 |
| commit | 34355e61835e772abddb49ad819f88bf185f8d42 (patch) | |
| tree | 133f6dbdb14edf2026f299caf37ad2d94e3eba06 /drivers/gpu/drm/amd/amdgpu | |
| parent | e12603bf2c3d571476a21debfeab80bb70d8c0cc (diff) | |
drm/amdgpu: Fix GFX hang on SteamDeck when amdgpu is reloaded
When trying to unload amdgpu in the SteamDeck (TTY mode), the following
set of errors happens and the system gets unstable:
[..]
[drm] Initialized amdgpu 3.64.0 for 0000:04:00.0 on minor 0
amdgpu 0000:04:00.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on gfx_0.0.0 (-110).
amdgpu 0000:04:00.0: amdgpu: ib ring test failed (-110).
[..]
amdgpu 0000:04:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000
amdgpu 0000:04:00.0: amdgpu: Failed to disable gfxoff!
amdgpu 0000:04:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000
amdgpu 0000:04:00.0: amdgpu: Failed to disable gfxoff!
[..]
When the driver initializes the GPU, the PSP validates all the firmware
loaded, and after that, it is not possible to load any other firmware
unless the device is reset. What is happening in the load/unload
situation is that PSP halts the GC engine because it suspects that
something is amiss. To address this issue, this commit ensures that the
GPU is reset (mode 2 reset) in the unload sequence.
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Diffstat (limited to 'drivers/gpu/drm/amd/amdgpu')
| -rw-r--r-- | drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 14 |
1 files changed, 14 insertions, 0 deletions
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index ca4e47ce2417..a7594ae44b20 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3681,6 +3681,20 @@ static int amdgpu_device_ip_fini_early(struct amdgpu_device *adev) "failed to release exclusive mode on fini\n"); } + /* + * Driver reload on the APU can fail due to firmware validation because + * the PSP is always running, as it is shared across the whole SoC. + * This same issue does not occur on dGPU because it has a mechanism + * that checks whether the PSP is running. A solution for those issues + * in the APU is to trigger a GPU reset, but this should be done during + * the unload phase to avoid adding boot latency and screen flicker. + */ + if ((adev->flags & AMD_IS_APU) && !adev->gmc.is_app_apu) { + r = amdgpu_asic_reset(adev); + if (r) + dev_err(adev->dev, "asic reset on %s failed\n", __func__); + } + return 0; } |