I accidentally discovered a useful trick with my open-source #AMD #Radeon R9 270 GPU drivers on #Linux: when running the #Wayland GNOME session, if I encounter corrupt graphics (garbled on a black screen) while resuming from standby sleep, I can… simply press the power button to put the computer back to sleep, resume again, and it fixes itself!
No need to reboot or restart the display server!

I wonder how that's possible… & why it doesn't do that automatically.

#Mesa #RadeonSI #AMDGPU #drivers

@nekohayo Wayland has had issues with screen corruption for some time now.

@simplycorbett
Maybe it was merely coincidence from running older drivers when I was still on Xorg last year, but I don't remember being able to simply recover the GPU state by suspending & resuming when I was on Xorg (I also don't remember with great accuracy whether Xorg had garbled artifacts like that on resume occasionally, but I _think_ I had seen those before).

Attached image shows the use-after-free kfence error I saw in journalctl from the first resume tonight; absent in 2nd resume.

@simplycorbett This error sounds a lot like https://gitlab.freedesktop.org/drm/amd/-/issues/3171 (where some reporters were running Xorg as well) on kernels 6.7.x and 6.8.x 🤔 but if that's the one, it's unclear to me what exact kernel versions the fix is supposed to be in…
BUG: KFENCE: use-after-free read in amdgpu_bo_move+0x1ce/0x710 [amdgpu] (#3171) · Issues · drm / amd · GitLab

System information System: Host: el-ryzerino...

GitLab
@nekohayo I'm not sure, sorry! I have nvidia cards so when I do use desktop linux (instead of servers) its with nvidia. All I can tell you is that there has been vram/memory corruption on nvidia for a while when it comes to the screen on wayland. I -think- its been fixed but you would have to be running a rolling release with the most up to date version of everything. I've never heard of an issue with amd cards, are you sure it's a linux bug and not hardware?

@simplycorbett In my casual experience over the years, AMD Mesa/Linux drivers are a bit temperamental, prone to lockups & crashes. Especially the infamous "ring stalled" errors!
Seems someone wrote an open letter about it in general: https://www.reddit.com/r/Amd/comments/1bsjm5a/letter_to_amd_ongoing_amd/

It once took me 2 years to discover that the Mesa drivers for AMD GPUs don't do any thermal throttling on Linux: they just crash, unlike the drivers on Windows (that would throttle the card's performance when overheating, and never crash).

@nekohayo Sounds like a race condition.