I really can’t wait until gpu resetting on Linux will become a thing…

Many motherboards (especially reference boards from silicon vendors or laptops) allow you to enable/disable power to the slot by controlling the GPIO (SoC pincontrol).

You can spot this by looking for “rtd” in src/mainboard/* directory in coreboot tree, there are pins specified for PCIe power and reset.

Imagine being able to cut the power to something like a dedicated GPU in your system by setting PCIE_PWR pin low and decreasing your desktop power’s consumption by ~30W (or even ~80W if you happen to use first-generation Intel Arc GPU).

Then when you want to play a game, you would set the pin high, DRM would re-initialize the card, and you could either play as-is (displays connected to iGPU would cost you some performance due to DMA framebuffer copy) or press the button on (very inexpensive these days) DP/HDMI switch and play with full performance straight from dGPU.

Of course none of that would be required if board vendor would implement ASPM correctly, but vast majority of board vendors fuck up their power management/board designs, so that would be nice to have.

(One day I will go insane enough to design an open-source ATX board that would implement everything correctly and use STM32 as an Embedded Controller)

@[email protected] omg that would be amazing

I had this unhinged idea where I could have a cluster worker node that is also usable as a desktop, being able to cut the video card and drop down to nearly idle power draw while helping cluster stability would be amazing

@elly The horrors of fixing the mess...

though at least through runtime power management it _might_ be doable already, just requires Userspace to do a thingy 🙃 and most people plug their displays on the discrete GPU anyway (or don't even have an iGPU in the first place)...

_but_ we also already have eGPUs and that's already a huge dumpster fire.

@karolherbst oh yeah, eGPUs are a total mess...
Two years ago my company organized a hackathon somewhere in rural Germany, I brought some Chromebooks and my desktop, coworker brought his laptop and RTX2080 (I think?) in eGPU enclosure.

eGPU setup basically... didn't work. It behaved much better with RX7800XT we yoinked from my workstation, but still wasn't super stable either way.
I ended up letting him swap the SSD into my desktop and use it for few days, after which he sold nvidia card and eGPU enclosure. He bought used Ryzen system with the same GPU I have (I think) for pretty much the same amount of money he got from selling that setup.

That's why it might be a hot take but eGPUs are inherently stupid. Setup costs the same as used parts and you give up A LOT of performance due to bus constrains (it's like what, 32Gb/s on TB4?).

I'm pretty hyped for dual-oculink board for my work laptop (Framework 16) that someone's been working on.
You have direct PCIe 4.0 x8 exposed on the back of the laptop that you can do whatever you want with.
Plus, because AMD doesn't have the whole IFD nonsense, you can split those lanes however you want (at least according to my findings while reverse-engineering this stuff so far), so you can literally have 8x PCIe 4.0 x1 slots hanging off of your laptop if you really want.

This effectively means that once I'm done with porting coreboot to that system, you'll have a toggle that will allow you to split that PCIe port into:
- 1x 8-lanes
- 2x 4-lanes
- 4x 2-lanes
- 8x 1-lane
- Maybe something custom, like 2x2 for two additional (internal) NVME SSDs, and external OcuLink with 4-lanes of PCIe (40Gb/s)?

That would be really neat in my opinion, and board costs basically pennies (~40EUR) in comparison to licensed TB4 implementation (I don't even want to think how much TB5 will cost due to signal integrity etc)

@elly it's great for driver development though... I have my eGPU set up in a way, that I can unload the GPU driver any moment and power cycle the case without rebooting!

So that's kinda working, but yeah.. do not unplug the cable 🙃

@elly my magic trick: create a fake "card1" device file and chmod 000 it lol. This way the wayland compositor won't even dare to touch that one.

@elly I'm confused, DRM has had support for this for over 10 years using ACPI. It power gates the GPU after 4 seconds of lack of use.

Maybe coreboot doesn't expose the right methods?

@mupuf I think it's just the cursedness of my board. I thought I had it working at one point, but then NVME fell of the bus
(it's https://doc.coreboot.org/mainboard/erying/tgl/tgl_matx.html)
Erying Polestar G613 Pro — coreboot 26.03-11-g493770d730 documentation

@elly @mupuf ah yeah, the issue I was debugging where the GPU fell of the bus after power cycling. the workaround was to disable some PCI power management feature on the bridge prior...