Today I submitted Vulkan 1.2 conformance for NVK on all Kepler GPUs. (They're not conformant yet. There's a review period.)

Sometimes I wonder why I even bother putting time into old hardware.

Sure, there's a lot of GTX 1060s out there that are gonna become e-waste one day. But also, we can't reclock them with nouveau and likely will never be able to so they should probably just be e-waste. If you plug one in and want to run nouveau on it, it's just going to burn a lot of PCIe power plugs and render slower than the GPU you got for free with your old Intel CPU.

"Oh, but some of them can reclock!" Yes. Kepler, which is a 13-year-old architecture at this point. Sure, maybe with enough effort we can make it sing but it's so old and missing so many features that it can't run modern stuff anyway. So, yeah, maybe we can replicate a 2012 gaming experience in 2025. And the point is?...

A big part of it, though, is because I care about Linux users. There are a lot of people who still have one of those old cards and still want to use it. Maybe they can't afford a new GPU and the one they got second-hand from their rich uncle 5 years ago is the best they have. There's a lot of those people running Linux. There's a limit as to how much of my time I can really justify spending on ancient hardware but I want to give them what I can.

Another is that I really like the archeology of it. Today, I went on a deep dive down the rabbit hole of how NVIDIA has handled reading from global constant memory over the years. Tesla and Fermi didn't have any real magic here. Then Kepler B added a magic texture instruction that fetch from arbitrary 64-bit addresses. Then, on Maxwell, they unified a bunch of caches and added a non-coherent cache mode which skips all the coherency checks and just grabs the first thing that matches. On Volta, they reworked their coherency model but kept the non-coherent mode, only under a different name. I love learning about this stuff!

The third big reason is maintenance. Even if that old hardware will never run well, it does need to work. And the old nouveau stack is such a pain in the ass to work on and try to maintain. If we can make Zink+NVK even remotely decent on older hardware, that'll be a huge win for users in terms of reliability and stability and a huge win for the nouveau community in terms of support costs.

In the end, my job is to be a steward of the Linux graphics stack. To ensure that Linux graphics is good, actually. Sometimes that means deciding that my time is better spent on new features and hardware so that Linux can stay ahead of the curve. And sometimes that means I spend two weeks reviewing community patches and fixing the last few Kepler bugs so we can breathe some new life into some old GPUs.

@gfxstrand a very noble endeavour

> In the end, my job is to be a steward of the Linux graphics stack.

@gfxstrand No pressure at all there 😅 More seriously, don't overdo it... for your own sanity's sake! We need a healthy Faith for the years to.come!

@mupuf I said A steward, not THE steward. There's quite a few of us. I know it's too big of a job, even for me. 💜
@gfxstrand Why can 10-series cards still not be reclocked if other newer cards now can be?
Faith Ekstrand (@[email protected])

@[email protected] Reclocking is controlled by the firmware. (There's no way to do it with the registers exposed directly to PCI.) The firmware is signed. And Nvidia hasn't given us firmwares that are capable of reclocking. On Turing+, we're using the GSP firmware, which is the same firmware blob they use in their Windows and Linux drivers and it just handles everything for us.

Treehouse Mastodon
@gfxstrand great job, thank you for all the effort you are putting into it 🙏
@gfxstrand me reading this with a 1060 on my pc:
@gfxstrand thank you for your hard work. as someone who’s having a GTX 1070 in their homeserver and was just recently frustrated by the NVIDIA experience on Linux, I’m so so happy to see that there’s still someone caring about those old devices 🥹

@gfxstrand

Thank you. I think it's a worthy cause.

@gfxstrand what the actual technical reason behind not being to reclock newer pre-GSP GPUs?

@a1ba Reclocking is controlled by the firmware. (There's no way to do it with the registers exposed directly to PCI.) The firmware is signed. And Nvidia hasn't given us firmwares that are capable of reclocking.

On Turing+, we're using the GSP firmware, which is the same firmware blob they use in their Windows and Linux drivers and it just handles everything for us.

@gfxstrand is it signed in runtime by the driver or it's already signed by the nvidia and distributed in the driver package?

I mean, besides obvious legal implications, what stops one from using the firmware they distribute in the driver package?

@a1ba It's signed ahead of time.

Could we use the firmwares from the blob? Maybe in theory. But it's a whole complex chain of firmwares loading other firmwares and we have no docs for any of it. And even if someone did figure it out, they'd never be able to ship it because of licensing issues.

And it's not like no one has ever tried going down that path. Many have. And in the end they've all thrown up their hands and moved on to something they might actually succeed at. So if someone wants to go Don Quixote on it, I'm not gonna stop them but I'm also not going to suggest anyone take up that cursed project.

@gfxstrand yeah, that makes sense.

Thanks for taking my question seriously :)

@a1ba @gfxstrand +1 thank you

if you don't mind one more question: when the binary driver is probed and then unloaded, the blessed firmware remains active on the card, right? can nouveau interact with the card in that state without uploading the signed-for-nouveau-but-incapable-of-reclocking firmware variant?

@amonakov @a1ba People have toyed with ideas like that but loading one driver just for the firmware, unloading it, and then loading another is not something that's ever going to be robust. It's maybe a decent idea for the initial reverse engineering to separate loading from figuring out interfaces a bit but it's not something you can ship.

Also, I don't know what the proprietary driver does on teardown. That it leaves everything in a useable state is a big assumption.

@gfxstrand @a1ba weren't there bugs discovered in signature verification on some cards, that would allow to load unsigned code?

EDIT: also, didn't signatures appear only in late Maxwell?

@gfxstrand Vulkan conformance for the Matrox G400 when?

*jk*

You're doing an awesome job with NVK.

@sascha You can already get conformant Vulkan displayed on your G400. Lavapipe should already be installed. 😜

@gfxstrand You're a hero.

My girlfriend today mostly uses her macbook, but she still has a desktop PC with a Kepler (660 ti) card in it! Perhaps there's a near future where we'll be switching that over from win10 to Linux.

Kepler really was a good micro-architecture in its day...

@gfxstrand I think the GPUs that can’t be reclocked also use a symmetric algorithm to sign the firmware, so in theory it ought to be possible to extract the signing key from the hardware. That might run afoul of DMCA and other similar laws, though, and to be honest the days of being able to keep old hardware running forever as anything more than a toy were killed by Meltdown and Spectre. (“I only run trusted code” implies “I don’t browse the web”, which is enough of a corner case that it can be ignored.)
@gfxstrand With a move to Zink+NVK, how much of the old nouveau stack jank does that (could that) let you cut out?
@developing_agent Basically all of it on the userspace side. The kernel module is still a bit janky but the Nova team is working on that. I've just gotta give them time.
@gfxstrand The nova team is working on making the nouveau kernel module less janky?
@developing_agent They're rewriting it.
@gfxstrand I thought the nova kernel module wan't going to support older cards? Or do you mean they're rewriting the nouveau kernel module too?
@gfxstrand Or I guess you mean on the kernel side the nova rewrite only helps with newer cards, not older ones?
@developing_agent Correct. But NVK also has limits. We can't rewrite everything for everything. There just isn't a good justification for that.
@gfxstrand If anything, the thought that general pre-turing support trickles down in some way to Tegra (which _does_ have all needed firmware upstream iirc) already makes me happy. So it's still much appreciated!