Mastodawn

Was thinking again a while ago what waste PBR textures can be under most lighting.

Kind of weird to do a 4x texture memory increase - assuming BC1-5 and no alpha/metalness, e.g. BC1 base color + BC5 normal map & BC4 roughness - that will only show up under specific lighting conditions and elsewhere appears flat.

Though doubling texture resolution in both dimensions is also a 4x increase that might never show up (esp. with upscaling) so all things considered, maybe 4x isn't that bad.

Tom Forsyth Dec 6, 2024

@archo Yeah but at sensible resolutions, the higher-rez textures will never be loaded. So all they're wasting is disk space and their own production time. Whereas PBR is burning my precious DRAM for minor LSB differences. Boooooo.

(I say this with honest love to all my PBR shader writers)

(it's a joke. This is a bit)

(or is it)

Arvīds Kokins Dec 6, 2024

@TomF Texture loading time from disk and download time/bandwidth would be wasted as well, which seems relevant with modern 150 GB games. (Also not a lot of such games fit on 1 TB consoles.)

But I suspect for high res textures there could be a way to stream the biggest mips on-demand from the CDN to the GPU based on what the GPU needs which AFAIK nobody's doing yet. PBR seems uniquely disadvantaged in this regard (but at least the number of parameters doesn't seem to be growing infinitely).

Tom Forsyth Dec 6, 2024

@archo I keep waiting for @rygorous to come up with a compression scheme for PBR textures. Gotta be a lot of zeros in that sparse matrix, right?

Fabian Giesen Dec 6, 2024

@TomF @archo Wronski et al. have already done it

Arvīds Kokins Dec 6, 2024

@rygorous @TomF IIRC those methods mainly dealt with correlated color/normal/roughness. Though I've seen a post from 2020 that deals with tile repetition as well.

Many emissive/metalness maps I've seen could also benefit from tile deduplication in VRAM (lots of solid black/white space).

Geometry data (normal, AO, curvature) seems extremely difficult to compress though, despite looking very regular visually.

Matt Pettineo Dec 7, 2024

@archo @rygorous @TomF a lot of your typical AAA content is forming the final surface parameters “on-the-fly” by combining lots of tilers together in the shader. Although in some cases they will indeed ship “flattened” versions of that were produced in Substance or a proprietary tool, it depends on the content. But historically shipping tiling maps has been an effective “compression” scheme, at the cost of performance and shader complexity. Or you even have VT systems that “flatten” at runtime.

Tom Forsyth Dec 7, 2024

@mjp @archo @rygorous Absolutely. But you can do that flatten/baking step based on predictive heuristics as well, and they work just fine.

Now if you're arguing that you could use sampler feedback for the SOURCES of that baking - since you don't really care about the latency there - absolutely! Although I suspect the heuristics work Just Fine Enough even there.

Josh Simmons Dec 6, 2024

@archo @TomF cod already does network streaming for high res mips :')

Josh Simmons Dec 6, 2024

@archo @TomF https://www.youtube.com/watch?v=5RurbYnwZ6Y

On Demand Texture streaming - How we made all our Cod's fit on one PS4

YouTube

Tom Forsyth Dec 6, 2024

@dotstdy @archo Yeah I was streaming mips off a DVD in 2000 - I'm sure people have been doing it from the interwebs for a long time. Even at the time we joked it was higher bandwidth and lower seek time...

Josh Simmons Dec 6, 2024

@TomF @archo go full guerilla games and stream the whole damn game package

James Milne Dec 6, 2024

@dotstdy @TomF @archo Alas that's not been working so well for MS Flight Simulator 2024....

Josh Simmons Dec 6, 2024

@jamesfmilne @TomF @archo well they only do it internally, it's a bit easier when you can just make sure everyone has 10gbe to the workstation. :')

Arvīds Kokins Dec 6, 2024

@dotstdy @TomF IIRC there was also some software that used a virtual file system to download Steam game files on-demand (I forgot the name), which together with local mip streaming from separate files would work similarly.

The part I was specifically thinking nobody's done yet was detecting which mips are actually being requested by the pixel shader. It seems this wasn't mentioned in the video but I'm curious if they've been experimenting with something like that as well.

Tom Forsyth Dec 6, 2024

@archo @dotstdy It's called "virtual texturing" and it's been tried many ways over the years. My favourite talk is this one by Sean Barrett: https://www.youtube.com/watch?v=MejJL87yNgI

Also called "megatextures" and implemented in the "id Tech 5" engine and used in a bunch of games, most notably Rage.

I don't remember if it's persisted in newer engines like UE5. There's a certain overhead to using it, and I think most of the time analytic solutions are just as effective and cheaper e.g. https://tomforsyth1000.github.io/blog.wiki.html#%5B%5BKnowing%20which%20mipmap%20levels%20are%20needed%5D%5D

Virtual Textures (aka Megatextures) talk (2008)

YouTube

Tom Forsyth Dec 6, 2024

@archo @dotstdy Hardware implementations go as far back as 1998 when we showed a Permedia3 streaming a giant dataset, all demand-paged in only when the pixel shader requested the data.

The problem with all these demand-paged approaches is you get this gigantic stall in the middle of rendering your scene.

Tom Forsyth Dec 6, 2024

@archo @dotstdy So you need to solve this with two things:

1. have a fallback, i.e. lot the pixel shader use the mipmap level it has, not the one it wants, and fetch the desired data async.

2. predict what the shader will want and prefetch aggressively. OK, but how do you do that? You use the analytic methods. But this reduces the advantage of the pixel shader method, but the costs are still there.

Arvīds Kokins Dec 7, 2024

@TomF @dotstdy Oh yeah I'm aware of VT, was specifically talking about detecting shader demand in the context of streaming content from the internet (not even as granularly as VT impls do it, whole mips would be fine), sorry for the confusion.

It seems as if the solutions have been implemented 90% from one end (megatextures going from disk to GPU) and 90% from the other end (mips downloaded from a CDN depending on upcoming content/possibly draw calls) but haven't seen an end-to-end solution.

Tom Forsyth Dec 7, 2024

@archo @dotstdy Got it. Xbox games can download in the background, and will stop if you hit a chunk that hasn't arrived yet, but doing mip-by-mip would indeed be pretty aggressive!

I bet Flight Simulator does this. It streams the entire world, so it kinda has to, right?

Arvīds Kokins Dec 7, 2024

@TomF @dotstdy I suspect that for Flight Simulator it's easy to determine how far each model is from the camera so they likely wouldn't have to measure at pixel level to start downloading some chunk of the world. But I haven't looked into how their streaming works.

Josh Simmons Dec 7, 2024

@archo @TomF tbh i'm not sure many titles use sampler feedback for their streaming. it seems kinda not really that effective in practice, and the existing huge piles of heuristics work quite well even if it's a bit of a nightmare.

Josh Simmons Dec 7, 2024

@archo @TomF but there's a difference there between two problems, one being "how do i decide which mips to load" where the choices are between "feedback from the gpu sampling" and "cpu heuristics". And then the second problem is "where do i pull those mips from", where game like cod and flight simulator are pulling (some of) them dynamically from the network, and most games just pull them from the disk package (maybe with a "high res texture dlc" for the stupid mip levels).

Josh Simmons Dec 7, 2024

@archo @TomF In practice with sampler feedback (at least with the feature, ymmv if you're doing something bespoke with software paging) it seems that you really can't pull a large amount of data without compromising performance. So in order to use it you're stochastically sampling sampler feedback at an extremely low rate. Plus, I think titles just don't need to use it when they already have a highly tuned, working, texture streamer, so there's not necessarily a huge demand to change it up.

Tom Forsyth Dec 7, 2024

@dotstdy @archo Also if you rely only on sampler feedback, as the camera turns, the edge of the screen (which is where your eyes are naturally looking) are always wrong. I like the term "Just Too Late" rendering instead of Just In Time. It looks kinda crappy.

So as Josh says, you always need good PREDICTIVE heuristics anyway, and if you do then why even use sampler feedback?

Michael Vance Dec 7, 2024

@TomF @dotstdy @archo Turnkey sampler feedback failed the "no implicit behavior" test of inserting bits into your shaders, as well. But the primary piece is as you note, that your predictive heuristics are needed and Good Enough(tm). The most interesting uses for sampler feedback are actually for offline analysis in tooling to build offline CPU-side predictive guidance hierarchies/etc. for your streamer.

@wadeb is on here if you want to ask specifics about COD on-demand texture streaming.

Wade Brainerd Dec 18, 2024

@mtothevizzah @TomF @dotstdy @archo I feel like there's still space for stochastic sampler feedback - but not as an alternative to traditional streaming, and not as a side-channel hw feature.

Wade Brainerd Dec 18, 2024

@mtothevizzah @TomF @dotstdy @archo Roughly, the usual predictive streamer manages through mip N-2, then the largest mips are managed page-by-page via stochastic sampler feedback.

Wade Brainerd Dec 18, 2024

@mtothevizzah @TomF @dotstdy @archo It'd maximize the memory-saving-precision of sampler feedback, while limiting worst case blurriness from camera cuts.
Aki and I had a design for an intern to try on PS5 a few years ago but it didn't get done.

Tom Forsyth Dec 18, 2024

@wadeb @mtothevizzah @dotstdy @archo The real problem is eviction. If you only get data back to the CPU about page faults, how does it know what it can safely evict? You need some sort of frame counter on each page, and then the GPU has to check those and tell the CPU which pages are LRU, and it's all getting excitingly complex and expensive again. I mean - I realise this is all how standard CPU VM works - it's really really ugly and "how does this function at all" and so on, but...

Tommy Schmid Dec 7, 2024

@TomF @dotstdy @archo On PC, definitely. Not only do you have to contend with the latency of readback, and potentally spinning platters, but the API also requires all the shaders be instruments. Bang for buck is awful.
On consoles with shared memory and dedicated IO HW though, you can get the just too late time down to 1-2 frames after the first texel is requested, and at that point you can probably get away with some _really_ dumb heuristics for fallbacks, and only predicting on camera cuts.

Arvīds Kokins Dec 7, 2024

@TomF @dotstdy I would agree with "just too late" looking crappy but it's also ubiquitous and happens in ways that are far more noticeable (e.g. culling).

I also specifically recall GR:Wildlands (2017) doing "just too late" mip streaming, most noticeably with road textures (but probably at draw call granularity and going all the way between low and high LOD).

I agree with SF not adding much though. It at most removes the need for reducing detail in graphics options manually.

Tom Forsyth Dec 7, 2024

@archo @dotstdy Yeah I really hate just-too-late pixel-feedback culling especially. It's really obvious. Pop pop pop.

In general the problem with occlusion culling is it doesn't help the worst case, and although it helps framerate in the normal case, it also adds artifacts. Makes me very hesitant to add it to any engine - very poor bang-for-the-buck.

@TomF @archo @dotstdy we use it extensively in ue5, it's a lot cheaper now there is first class hardware support and saves me a lot of work

Tom Forsyth Dec 9, 2024

@1st_C_Lord @archo @dotstdy You don't see it pop in annoying ways? Is this another case where TAA hides the sins? :-(

@TomF @archo @dotstdy there is cases where you can catch it streaming in but unreal's traditional mip level streaming is much worse anyway

Tom Forsyth Dec 10, 2024

@1st_C_Lord @archo @dotstdy That's pretty impressive - well done.

Matt Pettineo Dec 7, 2024

@archo I’m not sure what you mean by this, all of your typical “PBR” maps will be active in all lighting scenarios unless you literally have no lighting.

Arvīds Kokins Dec 8, 2024

@mjp I meant that it would be possible to replicate huge chunks of scenes even with the PBR textures removed (may need baking/keeping lighting into the color map aka old-school diffuse maps).

Depends on the game and the scene of course but most materials have high roughness and a good amount of games have lots of heavily shadowed areas or cloudy/foggy skies.

Attached some examples where the numerous flat-appearing areas should be easily visible:

Matt Pettineo Dec 8, 2024

@archo those two scenes would look wildly different without roughness/normal/maps, and *especially* if you dropped specular entirely. Even if you had infinite resolution baked diffuse response through VT or similar they wouldn’t look the same, and you’d have no ability to do dynamic lighting.

Matt Pettineo Dec 8, 2024

@archo the whole “wet look” of the road in the left screenshot is coming from normal and roughness maps. I’m not sure how you would replicate that with just diffuse. Or even if you could, it would surely not look right from multiple view angles.

Arvīds Kokins Dec 8, 2024

@mjp I'm not saying that the entire screenshot/scene doesn't benefit from PBR textures, just that there are huge parts that don't - for example, the walls in the first screenshot and grass/most trees in the second one.

My note on diffuse baking was about being able to assume that lighting will generally come from known average directions, which allows the normal map to be baked into base color with hard-to-notice loss assuming (near-)max roughness (e.g. for the characters in the screenshots).

Matt Pettineo Dec 9, 2024

@archo I very much doubt that assumption would hold for all viewing angles or lighting conditions in which that wall material ends up getting used. Grass has tons of specular, especially up close! Very few materials have effectively no noticeable specular, especially at grazing angles.

Matt Pettineo Dec 9, 2024

@archo if you were to bake lighting/normals into the color map you would effectively be having a gigantic combinatorial explosion on your texture sizes, MegaTexture-style. That's a very tough pill to swallow for all kinds of reasons, both in terms of dev workflow and end user experience. People already complain about filling up their HDDs. Which is why that idea of shipping unique VTs never went anywhere post-Rage.

Matt Pettineo Dec 9, 2024

@archo VT page generation at runtime is very much a thing though, since you still get the massive compression benefit of shipping tiling textures that can be used over and over again throughout the game. But even then the VT system will still spit out normal + roughness maps, since you want that for dynamic lighting. Or you can go even further and do full texture-space shading using the bones of a runtime VT system.

Arvīds Kokins Dec 9, 2024

@mjp I specifically mentioned lighting as a precondition in my initial post.

As for grazing angles, they shouldn't make a difference for (near-)max roughness materials.

Grass may have tons of specular elsewhere, but not in that screenshot.

Also not concerned with solutions that require >1 baked color maps.

All I'm saying is that with frequently appearing low-contrast lighting, PBR maps have little value.

Certainly not claiming there's a better way to support different lighting conditions. 🤷‍♂️