New blog post: "Exact UNORM8 to float" https://fgiesen.wordpress.com/2024/11/06/exact-unorm8-to-float/ nobody asked, but here we go.
Exact UNORM8 to float

GPUs support UNORM formats that represent a number inside [0,1] as an 8-bit unsigned integer. In exact arithmetic, the conversion to a floating-point number is straightforward: take the integer and…

The ryg blog

@rygorous Super interesting! I recently had this problem of precision when converting colors. I came with a slightly different solution to this problem, giving similar exact results and slightly 10% faster. Code in #csharp

static float ByteToFloat(byte value) => (value + value + value) * (1.0f / (3.0f * 255.0f));

@xoofx @rygorous Woah, nice! Is there theory behind that, or just trial and error?
@dougall @rygorous It was purely gut feeling and brute force! 😅 I looked for an integer that would mitigate the loss of precision when dividing solely by 255 and tried i with (value * i) * (1.0f / (i * 255.0f)) and 3 came out quickly. Finding the theory going backward should be possible from there, but I'm a lazy person ☺️

@xoofx @rygorous Hmm... Messing around, the best "i" I could find (by how far beyond 255 you need to go before getting an incorrect answer) is 341.

This gives a number that, when represented as float, is accurate to 40 bits (i.e. it has 16 zero bits after the 24 significant bits):

11000000111100001111110100000000000000001... 1/(341*255)
10000000100000001000000010000000100000001... 1/255

i=3 doesn't have this property, but is a little (enough?) closer to its rounded representation than i=1...

@xoofx @rygorous (I feel like the other factor is increasing the number of significant bits on the left hand side, so the multiplication result is wider. I wouldn't be surprised if I was making an error there though.)

This is probably a better way to look at it: https://x.com/corsix/status/1854539270981025903

(I _think_ my ideas are kind of equivalent?)

Pete Cawley (@corsix) on X

@rygorous Intuition: multiplication by the reciprocal involves multiplication by integer m then division by 2^e, where m/2^e approximates 1/255. We want a slightly more accurate approximation, which means increasing |m|, but FP format limits |m|. Solution? Factorise m = m_1 * m_2.

X (formerly Twitter)