@rygorous Super interesting! I recently had this problem of precision when converting colors. I came with a slightly different solution to this problem, giving similar exact results and slightly 10% faster. Code in #csharp
static float ByteToFloat(byte value) => (value + value + value) * (1.0f / (3.0f * 255.0f));
@xoofx @rygorous Hmm... Messing around, the best "i" I could find (by how far beyond 255 you need to go before getting an incorrect answer) is 341.
This gives a number that, when represented as float, is accurate to 40 bits (i.e. it has 16 zero bits after the 24 significant bits):
11000000111100001111110100000000000000001... 1/(341*255)
10000000100000001000000010000000100000001... 1/255
i=3 doesn't have this property, but is a little (enough?) closer to its rounded representation than i=1...
@xoofx @rygorous (I feel like the other factor is increasing the number of significant bits on the left hand side, so the multiplication result is wider. I wouldn't be surprised if I was making an error there though.)
This is probably a better way to look at it: https://x.com/corsix/status/1854539270981025903
(I _think_ my ideas are kind of equivalent?)
@rygorous Intuition: multiplication by the reciprocal involves multiplication by integer m then division by 2^e, where m/2^e approximates 1/255. We want a slightly more accurate approximation, which means increasing |m|, but FP format limits |m|. Solution? Factorise m = m_1 * m_2.