@rygorous Super interesting! I recently had this problem of precision when converting colors. I came with a slightly different solution to this problem, giving similar exact results and slightly 10% faster. Code in #csharp
static float ByteToFloat(byte value) => (value + value + value) * (1.0f / (3.0f * 255.0f));
@corsix @xoofx @rygorous I think the best I can do on UNORM16 is:
float unorm16(int x) {
float f = x * (1 / 65536.f);
return f + f * (0x10001 / 4294967296.f);
}
Not amazing, but the first scale factor is a power-of-two, so it can be folded into the int-to-float operation on A64 (https://docsmirror.github.io/A64/2022-09/scvtf_float_fix.html), such that it's just two ops, SCVTF + FMADD.
It works without FMAs too, and x86 just needs an extra MULSS, but it's less convincing in those cases: https://godbolt.org/z/bsfx7WM7h