The Beach Boys song about Constructive Cost Model is pretty catchy.
It will also eliminate annoying zero case that clz doesn't like.
Thinking of revisiting the idea of changing decomposition of significand for SIMD (to get 16 digits that can be processed in parallel) by splitting the least-significant digit instead of the most-significant one. This should allow getting rid of an extra multiplication.
At least that's what happened in my last few interactions with it.
Maybe it's good that Claude is so dumb because while explaining things to it you better understand the problem yourself.
Another 8% perf improvement for fixed formatting in π Ε»mij on macOS
https://github.com/vitaut/zmij/pull/114Improvement on NEON path of writing fixed doubles by Antares0982 Β· Pull Request #114 Β· vitaut/zmij
This is a continuation of #110 .
The only tricky part is that we have to reverse the hi-lo of ffgghhii_bbccddee_64, because the original order of unshuffled bytes cannot be used to calculate length...
GitHub