Does anybody know why gcc and clang can't optimize away this bit shift on most architectures?
uint16_t test(uint16_t *buf) {
return (uint16_t)((char *)buf)[0] | ((uint16_t)((char *)buf)[1] << 8);
}
I tested with godbolt.org and it only really showed good optimizations on powerpc.
On all little endian systems this should be equivalent to just *buf which has much shorter assembly.