Saw this when randomly flipping through the code base of something posted to Lobsters. C or C++ is not the language for you if you think you can write code like that. https://c.godbolt.org/z/8q5zoh6Ke
Compiler Explorer - C (x86-64 gcc 15.2)

// https://github.com/abdimoallim/jit typedef int32_t i32; typedef uint8_t u8; typedef size_t usz; typedef struct { u8* buf; usz len; } jit_buf; void jit_ensure(jit_buf* j, usz n) {} // bad void jit_emit_i32(jit_buf* j, i32 v) { jit_ensure(j, 4); j->buf[j->len + 0] = (u8)(v); j->buf[j->len + 1] = (u8)(v >> 8); j->buf[j->len + 2] = (u8)(v >> 16); j->buf[j->len + 3] = (u8)(v >> 24); j->len += 4; } // good void jit_emit_i32_fixed(jit_buf* j, i32 v) { jit_ensure(j, 4); u8* buf = j->buf; usz len = j->len; buf[len + 0] = (u8)(v); buf[len + 1] = (u8)(v >> 8); buf[len + 2] = (u8)(v >> 16); buf[len + 3] = (u8)(v >> 24); j->len = len + 4; } int main() {}

There's a certain systems language with pervasive noalias guarantees where you can write code like that but C is assuredly not that language.
@pervognsen eh in that language every [] is a branch so it’s not magically better
@zeux @pervognsen You can use get_unchecked.
@foonathan @pervognsen In both cases, the generated code for naive version is suboptimal and changing the code to get optimal codegen isn’t too hard.
@pervognsen Isn't this one of the cases that strict aliasing optimizations are supposed to enable?
@nothings Not in this case. A store via a char pointer is expressly allowed to alias any type.
@pervognsen Oh right, of course. For some reason the page loaded with the struct scrolled off the top so I didn't see the types, and I didn't really look close enough to notice the casts.

@pervognsen @nothings Both GCC & clang produce the desired if 'restrict' is added to the jit_buf pointer. That seems "wrong" to me.

https://c.godbolt.org/z/6sGsh7xnY

Compiler Explorer - C

// https://github.com/abdimoallim/jit typedef int32_t i32; typedef uint8_t u8; typedef size_t usz; typedef struct { u8* buf; usz len; } jit_buf; void jit_ensure(jit_buf*j, usz n) {} // bad void jit_emit_i32(jit_buf* j, i32 v) { jit_ensure(j, 4); j->buf[j->len + 0] = (u8)(v); j->buf[j->len + 1] = (u8)(v >> 8); j->buf[j->len + 2] = (u8)(v >> 16); j->buf[j->len + 3] = (u8)(v >> 24); j->len += 4; } // seems wrong to me void jit_emit_i32_humm(jit_buf* restrict j, i32 v) { jit_ensure(j, 4); j->buf[j->len + 0] = (u8)(v); j->buf[j->len + 1] = (u8)(v >> 8); j->buf[j->len + 2] = (u8)(v >> 16); j->buf[j->len + 3] = (u8)(v >> 24); j->len += 4; } // good void jit_emit_i32_fixed(jit_buf* j, i32 v) { jit_ensure(j, 4); u8* buf = j->buf; usz len = j->len; buf[len + 0] = (u8)(v); buf[len + 1] = (u8)(v >> 8); buf[len + 2] = (u8)(v >> 16); buf[len + 3] = (u8)(v >> 24); j->len = len + 4; } int main() {}

@mbr @nothings No? That's as expected.
@pervognsen @mbr
I'm not sure what can look suspicious here
@amonakov @pervognsen It's my brain today that's suspicious.
@pervognsen @mbr The char pointer inside the struct "inherits" the restrict qualifier from the pointer to the struct? I didn't realize that (I think I've never actually used restrict.)
@nothings @pervognsen @mbr the char buf doesn't alias with the struct because the pointer to the struct is restricted.
@pkhuong @pervognsen Ah right, of course, that makes sense. Thanks!
@nothings @mbr No, that's not exactly how it works. It's declaring that this jit_buf *restrict j pointer (and anything derived from it) is the only way to access the pointee object in this scope (and it's UB if you then violate that). "Derived" means things like (j+1)-1, &j->buf, &j->len, but not a pointer like j->buf which just happens to be the value of a field in j. So, this 'restrict' example would optimize just as well if you pass in char *buf as a separate param to the function.
@nothings @mbr But yeah, it's what Paul said, I'm just trying to clarify why the j->buf pointer isn't considered derived from j in the way that &j->buf is when it comes to aliasing.
@nothings @mbr FWIW, despite the fact that 'restrict' has been in C since C99, after rustc started actually relying on LLVM's implementation of 'noalias' (which is what 'restrict' is compiled down to with clang) it took literally years of fixing the long tail of LLVM bugs due to the fact that approximately no-one uses it in C.
@pervognsen
@nothings @mbr
I think a couple years ago, restrict started appearing all over glibc manpages - could this be related?
@wolf480pl @nothings @mbr Dunno. I'm curious how bug-free gcc's implementation of 'restrict' will turn out to be. Until rustc_codegen_gcc is further along I doubt we'll know for sure. I think the experience with LLVM suggests that there aren't enough C use cases in the wild to flush out bugs via C compilers.
@pervognsen @nothings @mbr I think this is also partially related to C++ not supporting it and thus AFAIK C library headers using it are not compatible with C++
@antopatriarca @pervognsen @nothings @mbr tbf, most libraries would wrap the restrict in a macro with a compiler specific keyword. Some popular C compiler feature not being standardized never was a showstopper ;)

@pervognsen Generally agree, but the buf pointee is uint8_t, so technically shouldn't strict aliasing apply here and guarantee that buf cannot alias j->len?

That said, this whole situation is clearly a mess in C++ and clearly much better in Rust.

@nh I'm not trying to promote anything else, I'm just saying _if_ you want to write code like that then C or C++ isn't the right language, i.e. you need to be explicit about loads and stores and understand when they're happening (especially bad with C++ member variables).
@nh Regarding uint8_t I don't know what the standard says but I've never seen a compiler that doesn't optimize uint8_t and friends the same way as char when it comes to the special type-based aliasing rules for char pointer reads/writes.
@nh @pervognsen iirc char can alias with anything and depending on the platform uint8_t may be char, so this potentially still aliases on some platforms?

@bas @nh @pervognsen realistically uint8_t can't be anything but unsigned char in C on modern platforms, no?

even attempting to side-step the issue with
typedef int i8 __attribute__((mode(QI)))
won't help, the resulting type still aliases like char in both Clang and GCC

(however, in C++20 char8_t is a distinct type and does not alias everything)

@pervognsen -fstrict-aliasing doesn't help...
@tekknolagi It's on by default starting at -O2. It doesn't matter here because of the char pointer type-based aliasing rules.
Maybe even better?: c.godbolt.org/z/Ef9hbqMhE Since it clarifies the intent (and saves one whole move!)
Compiler Explorer - C (x86-64 gcc 15.2)

// https://github.com/abdimoallim/jit typedef int32_t i32; typedef uint8_t u8; typedef size_t usz; typedef struct { u8* buf; usz len; } jit_buf; void jit_ensure(jit_buf* j, usz n) {} // bad void jit_emit_i32(jit_buf* j, i32 v) { jit_ensure(j, 4); j->buf[j->len + 0] = (u8)(v); j->buf[j->len + 1] = (u8)(v >> 8); j->buf[j->len + 2] = (u8)(v >> 16); j->buf[j->len + 3] = (u8)(v >> 24); j->len += 4; } // good void jit_emit_i32_fixed(jit_buf* j, i32 v) { jit_ensure(j, 4); u8* buf = j->buf; usz len = j->len; buf[len + 0] = (u8)(v); buf[len + 1] = (u8)(v >> 8); buf[len + 2] = (u8)(v >> 16); buf[len + 3] = (u8)(v >> 24); j->len = len + 4; } #define jit_emit_i32(_j, _v) jit_emit_i32_restrict((_j), (_j)->buf, (_v)) void jit_emit_i32_restrict(jit_buf* j, u8 * restrict buf, i32 v) { jit_ensure(j, 4); buf[j->len + 0] = (u8)(v); buf[j->len + 1] = (u8)(v >> 8); buf[j->len + 2] = (u8)(v >> 16); buf[j->len + 3] = (u8)(v >> 24); j->len += 4; } int main() {}

@namandixit.net Yeah, restrict/noalias works. The problem is that it isn't practical to apply this all over a code base (C programmers aren't used to thinking about restrict/noalias semantics and it's yet another UB footgun), so you need to have good default habits built into your C coding style, e.g. "don't rely on load/store elimination except for non-escaping locals" is a good place to start.
@pervognsen I forget how back it can be :') I need to post this for our juniors (and potentially seniors)