something that worries me is padding in hashmap key types. a while back i experienced a nasty surprise when i discovered clang to optimize copy operations around unused bytes in structs, causing padding to contain uninitialized memory and thus messing up hashes.

#devlog #C

@lritter is it possible to put all hashmap related function in the same translation unit and compile the object file with different optimization level than the rest of the project before linking?
That sounds like a pain to manage but I'm curious if that would work

@greenmoonmoon that wouldn't be the level at which i would counteract it, as it's quite fragile to maintain.

better would be to explicitly memclear a buffer then init the fields in it - before passing the key to any hashmap function. at least that's my plan when i encounter this again.

@greenmoonmoon personally i think it's crazy¹ not to clear memory on allocation. speedfreaks cut too many corners.

¹ inviting trouble

@lritter @greenmoonmoon I bet the compiler will still find a way to mess up your padding bytes. The only safe way are packed structs (not standardized though, but all C/C++ compilers support it).
@floooh @lritter @greenmoonmoon yep, explicitly clearing to zero won't clear padding bytes either.

@dotstdy @floooh @lritter @greenmoonmoon There is the GCC extension __builtin_clear_padding: https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-_005f_005fbuiltin_005fclear_005fpadding

But as soon as you move or modify the objects, the padding bits may change again. So hash it immediately after clearing the padding bits.

Unfortunately, GCC is the only compiler that has it AFAIK.

Other Builtins (Using the GNU Compiler Collection (GCC))

Other Builtins (Using the GNU Compiler Collection (GCC))

@foonathan @dotstdy @lritter @greenmoonmoon hmm interesting... apparently MSVC has a similar `__builtin_zero_non_value_bits`.

LLVM seems to be missing something similar so far though (https://github.com/llvm/llvm-project/issues/64830)... seems to be in progress though:

https://github.com/llvm/llvm-project/actions/runs/19788158702

Implement __builtin_clear_padding (used by libstdc++ for P0528R3/P1123R0) · Issue #64830 · llvm/llvm-project

This is basically identical to #46030 ("Implement __builtin_zero_non_value_bits which was recently added to MSVC") except for GCC instead of MSVC. C++20 includes P0528R3 and P1123R0, which together...

GitHub

@floooh @lritter @greenmoonmoon Yeah, surrounding the structs you want this on with:

#pragma pack(push, 1)

// structs

#pragma pack(pop)

That's right, right? Been a while since I learnt about that.

@miblo @lritter @greenmoonmoon ah right, that actually works across gcc/clang and msvc, alignment used to be compiler-specific, but at least in C it has been standardized since C11 (don't know about C++).
@floooh @miblo @greenmoonmoon i need a method that doesn't come out of a preprocessor directive, because i set up the relevant structs by macro already
@lritter @floooh @miblo @greenmoonmoon build with -Wpadded.

@pkhuong @floooh @miblo @greenmoonmoon these are not manually set up structs - they appear through the nudl codegen.

so i would have to ban unaligned table column configurations, and the only reason i could state for this is "because i am incompetent".

if i have no other course of action, i'll have to rewrite the relevant bits of stb_ds to use my macro type dispatch stuff. it's ok. just more work. 🫥

@lritter @pkhuong @miblo @greenmoonmoon for code-generated structs in the sokol shader compiler (to map GLSL structs with 'custom alignment' to C structs) I just use explicit padding bytes inside packed+aligned struct declarations... e.g. like this:
@floooh @pkhuong @miblo @greenmoonmoon it's easier for me to modify stb_ds than to add size/alignment knowledge to the compiler frontend (so far i have managed to avoid this).
@lritter @floooh @miblo @greenmoonmoon as long as you're OK with 0-sized arrays, I'm pretty sure you could have the C compiler figure out how many padding bytes you need after each field (add sizeof, mod alignof).
@pkhuong i don't understand

@lritter Let's say you have struct { int x; char y; double z; };

You'd have

```
enum
{
after_x = alignof(char) - (sizeof(int)) % alignof(char) % alignof(char),
after_y = (alignof(double) - (sizeof(int) + padding_after_x + sizeof(char)) % alignof(double)) % alignof(double),
...
}
```

and tail padding would use `alignof(union { one-of-every-type; })`.

and instead generate `struct {int x; char padding_after_x[after_x]; char y; char padding_after_y[after_y]; double z; char tail_padding[tail];}`.

@pkhuong hm this is indeed possible.
@lritter @pkhuong @miblo @greenmoonmoon the SOKOL_SHDC_ALIGN macro looks like this (I think that can be unified in more recent C versions):
@floooh @lritter @greenmoonmoon Packed structs are almost never "safe" in any way.
As soon as you pass a reference or pointer to a packed and non-aligned member in that struct, you are in Undefined Behaviour land, because non-aligned objects must not exist, and the compiler will break that basically immediately. The non-alignedness is not encoded in the type system, neither for #pragma pack, nor for __attribute__((packed)).
A simple call to std::min is enough to break things.

@manx @lritter @greenmoonmoon it should be safe when each "regular" struct member is still properly aligned (e.g. when using packed structs with "manually defined" padding, like below).

(it gets interesting though when the GPU alignment rules differ from the C alignment rules)

@floooh @lritter @greenmoonmoon At that point, #pragma pack is kind of pointless, because it does nothing. You have already filled all the holes that packing would eliminate.
If you want to model alignment that does not match the language alignment rules precisely (i.e. GPU, or file formats), I would strongly suggest doing what boost::endian does: Use distinct types that are naturally byte aligned (i.e. arrays of bytes), and (in C) accessor functions or (in C++) implicit conversions.

@manx @lritter @greenmoonmoon

> At that point, #pragma pack is kind of pointless, because it does nothing.

...the whole point of the thread is to make the content of padding bytes explicitly defined, and this is one way to do it (the other solution that came up is to use a compiler-specific builtin to zero the padding bytes, not yet supported by clang though)

@manx @lritter @greenmoonmoon tbf though, in my screenshot the explicit packing/alignment/padding has a different purpose: to make those structs compatible with GPU-side structs (e.g. those *must* be 16-byte aligned even if the C struct would only require 4-byte alignment), and there are some subtle differences in alignment rules for arrays of float3 vectors (those need 16-byte stride instead of 12). In any case, the resulting member alignments should be greater than the required by C.
@floooh @lritter @greenmoonmoon Yes, but that does not require #pragma pack at all. Just the explicit padding members are enough.
@floooh @lritter @greenmoonmoon In C++, you can actually check whether there are any implicit padding bits in a struct with std::has_unique_object_representations (i.e. <https://godbolt.org/z/ea45r6773>).
std::is_standard_layout is also probably also useful in related situations.
Compiler Explorer - C++ (x86-64 gcc 15.2)

struct foo { std::uint8_t x; std::uint32_t y; }; static_assert(!std::has_unique_object_representations<foo>::value); struct bar { std::uint8_t x; std::uint8_t padding1[3]; std::uint32_t y; }; static_assert(std::has_unique_object_representations<bar>::value);

@lritter @greenmoonmoon If you modify the struct, the padding might change again (e.g. because the compiler promoted a 16 bit store of a field to a 32 bit store of a field and the neighboring padding).

@lritter It's not just Clang. Padding bytes are a 'no-go-area' for C and C++ compilers.

One area where I've seen padding bytes being lost was calling a function which takes a struct by value (and I seem to remember not just for small structs where the struct content is passed in registers), or even just assigning one struct to another without using memcpy (but tbh I wouldn't even trust memset or memcpy, since these are essentially builtins today).

@lritter PS: I think this is the relevant part from the C standard (via https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3220.pdf):