wait hang on. why does ECS always use bare arrays for holding structs? that makes the bookkeeping way more difficult. wouldn't it be easier to use a simple array based hash table, so that every entity can simply store their data using their ID as the key?

that way iterating over all the values in the table is still fast and cache coherent (it's still just an array) but all the bookkeeping becomes way less complex. no more need for archetypes, just have one hash table per struct and if an entity doesn't have that component, it's as simple as just not adding it to the table

... am i missing something here? O_o i feel like i'm missing something obvious

hmmm i guess if you have a system that operates on multiple components then the access for any but the first is gonna be random, so that sucks

@eniko can have a linked hashmap, basically normal hashmap but entry is kept with a pointer to next element

uses more memory, but has reliable iteration order

@eniko Oh I love the Enhanced Chip Set
Amiga Enhanced Chip Set - Wikipedia

@eniko I think that is how ECS often stores structs in a few implementations.
@boggo oh i thought they all did the thing where they have one tightly packed struct of arrays per archetype (combination of components)
@eniko Well that definitely sounds more efficient to me as even if you just used a simple array hash table you'd eventually have quite a bit of fragmentation without having a really big table. But you can probably still use some sort of indirection table. I'm far from an expert on it though.
@eniko i recall reading how python's dict works, and i think hash tables are much the same. It takes vastly more memory than is in the dict bec they leave space for inserts to prevent reordering. I can see that being bad for tight memory to cache use, hence arrays.
Take with pinch of 🧂
@eniko That's how I've been implementing it in my little toy engine. I have an array based hashmap per component where the entity IDs are the keys, along with bitsets so I can logical OR and get which entities have multiple components.

@eniko No, it really is that simple. In the most "extreme" case you can do away with the concept of entity entirely and just make it an ID and make sure the components belonging to an entity are all tagged with the correct ID.

I've thought of using a sparse array to keep the cache locality and then just when looping over the components check if it's "alive" or not and skipping "dead" components. It'd probably benefit from periodically defragging the array and packing the alive components closer. Keeping an alive and dead list would probably be better than that though.

@eniko the hash into an array approach results in array fragmentation and loading a much larger array, much of which is empty.

Is that particular thing *absolutely necessary* for the degree of memory coherence you need? Probably not in a lot of cases, but I feel like a lot of ECS implementations maximize memory contiguity (no way that's a real word) *just in case*.