Mastodawn

@martinus What's your implementation strategy? I'm using a segmented vector for the ability to safely store pointers to elements, that uses increasing powers of two as blocks.

@foonathan I tried that too, even with CLZ intrinsics to find the index. For some reason that was still not as fast as this approach:

I allocate fixed sized segments, e.g. 4096 bytes (it's a template argument), and make sure that exactly 2^x elements fit into it. The segments are stored in an std::vector. Indexing then becomes

return m_segments[idx >> num_bits][idx & mask];

where num_bits and mask are constexpr members.

Jonathan Müller

@martinus Ah right, I've abandoned contiguous indices and use num_bits == 21 (2 MiB) regardless of actual block size.

@foonathan I might change my implementation so that it behaves like std::vector until the block size is reached (doubling size and moving elements), and then switch to always allocating same size segments. Then it's not a problem to have big segment sizes even when containers might not need to hold many elements, and indexing is still the same speed

@martinus Wait, if you don't actually care about address stability, why have a segmented implementation? push_front?

@foonathan I don't want to have the memory allocation spikes when the vector gets full and relocates

@martinus Ah, I see. Yeah, then I reckon a hybrid approach might work.

> @foonathan Ah right, I've abandoned contiguous indices and use num_bits == 21 (2 MiB) regardless of actual block size.

I don't think I understand this. What I did previously was basically implement this here: https://plflib.org/matt_bentley_-_pot_blocks_bit_tricks.pdf

That way you can have contiguous indices. Finding the index of the actual bucket is just more or less a CLZ operation, finding index within the bucket means removing the highest set bit

@martinus Yes, I had that before as well, but I also found it too slow. Since I don't need contigouous indices I basicially treat every block as-if it was 2MiB - so the indices are something like 0, 1, 2, 3, 2097152, ...

@foonathan Ah, thanks for explaining, that's a neat trick! I didn't think about just skipping indices