I have finally understood a key part of the encoding of #quicktake 150 images, and that gives me hope again to finally shrink this bit buffer to 8bits.
I failed three times already and could only get it down from 32 to 16 bits.
This thing is not just Huffman encoding. It reads bits without discarding some of them in one situation (or refuses some). This is weird and leads to wild left-and-right shifting in all cases, which is vastly suboptimal.
#Retrocomputing #apple2
I did not sleep enough but this is finally done, and fully worth it with a solid -10 seconds on the decoding time, down to 110s now!
#RetroComputing
Holy hell it does not do that, I was mistaken by a bug I introduced a LONG time ago. Well that's gonna simplify things !

So the bug I had introduced is that I had simplified this dcraw code wrong:

s = 3;
for (c=0; c < 256; c++) {
uint16 tmp = (8-s) << 8 | c >> s << s | 1 << (s-1);
huff[18][c] = tmp & 0xFF;
}

I originally replaced it with tmp = 1284 | c
but it should have been tmp = (c << 3) + 4

I have no excuse. It was so clear πŸ€¦β€β™‚οΈ

@colin_mcmillen
That sounds vaguely similar to an optimization I've used for deciding non-byte-aligned data that could be variable-length. Not necessarily Huffman codes, but it could be used there.

I count of bits into the byte under consideration have been processed, and use (bit_count * 0x100 + input_byte) to index into a table (or multiple tables). The table entry indicates how many bits to accept from the input byte (1-8), and what output token is generated, if any.
1/

@colin_mcmillen
Then I shift the output buffer left by the number of bits accepted, or in the accepted bits, increment the bit pointer and advance to a new input byte if greater than 7. Then store the output of the token was accepted, or use it as an input into a lookup table.

I've glossed over some details that should be apparent, but I could find a code sample if anyone wants it.

Anyhow, it doesn't sound exactly like what you've described, but maybe it's a variation on it.
2/