Mastodawn

a decade or so ago, I was writing a H.264 decoder (needed a custom one for stupid reasons which of course had to do with hardware reverse engineering).

the first order of business was to implement CABAC: the final entropy coding stage of H.264 (ie. the first layer I had to peel starting from the bitstream), a funny variant of arithmetic coding. the whole thing is quite carefully optimize to squeeze out bits from video frames by exploiting statistics. in addition to carefully implementing the delicate core logic, I also had to copy-paste a few huge probability tables from the PDF, which of course resisted copy-paste as PDFs like to do and I had to apply some violence until it became proper static initializers in C source code.

furthermore, testing such code is non-trivial: the input is, of course, completely random-looking bits. and the way bitstreams work, I’d have to implement pretty much the whole thing before I got to the interesting part.

so, a few hours later, I figured I’m done with CABAC and reconstructing H.264 data structures, and pointed my new tool at some random test videos. and it worked first try! the structures my program spit out looked pretty much as expected, the transform coefficient matrices had pretty shapes and looked just as you’d expect them to, and I was quite happy with that.

and then I moved on to actually decoding the picture from the coefficients, and this time absolutely nothing worked. random garbage on screen. I spent a long time looking at my 2D transform code searching for bugs, but couldn’t find anything.

and then it hit me exactly what “entropy coding” means. I implemented something that intimately knows and exploits the statistical properties of what video transform coefficients and other structures look like, their probabilities and internal correlations, and uses that to squeeze out entropy and reconstruct it on the other end. my “looks good” testing meant absolute jack shit: I could’ve thrown /dev/urandom into the CABAC decoder instead of actual H.264 video, and it would still look like good video data at this stage until you actually tried to reconstruct the picture.

and sure enough, it turned out I fucked up transcribing some rows from the PDF around a page break or something.

10 years later, I think of this experience every time I see a vibecoded pull request, or other manifestation of AI bullshit. all the right shape, and no substance behind it.

and people really should learn to tell the fucking difference.