Mastodawn

ray__3d ago

TurboQuant: Redefining AI efficiency with extreme compression

https://research.google/blog/turboquant-redefining-ai-efficiency-with-extreme-compression/

TurboQuant: Redefining AI efficiency with extreme compression

Show thread

benob 3d ago

This is the worst lay-people explanation of an AI component I have seen in a long time. It doesn't even seem AI generated.

Show thread

spencerflem 3d ago

I think it is though-

“ TurboQuant, QJL, and PolarQuant are more than just practical engineering solutions; they’re fundamental algorithmic contributions backed by strong theoretical proofs. These methods don't just work well in real-world applications; they are provably efficient and operate near theoretical lower bounds.”

Show thread

integralid 3d ago

I also instinctively reacted to that fragment, but at this point I think this is overreacting to a single expression. It's not just a normal thing to say in English, it's something people have been saying for a long time before LLMs existed.

Show thread

nvme0n1p1 3d ago

There are tells all over the page:

> Redefining AI efficiency with extreme compression

"Redefine" is a favorite word of AI. Honestly no need to read further.

> the key-value cache, a high-speed "digital cheat sheet" that stores frequently used information under simple labels

No competent engineer would describe a cache as a "cheat sheet". Cheat sheets are static, but caches dynamically update during execution. Students don't rewrite their cheat sheets during the test, do they? LLMs love their inaccurate metaphors.

> QJL: The zero-overhead, 1-bit trick

> It reduces each resulting vector number to a single sign bit (+1 or -1). This algorithm essentially creates a high-speed shorthand that requires zero memory overhead.

Why does it keep emphasizing zero overhead? Why is storing a single bit a "trick?" Either there's currently an epidemic of algorithms that use more than one bit to store a bit, or the AI is shoving in extra plausible-sounding words to pad things out. You decide which is more likely.

It's 1:30am and I can't sleep, and I still regret wasting my time on this slop.

Show thread

veunes

Looks like Google canned all their tech writers just to pivot the budget into H100s for training these very same writers