Mastodawn

This might be huge (esp. for future #Gemini versions):

#Google intoduced #TurboQuant - new compression algorithm that reduces #LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining #AI efficiency.

TurboQuant: Redefining AI efficiency with extreme compression https://research.google/blog/turboquant-redefining-ai-efficiency-with-extreme-compression/

TurboQuant: Redefining AI efficiency with extreme compression

Show thread

Santiago, né ?

👾3h ago

@firusvg I wonder if the #NASDAQ will sink on the news. I mean I don’t know how much this particular optimization would cut down on real datacenter costs for the same tasks but judging how the market reacted when DeepSeek just evoked the possibility of doing the same stuff with less money …

Show thread

Vladimir Savić 3h ago

@santi I never expect markets to react rationally or in a logical way. Too much of the market is driven by hype and reality distortion fields these days. So - everything is possible. ¯\_(ツ)_/¯