This might be huge (esp. for future #Gemini versions):

#Google intoduced #TurboQuant - new compression algorithm that reduces #LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining #AI efficiency.

TurboQuant: Redefining AI efficiency with extreme compression https://research.google/blog/turboquant-redefining-ai-efficiency-with-extreme-compression/

TurboQuant: Redefining AI efficiency with extreme compression

@firusvg I wonder if the #NASDAQ will sink on the news. I mean I don’t know how much this particular optimization would cut down on real datacenter costs for the same tasks  but judging how the market reacted when DeepSeek just evoked the possibility of doing the same stuff with less  money …
@santi I never expect markets to react rationally or in a logical way. Too much of the market is driven by hype and reality distortion fields these days. So - everything is possible. ¯\_(ツ)_/¯