LLMs are surprisingly great at compressing images and audio, DeepMind researchers find
LLMs are surprisingly great at compressing images and audio, DeepMind researchers find
Skimming through the linked paper, I noticed this:
Scaling beyond a certain point will deteriorate the compression performance since the model parameters need to be accounted for in the compressed output.
So it sounds like the model parameters needed to decompress the file are included in the file itself.
Training tends to be more compute intensive while inference is more likely to be able to be ran on a smaller hardware foot print.
The neater idea would be a standard model or set of models, so that a 30G program can be used on ~80% of target case, games and video seem good canidates for this.