I'm fascinated by this section in an Apple paper about how they're using ASTC to compress models to 4 bit, then using the hardware decode to decompress with no overhead. I don't understand how ASTC could ever be even remotely close to 4bit quantization in terms of NRMSE though…
