Quantization from the ground up https://ngrok.com/blog/quantization
Quantization from the ground up | ngrok blog

A complete guide to what quantization is, how it works, and how it's used to compress large language models

ngrok blog
@jchyip I think it's the best explanation I have seen since years!
I don't want to hear anyone prettending knowing about AI and LLM that doesn't have this level of understanding. That's a pretty mater piece of knowledge sharing. And I really appreciate the end... So you can reproduce all the steps by yourself.