Mastodawn

RT @syhw
Do you need to quantize models? Try diffq, `pip install diffq` and https://github.com/facebookresearch/diffq#usage

GitHub - facebookresearch/diffq: DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight or group of weights, in order to achieve a given trade-off between model size and accuracy.

DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight or group of weights, in order to achieve a given trade-off b...

GitHub