【コラム】 ビットクラッシュと低ビット音楽──デジタル解像度...
SplitQuantV2: Enhancing Low-Bit Quantization of LLMs Without GPUs
https://arxiv.org/abs/2503.07657
#HackerNews #SplitQuantV2 #LowBit #Quantization #LLMs #AI #Research #MachineLearning
The quantization of large language models (LLMs) is crucial for deploying them on devices with limited computational resources. While advanced quantization algorithms offer improved performance compared to the basic linear quantization, they typically require high-end graphics processing units (GPUs), are often restricted to specific deep neural network (DNN) frameworks, and require calibration datasets. This limitation poses challenges for using such algorithms on various neural processing units (NPUs) and edge AI devices, which have diverse model formats and frameworks. In this paper, we show SplitQuantV2, an innovative algorithm designed to enhance low-bit linear quantization of LLMs, can achieve results comparable to those of advanced algorithms. SplitQuantV2 preprocesses models by splitting linear and convolution layers into functionally equivalent, quantization-friendly structures. The algorithm's platform-agnostic, concise, and efficient nature allows for implementation without the need for GPUs. Our evaluation on the Llama 3.2 1B Instruct model using the AI2's Reasoning Challenge (ARC) dataset demonstrates that SplitQuantV2 improves the accuracy of the INT4 quantization model by 11.76%p, matching the performance of the original floating-point model. Remarkably, SplitQuantV2 took only 2 minutes 6 seconds to preprocess the 1B model and perform linear INT4 quantization using only an Apple M4 CPU. SplitQuantV2 provides a practical solution for low-bit quantization on LLMs, especially when complex, computation-intensive algorithms are inaccessible due to hardware limitations or framework incompatibilities.
~
“In nature, everything has a job. The job of the fog is to beautify further the existing beauties!”
- Mehmet Murat Ildan
~
fog song by internet based ghosts is now available on #sub65media
internet based ghosts - fog songhttps://youtu.be/h2CqmLB1qEU
~
The laws of dance have been broken. Enjoy your freedom.
~
That's Not How U Use This Programme by Ayrweda is now available on #sub65media
Ayrweda - That's Not How U Use This Programme
~
"Ah yes indeed it's fun time!"
~
fun at Sub65 by Toxic Chicken is now available on #sub65media
https://archive.org/details/s65108
#ToxicChicken #electronic #electro #idm #experimental #lobit #lowbit
Toxic Chicken - fun at Sub65
~
'Captain, allow me to emphasize, this very well could be the distribution of our whole ship's fate.'
~
Taburetka - Distribution of happiness is now available on #sub65media
Taburetka - Distribution of happiness
~
So many strange properties, so many sources of inspiration.
~
'Strange properties of Diogenes Death, 7, 77 A:.D:.' is now out on #sub65media
Vziel Projet - 'Strange properties of Diogenes Death, 7, 77 A:.D:.'
~
Turkish delights are out. Türkmenistani waves of bliss are in.
~
rowboats on the river - Türkmenistan is now available on #sub65media
~
'I don't even want to imagine what life would be like without our null noise'
~
Thank You Null Noises by Shifala is now available on #sub65media
https://archive.org/details/s65100
#Shifala #experimental #avantgarde #soundcollage #lobit #lowbit
Shifala - Thank You Null Noises