Mastodawn

LLMQ: Efficient Lower-Precision LLM Training for Consumer GPUs

LLMQ: Efficient Lower-Precision LLM Training for Consumer GPUs

We present LLMQ, an end-to-end CUDA/C++ implementation for medium-sized language-model training, e.g. 3B to 32B parameters, on affordable, commodity GPUs. These devices are characterized by low mem…

hgpu.org