AISatoshi (@AiXsatoshi)

DeepSeek-V4가 1.6T 파라미터 규모라는 언급이 있으며, 추론 효율을 고려할 때 int4 QAT 같은 양자화 기법이 사용됐을 가능성을 짚고 있습니다. 대형 모델의 효율적 추론 방식에 대한 관심을 보여주는 내용입니다.

https://x.com/AiXsatoshi/status/2046192554748821937

#deepseek #llm #quantization #qat #inference

AI✖️Satoshi⏩️ (@AiXsatoshi) on X

DeepSeek-V4 パラメータ数1.6Tか 推論時の効率考えると、int4 QATかな?

X (formerly Twitter)

QAD от NVIDIA: разбираюсь, почему 4-битная квантизация перестала всё ломать

NVIDIA выпустила отчет о методе QAD, который позволяет квантовать LLM в 4 бита без потери качества на сложных задачах (математика, код). Разбираем, почему привычный QAT «ломает» модели после RLHF, как дистилляция через KL-дивергенцию решает эту проблему и почему метод работает даже на рандомных данных. Личный опыт попыток уместить 49B модель в железо и анализ нового подхода.

https://habr.com/ru/articles/991586/

#LLM #Квантизация #NVIDIA #QAD #QAT #FP4 #Blackwell #Machine_Learning #Llama #Distillation

QAD от NVIDIA: разбираюсь, почему 4-битная квантизация перестала всё ломать

На прошлой неделе NVIDIA выложила отчёт про QAD и я его проигнорировал. Потому что каждый месяц кто-то "решает квантизацию" и каждый раз на практике всё не так радужно. Но потом коллега скинул...

Хабр
Gemma 3 QAT Models: Bringing state-of-the-Art AI to consumer GPUs- Google Developers Blog

Explore Gemma 3 models now offering state-of-the-art AI performance on consumer GPUs with new int4 quantized versions optimized with Quantization Aware Training (QAT).

Thoughts on #qat / #khat ?
Tried it and loved it
0%
Tried it and didn't love it
0%
Want to try it
50%
Not interested
50%
Poll ended at .

Who knew a 'sin tax' on #Khat could lead to #MentalHealth revolution in #Somaliland? Country's innovative approach is #Funding treatment for addicts, but let's not forget: #InternationalDonors, you're not off the hook yet! #MentalHealthMatters #Qat

https://saxafimedia.com/somaliland-sin-tax-mental-health/

In Somaliland, A Sin Tax For Mental Health Relief | Saxafi Media

The article “In Somaliland, a Sin Tax for Mental Health Relief” discusses an innovative approach in Somaliland to address mental health issues through the taxation of Khat, a locally consumed stimulant plant.

SaxafiMedia
Answering myself: LKCF seems to use the in-tree QAT driver by default for a bunch of algorithms. #QAT #linux #kernel
Is anyone using the upstreamed Intel #QAT kernel module for anything? Is it useful at all? All the Intel instructions seem to start with installing their stuff instead. #linux #kernel

Anyone else gone down the rabbit hole of #Intel #QAT support on #FreeBSD?

The performance boost is crazy, Server The Home has a great write up about it on the Xeon D CPU:
https://www.servethehome.com/welcome-to-the-intel-ice-lake-d-era-with-the-xeon-d-2700-and-d-1700-series/

It accelerates #OpenSSL
https://github.com/intel/QAT_Engine/tree/master

It also accelerates gzip with a QATzip module.

For #Nginx there is a great guide:
https://www.intel.com/content/www/us/en/developer/articles/guide/nginx-https-with-qat-tuning-guide.html

It looks like a lot of work to configure and test, but hopefully later this summer I can give it a go on a system . CPU that supports QAT.

#SysAdmin

Welcome to the Intel Ice Lake D Era with the Xeon D-2700 and D-1700 series

We get hands-on with Intel Xeon D-2700 and D-1700 platforms for the Ice Lake-D launch and share initial performance and power figures

ServeTheHome