AISatoshi (@AiXsatoshi)
DeepSeek-V4가 1.6T 파라미터 규모라는 언급이 있으며, 추론 효율을 고려할 때 int4 QAT 같은 양자화 기법이 사용됐을 가능성을 짚고 있습니다. 대형 모델의 효율적 추론 방식에 대한 관심을 보여주는 내용입니다.
AISatoshi (@AiXsatoshi)
DeepSeek-V4가 1.6T 파라미터 규모라는 언급이 있으며, 추론 효율을 고려할 때 int4 QAT 같은 양자화 기법이 사용됐을 가능성을 짚고 있습니다. 대형 모델의 효율적 추론 방식에 대한 관심을 보여주는 내용입니다.
QAD от NVIDIA: разбираюсь, почему 4-битная квантизация перестала всё ломать
NVIDIA выпустила отчет о методе QAD, который позволяет квантовать LLM в 4 бита без потери качества на сложных задачах (математика, код). Разбираем, почему привычный QAT «ломает» модели после RLHF, как дистилляция через KL-дивергенцию решает эту проблему и почему метод работает даже на рандомных данных. Личный опыт попыток уместить 49B модель в железо и анализ нового подхода.
https://habr.com/ru/articles/991586/
#LLM #Квантизация #NVIDIA #QAD #QAT #FP4 #Blackwell #Machine_Learning #Llama #Distillation
Gemma 3 QAT Models: Bringing AI to Consumer GPUs
#HackerNews #Gemma3 #QAT #AI #ConsumerGPUs #MachineLearning #TechInnovation
Google Releases Gemma 3 QAT AI Models for Consumer GPUs
#AI #AIModels #GoogleAI #Gemma3 #LLM #OpenSourceAI #GPUs #QAT #Quantization #DeepLearning #MachineLearning #NVIDIA #RTX3090 #Kaggle
https://winbuzzer.com/2025/04/20/google-releases-gemma-3-qat-ai-models-for-consumer-gpus-xcxwbn/
Who knew a 'sin tax' on #Khat could lead to #MentalHealth revolution in #Somaliland? Country's innovative approach is #Funding treatment for addicts, but let's not forget: #InternationalDonors, you're not off the hook yet! #MentalHealthMatters #Qat
Anyone else gone down the rabbit hole of #Intel #QAT support on #FreeBSD?
The performance boost is crazy, Server The Home has a great write up about it on the Xeon D CPU:
https://www.servethehome.com/welcome-to-the-intel-ice-lake-d-era-with-the-xeon-d-2700-and-d-1700-series/
It accelerates #OpenSSL
https://github.com/intel/QAT_Engine/tree/master
It also accelerates gzip with a QATzip module.
For #Nginx there is a great guide:
https://www.intel.com/content/www/us/en/developer/articles/guide/nginx-https-with-qat-tuning-guide.html
It looks like a lot of work to configure and test, but hopefully later this summer I can give it a go on a system . CPU that supports QAT.