>8 token/s DeepSeek R1 671B Q4_K_M with 1~2 Arc A770 on Xeon — https://github.com/intel/ipex-llm/blob/main/docs/mddocs/Quickstart/llamacpp_portable_zip_gpu_quickstart.md
#HackerNews #DeepSeek #ArcA770 #Xeon #Tokenization #LLM #GitHub
#HackerNews #DeepSeek #ArcA770 #Xeon #Tokenization #LLM #GitHub
ipex-llm/docs/mddocs/Quickstart/llamacpp_portable_zip_gpu_quickstart.md at main · intel/ipex-llm
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discr...