RT @rumgewieselt: TRANSLASION: Die Community von llama.cpp testet gerade diesen PR! 😆 ER IST GEMERGED!!! https://github.com/ggml-org/llama.cpp/pull/22673

mehr auf Arint.info

#llamacpp #llamacppcommunity #arint_info

https://x.com/rumgewieselt/status/2055672028774981804#m

llama + spec: MTP Support by am17an · Pull Request #22673 · ggml-org/llama.cpp

Overview This PR adds support for MTP (Multi Token Prediction) heads. I tested this on Qwen3.6 27B and Qwen3.6 35BA3B but in principle it should work for any MTP model. I've posted the detaile...

GitHub