Sam Altman says ChatGPT should be 'much less lazy now'
Sam Altman says ChatGPT should be 'much less lazy now'
PSA: give open-source LLMs a try folks. If you’re on Linux or macOS, ollama makes it incredibly easy to try most of the popular open-source LLMs like Mistral 7B, Mixtral 8x7B, CodeLlama etc… Obviously it’s faster if you have a CUDA/ROCm-capable GPU, but it still works in CPU-mode too (albeit slow if the model is huge) provided you have enough RAM.
You can combine that with a UI like ollama-webui or a text-based UI like oterm.
ROCm is decent right now, I can do deep learning stuff and CUDA programming with it with an AMD APU. However, ollama doesn’t work out-of-the-box yet with APUs, but users seem to say that it works with dedicated AMD GPUs.
As for Mixtral8x7b, I couldn’t run it on a system with 32GB of RAM and an RTX 2070S with 8GB of VRAM, I’ll probably try with another system soon. But that same system runs CodeLlama-34B fine.
So far I’m happy with Mistral 7b, it’s extremely fast on my RTX 2070S, and it’s not really slow when running in CPU-mode on an AMD Ryzen 7. Its speed is okayish (~1 token/sec) when I try it in CPU-mode on an old Thinkpad T480 with an 8th gen i5 CPU.