Sam Altman says ChatGPT should be 'much less lazy now'
Sam Altman says ChatGPT should be 'much less lazy now'
PSA: give open-source LLMs a try folks. If you’re on Linux or macOS, ollama makes it incredibly easy to try most of the popular open-source LLMs like Mistral 7B, Mixtral 8x7B, CodeLlama etc… Obviously it’s faster if you have a CUDA/ROCm-capable GPU, but it still works in CPU-mode too (albeit slow if the model is huge) provided you have enough RAM.
You can combine that with a UI like ollama-webui or a text-based UI like oterm.
ROCm is decent right now, I can do deep learning stuff and CUDA programming with it with an AMD APU. However, ollama doesn’t work out-of-the-box yet with APUs, but users seem to say that it works with dedicated AMD GPUs.
As for Mixtral8x7b, I couldn’t run it on a system with 32GB of RAM and an RTX 2070S with 8GB of VRAM, I’ll probably try with another system soon. But that same system runs CodeLlama-34B fine.
So far I’m happy with Mistral 7b, it’s extremely fast on my RTX 2070S, and it’s not really slow when running in CPU-mode on an AMD Ryzen 7. Its speed is okayish (~1 token/sec) when I try it in CPU-mode on an old Thinkpad T480 with an 8th gen i5 CPU.
Some users found inventive strategies to get around ChatGPT’s laziness, with one finding that the AI model would provide longer responses if they promised to tip it $200.
Man this thing really IS just like a human!
/joke