Tried out the Mixtral-8x22B model that Mistral.ai has released a few days ago. Have to say that I'm impressed. It is the first open foundation model that can speak German flawlessly without grammar mistakes.
I haven't tested the fine-tunes that start to get released now, but I think it is clear that the foundation model is very strong.
In terms of hardware requirement, it will be out of reach for most people to run at home, I fear. Yet it is still in the same range as the biggest LLama2 based models. Being Mixture of Experts, it is comparable in speed to 70B Llama2 for me, and faster than the bigger Frankenstein models like Goliath (120B).
I would say, the minimum hardware requirement is 24 GB VRAM (e.g. Nvidia 3090/4090) and around 100 GB Memory. In such a setup, the CPU becomes the bottleneck anyway, and Mixture of Experts is a significant speedup in the phase when the output tokens are being generated.
But kudos to Mistral.ai for releasing such a great model!