How to use GPUs over multiple computers for local AI?
https://lemmy.dbzer0.com/post/41844010
How to use GPUs over multiple computers for local AI? - Divisions by zero
The problem is simple: consumer motherboards don’t have that many PCIe slots,
and consumer CPUs don’t have enough lanes to run 3+ GPUs at full PCIe gen 3 or
gen 4 speeds. My idea was to buy 3-4 computers for cheap, slot a GPU into each
of them and use 4 of them in tandem. I imagine this will require some sort of
agent running on each node which will be connected through a 10Gbe network. I
can get a 10Gbe network running for this project. Does Ollama or any other local
AI project support this? Getting a server motherboard with CPU is going to get
expensive very quickly, but this would be a great alternative. Thanks
There are several solutions:
github.com/b4rtaz/distributed-llama
github.com/exo-explore/exo
github.com/kalavai-net/kalavai-client
petals.dev
Didn’t try any of them and haven’t looked for 6 months, so maybe something better have arrived…

GitHub - b4rtaz/distributed-llama: Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.
Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference. - b4rtaz/distributed-llama
GitHubThank you for the links. I will go through them
I’ve tried Exo and it worked fairly well for me. Combined my 7900 XTX, GTX 1070, and M2 MacBook Pro.