Mastodawn

marauding_gibberish142 Apr 9, 2025

How to use GPUs over multiple computers for local AI?

https://lemmy.dbzer0.com/post/41844010

How to use GPUs over multiple computers for local AI? - Divisions by zero

The problem is simple: consumer motherboards don’t have that many PCIe slots, and consumer CPUs don’t have enough lanes to run 3+ GPUs at full PCIe gen 3 or gen 4 speeds. My idea was to buy 3-4 computers for cheap, slot a GPU into each of them and use 4 of them in tandem. I imagine this will require some sort of agent running on each node which will be connected through a 10Gbe network. I can get a 10Gbe network running for this project. Does Ollama or any other local AI project support this? Getting a server motherboard with CPU is going to get expensive very quickly, but this would be a great alternative. Thanks

Show thread

Sims Apr 9, 2025

There are several solutions:

github.com/b4rtaz/distributed-llama

github.com/exo-explore/exo

github.com/kalavai-net/kalavai-client

petals.dev

Didn’t try any of them and haven’t looked for 6 months, so maybe something better have arrived…

GitHub - b4rtaz/distributed-llama: Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.

Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference. - b4rtaz/distributed-llama

GitHub

Show thread

marauding_gibberish142 Apr 9, 2025

Thank you for the links. I will go through them

Show thread

Wigglytuff Apr 10, 2025

I’ve tried Exo and it worked fairly well for me. Combined my 7900 XTX, GTX 1070, and M2 MacBook Pro.

Show thread

ebuttonsdude

+1 on exo, worked for me across the 7900xtx, 6800xt, and 1070ti