Mastodawn

Hoi iedereen! 👋
Vragen aan de community:

Heeft iemand ervaring met deze GPU’s? Welke zou je aanbevelen voor het lokaal draaien van grotere LLMs?
Zijn er andere budgetvriendelijke server-GPU’s die ik misschien heb gemist en die geweldig zijn voor AI-workloads?
Heb je tips voor het bouwen van een kosteneffectieve AI-workstation? (Koeling, voeding, compatibiliteit, enz.)
Wat is jouw favoriete setup voor lokale AI-inferentie? Ik zou graag over jullie ervaringen horen!

Alvast bedankt! 🙌"
#AIServer #LokaleAI #BudgetBuild #LLM #GPUAdvies #ThuisLab #AIHardware #DIYAI #ServerGPU #TweedehandsTech #AIGemeenschap #OpenSourceAI #ZelfGehosteAI #TechAdvies #AIWorkstation #MachineLeren #AIOnderzoek #FediverseAI #LinuxAI #AIBouw #DeepLearning #ServerBouw #BudgetAI #AIEdgeComputing #Vragen #CommunityVragen

j🌻m Sep 13, 2025

@debby salut
go visit /r/localllama on reddit, they have plenty of advice and opinion

#ServeurIA #IALocale #MontageBudget #LLM #ConseilsGPU #LaboMaison #MatérielIA #IAFaitesVousMême #GPUServeur #TechOccasion #CommunautéIA #IAOpenSource #IAAutoHébergée #ConseilsTech #StationIA #ApprentissageAutomatique #RechercheIA #FediverseIA #IALinux #MontageIA #ApprentissageProfond #MontageServeur #IABudget #CalculEnPériphérieIA #Questions #QuestionsCommunauté

Salut à tous ! 👋
Questions pour la communauté :

Quelqu’un a-t-il de l’expérience avec ces GPU ? Lequel recommanderiez-vous pour exécuter des LLMs plus grands localement ?
Y a-t-il d’autres GPU serveurs économiques que j’aurais pu manquer et qui sont excellents pour les charges de travail IA ?
Avez-vous des conseils pour construire une station de travail IA rentable ? (Refroidissement, alimentation, compatibilité, etc.)
Quelle est votre configuration préférée pour l’inférence IA locale ? J’aimerais entendre vos expériences !

Merci d’avance ! 🙌

#HejmaServilo #HejmaLabo #UzitaTek #AIKomunumo #Demandoj

Debby Sep 13, 2025

@debby Saluton ĉiuj 👋

Demandoj al la Komunumo:

Ĉu iu havas spertojn kun tiuj GPU-oj? Kiun vi rekomendus por loke funkciigi pli grandajn LLM-ojn?
Ĉu estas aliaj buĝetamikaj servilaj GPU-oj, kiuj eble fuĝis de mia atento, sed estas bonaj por AI-ŝarĝoj?
Ĉu vi havas konsilojn por konstrui malmultekostan AI-laborstacion? (Malvarmigo, energifonto, interkonektebleco, ktp.)
Kia estas via preferata agordo por loka AI-inferenco?

Dankon antaŭe! 🙌

prozak Sep 13, 2025

@debby my advice, maybe you won’t love. In my own journey I found out that to run really big models, you need the biggest and expensive GPU, most models 20B+ need a lot of video ram, which either puts you in the I need 2+ beefy gpu and then at some point, fairly short, you will need to upgrade. So economically speaking makes no sense for 1-2 people usage.

What I settled on is in segmenting my LLMs, I installed litellm as main proxy and behind that I have two setups a local Ollama server with 1-7B models and for anything beyond that I am hiring cloud inference from hyperbolic. Which I find more economically it is all stitched back through open web up where the models centralized in lite llm are the ones to be selected in open web ui

With this, I figured I don’t fall to the 1-2 years GPU upgrade cycle which in my case, 2 people only it’s hard to justify the expense

#PassionProject #AILab #DIYTech #LocalAI #TechEnthusiast

@Prozak You're absolutely right—running 20B+ models locally can be quite costly. From a purely economic standpoint, your setup with LiteLLM + Ollama + cloud for heavy lifting makes the most sense for most people.

However, I still find myself drawn to the idea of experimenting with a local setup, even if it's not the most cost-effective choice. There's a certain appeal to tinkering with hardware and having full control over the system. It's not just about efficiency; it's about learning, autonomy, and the satisfaction of building something with your own hands. It's akin to building a custom PC just for the enjoyment of the process—sometimes, the journey itself is the reward!

Have you ever felt the urge to go fully local, even if just for the experience? Or are you firmly in the "hybrid is the best approach" camp?

prozak Sep 13, 2025

@debby I have a full local pipeline for that urge. I understand what you mean and agree. However I did NOT buy a GPU, I am using my Mac Studio M1 MAX 32gb for the thinkering. Beyond 32B everything is too slow to have it as a “useful assistant” for the utilization needs, but to your point, my full end to end solution includes my offline only pipeline (even down to scikit, rag, etc playing) all local.

Morten Mosgaard Sep 13, 2025

@Prozak @debby just want to mention, I run Ollama with 27B and 30B models on my MacBook Pro M1 Pro 32GB ram. It’s a 4 year old machine and it’s doing a really good job.

I’m satisfied with what it can do, and won’t be searching for anything else.

I really like that everything is local (and I know how much power it takes).

A Sep 13, 2025

@debby I don't have any recommendation, but I'm also interested in your findings. What is your expected budget?

@a Great question!
My budget is still a bit flexible, but I’m aiming for a realistic range of $550 to $2,500+, depending on how ambitious I get. Since I already have RAM and an M.2 SSD on hand, I can focus the budget on the core components: a solid workstation base and a capable GPU.

A Sep 13, 2025

@debby that is quite a range! 😆 I'd love to know what you end up buying. For me, the only real use of LLM (other than the infrequent grammar checking) is programming.

@a
Using LLMs mostly for programming? Sounds very reasonable! 💻✨ (I might not be reasonable? 😅)

For me, LLMs are like a Swiss Army knife—I use them for programming and debugging, sure, but also for voice typing and correcting my spelling and grammar (they save me daily!). Tools like Whisper AI are amazing for real-time transcription, but I’m still chasing the dream of local real-time translation—it feels so close but just out of reach with my current setup.

What I’d really love is to run bigger models locally—things like Intellect-2, Mistral-Large, or Llama 3.3—but most of these require 30GB+ of VRAM, which is a tough hurdle. I’d love to integrate my entire digital library and personal data into a local LLM—a truly private, personal AI assistant that understands my context without sending everything to the cloud.

I’m still in the discovery phase—figuring out what’s possible, but finding a sensible configuration is the real challenge.

Once I find my perfect setup, I’ll definitely share the build! 🛠️✨

isekai-OFFline Sep 17, 2025

@debby @[email protected] We can discuss this for a while.

We’re using RTX 4060 Ti 16GB with memory offloading to RAM, all running inside a VM with PCIe passthrough.

For everything related to LLMs, it’s not really an issue once the adjustments are made.

Some models can run without a GPU as long as there’s enough RAM, which is a good starting point for testing.

What kind of server model are you currently using?

Papa Wheeli Sep 21, 2025

@debby

I went the opposite direction. I got a cheap mini pc with as powerful an integrated graphics chip as I could find (pictured). I upgraded it to 64GB of RAM (the maximum it supports). Since the GPU shares system memory, I can run models that require 32GB or more of VRAM. The only disadvantage to this setup? It's *very* slow!

M Schommer Sep 29, 2025

@debby
I just came across your toot on my own search for the right GPU(s) to buy https://chaos.social/@musevg/115288876536469616
I aim lower than you - budget-wise - and also want to avoid older architectures with lower compute capabilities: In the hope to be able to use the GPU(s) longer if future versions of Ollama need higher capabilities. This is why I don't have older models like V100/M40 on my list.
What are your thoughts / do you have any new insights?

M Schommer (@[email protected])

So. Jetzt. Bin ich zwar bei meiner Quest zur #Ollama #localAI etwas schlauer als vorher, aber dafür habe ich jetzt 3 (statt vorher 3) Optionen auf der Liste. Ratet mir bitte mal… was würdet ihr kaufen? 1. Zwei gebrauchte #RTX 2080 Ti mit 11GB von eBay für je 200-250€ 2. Zwei neue 3060 mit 12GB für je ~230€ 3. Eine neue 5060 Ti mit 16GB für ~420€ 4. Anderes, nämlich…?

chaos.social