'At some point you've got to make money': Goldman's top AI skeptic warns the clock is running out ahead of OpenAI and Anthropic IPOs

https://lemmy.today/post/54311965

'At some point you've got to make money': Goldman's top AI skeptic warns the clock is running out ahead of OpenAI and Anthropic IPOs - Lemmy Today

Lemmy

I’m not much of an AI skeptic compared to most on Lemmy. I think the technology is incredibly useful and probably beneficial to society if we can remove the control of the ruling class.

That said I truly don’t understand how the AI business model is supposed to work. I’m sure there is some market for businesses, governments, etc., basically people who have too much money who may want to pay for the latest and greatest models.

But I don’t really see the average consumer doing this when slightly less good versions will almost certainly be available for free. And the above customers will not be able to support the level of investment that’s going on right now.

The business model should be that with economies of scale they could provide compute much cheaper than average consumer can buy to run locally. So yeah, that means they gotta be able to support these $20/mo plans indefinitely.

If they jack up the prices i can just buy a 128gb ryzen ai machine for the price of $200/mo claude for a year

I’m not an expert but my understanding is most of the computation is in the training. The actual queries are not too difficult to manage. So I think that’s what makes it more difficult to monetize because you’re trying to position yourself as a digital gatekeeper for work that has already been done. Yes, some industries have survived in this position but it limits the amount of profit you can make because there are always ways to copy someone else’s homework.

Local is potentially even cheaper than that. This guy talks about how to get 17 t/s with a GTX 1060 that has 6GB of VRAM on the Qwen 3.6 35B MoE model: m.youtube.com/watch?v=8F_5pdcD3HY. He’s using a fork of llama.cpp with TurboQuant and his newest video made after this one is using an even more optimized 28B version of the model. I have cmake running in a Dockerfile at the moment and we’ll see how this performs on my $800 laptop with a RTX 4060.

I’m also impressed how good OpenCode is compared to Claude Code. Qwen 3.6 is not quite as good as Claude, but it also doesn’t cost $200 a month with usage limitations and a company training their models on your data. If it’s anywhere near “good enough”, I can see this being a daily driver.

Running a 35B AI Model on 6GB VRAM, FAST (llama.cpp Guide)

YouTube

The business model should be that with economies of scale they could provide compute much cheaper than average consumer can buy to run locally.

That business model assumes that the huge cloud models will always maintain a gap worth paying for, compared to the local models. I’m just not convinced that the average consumer will need cloud models for summarizing their emails or the news of the day.

And for actual costs of their data centers, there literally aren’t enough humans in the world where $20/month AI spending per person will help them break even. They’ll need to sell big accounts (many businesses spending billions per year) in order to break even.