Mastodawn

The whole LLM as a service business model has a fundamental flaw to it. The cost of operating the data centres is an order of magnitude higher than the profit.

But if models get efficient enough to bring the costs down, then they become efficient enough to run locally. So, either it’s too expensive to operate, or nobody will want to use it as a service because running your own gives you privacy and flexibility.

The fact that investors don't get this is frankly incredible.

@yogthos I think a fair amount of them do, but are frantically looking to scout the top ahead of time and leave someone else holding the bag.

@yogthos Because when it comes to increases in value, there are few places that are as extreme as the inflation end of a bubble, so riding _anything else_ is a "bad financial decision".

Jürgen Hubert 2d ago

@pettter @yogthos

It's hucksters all the way down.

Kyle Montanio 2d ago

Ah, but have you priced in the (what is likely technically impossible) possibility of reaching AGI with LLMs?

It's like a lottery ticket for the wealthy who don't mind setting a bit of their money and the economy and the planet on fire for the slimmest of chances to become obscenely wealthy, even by billionaire standards.

Neo Ehproque 2d ago

@FantasticalEconomics @yogthos i think they see themselves as Prometheus, stealing fire from the Gods and bringing it to their billionaire friends, finally releasing them from a fate worse than death: having obligations towards the Poors

@FantasticalEconomics the whole AGI thing is basically marketing for gullible investors in my opinion

Kyle Montanio 1d ago

It mostly is. But there are some true believers out there who take AGI quickly followed by super intelligence as a near certainty.

The rationalist movement - the same asshats who spawned effective altruism and the likes of Sam Bankman-Fried - have strong pull in Silicon Valley and take the mytgology of AI super intelligence they created literally.

But you are spot on in thinking the monied interests regarding AGI is largely just a marketing con.

https://en.wikipedia.org/wiki/Zizians

Zizians - Wikipedia

@FantasticalEconomics yeah, some people start getting high on their own supply

ahistorical immaterialist 1d ago

@FantasticalEconomics @yogthos if AGI is possible (and it really, absolutely isn't within the framework of LLMs/transformers), I still don't see the connection to making their money back.

1. Create single universal global super-intelligence
2. ???
3. Profit!

@jimbob @FantasticalEconomics yeah I also can't see how LLMs on their own can lead to AGI, they might be a piece of a bigger puzzle at best

Kyle Montanio 1d ago

@jimbob @yogthos

I think step 2 is: ask god-AI how to profit haha.

But you are both absolutely right. It's not a well thought-out strategy and AGI is almost certainly not coming out of LLMs. It's not the tech that can produce it and, at best, would be a distant stepping stone towards a very different looking tech.

@yogthos ::motions at NFTs, crypto, time-shares and other such wonderful ideas "investors" have fallen for ::

@yogthos That or they believe that you won't have the same capacity or ability to pirate the data required for the models to be useful.

They get a carve out, you get a parrot that still needs training.

@yoasif I mean large open models already exist thanks to Chinese companies releasing them, and we're past the point where shoving more data into models actually improves anything. The future is going to be in architectural improvements.

@yogthos That seems doubtful, with Jensen Huang claiming we've "reached AGI" - the existing models are going to pay off before we see even more investment into something that is just a glint in the eye of most engineers working in this field - it ain't happening without MUCH more investment than we're already getting.

For the existing models, they tend to become less useful as the "state of the art" changes, so ongoing piracy is required (see the deals with Wikipedia from big tech).

@yogthos Training will always be a bottleneck, and ongoing piracy from China isn't guaranteed, especially as they pose a risk to administration affiliated businesses.

See the ban on foreign routers from yesterday: https://www.usatoday.com/story/tech/news/2026/03/24/fcc-bans-new-router-imports/89300646007/

FCC bans imports of new foreign-made routers over security fears

China is estimated to control at least 60% of the U.S. market for home routers, boxes that connect computers, phones and smart devices to the internet.

USA TODAY

@yoasif yet, training can be done differently from the way we do it now. There is already plenty of research, coming out of China incidentally, on how to train models more intelligently. Here's one example https://arxiv.org/abs/2512.24873

Chinese companies don't need to do distillation from US models given that Chinese models are already competitive. Having more data isn't the bottleneck at this point. It's how you analyze the data that matters.

Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem

Agentic crafting requires LLMs to operate in real-world environments over multiple turns by taking actions, observing outcomes, and iteratively refining artifacts. Despite its importance, the open-source community lacks a principled, end-to-end ecosystem to streamline agent development. We introduce the Agentic Learning Ecosystem (ALE), a foundational infrastructure that optimizes the production pipeline for agentic model. ALE consists of three components: ROLL, a post-training framework for weight optimization; ROCK, a sandbox environment manager for trajectory generation; and iFlow CLI, an agent framework for efficient context engineering. We release ROME, an open-source agent grounded by ALE and trained on over one million trajectories. Our approach includes data composition protocols for synthesizing complex behaviors and a novel policy optimization algorithm, Interaction-Perceptive Agentic Policy Optimization (IPA), which assigns credit over semantic interaction chunks rather than individual tokens to improve long-horizon training stability. Empirically, we evaluate ROME within a structured setting and introduce Terminal Bench Pro, a benchmark with improved scale and contamination control. ROME demonstrates strong performance across benchmarks like SWE-bench Verified and Terminal Bench, proving the effectiveness of ALE.

arXiv.org

@yogthos Sorry, how does this not continue to rely on piracy?

> We select approximately one million high-quality GitHub repositories based on criteria such as star counts, fork statistics, and contributor activity. Following Seed-Coder, we concatenate multiple source files within the same repository to form training samples at the project-level code structure, preventing the model from learning only isolated code snippets and promoting understanding of real-world engineering context.

@yoasif first of all, training open repos on GitHub isn't piracy. But the point you've evidently missed is they don't need more data than what they already have available. What the paper actually says is that the structure of the network is what matters. Their innovation is in how relationships in the data are expressed within the model.

@yogthos LOL training open repos isn't piracy?

I'm clearly wasting my time with you.

@yoasif lol clearly you haven't read MIT license or even gPL for that matter, which says that as long as you're making your derivative work open you're compliant. I don't see what you're even trying to say here when talking about open models. You sound confused.

@yogthos Sorry, I'm not going to waste my time trying to explain things to you when I'm avoiding working on an explainer post on the topic of LLM piracy and open source.

PS: I am not confused.

@yoasif do feel free to stop trying to explain things to me which you clearly have little understanding of

@yoasif Jensen Huang is a snake oil salesman who is currently selling shovels in a gold rush. I don't see why anybody would take what he says seriously.

I also don't understand how existing models become less useful. If the model solves a problem today, it will continue to do so tomorrow when a better model is available.

> I also don't understand how existing models become less useful. If the model solves a problem today, it will continue to do so tomorrow when a better model is available.

That is fair enough, but assumes that the problem domains will be the same in the future.

That is clearly what the model builders want, but to promote dependence, they need to cater to new use cases, not just existing ones.

@yoasif and my original point was that they have a dead end business model

Mauve 👁💜1d ago

@yogthos I think that's also why they're putting so much effort intk causing a chip shortage. IIRC a bunch of them want to get rid of personal computers and force people to use the cloud.

@mauve I can't really see this working out in practice, if anything we'll likely just end up in a similar situation to solar and EVs where China steps up. Of course, we already see Chinese tech increasingly banned in the west. It is possible that the west just becomes a hermit kingdom while the rest of the world moves on.

Mauve 👁💜1d ago

@yogthos Yeah I think it's a failing strategy too. I expect the costs to businesses would be higher than they'd be willing to go to play along. I think the average consumer would just go with whatever seems popular and lets them stay addicted to their algorithms though. 😅