The whole LLM as a service business model has a fundamental flaw to it. The cost of operating the data centres is an order of magnitude higher than the profit.

But if models get efficient enough to bring the costs down, then they become efficient enough to run locally. So, either it’s too expensive to operate, or nobody will want to use it as a service because running your own gives you privacy and flexibility.

The fact that investors don't get this is frankly incredible.

#llm #economy

@yogthos That or they believe that you won't have the same capacity or ability to pirate the data required for the models to be useful.

They get a carve out, you get a parrot that still needs training.

@yoasif I mean large open models already exist thanks to Chinese companies releasing them, and we're past the point where shoving more data into models actually improves anything. The future is going to be in architectural improvements.

@yogthos That seems doubtful, with Jensen Huang claiming we've "reached AGI" - the existing models are going to pay off before we see even more investment into something that is just a glint in the eye of most engineers working in this field - it ain't happening without MUCH more investment than we're already getting.

For the existing models, they tend to become less useful as the "state of the art" changes, so ongoing piracy is required (see the deals with Wikipedia from big tech).

@yoasif Jensen Huang is a snake oil salesman who is currently selling shovels in a gold rush. I don't see why anybody would take what he says seriously.

I also don't understand how existing models become less useful. If the model solves a problem today, it will continue to do so tomorrow when a better model is available.

@yogthos

> I also don't understand how existing models become less useful. If the model solves a problem today, it will continue to do so tomorrow when a better model is available.

That is fair enough, but assumes that the problem domains will be the same in the future.

That is clearly what the model builders want, but to promote dependence, they need to cater to new use cases, not just existing ones.

@yoasif and my original point was that they have a dead end business model