There's a shift happening. OpenAI is shutting down Sora, presumably to focus on enterprise offerings. Walmart and Disney are cutting ties with OpenAI. Sam's about to get sued by Microsoft.

Conversely,

Nvidia's NemoClaw seems like a legitimate effort towards on-device AI. Apple's M5 chips are decked out with new AI technology, hinting that they might do something in the local AI space.

I think cloud-based AI is cooked. It's too expensive. The market is shifting. Or I'm high on my own supply.

I say this not as someone who is pro-local AI but someone who is against the cloud. And against this sort of mass power grab enacted by the hyperscalers over the past five years.

It appears like they're failing. And short of passing legislation to codify a monopoly of some sort, their vision of our digital lives are not coming to fruition.

@fromjason There's obviously an AI bubble that's going to burst at some point (I don't know when). But until local models at least as good as the current frontier models can run on a phone, we're going to have cloud AI. We are years away from that.

Cloud-based AI will always be cheaper, surely, because machines get shared in a fairly efficient way?

@john @fromjason famously "cloud" is just somebody else's Linux machine. A lot depends on who that somebody else is, what their business models, market power etc. If we condition on a world where three US conglomerates remain the only gatekeepers of any and all computing then "AI" is the least of our problems. The challenge is that the only challenge to this configuration comes from China. What is missing is a blueprint for democratic, widely distributed compute that is not a prop for oligarchy.
@openrisk @fromjason Don't get me wrong, I understand the problems here (I run a Mastodon instance for godssake, so I'm not exactly on team tech-oligarchy!), but the fact that Big Serverβ„’ can run their AI servers at at something like 70% load, whereas local AI capabilities will run at way less than that is a structural advantage. That shared nature can also mean way more RAM than most people would ever get their hands on.
@john @fromjason no crystal ball but it seems there are still many uncertainties as to what architectures will be viable in the long term. The gigantic hyperconnected gpu datacenters reflect a breathless rush to develop as fast as possible, as large as possible LLM's, grabbing also all public content while nobody is watching. Literally move fast and break things, to create a captive user base etc. But is this sustainable and is it even needed for developing useful open models in the long run?
@john @fromjason it would take serious technical analysis πŸ˜… to sketch the space of possibilities, but my gut feeling is that if you remove the anxiety of building or defending trillion dollar valuations and focus on algorithms as lowly tools rather than pretend it's an impending superintelligence, then there is a lot of design space to explore about orchestrating computations (for the model building phase).