A big problem with Apple's Private Cloud Compute is that the servers 'merely' have the same horsepower as M2 Ultra machines you can buy at retail and run on your desk, where the cloud-based LLMs they're supposed to compete with are datacenters full of orders-of-magnitude-more-powerful Nvidia GPUs. And as we've seen, even with Apple's Foundation Models, a simple request can take tens of seconds to process. There is a gulf between what Apple wants to do and the hardware it's trying to build it on
This likely means that deciding to use a cloud provider like OpenAI to power Siri instead of trying to do it on-device is a practical inevitability. Apple may only be realizing now that their silicon is still generations away from where they need it to be to power the features they want to build, in any kind of performant manner, even with all the unified RAM in the world
@stroughtonsmith Doesn’t Android have it worse?
@mikecane yeah but Google has servers and is not afraid to use them
@stroughtonsmith Right. Google: Where Privacy Was Murdered In Daylight While Everyone Watched.