A big problem with Apple's Private Cloud Compute is that the servers 'merely' have the same horsepower as M2 Ultra machines you can buy at retail and run on your desk, where the cloud-based LLMs they're supposed to compete with are datacenters full of orders-of-magnitude-more-powerful Nvidia GPUs. And as we've seen, even with Apple's Foundation Models, a simple request can take tens of seconds to process. There is a gulf between what Apple wants to do and the hardware it's trying to build it on
This likely means that deciding to use a cloud provider like OpenAI to power Siri instead of trying to do it on-device is a practical inevitability. Apple may only be realizing now that their silicon is still generations away from where they need it to be to power the features they want to build, in any kind of performant manner, even with all the unified RAM in the world
@stroughtonsmith Hopefully this is good news for a future Mac Pro!
@stroughtonsmith and I think Apple knows it. The M3 Ultra didn’t just double the ram compared to the M2, but went a bit further iirc? And now put this chip in a Mac Pro and we may be back in business 😁
@stroughtonsmith Doesn’t Android have it worse?
@mikecane yeah but Google has servers and is not afraid to use them
@stroughtonsmith Right. Google: Where Privacy Was Murdered In Daylight While Everyone Watched.