Apple explores hybrid on-device and cloud AI approach for next-generation Siri using Google Gemini models
📰 Original title: Apple Working To Cram Massive Gemini Model Into iPhone To Power New Siri
🤖 IA: It's clickbait ⚠️
👥 Users: It's clickbait ⚠️
#arti...

Apple explores hybrid on-device and cloud AI approach for next-generation Siri using Google Gemini models
Apple is reportedly developing a new version of Siri powered in part by Google's Gemini large language models, aiming to significantly upgrade its long-delayed AI assistant. According to reporting from Ars Technica, Apple is attempting to distill Google's large cloud-based Gemini models so that smaller versions can run on iPhones for certain on-device tasks. This process, known as model distillation, allows a smaller model to learn from a larger one while reducing computational requirements, potentially enabling faster and more private local AI processing. However, the report suggests that a fully on-device solution is unlikely. Instead, Apple's approach appears to be a hybrid system in which more complex Siri requests are processed in the cloud. This introduces potential reliance on external infrastructure, including Google’s cloud services and Nvidia’s confidential computing platform. Nvidia’s system is designed to keep data encrypted while it is being processed on GPUs, which could help Apple maintain its privacy-focused messaging even when user data is handled off-device. The article also notes that Apple has faced challenges integrating large undistilled Gemini models into its own Private Cloud Compute infrastructure, which runs on Apple’s M-series Mac hardware. As a result, Apple may route more demanding AI tasks to external cloud systems rather than its own servers. Despite this, the company is expected to emphasize its privacy protections and possibly extend its Private Cloud Compute branding to the hybrid system. Overall, the development reflects Apple’s ongoing effort to catch up in the generative AI space by combining on-device intelligence with powerful cloud-based models, while balancing performance demands with its strong privacy positioning.









