What a difference a day makes: Was optimistic about Foundation Models yesterday, and today I think I know why they didn't ship the improved Siri. The local model really is pretty thick. I thought it would be capable of stringing together tool calls in a logical way, and sometimes it is, but other times it fails to understand. Exactly the same prompt will work one time, and fail the next. Sounds like what Apple was saying about the new Siri.
greedy | Apple Developer Documentation

A sampling mode that always chooses the most likely token.

Apple Developer Documentation
@alpennec I’ll look into it. What is the difference between this and a low temperature?
@drewmccormack I havent touched enough to know unfortunately.
@alpennec @drewmccormack AFAIK it’s exactly the same as 0 temperature. It does make things deterministic (within any given model) but it will make certain use cases really boring.
@brandonhorst @alpennec And I assume it doesn’t guarantee that it will be any more “right”, correct? Or is the zero temp solution the most likely to be right? I guess it is.
@drewmccormack @alpennec It’ll be more “probable” haha. “Right” is in the eye of the beholder with these sorts of things
@brandonhorst @alpennec In my case, it should just do the logical thing. Given my use case is mostly search and automation, a low temp probably makes sense. Personality is not very important here.
@drewmccormack @brandonhorst the question is can we always get the same result between sessions? Is it stable?
@alpennec @drewmccormack Yes, that is definitely true, until they update the model itself https://developer.apple.com/videos/play/wwdc2025/301?time=380
Deep dive into the Foundation Models framework - WWDC25 - Videos - Apple Developer

Level up with the Foundation Models framework. Learn how guided generation works under the hood, and use guides, regexes, and generation...

Apple Developer
@brandonhorst @drewmccormack so if I provide the same input between multiple app launches, the result provided will be the same?
@brandonhorst @drewmccormack what does "IIUS" stands for please?
@alpennec @drewmccormack Lol I meant to say “IIUC”, if I understand correctly, sorry
@brandonhorst @drewmccormack it seems the session needs to be in the same state. Does that mean a fresh new session with the same instructions only can provide the same output?
@alpennec @brandonhorst My understanding is that it will be deterministic based on the whole conversation. So a particular set of prompts will lead to the same response.
@drewmccormack @brandonhorst even if the responses to the same prompts differ? This would mean the session transcript could contain the same instructions and the same prompts but different responses.
@alpennec @brandonhorst The responses are also part of the history. If you start a chain with a specific prompt, and continue exactly as last time, you should get the same responses each time in the .greedy case. (Although, I have a feeling temperature would have to be 0, because that might relate to a different part of the LLM.)
@drewmccormack yes, as responses are part of the transcript, they would all need to be .greedy to get a stable session state. If one is not .greedy, then it breaks all the session I guess.