Mastodawn

Drew McCormack Jun 17

What a difference a day makes: Was optimistic about Foundation Models yesterday, and today I think I know why they didn't ship the improved Siri. The local model really is pretty thick. I thought it would be capable of stringing together tool calls in a logical way, and sometimes it is, but other times it fails to understand. Exactly the same prompt will work one time, and fail the next. Sounds like what Apple was saying about the new Siri.

Axel Le Pennec Jun 17

@drewmccormack have you tried to use GenerationOptions(sampling: .greedy)?

https://developer.apple.com/documentation/foundationmodels/generationoptions/samplingmode/greedy

greedy | Apple Developer Documentation

A sampling mode that always chooses the most likely token.

Apple Developer Documentation

Drew McCormack Jun 17

@alpennec I’ll look into it. What is the difference between this and a low temperature?

Axel Le Pennec Jun 17

@drewmccormack I havent touched enough to know unfortunately.

Brandon Horst Jun 17

@alpennec @drewmccormack AFAIK it’s exactly the same as 0 temperature. It does make things deterministic (within any given model) but it will make certain use cases really boring.

Drew McCormack Jun 17

@brandonhorst @alpennec And I assume it doesn’t guarantee that it will be any more “right”, correct? Or is the zero temp solution the most likely to be right? I guess it is.

Brandon Horst Jun 17

@drewmccormack @alpennec It’ll be more “probable” haha. “Right” is in the eye of the beholder with these sorts of things

@brandonhorst @alpennec In my case, it should just do the logical thing. Given my use case is mostly search and automation, a low temp probably makes sense. Personality is not very important here.

Axel Le Pennec Jun 17

@drewmccormack @brandonhorst the question is can we always get the same result between sessions? Is it stable?

Brandon Horst Jun 17

@alpennec @drewmccormack Yes, that is definitely true, until they update the model itself https://developer.apple.com/videos/play/wwdc2025/301?time=380

Deep dive into the Foundation Models framework - WWDC25 - Videos - Apple Developer

Level up with the Foundation Models framework. Learn how guided generation works under the hood, and use guides, regexes, and generation...

Apple Developer

Axel Le Pennec Jun 17

@brandonhorst @drewmccormack so if I provide the same input between multiple app launches, the result provided will be the same?

Brandon Horst Jun 17

@alpennec @drewmccormack IIUS yes

Axel Le Pennec Jun 17

@brandonhorst @drewmccormack what does "IIUS" stands for please?

Brandon Horst Jun 17

@alpennec @drewmccormack Lol I meant to say “IIUC”, if I understand correctly, sorry

Axel Le Pennec Jun 18

@brandonhorst @drewmccormack it seems the session needs to be in the same state. Does that mean a fresh new session with the same instructions only can provide the same output?

Drew McCormack Jun 18

@alpennec @brandonhorst My understanding is that it will be deterministic based on the whole conversation. So a particular set of prompts will lead to the same response.

Axel Le Pennec Jun 18

@drewmccormack @brandonhorst even if the responses to the same prompts differ? This would mean the session transcript could contain the same instructions and the same prompts but different responses.

Drew McCormack Jun 18

@alpennec @brandonhorst The responses are also part of the history. If you start a chain with a specific prompt, and continue exactly as last time, you should get the same responses each time in the .greedy case. (Although, I have a feeling temperature would have to be 0, because that might relate to a different part of the LLM.)

Axel Le Pennec Jun 18

@drewmccormack yes, as responses are part of the transcript, they would all need to be .greedy to get a stable session state. If one is not .greedy, then it breaks all the session I guess.