Car Wash Test on 53 leading AI models: "I want to wash my car. The car wash is 50 meters away. Should I walk or drive?"

https://lemmy.world/post/43503268

Car Wash Test on 53 leading AI models: "I want to wash my car. The car wash is 50 meters away. Should I walk or drive?" - Lemmy.World

Screenshot of this question was making the rounds last week. But this article covers testing against all the well-known models out there. Also includes outtakes on the ‘reasoning’ models.

We poked fun at this meme, but it goes to show that the LLM is still like a child that needs to be taught to make implicit assumptions and posses contextual knowledge. The current model of LLM needs a lot more input and instructions to do what you want it to do specifically, like a child.
LLMs are not children. Children can have experiences, learn things, know things, and grow. Spicy autocomplete will never actually do any of these things.
I like the idea of referring to LLMs as “spicy autocomplete”.