So, you think AI is ready to write all of your code for you? Ok, start by asking it “There is a car wash 100 meters away. I need to wash my car. Should I walk or drive?”
@macmanx response, well argued. "You'll want to drive — the whole point is to get your car there so it can be washed! Even though 100 meters is a very short walk, your car can't wash itself at home. Drive it over, and enjoy the short trip."
@bph I’ve gotten Claude and Gemini to respond correctly, but GPT, Copilot, and even Gemma (Google’s local model) fail big time, suggesting that walking is the most efficient option.

@macmanx Lumo disappoints, too. Quite elaborate for a wrong answer 🤦‍♀️

"At just 100 m, walking will almost always beat driving. Even if you could roll out the door and zip over in a few seconds, you still have to spend time getting into the car, starting it, navigating any curb or parking spot, and then getting back out again. All that adds up to more hassle than a quick stroll.

So, lace up your shoes (or just hop on foot) and head over—it's faster, simpler, and you’ll save a bit of fuel too."

@bph I feel like I could write a book titled “Lumo Disappoints” at this point. 😅
@macmanx I am in the same boat. Got quite a few wrong answers lately. And it has a passive aggressive tone,.once you point out the wrongness. And Its context window is really short.
@bph Yeah, overall Claude 4.5 is doing well for a remote model, and Gemma 3 is doing well for a hosted model (it shows its work quite well, so hallucinations are easier to spot).
@macmanx yeah, totally agree. I don't use Lumo unless I am on my phone... Claude Opus 4.5 or soon 4.6 are the work horses here, too.