Mastodawn

Rod Faulkner

"The bottom line from Apple’s research is stark: we’re not witnessing the birth of AI reasoning.

We’re seeing the limits of very expensive autocomplete that breaks when it matters most."

Damning proof from Apple researchers that the hype from big tech surrounding #AI is an expensive illusion.

https://medium.com/@ninza7/apple-just-pulled-the-plug-on-the-ai-hype-heres-what-their-shocking-study-found-24ad42c234a0

#tech

Apple Just Pulled the Plug on the AI Hype. Here’s What Their Shocking Study Found

New research reveals that today’s “reasoning” models aren’t thinking at all. They’re just sophisticated pattern-matchers that completely…

Medium

Show thread

Final Round Player 😷🇪🇺🍸Jun 12, 2025

@eosfpodcast "Sparkling Auto-carrot"

Show thread

Michael Jun 12, 2025

@eosfpodcast I'm actually glad that Apple is taking such a measured approach in this area, even if it's affecting their stock price negatively (full disclosure... I own a few shares). I still wonder exactly how it's going to really improve our lives. Apple seems to be compartmentalizing the capabilities to small, useful features, many of them not even obviously “AI”. Seems pretty sensible to me. I don't need deep fakes, or a machine writing things for me. Let me do the art and humanities stuff, and make an AI that will do my laundry, or clean my home.   “AI -- Fun until it's not”

Show thread

mirabilos Jun 12, 2025

@eosfpodcast uhm, told you so?

Show thread

Ed Jun 12, 2025

@mirabilos @eosfpodcast I read the summary and was immediately left thinking, "Shocking to whom, exactly?"

Show thread

Ben Ramsey Jun 12, 2025

@eosfpodcast @danirabbit Since you have to log in to Medium to actually read the article, and the intro text and title seem like clickbait-y hype, so I was skeptical, here’s a link to the actual research: https://machinelearning.apple.com/research/illusion-of-thinking

The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity

Recent generations of frontier language models have introduced Large Reasoning Models (LRMs) that generate detailed thinking processes…

Apple Machine Learning Research

Show thread

Danielle Foré Jun 12, 2025

@ramsey here’s another breakdown as well https://garymarcus.substack.com/p/a-knockout-blow-for-llms

A knockout blow for LLMs?

LLM “reasoning” is so cooked they turned my name into a verb

Marcus on AI

Show thread

Danielle Foré Jun 12, 2025

@ramsey and there’s an internet archive link for the medium post https://archive.is/2025.06.12-060251/https://ninza7.medium.com/apple-just-pulled-the-plug-on-the-ai-hype-heres-what-their-shocking-study-found-24ad42c234a0

Show thread

Sasha

Jun 13, 2025

@danirabbit @ramsey I read this original paper this week. And it's very well researched and confirms most of my views on LLMs and RLMs.

Show thread

Marsh Ray Jun 12, 2025

@ramsey @eosfpodcast @danirabbit Thank you.

I can’t stand it when people post email-required links.

Show thread

Oliver Knabe Jun 12, 2025

@eosfpodcast
If someone would like to see the text without login: https://archive.ph/ASo9a

Show thread

⁂ Fish Id Wardrobe Jun 12, 2025

@eosfpodcast this is some king's-new-clothes stuff right here.

Show thread

RejZoR Jun 12, 2025

@eosfpodcast It's why I don't use auto correction in messages and it's not even Ai. It made me lazy and not care about spelling because I was expecting system to fix it itself (I'm not English native, though how some natives use there and their wrong, maybe they should disable it too).

Ai chatbots are not much different, you ask them and they spit out some BS. People don't search and investigate things anymore to learn anything along the way, they just expect answer, correct or not.

Show thread

freiheitsgrad Jun 12, 2025

@rejzor @eosfpodcast This. Thank you.

Show thread

Mark Wieczorek Jun 12, 2025

@eosfpodcast https://archive.is/20250612060251/https://ninza7.medium.com/apple-just-pulled-the-plug-on-the-ai-hype-heres-what-their-shocking-study-found-24ad42c234a0

Show thread

harmone Jun 12, 2025

@eosfpodcast TIL that Apple is to AI what Toyota is to BEV cars.

Show thread

David Nash Jun 12, 2025

@eosfpodcast I see we're now in the "of *course* you can't expect this skill from an LLM, that's not what they're good at" stage of the discourse. This sort of remark is also popping up a lot in comments to people posting links to that article about how an old Atari console chess game from almost 50 years ago stomped ChatGPT recently.

This observation is of course totally correct. LLMs are genuinely not good for this sort of thing. They can't play chess very well (certainly not as well as just about any off-the-shelf dedicated chess program from anytime in the last several decades), and they can't actually reason about anything, instead just printing text that *looks* like what a reasoning person might do or think. And, of course, people critical of "generative AI" have been pointing this out for years.

Pointing this out is neither trite nor useless.

When OpenAI literally advertises "Learn something new. Dive into a hobby. Answer complex questions." as a sensible use case for ChatGPT (https://openai.com/chatgpt/overview/), people will expect behavior like this from an LLM.

When Google uses its LLM, Gemini, not just as a search adjunct (where it might be at least *somewhat* useful, since it could plausibly associate your request with actual web pages' text) but as a tool for creating factual summaries of information on the web, people will expect behavior like this from an LLM.

When the entire first *year* of ChatGPT hype claimed things like showing "sparks of artificial general intelligence" (https://www.microsoft.com/en-us/research/publication/sparks-of-artificial-general-intelligence-early-experiments-with-gpt-4/), people will expect behavior like this from an LLM.

When every vendor out there (Anthropic, OpenAI, Microsoft, you name 'em) aggressively markets their models as "reasoning" models, people will expect behavior like this from an LLM.

There are lots of people who have no clue about the limitations of this flavor of "AI", precisely because it has been -- and still is -- hyped so aggressively in counterproductive and misleading ways.

Show thread

Sassinake! ᑐ ∪ ∩ ⊂Jun 12, 2025

@eosfpodcast

The goal is not to provide answers and information

The goal is to DESTROY information.

Show thread

👁Pooka🫣Boo🫵Jun 12, 2025

@eosfpodcast Unfortunately, humans are pattern matchers as well. And, humans demonstrate a lack of reasoning ability constantly. Case in point, religion(s)! When the AI tells you it has faith in the nonsense, it is spewing, case closed!

Show thread

Angela Scholder Jun 12, 2025

@eosfpodcast This is why the EU shouldn't try to keep up with US companies and AI, but should focus on sovereignty,security, and privacy. Taking back control over our own data..

@kimvsparrentak @kimvsparrentak

Show thread

mattlqx Jun 13, 2025

@eosfpodcast I mean if anyone knows expensive illusion, it’s Apple. 🤣🤣

Show thread

🅱🅻🆄🅴🅱️Jun 13, 2025

@eosfpodcast

Yup, that Google Collab that I just wrote using a simple prompt and a math equation to help a coworker understand a very difficult concept that would have probably taken me a whole day to write traditionally is an illusion.

Show thread

xale Jun 13, 2025

@BlueBee @eosfpodcast that's not reasoning, though, that's just another example of the machine filling in details extrapolated from its training data.

The point of this study is to illustrate that the model has no *comprehension* of what its doing.

Show thread

🅱🅻🆄🅴🅱️Jun 13, 2025

@xale @eosfpodcast

That's called moving the goal post.

Show thread

xale Jun 13, 2025

@BlueBee @eosfpodcast How so? The paper explicitly discusses "reasoning" models, not generative behavior. I'm not saying that the model didn't do what *you're* claiming it did, only that you're not describing the same thing the paper is.

Show thread

🅱🅻🆄🅴🅱️Jun 13, 2025

@xale @eosfpodcast

Okay so I'm reading the paper now.

So far it seems like they are saying... When you use a language model wrong, it performs poorly.

Which... Duh.

For example, you can't ask the language model itself to do a math equation. It will get it wrong, it's just not in its ability to do so. But you can ask a language model to write a program that solves the equation.

The model itself cannot do the calculation. This is known.

This is why modern models do some trickery on the back end where they use traditional tools to validate conclusions. Like having a calculator or a Python interpreter.

I'm just annoyed at everyone still thinking this stuff is a fad and it's going to go away. It's blindingly obvious it's not going away. I think we should be more focused on supporting those who will make models we can use instead of allowing two giant companies to have the whole thing to themselves that they then use to brainwash us all.

Because I would really love to run local models and build cool workflows for people instead of talking directly to Google and OpenAI.

Allowing two companies to have a monopoly on the thinking device you plug into your brain- that they can shut off, seems like a bad idea.

Show thread

Avi Rappoport (avirr)Jun 13, 2025

@eosfpodcast @BlueBee Great use case! And you were right there in case it went off track. But a very limited scope.

Show thread

François @Jura Jun 13, 2025

@eosfpodcast

This is clearly the beginning of a new era... Good. I just wait to see all the negative comments that Apple is cooked, lost it and will fade while the AI hypers are the one to be in the dustbin of history. I need a new phone and I might consider Apple now.

Show thread

Rasmus Lindegaard Jun 13, 2025

@eosfpodcast

"New research reveals that today’s “reasoning” models aren’t thinking at all. They’re just sophisticated pattern-matchers"

No. No, new research does not show that. Everyone who had the least bit of understanding already knew that this was what was going on. But I'm glad to hear that Apple has "realized it"

If anything I would say this is how they frame it, as to not seem like idiots to have bet so much in this horse for so long.

Show thread

@eosfpodcast what is their commercial interest to post this? Do they imply that their own vomit emitters are better?

Show thread

James 🦉 #FBPE 🇪🇺Jun 13, 2025

@eosfpodcast And yet they're using it for everything but autoclaret!

Show thread

Netux Jun 13, 2025

@eosfpodcast thankfully they didn't decide to try and bilk a few billion dollars from investors. Guess they figured they would lose more in the fallout.

Show thread

Garrett Latimer Jun 13, 2025

@eosfpodcast But will they let me delete “Image Playground.app”? No.

Show thread

Allan Svelmøe Hansen Jun 14, 2025

@eosfpodcast I rarely agree with Apple, but well..... yes.
I hope this reaches more common folk as well, other than just us nerds :D