A new #AI paper confirms a component of #StrategicReflectivism: when accuracy is relatively easy for an intelligent system, reflective inference isn't worth its added tokens (https://doi.org/10.48550/arXiv.2505.22987).

See CODA (Compute Allocation by Difficulty Awareness): https://doi.org/10.48550/arXiv.2603.08659

As recommended in #StrategicReflectivism (https://doi.org/10.48550/arXiv.2505.22987), #AI models can increase efficiency by tactically reflecting on initial answers.

There are a few ways to do this with #LLMs.

Kim et al. recently tested "metacognitive behavioral tuning": https://doi.org/10.48550/arXiv.2602.22508

A new #AI paper found what I predicted awhile ago.

In #StrategicReflectivism, I argued that having #LLMs reflect on low-confidence or high-uncertainty outputs of small models can increase accuracy and decrease cost: https://doi.org/10.48550/arXiv.2505.22987

🔒The result: https://ieeexplore.ieee.org/abstract/document/11393760

Does additional reflective thinking improve #AI reasoning models' math decisions?

Actually, additional reflection improved initial (more than final) answers: https://doi.org/10.48550/arXiv.2510.08308

More reason for #StrategicReflectivism: https://doi.org/10.48550/arXiv.2505.22987

#epistemology #cogSci #philMind

Yet another paper showing dual-minded #LLMs (intuitive + reflective) can improve accuracy-cost tradeoffs: https://doi.org/10.48550/arXiv.2504.12329

As I argue in #StrategicReflectivism, pragmatic switching between the two modes is key to intelligent systems: https://www.researchgate.net/publication/390166382

#AI #cogSci

Speculative Thinking: Enhancing Small-Model Reasoning with Large Model Guidance at Inference Time

Recent advances leverage post-training to enhance model reasoning performance, which typically requires costly training pipelines and still suffers from inefficient, overly lengthy outputs. We introduce Speculative Thinking, a training-free framework that enables large reasoning models to guide smaller ones during inference at the reasoning level, distinct from speculative decoding, which operates at the token level. Our approach is based on two observations: (1) reasoning-supportive tokens such as "wait" frequently appear after structural delimiters like "\n\n", serving as signals for reflection or continuation; and (2) larger models exhibit stronger control over reflective behavior, reducing unnecessary backtracking while improving reasoning quality. By strategically delegating reflective steps to a more capable model, our method significantly boosts the reasoning accuracy of reasoning models while shortening their output. With the assistance of the 32B reasoning model, the 1.5B model's accuracy on MATH500 increases from 83.2% to 89.4%, marking a substantial improvement of 6.2%. Simultaneously, the average output length is reduced from 5439 tokens to 4583 tokens, representing a 15.7% decrease. Moreover, when applied to a non-reasoning model (Qwen-2.5-7B-Instruct), our framework boosts its accuracy from 74.0% to 81.8% on the same benchmark, achieving a relative improvement of 7.8%.

arXiv.org

My next #CogSci paper should be how #LLMs get us from #BoundedReflectivism (https://doi.org/10.1111/meta.12534) to #StrategicReflectivism.

#AI research shows PRAGMATIC SHIFTING between single (intuitive) and dual (reflective) models outperforms both kinds of model.

#epistemology #PhilMind