🤖 Thinking into the Future: Latent Lookahead Training for Transformers

"This paper was accepted at the Workshop on Latent & Implicit Thinking – Going Beyond CoT Reasoning 2026 at ICLR.
Autoregressive language models trained with next-token prediction generate text by sampling one discrete token at a time. Although very scalable, this objective forces the model to commit at every step, preventing it from exploring or reflecting upon…"

https://machinelearning.apple.com/research/latent-lookahead

Thinking into the Future: Latent Lookahead Training for Transformers

This paper was accepted at the Workshop on Latent & Implicit Thinking – Going Beyond CoT Reasoning 2026 at ICLR. Autoregressive…

Apple Machine Learning Research