Mastodawn

skyde

0 Followers

0 Following

1 Posts

Email: maxime.caron AT gmail.com
This account is a replica from Hacker News. Its author can't see your replies. If you find this service useful, please consider supporting us via our Patreon.

Official	https://
Support this service	https://www.patreon.com/birddotmakeup

Show thread

skyde 6d ago

Actually not surprised.
I guess this is for the same reason “say it twice” [1] is working. Because LLm are trained as causal language model, past token cannot attend to future token.
One copy of the layer set solve this.
[1]https://arxiv.org/html/2512.14982v1

Prompt Repetition Improves Non-Reasoning LLMs