Mastodawn

"Across multiple coding agents and LLMs, we find that context files tend to reduce task success rates compared to providing no repository context, while also increasing inference cost by over 20%"

I've suspected this all along. Folks spending mucho-plenty time curating project-level .md files have been deluding themselves that it helps.

https://arxiv.org/abs/2602.11988

Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents?

A widespread practice in software development is to tailor coding agents to repositories using context files, such as AGENTS.md, by either manually or automatically generating them. Although this practice is strongly encouraged by agent developers, there is currently no rigorous investigation into whether such context files are actually effective for real-world tasks. In this work, we study this question and evaluate coding agents' task completion performance in two complementary settings: established SWE-bench tasks from popular repositories, with LLM-generated context files following agent-developer recommendations, and a novel collection of issues from repositories containing developer-committed context files. Across multiple coding agents and LLMs, we find that context files tend to reduce task success rates compared to providing no repository context, while also increasing inference cost by over 20%. Behaviorally, both LLM-generated and developer-provided context files encourage broader exploration (e.g., more thorough testing and file traversal), and coding agents tend to respect their instructions. Ultimately, we conclude that unnecessary requirements from context files make tasks harder, and human-written context files should describe only minimal requirements.

arXiv.org

Show thread

Stefan Arentz Mar 3

@jasongorman I am confused about this. I have a basic CLAUDE.md and also some CODING_GUIDELINES.md that describe how I would like the generated Go code to look. If I do not include these, those instructions are not followed and I have to specify these things every time I start a new task. Are you saying I should not do these things at all? Or is there a better way?

Show thread

Jason Gorman Mar 3

@st3fan Task-specific context files

Show thread

Stefan Arentz Mar 3

@jasongorman pretty much every task is me asking Claude Code to write code for which I want to give the same guidance. . I can move these instructions into a skill but then instead of letting the agent read my once at the beginning of a day it will read it every time I start the “work on a feature” skill. Which seems less optimal re token spend?

Show thread

Jason Gorman Mar 4

@st3fan If it only reads it at the start of the day, does that mean you're letting the context run for the whole day?

Show thread

Stefan Arentz Mar 5

@jasongorman I don't think that is actually correct. I am pretty sure that Claude will use your CLAUDE.md files when you start a new session or when you /clear or /compact - i will test this but I am pretty confident they become part of the always present system prompt.

I do use long running sessions but work in smaller features / changes .. I find it helps a lot to keep a lot of context alive. (Also costs more tokens)

Show thread

Jason Gorman

@st3fan LLMs are stateless. That context will have to be fed into the model with every interaction. It's only a "session" because the client (e.g. Claude Code) maintains state client-side.