Mastodawn

"Across multiple coding agents and LLMs, we find that context files tend to reduce task success rates compared to providing no repository context, while also increasing inference cost by over 20%"

I've suspected this all along. Folks spending mucho-plenty time curating project-level .md files have been deluding themselves that it helps.

https://arxiv.org/abs/2602.11988

Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents?

A widespread practice in software development is to tailor coding agents to repositories using context files, such as AGENTS.md, by either manually or automatically generating them. Although this practice is strongly encouraged by agent developers, there is currently no rigorous investigation into whether such context files are actually effective for real-world tasks. In this work, we study this question and evaluate coding agents' task completion performance in two complementary settings: established SWE-bench tasks from popular repositories, with LLM-generated context files following agent-developer recommendations, and a novel collection of issues from repositories containing developer-committed context files. Across multiple coding agents and LLMs, we find that context files tend to reduce task success rates compared to providing no repository context, while also increasing inference cost by over 20%. Behaviorally, both LLM-generated and developer-provided context files encourage broader exploration (e.g., more thorough testing and file traversal), and coding agents tend to respect their instructions. Ultimately, we conclude that unnecessary requirements from context files make tasks harder, and human-written context files should describe only minimal requirements.

arXiv.org

Show thread

Jason Gorman Mar 3

The best results I've managed to get were when I kept contexts small and task-specific, solving one problem at a time. They can pay attention (literally) to surprisingly little at a time.

And summaries of the code are out of date the moment the tool starts changing it.

Just like in real life :-)

Show thread

Jason Gorman Mar 3

This is a big tick for some of my AI-Ready Software Developer principles.

But, heck, it just seems so obvious!

https://codemanship.wordpress.com/2025/10/28/the-ai-ready-software-developer-12-ground-truth/

The AI-Ready Software Developer #12 – Ground Truth

When Large Language Models hit the headlines in late 2022, with much speculation about impending Artificial General Intelligence (AGI) and the displacement of hundreds of millions of knowledge work…

Codemanship's Blog

Show thread

Jason Gorman Mar 3

Well, this has stirred a hornet's nest. Some folks luuurve their project context files!

"Yes, I see the hard data, Jason. But in my experience..."

Show thread

Claudius Link Mar 3

@jasongorman
My thesis is that it's an Illusion of Control problem. The context file gives you the illusion that you have control. Changing it changes the output. Accepting that the changes are basically random takes your only tool of "control" away.
It reminds me of the story that people prefer using the map of a wrong city over giving up the map and using no map at all 😬

Show thread

Jason Gorman Mar 3

@realn2s I think you're very probably right. The problem with probabilistic systems that *seem* to understand us is that we very easily fool ourselves. Confirmation bias is very much in play.

Show thread

GeePawHill

@jasongorman @realn2s Jason, I quote your comment about seeing the face of Jesus in a piece of toast all the time.

Turing, apparently, thought rather too highly of human intelligence.

Show thread

Manni Mar 19

@GeePawHill @jasongorman @realn2s didn't we all?