RE: https://mastodon.social/@mgifford/116106634552309135

(I asked Mike on LinkedIn.)

Has anyone tried this, measured outcomes?

Asking because:
https://arxiv.org/abs/2602.11988

It found AGENTS‌.md “files tend to reduce task success rates compared to providing no repository context, while also increasing inference cost by over 20%.”

#accessibility #a11y

ACCESSIBILITY.md | Mike Gifford, CPWA

I have seen this problem repeatedly when talking to developers, maintainers, and accessibility practitioners: projects often lack a clear, living statement of how accessibility is handled. That gap is why I drafted ACCESSIBILITY.md guidance: https://lnkd.in/gpRZqKfS Why should projects have an ACCESSIBILITY.md? Because it makes a project’s accessibility commitments explicit, discoverable, and actionable for contributors and users. How can it help both humans and machines? For humans, it clarifies expectations, processes, and responsibilities. For machines, having a consistent, structured file enables automation, tooling, and compliance scanning to integrate accessibility into workflows rather than leave it to chance. This is a community effort. The initial draft was created with AI under my direction; I shaped the intent, structure, and editing. I want practical feedback. What’s missing? What should be clearer? What formats or examples would make this genuinely useful for real projects?

LinkedIn

@aardrian The methodology doesn't lend to practical application per-say (e.g., it's an experimental design) and no discussion section (which is where you propose potential practical and theoretical approaches to follow-up with).

Taking study as-is, devs should manually write/test their context files before use; otherwise risking increased failures and costs. But study needs additional robustness; other contexts would have to be further vetted (which they do note; one size doesn't fit all).

@justin Yeah, I figure experimental design lacks the rigor of real-world use informed by, well, real-world use.
@aardrian It's not so much lacking real-world rigor as attempting to test behavior and theorize a means to measure against a sample (in this case repos/contexts on Github). And while the study methodology may not work out of the box for practitioner use cases (not the purpose obviously), but it does highlight an interesting paradox to consider when writing such instructions in a repo (e.g., that the wieght of those instuctions has non-positive effects on outcomes to certain tasks)