In « Promises, Perils, and (Timely) Heuristics for Mining Coding Agent Activity » (#openaccess preprint at: https://hal.science/hal-05487636) we review the potential and risks of mining agent-based coding activities in empirical #SoftwareEngineering #research. To be presented at the #MSR2026 conference this summer.
Promises, Perils, and (Timely) Heuristics for Mining Coding Agent Activity
In 2025, coding agents have seen a very rapid adoption. Coding agents leverage Large Language Models (LLMs) in ways that are markedly different from LLM-based code completion, making their study critical. Moreover, unlike LLM-based completion, coding agents leave visible traces in software repositories, enabling the use of MSR techniques to study their impact on SE practices. This paper documents the promises, perils, and heuristics that we have gathered from studying coding agent activity on GitHub.
