Das macht mich so traurig. Im Jahr 2026. Fritzbox zeigt die Verbesserungen und neuen Eigenschaften für das Repeater-Update und verbaselt das Textencoding als hätten wir 1999.😩
Show HN: Agent Postmortem Skill – Force AI coding agents to prove their work
agent-postmortem-skill은 AI 코딩 에이전트가 작업 완료를 주장할 때 실제 증거를 제시하도록 강제하는 오픈소스 검증 도구입니다. git 상태, diff, 명령어 실행 결과 등 하드 신호를 수집해 작업 완료 여부를 검증하며, 거짓 완료 상태를 사전에 차단해 품질을 표준화합니다. 모든 셸 명령 실행과 git 상태 확인이 가능한 코딩 에이전트와 호환되며, 작업 후 검증 리포트를 생성해 공유 및 감사가 가능합니다. CI를 대체하지 않고, 에이전트의 작업 주장에 대한 집중적인 거짓 탐지 기능을 제공합니다.
https://github.com/plus8bit/agent-postmortem-skill
#aiagent #verification #softwarequality #opensource #postmortem
Debt Behind the AI Boom: A Large-Scale Study of AI-Generated Code in the Wild
이 논문은 AI 코딩 어시스턴트가 실제 소프트웨어 개발 현장에서 생성한 코드가 장기적으로 기술 부채를 유발하는지를 대규모로 분석했다. 6,299개 GitHub 저장소에서 30만 건 이상의 AI 생성 커밋을 추적해 코드 냄새, 정확성 문제, 보안 이슈 등 48만 건 이상의 문제를 발견했으며, 이 중 22.7%는 최신 버전까지도 해결되지 않고 남아있음을 확인했다. AI 생성 코드는 생산성 향상에 기여하지만, 품질 보증과 유지보수 비용 증가라는 과제도 함께 존재함을 시사한다.
https://arxiv.org/abs/2603.28592
#aigeneratedcode #technicaldebt #softwarequality #github #codeanalysis

AI coding assistants are now widely used in software development. Software developers increasingly integrate AI-generated code into their codebases to improve productivity. Prior studies have shown that AI-generated code may contain code quality issues under controlled settings. However, we still know little about the real-world impact of AI-generated code on software quality and maintenance after it is introduced into production repositories. In other words, it remains unclear whether such issues are quickly fixed or persist and accumulate over time as technical debt. In this paper, we conduct a large-scale empirical study on the technical debt introduced by AI coding assistants in the wild. To achieve that, we built a dataset of 302.6k verified AI-authored commits from 6,299 GitHub repositories, covering five widely used AI coding assistants. For each commit, we run static analysis before and after the change to precisely attribute which code smells, correctness issues, and security issues the AI introduced. We then track each introduced issue from the introducing commit to the latest repository revision to study its lifecycle. Our results show that we identified 484,366 distinct issues, and that code smells are by far the most common type, accounting for 89.3% of all issues. We also find that more than 15% of commits from every AI coding assistant introduce at least one issue, although the rates vary across tools. More importantly, 22.7% of tracked AI-introduced issues still survive at the latest version of the repository. These findings show that AI-generated code can introduce long-term maintenance costs into real software projects and highlight the need for stronger quality assurance in AI-assisted development.
やねうら王 (@yaneuraou)
코딩 AI가 쇼기 AI의 버그를 연속적으로 찾아내는 사례를 다룬 글이 소개되었다. AI가 다른 AI 시스템의 품질 문제를 자동으로 발굴하는 방향으로 발전하고 있으며, 소프트웨어 품질 개선과 검증 자동화 측면에서 의미 있는 흐름이다.
https://x.com/yaneuraou/status/2048598232440426544
#codingai #bugfinding #shogi #softwarequality #aiverification
Spent an hour today reviewing code the AI agents wrote this week. More drift than I expected. Not wrong, just... inconsistent. Conventions slowly becoming suggestions.
Does AI-driven bug finding software threaten to make closed-source s/w better than FOSS?
Discover how #Meta improved #SoftwareQuality with a Just-in-Time (JiT) testing approach that dynamically generates tests during code review.
The system increases bug detection by approx 4x in AI-assisted development using LLMs, mutation testing, and intent-aware workflows like Dodgy Diff.
More on #InfoQ ⇨ https://bit.ly/4tqylKc
#SoftwareArchitecture #SoftwareTesting #AI #LLMs #CodeReviews
My VSCode hang again, and I had to quit it. Like it does at least once a day in the last week. And then I have to rearrange all windows on virtual desktops again. And again…
I don’t know if this is because of the dozen projects I have open at the same time, currently many Python ones which use automatic VSCode Python stuff, or because of Microslop being very productive. But it sucks.