CircleCI's analysis of 28 million CI workflows confirms the same picture the DORA data shows. While feature branch activity's up significantly, the median impact on *main* (i.e. release) branch activity's net-negative 7%.

Only the top 5% of teams saw significant gains. The top 10% flatlined at 1%.

For the average team, AI slows them down overall.

Told ya!

https://www.linkedin.com/pulse/what-28-million-workflows-reveal-ai-codings-biggest-risk-circleci-j9syc/

What 28 million workflows reveal about AI coding’s biggest risk

In our last issue, we shared a preview of data from our upcoming 2026 State of Software Delivery showing that the promised AI productivity boom isn’t all hype. Throughput across the CircleCI platform increased 59% year-over-year, by far the largest productivity jump we've ever recorded and a clear i

"But Jason, this is only 28 million data points comprising actual observations from real projects..."
Now let's watch engineering leaders nod sagely in agreement and then proceed to do nothing about it. Like they always did.
@jasongorman "But we are the top 5% team" ..

@mosmann @jasongorman 👆This! 100% this!

Teams & developers generally just don’t realise how bad they are. The apparent arrogance and lack of humility is staggering, and IMO is getting worse.

@thirstybear @mosmann I've known for years that the teams who need my help most believe they need it the least

@jasongorman @thirstybear @mosmann

You are Nanny McPhee 😁

“When you need me, but do not want me, then I must stay. When you want me, but no longer need me, then I have to go.”

@chrisoldwood @jasongorman @mosmann Funnily enough that’s how I describe my coaching approach too 🙂
@mosmann @jasongorman I would not take the mentioned "Top 5% Teams" as the actual best teams in terms of actual real world impact. It just means that the teams that pushed a lot to the main branch do it even more now. But that could just be teams with no code review dumping code into main and testing there. Including hotfixes because stuff did not work.
@mormund @mosmann Exactly. It means "top 5 in the data"
@jasongorman @mormund I think we lost the "build reliable, maintainable software" KPI years ago:)
@jasongorman @mormund @mosmann given that their average CI time is 6s, I don't think we need to ponder about this for too long...
@mosmann @jasongorman the first thing I thought. I know that "top 5%" is used in the article in a strictly statistical sense (as a percentile), but I fear that too many people won't realize that it's unlikely that you can know how to behave to land there - it's just a statistical outlier. So the real take-away is the median negative rate, IMHO.

@jasongorman This is a very interesting report, thou it’s no surprise. Thanks for sharing.

Two very important aspects these data do not tell: changes in code quality, and in product quality. I suspect enshittification not making things better here, too.

@meduz There's data on failed merges and MTTR which hints at degredation.

@jasongorman Very likely, thou it doesn’t say much about what ends up merged. It’s probably nuanced depending on how much a team would need guidance (even from AI 😨).

Also interesting to cross such data with development teams burnout in the coming times.

@jasongorman the average AI-adopting engineering leader will think that they must be in that top 5%

@jasongorman the definition of top here is interesting. If it's only defined as most active, it could very well be the teams driving their projects off a cliff with slop.

Top teams vibes as best teams, but I don't see any method of measuring that in the article.

@iwein They also measured failure rates and MTTR.

For sure, we don't know what was in those merges or what value or quality it had. But the DORA data also shows that +ve AI impact is correlated with better release outcomes.

@iwein @jasongorman from the article:

"Roughly 3 in 10 attempts to merge changes into production are now failing, and when the main branch breaks, nothing else can ship until it’s fixed.

That brings us to a third place AI code tends to get stuck: fixing it when it breaks. Recovery times for failed builds have been climbing since AI-assisted development went mainstream in 2022. This year, the typical team takes 72 minutes to get back to green, a 13% increase from the previous year.

Recovery times have been getting longer as teams incorporate more AI-generated code into their workflows. The same challenges that slow down manual review – PR size and complexity – also make failures harder to diagnose. Larger changes create more places for bugs to hide. And when something breaks in code a model generated, the developer debugging it often has little intuition for why it was written the way it was or what it might have affected downstream."

And presumably "top teams" who are managing to "successfully" push more slop into their production code include "AI first" companies such as Microsoft and AWS who are seeing their production systems break as a result of not catching all the bugs.