Mastodawn

CircleCI's analysis of 28 million CI workflows confirms the same picture the DORA data shows. While feature branch activity's up significantly, the median impact on *main* (i.e. release) branch activity's net-negative 7%.

Only the top 5% of teams saw significant gains. The top 10% flatlined at 1%.

For the average team, AI slows them down overall.

Told ya!

https://www.linkedin.com/pulse/what-28-million-workflows-reveal-ai-codings-biggest-risk-circleci-j9syc/

What 28 million workflows reveal about AI coding’s biggest risk

In our last issue, we shared a preview of data from our upcoming 2026 State of Software Delivery showing that the promised AI productivity boom isn’t all hype. Throughput across the CircleCI platform increased 59% year-over-year, by far the largest productivity jump we've ever recorded and a clear i

Show thread

Jason Gorman 18h ago

"But Jason, this is only 28 million data points comprising actual observations from real projects..."

Show thread

Jason Gorman 18h ago

Now let's watch engineering leaders nod sagely in agreement and then proceed to do nothing about it. Like they always did.

Show thread

mosmann 18h ago

@jasongorman "But we are the top 5% team" ..

Show thread

Chris Pitts 16h ago

@mosmann @jasongorman 👆This! 100% this!

Teams & developers generally just don’t realise how bad they are. The apparent arrogance and lack of humility is staggering, and IMO is getting worse.

Show thread

Jason Gorman 16h ago

@thirstybear @mosmann I've known for years that the teams who need my help most believe they need it the least

Show thread

Chris Oldwood 16h ago

@jasongorman @thirstybear @mosmann

You are Nanny McPhee 😁

“When you need me, but do not want me, then I must stay. When you want me, but no longer need me, then I have to go.”

Show thread

Chris Pitts 16h ago

@chrisoldwood @jasongorman @mosmann Funnily enough that’s how I describe my coaching approach too 🙂

Show thread

mormund 11h ago

@mosmann @jasongorman I would not take the mentioned "Top 5% Teams" as the actual best teams in terms of actual real world impact. It just means that the teams that pushed a lot to the main branch do it even more now. But that could just be teams with no code review dumping code into main and testing there. Including hotfixes because stuff did not work.

Show thread

Jason Gorman 11h ago

@mormund @mosmann Exactly. It means "top 5 in the data"

Show thread

mosmann 11h ago

@jasongorman @mormund I think we lost the "build reliable, maintainable software" KPI years ago:)

Show thread

Silmathoron ⁂9h ago

@jasongorman @mormund @mosmann given that their average CI time is 6s, I don't think we need to ponder about this for too long...

Show thread

Cees de Groot 10h ago

@mosmann @jasongorman the first thing I thought. I know that "top 5%" is used in the article in a strictly statistical sense (as a percentile), but I fear that too many people won't realize that it's unlikely that you can know how to behave to land there - it's just a statistical outlier. So the real take-away is the median negative rate, IMHO.

Show thread

meduz'15h ago

@jasongorman This is a very interesting report, thou it’s no surprise. Thanks for sharing.

Two very important aspects these data do not tell: changes in code quality, and in product quality. I suspect enshittification not making things better here, too.

Show thread

Jason Gorman 15h ago

@meduz There's data on failed merges and MTTR which hints at degredation.

Show thread

meduz'14h ago

@jasongorman Very likely, thou it doesn’t say much about what ends up merged. It’s probably nuanced depending on how much a team would need guidance (even from AI 😨).

Also interesting to cross such data with development teams burnout in the coming times.

Show thread

Joseph Nuthalapati

11h ago

@jasongorman the average AI-adopting engineering leader will think that they must be in that top 5%

Show thread

iwein 17h ago

@jasongorman the definition of top here is interesting. If it's only defined as most active, it could very well be the teams driving their projects off a cliff with slop.

Top teams vibes as best teams, but I don't see any method of measuring that in the article.

Show thread

Jason Gorman 17h ago

@iwein They also measured failure rates and MTTR.

For sure, we don't know what was in those merges or what value or quality it had. But the DORA data also shows that +ve AI impact is correlated with better release outcomes.

Show thread

MarjorieR 13h ago

@iwein @jasongorman from the article:

"Roughly 3 in 10 attempts to merge changes into production are now failing, and when the main branch breaks, nothing else can ship until it’s fixed.

That brings us to a third place AI code tends to get stuck: fixing it when it breaks. Recovery times for failed builds have been climbing since AI-assisted development went mainstream in 2022. This year, the typical team takes 72 minutes to get back to green, a 13% increase from the previous year.

Recovery times have been getting longer as teams incorporate more AI-generated code into their workflows. The same challenges that slow down manual review – PR size and complexity – also make failures harder to diagnose. Larger changes create more places for bugs to hide. And when something breaks in code a model generated, the developer debugging it often has little intuition for why it was written the way it was or what it might have affected downstream."

And presumably "top teams" who are managing to "successfully" push more slop into their production code include "AI first" companies such as Microsoft and AWS who are seeing their production systems break as a result of not catching all the bugs.