@ErikvanStraten @harld
Zou ik hem toch naar de oostkust boren eerst. De rest zoeken ze daar maar uit. #FailAI dus.

Source: https://substack.com/home/post/p-172538377

Today's episode of *News That Surprises Nobody:*

I was an early adopter of AI coding and a fan until maybe two months ago, when I read the METR study and suddenly got serious doubts. In that study, the authors discovered that developers were unreliable narrators of their own productivity. They thought AI was making them 20% faster, but it was actually making them 19% slower. This shocked me because I had just told someone the week before that I thought AI was only making me about 25% faster, and I was bummed it wasn’t a higher number. I was only off by 5% from the developer’s own incorrect estimates.
[...]
So, I started testing my own productivity using a modified methodology from that study. I’d take a task and I’d estimate how long it would take to code if I were doing it by hand, and then I’d flip a coin, heads I’d use AI, and tails I’d just do it myself.
[...]
Yes, it’s a limited sample and could be chance, but also so far AI appears to slow me down by a median of 21%, exactly in line with the METR study. I can say definitively that I’m not seeing any massive increase in speed (i.e., 2x) using AI coding tools. If I were, the results would be statistically significant and the study would be over.

#AI #GenAI #productivity #perception #FailAI

Where's the Shovelware? Why AI Coding Claims Don't Add Up

78% of developers claim AI makes them more productive. 14% say it's a 10x improvement. So where's the flood of new software? Turns out those productivity claims are bullshit.

As an example on what can go wrong when you give LLMs agency in your actions:

Replit ignores *explicit instructions* and drops production DB during change freeze leading to the loss of many records. The AI even acknowledges its wrongdoing. And this after the developer has made more things very explicit after a previous day's many 'lies' generated by the GenAI.

https://xcancel.com/jasonlk/status/1946069562723897802

#LLM #GenAI #FailAI

@phlogiston
There are more #AIFail tagged toots than I'm prepared to scroll through.

It seems to have been the common choice.

Though I'll try to add #FailAI to future toots too.

And yes to linking these fails together.

And tagging #BraessParadox too.Tres cool.

I'm predicting that #GenAI in many places (e.g. in software development) – even though they are tools that may increase efficiency/productivity – will eventually cause a Braess's Paradox to occur. In this scenario, adding something that should (in theory) be beneficial, will actually cause people to change their behaviour, which will ultimately (and paradoxically) make things worse.

https://en.wikipedia.org/wiki/Braess%27s_paradox

Here's a visual description explaining the paradox on things like mechanical spring systems, traffic, electricity (grids), etc.

https://www.youtube.com/watch?v=-QTkPfq7w1A

BTW, I think we need a good hash tag for fails and essential shortcomings of any aspects of AI/GenAI/LLM stuff.

I think most use #AIFail (or #FailAI?)

Braess' paradox - Wikipedia

🤖 Evita il #FailAI: Scopri i 5 errori comuni nei progetti IA aziendali e come superarli! 🌟

🔗 https://www.tomshw.it/business/5-errori-che-fanno-fallire-i-progetti-ia-in-azienda-e-come-evitarli

5 errori che fanno fallire i progetti IA in azienda (e come evitarli)

Molti progetti di intelligenza artificiale falliscono per errori evitabili. Scopri i cinque più comuni e come superarli, insieme ai temi principali dell’innovazione tecnologica.

Tom's Hardware

I'm working on an "AI" in python. Here's how it's going:

#AI #failAI #projectfAI