Mastodawn

Stevan 5d ago

@glynmoody No, no
Reality is at fault.

@withaveeay always

Little Art Histories 5d ago

@glynmoody Let's hope for the best.

@art_histories i think we can...

MugsysRapSheet 🔩🐑🐘5d ago

@glynmoody
I was one of the first to study #AI in college back in the mid-80s.

What people refer to as "AI" today is nothing more than "Deep Database Scrubbing" (which is why it requires so much power.)

It constantly scours billions of data sources and "makes assumptions" based on the number of links/connections between two bits of info. Garbage_In/Garbage_Out.

It doesn't test for accuracy, and the more unreliable the sources, the more unreliable the conclusions.

Collapse is inevitable.

@MugsysRapSheet indeed

Lux

5d ago

@MugsysRapSheet @glynmoody using an AI image kinda defeats the point here.

Stuart Longland (VK4MSL)5d ago

@glynmoody Sounds like what @jensorensen explained beautifully in this cartoon…

https://mastodon.social/@jensorensen/113136950728170129

#AI #ModelCollapse

@stuartl @jensorensen nice: mad cow disease had occurred to me too, but I can't draw...

@glynmoody @stuartl @jensorensen

Sepia Fan 5d ago

My analogy was inbreeding, but mad cow is similar.
https://mstdn.social/@stekopf/114581174083947454

Sepia Fan (@stekopf@mstdn.social)

Attached: 2 images @jensorensen@mastodon.social Thank you. I ended up at the University page: "Self-Consuming Generative Models Go MAD" https://news.rice.edu/news/2024/breaking-mad-generative-ai-could-break-internet The subtitle "Training #AI systems on synthetic data could include negative consequences" is quite the understatement if one looks at the left image. 🤪 Left image shows #LLM deterioration when trained with artificial data. Right image shows the same when trained with prepared artificial data (see link for details). Even there 20% of the info gets distorted quickly.

Mastodon 🐘

@stekopf also good @stuartl @jensorensen

jackcole 5d ago

@glynmoody #AI swallows its own vomit.

@jackcole eek

@glynmoody haha, reaching the point of no return, eh, not garbage in garbage out, but more like competing streams of garbage. So predictable.

@wrath0110 except to its investors

Nicholas Laney 5d ago

@glynmoody
Not exactly the sharpest tool in the shed isn't he?
The reason why ai search isn't working properly anymore is in the /robots.txt files. Except on The Register. 😄

Daz 5d ago

@NicholasLaney @glynmoody I would be surprised if they respected robots.txt if they are quite happy to do mass pirating and copyright infringement to train the models

@daz @NicholasLaney @glynmoody

Vitlöksbjörn 5d ago

The scrappers haven't been following robots.txt for a while.

https://arstechnica.com/ai/2025/03/devs-say-ai-crawlers-dominate-traffic-forcing-blocks-on-entire-countries/

https://www.technologyreview.com/2025/02/11/1111518/ai-crawler-wars-closed-web/

Open source devs say AI crawlers dominate traffic, forcing blocks on entire countries

AI bots hungry for data are taking down FOSS sites by accident, but humans are fighting back.

Ars Technica

@NicholasLaney many bots ignore robots.txt, so I don't think it's that...

@glynmoody @NicholasLaney

contrasocial 5d ago

I think the only way people can speed up the collapse is by finding as many ways to poison the content they create as possible. AI devs will never voluntarily limit AI. It needs to be made a nightmare for them create training sets.

Colin Rowat 5d ago

@glynmoody neologism I'm working to kick into play: #gigaGIGO - like normal GIGO, but at the 10^9 scale.

@rowat_c sounds good

Kim Spence-Jones 🇬🇧😷5d ago

@glynmoody The author is showing extraordinary arrogance to suggest that the he is the only one who is calling this out. Many of us have been predicting for months, if not years, that letting AI feed upon its own output will end in farce.

@KimSJ @glynmoody Yep. It should be obvious to anyone who isn't susceptible to hype, that flooding the internet with slop will only poison the training data for future iterations of genAI, degrading the quality of their output over time. The models will increasingly end up feeding on *each other's* error-strewn output.

These shitty LLMs we have now? They're the pinnacle of how good such models will ever get, no matter how many more billions are "invested" in their development.

Pedro Pascal's Triangle Man 5d ago

@ApostateEnglishman @KimSJ @glynmoody Habsburg AI

@pikesley @KimSJ @glynmoody Yes! Brilliant! 😆

@ApostateEnglishman @KimSJ @glynmoody

samiamsam 5d ago

we used to call it GIGO

@samiamsam @KimSJ @glynmoody Yep. Only in this case, on an industrial scale - and then we feed all that garbage *back in* to the machines and get even worse garbage back out.

We keep doing this until everything is garbage.

Kim Spence-Jones 🇬🇧😷5d ago

@ApostateEnglishman @samiamsam @glynmoody
I proposed at an early point in this farce that regulators should mandate that any content produced by AI must be tagged as such. This would have allowed future AI iterations to disregard it. Sadly, I think that ship has well and truly sailed now.

Magnus Ahltorp 5d ago

@KimSJ @ApostateEnglishman @samiamsam @glynmoody These kind of rules never stopped spam, though. Companies will just ignore it, just as they are ignoring any form of privacy legislation.

Douglas Edwards 🇺🇦🇨🇦🇲🇽🇵🇦🇬🇱🇩🇰🇪🇺5d ago

@ApostateEnglishman @KimSJ @glynmoody There's even a cartoon satire of the process of "Model Autophagy Disorder" ("MAD", as in "MAD cow disease"):

@dedicto @KimSJ @glynmoody I love this bigly. 😍

Chris Simpson 5d ago

@ApostateEnglishman @dedicto @KimSJ @glynmoody wishful thinking but if "big AI" has the political power to bypass all copyright, then it should have the political power to mandate that all AI generated content be adequately watermarked so that it can be avoided on the next pass. Then it would become possible for your favourite open search engine to filter all such content from your search results

DeterioratedStucco 5d ago

@chris_e_simpson
They may well have to do that anyway, to stop it marking its own homework; IIUC people are beginning to see its search functionality starting to degrade.

@ApostateEnglishman @dedicto @KimSJ @glynmoody

@KimSJ many of us have, but he's a columnist, he's just generalising his own experience

Kim Spence-Jones 🇬🇧😷5d ago

@glynmoody A perfect description of what columnists do!

@KimSJ I speak as one...

Daniel Marks 5d ago

@glynmoody "My mind is going... I can feel it." - HAL 9000

@profdc9 always was deep that film

Voracious Reader 5d ago

@glynmoody

Makes me think of this:

@lnlyisol hmmm

The gallant knight 5d ago

@glynmoody why is anybody even surprised about this?

@burningTyger because VCs need us to believe the contrary

Nazo 5d ago

@glynmoody They have just about reached the limits of how much training they can use with existing data collection methods and artificial refinement methods.

The irony is, one way they could exponentially improve their model training is the one thing that still won't occur to them: pay people to actually write stuff for it with consistent quality and formatting instead of just stealing data from all over the Internet. They could also get higher quality out of smaller models. And then they wouldn't have to pay Meta/etc to help steal data for them.

But they'll die before they'll consider doing it that way, lol. And that may just happen at the rate they're going. I just hope the pretend LLMs are AI when they aren't crash doesn't take others out with it.

@nazokiyoubinbou they are already paying people to write stuff: journalists and others are doing this because they need the money...

Nazo 5d ago

@glynmoody Not really, no.

I'm talking about actually paying people to sit down and write actual training data. I don't just mean taking journalist's articles and feeding them in or that sort of thing. I mean writing data specifically for the models.

@nazokiyoubinbou yes, that is what they are doing: writing to order, specifically for ai

NotMaddie!5d ago

@glynmoody A little anecdote. A year ago my mom got me a Bird Buddy that takes photos and videos of birds at the feeder. Fun little gadget which also id’s birds. Part way through the year they updated their app to include/upgrade AI. And the id’s have gotten much worse. Purple finches, which we see multiple times a day, are being called house sparrows. American goldfinches are named as lesser goldfinches. These are easily identified, not even close.

@madrobin amazing...

@madrobin Huh. I can't tell the goldfinches apart from pics, and asked at Wild Birds Unlimited and got an answer that doesn't always apply.

I only know I have lessers because the Merlin app id's their vocalizations.

How *do* you tell them apart?

NotMaddie!5d ago

@solitha Their bills are different colors and the male lessers have a black head and dark back, not just a black cap. The females are hard, but the AI is mixing up the males.

CockneyLaurie 5d ago

@glynmoody it's already ruined the Internet, much like the "internet of things"
Hands up anyone that needs a kettle connected to the internet to act as a security risk...
Cos it's no good for anything else u

@cockneylaurie it's all about startups and business models, not need...

Little Art Histories 5d ago

@cockneylaurie @glynmoody I had an LED strip light that requires Internet. This was so nonsensical that I never used it. I couldn't give it back unfortunately because I already cut it. A strip light with Internet. Why? Why?

UkeleleEric 5d ago

@glynmoody you mean it's POSSIBLE for general AI to get worse???

@UkeleleEric oh yes...