Mastodawn

Haha, that's one way to deal with it. I'm not sure how effective it is, though.

@nixCraft
It will not work.
The Language Models will be way more better able to understand the text, than humans are able ... with the mess inside.
The reason is, the AI don "read" a text ... it "values chances". So the "mess" inside not fitting in and get a lower value.
Try with your mobile and the words it suggesting at typing to complete the sentences.

The AI handles a messed up grammar, wrong spelled words, wrong words, ... in an input ... and still understands the text and gives an output.

I was chatting fro a while with one (knowingly) and testing it.
Even typos and ... "abstract stories" ... it was handle well.
You seen it was an output by an AI, as the text was in well written english with no typos ... and to long, someone would be able to type it in that short time.

Show thread

Nazo Apr 22

@Zada_Bury @nixCraft It wouldn't work at a small scale.

But imagine this at a large enough scale that it started getting trained into the training data... It would completely warp the training with very unpredictable results.

Obviously that requires a much larger scale than is realistically possible, but the original post really is just a joke after all.

Show thread

Zada Bury Apr 22

@nazokiyoubinbou
So far, I was hear/read, this was already happening partial with ChatGPT, Version 4 (?) ...

As it was get training data which was also AI-generated, the results became bad.
The version 3 or 3.5 (?) was more accurate.

For the details, you will have to research by self, to be honest - I not understand that much about that stuff.

@nixCraft

Show thread

Nazo Apr 22

@Zada_Bury @nixCraft Yeah, whenever "AI" feeds into "AI" training, it degrades. People can use it a little bit to clean and process certain training data, but only just a tiny bit. The data has to be very carefully handled with a lot of manual controls. The problem is that at the scale OpenAI is working at, they can't do that. Physically impossible.

So they're going to loopback more and more and they probably don't even know why it's degrading anymore.

Show thread

Zada Bury Apr 23

@nazokiyoubinbou
I remember some experiment about "what an AI see in some random noise" ...
... and it was drawing out the pictures, it been trained to detect - but it was just ... "white noise".

Well ... "Does Androids dream about electric sheep?" ... 😅

@nixCraft

Show thread

Nazo Apr 23

@Zada_Bury @nixCraft You do know that image generation (such as Stable Diffusion) LITERALLY takes random noise and tries to produce the prompt from that? The concept actually came from noise reduction algorithms. Basically reduce it into patterns like what it expects to see. At least the "generation" aspect does this. Though even if you do img2img it adds noise to help the process along.

The issue about "AI" feeding into "AI" training is basically it multiplies the deficiencies, but can't really improve upon what it gets right. Every mistake it gets gets multiplied into the new data more and more. They're spending inordinate amounts of money to steal real data from humans because it simply can't feed itself.

Androids may dream of electric sheep, but these are toasters.

Show thread

Zada Bury Apr 23

@nazokiyoubinbou
That experiment was years before "Stable Diffusion" ... 😅

But very interesting, really.
I never thought about, how an AI starts to ... "drawing".

I meaning ... every Artist also need to "plotting" the picture, even on drawing a landscape from it's point of view ...
(eg.: There the tree, there the river ... the rocks a bit smaller about painting from a higher angel, ...)

@nixCraft

Show thread

Nazo Apr 23

@Zada_Bury @nixCraft Yeah, my point was just that the mechanisms of it came from doing that. Digital noise reduction in images has been a thing since probably the 70s or so and they've done it different ways over the many many years, but the point I just wanted to make is that making patterns out of the noise is a key part of how the mechanism even works. It's literally just taking actual random noise and then being told there is a pattern (that there isn't actually) and then just applying some very generous methodology to find the supposed pattern in the noise that isn't actually there.