Here's a fun AI story: a security researcher noticed that large companies' AI-authored source-code repeatedly referenced a nonexistent library (an AI "hallucination"), so he created a (defanged) malicious library with that name and uploaded it, and thousands of developers automatically downloaded and incorporated it as they compiled the code:

https://www.theregister.com/2024/03/28/ai_bots_hallucinate_software_packages/

1/

AI hallucinates software packages and devs download them – even if potentially poisoned with malware

Simply look out for libraries imagined by ML and make them real, with actual malicious code. No wait, don't do that

The Register

If you'd like an essay-formatted version of this thread to read or share, here's a link to it on pluralistic.net, my surveillance-free, ad-free, tracker-free blog:

https://pluralistic.net/2024/04/01/human-in-the-loop/#monkey-in-the-middle

2/

Pluralistic: Humans are not perfectly vigilant (01 Apr 2024) – Pluralistic: Daily links from Cory Doctorow

These "hallucinations" are a stubbornly persistent feature of large language models, because these models only give the illusion of understanding; in reality, they are just sophisticated forms of autocomplete, drawing on huge databases to make shrewd (but reliably fallible) guesses about which word comes next:

https://dl.acm.org/doi/10.1145/3442188.3445922

3/

On the Dangers of Stochastic Parrots | Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency

ACM Conferences

Guessing the next word without understanding the meaning of the resulting sentence makes unsupervised LLMs unsuitable for high-stakes tasks. The whole AI bubble is based on convincing investors that one or more of the following is true:

I. There are low-stakes, high-value tasks that will recoup the massive costs of AI training and operation;

II. There are high-stakes, high-value tasks that can be made cheaper by adding an AI to a human operator;

4/

III. Adding more training data to an AI will make it stop hallucinating, so that it can take over high-stakes, high-value tasks without a "human in the loop."

5/

These are dubious propositions. There's a universe of low-stakes, low-value tasks - political disinformation, spam, fraud, academic cheating, nonconsensual porn, dialog for video-game NPCs - but none of them seem likely to generate enough revenue for AI companies to justify the billions spent on models, nor the trillions in valuation attributed to AI companies:

https://locusmag.com/2023/12/commentary-cory-doctorow-what-kind-of-bubble-is-ai/

6/

Cory Doctorow: What Kind of Bubble is AI?

Of course AI is a bubble. It has all the hallmarks of a classic tech bubble. Pick up a rental car at SFO and drive in either direction on the 101 – north to San Francisco, south to Palo Alto – and …

Locus Online

The proposition that increasing training data will decrease hallucinations is hotly contested among AI practitioners. I confess that I don't know enough about AI to evaluate opposing sides' claims, but even if you stipulate that adding lots of human-generated training data will make the software a better guesser, there's a serious problem.

7/

All those low-value, low-stakes applications are flooding the internet with botshit. After all, the one thing AI is unarguably *very* good at is producing bullshit at scale. As the web becomes an anaerobic lagoon for botshit, the quantum of human-generated "content" in any internet core sample is dwindling to homeopathic levels:

https://pluralistic.net/2024/03/14/inhuman-centipede/#enshittibottification

8/

Pluralistic: The Coprophagic AI crisis (14 Mar 2024) – Pluralistic: Daily links from Cory Doctorow

This means that adding another order of magnitude more training data to AI won't just add massive computational expense - the data will be many orders of magnitude more expensive to acquire, even without factoring in the additional liability arising from new legal theories about scraping:

https://pluralistic.net/2023/09/17/how-to-think-about-scraping/

9/

How To Think About Scraping – Pluralistic: Daily links from Cory Doctorow

That leaves us with "humans in the loop" - the idea that an AI's business model is selling software to businesses that will pair it with human operators who will closely scrutinize the code's guesses. There's a version of this that sounds plausible - the one in which the human operator is in charge, and the AI acts as an eternally vigilant "sanity check" on the human's activities.

10/

For example, my car has a system that notices when I activate my blinker while there's another car in my blind-spot. I'm pretty consistent about checking my blind spot, but I'm also a fallible human and there've been a couple times where the alert saved me from making a potentially dangerous maneuver. As disciplined as I am, I'm also sometimes forgetful about turning off lights, or waking up in time for work, or remembering someone's phone number (or birthday).

11/

I like having an automated system that does the robotically perfect trick of never forgetting something important.

There's a name for this in automation circles: a "centaur." I'm the human head, and I've fused with a powerful robot body that supports me, doing things that humans are innately bad at.

12/

That's the good kind of automation, and we all benefit from it. But it only takes a small twist to turn this good automation into a *nightmare*. I'm speaking here of the *reverse-centaur*: automation in which the computer is in charge, bossing a human around so it can get its job done.

13/

Think of Amazon warehouse workers, who wear haptic bracelets and are continuously observed by AI cameras as autonomous shelves shuttle in front of them and demand that they pick and pack items at a pace that destroys their bodies and drives them mad:

https://pluralistic.net/2022/04/17/revenge-of-the-chickenized-reverse-centaurs/

Automation centaurs are great: they relieve humans of drudgework and let them focus on the creative and satisfying parts of their jobs.

14/

Revenge of the Chickenized Reverse-Centaurs – Pluralistic: Daily links from Cory Doctorow

That's how AI-assisted coding is pitched: rather than looking up tricky syntax and other tedious programming tasks, an AI "co-pilot" is billed as freeing up its human "pilot" to focus on the creative puzzle-solving that makes coding so satisfying.

15/

But an hallucinating AI is a *terrible* co-pilot. It's just good enough to get the job done much of the time, but it also sneakily inserts booby-traps that are statistically *guaranteed* to look as plausible as the *good* code (that's what a next-word-guessing program does: guesses the statistically most likely word).

16/

This turns AI-"assisted" coders into *reverse* centaurs. The AI can churn out code at superhuman speed, and you, the human in the loop, must maintain perfect vigilance and attention as you review that code, spotting the cleverly disguised hooks for malicious code that the AI can't be prevented from inserting into its code. As "Lena" writes, "code review [is] difficult relative to writing new code":

https://twitter.com/qntm/status/1773779967521780169

17/

qntm (@qntm) on X

What I dislike about AI-powered coding assistance is that I have to very carefully review the new code to be sure that it does the right thing. And I, personally, find code review difficult relative to writing new code (to an equivalent standard of quality)

X (formerly Twitter)

Why is that? "Passively reading someone else's code just doesn't engage my brain in the same way. It's harder to do properly":

https://twitter.com/qntm/status/1773780355708764665

There's a name for this phenomenon: "automation blindness." Humans are just not equipped for eternal vigilance. We get good at spotting patterns that occur frequently - so good that we miss the anomalies.

18/

qntm (@qntm) on X

I don't know if anybody else has this experience, but I understand code better by refactoring it directly by hand, bringing it into existence. Passively reading someone else's code just doesn't engage my brain in the same way. It's harder to do properly. More omissions

X (formerly Twitter)

That's why TSA agents are so good at spotting harmless shampoo bottles on X-rays, even as they miss nearly every gun and bomb that a red team smuggles through their checkpoints:

https://pluralistic.net/2023/08/23/automation-blindness/#humans-in-the-loop

"Lena"'s thread points out that this is as true for AI-assisted driving as it is for AI-assisted coding: "self-driving cars replace the experience of driving with the experience of being a driving instructor":

https://twitter.com/qntm/status/1773841546753831283

19/

Pluralistic: Supervised AI isn’t (23 August 2023) – Pluralistic: Daily links from Cory Doctorow

In other words, they turn you into a reverse-centaur. Whereas my blind-spot double-checking robot allows me to make maneuvers at human speed and points out the things I've missed, a "supervised" self-driving car makes maneuvers at a computer's frantic pace, and demands that its human supervisor tirelessly and perfectly assesses each of those maneuvers.

20/

No wonder Cruise's murderous "self-driving" taxis replaced each low-waged driver with 1.5 high-waged technical robot supervisors:

https://pluralistic.net/2024/01/11/robots-stole-my-jerb/#computer-says-no

AI radiology programs are said to be able to spot cancerous masses that human radiologists miss.

21/

Pluralistic: The REAL AI automation threat to workers (11 Jan 2024) – Pluralistic: Daily links from Cory Doctorow

@pluralistic well, self-driving vehicles work.

It's just that theybwon't solve any issue hesides making rick people richer by removing employees from the operational cost sheet.

@pluralistic Quick correction, the person behind the quoted account (qntm) is Sam; "Lena", the first word of the bio, is a story they wrote thst was recently published.

(I now return to reading the thread intently)

@boterbug thanks, fixed at the permalink

@pluralistic
>"As the web becomes an anaerobic lagoon for botshit, the quantum of human-generated "content" in any internet core sample is dwindling to homeopathic levels:"

This sentence is perfect.

@pluralistic Incorporating LLM generated content into LLM training datasets is like sticking a microphone into a speaker. Classic feedback loop, immediately drowns out the desired signal.

This means unless a reliable method of detecting LLM output comes along, the technology is self-limiting because it rapidly poisons its reproductive ecosystem.

LLMs are therefore best understood as an invasive species. The question is whether the explosive growth ends in desertification.

@pluralistic I’d even say dialog for video game NPCs is an unsuitable task for LLMs.

As a writer, you know how important it is to say exactly what needs to be said to move the story—in advancing the setting, characters, or narrative.

An LLM just cranking out “kind of fits” dialog really doesn’t do any of that, and if anything, makes it harder to deal with knowing if you caused a state change you were supposed to or learned something you needed to know.

@thedansimonson might be good for barks and/or tertiary and bystander NPCs

@stooovie you would think—but think about that for a second. Most bystander NPCs sort of grunt, act confused, or maybe say a comment to you that you’ve heard before. These are all signals that NPC doesn’t have much to add to the progression of the story—it’s an indicator that they’re a leaf on the game tree, not a branch.

Putting all of ChatGPT, its weird sycophancy, its infinite desire to blab—does that add anything to the game? Maybe the first time, but not the fifth.

@thedansimonson @stooovie Reminds me of the parlor walls in Fahrenheit 451. Instead of books, people spend all their time on semi-interactive soap operas where you can talk to the characters.
@thedansimonson I agree to an extent. It probably could be curtailed to react to some events with one or two sentences and not spew too much BS. It's a balancing act as everything but what we have now is either repetition or a game that only the likes of Rockstar can actually produce.
@pluralistic
The key phrase is "convincing investors". AI companies are just a more sophisticated pump-and-dump scam. It doesn't matter whether AI can actually do any of this stuff. It only matters that the AI companies can convince people for long enough for the investors to unload their stock at a hefty profit. After that, they don't really care if the whole sector crashes and burns.
@pluralistic thanks for sharing. That's a very good summary explanation of the challenges of AI /Llm.
It's going back to the question of what is truth. Can we trust AI with important decisions.
@Babadofar @pluralistic no, we cannot, what are you even about. It is just fricking random choice sorting hat.

@pluralistic

«The willingness of AI models to confidently cite non-existent court cases is now well known and has caused no small amount of embarrassment among attorneys unaware of this tendency. And as it turns out, generative AI models will do the same for software packages.»

*snerk*

@pluralistic Please don't add copyright information to the image description, it makes it useless for blind people.

@alttexthalloffame No, I will continue to do so because the alternative is to risk $150,000 copyright charges at the hands of predatory copyleft trolls:

https://pluralistic.net/2022/01/24/a-bug-in-early-creative-commons-licenses-has-enabled-a-new-breed-of-superpredator/

A Bug in Early Creative Commons Licenses Has Enabled a New Breed of Superpredator – Pluralistic: Daily links from Cory Doctorow

@pluralistic "I did that in multiple places: both in the Twitter thread and in the alt text of the image."

Why is it necessary to do both? I am not seeing anything online about attribution being required as part of alt text, only that it's present. (Happy to be corrected.)

@alttexthalloffame There is NO standard for attribution, hence the need to do AS MUCH attribution as possible, in order to allow any claims to be knocked back prior to expensive litigation.

@pluralistic I'd love to hear @Gargron's thoughts on this.

https://mastodon.social/@Gargron/112118260677357857

"Content created by others must be attributed"

Sounds like we might need to add a new field separate from alt text?

Only about a third of images have description (as per @AltTextHealthCheck), and who knows how much of that is actually usable. That's really bad.

@alttexthalloffame @Gargron @AltTextHealthCheck

IMO - as an avid caption writer - the most useful thing would be a field in IMAGES (e.g. EXIF) that could contain the descriptions and maintain them between services. My workflow is farcically complex and tracking descriptions across days is just a bridge too far, but if I could embed the description in the image so that it was available wherever I posted the image, that would be huge.

@pluralistic @Gargron @AltTextHealthCheck I've seen people make this point about alt text being part of EXIF data, this would definitely make the most sense.
@pluralistic @alttexthalloffame @Gargron @AltTextHealthCheck the EXIF 2.1 and 2.2 metadata standards include attributes for both ImageDescription and Copyright. The definition for the latter identifies it as "copyright holder" but it could be used for broader license information.
@lukethelibrarian @pluralistic @Gargron @AltTextHealthCheck Great! We'd then just need Mastodon (and the rest of the fediverse platforms) to allow editing and displaying this information, without having to change the ActivityPub standard.
@alttexthalloffame @pluralistic @Gargron @AltTextHealthCheck maybe, maybe not. Cory said this was about reusability within his own POSSE process, so it's really about whether the tools he uses as part of that process can populate Alt Text based on the EXIF ImageDescription, or could be modified to do so.

@alttexthalloffame @pluralistic @Gargron @AltTextHealthCheck also note that embedding alt text in EXIF data is far from a panacea for reusability of images. @eric captured some really important considerations at https://ericwbailey.website/published/thoughts-on-embedding-alternative-text-metadata-into-images/ - but he's mostly looking at the reuse of alt text between different users/contexts.

Cory's POSSE use-case is different: he's reusing/syndicating images across platforms, but all tied back to a single publication/context.

Thoughts on embedding alternative text metadata into images

The idea of “solving” alternate text descriptions by automating them away so that they are not a consideration is a bad frame.

sounds like <cite>...</cite>
and not necessarily same as copyright holder.

@lukethelibrarian @pluralistic @alttexthalloffame @Gargron @AltTextHealthCheck

@alttexthalloffame Oh, Gargron's thoughts on image descriptions are not going to help here, he's said "when you're trying to make a point, or maybe you're out and about somewhere  and you're just posting something quickly, you don't always have the time to type it in. So it's a good attitude to watch out and make sure that everything is accessible, but it can go overboard."

https://www.platformer.news/mastodon-interview-eugen-rochko-meta-bluesky-threads-federation/

How Mastodon made friends with Meta

Founder Eugen Rochko on helping Threads federate, dodging venture capital, and why he hopes Bluesky abandons its protocol

Platformer
@pluralistic @alttexthalloffame Then put it in the main text, just like these words.
@reinhilde @alttexthalloffame I do, on character-unlimited platforms (e.g. Medium and Tumblr).
@pluralistic @alttexthalloffame Ask mamot to patch their installation to raise the character limit.
How To Make the Least-Worst Mastodon Threads – Pluralistic: Daily links from Cory Doctorow

@pluralistic @alttexthalloffame You still aren’t helping your case, Cory.

Attribution belongs in the main text. Not cluttering the alt text, which is for screen reader users to know what the image is supposed to be.

@pluralistic Geeez. You'd think, since hallucination, is a well known phenomena and citing a non-existing package was a manifestation in AI generated coding that some programmer would have put an 'if' statement in to screen for it and tell the AI not to cite that package.

@pluralistic Wow, this is bad.

"Our findings revealed that several large companies either use or recommend this package in their repositories. For instance, instructions for installing this package can be found in the README of a repository dedicated to research conducted by Alibaba"

Recommend the dummy package? Like WTF.
Check your dependencies FFS.

@pluralistic

I wonder if he added a call for it to periodically emit an error message that reads "You know, you should really vet your AI-produced code more carefully."

@pluralistic I had regular old Gemini suddenly sending me on a 3 day cruise when I asked it to summarize a trip itinerary for me.

It started out as a 2 hour ferry ride.

Also is it still a supply chain attack if the supply chain didn't exist before it was used in an attack?

@pluralistic that's a lot of clickbait headline and build up for a single simple typo from '-' to '_' - and I've made the same mistake as a human on that exact huggingface cli library. This is just typosquatting rewarmed.