As someone who has participated in multi-year edit wars over, yes, Nazi shit, I will say that my biggest concern here isn’t about unedited LLM text hitting wikipedia articles—that’s v bad but probably largely fixable—but with the way Talk page sophistry is about to become absolutely fucking unmanageable as malicious editors set chatbots to do their infinite argumentation for them

To generalize: LLMs on the web’s surfaces are bad. LLMs in the backstage are much worse.

https://www.vice.com/en/article/v7bdba/ai-is-tearing-wikipedia-apart

AI Is Tearing Wikipedia Apart

Volunteers who maintain the digital encyclopedia are divided on how to deal with the rise of AI-generated content and misinformation.

@kissane [Non-AI citation needed]

@kissane Very good point, sadly even if you had a reliable form of human verification, that won't stop a single person from managing multiple discussions with the help from an LLM, their output is still multiplied.

Either way I suspect LLMs will reduce the accessibility of online spaces, because speaking a human language no longer means being human, so additional barriers will be put in place that end up hurting everyone.

@kissane and LLMs that astroturf citations 😑

@kissane
This is related to a problem I've mentioned elsewhere. I am (still, for now) making my living as a magazine editor, and my immediate problem isn't that the CEO believes ChatGPT can do the writing part of my job better... it's that my inbox is filling up with story pitches that are obviously AI generated.

They don't look like spam or press releases at first glance, but at second glance, they either get basic facts dangerously wrong or are weirdly incomplete and non-specific.

@grantimatter I hate this for you! And all of us

@grantimatter: So, let's figure out a way to make LLMs measure the specificity of a claim.

Near-fully automatic fact-checking, unfortunately, will require more advanced AI than LLMs can do.

@kissane

@grantimatter @kissane I almost can’t believe I didn’t think librarians were needed, when I was a child in 1990

Now I think we all need to be librarians

@kissane Truth. One of Wikipedia's worst problems (and I say this as a pretty big fan most of the time) is the "endurance effect" where tedious arguing can stand in for actually having either support from others or an actual good point and this slowly bends Wikipedia towards survival of the ones with the most free time. I am curious to see how this goes, and vaguely concerned about it.
@jessamyn @kissane same; it’s a big factor in many open source projects too, where cultural norms around “time contribution” being an inherently virtuous thing even frame it as a feature rather than a bug
@eaton @jessamyn @kissane I’m pretty sure you could automate every single delete discussion to always produce a delete result just by throwing up random combinations of policy acronyms.
@eaton @jessamyn @kissane that’s … quite a general problem in civilisation as well though. Any form of politics bears that out. We have to build safeguards and Overton windows to protect ourselves.
@jessamyn @kissane I find this stuff a much more plausible mechanism for disengagement from the open web and the Dark Forest theory (which I generally don't subscribe to) than fear of surveillance capitalism
@ted @jessamyn yeah, I think player killers drive a lot more people out of open spaces than surveillance capitalism does, and chatbots have the potential to act as a multiplier there, I fear
@jessamyn @kissane This has been a problem for a decade. The only fix is a policy one but the inmates are running the asylum so I don't see that happening.
@jessamyn @kissane
Yes, this is my experience as a reader on topics that I know about. Multiple times I've seen correct changes get backed out by an admin who didn't agree with it. In fact I went back there recently to show someone, and all reference to the issue had been removed altogether! As a Maths teacher there are things I could settle (textbook references), but what I've seen puts me off. I've always said Wikipedia is "like an encyclopedia" in the same way that Madonna is like a virgin.
@jessamyn @kissane
For anyone who has the patience for any such edit wars, there's no such thing as "implicit multiplication" - only people who don't know the actual rules of Maths call it that. The correct name is The Distributive Law (also understanding of Terms is relevant here), law as in "must be obeyed at all times". i.e. not optional or ambiguous at all. Here's a couple of the memes I made (one includes one of many textbook references on the topic)...
@kissane @colinaut I abandoned translating, writing, and editing articles more than a decade ago after malicious editors arguing with me.
@kissane Dumb question, is there a way to give non-human editors a (technically speaking) scarlet letter and therefore segregate them from making edits?
@smokler I don't know! I think the problem is that it would be very easy to use a chatbot just part of the time and yourself the rest of the time
@kissane @smokler there are bot labels that can be applied (and various limits/controls on bot accounts) but they mostly only make sense for good-faith bots, which often these won’t be.
@luis_in_brief @kissane What is a good faith bot? (I was with you up until that point).
@smokler @luis_in_brief Helpful reminder bots, things like that, IIRC. Super simple old school bots.
@smokler @kissane They also, among other things, help with various anti-vandalism work, and (in some languages, generally not English) create very short articles from databases of verified facts. The Wikipedia article is pretty good! https://en.wikipedia.org/wiki/Wikipedia_bots?wprov=sfti1
Wikipedia bots - Wikipedia

@kissane @donmelton
This is what I saw in my mind when chatbots were mainstreamed.

@kissane Oh dear lord yes, this. I gave up on trying to edit Wikipedia years ago after having people "well akshully" me with Scientology propaganda that they insisted on including in mental health articles.

It would be trivial to program a chatbot to do that sort of 19 year-old who's "too cool to graduate, man, 'cuz they were scared of my ideas", but they have been a problem on the internet for decades.

There's a phrase, uh, "mental m..." you get the drift, LLMs are that with a premature end.

@kissane
The only thing worse than trolls are automated trolls.

The developers of LLM's pretend that they have no control over whether or not their LLM tells the truth, but I think that will eventually be tried in court and they will be found liable because they could have trained their model to recognize patterns of ethical and unethical behavior and to distinguish between fact and fiction prior to producing results.

I believe that #ChatGPT and others of this generation of AI will eventually become a very expensive cautionary tale.

The Ford Pinto or Lawn Darts of #AI.

The danger of which was known and acknowledged by the developers and ignored by their corporate sponsors in order to make a quick buck.

@eggmont @kissane I have worried for some time now that the pollution of the information space by these products is deliberate, with the idea that once that's complete enough only people who can pay money will be able to get good information.

Doesn't need a conspiracy to do it, just some bad actors. The truism goes that ascribing malice is not necessary when stupidity will do, but it seems to me that people can't really be this stupid, can they?

@rcorless @kissane
I agree and since most of our tech giants have their roots in unethical, amoral, or actual criminal behavior I am concerned that the motives are less that pure and that safeguarding the public is not foremost in their minds as much as fleecing the public is.

I think these can be very useful tools in the right hands, but we are already seeing paywalls being put in place to rent yet another technology to the unsuspecting public.

Monthly subscriptions are the modern equivalent of the Enclosure Act whereby public information is privatized by moneyed interests.

“Stupidity is a more dangerous enemy of the good than malice. One may protest against evil; it can be exposed and, if need be, prevented by use of force. Evil always carries within itself the germ of its own subversion in that it leaves behind in human beings at least a sense of unease. Against stupidity we are defenseless."
― Dietrich Bonhoeffer, Letters and Papers from Prison

@eggmont @kissane They've done much better at creating bullshit machines than making bullshit detecting machines - that's for sure. It's all down to priorities.
@kissane I did a bit on Covid vaccine pages, & at one point an admin just collapsed and hid a bunch of argument - which I thought was great. (I'll try & remember which page it was, so I can call it as a precedent when I encounter the problem.)
@kissane: Wow. For all of the ways Wikipedia sucks, this is one I didn't see coming.
@kissane Thanks for highlighting the risks to community as well as to the sharing of knowledge
@kissane if only ai services could provide whether a given text snippet was in its output history

@kissane @corbden ah yes, the spam avalanche truly begins. I also expect LLM to clog our legal system or, moreover, our legal texts. Imagine contract law being supercharged, where “Click Agree” was just the template for chaos.

It will really test the limits of “ignorance is no defence” to see where the limit of that is. Do we already allow an exception based on complexity? Should we?

@kissane Too many people worry about Skynet, not enough about the first effective sealioning LLM chatbot.
@kissane I cannot imagine what the proposer of LLMs in Wikipedia was thinking. They're known to hallucinate, invent non-existent sources and can produce megabytes of text in a fraction of a second. I've heard of two people threatening libel actions over what LLMs have written about them. The whole idea sounds like a dumpster fire parade!
@kissane The only reasonable answer will be to put in place authentication systems tied to a web of nations in the end, so that humans can identify themselves at the very least as humans, bots as owned by particular humans, and some idea of which legal jurisdiction people are speaking from, as well potentially as their educational level. All of that could be done in a very flexible way. https://co-operating.systems/2020/06/01/WoN.pdf
@kissane interesting point, and one that doesn't seem to have been considered in the ongoing policy discussions
Wikipedia:Large language models - Wikipedia

@kissane oh jesus

mind you, it couldn't argue worse than some of the present partisans

@kissane
You can just get an admin to look. If they think it's a bot arguing on the talk page they can summarily block the editor. The user still gets an appeal, and then people get to challenge them to prove they're real.
@bitnik I think that all of the "You could simply _____" solutions here misunderstand how a truly massive scale-up affects things.
@kissane
It's how Wikipedia generally does things. Admins keep the place running by excluding bad actors. No scale-up is required.

@kissane

Sigh. As one who is already exhausted from transphobic posts on Wikipedia Talk pages, not looking forward to this getting even worse.

@kissane

Could you enlarge on that a bit pls?
Have listened to the article.

"...with the way Talk page sophistry is about to become absolutely fucking unmanageable as malicious editors set chatbots to do their infinite argumentation for them..."