Wikipedia has banned AI-generated text, with two exceptions
Wikipedia has banned AI-generated text, with two exceptions
Saved you a click:
After much debate, the new policy is in effect: Wikipedia authors are not allowed to use LLMs for generating or rewriting article content. There are two primary exceptions, though.
First, editors can use LLMs to suggest refinements to their own writing, as long as the edits are checked for accuracy. In other words, it’s being treated like any other grammar checker or writing assistance tool. The policy says, “ LLMs can go beyond what you ask of them and change the meaning of the text such that it is not supported by the sources cited.”
The second exemption for LLMs is with translation assistance. Editors can use AI tools for the first pass at translating text, but they still need to be fluent enough in both languages to catch errors. As with regular writing refinements, anyone using LLMs also has to check that incorrect information hasn’t been injected.
AIbros: we’re creating God!!!
AI users: it can do translation & reformating pretty well but you got to check it’s not chatting shit
The takeaway from all LLM-based AI is the user needs to be smart enough to do whatever they’re asking anyway. All output needs to be verified before being used or relied upon.
The “AI” is just streamlining the process to save time.
Relying on it otherwise is stupid and just proves instantly that you are incompetent.
the user needs to be smart enough to do whatever they’re asking anyway
I’m gonna say that’s ideal but not quite necessary. What’s needed is that the user is capable of properly verifying the output. Which anyone who could do it themselves definitely can, but it can be done more broadly. It’s an easier skill to verify a result than it is to obtain that result. Think: how film critics don’t necessarily need to be film*makers*, or the P=NP question in computer science.
Making text flow naturally, grouping and ordeeing information, good writing.
You can verify two textst have the same facts and information, yet one reads way better than the other. But writing a text that reads well is quite hard.
As someone who codes, I specifically didn’t say “always” because of course it’s not always true. Especially in the cases of “garbage in, garbage out.”
But there’s still an argument to be made for mental load and context, for which I’d argue that planning solutions and then writing the code generally is more taxing than someone handing you suggested solutions with semi-complete code or pseudo-code, and then identifying road blocks.
On the other hand, if someone you trust unexpectedly hands you hallucinated garbage, then you’re likely to spin your wheels trying to identify what they did.
Relying on it otherwise is stupid and just proves instantly that you are incompetent.
Relying on it in any circumstances (though medical stuff is understandable if you’re simply too poor or don’t have access) while it is exhausting water supplies and polluting the planet is stupid and instantly proves that you are stupid and inconsiderate.
This is absolutely the case, and honestly, at least for now how it needs to be across the board.
Noone should be using AI to do things you’re incapable of doing (or undoing).
Liar. I already read the article before opening the comments. YOU SAVED ME NOTHING.
;-)
Wikipedia probably wants to sell access to LLMs to train. It’s only valuable if Wikipedia remains a high-quality, slop-free source.
I think even AI zealots think there should be silos of content to train from that are fully human generated. Training slop on slop makes the slop even worse.
AI already trains on Wikipedia.
Yes, but…
en.wikipedia.org/…/Wikipedia%3ADatabase_download
That’s because viewing the page uses server resources, as done API access. If you want the data you can download the database directly.
That is a reasonable exception to no-AI policies in research papers and newspaper articles, but not for Wikipedia. As a tertiary source, Wikipedia has a strict “no original research” policy. Using AI to provide examples of AI output would be original research, and should not be done.
Quoting AI output shared in primary and secondary sources should be allowed for that reason, though.
Eh, that’s not quite original research. There are plenty of other examples of images and sound files created for Wikipedia. A representative example isn’t research, it’s just indicating what something is.
The Wikipedia article on AI slop and generative AI has a few instances of content that’s representative to illustrate a sourced statement, as opposed to being evidence or something.
It’s similar to the various charts and animations.
You could make that argument about any tool Wikipedia editors use. Why should they need spellcheck? They were typing words just fine before.
…except it just makes it easier to spot errors or get little suggestions on how you could reword something, and thus makes the whole process a little smoother.
It’s not strictly necessary, but this could definitely be helpful to people for translation and proofreading. Doesn’t have to be something people are wholly reliant on to still be beneficial to their ability to edit Wikipedia.
Why should we use (insert tool) when we did just fine before?
Because when used correctly it can be great for helping you be more productive, and find errors/make improvements. The two exceptions are for grammar which AI does a surprisingly good job with. Would you have gotten mad if they used Grammarly >5 years ago? Having it rewrite an entire article is gonna be a bad idea, but asking it to rephrase a sentence, or check your phrasing for potential issues is a much safer thing. Not everyone who speaks Spanish uses it the same way. Some words are innocuous in some regions, but offensive in others.
en.wikipedia.org/…/Wikipedia:Writing_articles_wit…
en.wikipedia.org/…/Wikipedia:LLM-assisted_transla…
The two related “policies” are rather short, you should read them if you haven’t.
AI shouldn’t be altering databases of knowledge, especially when it is so inconsistent
The policy only allows usage as an auto-translater (a task at which they are not worst than old-style auto-translaters that were always allowed) and as spellcheck/grammarcheck (where it is also not worst than other allowed options).
None of those tools were previously seen as altering Wikipedia by themselves. The goal is that LLMs should be used and considered like they were.
To be clear they always were articles for creation submitted from clearly google-translated text, and they always were dismissed as slop. To get an autotranslated article accepted, you need to remove all the crap until all the information is correct and the grammar is good enough. This is a rather standard workflow for translations. The same thing should apply to LLMs.
The new issue here is that LLMs can “organically” change informations while asked to translate. When a classic autotranslate changes the information, it often (not always) leaves a notable mess in the grammar. LLMs will insert their errors much more cleanly. This is acknowledged by both texts and, well, texts will change if that becomes a reocurring issue.
AI isn’t altering databases or knowledge. AI is telling the writer there’s a better way to do this, and the writer has to explicitly change their wording.
You only know to look at a dictionary for alternative wordings if you know there’s a problem. How do you know there’s a problem?
If you ask someone else what if that same someone else uses your regional dialect and not the one that has problems? Your average writer can review every single word used in the dictionary for every single article they edit. But AI can, and that’s something it’s actually good at. You may only know 5 Spanish speakers, but AI knows everything it was trained on.
Just for more clarity: they workshoped for ideas on how to improve clarity and accessibility from some editors at an event. They did some small experiments, and they then developed a plan to trial some of them and presented the plan to a wider audience for feedback. After they got feedback they decided not to.
It’s not quite the editors pushing back on Wikipedia. Or rather, it’s not the “rebellion” people want to make it out to be.
mediawiki.org/…/Wikimania_2024,_"Written_by_AI"_H…
www.mediawiki.org/…/Simple_Article_Summaries
It rubs me the wrong way when the process going how it should go gets cast as controversial and dramatic. Asking the community if you should do something and listening to them is how it’s supposed to go. It’s not resistance, it’s all of them being on the same team and talking.