Mastodawn

corbin Mar 23

Wikipedia has banned AI-generated text, with two exceptions

https://infosec.pub/post/43865778

Wikipedia has banned AI-generated text, with two exceptions - Infosec.Pub

Lemmy

Show thread

infeeeee

Saved you a click:

After much debate, the new policy is in effect: Wikipedia authors are not allowed to use LLMs for generating or rewriting article content. There are two primary exceptions, though.

First, editors can use LLMs to suggest refinements to their own writing, as long as the edits are checked for accuracy. In other words, it’s being treated like any other grammar checker or writing assistance tool. The policy says, “ LLMs can go beyond what you ask of them and change the meaning of the text such that it is not supported by the sources cited.”

The second exemption for LLMs is with translation assistance. Editors can use AI tools for the first pass at translating text, but they still need to be fluent enough in both languages to catch errors. As with regular writing refinements, anyone using LLMs also has to check that incorrect information hasn’t been injected.

Show thread

Rioting Pacifist Mar 23

AIbros: we’re creating God!!!

AI users: it can do translation & reformating pretty well but you got to check it’s not chatting shit

Show thread

halcyoncmdr Mar 23

The takeaway from all LLM-based AI is the user needs to be smart enough to do whatever they’re asking anyway. All output needs to be verified before being used or relied upon.

The “AI” is just streamlining the process to save time.

Relying on it otherwise is stupid and just proves instantly that you are incompetent.

Show thread

Zagorath Mar 23

the user needs to be smart enough to do whatever they’re asking anyway

I’m gonna say that’s ideal but not quite necessary. What’s needed is that the user is capable of properly verifying the output. Which anyone who could do it themselves definitely can, but it can be done more broadly. It’s an easier skill to verify a result than it is to obtain that result. Think: how film critics don’t necessarily need to be film*makers*, or the P=NP question in computer science.

Show thread

Pyro Mar 24

But if the output has issues, what’re you going to do, prompt it again? If you are only able to verify but not do the task, you cannot correct the AI’s mistakes yourself.

Show thread

Zagorath 6d ago

At the risk of sounding like an overly obsequious AI… You know what, you’re completely right. I’m honestly not sure what use case I was imagining when I wrote that last comment.

Show thread

Redjard 6d ago

Making text flow naturally, grouping and ordeeing information, good writing.

You can verify two textst have the same facts and information, yet one reads way better than the other. But writing a text that reads well is quite hard.

Show thread

EldritchFemininity 6d ago

You were thinking logically about a normal production chain. In that case, QA or whoever says “This is wrong, rework it and correct the issue” and that’s that. With AI, it does the whole thing over again and may or may not come back with the same issue or an entirely new one.

Show thread

Redjard 6d ago

If you don’t habe the ability then you would do what you would have 5 years ago: not do it
Either submit without, or not submit at all.

Show thread

WhiskyTangoFoxtrot 6d ago

I can’t draw, but I could probably photoshop out some minor issues in an AI-generated image.

Show thread

fartographer 6d ago

If you’re unable to brute-force verification (research, testing, consulting the ancient texts), there’s where you stop what you’re doing, and take a breath. Then, consult an expert. Just like the film critic analogy, it’s easier to verify than to create, so you’re saving the expert time and effort while learning about something that you were obviously already passionate enough about to have started this endeavor.

Show thread

alsimoneau 5d ago

As someone who codes, it’s not always easier to verify than to create.

Show thread

fartographer 5d ago

As someone who codes, I specifically didn’t say “always” because of course it’s not always true. Especially in the cases of “garbage in, garbage out.”

But there’s still an argument to be made for mental load and context, for which I’d argue that planning solutions and then writing the code generally is more taxing than someone handing you suggested solutions with semi-complete code or pseudo-code, and then identifying road blocks.

On the other hand, if someone you trust unexpectedly hands you hallucinated garbage, then you’re likely to spin your wheels trying to identify what they did.

Show thread

Aralakh 6d ago

This is where domain expertise would come in, no? It’s speeding up the work but it usually outputs generic content, and whatever else it injects while hallucinating. Therefore the validation part holds up I’d say.

Show thread

7101334 6d ago

Relying on it otherwise is stupid and just proves instantly that you are incompetent.

Relying on it in any circumstances (though medical stuff is understandable if you’re simply too poor or don’t have access) while it is exhausting water supplies and polluting the planet is stupid and instantly proves that you are stupid and inconsiderate.

Show thread

rumba 6d ago

This is absolutely the case, and honestly, at least for now how it needs to be across the board.

Noone should be using AI to do things you’re incapable of doing (or undoing).

Show thread

youcantreadthis Mar 23

Fucking hate those anti human filth pushing slop into everything. I want to take one apart with power tools.

Show thread

Paranoid Factoid Mar 23

Show thread

Scrollone Mar 23

Damn that movie was funny. I need to rewatch it.

Show thread

onlyhalfminotaur Mar 23

It holds up better than any movie from the late 90s that I can think of.

Show thread

SocialMediaRefugee 6d ago

Yaaah, but I’ll need you to come in this weekend though. Yaaaahhhh…

Show thread

XLE Mar 23

I don’t think AI users would say it does reformatting either (if they’re honest): If you tell a chatbot to reformat text without changing it, it will change the text, because it does not understand the concept of not changing text. It should only take one time for someone to get burned for them to learn that lesson.

Show thread

ji59 Mar 23

So, it should be used reasonably, as it should have always been.

Show thread

MissesAutumnRains Mar 23

Seems pretty reasonable to use it as a grammar checker. As long as it’s not changing content, just form or readability, that seems like a pretty decent use for it, at least with a purely educational resource like Wikipedia.

Show thread

🌞 Alexander Daychilde 🌞Mar 23

Liar. I already read the article before opening the comments. YOU SAVED ME NOTHING.

;-)

Show thread

errer Mar 23

Wikipedia probably wants to sell access to LLMs to train. It’s only valuable if Wikipedia remains a high-quality, slop-free source.

I think even AI zealots think there should be silos of content to train from that are fully human generated. Training slop on slop makes the slop even worse.

Show thread

SuspciousCarrot78 Mar 23

AI already trains on Wikipedia.

commoncrawl.org

Common Crawl - Open Repository of Web Crawl Data

We build and maintain an open repository of web crawl data that can be accessed and analyzed by anyone.

Show thread

Grimy Mar 23

Sell licenses of what? It’s already all in the creative commons iirc.

Show thread

Zagorath Mar 23

The content is CC licensed, but they are trying to block AI scraping because it overloads their servers. They have a paid API that uses a lot less compute for both Wikipedia and the AI, as well as being a revenue source for Wikipedia.

Show thread

ricecake 6d ago

Yes, but…

en.wikipedia.org/…/Wikipedia%3ADatabase_download

That’s because viewing the page uses server resources, as done API access. If you want the data you can download the database directly.

Wikipedia:Database download - Wikipedia

Show thread

MountingSuspicion Mar 23

This was only done because the editors pushed to minimize AI involvement. There’s a comment here already mentioning that: lemmy.world/comment/22826863

Wikipedia has banned AI-generated text, with two exceptions - Lemmy.World

Lemmy

Show thread

FauxPseudo Mar 23

Seems like there should be a third exception. For those occasions where the article is about LLM generated text. They should be able to quote it when it’s appropriate for an article.

Show thread

Zagorath Mar 23

That is a reasonable exception to no-AI policies in research papers and newspaper articles, but not for Wikipedia. As a tertiary source, Wikipedia has a strict “no original research” policy. Using AI to provide examples of AI output would be original research, and should not be done.

Quoting AI output shared in primary and secondary sources should be allowed for that reason, though.

Show thread

ricecake 6d ago

Eh, that’s not quite original research. There are plenty of other examples of images and sound files created for Wikipedia. A representative example isn’t research, it’s just indicating what something is.

The Wikipedia article on AI slop and generative AI has a few instances of content that’s representative to illustrate a sourced statement, as opposed to being evidence or something.

It’s similar to the various charts and animations.

Show thread

Goodlucksil 6d ago

To save you another few clicks: this is the discussion (RfC) that implemented the changes, and the policy is linked at the top.

Wikipedia:Writing articles with large language models/RfC - Wikipedia

Show thread

arcine 5d ago

Treating it like a tool instead of treating it like a God. What a novel idea !