You really have to wonder how the powers-that-be at Wikipedia thought that putting goddamned AI summaries on Wikipedia pages wasn't anything other than Trump-level stupidity.
@lauren it is my favourite thing at the moment, to say those who support LLM AI are "into that trump shit"

@lauren I think they ("we", since I work there) thought that the general public would distinguish language models from "AI" and that a bunch of "warning this is slop" messages would be enough.

Some bits of grey:

1. We've used language models for machine translation in the Content Translation tool for a decade, and probably (in a bit of hubris) consider ourselves somewhat expert in identifying and mitigating the biases inherent in that technology (biases in dozens of languages!). All translated content is human reviewed. We were giving conference talks on AI bias before that was cool.
2. We also use language models to help vandal fighters, and again have been doing so for a decade, they are controlled for bias and human reviewed, etc.
3. Historically all of our article summaries have been human generated. Not a lot of articles have summaries. Humans are slow. There are millions of articles in dozens of languages.
4. "Some folks at a conference" thought it would be interesting to make an opt-in experiment, to see if we could use what we know about responsible use of language models to make a summary tool. The experiment was opt-in and even if you opted in the summary was collapsed by default; I think it was also only shown if we didn't have a human-authored summary. Part of the experiment was to try to quantify usefulness, bias, etc.
5. Before the experiment was launched, folks cried AI and shut it down. (This should have been foreseen!)

I am no fan of generative AI, but language models in general have been a useful tool to triage vandalism, help translate content, help volunteers communicate across language barriers, etc. They also have biases we have studied extensively (eg "doctor" and "teacher" get translated with gendered language in languages with grammatical gender, to start a long long list). Especially in the translation use case, they have the potential to help alleviate systemic inequities by making more of our content accessible in minority languages. WMF is trying to figure out how to navigate in a world where bad actors and slop are doing their best to poison the well, and messaging technical subtleties is not one of its strong points.

That said, anything that looked that similar to Google's slop summaries of search results was obviously a PR disaster waiting to happen and I have no idea why no one on the C-team saw that trainwreck coming.

@cscott You probably are already aware of my feelings regarding the various types of AI and how they are being deployed, particularly in terms of misinformation generation capacity. However, your reply really deserves a more detailed discussion, and I would be happy to discuss this with you (or other relevant folks at WP/WM), if there is any interest in doing so in some depth in a more direct way. Thanks.

@lauren https://www.mediawiki.org/wiki/Reading/Web/Content_Discovery_Experiments/Simple_Article_Summaries for more information.

Bias research summary (in 2018!): https://wikimediafoundation.org/news/2018/10/10/mitigating-biases-artificial-intelligences-wikipedian-way/

I think the tl;dr answer is "hubris". Folks thought they could do this "better" and that everyone would agree it was "better" once they saw it working.

Reading/Web/Content Discovery Experiments/Simple Article Summaries - MediaWiki

MediaWiki

@cscott Two of the biggest problems with GAI are:

1) Even when GAI outputs are ostensibly human reviewed (e.g., automatically created police reports), studies show that (as one would expect) human review diligence drops off very rapidly, with the expected negative results.

2) No matter how a GAI summary in a highly visible (e.g. top of page) position is disclaimed, most people will simply accept that summary as authoritative, and not bother checking the rest of the links (or in the WP case, the article). This of course is ripe for the spread of errors, misinformation, and (again in my opinion) the worst case of a mix of correct and incorrect information.

@lauren Agreed on both points. I'd just mention that neither of these problems are new to Wikipedia, which has had various semi-automated article creation/editing and vandalism tools since its inception decades ago. There are scads of research on (eg) stub articles, article quality, editor workflow, etc. "Is it better to have a low quality article on a topic, or no article at all?" This is almost the foundational question of wiki, exemplified by the "inclusionist" and "deletionist" belief systems, in dialog to this day. The answer isn't simple, and there's a large body of praxis and experience and lessons learned, and we still try experiments and learn from them, because creating knowledge in this way is still very unique.

You can expect vigorous debate and a dozen different solutions from the editor community, and the answers might be different when the wiki is small and editor-poor, or large and editor-rich. And sometimes editors throw out a bunch of content and start over, if that's what needs to happen.

As I said, the hubris was thinking that this was business as usual.