So about five days ago, or so, people on Bsky and Twttr started highlighting Elsevier science papers with GPT/LLM hallmark phrases riddled all throughout them. [Dozens and dozens (at least)] of peer-reviewed papers.

As I said, then, and as I discussed in my dissertation, knowledge-making and expertise are always a tricky process, but it needs deep, intentional confrontation and reform:
https://media.proquest.com/media/hms/PRVW/1/twSaS?_s=yIAhHtzhif4xd76I%2BihtcJJXTPw%3D

Anyway, now it looks like @404mediaco has dug down on this, and found *Even More of It* and I am genuinely and completely struggling against despair at what the future of being an educator, researcher, and writer will even mean over and at the end of the next 5 years.
https://www.404media.co/scientific-journals-are-publishing-papers-with-ai-generated-text/

Quite frankly, this should genuinely a) be the death of peer review as we know it (Again: AS WE KNOW IT), and b) lead a complete reformulation of the knowledge-making and expertise processes, but it won't and that terrifies and saddens me.

@Wolven @404mediaco Are you sure it's thousands of peer-reviewed papers? I'd love to see the evidence. Paper mills have certainly published many thousands of articles, but most of those including AI shibboleths seem to be in predatory journals, preprints, and grey literature.
@mattjhodgkinson I may be misremembering the total, or misapplying it to "AI" papers in particular, when it was about deceptive practices as a whole. Going to tag @gcabanac in to help clarify (does most of the research I'm talking about), and I'll amend my language in the post in the meantime. @404mediaco
@Wolven @gcabanac @404mediaco Guillaume is indeed the expert on 'tortured phrases'!

@mattjhodgkinson @Wolven @404mediaco I screen the literature with fingerprints that reflect 2 things.

First: paraphrasing with synonyms that create ‘tortured phrases’ see https://www.irit.fr/~Guillaume.Cabanac/problematic-paper-screener/tortured .

Second, conversational prologues of ChatGPT, see screenshot and https://retractionwatch.com/papers-and-peer-reviews-with-evidence-of-chatgpt-writing/

See my slides https://hal.science/hal-04225515v7 (and part of them are in English here https://hal.science/hal-04225515v4)

Redirecting...

Daily screening of problematic papers found in the literature with tortured phrases or algorithmically generated text.

@Wolven @mattjhodgkinson @gcabanac @404mediaco is it possible that LLM were trained on these Elsevier papers and the hallmark phrases are recognizable regurgitations? Elsevier seems like it would happily charge a fee and let AI companies have at their database - especially since they may accept papers under terms that gives them the legal control to do so
@metaphase @Wolven @gcabanac @404mediaco These are conversational markers of someone using ChatGPT.
Papers and peer reviews with evidence of ChatGPT writing

Retraction Watch readers have likely heard about papers showing evidence that they were written by ChatGPT, including one that went viral. We and others have reported on the phenomenon. Here’…

Retraction Watch
@Wolven @404mediaco more stuff for guys and gals at PubPeer to look through :)
To be Scientific is to be Communist

What differentiates scientific research from non-scientific inquiry? Philosophers addressing this question have typically been inspired by the exalted social place and intellectual achievements of ...

Taylor & Francis

@Wolven @404mediaco
LLM phrases are one thing. Even if I wonder how that survived any kind of peer review...

I see the real problem in the "publish or perish" mindset, encouraging researchers to go for quantity over quality, "salami-slice dissemination" and so on.

To weed out most bad papers, someone actually reading them suffices. And maybe we should stop thinking in "half paper" (4 pages) and "paper" (8 pages) categories, allow omitting the state of the art if it isn't needed and... maybe even find ways to automatically create a proper introduction.
With less LLM magic and more science, maybe.

@wakame @Wolven @404mediaco

Indeed, we do not need LLMs to write unreliable articles at all. If you just take "regular" psychology and cancer biology articles, they contribute ~200k unreliable articles to the literature every year - and that's just two fields:

https://bjoern.brembs.net/2024/02/how-reliable-is-the-scholarly-literature/

How reliable is the scholarly literature?

A few years ago, I came across a cartoon that seemed to capture a particular aspect of scholarly journal publishing quite well: The academic journal publishing system sure feels all too often a bit like a sinking boat. There are […] <a class="more-link" href="https://bjoern.brembs.net/2024/02/how-reliable-is-the-scholarly-literature/">↓ Read the rest of this entry...</a>

bjoern.brembs.blog
@brembs @wakame @Wolven @404mediaco I am sure the “publish or perish” rat race contributes to this state of affairs but so does Elsevier’s monopoly position. They have a license to print money in this field, it’s not like they need to actually put in the effort of closely reviewing what they publish.
Prestigious Science Journals Struggle to Reach Even Average Reliability

In which journal a scientist publishes is considered one of the most crucial factors determining their career. The underlying common assumption is that only the best scientists manage to publish in a highly selective tier of the most prestigious journals. However, data from several lines of evidence suggest that the methodological quality of scientific experiments does not increase with increasing rank of the journal. On the contrary, an accumulating body of evidence suggests the inverse: methodological quality and, consequently, reliability of published research works in several fields may be decreasing with increasing journal rank. The data supporting these conclusions circumvent confounding factors such as increased readership and scrutiny for these journals, focusing instead on quantifiable indicators of methodological soundness in the published literature, relying on, in part, semi-automated data extraction from often thousands of publications at a time. With the accumulating evidence over the last decade grew the realization that the very existence of scholarly journals, due to their inherent hierarchy, constitutes one of the major threats to publicly funded science: hiring, promoting and funding scientists who publish unreliable science eventually erodes public trust in science.

Frontiers
@brembs @wakame @Wolven @404mediaco Oh the only thing they care to invest in is DRM and rent-seeking, haha, of course; we live in hell.

@Wolven @404mediaco Just my opinion, but I think education will become more important as this misinformation/branding-obsessed/bullshit era develops, rapidly fueled on by what is stupidly called "AI". However, I think traditional teaching will need to be combined with training to distinguish between reality and fakery.

On my reading list (I haven't read it yet) is a book by @ct_bergstrom and J. West called "Calling Bullshit: The Art of Skepticism in a Data-Driven World". Looks like one good source of ideas on how to move forward, especially for educators.

@aebrockwell @404mediaco @ct_bergstrom Carl and I are due for a conversation, because we work the same beat on a lot of these things
@Wolven I am firmly of the opinion that a lot more educators, researchers, and knowledge-makers of all kinds need to take some cues from Paulo Freire.
@Wolven @404mediaco "What looks like a Crisis is often simply the end of an Illusion."
http://webseitz.fluxent.com/wiki/Crisis
Crisis

Crisis

WebSeitz
@Wolven @404mediaco clearly peer+editorial review was already fake.
@Wolven it is hugely demoralizing. Peer review has obviously been broken for a long time, but this is definitely unveiling the magnitude in a highly visible way
@Wolven @404mediaco I mean my institution is *actively encouraging* academics to use LLMs in their work, it's part of the official strategy document, so it goes deeper than just flaws in peer review; a bunch of researchers have concluded that this is a legitimate way to do science 🤮
@jimbob @Wolven @404mediaco I think using LLMs in and of itself won't make a work illegitimate. One can think of it as a sophisticated autocomplete: if it suggests you write "iture" after you put "furn", it doesn't make your work illegitimate. It's harder to believe the analogy is true for LLMs tho, I bet it is
@sanfierro @Wolven @404mediaco had a student recently, whose written English was not great, who attempted to use a LLM to improve his writing on a paper - just employing it sentence by sentence to try to do this. The results were *much* worse than their writing had been without it... even on that small scale, small, nonsensical mistakes appeared.

@Wolven @404mediaco @steve

Peer review has been broken joke for a while imho.

Sometimes it works. But more often than not, it slows the publication of legit works because of academic beefs and fiefdoms and allows junk to get through.

All while delaying knowledge, enriching publishers, and over burdening researchers.

I used to think arXiv and similar were the answer. But how can those largely community volunteer run publications be expected to keep up with keeping llm junk out?