Mastodawn

Andrew Mar 26, 2024

I have a preprint out estimating how many scholarly papers are written using chatGPT etc? I estimate upwards of 60k articles (>1% of global output) published in 2023. https://arxiv.org/abs/2403.16887

How can we identify this? Simple: there are certain words that LLMs love, and they suddenly start showing up *a lot* last year. Twice as many papers call something "intricate", big rises for "commendable" and "meticulous".

#bibliometrics #scholcomm #chatgpt

ChatGPT "contamination": estimating the prevalence of LLMs in the scholarly literature

The use of ChatGPT and similar Large Language Model (LLM) tools in scholarly communication and academic publishing has been widely discussed since they became easily accessible to a general audience in late 2022. This study uses keywords known to be disproportionately present in LLM-generated text to provide an overall estimate for the prevalence of LLM-assisted writing in the scholarly literature. For the publishing year 2023, it is found that several of those keywords show a distinctive and disproportionate increase in their prevalence, individually and in combination. It is estimated that at least 60,000 papers (slightly over 1% of all articles) were LLM-assisted, though this number could be extended and refined by analysis of other characteristics of the papers or by identification of further indicative keywords.

arXiv.org

Show thread

Steffen Christensen Mar 26, 2024

@generalising Fantastic work, Andrew!
Thank you so much. Now I can search web data for posts, searches, and media using the same token words. :)

Show thread

Steffen Christensen Mar 26, 2024

@generalising I'm number 2!!

Show thread

Steffen Christensen Mar 26, 2024

@generalising Do you have the data table for the 90 top words? I'd love to see how the also-rans performed vis-a-vis the top 10! :)

Show thread

Andrew Mar 26, 2024

@Wikisteff No, these were all done by hand so I didn't want to spend a full week on doing all 100! Might be practical to test them all using the Dimensions API, though?

Show thread

Steffen Christensen Mar 27, 2024

@generalising I computed number of standard deviations above the 2016-2022 mean above a baseline quadratic time series model of language use for 2023-2024. All your control words came out significantly different than model, except for "before" and "earlier".

Show thread

Steffen Christensen Mar 27, 2024

@generalising It's not a great model, I should probably use an ARIMA for baseline trend, but I don't have that implemented in my Excel library like quadratic regression.

Show thread

Steffen Christensen Mar 27, 2024

@generalising The LLM-boosted adjectives are all up, except for "fresh", "potent", and "ingenious". Enormous effect size in 2024, as you noted.

Show thread

Steffen Christensen Mar 27, 2024

@generalising For the adverbs, insane significance levels for "meticulously", "methodically", "compellingly", "impressively", and "strategically"; alongside a significant decline in "reportedly", "excellently", and "undoubtedly".

Show thread

Steffen Christensen

@generalising All your synthetic tests do great by this measure, but of course you already knew that. :)