I have a preprint out estimating how many scholarly papers are written using chatGPT etc? I estimate upwards of 60k articles (>1% of global output) published in 2023. https://arxiv.org/abs/2403.16887

How can we identify this? Simple: there are certain words that LLMs love, and they suddenly start showing up *a lot* last year. Twice as many papers call something "intricate", big rises for "commendable" and "meticulous".

#bibliometrics #scholcomm #chatgpt

ChatGPT "contamination": estimating the prevalence of LLMs in the scholarly literature

The use of ChatGPT and similar Large Language Model (LLM) tools in scholarly communication and academic publishing has been widely discussed since they became easily accessible to a general audience in late 2022. This study uses keywords known to be disproportionately present in LLM-generated text to provide an overall estimate for the prevalence of LLM-assisted writing in the scholarly literature. For the publishing year 2023, it is found that several of those keywords show a distinctive and disproportionate increase in their prevalence, individually and in combination. It is estimated that at least 60,000 papers (slightly over 1% of all articles) were LLM-assisted, though this number could be extended and refined by analysis of other characteristics of the papers or by identification of further indicative keywords.

arXiv.org
@generalising Fantastic work, Andrew!
Thank you so much. Now I can search web data for posts, searches, and media using the same token words. :)
@generalising I'm number 2!!
@generalising Do you have the data table for the 90 top words? I'd love to see how the also-rans performed vis-a-vis the top 10! :)
@Wikisteff No, these were all done by hand so I didn't want to spend a full week on doing all 100! Might be practical to test them all using the Dimensions API, though?
@generalising I computed number of standard deviations above the 2016-2022 mean above a baseline quadratic time series model of language use for 2023-2024. All your control words came out significantly different than model, except for "before" and "earlier".
@generalising It's not a great model, I should probably use an ARIMA for baseline trend, but I don't have that implemented in my Excel library like quadratic regression.
@generalising The LLM-boosted adjectives are all up, except for "fresh", "potent", and "ingenious". Enormous effect size in 2024, as you noted.
@generalising For the adverbs, insane significance levels for "meticulously", "methodically", "compellingly", "impressively", and "strategically"; alongside a significant decline in "reportedly", "excellently", and "undoubtedly".
@generalising All your synthetic tests do great by this measure, but of course you already knew that. :)