The PNAS paper from Monday got a lot of attention

https://doi.org/10.1073/pnas.2420092122

One particularly attention-grabbing point was the growth of paper mill papers, i.e., the red line.
The area under the black curve is the entire scholarly literature. Judging from reproducibility projects, I have added the # of articles that are likely to be irreproducible (yellow).
Sure, paper mills can some day be a problem. But compared to irreproducibility, it's a really minor problem:

https://bjoern.brembs.net/2024/02/how-reliable-is-the-scholarly-literature/

P.S.:
Yes, the Y-axis should be labelled "per year"

I wonder what makes the red curve so interesting and attention-grabbing for people that they completely forget the yellow curve? Is an AI generated unreliable paper somehow worse than a human-generated unreliable paper?

@brembs I wonder if it is because the yellow line is a constant relative to the black line, which means maybe we can live with it. Science has been doing well over those years (despite there being some irreproducible papers*). But exponential growth signals a problem that will soon get out of hand. Can we stop it before it gets to be a pandemic?

* As Stuart Fierstein says in his wonderful book Failure: what proportion of initial results should we expect to be irreproducible in a robust and healthy scientific enterprise? That's not a trivial question.

@adredish
I probably should have emphasized more that the yellow line is not in the original figure. I added it as a very rough guesstimate from reproducibility studies. My implied assumption that the rate was constant is very likely a gross oversimplification.

The original authors do not seem concerned at all and neither were the commentators I've seen.

@brembs @adredish I don't understand the yellow line, Bjoern- isn't it reasonable to assume that paper mill articles have higher irreproducibility rates than other articles, in which case that yellow line should be moving toward the black line?

@UlrikeHahn @adredish

Excellent catch! I created the yellow line simply by copying the black one and positioning it at roughly 50%.

That being said, if paper mill papers, say, plagiarize figures, their replicability should be the same as the source material. Not sure what the fraction of totally made up experimental papers is, though.

Either way, assuming a constant 50% is of course a gross oversimplification! But it was just meant as an illustration to give a better sense of proportion.

@UlrikeHahn @adredish

My linked post has more numbers...