The strain on scientific publishing 📄:

The publishing sector has a problem. Scientists are overwhelmed, editors are overworked, special issue invitations are constant, research paper mills, article retractions, journal delistings
 JUST WHAT IS GOING ON!?

Myself, pablo, @paolocrosetto and Dan have spent the last few months investigating just that.
https://arxiv.org/abs/2309.15884

A threadđŸ§”1/n

#AcademicChatter #PublishOrPerish #Elsevier #Springer #MDPI #Wiley #Frontiers #PhDAdvice #PhDChat #SciComm

The strain on scientific publishing

Scientists are increasingly overwhelmed by the volume of articles being published. Total articles indexed in Scopus and Web of Science have grown exponentially in recent years; in 2022 the article total was approximately ~47% higher than in 2016, which has outpaced the limited growth - if any - in the number of practising scientists. Thus, publication workload per scientist (writing, reviewing, editing) has increased dramatically. We define this problem as the strain on scientific publishing. To analyse this strain, we present five data-driven metrics showing publisher growth, processing times, and citation behaviours. We draw these data from web scrapes, requests for data from publishers, and material that is freely available through publisher websites. Our findings are based on millions of papers produced by leading academic publishers. We find specific groups have disproportionately grown in their articles published per year, contributing to this strain. Some publishers enabled this growth by adopting a strategy of hosting special issues, which publish articles with reduced turnaround times. Given pressures on researchers to publish or perish to be competitive for funding applications, this strain was likely amplified by these offers to publish more articles. We also observed widespread year-over-year inflation of journal impact factors coinciding with this strain, which risks confusing quality signals. Such exponential growth cannot be sustained. The metrics we define here should enable this evolving conversation to reach actionable solutions to address the strain on scientific publishing.

arXiv.org

First, things first: growth in articles published each year has outpaced the scientists doing the publishing. With #PublishOrPerish, we all face an ever-increasing workload (writing, reviewing, editing
). It’s been rough.

Strain itself is neutral: this could be a welcome change! Are we becoming more efficient? Are we combatting biases (academic racism, positive result bias)?

If that’s all it were, the solution to strain would be to build a better infrastructure.

But
 well
 it’s not. 2/n

We see that certain groups are major drivers of this article growth, in some cases seemingly out of nothingness. This includes your classic publishers like #Elsevier and #Springer, but also the upstarts #Frontiers and
 most significantly #MDPI.

In numbers, there were nearly 1 million more articles per year published in 2022 (2.8m) compared to 2016 (1.9m). MDPI takes the lion’s share at 27% of that growth, with Elsevier (16%) a distant 2nd.

How did we get to this point? 3/n

I could be nuanced (it's in the preprint!). But let’s be frank: it’s special issues.

“Dear Dr ___, your preeminent work in [FIELDYOUDONTWORKIN] drew our attention to your [COPYPASTEPAPERTITLE] and we were thoroughly aroused. We invite you to submit to special issue with us, who love your preeminence. Yours faithfully, [AROUSED].”

The figure speaks for itself. With my leftover characters, instead I wanna ask y’all to send me screenshots of your favourite SI invitations! Hit me! 😀 4/n

So still
 is it worth it? Strain itself is neutral. Maybe these special issues are just giving a voice to authors with less privilege?

Or maybe not. The publishers hosting special issues drastically reduced their turnaround times (TATs: submission to acceptance) - and let’s be clear, that’s INCLUDING revisions. 5/n

Now, it’s not our place to judge what an average TAT is supposed to be, but we’re very confident it’s not 37 days across all research fields. Experiment requests in fruit flies take weeks, whereas mice will take months.

TATs are also supposed to vary from article to article: some articles are great on 1st draft, some need a little TLC, and some need
 a lot
 Yet #MDPI journals in particular, across the board, accept everything in a blistering 37 days with almost no variation. 6/n

But it’s not just #MDPI: #Frontiers and #Hindawi also grew their share of special issues. One might argue: “These are just labels publishers use. The peer review process is the same.”

Au contraire mon ami : no it’s not. Special issues have lower TATs. They’re intended to be lax. They’re for authors to voice ideas that could turn out to be wrong, but advance the conversation in the field. That’s what they used to be at least
 and what made them “special.” But I digress
 7/n

We also looked at rejection rates (RRs), with some caveats: we took a publisher’s word at what their RRs were, and don’t know underlying methods. But we figured RRs will at least be calc’d consistently within groups. We compared relative RRs over time and RRs compared to proportions of special issues.

Again, #MDPI was the maverick, with a unique decline in RRs over time. Not only that, but in both #Hindawi & MDPI, more special issues means lower RRs. The review process *is not* the same. 8/n

Lastly let’s talk #ImpactFactor (IF). Reminder: IF = avg cites/doc articles in a journal receive within 1st 2y. IF values total cites.

IFs are going up 📈: they’re literally being inflated like a currency. So if you see a journal celebrating its year-over-year increase in IF, you’ve gotta normalize for inflation. This inflation accompanies the huge crush of special issues from earlier. But(!) a citation network-adjusted rank (Scimago Journal Rank, SJR) hasn’t changed accordingly. What gives? 9/n

Well, SJR is complex, but the main thing is it doesn’t reward self-citations, or circular citations from so-called “citation cartels.”

In other words:

** IF just cares about total citations, but doesn’t pay attention to where they come from.
** SJR pays attention, and doesn’t reward you or your buddies for reciprocal back scritchies

10/n

Then there’s Goodhart’s law: “when a measure becomes a target, it ceases to be a good measure.”

We use IFs and publications as a measure, but now they’re targets. Many studies on consequences, such as @abalkinaanna ‘s work on paper mills:
https://onlinelibrary.wiley.com/doi/pdf/10.1002/leap.1574

And then there’s this: https://fediscience.org/@MarkHanson/111104919139171425

That’s what you get from #PublishOrPerish đŸ€·â€â™‚ïž 11/n

We developed a new metric that we call “Impact Inflation.” Impact Inflation is the ratio of Impact Factor to Scimago Journal Rank (IF/SJR). Because IF values total cites (no matter the source), but SJR fails to reward authors aggressively self/co-citing, IF can become extremely inflated compared to SJR for journals hosting citation cartels.

Key point: Impact Inflation is a metric that shows to what extent a journal has succumbed to Goodhart’s law. And well
 once again #MDPI leads the pack. 12/n

Talking within-journal self-cites, once again #MDPI has the highest rates

What’s more we also see groups like #Hindawi have higher Impact Inflation, but normal self-cite levels. What gives?

Well, SJR also weights a citation based on where it comes from, and because MDPI journals aren’t well-cited (except by themselves), their citations aren’t worth much. And because MDPI growth came out of nowhere, they’re now exporting huge numbers of citations to others, including a penchant for Hindawi 13/n

So where does that leave us? Well, it’s easy to talk about #MDPI because
 scroll up. But fundamentally we need to address strain. We’re all overworked, and we can’t let this go on.

Our metrics tell us this growth isn’t rigorous science. Special issues are lowering standards, which nets groups like MDPI more articles, and more money đŸ’±. We don’t have revenue data, but for-profit gold OA ties revenues to articles published. So it’s no surprise that some groups are gonna spam engines of growth 14/n

Science needs accountability. The public needs to trust “peer-reviewed” papers have some minimum standard. These crazy-prolific special issues are damaging the authority and integrity of science.

It’s also costly: millions of scientists writing, reviewing, editing, and for what? These extra ~1m annual articles aren’t necessary. What’s more: we’re under-describing the strain because we’re only using journals indexed in both Scopus and Web of Science. Surprise! It’s actually even worse 🙃 15/n

That said: we’re just four white guys who all got fascinated with the craziness of the publishing sector. But you, the reader, can help. Publishing scientific articles can’t be like ordering fast food: “I’d like one special issue article please, hold the critiques.”

Special issues need to be a rare treat. A “sometimes” food. And when you’re invited to publish in one, or host one, that invite shouldn’t come from an algorithm. We should try to establish this basic #ResearchCulture 16/n

You know who CAN make a difference though? Funders, Universities, Academies of Science, @wellcometrust, @ukrio, @snsf_ch @DORAssessment etc
 we need your help!

We need policies that treat special issues differently because they are. We need guidelines from #COPE on a reasonable minimum rigour for #peerreview. We need standard reporting of key metrics like RRs, profit margins, etc
 We need leadership, and thank you for all you’ve already done and all you’re going to do. We’re up to chat! 17/n

Now the mushy stuff: Pablo, @paolocrosetto Dan, it’s been an incredible privilege to work with you all on this. I learned a ton through this project on coding and reproducible research practices.

Also: thanks for putting up with me, I know I’m a lot. As we’ve heard many times from folks over the last months: “this work you guys are doing is really important.” I believe it. Still banned from the text though. 18/n

A last point: we really hummed and hawed about if and how we could release scripts and data, but we just can’t right now. Lawyers told us not to. We’re like
 99% sure we didn’t do anything risquĂ©, but these things can’t be rushed. We’ll update the preprint if/once we’ve confirmed everything. Sorry about that, but hope it’s understandable. 19/n
@MarkHanson I mean, I guess? But somehow ironically, this also ties neatly into a whole different issue of what's borked about publishing as well. The lack of transparency and reproducibility. 😞
@adamhsparks Oh I know. We will make the data available to peer reviewers and we really hope to update the preprint as soon as we've got the all-clear. But we web scraped a ton of data that makes this very tricky across international boundaries. We're confident we'll be protected within UK Fair Dealings (they have an explicit section on text mining for non-commercial research), but as our author group is spread out, we didn't have definite protections from elsewhere and lawyers told us "wait."