Great! A bunch of us here wanted it. Now it exists. 👍

It's a "dark archive" of the arXiv - a non-public backup to save the data in case of attack by hackers or the US government. The arXiv, I hope you know, is the biggest source of modern math and physics papers.

Who got the job done? The TIB: the Technische Informationsbibliothek, run by the Leibniz Information Centre for Science and Technology, in Hannover, Germany.

They write:

"The TIB has now set up a so-called dark archive for the arXiv content in order to be able to make the backed-up data accessible if the data stored in the USA is lost. The archive functions as a silent reserve: the complete copy of the content is stored decentrally at the TIB, but is not publicly accessible. This means that the data stock – almost 10 terabytes – is protected against potential outages and can be activated in an emergency.

The TIB is currently working on processes to keep the archive up to date: new submissions and updated versions must be backed up regularly in order to preserve the state of research as completely as possible.

“Building a Dark Archive is an expression of our longstanding commitment for a reliable, international academic provision, and as a partner of arXiv. Even though the Dark Archive today only works in the background, it is a key element in safeguarding digital research contents in the long term, because in case of a crisis, we could open the archive,” explains Dr Irina Sens, Deputy Director of the TIB."

We should call it the darXiv.

More details here:

https://blog.tib.eu/2025/05/14/protecting-science-tib-builds-dark-archive-for-arxiv/

Protecting Science: TIB builds Dark Archive for arXiv - TIB-Blog

Research and science are international; it is not for nothing that we speak of international specialist communities. Although a service such as arXiv is operated by an institution based in the USA, namely Cornell University, it is used by researchers worldwide. Part of arXiv‘s funding has also been internationalised since 2010 with the introduction of arXiv membership. The TIB finances the German contribution together with the Helmholtz Association of German Research Centres (HGF) and the Max Planck Society (MPG). The TIB has now set up a so-called dark archive for the arXiv content in order to make the backed-up data accessible in the event that the data located in the USA is lost.

TIB-Blog
@johncarlosbaez I second "darXiv" and would have suggested it if you hadn't
@bstacey @johncarlosbaez Yes, even if the name sounds as if the archive was run by Darx Vader.
@johncarlosbaez Dark energy, Dark matter and now, what? Dark Archive? Joe Rogan will be confused.
@johncarlosbaez Oh good. Perhaps the coming interval of barbarism can be reduced from thirty thousand years to a single thousand?
@johncarlosbaez
Not the worst point in time to take a snapshop of humankind's knowledge.
Just before it gets watered by Maledicant Intelligence.
At least there'll be something to train our kids on afterwards.
Sorry for being too realistic _.._

@blausand @johncarlosbaez

Or something our kids can use to train AI?

@johncarlosbaez This is amazing! I hope something similar is being done for the climate science literature and datasets, and medical research - ideally, all fields of science, but at the very least those under daily assault by the current US government.
@johncarlosbaez can they do that for all the science hosted in the U.S.? Especially by the government?
@johncarlosbaez is there a torrent for arxiv content? Would seed it.

@f4grx @johncarlosbaez Academic Torrents has one, but it's suspiciously small so I think it's not complete, may just be a catalogue or so: https://academictorrents.com/collection/arxiv

There is also one for biorxiv that seems to be more in the right ballpark: https://academictorrents.com/collection/biorxiv

@f4grx @johncarlosbaez see also the stuff that @SafeguardingResearch does under #sciop : decentralized backups of datasets in danger via BitTorrent. There's a ton of stuff to seed there too, sorted by threat level: https://sciop.net/datasets/?sort=-threat
Datasets - SciOp

Preserving Public Information

@johncarlosbaez On a much smaller scale, Scientific American has helpfully preserved the pre-RFKJr vaccine recommendations from the U.S. Advisory Committee on Immunization Practices as of November 22, 2024.

https://www.scientificamerican.com/article/see-vaccine-recommendations-backed-by-science-in-these-handy-charts/

See Vaccine Recommendations Backed by Science in These Handy Charts

These graphics will guide you through science-based vaccine guidelines for children and adults

Scientific American

@gregeganSF - I'm glad people are saving old documents. The current US government attitude to medicine is worse than medieval.

https://www.nytimes.com/2025/06/25/health/hiv-lenacapavir-vaccine-trump-cuts-africa.html

https://archive.is/93JD3

Promise of Victory Over H.I.V. Fades as U.S. Withdraws Support

A new drug that gives almost complete protection against the virus was to be administered across Africa this year. Now, much of the funding for that effort is gone.

The New York Times

@johncarlosbaez

I am redeemed!

For a few decades now, I have been advocating for privacy and cyberfreedoms.
My argument always was, that Governments can not be trusted with your data and your digital rights.

Every time I argued this, most people would smirk. Surely, we would never go back to Authoritarianism...

It took less than 6 months for the US government to turn into the enemy of the people.

Thanks for doing the good work of preserving public knowledge.

@n_dimension @johncarlosbaez I am worried that Trump may turn his attention on NARA...maybe execute DEI scrubbing on NARA's archives. That would be really disastrous! I went into NARA building as a part of Wikimedia DC's event, and I learned several things from it. Worth it. Now, how would we saveguard NARA from such actions? I have no idea. Especially that Trump really follow the unitary executive theory to the hilt by controlling NARA.
@thebluewizard - I don't know what NARA is!

@johncarlosbaez *gasp* It is United States National Archives and Records Administration.

https://en.wikipedia.org/wiki/National_Archives_and_Records_Administration

National Archives and Records Administration - Wikipedia

@johncarlosbaez Wherein we are hiding our public libraries from the governments. Strange times we’re living in. Kudos to TIB!

https://mathstodon.xyz/@johncarlosbaez/114752342143968527

John Carlos Baez (@[email protected])

Great! A bunch of us here wanted it. Now it exists. 👍 It's a "dark archive" of the arXiv - a non-public backup to save the data in case of attack by hackers or the US government. The arXiv, I hope you know, is the biggest source of modern math and physics papers. Who got the job done? The TIB: the Technische Informationsbibliothek, run by the Leibniz Information Centre for Science and Technology, in Hannover, Germany. They write: "The TIB has now set up a so-called dark archive for the arXiv content in order to be able to make the backed-up data accessible if the data stored in the USA is lost. The archive functions as a silent reserve: the complete copy of the content is stored decentrally at the TIB, but is not publicly accessible. This means that the data stock – almost 10 terabytes – is protected against potential outages and can be activated in an emergency. The TIB is currently working on processes to keep the archive up to date: new submissions and updated versions must be backed up regularly in order to preserve the state of research as completely as possible. “Building a Dark Archive is an expression of our longstanding commitment for a reliable, international academic provision, and as a partner of arXiv. Even though the Dark Archive today only works in the background, it is a key element in safeguarding digital research contents in the long term, because in case of a crisis, we could open the archive,” explains Dr Irina Sens, Deputy Director of the TIB." We should call it the darXiv. More details here: https://blog.tib.eu/2025/05/14/protecting-science-tib-builds-dark-archive-for-arxiv/

Mathstodon
@fm_volker @johncarlosbaez There's a lovely manga called "Library Wars" in English (I don't know the Japanese title) in which libraries carry out full-scale military defence against governments trying to censor them.

@johncarlosbaez

If there ever came a time that the data was to be restored, how would anyone know that it was authoritative?

@number6 - I don't know. Luckily it's not like climate data where the numbers have to be exactly right and are easily changed in ways that are hard to notice.
@number6 @johncarlosbaez
Trust in the institutional setup, just like arXiv itself

@johncarlosbaez

Money quote: "the S3 transfer came to about 900 Euros"

@johncarlosbaez As a Leibniz University alumn, this makes me insanely proud.

It's immeasurably sad though that it had to come to this.

(darXiv is such a fitting name!)