Incredible essay about the importance and challenges of digital archival by Maxwell Neely-Cohen, as well as the various imperfect strategies to achieve “century-scale” digital archives.

https://lil.law.harvard.edu/century-scale-storage/

"We picked a century scale because most physical objects can survive 100 years in good care. It is attainable, and yet we selected it because the design of mainstream digital storage mediums are nowhere close to even considering this mark."

1/

#archival

Century-Scale Storage

If you had to store something for 100 years, how would you do it?

"The current web pages and marketing for Microsoft Azure and Google Cloud do not mention cultural or historical preservation at any point. ... At this precise moment all of these services mention AI (a lot) and how it’s going to change everything. ... Two years ago their marketing materials mentioned web3 and the metaverse (a lot) and how it was going to change everything, and how if your business did not adapt you were going to be left behind—yet those sentiments no longer appear."

2/

"The Jack Welch school of shareholder supremacy is completely incompatible with the sorts of values that would ensure a cloud storage provider would reliably exist for a century."

3/

"The progeny of [early internet filesharing] platforms still exist, and in some cases, thrive, though they are no longer a dominant means of distributing media. Sci-Hub, Library Genesis, and Z-Library offer academic journal articles for free to anyone who wants to download them, flouting intellectual property laws and invoking the right to science and culture under Article 27 of the Universal Declaration of Human Rights." 4/
"It’s worth considering the efficacy of piracy and the intentional breaking of intellectual property law as a long-term preservation tactic. Abigail De Kosnik, a professor in the Berkeley Center for New Media, contends that, given the nature of digital cultural output and the failures of the current corporate and institutional orders to properly care for them, ..." 5/
"... piracy-based media preservation efforts are more likely to survive catastrophic future events than traditional institutions. On the other hand, as the notorious prosecution of Aaron Swartz or the legal cases against the Internet Archive demonstrate, engaging in copyright infringement at scale runs the constant risk of sanction and shutdown from state actors." 6/

On blockchain-based filestorage projects:

"If providing storage generates revenue, that revenue will centralize because it is incentivized to centralize, just like other supposedly decentralized offerings in an unregulated market context. The untested legal status of these systems also poses potential problems. ..." 7/

"... None of these schemes have so far proven that they can function, let alone thrive, as functional viable marketplaces for a sustained period of time, nor that they can reliably incentivize storage in times of strife or scarcity. ... To directly peg an archival storage method to a market system with stakeholders that feed on volatility is equivalent to burying your hard drives in a 100-year flood zone." 8/
"If your goal in century-scale storage is avoiding kinetic, Hollywood-ready catastrophes, then decentralized solutions are ideal, but whether they can combat neglect is less clear. If a decentralized scheme wants to be successful at century scale, this is what they should and must attack. One of the few clear benefits of centralization is that it inspires care. If people know something is important, of value, potentially even the last of something, they tend to fight every day to protect it." 9/
"What is consistent about these examples is that they all involve groups who care. The most enduring decentralized efforts don’t owe their success to technological or organizational innovation, but rather by having enlisted generations of people with an emotional and intellectual investment in their worth. For both cloud storage services and distributed storage schemes, the question is whether they can provoke the necessary level of passion and watchfulness." 10/
"Are they and their technologies empowering those who care, or setting them up to fail? Can cloud storage corporations transform themselves into wardens? Can distributed storage systems turn each node into a guardian?" 11/
"The librarians and archivists of the world have been tackling the challenges of digital preservation for decades—the issue is that no one else is. The real solution to century-scale storage, especially at scale, is to change this reality. Successful century-scale storage will require a massive investment in digital preservation, a societal commitment. Politicians, governments, companies, and investors will have to be convinced, incentivized, or even bullied." 12/
"Every time a media company destroys an archive, every time a video game company prosecutes the preservers of content it has abandoned, every time a tech company kills a well-used product with no plan for preservation, these actions should be met with attention and resistance. We are on the brink of a dark age, or have already entered one." 13/
"The scale of art, music, and literature being lost each day as the World Wide Web shifts and degenerates represents the biggest loss of human cultural production since World War II. My generation was continuously warned by teachers, parents, and authority figures that we should be careful online because the internet is written in ink, and yet it turned out to be the exact opposite. As writer and researcher Kevin T. Baker remarked, 'On the internet, Alexandria burns daily.'" 14/
"In order to survive, a data storer, and the makers of the tools they use, must be prepared to adopt a skeptical and even defiant attitude toward the societies in which they live. They must accept the protection of a patron while also preparing for the possibility of betrayal." 15/
"If you’re wondering why much of this essay takes such an antagonistic pose toward external political and economic actors, while also considering the fruits of their offerings, it is because the century-scale archivist must sometimes be in service of an ideology that only answers to itself—to the protection of the collected artifacts at all costs." 16/
"This ideology, an 'Archivism,' entails a belief in the preservation of that which we make and think for future generations, at the expense of anything else. Century-scale storage can span methods and platforms, be enabled by governments and titans of industry, be helped by religions, cultures, artists, scenes, fans, collectors, technocrats, and engineers, but it must, at the end of the day, retain its values internally." 17/
"This is where, once again, the only true solution is an aggressive and massive investment in archives, libraries, digital preservationists, and software and hardware maintainers at every level, in every form of practice and economic circumstance. This needs to happen not just for states, corporations, and institutions, but for hobbyists and consumers." 18/
"The goal of century-scale storage must be to preserve that which we have created so that others, those we will never meet, may experience their intricacies and ecstasies, their capacities for enlightenment. This should be done by whatever means necessary, whatever method or decision ensures the possibility of that future—one day at a time,—and be willing to change at any moment, to scrap and claw against the forces attempting to smother the light." 19/19
@molly0xfff I mean, I use Zip disks. That's safe, right?

@d_j_fitzgerald @molly0xfff

I think magnetic storage media are doomed long term. The earth's magnetic field, the air temperature, and the slow chemical decay of the plastic disks themselves will eventually demagnetize them. And solid state / flash drives reply on static electricity that slowly discharges.

I am thinking of getting some "M-discs which are supposedly "archival quality" Blu Rays. Does anyone know if THOSE are safe? Just for my own personal files?

https://www.pcworld.com/article/2015499/storing-data-long-term-m-disc-best-method.html

Ultimate backup: Archival M-Discs store your data for 1000 years

Long-term storage of digital data is important, but also challenging. Archival CDs such as M-Discs offer a good solution with high durability.

PCWorld
@d_j_fitzgerald @molly0xfff OTOH, retrocomputing as a hobby is a new wellspring of software preservation, something the article repeatedly hammers on as critical for getting digital information to last a long time.
@molly0xfff interesting read thanks. Another aspect in addition to the physical layer here is the encoding or formatting. I can easily read Tex and LaTex from the 1980s but not the word 1 or word 2 etc documents nor the DVI files (easily). RTF usually works. I am afraid my brilliant 1990 era check book excel spreadsheets where I could click and unclick checks till I matched the bank statement are just meaningless blobs now (maybe LibreOffixe can read them?) Not that i use checks anymore either.
@molly0xfff also this is asking for a Sci fi book where the Archivists have to battle continually against the bands of Extractors that try to create IP monopolies based on the old culture.

@molly0xfff it seems the rule "whoever owns the information, owns the world" also works the other direction: "whoever owns the world, owns the information".

Regarding Amendments and Human Rights. All that is a thing of a past. 24th of Feb 2022 proved that there is no more UN Charter, no more International Laws, no more Sheriff who guards the Great Peace established in 1945. Force rules the world, not Law.

The major powers in the world no longer refer to internationally agreed borders when they talk about the peace deal. Thus, Law is the product of Force.

It is a shame to be a law-abiding citizen in WW3, when Force is reshaping the world.

Remember the words a nazi shouted to prof. Victor Frankle in Auschwitz? He shouted: "Bullshit!" After the professor tried to explain to Nazis that he has his manuscript of an important book with him.

451 by Fahrenheit is real in 2024. The first paper to be burnt? Budapest Memorandum.

@molly0xfff I've recently started switching cold storage stuff to Mdiscs. Still have online and offsite stuff but these are kept in a fire resistant safe too.
@molly0xfff Sounds like Dolores Umbridge?

@molly0xfff "by whatever means": "Funny" that when Archive.org does that with old 78 rpm vinyls it's called "robbery" – but when OpenAI does it with contemporary & copyrighted material it's fine as it's a requirement for their business model…

Note: it's only a valid business model if some rich a**hole is making profit from it. A not-for-profit thus is not allowed to

Yes, we should be allowed to preserve – even if (or especially when) no money-maker gets money-profit from it, it's human profit.

@molly0xfff Thread for you @hex (this is the middle, but I liked the quotation; I'm sure you'll manage) 🙂
@denny This is an extremely good article, thank you very much!
@molly0xfff from a former peer-to-peer protocols researcher, this sounds to me as a call to all archivers to gather unique hashes of the contents they have nowaday, and the adoption of a UUID scheme that would allow a-posteriori identification of contents we have on our drives as being part of a lost archive after the archive is lost.

@molly0xfff everything they do this, a tech bro angel investor gets his wings?

No, wait, I might be confusing things again.

jonathankoren™ (@[email protected])

Attached: 1 image https://hachyderm.io/@molly0xfff/113635494812376215

SFBA.social
@molly0xfff Hi Molly, I enjoyed all 19 parts of your segment. the part about 'politicians feeling bullied remains a waste process of Party Whips grooming. "?Strong politicians, ie senior party politicians.w/ such a stock position in market that congress being so bad at regulation is a reason no one be releected. They have companies in lockjaws, fail even then to bring market back to ethics. stock market rich congress is self damning of their poor performance to regulate commerce.". MBenchB
@molly0xfff Apoligies if you've refered to this (I've only seen the thread reference to libgen and other older systems) but if not, highly relevant: https://annas-archive.org/blog/critical-window.html
The critical window of shadow libraries

How can we claim to preserve our collections in perpetuity, when they are already approaching 1 PB?