#Digital ⚓️ #Vagabond 🦈

@beet_keeper
30 Followers
42 Following
62 Posts

When you can’t pay for things the currency of payment is psychic…

by @beet_keeper

Contributing back to the commons in digital preservation hasn’t been for everyone.

We know the famous XKCD that touches on the underappreciated work of maintainers in obscurity. When you, or your institutions, or services are using free and open source software, or other information and data in the commons, and you’re not contributing back, you’re perpetuating this, and what’s more, there’s a virtuous cycle that we’re missing out on.

I read something the other day and it felt like a red flag.

Continue reading “When you can’t pay for things the currency of payment is psychic…”


#community #contributing #CreativeCommons #digipres #digitalLiteracy #DigitalPreservation #dpip #Ecosystem #empathy #empathyInTechnology #FOSS #Infrastructure #OpenData #OpenSource #professionalMaturity

The Painter Goblin: Becoming Corporeal


by @beet_keeper

When you move country you have to be prepared to change quite a lot about your life. Back at the end of 2020, apart from literally everything else going on my partner and I also moved from Canada to Germany.

For me, this was my fifth or so international move (including shorter temporary stays) in as many years.

Being able to pick up sticks and move like that means living a drastically minimized life. Most of the things you have fit in a suitcase. Most of the things you have are small, and largely not overly whimsical. Sure, you can fit a few treasures into your bag, but you learn to value small ones, not things you might otherwise use to decorate an entire apartment!! 

So, what do you do when you do have an apartment to decorate?

You ask the best known painter in your family to conjure some magic, The Painter Goblin!

Continue reading “The Painter Goblin: Becoming Corporeal”

#art #coding #creativecommons #cybercats #daumier #digital #digitalArt #digitalHumanities #digitalart #digitization #frozen #gameboy #glam #keithharing #letterWriting #lollyrocket #magentaconverse #marieantoinette #materialCulture #openData #painterGoblin #paintergoblin #postcard #postcards #print #remixArt #remixart #twitterBot #vagabond #wikidata

Revisiting bsdiff as a tool for digital preservation


by @beet_keeper

I introduced bsdiff in a blog in 2014. bsdiff compares the differences between two files, e.g. broken_file_a and corrected_file_b and creates a patch that can be applied to broken_file_a to generate a byte-for-byte match for corrected_file_b.

On the face of it, in an archive, we probably only care about corrected_file_2 and so why would we care about a technology that patches a broken file?

In all of the use-cases we can imagine the primary reasons are cost savings and removing redundancy in file storage or transmission of digital information. In one very special case we can record the difference between broken_file_a and corrected_file_b and give users a totally objective method of recreating corrected_file_b from broken_file_a providing 100% verifiable proof of the migration pathway taken between the two files.

Continue reading “Revisiting bsdiff as a tool for digital preservation”

#ac3 #archives #audio #audiovisual #audit #authenticity #av #bash #bsdiff #checksums #code4lib #corruption #corruptionIndex #digipres #digitalArchiving #digitalForensics #digitalLiteracy #digitalPreservation #digitalStorage #diplomatics #fileFormats #glitch #glitchAudio #glitchart #integrity #preservationAnalysis #preservationMetadata #provenance #sensitivityIndex #storage

Turning NASA Wake-up Calls into data

by @beet_keeper

For a while back then I was into space flight again. Scientists, science communicators, and engineers were all excited for a new era of rocket launches and the potential unification of the human race as we look towards the future.

During that time I discovered Colin Fries’ work in the NASA History Division to document all NASA “Wake-up calls”. A wake-up call is simply a piece of music used to wake astronauts on missions, a different piece of music, daily, for the duration of the flight.

Take, for example, the last Space Shuttle mission (Space Transportation System) STS-135; it was in flight for 13 days, and the wake-up call on day one was Coldplay’s Viva la Vida, while on day 13 it was Kate Smith singing God Bless America.

As a huge music buff who has the radio or music television on 18 hours a day, I really wanted to delve into this further. While Colin’s work is great, it’s just a PDF file (@wtfpdf). A PDF is not an ideal file format for querying data and gleaning new insights. So, while I wanted to explore it, I first decided to turn it into a true dataset. The result was a set of resources, a website, a JSON, a CSV, and an SQLite database which are each more functional and more maintainable over time.

Lets take a look at the results and https://nasawakeupcalls.github.io below!

Continue reading “Turning NASA Wake-up Calls into data”


#ApacheTika #Code #Coding #DataWrangling #Datasette #DatasetteLite #DH #DigitalHumanities #GLAM #harkive #NASA #NASAWakeUpCall #NASAWakeUpCalls #OpenData #PersonalProjects #Science #Space #SpaceHistory #Twitter #WakeUpCall

Looking after your URLs: tikalinkextract eight years on


by @beet_keeper

We might not have a second life, but what if I told you there was a second internet? Not the deep web, but another web that we engage with nearly every day?

Think about it, that QR code you scanned for more information? That payment link you followed on your electricity bill? The website you’re told to visit at the end of a television ad?

The antipodes of the internet are these terminal endpoints, material and not necessarily material objects that represent the end of the freely navigable web — the QR code on a concert poster is the web printed onto the physical world. There is every chance it will be scanned and followed by someone from a mobile device, but it’s a transient object, something that will exist for a short amount of time, and then disappear into the palimpsest of the poster board or wall it was pasted on until it eventually disappears.

This is part of the materiality of the internet that has long fascinated me. Perhaps it comes from being a student of material culture, but if we look around, we see the Internet everywhere!

Continue reading “Looking after your URLs: tikalinkextract eight years on”

#Archives #digipres #DigitalArchiving #digitalContinuity #DigitalPreservation #httpreserve #Memento #outreach #RobustLinks #RobustWebLinks #WebArchives #webArchiving

Declarative programming for Digital Preservationists @ NTTW8


by @beet_keeper

Just released on the No Time to Wait (NTTW) YouTube channel is my presentation from NTTW8 in Karlsruhe, Germany. (Slides also available here).

The presentation follows up on my proposal for iPRES 2024 and allowed me to present parts of what was, in the end, a pretty significant paper (in terms of word count).

Some of my reflections on the presentation are below.

Continue reading “Declarative programming for Digital Preservationists @ NTTW8”

#Code #Coding #Conferences #declarative #declarativeLanguages #declarativeProgramming #jsonid #KVAL #kvalAccessLanguage #NoTimeToWait #NTTW #NTTW8 #NTTW9 #Programming #programmingParadigms #software #SoftwareDevelopment #talks

2024-09-12-NTTW-Declarative

Declarative programming for digital preservationists NTTW8 September 2024 Ross Spencer

Google Docs

File formats as Emoji: 0xffae


by @beet_keeper

tldr: https://emoji.exponentialdecay.co.uk

File Formats As Emoji (0xFFAE or 0xffae) might be my most random file format hack yet. Indeed, it is a random page generator! But it generates random pages of file formats represented as Emoji.

The idea came in 2016 with radare releasing a new version that supported an emoji hexdump. I wondered whether I could do something fun combining file formats and the radare output to create a web-page.

Along came a spare moment one weekend, some pyscript, and bit of sqlite, et voilà. File Formats as Emoji (0xFFAE) was made a reality.

Continue reading “File formats as Emoji: 0xffae”

#0xffae #Code #Coding #digipres #digitalLiteracy #DigitalPreservation #emoji #FileFormat #FileFormatIdentification #FileFormats #learning #PRONOM #pyscript #Python #SkeletonTestCorpus #teaching

0xffae.ffdev.info

Returns a random file-format represented in emoji

File format building blocks: primitives in digital preservation


by @beet_keeper

A primitive in software development can be described as:

a fundamental data type or code that can be used to build more complex software programs or interfaces.

– via https://www.capterra.com/glossary/primitive/ (also Wiki: language primitives)

Like bricks and mortar in the building industry, or oil and acrylic for a painter, a primitive helps a software developer to create increasingly more complex software, from your shell scripts, to entire digital preservation systems.

Primitives also help us to create file formats, as we’ve seen with the Eyeglass example I have presented previously, the file format is at its most fundamental level a representation of a data structure as a binary stream, that can be read out of the data structure onto disk, and likewise from disk to a data structure from code.

For the file format developer we have at our disposal all of the primitives that the software developer has, and like them, we also have “file formats” (as we tend to understand them in digital preservation terms) that serve as our primitives as well. 

Continue reading “File format building blocks: primitives in digital preservation”

#archives #digipres #digitalPreservation #digitalPreservationEssentialism #diplomatics #eyeglass #eygl #fileFormats #informationRecordsManagement #irm #json #jsonid #openData #openSource #rdm #researchData #researchDataManagement #xml

@beet_keeper and Andrea K. Byrne present "Introduction to Digital Preservation is People" at #nttw8 at ZKM in Karlsruhe: https://youtu.be/eZ3rQ3SRtm0
No Time To Wait - S08E19 - Day 1 - Digital Preservation is People - Ross Spencer, Andrea K. Byrne

YouTube

Informed consent: considering steganographic techniques to fingerprint Generative AI output


by @beet_keeper

Artificial intelligence (AI) is a polarizing topic. For every reasoned assessment of the technology and its potential to make some of our smaller, onerous, or more repetitive tasks easier, there are probably 100 reactive pieces predicting some radical overhaul of societal norms, from the service industry receiving new intakes of out of work software developers to laypeople taking on roles traditionally occupied by those of a college education, if they just start asking their AI the right questions ¯\_(ツ)_/¯

The amount of AI-propaganda is draining, and the reaction is often spread across the board too, some cheer leading, some decrying, plenty taking their time to offer skilled and nuanced rebuttals, or suggestions for improvements.

I find myself largely trying to stay out of the conversations. A lot like blockchain conversations 8 years ago, it will take another half decade for the hype-cycle to plateau for us to see where it can truly complement our work.

One part of the conversation that is increasingly harder to ignore, is being informed about when AI has been used in the generation of text or images. It is the property of knowing, or having the tools to know is what I feel is the most important.

How can we be better informed about when AI is used, so that we are better prepared as consumers, to receive and understand content?

In this blog I want to explore the potential for steganography techniques to be used in the output of AI to fingerprint content and provide a way for front-end mechanisms to identify it, as we might file formats using magic numbers, so that users can be given the chance of informed consent: the opportunity to opt-in or out of whether we engage with AI content or not.

Continue reading “Informed consent: considering steganographic techniques to fingerprint Generative AI output”

#ai #aiEthics #artificialIntelligence #c2pa #cyberSecurity #digitalFingerrprint #digitalObfuscation #euRegulation20241689 #genai #generativeai #imatag #journalism #llm #machineLearning #misinformation #onlineSafety #steganography #watermark #whistleblower