"…could we use a COG as a shard [inside Zarr]]?!?
It turns out…we can! Check out the linked notebook, where we show exactly this."
https://element84.com/software-engineering/is-zarr-the-new-cog/
"…could we use a COG as a shard [inside Zarr]]?!?
It turns out…we can! Check out the linked notebook, where we show exactly this."
https://element84.com/software-engineering/is-zarr-the-new-cog/
University of Michigan: U-M develops free tool to empower municipalities, modernize financial reporting. “When Congress passed the Financial Data and Transparency Act in 2022, it required most municipalities in the U.S. to modernize and digitize their financial reports. This is a heavy lift for small towns and school districts, most of which still report their financial information in PDF […]
Hilarious #SMBC on the #plural of #files:
https://www.smbc-comics.com/comic/plural-2
File formats as Emoji: 0xffae
by @beet_keeper
tldr: https://emoji.exponentialdecay.co.uk
File Formats As Emoji (0xFFAE or 0xffae) might be my most random file format hack yet. Indeed, it is a random page generator! But it generates random pages of file formats represented as Emoji.
The idea came in 2016 with radare releasing a new version that supported an emoji hexdump. I wondered whether I could do something fun combining file formats and the radare output to create a web-page.
Along came a spare moment one weekend, some pyscript, and bit of sqlite, et voilà. File Formats as Emoji (0xFFAE) was made a reality.
Continue reading “File formats as Emoji: 0xffae”…
#0xffae #Code #Coding #digipres #digitalLiteracy #DigitalPreservation #emoji #FileFormat #FileFormatIdentification #FileFormats #learning #PRONOM #pyscript #Python #SkeletonTestCorpus #teaching
@wyatt @kawa Until archive format overhead becomes the limiting factor for size.
At one point in early 2002, I briefly considered using the UNIX tape archive (tar) format to bundle assets in a Game Boy Advance game. Many of these assets were 2048 bytes or smaller. I looked up the spec for a POSIX tar file, and it involved rounding up each file's size to a multiple of the 512-byte block size and adding a 512-byte block header. That kind of overhead adds up. On top of that, searching for a particular file in a tarball takes linear time, not constant or even logarithmic time.
That led me to devise and document a simpler, more fit for purpose archive format called GBFS. Other specialized archive formats may benefit from packing files so as to avoid crossing 16K, 32K, 64K, or 128K block boundaries in the medium.
File format building blocks: primitives in digital preservation
by @beet_keeper
A primitive in software development can be described as:
a fundamental data type or code that can be used to build more complex software programs or interfaces.
– via https://www.capterra.com/glossary/primitive/ (also Wiki: language primitives)
Like bricks and mortar in the building industry, or oil and acrylic for a painter, a primitive helps a software developer to create increasingly more complex software, from your shell scripts, to entire digital preservation systems.
Primitives also help us to create file formats, as we’ve seen with the Eyeglass example I have presented previously, the file format is at its most fundamental level a representation of a data structure as a binary stream, that can be read out of the data structure onto disk, and likewise from disk to a data structure from code.
For the file format developer we have at our disposal all of the primitives that the software developer has, and like them, we also have “file formats” (as we tend to understand them in digital preservation terms) that serve as our primitives as well.
Continue reading “File format building blocks: primitives in digital preservation”…
#Archives #digipres #DigitalPreservation #DigitalPreservationEssentialism #diplomatics #eyeglass #eygl #FileFormats #InformationRecordsManagement #IRM #JSON #OpenData #OpenSource #RDM #ResearchData #ResearchDataManagement #XML
Huzzah! A small personal 15 year old goal achieved -- finally got a service to update automatically from the PRONOM release notes we first minted at TNA back in 2010 with signature file V31. (Rather than something going wrong and my needing to handle turn. We only trigger a download and update on a new release and it finally worked!)
Today's release V120 reflected live on the ffdev.info PRONOM dashboard and API.