I'm getting pretty close to a point where I feel I would not totally embarrass myself when publishing my first Rust code 😅
I already added some basic integration tests, removed more hardcoded stuff, etc. But I still need to rebuild a "test backup” for testing, collect my notes for things that are missing, write at least a little bit of README etc.
I'll push something to Codeberg in the next couple of days 🤞
Got distracted when I tested a slightly older backup created with Arq 6.2.54 (current major version is 7). The Arq 5 and 7 formats are "documented", the Arq 6 format is not. At first I thought this can't possibly work, but the extra undocumented byte I found in a metadata object in Arq 7 is not used in Arq 6. Other than that everything seems to work just fine.
Your backup repository can contain a mix of Arq 5,6 and 7 data. Arq 7 simply reuses data created by older versions.
I just realized that I can hash the restored files instead of writing them to disk. This way I can easily build a test mode which writes a text file with all files that would be restored with their respective hashes and compare that against a golden file. Could do the same with other file metadata too… 🤔
Yesterday I mostly did all kinds of refactorings, which was VERY rewarding! Most unwraps are gone, cleaned up a TON of “&str, String and PathBuf”-conversion mess and other unnecessarily complicated stuff.
I also caught myself trying to make sure not to accidentally mutate something… only to realize eventually, that the compiler won't let me unless I say so 😅
Slightly better error reporting to the user is next on my list, before I want to look at some basic integration tests.
oh, one more thing: There is plenty of meta data for files stored (like owner, group, permissions, extended attributes, …), but no checksums whatsoever 🙄 This means when a larger or slow restore gets interrupted there is no good way resume and safely skip files that are already provably correct restored.
That's not a good design IMO. And let me tell you, the oddities don't end there 😅
Today I did some testing using a read-only mount of one of my real backups using “rclone serve nfs” so I don't have to implement an SFTP client. This works quite well so far.
Also: I quickly learning that quite a few assumptions on the semantics of certain repository metadata objects are incorrect. As already mentioned, the docs are quite spotty to begin with, so this is basically reverse engineering now to figure out how this is meant to be used 🙄
I also reduced a good amount of “unwraps”, but I'm having a hard time figuring out what a good way is to handle various errors and how to structure the code accordingly.
For the restore, I have to do a ton of testing obviously. But I'll probably first need to improve the ergonomics of the CLI arguments so I can test restore subtrees of one of my larger "real" backups.
Today, I added an "inspect” CLI subcommand to simplify debugging. It's not super ergonomic for now, but it knows the types of files in a backup repository and also how to decrypt and decompress them for inspection. Some are just JSON others are in a custom binary format 🙄. The binary ones will be parsed and printed in JSON too.
More excitingly: A (very) minimal version of “restore" is also working now 🎉 Permission and custom attributes are ignored (for now).
It doesn't look like much, but wow is my brain not wired at all for Rust (yet) 😅
I can now parse enough of all relevant backup objects to dump a list of directories and files within a backup 🚀 The backup location and "snapshot” is hardcoded for now, so I can delay exploring the “CLI argument parsing”-rabbit hole.
Next step is a very simple "restore everything at given location”, ignoring file permissions, extended attributes etc.