Hey Unix history nerds:

Do we know who wrote tar(1)? Like, the original tar.c included in V7 Research Unix (which is the earliest one I can find). The list of potential culprits is pretty short, but there's no attribution I can find.

(If you have an answer, please clarify where you got it from, and reference any contemporaneous/primary sources if you've got them.)

Hmmm this site attempts to assign authorship to pre-version-control Unix sources, and credits tar to Charles B Haley:

https://www.spinellis.gr/pubs/jrnl/2016-EMPSE-unix-history/html/unix-history.html

However, this credit is described as coming from "primary research" and the sources are not cited. So far I haven't been able to locate a primary source. (Haley _is_ explicitly credited in several other areas of the V7 sources.)

A Repository of Unix History and Evolution

@cliffle Chasing links from there, I found https://github.com/dspinellis/unix-history-make/blob/0c545137e10f8e8edcca3ae8d2dad54f2d251f8e/src/author-path/Research-V7#L26-L38 which (contra the paper - possibly new information since) quotes Doug McIlroy as saying that Ken Thompson was the original author.
unix-history-make/src/author-path/Research-V7 at 0c545137e10f8e8edcca3ae8d2dad54f2d251f8e · dspinellis/unix-history-make

Code and data to create a git repository representing the Unix source code history - dspinellis/unix-history-make

GitHub

@zwol hmmmm. While Doug may be correct there, the tar.c code does not (to me) read like Ken's hand.

That's not a guarantee, by any means, but it does mean I'm still searching just in case.

(Asking Ken is of course an option, but memory can be complex, so I'd love to find a contemporaneous source.)

@cliffle no answers, I'm afraid, just happy at the mention of seventh edition (my fav, for sentimental reasons). 😹
@cliffle @vmbrasseur if you’re looking to tell some stories about why you’re looking for this info shoot me bc a DM. It would probably be a fun discussion on https://hackerhistory.com/
Home - Hacker History Podcast

Hacker History: Explore where it all began, interviews with retro hackers, the pioneers and forefathers of yesteryear. The true old school hackers amongst our societies have an unquenchable thirst for knowledge. We will dive into fascinating stories that made hacker history.

Hacker History Podcast
@cliffle Do you have the TUHS CD-ROM? It may be in the SCCS files there, contact Warren Toomey ([email protected]) who will probably have the definitive answer.

@ChuckMcManis Interesting, I'll reach out -- but I'm pretty sure tar.c started life before sccs had escaped the PWB Unix branch into Research. So I'd be surprised if it had revision history from before the V7 release.

I would be delighted to be wrong about this!

@cliffle I do think tar started life earlier, just that when it was added to sccs the initial commit might have included "this is an archive utility written by ..."
@ChuckMcManis well that would be marvelous, let's find out

@ChuckMcManis @cliffle (I don’t have anything to add to the tar history, but…)

The JDK sources prior to #OpenJDK were maintained in Teamware/CodeMgr which used SCCS as a storage format. I occasionally have a hunt through them, using the Mac x64 port by the late Jörg Schilling, running in emulation on my Mac M4. All my old shell aliases still work.

#Java

@stuartmarks @cliffle Is "Teamware" what Avocet became? You can go back and see some of my old commits 😆
@ChuckMcManis @cliffle Yep, Avocet turned into Code Manager and then Teamware. Or Teamware was the product line, or something. I didn’t know you worked on Avocet. I knew Evan Adams, Claeton Giordano, and John Treacy.
@stuartmarks @cliffle I didn't work on Avocet, I was on the original Java team so some of my commits to the Java repo would be from me 🙂 I had switched to Avocet because it was part of the SunOS/SVR4 merge and I was doing ONC stuff, and then joined First Person and it was back to SunOS 4 and SCCS. But I did do the first JVM on Solaris 2.0 and got "Hello Hello Hello World World World" running. (Java + Green threads)

@ChuckMcManis @cliffle Oh you worked on early Java and not early Avocet! Ok maybe I’ll look for some of your changesets (erm, “deltas” in SCCS parlance) in the old JDK sources. ah green threads, how we’ve missed you… NOT.

BTW we are finally removing Thread.stop() for good.

@stuartmarks @cliffle on thread.stop that was my comment on the halting problem and making final really final 😀 we had so many discussions about that

@ChuckMcManis @cliffle Oh yes and the discussions continue! Well Thread.stop is gone. But final still doesn’t quite mean final, yet… see JEP 500. (Dunno if you care to follow this stuff.)

https://openjdk.org/jeps/500

@stuartmarks @cliffle Yeah. See while I was part of the original Java group I wasn't a *language* guy, I was a *systems* guy. So I was worried about things like "How are we going to make it secure, exportable according to the NSA, AND remotely executing?" I was amused by the language debates (is Boolean a first class thing? How about unsigned ints? are different sized floats distinct types? Etc. 1/2
@stuartmarks @cliffle Thread.stop() and 'final' were a very long very deep debate about could you stop a thread AND guarantee final? Did it violate the language safety if a thread never stopped even if you told it to? (because it was hung) and if you create a deadlock by garbage collecting a dead thread who was holding state until a final clause could execute could you detect that? There were many worms in that can, many worms.
@cliffle @ChuckMcManis
TUHS email list seems to indicate Ken wrote the first cut of tar because he wasn’t a fan of cpio. He kept the tp/stp interface and there are other references to upgrade from V6 to V7 Unix using tar (see below). There was a v6tar.c that existed specifically to tar up V6 to restore on v7 (file systems were not compatible).

https://www.tuhs.org/pipermail/tuhs/2019-September/018452.html

Guide for moving to v7 from v6:
https://minnie.tuhs.org/PUPS/Setup/v7_setup.html
[TUHS] PWB vs Unix/TS

@cliffle all i know is i get yelled at for tarring one file even though im pretty sure a tape archive might very well just have one file and its Fine. i hate pedantry!!!

@anna @cliffle

There are all kinds of fun things you can do with tar, because of what its basic function is.

@cliffle As a last resort, you might trying asking here: https://elists.isoc.org/mailman/listinfo/internet-history

It's not exactly an "Internet" history question, but probably close enough I don't think you'll get a lot of complaints, but more importantly a good chance you'll get an answer.

Internet-history Info Page

@cliffle

Multics had an author maintained command called tape_archive (tar) written by Chris Tavares. Before that, Multics had the archive (ac) command, which did the basic things tar does: create an archive file that can contains copies of multiple files. An archive file can be used to keep a related set of files together. see https://www.multicians.org/mspm/bx-9-04.680611.archive-command.pdf
which is the Multics System Programmer’s Manual writeup for archive.

The Multics archive command was descended from Barry Wolman’s ARCHIV command for CTSS. see https://people.csail.mit.edu/saltzer/Multics/MHP-Saltzer-060508/bookcases/CTSS%20Bulletins/CB-44.pdf

@thvv that's interesting.

While it sounds like the programs have little in common in practice (the format is pretty wildly different, plus Multics's segment model is a whole thing), this _does_ appear to be where tar got its idiosyncratic syntax. The d/r/x verbs match.

(A similarly idiosyncratic syntax appeared in ar years earlier, which I bet was also Multics-influenced, given the similarity.)

I don't think this gets me any closer to proving authorship, but it's interesting context, thanks!

@thvv unrelatedly, I see that we share an affection for Borges' works. Always nice to encounter another warped mind. 🙂
@cliffle I'm just glad Mac didn't originate tar Mac - as faults of mine go.
@cliffle if you find them, I will bow down to them as a God. I hate zip. (Even though jar is basically the same, but 'rebranding' tm)

@cliffle What if we created a long-term #blockchain for #copyrights related to #freesoftware production?🤔

#Linux #JustAnIdea

@s4mdf0o1 @cliffle this...this is a joke, right?
@endrift @cliffle
Just an idea
but I guess you will explain me why it's stupid 🙂

@cliffle I bet when you start unpacking this it will turn out to be a bunch of people.

Ha!

I’ll get my coat….

@cliffle V7 tar was a replacement for tp and tap, are you looking for “Whose idea?” or “Who did tar v1?”

@cliffle There's the pre-existing 'ar' command as well ("archive"), dating to 1971 per Wikipedia.

Long-ago lore (1990s-era?) suggests some inheritance by tar, though colour that very shady.

@dredmorbius yeah, someone else suggested that, and I can't find evidence for it. The code is very different, the format is very different... Only the names and command line syntax have parallels, but both appear to have gotten that from similar tools on Multics.

@cliffle @dredmorbius bear in mind the "t" in tar. When I first encountered tar, it was mostly for tape backups. Distributing source code was mostly through Shar (shell archives). It was years later that .tar.Z became a popular thing.

Ar had a very different use case than tar. When I joined the scene, ar was used only for binary libraries.

Fun fact: even back then I hated "curl | sudo bash". I had an assortment of unshar tools to unpack shars without having to feed them to sh. But I digress...

@cliffle @dredmorbius A related research question I've idly wondered about on occasion: 'ar' is a completely general "shove a bunch of files in a container" format. It's not optimized for tape, and I don't know what its name length limitations and so on are, but it *could* be used to store anything. Why did it get so strongly pigeonholed as the format for Unix static code libraries?
@cliffle @dredmorbius (The *only* other use I know of is that the .deb software package format is an AR archive wrapping two TAR archives and a marker file.)

@zwol I know that's one of my own major exposure routes, if not necessarily how I learned of ar in the first place.

@cliffle

@dredmorbius @cliffle hm, looking at https://pubs.opengroup.org/onlinepubs/9799919799/utilities/ar.html reveals one big glaring reason why the .a format is unsuitable for general backup use: it doesn't store *path* names, only leaf filenames. And the mechanism for permitting arbitrarily long filenames depends on this, so it's a limitation of the format, not just the ar command.
ar