Thalia Archibald

454 Followers
164 Following
545 Posts
Computer science PhD student at the University of Utah, researching compiler verification with @regehr.
UNIX historian and archivist.
Occasional Rust standard library contributor.
Sitehttps://thalia.dev
GitHubhttps://github.com/thaliaarchi

Watch this space.

a.out(V), UNIX Programmer's Manual First Edition, 3 November 1971.

#unix #retrocomputing #vintagecomputing

@vnikolov Yes. Casts weren't introduced until V7 and a constant address couldn't be dereferenced, so this was necessary.
@aap It feels pretty experimental and I haven't seen it anywhere else. Currently reimplementing it in Rust :). Finding lots of little bugs in it.

This snippet deliberately triggers a "Bus error -- Core dumped":

int o1[];
o1 = -3;
*o1;

From Dennis_Tapes/dmr/cgd/cg1.c:expr.

Another strange pattern from the same program.

This one reassigns the address of an array, `int nlist[250]`, in char increments. Arrays are no longer lvalues, so this doesn't work anymore. Also, the address is unaligned every other iteration.

lbp;
nlist[250];

getnam()
{
extern nlist, lbp;
char nlist[], lbp[], c;

loop:
c = *lbp++;
if (c==';' | c=='\n')
goto el;
*nlist++ = c;
goto loop;
el:
*nlist++ = '\0';
}

Somewhat simplified from Dennis_Tapes/dmr/cgd/cg1.c:getnam.

@usul PDP-11 assembly with UNIX syntax. Those are octal addresses. Unless an address has a $, it refers to the value at that address. * is a dereference.
@djl I'm afraid I don't speak PDP-6 / PDP-10 assembly (yet?). Could you elucidate?
@djl @bms48 Lovely. I'd love to get a PDP-7, but they're incredibly rare.

This snippet appears in cvft, a compiler for translating Fortran threaded code to machine code from June 1972, which is notably derived from the early C code generator. See putchar (and also getcha) in dmr/cgd/cvft.c.

The earliest extant C compiler is last1120c from July 1972, the last C version for the PDP-11/20, before they migrated to the PDP-11/45. This version still has the label lvalue behavior of B seen in cvft. Then, it was changed to the modern behavior by the time of prestruct-c from December 1972. That version supports structures, but does not yet use them itself.

All three can be found in Dennis_Tapes: https://www.tuhs.org/Archive/Applications/Dennis_Tapes

Index of /Archive/Applications/Dennis_Tapes

At the time, C was rapidly evolving into what we recognize today.

It started as B, an untyped and interpreted language that only had one kind of value, a word. This was a perfect fit for the PDP-7 that UNIX started on, with 18-bit words, but as they transitioned to the PDP-11, with 16-bit words and 8-bit addressing, this became a limitation.

Soon, types were added, to distinguish char, int, and pointers, and it became known as NB ("New B"). But, B wasn't particularly fast, as it was interpreted (well, threaded).

Once it was rewritten to be compiled, the language became known as C (perhaps initially as NC). For a short while, everything in C was an lvalue like B, giving the above snippet, but this was dropped a few months later, presumably for efficiency.

Some UNIX utilities had been written in B/C from the start, but efforts to rewrite the kernel itself in B/C had failed. Finally, once structs were added to C, it was powerful enough to support the kernel and it was rewritten in C over the summer of 1973, culminating in the release of UNIX V4.

#c #unix #retrocomputing #vintagecomputing