Mastodawn

A cursed feature of C in 1972: Labels and functions were reassignable (i.e., lvalues)!

For example, this is a clever way to initialize once:

goto init;
init:
ouptr = oubuf;
init = init1;
init1:

which is compiled to:

jmp *4120
mov 4136,4144
mov 4122,4120

Note the indirect jump and assignment to that address. All gotos used indirect jumps. This apparently would have also worked with functions.

#c #unix #retrocomputing #vintagecomputing

Show thread

Thalia Archibald May 10

At the time, C was rapidly evolving into what we recognize today.

It started as B, an untyped and interpreted language that only had one kind of value, a word. This was a perfect fit for the PDP-7 that UNIX started on, with 18-bit words, but as they transitioned to the PDP-11, with 16-bit words and 8-bit addressing, this became a limitation.

Soon, types were added, to distinguish char, int, and pointers, and it became known as NB ("New B"). But, B wasn't particularly fast, as it was interpreted (well, threaded).

Once it was rewritten to be compiled, the language became known as C (perhaps initially as NC). For a short while, everything in C was an lvalue like B, giving the above snippet, but this was dropped a few months later, presumably for efficiency.

Some UNIX utilities had been written in B/C from the start, but efforts to rewrite the kernel itself in B/C had failed. Finally, once structs were added to C, it was powerful enough to support the kernel and it was rewritten in C over the summer of 1973, culminating in the release of UNIX V4.

#c #unix #retrocomputing #vintagecomputing

Show thread

Thalia Archibald

This snippet appears in cvft, a compiler for translating Fortran threaded code to machine code from June 1972, which is notably derived from the early C code generator. See putchar (and also getcha) in dmr/cgd/cvft.c.

The earliest extant C compiler is last1120c from July 1972, the last C version for the PDP-11/20, before they migrated to the PDP-11/45. This version still has the label lvalue behavior of B seen in cvft. Then, it was changed to the modern behavior by the time of prestruct-c from December 1972. That version supports structures, but does not yet use them itself.

All three can be found in Dennis_Tapes: https://www.tuhs.org/Archive/Applications/Dennis_Tapes

Index of /Archive/Applications/Dennis_Tapes

Show thread

Thalia Archibald May 10

Another strange pattern from the same program.

This one reassigns the address of an array, `int nlist[250]`, in char increments. Arrays are no longer lvalues, so this doesn't work anymore. Also, the address is unaligned every other iteration.

lbp;
nlist[250];

getnam()
{
extern nlist, lbp;
char nlist[], lbp[], c;

loop:
c = *lbp++;
if (c==';' | c=='\n')
goto el;
*nlist++ = c;
goto loop;
el:
*nlist++ = '\0';
}

Somewhat simplified from Dennis_Tapes/dmr/cgd/cg1.c:getnam.

Show thread

Thalia Archibald May 10

This snippet deliberately triggers a "Bus error -- Core dumped":

int o1[];
o1 = -3;
*o1;

From Dennis_Tapes/dmr/cgd/cg1.c:expr.

Show thread

aap May 10

@thalia Very interesting program too :) did it ever up in anything later or was it more like an experiment?

Show thread

Thalia Archibald May 10

@aap It feels pretty experimental and I haven't seen it anywhere else. Currently reimplementing it in Rust :). Finding lots of little bugs in it.

Show thread

Vassil Nikolov | Васил Николов May 10

@thalia wrote:
«This snippet deliberately triggers a "Bus error -- Core dumped":

int o1[];
o1 = -3;
*o1;»

Apart from the array variable not being a constant in those days,
isn't this essentially the same as
* ((int *) -3);
(with platform-dependent behavior, of course)?

Show thread

Thalia Archibald May 10

@vnikolov Yes. Casts weren't introduced until V7 and a constant address couldn't be dereferenced, so this was necessary.