Mastodawn

Jean Hominal 3d ago

This article, "C Is Not a Low-Level Language; Your Computer is Not A Fast PDP-11", by David Chisnall of Cambridge, is a critical self-assessment of C, modern CPU architectures that pander to C, and modern C compiler writers, like himself.

The intertwined successes of UNIX and C was inevitable, given the nature of the computing technology in the late 1960s. But that instant, meteoric success continues to demand backward compatibility through the decades, and that backward compatibility engenders much forward restraint on future advances.

As a long-time fan of C and PDP-11/70, I find Chisnall's critiques painfully true. The same could be said of UNIX, my all-time favourite operating system. And x86, too, followed a similar path to immediate success and perpetual dominance.

The power of inertia is terrifying: even after 50 years, the current score of the computer architecture and programming language game remains "von Neumann 1 v Backus 0".

https://spawn-queue.acm.org/doi/pdf/10.1145/3212477.3212479

@AmenZwa I found it unpersuasive (granted its from 2018 but it didn't persuade me then either :-)). I admit I'm something of a C bigot too, so grain of salt and all that, but it wasn't until 2015 when one could reasonably build a useful massively parallel system. That was when GPUs got solder on the back HBM. And system design is the key because languages in isolation are intellectually interesting but useless. 1/2

@AmenZwa Rob Pike took a swing at a language that was designed for a multi-core system environment with Go. I think the results are mixed mostly from the stupid cluster architecture inside of Google but conceptually it was good. We could build CPUs now that are more GPU like but we don't. And without a nexus of design we're destined to get "compatibility" designs. As a systems guy this makes me sad. 2/2

amen zwa, esq.5d ago

@ChuckMcManis
Yeah, I see your point.👍

But there exists inertia in many areas of our work.

James Widman 4d ago

@AmenZwa @david_chisnall David, re:

> A processor designed purely for speed, not for a compromise between speed and C support [...]

How feasible do you think it would be to support existing C programs on a system like that through some set of bridging extensions to the C language (maybe something reminiscent of the extensions that were added to ObjC in order to support Swift)?

Does the answer to that question change if we assume that this new architecture supports CHERI or similar?

amen zwa, esq.4d ago

@JamesWidman @david_chisnall

James, I am aware that your question was addressed to David, and I'm just a little DSP-on-MCU guy who has never designed a CPU microarchitecture.

But I offer my own confined perspective. It is eminently possible to break the x86-C-Linux stranglehold of modern computing. Indeed, well into the early 1980s, that was how things were done. Each new architecture was incompatible with the one prior. Innovation was explosive, and the laws of Moore and Dennard were extent. Such exploratory divergence was inevitable. But once the industry stumbled upon the cost-performance-familiarity tripod in the shape of x86-C-UNIXen, it took the most steady-profit path.

I doubt, though, that today's profit-driven, risk-averse industry has the moxie to take on risky, groundbreaking innovations. No one dares to play chicken with a half-billion-dollar investment that would likely flop—no one would switch from their familiar environment.

Here is a wild thought. Everyone is obsessed with LLMs these days, using them indiscriminately to generate code they cannot understand. Why not just have AI generate Verilog or VHDL and make FPGAs for small-scale and ASICs for large-scale directly, and be done with architectures, languages, and operating systems? (Forgive me; at my age, I am prone to suffer from these frequent pangs of resentment.)

David Chisnall (*Now with 50% more sarcasm!*)4d ago

@AmenZwa @JamesWidman

Indeed, well into the early 1980s, that was how things were done. Each new architecture was incompatible with the one prior.

Even by the '80s, backwards binary compatibility was important. The original IBM PC was from the late '70s, the original Mac the early '80s and subsequent iterations of both ran older binaries. By the '90s, most software wasn't written in assembly, but abstract-machine portability was important. Even in the microcontroller space today, you'd be hard pressed to sell a chip that can't run C code: recompiling is normal, but rewriting is not.

But the sunk cost keeps going up. Every year, there's more C code in the world and so the cost of building a completely new alternative ecosystem is higher.

amen zwa, esq.4d ago

@david_chisnall @JamesWidman

Very true. At the inception of a new architecture, backward compatibility is the ultimate economic decision factor. For instance, the mighty S/360 (designed in the early 1960s) was obliged to run code for 7090 (designed in the mid 1950s).

And the longer that backward-compatible architecture lived on and become established, the sunk-cost became the ultimate economic decision factor.

We geeks can't have nice things....😀

darth_cheney 3d ago

@david_chisnall @AmenZwa @JamesWidman

Taking a step back: do we need this kind of compatibility today? Unlike the 80s when we had that explosion of different holistic but incompatible systems, today we have widely adopted open standards for data and communications formats. This is an advantage that implementers simply didn't have back then. It grants the possibility of making a totally new kind of computing system from the ground up

amen zwa, esq.3d ago

@darth_cheney
I don't mean to cut in front of others with a quick reply, but this is my week of, so I'm idle, now. And here goes.

Compatibility is essential in all interconnected system. All engineers live by that ethos. In EE, for instance, we have numerous standards, ranging from federal statutes and state regulations down to IEEE guidelines and company policies. But one thing I noticed through the decades is that none of those "commandments" interfere with an individual practitioner's ability to apply mathematical reasoning, engineering principles, and innate creativity to his problem analysis, solution synthesis, and production manufacturing tasks.

The software community (of which I am a member, too), for the past six decades at least, has been desperately trying to convert itself into an engineering discipline, without success. There have been tonnes of software "standards". But none are legally binding and only a few, and narrow, areas (like life-critical applications) of software practice actually follow their own standards.

The baseline in engineering (like in medicine and law) is licensure and continuing education, with serious legal consequences for breach of professional ethics. There is no equivalent regulatory oversight in software—because it is impracticable to hold software practitioners legally responsible. And it is impossible legally to define a court-provable the causation chain of a "bug" as having been caused by accident, negligence, recklessness, or intentional. Even if such a scheme could be developed, there is no reliable way to apportion blame amongst the culprits.

Software is not an engineering practice; it is more akin to an art form.

@david_chisnall @JamesWidman

darth_cheney 3d ago

I meant "compatibility" more in terms of the "abstract machine portability" David was describing above.

A totally new computing system can get a long way by speaking standards like TCP/IP and being able to parse common formats like XML, etc, etc., _without_ having to be a unix at all, or having anything to do with C.

@david_chisnall @JamesWidman

David Chisnall (*Now with 50% more sarcasm!*)3d ago

@darth_cheney @AmenZwa @JamesWidman

It could. Last time I looked, there were over ten billion lines of C and C++ just counting code in public repos. You need at least a couple of hundred million for a functional desktop.

If you build a computer that can’t run C a C-like abstract machine, you have to rewrite all of this code. That’s feasible technically, but not commercially. And people keep doing things like saying ‘let’s build a a portable runtime that uses the worst possible common subset of a C-like abstract machine’ and putting it in web browsers and making the problem worse.

darth_cheney 3d ago

@david_chisnall @AmenZwa @JamesWidman

What are your thoughts on STEPS? There were some interesting insights in that project about what it might take to replicate the "desktop" experience (hardware aside).

But yeah, commercial viability is going to be the limiting factor. As it stands, capitalism likes a specific kind of free software and everything is molded around it

David Chisnall (*Now with 50% more sarcasm!*)3d ago

@darth_cheney @AmenZwa @JamesWidman

STEPS was amazing. One of our biggest inspirations for Étoilé. They took the ideas to extremes, which is absolutely the right thing to do for a research project (if you don’t go a little way beyond the sensible limit, you can’t tell where the limit is).

For all the software novelty, it ran on a VM with a very conventional architecture, which means that everything that they did would work nicely on existing hardware.

darth_cheney 3d ago

@david_chisnall @AmenZwa @JamesWidman

Good point! Brings up an interesting question about where to start for novel computing -- do you make a new kind of architecture and then see what kinds of way higher level environments can emerge, or do you go in the reverse order?

Lately I've been thinking about Dale Schumacher's uFork project, and what kinds of higher level environments might become more feasible atop such a system

amen zwa, esq.3d ago

@darth_cheney
Ah yes, of course; absolutely. 👍 I misunderstood "compatibility" as those continually evolving "standards" that no one followed. My apologies.

@david_chisnall @JamesWidman

David Chisnall (*Now with 50% more sarcasm!*)4d ago

@JamesWidman @AmenZwa

I worked on an in-house architecture at Microsoft and the consensus was that a 20% speedup on the same process / design complexity with easy source compatibility is not worth it commercially. To bring a new CPU architecture to market that's much faster, you need at least a 2x, ideally a 10x speedup. The problem with doing a non-C-friendly architecture is that you mostly get speedups from removing things that you'd put on to make C fast. If you then also want to run C, you need to add them all back.

The places alternative designs have been successful are things like GPUs. GPUs are well over 10x faster than CPUs for workloads that work on GPUs. They're not displacing CPUs, but they are augmenting them, and a load of things moved to running on GPUs. I could imagine an architecture that's able to run GPU workloads maybe 70% as fast as a good GPU design, but also able to run CPU-like workloads written in a non-C-like language appearing as a more flexible accelerator and then gradually displacing more of the things that CPUs do. It may be that you could get to the point where nothing performance-critical then runs on the CPU and that would give you a path, but it's quite hard (both technically and economically).

amen zwa, esq.4d ago

@david_chisnall @JamesWidman

In my small corner of the DSP world, the Cortex CPUs and MCUs rule today. But the 80s were the heydays of the TMS320. The TMS320 was a purpose-built DSP CPU. The Cortex-M4F is an M3 augmented with an FPU, the MAC instruction, and a couple of vector facilities. The M3 is as pedestrian as an MCU can be. Yet, the general-purpose MCU, like the M4, with a couple of tweaks, can live happily in the narrow, specialised field, like DSP.

Oh, by the way, the Cortex-M family specifically targets C, with internal components designed to accommodate the way C compilers generate code, including C's calling convention of pushing the stack in printf-friendly reverse order: printf-style variadic argument passing specialisation on a MCU/DSP chip—fancy that!

This type of single-minded pursuit of design artificially restrain progress, as David pointed out.

In 1977, Backus pleaded—of all the places, at his own ACM award ceremony for his work on FORTRAN—to move away from the von Neumann bottleneck and its software reflection, PP languages. He even proposed his own "FP" language. In the mid 1980s, at the peak of LISP machines, Knight published an insightful article on the architectures designed for FP languages. But the industrial flywheel ensured that none of those ideas escaped academia.

Today, there is loads of opportunities for researchers to explore truly novel CPU architectures in affordable ways. Yet, the economic gravity of the industry keeps pulling the talent towards the spinning flywheel.

I don't have a solution in mind; I'm just complaining, bitterly....

Vassil Nikolov | Васил Николов 4d ago

«In 1977, Backus ... even proposed his own "FP" language. In the mid 1980s, at the peak of LISP machines, Knight published an insightful article on the architectures designed for FP languages. But the industrial flywheel ensured that none of those ideas escaped academia.»

Not just the industrial flywheel.

Imperative programming is easier than functional,
and procedural is easier than declarative.
Easier as the path of least immediate resistance, but it matters.

(If this is included in the Industrial flywheel, then scratch my first sentence above.)

@AmenZwa @david_chisnall @JamesWidman

amen zwa, esq.4d ago

@vnikolov
No no, you’re spot. Indeed, all those points we have been cogitating over in this thread could be explained with that one fundamental physics principle: the path of least resistance—you know, the path all EEs tread upon.😀

@david_chisnall @JamesWidman

David Chisnall (*Now with 50% more sarcasm!*)4d ago

@vnikolov @AmenZwa @JamesWidman

I'm curious what this claim is based on. There are a lot of 'X is easier' claims, but they generally include a lot of survival bias: people were taught X, then they learned Y. The ones who couldn't understand X never got to see Y, so you're left with a population that can do X, of which some prefer Y, but all of them can do X so X must be easier.

For example, I remember the thing I found most confusing when I started learning to program was that subroutines executed synchronously. My mental model was that a subroutine call started something running and eventually it would finish. It took me years to get to grips with the idea that these things happened synchronously.

You're told that programs are like recipes, but every single recipe I've read has instructions like 'start the pot simmering' and 'meanwhile...' Each step is a thing that happens in a distinct time, but with interdependencies and steps that are atomic with respect to the rest of the process.

This is why I like our Behaviour-Oriented Concurrency model: it directly corresponds to things that you see in recipes, and a lot more people can follow a recipe than can write code.

When Concurrency Matters: Behaviour-Oriented Concurrency | Proceedings of the ACM on Programming Languages

Expressing parallelism and coordination is central for modern concurrent programming. Many mechanisms exist for expressing both parallelism and coordination. However, the design decisions for these two mechanisms are tightly intertwined. We believe ...

Proceedings of the ACM on Programming Languages

Vassil Nikolov | Васил Николов 4d ago

My claim, or perhaps thesis, is based on the accumulation of my observations on the multitude of programs that I have seen.
I don't think the explanation of the prevalence of imperative and procedural programs is only that their authors never heard of other ways to do it.

Another thing that feeds my intuition about the above is the introduction of things like the IO monad in Haskell.

And it is a much bigger topic, of course.

@david_chisnall @AmenZwa @JamesWidman

amen zwa, esq.4d ago

@vnikolov
I learned assembly in the early 1980s, as was common in those days. Over the next five or so years, I learned C, LISP, Smalltalk, and ML. This was unusual, and perhaps unwise, by conventional wisdom. It was surely a mess trying to juggle PP, OO, and FP concepts nearly simultaneously, when my only mental reference were circuits and signals. These days, my favourite is dependently typed FP, as expressed in my Fortran modernisation article.
https://amenzwa.github.io/stem/PL/FortranModernisation/

I agree with you that the imperative paradigm (machine, procedural, and objective) is easier for novices to pick up, compared to the declarative paradigm (functional, relational, and logical). By “easier”, I mean the transition from none to some level of comprehension is shorter. This observation is based on my teaching undergraduate CS students C, ML, and Prolog in the 1990s and my continued instructional efforts in industry teaching OCaml, Haskell, Scala, etc., to Java aficionados.

I also observe that in the 80s and 90s, CS curricula were far narrower (naturally) and there was a lot of theoretical background given: computability, complexity, category, and so on, even if only at the shallow introductory depths. That is no longer true amongst ordinary CS curricula. Most students are focusing on Python and PyTorch.🤦‍♂️ So, when I teach Haskell to a gaggle of JavaScript and Python industry experts, it stresses them. My 90s undergraduate students, who were more accustomed to theoretical concepts, learned ML (granted a far simpler language) with less dread and resistance.

@david_chisnall @JamesWidman

A Forlorn Hope of Fortran Modernisation · Amen Zwa, Esq.

Vassil Nikolov | Васил Николов 4d ago

Indeed.
And your final paragraph is about yet another important part of this picture.

At least, JavaScript has `const' and Python has several kinds of immutable data structures, some small stepping stones.

@david_chisnall @JamesWidman

amen zwa, esq.4d ago

@vnikolov
Also, Python 3’s “Typing” library helps, a little. And I advise my mentees who still live and breathe JavaScript to switch to TypeScript, or at least use Flow, instead.

@david_chisnall @JamesWidman

David Chisnall (*Now with 50% more sarcasm!*)4d ago

@vnikolov @AmenZwa @JamesWidman

I don't think the explanation of the prevalence of imperative and procedural programs is only that their authors never heard of other ways to do it.

Please reread what I wrote. That is not at all what I’m saying.

Imagine you have a school that teaches French and German as foreign languages. Everyone starts learning French. After one year, you may drop French. If you do two years of French, you will do one year of German and may then continue to do either French, German, or both.

At the end, you have a large proportion of people who never tried German because they were not good at French and so never got taught any German. The ones who have learned German have more experience with French but some may find German easier and become more proficient. Now go and ask the people who have learned French whether they find French or German easier. Most will say French, because none of the people who would have found German easy and French hard ever got the opportunity to try German, whereas the ones who found French easy and German hard got to try both.

Now, it may be that German is much harder than French and that’s why they’re taught in that order. If you flipped the order of teaching, you may end up with half as many people making it to the class where the6 got to try the other language. But you can’t draw that conclusion from this data set.

Teaching imperative programming, pretty much independent of the pedagogy, fails for around 80-90% of the population (I am particularly interested in the fact that this number also corresponds to the proportion of the population that doesn’t think of organisation in terms of hierarchies, given the hierarchical nature of most programming languages core concepts). So you’re starting with a population of under 20% who were able to learn imperative programming. Of those, some subset learn other programming styles.

Oh, and note that the proportion who are able to learn and become proficient in algebra is much higher than the proportion that successfully learn to program (typically well over 50%).

amen zwa, esq.4d ago

@david_chisnall @vnikolov @JamesWidman
Learning appears to be evaluation order dependent, then. It sure appears to be imperative, as in, “Do the assignments or else!” sort of way.😀

I hadn’t quite thought about college-level programming education in this way, but what you described rings true, based on my limited experience in that world. In my days, most of us started with assembly. In my case, electronics first, then assembly. In my students’ days, everyone started in C. Going from C PP to C++ OO wasn’t a big deal, naturally. And C++ was but a simple, thin objective layer atop C, back then. And most students enjoyed the transition. But going from PP and OO to FP was indeed rough to every one of my students.

I personally didn’t experience much difference in the transition from PP to OO v to FP, since I dove into all three paradigms about the same time, when I hadn’t a preferred way of thinking, yet. The unintentional syntopic learning of PP, OO, and FP was long and rough, for sure. But it was a valuable learning experience, too.

David Chisnall (*Now with 50% more sarcasm!*)3d ago

@AmenZwa @vnikolov @JamesWidman

One other data point: the most widely used programming language in the world, Excel, has around a billion users. It is a reactive declarative programming environment (and now it has lambdas!). No other programming language or even set of programming abstractions has managed to be used by 10% of that number of people.

amen zwa, esq.3d ago

@david_chisnall
👍 That is indeed a data point, a massive one, the one no programmers like to talk about. Truth can be stranger than fiction, at times.

@vnikolov @JamesWidman

Vassil Nikolov | Васил Николов 3d ago

Some of those billion Excel users write their own Excel formulas.
Some of the latter subset of users write non-trivial formulas.
I don't know how easy or difficult it would be to find out how many such users there are.

@david_chisnall @AmenZwa @JamesWidman

David Chisnall (*Now with 50% more sarcasm!*)3d ago

@vnikolov @AmenZwa @JamesWidman

Ironically, we actually didn’t have good telemetry on this at Microsoft. Getting data on what functions are used in formulae took a lot of work to clear the privacy approvals and the group asking for it just wanted aggregate data on the most common functions to know what to prioritise for the web version. The corpora we had were also quite non-representative (the biggest corpus of Excel spreadsheets is from the court filings for Enron).

Vassil Nikolov | Васил Николов 3d ago

I think there is significant variation in the teaching scenarios, perhaps a lot of variation.

For example, judging by _Structure and Interpretation of Computer Programs_, the introductory computer programming course at MIT (and presumably elsewhere) didn't use to be taught in procedural style, or at least not entirely (and not judging by the choice of programming language).

In the final paragraph, what exactly does algebra refer to, and fifty percent of whom become proficient in it?

@david_chisnall @AmenZwa @JamesWidman

Vassil Nikolov | Васил Николов 4d ago

P.S.
Myself, I prefer to avoid writing in an imperative or procedural style as much as I can, so it's not about what is easier for me, but what seems to me to be easier for people in general.

P.P.S.
I have always considered a cooking recipe as a rather weak metaphor for an algorithm or a program most of all because executing a recipe requires judgement.
Concurrency and real-time constraints come after that, in my view.

@david_chisnall @AmenZwa @JamesWidman