Mastodawn

So now I am getting sidetracked again, from TGE to making a language that is ‘orthogonal’ but which requires no ‘two-level grammar’ yadda yadda yadda nor a report Wirth, Dijkstra, and Hoare could not read and about which Hoare would end up writing a paper roughly entitled ‘Algol 68 Sucks’.

Indeed, my notion is to not write any formal grammar at all, and to use only A PRATT PARSER.

Which, admittedly, is a thing that did not exist yet for Algol 68, but that hardly matters.

Show thread

Barry Schwartz 🫖

Incidentally, Wirth developed a language called ‘Algol 68 Sucks’ (actually it is called Pascal) and the United Stated Department of Defense funded development of another language called ‘Algol 68 Sucks’ (actually it is called Ada).

Neither of these languages sucks.

Pascal as originally developed is a whole program only language, mind you. Not a separate compiler. You must understand that to understand the language. Everything is nested within ‘program’.

Show thread

Barry Schwartz 🫖6d ago

It is with Modula that things get more serious in that line of development. (Not with Object Pascal aka Turbo Pascal, in which was written the worst computer code I have ever seen or worked on.)

Show thread

Barry Schwartz 🫖6d ago

I am fond of safety but also of simplicity.

It must be admitted, for instance, that ATS (any version of it) is a complicated language. It is much simpler than an Adriaan van Wijngaarden would have made it. It surely is simpler and much more practical than, say, Agda (which is a hugely moving target, anyway), which attempts to be ‘theoretical’ whilst ATS does not.

So we want ATS-like safety but we also want simplicity. Can that be done with an ‘orthogonal’ language and Pratt parsing?

Show thread

Barry Schwartz 🫖6d ago

Note, for instance, the problem of buffer overruns. But there is another problem with arrays that is addressed by ATS, which is NOT addressed widely otherwise: aliasing. The problem of entries in arrays being aliased, because it is easy to refer to an array in two place.

ATS solves both problems by making an array a linear type that is typechecked via a ‘view’. A view has a particular length in entry types. The entry type can be changed, and so the length with that, but let us ignore that.

Show thread

Barry Schwartz 🫖6d ago

A view can also be split in mutually exclusive twos or threes (or more if need be). Whilst split, it becomes that many arrays. Then these must be joined together again at some time, because the types are linear. They must be consumed.

These are all typechecking operations, but they must be written by the programmer or put in subprograms.

Thus, to read or write an array entry, you must split the array into two or three parts and also somehow transform the one entry into an ordinary variable.

Show thread

Barry Schwartz 🫖6d ago

Then you have to join it all back together.

But none of this steals from your code speed, because it is all typechecking code. So you end up with safe array access that requires no runtime bounds checking!

Show thread

Barry Schwartz 🫖6d ago

The question is, can we achieve something similar by being ‘orthogonal’ with a Pratt parser? That is, can we do it by adjusting the SYNTAX of the language?

Show thread

Barry Schwartz 🫖6d ago

How about if we make fundamental syntactic operations on arrays be splitting and joining?

We can also make assignment and evaluation be entry-wise in the manner of Fortran and Matlab, rather than require it be reduced to a proper variable as in ATS. This way we still avoid the aliasing problem.

For example, if we do a quicksort, the two parts of the array are still fed separately to the two parts of the algorithm. Neither subprogram call can touch the other subprogram call’s subarray.

Show thread

Barry Schwartz 🫖6d ago

But we have done it with the Pratt parser, so have done it ‘orthogonally’, not via semantics. And we have done it without writing an unreadable report.

Show thread

Barry Schwartz 🫖6d ago

Mind you, this requires that splitting an array CONSUME the original array, and that the array must be rejoined if it is to be recovered.

We do not have to enforce linear typing, only that splitting and joining be destructive operations.

But that’s just control of your symbol tables and such. That’s just more syntax. You could probably even go full linear through syntax.

Show thread

Barry Schwartz 🫖6d ago

It likely does mean you aren’t going to be happy with the classic ‘orthogonal’ syntax example, if it should appear as an initialization:

(if b then x else y fi) := 3

Still, if assignment to an initialized variable implies consumption of the previous value, then it is okay if not an initialization.

Show thread

Barry Schwartz 🫖6d ago

Oh, let’s write that in Dijkstra’s nondeterministic if-fi:

(if b => x | ~b => y fi) := 3

because let us say we want to use that, if only to introduce a new generation of programmers to it.

Show thread

Barry Schwartz 🫖6d ago

The big difference here is there is no sequence of operations. The code takes any one of the branches in the if-fi, it is not specified. The only thing that decides which branch is actually taken is the guard.

(I read stuff here in the Fediverse that makes me think, ‘These people have no idea what “nondeterminism” and “determinism” even mean’.)

Show thread

Barry Schwartz 🫖6d ago

Of course the compiler could have decided ~b is the negation of b and so produced the same code as ‘if x then x else y fi’. But that is for the compiler writer to decide.

Meanwhile, the programmer actually has a simpler situation with which to do proofs. To do proofs with the earlier, non-Dijkstra formulation, one first usually throws away part of the information and mentally transforms to the Dijkstra formulation.

Show thread

Barry Schwartz 🫖6d ago

Also, suppose I left out the ~b guard. Then there would be no specified operation for if b were false. There is no fallthrough. The program is erroneous and presumably will terminate with an error message.

The program would have been erroneous in this case, anyway, because you cannot assign to ‘nothing’. But the principle is more general.

Absence of a guard is a common cause of nondeterminism in the Mercury language, though in Mercury I think guards are gone through in sequence.

Show thread

Barry Schwartz 🫖6d ago

Obviously, if the guards cover all the cases and are mutually exclusive, then the program is ACTUALLY deterministic. And a compiler may be able to confirm this.

As indeed a Mercury compiler has to be able to do, for the Mercury language, though there I believe the cases do not have to be mutually exclusive. In one’s mind, one assumes earlier tests that pass are actually excluded from later tests, and does proofs by using the Dijkstra formulation.

Show thread

Barry Schwartz 🫖6d ago

Strictly speaking, however, the order of the expressions is a part of the information that is not included in the proofs and yet is relevant, and bits of code that would break the proper functioning, if the order were changed, are not explicitly accounted for.

Show thread

Barry Schwartz 🫖6d ago

I am merely trying to make an ‘orthogonal’ language. I can let the program terminate with an error if there is a case left out. At least I need not REQUIRE that a program be deterministic. In fact I think it silly to REQUIRE that a program be deterministic, and most programming languages in fact do not require this.

Indeed, they are happy to let you have, for instance, a loop despite that one has failed to prove it either terminating or nonterminating.

Show thread

Barry Schwartz 🫖6d ago

ATS, incidentally, has but one method for proving a recursion or loop terminating, which is ‘termination metrics’. It involves having a typechecking variable, or a tuple of typechecking variables, move progressively towards zero without passing zero, on each iteration of the recursion or loop. It is mathematical induction mapped to counting backwards, although the counting can be by leaps and bounds (division, for instance).

ATS2 has both recursions and loops. Loops are poorly documented, tho’.

Show thread

Barry Schwartz 🫖6d ago

In a way it matters little, though, because recursions can be done with reference variables and so tail recursions effectively resemble loops closely. But, as I say, these can be written as loops more properly, resembling C loops, except with a lot of proof notations.

Show thread

Barry Schwartz 🫖6d ago

The following occurs to me:

If you start with bytes as the basic type, you can make EVERY type in the language be an array. And that includes records. For accessing a record is merely splitting of an array.

And a multidimensional array is also merely splitting of an array. This is already how it is done in Fortran. Indeed, in Fortran you can change the shape of an array simply by calling a subprogram that refers to the array differently.

In ATS things CAN be done this way in typechecking.

Show thread

troi 13h ago

@chemoelectric I don't have the quote handy but Wirth's opinion of committees for language design sure seems valid :)

Show thread

Barry Schwartz 🫖6h ago

@troi He did like for everything to be understood by one individual.

A GNU/Linux system could be understood by one individual but that person would go insane.

Eric Raymond probably is not that person but was already insane, anyway.

Show thread

Barry Schwartz 🫖6h ago

@troi Ada is small enough you could understand it, if you were versed in the semantic details of it. But these do exist and I have no grasp of them. I simply assume they are thought out. Wirth might not have been able to fathom them, either.

I believe Hoare said of early Ada it was MOSTLY okay.

I have real problems with Wirth’s later languages. The Modula 2 libraries are a mess except for the ISO one. And CARDINAL is no name for ordinal numbers!

Oberon is definitely better. None of the ^ crap

Show thread

Barry Schwartz 🫖6h ago

@troi But there is no good Oberon compiler for GNU/Linux systems. There are compilers, but they are awful.

Show thread

Barry Schwartz 🫖6h ago

@troi I’m in the middle of incorporating Boehm GC as part of hiha, as a replacement malloc/calloc/realloc. You can’t just use the system Boehm GC for that, it has to be compiled special, I think. So I have committed the Boehm GC 8.2.12 sources to my local hiha repo already.

I’m just tired of dealing with free without having ATS2 to tell me when to call the free.

A viable alternative would be to start again in ATS2 and use linear types to tell me when to call free. This is less code.

Show thread

Barry Schwartz 🫖6h ago

@troi Being an old hand at writing Icon code, I am used to using characters themselves as the tokens for a parser. This is fine for recursive descent parsing, which is the usual method in Icon coding, but not so good for Pratt parsing. I could not think of a really good way to do that in Pratt parsing without at least pushing back tokens and backtracking and other unholy stuff.

So I thought, what about just do passes? And I thought, how about you just do it until it reaches a fixed point?

Show thread

Barry Schwartz 🫖6h ago

@troi In other words, until the output is the same as the input. There are no more changes in token stream.

Now you can feed it to the parser proper that generates a parse tree.

Show thread

Barry Schwartz 🫖6h ago

@troi A Pratt parser does two things: it reads a token and treats it as a prefix operator or a literal, or it reads a token and treats it as a postfix or infix operator of some given numeric precedence. This should be adequate for a lexical scanner, if you do not demand too much of it and let it work in multiple passes.

Show thread

Barry Schwartz 🫖6h ago

@troi The language is “orthogonal” because the whole dang thing will do whatever the fØKK you want and yet it will be made from almost nothing but this fricking Pratt parser.

Show thread

Barry Schwartz 🫖6h ago

@troi Yes, Wirth could understand his entire OS, including the compiler.

I could probably put something of the sort together, were I not disabled. OTOH then I would be employed and so not have the time or thoughtfulness. I would be a fricking idiot like most people are. Thank goodness I am a basket case.

Show thread

Barry Schwartz 🫖5h ago

@troi But of course I am also not a CS professor. :)

I am just a generalist hobbyist with an electrical engineering education. I had things come in a parcel today. They were small pieces of hardwood. Half of it was sappy black walnut. Half of it was pieces of light colored woods, perhaps some of it maple. I like maple, but I am not sure what they all are. Maybe some are birch, say.

My shoulder is getting better, so I have ordered small pieces of wood. I even have a strip of purpleheart or two.

Show thread

Barry Schwartz 🫖5h ago

@troi BTW I burnt my bridges when I got my EE education. The people in the department had had enough of the guy with bad OCD.

Show thread

Barry Schwartz 🫖5h ago

@troi To this day I have no idea how PAM works. I have not even bothered to try to understand how PAM works, mind you.

I understand that PAM can do something useful. I also understand that PAM does nothing useful that cannot be done without PAM.

When I was still working at CrossWorks Inc. (formerly Pioneer Blah Blah Blah) over 20 years ago, we had a Red Hat type trying to tell me how to set up my interface to something or another. He simply could not comprehend that my Slackware had no init.d.

Show thread

Barry Schwartz 🫖5h ago

@troi I kept saying, ‘You are telling me nonsense. Would you just let me write the necessary shell code?’ Because it was Slackware. It had BSD structure. It could be understood very simply. To get to multiuser mode, there was one script. Just one script. Not bunch of scripts with different priority levels, like an init.d. Just one script, in rc.d. And it had to be edited.

I finally just went ahead and edited the script, and also the script for going to shutdown and the one for going to reboot.

Show thread

Barry Schwartz 🫖5h ago

@troi It’s a clumsy system, actually, but at least it could be understood.

I prefer OpenRC, the Gentoo system that is used by a few other distros. An increasing number, as distros jettison systemd because it sucks. But OpenRC is a simple init.d whereas systemd is a complicated variant.

Show thread

Barry Schwartz 🫖5h ago

@troi Of course the latest Slackwares might have systemd. I don’t tell Patrick Volkerding what to do. He lives clear across Minnesota from me, and the state is enormous! :) (I’m from New Jersey, after all, and even New Jersey seemed big to me, given I was from the narrow part of it. That old man’s neck. 20 miles to New Hope, Pa., by Route 518, 20 miles to Staten Island if you could fly.)