#BabelOfCode 2024
Week 4
Language: FORTRAN

Confidence level: High

PREV WEEK: https://mastodon.social/@mcc/113867584791780280
NEXT WEEK: https://mastodon.social/@mcc/113975448813565537
RULES: https://mastodon.social/@mcc/113676228091546556

I was very excited about doing TCL this week, but I told myself the first time I get a two-dimensional array problem I'd go FORTRAN, so I guess this week is FORTRAN.

A friend of mine who did AOC2024 in December noted the early challenges this year were *very* easy. Today's definitely is. I wonder if part 2 will have any depth.

I went into this thinking: C is basically cleaned up FORTRAN, right? I know C? This should be easy, right? Right off the bat I find there will be a lot of difficulties entirely not of the kind I'm used to in programming. After a brief adventure with accidentally naming my file .f and not .f90 causing horrific and baffling errors, I run a hello world off the Internet. There's a space before the printout. Hm, how do I turn that off?

https://stackoverflow.com/a/31236043

Oh my fuck, *what*?

How to get rid of unwanted spacing in Fortran's print output?

It may look like a trivial issue, but I couldn't find any answer through googling. I have this little program : Program Test_spacing_print Integer:: N Real:: A,B N=4; A=1.0; B=100.0 prin...

Stack Overflow

I get frustrated with C all the time for being fundamentally a 70s language. It may be I'm about to learn the pain of using a *50s language*.

(Alternately, I hear modern FORTRAN has all kinds of fancy niceties like operator overloading and might not resemble traditional FORTRAN all that much. But then I have the problem if I pick up a random tutorial it's hard to guess which *decade's* standard it's teaching me from, or if it's the GNU extension, if the GNU extension is that different, etc.)

Just learned FORTRAN has an aint() function

Don't that just beat all

I am getting a weird floaty feeling from FORTRAN. I do not at any point really understand what I'm doing, but I am having little to no problems having any one particular thing. I keep doing google searches and getting which I do not understand the syntax of ( `write(error_unit, *) "String to write"`— wait, what? but which work. I do not know if I'd be able to get anywhere with this language if I didn't have either a more experienced programmer or Google+Stack Overfow.

Here is my current program. At the moment, all it does is take a command line argument (a path) and attempt to open the specified file. I build it with `gfortran src/puzzle.f90 -std=f2023 -o program`

https://github.com/mcclure/aoc2024/blob/822e460f81b944c21ca675303b868c45b22a4c2b/04-01-wordsearch/src/puzzle.f90

I'm having two problems, one serious, one unserious.

The unserious problem: I want to abort if the # of arguments is bad. If I do "error stop", it prints a backtrace, which I didn't ask for. If I do "call abort", gfortran fails to link ("undefined reference to 'abort_').

aoc2024/04-01-wordsearch/src/puzzle.f90 at 822e460f81b944c21ca675303b868c45b22a4c2b · mcclure/aoc2024

Advent of Code 2024 challenge (laid-back/"babel" version) - mcclure/aoc2024

GitHub

Here's my serious FORTRAN problem (code link in previous post):

The recommended GET_COMMAND_ARGUMENT function seems to assume you know the length of the argument ahead of time. Obviously, I don't.

https://gcc.gnu.org/onlinedocs/gfortran/GET_005fCOMMAND_005fARGUMENT.html

It allows me to pass in an `allocatable` string, but I believe it is leaving that string of length 0 if I do not ALLOCATE() it, and the fetched string is ''. Is there a way to get the length of a command-line argument before GET-ing it?

I see GET_COMMAND, but that's… awkward

GET_COMMAND_ARGUMENT (The GNU Fortran Compiler)

GET_COMMAND_ARGUMENT (The GNU Fortran Compiler)

…hm. I am concerned.

According to the FORTRAN working group

https://wg5-fortran.org/N2201-N2250/N2212.pdf

As of the 2023 standard, an un-allocated deferred-length variable may have its length set by calling intrinsic procedures; they give GET_COMMAND as an example specifically, and StackOverflow users assert GET_COMMAND_ARGUMENT is also included.

I don't get this behavior. My argument is being input as ''.

Do you think this means my code (linked above) is wrong, or that gfortran -std=f2023 is nonconformant?

Oh my hell lol, I sincerely believe I have found a bug (standard nonconformance) in GNU FORTRAN 14.2.0 and I have a repro case

https://github.com/mcclure/aoc2024/blob/b31be91adb5a0721f97e2ba8f145da4f36129753/04-01-wordsearch/src/puzzle.f90

Am I going to have to figure out how to report a bug on GNU. Geez. Is this going to be like the bureaucracy planet scene in Jupiter Rising

aoc2024/04-01-wordsearch/src/puzzle.f90 at b31be91adb5a0721f97e2ba8f145da4f36129753 · mcclure/aoc2024

Advent of Code 2024 challenge (laid-back/"babel" version) - mcclure/aoc2024

GitHub

Now that I have successfully figured out how to read an argument from the command line without knowing its length ahead of time, I am stuck on figuring out how to read a line of text from a file without knowing its length ahead of time. Apparently not an expected FORTRAN use case. I think the trick I was using before will not work, or at least, I *think*

read(10,"(a)",size=line_length,advance='NO') line_in

should read a line without advancing the filehandle and save the size. But it saves 0.

This is the second time in the longform challenge I have found myself having to implement fscanf() [the other time was Forth]. Both languages do predate C, but I was hoping both would have, at some point in the last 55 years, realized "oh… people want to be able to input line-delimited text files" and added such a feature.

Incidentally, I am very comfortable usually with reading language specifications, but I got hold of a copy of the FORTRAN 2008 spec and… this is one of the least friendly language specifications I've ever seen, when reading it to try to determine how a program should be written.

(Trivia: When the image on the right says something like "the value must be YES or NO" what they mean is "the value must be 'YES' or 'NO'.)

FORTRAN 2008 spec 9.1:
"A file is composed of either a sequence of file storage units (9.3.5) or a sequence of records… A file composed of file storage
13 units is called a stream file."

The definition of "file storage units" (9.3.5):
"A file storage unit is the basic unit of storage in a stream file or an unformatted record file."

The linked definition of "Stream file" (1.3.139)
A file composed of a sequence of le storage units (9.1)

YOU ASSHOLES, THIS DEFINITION IS CIRCULAR!

Like okay I understand what you are trying to do here is not commit to a particular byte length because you want to support 7-bit bytes, 8-bit bytes, and possibly UTF-16 characters or something, but YOU COULD HAVE *SAID SO* INSTEAD OF MAKING A LOOP OF 3 DEFINITIONS LINKING TO EACH OTHER

git discovery: you cannot check a file you do not have permissions to into a repository.

problem: my repo contains a directory of test files. this directory contains a second directory of "invalid" tests, i.e., tests that the program is *expected* to fail on (and if it doesn't fail cleanly that's an error). one of the "invalid" tests is a file the program doesn't have read permissions to.

I cannot check my "file with no read permissions" test case into the repository :(

Since automatic deferred length initialization is not working as the spec seems to require, I decided to just read the file in line by line.

This was much harder than I expected! The GNU docs do not well describe READ/WRITE, but do well describe the standard library functions. So I wrote around FGETC(), but then it turned out this is a GNU extension (or possibly was in FORTRAN 77 but not future FORTRANS?) so I had to rewrite to use normal READ in "Stream mode"… not in the GNU docs * _ *

(1/2)

I wound up slowly picking my way through the 2008 language spec, as previously mentioned not super readable. But I got it to work:

https://github.com/mcclure/aoc2024/blob/a8835dcd8ad0162268ef16708f9a7cd8b6d45958/04-01-wordsearch/src/puzzle.f90

Annoyed I had to spend all this time just doing file management, I decided to do the file management REALLY WELL. It's got unique error messages for different types of file handling errors and everything!

Now if only I could make gfortran suppress that darn `error stop` backtrace..

(2/2)

aoc2024/04-01-wordsearch/src/puzzle.f90 at a8835dcd8ad0162268ef16708f9a7cd8b6d45958 · mcclure/aoc2024

Advent of Code 2024 challenge (laid-back/"babel" version) - mcclure/aoc2024

GitHub
By the way, check out this vaguely hilarious section from the FORTRAN 2008 spec, where they state in the *most noncommital terms possible* that a byte is *probably* 8 bits (they don't say "byte"; they don't like the word "byte"; the word "byte" appears exactly twice in the 621-page spec, both times in the C interoperability section) then have this long additional note falling all over themselves to apologize for the hubris of suggesting that a byte is probably 8 bits.

A reply I got on a previous post in this thread suggested that the reason the FORTRAN spec is so strict about using the word "file storage unit" instead of the normal terminology "byte" is that FORTRAN is from 1957 and literally predates the world at large adopting the vocabulary "byte" for the minimum addressable storage in a filesystem. Well, geez. That might really be it.

(Also note FORTRAN's commitment to line numbers is so great that even the English text of the spec has line numbers.)

An interesting thing about the gfortran compiler I've never seen elsewhere… when it wants to mark the position of an error on a line, it has a little ASCII art arrow like Rust or Clang. But instead of an arrow it uses a number 1, which lets it say "blahblah wrong at (1)". It takes a moment to get used to, but it's actually really smart. If you think about it it creates the potential for sentences like "found glarg at (1), but expected blarg because of (2)". rustc struggles with such sentences.

So the reason I picked this puzzle for FORTRAN was I heard FORTRAN's builtin multidimensional array types were nice. I probably won't be truly making deep use of those in the end, but I'm at least learning one thing from the experience: FORTRAN's builtins are *not* especially nice for *growable* arrays, regardless of dimension. You can *do* it, but it's not efficient or super friendly. I realized my growable implementation would be O(n^2) on file size, just to read the file into memory.

(1/2)

Fortunately, I can make everything easy again by just scanning the file to determine the size of the multidimensional array I need to allocate, seeking back to the beginning, then actually reading the array in. It probably sounds to you like I just made a very basic statement. But something to make clear: fseek() is actually considered a very advanced feature in FORTRAN, and was only added to the language in 2003 (in other words 46 years after the language was first specified)

(2/2)

Decided to do a bit more on the FORTRAN project this morning… found myself running in circles for a decent bit trying to understand why code wasn't acting the way I expected. I eventually realized the problem was that when I wrote the expression `char_in == '\n'`, this was behaving differently than I expected because '\n' is a string containing a backslash followed by an n. Of course it is. Of course? Why would I have expected it be anything else?
On a related note, shortly after this I was in a situation it would be convenient if I could cause a particular loop iteration to abort and resume from the beginning of the loop. I experimentally typed "continue". It was accepted. I run the code. It does not behave as I expected. I go to look it up.
(So there's no confusion: I think this is a fine feature for a language to have, I think every language should have a "no-op" keyword, it's basically the same as Python "pass" and exists for the same reason [as a target for otherwise-productive flow control] [GOTO labels must point to statements, so you need a way to write a statement with no effect])
@mcc I wonder if it could have also been useful as something that could have easily been replaced with a GOTO as a patch without having to redo a ton of paper tape or punch cards long ago.
@foobarsoft @mcc
paper tape would need to be repunched, but punchcards, indeed, just adding a card into the stack as a bug fix was quite convenient (no, I'm not very old, I just grew up in a backward place called USSR, and so I learned to program on a Soviet clone of IBM360/370)
@mcc RTFM would save so much stumbling in the dark. Why do you do this to yourself?
@hyc I have no idea what you are trying to communicate here. The screenshot above is literally a screenshot of the manual.
@mcc I mean read the language text thoroughly before trying to write it. Instead of guessing and looking up your mistake after the fact.
@hyc Typing a nine-letter word, identifying that the syntax highlighter didn't bold it which is suspicious, and running the program once to see what happens is not exactly a substantial investment of time.
@mcc you see, it continues executing the code as requested
@mcc Adding dynamic memory allocation to Fortran has been like adding type safety to C

@sklrmths What they came up with is not terrible. I'm not even willing to say it's worse than what we have in C.

Like, it would be *nice* to have realloc, but I've never felt *really* convinced that realloc in C actually works

@mcc back in the day you didn't have this luxury (of allocateable arrays). So you just read everything in a big static 1-dim array (typically called WORK :-)), and then passed it to a subroutine, which took it into a 2-dim array (I never needed anything higher dimension to input like this, the point is that one could then make it look like a dynamically allocated 2d array of the right dimensions).
@dimpase I guess if your program is fit to a particular data set, this is harmless
@mcc
do you mean fitting in "the RAM of the machine"? 🙂
Surely, multitasking OSs are later invention than Fortran. In the olden days you just allocated (statically) however much RAM you could, and use it as you please. If your data won't fit into RAM, then it's a different, much harder, issue...
@mcc you could say that gfortran loves line numbers so much that it adds extra numbers to the middle of the line
@mcc I first saw this a few months ago while trying to get Helsinki MONSTER to build with a newer version of the VAX Pascal compiler on VMS. I wonder if it got to `gfortran` by way of DEC FORTRAN, which was probably the most widely used at one point.
@eschaton Hm, I don't think I know what "Helsinki MONSTER" refers to
@mcc @eschaton you see, Linux is the doctor's name
@mcc It’s an enhanced version of the MONSTER MUD, written in Pascal for VMS. https://web.archive.org/web/20221126161847/http://www.skrenta.com/monster/
VMS Monster

@eschaton Oh, that's super neat.
@mcc If I'm understanding Wikipedia right, the first usage of the word "byte" was in 1956, literally just a year before. Knuth puts the year that byte standardized around being exactly eight bits as being as late as 1975... it's breaking my brain to think about.
@xgranade another really funny thing you'll find if you look carefully at wikipedia is that FORTRAN was designed in 1957, but the first FORTRAN program was not successfully compiled until 1958.
@mcc That makes way, way too much sense.
@xgranade It was of course, like, the second compiler. Ever. (Actually, although my memory is ALGOL beats FORTRAN as first compiled language, if it turns out the ALGOL team took longer to get to hello world on *implementing* the compiler then it might be once you work out the timelines you could argue ALGOL was the first compiled language but FORTRAN was the first compiler… I haven't done that math :P)
@mcc As much fun as I make of FORTRAN, and I do, it's really impressive that the second compiled language ever is still usable today, even if only just.
@xgranade Frankly, although some of the pain points are in *very* awkward places, I'm finding this a *lot* more usable than I was expecting. This is not the constant start and stop I had with Forth.
@mcc Maybe it's improved since I last used it, then... I kept having weird segfaults when using functions that didn't happen when rewriting as subroutines. I eventually figured out what was happening, but the compiler did *not* help with that. At all.
@xgranade You may have been writing more complex code than I yet have. I am still on attempting to load the contents of a file into a two-dimensional array.
@mcc Was for homework in a numerical relativity class, and for writing higher-order Runge–Kutta integrators.
@mcc (I won't mention the other time I used FORTRAN, namely to call into code generated by a numerical relativity DSL.)

@xgranade @mcc *has flashback to Runge-Kutta-Gill integrators in programming for Chemical Engineering*

😖

@xgranade @mcc One of those wild bits of trivia about C++ is that there are parts of the spec that don't assume a byte is 8 bits, because it's old enough to have cared about targets where that isn't true.

I want to say it's around C++, like, 20? Where they introduced some modifiers on string literals to let you be explicit about the encoding, because otherwise encoding is an implementation detail and that means your strings might get jammed into 7-bit representations on some compilation targets.

@mark @mcc I mean, I've worked on embedded systems with 18-bit integer representations, so it makes a lot of sense.

Just still wild to think about, somehow.

@xgranade @mcc ... unless you're under NDA, I must know more. 😄
@mark @mcc It's all in https://arxiv.org/abs/2206.12950, but the rough version was that it was a really cool prototype where 18 bits was what folks could eke out.
Advancing Hybrid Quantum-Classical Computation with Real-Time Execution

The use of mid-circuit measurement and qubit reset within quantum programs has been introduced recently and several applications demonstrated that perform conditional branching based on these measurements. In this work, we go a step further and describe a next-generation implementation of classical computation embedded within quantum programs that enables the real-time calculation and adjustment of program variables based on the mid-circuit state of measured qubits. A full-featured Quantum Intermediate Representation (QIR) model is used to describe the quantum circuit including its embedded classical computation. This integrated approach eliminates the need to evaluate and store a potentially prohibitive volume of classical data within the quantum program in order to explore multiple solution paths. It enables a new type of quantum algorithm that requires fewer round-trips between an external classical driver program and the execution of the quantum program, significantly reducing computational latency, as much of the classical computation can be performed during the coherence time of quantum program execution. We review practical challenges to implementing this approach along with developments underway to address these challenges. An implementation of this novel and powerful quantum programming pattern, a random walk phase estimation algorithm, is demonstrated on a physical quantum computer with an analysis of its benefits and feasibility as compared to existing quantum computing methods.

arXiv.org
@mark @xgranade I once heard someone claim that the reason C++ took so long to standardize that, and so long to remove trigraphs, was IBM was still trying to support a very small number of SERIOUSLY old mainframe installations
@mcc @mark @xgranade For C DSPs were one of the big holdouts with unusual byte sizes AIUI - the users didn’t really care about data not represented as a word so no point implementing the hardware to handle that efficiently.
@xgranade @mcc 8-bit bytes were introduced with the 360 in 1964. Obviously it took some time for it to catch on universally after that, so 70's seems like a reasonable guess.

@mcc apparently the term "byte" was coined in 1956, so. yeah that's entirely plausible

on the other hand you are reading the specification for fortran 2008, which was approved in 2010, not the original fortran specification drafted in 1954

we knew what bytes are in 2008

@00dani sometimes organizational momentum is hard to reverse
@mcc @00dani I can believe that in a programming language spec there are so many places that reference a term like that that changing it would be a bunch of work.
@mcc @00dani holy moly, fortran is 10th on TIOBE. i was not expect it to be beating rust, swift, assembly, php and ruby. i need to reset my image of how many programmers are doing scientific computing
@leon @00dani Based on my experience so far, I would not intentionally use FORTRAN again, but also if I wound up in a situation where I joined a project and they told me "yeah, we're already using FORTRAN for this", I would not throw a tantrum.

@mcc @leon @00dani I wouldn't use anything earlier than F2008; F2018 is pretty well supported by now.

Depending on the application I might start a greenfield project in modern Fortran. For me the biggest problem is the absolute lack of a standard library; I have a set of trusted third-party libraries I use for logging, CLI option processing, unit testing, and common file formats, but if I need to sort anything or do any sophisticated parsing I'm SOL. The C interface is standardized and I've had success linking to sqlite and Lua, but I'd rather avoid mixed language programming if I could avoid it.

Still, it's difficult to find a fast low magic language with decent array operations, vectorizable (elemental) functions, and easy-to-assure static binaries.

There really isn't a good language for writing dependable engineering code. Fortran does fast numerics but lacks domain-specific safety features like units of measure, constrained types (from Ada), contracts (from Eiffel), and canonical physical constants (e.g. CODATA). C++ is a swirling vortex of incomprehensible and gratuitous abstraction, Rust is single vendor with the-code-is-the-spec disease, and Ada is, well, Ada. Fortran lacks a lot of the basic features we expect from a programming language (the aforementioned lack not standard library) and the devotion to backwards compatibility necessitates it being handcuffed to 1956 design decisions. Otherwise it's a straightforward and fairly simple and fast language that does a decent job on a certain class of problems and is uniformly poor at some critical tasks, mainly input processing. Expect that 25-30% of your Fortran application will be devoted to input processing and validation and all that parsing and checking will be a tedious pain in the ass. :/

@arclight @mcc @leon @00dani that 25-30% number is not unique to fortran and i'm hoping to improve it with my phd work
@arclight @mcc @leon @00dani take a look at Julia. It's idiosyncratic, but I think the matrix support is good, and it's not hard to write parsers. (Still miss scanf, though. Didn't appreciate it until I didn't have it.)
@00dani @mcc Yeah but some people were still running PDP-10s in 2008, so...
@00dani @mcc Beware, though -- this is one of many technical terms whose meaning has drifted over time. It was originally defined to mean any set of contiguous bits in a word (in word-addressable machines); this usage survived as late as the "byte instructions" of the PDP-10, which operated on contiguous bitfields within that computer's 36-bit words.

@00dani @mcc Early on in my career i got into a _monstrous_ (and foolish) argument about a byte having 8 bits.

Anyway, in Common Lisp a byte is a variable sized range of contiguous bits within an arbitrary sized integer. See DPB, for deposit byte, https://lisp-docs.github.io/cl-language-reference/chap-12/bc-c-dictionary/dpb_function (and LDB); named after... a PDP (8?) instruction of course.

dpb | Common Lisp (New) Language Reference

Expanded Reference: dpb

@mcc I wonder if that's a FORTRAN thing or an IBM thing? It's been an age since I read Mythical Man Month, but I remember Fred Brooks mentioning one of the innovations at IBM was to keep documentation in binders so that individual pages could be hot-swapped as errata were published. I mean, nowadays we use word processors and cloud collaboration tools, but I'm sure that was hot shit back in the '70s.

I bet line numbers would have made it a lot easier to publish errata that said stuff like "Lines 305-335 on page 69 are OBSOLETE. Replace with the following:"