#BabelOfCode 2024
Week 6
Language: Nameless experimental LISP

Confidence level: High

PREV WEEK: https://mastodon.social/@mcc/113975448813565537
NEXT WEEK: https://mastodon.social/@mcc/114433465965880352
RULES: https://mastodon.social/@mcc/113676228091546556

Okay… here things get weird!

My project to, gradually over the course of 2025, do each puzzle from Advent of Code 2024 in a different programming language I've never used before, has stalled out for exactly two months now as I've instead been creating…

…the programming language I'm going to use this week!

For "Week 5" I did TCL, and it kind of set my brain on fire with "wait… you can just DO that?".

In TCL, strings are anonymous functions and vice versa. In my old Emily language, "if" and "while" blocks were implemented by passing in lambdas; but in TCL you do the same thing by *passing in a string containing {the code to execute}*. I started trying to imagine what "hygienic" TCL would be; if TCL is a C macro, I want an ML or Rust macro. I tried to imagine something closer to LISP than TCL.

This is what I came up with; I think of it as "0-lisp". In LISP a "2-LISP" is a LISP where functions and variables live in different namespaces, and in a "1-LISP" they live in one namespace. Projecting backward, in my 0-LISP, the distinction itself disappears totally; functions are lists and *any list can be executed*. I wanted to know what this changed. Here's what I found:

It helps very little, and makes certain important things a huge pain. Oops! Still, it was very educational to learn this.

Anyway, now I have a LISP interpreter written in Rust.

https://github.com/mcclure/lisp0-experiment/tree/2025-04-07

It's a very minimal "MVP"; the language currently lacks:

- Variable scopes
- Loops
- Conditionals
- Floating point numbers
- Errors lack backtraces or line numbers (and cannot be recovered from)

It has

- One global scope
- Ints, strings, arrays and hashtables
- Tail recursion
- File I/O

But math and recursion means it's Turing complete. Which means it's sufficient to use it for a AOC problem!

GitHub - mcclure/lisp0-experiment at 2025-04-07

???? don't look at this. Contribute to mcclure/lisp0-experiment development by creating an account on GitHub.

GitHub

Now, since *I'm* the one writing this LISP, I do get to make some changes. The main thing I wanted to change, as it's the #1 thing that infuriates me writing LISP, is parentheses. This "LISP" has a syntax where each line (as separated by newlines or commas) implicitly has () around it. Meanwhile, {} and [] are shorthand, for

{x,y,z} => '((x)(y)(z))

[x,y,z] => (make-array [(x),(y),(z)])

In other words {} is functions (quoted lists) and [] is array literals.

Is this still a LISP? I don't care!

Before I can start, I need to make some utilities. By "utilities" I mean "if" and "while" statements, which again, this language currently lacks. MVP, remember. Implementing these in userland is possible! It's *ugly*, though. For `if`, I simulate branching by putting the two possible outcomes (functions) into an array, casting the condition to a bool and then the bool to an int, and using that as an index to the array.

"while" is even more nightmarish. I don't… understand the use of make-quote.

Like, I wrote the code, but I don't understand it. I initially wrote it with all the `make-quote`s being `return`s, and that didn't work, so I experimentally replaced them with make-quotes and it did work. I need to reread the interpreter code to really understand why that was required.

If even the designer can't understand the model here, the model is probably too complicated for anyone ELSE to write self-modifying code either! That's a sign of a problem, or better abstractions needed.

Anyway, I like testing things before I use them, so after implementing my "if" and "while" I decide to write a simple Fizzbuzz program. This causes me to immediately realize—

I FORGOT TO IMPLEMENT > AND <

I just forgot!! Fortunately I *did* include bit arithmetic ops, so I implement <, >, <=, >= in userland as well, by testing bit (1<<63) as a proxy for 2's compliment negative. This is actually a little worrisome. I'm not sure if this code truly "works" or if I'm just leveraging UB in Rust.

So after a prelude containing reimplementations of if, while, int comparisons, and "noop", I am now ready to implement "fizzbuzz". I do not need an implementation of "fizzbuzz", I'm supposed to be doing AOC2024 day 6. But at least I know my helper routines work!

Fizzbuzz source:

https://github.com/mcclure/aoc2024/blob/6d12e9cf65158b5f511da8545d263f4ed1dcad99/06-01-guard/src/test-fizzbuzz.l0

I was GOING to implement a `proc` that modified function code in-place to add local variable support, but debugging `while` took a lot out of me, so I think I will not do that for this project.

aoc2024/06-01-guard/src/test-fizzbuzz.l0 at 6d12e9cf65158b5f511da8545d263f4ed1dcad99 · mcclure/aoc2024

Advent of Code 2024 challenge (laid-back/"babel" version) - mcclure/aoc2024

GitHub

Got part 1 of the puzzle working this morning!

https://github.com/mcclure/aoc2024/blob/2656c75da28669ad11568fae6f5f3a9c24a311a6/06-01-guard/src/puzzle.l0

App code "proper" starts around line 124 (43-123 are "library code" I didn't post above; I implemented "for" and "switch", and a very small vector math library.)

Writing this program was not at all hard (other than the library code… `for` was painful), but nor was it particularly *pleasant*; it didn't feel fluid. That might speak poorly for my minimal LISP, and the question of whether I use it again after this project.

aoc2024/06-01-guard/src/puzzle.l0 at 2656c75da28669ad11568fae6f5f3a9c24a311a6 · mcclure/aoc2024

Advent of Code 2024 challenge (laid-back/"babel" version) - mcclure/aoc2024

GitHub
In particular, what worries me is the "hard part" of this code turned out to be the higher order functions (if, while, for, foreach, proc). This LISP is contrived compared to other LISPs, with no special "forms", and no sugar other than the {} and [] which are supposed to be swiss army knives you can construct anything else you need from. I did it this way to make self-modifying / generated-at-runtime code easy & fluid. *But that's exactly what turned out to be hardest*. And *that's* a bad sign!

Making this little LISP was *in part* a "can I do this?" for the Babel project. But there's another thing, which is I want a general purpose assembler. I've recently started wanting to target some unusual/retro systems (NES, N64, Saturn) and I forsee frustration with having no common toolchain among these projects for any language I like, nor a common macroassembler syntax I can take with me across different ISAs.

So I was hoping my tiny LISP could be reworked into a macroassembler.

I've realized something:

mov rax, r10
add r8, 1
jmp _main_loop_reset

Assembly looks a lot like LISP once you remove the parentheses! So I wanted to design an ASM syntax that exists as a data structure within a LISP, and write code as a LISP program that generates that data structure. In other words make a DSL similar to "Amaranth" or my old "No Compiler" blog posts https://msm.runhello.com/p/category/plt/no-compiler . I may be just confusing you at this point— I'm not sure I can describe this idea clearly in 500 chars.

Run Hello » No Compiler

Anyway, before I move on to Week 6 Part 2, I'm considering going back and making changes to the interpreter. I told myself whatever I had written as of 04-07, I'd stick with it— I could fix bugs but not language design issues. Well, I hit some stuff straddling the line:

- lack of "fail" (exit with error message) is m i s e r a b l e, the second half of my program is all `if (!abort)` every few lines
- "the return problem"

…Oh geez. i can't describe the return problem in 500 characters either.

The point of my "no parentheses"/l0 syntax was each independent line has an implicit () around it— each line is a function call. This creates a problem: How do you form an expression that returns a value without executing code? I foresaw this during design and added a function named "return" that takes one argument and returns it. So a fn def looks like

{
set 'result
return result
}

Kludgy but readable. *But I forgot about lists!* So all lists turn out an ugly traffic jam of `return`s:

So the obvious fix for *this* is to have the interpreter simply treat any single-symbol line as a value, not a function call. But NOW I have to introduce a *separate* syntax for no-argument function calls! (emily had a higher order function named "do" for this), and I risk sneaky bugs if I ever forget to use that syntax on some given line. Gradually I'm losing the elegant simplicity that inspired me to base this language on the idea of LISPness rather than resurrecting Emily or something. Messy.
Anyway, the GOOD news from part 1 is that I found a bug in my garbage collector! That's not a joke or sarcasm, I am legitimately excited I'm now exercising my GC enough that it's possible to find bugs in it. If you run my part 1 solution in verbose-debug mode it crashes at some random point partway through execution, which I haven't fully investigated but to me screams "GC bug". So now I get to do more work on my GC (my favorite part of language implementation).

So that flows well into this next post…
I tried out "part 2" last night. Part 2 involves rerunning the simulation from part 1, but repeatedly stopping it and backing it up to test untaken paths.

That GC bug I was excited about? Yeah, it turns out my part 2 solution stresses my GC enough I get the GC bug *just running the basic program*. Every time I run, I get a random failure in a random place after memory corrupts (usually by randomly scrambing which object which pointer points to).

"LOL"

I mentioned before I was looking forward to fixing this GC bug, but that doesn't mean I want to fix it *now*.

Guessing the bug happens after a random number of GCs, I tried increasing the GC space size (by default it's 56 MB, and doubles when a collection ends with the space more than half full— which in this program never happens), to increase the time between GCs. Oddly, this *also* caused the program to survive for more GCs without crashing. But I couldn't find a size it didn't crash.

So yesterday got… a little silly.

Let me explain the puzzle.

The idea is a guard (▲) is walking a room with various obstructions (#) [📎 1]. They turn right every time they hit an obstruction, and continue until they reach the room border and "escape" [📎 2]. My goal in part two (SPOILERS!!!!) is to find all the places I can insert a new obstruction (X) which would cause the guard to enter a "loop" and prevent them from escaping [📎 3].

Interestingly, this problem is "embarrassingly parallel".

The "walk" must be done linearly, but in a setup where you can multithread, you could split up the search space for wall insertion into chunks and have each chunk be tested separately. My handmade language can't multithread; it can't even single-thread, insofar as it crashes after a certain amount of time. But the solution is the same! I simply ran the program 14 times, by hand, each time checking only a few hundred of the 5,950 possible wall insertion positions, bailing out before it crashes.
Add together the outputs from the 14 runs, and I have an answer! My answer's wrong. It's too high. I instantly realize why: I find wall candidate positions by looking "one step ahead" of the guard at each of the 5,950 steps. This creates a potential for double counting if the guard twice approaches the same square from different directions. Good thing I thought ahead and added proper file I/O! I rewrite the script to output every wall location to a file as it goes and rerun it (again in chunks)…

…then deduplicate the results using, of course, a small script written in my experimental LISP:

https://github.com/mcclure/aoc2024/blob/stable/06-02-guard/src/dedup.l0

(I could have just used UNIX uniq, but that's! Not! The point! Of this! Exercise!)

It's still too high!!! And now I'm in trouble. After double checking I conclude the problem isn't my weird multisegmented run process; there's either a very fundamental bug or I've misread the prompt.

My LISP isn't unpleasant to write, but it's not well instrumented for intricate debugging :(

aoc2024/06-02-guard/src/dedup.l0 at stable · mcclure/aoc2024

Advent of Code 2024 challenge (laid-back/"babel" version) - mcclure/aoc2024

GitHub
Today went back and carefully reread my [aoc 2024, day 6, part 2] source looking for problems. Discovered there was a line where I believed— had from the beginning intended!— for it to do test A, but in fact it was doing slightly different test B which would produce bad results. The fact I didn't notice this until now emphasizes the importance of languages which encourage clear, readable code! However even with the fix I am still delivering the wrong answer.

This might be the worst I've ever done on an AOC problem (grading by number of unsuccessful submissions)! The site is no longer giving me too high/too low hints, I guess to discourage binary search.

I think the only way I can proceed is to rewrite my code to be "dumber" ( https://mastodon.social/@mcc/114144927522971157 ) — right now I have some clever optimizations, but those optimizations are potential sites for bugs. Increasingly worried about potential weasel wording in the problem statement https://mastodon.social/@mcc/114332477221007074

Okay. So.

- I rewrote my search program to be less optimized in terms of time and more optimized in terms of memory (eg, creating less garbage).

https://github.com/mcclure/aoc2024/blob/4fd6939cbac0a024ed0b5ea85670311688d3fb6a/06-02-guard/src/puzzle-exhaustive.l0#L383

- In this less optimized state, it took nearly all day to run— like, six hours (but despite doing more work, only crashed and had to be restarted 8 times… so that part worked… sorta… I guess…)

- The "dumber" version produced the correct answer on the first try.

I'm not sure what, if anything, I've learned here.

aoc2024/06-02-guard/src/puzzle-exhaustive.l0 at 4fd6939cbac0a024ed0b5ea85670311688d3fb6a · mcclure/aoc2024

Advent of Code 2024 challenge (laid-back/"babel" version) - mcclure/aoc2024

GitHub
My original implementation was very smart about avoiding unnecessary work; it only checked the 5,930 "one step ahead" squares, on the logic a wall that never crosses the guard's path can't affect the guard's behavior, and when speculatively placing a wall it would only run the simulation from the moment the guard first reached that square on. Somewhere in that "smart" logic hid a bug. By contrast, the version that eventually worked ran the entire simulation start to finish 16,900 times.

However I don't really get to go "there's a moral here!" because the brute force placement method only occurred to me *after* I'd implemented the "smart" time-rewinding method. So I didn't do it the smart/broken way because I prematurely optimized; I did it because I didn't want to rewrite the program. But then I had to do that anyway. Oh well!

Anyway, the second half of this was miserable, but I think I've got some useful feedback (from myself) about how to proceed with the "tiny LISP".

Most of the issues I found [in the language] were obvious or things I'd already planned to fix. (Errors without line numbers suck; programming without locals is a pain. Well, obviously.) That "macros" made by constructing lists by-append would turn out to be so painful… that was more of a surprise. I think I have a clearer idea of what I want function declaration to look like now (the plans I had before were too naive). Getting a good repro case that activates the bugs in my GC *is* a blessing.
This language may show up again in this post series, once I change it enough it constitutes a "different language". Before that though I want to do weeks with textual WASM (to get ideas about LISP-flavored assembly), Dylan & Haskell (to get ideas about multimethods) and Fennel (so I can try out HoneyLISP https://freeradical.zone/@suetanvil/114314911972830726). Alternately, the next version may just be a self-hosted macroassembler! I also hope to do an Emily 2 week, so my occasional "Emily" references won't just be incoherent.
Chris [list of emoji] (@suetanvil@freeradical.zone)

@mcc@mastodon.social Did I already mention Honeylisp? It does exactly that, but for the Apple 2. On the off chance that I'd planned to mention it but forgot, here it is: https://git.information-superhighway.net/SpindleyQ/honeylisp

Free Radical

As followup, I'm adding now to the language everything I missed while writing my test program.

This means adding > < >= <=, which means writing test cases for string comparison, which means an excuse to come up with cute Hanzi testcases…

Fun fact: Did you know that in Unicode the Chinese numerals are not actually sorted in lexicographic order?? Or any apparent order at all?

(一 < 四)
but
(四 > 五)

So one of the unusual solutions I have created to one of the unusual problems my new language has is that every array, every array ever created by the language, now remembers what line of code it was created on forever¹, because the language doesn't yet have an a priori way of knowing which arrays might later become functions and it *might* be needed for error messages

¹ Oddly, due to another questionable decision I made, this doesn't result in any additional memory usage

Post-field-test updates:

Finished:
- Line numbers in error messages
- do (call unary function) apply (call function given car element and cdr array) and if (if) builtins now exist
- Lone symbols on a line now return the value instead of calling it as a function

Fizzbuzz is now MUCH more readable and doesn't have to define any library functions except `while`:

https://github.com/mcclure/lisp0-experiment/blob/3b7a6676aca477fe529680fe3e008be24308c720/sample/fizzbuzz.l0

TODO:
- `proc` (maybe `fun`) to create functions with named args and locals
- builtin `while`/`for`/`foreach`

lisp0-experiment/sample/fizzbuzz.l0 at 3b7a6676aca477fe529680fe3e008be24308c720 · mcclure/lisp0-experiment

???? don't look at this. Contribute to mcclure/lisp0-experiment development by creating an account on GitHub.

GitHub

Final update for this thread:

So based on the field test of this AOC, the Nameless Experimental LISP has been improved to a point I'm no longer embarrassed publishing it and "released", as open source, on Codeberg:

https://codeberg.org/mcc/nameless-experimental-lisp/src/branch/2025-04-26#readme

I also documented it, pretty extensively, the documentation .md is now the longest single file in the project.

The main new thing is { } lambdas no longer accept arguments, and there's a `fn` and `set-fn` functions that upgrade a { } to have args/locals.

Making sure you're not a bot!

Some fun benefits of the refactor:
- "do" ("create a function and immediately call it") now can be used EITHER as a feral flow control construct OR to inject variables into a block's scope (my `for` impl uses this)
- I seem to have fixed the GC crash by accident?! That's terrifying?

The AOC answers, upgraded to the new variant (they're sample code!) are now part 1:

https://codeberg.org/mcc/nameless-experimental-lisp/src/commit/36435712e43afcde5bc85e74341928e5cbf3d12c/sample/aoc/2024-day6-part1.l0

And part 2, although I think there's a bug:
https://codeberg.org/mcc/nameless-experimental-lisp/src/commit/36435712e43afcde5bc85e74341928e5cbf3d12c/sample/aoc/2024-day6-part2.l0

And I find them much more readable!

Making sure you're not a bot!

@mcc There is a specific Unicode character property for this, although I'm sure you already knew this.

Specifically, the nv property that is the numeric value of a character which represents a number: https://en.wikipedia.org/wiki/Unicode_character_property

Unicode character property - Wikipedia

@loke I'm going with codepoint/lexicographic ordering because I'm inheriting the Rust implementation, and I think it's the correct approach since afaik there are multiple different "content aware" Unicode sorts so really none of them should be treated as privileged.
@loke If you want a unicode ordering that should be something you specify longform, not in the short-form > < operators, and probably that code should live in a library.

@mcc Yeah, I'd never actually suggest it's a good idea. In fact, I'm not sure the nv property can ever be really useful for much either.

No "smart" ordering of Unicode characters is ever going to be correct (they can't be, because of conflicting expectations).

@loke @mcc Like the delightful Scandawegian characters "å", "ä" and "ö". Which in Denmark and Norway are in the order "ä ö å" (thus nicely mapping to "{ | }", for those who want to do pre-Latin-1 archaeology), and in Sweden as "å ä ö" (and thus not allowing Swedish to be sorted correctly when sorted ASCIIbetically).
@mcc Are you talking about in code point order, or according to the Unicode Collation Algorithm?
@unlambda code point order, which is what I would consider [byte] lexicographic order.
@mcc Oh, also, turns out that for Han ideographs, Unicode collation order is the same as code point order, they don't have explicit collation weights: https://www.unicode.org/reports/tr10/#Implicit_Weights
UTS #10: Unicode Collation Algorithm

@mcc

Dylan’s multimethods are very pretty. To have a random deep cut, REBOL desperately needed multimethods. It could’ve worked, and it would’ve been beautiful.

@mcc about locals - isn't it solvable by macros? lambda that parses the argument list and pushes to a linked list of environments
@mcc the most natural thing would be dynamic scoping - this would only require saving globals with the argument names to the stack and restoring them after. then other variables won't be closed (as in closures) but would come from the scope of the calling function. somewhat like file descriptors, or fp flags, or formatting directives in c++ streams, or logging handles, etc. lexical scoping would require walking the code and converting all get/set operations of external variables to use the explicit stack. it's also possible to do very different conventions, e.g. forth-style explicit stack handling.

@Pashhur There are two problems with solving it with a macro:

- If I place any extra code at the end of the function-- for example, setting the locals to previous values, or to nil-- then I will break any tail recursion the function performs.

- I don't have any builtins for environment manipulation; if I did I'd want to choose them very carefully, because part of the point of this language is to be less aggressively dynamic than my previous language attempts, so I can easily write a compiler.

@mcc drunk driving does let people get to work on time sort of vibe