#BabelOfCode 2024
Week 7
Language: Haskell

Confidence level: Medium low

PREV WEEK: https://mastodon.social/@mcc/114308850669653826
NEXT WEEK: https://mastodon.social/@mcc/114463342416949024
RULES: https://mastodon.social/@mcc/113676228091546556

I was going to do Fennel this week, but then I looked at the problem and thought "this is ideal for Haskell "amb". I have been looking for an excuse to use Haskell "amb" for 25 years. So Haskell.

I have tried to learn Haskell 3 times now and failed. This "Babel of Code" thing was originally in part an excuse to do Haskell

I am not sure whether the reason I previously failed Haskell is

1. Because it's actually hard
2. Because of a mental block caused by failing at it more than once already
3. Because Haskell users are really bad at explaining things

I think it's a little 2 and mostly 3. I *love* ML, I know two MLs (3 if you count Rust) plus have in the past written my own ML. I understand the parts of Haskell that are just ML and get lost whenever I hit "do"— the point of divergence from ML; the dreaded Monad.

Question: The Haskell 2010 documentation describes its basic I/O functions, somewhat ambiguously, as "character oriented". I assume this means "ASCII character oriented". Is there a way in Haskell to get equivalents of getChar, putChar, string operations etc which are *UTF-8 character* oriented? I don't need graphemes, I'm happy with codepoint resolution.

In the Haskell docs

https://wiki.haskell.org/Haskell_in_5_steps

It states this is how you build a Haskell program to run it.

Assuming I realize I can drop -threaded, is actually the easiest/correct way to build a Haskell program to run in the year 2025?

Haskell in 5 steps - HaskellWiki

I run the given ghc --make command. It leaves some crap in src/. Say I do not want intermediate files in my source tree. I would like them to be moved to bin/ or obj/ or something, or simply not retained. Is this possible, or is Haskell in 2025 simply a "leaves crap in src/" kind of language in 2025?

I found -no-keep-hi-files and -no-keep-o-files (despite them technically not being documented) but say I want to retain them, just in a place of my choosing.

Welp, after 25+ years of trying, I have written my first working Haskell program. It reads one line from stdin and then prints it back out. I have now finally used a "monad", although I still don't feel I know what one ~is~.
I am trying to switch my program to Cabal-driven builds. This is one of the questions it asks you when you run `cabal init`. I think I understand what it is about the Haskell community that lead them to do it this way, but in my opinion, this is bad user experience. If the newest version of the Cabal format isn't the recommended one then why did you release it at all?

I am attempting to call "openFile" on the first command line argument in Haskell¹. It doesn't like it.

I'm not sure I'm looking at the right docs. I searched Google for "haskell system.io" and got https://hackage.haskell.org/package/base-4.21.0.0/docs/System-IO.html#v:openFile . I don't know if this is the newest Haskell2010 or if $GOOG is confused.

The doc (every doc I find) claims the type of openFile is FilePath -> IOMode -> IO Handle. But hls on my computer seems to think it's FilePath ->IOMode -> Bool -> IO Handle. Am I missing something?

System.IO

Answer to my previous question was I naively took the first suggestion from hls and imported GHC primitives where I should have imported System.IO. Cool. Works now

Haskell people, please help me.
There are 3 image attachments to this post, showing the same code block but with different amounts of indentation.

The first code block works,
the second block does not work,
the third one REALLY does not work.

According to my editor, none of these blocks of code contains tabs.

Haskell appears (?) to treat three spaces, four spaces, and eight spaces radically differently.

I dislike multiple-of-3 indents.

What do I need to read to understand what I am missing?

Another cursed question.

See attachment 1. This code compiles.

Reading the documentation ( https://hackage.haskell.org/package/megaparsec-9.7.0/docs/Text-Megaparsec-Char.html#v:space ), I realize I do not want space but "space1" (see attachment 2).

I change the symbol "L.space" to "L.space1". No!! Says GHC. L does *not* export space1!! only space!!

But the documentation says it exports space1?

Text.Megaparsec.Char

My problem can be explained if when I put dependency "megaparsec ^>=9.7.0" in my cabal file it picked like version 5 or 6 or something.

Is there a way to get cabal to print out for me what version it actually chose of each solved dependency? In npm or Rust for example I would consult the lock file.

I never solved the space1 problem but worked around it with a solution from @dysfun . I now have three new questions.

1. In attachment 1, why is "return" not required on L.decimal? I originally wrote "return" and it gave me a hint saying I could remove it, and removing it works. But return *is* required on (lsum, nums)?

2. In attachment 2: If att. 1 is allowed, why is this not allowed? It gives "parse error (possibly incorrect indentation or mismatched brackets)" on the _. Wrong type syntax?

I went from OCaml to Rust and coming back to Functional Land, one thing I'm really noticing is just *how much fricking better* the Rust error messages are than OCaml's, and consequently, how much fricking better they are than Haskell's. One thing is since Rust has less extensive type inference, you get way less "spooky action at a distance" in Rust than OCaml/Haskell and thus errors tend to actually be marked at the site where they really occur.
I've been having extensive problems in my program using the symbol "spaceChar" exported from Megaparsec, because if I use spaceChar in an expression, but the *variable to which the expression which uses spaceChar is assigned* is unused, everything breaks (and the error is inscrutable). OCaml is a spooky language but I never saw anything THIS spooky happen. Writing normal idiomatic code, too much wound up implicit and the compiler cannot even explain to me what it is that it doesn't understand.

One more question (I think this question might be outright goofy).

Is there a specific syntax for "calling a monad of a different type" from the current monad?

I have constructed a megaparsec-monad combinator that parses my pattern. I've made an IO-monad function that reads lines one at a time. If I call the megaparsec combinator I made inside my IO monad, I get a confusing error.

The megaparsec tutorial implies a combinator can act like a function that takes strings: https://markkarpov.com/tutorial/megaparsec.html#forcing-consumption-of-input-with-eof

Megaparsec tutorial

Haskell Programmers Will Literally Write Multiparagraph Comments Instead Of Just Giving The Parameter A Name Longer Than One Letter

So the answer to my last question was to use "parse"/"runParser" (aliases for 1 function) from megaparsec. Great.

It's not working and I think the problem is I don't understand destructuring. I want the equivalent of Rust

let Some(x) = f() else { panic!("Not found!"); }

I *think* I'm getting an Either, and I need to match its cases. But the way I know how to do that is "case…of". And the arrows in that "point the wrong way"?? Compare this similar attempt to unpack a list into its 1 item:

Alright. Thanks for the explanations y'all. I am now correctly parsing my input file. Current source:

https://github.com/mcclure/aoc2024/blob/d23ee9139e6645022504fce2dc26f39601e66933/07-01-operations/app/Puzzle.hs

I STILL can't figure out how to make the match on line 43 give a more human-readable error (on the array destructure on the command line argument processing) than "user error (Pattern match failure in 'do' block at app/Puzzle.hs:43:5-12)", but since I'm the only one running this I guess this only matters for "this hurts my sense of professionalism!!" reasons.

aoc2024/07-01-operations/app/Puzzle.hs at d23ee9139e6645022504fce2dc26f39601e66933 · mcclure/aoc2024

Advent of Code 2024 challenge (laid-back/"babel" version) - mcclure/aoc2024

GitHub

I will say. I could have done *this entire 50 line Haskell program* in the following two lines of perl:

perl -e 'open(FH,"$ARGV[0]"); while (<FH>) { /^(\d+):\s+(\d+(?:\s+\d+)*)$/ or die "Invalid input"; my $sum=$1; my @‍ops = split(/\s+/, $2); }'

…and I suspect writing that not only required far, far less thought for me, but would have required far less thought for someone who was already versed in both Haskell and Megaparsec.

(EDIT: Note to run this snippet you must remove the 0-width space.)

The Haskell documentation is clearly written for humans to read— *some* human, *somewhere*. It is definitely not written with the intent of *me* reading it.

Haskell syntax question: Is there a thing I can do to convince GHC that when I said "-1" I REALLY, definitely wanted the unary integer negation operator and not something else, so I don't get a spurious warning

(No one agrees with me but I still think SML was right to use ~ for integer negation.)

EDIT: Seems the problem isn't unary - at all but that Haskell feels unsure this is int and not some other Num. -(1 :: Int) solved it.

I've reached the part of the puzzle that inspired me to write this in Haskell to start with: The "amb" combinator. I read about this decades ago and it captivated me as the one example I'd seen of what made Haskell's weirdo pure-functional model "useful".

Here's my outsider's awareness of what a monad is:

You write some code, and it's executed in a sequence.
*What does that mean*?

The monad defines what *sequentiality* means. For example "IO" means "linearly in time, as external events allow"

(Okay functional fans, THIS is the point where you can reply to object I'm describing monads wrong.)

I think this is why the Haskell folks latch onto category theory. As best I understand, category theory is about explaining what it means to "apply" a thing to another thing:

https://web.archive.org/web/20250107162216mp_/https://cohost.org/mcc/post/75444-this-video-is-the-on

I think I understand why the correspondence from the category functor to the Haskell functor typeclass is exact, so I'm willing to believe Haskell Monads mostly correspond to…something categorical.

This video is the only thing I've ever seen that made "categories" make sense to me

So the video takes about four minutes to start getting to the point, and frankly I'm not sure it will make much sense to someone who doesn't already sorta get group theory, but I stumbled on this YouTube video: A Sensible Introduction to Category Theory": YouTube link [https://staging.cohostcdn.org/attachment/474a951d-02f2-4a93-831d-c98a11f37a1b/cateogry.png] [https://www.youtube.com/watch?v=yAi3XWCBkDo] And the critical idea in this video that made things click for me is: Categories are about things that can be composed. When category theory people start trying to describe it they often go on about "structure preserving maps", and that does seem to be what the theory is mainly used to talk about, but a category itself is something simpler: 1. You have a collection¹ of things; 2. You have a collection of relationships between the things (just defined pairs that mean something to you; these are the "morphisms") and 3. You know how to compose the relationships ("morphisms"). Composing the relationships has to follow an associativity law, and there have to exist "identity morphisms" for every object that "do nothing" (in the sense that if you have an a -> b relationship, "precomposing" the identity for a, or "postcomposing" the identity for b, is "equivalent" to just a -> b within whatever meaning the relationship you defined has). And... that's all. So usually the morphisms are things "shaped like" one-argument functions so that they can stand in for set injections or group homomorphisms or continuous functions between topologies, and in these cases, they are structure-preserving maps because that's what we decided to consider. But the video gives a much simpler example of a category where the objects are real numbers, and the morphisms are every accurate instance of "less than or equal to". There's a morphism X -> Y if X ≤ Y. Composition is just the trivial stringing together X ≤ Y ≤ Z. The identity is comparing a number to itself. Some morphisms don't exist, like there is no 3 -> 2 because in fact 3 is larger than 2, but that's fine. What's important is when relationships exist, you know how to compose them. So if this is all category theory is, this explains why computer scientists latch onto it so hard, and why when category theorists start talking it sometimes sounds like they're just doing graph theory with some slightly obtuse terminology. Because they are. I still think their terminology is obtuse, but at least now I get why they're doing it! I DON'T KNOW WHAT A "GROUP" IS AND I DON'T GET WHY YOU THINK THIS IS INTERESTING ...Well. ---------------------------------------- Okay. So did you ever have a day in elementary school that the teacher taught you about "properties"? Like, the "commutative property" meaning that 3 + 4 and 4 + 3 are the same. So in 1854 this guy named Arthur Cayley started writing down all the properties of grade-school number-line addition and subtraction and asked the question: Imagine we had this list of properties², but we didn't know what "numbers" were. Say we started trying to write proofs about the behavior of numbers based only on these properties. What would we have? It turns out you can reproduce quite a lot of math about the integers, but Cayley noticed also you can reproduce interesting behaviors of some other things 1800s mathematicians were playing with that were not numbers, like substitution ciphers and matrices and quaternions (an unbelievably freakish thing involving imaginary numbers but more imaginary), all of which follow those same properties. He named these things "groups" (i.e., the quaternions are "a group" because they follow the rules for a group) after a term the substitution cipher people had been using. Once people started picking at the group idea they started realizing certain groups were interchangeable. Like, if you start imagining groups where there are exactly four "numbers" to do arithmetic on, it turns out there are only two of those. Two groups of size four, I mean. Any other group of size four you try to invent will just turn out to be one of the main two in disguise. This interchangeability idea turned out to be incredibly important in the twentieth century because it meant we could convert³ things that are hard to do math on into things that are easy to do math on, do the math, and then convert back. Like at some point someone figured out the group quaternions of magnitude one (a subgroup of the quaternions) are exactly interchangeable with the group of rotations in three dimensions, making graphics programmers miserable for the rest of time. Groups turned out to be the first acquisition in the zoo of "abstract algebra". If you add a second operation (cuz maybe you want to add and multiply) to a group we call that a ring, and if you add some more properties to the ring we call that a field, and there's various categories⁴ like this that if you start with "I have a set and these binary operators" you can slot it into one of the categories based on how many of the standard properties the operators follow and now you know lots of useful facts about it and possibly have a list of well-researched existing [groups, rings, fields] that the thing you invented is convertible from or into. Anyway I think the reason this video jumped out to me is that by just plainly stating what the definition of a Category is it gave me a way of kind of sorting Categories into the abstract algebra zoo, or an annex to the zoo maybe. Categories are about one-argument functions ("morphisms") where the abstract algebra beasties are all about two-argument functions (binary operators) but that's not such an interesting distinction to me. Groups are about things you can add (in some sense) and Categories are about things you can compose (in some sense) but you can think of either as being about collections and the properties those collections follow, and thinking about it that way gives me a way to bring category theory (which I've never understood) into context with something I do understand. Although I do still wish their terminology were not so obtuse. ¹ The word "collection" is used in many definitions of "a Category" I find and it seems to be intentionally not be defined in any rigorous way, I assume because they don't want to paint themselves in a corner where they say "a set of objects" and now all of a sudden they can't talk about categories where the objects form a proper class. I do hope the loosey-goosey terminology there doesn't turn out to create problems of unsoundness. ² Though commutativity wasn't on his list actually. Adding commutativity to a group gets you an "abelian group", and rings (which I mention a little later on) are actually based on abelian groups, not groups. ³ I couldn't find any natural way to just slip this in here: The "conversion" from one group to another I'm handwaving here is called a "homomorphism", which you might notice I mentioned earlier as being the "morphism" relationship that category theory considers important on groups. For the Category of groups the "objects" are the groups themselves, the "morphisms" are the homomorphisms/conversions between groups, and composition means performing the homomorphisms serially. ⁴ Notice my use of the lowercase letter here⁵. ⁵ …although incidentally all the things I mention here are also Categories.

andi on cohost

Anyway, the Amb monad, as offered by the "nondeterminism" package¹:

https://hackage.haskell.org/package/nondeterminism-1.5#readme

Stretches this definition by defining "sequential code" thus: *It is nonsequential*. The Amb monad can fork computations, and answer questions like "do all paths return?" "does ANY path return?" "give me a list of all possible results".

¹ For some reason until today I thought it was a language builtin. It's not; it's just something multiple people have implemented in various ways in the last 30 years.

nondeterminism

A monad and monad transformer for nondeterministic computations.

Hackage

So here's a test program with Amb. You can probably read it without knowing Haskell! "aMemberOf" in sumTen forks the computation by running the code following it with *both* members of the list -1, 1.

We de-monadify this, making it a real single computation, with the nondeterminism package's "isPossible", which returns true if any path does. Does *either* 6 + 4, *or* 6 - 4, sum to ten? 6 + 4 does, so this code prints true. Wow! That was shockingly easy!

Then I try to make it *marginally* more complex and I run facefirst into the wall that is Haskell's baroque syntax.

It took me like… 15 minutes to figure out how to modify the type signature of sumTen to take 1 argument. I kept expecting it to be Amb Bool Bool Int but no it's Int -> (Amb Bool Bool). I *think* I now understand why it's the way it is, and I *think?* I also understand why TakeLines upthread returned "IO Int" and not "Int -> IO", but I'm not sure how I was supposed to have known it

(Note: If my current mental model of the syntax is correct, this actually isn't a case of Haskell's syntax being overly baroque but a case of it being extremely minimal and things just working out the way they do "by coincidence". I think the problem I'm having is that the code keeps *alternating* being overly smart [like the indentation rules, still which feel complex to me] with being overly bone-dry simple [meaning you get one character off and the compiler can't give you a helpful error].)
@mcc I think you're right. Layout is smart, but type syntax is minimal and complex things happen by composing things, but basic rules are deceptively simple
@mcc (- 1) or negate 1
@amy Hmm, it still puts a warning on "negate". I wonder if there's some way to explicitly tell it which instance of Num I meant by 1?
@mcc oh right. sry i thought this was the other issue with the unary negation operator. times :: Int <- amb [-1, 1] or [-1 :: Int, 1] (or [-1, 1 :: Int] etc). basically put a type signature anywhere that one will go
@amy Thank you very much.

@mcc the underlying issue is that your code works for any numeric type that can also be checked for equality, and since you're returning a boolean (and taking no argument of that same type) there's no way to figure out what type was meant or make your function polymorphic in that type. and the type needs to be figured out so the code can be compiled.

the standard has some oddball rules to satisfy the constraints when you're only using prelude classes like Eq, Num, Floating etc but in more complicated cases you'll get an ambiguous type variable error (like potentially the unused let spaces = skipSome spaceChar you were seeing yesterday; nothing determined what instance to use so ghc could not make your code into a complete program)

@mcc c'mon you're likely just 6 ICFP papers away from understanding a programming dad joke. :D
@mcc love perl's "or die"
@Norgg <Larry Wall> THEN PERISH
@mcc @Norgg
die unless [extremely trivial computery thing];

@adriano @Norgg

fix(YOUR, $hearts) or die;

@mcc @Norgg bless;
@mcc @Norgg (I’d have to check but I think all of these are in the Camel Book already)

@Norgg @mcc i think `or die` might technically also work in php, but it wouldn't pass code review.

Something for my coworkers to watch out for. 😉

@mcc I kinda feel like I should go back to Perl. All I do is admin stuff anyway and I'm not sure anything else is really better for it.
@mcc is this just summing numbers from a file?
print . sum . fmap (read @int) . words =<< readFile . head =<< getArgs

@zmz @int No; it is splitting a file into lines, then for each line matching the pattern

num: num [num num…]
and returning the tuple (initial_number, [list_of_following_numbers])

So that is a little more complicated.

I also intentionally did it such that it interprets each line one by line instead of buffering the entire file (the perl oneliner solution does this) which might complicate things.

@mcc @int sorry, distracted and wasn't reading carefully. still pretty simple to do idiomaticly, but too much for my phone. readFile isn't lazy io, but there is a version that does lazy io that would make that steam the file

@zmz @mcc @int readFile is lazy

ghci> x <- readFile "test.hs"
ghci> writeFile "test.hs" "awaga"
*** Exception: test.hs: withFile: resource busy (file is locked)

(so is hGetContents, which it is)

@amy @mcc @int ahh, I was misremembering
@amy @mcc @int I think because we use Relude (standard library replacement) for most things at work and that exports `readFile :: FilePath -> Text` (although it actually doesn't anymore because that assumes utf8) that's probably where my brain went for readFile being strict

@mcc @zmz my username has landed me in a Haskell community discussion on regular expressions.

Wrong number. Figures.

@int @mcc Sorry 😅 @ is used is Haskell for type application (a way to tell the compiler that you want to use a polymorphic function at a concrete type)
@mcc I haven't touched perl in at least 15 years and that example seems clear (it's a one-liner and therefore too terse). I can only imagine how alien and unreadable the Haskell version would be to me. Not that one-liner perl is any great prize but Haskell's syntax eludes me.
@arclight it's linked in the post before . You'll find it maybe not so bad because I don't use any of the . $ <*> << >> <* nonsense
@mcc the argument order for parse is wrong. it take the parser first and the file second
@mcc for destructuring, if a pattern match fails in do notation it calles "fail" from MonadFail. for IO that throws an error. You could use a Monad Transformer over IO to get difference behavior or have the first case be
"[x] -> [x]"
@zmz I did try it with [inFile] -> [inFile] and the line after was still an indentation error.
@mcc There's some library, I think it's one of the Lens ones, where the type variables for a type spell "s t a b". It doesn't help me know what they mean or do, but it does give me a chuckle.
@j3rn @mcc same-type attack bonus?
@j3rn @mcc I know! Soprano-tenor-alto-bass!
@j3rn @mcc I wouldn't blame you for taking a stab at it
@mcc you're interested in "lift IO" or "parser combinators"
@mcc Something to do with monads? The do-block does something similar to the transform from async/await to callbacks and maybe you're doing the equivalent of dropping a future instead of awaiting it? But those error messages tell you none of that.
@mcc haven't done Haskell in a while, but I don't think it cares how many spaces, just that they're the same. Your problem is that the alignment point is the start column of the first token after the `do`.
@kw217 oh no. okay, I see :(

@mcc only in the first case is the "let" keyword (and following code) aligned to "do"

It's just significant whitespace. You can either align everything on the column after "do " (that's where the 3 chars come from) or indent on a newline with an indent width of your choice.

@mcc in pic 3, it's the "if" that breaks the alignment

@mcc

This should compile, with 'do' on the prev line:

```
takeLine :: Handle -> Int -> IO Int
takeLine inHandle acc = do
inEof <- hIsEOF inHandle
if inEof
then return acc
else do
inStr <- hGetLine inHandle
let result = 1
takeLine inHandle (acc + result)
```

@mcc in my opinion, the neatest way to get consistent indents is to have a newline straight after "do", then your preferred number of spaces the next line