Mastodawn

I somehow wish that whenever I invoke 'reader mode' on a website, particularly a blog, the owner of the website was notified. Okay, I don't mean me specifically, but if website owners could collect what percentage of users invoke reader mode. That might make them think "hmm, maybe my website could be more readable".

allanderek Jan 13

@simon Have you read https://felix.dognebula.com/art/html-parsers-in-portland.html
I thought you might be very interested. In particular the author posits that these HTML5 ports are not really 'ports' in the traditional sense, but actually 're-writes', so the agent decides "Oh this is an HTML5 parser in X so I will write an HTML5 parser in Y", rather than "I will translate this X library into Y".
I'm not 100% convinced, and the author clearly has some motivated reasoning going on (though the conclusion is not exactly anti-AI per se).

HTML parsers in Portland

A small exploration of weird results when AI coding agents translate an HTML parser into different languages. 2000 words - 10 minutes

allanderek Dec 19

A new link post to Ned Batchelder's testing conundrum: https://blog.poleprediction.com/posts/link-ned-batchelder-testing-conundrum/

Link: A testing conundrum

Link: A testing conundrum A thoughtful blog post by Ned Batchelder regarding a difficulty in testing a class that on the face of it should be straightforward to test. The class seems straightforward to test because: This is a pure function: inputs map to outputs with no side-effects or other interactions. It should be very testable. The code is basically a hashing function, for which he wants “equal” values to hash to the same hash, and non-equal values to hash to different hashes.

Allanderek's blog

allanderek Dec 17

I made a new post, regarding LLMs and code duplication: https://blog.poleprediction.com/posts/llms-and-code-reuse/

LLMs and code duplication

I’m working on a project for which I’m using Go, as well as LLMs, mostly Sonnet 4.5 via Claude Code. I’ve noticed that it is frequently passing up on opportunities to factor out common code. But I’m not at all sure this is a bad thing. Here is an example, which I’ve chosen as relatively short but it illustrates a pattern which is relatively common: if config.DatabaseType == "postgres" { query = ` SELECT p.slug, COALESCE(NULLIF(p.menu_title, ''), p.title) as menu_title FROM pages p INNER JOIN site_pages sp ON sp.page_id = p.id AND sp.site_id = $1 WHERE sp.is_homepage = false AND sp.site_id = $1 ORDER BY p.title ` } else { query = ` SELECT p.slug, COALESCE(NULLIF(p.menu_title, ''), p.title) as menu_title FROM pages p INNER JOIN site_pages sp ON sp.page_id = p.id AND sp.site_id = ? WHERE sp.is_homepage = 0 AND sp.site_id = ? ORDER BY p.title ` } Ignore the fact that there may be better ways to handle differences between database flavours. The point here is that these two queries are almost identical, except for the parameter placeholder and boolean value. A much better way to write this would be to have a single query with the differences parameterized:

Allanderek's blog

allanderek Nov 17

New post on small functions in Elm code: https://blog.poleprediction.com/posts/small-functions-and-elm/

Small functions and Elm

Cindy Sridharan writes an interesting post which questions the wisdom of keeping functions small. Although the author uses the provactive title “Small functions considered harmful” that is actually not their point, which is better stated in the conclusion: This post’s intention was neither to argue that DRY nor small functions are inherently bad (even if the title disingenuously suggested so). Only that they aren’t inherently good either. I’ve long had a feeling that an urge to keep functions small is counter-productive in many cases, particularly in Elm, and other languages which have nested functions. I think the key point is that you wish to decompose problems into sub-problems well, but that that doesn’t have to mean that a function is necessarily short. In this post I’m going to attempt to argue for this by comparing code fragments which are clearly pretty similar structurally but in which one has a single long function and the other does not. In this way I will argue that a simple measurement of how long a function is, is not particularly helpful.

Allanderek's blog

allanderek Nov 14

New post asking why gmail's spam detection doesn't work on obvious flirty scams: https://blog.poleprediction.com/posts/gmail-and-spam/

Gmail and spam

I regularly (more than once a week) receive obvious spam direct into my Gmail inbox. It looks like: Hello, handsome man, I would be very happy to meet a man like you and have a coffee with you to make each other happy. The ‘from’ address usually contains ’love’, ‘sweet’ or even ‘sexy’, such as ’[email protected]’. I generally think gmail’s spam detection is pretty decent, other than these obvious attempts to scam lonely single men I don’t get much other spam directly in the Inbox, and if I dare to look in the spam folder it contains mostly stuff that yes I didn’t need to see.

Allanderek's blog

allanderek Nov 13

New post regarding stacks and laziness: https://blog.poleprediction.com/posts/stacks-and-laziness/

Stacks and Laziness

I sometimes find that developers do not really have a good grasp on the point of laziness in a programming language, believing that it is mostly an optimisation. It’s not really an optimisation, it’s a way of writing generic code which doesn’t need to be specialised for a particular use case. I’ve made an attempt before to explain why laziness can result in cleaner or less complicated code. In this post I wish to attempt to explain another example where laziness can solve a complexity issue in code, but it comes at some cost, and understanding that cost can go a long way to explaining why laziness is not the norm (aside from the fact that purity is also not the norm).

Allanderek's blog

allanderek Nov 5

In case anyone is struggling with an Elm application with a long compile time, this may be of use: https://blog.poleprediction.com/posts/elm-compilation-elmi-files/

Elm compilation time with large records

This post details a problem with the Elm compiler that results in long compilation times or large memory usage (and possibly OOM issues) and a possible way to fix it, as in change your application so that your compilation times are shorter. I have a large (~150k lines) Elm application and it can take over 2 minutes to compile. One reason for this is the stored state that the Elm compiler stores in order to avoid some recompilation (or at least re-type-checking) code that hasn’t changed. It stores this information in .elmi and .elmo files. The .elmo files do not seem to grow overly large but the .elmi files can.

Allanderek's blog

allanderek Nov 3

A foray into trepidatious world of A.I. and water consumption: https://blog.poleprediction.com/posts/empire-ai-water-usage/
Don't worry, no firm conclusions were drawn.

A.I. Water usage - Empire of A.I.

I’ve been following some of the debate on the water usage of LLMs. A lot of people seem concerned that this is a large issue. I’m not sure either way, but I’m coming round to the view that water usage is not a major concern, in comparison to, for example, the energy usage. On the recommendation of a friend I’m reading “Empire of A.I.” by Karen Hao, in the chapter ‘Plundered Earth’ the author touches on the water issue. I think she makes a pretty big error that should have been caught by an editor, but also investigating this error has caused me to update a little in favour of the water issue actually being an issue, even if just a small consideration as opposed to being a complete non-issue. So first of all I’ll give a couple of links to roughly explain where I think we are in the llm-water-usage issue.

Allanderek's blog

allanderek Oct 5

New link post to Martin Janiczek's post regarding Elm Queue libraries:
https://blog.poleprediction.com/posts/elm-queue-shootout/

Link: Elm Queue Shootout

Martin Janickek has a great post “Elm Queues Shootout” which is well worth reading. He uses property based testing to test several Elm queue libraries. He then benchmarks those libraries as well, though he is cautious to note that for the most part queuing libraries are fast enough and you don’t really need to worry about performance unless you have queues of many items in a hot loop somewhere: I need to preface this with: this is all on a nanosecond scale. Don’t be wooed by the absolute differences here - does your webapp really care about 0.2ns vs 3ns? Which operations will it do often? The O(1) vs O(N) time complexities will probably be more instructive, though again you have to think about the realistic sizes of your queues. Are they going to hold more than a few hundred items?

Allanderek's blog