Mastodawn

★ Amy Star ★Mar 27

jneen collective

how to make programming terrible for everyone

https://jneen.ca/posts/2026-03-27-how-to-make-programming-terrible-for-everyone/

how to make programming terrible for everyone | jneens web site

how to make programming terrible for everyone

Show thread

jneen collective Mar 27

you can tell i have adhd by the 17 footnotes

Show thread

demize Mar 27

@jneen god i love footnotes

Show thread

jneen collective Mar 27

@demize they're so good. terry pratchett used to have like 5 levels of nested footnotes spanning 3 pages and i aspire to that level of whimsy

Show thread

mirabilos Mar 28

@jneen @demize I like it!

Show thread

Sam Mar 27

@jneen alright there pterry

Show thread

jneen collective Mar 27

@lenary lol

Show thread

arclight Mar 28

@jneen I just thought you were being thorough...

Show thread

jneen collective Mar 28

@arclight that's how i feel too lol

Show thread

arclight Mar 29

@jneen My most recent project was a Python implementation of an aerosol scrubbing model that is implemented as part of larger R code. The R code is not documented to the level we need so the Python implementation was mainly an excuse to chase down references for everything left undocumented or unattributed in the R application. The original documentation was about 50 pages for the entire application; my documentation was about 200 pages for just one model (though strip out the example plots and source code and there's still at least 50 pages of design info and technical basis). In this case the documentation was far more valuable than the code because it cited chapter and verse where every equation, correlation, and piece of data came from. It's good having a pedantic technical reviewer that holds you accountable.

So yeah, I appreciate the work that went into your post and the background detail. :)

Aside: Is a commercial chatbot even capable of providing references for its work? My understanding is that all the attribution is laundered away when the LLM is constructed; all it can produce is obsequious hearsay...

Show thread

jneen collective Mar 29

@arclight i'm glad you had the documentation for that project, sounds like a life-saver. human communication >>>>

Show thread

jneen collective Mar 29

@arclight to your other question, part of the reason there's ambiguity on this is that LLMs can *claim* to provide references and introspect about its output, but that introspection and those references are still just... output

Show thread

arclight Mar 29

@jneen We've seen with the number of legal cases where lawyers have been caught out with fabricated cases as well as journal papers with fabricated references that the system will simply stick tokens together to meet its optimization threshold. So even if attribution hadn't been intentionally bleached away, nothing the chatbot emitted could be trusted unless there's some trustworthy deterministic (non-LLM) system that can verify the existence of citations and assess their relevance. *Everything* is a fabrication.

What concerns me is not the LLM part of the chatbot - that's just a pile of linear algebra - it's the cobbled-together UI that responds like an obsequious servile intern, Stepford ELIZA on Prozac. That part of the system is built on 30+ years of dark pattern research to keep people spending tokens. Right, wrong, as long as users keep spending, the system is operating as designed. The only acceptance test is that line goes up.

Show thread

jneen collective Mar 29

@arclight exactly! i've seen someone ask a chatbot "would you hallucinate if i asked you X" and it's just... not how it works.

Show thread

Tiota Sram Mar 29

@jneen @arclight a point I like to emphasize in these discussions is that if you gave me the same kind of money currently being lit on fire, I could build you a system that would take a natural-language query about code you want you write and then come up with a short list of open-source projects (along with their licenses) that have code that does that thing, along with a pointer to that exact snippet of code. It would of course have some results that weren't great sometimes and there would be an art to querying it effectively, like any search engine... But it would be a much better programming assistant than code LLMs.

Show thread

Jeff Robot 🤖Mar 29

@arclight @jneen — this! Keeping engagement, keeping one hoping that the next response will be the right one. Feels like gambling. A recent JA Westenberg piece goes well with this: https://mastodon.social/@Daojoan/116219554271259845.

Show thread

arclight Mar 29

@jneen I get extra twitchy about chatbot use because my job is software QA on nuclear safety analysis code (here's a decent technical basis for an earlier related code https://www.osti.gov/biblio/10200672/) We have enough problems with coarse models and missing or uncertain data, we don't need a machine confidently fabricating nonsense. I'm not going in front of a regulator to explain our answers are bullshit because someone trusted a chatbot to fill in the blanks. Health & safety of the people comes first, then environmental protection, then protection of equipment. It's simply unethical to use these systems in any part of the safety analysis or design or licensing process. There's too much at stake.

An assessment of the potential for in-vessel fission product scrubbing following a core damage event in IFR (Technical Report) | OSTI.GOV

A model has been developed to analyze fission product scrubbing in sodium pools. The modeling approach is to apply classical theories of aerosol scrubbing, developed for the case of isolated bubbles rising through water, to the decontamination of gases produced as a result of a postulated core damage event in the liquid metal-cooled IFR. The modeling considers aerosol capture by Brownian diffusion, inertial deposition, and gravitational sedimentation. In addition, the effect of sodium vapor condensation on aerosol scrubbing is treated using both approximate and detailed transient models derived from the literature. The modeling currently does not address thermophoresis or diffusiophoresis scrubbing mechanisms, and is also limited to the scrubbing of discrete aerosol particulate; i.e., the decontamination of volatile gaseous fission products through vapor-phase condensation is not addressed in this study. The model is applied to IFR through a set of parametric calculations focused on determining key modeling uncertainties and sensitivities. Although the design of IFR is not firmly established, representative parameters for the calculations were selected based on the design of the Large Pool Plant (LPP). The results of the parametric calculations regarding aerosol scrubbing in sodium for conditions relevant to the LPP during a fuel pin failure incident are summarized as follows. The overall decontamination (DF) for the reference case (8.2 m pool depth, 770 K pool temperature, 2.4 cm initial bubble diameter, 0.1 pm aerosol particle diameter, 1573 K initial gas phase temperature, and 72.9 mole % initial sodium vapor fraction) is predicted to be 36. The overall DF may fall as low as 15 for aerosol particle diameters in the range 0.2-0.3 pm. For particle diameters of <0.06 pm or >1 pm, the overall DF is predicted to be >100. Factors which strongly influence the overall DF include the inlet sodium vapor fraction, inlet gas bubble diameter, and aerosol particle diameter. The sodium pool depth also plays a significant role in determining the overall DF, but the inlet gas phase temperature has a negligible effect on the DF. | OSTI.GOV

Show thread

jneen collective Mar 29

https://unstable.systems/@jneen/116307806483254677

Show thread

packetcat Mar 27

@jneen er is the body of the blog post supposed to be ~7MB of HTML and take ~28s to load?

Show thread

jneen collective Mar 27

@packetcat oops i left the inlined images in there

Show thread

jneen collective Mar 27

@packetcat there we go. should be about 43kb now

Show thread

packetcat Mar 27

@jneen yep, loads very quickly for me now!

Show thread

sebbe://Mar 27

@jneen amazing read. Thanks!

Show thread

jneen collective Mar 27

@royalrex thanks for reading :]

Show thread

Jordan Sissel Mar 28

@jneen this is spot on and great writing!

Show thread

jneen collective Mar 28

@whack thank you so much! i had a lot of help, it was a whole mess for a while lmao

Show thread

Martijn Frazer Mar 28

@jneen Wow this is great!

Show thread

jneen collective Mar 28

@Tijn thank you! :]

Show thread

Natalie Mar 28

@jneen grandpa mentioned

Show thread

jneen collective Mar 28

@nex3 oh my god really?

Show thread

Natalie Mar 28

@jneen yeah there are only two Weizenbaum branches that still exist and I'm that one

Show thread

jneen collective Mar 29

@nex3 that's really cool!

Show thread

Irenes (many)Mar 29

@jneen @nex3 extremely!

Show thread

samir, actually the xkcd top hat guy Mar 29

@jneen I was in the middle of reading this when a friend texted me saying, “read this!”

It was absolutely worth reading. Thank you for writing it.

Show thread

jneen collective Mar 29

@samir thank you so much!

Show thread

Tiota Sram Mar 29

@jneen this is brilliant. I loved this from the ending:

"""
I think the ultimate fate of AI programming won’t be too far from that of The Last One. When a programming tool is unreliable, completely resists mental-modeling, and is incapbable of consistently rejecting invalid input, I think it’s reasonable to say it’s not fit for purpose, and is certainly not the future of programming. We simply cannot develop mental models of AI through traditional means. But we have to remember that just because we don’t understand it doesn’t mean it’s hiding secret insight or power.
"""

Have you read of the "Sim City effect" (disclaimer: Noah Wardrip-Fruin was one of my graduate advisors)? It's a nice discussion that extends the "Eliza effect" in several dimensions and seems to match up with a lot of what you're saying here.

Show thread

Giacomo Tesio Mar 29

@[email protected]

Yet we can build a mental model of #AI: it's simply a corrupted decompression of a lossy compressed archive of our competitors work.

Those who builds the archive gives to our competitor access to our work in exchange of giving us access to their. All for the largest fee they can get at any given time.

@[email protected]

Show thread

Tiota Sram Mar 29

@giacomo @jneen the size of the archive makes this unusable as a predictive model of the LLM's behavior though, at least for the kinds of predictions necessary to support use for building programs.

To be useful as a tool in the way that a programming language is, humans would need to be able to build a mental model of its operation that could fit in our heads and allow us to predict what it would do. I can understand the math behind an LLM fairly well and can even make *some* kinds of predictions like "it will tend to fail in these ways." I might even be able to leverage this to exert some level of control over the system if all I want to make it do is fail in a particular way or email me the user's github tokens or something. But to control it precisely in the ways necessary to effectively program with it is harder-to-impossible. It might *seem* to be giving me what I want because (especially for simple programming tasks) it sometimes or even often gives me what I ask for. But that's an illusion of control as it's not dependable and when it fails I've got little recourse other than to abandon it for a different tool.

Show thread

Giacomo Tesio Mar 29

@[email protected]

Of course, such mental model won't let you predict a #CodingAgent's output. Even just because you know a certain amount of random input is included in its computation to give an illusion of creativity and counter some plagiarism accusation. That's why I included the "corrupted" term.

On the other hand, it gives you a useful insight of what you might exfiltrate from a competitor work and how to setup the context to get specific parts of its proprietary software, such as secret algorithms, test suites, or even secret keys (in particular short ones like ed25519) that the agent got access to.

@[email protected]

Show thread

Zack Weinberg Mar 29

@Jetengineweasel The top half of the essay linked from what I'm replying to -- up till "Evaluating AI as a programming language" -- is relevant to what we were talking about earlier today. "predictability" and "discoverability" especially are important for _any_ program, not just languages

( @jneen sorry for the tangent <3 )

Show thread

jneen collective Mar 29

@zwol @Jetengineweasel i think there may be part of this thread I can't see, but yes, it's 100% rehashing of a lot of basic UX stuff

Show thread

Zack Weinberg Mar 29

@jneen i was referring back to an off-masto conversation, sorry for the confusion

Show thread

Zack Weinberg Mar 29

@jneen regarding the ELIZA effect, I think it points at a more basic cognitive error than just "having no reference point to fall back on". so much of our own understanding of how we think is grounded in language that it's hard to believe that something that can (convincingly mimic) talk *can't* think.

Show thread

Zack Weinberg Mar 29

@jneen but i went through a master's program in cognitive science and one of the things the intro classes really hammered on is: a great deal of the work the human brain does isn't based on language at all. or vision, which is the other thing we're really *aware* of relying on.

and -- not coincidentally -- that work is the work we struggle to write algorithms *of any kind* to handle.

Show thread

jneen collective Mar 29

@zwol that's incredibly interesting - do you have any reading on that topic you'd recommend?

Show thread

Zack Weinberg Mar 29

@jneen Hmm, it's been a long time, but I'd suggest you start with "Cognition in the Wild" (E. Hutchins) and "The Way We Think" (Fauconnier and Turner). Also anything you can find on nonhuman animal cognition

Show thread

unsponsor Mar 30

@jneen great post. minor nitpick, I think it's "capable"? anyway I often find myself wishing I could "write computer programs using natural language" but like, I don't want to start speaking INFORM

Show thread

jneen collective Mar 30

@amsomniac ha! inform and applescript are my usual cautionary tales. inform is actually very interesting as an art project itself, though - the fact that it exists is a marvel, and the things people have made in it as well

Show thread

jneen collective Mar 30

@amsomniac i'm not sure what you mean by "capable" here, did I make a typo somewhere?

Show thread

unsponsor Mar 30

@jneen you know, I thought I saw one but I just looked for it and couldn't find it. sorry!

Show thread

jneen collective Mar 30

@amsomniac ha it's all good, nw

Show thread

unsponsor Mar 30

@jneen "incapbable" that's it!