Mastodawn

When I say LLMs are good at writing code that they're bad at modifying, no matter how we prompt, this is what I'm talking about.

Note also how problem completion rates never get even close to 100% under any conditions. I've never seen it, either. I suspect nobody has.

This is that "Fool's Errand" I've been talking about.

https://arxiv.org/html/2603.24755v1

SlopCodeBench: Benchmarking How Coding Agents Degrade Over Long-Horizon Iterative Tasks

wwolf 5d ago

Keith Mar 31

copilot is just for entertainment? Per the TOS...
Highlighting is my own. From that last boost.

https://www.microsoft.com/en-us/microsoft-copilot/for-individuals/termsofuse

wwolf 5d ago

abadidea 6d ago

IT'S HAPPENING

GITHUB, THE FIRST ENTERPRISE CLOUD SOLUTION TO REACH ZERO NINES RELIABILITY

https://mrshu.github.io/github-statuses/

#github

wwolf 5d ago

Daniel Kochmański 6d ago

I'm going to throw some alpha stuff into the release to avoid unnecessary pestering :)

#McCLIM #lisp #sdl2

wwolf Mar 27

Daniel Kochmański Mar 27

With pleasure we are announcing a new Embeddable Common Lisp release:

ECL-26.3.27: https://ecl.common-lisp.dev/posts/ECL-26327-release.html

#lisp

Embeddable Common-Lisp

wwolf Mar 27

Show thread

Angus McIntyre Mar 26

Whatever the hypesters may tell you, LLMs do NOT reason. Given two conflicting versions of a story, they’ll go for the one that is repeated more often. The sequence of tokens representing a false narrative is – if the astroturfers have done their job right – statistically more probable than the sequence representing a factual account, so it's the false narrative that will get coded into the model and trotted out on demand.

wwolf Mar 27

Show thread

Angus McIntyre Mar 26

LLMs are essentially gullible. And many people, even otherwise smart people, are gullible enough to believe that "AI" distillations of facts are trustworthy. It's a problem of gullibility compounded. But there's also an entire industry that's devoted to trying to convince us NOT to be skeptical of AI, not to see it for what it is -- an often-naive statistical model that can and will increasingly be gamed by bad actors.

wwolf Mar 21

Daniel Kochmański Mar 20

No no, we are not vibe coders anymore - it sounds unprofessional. We are doing now (checks notes) harness engineering and we are the real engineers :3 /s

wwolf Mar 11

Manon Mar 10

Metal Mystery Patch - help needed!
My kid bought this patch on a concert where this band opened. Unfortunately they remember neither the concert (because the opening act wasn‘t on the ticket) nor the band’s name. And it‘s unreadable. 😬

Google Lense was of no help.
Please boost widely! Thx!

wwolf Feb 23

Kent Pitman Feb 22

I put up a new blog essay, The Tedious Pained, a defense of the legitimacy of technical & social critiques of "AI" tech, even as positives are claimed. Positives don't cancel negatives, in other words.

Well, it's only new if you didn't see it as a comment on someone else's post a bit ago at LinkedIn. This is me just giving it a better placement and a name for later reference. And perhaps a wider audience.

https://netsettlement.blogspot.com/2026/02/the-tedious-pained.html

#AI #LLM #LLMs #SocialCost #critique #ethics #AIEthics #tech #technology #harm #society #politics

The Tedious Pained

A defense of the legitimacy of technical & social critiques of "AI" tech, even as positives are claimed. Positives don't cancel negatives.