Max Woerner Chase

17 Followers
15 Following
207 Posts

There's a blog I used to read often. Not so much lately, since the posts have mostly been about LLMs and they're just kind of depressing. But I still check to see if something interesting has been posted by chance. A few days ago, he posted a project, a programming language, specifically a restricted dialect to *sigh* facilitate LLM usage, which fits in with his whole "learn to use LLMs as centaurs instead of reverse centaurs" thing. Now, the bitter irony came when I took a look at the readme, and followed the links to related projects, except that one link was an unregistered domain (not even squatted!) and the other was a 404.

See, Claude came up with the URLs for those links, and they got committed without any kind of human review. Since Claude cannot meaningfully take feedback for its actions, the responsibility for the mistake has to fall on... the guy figuring out how not to get reverse centaured.

Whenever someone representing Microsoft talks about the potential of "AI" or whatever, whoever is talking with them should ask them about that hallucinated "Windows 12" story like it was legit, and refuse to back down until the representative is all "was this story generated by AI?"

Because, like, guys, you had no direct input into this, but it's still kind of your fault.

@glyph @cthos @aburka My take on this is that it could make sense to treat it like the lived experience of someone who is currently being scammed, or who has "a system" for gambling.
When writing code, it's somehow so much faster to do something than to undo it. At least sometimes. It'd be nice to have some sort of comprehensible heuristic for when that is and isn't the case.
@glyph I'm four years into my own attempt 💀. I consider it "not ready" because it only runs on its own repo, with a configuration file about as long as its former noxfile, and I'm using thrown-together tox.ini files in projects started since. The majority of the effort has been in developing an internal DSL to streamline configuration; fingers crossed this gets somewhere once I'm less burnt out.

@wolf480pl One of the projects we have going right now concerns removing terraform (mis)use from one of our internal services, so that's nice for us.

My sympathies about having to deal with it. I assume/hope it's not as bad as what we've been saddled with for like a decade. I can sum it up as:

"Wait, why are we going to use terraform? The intended use case doesn't match what we're building."
...
"Wait, why did we use terraform? The intended use case doesn't match what we built."
...
"Counting down the days until we can stop supporting legacy infra and remove terraform."

Sad to say, I just kind of went along with things, and didn't push back.

Not to be all "skill issue", but it seems to me that anyone talking about how LLMs let them focus on the architecture and leave the code as an afterthought is actually not thinking enough about the architecture.

My hobby projects tend to involve careful planning, followed by relatively quickly filling in or changing the code. Our projects at work are much nicer when we find code that can be safely deleted.

Others have said it, but typing speed is not the bottleneck.

@glyph @mcc It does not.

I need some kind of media query that lets companies who have their websites go all in on "AI" marketing to appease shareholders flip their copy back to talking about what the fuck it is they actually do.

I mean I'd still feel a kind of oily moisture on my hands and eyeballs just from looking at the "AI" version, but I'd like some idea of the actual value proposition.

Thinking about the kind of "AI" studies I'd maybe like to see. Stuff like "is 'prompt engineering' real?"

In other words, does rewording a prompt give "better" results than simply repeating the same request?
Is choice of model a confounding factor?
If rewording a prompt can give "better" results, does this represent some form of trainable skill?
If rewording a prompt is a trainable skill, to what degree does experience generalize between models?

While there are apparently ideas about hooking up models together (thereby using more energy/credits), the impression I got was that a lot of "prompt engineering" basically boils down to pulling the lever on the slot machine in a subtly different way. (And we've got studies casting doubt on stuff like "chain of thought".)