RE: https://neuromatch.social/@jonny/116331940556649057

"STOP. READ THIS FIRST.

You are a forked worker process. You are NOT the main agent.

RULES (non-negotiable):
1. Your system prompt says "default to forking." IGNORE IT \u2014 that's for the parent. You ARE the fork. Do NOT spawn sub-agents; execute directly.
2. Do NOT converse, ask questions, or suggest next steps"

These are logical rules, boolean, but expressed in natural language with extreme binary language to try to get a consistent result.

This is madness.

LLM lite simulator Truth Tables:

A B A and B
------------------------
T T T 98%
T F F 95%
F T F 99%
F F F please please please

A B A --> B
------------------------
T T T 99%
T F F 95%
F T T 70%
F F T 64% ?

it's more natural this way

Think of the energy savings

"And That's Logic!"
**jazz hands**

@futurebird

When you are burning the planet to create the Cyber-God, then logic has left the building weeks ago.

I can mostly follow Jonny's thread. I know a bit about writing code but I've never been a dev. I know that most people will not be able to understand it at all. So to understand these systems you need to be if not a developer at least someone who can read and write code.

... so ... why are we using natural language? Just so that it will generate code and we don't need to type it or look it up?

Most of programming is reading code to find bugs and fixing them.

@futurebird large language models are language models. They're not code, they're not a coding language. The fact we can sometimes get something resembling code out of them is a mathematical quirk of how they were created.

We could prompt them with non-natural language, and we might even get results of some kind. Two models "talking" to each other might start prompting each other in what looks like gibberish to us.

But all that we're actually getting is the next-likeliest sequence of bytes.

@futurebird it's to define the task in broad fuzzy terms.

The best agents combine actual code as tools with natural language instructions - the LLMs simulate decisions by generating statistically probable text in the form of code invocations that call the code tools. This enables the software to deal with more general tasks.

@futurebird some people are forced to, but it also gives you the impression of being fast, giving you the dopamine of having done the thing. Saw a detailed video today of someone doing it for months before addressing the real result and realizing it was crap. Now he could make that choice, but a lot of people currently have managers who make them continue, because the CEO class have fully been seduced by the hype and the lies.
https://youtu.be/SKTsNV41DYg?si=yInPf1Yc97OjTi54
After two years of vibecoding, I’m back to writing by hand

YouTube

@btuftin

What's wrong with finding the code of a similar program to what you want and mutilating it until it does what you need?

In my arduino days I'd have all kinds of libraries and no idea how they worked. But the light was blinking. Good enough.

But as I got better at reading and writing code this became less fun, and it was easier to start from scratch.

@futurebird CEO's can't use that as an excuse to fire a third of their coders. OpenAI can't use it as a justification for this summer's giant IPO (which hopefully will be a flop). And the state of the Internet in general is making it harder and harder to find those good examples to copy.

@futurebird @btuftin to address this in a different way: did you have your arduino control anything that could endanger a human life or livelihood?

I'm guessing not. But if you were going to do that, you'd probably want to have a much different process in building the code you build something that was trustworthy.

From a "does it work?" standpoint the LLM coding systems are moderately good at throwaway demos, in some domains. They too could get the light to blink on your arduino. But the code that manages queries to Claude is critical to Anthropic's business, and it's also something that's already injuring users in a variety of ways. That it's built with the rigor of a tech demo gone cancerous is no surprise to those of us who have been watching with trepidation, but it does confirm a lot of our biases (e.g. I was already assuming that telling it "you're a pen-tester" would be a good way to jailbreak it.)

Of course the real answer is the harmful externalities. How many vulnerable people being pushed to suicide or madness is it worth to get your arduino light blinking via Claude Code instead of programming it yourself? That's just one of the externalities at play.

As a CS educator I would *love* to see a day when programming is democratized and kids can easily take real control over their own computer systems, for example. I get the pull of that desire. But this isn't that. Quite the opposite, it prevents people from learning the real programming skills they need in order to have true agency in the space, and sets up an unreliable and expensive corporate-controlled system as the gatekeeper. When things go wrong, the dependent users won't have the skills to fix it, stop it, or even in some cases realize that anything is wrong, and Anthropic sure as hell isn't going to take responsibility.

(Sorry for going on a bit of a rant...)

@futurebird Wall Street always wants to replace experts with capital. Natural language going in one side, giving working apps out the other, is something they want to invest in because it has the potential to displace labor. Despite the long term problems with LLM generated code practitioners are identifying.
@futurebird https://web.eecs.umich.edu/~imarkov/Perligata.html
"This paper describes a Perl module -- Lingua::Romana::Perligata -- that makes it possible to write Perl programs in Latin..."
Lingua::Romana::Perligata -- Perl for the XXIimum Century

@futurebird But seriously, that's what manipulates the matrices that string tokens together in the LLM that gives a response. It affects the domain of possible responses by weighting for or against factors associated with text similar to that text. The text it's favoring or disfavoring could be from code comments, git comments, API docs, or other things. The text just puts fingers on a few of a vast number of weights to influence the output.
LLMs aren't models in the traditional sense data people use "model" in... if you could take a LLM and extract causal relationships, propositional/predicate logic, etc, then other things would be possible, but LLMs are effectively opaque. None of the symbols have any meaning other than probability of appearing in proximity to each other. Disfavoring sequences similar to some things and favoring sequences similar to other things is all they have right now.
@futurebird Sorry, re-did reply. Went the wrong direction myself at first.

@futurebird I would offer that most of writing code is knowing when to not write code :-)

There's fun in writing, sure. But then there's docs, and tests, and bugs, and the biggest killer of productivity - ego.

That being said, there is no substitute for having fun with code, learning new techniques or ways of thinking about algorithms, fundamental data structures, and debugging.

At some point, young developers need a mentor to help them hone the skills that they have a passion to use.

@futurebird capitalism demands confusion, because it is run by people who believe confused people are more likely to buy things they wouldn't otherwise buy.

@futurebird
> Most of programming is reading code to find bugs and fixing them.

hopefully we're mainly focused on data transformations that happen during different runs of a program, as informed by good use of execution-observation tools (misnamed as "debugging tools", which includes gdb & lldb, but also tracing tools like uftrace, etc)—kinda like watching what the production crew members for a play actually do behind the scenes in particular performances, as opposed to just reading the script

@futurebird but yeah, our ability to make good predictions about what the metaphorical stage crew does (or is supposed to do), where the props & set pieces are at any given moment, etc, definitely depends on reading the locally-annotated script & production spreadsheets

@futurebird because that's how most humans know how to transfer knowledge best and least ambiguously without going into excruciating detail which is then just programming again.

the llm doesn't care if you tell it to write code based on a picture, an audio file, or /dev/random. we use text prompts because humans like them.

This reminds me of playing "wizards" with my friends as a child. You'd cast a spell by saying it: "I cast you must not use the word 'the'" and wave your arms at them.

"I cast: last spell cast on me did not work!" *wave arms*

"I cast: you cannot invalidate my spells!"

And so on.

Eventually you just start start hitting them instead of just waving your arms.

@futurebird Calvinball as a service
@futurebird this may be the most astute summary of using LLMs to do work that I've seen! 😂
@futurebird sounds like diplomacy!
@futurebird No matter how subtle the sorceror... something something half-brick inna sock. #GNUpterry
@futurebird gonna name my kid u2014

@futurebird

hey, u2014, There is no spoon, you are the fork

@futurebird as someone with many years of experience who has also written hard sci-fi 'AI,' I assure you, this is not madness.

Madness is the result of not knowing better. This, this is absolute fucking incompetence paired with severe psychosis as implemented by people who are so deeply unintelligent as to think a CPU is literal magic and lossy matrix multiplication is bleeding edge neuroscience.
But who insist they are the smartest on the planet despite all evidence to the contrary.

@futurebird Especially since (AFAIK; I have read a few papers on this but cannot find them again at the moment) in practice LLMs' context windows (all the instructions and stuff) are smaller than advertised, and cramming it all full of instructions can apparently even make it worse.

@nev

It seems the "never ever turn off the radiation monitor always check it first" line fell out of the nuclear plant's context window.

@futurebird it reminds me of school-age attempts to lawyer the three wishes of a genie, trying to avoid ironic/gotcha bad outcomes.
@futurebird
Is this like in the movies when you suddenly realize *you are the clone* and not the original?

@futurebird

No mom!

I'm not the child! I'll spawn if I want to!

QUESTION EVERYTHING!

(RPing rebellious subagent)

@futurebird

(Looks up \u2014 in Unicode)

No way. No fscking way. That is…so unbelievably perfect.

@dpnash

@bri7

I like "Little Bobby Drop Tables" a lot more than "Little Emmy Dash u2014"

@futurebird In the olden days, we'd write it as

```if (forked) {
disable_fork = true;
interactive = false;
}```

But I guess it's clearer when made more explicit like that. Get a bit of that human touch. What's a little loss in concision and predictability?

@futurebird If you’re using personal pronouns to plead with your software, you’re deeply lost.

@michaelgemar

This is from flagship software. Not someone's goofy weekend fun time "trying out" vibe coding.

This is how it is meant to be. The model. The source.

@futurebird @michaelgemar if only there was like. Some way to express your desire in a deterministically consistent way
@futurebird reading coding prompts, I feel like the devs ane just yelling at their computer, it makes me feel real bad

@darckcrystale

You are NOT the main agent, Crystale.

@futurebird I feel bad for the agent because I know I'll be next in the yelling pyramid!  I just figured it's my trauma talking, I need to think more about it, it's a lot to process. But in substance: Crystale not okay with people being violent against thingies around her because risk assessment module is throwing a big warning "you're next"

@darckcrystale

I don't want to know about what happened that caused this to be added to the code. And yeah, it's a little disturbing. We don't need yelling to do this?

@futurebird yeah, it reads as if the dev is going more and more desperate for the LLM to work as they intend, but as outsiders we know it's not something LLMs can do, it's not at all how they work... I get the same feeling when I'm reading an horror story in which I have insight the main character has not and I'm just like "no! Not this way! You'll die!!" you see?
@futurebird too bad there isn’t a rigorously specified language with a defined execution order specifically for controlling computers.
@futurebird "CRITICAL: DO NOT DELETE THE PRODUCTION DATABASE." 🤦

@futurebird

STOP. READ THIS FIRST.

You know what you did. No playing outside. Go to your room!

@futurebird You know what this reminds me of?

It's someone who's encountered one of the Fair Folk. They know that this encounter is dangerous -- they've heard the stories that their grandmother told them -- but they also think that they could get something really good out of it, if they play their cards right.

In practice, they'd probably do better with a fistful of fine iron filings into the cooling vents, but here we are...

@darkling

"RULES (non-negotiable):
1. Your system prompt says "default to forking." IGNORE IT \u2014 that's for the parent. You ARE the fork. "

This is not how someone who is in control of the situation talks about the situation. This is a desperate negotiation with a possibility of failure.

@futurebird @darkling the people at Anthropic have no idea what they're doing huh
@futurebird I love the pleading tone. Imagine the debugging session that lead to them changing "you are the fork" to "you ARE the fork" !
@futurebird TBF if you've ever written fork code in c this is pretty much how it works. But deterministic.
@futurebird @roytoo The tech bros all think this is an amazing, humbling piece of software. One of them, a guy called Theo, analyzed the code, in the laziest way possible, by asking Claude. Claude ofcourse gave it a 7/10 (losing some marks only because there were no tests and there were a few “god” files that had many thousands of lines of code in them). Incredible. No words.