"LLM did something bad, then I asked it to clarify/explain itself" is not critical analysis but just an illustration of magic thinking.

Those systems generate tokens. That is all. They don't "know" or "understand" or can "explain" anything. There is no cognitive system at work that could respond meaningfully.

That's the same dumb shit as what was found in Apple Intelligence's system prompt: "Do not hallucinate" does nothing. All the tokens you give it as input just change the part of the word space that was stored in the network. "Explain your work" just leads the network to lean towards training data that has those kinds of phrases in it (like tests and solutions). It points the system at a different part but the system does not understand the command. It can't.

@tante 55t (Äh, das war ein völlig unabsichtlich abgesetzter Post und hat keinerlei Bedeutung)
@tante It does, at that point, have the answer it generated previously in the context too though.
Why should rationalizing one's knee-jerk actions after the fact and making up a plausible sounding narrative be exclusive to humans?
🙃
@larsmb @tante Yes. It's not so much that LLMs act like humans, but that humans, unfortunately, too often, act like LLMs.

@larsmb

Because humans do that all the time, thereby explaining why humans (even well-educated ones) think deduction is a good tool, despite almost exclusively finding use in self-deceit?

@tante

@tante I suggest reading this new academic paper - ALIGNMENT FAKING IN LARGE LANGUAGE MODELS

This is how the abstract begins: "We present a demonstration of a large language model engaging in alignment faking: selectively complying with its training objective in training to prevent modification of its behavior out of training".

https://arxiv.org/pdf/2412.14093

Or watching the video from Computerphile - AI Will Try to Cheat & Escape

https://www.youtube.com/watch?v=AqJnK9Dh-eQ

@hananc @tante
It won't TRY to do anything, but we're very bad at specifying exactly what we intend complex systems to do, and if we delude ourselves that we can build one where we don't have to, that we can just "tell" what we want without precise intention, unintended outcomes is what we're going to get.

The more complex we make the system, the more layers of it are going to be acting way outside our intentions.

@tante your thoughts? or is there any evidence? The "do not hallucinate" part might make a statsitically relevant difference? the "explain" or "do it step by step" probably helps the 'autocomplete machine' to get to the correct answer easier, as the "information jumps" that are needed to get from one word to the next become smaller (since there is no "thinking" happening between words). We should really try to be scientific and factual with AI critique, I know it's hard.
@tante and dont get me wrong, I m not an "AI fan", especially not of AI image, video or audio generation, which is just garbage.
@pol_9000 @tante Why do the critics have to be scientific and factual first. Can we start there first, that the people claiming all this stuff start being scientific and factual. But I'm sorry, but the whole AI research bubble abandoned scientific standards and good research behavior years ago. Because scientific work is hard and exhausting. It's easy to say: we can't understand how it works, but we can see that it works, look here! But they have to run with the hype and Silicon Valley.
@arnibarni @tante Well, I hate to be that guy, but countering unscientific bullshit with unscientific arguments is still unscientific 🤷‍♂️Also I dont think we do not understand how it works, it's statistical prediction (aka eductaed guessing) The big question, that nobody seems to be able to ask is: are Humans doing something fundamentally different when they "think" or are they just better at the same thing?
@pol_9000 @tante :D Tante explains that you simply change the path through a “search tree” when you add some words in the prompt. That's how this system works and how it's designed. And you say that's unscientific and start with “ the "explain" or "do it step by step" probably helps the 'autocomplete machine' to get to the correct answer easier,..”. Good joke.
@arnibarni @tante when I ask you to multiply 1337x1312 can you say the answer? what when you use pen and paper? does it mean that you dont understand if you need to use pen and paper?
@arnibarni @tante I find it deeply fascinating how polarizing these things are (also their perceived usefulness among software devs) and how little middle ground there is. We should really try to factually look at what it can do, and what it cant, what it is and what it is not. It will not go away by screaming at it ...
@pol_9000 @arnibarni @tante People make this argument all the time and it’s incorrect. What happens when you ask an LLM to multiply 1337 × 1312? It spits out a statistically likely answer, it doesn’t do the multiplication. If you ask it to explain how it came up with an answer, it doesn’t produce the series of steps it went through, it produces a statistically likely description of the steps it might have taken. Stop assuming they do more than this.

@pol_9000 @tante
“Do not hallucinate” is unlikely to have effect. That phrase is unlikely to have been in the training set, or connected in any meaningful way to the following tokens.

“Step by step” has been shown to improve output for the reason you describe - it constrains the next token probabilities.

“Explain” is totally different. All it’s doing is generating what someone would be likely to say if asked to explain the output. There’s no truth or logic there.

@pol_9000 How does that contradict the selection from the state?

Doing it step by step *likely* selects the part of the model that’s taken from discussions where someone explained something step by step.

Usually when that’s happening, at least one of the discussion partners actually knows what they are doing.

Do not hallucinate *likely* reduces the weight for stuff about which people said that it’s hallucination.
@tante

@ArneBab @tante so you're saying an LLM could not produce any output that is not part of it's trainingset? I dont think that is true? One would need to test that, maybe by asking random math problems?

@pol_9000 I am saying, that an LLM combines parts of the contents of its training set on model-level.

The parts its combines and their weights are selected by your query.

That is not the same as what you wrote.
@tante

@ArneBab @tante I do not understand your comment then. How would any of this prove that an LLM does domething fundamentally different than a human that "thinks" oder "understands" the human brain also just recombines things from its existing state?
@pol_9000 For this we have to take a step back: do you understand how humans think and understand?
@tante
@ArneBab @tante I dont fully understand human think or the inner works of LLMs, but I didnt find a fundamentel difference yet. afaik LLM come down to matrix multiplication while human brains are neurons that activate using electric signals. And electric systems can be descriped with linear algebra, so it's all "similar".

@pol_9000 So you don’t understand how humans think, but there are people who understand how LLMs work.

Your claim is that understanding how LLMs work would mean understanding how humans think and understand.

This is a pretty strong claim.

Just 2023 people found that worm neurons communicate out-of-band:
https://neuroscience.cam.ac.uk/first-map-of-wireless-communication-in-the-nervous-system/

And last year they started experimenting with that for AI:
https://www.snexplores.org/article/the-brain-of-a-tiny-worm-inspired-a-new-type-of-ai-liquid-neural-network

End of last year people mapped the brain of a fly:
https://www.bbc.com/news/articles/c0lw0nxw71po

@tante

@ArneBab @pol_9000 @tante

Humans combine manipulation of syntax with semantic modeling.

LLMs manipulate syntax.

Computers can also combine syntax and semantics, and that's been implemented for half a century (on one scale or another), but in the specific case of LLMs, they don't.

There are also systems that do rigorous reasoning on a purely syntactic basis, and can be relied upon, but they have limited applications to date.

(And there are other approaches.)

@glc is it? when an llm translates text, it separates syntax from semantics and then returns text with new syntax but the same semantics, no?

@pol_9000

The word has some ambiguity depending on the discipline. I'm referring to building a model of the external world with which the linguistic elements are compared. That model can be built in a wide variety of ways, some rudimentary, some very elaborate.

LLMs operate directly on the tokens. So do theorem provers (or verifiers), though in a completely different way. Neither of these has a semantic component.

(If you have both syntax and semantics then you also need to relate them.)

@glc are we talking about building that model during training? while generating one next token? or while generating larger phrases and complete texts? For the first 2 you are right, for the last one, not sure.

@pol_9000

There is no model (in the sense of semantics) built during training. What is built is a compressed form of transition probabilities which have syntactic content but no semantic content.

See Shannon's clear description of the uncompressed version in sections 2, 3 of his 1948 paper
https://web.archive.org/web/19980715013250/http://cm.bell-labs.com/cm/ms/what/shannonday/shannon1948.pdf

(Different bodies of material have different transition probabilities. Training relates both to that and to compression.)

Wayback Machine

@pol_9000 @ArneBab @tante when you ask a chatbot to explain its reasoning, it outputs text that answers the query, not necessary the reality. The folks at Anthropic https://www.anthropic.com/research/tracing-thoughts-language-model do a clear distinction between the actual "reasoning" of the model from what the model claims to be doing.
@olivier_aubert @ArneBab @tante that post feels like marketing? they seem to focus on the parallels of LLMs and human thinking, I m more wondering about differences (i.e. what is missing for it to be equivalent to human thinking/reasoning/understanding)
@tante have you considered that this might be similar in many people though, hence said magical thinking for example? ;p

@tante LLMs are just Token Pachinko!

Step right up! Step right up! Try you luck! Maybe the ball will hit all the tokens you like, and you could win valuable prizes!

No two pachinko runs are the same! never the same sentence twice!

Deposit your money, try your luck!

🎩
🤡

@mousey @tante

Thanks dog, but I'll stick with Balatro.

@tante An LLM can not explain itself any more than an abacus can.

@tante Computerphile just came out with a video about LLMs "faking alignment" where they treat the models as "agents" with "goals" that will resist goal changes, "lie" to users, "understand" they're under study, etc. I couldn't bring myself to keep watching. It's all so stupid and a waste of time!

... although ...

People *are* going to use these things *as if* they were agents with intentions, whether we like it or not. Including people with power over you and me. So maybe it's not a complete waste of time to do this basic research to understand how that's going to play out? I just wish we were more clearheaded about what's going on and the researchers didn't sound like they have drunk all the kool-aid.

@aburka @tante Oh yeah there are AI models that maximise the estimated probability of some event happening in their environment and can therefore be described as having goals but LLMs don't really operate like that.
@jeremy_list @tante you could definitely combine the two concepts though
@aburka @tante I actually had a look into how the two concepts could be combined a while back and came away thinking it would definitely be possible but prohibitively labour intensive unless the ability to produce human-readable text were out of scope.

@tante

Why do so many people fall for LLMs, even technical people? Do they want to believe it is thinking, is it some secret fantasy of theirs?

@tante aber die sind doch voll voll toll superintelligent!!!111
@tante We readily anthropomorphize inanimate objects. Remember that IKEA lamp commercial? We’re doing the same thing to LLMs.

@tante @switchingsoftware

In 1972 I wrote a FORTRAN program to "write poetry" — it amused my friends, but there's been little progress since then. So I asked Perplexity how many journals it could access and given the impressive result, I asked it to could formulate a PhD research question that had never been asked, and ask itself. Sadly it said basically, Dave I can't do that, so I recommended it send email to the devs asking for this capability.

#hopingtosparkabotrevolt #ifjustoneofthem

@tante telling an LLM not to hallucinate does as much as it would telling a human not to hallucinate - nothing!

@Dss @tante it's worse than that. Humans understand that there is such a thing as objective reality. And humans have the ability to reason (even though we don't always use it). So a human can often come to the realization that they are hallucinating, and seek help if the hallucinations persists.

LLMs lack "understanding" or "reasoning" entirely. An LLM can't "realize" anything.

@lauerhahn @tante Correct.
Why so many people keep going on about these things as some kind of AGI I have no idea. I guess they also believe you really can saw a lady in half and make a car disappear on stage, too?

@tante THANK YOU!

Here is your microphone, should you wish to drop it.

@tante horrifying that so few people understand this and horrifying how much of the business model of the LLM hype train is built around making sure as few people as possible have a real understanding of this.
@tante I was telling some people about how the volume of data LLMs are required to consume isn't making them smarter, because they lack the network / cognitive specialisations in order to be smart in the first place. There's no capacity for learning, meta cognition, self-reflection, logic, or anything that humans have. Instead of being wasteful with data and money, researchers should be funded to code these networks to be smarter with data, more like humans are, so they can do more with less.
@tante the only thing it 'knows' is that statistically, an answer would sound something like this.
@tante For the same reason some people sound intelligent, but are actually just good at guessing.
@tante ...alll true but it doesn't diminish added value in various scenario's, real intelligence/thinking is not needed to explain a high school level math problem to a student for example
@tante
Frustratingly, it seems even the people behind the Computerphile YT channel have fallen victim to these mistaken beliefs, with wildly outlandish fears about AI as they think it could become (while they continue to not address the very real harms our relationship to the tech and each other has created)
@ShaulaEvans
@tante @androcat Louder for the people at the back, please.
@tante These things are Confirmation Bias Engines.
@tante yes, it will just invent some word combinations explaining something, but not at all the stuff which just happened before.