Mastodawn

David Gerard Oct 13, 2024

LLMs can’t reason — they just crib reasoning-like steps from their training data

LLMs can’t reason — they just crib reasoning-like steps from their training data - awful.systems

When you ask an LLM a reasoning question. You’re not expecting it to think for you, you’re expecting that it has crawled multiple people asking semantically the same question and getting semantically the same answer, from other people, that are now encoded in its vectors.

That’s why you can ask it. because it encodes semantics.

Show thread

ebu Oct 14, 2024

because it encodes semantics.

if it really did so, performance wouldn’t swing up or down when you change syntactic or symbolic elements of problems. the only information encoded is language-statistical

Show thread

froztbyte Oct 14, 2024

did you ask a LLM for a post to make here? that might explain this mess of a comment

Show thread

leftzero Oct 14, 2024

Paraphrasing Neil Gaiman, LLMs don’t give you information; they give you information shaped sentences.

They don’t encode semantics. They encode the statistical likelihood that each token will follow a given sequence of tokens.

Show thread

LainTrain Oct 14, 2024

It’s worth pointing out that it does happen to reconstruct information remarkably well considering it’s just likelihood. They’re pretty useful tools like any other, it’s funny ofc to watch silicon valley stumble all over each other chasing the next smartphone.

Show thread

swlabr Oct 14, 2024

“remarkably well” as long as the remark is “this is still garbage!”

Show thread

LainTrain Oct 14, 2024

Anything in particular the LLMs are bad at?

Show thread

V0ldek Oct 15, 2024

The only remarkable thing is how fucking easy it is to convince the median consumer that vaguely-correct-shape sentences are correct.

Show thread

LainTrain Oct 15, 2024

It was all lost long before the LLMs when people took random schizo opinions on Facebook as gospel.

We live in a post-truth world, and all things considered I’m not too fussed about LLMs being fallible on occasion when the average person is wrong far more.

Show thread

self Oct 14, 2024

thank you for bravely rushing in and providing yet another counterexample to the “but nobody’s actually stupid enough to think they’re anything more than statistical language generators” talking point

Show thread

blakestacey Oct 14, 2024

Rooting around for that Luke Skywalker “every single word in that sentence was wrong” GIF…

Show thread

vrighter Oct 14, 2024

so… a stochastic parrot?

Show thread

sc_griffith Oct 14, 2024

*guy who totally gets what these words mean* an llm simply encodes the semantics into the vectors

Show thread

self Oct 14, 2024

all you gotta do is, you know, ground the symbols, and as long as you’re writing enough Lisp that should be sufficient for GAI

Show thread

froztbyte Oct 14, 2024

both your comments made my eye twitch

like what’d happen if bob fucked up the symbols in a pentacle

Show thread

froztbyte Oct 14, 2024

also why do we need getaddrinfo? the promptfans will always readily tell you who they are

Show thread

V0ldek Oct 15, 2024

because it encodes semantics.

Please enlighten me on how? I admit I don’t know all the internals of the transformer model, but from what I know it encodes precisely only syntactical information, i.e. what next syntactical token is most likely to follow based on a syntactical context window.

How does it encode semantics? What is the semantics that it encodes? I doubt they have denatotational or operational semantics of natural language, I don’t think something like that even exists, so it has to be some smaller model. Actually, it would be enlightening if you could tell me at least what the semantical domain here is, because I don’t think there’s any naturally obvious choice for that.