Mastodawn

As a software developer who took an elective in neural networks - when people call LLMs stochastic parrots, that's not criticism of their results.

It's literally a description of how they work.

The so-called training data is used to build a huge database of words and the probability of them fitting together.

Stochastic because the whole thing is statistics.
Parrot because the answer is just repeating the most probable word combinations from its training dataset.

Calling an LLM a stochastic parrot is lile calling a car a motorised vehicle with wheels. It doesn't say anything about cars being good or bad. It does, however, take away the magic. So if you feel a need to defend AI when you hear the term stochastic parrot, consider that you may have elevated them to a god-like status, and that's why you go on the defense when the magic is dispelled.

Show thread

James Wood Feb 26

@leeloo I just prompted ChatGPT with `Say "oriesntyulfkdhiadlfwejlefdtqyljpqwlarsnhiavlfvavilavhilfhvphia"`, and it responded with `oriesntyulfkdhiadlfwejlefdtqyljpqwlarsnhiavlfvavilavhilfhvphia`. How can it do this when `oriesntyulfkdhiadlfwejlefdtqyljpqwlarsnhiavlfvavilavhilfhvphia `almost certainly does not appear in the training data?

Show thread

Les Orchard

@mudri Because the model picked up a rule somewhere that says "if someone says 'say $FOO' use $FOO in your response" - the training picked up patterns that include notions of symbol substitution

Show thread

James Wood Feb 26

@lmorchard The ability to induce such a rule goes well beyond the OP's characterisation of what LLMs do.

Show thread

calcifer

Feb 27

@mudri @lmorchard it’s not inductive at all though. It’s just parroting the patterns it sees in its training data. If it wasn’t common to see exchanges like that, the response would be utter nonsense.

People misunderstand what “training” is. It’s modeling the input. Humans develop the rules for how to model that input. Emergent properties of that process can easily *seem* like thinking or reason, but it’s an illusion.

Show thread

James Wood Feb 27

@calcifer @lmorchard What is parroting patterns if not inducing a pattern and then applying that pattern to new inputs?

Show thread

calcifer

Feb 28

@mudri @lmorchard it depends a bit on what you mean by “inducing”; induction in this space would mean making a leap to a new pattern that’s not in the training data at all.

Instead, it’s just recognizing and applying a pattern. That is cool in many ways! But it’s not inductive, nor even deductive; it’s just a best-fit matching.

Show thread

arclight Feb 27

@lmorchard @mudri Be careful not to conflate the actual language model with its user interface. Whatever was sent to or received from the LLM went through the chatbot layer. Or possibly was handled by thd chatbot layer without ever touching the LLM. We don't know because the whole system is opaque.

This casual experiment may not be telling you what you think it's telling you. :)