Absolutely fascinating piece by David Oks connecting language model oddities to human cultural development. Picked this up courtesy of Deena Mousa's "Under Development" newsletter.
https://davidoks.blog/p/language-models-are-weird-for-the
Language models are weird for the same reason human cultures are weird

You can’t have adaptive learning without strange tics

David Oks
@shriramk I like that the text traces these connections while mostly avoiding the anthropomorphism (until the end where it rears its ugly head among mentions of "psychology" and "mind-like" entities, sic). A science of connectionist language patterns is waiting, that is exciting, and we may learn a great deal about humans, too. But to ascribe human traits to models is IMHO a bridge crossed too far and ignores that ultimately it is in our human desire and need for interpretation where the oddness and weirdness judgment comes up. The linear algebra does not care, and that makes LLM technology not less but *more* prone to be used for manipulating humans.

@burakemir Okay, but most writers (not all, obviously) do not actually mean to anthropomorphize. It's just a convenient shorthand.

Like if I say "the LLM says" or "the LLM knows", sure, the first time I could write «the LLM "says"» or «the LLM "knows"» but the tenth time you read that in the same essay, it's irritating as heck.

Note that I'm also the kind of person who refuses to assign a gender to Alexa, Siri, etc.: it's an "it".

@shriramk No disagreement, the shorthand is more than convenient, we don't seem to have good ways (for now?) to sustainably talk about "speak, know, answer" and goddamn "intelligence" while preserving that nuance. Anthropomorphism is very old with general public computer use and it would not be an issue if everyone had the capacity to see it as a playful shorthand. Alas, anthropomorphism is normalized and instrumentalized in marketing what amounts to a war on human labor. Apart from the risk of being unknowingly co-opted, it is also intellectually interesting to ponder why researchers do not seem capable of consistently using technical and precise language here.
@burakemir What "consistent and precise" language would you use that is not a mouthful every time you use it?

@shriramk I am no Kamlah and Lorenzen, but let me try.

* instead of "say/reply", use "emit" or "generate" output.
* Instead of "thinks/understands" use "computes probabilities" or "maps vector spaces".
* Instead of "hallucinates" use "emits misaligned patterns".
* Instead of "learns" use "optimizes weights".
* Instead of "cooperate", say the human "initiates" interaction by providing constraints, and the model solves these constraints.

@burakemir I like emit and generate, but for the most part I think the LHSs fall into exactly the "convenient shorthand" category. (I do think "think" and "understand" are egregious, but we can just say "computes" instead.)
@shriramk The question is: shorthands for *what* : ) There is something genuinely lacking, a scientific language of semiotic (self-) interaction and control.
The vector spaces are not even the point, we are interacting with a machine that retrieves lossy/compressed accounts of "meaning", and it is only our interpretation and evals that assign this meaning.