Anna Ivanova

436 Followers
179 Following
37 Posts
Postdoctoral Associate at MIT. Language and thought in brains & in machines.
Pronounsshe/her
Websitehttps://anna-ivanova.net/
Google Scholarhttps://scholar.google.com/citations?user=hBUjCB0AAAAJ&hl=en
On the other hand, LLMs are still quite bad at most aspects of functional competence (math, reasoning, world knowledge) - especially when it deviates from commonly occurring text patterns. 5/
Armed with the formal/functional distinction, we thoroughly review the NLP literature. We show that, on one hand, LLMs are surprisingly good at *formal* linguistic competence, making significant progress at learning phonology, morphosyntax, etc etc. 4/

We ground this distinction in cognitive neuroscience.

Years of empirical work show that humans have specialized neural machinery for language processing (reading, listening, speaking, etc), which is distinct from brain mechanisms underlying other cognitive capacities (social reasoning, intuitive physics, logic and math…) 3/

The key point we’re making is the distinction between *formal competence* - the knowledge of linguistic rules and patterns - and *functional competence* - a set of skills required to use language in real-world situations. 2/

Bonus: a preliminary exploration of #ChatGPT responses shows that it might also have an impossible-implausible gap (although a more detailed investigation is of course needed).

9/end

- explicit plausibility information emerges in mid LLM layers and then stays high
- implausibility signatures generalize poorly across animate-inanimate (impossible) events and animate-animate (unlikely) events
- a probe trained on both active and passive voice sentences is as successful as a within-voice probe (but a probe trained on only one voice type fails to generalize)

6/

- LLMs generalize very well between active and passive versions of the same sentence BUT not as well as humans for synonymous sentences (The teacher bought the laptop / The instructor purchased the computer).

5/

In follow-up tests, we show that
- LLM scores depend both on plausibility and surface-level factors like word frequency (meaning that distributions for plausible and implausible sentences are highly overlapping)

4/

LLMs are almost perfect when assigning likelihood to possible vs. impossible events but aren’t as good when it comes to likely vs. unlikely events.

(our baseline language models also show this effect)

3/

#introduction #cogneuro #language

Hello! I am a researcher at MIT studying the relationship between language and other aspects of human cognition. I do so using (a) cognitive neuroscience and (b) studies of large language models. I'm originally from Russia but have lived in the US for over a decade. Attaching a gif of my brain just for fun.

https://anna-ivanova.net/

Anna Ivanova

Postdoctoral associate at MIT studying language and cognition.

Anna Ivanova