🗞️ Now out in Nature Communications:

Deep neural networks and humans both benefit from compositional structure.

w/ Yoav Ram and Limor Raviv

Investigating the relationship between language learning and language structure, we find striking similarities between humans and language models: small recurrent neural networks trained from scratch and large pre-trained language models via in-context learning.
All these learning systems, small RNNs, pre-trained LLMs, and humans, show *very* similar memorization and generalization behavior -- with more structured languages leading to generalizations that are more systematic and more similar to the generalization of human participants.
We find a similar effect when looking at memorization errors. In the memorization test, the task for in-context LLMs boils down to copying a word that is present earlier in the prompt. But even here, we can see an advantage of language structure.
When analyzing the learning trajectory of RNNs throughout training, we make several other interesting observations: medium-structured languages have an learnability advantage early in training (likely due to ambiguous terms in those languages) but fall behind high-structured languages later.
What can we conclude? Humans and deep nets are not so different after all when learning a new language. The simplicity bias of overparameterized models seems to guide them towards learning compositional structures, even though they could easily memorize all different combinations.