Mastodawn

Lukas Galke Jan 6, 2025

🗞️ Now out in Nature Communications:

Deep neural networks and humans both benefit from compositional structure.

w/ Yoav Ram and Limor Raviv

Show thread

Lukas Galke Jan 6, 2025

Investigating the relationship between language learning and language structure, we find striking similarities between humans and language models: small recurrent neural networks trained from scratch and large pre-trained language models via in-context learning.

Show thread

Lukas Galke Jan 6, 2025

All these learning systems, small RNNs, pre-trained LLMs, and humans, show *very* similar memorization and generalization behavior -- with more structured languages leading to generalizations that are more systematic and more similar to the generalization of human participants.

Show thread

Lukas Galke Jan 6, 2025

We find a similar effect when looking at memorization errors. In the memorization test, the task for in-context LLMs boils down to copying a word that is present earlier in the prompt. But even here, we can see an advantage of language structure.

Show thread

Lukas Galke

When analyzing the learning trajectory of RNNs throughout training, we make several other interesting observations: medium-structured languages have an learnability advantage early in training (likely due to ambiguous terms in those languages) but fall behind high-structured languages later.

Show thread

Lukas Galke Jan 6, 2025

What can we conclude? Humans and deep nets are not so different after all when learning a new language. The simplicity bias of overparameterized models seems to guide them towards learning compositional structures, even though they could easily memorize all different combinations.

Show thread

Lukas Galke Jan 6, 2025

Paper: https://rdcu.be/d5f2e

Code: https://github.com/lgalke/easy2deeplearn

Data: https://doi.org/10.5281/zenodo.14205452

Show thread

Michael Misamore Jan 6, 2025

@lpag thanks for sharing! 🙂