Lukas Galke

586 Followers
385 Following
423 Posts

Assistant Professor of Data Science and Advanced Machine Learning at the University of Southern Denmark in Odense

Machine Learning, Natural Language Processing, Interpretability

Previously:
Postdoc @mpi_nl
PhD @ Kiel University, Germany

#ML, #NLProc

Websitehttp://lpag.de
ORCIDhttps://orcid.org/0000-0001-6124-1092
Google Scholarhttps://scholar.google.de/citations?hl=en&user=AHGGdYQAAAAJ&view_op=list_works&sortby=pubdate
What can we conclude? Humans and deep nets are not so different after all when learning a new language. The simplicity bias of overparameterized models seems to guide them towards learning compositional structures, even though they could easily memorize all different combinations.
When analyzing the learning trajectory of RNNs throughout training, we make several other interesting observations: medium-structured languages have an learnability advantage early in training (likely due to ambiguous terms in those languages) but fall behind high-structured languages later.
We find a similar effect when looking at memorization errors. In the memorization test, the task for in-context LLMs boils down to copying a word that is present earlier in the prompt. But even here, we can see an advantage of language structure.
All these learning systems, small RNNs, pre-trained LLMs, and humans, show *very* similar memorization and generalization behavior -- with more structured languages leading to generalizations that are more systematic and more similar to the generalization of human participants.
Investigating the relationship between language learning and language structure, we find striking similarities between humans and language models: small recurrent neural networks trained from scratch and large pre-trained language models via in-context learning.

๐Ÿ—ž๏ธ Now out in Nature Communications:

Deep neural networks and humans both benefit from compositional structure.

w/ Yoav Ram and Limor Raviv

Preventing catastrophic forgetting in NLP! ๐ŸŒŸ Our discrete key-value bottleneck enables efficient continual learning in encoder-only language modelsโ€”no major updates, just localized tweaks. With Andor Diera and @lpag Learn more! ๐Ÿš€ https://arxiv.org/abs/2412.08528
Continual Learning for Encoder-only Language Models via a Discrete Key-Value Bottleneck

Continual learning remains challenging across various natural language understanding tasks. When models are updated with new training data, they risk catastrophic forgetting of prior knowledge. In the present work, we introduce a discrete key-value bottleneck for encoder-only language models, allowing for efficient continual learning by requiring only localized updates. Inspired by the success of a discrete key-value bottleneck in vision, we address new and NLP-specific challenges. We experiment with different bottleneck architectures to find the most suitable variants regarding language, and present a generic discrete key initialization technique for NLP that is task independent. We evaluate the discrete key-value bottleneck in four continual learning NLP scenarios and demonstrate that it alleviates catastrophic forgetting. We showcase that it offers competitive performance to other popular continual learning methods, with lower computational costs.

arXiv.org

We have some openings for PhD/Postdoc positions on multilingual language modeling at SDU's Centre for Machine Learning, Denmark. Topics go down to the core of pre-training and instruction tuning and adjacent topics such as efficient language modeling. Please consider to apply and/or reshare :)

https://tinyurl.com/dfm2025phd

https://tinyurl.com/dfm2025postdoc

Several PhD scholarships in Artificial Intelligence

Application deadline: 19 December 2024 at 23:59 hours local Danish time

SDU Career Site

We have some openings for PhD/Postdoc positions on multilingual language modeling at SDU's Centre for Machine Learning, Denmark. Topics go down to the core of pre-training and instruction tuning and adjacent topics such as efficient language modeling. Please consider to apply and/or reshare :)

https://tinyurl.com/dfm2025phd

https://tinyurl.com/dfm2025postdoc

Several PhD scholarships in Artificial Intelligence

Application deadline: 19 December 2024 at 23:59 hours local Danish time

SDU Career Site