+++ PUBLICATION ALERT +++

My colleague Anastasia Menshikova and me have published our work on dependency parsing in the DGS conference proceedings!

Read it here (in German):

#sociology #llm #dependencyParsing

https://publikationen.soziologie.de/index.php/kongressband_2022/article/view/1693

Text als Daten: Extraktion von Variablen mittels LSTM-Netzwerken | Polarisierte Welten. Verhandlungen des 41. Kongresses der Deutschen Gesellschaft für Soziologie in Bielefeld 2022

Author: Deutsche Gesellschaft für Soziologie, Category: Books

The Double Helix inside the NLP Transformer
https://arxiv.org/abs/2306.13817

We introduce a framework for analyzing types of information in a NLP Transformer. We distinguish four layers of information: positional, syntactic, semantic, and contextual.

We show that the distilled positional components of the embedding vectors follow the path of a helix, both on the encoder side & on the decoder side.
...

#NLP #MachineLearning #transformers #DependencyParsing #PartsOfSpeech #VectorSpace #informatics

The Double Helix inside the NLP Transformer

We introduce a framework for analyzing various types of information in an NLP Transformer. In this approach, we distinguish four layers of information: positional, syntactic, semantic, and contextual. We also argue that the common practice of adding positional information to semantic embedding is sub-optimal and propose instead a Linear-and-Add approach. Our analysis reveals an autogenetic separation of positional information through the deep layers. We show that the distilled positional components of the embedding vectors follow the path of a helix, both on the encoder side and on the decoder side. We additionally show that on the encoder side, the conceptual dimensions generate Part-of-Speech (PoS) clusters. On the decoder side, we show that a di-gram approach helps to reveal the PoS clusters of the next token. Our approach paves a way to elucidate the processing of information through the deep layers of an NLP Transformer.

arXiv.org