Mastodawn

Leshem Choshen Dec 11, 2022

Simplification for cognitive disabilities

Simplifying for specific disabilities, not to undefined readers

they propose 1⃣st
Dataset
Needed operations taxonomy
Model

@[email protected] @[email protected] @[email protected]
#EMNLP2022livetweet (in delay)

Leshem Choshen Dec 11, 2022

🧞‍♂️Human annotations for free🥳

As each one reinvents their annotation process GENIE🧞
Just decided to standardize the whole process (API MTurk...)
Upload your process and
the next papers would be able to exactly replicate
And they pay for annotation!

#EMNLP2022livetweet

Leshem Choshen Dec 9, 2022

Efficient finetuning can predict transferrability
Tuning changes some weights
If different weights are changed similarly -> similar tasks
Indeed this predicts inter-training improvements better than previous works
link👇
@[email protected] @[email protected] @[email protected]
#EMNLP2022livetweet

Leshem Choshen Dec 9, 2022

Comics is a language (keynote)🧵

language has many modalities that often mix
(fig text+images but also speech, sign...)
let's understand the language of paintings

#EMNLP2022livetweet @neilcohn

Boyang "Albert" Li Dec 8, 2022

Second day at #EMNLP2022. My personal favorite of the day: Razvan Pascanu's keynote at the Multilingual Representation Learning Workshop. Some ideas here will inform the next decade, I believe. #EMNLP2022livetweet

Leshem Choshen Dec 8, 2022

Methapor dataset with some twists
They annotated Spanish 117K tokens (3.6K sentences) for being metaphoric or not

🟦new dataset
🟦2-10% of sentences have a metaphor🤨
🟦Cross lingual - SoTA on Spanish, and they didn't even multitask

#EMNLP2022livetweet #CoNLL2022
linkes👇

Leshem Choshen Dec 8, 2022

Tydip: a dataset of politeness in 9 languages

Reasonable agreement and classification scores based on it

Anirudh Srinivasan @[email protected]
https://arxiv.org/abs/2211.16496
https://underline.io/events/342/sessions/13930/lecture/65969-tydip-a-dataset-for-politeness-classification-in-nine-typologically-diverse-languages
#EMNLP2022livetweet #CoNLL2022

TyDiP: A Dataset for Politeness Classification in Nine Typologically Diverse Languages

We study politeness phenomena in nine typologically diverse languages. Politeness is an important facet of communication and is sometimes argued to be cultural-specific, yet existing computational linguistic study is limited to English. We create TyDiP, a dataset containing three-way politeness annotations for 500 examples in each language, totaling 4.5K examples. We evaluate how well multilingual models can identify politeness levels -- they show a fairly robust zero-shot transfer ability, yet fall short of estimated human accuracy significantly. We further study mapping the English politeness strategy lexicon into nine languages via automatic translation and lexicon induction, analyzing whether each strategy's impact stays consistent across languages. Lastly, we empirically study the complicated relationship between formality and politeness through transfer experiments. We hope our dataset will support various research questions and applications, from evaluating multilingual models to constructing polite multilingual agents.

arXiv.org

Leshem Choshen Dec 8, 2022

The old man the ship? Easy

LMs underestimate the surprisal of garden path sentences and syntax
Controlling for words complexity (lexical)

@[email protected] @[email protected] @[email protected]
https://arxiv.org/abs/2210.12187
https://underline.io/events/342/posters/12852/poster/66609-syntactic-surprisal-from-neural-models-predicts,-but-underestimates,-human-processing-difficulty-from-syntactic-ambiguities
#EMNLP2022livetweet #CoNLL2022

Syntactic Surprisal From Neural Models Predicts, But Underestimates, Human Processing Difficulty From Syntactic Ambiguities

Humans exhibit garden path effects: When reading sentences that are temporarily structurally ambiguous, they slow down when the structure is disambiguated in favor of the less preferred alternative. Surprisal theory (Hale, 2001; Levy, 2008), a prominent explanation of this finding, proposes that these slowdowns are due to the unpredictability of each of the words that occur in these sentences. Challenging this hypothesis, van Schijndel & Linzen (2021) find that estimates of the cost of word predictability derived from language models severely underestimate the magnitude of human garden path effects. In this work, we consider whether this underestimation is due to the fact that humans weight syntactic factors in their predictions more highly than language models do. We propose a method for estimating syntactic predictability from a language model, allowing us to weigh the cost of lexical and syntactic predictability independently. We find that treating syntactic predictability independently from lexical predictability indeed results in larger estimates of garden path. At the same time, even when syntactic predictability is independently weighted, surprisal still greatly underestimate the magnitude of human garden path effects. Our results support the hypothesis that predictability is not the only factor responsible for the processing cost associated with garden path sentences.

arXiv.org

Leshem Choshen Dec 8, 2022

Mismatches between LMs and humans

Jhon/Johanna knew Robert\a hated *him*

Humans&GPTs find "him" surprising after Johana
But GPTs are also surprised when they shouldn't be

@[email protected]
(No arXiv?)
https://underline.io/events/342/posters/12852/poster/66600-incremental-processing-of-principle-b-mismatches-between-neural-models-and-humans
#EMNLP2022livetweet #CoNLL2022

Watch lectures from the best researchers.

On-demand video platform giving you access to lectures from conferences worldwide.

Underline.io

Leshem Choshen Dec 8, 2022

Semantics is not only BERT
DALLE-2:
Instead of disambiguating uses information from all meanings of the words simultaneously

(cool pictures that we #NLProc don't usually have as a bonus)
https://arxiv.org/abs/2210.10606
@[email protected] @[email protected] @[email protected]
#EMNLP2022livetweet #CoNLL2022

DALLE-2 is Seeing Double: Flaws in Word-to-Concept Mapping in Text2Image Models

We study the way DALLE-2 maps symbols (words) in the prompt to their references (entities or properties of entities in the generated image). We show that in stark contrast to the way human process language, DALLE-2 does not follow the constraint that each word has a single role in the interpretation, and sometimes re-use the same symbol for different purposes. We collect a set of stimuli that reflect the phenomenon: we show that DALLE-2 depicts both senses of nouns with multiple senses at once; and that a given word can modify the properties of two distinct entities in the image, or can be depicted as one object and also modify the properties of another object, creating a semantic leakage of properties between entities. Taken together, our study highlights the differences between DALLE-2 and human language processing and opens an avenue for future study on the inductive biases of text-to-image models.

arXiv.org